Self-supervised deep reconstruction of mixed strip-shredded text documents. (November 2020)
- Record Type:
- Journal Article
- Title:
- Self-supervised deep reconstruction of mixed strip-shredded text documents. (November 2020)
- Main Title:
- Self-supervised deep reconstruction of mixed strip-shredded text documents
- Authors:
- Paixão, Thiago M.
Berriel, Rodrigo F.
Boeres, Maria C.S.
Koerich, Alessandro L.
Badue, Claudine
De Souza, Alberto F.
Oliveira-Santos, Thiago - Abstract:
- Highlights: Paper shreds matching via self-supervised deep learning. Training with simulated cuts is effective for real-shredded documents. A new public dataset with 100 strip-shredded documents (2292 shreds). Accurate (over 90% accuracy) reconstruction of 100 mixed shredded documents. Abstract: The reconstruction of shredded documents consists of coherently arranging fragments of paper (shreds) to recover the original document(s). A great challenge in computational reconstruction is to properly evaluate the compatibility between the shreds. While traditional pixel-based approaches are not robust to real shredding, more sophisticated solutions compromise significantly time performance. The solution presented in this work extends our previous deep learning method for single-page reconstruction to a more realistic/complex scenario: the reconstruction of several mixed shredded documents at once. In our approach, the compatibility evaluation is modeled as a two-class (valid or invalid) pattern recognition problem. The model is trained in a self-supervised manner on samples extracted from simulated-shredded documents, which obviates manual annotation. Experimental results on three datasets – including a new collection of 100 strip-shredded documents produced for this work – have shown that the proposed method outperforms the competing ones on complex scenarios, achieving accuracy superior to 90%.
- Is Part Of:
- Pattern recognition. Volume 107(2020:Nov.)
- Journal:
- Pattern recognition
- Issue:
- Volume 107(2020:Nov.)
- Issue Display:
- Volume 107 (2020)
- Year:
- 2020
- Volume:
- 107
- Issue Sort Value:
- 2020-0107-0000-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-11
- Subjects:
- Deep learning -- Self-supervised learning -- Fully convolutional neural networks -- Document reconstruction -- Forensics -- Optimization search
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2020.107535 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 19199.xml