Word spotting and recognition via a joint deep embedding of image and text. (April 2019)
- Record Type:
- Journal Article
- Title:
- Word spotting and recognition via a joint deep embedding of image and text. (April 2019)
- Main Title:
- Word spotting and recognition via a joint deep embedding of image and text
- Authors:
- Mhiri, Mohamed
Desrosiers, Christian
Cheriet, Mohamed - Abstract:
- Highlights: An unified framework for word spotting and recognition. An end-to-end model for the joint embedding of handwritten word texts and images. Embedded via a convolution neural network (CNN) and a recurrent neural network (RNN). A word representation modeling character-level information. Abstract: This work addresses three important yet challenging problems of handwritten text understanding: word recognition, query-by-example (QBE) word spotting and query-by-string (QBS) word spotting. In most existing approaches, these related tasks are considered independently. We propose a single unified framework based on deep learning to solve all three tasks efficiently and simultaneously. In this framework, an end-to-end deep neural network architecture is used for the joint embedding of handwritten word texts and images. Word images are embedded via a convolution neural network (CNN), which is trained to predict a representation modeling character-level information. The output of the last convolutional layer is considered as representation in the joint embedding subspace. Likewise, a recurrent neural network (RNN) is used to map a sequence of characters to the joint subspace representation. Finally, a model based on multi-layer perceptrons is proposed to predict the matching probability between two embedding vectors. Experiments on five databases of documents written in three languages show our method to yield state-of-the-art performance for QBE and QBS word spotting. TheHighlights: An unified framework for word spotting and recognition. An end-to-end model for the joint embedding of handwritten word texts and images. Embedded via a convolution neural network (CNN) and a recurrent neural network (RNN). A word representation modeling character-level information. Abstract: This work addresses three important yet challenging problems of handwritten text understanding: word recognition, query-by-example (QBE) word spotting and query-by-string (QBS) word spotting. In most existing approaches, these related tasks are considered independently. We propose a single unified framework based on deep learning to solve all three tasks efficiently and simultaneously. In this framework, an end-to-end deep neural network architecture is used for the joint embedding of handwritten word texts and images. Word images are embedded via a convolution neural network (CNN), which is trained to predict a representation modeling character-level information. The output of the last convolutional layer is considered as representation in the joint embedding subspace. Likewise, a recurrent neural network (RNN) is used to map a sequence of characters to the joint subspace representation. Finally, a model based on multi-layer perceptrons is proposed to predict the matching probability between two embedding vectors. Experiments on five databases of documents written in three languages show our method to yield state-of-the-art performance for QBE and QBS word spotting. The proposed method also obtains competitive results for word recognition, when compared against approaches tailored specifically for this task. … (more)
- Is Part Of:
- Pattern recognition. Volume 88(2019:Apr.)
- Journal:
- Pattern recognition
- Issue:
- Volume 88(2019:Apr.)
- Issue Display:
- Volume 88 (2019)
- Year:
- 2019
- Volume:
- 88
- Issue Sort Value:
- 2019-0088-0000-0000
- Page Start:
- 312
- Page End:
- 320
- Publication Date:
- 2019-04
- Subjects:
- Representation learning -- Word image representation -- Word text representation -- Word spotting -- Word recognition
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2018.11.017 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 9397.xml