A database of unconstrained Vietnamese online handwriting and recognition experiments by recurrent neural networks. (June 2018)
- Record Type:
- Journal Article
- Title:
- A database of unconstrained Vietnamese online handwriting and recognition experiments by recurrent neural networks. (June 2018)
- Main Title:
- A database of unconstrained Vietnamese online handwriting and recognition experiments by recurrent neural networks
- Authors:
- Nguyen, Hung Tuan
Nguyen, Cuong Tuan
Bao, Pham The
Nakagawa, Masaki - Abstract:
- Highlights: A Vietnamese Online Handwriting Database is made and analyzed. Vietnamese online handwritten text poses a challenge due to many delayed strokes. Long Short-Term Memory neural networks is effective to process delayed strokes. Abstract: We present our efforts to create a database of unconstrained Vietnamese online handwritten text sampled from pen-based devices. The database stores handwritten text for paragraphs, lines, words, and characters, with the ground truth associated with every paragraph and line. We show a detailed statistical analysis of the handwritten text in this database and describe recognition experiments using several recent methods including the Bidirectional Long Short-Term Memory (BLSTM) network. Overall, our database contains over 480, 000 strokes from more than 380, 000 characters, which, at present, is the largest database of Vietnamese online handwritten text. Although Vietnamese script is based on a fixed set of alphabet letters, the recognition of Vietnamese online handwritten text poses a difficult challenge because of many diacritical marks, which usually result in delayed strokes during writing. We designed and implemented an online handwriting-collection tool to gather data, as well as a line-segmentation tool and a delayed-stroke-detection tool to analyze collected handwritten text. We also conducted a statistical analysis based on the writer profiles. We applied a number of the state-of-the-art recognition methods on unconstrainedHighlights: A Vietnamese Online Handwriting Database is made and analyzed. Vietnamese online handwritten text poses a challenge due to many delayed strokes. Long Short-Term Memory neural networks is effective to process delayed strokes. Abstract: We present our efforts to create a database of unconstrained Vietnamese online handwritten text sampled from pen-based devices. The database stores handwritten text for paragraphs, lines, words, and characters, with the ground truth associated with every paragraph and line. We show a detailed statistical analysis of the handwritten text in this database and describe recognition experiments using several recent methods including the Bidirectional Long Short-Term Memory (BLSTM) network. Overall, our database contains over 480, 000 strokes from more than 380, 000 characters, which, at present, is the largest database of Vietnamese online handwritten text. Although Vietnamese script is based on a fixed set of alphabet letters, the recognition of Vietnamese online handwritten text poses a difficult challenge because of many diacritical marks, which usually result in delayed strokes during writing. We designed and implemented an online handwriting-collection tool to gather data, as well as a line-segmentation tool and a delayed-stroke-detection tool to analyze collected handwritten text. We also conducted a statistical analysis based on the writer profiles. We applied a number of the state-of-the-art recognition methods on unconstrained Vietnamese handwriting to evaluate their performance, including the BLSTM network, which is an efficient architecture derived from the Recurrent Neural Network (RNN) and is often applied to sequence labeling problems. The BLSTM network achieved 90% character recognition accuracy, despite many long sequences with several delayed strokes. Our database is allowed open access for research to stimulate the development of handwriting research technology. … (more)
- Is Part Of:
- Pattern recognition. Volume 78(2018:Jun.)
- Journal:
- Pattern recognition
- Issue:
- Volume 78(2018:Jun.)
- Issue Display:
- Volume 78 (2018)
- Year:
- 2018
- Volume:
- 78
- Issue Sort Value:
- 2018-0078-0000-0000
- Page Start:
- 291
- Page End:
- 306
- Publication Date:
- 2018-06
- Subjects:
- Vietnamese online handwriting -- Database -- Handwriting recognition -- Recurrent neural network -- Long short term memory
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2018.01.013 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 11317.xml