SCUT-HCCDoc: A new benchmark dataset of handwritten Chinese text in unconstrained camera-captured documents. (December 2020)

Record Type:: Journal Article
Title:: SCUT-HCCDoc: A new benchmark dataset of handwritten Chinese text in unconstrained camera-captured documents. (December 2020)
Main Title:: SCUT-HCCDoc: A new benchmark dataset of handwritten Chinese text in unconstrained camera-captured documents
Authors:: Zhang, Hesuo
Liang, Lingyu
Jin, Lianwen
Abstract:: Highlights: We propose a new large-scale SCUT-HCCDoc dataset for handwritten Chinese text detection, recognition and spotting in natural images. We statistically analyzed the samples and annotations of SCUT- HCCDoc in image level, text level and character level. We use state-of-the-art methods for baseline evaluation of text line detection, text line recognition, and end-to-end text spotting. A comprehensive analysis of the challenge of the dataset and benchmark testing are provided. Abstract: In this paper, we introduce a large-scale dataset, called SCUT-HCCDoc, to address challenging detection and recognition problems of handwritten Chinese text (HCT) in the camera-captured documents. Despite extensive studies of optical character recognition (OCR) and offline handwriting recognition for document images, text detection and recognition in the camera-captured documents remains an unsolved problem that is worth for extensive study and investigation. With recent advances in deep learning, researchers have proposed useful architectures for feature learning, detection, and recognition for the scene text. However, the performance of deep learning methods highly depends on the amount and diversity of training data. Previous OCR and offline HCT datasets were built under specific constraints, and most of the recent scene text datasets are for non-handwritten text. Hence, there is a lack of a comprehensive scene handwritten text benchmark. This study focuses on scenes with … (more)
Is Part Of:: Pattern recognition. Volume 108(2020:Dec.)
Journal:: Pattern recognition
Issue:: Volume 108(2020:Dec.)
Issue Display:: Volume 108 (2020)
Year:: 2020
Volume:: 108
Issue Sort Value:: 2020-0108-0000-0000
Page Start:
Page End:
Publication Date:: 2020-12
Subjects:: Document analysis and recognition -- Handwritten Chinese text recognition -- Handwritten Chinese text detection -- Benchmark dataset
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4
Journal URLs:: http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗
DOI:: 10.1016/j.patcog.2020.107559 ↗
Languages:: English
ISSNs:: 0031-3203
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 13920.xml