Scale robust deep oriented-text detection network. (June 2020)
- Record Type:
- Journal Article
- Title:
- Scale robust deep oriented-text detection network. (June 2020)
- Main Title:
- Scale robust deep oriented-text detection network
- Authors:
- Zheng, Yuqiang
Xie, Yuan
Qu, Yanyun
Yang, Xiaodong
Li, Cuihua
Zhang, Yan - Abstract:
- Highlights: We propose a highly efficient one-stage deep text-detection model for natural scene multi-oriented text with robustness to scale variation. The proposed deep model contains the feature extraction block, the feature refining block and the prediction block. The feature refining block embeds upsampling operation, Residual Convolution Unit and Chained Residual Pooling Unit in order to keep text detection in a higher-resolution. Our method is implemented on ICDAR2015 and MSRA-TD500, achieving the detection F-score over 83.1 and 79.0%, respectively. Abstract: Text detection is a prerequisite of text recognition, and multi-oriented text detection is a hot topic recently. The existing multi-oriented text detection methods fall short when facing two issues: 1) text scales change in a wide range, and 2) there exists the foreground-background class imbalance. In this paper, we propose a scale-robust deep multi-oriented text-detection model, which not only has the efficiency of the one-stage deep detection model, but also has the comparable accuracy of the two-stage deep text-detection model. We design the feature refining block to fuse multi-scale context features for the purpose of keeping text detection in a higher-resolution feature map. Moreover, in order to mitigate the foreground-background class imbalance, Focal Loss is adopted to up weight the hard-classified samples. Our method is implemented on four benchmark text datasets: ICDAR2013, ICDAR2015, COCO-Text andHighlights: We propose a highly efficient one-stage deep text-detection model for natural scene multi-oriented text with robustness to scale variation. The proposed deep model contains the feature extraction block, the feature refining block and the prediction block. The feature refining block embeds upsampling operation, Residual Convolution Unit and Chained Residual Pooling Unit in order to keep text detection in a higher-resolution. Our method is implemented on ICDAR2015 and MSRA-TD500, achieving the detection F-score over 83.1 and 79.0%, respectively. Abstract: Text detection is a prerequisite of text recognition, and multi-oriented text detection is a hot topic recently. The existing multi-oriented text detection methods fall short when facing two issues: 1) text scales change in a wide range, and 2) there exists the foreground-background class imbalance. In this paper, we propose a scale-robust deep multi-oriented text-detection model, which not only has the efficiency of the one-stage deep detection model, but also has the comparable accuracy of the two-stage deep text-detection model. We design the feature refining block to fuse multi-scale context features for the purpose of keeping text detection in a higher-resolution feature map. Moreover, in order to mitigate the foreground-background class imbalance, Focal Loss is adopted to up weight the hard-classified samples. Our method is implemented on four benchmark text datasets: ICDAR2013, ICDAR2015, COCO-Text and MSRA-TD500. The experimental results demonstrate that our method is superior to the existing one-stage deep text-detection models and comparable to the state-of-the-art text detection methods. … (more)
- Is Part Of:
- Pattern recognition. Volume 102(2020:Jun.)
- Journal:
- Pattern recognition
- Issue:
- Volume 102(2020:Jun.)
- Issue Display:
- Volume 102 (2020)
- Year:
- 2020
- Volume:
- 102
- Issue Sort Value:
- 2020-0102-0000-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-06
- Subjects:
- Scene text detection -- One-stage -- Scale robust
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2019.107180 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 12955.xml