Realtime multi-scale scene text detection with scale-based region proposal network. (February 2020)
- Record Type:
- Journal Article
- Title:
- Realtime multi-scale scene text detection with scale-based region proposal network. (February 2020)
- Main Title:
- Realtime multi-scale scene text detection with scale-based region proposal network
- Authors:
- He, Wenhao
Zhang, Xu-Yao
Yin, Fei
Luo, Zhenbo
Ogier, Jean-Marc
Liu, Cheng-Lin - Abstract:
- Highlights: We propose a novel network named SRPN to realize both text/non-text localization and scale estimation efficiently. A two-stage detection scheme based on SRPN is proposed to avoid using multi-scale pyramid input and achieve faster detection speed. The proposed method achieves remarkable speedup on ICDAR2015, ICDAR2013 and MSRA-TD500 while keeping competitive performance. Ablation experiments are given to prove reasonableness of the proposed method from different aspects. Abstract: Multi-scale approaches have been widely used for achieving high accuracy for scene text detection, but they usually slow down the speed of the whole system. In this paper, we propose a two-stage framework for realtime multi-scale scene text detection. The first stage employs a novel S cale-based R egion P roposal N etwork (SRPN) which can localize text of wide scale range and estimate text scale efficiently. Based on SRPN, non-text regions are filtered out, and text region proposals are generated. Moreover, based on text scale estimation by SRPN, small or big texts in region proposals are resized into a unified normal scale range. The second stage then adopts a Fully Convolutional Network based scene text detector to localize text words from proposals of the first stage. Text detector in the second stage detects texts of narrow scale range but accurately. Since most non-text regions are eliminated through SRPN efficiently, and texts in proposals are properly scaled to avoid multi-scaleHighlights: We propose a novel network named SRPN to realize both text/non-text localization and scale estimation efficiently. A two-stage detection scheme based on SRPN is proposed to avoid using multi-scale pyramid input and achieve faster detection speed. The proposed method achieves remarkable speedup on ICDAR2015, ICDAR2013 and MSRA-TD500 while keeping competitive performance. Ablation experiments are given to prove reasonableness of the proposed method from different aspects. Abstract: Multi-scale approaches have been widely used for achieving high accuracy for scene text detection, but they usually slow down the speed of the whole system. In this paper, we propose a two-stage framework for realtime multi-scale scene text detection. The first stage employs a novel S cale-based R egion P roposal N etwork (SRPN) which can localize text of wide scale range and estimate text scale efficiently. Based on SRPN, non-text regions are filtered out, and text region proposals are generated. Moreover, based on text scale estimation by SRPN, small or big texts in region proposals are resized into a unified normal scale range. The second stage then adopts a Fully Convolutional Network based scene text detector to localize text words from proposals of the first stage. Text detector in the second stage detects texts of narrow scale range but accurately. Since most non-text regions are eliminated through SRPN efficiently, and texts in proposals are properly scaled to avoid multi-scale pyramid processing, the whole system is quite fast. We evaluate both performance and speed of the proposed method on datasets ICDAR2015, ICDAR2013, and MSRA-TD500. On ICDAR2015, our system can reach the state-of-the-art F -measure score of 85.40% at 16.5 fps (frame per second), and competitive performance of 79.66% at 35.1 fps, either of which is more than 5 times faster than previous best methods. On ICDAR2013 and MSRA-TD500, we also achieve remarkable speedup by keeping competitive performance. Ablation experiments are also provided to demonstrate the reasonableness of our method. … (more)
- Is Part Of:
- Pattern recognition. Volume 98(2020:Feb.)
- Journal:
- Pattern recognition
- Issue:
- Volume 98(2020:Feb.)
- Issue Display:
- Volume 98 (2020)
- Year:
- 2020
- Volume:
- 98
- Issue Sort Value:
- 2020-0098-0000-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-02
- Subjects:
- Scene text detection -- Multi-scale -- Speedup -- Scale-based region proposal network
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2019.107026 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 12059.xml