ReLaText: Exploiting visual relationships for arbitrary-shaped scene text detection with graph convolutional networks. (March 2021)
- Record Type:
- Journal Article
- Title:
- ReLaText: Exploiting visual relationships for arbitrary-shaped scene text detection with graph convolutional networks. (March 2021)
- Main Title:
- ReLaText: Exploiting visual relationships for arbitrary-shaped scene text detection with graph convolutional networks
- Authors:
- Ma, Chixiang
Sun, Lei
Zhong, Zhuoyao
Huo, Qiang - Abstract:
- Highlights: We propose a new arbitrary-shaped text detection approach by formulating text detection as a visual relationship detection problem and demonstrate the effectiveness of this new formulation by starting from using a "link" relationship to address the challenging text-line grouping problem. We adopt a GCN-based visual relationship detection framework to effectively leverage context information to improve link prediction accuracy so that even text instances with large inter-character or very small inter-line spacing can be robustly detected by our text detector. Significantly improved text detection accuracy demonstrates the effectiveness of visual relationship prediction based text-line grouping and the effectiveness of GCN-based link relationship prediction. Our new text detector has achieved state-of-the-art performance on five public text detection benchmarks, namely RCTW-17, MSRA-TD500, Total-Text, CTW1500 and DAST1500. Abstract: We introduce a new arbitrary-shaped text detection approach named ReLaText by formulating text detection as a visual relationship detection problem. To demonstrate the effectiveness of this new formulation, we start from using a "link" relationship to address the challenging text-line grouping problem firstly. The key idea is to decompose text detection into two subproblems, namely detection of text primitives and prediction of link relationships between nearby text primitive pairs. Specifically, an anchor-free region proposal networkHighlights: We propose a new arbitrary-shaped text detection approach by formulating text detection as a visual relationship detection problem and demonstrate the effectiveness of this new formulation by starting from using a "link" relationship to address the challenging text-line grouping problem. We adopt a GCN-based visual relationship detection framework to effectively leverage context information to improve link prediction accuracy so that even text instances with large inter-character or very small inter-line spacing can be robustly detected by our text detector. Significantly improved text detection accuracy demonstrates the effectiveness of visual relationship prediction based text-line grouping and the effectiveness of GCN-based link relationship prediction. Our new text detector has achieved state-of-the-art performance on five public text detection benchmarks, namely RCTW-17, MSRA-TD500, Total-Text, CTW1500 and DAST1500. Abstract: We introduce a new arbitrary-shaped text detection approach named ReLaText by formulating text detection as a visual relationship detection problem. To demonstrate the effectiveness of this new formulation, we start from using a "link" relationship to address the challenging text-line grouping problem firstly. The key idea is to decompose text detection into two subproblems, namely detection of text primitives and prediction of link relationships between nearby text primitive pairs. Specifically, an anchor-free region proposal network based text detector is first used to detect text primitives of different scales from different feature maps of a feature pyramid network, from which a text primitive graph is constructed by linking each pair of nearby text primitives detected from a same feature map with an edge. Then, a Graph Convolutional Network (GCN) based link relationship prediction module is used to prune wrongly-linked edges in the text primitive graph to generate a number of disjoint subgraphs, each representing a detected text instance. As GCN can effectively leverage context information to improve link prediction accuracy, our GCN based text-line grouping approach can achieve better text detection accuracy than previous text-line grouping methods, especially when dealing with text instances with large inter-character or very small inter-line spacing. Consequently, the proposed ReLaText achieves state-of-the-art performance on five public text detection benchmarks, namely RCTW-17, MSRA-TD500, Total-Text, CTW1500 and DAST1500. … (more)
- Is Part Of:
- Pattern recognition. Volume 111(2021)
- Journal:
- Pattern recognition
- Issue:
- Volume 111(2021)
- Issue Display:
- Volume 111, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 111
- Issue:
- 2021
- Issue Sort Value:
- 2021-0111-2021-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-03
- Subjects:
- Arbitrary-Shaped text detection -- Graph convolutional network -- Link prediction -- Visual relationship detection
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2020.107684 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 14921.xml