A new method for multi-oriented graphics-scene-3D text classification in video. (January 2016)
- Record Type:
- Journal Article
- Title:
- A new method for multi-oriented graphics-scene-3D text classification in video. (January 2016)
- Main Title:
- A new method for multi-oriented graphics-scene-3D text classification in video
- Authors:
- Xu, Jiamin
Shivakumara, Palaiahnakote
Lu, Tong
Tan, Chew Lim
Uchida, Seiichi - Abstract:
- Abstract: Text detection and recognition in video is challenging due to the presence of different types of texts, namely, graphics (video caption), scene (natural text), 2D, 3D, static and dynamic texts. Developing a universal method that works well for all the types is hard. In this paper, we propose a novel method for classifying graphics-scene and 2D–3D texts in video to enhance text detection and recognition accuracies. We first propose an iterative method to classify static and dynamic clusters based on the fact that static texts have zero velocity while dynamic texts have non-zero velocity. This results in text candidates for both static and dynamic texts regardless of 2D and 3D types. We then propose symmetry for text candidates using stroke width distances and medial axis values, which results in potential text candidates. We group potential text candidates using their geometrical properties to form text regions. Next, for each text region, we study the distribution of the dominant medial axis values given by ring radius transform in a new way to classify graphics and scene texts. Similarly, we study the proximity among the pixels that satisfy the gradient directions symmetry to classify 2D and 3D texts. We evaluate each step of the proposed method in terms of classification and recognition rates through classification with the existing methods to show that video text classification is effective and necessary for enhancing the capability of current text detection andAbstract: Text detection and recognition in video is challenging due to the presence of different types of texts, namely, graphics (video caption), scene (natural text), 2D, 3D, static and dynamic texts. Developing a universal method that works well for all the types is hard. In this paper, we propose a novel method for classifying graphics-scene and 2D–3D texts in video to enhance text detection and recognition accuracies. We first propose an iterative method to classify static and dynamic clusters based on the fact that static texts have zero velocity while dynamic texts have non-zero velocity. This results in text candidates for both static and dynamic texts regardless of 2D and 3D types. We then propose symmetry for text candidates using stroke width distances and medial axis values, which results in potential text candidates. We group potential text candidates using their geometrical properties to form text regions. Next, for each text region, we study the distribution of the dominant medial axis values given by ring radius transform in a new way to classify graphics and scene texts. Similarly, we study the proximity among the pixels that satisfy the gradient directions symmetry to classify 2D and 3D texts. We evaluate each step of the proposed method in terms of classification and recognition rates through classification with the existing methods to show that video text classification is effective and necessary for enhancing the capability of current text detection and recognition systems. Highlights: We propose a novel method for classifying graphics-scene and 2D–3D texts in video. An iterative procedure to identify text candidates is presented. Stroke width and medial axis are explored for classifying graphics and scene texts. Gradient directions and medial axis are combined for classifying 2D and 3D texts. … (more)
- Is Part Of:
- Pattern recognition. Volume 49(2016:Jan.)
- Journal:
- Pattern recognition
- Issue:
- Volume 49(2016:Jan.)
- Issue Display:
- Volume 49 (2016)
- Year:
- 2016
- Volume:
- 49
- Issue Sort Value:
- 2016-0049-0000-0000
- Page Start:
- 19
- Page End:
- 42
- Publication Date:
- 2016-01
- Subjects:
- Text detection -- Text recognition -- Stroke with distance -- Ring radius transform -- Graphics and scene text -- 2D and 3D texts
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2015.07.002 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 9064.xml