Non-native text analysis: A survey. (7th September 2015)
- Record Type:
- Journal Article
- Title:
- Non-native text analysis: A survey. (7th September 2015)
- Main Title:
- Non-native text analysis: A survey
- Authors:
- MASSUNG, SEAN
ZHAI, CHENGXIANG - Abstract:
- Abstract: Non-native speakers of English far outnumber native speakers; English is the main language of books, newspapers, airports, air-traffic control, international business, academic conferences, science, technology, diplomacy, sports, international competitions, pop music, and advertising (British Council2014 ). Online education in the form of massive online open courses is also primarily in English—even teaching English. This creates enormous amounts of text written by non-native speakers, which in turn generates a need for grammar correction and analysis. Even aside from massive online open courses, the number of English learners in Asia alone is in the tens of millions. In this paper, we provide a survey of the two main areas of existing work on non-native text analysis, prefaced by an overview of common datasets used by researchers, comparing their attributes and potential uses. Then, an introduction to native language identification follows: determining the native language of an author based on text in the second language. This section is subdivided into various techniques and a shared task on this classification problem. Next, we discuss non-native grammatical error correction—finding and modifying text to fix errors or to make it sound more fluent. Again, we discuss different methods before investigating a relevant shared task. Lastly, we end with conclusions and potential future directions. While this survey primarily focuses on detecting and correctingAbstract: Non-native speakers of English far outnumber native speakers; English is the main language of books, newspapers, airports, air-traffic control, international business, academic conferences, science, technology, diplomacy, sports, international competitions, pop music, and advertising (British Council2014 ). Online education in the form of massive online open courses is also primarily in English—even teaching English. This creates enormous amounts of text written by non-native speakers, which in turn generates a need for grammar correction and analysis. Even aside from massive online open courses, the number of English learners in Asia alone is in the tens of millions. In this paper, we provide a survey of the two main areas of existing work on non-native text analysis, prefaced by an overview of common datasets used by researchers, comparing their attributes and potential uses. Then, an introduction to native language identification follows: determining the native language of an author based on text in the second language. This section is subdivided into various techniques and a shared task on this classification problem. Next, we discuss non-native grammatical error correction—finding and modifying text to fix errors or to make it sound more fluent. Again, we discuss different methods before investigating a relevant shared task. Lastly, we end with conclusions and potential future directions. While this survey primarily focuses on detecting and correcting non-native English text, many approaches are general and can be used across any language pairing. … (more)
- Is Part Of:
- Natural language engineering. Volume 22:Part 2(2016)
- Journal:
- Natural language engineering
- Issue:
- Volume 22:Part 2(2016)
- Issue Display:
- Volume 22, Issue 2, Part 2 (2016)
- Year:
- 2016
- Volume:
- 22
- Issue:
- 2
- Part:
- 2
- Issue Sort Value:
- 2016-0022-0002-0002
- Page Start:
- 163
- Page End:
- 186
- Publication Date:
- 2015-09-07
- Subjects:
- Natural language processing (Computer science) -- Periodicals
Software engineering -- Periodicals
006.35 - Journal URLs:
- http://journals.cambridge.org/action/displayJournal?jid=NLE ↗
- DOI:
- 10.1017/S1351324915000303 ↗
- Languages:
- English
- ISSNs:
- 1351-3249
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library HMNTS - ELD Digital store
- Ingest File:
- 154.xml