A supervised learning approach for heading detection. Issue 4 (8th January 2020)
- Record Type:
- Journal Article
- Title:
- A supervised learning approach for heading detection. Issue 4 (8th January 2020)
- Main Title:
- A supervised learning approach for heading detection
- Authors:
- Budhiraja, Sahib Singh
Mago, Vijay - Other Names:
- Saen Reza Farzipoor guestEditor.
Song Malin guestEditor.
Fisher Ron guestEditor. - Abstract:
- Abstract: As the popularity of the portable document format (PDF) file format increases, research that facilitates PDF text analysis or extraction is necessary. Heading detection is a crucial component of PDF‐based text classification processes. This research involves training a supervised learning model to detect headings by systematically testing and selecting classifier features using recursive feature elimination . Results indicate that decision tree is the best classifier with an accuracy of 95.83%, sensitivity of 0.981, and a specificity of 0.946. This research into heading detection contributes to the field of PDF‐based text extraction and can be applied to the automation of large scale PDF text analysis in a variety of professional and policy‐based contexts.
- Is Part Of:
- Expert systems. Volume 37:Issue 4(2020)
- Journal:
- Expert systems
- Issue:
- Volume 37:Issue 4(2020)
- Issue Display:
- Volume 37, Issue 4 (2020)
- Year:
- 2020
- Volume:
- 37
- Issue:
- 4
- Issue Sort Value:
- 2020-0037-0004-0000
- Page Start:
- n/a
- Page End:
- n/a
- Publication Date:
- 2020-01-08
- Subjects:
- heading detection -- machine learning -- supervised learning algorithm -- text segmentation
Expert systems (Computer science)
006.33 - Journal URLs:
- http://onlinelibrary.wiley.com/journal/10.1111/(ISSN)1468-0394 ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1111/exsy.12520 ↗
- Languages:
- English
- ISSNs:
- 0266-4720
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3842.004000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 22882.xml