The NoisyOffice Database: A Corpus To Train Supervised Machine Learning Filters For Image Processing. (30th November 2019)
- Record Type:
- Journal Article
- Title:
- The NoisyOffice Database: A Corpus To Train Supervised Machine Learning Filters For Image Processing. (30th November 2019)
- Main Title:
- The NoisyOffice Database: A Corpus To Train Supervised Machine Learning Filters For Image Processing
- Authors:
- Castro-Bleda, M J
España-Boquera, S
Pastor-Pellicer, J
Zamora-Martínez, F - Abstract:
- Abstract: This paper presents the 'NoisyOffice' database. It consists of images of printed text documents with noise mainly caused by uncleanliness from a generic office, such as coffee stains and footprints on documents or folded and wrinkled sheets with degraded printed text. This corpus is intended to train and evaluate supervised learning methods for cleaning, binarization and enhancement of noisy images of grayscale text documents. As an example, several experiments of image enhancement and binarization are presented by using deep learning techniques. Also, double-resolution images are also provided for testing super-resolution methods. The corpus is freely available at UCI Machine Learning Repository. Finally, a challenge organized by Kaggle Inc. to denoise images, using the database, is described in order to show its suitability for benchmarking of image processing systems.
- Is Part Of:
- Computer journal. Volume 63:Number 11(2020)
- Journal:
- Computer journal
- Issue:
- Volume 63:Number 11(2020)
- Issue Display:
- Volume 63, Issue 11 (2020)
- Year:
- 2020
- Volume:
- 63
- Issue:
- 11
- Issue Sort Value:
- 2020-0063-0011-0000
- Page Start:
- 1658
- Page End:
- 1667
- Publication Date:
- 2019-11-30
- Subjects:
- optical character recognition -- image processing -- binarization -- denoising -- super resolution -- machine learning -- neural networks -- deep learning
Computers -- Periodicals
005.1 - Journal URLs:
- http://comjnl.oxfordjournals.org/ ↗
http://ukcatalogue.oup.com/ ↗ - DOI:
- 10.1093/comjnl/bxz098 ↗
- Languages:
- English
- ISSNs:
- 0010-4620
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.060000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 15099.xml