Accelerating the process of web page segmentation via template clustering. (2016)
- Record Type:
- Journal Article
- Title:
- Accelerating the process of web page segmentation via template clustering. (2016)
- Main Title:
- Accelerating the process of web page segmentation via template clustering
- Authors:
- Zeleny, Jan
Burget, Radek - Abstract:
- Page segmentation is often one of the initial steps when performing data mining on a web page. In the past years, several methods of page segmentation have been developed that are based on visual perception of the web page. In this paper, we propose a generic method for improving efficiency of virtually all vision-based segmentation algorithms. Our method called cluster-based page segmentation takes the widely spread concept of web templates and utilises it for improving the efficiency of vision-based page segmentation by clustering web pages and performing the segmentation on the clusters instead of each page in the cluster. To prove the efficiency of our algorithm, we offer experimental results gathered using three different vision-based segmentation algorithms.
- Is Part Of:
- International journal of intelligent information and database systems. Volume 9:Number 2(2016)
- Journal:
- International journal of intelligent information and database systems
- Issue:
- Volume 9:Number 2(2016)
- Issue Display:
- Volume 9, Issue 2 (2016)
- Year:
- 2016
- Volume:
- 9
- Issue:
- 2
- Issue Sort Value:
- 2016-0009-0002-0000
- Page Start:
- 134
- Page End:
- 154
- Publication Date:
- 2016
- Subjects:
- VIPS -- page segmentation -- vision-based page segmentation -- web page segmentation -- web page preprocessing -- segmentation performance -- template detection -- template clustering -- data mining -- visual perception -- web templates
Database management -- Computer programs -- Periodicals
Information retrieval -- Computer programs -- Periodicals
Information storage and retrieval systems -- Computer programs -- Periodicals
Artificial intelligence -- Periodicals
Expert systems (Computer science) -- Periodicals
Intelligent agents (Computer software) -- Periodicals
006.33 - Journal URLs:
- http://www.inderscience.com/jhome.php?jcode=ijiids ↗
http://www.inderscience.com/ ↗ - Languages:
- English
- ISSNs:
- 1751-5858
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 7638.xml