Layout-based computation of web page similarity ranks. Issue 110 (February 2018)
- Record Type:
- Journal Article
- Title:
- Layout-based computation of web page similarity ranks. Issue 110 (February 2018)
- Main Title:
- Layout-based computation of web page similarity ranks
- Authors:
- Bozkir, Ahmet Selman
Akcapinar Sezer, Ebru - Abstract:
- Highlights: A novel web page layout similarity measuring and ranking approach is proposed Visual layout of a web page is enabled to be used as a query item As a representation method, wireframes are proposed and verified Form elements, animations and whitespaces are introduced as structural features A novel and verified SWPS-40 dataset is introduced for web page layout similarity related further studies Experiments show that use of wireframe representation along with structural and vision based features to the web page visual similarity comparison is highly reasonable Abstract: In this paper, we propose a ranking approach which considers visual similarities among web pages by using structure and vision-based features. Throughout the study, we aim to understand and represent the web page visual structure as in the way people do by focusing on the layout similarity through the wireframe design. The conducted study is composed of two parts. In the first part, structural similarities are analyzed with the proposed concept of "layout components" along with visual inspection of DOM trees. In this way, five types of structural layout components are proposed and revealed. Moreover, whitespaces are also utilized since they are important visual cues in the visual perception of web pages. In the second part, a computer-vision based method named histogram of oriented gradients (HOG) is employed to reveal local visual cues in terms of edge orientations. Following the feature extractionHighlights: A novel web page layout similarity measuring and ranking approach is proposed Visual layout of a web page is enabled to be used as a query item As a representation method, wireframes are proposed and verified Form elements, animations and whitespaces are introduced as structural features A novel and verified SWPS-40 dataset is introduced for web page layout similarity related further studies Experiments show that use of wireframe representation along with structural and vision based features to the web page visual similarity comparison is highly reasonable Abstract: In this paper, we propose a ranking approach which considers visual similarities among web pages by using structure and vision-based features. Throughout the study, we aim to understand and represent the web page visual structure as in the way people do by focusing on the layout similarity through the wireframe design. The conducted study is composed of two parts. In the first part, structural similarities are analyzed with the proposed concept of "layout components" along with visual inspection of DOM trees. In this way, five types of structural layout components are proposed and revealed. Moreover, whitespaces are also utilized since they are important visual cues in the visual perception of web pages. In the second part, a computer-vision based method named histogram of oriented gradients (HOG) is employed to reveal local visual cues in terms of edge orientations. Following the feature extraction phases, extracted feature histograms are mapped on spatial information preserving multilevel and multi-resolution bag of features representation method named spatial pyramid matching. In this way, three goals were achieved: (1) the visual layout of web pages were mapped and compared in a multi-resolution schema; (2) the intermediate process of visual segmentation was removed; and (3) efficient and easily comparable web page layout signatures were generated. We also conducted a questionnaire study covering 312 subjects. This helped us to create a benchmark dataset involving similarity scores collected from individuals. So far, there exists no web page layout similarity ranking oriented corpus in the literature. Our suggested approach achieved a remarkable ranking performance at top-5 and top-10 retrieval results. According to the findings of the comparative study, our approach outperforms some structure and vision-based studies in the literature. With this achievement, web pages could be employed as a query item to find other, similar web pages by taking into consideration that they are web pages, instead of images or anything else. Graphical abstract: … (more)
- Is Part Of:
- International journal of human-computer studies. Issue 110(2018)
- Journal:
- International journal of human-computer studies
- Issue:
- Issue 110(2018)
- Issue Display:
- Volume 110, Issue 110 (2018)
- Year:
- 2018
- Volume:
- 110
- Issue:
- 110
- Issue Sort Value:
- 2018-0110-0110-0000
- Page Start:
- 95
- Page End:
- 114
- Publication Date:
- 2018-02
- Subjects:
- Web page layout -- Layout similarity -- Similarity ranking -- Bag of features -- Spatial pyramid matching -- Histogram of oriented gradients
Human-machine systems -- Periodicals
Systems engineering -- Periodicals
Human engineering -- Periodicals
Human engineering
Human-machine systems
Systems engineering
Periodicals
Electronic journals
004.019 - Journal URLs:
- http://www.sciencedirect.com/science/journal/10715819 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.ijhcs.2017.10.008 ↗
- Languages:
- English
- ISSNs:
- 1071-5819
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4542.288100
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 5469.xml