Profile-Based Focused Crawling for Social Media-Sharing Websites. (22nd April 2009)
- Record Type:
- Journal Article
- Title:
- Profile-Based Focused Crawling for Social Media-Sharing Websites. (22nd April 2009)
- Main Title:
- Profile-Based Focused Crawling for Social Media-Sharing Websites
- Authors:
- Zhang Zhang, Zhiyong Zhiyong
Nasraoui Nasraoui, Olfa Olfa - Other Names:
- Shih Shih Timothy Timothy Academic Editor.
- Abstract:
- Abstract : We present a novel profile-based focused crawling system for dealing with the increasingly popular social media-sharing websites. In this system, we treat the user profiles as ranking criteria for guiding the crawling process. Furthermore, we divide a user's profile into two parts, an internal part, which comes from the user's own contribution, and an external part, which comes from the user's social contacts. In order to expand the crawling topic, a cotagging topic-discovery scheme was adopted for social media-sharing websites. In order to efficiently and effectively extract data for the focused crawling, a path string -based page classification method is first developed for identifying list pages, detail pages, and profile pages . The identification of the correct type of page is essential for our crawling, since we want to distinguish between list, profile, and detail pages in order to extract the correct information from each type of page, and subsequently estimate a reasonable ranking for each link that is encountered while crawling. Our experiments prove the robustness of our profile-based focused crawler, as well as a significant improvement in harvest ratio, compared to breadth-first and online page importance computation (OPIC) crawlers, when crawling the Flickr website for two different topics.
- Is Part Of:
- EURASIP journal on image and video processing. Volume 2009(2009)
- Journal:
- EURASIP journal on image and video processing
- Issue:
- Volume 2009(2009)
- Issue Display:
- Volume 2009, Issue 2009 (2009)
- Year:
- 2009
- Volume:
- 2009
- Issue:
- 2009
- Issue Sort Value:
- 2009-2009-2009-0000
- Page Start:
- Page End:
- Publication Date:
- 2009-04-22
- Subjects:
- Image processing -- Digital techniques -- Periodicals
Digital video -- Periodicals
Traitement d'images
Vidéo numérique
Digital video
Image processing -- Digital techniques
Periodicals
Electronic journal
Electronic journals
621.367 - Journal URLs:
- https://jivp-eurasipjournals.springeropen.com/ ↗
http://link.springer.com/ ↗ - DOI:
- 10.1155/2009/856037 ↗
- Languages:
- English
- ISSNs:
- 1687-5176
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 24873.xml