Boosted seed oversampling for local community ranking. Issue 2 (March 2020)
- Record Type:
- Journal Article
- Title:
- Boosted seed oversampling for local community ranking. Issue 2 (March 2020)
- Main Title:
- Boosted seed oversampling for local community ranking
- Authors:
- Krasanakis, Emmanouil
Schinas, Emmanouil
Papadopoulos, Symeon
Kompatsiaris, Yiannis
Symeonidis, Andreas - Abstract:
- Highlights: Example seed nodes may be too few and non-central to help produce high quality community ranks. Finding more seed nodes can improve the quality of ranking algorithms. Proposed a novel seed oversampling method to find more seed nodes. Proposed a boosting scheme to combine seed oversampling iterations. Produced higher quality ranks compared to the neighborhood inflation heuristic. Abstract: Local community detection is an emerging topic in network analysis that aims to detect well-connected communities encompassing sets of priorly known seed nodes. In this work, we explore the similar problem of ranking network nodes based on their relevance to the communities characterized by seed nodes. However, seed nodes may not be central enough or sufficiently many to produce high quality ranks. To solve this problem, we introduce a methodology we call seed oversampling, which first runs a node ranking algorithm to discover more nodes that belong to the community and then reruns the same ranking algorithm for the new seed nodes. We formally discuss why this process improves the quality of calculated community ranks if the original set of seed nodes is small and introduce a boosting scheme that iteratively repeats seed oversampling to further improve rank quality when certain ranking algorithm properties are met. Finally, we demonstrate the effectiveness of our methods in improving community relevance ranks given only a few random seed nodes of real-world network communities.Highlights: Example seed nodes may be too few and non-central to help produce high quality community ranks. Finding more seed nodes can improve the quality of ranking algorithms. Proposed a novel seed oversampling method to find more seed nodes. Proposed a boosting scheme to combine seed oversampling iterations. Produced higher quality ranks compared to the neighborhood inflation heuristic. Abstract: Local community detection is an emerging topic in network analysis that aims to detect well-connected communities encompassing sets of priorly known seed nodes. In this work, we explore the similar problem of ranking network nodes based on their relevance to the communities characterized by seed nodes. However, seed nodes may not be central enough or sufficiently many to produce high quality ranks. To solve this problem, we introduce a methodology we call seed oversampling, which first runs a node ranking algorithm to discover more nodes that belong to the community and then reruns the same ranking algorithm for the new seed nodes. We formally discuss why this process improves the quality of calculated community ranks if the original set of seed nodes is small and introduce a boosting scheme that iteratively repeats seed oversampling to further improve rank quality when certain ranking algorithm properties are met. Finally, we demonstrate the effectiveness of our methods in improving community relevance ranks given only a few random seed nodes of real-world network communities. In our experiments, boosted and simple seed oversampling yielded better rank quality than the previous neighborhood inflation heuristic, which adds the neighborhoods of original seed nodes to seeds. … (more)
- Is Part Of:
- Information processing & management. Volume 57:Issue 2(2020:Mar.)
- Journal:
- Information processing & management
- Issue:
- Volume 57:Issue 2(2020:Mar.)
- Issue Display:
- Volume 57, Issue 2 (2020)
- Year:
- 2020
- Volume:
- 57
- Issue:
- 2
- Issue Sort Value:
- 2020-0057-0002-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-03
- Subjects:
- Network ranking -- Local communities -- Seed oversampling -- Unsupervised rank boosting
Information storage and retrieval systems -- Periodicals
Information science -- Periodicals
Systèmes d'information -- Périodiques
Sciences de l'information -- Périodiques
Information science
Information storage and retrieval systems
Periodicals
658.4038 - Journal URLs:
- http://www.sciencedirect.com/science/journal/03064573 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.ipm.2019.06.002 ↗
- Languages:
- English
- ISSNs:
- 0306-4573
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4493.893000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 12552.xml