A parallel fuzzy clustering algorithm for large graphs using Pregel. (15th July 2017)
- Record Type:
- Journal Article
- Title:
- A parallel fuzzy clustering algorithm for large graphs using Pregel. (15th July 2017)
- Main Title:
- A parallel fuzzy clustering algorithm for large graphs using Pregel
- Authors:
- Bhatia, Vandana
Rani, Rinkle - Abstract:
- Highlights: We propose a parallel fuzzy clustering algorithm (PGFC) for large graphs. Pregel and Hadoop frameworks are used for processing of massive graph data. PGFC is scalable and produces good quality of clusters. The performance is validated using graphs having upto millions of nodes. Results reveal that PGFC significantly outperforms state-of-the-art algorithms. Abstract: Large graphs are scale free and ubiquitous having irregular relationships. Clustering is used to find existent similar patterns in graphs and thus help in getting useful insights. In real-world, nodes may belong to more than one cluster thus, it is essential to analyze fuzzy cluster membership of nodes. Traditional centralized fuzzy clustering algorithms incur high communication cost and produce poor quality of clusters when used for large graphs. Thus, scalable solutions are obligatory to handle huge amount of data in less computational time with minimum disk access. In this paper, we proposed a parallel fuzzy clustering algorithm named 'PGFC' for handling scalable graph data. It will be advantageous from the viewpoint of expert systems to develop a clustering algorithm that can assure scalability along with better quality of clusters for handling large graphs.The algorithm is parallelized using bulk synchronous parallel (BSP) based Pregel model. The cluster centers are initialized using degree centrality measure, resulting in lesser number of iterations. The performance of PGFC is compared withHighlights: We propose a parallel fuzzy clustering algorithm (PGFC) for large graphs. Pregel and Hadoop frameworks are used for processing of massive graph data. PGFC is scalable and produces good quality of clusters. The performance is validated using graphs having upto millions of nodes. Results reveal that PGFC significantly outperforms state-of-the-art algorithms. Abstract: Large graphs are scale free and ubiquitous having irregular relationships. Clustering is used to find existent similar patterns in graphs and thus help in getting useful insights. In real-world, nodes may belong to more than one cluster thus, it is essential to analyze fuzzy cluster membership of nodes. Traditional centralized fuzzy clustering algorithms incur high communication cost and produce poor quality of clusters when used for large graphs. Thus, scalable solutions are obligatory to handle huge amount of data in less computational time with minimum disk access. In this paper, we proposed a parallel fuzzy clustering algorithm named 'PGFC' for handling scalable graph data. It will be advantageous from the viewpoint of expert systems to develop a clustering algorithm that can assure scalability along with better quality of clusters for handling large graphs.The algorithm is parallelized using bulk synchronous parallel (BSP) based Pregel model. The cluster centers are initialized using degree centrality measure, resulting in lesser number of iterations. The performance of PGFC is compared with other state of art clustering algorithms using synthetic graphs and real world networks. The experimental results reveal that the proposed PGFC scales up linearly to handle large graphs and produces better quality of clusters when compared to other graph clustering counterparts. … (more)
- Is Part Of:
- Expert systems with applications. Volume 78(2017)
- Journal:
- Expert systems with applications
- Issue:
- Volume 78(2017)
- Issue Display:
- Volume 78, Issue 2017 (2017)
- Year:
- 2017
- Volume:
- 78
- Issue:
- 2017
- Issue Sort Value:
- 2017-0078-2017-0000
- Page Start:
- 135
- Page End:
- 144
- Publication Date:
- 2017-07-15
- Subjects:
- Clustering -- Graphs -- Big data mining -- Fuzzy C-Mean -- Pregel
Expert systems (Computer science) -- Periodicals
Systèmes experts (Informatique) -- Périodiques
Electronic journals
006.33 - Journal URLs:
- http://www.sciencedirect.com/science/journal/09574174 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.eswa.2017.02.005 ↗
- Languages:
- English
- ISSNs:
- 0957-4174
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3842.004220
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 2757.xml