Evaluation of web service clustering using Dirichlet Multinomial Mixture model based approach for Dimensionality Reduction in service representation. Issue 4 (July 2020)
- Record Type:
- Journal Article
- Title:
- Evaluation of web service clustering using Dirichlet Multinomial Mixture model based approach for Dimensionality Reduction in service representation. Issue 4 (July 2020)
- Main Title:
- Evaluation of web service clustering using Dirichlet Multinomial Mixture model based approach for Dimensionality Reduction in service representation
- Authors:
- Agarwal, Neha
Sikka, Geeta
Awasthi, Lalit Kumar - Abstract:
- Highlights: Investigate Topic modeling techniques for dimensionality reduction in web service The proposal of DMM Model based Approach to overcome short text problems Apply clustering algorithms to group similar services into same domain Performance of proposed methodology is validated using standard evaluation criteria Abstract: In recent years, mainly the functionality of services are described in a short natural text language. Keyword-based searching for web service discovery is not efficient for providing relevant results. When services are clustered according to the similarity, then it reduces search space and due to that search time is also reduced in the web service discovery process. So in the domain of web service clustering, basically topic modeling techniques like Latent Dirichlet Allocation (LDA), Correlated Topic Model (CTM), Hierarchical Dirichlet Processing (HDP), etc. are adopted for dimensionality reduction and feature representation of services in vector space. But as the services are described in the form of short text, so these techniques are not efficient due to lack of occurring words, limited content, etc. In this paper, the performance of web service clustering is evaluated by applying various topic modeling techniques with different clustering algorithms on the crawled dataset from ProgrammableWeb repository. Gibbs Sampling algorithm for Dirichlet Multinomial Mixture (GSDMM) model is proposed as a dimensionality reduction and feature representationHighlights: Investigate Topic modeling techniques for dimensionality reduction in web service The proposal of DMM Model based Approach to overcome short text problems Apply clustering algorithms to group similar services into same domain Performance of proposed methodology is validated using standard evaluation criteria Abstract: In recent years, mainly the functionality of services are described in a short natural text language. Keyword-based searching for web service discovery is not efficient for providing relevant results. When services are clustered according to the similarity, then it reduces search space and due to that search time is also reduced in the web service discovery process. So in the domain of web service clustering, basically topic modeling techniques like Latent Dirichlet Allocation (LDA), Correlated Topic Model (CTM), Hierarchical Dirichlet Processing (HDP), etc. are adopted for dimensionality reduction and feature representation of services in vector space. But as the services are described in the form of short text, so these techniques are not efficient due to lack of occurring words, limited content, etc. In this paper, the performance of web service clustering is evaluated by applying various topic modeling techniques with different clustering algorithms on the crawled dataset from ProgrammableWeb repository. Gibbs Sampling algorithm for Dirichlet Multinomial Mixture (GSDMM) model is proposed as a dimensionality reduction and feature representation of services to overcome the limitations of short text clustering. Results show that GSDMM with K-Means or Agglomerative clustering is outperforming all other methods. The performance of clustering is evaluated based on three extrinsic and two intrinsic evaluation criteria. Dimensionality reduction achieved by GSDMM is 90.88%, 88.84%, and 93.13% on three real-time crawled datasets, which is satisfactory as the performance of clustering is also enhanced by deploying this technique. … (more)
- Is Part Of:
- Information processing & management. Volume 57:Issue 4(2020:Jul.)
- Journal:
- Information processing & management
- Issue:
- Volume 57:Issue 4(2020:Jul.)
- Issue Display:
- Volume 57, Issue 4 (2020)
- Year:
- 2020
- Volume:
- 57
- Issue:
- 4
- Issue Sort Value:
- 2020-0057-0004-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-07
- Subjects:
- Web service clustering -- Dirichlet Multinomial Mixture (DMM) model -- Latent Dirichlet Allocation (LDA) -- Topic modeling techniques -- Clustering techniques
Information storage and retrieval systems -- Periodicals
Information science -- Periodicals
Systèmes d'information -- Périodiques
Sciences de l'information -- Périodiques
Information science
Information storage and retrieval systems
Periodicals
658.4038 - Journal URLs:
- http://www.sciencedirect.com/science/journal/03064573 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.ipm.2020.102238 ↗
- Languages:
- English
- ISSNs:
- 0306-4573
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4493.893000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 20467.xml