Using Hidden Markov Models for the accurate linguistic analysis of process model activity labels. (July 2019)
- Record Type:
- Journal Article
- Title:
- Using Hidden Markov Models for the accurate linguistic analysis of process model activity labels. (July 2019)
- Main Title:
- Using Hidden Markov Models for the accurate linguistic analysis of process model activity labels
- Authors:
- Leopold, Henrik
van der Aa, Han
Offenberg, Jelmer
Reijers, Hajo A. - Abstract:
- Abstract: Many process model analysis techniques rely on the accurate analysis of the natural language contents captured in the models' activity labels. Since these labels are typically short and diverse in terms of their grammatical style, standard natural language processing tools are not suitable to analyze them. While a dedicated technique for the analysis of process model activity labels was proposed in the past, it suffers from considerable limitations. First of all, its performance varies greatly among data sets with different characteristics and it cannot handle uncommon grammatical styles. What is more, adapting the technique requires in-depth domain knowledge. We use this paper to propose a machine learning-based technique for activity label analysis that overcomes the issues associated with this rule-based state of the art. Our technique conceptualizes activity label analysis as a tagging task based on a Hidden Markov Model. By doing so, the analysis of activity labels no longer requires the manual specification of rules. An evaluation using a collection of 15, 000 activity labels demonstrates that our machine learning-based technique outperforms the state of the art in all aspects. Highlights: We propose a machine learning-based technique for activity label analysis. We conceptualize activity label analysis as a tagging task based on a Hidden Markov Model. Our technique overcomes the issues associated with the rule-based state of the art. Our technique no longerAbstract: Many process model analysis techniques rely on the accurate analysis of the natural language contents captured in the models' activity labels. Since these labels are typically short and diverse in terms of their grammatical style, standard natural language processing tools are not suitable to analyze them. While a dedicated technique for the analysis of process model activity labels was proposed in the past, it suffers from considerable limitations. First of all, its performance varies greatly among data sets with different characteristics and it cannot handle uncommon grammatical styles. What is more, adapting the technique requires in-depth domain knowledge. We use this paper to propose a machine learning-based technique for activity label analysis that overcomes the issues associated with this rule-based state of the art. Our technique conceptualizes activity label analysis as a tagging task based on a Hidden Markov Model. By doing so, the analysis of activity labels no longer requires the manual specification of rules. An evaluation using a collection of 15, 000 activity labels demonstrates that our machine learning-based technique outperforms the state of the art in all aspects. Highlights: We propose a machine learning-based technique for activity label analysis. We conceptualize activity label analysis as a tagging task based on a Hidden Markov Model. Our technique overcomes the issues associated with the rule-based state of the art. Our technique no longer requires the manual specification of rules. A comparative evaluation with 15, 000 labels demonstrates the superiority of our technique. … (more)
- Is Part Of:
- Information systems. Volume 83(2019)
- Journal:
- Information systems
- Issue:
- Volume 83(2019)
- Issue Display:
- Volume 83, Issue 2019 (2019)
- Year:
- 2019
- Volume:
- 83
- Issue:
- 2019
- Issue Sort Value:
- 2019-0083-2019-0000
- Page Start:
- 30
- Page End:
- 39
- Publication Date:
- 2019-07
- Subjects:
- Label analysis -- Process model -- Natural language -- Hidden Markov models
Database management -- Periodicals
Electronic data processing -- Periodicals
Bases de données -- Gestion -- Périodiques
Informatique -- Périodiques
Database management
Electronic data processing
Periodicals
005.7 - Journal URLs:
- http://www.sciencedirect.com/science/journal/03064379 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.is.2019.02.005 ↗
- Languages:
- English
- ISSNs:
- 0306-4379
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4496.367300
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 20396.xml