OILog: An online incremental log keyword extraction approach based on MDP-LSTM neural network. Issue 95 (January 2021)
- Record Type:
- Journal Article
- Title:
- OILog: An online incremental log keyword extraction approach based on MDP-LSTM neural network. Issue 95 (January 2021)
- Main Title:
- OILog: An online incremental log keyword extraction approach based on MDP-LSTM neural network
- Authors:
- Duan, Xiaoyu
Ying, Shi
Cheng, Hailong
Yuan, Wanli
Yin, Xiang - Abstract:
- Abstract: Log keyword extraction is an indispensable part of log anomaly detection. There are two main challenges in keyword extraction, one is that the essence of logs is unstructured, and different vendors usually define different log formats, the other one is that the most of the traditional method cannot update the log keywords incrementally to match the newly generated log data, so the extraction accuracy is low. To solve these problems, we introduce an online incremental keyword extraction method OILog. The essential idea of this method is that log templates are usually the longest combination of high-frequency words. OILog builds models by using a deep Long Short-Term Memory network (LSTM) for capturing both high-frequency log keywords in real-time and new log keywords generated by the system, which can transform unstructured raw logs into structured logs quickly. To improve the efficiency and accuracy of the model, we proposed an improved particle swarm optimization algorithm, which changes the traditional topology structure of Particle Swarm Optimization algorithm (PSO) into a multilayer structure and applies a new particle velocity update formula to increase the attraction between particles. We summarized the previous works and validated OILog using real log data collected from four systems. The results show that OILog has superiority in terms of both accuracy and robustness. Highlights: This paper introduces an online incremental keyword extraction approach. WeAbstract: Log keyword extraction is an indispensable part of log anomaly detection. There are two main challenges in keyword extraction, one is that the essence of logs is unstructured, and different vendors usually define different log formats, the other one is that the most of the traditional method cannot update the log keywords incrementally to match the newly generated log data, so the extraction accuracy is low. To solve these problems, we introduce an online incremental keyword extraction method OILog. The essential idea of this method is that log templates are usually the longest combination of high-frequency words. OILog builds models by using a deep Long Short-Term Memory network (LSTM) for capturing both high-frequency log keywords in real-time and new log keywords generated by the system, which can transform unstructured raw logs into structured logs quickly. To improve the efficiency and accuracy of the model, we proposed an improved particle swarm optimization algorithm, which changes the traditional topology structure of Particle Swarm Optimization algorithm (PSO) into a multilayer structure and applies a new particle velocity update formula to increase the attraction between particles. We summarized the previous works and validated OILog using real log data collected from four systems. The results show that OILog has superiority in terms of both accuracy and robustness. Highlights: This paper introduces an online incremental keyword extraction approach. We proposed an improved PSO algorithm (MDPSO) and use it in deep LSTM (MDP-LSTM). We evaluated six optimization algorithms by using six benchmark functions. We conducted extensive experimental evaluations on four real log datasets. … (more)
- Is Part Of:
- Information systems. Issue 95(2021)
- Journal:
- Information systems
- Issue:
- Issue 95(2021)
- Issue Display:
- Volume 95, Issue 95 (2021)
- Year:
- 2021
- Volume:
- 95
- Issue:
- 95
- Issue Sort Value:
- 2021-0095-0095-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-01
- Subjects:
- Keyword extraction -- Optimization algorithm -- Text mining -- LSTM -- Deep learning
Database management -- Periodicals
Electronic data processing -- Periodicals
Bases de données -- Gestion -- Périodiques
Informatique -- Périodiques
Database management
Electronic data processing
Periodicals
005.7 - Journal URLs:
- http://www.sciencedirect.com/science/journal/03064379 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.is.2020.101618 ↗
- Languages:
- English
- ISSNs:
- 0306-4379
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4496.367300
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 14656.xml