Symbolic pattern recognition for sequential data. Issue 4 (2nd October 2017)
- Record Type:
- Journal Article
- Title:
- Symbolic pattern recognition for sequential data. Issue 4 (2nd October 2017)
- Main Title:
- Symbolic pattern recognition for sequential data
- Authors:
- Akbilgic, Oguz
Howe, J. Andrew - Abstract:
- ABSTRACT: Sources of sequential data surround and pervade our lives. Our bodies continuously generate sequential data such as heart rate and blood pressure. In global finance, stock indices and currency exchange rates change every second. The movement of clouds, the coordinates of the planets, the score of a soccer game, etc., are all examples of sequential data transitioning from one state to another in regular time steps. There is a mature body of literature related to modeling time-dependent sequential data, or time series, in which every model relies upon specific assumptions for applicability to data. However, there other kinds of sequential data that are not collected with respect to time; for example, DNA sequences (if we ignore the gene mutations). Modeling of this type of sequential data does not have a body of literature that is as mature. This is somewhat perplexing, since sequential data modeling should cover both time-dependent and non-time-dependent data. In this study, we introduce a framework called Symbolic Pattern Recognition for modeling pattern transition behavior of sequential data that is expressed by a finite alphabet of symbols. Our framework can be used to characterize, predict, simulate, cluster, and classify multiple series based on their observed pattern transition behaviors. We document our proposed framework and apply it to perform unsupervised clustering of 13 different species of mollusks based on their DNA sequences. Our model correctlyABSTRACT: Sources of sequential data surround and pervade our lives. Our bodies continuously generate sequential data such as heart rate and blood pressure. In global finance, stock indices and currency exchange rates change every second. The movement of clouds, the coordinates of the planets, the score of a soccer game, etc., are all examples of sequential data transitioning from one state to another in regular time steps. There is a mature body of literature related to modeling time-dependent sequential data, or time series, in which every model relies upon specific assumptions for applicability to data. However, there other kinds of sequential data that are not collected with respect to time; for example, DNA sequences (if we ignore the gene mutations). Modeling of this type of sequential data does not have a body of literature that is as mature. This is somewhat perplexing, since sequential data modeling should cover both time-dependent and non-time-dependent data. In this study, we introduce a framework called Symbolic Pattern Recognition for modeling pattern transition behavior of sequential data that is expressed by a finite alphabet of symbols. Our framework can be used to characterize, predict, simulate, cluster, and classify multiple series based on their observed pattern transition behaviors. We document our proposed framework and apply it to perform unsupervised clustering of 13 different species of mollusks based on their DNA sequences. Our model correctly clusters the mollusks into their respective genera, matching results obtained via more traditional, supervised genetic analysis. … (more)
- Is Part Of:
- Sequential analysis. Volume 36:Issue 4(2017)
- Journal:
- Sequential analysis
- Issue:
- Volume 36:Issue 4(2017)
- Issue Display:
- Volume 36, Issue 4 (2017)
- Year:
- 2017
- Volume:
- 36
- Issue:
- 4
- Issue Sort Value:
- 2017-0036-0004-0000
- Page Start:
- 528
- Page End:
- 540
- Publication Date:
- 2017-10-02
- Subjects:
- Genetic analysis -- pattern recognition -- sequential data -- symbolic data -- time series modeling
62H30 -- 60Jxx -- 62P10
Sequential analysis -- Periodicals
519.54 - Journal URLs:
- http://www.tandfonline.com/toc/lsqa20/current ↗
http://www.tandfonline.com/ ↗ - DOI:
- 10.1080/07474946.2017.1394719 ↗
- Languages:
- English
- ISSNs:
- 0747-4946
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 8242.279500
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 5680.xml