Mining frequent biological sequences based on bitmap without candidate sequence generation. (1st February 2016)
- Record Type:
- Journal Article
- Title:
- Mining frequent biological sequences based on bitmap without candidate sequence generation. (1st February 2016)
- Main Title:
- Mining frequent biological sequences based on bitmap without candidate sequence generation
- Authors:
- Wang, Qian
Davis, Darryl N
Ren, Jiadong - Abstract:
- Abstract: Biological sequences carry a lot of important genetic information of organisms. Furthermore, there is an inheritance law related to protein function and structure which is useful for applications such as disease prediction. Frequent sequence mining is a core technique for association rule discovery, but existing algorithms suffer from low efficiency or poor error rate because biological sequences differ from general sequences with more characteristics. In this paper, an algorithm for miningF requentB iologicalS equence based onB itmap, FBSB, is proposed. FBSB uses bitmaps as the simple data structure and transforms each row into a quicksort list QS-list for sequence growth. For the continuity and accuracy requirement of biological sequence mining, tested sequences used during the mining process of FBSB are real ones instead of generated candidates, and all the frequent sequences can be mined without any errors. Comparing with other algorithms, the experimental results show that FBSB can achieve a better performance on both run time and scalability. Highlights: Frequent sequence mining is used to discover genetic laws from biological sequence. Less memory is required because bitmap is used for sequence record and joining. Quicksort list makes the tested sequence real ones instead of excessive candidates. Joining operation is simplified by subtraction and efficient for runtime saving. All frequent sequences can be mined without any loss, so the error rate is 0.
- Is Part Of:
- Computers in biology and medicine. Volume 69(2016)
- Journal:
- Computers in biology and medicine
- Issue:
- Volume 69(2016)
- Issue Display:
- Volume 69, Issue 2016 (2016)
- Year:
- 2016
- Volume:
- 69
- Issue:
- 2016
- Issue Sort Value:
- 2016-0069-2016-0000
- Page Start:
- 152
- Page End:
- 157
- Publication Date:
- 2016-02-01
- Subjects:
- Biological sequence -- Frequent pattern -- Bitmap -- Quicksort list
Medicine -- Data processing -- Periodicals
Biology -- Data processing -- Periodicals
610.285 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00104825/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compbiomed.2015.12.016 ↗
- Languages:
- English
- ISSNs:
- 0010-4825
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.880000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 68.xml