(Un/Semi-)supervised SMS text message SPAM detection. (August 2015)
- Record Type:
- Journal Article
- Title:
- (Un/Semi-)supervised SMS text message SPAM detection. (August 2015)
- Main Title:
- (Un/Semi-)supervised SMS text message SPAM detection
- Authors:
- GIANNELLA, CHRIS R.
WINDER, RANSOM
WILSON, BRANDON - Abstract:
- <abstract abstract-type="normal"> <title>Abstract</title> <p>We address the problem of unsupervised and semi-supervised SMS (Short Message Service) text message SPAM detection. We develop a content-based Bayesian classification approach which is a modest extension of the technique discussed by Resnik and Hardisty in 2010. The approach assumes that the bodies of the SMS messages arise from a probabilistic generative model and estimates the model parameters by Gibbs sampling using an unlabeled, or partially labeled, SMS training message corpus. The approach classifies new SMS messages as SPAM or HAM (non-SPAM) by zero-thresholding their logit estimates. We tested the approach on a publicly available SMS corpora collected from the UK. Used in semi-supervised fashion, the approach clearly outperformed a competing algorithm, Semi-Boost. Used in unsupervised fashion, the approach outperformed a fully supervised classifier, an SVM (Support Vector Machine), when the number of training messages used by the SVM was small and performed comparably otherwise. We believe the approach works well and is a useful tool for SMS SPAM detection.</p> </abstract>
- Is Part Of:
- Natural language engineering. Volume 21:Part 4(2015)
- Journal:
- Natural language engineering
- Issue:
- Volume 21:Part 4(2015)
- Issue Display:
- Volume 21, Issue 4, Part 4 (2015)
- Year:
- 2015
- Volume:
- 21
- Issue:
- 4
- Part:
- 4
- Issue Sort Value:
- 2015-0021-0004-0004
- Page Start:
- 553
- Page End:
- 567
- Publication Date:
- 2015-08
- Subjects:
- Natural language processing (Computer science) -- Periodicals
Software engineering -- Periodicals
006.35 - Journal URLs:
- http://journals.cambridge.org/action/displayJournal?jid=NLE ↗
- DOI:
- 10.1017/S1351324914000102 ↗
- Languages:
- English
- ISSNs:
- 1351-3249
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library HMNTS - ELD Digital store
- Ingest File:
- 3116.xml