A novel statistical mechanics-based metric for characterization of text documents. (October 2020)
- Record Type:
- Journal Article
- Title:
- A novel statistical mechanics-based metric for characterization of text documents. (October 2020)
- Main Title:
- A novel statistical mechanics-based metric for characterization of text documents
- Authors:
- Biswas, Mainak
Chakrabartty, Shubhro
Suri, Jasjit S.
Song, Hanjung - Abstract:
- Highlights: A new similarity measured is proposed based on statistical mechanics. The new measure, namely DSSM finds similarity between two documents based on uniqueness of energy. The DSSM showed higher performance with regards to conventional measures. Abstract: Machine learning (ML) uses intelligence-based statistical models to resolve characterization problems. However, the accuracy of ML models decreases as the volume of data increases since they are not able to capture the diversity and important regularities accompanying the huge volume. In this regard, the statistical mechanics' (SM) paradigm for ML models must be investigated. SM like Big Data also deals with a large volume of instances. A correlation principle is proposed between data represented as words in a text corpus and a large volume of particles within a system. Based on it a new similarity measure, namely degeneracy-based statistical similarity measure (DSSM) is proposed. The DSSM is directly proportional to the ratio of the total energy of two documents with maximally similar words and the total number of distinct words in the document. The DSSM showed higher performance with regards to existing measures such as Euclidean, Cosine, SMTP, and MBSM.
- Is Part Of:
- Computers & electrical engineering. Volume 87(2020)
- Journal:
- Computers & electrical engineering
- Issue:
- Volume 87(2020)
- Issue Display:
- Volume 87, Issue 2020 (2020)
- Year:
- 2020
- Volume:
- 87
- Issue:
- 2020
- Issue Sort Value:
- 2020-0087-2020-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-10
- Subjects:
- Similarity measures -- Machine learning -- Statistical mechanics -- Classification
00-01 -- 99-00
Computer engineering -- Periodicals
Electrical engineering -- Periodicals
Electrical engineering -- Data processing -- Periodicals
Ordinateurs -- Conception et construction -- Périodiques
Électrotechnique -- Périodiques
Électrotechnique -- Informatique -- Périodiques
Computer engineering
Electrical engineering
Electrical engineering -- Data processing
Periodicals
Electronic journals
621.302854 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00457906/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compeleceng.2020.106812 ↗
- Languages:
- English
- ISSNs:
- 0045-7906
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.680000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 14610.xml