Well-calibrated confidence measures for multi-label text classification with a large number of labels. (February 2022)

Record Type:: Journal Article
Title:: Well-calibrated confidence measures for multi-label text classification with a large number of labels. (February 2022)
Main Title:: Well-calibrated confidence measures for multi-label text classification with a large number of labels
Authors:: Maltoudoglou, Lysimachos
Paisios, Andreas
Lenc, Ladislav
Martínek, Jiří
Král, Pavel
Papadopoulos, Harris
Abstract:: Highlights: We propose a novel approach to address the computationally demanding nature of the Label Powerset (LP) Inductive Conformal Prediction (ICP) multi-label classification with a high number of labels. We mathematically establish the validity of the proposed approach and provide experimental results that highlight its computational efficiency. We present prediction set results for data-sets in for multi-label text classification problems where it was previously computationally challenging and show that can be practically useful. Results show that Bert classifier surpasses the non-contextualised based by a large margin. In addition, to the best of our knowledge, our bert implementation achieved state-of-the-art results in the data-sets used. Abstract: We extend our previous work on Inductive Conformal Prediction (ICP) for multi-label text classification and present a novel approach for addressing the computational inefficiency of the Label Powerset (LP) ICP, arrising when dealing with a high number of unique labels. We present experimental results using the original and the proposed efficient LP-ICP on two English and one Czech language data-sets. Specifically, we apply the LP-ICP on three deep Artificial Neural Network (ANN) classifiers of two types: one based on contextualised (bert) and two on non-contextualised (word2vec) word-embeddings. In the LP-ICP setting we assign nonconformity scores to label-sets from which the corresponding p -values and prediction-sets … (more)
Is Part Of:: Pattern recognition. Volume 122(2022)
Journal:: Pattern recognition
Issue:: Volume 122(2022)
Issue Display:: Volume 122, Issue 2022 (2022)
Year:: 2022
Volume:: 122
Issue:: 2022
Issue Sort Value:: 2022-0122-2022-0000
Page Start:
Page End:
Publication Date:: 2022-02
Subjects:: Text classification -- Multi-label -- Word2vec -- Bert -- Conformal prediction -- Label powerset -- Computational efficiency -- Nonconformity measure -- Confidence measure
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4
Journal URLs:: http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗
DOI:: 10.1016/j.patcog.2021.108271 ↗
Languages:: English
ISSNs:: 0031-3203
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 19791.xml