Face mask recognition from audio: The MASC database and an overview on the mask challenge. (February 2022)
- Record Type:
- Journal Article
- Title:
- Face mask recognition from audio: The MASC database and an overview on the mask challenge. (February 2022)
- Main Title:
- Face mask recognition from audio: The MASC database and an overview on the mask challenge
- Authors:
- Mohamed, Mostafa M.
Nessiem, Mina A.
Batliner, Anton
Bergler, Christian
Hantke, Simone
Schmitt, Maximilian
Baird, Alice
Mallol-Ragolta, Adria
Karas, Vincent
Amiriparian, Shahin
Schuller, Björn W. - Abstract:
- Highlights: Introduction of the Mask Augsburg Speech Corpus (MASC) database. Summarising the Mask Sub-Challenge (MSC) and its baseline approaches. Explanation and comparison of the approaches of top participants in the challenge. Summarising the results of the Mask Sub-Challenge from ComParE 2020. Introduction of novel fusion results, by way of fusing approaches from the best participants. Conducting a discussion of the approaches and the results, regarding several aspects. Introducing a proof of concept demonstration Android app. Benchmarking the serving run-time of the top models. Abstract: The sudden outbreak of COVID-19 has resulted in tough challenges for the field of biometrics due to its spread via physical contact, and the regulations of wearing face masks. Given these constraints, voice biometrics can offer a suitable contact-less biometric solution; they can benefit from models that classify whether a speaker is wearing a mask or not. This article reviews the Mask Sub-Challenge (MSC) of the INTERSPEECH 2020 COMputational PARalinguistics challengE (ComParE), which focused on the following classification task: Given an audio chunk of a speaker, classify whether the speaker is wearing a mask or not. First, we report the collection of the Mask Augsburg Speech Corpus (MASC) and the baseline approaches used to solve the problem, achieving a performance of 71.8 % Unweighted Average Recall (UAR). We then summarise the methodologies explored in the submitted and acceptedHighlights: Introduction of the Mask Augsburg Speech Corpus (MASC) database. Summarising the Mask Sub-Challenge (MSC) and its baseline approaches. Explanation and comparison of the approaches of top participants in the challenge. Summarising the results of the Mask Sub-Challenge from ComParE 2020. Introduction of novel fusion results, by way of fusing approaches from the best participants. Conducting a discussion of the approaches and the results, regarding several aspects. Introducing a proof of concept demonstration Android app. Benchmarking the serving run-time of the top models. Abstract: The sudden outbreak of COVID-19 has resulted in tough challenges for the field of biometrics due to its spread via physical contact, and the regulations of wearing face masks. Given these constraints, voice biometrics can offer a suitable contact-less biometric solution; they can benefit from models that classify whether a speaker is wearing a mask or not. This article reviews the Mask Sub-Challenge (MSC) of the INTERSPEECH 2020 COMputational PARalinguistics challengE (ComParE), which focused on the following classification task: Given an audio chunk of a speaker, classify whether the speaker is wearing a mask or not. First, we report the collection of the Mask Augsburg Speech Corpus (MASC) and the baseline approaches used to solve the problem, achieving a performance of 71.8 % Unweighted Average Recall (UAR). We then summarise the methodologies explored in the submitted and accepted papers that mainly used two common patterns: (i) phonetic-based audio features, or (ii) spectrogram representations of audio combined with Convolutional Neural Networks (CNNs) typically used in image processing. Most approaches enhance their models by adapting ensembles of different models and attempting to increase the size of the training data using various techniques. We review and discuss the results of the participants of this sub-challenge, where the winner scored a UAR of 80.1 % . Moreover, we present the results of fusing the approaches, leading to a UAR of 82.6 % . Finally, we present a smartphone app that can be used as a proof of concept demonstration to detect in real-time whether users are wearing a face mask; we also benchmark the run-time of the best models. … (more)
- Is Part Of:
- Pattern recognition. Volume 122(2022)
- Journal:
- Pattern recognition
- Issue:
- Volume 122(2022)
- Issue Display:
- Volume 122, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 122
- Issue:
- 2022
- Issue Sort Value:
- 2022-0122-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-02
- Subjects:
- COVID-19 -- Deep learning -- Masks -- Voice biometrics -- Acoustic modelling
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2021.108361 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 19791.xml