Automated detection of sigmatism using deep learning applied to multichannel speech signal. (July 2021)
- Record Type:
- Journal Article
- Title:
- Automated detection of sigmatism using deep learning applied to multichannel speech signal. (July 2021)
- Main Title:
- Automated detection of sigmatism using deep learning applied to multichannel speech signal
- Authors:
- Krecichwost, Michal
Mocko, Natalia
Badura, Pawel - Abstract:
- Highlights: Computer-aided speech therapy of sigmatism in children. Acquisition of the speech signal in 15 spatially distributed channels. The speech sample represented by a 4D acoustic volume. A dedicated multibranch CNN for classification of pronunciation disorders. Valuable spatial acoustic information in a high-frequency band up to 22 kHz. Abstract: This paper presents a system for the analysis of acoustic data for the computer-aided diagnosis and therapy of sigmatism in children. The analysis is focused on the detection and recognition of selected articulation disorders in sibilant sounds. The system relies on the dedicated data acquisition device recording the speech signal using 15 microphones spatially arranged around the speaker's mouth. The collected speech corpus contains 923 samples of the /s/ and /ʃ/ consonants from 98 five- and six-year-old children with either normative or pathological pronunciation features. Each recording is supplemented with a detailed speech therapy annotation. A dedicated multibranch convolutional neural network architecture was designed for the speech sample classification. The filter bank energy feature maps are extracted from each channel along with their two derivatives in the time domain. The feature maps are aggregated along different dimensions to constitute a four-dimensional data structure called acoustic volume, being the input data for the deep network. We proposed three ways to aggregate the multichannel data into the acousticHighlights: Computer-aided speech therapy of sigmatism in children. Acquisition of the speech signal in 15 spatially distributed channels. The speech sample represented by a 4D acoustic volume. A dedicated multibranch CNN for classification of pronunciation disorders. Valuable spatial acoustic information in a high-frequency band up to 22 kHz. Abstract: This paper presents a system for the analysis of acoustic data for the computer-aided diagnosis and therapy of sigmatism in children. The analysis is focused on the detection and recognition of selected articulation disorders in sibilant sounds. The system relies on the dedicated data acquisition device recording the speech signal using 15 microphones spatially arranged around the speaker's mouth. The collected speech corpus contains 923 samples of the /s/ and /ʃ/ consonants from 98 five- and six-year-old children with either normative or pathological pronunciation features. Each recording is supplemented with a detailed speech therapy annotation. A dedicated multibranch convolutional neural network architecture was designed for the speech sample classification. The filter bank energy feature maps are extracted from each channel along with their two derivatives in the time domain. The feature maps are aggregated along different dimensions to constitute a four-dimensional data structure called acoustic volume, being the input data for the deep network. We proposed three ways to aggregate the multichannel data into the acoustic volume and two techniques for the data augmentation to enlarge the available dataset and avoid overfitting. Classification experiments involving different data subsets have proven the system's ability to detect the analyzed pronunciation disorders with reasonable accuracy. The framework with speech data organized spatially in five channels provides the most efficient classification. … (more)
- Is Part Of:
- Biomedical signal processing and control. Volume 68(2021)
- Journal:
- Biomedical signal processing and control
- Issue:
- Volume 68(2021)
- Issue Display:
- Volume 68, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 68
- Issue:
- 2021
- Issue Sort Value:
- 2021-0068-2021-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-07
- Subjects:
- Multichannel acoustic signal -- Speech analysis -- Sigmatism -- Deep learning -- Convolutional neural network -- Computer-aided speech diagnosis
Signal processing -- Periodicals
Biomedical engineering -- Periodicals
Signal Processing, Computer-Assisted -- Periodicals
Image Processing, Computer-Assisted -- Periodicals
Biomedical Engineering -- Periodicals
610.28 - Journal URLs:
- http://www.sciencedirect.com/science/journal/17468094 ↗
http://www.elsevier.com/journals ↗
http://www.sciencedirect.com/science?_ob=PublicationURL&_tockey=%23TOC%2329675%232006%23999989998%23626449%23FLA%23&_cdi=29675&_pubType=J&_auth=y&_acct=C000045259&_version=1&_urlVersion=0&_userid=836873&md5=664b5cf9a57fc91971a17faf20c32ec1 ↗ - DOI:
- 10.1016/j.bspc.2021.102612 ↗
- Languages:
- English
- ISSNs:
- 1746-8094
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 2087.880400
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 23797.xml