Efficient query-by-example spoken document retrieval combining phone multigram representation and dynamic time warping. Issue 1 (January 2019)
- Record Type:
- Journal Article
- Title:
- Efficient query-by-example spoken document retrieval combining phone multigram representation and dynamic time warping. Issue 1 (January 2019)
- Main Title:
- Efficient query-by-example spoken document retrieval combining phone multigram representation and dynamic time warping
- Authors:
- Lopez-Otero, Paula
Parapar, Javier
Barreiro, Alvaro - Abstract:
- Highlights: Query-by-example spoken document retrieval (QbESDR) strategies are time-consuming. A fast QbESDR strategy using different-sized n-grams and inverted indices is proposed. This was used to select candidates for a dynamic time warping (DTW)-based system. Score fusion of DTW and the proposed approach is also assessed. The paper reports effective and efficient solutions for the QbESDR task. Abstract: Query-by-example spoken document retrieval (QbESDR) aims at finding those documents in a set that include a given spoken query. Current approaches are, in general, not valid for real-world applications, since they are mostly focused on being effective (i.e. reliably detecting in which documents the query is present) but practical implementations must also be efficient (i.e. the search must be performed in a limited time) in order to allow for a satisfactory user experience. In addition, systems usually search for exact matches of the query, which limits the number of relevant documents retrieved by the search. This paper proposes a representation of the documents and queries for QbESDR based on combining different-sized phone n-grams obtained from automatic transcriptions, namely phone multigram representation. Since phone transcriptions usually have errors, several hypotheses for the query transcriptions are combined in order to ease the impact of these errors. The proposed system stores the document in inverted indices, which leads to fast and efficient search.Highlights: Query-by-example spoken document retrieval (QbESDR) strategies are time-consuming. A fast QbESDR strategy using different-sized n-grams and inverted indices is proposed. This was used to select candidates for a dynamic time warping (DTW)-based system. Score fusion of DTW and the proposed approach is also assessed. The paper reports effective and efficient solutions for the QbESDR task. Abstract: Query-by-example spoken document retrieval (QbESDR) aims at finding those documents in a set that include a given spoken query. Current approaches are, in general, not valid for real-world applications, since they are mostly focused on being effective (i.e. reliably detecting in which documents the query is present) but practical implementations must also be efficient (i.e. the search must be performed in a limited time) in order to allow for a satisfactory user experience. In addition, systems usually search for exact matches of the query, which limits the number of relevant documents retrieved by the search. This paper proposes a representation of the documents and queries for QbESDR based on combining different-sized phone n-grams obtained from automatic transcriptions, namely phone multigram representation. Since phone transcriptions usually have errors, several hypotheses for the query transcriptions are combined in order to ease the impact of these errors. The proposed system stores the document in inverted indices, which leads to fast and efficient search. Different combinations of the phone multigram strategy with a state-of-art system based on pattern matching using dynamic time warping (DTW) are proposed: one consists in a two-stage system that intends to be as effective but more efficient than a DTW-based system, while the other aims at improving the performance achieved by these two systems by combining their output scores. Experiments performed on the MediaEval 2014 Query-by-Example Search on Speech (QUESST 2014) evaluation framework suggest that the phone multigram representation for QbESDR is a successful approach, and the assessed combinations with a DTW-based strategy lead to more efficient and effective QbESDR systems. In addition, the phone multigram approach succeeded in increasing the detection of non-exact matches of the queries. … (more)
- Is Part Of:
- Information processing & management. Volume 56:Issue 1(2019:Jan.)
- Journal:
- Information processing & management
- Issue:
- Volume 56:Issue 1(2019:Jan.)
- Issue Display:
- Volume 56, Issue 1 (2019)
- Year:
- 2019
- Volume:
- 56
- Issue:
- 1
- Issue Sort Value:
- 2019-0056-0001-0000
- Page Start:
- 43
- Page End:
- 60
- Publication Date:
- 2019-01
- Subjects:
- Query-by-example spoken document retrieval -- Phone decoding -- Phone n-grams -- Phone posteriorgrams -- Dynamic time warping
Information storage and retrieval systems -- Periodicals
Information science -- Periodicals
Systèmes d'information -- Périodiques
Sciences de l'information -- Périodiques
Information science
Information storage and retrieval systems
Periodicals
658.4038 - Journal URLs:
- http://www.sciencedirect.com/science/journal/03064573 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.ipm.2018.09.002 ↗
- Languages:
- English
- ISSNs:
- 0306-4573
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4493.893000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 9140.xml