Supporting accessibility and reproducibility in language research in the Alveo virtual laboratory. (September 2017)
- Record Type:
- Journal Article
- Title:
- Supporting accessibility and reproducibility in language research in the Alveo virtual laboratory. (September 2017)
- Main Title:
- Supporting accessibility and reproducibility in language research in the Alveo virtual laboratory
- Authors:
- Cassidy, Steve
Estival, Dominique - Abstract:
- Highlights: Reviews a number of publications in CSL regarding their practice in using and citing data collections. Finds that authors are keen to identify and share data but that practices vary in how precise they are or how easy it is to get the data. Reviews research workflows in speech and language, including the use of software tools. Suggests a 'hierarchy of needs' for reproducibility in speech and language research. Describes how the Alveo Virtual Laboratory supports a model of research that facilitates data sharing and citation of software tools. Abstract: Reproducibility is an important part of scientific research and studies published in speech and language research usually make some attempt at ensuring that the work reported could be reproduced by other researchers. This paper looks at the current practice in the field relating to the citation and availability of both data and software methods. It is common to use widely available shared datasets in this field which helps to ensure that studies can be reproduced; however a brief survey of recent papers shows a wide range of styles of citation of data only some of which clearly identify the exact data used in the study. Similarly, practices in describing and sharing software artefacts vary considerably from detailed descriptions of algorithms to linked repositories. The Alveo Virtual Laboratory is a web based platform to support research based on collections of text, speech and video. Alveo provides a centralHighlights: Reviews a number of publications in CSL regarding their practice in using and citing data collections. Finds that authors are keen to identify and share data but that practices vary in how precise they are or how easy it is to get the data. Reviews research workflows in speech and language, including the use of software tools. Suggests a 'hierarchy of needs' for reproducibility in speech and language research. Describes how the Alveo Virtual Laboratory supports a model of research that facilitates data sharing and citation of software tools. Abstract: Reproducibility is an important part of scientific research and studies published in speech and language research usually make some attempt at ensuring that the work reported could be reproduced by other researchers. This paper looks at the current practice in the field relating to the citation and availability of both data and software methods. It is common to use widely available shared datasets in this field which helps to ensure that studies can be reproduced; however a brief survey of recent papers shows a wide range of styles of citation of data only some of which clearly identify the exact data used in the study. Similarly, practices in describing and sharing software artefacts vary considerably from detailed descriptions of algorithms to linked repositories. The Alveo Virtual Laboratory is a web based platform to support research based on collections of text, speech and video. Alveo provides a central repository for language data and provides a set of services for discovery and analysis of data. We argue that some of the features of the Alveo platform may make it easier for researchers to share their data more precisely and cite the exact software tools used to develop published results. Alveo makes use of ideas developed in other areas of science and we discuss these and how they can be applied to speech and language research. … (more)
- Is Part Of:
- Computer speech & language. Volume 45(2017)
- Journal:
- Computer speech & language
- Issue:
- Volume 45(2017)
- Issue Display:
- Volume 45, Issue 2017 (2017)
- Year:
- 2017
- Volume:
- 45
- Issue:
- 2017
- Issue Sort Value:
- 2017-0045-2017-0000
- Page Start:
- 375
- Page End:
- 391
- Publication Date:
- 2017-09
- Subjects:
- Corpus infrastructure -- Data citation -- Reproducibility -- Research methods -- eResearch -- Research workflow
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2017.01.003 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 2060.xml