FAIR data pipeline: provenance-driven data management for traceable scientific workflows. (3rd October 2022)
- Record Type:
- Journal Article
- Title:
- FAIR data pipeline: provenance-driven data management for traceable scientific workflows. (3rd October 2022)
- Main Title:
- FAIR data pipeline: provenance-driven data management for traceable scientific workflows
- Authors:
- Mitchell, Sonia Natalie
Lahiff, Andrew
Cummings, Nathan
Hollocombe, Jonathan
Boskamp, Bram
Field, Ryan
Reddyhoff, Dennis
Zarebski, Kristian
Wilson, Antony
Viola, Bruno
Burke, Martin
Archibald, Blair
Bessell, Paul
Blackwell, Richard
Boden, Lisa A.
Brett, Alys
Brett, Sam
Dundas, Ruth
Enright, Jessica
Gonzalez-Beltran, Alejandra N.
Harris, Claire
Hinder, Ian
David Hughes, Christopher
Knight, Martin
Mano, Vino
McMonagle, Ciaran
Mellor, Dominic
Mohr, Sibylle
Marion, Glenn
Matthews, Louise
McKendrick, Iain J.
Mark Pooley, Christopher
Porphyre, Thibaud
Reeves, Aaron
Townsend, Edward
Turner, Robert
Walton, Jeremy
Reeve, Richard
… (more) - Abstract:
- Abstract : Modern epidemiological analyses to understand and combat the spread of disease depend critically on access to, and use of, data. Rapidly evolving data, such as data streams changing during a disease outbreak, are particularly challenging. Data management is further complicated by data being imprecisely identified when used. Public trust in policy decisions resulting from such analyses is easily damaged and is often low, with cynicism arising where claims of 'following the science' are made without accompanying evidence. Tracing the provenance of such decisions back through open software to primary data would clarify this evidence, enhancing the transparency of the decision-making process. Here, we demonstrate a Findable, Accessible, Interoperable and Reusable (FAIR) data pipeline. Although developed during the COVID-19 pandemic, it allows easy annotation of any data as they are consumed by analyses, or conversely traces the provenance of scientific outputs back through the analytical or modelling source code to primary data. Such a tool provides a mechanism for the public, and fellow scientists, to better assess scientific evidence by inspecting its provenance, while allowing scientists to support policymakers in openly justifying their decisions. We believe that such tools should be promoted for use across all areas of policy-facing research. This article is part of the theme issue 'Technical challenges of modelling real-life epidemics and examples of overcomingAbstract : Modern epidemiological analyses to understand and combat the spread of disease depend critically on access to, and use of, data. Rapidly evolving data, such as data streams changing during a disease outbreak, are particularly challenging. Data management is further complicated by data being imprecisely identified when used. Public trust in policy decisions resulting from such analyses is easily damaged and is often low, with cynicism arising where claims of 'following the science' are made without accompanying evidence. Tracing the provenance of such decisions back through open software to primary data would clarify this evidence, enhancing the transparency of the decision-making process. Here, we demonstrate a Findable, Accessible, Interoperable and Reusable (FAIR) data pipeline. Although developed during the COVID-19 pandemic, it allows easy annotation of any data as they are consumed by analyses, or conversely traces the provenance of scientific outputs back through the analytical or modelling source code to primary data. Such a tool provides a mechanism for the public, and fellow scientists, to better assess scientific evidence by inspecting its provenance, while allowing scientists to support policymakers in openly justifying their decisions. We believe that such tools should be promoted for use across all areas of policy-facing research. This article is part of the theme issue 'Technical challenges of modelling real-life epidemics and examples of overcoming these'. … (more)
- Is Part Of:
- Philosophical transactions. Volume 380:Number 2233(2022)
- Journal:
- Philosophical transactions
- Issue:
- Volume 380:Number 2233(2022)
- Issue Display:
- Volume 380, Issue 2233 (2022)
- Year:
- 2022
- Volume:
- 380
- Issue:
- 2233
- Issue Sort Value:
- 2022-0380-2233-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-10-03
- Subjects:
- FAIR -- provenance -- data management -- epidemiology -- modelling -- COVID-19
Physical sciences -- Periodicals
Engineering -- Periodicals
Mathematics -- Periodicals
500 - Journal URLs:
- https://royalsocietypublishing.org/loi/rsta ↗
- DOI:
- 10.1098/rsta.2021.0300 ↗
- Languages:
- English
- ISSNs:
- 1364-503X
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library STI - ELD Digital store
- Ingest File:
- 23455.xml