113 Data extraction for the digital research environment. (15th December 2021)
- Record Type:
- Journal Article
- Title:
- 113 Data extraction for the digital research environment. (15th December 2021)
- Main Title:
- 113 Data extraction for the digital research environment
- Authors:
- Key, Daniel
Booth, John
Shah, Mohsin
Briggs, Lydia
Spiridou, Anastassia
Sebire, Neil J
Bryant, William A - Abstract:
- Abstract : GOSH's adoption of Electronic Patient Records is world leading. However, the large volumes and heterogeneity of data available can make it difficult for secondary uses to efficiently locate and extract the information they need. Large amounts of data also exist in pre-EPIC and non-EPIC systems that allow both longer time-series and wider selection of variables to analyse, but this can be non-trivial to harmonise (at the variable level) or link (at the patient level) to EPIC data or to other non-EPIC sources. Furthermore, data must be adequately de-identified before it's shared with external collaborators and the expertise to perform this safely may not be available within a small project group. The Digital Research Environment team have developed a Python library along with supporting documentation and procedures that aims to simplify and standardise the process of extracting and de-identifying hospital data before making it available on our web-based research platform. We present both the highly flexible and general software API that we use for internal development and a process called Standardised Data Extraction. This retrieves a set of well-defined Research Data Views (RDVs) via a simple csv template that subject matter experts, irrespective of IT proficiency, can easily work with. These tools allow the seamless extraction of consistent datasets that cross the EPIC adoption boundary in April 2019 and go back as far as the year 2000. Simple data requests, orAbstract : GOSH's adoption of Electronic Patient Records is world leading. However, the large volumes and heterogeneity of data available can make it difficult for secondary uses to efficiently locate and extract the information they need. Large amounts of data also exist in pre-EPIC and non-EPIC systems that allow both longer time-series and wider selection of variables to analyse, but this can be non-trivial to harmonise (at the variable level) or link (at the patient level) to EPIC data or to other non-EPIC sources. Furthermore, data must be adequately de-identified before it's shared with external collaborators and the expertise to perform this safely may not be available within a small project group. The Digital Research Environment team have developed a Python library along with supporting documentation and procedures that aims to simplify and standardise the process of extracting and de-identifying hospital data before making it available on our web-based research platform. We present both the highly flexible and general software API that we use for internal development and a process called Standardised Data Extraction. This retrieves a set of well-defined Research Data Views (RDVs) via a simple csv template that subject matter experts, irrespective of IT proficiency, can easily work with. These tools allow the seamless extraction of consistent datasets that cross the EPIC adoption boundary in April 2019 and go back as far as the year 2000. Simple data requests, or refreshes of existing data, can thereby be turned around by the DRE team in hours or days. We also store all projects under version control with a complete record of software packages used, enabling a high degree of repeatability for follow-up investigations including newly entered patient data and/or broadening the data sets to be retrieved. … (more)
- Is Part Of:
- Archives of disease in childhood. Volume 106(2021)Supplement 3
- Journal:
- Archives of disease in childhood
- Issue:
- Volume 106(2021)Supplement 3
- Issue Display:
- Volume 106, Issue 3 (2021)
- Year:
- 2021
- Volume:
- 106
- Issue:
- 3
- Issue Sort Value:
- 2021-0106-0003-0000
- Page Start:
- A42
- Page End:
- A42
- Publication Date:
- 2021-12-15
- Subjects:
- Children -- Diseases -- Periodicals
Infants -- Diseases -- Periodicals
618.920005 - Journal URLs:
- http://adc.bmjjournals.com/ ↗
http://www.bmj.com/archive ↗ - DOI:
- 10.1136/archdischild-2021-gosh.113 ↗
- Languages:
- English
- ISSNs:
- 0003-9888
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 27126.xml