Machine learning and big scientific data. (20th January 2020)
- Record Type:
- Journal Article
- Title:
- Machine learning and big scientific data. (20th January 2020)
- Main Title:
- Machine learning and big scientific data
- Authors:
- Hey, Tony
Butler, Keith
Jackson, Sam
Thiyagalingam, Jeyarajan - Abstract:
- Abstract : This paper reviews some of the challenges posed by the huge growth of experimental data generated by the new generation of large-scale experiments at UK national facilities at the Rutherford Appleton Laboratory (RAL) site at Harwell near Oxford. Such 'Big Scientific Data' comes from the Diamond Light Source and Electron Microscopy Facilities, the ISIS Neutron and Muon Facility and the UK's Central Laser Facility. Increasingly, scientists are now required to use advanced machine learning and other AI technologies both to automate parts of the data pipeline and to help find new scientific discoveries in the analysis of their data. For commercially important applications, such as object recognition, natural language processing and automatic translation, deep learning has made dramatic breakthroughs. Google's DeepMind has now used the deep learning technology to develop their AlphaFold tool to make predictions for protein folding. Remarkably, it has been able to achieve some spectacular results for this specific scientific problem. Can deep learning be similarly transformative for other scientific problems? After a brief review of some initial applications of machine learning at the RAL, we focus on challenges and opportunities for AI in advancing materials science. Finally, we discuss the importance of developing some realistic machine learning benchmarks using Big Scientific Data coming from several different scientific domains. We conclude with some initialAbstract : This paper reviews some of the challenges posed by the huge growth of experimental data generated by the new generation of large-scale experiments at UK national facilities at the Rutherford Appleton Laboratory (RAL) site at Harwell near Oxford. Such 'Big Scientific Data' comes from the Diamond Light Source and Electron Microscopy Facilities, the ISIS Neutron and Muon Facility and the UK's Central Laser Facility. Increasingly, scientists are now required to use advanced machine learning and other AI technologies both to automate parts of the data pipeline and to help find new scientific discoveries in the analysis of their data. For commercially important applications, such as object recognition, natural language processing and automatic translation, deep learning has made dramatic breakthroughs. Google's DeepMind has now used the deep learning technology to develop their AlphaFold tool to make predictions for protein folding. Remarkably, it has been able to achieve some spectacular results for this specific scientific problem. Can deep learning be similarly transformative for other scientific problems? After a brief review of some initial applications of machine learning at the RAL, we focus on challenges and opportunities for AI in advancing materials science. Finally, we discuss the importance of developing some realistic machine learning benchmarks using Big Scientific Data coming from several different scientific domains. We conclude with some initial examples of our 'scientific machine learning' benchmark suite and of the research challenges these benchmarks will enable. This article is part of a discussion meeting issue 'Numerical algorithms for high-performance computational science'. … (more)
- Is Part Of:
- Philosophical transactions. Volume 378:Number 2166(2020)
- Journal:
- Philosophical transactions
- Issue:
- Volume 378:Number 2166(2020)
- Issue Display:
- Volume 378, Issue 2166 (2020)
- Year:
- 2020
- Volume:
- 378
- Issue:
- 2166
- Issue Sort Value:
- 2020-0378-2166-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-01-20
- Subjects:
- machine learning -- materials science -- atmospheric science -- electron microscopy -- image processing -- AI benchmarks
Physical sciences -- Periodicals
Engineering -- Periodicals
Mathematics -- Periodicals
500 - Journal URLs:
- https://royalsocietypublishing.org/loi/rsta ↗
- DOI:
- 10.1098/rsta.2019.0054 ↗
- Languages:
- English
- ISSNs:
- 1364-503X
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library STI - ELD Digital store
- Ingest File:
- 12788.xml