When less is more: How increasing the complexity of machine learning strategies for geothermal energy assessments may not lead toward better estimates. (May 2023)
- Record Type:
- Journal Article
- Title:
- When less is more: How increasing the complexity of machine learning strategies for geothermal energy assessments may not lead toward better estimates. (May 2023)
- Main Title:
- When less is more: How increasing the complexity of machine learning strategies for geothermal energy assessments may not lead toward better estimates
- Authors:
- Mordensky, Stanley P.
Lipor, John J.
DeAngelo, Jacob
Burns, Erick R.
Lindsey, Cary R. - Abstract:
- Highlights: Modern machine learning approaches can improve upon systems built with expert decisions. Expert decisions can have a greater influence on the predictions from a model than the conceptual framework of the model ( e.g., turning an otherwise linear approach non-linear). When using data from the 2008 USGS geothermal resource assessment, the F1 scores for the machine learning approaches appear low (F1 score < 0.10), do not improve with increasing model complexity, and, therefore, indicate the fundamental limitations of the input feature data. Until new improved feature data are incorporated into the assessment process, simple non-linear algorithms ( e.g., XGBoost) perform equally well or better than more complex methods ( e.g., artificial neural networks), while remaining easier to interpret. Novel machine learning performance metrics are needed in order to properly apply machine learning approaches to positive-unlabeled data Abstract: Previous moderate- and high-temperature geothermal resource assessments of the western United States utilized data-driven methods and expert decisions to estimate resource favorability. Although expert decisions can add confidence to the modeling process by ensuring reasonable models are employed, expert decisions also introduce human and, thereby, model bias. This bias can present a source of error that reduces the predictive performance of the models and confidence in the resulting resource estimates. Our study aims to develop robustHighlights: Modern machine learning approaches can improve upon systems built with expert decisions. Expert decisions can have a greater influence on the predictions from a model than the conceptual framework of the model ( e.g., turning an otherwise linear approach non-linear). When using data from the 2008 USGS geothermal resource assessment, the F1 scores for the machine learning approaches appear low (F1 score < 0.10), do not improve with increasing model complexity, and, therefore, indicate the fundamental limitations of the input feature data. Until new improved feature data are incorporated into the assessment process, simple non-linear algorithms ( e.g., XGBoost) perform equally well or better than more complex methods ( e.g., artificial neural networks), while remaining easier to interpret. Novel machine learning performance metrics are needed in order to properly apply machine learning approaches to positive-unlabeled data Abstract: Previous moderate- and high-temperature geothermal resource assessments of the western United States utilized data-driven methods and expert decisions to estimate resource favorability. Although expert decisions can add confidence to the modeling process by ensuring reasonable models are employed, expert decisions also introduce human and, thereby, model bias. This bias can present a source of error that reduces the predictive performance of the models and confidence in the resulting resource estimates. Our study aims to develop robust data-driven methods with the goals of reducing bias and improving predictive ability. We present and compare nine favorability maps for geothermal resources in the western United States using data from the U.S. Geological Survey's 2008 geothermal resource assessment. Two favorability maps are created using the expert decision-dependent methods from the 2008 assessment ( i.e., weight-of-evidence and logistic regression). With the same data, we then create six different favorability maps using logistic regression (without underlying expert decisions), XGBoost, and support-vector machines paired with two training strategies. The training strategies are customized to address the inherent challenges of applying machine learning to the geothermal training data, which have no negative examples and severe class imbalance. We also create another favorability map using an artificial neural network. We demonstrate that modern machine learning approaches can improve upon systems built with expert decisions. We also find that XGBoost, a non-linear algorithm, produces greater agreement with the 2008 results than linear logistic regression without expert decisions, because the expert decisions in the 2008 assessment rendered the otherwise linear approaches non-linear despite the fact that the 2008 assessment used only linear methods. The F1 scores for all approaches appear low (F1 score < 0.10), do not improve with increasing model complexity, and, therefore, indicate the fundamental limitations of the input features ( i.e., training data). Until improved feature data are incorporated into the assessment process, simple non-linear algorithms ( e.g., XGBoost) perform equally well or better than more complex methods ( e.g., artificial neural networks) and remain easier to interpret. … (more)
- Is Part Of:
- Geothermics. Volume 110(2023)
- Journal:
- Geothermics
- Issue:
- Volume 110(2023)
- Issue Display:
- Volume 110, Issue 2023 (2023)
- Year:
- 2023
- Volume:
- 110
- Issue:
- 2023
- Issue Sort Value:
- 2023-0110-2023-0000
- Page Start:
- Page End:
- Publication Date:
- 2023-05
- Subjects:
- Geothermal resource assessment -- Machine learning -- Class imbalance -- Logistic regression -- Support-vector machines -- XGBoost -- Artificial neural network -- Feature importance -- Positive-unlabeled data -- Positive-unlabeled learning
Hydrogeology -- Periodicals
Geothermal resources -- Periodicals
Énergie géothermique -- Périodiques
GEOTHERMAL ENGINEERING
GEOTHERMAL ENERGY
GEOTHERMAL EXPLORATION
Geothermal resources
Hydrogeology
Periodicals
Electronic journals
621.44 - Journal URLs:
- http://www.journals.elsevier.com/geothermics/ ↗
http://www.elsevier.com/journals ↗
http://www.sciencedirect.com/science/journal/03756505 ↗ - DOI:
- 10.1016/j.geothermics.2023.102662 ↗
- Languages:
- English
- ISSNs:
- 0375-6505
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4161.040000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 26172.xml