Machine Learning Accurately Predicts Next Season NHL Player Injury Before It Occurs: Validation of 10, 449 Player-Years from 2007-17. Issue 7 (31st July 2020)
- Record Type:
- Journal Article
- Title:
- Machine Learning Accurately Predicts Next Season NHL Player Injury Before It Occurs: Validation of 10, 449 Player-Years from 2007-17. Issue 7 (31st July 2020)
- Main Title:
- Machine Learning Accurately Predicts Next Season NHL Player Injury Before It Occurs: Validation of 10, 449 Player-Years from 2007-17
- Authors:
- Wright, Audrey
Karnuta, Jaret
Luu, Bryan
Haeberle, Heather
Makhni, Eric
Schickendantz, Mark
Ramkumar, Prem - Abstract:
- Objectives: With the accumulation of big data surrounding National Hockey League (NHL) and the advent of advanced computational processors, machine learning (ML) is ideally suited to develop a predictive algorithm capable of imbibing historical data to accurately project a future player's availability to play based on prior injury and performance. To the end of leveraging available analytics to permit data-driven injury prevention strategies and informed decisions for NHL franchises beyond static logistic regression (LR) analysis, the objective of this study of NHL players was to (1) characterize the epidemiology of publicly reported NHL injuries from 2007-17, (2) determine the validity of a machine learning model in predicting next season injury risk for both goalies and non-goalies, and (3) compare the performance of modern ML algorithms versus LR analyses. Methods: Hockey player data was compiled for the years 2007 to 2017 from two publicly reported databases in the absence of an official NHL-approved database. Attributes acquired from each NHL player from each professional year included: age, 85 player metrics, and injury history. A total of 5 ML algorithms were created for both non-goalie and goalie data; Random Forest, K-Nearest Neighbors, Naive Bayes, XGBoost, and Top 3 Ensemble. Logistic regression was also performed for both non-goalie and goalie data. Area under the receiver operating characteristics curve (AUC) primarily determined validation. Results: Player dataObjectives: With the accumulation of big data surrounding National Hockey League (NHL) and the advent of advanced computational processors, machine learning (ML) is ideally suited to develop a predictive algorithm capable of imbibing historical data to accurately project a future player's availability to play based on prior injury and performance. To the end of leveraging available analytics to permit data-driven injury prevention strategies and informed decisions for NHL franchises beyond static logistic regression (LR) analysis, the objective of this study of NHL players was to (1) characterize the epidemiology of publicly reported NHL injuries from 2007-17, (2) determine the validity of a machine learning model in predicting next season injury risk for both goalies and non-goalies, and (3) compare the performance of modern ML algorithms versus LR analyses. Methods: Hockey player data was compiled for the years 2007 to 2017 from two publicly reported databases in the absence of an official NHL-approved database. Attributes acquired from each NHL player from each professional year included: age, 85 player metrics, and injury history. A total of 5 ML algorithms were created for both non-goalie and goalie data; Random Forest, K-Nearest Neighbors, Naive Bayes, XGBoost, and Top 3 Ensemble. Logistic regression was also performed for both non-goalie and goalie data. Area under the receiver operating characteristics curve (AUC) primarily determined validation. Results: Player data was generated from 2, 109 non-goalies and 213 goalies with an average follow-up of 4.5 years. The results are shown below in Table 1.For models predicting following season injury risk for non-goalies, XGBoost performed the best with an AUC of 0.948, compared to an AUC of 0.937 for logistic regression. For models predicting following season injury risk for goalies, XGBoost had the highest AUC with 0.956, compared to an AUC of 0.947 for LR. Conclusion: Advanced ML models such as XGBoost outperformed LR and demonstrated good to excellent capability of predicting whether a publicly reportable injury is likely to occur the next season. As more player-specific data become available, algorithm refinement may be possible to strengthen predictive insights and allow ML to offer quantitative risk management for franchises, present opportunity for targeted preventative intervention by medical personnel, and replace regression analysis as the new gold standard for predictive modeling. Figure 1. … (more)
- Is Part Of:
- Orthopaedic journal of sports medicine. Volume 8:Issue 7(2020)Supplement 6
- Journal:
- Orthopaedic journal of sports medicine
- Issue:
- Volume 8:Issue 7(2020)Supplement 6
- Issue Display:
- Volume 8, Issue 7, Part 6 (2020)
- Year:
- 2020
- Volume:
- 8
- Issue:
- 7
- Part:
- 6
- Issue Sort Value:
- 2020-0008-0007-0006
- Page Start:
- Page End:
- Publication Date:
- 2020-07-31
- Subjects:
- Sports medicine -- Periodicals
Orthopedics -- Periodicals
Arthroscopy -- Periodicals
Arthroplasty -- Periodicals
Knee -- Surgery -- Periodicals
616.7 - Journal URLs:
- http://www.sagepublications.com/ ↗
- DOI:
- 10.1177/2325967120S00360 ↗
- Languages:
- English
- ISSNs:
- 2325-9671
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 14923.xml