Multi‐Layer Feature Selection Incorporating Weighted Score‐Based Expert Knowledge toward Modeling Materials with Targeted Properties. Issue 2 (15th January 2020)
- Record Type:
- Journal Article
- Title:
- Multi‐Layer Feature Selection Incorporating Weighted Score‐Based Expert Knowledge toward Modeling Materials with Targeted Properties. Issue 2 (15th January 2020)
- Main Title:
- Multi‐Layer Feature Selection Incorporating Weighted Score‐Based Expert Knowledge toward Modeling Materials with Targeted Properties
- Authors:
- Liu, Yue
Wu, Jun‐Ming
Avdeev, Maxim
Shi, Si‐Qi - Abstract:
- Abstract: Selecting proper descriptors or features is one of the central problems in exploring structure–activity relationships of materials using machine learning models. The current feature selection algorithms usually require tedious hyperparameter tuning and do not actively consider the prior knowledge of domain experts about the features. Here, this work proposes a data‐driven multi‐layer feature selection method incorporating domain expert knowledge named DML‐FSdek, which is automated, with users entering training data without manual tuning of the hyperparameters. The domain expert knowledge is quantified by means of weighted scoring and integrated into the selection process to eliminate the risk of crucial features being removed. The test studies on ten material properties datasets demonstrate the potential of the approach to automatically search for a reduced feature set with lower root mean square errors than those for the initial feature set. Essentially, the most relevant material features, the number of which is much smaller than that in the original feature set, are automatically selected to establish a closer and more accurate structure–activity relationship for the materials of interest. As a result, the method represents the targeted properties of materials with a smaller and more interpretable set of features while ensuring equal or better prediction accuracy. Abstract : A data‐driven multi‐layer feature selection method incorporating domain expert knowledgeAbstract: Selecting proper descriptors or features is one of the central problems in exploring structure–activity relationships of materials using machine learning models. The current feature selection algorithms usually require tedious hyperparameter tuning and do not actively consider the prior knowledge of domain experts about the features. Here, this work proposes a data‐driven multi‐layer feature selection method incorporating domain expert knowledge named DML‐FSdek, which is automated, with users entering training data without manual tuning of the hyperparameters. The domain expert knowledge is quantified by means of weighted scoring and integrated into the selection process to eliminate the risk of crucial features being removed. The test studies on ten material properties datasets demonstrate the potential of the approach to automatically search for a reduced feature set with lower root mean square errors than those for the initial feature set. Essentially, the most relevant material features, the number of which is much smaller than that in the original feature set, are automatically selected to establish a closer and more accurate structure–activity relationship for the materials of interest. As a result, the method represents the targeted properties of materials with a smaller and more interpretable set of features while ensuring equal or better prediction accuracy. Abstract : A data‐driven multi‐layer feature selection method incorporating domain expert knowledge named DML‐FSdek is proposed, which is automated, with users entering training data without manual tuning of the hyperparameters. The DML‐FSdek represents the targeted properties of materials with a smaller and more interpretable set of features while ensuring equal or better prediction accuracy. … (more)
- Is Part Of:
- Advanced theory and simulations. Volume 3:Issue 2(2020)
- Journal:
- Advanced theory and simulations
- Issue:
- Volume 3:Issue 2(2020)
- Issue Display:
- Volume 3, Issue 2 (2020)
- Year:
- 2020
- Volume:
- 3
- Issue:
- 2
- Issue Sort Value:
- 2020-0003-0002-0000
- Page Start:
- n/a
- Page End:
- n/a
- Publication Date:
- 2020-01-15
- Subjects:
- domain expert knowledge -- feature selection -- machine learning -- materials properties prediction
Science -- Simulation methods -- Periodicals
Science -- Methodology -- Periodicals
Engineering -- Simulation methods -- Periodicals
Engineering -- Methodology -- Periodicals
507.21 - Journal URLs:
- http://onlinelibrary.wiley.com/ ↗
- DOI:
- 10.1002/adts.201900215 ↗
- Languages:
- English
- ISSNs:
- 2513-0390
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 0696.935575
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 13637.xml