A comprehensive comparison of two variable importance analysis techniques in high dimensions: Application to an environmental multi-indicators system. (August 2015)
- Record Type:
- Journal Article
- Title:
- A comprehensive comparison of two variable importance analysis techniques in high dimensions: Application to an environmental multi-indicators system. (August 2015)
- Main Title:
- A comprehensive comparison of two variable importance analysis techniques in high dimensions: Application to an environmental multi-indicators system
- Authors:
- Wei, Pengfei
Lu, Zhenzhou
Song, Jingwen - Abstract:
- Abstract: Permutation variable importance measure (PVIM) based on random forest and Morris' screening design are two effective techniques for measuring the variable importance in high dimensions. The former technique is developed in the machine learning discipline and widely used in bioinformatics, while the latter technique is popular in scientific computing. We present three main contributions to variable importance analysis (VIA). First, through theoretical derivation, we show that the PVIM converges to double the non-standardized Sobol' total effect index. This observation indicates that the PVIM is especially useful for variable screening as it captures both the individual and interaction effects. Second, three numerical examples with different types of model behavior are presented for comparing the performances of these two techniques. The main conclusions are as follows. For high-dimensional additive or approximately additive models, the PVIM is much more efficient than Morris' screening design when used for both variable importance ranking and variable screening. For high-dimensional models mainly governed by interaction effects, the performance of PVIM degrades, but it is still a competitive technique. Finally, the two techniques are applied to an environmental multi-indicators system for improving the robustness of the partial order structure of this system. Highlights: The relative merits of PVIM and Morris' design are compared. The PVIM is proved to converge toAbstract: Permutation variable importance measure (PVIM) based on random forest and Morris' screening design are two effective techniques for measuring the variable importance in high dimensions. The former technique is developed in the machine learning discipline and widely used in bioinformatics, while the latter technique is popular in scientific computing. We present three main contributions to variable importance analysis (VIA). First, through theoretical derivation, we show that the PVIM converges to double the non-standardized Sobol' total effect index. This observation indicates that the PVIM is especially useful for variable screening as it captures both the individual and interaction effects. Second, three numerical examples with different types of model behavior are presented for comparing the performances of these two techniques. The main conclusions are as follows. For high-dimensional additive or approximately additive models, the PVIM is much more efficient than Morris' screening design when used for both variable importance ranking and variable screening. For high-dimensional models mainly governed by interaction effects, the performance of PVIM degrades, but it is still a competitive technique. Finally, the two techniques are applied to an environmental multi-indicators system for improving the robustness of the partial order structure of this system. Highlights: The relative merits of PVIM and Morris' design are compared. The PVIM is proved to converge to double the unscaled Sobol' total effect indices. The two techniques are compared with three high-dimensional examples. The two techniques are applied to an environmental multi-indicators system. … (more)
- Is Part Of:
- Environmental modelling & software. Volume 70(2015:Aug.)
- Journal:
- Environmental modelling & software
- Issue:
- Volume 70(2015:Aug.)
- Issue Display:
- Volume 70 (2015)
- Year:
- 2015
- Volume:
- 70
- Issue Sort Value:
- 2015-0070-0000-0000
- Page Start:
- 178
- Page End:
- 190
- Publication Date:
- 2015-08
- Subjects:
- High-dimensional model -- Random forest -- Permutation variables importance measure -- Morris' screening design
Environmental monitoring -- Computer programs -- Periodicals
Ecology -- Computer simulation -- Periodicals
Digital computer simulation -- Periodicals
Computer software -- Periodicals
Environmental Monitoring -- Periodicals
Computer Simulation -- Periodicals
Environnement -- Surveillance -- Logiciels -- Périodiques
Écologie -- Simulation, Méthodes de -- Périodiques
Simulation par ordinateur -- Périodiques
Logiciels -- Périodiques
Computer software
Digital computer simulation
Ecology -- Computer simulation
Environmental monitoring -- Computer programs
Periodicals
Electronic journals
363.70015118 - Journal URLs:
- http://www.sciencedirect.com/science/journal/13648152 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.envsoft.2015.04.015 ↗
- Languages:
- English
- ISSNs:
- 1364-8152
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3791.522800
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 1857.xml