A data-driven approach to selection of critical process steps in the semiconductor manufacturing process considering missing and imbalanced data. (July 2019)