A novel methodology to classify test cases using natural language processing and imbalanced learning. (October 2020)