A Multimodal Model for College English Teaching Using Text and Image Feature Extraction. (16th August 2022)