A novel double-mGBDT-based Q-learning. (23rd March 2022)