Boosting question answering over knowledge graph with reward integration and policy evaluation under weak supervision. Issue 2 (March 2023)