Learning efficient push and grasp policy in a totebox from simulation. (2nd July 2020)