Learning‐based T‐sHDP(λ) for optimal control of a class of nonlinear discrete‐time systems. (24th October 2021)