Stateful human-centered visual captioning system to aid video surveillance. (September 2019)