A Bayesian evaluation framework for subjectively annotated visual recognition tasks. (March 2022)