An Attention Enhanced Cross-Modal Image–Sound Mutual Generation Model for Birds. (1st February 2021)