A dataset for learning stylistic and cultural correlations between music and videos. Issue 2 (21st February 2022)
- Record Type:
- Journal Article
- Title:
- A dataset for learning stylistic and cultural correlations between music and videos. Issue 2 (21st February 2022)
- Main Title:
- A dataset for learning stylistic and cultural correlations between music and videos
- Authors:
- Chen, Xinyi
Zhang, Hui
Wu, Songruoyao
Zheng, Jun
Sun, Lingyun
Zhang, Kejun - Other Names:
- Li Zijin guestEditor.
McAdams Stephen guestEditor. - Abstract:
- Abstract: Music–visual retrieval is of broad interest in the field of Music Information Retrieval (MIR). Most research relies on emotional tags or is based on content but does not consider stylistic and cultural differences between music and videos. As a result, only one‐sided dimensions are considered for automatic music retrieval for videos, while the stylistic correlation between audio‐visual is ignored. At the same time, the needs of different cultural regions cannot be well met. Therefore, the first labelled extensive Music Video (MV) dataset, Next‐MV, is constructed in this paper consisting of 6000 pieces of 30‐s MV fragments, including five music style labels and four cultural labels. The proposed Next‐Net framework is built to study the correlation between music style and visual style. The optimal audiovisual feature set and model structure are obtained in the experiments. The accuracy reached 71.1%, higher than the baseline model (66.9%). Furthermore, in the cross‐cultural experiment, it is found that the accuracy of the general fusion model (71.1%) is between the model trained by within‐dataset (76%) and the model trained by cross‐dataset (60%), indicating that culture has a significant influence on the correlation between music and visual. The experiments of pair classification on cultures are further carried out. It is found that Rock and Dance are more culturally influenced than R&B and Hip‐hop. Among all the cultures discussed, Chinese and Japanese music andAbstract: Music–visual retrieval is of broad interest in the field of Music Information Retrieval (MIR). Most research relies on emotional tags or is based on content but does not consider stylistic and cultural differences between music and videos. As a result, only one‐sided dimensions are considered for automatic music retrieval for videos, while the stylistic correlation between audio‐visual is ignored. At the same time, the needs of different cultural regions cannot be well met. Therefore, the first labelled extensive Music Video (MV) dataset, Next‐MV, is constructed in this paper consisting of 6000 pieces of 30‐s MV fragments, including five music style labels and four cultural labels. The proposed Next‐Net framework is built to study the correlation between music style and visual style. The optimal audiovisual feature set and model structure are obtained in the experiments. The accuracy reached 71.1%, higher than the baseline model (66.9%). Furthermore, in the cross‐cultural experiment, it is found that the accuracy of the general fusion model (71.1%) is between the model trained by within‐dataset (76%) and the model trained by cross‐dataset (60%), indicating that culture has a significant influence on the correlation between music and visual. The experiments of pair classification on cultures are further carried out. It is found that Rock and Dance are more culturally influenced than R&B and Hip‐hop. Among all the cultures discussed, Chinese and Japanese music and videos show great differences among most of the styles, while Korean music videos styles are more similar to western styles than other eastern cultures. … (more)
- Is Part Of:
- Cognitive computation and systems. Volume 4:Issue 2(2022)
- Journal:
- Cognitive computation and systems
- Issue:
- Volume 4:Issue 2(2022)
- Issue Display:
- Volume 4, Issue 2 (2022)
- Year:
- 2022
- Volume:
- 4
- Issue:
- 2
- Issue Sort Value:
- 2022-0004-0002-0000
- Page Start:
- 177
- Page End:
- 187
- Publication Date:
- 2022-02-21
- Subjects:
- cross culture -- cross‐modal processing -- deep learning architectures -- multimodal fusion -- new datasets
Cognitive science -- Periodicals
Artificial intelligence -- Periodicals
Neurosciences -- Periodicals
Computer science -- Periodicals
Neurosciences
Computer science
Cognitive science
Artificial intelligence
Periodicals
Electronic journals
006.3 - Journal URLs:
- https://digital-library.theiet.org/content/journals/ccs ↗
https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=8694204 ↗
https://ietresearch.onlinelibrary.wiley.com/loi/25177567 ↗
http://www.theiet.org/ ↗
https://digital-library.theiet.org/content/journals/ccs ↗ - DOI:
- 10.1049/ccs2.12043 ↗
- Languages:
- English
- ISSNs:
- 2517-7567
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 22627.xml