
arXiv:2602.19778v4 Announce Type: replace-cross Abstract: Automatic Chord Recognition (ACR) is constrained by the scarcity of aligned chord labels, as well-aligned annotations are costly to acquire. At the same time, open-weight pre-trained models are more accessible than their proprietary training data. In this work, we present a two-stage training pipeline that leverages pre-trained models together with unlabeled audio. The proposed method decouples training into two stages. In the first stage, we use a pre-trained BTC model as a teacher to generate pseudo-labels for over 1,000 hours of dive
The research addresses a persistent challenge in Automatic Chord Recognition (ACR) by leveraging increasingly accessible pre-trained models and large unlabelled datasets, which is a common trend in AI development.
This development improves the efficiency and accuracy of acoustic signal processing, which has implications for various audio artificial intelligence applications beyond music.
The proposed two-stage training pipeline makes it easier and less costly to develop high-performing ACR systems by reducing reliance on expensive, manually-aligned chord annotations and utilizing readily available unlabelled audio data.
- · AI researchers
- · Music technology companies
- · Audio analysis software developers
- · Companies relying on manual data annotation for audio analysis
Improved performance and broader accessibility of automatic chord recognition systems.
Accelerated development of AI applications in music creation, education, and entertainment.
Enhanced ability to analyze and categorize large audio datasets for various sound recognition tasks beyond music.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG