
arXiv:2603.08683v2 Announce Type: replace-cross Abstract: Autoregressive "language" models (LMs) trained on raw waveforms can be repurposed for lossless audio compression, but prior work is limited to 8-bit audio, leaving open whether such approaches work for practical settings (16/24-bit) and can compete with existing codecs. We benchmark LM-based compression on full-fidelity audio across diverse domains (music, speech, bioacoustics), sampling rates (16kHz-48kHz), and bit depths (8, 16, 24-bit). Standard sample-level tokenization becomes intractable at higher bit depths due to vocabulary size
The rapid advancements in language models are enabling their application to new modalities like audio, pushing the boundaries of what was previously possible in compression.
This work indicates a potential paradigm shift in audio compression technology, moving towards AI-driven methods that could offer superior efficiency and fidelity.
Traditional audio compression codecs may face significant competition from AI-driven methods, potentially leading to more efficient data storage and transmission of high-fidelity audio.
- · AI compute providers
- · Cloud storage providers
- · Audio streaming services
- · Content creators
- · Legacy audio codec developers
- · Companies reliant on current compression standards
Significant improvements in lossless audio compression ratios will occur, reducing data transfer and storage costs.
New applications for high-fidelity audio, previously constrained by data size, will become economically viable.
The underlying 'language' model approach could generalize to other forms of sensor data compression, creating a new AI-driven standard for raw data encoding.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG