Time Series as Language: A Universal Tokenizer for General-Purpose Time Series Foundation Models

arXiv:2606.09861v1 Announce Type: new Abstract: While Next-Token Prediction (NTP) has unified LLM pretraining, its adaptation to unbounded, continuous time series (TS) remains open. To bridge the gap, we introduce UniTok, a universal tokenizer that transforms TS into discrete tokens, and UniTok-FM, a foundation model pretrained via NTP on these tokens. UniTok-FM is a general-purpose foundation model that supports zero-shot and prompt-boosted forecasting, as well as few-shot generation and classification via training-free in-context inference--a capability not achieved by prior works. Technical
The proliferation of Large Language Models (LLMs) and the quest for general-purpose AI capabilities are driving efforts to adapt their success to other data modalities, such as time series.
This breakthrough suggests a path towards unifying diverse time series tasks under a single foundation model paradigm, potentially accelerating AI development in various fields from finance to climate science.
Traditional fragmented approaches to time series analysis may be superseded by general-purpose foundation models, allowing for zero-shot and few-shot learning across forecasting, generation, and classification.
- · AI researchers and developers
- · Industries reliant on time series forecasting (finance, logistics, energy)
- · Cloud AI providers
- · Specialized time series modeling startups (if they cannot adapt)
- · Traditional statistical modeling approaches
- · Domain-specific time series software vendors (if not integrated)
The ability to tokenize and unify time series data could lead to more robust and generalized AI applications.
This framework might enable the development of more autonomous AI agents capable of understanding and interacting with complex dynamic environments.
A universal time series model could democratize access to advanced predictive analytics, shifting competitive advantage towards data availability and model deployment over bespoke algorithm development.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG