SIGNALAI·May 22, 2026, 4:00 AMSignal75Medium term

How Many Different Outputs Can a Transformer Generate?

arXiv:2605.22223v1 Announce Type: new Abstract: We study how we can leverage only a handful of characteristics of a transformer's architecture to closely predict the number of different sequences it can output, both qualitatively and quantitatively. We provide an upper bound depending on the length of the prompt, which we show empirically to be tight up to a factor less than 10, across architectures and model sizes. Our analysis also provides a theoretical explanation for previously observed empirical failures of transformers on simple sequence tasks, such as copying and cramming. Formally, we

Why this matters

Why now

Published in 2026, this research provides theoretical insights into transformer limitations, driven by continuous advancements and challenges in large language models.

Why it’s important

Understanding the fundamental limitations and generative capacities of transformers is crucial for guiding future AI research and application development, informing architectural choices and performance expectations.

What changes

This research provides a theoretical framework to predict transformer output diversity, offering explanations for observed failures and potentially guiding the design of more robust models.

Winners

· AI researchers
· Model architects
· Companies developing specialized AI models

Losers

· Developers unaware of fundamental architectural limitations
· Applications relying on transformers for tasks beyond their theoretical capacity

Second-order effects

Direct

It becomes possible to predict the output variety of a transformer based on a few architectural characteristics.

Second

AI model design shifts towards architectures that explicitly overcome, or are designed within, these newly understood generative limits.

Third

The development of 'limitation-aware' AI leads to a new generation of hybrid models that combine transformers with other mechanisms for tasks where diversity or specific sequences are critical.

Editorial confidence: 90 / 100 · Structural impact: 60 / 100

Original report

This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.

Read at arXiv cs.LG

#cs.LG

Tracked by The Continuum Brief · live intelligence network

The Brief · Weekly Dispatch

Stay ahead of the systems reshaping markets.