SIGNALAI·Jun 30, 2026, 4:00 AMSignal75Short term

How Far Do On-Prem Open LLMs Get on Text-to-SQL? A Cross-Family Size x Technique Frontier on BIRD

arXiv:2606.29733v1 Announce Type: cross Abstract: Organizations that cannot send data to a cloud API increasingly ask: how good is Text-to-SQL if the model must run on-premises on open weights, and which popular accuracy "recipes" are worth their compute? We answer with an honest, fully reproducible benchmark on the BIRD development split (n=1534, Execution Accuracy), evaluating three open model families across two generations -- Qwen2.5-Coder (7B/14B/32B), CodeLlama-Instruct (7B/13B/34B), and Llama-3.x (8B, 70B) -- under one matched protocol, ablating a model-agnostic recipe (schema linking,

Why this matters

Why now

The proliferation of open-source LLMs combined with increasing data privacy concerns is driving the immediate need to evaluate their performance in constrained environments.

Why it’s important

This research provides crucial benchmarks for organizations needing to deploy AI models on-premises, directly impacting their ability to leverage advanced AI without relying on cloud solutions or risking sensitive data.

What changes

The understanding of which open-source LLMs and techniques are most effective for Text-to-SQL tasks in on-premise settings is now significantly clearer, influencing deployment strategies.

Winners

· Organizations with strict data privacy requirements
· Open-source LLM developers
· AI data security solution providers

Losers

· Cloud AI API providers (for specific use cases)
· Companies relying solely on proprietary models for sensitive data

Second-order effects

Direct

Increased adoption of on-premise open LLMs for sensitive enterprise data tasks like Text-to-SQL.

Second

A shift in compute and data architectures towards hybrid or fully on-premise AI deployments for certain industries.

Third

Enhanced data sovereignty and reduced reliance on external cloud providers for critical AI functions across various sectors.

Editorial confidence: 90 / 100 · Structural impact: 60 / 100

Original report

This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.

Read at arXiv cs.LG

#cs.CL #cs.DB #cs.LG

Tracked by The Continuum Brief · live intelligence network

The Brief · Weekly Dispatch

Stay ahead of the systems reshaping markets.