Sustainability Is Not Linear: Quantifying Performance, Energy, and Privacy Trade-offs in On-Device Intelligence

arXiv:2603.26603v2 Announce Type: replace-cross Abstract: The migration of Large Language Models (LLMs) from cloud clusters to edge devices promises enhanced privacy and offline accessibility, but this transition encounters a harsh reality: the physical constraints of mobile batteries, thermal limits, and, most importantly, memory constraints. To navigate this landscape, we constructed a replicable and reproducible experimental pipeline to profile the complex interplay between energy consumption, latency, and quality of LLMs on mobile devices. We harness this pipeline to conduct an empirical c
The proliferation of LLMs and the increasing demand for enhanced privacy and offline functionality on edge devices necessitate immediate solutions for managing their significant computational and energy demands.
This research provides a framework for understanding and optimizing the critical trade-offs between performance, energy consumption, and privacy for on-device AI, directly impacting the feasibility and scalability of edge LLM deployments.
The ability to quantify and manage these trade-offs will enable more efficient product development, better user experience, and accelerated adoption of powerful AI on consumer devices, shifting focus from pure model size to optimized deployment strategies.
- · Mobile device manufacturers
- · On-device AI chip designers
- · Edge computing platform providers
- · AI model compression companies
- · Cloud-dependent AI service providers (for specific use cases)
- · Inefficient LLM architectures
Widespread adoption of high-performance, energy-efficient LLMs on mobile devices will become more feasible.
This will accelerate the distributed intelligence paradigm, reducing reliance on centralized cloud infrastructure for many AI tasks.
Enhanced on-device AI capabilities could lead to new privacy-centric applications and business models, potentially disrupting existing data-intensive industries.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG