Efficient Zeroth-Order Federated Finetuning of Language Models on Resource-Constrained Devices

arXiv:2502.10239v3 Announce Type: replace-cross Abstract: Federated Learning (FL) is a promising paradigm for finetuning Large Language Models (LLMs) across distributed data sources while preserving data privacy. However, finetuning such large models is challenging on edge devices due to its high resource demand. Zeroth-order Optimization (ZO) estimates gradients through finite-difference approximations, which rely on function evaluations under random perturbations of the model parameters. Consequently, ZO with task alignment provides a potential solution, allowing finetuning using only forwar
The proliferation of LLMs and the increasing demand for on-device AI necessitate more efficient and privacy-preserving finetuning methods for resource-constrained edge devices.
This development could significantly lower the computational barriers for deploying advanced AI models on a wider range of hardware, enabling more distributed and private AI applications.
The ability to finetune LLMs efficiently on edge devices through zeroth-order optimization will reduce dependency on centralized cloud infrastructure for model updates and personalization.
- · Edge device manufacturers
- · On-device AI application developers
- · Privacy-focused AI solutions
- · Distributed computing platforms
- · Centralized cloud AI providers (for certain use-cases)
- · Traditional high-resource finetuning methods
More powerful and personalized AI experiences become available directly on consumer devices without constant cloud connectivity.
This democratizes access to sophisticated AI model adaptation, potentially leading to a fragmentation of proprietary AI models and more open-source development at the edge.
Reduced data transmission for AI updates enhances data privacy and security, and could lead to new regulatory frameworks for ethical on-device AI.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.AI