The Fine-Tuning Trap: Evaluating Negative Transfer and the Role of PEFT in Sub-1B Mathematical Reasoning

arXiv:2606.06920v1 Announce Type: new Abstract: Deploying Small Language Models (SLMs) on edge devices requires efficient fine-tuning strategies that adapt models to new tasks without degrading their general capabilities. In this study, we benchmark five sub-1B models (135M-1B) on mathematical reasoning tasks and uncover a critical vulnerability: Full Fine-Tuning (Full FT) actively harms performance in models under 300M parameters, often dropping accuracy below zero-shot baselines. This "negative transfer" makes Parameter-Efficient Fine-Tuning (PEFT) not just an efficiency preference, but a st
This research emerges as Small Language Models are increasingly critical for edge device deployment and efficient AI, driven by the expanding need for local processing power and reduced reliance on cloud infrastructure.
A strategic reader should care because this research identifies a critical limitation in fine-tuning SLMs, particularly regarding negative transfer, impacting the viability of current development strategies for compact AI.
The understanding that full fine-tuning can harm performance in sub-300M models fundamentally alters how developers approach optimizing small AI, making PEFT not just an efficiency choice but a necessity for certain applications.
- · PEFT developers
- · Edge AI providers
- · On-device AI applications
- · Specialized SLM architects
- · General full fine-tuning methodologies
- · Cloud-dependent AI models
- · Developers neglecting negative transfer
- · Computational resource-intensive training methods
Increased adoption and research into Parameter-Efficient Fine-Tuning (PEFT) methods for small models.
Accelerated development of more robust and less vulnerable Small Language Models specifically designed for efficient adaptation.
Enhanced overall energy efficiency in AI deployments as reliance on full fine-tuning and larger models diminishes for specific tasks.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG