PEFT-Arena: Understanding Parameter-Efficient Finetuning from a Stability-Plasticity Perspective

arXiv:2605.28819v1 Announce Type: new Abstract: Parameter-efficient finetuning (PEFT) has become the standard approach for adapting large language models, yet evaluations largely emphasize downstream accuracy while overlooking the retention of pretrained capabilities. We argue that PEFT should be assessed through the stability-plasticity dilemma: the trade-off between target-task adaptation and resistance to forgetting. We introduce PEFT-Arena, a benchmark that jointly measures downstream performance and general capability retention. Across methods, we find distinct stability-plasticity profil
The proliferation of PEFT methods for large language models necessitates a more comprehensive evaluation framework beyond just accuracy, prompting research into stability and plasticity.
A deeper understanding of PEFT's trade-offs will directly influence how large language models are adapted and deployed, impacting their reliability and utility in real-world applications.
The evaluation of PEFT methods will likely evolve beyond simple downstream accuracy to include metrics for catastrophic forgetting and retention of general capabilities, leading to more robust finetuning strategies.
- · AI researchers
- · Model developers focusing on practical deployment
- · Enterprises using finetuned LLMs
- · PEFT methods with poor stability-plasticity balance
- · Evaluations solely focused on accuracy benchmarks
New PEFT methods optimized for both stability and plasticity will emerge, leading to more balanced model adaptations.
This improved understanding will accelerate LLM deployment in sensitive applications requiring predictable behavior and robust knowledge retention.
The benchmark could become a standard for model adaptation, indirectly shaping compute resource allocation and research priorities within the broader AI ecosystem.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG