R2IF: Aligning Reasoning with Decisions via Composite Rewards for Interpretable LLM Function Calling

arXiv:2604.20316v2 Announce Type: replace Abstract: Function calling empowers large language models (LLMs) to interface with external tools, yet existing RL-based approaches suffer from misalignment between reasoning processes and tool-call decisions. We propose R2IF, a reasoning-aware RL framework for interpretable function calling, adopting a composite reward integrating format/correctness constraints, Chain-of-Thought Effectiveness Reward (CER), and Specification-Modification-Value (SMV) reward, optimized via GRPO. Experiments on BFCL/ACEBench show R2IF outperforms baselines by up to 34.62%
The rapid advancement and adoption of LLMs necessitate more effective and reliable function calling mechanisms to expand their utility and safety.
Improved LLM function calling enhances autonomy and reliability for AI agents, driving their practical application in complex tasks.
LLMs can now interface with external tools more accurately and interpretably, reducing errors and increasing trust in their autonomous functions.
- · AI Agent developers
- · SaaS platforms integrating LLMs
- · Enterprises adopting AI agents
- · Inefficient LLM function calling methods
- · Manual workflow processes
More robust and general-purpose AI agents become feasible for deployment across various industries.
Increased automation of white-collar tasks, potentially leading to significant productivity gains and workforce restructuring.
New business models emerging around highly autonomous AI systems that manage and execute complex operational workflows.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG