Linear-DPO: Linear Direct Preference Optimization for Diffusion and Flow-Matching Generative Models

arXiv:2605.21123v1 Announce Type: cross Abstract: Direct Preference Optimization (DPO) is successful for alignment in LLMs but still faces challenges in text-to-image generation. Existing studies are confined to denoising diffusion models while overlooking flow-matching, and suffer from an objective mismatch when applying discrete NLP-based DPO to regression-based generative tasks.\ In this paper, we derive a generalized DPO objective that covers both diffusion and flow-matching via a unified reverse-time SDE framework, and point out from a gradient perspective that the standard DPO objective
The paper addresses current limitations in applying Direct Preference Optimization to generative models, specifically the objective mismatch with regression-based tasks and oversight of flow-matching models.
This research provides a generalized DPO objective that could significantly improve the alignment and quality of text-to-image and other generative AI models, expanding their applicability and performance.
The unified framework for DPO across diffusion and flow-matching models offers a more robust and efficient method for optimizing generative AI, potentially leading to more sophisticated and controllable outputs.
- · AI researchers
- · Generative AI developers
- · Content creators
- · Digital media
- · Existing generative model fine-tuning techniques
- · Generative models with poor alignment capabilities
Improved generative AI models capable of producing higher-quality and more aligned outputs based on preferences.
Accelerated development of AI agents that leverage advanced generative capabilities for complex tasks and creative endeavors.
Potential for new industries and creative fields enabled by highly controllable and sophisticated AI-generated content.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG