
arXiv:2509.14969v2 Announce Type: replace Abstract: We introduce a new adaptive step-size strategy for convex optimization with stochastic gradient that exploits the local geometry of the objective function only by means of a first-order stochastic oracle and without any hyper-parameter tuning. The method comes from a theoretically-grounded adaptation of the Adaptive Gradient Descent Without Descent method to the stochastic setting. We prove the convergence of stochastic gradient descent with our step-size under various assumptions, and we show that it empirically competes against tuned baseli
The continuous push for more efficient and robust machine learning algorithms drives research into advanced optimization techniques like adaptive step-size strategies.
Improved optimization methods can significantly enhance the training of complex AI models, leading to faster development and deployment of AI applications.
This research introduces a hyperparameter-free, theoretically grounded adaptive step-size strategy for stochastic gradient descent, potentially making AI model training more accessible and efficient for practitioners.
- · AI/ML researchers
- · Companies developing AI models
- · Developers leveraging machine learning frameworks
- · Less efficient optimization methods
- · Developers reliant on manual hyperparameter tuning
More stable and faster convergence in stochastic optimization for machine learning models.
Reduced computational cost and time for training large-scale AI systems, accelerating AI development cycles.
Potentially enables new classes of AI applications that were previously too computationally expensive or difficult to train reliably.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG