SIGNALAI·Jun 9, 2026, 4:00 AMSignal55Medium term

Full-Batch Gradient Descent Outperforms One-Pass SGD: Sample Complexity Separation in Single-Index Learning

Source: arXiv cs.LG

Share
Full-Batch Gradient Descent Outperforms One-Pass SGD: Sample Complexity Separation in Single-Index Learning

arXiv:2602.02431v2 Announce Type: replace-cross Abstract: It is folklore that reusing training data more than once can improve the statistical efficiency of gradient-based learning. While this phenomenon has been extensively studied in linear regression, the benefit of multi-pass gradient descent (GD, which reuses all the data) over one-pass stochastic gradient descent (online SGD, which uses each data point only once) is not well-understood in nonlinear and non-convex settings, except for a loss modification mechanism achieved by the first two passes on the data. In this work, we consider lea

Why this matters
Why now

The paper addresses a long-standing folklore in machine learning regarding the statistical efficiency of different gradient descent methods, offering new theoretical insights in non-linear and non-convex settings.

Why it’s important

This research refines our understanding of fundamental AI optimization techniques, potentially influencing how future AI models are trained and optimized for efficiency and performance.

What changes

The findings challenge previous assumptions that single-pass SGD is always superior in certain contexts, suggesting that full-batch gradient descent can outperform, especially with multi-pass data use.

Winners
  • · AI researchers
  • · Machine learning engineers
  • · Companies with large datasets
Losers
  • · Developers solely relying on one-pass SGD for all applications
Second-order effects
Direct

Refined theoretical understanding of gradient descent algorithms for AI training.

Second

Improved efficiency and performance in training large and complex AI models through better algorithm selection.

Third

Acceleration of research into novel optimization techniques that leverage multi-pass data access more effectively.

Editorial confidence: 85 / 100 · Structural impact: 40 / 100
Original report

This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.

Read at arXiv cs.LG
Tracked by The Continuum Brief · live intelligence network
Share
The Brief · Weekly Dispatch

Stay ahead of the systems reshaping markets.

By subscribing, you agree to receive updates from THE CONTINUUM BRIEF. You can unsubscribe at any time.