
arXiv:2605.21648v1 Announce Type: new Abstract: We develop a mean-field theory of dropout as a perturbation of critical signal propagation at the edge of chaos. Dropout shifts the perfect-alignment fixed point, making the depth scale for information propagation finite even at critical initialization. We derive critical and crossover scaling laws for correlation decay and establish that smooth activations and kinked, ReLU-like activations constitute distinct universality classes, with different critical exponents and a universal two-parameter scaling collapse in detuning and dropout strength. T
The continuous drive for more efficient and robust AI models necessitates deeper theoretical understanding of core techniques like dropout, especially as models scale in complexity.
This research provides a foundational theoretical understanding of dropout, a critical technique for stabilizing and improving deep learning models, which can lead to more predictable and scalable AI development.
The theoretical frameworks for understanding and applying dropout are enhanced, potentially leading to more deliberate and optimized model architectures and training strategies, particularly for large, critical systems.
- · AI researchers
- · Deep learning practitioners
- · AI hardware developers
- · Cloud AI providers
- · Trial-and-error AI development approaches
- · Less theoretically grounded AI research
Improved understanding of dropout leads to more stable and performant deep learning models.
Optimized model training and architecture design could accelerate AI application deployment in various sectors.
Deeper theoretical insights into neural network dynamics could unlock new AI paradigms beyond current deep learning frameworks.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG