
arXiv:2605.24072v1 Announce Type: cross Abstract: Finite-width fully connected neural networks with Gaussian-initialized weights deviate from their infinite-width Gaussian limit, exhibiting non-vanishing higher-order cumulants. We approximate these deviations, for a neural network evaluated in a finite number of inputs, using multidimensional Edgeworth expansions of arbitrary order $4m-1$, with $m\in\mathbb{N}$. Assuming that the corresponding Gaussian limit has an invertible covariance matrix and that the activation function is polynomially bounded, we establish a bound of order $n^{-m}$ on t
This paper offers theoretical advancements in understanding the finite-width behavior of neural networks at a time when 'large' and 'infinite' models are hitting scaling limits and practical performance plateaus.
Understanding the deviations of finite-width neural networks from their infinite-width Gaussian limits is critical for designing more robust, predictable, and interpretable AI models.
The ability to approximate these deviations using Edgeworth expansions of arbitrary order provides a new mathematical toolset for optimizing neural network architectures and understanding their statistical properties.
- · AI researchers
- · ML model developers
- · Statistical learning theory
- · AI hardware architects
- · Black-box AI approaches
Improved theoretical understanding of neural network behavior will lead to more principled model design.
This could enable more efficient training and deployment of specialized neural networks, potentially reducing computational overhead.
A deeper theoretical foundation might unlock new AI capabilities that are currently hampered by unpredictable model behavior.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG