Kernel Renormalization in Bayesian Deep Neural Networks: the Equivalent Wishart Ansatz in the Proportional Regime

arXiv:2605.29684v1 Announce Type: new Abstract: The scaling limit where both the size of the training set $P$ and the width $N$ of a deep neural network grow at the same rate, the so-called proportional-width regime, has been intensely studied for shallow, single-hidden-layer networks. However, extending these non-perturbative results from shallow architectures to deep non-linear networks has proven very challenging. Here we present an effective approximate approach to predict the generalization performance of Bayesian multi-layer perceptrons (MLPs) of fixed depth $L$ on arbitrary high-dimensi
The paper addresses a significant challenge in understanding and predicting the generalization performance of deep neural networks by extending theoretical approaches previously limited to shallow architectures.
Improved theoretical understanding of deep neural networks can lead to more robust, efficient, and interpretable AI systems, impacting their development and deployment across various sectors.
This research provides a new effective approximate approach for predicting performance in Bayesian multi-layer perceptrons, potentially accelerating advancements in complex AI model design and optimization.
- · AI researchers
- · Machine learning engineers
- · Tech companies developing AI
- · Academic institutions
- · Inefficient AI development
- · Black-box AI models
Further theoretical breakthroughs in deep learning leading to more predictable model behavior.
Reduced computational costs for training and evaluating deep networks due to better theoretical guidance.
Accelerated development of novel AI applications and more reliable autonomous systems.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG