TBP-mHC: full expressivity for manifold-constrained hyper connections through transportation polytopes

arXiv:2605.21724v1 Announce Type: new Abstract: Hyper-Connections (HC) improve residual networks by introducing learnable mixing across multiple residual streams, but unconstrained mixing leads to training instability. Manifold-Constrained Hyper-Connections (mHC) address this by enforcing approximate double stochasticity via Sinkhorn normalization, while mHC-lite ensures exact constraints through convex combinations of permutation matrices at the cost of factorial complexity. KromHC reduces this cost using Kronecker-product parameterizations, but restricts the mixing matrices to a structured s
The paper addresses ongoing challenges in neural network design by introducing a novel method to enhance stability and expressivity, building on previous limitations in Hyper-Connections.
Improved neural network architectures directly feed into more capable AI systems, impacting a wide range of applications from enterprise software to autonomous agents.
This research provides a method for designing more stable and powerful residual networks, potentially leading to faster training and more robust AI models.
- · AI/ML Researchers
- · Deep Learning-focused Companies
- · AI Model Developers
- · Prior less efficient neural network architectures
More efficient and stable deep learning model training becomes possible.
This could accelerate the development of more complex and reliable AI agents and autonomous systems.
Advanced AI capabilities stemming from such improvements might enable new automation paradigms in various industries.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG