SIGNALAI·May 26, 2026, 4:00 AMSignal75Short term

Depth Registers Unlock W4A4 on SwiGLU: A Reader/Generator Decomposition

Source: arXiv cs.CL

Share
Depth Registers Unlock W4A4 on SwiGLU: A Reader/Generator Decomposition

arXiv:2604.18128v2 Announce Type: replace Abstract: We study post-training W4A4 quantization in a controlled 300M-parameter SwiGLU decoder-only language model trained on 5B tokens of FineWeb-Edu, and ask which input-activation sites dominate the error. Naive round-to-nearest W4A4 collapses validation perplexity from FP16 23.6 to 1727. A simple residual-axis training-time intervention -- Depth Registers with a register-magnitude hinge loss (DR+sink) -- reduces this to 119 (about 14x) at matched FP16 PPL and matched zero-shot capacity, and composes with SmoothQuant to 39.9 PPL. The residual ~2 P

Why this matters
Why now

The continuous push for more efficient and powerful AI models drives the exploration of advanced quantization techniques to reduce computational burdens.

Why it’s important

This research significantly improves the viability of 4-bit weight and 4-bit activation (W4A4) quantization for large language models, making advanced AI more accessible and cheaper to operate.

What changes

The demonstrated performance of Depth Registers with W4A4 quantization suggests a pathway to running high-quality LLMs on far less capable hardware than currently required.

Winners
  • · AI hardware manufacturers (edge devices)
  • · Cloud AI providers (reduced infrastructure costs)
  • · AI developers (wider deployment options)
  • · Consumers (more accessible AI features)
Losers
  • · Manufacturers of memory-intensive high-end AI accelerators
Second-order effects
Direct

More powerful AI models become deployable on constrained devices, such as smartphones, IoT devices, or embedded systems.

Second

This democratizes access to advanced AI capabilities, fostering innovation in new applications and services.

Third

Reduced computational and energy demands for AI could alleviate some pressure on energy grids and contribute to more sustainable AI development.

Editorial confidence: 90 / 100 · Structural impact: 60 / 100
Original report

This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.

Read at arXiv cs.CL
Tracked by The Continuum Brief · live intelligence network
Share
The Brief · Weekly Dispatch

Stay ahead of the systems reshaping markets.

By subscribing, you agree to receive updates from THE CONTINUUM BRIEF. You can unsubscribe at any time.