DCC: Data-Centric Compilation of Machine Learning Kernels for Processing-In-Memory Architectures

arXiv:2511.15503v2 Announce Type: replace-cross Abstract: High-performance Host processors can integrate Processing-In-Memory (PIM) devices, which can accelerate memory-intensive kernels of Machine Learning (ML) models, including Large Language Models (LLMs), by leveraging the large memory bandwidth available at PIM cores. However, Host processor needs consecutive elements distributed across DRAM banks, while PIM cores need consecutive elements within their local banks. This necessitates data rearrangements in ML kernel execution that pose significant performance and programmability challenges
The increasing computational demands of ever-larger Machine Learning models, especially LLMs, are pushing the limits of traditional computing architectures, making PIM solutions more urgent.
Processing-In-Memory (PIM) architectures can significantly accelerate AI workloads by addressing memory bandwidth bottlenecks, which is crucial for continued AI progress and efficiency.
This advancement in compilation for PIM allows for more efficient utilization of these specialized architectures, potentially bringing a new wave of performance gains to AI deployments.
- · PIM device manufacturers
- · AI hardware developers
- · Hyperscale cloud providers
- · Large Language Model developers
- · Traditional CPU/GPU designers
- · Memory-bound AI applications without PIM integration
Improved performance and energy efficiency for AI applications, particularly those with high memory bandwidth requirements.
Accelerated development and deployment of more complex Machine Learning models, pushing boundaries in fields like natural language processing.
Potential for new form factors and edge AI devices enabled by lower power consumption and higher performance from PIM integrated systems.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG