The Push Toward Memory-Centric Computing

For decades computing has been dominated by the von Neumann architecture: processors and memory separated, data shuttling back and forth across a bus. As models and datasets grow, this design has become a bottleneck. The energy cost of moving data now outweighs the cost of computation itself. Memory-centric computing is emerging as a response.

The idea is to collapse the distance between computation and memory. Instead of fetching data repeatedly, processing happens directly where the data resides. This can be achieved through processing-in-memory (PIM) architectures, non-volatile memory with compute capabilities, or 3D-stacked memory integrated tightly with logic.

The benefits are immediate. Reducing data movement cuts power consumption drastically, which is critical for AI training where energy use has become unsustainable. Bandwidth improves as well, since operations occur within the memory stack rather than across slower interconnects. For workloads dominated by linear algebra and tensor operations, the gains are particularly striking.

Challenges remain. Designing memory that can compute while still being reliable as storage is not trivial. New programming models are needed to expose in-memory compute capabilities to developers. Standardization across hardware vendors is still lacking, making adoption uneven. Yet research prototypes from academia and industry consistently demonstrate large efficiency gains compared to traditional systems.

Companies are beginning to explore commercial solutions. Samsung and SK Hynix are experimenting with PIM-enabled DRAM. Startups are working on resistive RAM and phase-change memory as candidates for hybrid storage-compute devices. Cloud providers are watching closely, as energy savings at scale translate into massive operational benefits.

The shift to memory-centric computing represents a deeper architectural change than adding more cores or faster GPUs. It is about rethinking where computation should live. If successful, it could mark the end of the von Neumann bottleneck and the beginning of a new era where memory is no longer passive but active, shaping the pace of computation itself.


References

https://www.nature.com/articles/s41586-020-03036-2

https://arxiv.org/abs/2102.07346

https://semiengineering.com/the-rise-of-processing-in-memory/

Comments