Skip to main content

Integration of PIM into the Memory Hierarchy

 Information about Today's Paper:

- Title: Livia: Data-Centric Computing Throughout the Memory Hierarchy
- Author: Elliot Lockerman1, et al.
- Published: ASPROS'20

My Reliable: 3/5


My Concern:
- In the near future,  the computer systems will have to reduce the amount of data movement. This is because the speed of performance improvement in memory bandwidth has been far slower in recent years than the degree of improvement in computation, and it can be inferred that when dealing with larger computational models, the performance is ruled by the memory bandwidth. This problem is called memory-wall.
- In this paper, by distributing the computation units (memory service elements) that perform near-memory computations throughout the memory hierarchy, this work has succeeded in reducing data movement, even for tasks where irregular memory accesses occur.
- In the AVL tree traversing task, it is about 1.6 times faster than a normal multi-core processor in a cycle-level simulator.
- This paper would be the pioneering paper integrating PIM into the memory hierarchy.  
- This approach allowed PIM to benefit from memory access locality. As a result, the data movement was reduced compared to a normal multi-core processor or a system that uses a hybrid of cache and PIM.
- Cache coherence can be maintained in a multi-core processor with L1 and L2 as private caches, L3 as shared cache, and main memory.
- I was curious how the cache coherent protocol and the issuance of schedules for the memory service elements work together.
- In a situation where memory service elements are trying to find out which private cache a certain data is in, do they broadcast to all cores to check?
- I was not sure what specific technique the context of using FPGAs for memory service elements refers to.

Related Papers:

- MVC: Enabling Fully Coherent Multi-Data-Views through the Memory Hierarchy with Processing in Memory