- May 11, 2008
- 19,555
- 1,194
- 126
I was wondering, would partitioning the hbm2 memory be a good idea ?
If the memory controller would be configured in such a way that both gpu and cpu can read write to all memory with pointers directly (zero copy). This is already possible. So, that is not an issue.
But what if the memory layout would be setup in such a way that one stack would be available for the cpu primarily to execute from and the other stacks are primarily for the gpu. With support from the os, this could result in parallel access for both cpu and gpu. Maximizing the use of the hbm2 features. Also, most memory for a gpu seems to be used for texture storage. And in the near future, m2 ssd speed will be high enough and cheap enough to stream textures in directly from the m2 ssd. This would be enough next to the always present texture caching in gpu memory.
But for a cpu to take good advantage from hbm2, would there not have to be a cache line size optimization ?
How would this work with pseudo channel ?
For the gpu there is the same issue ?
Anybody have some good thoughts ?
http://www.anandtech.com/show/10527/sk-hynix-adds-hbm2-4-gb-memory-q3
http://www.anandtech.com/show/9969/jedec-publishes-hbm2-specification
If the memory controller would be configured in such a way that both gpu and cpu can read write to all memory with pointers directly (zero copy). This is already possible. So, that is not an issue.
But what if the memory layout would be setup in such a way that one stack would be available for the cpu primarily to execute from and the other stacks are primarily for the gpu. With support from the os, this could result in parallel access for both cpu and gpu. Maximizing the use of the hbm2 features. Also, most memory for a gpu seems to be used for texture storage. And in the near future, m2 ssd speed will be high enough and cheap enough to stream textures in directly from the m2 ssd. This would be enough next to the always present texture caching in gpu memory.
But for a cpu to take good advantage from hbm2, would there not have to be a cache line size optimization ?
How would this work with pseudo channel ?
For the gpu there is the same issue ?
Anybody have some good thoughts ?
http://www.anandtech.com/show/10527/sk-hynix-adds-hbm2-4-gb-memory-q3
http://www.anandtech.com/show/9969/jedec-publishes-hbm2-specification