The problem is the core cpu uArch needs to be completely redesigned to take advantage of the type of memory bandwidth this enables. You're talking about eliminating the need for an L3 cache, since this would become your L3 cache. Possibly even your L2 cache. Throw a gpu in there and it gets even more complicated, since both the cpu and gpu will be sharing this memory. Games would need to be completely redesigned so that more data can be fed straight to the gpu, and so that the cpu and gpu could work on the same data sets without copying or moving the data around.
I wish some tech site would do a "The Life of a Texture" article so we could see just how much time is wasted copying and moving data vs how much actual processing is done.