- Oct 14, 1999
- 12,014
- 321
- 126
Intel has supported multiprocessing for awhile now and are gaining alot of headway with each successive generation, even making it somewhat happen on a single core. AMD has been putting alot of work into their next-generation multiprocessor platform, the Opteron-Hammer. This coming after a huge success with their Athlon-MP.
One new idea coming out of the Hammer development is that one processor has the ability to forward information in its cache to the other processor on a super high speed relay system based on hypertransport (HT) technology. (Funny how Intel recycles the abbreviation of "ht" for their "hyperthreading" technology...) Something around 20GB/second of thoroughput will be possible along the HT pathways in an 8-way Opteron system,
With the advent of hypertransport technology into SMP systems it would seem that multiple processors could benefit from a "dummy processor" that contained only old cache information already discarded by previous CPU's, acting as an L3 cache only faster than what you'd normally expect out of an L3 cache. Why faster? Because the entire "dummy processor" could be full-speed on-die cache of around 2MB using up about the same space as a regular core of a CPU. Minimal logic could be built into this "dummy processor" so that it never actually does any work, but merely acts as the keeper of old cache information. An 8-way system using 1 of these could be referred to as maybe an "7+1 way system", meaning 7 CPUs and 1 dummy processor.
Is this idea plausible as a replacement of conventional L3 designs or would it be too complicated to implement?
One new idea coming out of the Hammer development is that one processor has the ability to forward information in its cache to the other processor on a super high speed relay system based on hypertransport (HT) technology. (Funny how Intel recycles the abbreviation of "ht" for their "hyperthreading" technology...) Something around 20GB/second of thoroughput will be possible along the HT pathways in an 8-way Opteron system,
With the advent of hypertransport technology into SMP systems it would seem that multiple processors could benefit from a "dummy processor" that contained only old cache information already discarded by previous CPU's, acting as an L3 cache only faster than what you'd normally expect out of an L3 cache. Why faster? Because the entire "dummy processor" could be full-speed on-die cache of around 2MB using up about the same space as a regular core of a CPU. Minimal logic could be built into this "dummy processor" so that it never actually does any work, but merely acts as the keeper of old cache information. An 8-way system using 1 of these could be referred to as maybe an "7+1 way system", meaning 7 CPUs and 1 dummy processor.
Is this idea plausible as a replacement of conventional L3 designs or would it be too complicated to implement?
