IBM's Power7 heats up server competition at Hot Chips

Idontcare

Elite Member
Oct 10, 1999
21,118
58
91
IBM's Power7 heats up server competition at Hot Chips

Embedded DRAM, multithreading give edge over AMD, Intel

Among its several advances, Power7 uses a mix of SRAM and IBM's embedded DRAM technology to pack on to the same die as the processor as much or more cache as any of its competitors. That's a big shift from the past three Power generations that used cache on separate die in a multichip module.

The shift from the two-core Power6 to the 4-, 6- and 8-core Power7 drove the need for more memory, a change that took years of effort both in IBM's silicon-on-insulator process technology and in memory architecture, said Bill Starke, an IBM Power architect who has worked on four generations of Power chips.

"We knew when we hit this level of multicore design, we would have to make the shift," Starke said. "We've been talking about this for several processor generations," he said.

The eDRAM cache of more than 16 Mbytes, improved off-chip signaling techniques "and a few more ingredients," helped IBM get beyond the 300 Gbyte/second memory bandwidth of the Power6. In addition, Power7 is said to pack as many as eight DDR3 memory channels.

http://www.eetimes.com/news/se...cleID=219400955&pgno=1

eDRAM coming to Power7 processors...could we be seeing something like this in Bulldozer? IBM is certainly legitimizing the technology with this move.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,315
10,031
126
Power7 already? Did I miss out on Power6? I remember the pictures on TheRegister of the Power5 chip, it was huge. Is the power6 any bigger?
 

mozartrules

Member
Jun 13, 2009
53
0
0
The need some major speed improvements compared to the Power6. I write a heavily CPU bound application at work and is in the process of converting from the 4.2GHz Power6 to a 2.53GHz E5540 Xeon (still trying to get X5570s). We are doing the change for price reasons, but the E5540 is about 15% faster. This was not a surprise since we had already looked at SPEC ratings. Unless you need the specific features of the architecture the Nehalem based Xeons are likely to be as good (=much better once you factor in price).
 

Idontcare

Elite Member
Oct 10, 1999
21,118
58
91
Originally posted by: VirtualLarry
Power7 already? Did I miss out on Power6? I remember the pictures on TheRegister of the Power5 chip, it was huge. Is the power6 any bigger?

Yeah power6 (and power6+) has been out about a year now, serious clockespeeds - all the way up to 5GHz.

Of course big GHz multiplied by little IPC means they net out at about the same performance as all the other leading edge HPC solutions out there.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
IDC, you should read this article about Beckton :)
http://www.semiaccurate.com/20...ecton-8-cores-and-all/

I think we'll see Embedded DRAM in the future with all 3 CPU manufacturers, AMD/IBM/Intel.

From SPEC CPU scores, CPU-wise the Power 6 was about equal to Core Duo(Yonah) in single thread performance at equal clock speeds. Either way, I see Nehalem-EX as a significant threat to Power 7.
 

Idontcare

Elite Member
Oct 10, 1999
21,118
58
91
More on Power7:

I've included a list of specs in the accompanying slide: basically, the POWER7 is an 8-, 6, and 4-core chip with 1.2 billion transistors, running at an undisclosed clock speed. A shared L3 cache of up to 32 Mbytes in size will use eDRAM. The POWER7 will scale up to 32 sockets and 1,024 threads. Not surprisingly, it will be backward-compatible with the POWER6.

http://www.pcmag.com/article2/0,2817,2351965,00.asp

There's a couple of snazzy die-shots highlighting the chip layout and the core layout.

Details:

  • 567mm² and 1.2B xtors

    8 cores, 32MB L3$ (eDRAM)

    Each core supports 4-way SMT

    Dual DDR3 IMCs

    Platform supports up to 32 sockets
 

heyheybooboo

Diamond Member
Jun 29, 2007
6,278
0
0
Originally posted by: Idontcare
More on Power7:

I've included a list of specs in the accompanying slide: basically, the POWER7 is an 8-, 6, and 4-core chip with 1.2 billion transistors, running at an undisclosed clock speed. A shared L3 cache of up to 32 Mbytes in size will use eDRAM. The POWER7 will scale up to 32 sockets and 1,024 threads. Not surprisingly, it will be backward-compatible with the POWER6.

http://www.pcmag.com/article2/0,2817,2351965,00.asp

There's a couple of snazzy die-shots highlighting the chip layout and the core layout.

Details:

  • 567mm² and 1.2B xtors

    8 cores, 32MB L3$ (eDRAM)

    Each core supports 4-way SMT

    Dual DDR3 IMCs

    Platform supports up to 32 sockets

But can it run Windows? :D
 

Idontcare

Elite Member
Oct 10, 1999
21,118
58
91
Some more info on IBM's Power7 and SUN's Rainbow Falls (niagara 3)

Sun, IBM push multicore boundaries

Both pack 128 threads in a server CPU socket

Power7 packs as many as 32 cores supporting 128 threads on a four-chip module with links to handle up to 32 sockets in a system. "It is scaling well beyond anything we've ever really seen before," said Peter Glaskowsky, a technology analyst for Envisioneering Group (Seaford, NY).

At the chip level, IBM claims Power7 will deliver four times the performance of Power6, thanks to the combination of expanded, cache and interconnects.

Colors of Sun's Rainbow

Sun's Rainbow Falls will require about 30 percent more power than the previous generation called T2+ which integrated eight cores. However, the new chip?built in a 40nm TSMC process--is about the same size as the T2+.

The design uses the same basic core Sun employed on the previous T2 chip, but it enhanced its floating-point unit and added new block ciphers and hash functions to an embedded cryptographic accelerator.

http://www.eetimes.com/news/se...l;?articleID=219500130
 

Idontcare

Elite Member
Oct 10, 1999
21,118
58
91
Originally posted by: IntelUser2000
IDC, you should read this article about Beckton :)
http://www.semiaccurate.com/20...ecton-8-cores-and-all/

I think we'll see Embedded DRAM in the future with all 3 CPU manufacturers, AMD/IBM/Intel.

From SPEC CPU scores, CPU-wise the Power 6 was about equal to Core Duo(Yonah) in single thread performance at equal clock speeds. Either way, I see Nehalem-EX as a significant threat to Power 7.

Hey thanks for jogging my memory with the pm, I meant to address this and I forgot.

Yeah it is interesting how they are using ring-bus and cache slices to deal with the inherent communication topology across such physically large chips. They have literally brought onto the chip the same basic methodology used to deal with the same time-delay issues that were present when you'd have just as much cache and cores in a system but they were physically distributed across a motherboard and electrically connected by way of sockets and PCB traces instead of monolithic on-die interconnect.

Instead of dealing with NUMA coherency on multi-socket systems and hops we get to deal with it in miniaturized fashion with L3$ coherency on slices within the die but still physically separated enough as to drive heterogeneous access penalties depending on the distance to the slice (for now they've kept it all within "1hop" to extend the loose analogy even further).

The on-die interconnect makes things faster, but it doesn't negate the fundamental issue involved with making data resident on one part of the chip accessible to the xtors that need to process that data resident a 20mm's and a dozen clock ticks away.

What the Intel guys did with Beckton is pretty sophisticated in how they took care to implement a robust ring-bus topology such that no data is sent that the receiver can't accept. Maybe I'm just easily impressed but the description given by Charlie of the ring-bus topology is pretty swanky IMO.

And I think you are right, what you said in the pm, that this probably provides us some insight into how Intel has already engineered the ring-bus topology on Larrabee and Sandy Bridge.