Intel Sandy Bridge-EP versus AMD Interlagos

Chess Gator

Member
Jan 16, 2008
124
0
76
Hello group,

In the 3rd quarter of 2011, Intel will be launching it's next generation CPU with Sandy Bridge-EP, which is expected to feature 8 cores, 16MB of L3 cache (although some rumours put this at 20MB), 4 DDR3 memory controllers, 2 QuickPath 1.1 links and 32 lanes of PCI-Express 3.0.

http://www.realworldtech.com/page.cfm?ArticleID=RWT091810191937&p=1

In approximately the same timeframe AMD will be launching Interlagos Opteron, which is based on AMD's new Bulldozer core architecture, Interlagos has 8 Bulldozer modules, thus 16 cores per CPU!

http://www.hpcwire.com/features/Intel-AMD-Gear-Up-for-2011-Server-Chip-Battle-103060704.html

It looks like by the 3rd Quarter of 2011, Interlagos Opteron will finally be better than Intel CPUs for computer chess play!

Perhaps, someone more technical can comment.

Cordially,

CG
 

Edrick

Golden Member
Feb 18, 2010
1,939
230
106
Correct me if I am wrong, as I have not read every BD article out there, but isn't a BD module 2 INT units and 1 FP unit? So a module is not really 2 full cores.

And SB will have HT with will give you 16 threads, but at a performance penalty (percentage).

I do not think anyone really knows how they will compare at this point in time.
 

Martimus

Diamond Member
Apr 24, 2007
4,490
157
106
Read these two articles for a good indepth preview of both architectures, and how they might work for your applications:

Bulldozer Architecture Overview:
http://www.realworldtech.com/page.cfm?ArticleID=RWT082610181333&p=1

Sandybridge Architecture Overview:
http://www.realworldtech.com/page.cfm?ArticleID=RWT091810191937

EDIT: Oops, didn't realize that you already linked the two articles.

For all intents and purposes, you can equate a BD Module to a SB core. From the architecture previews, it appears that the BD module design has prioritized total module throughput over single threaded throughput, while the SB core design prioritized single threaded throughput over total core throughput. In other words, it looks like SB might be faster per thread, but BD might be faster at working on multiple threads.

Both architectures have improvements in both single threaded IPC and multi-threaded IPC, but they really diverged in which area they focused on.
 
Last edited:

Soleron

Senior member
May 10, 2009
337
0
71
Correct me if I am wrong, as I have not read every BD article out there, but isn't a BD module 2 INT units and 1 FP unit? So a module is not really 2 full cores.

The single FP unit is made of two 128-bit FPUs that can operate independently. The FP unit in the current Opterons is 128-bit. Hence FP performance per core does not drop and may actually go up due to design changes.

Think of it as two integer cores with two FPUs that can be combined to execute AVX instructions as well. Intel does have a 256-bit FP unit per core but it can't be split into two units (so not using AVX means half goes unused) and using AVX on SB does share some of the MUL/ADD hardware with the integer units so it's not perfectly independent.

BD will more than likely have lower performance per core than SB. We don't have enough information to conclude which will perform better overall.
 

Edrick

Golden Member
Feb 18, 2010
1,939
230
106
The single FP unit is made of two 128-bit FPUs that can operate independently. The FP unit in the current Opterons is 128-bit. Hence FP performance per core does not drop and may actually go up due to design changes.

Think of it as two integer cores with two FPUs that can be combined to execute AVX instructions as well. Intel does have a 256-bit FP unit per core but it can't be split into two units (so not using AVX means half goes unused) and using AVX on SB does share some of the MUL/ADD hardware with the integer units so it's not perfectly independent.

Thank you, that makes sence. It will be very interesting to finally see some performance numbers.
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
Hello group,

In the 3rd quarter of 2011, Intel will be launching it's next generation CPU with Sandy Bridge-EP, which is expected to feature 8 cores, 16MB of L3 cache (although some rumours put this at 20MB), 4 DDR3 memory controllers, 2 QuickPath 1.1 links and 32 lanes of PCI-Express 3.0.

http://www.realworldtech.com/page.cfm?ArticleID=RWT091810191937&p=1

In approximately the same timeframe AMD will be launching Interlagos Opteron, which is based on AMD's new Bulldozer core architecture, Interlagos has 8 Bulldozer modules, thus 16 cores per CPU!

http://www.hpcwire.com/features/Intel-AMD-Gear-Up-for-2011-Server-Chip-Battle-103060704.html

It looks like by the 3rd Quarter of 2011, Interlagos Opteron will finally be better than Intel CPUs for computer chess play!

Perhaps, someone more technical can comment.

Cordially,

CG

What about the 10 core intel. Or your not talking about that one right . Again A link to were AMD finally beats Intel at chess . You don't have one you say . Like I didn't know . Lets wait till both chips are out .
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
Read these two articles for a good indepth preview of both architectures, and how they might work for your applications:

Bulldozer Architecture Overview:
http://www.realworldtech.com/page.cfm?ArticleID=RWT082610181333&p=1

Sandybridge Architecture Overview:
http://www.realworldtech.com/page.cfm?ArticleID=RWT091810191937

EDIT: Oops, didn't realize that you already linked the two articles.

For all intents and purposes, you can equate a BD Module to a SB core. From the architecture previews, it appears that the BD module design has prioritized total module throughput over single threaded throughput, while the SB core design prioritized single threaded throughput over total core throughput. In other words, it looks like SB might be faster per thread, but BD might be faster at working on multiple threads.

Both architectures have improvements in both single threaded IPC and multi-threaded IPC, but they really diverged in which area they focused on.

Yep thats the way that article makes it sound. But who says intel wasn't thinking server.
 

OCGuy

Lifer
Jul 12, 2000
27,224
37
91
I would wait for actual reviews before jumping to any conclusions about performance differences.

Sandy Bridge can't come out fast enough for me.
 

nyker96

Diamond Member
Apr 19, 2005
5,630
2
81
this new site real world tech is very impressive, has a ton of articles on current architectures. I'm slowing reading through a few of them and is really liking them.
 

JFAMD

Senior member
May 16, 2009
565
0
0
Correct me if I am wrong, as I have not read every BD article out there, but isn't a BD module 2 INT units and 1 FP unit? So a module is not really 2 full cores.

And SB will have HT with will give you 16 threads, but at a performance penalty (percentage).

I do not think anyone really knows how they will compare at this point in time.

Actually the FPU can run in either dual 128-bit or single 256-bit. I would not characterize it as "not really 2 full cores".
 

Voo

Golden Member
Feb 27, 2009
1,684
0
76
Actually the FPU can run in either dual 128-bit or single 256-bit. I would not characterize it as "not really 2 full cores".
Well one module (~= 2 cores) can do only one AVX op right? So at least for that it's not equal to 2 cores.
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
this new site real world tech is very impressive, has a ton of articles on current architectures. I'm slowing reading through a few of them and is really liking them.

New site? They've been around for over 13 years.
 

Soleron

Senior member
May 10, 2009
337
0
71
Well one module (~= 2 cores) can do only one AVX op right? So at least for that it's not equal to 2 cores.

How many applications for the Xeon/Opteron market would benefit from more than eight heavy AVX threads per 16-core processor? If they exist, wouldn't you be better using GPGPU for that kind of thing?

Even if not, I'm sure AMD doesn't mind losing that small percentage of the market in exchange for not needing double the FP die space for the rest of it which would be wasteful. Not sure what SB does here; do they have way more FP die area?

AMD is making tradeoffs (like any CPU design). The one they chose to make is not duplicating some of the core logic for each core in order to save power, and they believe it won't make much difference in the majority of workloads that will actually be run on servers. Intel believes the same about HT: the benefit is worth the extra circuitry.
 

Voo

Golden Member
Feb 27, 2009
1,684
0
76
How many applications for the Xeon/Opteron market would benefit from more than eight heavy AVX threads per 16-core processor? If they exist, wouldn't you be better using GPGPU for that kind of thing?
Depends on the application and the size of the problem.
But I think you misunderstood me - I never said that sharing the FP units is a bad idea - actually most consumer apps aren't that heavy on FP anyways and you've got to recompile your program to use AVX anyhow - only that 1 module (2 threads) can't run 2 AVX ops, so "not really 2 full cores" is correct.
 

Soleron

Senior member
May 10, 2009
337
0
71
Depends on the application and the size of the problem.
But I think you misunderstood me - I never said that sharing the FP units is a bad idea - actually most consumer apps aren't that heavy on FP anyways and you've got to recompile your program to use AVX anyhow - only that 1 module (2 threads) can't run 2 AVX ops, so "not really 2 full cores" is correct.

OK. AMD defines a core as an integer core for marketing purposes. If you define it to include an FP unit then you are correct. I'm not sure what academic literature would define it as. However historically CPUs did not include FP units anyway so any definition with an FP unit would have to except those.

As I said above, Intel's SB does share part of the FP unit with the integer units, such that it can't use certain parts of the integer core at the same time as running an AVX op. So that's 'not really a full core' either. Though it won't affect performance except in corner cases.
 

Edrick

Golden Member
Feb 18, 2010
1,939
230
106
OK. AMD defines a core as an integer core for marketing purposes. If you define it to include an FP unit then you are correct. I'm not sure what academic literature would define it as. However historically CPUs did not include FP units anyway so any definition with an FP unit would have to except those.

FP units have been a part of the "core" since the 486 days. So I think most people will define core to include a FP unit.
 

Sp12

Senior member
Jun 12, 2010
799
0
76
Well, this is obviously a departure from previous designs. As the FP unit can performs 2-128 bit operations/clock, I don't think it's fair to call it an untrue core.
 
Last edited:

JFAMD

Senior member
May 16, 2009
565
0
0
If ~90% of the typical workloads are integer, that should drive people's decisions.

The reality is that people don't buy cores, they buy processors. They judge those on performance, price and power consumption for the most part.
 

Voo

Golden Member
Feb 27, 2009
1,684
0
76
If ~90% of the typical workloads are integer, that should drive people's decisions.

The reality is that people don't buy cores, they buy processors. They judge those on performance, price and power consumption for the most part.
Well the discussion is not about usefullness, but about semantics only ;)

Does it make sense to share the FP unit for AVX ops, which only get used for new/recompiled apps, considering that FP is usually not that heavily used in consumer apps? Personally I'd say yes, but in the end that's up to discussion.

But we're arguing whether or not that still constitutes 2 cores or something else. Actually it's not that easy to come up with a exact definition of "core" (considering that that definition should work as well for µcs and other stuff) if you think about it.
 

JFAMD

Senior member
May 16, 2009
565
0
0
Worst of all, if you start to lock down the definition of "core" you stifle innovation.

We are probably 5-10 years away from a world where "socket" is the next thing to go on a server. I can see several scenarios where the OS becomes natively clustered and you add processing resources the way people add memory or hard drives today.

In a world of fabric computing things get even more interesting.

Those that try to lock down innovation by forcing strict interpretations do us no favors.

Think about this. When the world went to dual core the definition of "processor" changed forever. I don't recall someone ever saying "that is not a true processor because those two cores are sharing a memory controller and some other features.

Processors today are a combination of shared and discrete resources. The winner in all of this will be the person who learns how to share best those components that can reduce cost and power consumption while keeping discrete those that cause bottlenecks.
 

ilkhan

Golden Member
Jul 21, 2006
1,117
1
0
True, but Intel didn't add a second core and proclaim it a dual processor, did they? They added a second core and said dual core.
What AMD is trying to do is add half a core and say its a dual core. That'll work on most of the sheep, but if you need the part thats being shared it sucks. If you can't split the "dual" part in half and have it still work, its not really "dual" anything.
 

Soleron

Senior member
May 10, 2009
337
0
71
True, but Intel didn't add a second core and proclaim it a dual processor, did they? They added a second core and said dual core.
What AMD is trying to do is add half a core and say its a dual core. That'll work on most of the sheep, but if you need the part thats being shared it sucks. If you can't split the "dual" part in half and have it still work, its not really "dual" anything.

If AMD will sell you a higher performance level in the same power and similar pricing as Intel's part, isn't that good enough regardless of what the die looks like? Conversely if AMD does not offer higher performance then it won't sell regardless of how many cores they claim.

So don't worry about it. Performance, price and power will still determine whether it sells, not core/module marketing.

Look at Intel's SB unit sharing for AVX. That's not "dual anything" either.
 

JFAMD

Senior member
May 16, 2009
565
0
0
True, but Intel didn't add a second core and proclaim it a dual processor, did they? They added a second core and said dual core.
What AMD is trying to do is add half a core and say its a dual core. That'll work on most of the sheep, but if you need the part thats being shared it sucks. If you can't split the "dual" part in half and have it still work, its not really "dual" anything.

Please define how doubling the integer execution pipelines is only half a core.

And each module can have 2 128-bit FPUs (and will for non AVX apps...)

I think people that are trying to argue "less than a core" are focused on semantics and not on what really matters to customers.
 

Edrick

Golden Member
Feb 18, 2010
1,939
230
106
Why introduce the term "module" then?

Why say that BD is a 4 module CPU with each module containing 2 integer and 1 FP unit? Why not just say its a 8 core CPU and save the confusion?

Again, this is just semantics anyways. Who knows, modules may be the future term instead of cores (IBM Power7).
 

JFAMD

Senior member
May 16, 2009
565
0
0
Modules are a design term. We will not market modules. A 16-core processor will be a 16-core processor, not an 8-module processor.

We introduced them to help people understand the modularity behind the architecture. In the future, for instance, we could swap out a processing core module with a GPU module if we needed to.