AMD describes multichip module: 12-core Magny-Cours

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
AMD describes multichip module

For its part, AMD is stealing a page from archrival Intel by building its 12-core Magny-Cours out of two of its six-core Istanbul dice. Intel released some of its early multicore chips using two die sharing a single front-side bus and external memory controller.

AMD is improving on Intel's approach by providing on each die a two-channel DDR3 memory controller and four coherent HyperTransport 3 links. As a result, data in a two-chip system can travel between any two cores in a single hop and in two hops in a four-chip server.

"Basically we are taking a leaf from [Intel's] book but doing it differently, said Pat Conway, principal member of technical staff at AMD who said to expect use of MCMs in the company's future CPUs.

http://www.eetimes.com/news/se...cleID=219400955&pgno=2

Kinda impressive to see an AMD technical staff member actually give a nod of acknowledgement to the Intel team there. :thumbsup:
 

SlowSpyder

Lifer
Jan 12, 2005
17,305
1,002
126
I see the core race has replaced the MHz race. I suppose this has it's place in the server world though.
 

zsdersw

Lifer
Oct 29, 2003
10,505
2
0
Originally posted by: dmens
<sarc>

but it's not a true 12 core!!!!!1111

I'm wondering if the same people who complained about C2Q, etc. not being "true" quad-core processors are going to say the same of an AMD MCM chip.

Probably not.. as they're mostly biased for-AMD / against-Intel, anyway.. but we'll see.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,587
10,225
126
Ok, but when are we going to see the 12-core on socket AM3??? (For that matter, when is the desktop version of the six-core coming out?)
 

Viditor

Diamond Member
Oct 25, 1999
3,290
0
0
Originally posted by: zsdersw
Originally posted by: dmens
<sarc>

but it's not a true 12 core!!!!!1111

I'm wondering if the same people who complained about C2Q, etc. not being "true" quad-core processors are going to say the same of an AMD MCM chip.

Probably not.. as they're mostly biased for-AMD / against-Intel, anyway.. but we'll see.

I believe that the issue was that each "half" of the early C2Q MCM chips from Intel had to go through the FSB to talk to the other half, thus negating any benefit of having them on the same chip. There was almost no difference between that and having a 2P dual core setup...
The same cannot be said for the Magny Cours...
 
Dec 30, 2004
12,553
2
76
Originally posted by: Viditor
Originally posted by: zsdersw
Originally posted by: dmens
<sarc>

but it's not a true 12 core!!!!!1111

I'm wondering if the same people who complained about C2Q, etc. not being "true" quad-core processors are going to say the same of an AMD MCM chip.

Probably not.. as they're mostly biased for-AMD / against-Intel, anyway.. but we'll see.

I believe that the issue was that each "half" of the early C2Q MCM chips from Intel had to go through the FSB to talk to the other half, thus negating any benefit of having them on the same chip. There was almost no difference between that and having a 2P dual core setup...
The same cannot be said for the Magny Cours...

Please elaborate.
 

Viditor

Diamond Member
Oct 25, 1999
3,290
0
0
Originally posted by: soccerballtux
Originally posted by: Viditor
Originally posted by: zsdersw
Originally posted by: dmens
<sarc>

but it's not a true 12 core!!!!!1111

I'm wondering if the same people who complained about C2Q, etc. not being "true" quad-core processors are going to say the same of an AMD MCM chip.

Probably not.. as they're mostly biased for-AMD / against-Intel, anyway.. but we'll see.

I believe that the issue was that each "half" of the early C2Q MCM chips from Intel had to go through the FSB to talk to the other half, thus negating any benefit of having them on the same chip. There was almost no difference between that and having a 2P dual core setup...
The same cannot be said for the Magny Cours...

Please elaborate.

The MCM version of the C2Q did not share any cache, nor did it have any other directly connecting hardware between the two cores or two "halves" of the CPU other than the system's FSB. Therefore, the connection between the the 2 parts of the MCM (which was basically 2 dual core chips) was essentially the same as it was between 2 dual cores in 2 seperate sockets.
For Magny-Cours, they are using 4 cHT links (cache coherent Hypertransport, which is different than normal HT). This means that the two "halves" can take advantage of their proximity to reduce latency by communicating directly with each other. In fact, in a 2 socket system, all cores will be able to "talk" using only one hop (i.e. the signal isn't routed through anything else). The latency should be excellent...

BTW, I would also agree that it's not a "true" 12 core in that the 2 "halves" aren't sharing the direct connect architecture. However, neither is it as slow as having each "half" in it's own socket. There is a definite operational advantage to putting the 2 halves together in this case.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Originally posted by: Viditor
In fact, in a 2 socket system, all cores will be able to "talk" using only one hop (i.e. the signal isn't routed through anything else). The latency should be excellent...

BTW, I would also agree that it's not a "true" 12 core in that the 2 "halves" aren't sharing the direct connect architecture. However, neither is it as slow as having each "half" in it's own socket. There is a definite operational advantage to putting the 2 halves together in this case.

Viditor if the setup is such that now the cpu's in two sockets can talk to each other with only 1 hop penalty, the same 1 hop penalty imposed on two MCM'ed die, then how is an MCM'ed magny-cours any better than a 2S Istanbul?

Shouldn't both setups deliver identical performance?

The way you wrote your post it reads to me that Magny-Cours vs. 2S Istanbul is no better (i.e. no worse) an improvement than Kentsfield was versus a dual-socket wolfdale setup.

Am I misinterpreting your statement?
 

Viditor

Diamond Member
Oct 25, 1999
3,290
0
0
Originally posted by: Idontcare
Originally posted by: Viditor
In fact, in a 2 socket system, all cores will be able to "talk" using only one hop (i.e. the signal isn't routed through anything else). The latency should be excellent...

BTW, I would also agree that it's not a "true" 12 core in that the 2 "halves" aren't sharing the direct connect architecture. However, neither is it as slow as having each "half" in it's own socket. There is a definite operational advantage to putting the 2 halves together in this case.

Viditor if the setup is such that now the cpu's in two sockets can talk to each other with only 1 hop penalty, the same 1 hop penalty imposed on two MCM'ed die, then how is an MCM'ed magny-cours any better than a 2S Istanbul?

Shouldn't both setups deliver identical performance?

The way you wrote your post it reads to me that Magny-Cours vs. 2S Istanbul is no better (i.e. no worse) an improvement than Kentsfield was versus a dual-socket wolfdale setup.

Am I misinterpreting your statement?

Yes and no...the latency of the hop will be different. From one half of the die to the other is a direct hop of much shorter length than to the cores in the other socket. In other words, there is no centralized HT controller, they are integral to the cores themselves.

 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Originally posted by: Viditor
Originally posted by: Idontcare
Originally posted by: Viditor
In fact, in a 2 socket system, all cores will be able to "talk" using only one hop (i.e. the signal isn't routed through anything else). The latency should be excellent...

BTW, I would also agree that it's not a "true" 12 core in that the 2 "halves" aren't sharing the direct connect architecture. However, neither is it as slow as having each "half" in it's own socket. There is a definite operational advantage to putting the 2 halves together in this case.

Viditor if the setup is such that now the cpu's in two sockets can talk to each other with only 1 hop penalty, the same 1 hop penalty imposed on two MCM'ed die, then how is an MCM'ed magny-cours any better than a 2S Istanbul?

Shouldn't both setups deliver identical performance?

The way you wrote your post it reads to me that Magny-Cours vs. 2S Istanbul is no better (i.e. no worse) an improvement than Kentsfield was versus a dual-socket wolfdale setup.

Am I misinterpreting your statement?

Yes and no...the latency of the hop will be different. From one half of the die to the other is a direct hop of much shorter length than to the cores in the other socket. In other words, there is no centralized HT controller, they are integral to the cores themselves.

Oh I'm with you now, I was mistaking the jargon "hop" to imply a quanta of latency, be it what it may. Meaning 1hop = X ns latency regardless where you hop to or from, and 2 hops then being 2X ns latency, etc.

So you are saying "hops" are more a schematic linguistic thing to denote where the data request resides at any point in time but the time required to execute that "hop" is entirely variable...a 2hop transaction on one system could be faster than a 1hop request on another sort of deal?

edit: The analogy I had in mind is the one of how latency for caches are described in terms of "cycles" where 2 cycle latency is mathematically twice as slow (in absolute ns latency terms) as 1 cycle latency. Clockspeeds are normalized out, etc.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Some more tidbits surrounding the HotChips presentation AMD made regarding Magny-Cours can be seen at Goto-san's weekly digest:

http://pc.watch.impress.co.jp/...i/20090826_310619.html

Intel guys try and keep from laughing beer out your nose when you read the "Economics" section of this AMD presentation slide: (after having been trolled to death by fanboys for your taking advantage of these same economics over four (!) years ago)
http://pc.watch.impress.co.jp/.../html/kaigai4.jpg.html

edit: Ouch I forgot about this, probe filter takes away 1MB of your L3$ per chip when activated, reducing IPC per core but with the intent of making it back up in system scaling over many threads.
 

geokilla

Platinum Member
Oct 14, 2006
2,012
3
81
So AMD's finally ditching their true X core status. I like the way AMD is headed, assuming everything works out well.

However, there is a problem with this. WHAT'S THE POINT OF HAVING MORE THAN QUAD CORES?! Only lately has there been programs been optimized for multi-core use. I'd rather see them spend their time trying to improve the speed of their processors then slapping more cores into one CPU. Seriously after a year of using my rig in sig, compared to my AMD X2 rig that I gave to my sister, the everyday performance difference in minimal. I bet if I chose better parts back then for that rig, the AMD rig would feel just as quick and snappy. Only time I feel that the computer is slow is during Folding@Home, and that's probably due to a slow hard drive and not enough RAM.
 

Viditor

Diamond Member
Oct 25, 1999
3,290
0
0
Originally posted by: geokilla
So AMD's finally ditching their true X core status. I like the way AMD is headed, assuming everything works out well.

However, there is a problem with this. WHAT'S THE POINT OF HAVING MORE THAN QUAD CORES?! Only lately has there been programs been optimized for multi-core use. I'd rather see them spend their time trying to improve the speed of their processors then slapping more cores into one CPU. Seriously after a year of using my rig in sig, compared to my AMD X2 rig that I gave to my sister, the everyday performance difference in minimal. I bet if I chose better parts back then for that rig, the AMD rig would feel just as quick and snappy. Only time I feel that the computer is slow is during Folding@Home, and that's probably due to a slow hard drive and not enough RAM.

This is server only (Opteron)...and believe me, there are a HUGE number of places this can be used there. Most especially in virtual machines and cloud servers.
 

Sahakiel

Golden Member
Oct 19, 2001
1,746
0
86
Originally posted by: geokilla
So AMD's finally ditching their true X core status. I like the way AMD is headed, assuming everything works out well.

However, there is a problem with this. WHAT'S THE POINT OF HAVING MORE THAN QUAD CORES?! Only lately has there been programs been optimized for multi-core use. I'd rather see them spend their time trying to improve the speed of their processors then slapping more cores into one CPU. Seriously after a year of using my rig in sig, compared to my AMD X2 rig that I gave to my sister, the everyday performance difference in minimal. I bet if I chose better parts back then for that rig, the AMD rig would feel just as quick and snappy. Only time I feel that the computer is slow is during Folding@Home, and that's probably due to a slow hard drive and not enough RAM.

Hm, if I remember correctly, the whole reason we even have the current emphasis on multi-core is because the speeds stopped scaling a lot sooner than expected. I think it was about... 10 years ago? Around that time, all the major chip companies including IBM were talking about hitting a clock speed wall due to quantum effects somewhere past 40nm or so. Unfortunately, they ran into problems a lot earlier than that so we ended up with shifting philosophies midway through the lifetime of the Pentium 4, which, by the way, probably represents the pinnacle design for the old school of computer architecture.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
It's not just that. Single thread performance scaling in general has diminished. Plus, multi-core offers more opportunities for power management than single cores. I'm pretty sure most of the "easy" architectural enhancements regarding single thread has been exploited.
 

Soleron

Senior member
May 10, 2009
337
0
71
According to JF at AMD, AMD were always against MCM connected outside the package (i.e. over FSB) as it degraded performance. Magny-Cours is connected by a fast HT link within the package so is almost as good as a native 12-core. This was interpreted by the media as AMD attacking all MCMs and picked up wrongly by AMD fanboys.

Magny-Cours, being on two dies, should be cheaper to produce than Intel. The upgrades made to the platform (DDR3, faster and more HT links, AMD chipset) could even get 4P AMD competitive on performance with 4P Nehalem on the server (they're already competitive on platform price).
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
I can see were AMD NEEDS all the things that was discussed at Hot Chips. I agree that AMD needs to do these things .

But IF amd tries to do all this on BD on 32NM High K/ metal gates /Gates first tech. It will be a disaster. I won't even say I think AND will fail . I will just say it . They will fail .

1) FMA4 - Maybe well see.

2) . 32nm HighK /Metal gates / Gate First - Likely .

3). SMT - Possiable

4) Clock speed with all of the above and power efficiency on a First time arch. Probabilies are very low for GOOD all round performance.

5) Fusion fits in above somewhere.


IF AMD tries all this on one chip on new process it will fail.
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
Originally posted by: Soleron
According to JF at AMD, AMD were always against MCM connected outside the package (i.e. over FSB) as it degraded performance. Magny-Cours is connected by a fast HT link within the package so is almost as good as a native 12-core. This was interpreted by the media as AMD attacking all MCMs and picked up wrongly by AMD fanboys.

Magny-Cours, being on two dies, should be cheaper to produce than Intel. The upgrades made to the platform (DDR3, faster and more HT links, AMD chipset) could even get 4P AMD competitive on performance with 4P Nehalem on the server (they're already competitive on platform price).


Well see about the performance VS Intel . But your way wromg about AMD s being Cheaper than INTELS.

1 Intel is a fully modular Chip= Intel can Add cores very cheaply.

2. Amd pays Intel for every chip/core they sell. AMD uses more expensive process.

AMD has more defects because of Intels Modular Chip. Intel SELLs in higher Volumn so they utilize more of their resources to produce lowest cost chip . = MARGINS

AS for MCM Intel used L2 cache for core communacations . FSB was for offcore memory . AMD is not using L3cache for the 2 Six core AMD to cummunicate with each other . If ya don't get this part your hopelessly lost.



 
Apr 20, 2008
10,067
990
126
Nemesis 1, I'm almost certain Intel pays AMD for every chip they sell because of the 64-bit licensing. And didn't AMD develop a layer of SSE?
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
Originally posted by: Scholzpdx
Nemesis 1, I'm almost certain Intel pays AMD for every chip they sell because of the 64-bit licensing. And didn't AMD develop a layer of SSE?

Well rather than being almost certain . Provide link to Intels/AMD licinese agreement . Highlighting were Intel pays any royalties to AMD , Why is it that AMD doesn't want the whole agreement public . Its clearly stated AMD pays Intel .

 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Originally posted by: Nemesis 1
I can see were AMD NEEDS all the things that was discussed at Hot Chips. I agree that AMD needs to do these things .

But IF amd tries to do all this on BD on 32NM High K/ metal gates /Gates first tech. It will be a disaster. I won't even say I think AND will fail . I will just say it . They will fail .

1) FMA4 - Maybe well see.

2) . 32nm HighK /Metal gates / Gate First - Likely .

3). SMT - Possiable

4) Clock speed with all of the above and power efficiency on a First time arch. Probabilies are very low for GOOD all round performance.

5) Fusion fits in above somewhere.


IF AMD tries all this on one chip on new process it will fail.

Nemesis I think you might be misunderstanding the reasoning behind why folks like to make quips about "new architecture + new process tech = too risky".

If the node delivers its spice model specs then the only reason a new chip design would fail is if the chip design fails, in which case that would have been the outcome whether it was implemented on a newer process or an older process.

If a node doesn't deliver its spice model specs then the chip can fail, but so too will any other chip (including a prior existing architecture design) which is attempted to be manufactured on that node.

The "new tech and new architecture is a bad idea" rule of thumb is born from risk management at the business strategy level...and the risks that are being managed are ones of time to market and gross margin maintenance not "does the chip work? does the node work?" type risk mitigation.

New designs carry with them an extra burden of verification and validation above and beyond that posed by shrinking a pre-existing architecture.

Thus the way to manage the risk of missing time-to-market opportunity for extracting entitlement gross margins from your newly released process technology is to plan to produce an existing architecture on the new node in parallel to producing a new architecture (which will take an extra 6-9 months minimum for the added verification and validation work).

If any one of those items you list above are not robustly implemented then the chip will have failed at any node (be it 45nm, 32nm, 22nm...bad design is bad design) or any chip design will have failed at that given node (be it BD or PhIII...bad xtors are bad xtors).

Doing old design + new tech doesn't change the risk of the new design having problems, nor does it change the risk of the new node having problems. But it does reduce the risk of failing to meet time to market and gross margin (yields, etc) targets for that first year that a new node is in manufacturing.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Originally posted by: Nemesis 1
Well see about the performance VS Intel . But your way wromg about AMD s being Cheaper than INTELS.

Originally posted by: Nemesis 1
2. AMD uses more expensive process.

It's not clear to me that you have access to the necessary fab data (yields, cycle-time, per-wafer production costs, etc) to make such a conclusion/statement in favor of either manufacturer.

Do you? Or is this just speculation on your part?

Originally posted by: Nemesis 1
AMD has more defects because of Intels Modular Chip.

What!?

Originally posted by: Nemesis 1
Intel SELLs in higher Volumn so they utilize more of their resources to produce lowest cost chip . = MARGINS

Cost structure is one thing, but gross margins are more impacted by ASP and Intel commands an average ASP well above AMD's and that is what makes the margins for Intel so much more compelling.

Cost cutting is the primary means of improving gross margins for commodity products (dram, nand flash, ornge juice, corn, cattle, gold, etc) but in microprocessors the primary means of improving gross margins is to improve the ASP.
 

zsdersw

Lifer
Oct 29, 2003
10,505
2
0
Originally posted by: Viditor
I believe that the issue was that each "half" of the early C2Q MCM chips from Intel had to go through the FSB to talk to the other half, thus negating any benefit of having them on the same chip. There was almost no difference between that and having a 2P dual core setup...
The same cannot be said for the Magny Cours...

Actually, no. The primary issue was that there are two separate pieces of silicon in the chip. That's what most were referring to when they said C2Q wasn't a "true" quad-core CPU.