Intel Clarkdale previewed

Page 3 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Originally posted by: MODEL3
So the 8mm2 die space/MB could be a possible scenario for L3 die+write buffer die+die of additional transistors.

IDC what do you think?

Sounds reasonable to me. They won't use exact identical sram sized cells and layout for each new design, even on same process node, though.

As Inteluser is alluding to, changes in associativity matter, as well as the aggressiveness of the die layout in terms of just how much silicon real-estate they budgeted for the sram and the GHz/W profile they wanted to hit with the sram.

Lots of tradeoffs get made, which is why we try an limit the comparisons we make (and extrapolations from there) based on sram comparisons.

It helps, sram comparisons, but its never quite apples to apples as it might be a Gala apple compared to a Red Delicious apple when we get down into the nitty gritty of the design choices that were made from product to product.
 

ilkhan

Golden Member
Jul 21, 2006
1,117
1
0
IDC: numbers ARE helpful. Looks like your image does a great job of comparing them too. I agree with your labels as the best we can do with what we know.

I wonder what one of the design team engineers would make of discussions like these. You know, somebody that knows the answers and just can't talk about it.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Originally posted by: ilkhan
I wonder what one of the design team engineers would make of discussions like these. You know, somebody that knows the answers and just can't talk about it.

If they know you, trust you, and are comfortable discussing it offline with you then you get nice little pm's or emails every now and then to assist you in avoiding making yourself to be more a fool than usual while posting in the public domain :laugh: ;) :p

And yes, it infuriates them to no end at how we bumble around and make the silliest of mistakes in our assignments and so on.

Prior to my exiting TI I actually avoided these forums like the plague anytime a new node was going to be released because it was just so painful to read the volumes of misinformation that passed as gospel (like wiki to some extent) in thread after thread while knowing there was no credible way I could weigh in on it without compromising my work confidentiality clauses.

Now that I am on the outside looking in I tend to try and be respectful of those who cannot post but have so much to say, particular in those touchy EU/Intel threads :laugh:
 

MODEL3

Senior member
Jul 22, 2009
528
0
0
Nice job IDC.
I have a question.
If we look at the 2 core parts, it seems to me they are the same.
Where is the GPU?
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Originally posted by: MODEL3
Nice job IDC.
I have a question.
If we look at the 2 core parts, it seems to me they are the same.
Where is the GPU?

Sitting to the left, way way to the left ;)

http://i272.photobucket.com/al...ucket/clarkdaleidf.jpg

(its MCM'ed, gpu is on the other die...what you see in that clarkdale shot is two cores, I merely outlined one of them in red to compare with one scaled nehalem core to show how little the architecture appears to have changed)
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Originally posted by: MODEL3
lol, I must go to sleep.
I put the GPU die into the CPU die.

You are a Renaissance man! Being all fusion/sandybridge on us already :laugh: Here we are stuck in 2009 and your mind is already thinking 2010...save those thoughts and we'll see you in a year to discuss :laugh:
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
Originally posted by: Idontcare

You are a Renaissance man! Being all fusion/sandybridge on us already :laugh: Here we are stuck in 2009 and your mind is already thinking 2010...save those thoughts and we'll see you in a year to discuss :laugh:

LOL!


(For those that care)
I think there's little more than just drive encryption for putting AES-NI on Westmere. It may have to do with accelerating HDCP decode acceleration.

http://www.anandtech.com/video/showdoc.aspx?i=3411&p=2

See how on the AT's article the path from Application-Graphics Driver-GPU Decode is using AES-128 acceleration. That specific part will be 10-15x faster on Westmere cores. Which will help lower CPU utilization when playing HD video.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Originally posted by: IntelUser2000
Originally posted by: Idontcare

You are a Renaissance man! Being all fusion/sandybridge on us already :laugh: Here we are stuck in 2009 and your mind is already thinking 2010...save those thoughts and we'll see you in a year to discuss :laugh:

LOL!


(For those that care)
I think there's little more than just drive encryption for putting AES-NI on Westmere. It may have to do with accelerating HDCP decode acceleration.

http://www.anandtech.com/video/showdoc.aspx?i=3411&p=2

See how on the AT's article the path from Application-Graphics Driver-GPU Decode is using AES-128 acceleration. That specific part will be 10-15x faster on Westmere cores. Which will help lower CPU utilization when playing HD video.

Definitely will help. And will this ISA extension make its way into Atom's ISA (or LRBi) at some point? Seems for this type of application it is all the more needed in the lower-power chips or as part of the IGP.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
Hmm. I seriously don't know. It looks like Lincroft doesn't have any ISA extensions so far. The ISA extensions end up being transistors and might have been power/die/time constrained to fit within Lincroft(Moorestown platform).

Continuing from yesterday, Ilkan asked:
"What's the queue in Bloomfield for?"

The Athlon 64 X2 had what's called System Request Interface or SRI, which acts as a router for multi-core systems. But it might be simpler than a router, which could be why Intel called it a "queue"(it merely queues requests).

By the way, the upper left portion of Clarkdale looks like to serve similar function as the queue on the Bloomfield. I thought it could have been the PCU, but no matter how bad the scaling would have been, 1 million transistors couldn't take that much. :p

Here are more shots without the lines for more raw shots.

http://download.intel.com/pressroom/images/Nehalem.jpg

I like this pic: http://www.3dnews.ru/_imgdata/...intel/nehalem-core.jpg

On a slightly off topic, Jasper Forest is yet another die. Jasper Forest is Bloomfield with the PCI Express controller.

Jasper Forest die:
http://www.techpowerup.com/img/09-09-15/143a.jpg

Maybe it will help us.
 

ilkhan

Golden Member
Jul 21, 2006
1,117
1
0
wow, that jasper forest die REALLY looks like they just took bloomfield and added a section (presumeably the PCI-E) on the left between the cores/cache and the gen-IO/QPI link. Is that a lot of empty space or are the colors just not showing much?
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
I think its the colors. The section which looks empty is the PCI Express controller. BTW, if you haven't noticed already it looks like Lynnfield shots has its die flipped horizontally.
 

TuxDave

Lifer
Oct 8, 2002
10,571
3
71
Originally posted by: Idontcare
Here's Hans de Vries' Shanghai and Nehalem annotated diemaps

Wow... that's a really detailed diagram. I'm impressed.

The funny part about the DDR3 partitioning is that I'm used to seeing the core stacked on top of each other and not side by side so it takes a while for me to figure out which way is "up". The two halves of DATA on each side is quite right but the stuff in between is a little off. There's enormous amounts of design reuse in the DDR3 portion so luckily if they look alike on the die shot, chances are they are the exact same block.

So how come no one wants to try to dissect the core to find the individual pieces?


Originally posted by: IntelUser2000

Jasper Forest die:
http://www.techpowerup.com/img/09-09-15/143a.jpg

Maybe it will help us.

Oh geez, that's a beautiful shot. I could probably break down the DDR3 portion into individual functional blocks with that.
 

TuxDave

Lifer
Oct 8, 2002
10,571
3
71
Originally posted by: ilkhan
wow, that jasper forest die REALLY looks like they just took bloomfield and added a section (presumeably the PCI-E) on the left between the cores/cache and the gen-IO/QPI link. Is that a lot of empty space or are the colors just not showing much?

lol. Funny you'd say that. The portion on the left (where you see a gap) was actually designed in parallel in a different database by another team. We were to basically drop in the design when it was finished so to me, it really did look like a huge blank space that we couldn't touch (aka drop signal repeaters etc...) at least until the design was dropped in.
 

MODEL3

Senior member
Jul 22, 2009
528
0
0
In the case that the GPU die is communicating with the CPU die with a QPI link, that means also that the memory controller is communicating with the CPU die with a QPI link, correct?

The QPI link (3.2 GHz?) is 12.8 GB/s on each direction, if i remember correctly.

Does anyone knows with what speed the memory controller inside the Lynnfield die can communicate?

Also, is this going to result (for the Clarkdales) in (a little bit) slower performance in some applications? (in relation with Lynnfield architecture)
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
Originally posted by: MODEL3
In the case that the GPU die is communicating with the CPU die with a QPI link, that means also that the memory controller is communicating with the CPU die with a QPI link, correct?

The QPI link (3.2 GHz?) is 12.8 GB/s on each direction, if i remember correctly.

Does anyone knows with what speed the memory controller inside the Lynnfield die can communicate?

Also, is this going to result (for the Clarkdales) in (a little bit) slower performance in some applications? (in relation with Lynnfield architecture)

First question: Correct

Second question:
Yea looks like it. Clarkdale has two disadvantages compared to Lynnfield

1. 1/2 the shared L3 cache
2. The memory controller in early benchmarks indicate performance similar to AMD's IMC, which isn't bad, but worse than Lynnfield.

Lynnfield: 16-18GB/s
Clarkdale: 10-11GB/s
Penryn: 6-7GB/s
 

MODEL3

Senior member
Jul 22, 2009
528
0
0
Originally posted by: IntelUser2000
Originally posted by: MODEL3
In the case that the GPU die is communicating with the CPU die with a QPI link, that means also that the memory controller is communicating with the CPU die with a QPI link, correct?

The QPI link (3.2 GHz?) is 12.8 GB/s on each direction, if i remember correctly.

Does anyone knows with what speed the memory controller inside the Lynnfield die can communicate?

Also, is this going to result (for the Clarkdales) in (a little bit) slower performance in some applications? (in relation with Lynnfield architecture)

First question: Correct

Second question:
Yea looks like it. Clarkdale has two disadvantages compared to Lynnfield

1. 1/2 the shared L3 cache
2. The memory controller in early benchmarks indicate performance similar to AMD's IMC, which isn't bad, but worse than Lynnfield.

Lynnfield: 16-18GB/s
Clarkdale: 10-11GB/s
Penryn: 6-7GB/s

Yes, the L3 for me is a problem for some applications, i wrote about it before 1,5 month and some members disagreed with me back then.

Can you clarify a little bit about the numbers? (10-11GB/s, etc...)
I asked about the speed that the memory controller is communicating with the Clarkdale CPU die. (and about Lynnfield case)
 

ilkhan

Golden Member
Jul 21, 2006
1,117
1
0
the GPU portion communicates via QPI, basically a little cut down X58 chip plus GPU on the socket but off the CPU die.
The CPU has half the cache because its organized in 2MB blocks under the cores on the dies. Less dies, less cache.
 

MODEL3

Senior member
Jul 22, 2009
528
0
0
ilkhan, do you have an estimation regarding the question i asked IntelUser2000?
 

ilkhan

Golden Member
Jul 21, 2006
1,117
1
0
memory <-> memory controller <-> QPI <-> CPU.
memory performance will be slower than bloomfield/lynnfield, but the degree to which that'll show in every day use is...minimal, at most.

Beyond those Im not sure what you're asking. Speed it communicates at? With what? The QPI link? Probably depends on the model but not sure. Wasn't there a CPUZ pic of clarkdale around here somewhere? That would show the QPI speed on it. Id find it but Im headed for bed. ;)
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
Originally posted by: MODEL3
Originally posted by: ilkhan
your comment about a 32nm Q8200 intrigues me, but I really don't think theres a spot for it in their lineup. Intel already has 9 SKUs planned for s1156, from 2.8Ghz duals @ $87 to 2.93Ghz quads @ $562. While I think they could get to a $100 quad if they really wanted to, (perhaps, as you say, a slow 32nm quad with 4MB cache) I think that faster duals are better for the low end of the market at this point. $87, $123, $143, $176, $196, $284, $562. Might be room for a $99 SKU, but the rest is pretty tight.

AMD has what, 5? 6? different die designs out for phenom/athlon II already? I can't keep track of the AMD side, the names have no order and it's not like AMD keeps their word on it anyway (core/cache unlocking Im looking at you).

Probably there is not a spot in their lineup, but certainly there isn't a spot for a 32nm 8200 based on Intel's strategy (what Intel is trying to do to the chipset/IGP/VGA competition...)

Although it would be good for the people that have dual cores, like 65nm E1X00 / E2XX0 / E4X00 / E6X00 or 45nm E3X00 /E5X00 / E6X00, to be able to upgrade for little money to a 32nm 775 Quad, i don't think it is very probable to happen...

I just wanted to point, what kind of increase to the die size brings, the integration of the memory controller and the pci-express in addition to the Nehalem architecture in relation with the Core 2 architecture (which is natural...)

Actually, the 32nm Clarkdale CPU die alone (not including the seperate IGP die) should be just a little bit bigger (in size, mm2) than a 32nm 8200 (with 4MB native cache) die.

I like this little info vid from intel its simple enough , Well done for what it intails.

http://www.youtube.com/watch?v...RROZmQ&feature=channel

 

MODEL3

Senior member
Jul 22, 2009
528
0
0
Originally posted by: ilkhan
memory <-> memory controller <-> QPI <-> CPU.
memory performance will be slower than bloomfield/lynnfield, but the degree to which that'll show in every day use is...minimal, at most.

Sure my friend.
I agree completely.
This is an easy thing to figure.
I am asking another thing. (see below)

Originally posted by: ilkhan
Beyond those Im not sure what you're asking. Speed it communicates at? With what? The QPI link? Probably depends on the model but not sure. Wasn't there a CPUZ pic of clarkdale around here somewhere? That would show the QPI speed on it. Id find it but Im headed for bed. ;)

Let's be more clear about what i ask.

Clarkdale case:
The CPU die is communicating with the GPU die with a QPI link.
This means that the memory controller which is in the GPU die is communicating with the CPU die at speeds that the QPI link can provide.
Like i said, the QPI link at 3,2GHz delivers 12.8 GB/s on each direction.
So depending on the QPI clock speed we have different communication speed.
I asked if someone knows at what speed the QPI clock will be. (it is probable, like you say the QPI link to have different clock speeds, depending on the model, but the variation (%) will be probably be small)
I ask if someone knows the GB/s range.

Lynnfield case:

The memory controller is within the CPU die.
With this kind of connection what is the communication speed?

 
Apr 20, 2008
10,067
990
126
These speeds are not amazing but if it's priced well it should be fine. If it cannot compete with the budget sector now then Intel might have a big problem. AMD looks like they will be all right to this new generation. This appears as if it will not bode well with their investors if the performance and price is not good. I don't think it's just me but it's these look pretty messed up. I severely doubt their new 32 nanometer quad cores would perform so horribly.