AMD K10.5 is 10-20 percent faster than K10

Page 4 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Originally posted by: Viditor
Originally posted by: Kuzi
Notice how most desktop applications (not synthetic) today don?t use more than two cores, and even the ones that do make use of +2 cores, only get a minor boost in performance (going from 2 to 4 cores), this is generally speaking of course. So the more cores you have it becomes harder and harder to make use of the extra cores, how many programs are out right now that give 4x performance from 4 cores vs a single core.

As to Hyperthreading in Nehalem (8x threads), my guess is that the increase in performance will be minimal for two reasons. First it will never be as effective as having 8 real cores, and second is what I mentioned above, that in general most applications today don?t get much benefit even from 4 cores. On the server side it might be very different and my guess Nehalem with Hyperthreading will show it?s true strength there.

As to AMD my guess is that they plan to have two quad-core processors on the same die (non-native) what Intel has been doing with their CPUs for a while. You know it?s cheaper and easier to do, but even at 45nm the die size would be too big, so maybe they need to wait till 32nm to do that. Good luck AMD.

I agree completely about hyperthreading...in fact, I don't know of a good reason for Intel to be bringing it back (unless it helps with the CSI interface in some way).

As to AMD's upcoming 8 and 16 cores CPUs, I don't know...
What we know so far is:
1. They are part of the fusion project in that they will be designed for modularity. This lends credence to your predicition of an MCM in that with all of the different variables (xCPU + xGPU = 8 cores), it would be VERY expensive to design and produce all of them...

2. On the other hand, we also know that they will utilize DC architecture and have a crossbar switch. This has always been a feature of monolithic design only (in fact I don't know how you could do it with an MCM). There's also the problem of how you deal with the on-die memory controller in an MCM...

BTW, while an MCM is cheaper from a yield standpoint, it's not necessarily cheaper overall (they are very expensive to design, which is why AMD doesn't have one).

My understanding is that Nehalem was designed by the recycled Prescot team in Oregon. So injecting hyperthreading is something those guys would be expected to know how to do versus the Israeli design team.

Also hyperthreading need not incur a performance penalty just because >1/core is involved. IBM and SUN both have many products in the market which have >1thread/core and they do not incur a thread level performance penalty. It comes down to design of course, and tradeoffs. Just saying assuming a priori that Nehalem's SMT will be sub-par is not an assumption I'm comfortable making.

Why do hyperthreading? Well you do save die space and TDP on a normalized thread level. If you can get 50% the benefit of having a second thread on one core versus a dedicated core, but at only the cost of 15% increase in die size and 20% increase in power consumption then that is a win-win.

Granted if you can't invoke SMT without the performance/die-size or performance/watt paying dividends then you shouldn't bother.

Why do it for desktop? Well even if desktop apps don't use >2 threads you still can make cheaper chips selling "dual-thread" single core chips (smaller die, less heat, etc).

So I still see this all falling into the bucket of "if you can do SMT for a net gain, why wouldn't you?". To me this is a QED on why do SMT. The question is what does AMD do in response. The arguments above remind me of the early days when the AMD camp was rallying themselves around the "monolithic quad-core will be teh uber, MCM FTL!".

Just because you can do hyperthreading badly and you can make it worse than not having it at all doesn't mean that is a forgone conclusion. Intel showed MCM was not the end of the world, IBM and SUN have already shown SMT is not the end of the world. So what is AMD got cooking to deal with it? Hopefully not another PR campaign of "our threads are pure! none of this so-called resource sharing will taint the purity of our native thread processors."
 

Kuzi

Senior member
Sep 16, 2007
572
0
0
Of course Nehalem having Hyperthreading is an advantage, no one can deny that. I mean worst case is you just turn it off and you still get a processor that is faster than the competition.

Sorry to say AMD will not be able to claim the highest performance like they once did. Not now, not in a few years even. Intel will not give them that chance anymore. We can at least hope they stay competitive enough and produce CPUs/GPUs with good price/performance ratio. Second place is not bad as long as you are making money, and AMD didn't make money in a while.
 

Viditor

Diamond Member
Oct 25, 1999
3,290
0
0
Originally posted by: Phynaz
Because Nehalem is very wide and SMT will show some nice performance gains.

Not sure what you mean here...could you elaborate?

I'd be very interested in what makes you think an MCM is any more expensive to design that any other cpu architechture. Especially since the "expense" would be man-hours, and Intel did it in nine months. There's also the "expense" of of AMD not being in the quad core market for a year.

I didn't say that it was more expensive for Intel's current design if that's what you mean...
The expense is in designing MCM for an on-die memory controller with Direct Connect architechture. How would you even do that???

AMD doesn't have an MCM becuase they picked the wrong strategy. Need I post the link to the Hector Ruiz quote that he wished he had gone MCM instead of native?
Exactly what I just said...it was too damned expensive so he couldn't.
BTW, he didn't say instead of, he said in addition to...and he said it wasn't to be because of the cost.
By all means, post the link!
 

Viditor

Diamond Member
Oct 25, 1999
3,290
0
0
Originally posted by: Idontcare

My understanding is that Nehalem was designed by the recycled Prescot team in Oregon. So injecting hyperthreading is something those guys would be expected to know how to do versus the Israeli design team.

To be clear, I don't think anyone doubts that Intel has the ability to add hyperthreading...my question was why would they want to?

Also hyperthreading need not incur a performance penalty just because >1/core is involved. IBM and SUN both have many products in the market which have >1thread/core and they do not incur a thread level performance penalty. It comes down to design of course, and tradeoffs. Just saying assuming a priori that Nehalem's SMT will be sub-par is not an assumption I'm comfortable making.

Again, I don't doubt that Nehalem's SMT will be as good as it was for the P4...
But remember that in some cases, SMT caused a reduction in performance.
However, the real question is: Under what circumstances is SMT effective?
Certainly it's most effective when you have a single core and multiple threads. As the number of cores increase, the effectiveness of SMT decreases unless you change the software to use more and more threads.
As has been pointed out, 3 cores appears to be about the limit with current software.
Since Quads are coming to be the norm, I don't see what advantage you get from SMT (which in all cases is less eficient than a full core in retiring a thread).

Why do hyperthreading? Well you do save die space and TDP on a normalized thread level. If you can get 50% the benefit of having a second thread on one core versus a dedicated core, but at only the cost of 15% increase in die size and 20% increase in power consumption then that is a win-win.

Granted if you can't invoke SMT without the performance/die-size or performance/watt paying dividends then you shouldn't bother.

Why do it for desktop? Well even if desktop apps don't use >2 threads you still can make cheaper chips selling "dual-thread" single core chips (smaller die, less heat, etc).

So I still see this all falling into the bucket of "if you can do SMT for a net gain, why wouldn't you?". To me this is a QED on why do SMT. The question is what does AMD do in response. The arguments above remind me of the early days when the AMD camp was rallying themselves around the "monolithic quad-core will be teh uber, MCM FTL!".

Just because you can do hyperthreading badly and you can make it worse than not having it at all doesn't mean that is a forgone conclusion. Intel showed MCM was not the end of the world, IBM and SUN have already shown SMT is not the end of the world. So what is AMD got cooking to deal with it? Hopefully not another PR campaign of "our threads are pure! none of this so-called resource sharing will taint the purity of our native thread processors."

The problem I have with this statement is that there really isn't a "normalized thread level".
With the exception of a few specialized apps, 2-3 threads tends to be the lot.
As you point out, HThread increases die size and power usage...and for single and possibly dual core chips I can see the logic. The thing is that both Intel and AMD are aiming to make quad core the smallest die in the fairly near future, so adding the expense of design and manufacture of a SMT chip makes little sense to me.
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
Where do you get this stuff? Jay Leno?

SMT on Nehalem will be less effective than on P4? Nehalem has more execution execution hardware to keep busy, therefore SMT will be more effective.

Increasing the number of cores decreases SMT effectiveness? I guess I'll have to stop buying those 64 core Power 6+ servers, as I'm not buying effective hardware.

3 cores the limit with current software...yeah okay...see above.

Please also enlighten us all on what exactly what retireing a thread is. As a matter of fact how about any proof for anything you just posted?

Thanks,
Q
 

Viditor

Diamond Member
Oct 25, 1999
3,290
0
0
Originally posted by: Phynaz
Where do you get this stuff? Jay Leno?

SMT on Nehalem will be less effective than on P4? Nehalem has more execution execution hardware to keep busy, therefore SMT will be more effective.

Your Mother and I told you that you'd go blind someday...
I never said that, please reread the post.

Increasing the number of cores decreases SMT effectiveness? I guess I'll have to stop buying those 64 core Power 6+ servers, as I'm not buying effective hardware.

Ummm...this is basic logic, a 6 year old should be able to get it. The more cores you add, the less there is for SMT to do.


3 cores the limit with current software...yeah okay...see above.

Effective limit, yes...
Please list for me the software that uses more...


Go ahead and speak your last word...I can see that this is gonna degrade into one of those postings, and I really don't want to head down that path.
 

coldpower27

Golden Member
Jul 18, 2004
1,676
0
76
Originally posted by: Idontcare
Originally posted by: coldpower27
No, your forgeting to incorporate the fact that alot of the Bareclona die is core logic, doubling the cache on Bareclona would not make it 570mm2, 1MB of cache at the 65nm node shouldn't be much more then 25mm2 each so your looking at close to 400mm2, still way too costly for anything besides the MP environment though.

At 45nm 8MB of total cache with the Bareclona core as it is now would probably come in at the 210-270mm2 range quite doable.

On the topic of cache sizes, I was always surprised that AMD didn't spin a teh uber cache size Opteron targeted at that 8xxx server market. Sure the diesize would be intentionally large, sub-600mm^2, but it would mostly be large regions of cache arrays that could be fused off to maintain yields as well as down-binning to desktop SKU's to ensure the whole mix was saleable material.

I never quite settled on a rationale for why AMD only differentiated their Sempron chips by cache size for the lower end but not for their upper end Opterons (and now Barcelona's).

In the case of AMD I guess they wanted a unified product lineup, having one core for ALL MP environments allows AMD to manage it's inventory better. And as was said again I think due to AMD's lack of cache expertise back then, as well as being strained for capacity and the fact that they had IMC, I think they decided it wasn't worth it in their case to have huge LV3 caches.
 

coldpower27

Golden Member
Jul 18, 2004
1,676
0
76
Originally posted by: Kuzi
Originally posted by: coldpower27
Originally posted by: Kuzi
Actually I meant that Shanghai K10.5, will have 8MB (2MB L2+6MB L3) cache. So if Phenoms weakness was really the small 512K L2 cache per core, then I'm suggesting AMD engineers would have designed K10.5 to have 1MB L2 cache per core and 4MB L3 shared cache (instead of 6MB), that will still give K10.5 8MB total cache (4MB+4MB) so the die size will be similar to what the actual K10.5 will be at.

Would that be possible, LV3 cache is obviously easier to yeild as it is considerably slower, maybe it's cost trade off? Though even with double the total cache I expect the Shanghai derivatives to be smaller then the current Bareclona.

It is my understanding that as the processor gets more cores, a large shared cache pool becomes more important. I believe even Intel will start producing desktop processors with large L3 caches in the future.

As many people mentioned before, the L2 cache size in AMD CPUs is less important because of the integrated memory controller. It's just that at the current Barcelona/Phenom state, the IMC is running too slow at 1.8GHz, which increases L3 cache latency and lowers overall performance. Hopefully this problem will be rectified when Shanghai is released. And who knows there could other variables slowing Phenom right now.

Yeah, that is what Nehalem will be shaping up to be looking like 8MB of LV3 caches shared among the 4 cores, right now we don't know yet if they will have each core pair sharing a LV2 cache on Nehalem though, as that would be pretty sweet.

We know from Dunnington that Intel can do it, having a pair of cores share LV2 and the entire 6 core processor shared a large LV3 cache.

 

dmens

Platinum Member
Mar 18, 2005
2,275
965
136
Originally posted by: Viditor
Ummm...this is basic logic, a 6 year old should be able to get it. The more cores you add, the less there is for SMT to do.

lol hope you're not a grade school teacher. more cores cost money. fewer SMT cores can do the same job for less money.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Originally posted by: dmens
Originally posted by: Viditor
Ummm...this is basic logic, a 6 year old should be able to get it. The more cores you add, the less there is for SMT to do.

lol hope you're not a grade school teacher. more cores cost money. fewer SMT cores can do the same job for less money.

Yeah I can't figure out why you wouldn't want to do it unless you can't figure out how to do it effectively of course.

Selling dual-core quad-thread teany-tiny 45nm chips sure makes good business sense as verse to selling largish 4 core chips. If you have a choice of course. If you can't SMT, then quad-core chips are viable too.

I personally don't get the direction of the arguments that SMT is non-issue for AMD because hand-wavy arguments can be made that quad-cores themselves can be viewed as a non-issue for AMD.

This is a wierd position to take, very non-forward looking, and doesn't really negate the value-add merits of SMT. Shrug, I'm not passionate enough to try and understand everything though I guess.
 

Cookie Monster

Diamond Member
May 7, 2005
5,161
32
86
You know what would be great? If AT or any other big hardware sites decide to invest some time/resources looking at why a phenom/agena has such lackluster performance. Testing how the IMC found on agena impacts on overall performance, the difference in performance between increasing just the core clock frequency and increasing the IMC frequency, how it affects the L3 cache and its latency, etc etc. Im quite surprised that no site has bothered to look into this in depth.

 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Originally posted by: Cookie Monster
You know what would be great? If AT or any other big hardware sites decide to invest some time/resources looking at why a phenom/agena has such lackluster performance. Testing how the IMC found on agena impacts on overall performance, the difference in performance between increasing just the core clock frequency and increasing the IMC frequency, how it affects the L3 cache and its latency, etc etc. Im quite surprised that no site has bothered to look into this in depth.

My inclination is to suspect that if they were considering doing such a scientific analysis of the K10 then they are likely holding out for B3 steppings to be in hand.

Why invest all that time and effort into something (B2) which even if you thought could give you B3-like results (disable TLB patch) when you could just wait a few months and have the real deal in hand?

Now if phenom were dominating the top-end of performance even with the TLB issues (i.e. a parallel universe where C2D was never invented and Intel was fielding P4 still) then you could bet there would be all kind of impassioned editorials out there on how to squeeze an extra 5% out of your teh sick Phenom 9900 by juicing the IMC this way or the L3 cache that way.

As it stands its pretty hard to get fired up as a reviewer to spend some quality time with a chip that has trouble beating out the absolute slowest quad-core the competion is fielding.
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
Originally posted by: Viditor
Originally posted by: Phynaz
Where do you get this stuff? Jay Leno?

SMT on Nehalem will be less effective than on P4? Nehalem has more execution execution hardware to keep busy, therefore SMT will be more effective.

Your Mother and I told you that you'd go blind someday...
I never said that, please reread the post.

Increasing the number of cores decreases SMT effectiveness? I guess I'll have to stop buying those 64 core Power 6+ servers, as I'm not buying effective hardware.

Ummm...this is basic logic, a 6 year old should be able to get it. The more cores you add, the less there is for SMT to do.


3 cores the limit with current software...yeah okay...see above.

Effective limit, yes...
Please list for me the software that uses more...


Go ahead and speak your last word...I can see that this is gonna degrade into one of those postings, and I really don't want to head down that path.

Oracle
SAS
PeopleSoft
SAP R3
Cognos
Business Objects
OpenView
IIS
Apache
Websphere
And thousands more.

How many more of the applications that will use as many cores as you can throw at them would you like me to list?

Now let's see if you turn this into the type of thread you say you don't want...I.E. the one that proves you wrong again.
 

SickBeast

Lifer
Jul 21, 2000
14,377
19
81
K10.5 probably will be 10-20% faster than K10. The problem with that is that people are easily hitting 4ghz+ on E8400 chips! That gives K10.5 a 25% deficit (again) even if it manages to run at 3ghz. There's also the fact that the E8400 can be had for less $$$ than the cheapest Phenoms...

AMD is in big trouble right now. They're only now catching up with the Q6600 chips that came out ages ago.

If they don't pull of Fusion without a hitch with stellar performance, they're not going to last very much longer at all. :(
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Originally posted by: SickBeast
K10.5 probably will be 10-20% faster than K10. The problem with that is that people are easily hitting 4ghz+ on E8400 chips! That gives K10.5 a 25% deficit (again) even if it manages to run at 3ghz. There's also the fact that the E8400 can be had for less $$$ than the cheapest Phenoms...

AMD is in big trouble right now. They're only now catching up with the Q6600 chips that came out ages ago.

If they don't pull of Fusion without a hitch with stellar performance, they're not going to last very much longer at all. :(

I thought Fusion was a budget part, not a performance part. Did it change again?
 

harpoon84

Golden Member
Jul 16, 2006
1,084
0
0
Originally posted by: SickBeast
K10.5 probably will be 10-20% faster than K10. The problem with that is that people are easily hitting 4ghz+ on E8400 chips! That gives K10.5 a 25% deficit (again) even if it manages to run at 3ghz. There's also the fact that the E8400 can be had for less $$$ than the cheapest Phenoms...

You're comparing a quad core to a dual core. IF 45nm Phenoms can increase IPC to Penryn levels as claimed, AND clocks to 3GHz, then AMD is in with a shout. Though Intel would probably just increase Penryn clocks in such a case, and we'll be back to square one.
 

heyheybooboo

Diamond Member
Jun 29, 2007
6,278
0
0
Originally posted by: Idontcare
Originally posted by: SickBeast
K10.5 probably will be 10-20% faster than K10. The problem with that is that people are easily hitting 4ghz+ on E8400 chips! That gives K10.5 a 25% deficit (again) even if it manages to run at 3ghz. There's also the fact that the E8400 can be had for less $$$ than the cheapest Phenoms...

AMD is in big trouble right now. They're only now catching up with the Q6600 chips that came out ages ago.

If they don't pull of Fusion without a hitch with stellar performance, they're not going to last very much longer at all. :(

I thought Fusion was a budget part, not a performance part. Did it change again?

I believe you are correct. Fusion concept - business desktop --> midrange. ?Torrenza? would be enterprise, and to some extent enthusiast - multicore, multisocket, multi-gpu, application accelerators ...


Originally posted by: Cookie Monster
You know what would be great? If AT or any other big hardware sites decide to invest some time/resources looking at why a phenom/agena has such lackluster performance. Testing how the IMC found on agena impacts on overall performance, the difference in performance between increasing just the core clock frequency and increasing the IMC frequency, how it affects the L3 cache and its latency, etc etc. Im quite surprised that no site has bothered to look into this in depth.

I believe this has been done - possibly at XS and other 'extreme' sites. Though improvements in latency did not 'scale' a linear 1:1 with clcok speed there were significant improvements. The example I recall was something like 50% increase in clock speed resulted in a 35% improvement in latency.

As I understand it (which is not very well :p ) the L1/L2 cache runs at cpu clock speed - the memory controller and L3 cache runs at its own independent frequency.

Same with the HyperTransport bus - it has its own unique frequency.

All of these independent frequencies work off of the ol' AMD clock generator. (It's sometimes called the CPU host clock but that is probably misleading.) Each of these 'independent' operations has a unique clock multiplier.

This is not ""your old AMD"". The memory is not tied to processor frequency BUT it still is a product of the 200MHz clock generator.

When some folks say, ""I can't OC this new AMD - it's crap."" I think the overwhelming majority of them do not understand the new rules they are operating under.

Now part of all this are the issues (which AMD must address) regarding the performance of the HT bus. I would imagine the engineers at AMD understand that, too - lol

But at issue is the overall new "arch". AMD wants to establish all of this ""independence"". They want their 'cores' to each function independently. They want independent 'power planes'. They want the memory controller and L3 cache to run at its own independent frequency.

I think I understand the principle behind all of this.

And I also think these are issues that Intel is going to have to deal with eventually. The problem for AMD is that Intel has alot more engineers, resources and $$$ to deal with the issues of ""independence"" in cpu cores and overall platform development.

 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Originally posted by: harpoon84
Originally posted by: SickBeast
K10.5 probably will be 10-20% faster than K10. The problem with that is that people are easily hitting 4ghz+ on E8400 chips! That gives K10.5 a 25% deficit (again) even if it manages to run at 3ghz. There's also the fact that the E8400 can be had for less $$$ than the cheapest Phenoms...

You're comparing a quad core to a dual core. IF 45nm Phenoms can increase IPC to Penryn levels as claimed, AND clocks to 3GHz, then AMD is in with a shout. Though Intel would probably just increase Penryn clocks in such a case, and we'll be back to square one.

We? Do you mean AMD?
 

SickBeast

Lifer
Jul 21, 2000
14,377
19
81
Originally posted by: harpoon84
Originally posted by: SickBeast
K10.5 probably will be 10-20% faster than K10. The problem with that is that people are easily hitting 4ghz+ on E8400 chips! That gives K10.5 a 25% deficit (again) even if it manages to run at 3ghz. There's also the fact that the E8400 can be had for less $$$ than the cheapest Phenoms...

You're comparing a quad core to a dual core. IF 45nm Phenoms can increase IPC to Penryn levels as claimed, AND clocks to 3GHz, then AMD is in with a shout. Though Intel would probably just increase Penryn clocks in such a case, and we'll be back to square one.
The quad core Penryns will also clock at 4ghz+.

I'm just saying, even at 3ghz and 20% faster, the Phenom will still be at least 25% slower than intel's best offerings.

In many situations, there is no substitute for clockspeed (by that I mean at least one high-performance core). My guess is that in 90% of situations, an E8400 at 4ghz would crush a bug-free Phenom at 3ghz, even 4 cores vs. 2. :beer:
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
Hyperthreading:

One reason that was contributing to loss in performance with HT enabled in the P4 was the replay system on the Netburst architecture. The additional resources needed because it needed to execute the second thread greatly hurt the performance of the CPU with HT.

http://www.xbitlabs.com/articl...pu/display/replay.html

Nehalem, having the fundamental architecture off Core 2, won't have that problem. The performance loss having Hyperthreading will be much less, if any. The great increase in functional units and bandwidth will also help.

Pentium 4, and the architecuture that its based on, Netburst was from the beginning an "Experimental" CPU. It used radical concepts never used and used them very aggressively. Aggressive speculation, Trace Cache, double pumped ALUs, replay, extremely long pipeline stages. In the end though, it didn't bring much benefits in the real world. Perhaps in the future we'll have the Netburst revisited with a much saner design.
 

harpoon84

Golden Member
Jul 16, 2006
1,084
0
0
Originally posted by: Idontcare
Originally posted by: harpoon84
Originally posted by: SickBeast
K10.5 probably will be 10-20% faster than K10. The problem with that is that people are easily hitting 4ghz+ on E8400 chips! That gives K10.5 a 25% deficit (again) even if it manages to run at 3ghz. There's also the fact that the E8400 can be had for less $$$ than the cheapest Phenoms...

You're comparing a quad core to a dual core. IF 45nm Phenoms can increase IPC to Penryn levels as claimed, AND clocks to 3GHz, then AMD is in with a shout. Though Intel would probably just increase Penryn clocks in such a case, and we'll be back to square one.

We? Do you mean AMD?

Well, we, as the consumer, will be presented with the same situation in such a case. But I guess AMD would be back to square one as well. ;)
 

harpoon84

Golden Member
Jul 16, 2006
1,084
0
0
Originally posted by: SickBeast
The quad core Penryns will also clock at 4ghz+.

I'm just saying, even at 3ghz and 20% faster, the Phenom will still be at least 25% slower than intel's best offerings.

In many situations, there is no substitute for clockspeed (by that I mean at least one high-performance core). My guess is that in 90% of situations, an E8400 at 4ghz would crush a bug-free Phenom at 3ghz, even 4 cores vs. 2. :beer:

Yes, the $1000 QX9650, free of FSB limitations unlike the Q9450/Q9550, can freely overclock to 4GHz+, but it draws in excess of 150W+ at such speeds.

http://www.anandtech.com/cpuch...howdoc.aspx?i=3184&p=2

Again, *IF* 45nm Phenom can clock to 3GHz *AND* increase IPC to Penryn levels (which I doubt, but we'll see) then AMD would effectively have a CPU equal to Intel's current flagship QX9650. This would be a best case scenario for AMD, and I would expect Intel to respond by releasing Penryn quads at higher clockspeeds up to 3.6GHz, even 3.8GHz perhaps. This should keep Intel in the lead by roughly the same margin as they are today.

However, 4GHz+ as you suggested is out of the question as it would exceed the 125W TDP limit, at least on current steppings of 45nm silicon.
 

coldpower27

Golden Member
Jul 18, 2004
1,676
0
76
Originally posted by: harpoon84
Originally posted by: SickBeast
K10.5 probably will be 10-20% faster than K10. The problem with that is that people are easily hitting 4ghz+ on E8400 chips! That gives K10.5 a 25% deficit (again) even if it manages to run at 3ghz. There's also the fact that the E8400 can be had for less $$$ than the cheapest Phenoms...

You're comparing a quad core to a dual core. IF 45nm Phenoms can increase IPC to Penryn levels as claimed, AND clocks to 3GHz, then AMD is in with a shout. Though Intel would probably just increase Penryn clocks in such a case, and we'll be back to square one.

Though it's been awhile where AMD has had a process which right away clocks higher then the previous one as both the 90nm and 65nm SOI process AMD has utilized so far both clocked lower initially then previous offerings.

Evne normalizing for the 65nm process and the improvements in IPC from K8 to K10 your only at about 2.65 GHZ for K10 clockspeed normalized to K8 performance though this is on a Quad instead of a Dual, or 2.8GHZ for K8 itself vs the 90nm SOI 3.2GHZ Dual Core.

And if AMD is really going to get 45nm out the door in mid Year, then I expect it to be in August, at the earliest.

We'll see won't we if there will be any stock 3GHz Phenom process this year...
 

Kuzi

Senior member
Sep 16, 2007
572
0
0
Originally posted by: harpoon84
Yes, the $1000 QX9650, free of FSB limitations unlike the Q9450/Q9550, can freely overclock to 4GHz+, but it draws in excess of 150W+ at such speeds.

http://www.anandtech.com/cpuch...howdoc.aspx?i=3184&p=2

Again, *IF* 45nm Phenom can clock to 3GHz *AND* increase IPC to Penryn levels (which I doubt, but we'll see) then AMD would effectively have a CPU equal to Intel's current flagship QX9650. This would be a best case scenario for AMD, and I would expect Intel to respond by releasing Penryn quads at higher clockspeeds up to 3.6GHz, even 3.8GHz perhaps. This should keep Intel in the lead by roughly the same margin as they are today.

However, 4GHz+ as you suggested is out of the question as it would exceed the 125W TDP limit, at least on current steppings of 45nm silicon.

I agree here, I don't think Intel would release quad-cores (45nm) at anything higher than 3.6GHz maybe 3.8GHz. The power draw is too high above that (4GHz+).

So even if Intel has QXxxxx running at 3.6GHz at that time, and AMD releases K10.5 all the way up to 3GHz initially, it should not be a problem cause that Intel CPU may sell for $1000, and AMD can (have to) sell their's for much less.

I would worry more about the IPC performance of K10.5, you know they have to compete with Nehalem too. Unless Intel delays it and give AMD a small break, or if they price Nehalem too high initially.

 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Originally posted by: Kuzi
Originally posted by: harpoon84
Yes, the $1000 QX9650, free of FSB limitations unlike the Q9450/Q9550, can freely overclock to 4GHz+, but it draws in excess of 150W+ at such speeds.

http://www.anandtech.com/cpuch...howdoc.aspx?i=3184&p=2

Again, *IF* 45nm Phenom can clock to 3GHz *AND* increase IPC to Penryn levels (which I doubt, but we'll see) then AMD would effectively have a CPU equal to Intel's current flagship QX9650. This would be a best case scenario for AMD, and I would expect Intel to respond by releasing Penryn quads at higher clockspeeds up to 3.6GHz, even 3.8GHz perhaps. This should keep Intel in the lead by roughly the same margin as they are today.

However, 4GHz+ as you suggested is out of the question as it would exceed the 125W TDP limit, at least on current steppings of 45nm silicon.

I agree here, I don't think Intel would release quad-cores (45nm) at anything higher than 3.6GHz maybe 3.8GHz. The power draw is too high above that (4GHz+).
So even if Intel has QXxxxx running at 3.6GHz at that time, and AMD releases K10.5 all the way up to 3GHz initially, it should not be a problem cause that Intel CPU may sell for $1000, and AMD can (have to) sell their's for much less.

I would worry more about the IPC performance of K10.5, you know they have to compete with Nehalem too. Unless Intel delays it and give AMD a small break, or if they price Nehalem too high initially.

I wouldn't be too hasty in assuming Intel's process engineers are sitting around doing nothing to improve 45nm power consumption characteristics, nor would I assume that Intel doesn't have a stepping team working on at least one more Penryn stepping just as they did with Kentsfield (B3 -> G0).

In fact I expect the upcoming 45nm stepping release here in March to do just that.