What if they had made northwood at 90nm?

Maximilian · Feb 18, 2006

What if prescott never materialised and they moved northwood to 90nm? For a cpu that runs hotter needs +200mhz and more cache to perform the same as a lower clocked northwood its pretty crap. Why didnt they just move northwood to 90nm? The change in process alone would have allowed for more clockspeed without having to lengthen the pipeline no?

AzNPinkTuv · Feb 18, 2006

then my friend, intel wouldnt be intel

stevty2889 · Feb 18, 2006

Unfortunatly, no, they tried getting higher clock speeds out of Northwood, but just couldn't do it without lengthening the pipeline.

Maximilian · Feb 18, 2006

O... well its nice to know they tried

stevty2889 · Feb 18, 2006

Originally posted by: Soviet
O... well its nice to know they tried

heh, they had to try something, when prescott was so far behind on it's release.

StrangerGuy · Feb 18, 2006

I'll bet it will runs just as hot as the Prescotts. As far as I can tell, it was the 90nm process that caused so much problems for Intel, not the Prescott core itself.

dexvx · Feb 18, 2006

Originally posted by: StrangerGuy
I'll bet it will runs just as hot as the Prescotts. As far as I can tell, it was the 90nm process that caused so much problems for Intel, not the Prescott core itself.

Do you know what a Dothan is. Same timeframe for release date.

StrangerGuy · Feb 18, 2006

Originally posted by: dexvx

Originally posted by: StrangerGuy
I'll bet it will runs just as hot as the Prescotts. As far as I can tell, it was the 90nm process that caused so much problems for Intel, not the Prescott core itself.

Click to expand...

Do you know what a Dothan is. Same timeframe for release date.

Dothan is optimized for low power consumption from the ground up. P4s are not. Plus I have seen somewhere that Dothan consumes more idle power than (possibly same-clocked) Banias

stevty2889 · Feb 19, 2006

Originally posted by: StrangerGuy
I'll bet it will runs just as hot as the Prescotts. As far as I can tell, it was the 90nm process that caused so much problems for Intel, not the Prescott core itself.

THe prescott design was a big part of the problem, they increased the pipeline length, added EM64T, doubled the cache size, built in vanderpool, and execute disable bit, which all added a lot of transistors. A die shrunk northwood would have run cooler, they just were unable to get a higher clock speed without increasing the pipeline length. They did the right thing with cedarmill/presler however, but just doing a die shrink, without drasticly changing the design, and it uses a little less power, and runs around 10c cooler than Prescott/Smithfield.

Cooler · Feb 19, 2006

Then again they only got 400mhz on Prescotts over the northwood core so they might have been better off with the shrink.

stevty2889 · Feb 19, 2006

Originally posted by: Cooler
Then again they only got 400mhz on Prescotts over the northwood core so they might have been better off with the shrink.

Maybe if they added some L2 cache and SSE3/EM64T to northwoods it would have helped some, as long as the cache didn't have the higher latency like the Prescotts. They would have just had added features though, and not gained any clock speed for 3 years, even if it was only 400mhz.

frostedflakes · Feb 19, 2006

Netburst was supposed to scale to 10GHz... hence the reason for increasing the length of the pipeline. Die shrinks are only good for very small increases in clock speed, but if you want to make huge jumps you need to make a longer pipeline. Die shrinks are more important for reducing cost and reducing voltage/heat. For whatever reason, a long pipeline just creates a lot of heat. Don't know exactly why, although I think it may have something to do with the leakage currents?

IMHO Netburst was a huge engineering failure, probably one of the biggest in this industry, Intel totally underestimated the heat caused by a longer pipeline (or had assumed die shrinks/lower voltage could keep it in check, but even at 65nm Netburst consumes more than some 90nm and 130nm products). This is why they have ditched it and are going for a more P6 like design -- short pipe, low GHz, high IPC, multiple cores, large cache, etc.

dmens · Feb 19, 2006

Prescott wasn't a shrink. It was supposed to be. If it were a straight shrink done by the northwood team, it would've hit 5ghz, probably higher.

Fox5 · Feb 19, 2006

Originally posted by: Soviet
What if prescott never materialised and they moved northwood to 90nm? For a cpu that runs hotter needs +200mhz and more cache to perform the same as a lower clocked northwood its pretty crap. Why didnt they just move northwood to 90nm? The change in process alone would have allowed for more clockspeed without having to lengthen the pipeline no?

If prescott had hit 5ghz as planned, it wouldn't have seemed like such a dumb move. Prescott basically hit the heat output levels planned for 5ghz at like 3ghz.

Oh, and there's no guarantee northwood would have clocked much higher at 90nm, AMD, IBM, and Intel haven't been getting much in the way of clock speed increases out of smaller processes. Going from 130nm to 90nm, AMD only gained about 200mhz (minus additional things like SOI), same for IBM, and Intel going from 90nm to 65nm only gained about the same. (I consider it a 200mhz gain on the p4s, because the only 3.6ghz and 3.8ghz p4s had to use throttling, while the new 3.6ghz p4s don't)

Furen · Feb 19, 2006

I dont think Netburst is an engineering failure but rather an architecture that hit the limits of the fabrication methods. AMD also experienced pretty horrible heat problems and the A64 would have probably been a scorcher, too, without SOI. And both of these architectures are pretty much as different as they can be, one is deep and narrow and the other is shallow and wide. We claim that prescott bites just because Intel wasn't able to make it hit its 5GHz clock speed target... at that clock it would (at least) match AMD's best.

Remember that hindsight is 20/20. Intel probably wouldn't have been able to get more than 600-1000MHz out of a Northwood die-shrink. It might sound like a lot but the prospect of a 4GHz-4.4GHz Pentium 4 by 2006 would have sounded ridiculously laughable back in 2003. Intel went from 1GHz to 3.06GHz between March 2000 and November 2002 so it expected its 90nm Desktop CPU to be able to scale clocks much better than that when it was launched in 2h03 (as originally intented). It was Intel's goal to have Prescott reach around 5GHz by this time, which would have been a considerable slowdown in clock speed increases (clocks basically tripled in the two-and-a-half years before 2003) but considering how hammer was taking longer than anticipated (I'd venture say heat and yield problems, since hammers were huge) Prescott would have been the top dog for a while after its launch.

Regardless, A die-shrunk Northwood would have lacked x86-64, which would have been a huge blow to Intel in the server arena.

stevty2889 · Feb 19, 2006

Originally posted by: dmens
Prescott wasn't a shrink. It was supposed to be. If it were a straight shrink done by the northwood team, it would've hit 5ghz, probably higher.

I don't think they would have gotten much more than 3.6ghz out of the northwood even with a die shrink. They tried using a couple extra layers to squeeze some more out of the northwood when the prescott's release date got behind schedule, but they just couldn't get any more out of it without increasing the pipeline length.

dmens · Feb 19, 2006

Maybe. All I know was prescott is not resentative of anything at all.

thecoolnessrune · Feb 19, 2006

Originally posted by: Furen
I dont think Netburst is an engineering failure but rather an architecture that hit the limits of the fabrication methods. AMD also experienced pretty horrible heat problems and the A64 would have probably been a scorcher, too, without SOI. And both of these architectures are pretty much as different as they can be, one is deep and narrow and the other is shallow and wide. We claim that prescott bites just because Intel wasn't able to make it hit its 5GHz clock speed target... at that clock it would (at least) match AMD's best.

Remember that hindsight is 20/20. Intel probably wouldn't have been able to get more than 600-1000MHz out of a Northwood die-shrink. It might sound like a lot but the prospect of a 4GHz-4.4GHz Pentium 4 by 2006 would have sounded ridiculously laughable back in 2003. Intel went from 1GHz to 3.06GHz between March 2000 and November 2002 so it expected its 90nm Desktop CPU to be able to scale clocks much better than that when it was launched in 2h03 (as originally intented). It was Intel's goal to have Prescott reach around 5GHz by this time, which would have been a considerable slowdown in clock speed increases (clocks basically tripled in the two-and-a-half years before 2003) but considering how hammer was taking longer than anticipated (I'd venture say heat and yield problems, since hammers were huge) Prescott would have been the top dog for a while after its launch.

Regardless, A die-shrunk Northwood would have lacked x86-64, which would have been a huge blow to Intel in the server arena.

QFT :thumbsup: Well written.

frostedflakes · Feb 20, 2006

But no matter how you cut it, long pipes are not efficient. Even if we had better fabrification methods to keep heat output of Netburst in check, a shorter pipeline CPU using the same processes could perform just as well while using less power.

I think it kind of serves Intel right for Netburst to backfire so bad on them. Next time they shouldn't let marketing control their engineering, as Netburst way basically about "hey look at us, we have more GHz than the competition -- our product rox." Heck, when I first started getting into computers, I fell for the hype.

DrMrLordX · Feb 20, 2006

I'm surprised that Northwoods crapped out at 3.6 ghz. I thought their wall was 4 ghz or so?

dmens · Feb 20, 2006

Was everyone carping about netburst when northwood was doing well? It had a 25 stage pipeline, and even with all the extra high activity transistor gate width of a netburst pipeline, its thermals weren't that bad. In fact, if intel had the foresight to let the experienced northwood team (instead of another team which will go unnamed) handle the prescott project and slapped a memory controller on there, it'd be doing pretty well against A64. Prescott was netburst style, but it was a pretty awful implementation. Same goes for the abandoned tejas (again, that was not the northwood team).

As for the whole marketing driven design, that is overhyped. You'd have to be insane to imagine some marketing guy sitting in the conceptual phase of the project. Most of the architects I know would probably quit outright if that happened.

I work with plenty of engineers who bitch about prescott (and rightfully so), but they all know the problem was implementation, first and foremost. The concept behind netburst is sound, even if the P4 architects were told to ignore thermals because of intel's faith in scaling transistor technology faster than everyone expects. There is nothing inherently wrong with a narrower pipeline with a higher clock if you can keep it fed. Note how many design decisions on the P4 were geared towards that purpose: the trace cache/predictor, replay, deeper memory buffers, etc.

As for all the I-told-you-so efficiency preachings, if people were so concerned about efficiency, nobody would do branch prediction, or load speculation, or many other innovations now considered standard in the industry. The key is to strike a balance on both hardware and software. And there lies yet another problem that intel did not expect, the fact that the P4 optimized were not picked up by the development community. If intel did manage to own say, 99% of the market, the developers would probably have switched. The performance difference between conservative and aggressively P4-oriented codes is quite amazing.

The netburst path was chosen based on many predictions, so of which worked out, and some didn't. It isn't a simple case of "long narrow pipe bad". The uarch could well have worked if everything lined up correctly. Hell, netburst concepts are still used for next-next generation projects. TheInq had an article today that revealed its name, heh.

Rock Hydra · Feb 20, 2006

Originally posted by: dmens
Was everyone carping about netburst when northwood was doing well? It had a 25 stage pipeline, and even with all the extra high activity transistor gate width of a netburst pipeline, its thermals weren't that bad. In fact, if intel had the foresight to let the experienced northwood team (instead of another team which will go unnamed) handle the prescott project and slapped a memory controller on there, it'd be doing pretty well against A64. Prescott was netburst style, but it was a pretty awful implementation. Same goes for the abandoned tejas (again, that was not the northwood team).

As for the whole marketing driven design, that is overhyped. You'd have to be insane to imagine some marketing guy sitting in the conceptual phase of the project. Most of the architects I know would probably quit outright if that happened.

I work with plenty of engineers who bitch about prescott (and rightfully so), but they all know the problem was implementation, first and foremost. The concept behind netburst is sound, even if the P4 architects were told to ignore thermals because of intel's faith in scaling transistor technology faster than everyone expects. There is nothing inherently wrong with a narrower pipeline with a higher clock if you can keep it fed. Note how many design decisions on the P4 were geared towards that purpose: the trace cache/predictor, replay, deeper memory buffers, etc.

As for all the I-told-you-so efficiency preachings, if people were so concerned about efficiency, nobody would do branch prediction, or load speculation, or many other innovations now considered standard in the industry. The key is to strike a balance on both hardware and software. And there lies yet another problem that intel did not expect, the fact that the P4 optimized were not picked up by the development community. If intel did manage to own say, 99% of the market, the developers would probably have switched. The performance difference between conservative and aggressively P4-oriented codes is quite amazing.

The netburst path was chosen based on many predictions, so of which worked out, and some didn't. It isn't a simple case of "long narrow pipe bad". The uarch could well have worked if everything lined up correctly. Hell, netburst concepts are still used for next-next generation projects. TheInq had an article today that revealed its name, heh.

Interesting.

frostedflakes · Feb 20, 2006

What is the concept behind Netburst, though? "Stretching out the pipeline faster than we can control the heat output inherent to stretching out the pipeline"? As semiconductor technology advances, I think small increases in pipeline length are necessary and prudent, but I think Intel tried to do too much too fast with Netburst. For example, AMD went from a 10-stage pipeline with K7 to a 12-stage pipeline with K8 to (I'd assume) allow them to bump up the clocks a bit. SOI and other technologies would allow them to keep heat under control, so they went for it. I think Intel is doing the same thing with P6 --> P8 (I want to say 12-stage --> 14-stage, but I'm not sure this is completely accurate).

Then again, I'm no engineer (hope to be one in a few years, though), but this is how I understand these things at least. And like you mentioned it's easy to look back on things now and ask "What the heck were they thinking?" I'm sure Intel had no idea heat output would spiral out of control so quickly, otherwise they would've stuck with P6. They definitely deserve kudos for trying out something new, it's just a shame it didn't work out as expected.

dmens · Feb 20, 2006

Not quite. The netburst concept is to increase instruction throughput by stretching the pipeline, then keeping it busy with a combination of accurate speculations and quick restarts.

P5 to P6 went from 5 to 12 stages, and P6 worked out pretty well. As I said, the P4 architects were basically given a blank check on thermals because at the conclusion of the P6, they had so many transistors the engineers didn't know what to do with them. Besides, most of the time, if you want to add width to the machine, it'll cost a pipeline stage because the frequency penalty without adding the pipestage will end up tanking your real performance. The issue is determined by examining the ROI on various axes. So there's nothing wrong with a longer pipeline, especially if the lengthened portions are not critical to performance.

Fox5 · Feb 20, 2006

I'd say Netburst is a better design than the Athlon core, longer IS better than wider. Typically parallel problems can easily be made sequential, but the reverse isn't true, so Netburst should have a higher efficiency than Athlon. (when you consider Athlon has additional execution units)

And it's not like the Athlon and P-M's don't feature fairly long pipelines of their own.

BTW, didn't the final P3s have severe heat problems as well? Maybe we'll see a P4 based design come back in a few generations.

What if they had made northwood at 90nm?

Lifer

Senior member

Diamond Member

Lifer

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Platinum Member

Diamond Member

Golden Member

Diamond Member

Platinum Member

Diamond Member

Diamond Member

Lifer

Platinum Member

Diamond Member

Diamond Member

Platinum Member

Diamond Member