Intel's secret weapon against Hammer?

KenAF

Senior member
Jan 6, 2002
684
0
0
We recently learned all about "Prescott," Intel's next-generation Pentium4. We learned it will have twice the L1 data cache (16Kb), twice the L2 cache (1Mb), a 800MHz system bus, 13 new instructions that might be SSE3, reduced latency on integer multiplies, more write combining buffers, improved prefetching and branch prediction, plus "Hyperthreading 2."

Who cares, right? Nothing for AMD to worry about. It's just a P4 with more cache, SSE3, and a few other minor optimizations and improvements. Or is it?

Chiparchitect.com took a detailed look at the Prescott and Northwood core (die) pictures. And they came up with some very interesting conclusions...it seems that Intel is still keeping many of Prescott's improvements under wraps.

First, they compared the size of a 256K L2 block in both Northwood and Prescott to the size of the trace cache. On Northwood, a 256K L2 block is 2.4 times the size of the t-cache, while a 256K L2 block on Prescott is only 1.6 times the t-cache size. Thus, it appears that Prescott has a ~160kByte (16uOps) trace cache that is 30% larger than the 12uOps trace cache on Northwood and Williamette. This change will almost certainly improve IPC. Indeed, this change is even more significant, from a design standpoint, than the extra L1 and L2 cache.

Secondly, when considering the increased trace cache, and after comparing the layout of the pipeline stages for Northwood and Prescott, they've come to the conclusion that Prescott is actually a 4-way design; that is, it issues and retires 4 instructions per clock cycle, up from the 3 in Williamette and Northwood. This would represent a significant design change from the Pentium4--a potential 33% improvement in the work performed every cycle. Of course, there wouldn't be much point to this change without additional execution resources...

Perhaps the most significant revelation from Chiparchitect.com's analysis is the apparent differences in the Rapid Execution Engine (Intel's term for their double pumped ALUs running at twice clock speed). According to their observations, the Prescott appears to DOUBLE the number of Rapid Execution Engines. That is, whereas the Northwood and Williamette have a single [effective] 32-bit ALU running at twice chip frequency, the Prescott appears to mirror or "double up" this silicon. Said another way, the Prescott appears to double the number of integer execution resources on the current P4--this is the type of thing you would expect of a dual core processor.

In summary....if their silicon observations hold up, then Prescott will be more than deserving of the Pentium 5 name. Whereas the current 3.06GHz Northwood issues 3 instructions per cycle from 2 threads to 1 execution core, it looks like Prescott may do 4 instructions per cycle from 2 threads to 2 execution cores. This would *significantly* increase IPC and Hyperthreading throughput on Prescott. Whereas the current P4 may get a 5% to 15% boost (on average) with Hyperthreading, Prescott could well get a 30% to 90% improvement with Hyperthreading.

It's not inconceivable that Prescott would exceed the Athlon in IPC with these changes. And it could come close to Hammer's IPC. All while running at substantially higher clock speeds...

Comments? Thoughts?
 

paralazarguer

Banned
Jun 22, 2002
1,887
0
0
Alright. They neglect to mention that prescott will also have more stages in its pipeline to reach higher clock frequencies.
It's not inconceivable that Prescott would exceed the Athlon in IPC with these changes

That statement is ridiculous at best.
 

Electrode

Diamond Member
May 4, 2001
6,063
2
81
You mention dual 32-bit ALU's. Anyone think that it will be possible to use them together, as a 64-bit ALU? Might those new instructions be akin to x86-64?
 

AgaBoogaBoo

Lifer
Feb 16, 2003
26,108
5
81
I'm thinking that they could very well be used together for 64bit. What they might do is later on after the 32bit release, they could release a 64bit version with the same chip except differed design in the ALU's. Another thing, is it possibly for the BIOS to allow a user to pick between 64bit and 32bit assuming they make it so the 2 32 bit ALUs can work together for a "virtual" 64bit?
 

KenAF

Senior member
Jan 6, 2002
684
0
0
That statement is ridiculous at best.
The Athlon has more execution resources than the P4, so of course it has higher IPC. But based on their findings, that will change with Prescott; for integer operations, Prescott will have 33% more usable execution resources than Athlon, rather than 33% less (based on available ports).

Alright. They neglect to mention that prescott will also have more stages in its pipeline to reach higher clock frequencies.
Prescott is believed to have two additional pipeline stages. But Prescott's improved prefeching and branch prediction should more than make up for the difference; AMD has touted the benefits of these changes to the Hammer as well. With the second "Rapid Execution Engine," an extra pipeline stage or two would be required to support writeback of results to both L1 caches/register files. So the added pipeline in that case would increase IPC, not lower it.

You mention dual 32-bit ALU's. Anyone think that it will be possible to use them together, as a 64-bit ALU? Might those new instructions be akin to x86-64?
The author doesn't seem to think they are implemented that way, but he does think the dual / mirrored "Rapid Execution Design" could have been originally designed / intended to support 64-bit, before it was modified for Prescott. So though Prescott almost certainly doesn't support 64-bit, this technology may have come out of Intel's "Yamhill" 64-bit research.
 

alexruiz

Platinum Member
Sep 21, 2001
2,836
556
126
Originally posted by: KenAF
We recently learned all about "Prescott," Intel's next-generation Pentium4. We learned it will have twice the L1 data cache (16Kb), twice the L2 cache (1Mb), a 800MHz system bus, 13 new instructions that might be SSE3, reduced latency on integer multiplies, more write combining buffers, improved prefetching and branch prediction, plus "Hyperthreading 2."

Who cares, right? Nothing for AMD to worry about. It's just a P4 with more cache, SSE3, and a few other minor optimizations and improvements. Or is it?

Chiparchitect.com took a detailed look at the Prescott and Northwood core (die) pictures. And they came up with some very interesting conclusions...it seems that Intel is still keeping many of Prescott's improvements under wraps.

First, they compared the size of a 256K L2 block in both Northwood and Prescott to the size of the trace cache. On Northwood, a 256K L2 block is 2.4 times the size of the t-cache, while a 256K L2 block on Prescott is only 1.6 times the t-cache size. Thus, it appears that Prescott has a ~160kByte (16uOps) trace cache that is 30% larger than the 12uOps trace cache on Northwood and Williamette. This change will almost certainly improve IPC. Indeed, this change is even more significant, from a design standpoint, than the extra L1 and L2 cache.

Secondly, when considering the increased trace cache, and after comparing the layout of the pipeline stages for Northwood and Prescott, they've come to the conclusion that Prescott is actually a 4-way design; that is, it issues and retires 4 instructions per clock cycle, up from the 3 in Williamette and Northwood. This would represent a significant design change from the Pentium4--a potential 33% improvement in the work performed every cycle. Of course, there wouldn't be much point to this change without additional execution resources...

Perhaps the most significant revelation from Chiparchitect.com's analysis is the apparent differences in the Rapid Execution Engine (Intel's term for their double pumped ALUs running at twice clock speed). According to their observations, the Prescott appears to DOUBLE the number of Rapid Execution Engines. That is, whereas the Northwood and Williamette have a single [effective] 32-bit ALU running at twice chip frequency, the Prescott appears to mirror or "double up" this silicon. Said another way, the Prescott appears to double the number of integer execution resources on the current P4--this is the type of thing you would expect of a dual core processor.

In summary....if their silicon observations hold up, then Prescott will be more than deserving of the Pentium 5 name. Whereas the current 3.06GHz Northwood issues 3 instructions per cycle from 2 threads to 1 execution core, it looks like Prescott may do 4 instructions per cycle from 2 threads to 2 execution cores. This would *significantly* increase IPC and Hyperthreading throughput on Prescott. Whereas the current P4 may get a 5% to 15% boost (on average) with Hyperthreading, Prescott could well get a 30% to 90% improvement with Hyperthreading.

It's not inconceivable that Prescott would exceed the Athlon in IPC with these changes. And it could come close to Hammer's IPC. All while running at substantially higher clock speeds...

Comments? Thoughts?

Let's wait and see...... It will be better than Northwood, but reaching the same IPC than an Athlon..... well, either they took a very good bribe or the stuff they are smoking is decreasing in quality..... . ;)

I still think that for a CPU design that generated over 2000 patents, there WILL be a lot more than just SSE2 support, hypertransport and integrated memory controller...... Time will tell.
 

OddTSi

Senior member
Feb 14, 2003
371
0
0
I think that all of this is just as hypothetical (perhaps even more so) than all of the things that the Inq posts on their site. It's good speculation to base some discussion on, but that's all it is, speculation.

As far as the 64-bit processor thing goes, why is it that everyone thinks that going from a 32-bit to a 64-bit processor automatically nets a 100% performance increase?
 

EdipisReks

Platinum Member
Sep 30, 2000
2,722
0
0
Originally posted by: alexruiz
well, either they took a very good bribe or the stuff they are smoking is decreasing in quality

no, no, no, it's increasing in quality. if it were decreasing, they would cough more, not believe more. regardless, i wanna see Prescott in action, not read some opinion that is based on a what if of a what if. that is a lot to come up with from a picture, afterall.
 

Duvie

Elite Member
Feb 5, 2001
16,215
0
71
I agree we will see....BUt if the AMD hammer is any more tardy if it didn't have these things now it could by then....AMD still needs to get those hammers up to the level they need to be or they will come out neck and neck and not far ahead as once thought.

I am more interested to see how skewed and screwed up the pr rating will be.....Regardless if it is compared to a tbird or not the fact remains most see it as a pretty good comparison to a p4 northwood and reviews back that up at the higher speeds. Every advancement Intel makes will gap thes even further....
 

Vegito

Diamond Member
Oct 16, 1999
8,329
0
0
all i can say is that i spend a lot of money cruching numbers in excel and I bet ya, this wont be fast enough... i got a few dual 2.8 xeons and crap and excel is still slow..
 

paralazarguer

Banned
Jun 22, 2002
1,887
0
0
No kidding. I have to run a couple of macros as work each day dealing with access databases and some excel spreadsheets and they take 10-30 minutes to run each one (on 1ghz) Hurry up computers, get faster!
 

AgaBoogaBoo

Lifer
Feb 16, 2003
26,108
5
81
For one, I don't think this. All I know is that 64bit is the next giant leap for speed to increase greatly.
 

Bovinicus

Diamond Member
Aug 8, 2001
3,145
0
0
It's a good looking CPU. AMD definitely has fierce competition on their hands. I think they will continue to at least make CPUs with a lot of value, but they may not be able to compete in the high end. Only the benchmarks will tell...
 

CZroe

Lifer
Jun 24, 2001
24,195
857
126
If only AMD could have created a new standard LAST year
rolleye.gif
 

zephyrprime

Diamond Member
Feb 18, 2001
7,512
2
81
You mention dual 32-bit ALU's. Anyone think that it will be possible to use them together, as a 64-bit ALU? Might those new instructions be akin to x86-64?
It just doesn't work that way.

Ever since I heard that the prescott would have more pipeline stages, I realized that the Prescott was a big P4 update. Way way more than the northwood. But if it really has dual integer cores, that would be something else.
 

jjyiz28

Platinum Member
Jan 11, 2003
2,901
0
0
Originally posted by: Wingznut
Originally posted by: AgaBooga
For one, I don't think this. All I know is that 64bit is the next giant leap for speed to increase greatly.
Maybe you ought to read the AnandTech FAQ,

The myths and realities of 64-bit computing for starters.

Until the need to address more than 4gb of memory arrives, 64-bit won't give much of an improvement over 32-bit.

gee, really?? 64bit processor will only give us an increase from 2^32 = 4gig, to 2^64 = 1.84*10^19?? thats all higher bit processor will do?? thats all its good for? no improvement other than that?? just more address space??
rolleye.gif


someone correct me if im wrong but doesnt pentium pro chips and up have a 36bit address bus? so max addressable is 2^36 = 68gigs. itaniam has 44bit address bus.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
That's right. 64bit no better than 32bit.

The proof is in the fact that the most serious servers, workstations, and supercomputers are 64bit.

Yep, no reason to put 64bit on the desktop.

Oh, and that's why there was no benefit, beyond larger address space, when the PC desktop migrated from 16bit to 32bit.

Dam PC makers, trying to shove this newfangled 64bit technology down Joe Shmoe's mouth...

Oops, guess I should read the bible...er, I mean the AT 64bit myths FAQ.
 

jjyiz28

Platinum Member
Jan 11, 2003
2,901
0
0
Originally posted by: Idontcare
That's right. 64bit no better than 32bit.

The proof is in the fact that the most serious servers, workstations, and supercomputers are 64bit.

Yep, no reason to put 64bit on the desktop.

Oh, and that's why there was no benefit, beyond larger address space, when the PC desktop migrated from 16bit to 32bit.

Dam PC makers, trying to shove this newfangled 64bit technology down Joe Shmoe's mouth...

Oops, guess I should read the bible...er, I mean the AT 64bit myths FAQ.

you do know that nt, 9x, etc are 32 bit OSes?? and yes it took a long time for a 32bit OS to come out when the first 32bit CPU came out, but that not how it works now.
There WILL be a benefit when desktop cpus move to 64bit athlon, an athlon 64bit version of XP.



 

Vegito

Diamond Member
Oct 16, 1999
8,329
0
0
Originally posted by: paralazarguer
No kidding. I have to run a couple of macros as work each day dealing with access databases and some excel spreadsheets and they take 10-30 minutes to run each one (on 1ghz) Hurry up computers, get faster!

u think excel will crunch better in 64 bit mode ?
 

Wingznut

Elite Member
Dec 28, 1999
16,968
2
0
Originally posted by: Idontcare
That's right. 64bit no better than 32bit.

The proof is in the fact that the most serious servers, workstations, and supercomputers are 64bit.

Yep, no reason to put 64bit on the desktop.

Oh, and that's why there was no benefit, beyond larger address space, when the PC desktop migrated from 16bit to 32bit.

Dam PC makers, trying to shove this newfangled 64bit technology down Joe Shmoe's mouth...

Oops, guess I should read the bible...er, I mean the AT 64bit myths FAQ.
Seems like you know a lot about 32-bit vs 64-bit...

I don't suppose you'd mind explaining to us what part of Sohcan's FAQ is inaccurate? Or exactly how 64-bit will speed up your desktop experience?
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Originally posted by: Wingznut
Originally posted by: Idontcare
That's right. 64bit no better than 32bit.

The proof is in the fact that the most serious servers, workstations, and supercomputers are 64bit.

Yep, no reason to put 64bit on the desktop.

Oh, and that's why there was no benefit, beyond larger address space, when the PC desktop migrated from 16bit to 32bit.

Dam PC makers, trying to shove this newfangled 64bit technology down Joe Shmoe's mouth...

Oops, guess I should read the bible...er, I mean the AT 64bit myths FAQ.
Seems like you know a lot about 32-bit vs 64-bit...

I don't suppose you'd mind explaining to us what part of Sohcan's FAQ is inaccurate? Or exactly how 64-bit will speed up your desktop experience?


What did I write that would make you assume that I know a lot about 32bit vs 64bit?

What did I write that made you assume any part of Sohcan's FAQ is inaccurate?

-Idontcare

(Oh, and since you seem to find one's position of employment important enough to include in your sig...)

- 65nm CMOS R&D, Texas Instruments

**Not Speaking for TI, Inc.**