Could a pentium 4 reach 10 ghz on intel latest process node?

superrockie

Member
Aug 11, 2013
28
0
0
We all know that intels pentium 4 were designed for higher clockspeeds then they reached. The max that intel could get of the pentium was 3,2 ghz because off heat constraints. My question is: could these cpu really reach 10 ghz if there made on the latest process node from intel?
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
The P4 reached 3.8Ghz on 90nm at 115W.

10Ghz for the whole CPU? No, 10Ghz for the ALUs? Yes.
 

Abwx

Lifer
Apr 2, 2011
11,835
4,789
136
Max was 3.73 or so , 10ghz is doable with current process
but its ipc would still be at the level of a 4ghz HW.
 

Yuriman

Diamond Member
Jun 25, 2004
5,530
141
106
As I understand it, no. It's not my field though.

Intel's latest node isn't designed around hitting 10GHz.
 

ElFenix

Elite Member
Super Moderator
Mar 20, 2000
102,393
8,552
126
i thought silicon had reliability issues before you got to 10 ghz
 

Abwx

Lifer
Apr 2, 2011
11,835
4,789
136
As I understand it, no. It's not my field though.

Intel's latest node isn't designed around hitting 10GHz.

With a 30-50 million xtors CPU it s possible , current transistors
have transition frequencies 3 times higher than 90nm siblings,
actualy it s the transistor inflation that limit frequency.
 

TuxDave

Lifer
Oct 8, 2002
10,571
3
71
The P4 reached 3.8Ghz on 90nm at 115W.

10Ghz for the whole CPU? No, 10Ghz for the ALUs? Yes.

Yeah I was about to say. A 10GHz P4 means a 20GHz ALU = 50ps cycle times. lol, that would really suck to deal with. All synchronous logic would go out the window.
 

Abwx

Lifer
Apr 2, 2011
11,835
4,789
136
Yeah I was about to say. A 10GHz P4 means a 20GHz ALU = 50ps cycle times. lol, that would really suck to deal with. All synchronous logic would go out the window.

Not sure that the double pumped ALUs were effectively
implemented IIRC.
 
Last edited:

Lepton87

Platinum Member
Jul 28, 2009
2,544
9
81
On Air or with exotic cooling? With exotic cooling I think it shouldn't be difficult, 8150 already reached past 8GHz, maybe it would be enough to reach 10GHz if made on an Intel process.
 

Tristor

Senior member
Jul 25, 2007
314
0
71
Pentium 4s were thermally limited primarily, and the efficiency gains Intel achieved following the failure of the Pentium 4 are not exclusively tied to die shrinks. The total rearchitecture that came with Core made as much if not more difference in getting them past their prior thermal limits and gaining a significant amount of IPCs. Do I think you could maybe get a P4 on a 22nm process to 10GHz? Probably under LN2. Would it matter? Nope, because Haswell is about 90% more efficient clock for clock than P4, and if you remove the process shrink efficiency gains, it's still 60% more efficient simple due to architecture. Under LN2 you can get Haswell to 7GHz which would still be much much faster than the P4.
 

Bateluer

Lifer
Jun 23, 2001
27,730
8
0
Perhaps, but even if it did,it'd still perform worse than the current Haswell line.
 

TuxDave

Lifer
Oct 8, 2002
10,571
3
71
Not sure that the double pumped ALUs were effectively
implemented IIRC.


I don't know about you but the lvs circuits to get the ALUs to get double pumped in the first place is goddamn genius to me.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,571
10,206
126
Is there a power-efficiency reason why we don't see double-clocked ALUs and shaders these days?

For example, the P4 had double-pumped ALUs, the Core2 didn't.

Another example, consider Fermi's shader's and their "hot clock" (double core freq), versus the greater number, but slower (running at core clock) shaders in Kepler.

Is the slower, but wider, trend something that we will see more of, in the quest for power-efficiency?
 

Dresdenboy

Golden Member
Jul 28, 2003
1,730
554
136
citavia.blog.de
I don't know about you but the lvs circuits to get the ALUs to get double pumped in the first place is goddamn genius to me.
It works, as does domino logic.

Here is one of the papers, describing Intel's tech:
http://ctho.org/toread/forclass/18-722/logicfamilies/Deleganes05.pdf

The shmoo plot goes up to 4.2 GHz base clock (8.4 GHz ALUs) at 1.4V.

I think, due to different effects of shrinks (lower power consumption, lower FO4 cycle times, shorter wire lengths) it might even be possible to push the P4 core to 10 GHz.

Just remember, today we're looking at 4C or more with SMT + GPU + IMC + SA etc. at far below 100W TDP. The P4 was just one core + FSB.

We might look at the Quark or MIC cores, which were derived from older x86 cores. Eg. the Quark is derived from the 486 pipeline but got a Pentium class ISA. It achieved a 4x-10x clock increase over the original designs at different process nodes.
 

Dresdenboy

Golden Member
Jul 28, 2003
1,730
554
136
citavia.blog.de
Is there a power-efficiency reason why we don't see double-clocked ALUs and shaders these days?

For example, the P4 had double-pumped ALUs, the Core2 didn't.

Another example, consider Fermi's shader's and their "hot clock" (double core freq), versus the greater number, but slower (running at core clock) shaders in Kepler.

Is the slower, but wider, trend something that we will see more of, in the quest for power-efficiency?
Yes, it is.

Switching clocks wastes energy and storing intermediate results in latches does the same. And there is more to it. E.g. the gates evaluate the results during the low/high phases of the clock signal. The faster the clock has to switch, the lower the usable percentage of time gets. There's noise and skew,etc.
 

Abwx

Lifer
Apr 2, 2011
11,835
4,789
136
I don't know about you but the lvs circuits to get the ALUs to get double pumped in the first place is goddamn genius to me.

The genius idea was to call double pumped the fact that
a single ALU could receive two micro ops simultaneously
and market it as double the frequency , of course , literaly
the frequency of micro ops is doubled , indeed the integer
performance was all but doubled compared to previous designs...

http://books.google.fr/books?id=gni...onepage&q=pentium 4 double pumped alu&f=false
 

Tuna-Fish

Golden Member
Mar 4, 2011
1,645
2,464
136
Intel's latest node isn't designed around hitting 10GHz.

Nodes are not designed to hit certain frequencies. The frequency of a chip is equally dependent on it's design and the capability of the node.

To put it simply, the node makers push the transistor switching speed as high as they can -- modern Intel 22nm transistors have switching times of less than 10ps, that is, they can reach well over 100GHz speeds. However, the ability to switch a single transistor really, really fast isn't all that useful. Logic in chips is constructed of chains of transistors, and the clock speed of the chip is the inverse of the time it takes for all the transistors in the longest chain of the CPU to switch in sequence. So, a hypothetical CPU that runs on 10ps transistors and has a longest path of 20 transistors would be 5GHz.

You *could* use Intel's latest node to build very high clock speed chips. Intel has chosen not to because chips built with lower clockspeeds but more logic per stage seem to outperform the dumber chips.
 

Abwx

Lifer
Apr 2, 2011
11,835
4,789
136
To put it simply, the node makers push the transistor switching speed as high as they can -- modern Intel 22nm transistors have switching times of less than 10ps, that is, they can reach well over 100GHz speeds.

Actualy it s well over 300GHz for CPUs that works at 3-5GHz,
that s all the margin needed to render the rising times negligibles
in respect of a single cycle duration.

Edit : i think that the 10ps is the transmission
delay not the actual switching time.
 

Dresdenboy

Golden Member
Jul 28, 2003
1,730
554
136
citavia.blog.de
The genius idea was to call double pumped the fact that
a single ALU could receive two micro ops simultaneously
and market it as double the frequency , of course , literaly
the frequency of micro ops is doubled , indeed the integer
performance was all but doubled compared to previous designs...

http://books.google.fr/books?id=gni...onepage&q=pentium 4 double pumped alu&f=false
I had to search the keywords again to get the book text. But there is no clear statement, just that it's not sure, how its done. But as many other sources suggest (esp. the ISSCC paper or the one above) for simple ops (bit ops, narrow adds, single bit shifts) the fast ALUs can be used. And they're really that fast (4 FO4 delays?). The slow ALUs (known from day one of Willamette presentations) do the remaining ops. Some ops can be executed in a staggered way and back to back operations (one op using the result of the previous one) were possible too, IIRC. I don't know, how flags were handled.

Actualy it s well over 300GHz for CPUs that works at 3-5GHz,
that s all the margin needed to render the rising times negligibles
in respect of a single cycle duration.

Edit : i think that the 10ps is the transmission
delay not the actual switching time.
The transistor switching speed is just part of the equation. Pipeline stages are measured in FO4 delays, which in itself are multiple layers of gates. P4 was ~16 FO4, fast ALUs 8 FO4, K10 and similar ones were 20+ FO4.
 

TuxDave

Lifer
Oct 8, 2002
10,571
3
71
It works, as does domino logic.



Here is one of the papers, describing Intel's tech:

http://ctho.org/toread/forclass/18-722/logicfamilies/Deleganes05.pdf



The shmoo plot goes up to 4.2 GHz base clock (8.4 GHz ALUs) at 1.4V.


Thanks but yeah I know it works. :)

However as you scale down the process you have a couple problems. Transistor unity frequency increases but it's hard to say how that scales with pass gate logic used in lvs designs. That and your rc delays don't get the same improvement.

I'm actually more concerned about clocking at 20ghz. How much skew and jitter will take away from that 50ps cycle time. At 7ghz domino circuits had to do a lot of tricks to get that done. At 20ghz it will take another stroke of genius.

The genius idea was to call double pumped the fact that
a single ALU could receive two micro ops simultaneously
and market it as double the frequency ,

If marketing calls the whole CPU double the frequency then yeah, that's a marketing bullet. But the alu fast clocks were actually 2x.
 
Last edited: