CPU improvements year to year

Gikaseixas · Jun 14, 2014

I'm a bit concerned with CPU tech progress these days because we seem to be getting fairly small increases lately (5-10% improvements in performance).
AMD is not offering much of a fight and i'm of the opinion that this is the main reason why most of us (i7 4xxx users) will not upgrade to DC, just like many still have their 3930k's.

IMO competition is the answer, just take a look at Nvidia and AMD trading blows in the GPU arena. My HD7970 is showing it's age but my 4770K is still one beast of a CPU.

ShintaiDK · Jun 14, 2014

Look at performance/watt plus IGP. Plenty of progress. Just not in the exact area you want.

Also even if there was endless competition. You are not getting any high IPC boost anymore. Or some high frequency increases on a 1000W CPU.

firewolfsm · Jun 14, 2014

Performance improvements are still high in the server space and that's what really matters anyways.

frozentundra123456 · Jun 14, 2014

I think DC proves that intel has pretty much maxed out core performance. Even with special enhancements for overclocking, they cant really reach 5ghz, so I think on a mobile oriented design they are pushing the wall. Not sure any competition from AMD would change that. If you go from ivy to DC you have a pretty good improvement actually, maybe 10% ipc and another 10% clockspeed. The only thing I think a more competitive AMD might do would be to push intel to offer a hexcore on the mainstream platform. But since 4 cores still beats 8 cores from AMD, intel can get by with only offering hex cores on the enthusiast platform.

As for improvements in performance per watt and igp, on the desktop, totally irrelevant except for some niche applications like the Brix or AIO formfactors.

witeken · Jun 14, 2014

Gikaseixas said:
I'm a bit concerned with CPU tech progress these days because we seem to be getting fairly small increases lately (5-10% improvements in performance).

IMO competition is the answer, just take a look at Nvidia and AMD trading blows in the GPU arena. My HD7970 is showing it's age but my 4770K is still one beast of a CPU.

No, the main problem is clock speeds. You can't get much higher clock speeds and competition won't do anything about that.

teejee · Jun 14, 2014

witeken said:
No, the main problem is clock speeds. You can't get much higher clock speeds and competition won't do anything about that.

Researchers have developed transistors with speeds close to 1 THz, so 200 times faster than the transistors in Intels CPU's. Of course a lot of work needed to reach these speeds in a CPU, but that is why we need competition.

witeken · Jun 14, 2014

teejee said:
Researchers have developed transistors with speeds close to 1 THz, so 200 times faster than the transistors in Intels CPU's. Of course a lot of work needed to reach these speeds in a CPU, but that is why we need competition.

Yeah, and those chips had what, less than 200 transistors, and were 1-bit. Not bad when you need billions of those for a 64-bit CPU. And now manufacture those at world scales, low costs and high yields. Intel is researching all those things, too. That's why they are 4 years ahead of TSMC, their closest competitor. Like Paul Otellini said: "At the end of the day, the best transistors win, no matter what you're building, a server or a phone."

ShintaiDK · Jun 14, 2014

teejee said:
Researchers have developed transistors with speeds close to 1 THz, so 200 times faster than the transistors in Intels CPU's. Of course a lot of work needed to reach these speeds in a CPU, but that is why we need competition.

I think you let yourself run with the PR headlines. And I assume its from the great master of the R&D spins, IBM.

Its not really the big issue of having a few transistors run at 100s of Ghz. The problem is to have billions of them to do it the same time.

Also these transistors are radio transistors.

Phynaz · Jun 14, 2014

I just got back from HP Discover.

All the talk there (outside of The Machine) was power consumption. If AMD is keeping on with the big core line then they aren't following the market.

AtenRa · Jun 14, 2014

Phynaz said:
I just got back from HP Discover.

All the talk there (outside of The Machine) was power consumption. If AMD is keeping on with the big core line then they aren't following the market.

That means Intel is switching to ATOM based CPUs for the enterprise segment ??

You can have low power big cores if you have the design and the process at the right time.

Homeles · Jun 14, 2014

Phynaz said:
I just got back from HP Discover.

All the talk there (outside of The Machine) was power consumption. If AMD is keeping on with the big core line then they aren't following the market.

In AMD's defense, they do have lower power targets with Steamroller designs.

teejee said:
Researchers have developed transistors with speeds close to 1 THz, so 200 times faster than the transistors in Intels CPU's. Of course a lot of work needed to reach these speeds in a CPU, but that is why we need competition.

Those are discrete transistors, not integrated ones.

Ajay · Jun 14, 2014

Homeles said:
In AMD's defense, they do have lower power targets with Steamroller designs.

And even lower targets with Excavator, IIRC.

Ken g6 · Jun 14, 2014

ShintaiDK said:
Its not really the big issue of having a few transistors run at 100s of Ghz. The problem is to have billions of them to do it the same time.

Right. The problem is that all the parts of a CPU (or at the very least one core) have to receive a clock signal before a new clock signal can begin.

Here's a map of a Haswell chip. From that I estimate that each core is about 7mm by 3.5mm. Since all parts of the chip have to be synchronized, the longest dimension is more-or-less what matters. Let's assume a similar chip is created, which uses silicon photonics to receive a light pulse as a clock signal. (I'll ignore whether an LED can blink that fast.) Let's also assume that the light pulses are generated in the centers of the cores, and have to propagate to the farthest corners, about 4mm (4*10^-3 m) away. Light travels 3*10^8 m/s. A clock signal consists of a pulse followed by a non-pulse, so that halves the maximum speed. So that takes a little over 10^-11 s, and means the clock speed must be less than about 38 GHz. Under the very best conditions.

Pulses travel slower in copper than light through a vacuum. Silicon at the endpoints is probably still slower. Clock pulses also tend to follow more circuitous routes, such as a grid. Assuming a copper grid that starts in the middle of the core, that's about 16GHz in the very best case. If the pulse starts at one corner, it's more like 8GHz. (Say, that's getting close to real speeds, isn't it?)

See why we don't have 100GHz cores?

SunburstLP · Jun 15, 2014

I registered just to say thanks for this. I'm a simple guy who is enthusiastic about computers and technology, but I'd never considered this type of thing as it would apply to microelectronics engineering. Thank goodness the planet has so many people that are way smarter than I am.:thumbsup:

Is is just process, design and/or layout choices that are keeping HSW seemingly stuck <5GHz? Or, to phrase it differently: What factors in a processor's design will ultimately limit its top-end clockspeed? I apologize for such a huge and open question, but I do mean within the context of the info you've already shared in your post, Ken. Thanks.

Ken g6 said:
Right. The problem is that all the parts of a CPU (or at the very least one core) have to receive a clock signal before a new clock signal can begin.

Here's a map of a Haswell chip. From that I estimate that each core is about 7mm by 3.5mm. Since all parts of the chip have to be synchronized, the longest dimension is more-or-less what matters. Let's assume a similar chip is created, which uses silicon photonics to receive a light pulse as a clock signal. (I'll ignore whether an LED can blink that fast.) Let's also assume that the light pulses are generated in the centers of the cores, and have to propagate to the farthest corners, about 4mm (4*10^-3 m) away. Light travels 3*10^8 m/s. A clock signal consists of a pulse followed by a non-pulse, so that halves the maximum speed. So that takes a little over 10^-11 s, and means the clock speed must be less than about 38 GHz. Under the very best conditions.

Pulses travel slower in copper than light through a vacuum. Silicon at the endpoints is probably still slower. Clock pulses also tend to follow more circuitous routes, such as a grid. Assuming a copper grid that starts in the middle of the core, that's about 16GHz in the very best case. If the pulse starts at one corner, it's more like 8GHz. (Say, that's getting close to real speeds, isn't it?)

See why we don't have 100GHz cores?

coolpurplefan · Jun 15, 2014

Where do they have the audacity to say the gaming performance is 21 times better? (I think E8500 to 4670 is about 2 times faster in games.)
http://hexus.net/tech/reviews/cpu/56005-intel-core-i7-4770k-22nm-haswell/?page=4

jpiniero · Jun 15, 2014

coolpurplefan said:
Where do they have the audacity to say the gaming performance is 21 times better? (I think E8500 to 4670 is about 2 times faster in games.)
http://hexus.net/tech/reviews/cpu/56005-intel-core-i7-4770k-22nm-haswell/?page=4

The IGP?

witeken · Jun 15, 2014

SunburstLP said:
Is is just process, design and/or layout choices that are keeping HSW seemingly stuck <5GHz? Or, to phrase it differently: What factors in a processor's design will ultimately limit its top-end clockspeed? I apologize for such a huge and open question, but I do mean within the context of the info you've already shared in your post, Ken. Thanks.

Transistor (too big gate means higher delay), design (if Core wasn't optimized it wouldn't go beyond 1-2GHz), physics (material characteristics of silicon).

shady28 · Jun 15, 2014

coolpurplefan said:
Where do they have the audacity to say the gaming performance is 21 times better? (I think E8500 to 4670 is about 2 times faster in games.)
http://hexus.net/tech/reviews/cpu/56005-intel-core-i7-4770k-22nm-haswell/?page=4

I'm sure there is something or the other where Haswell used heterogeneous compute models to get that kind of boost, but if that's the metric then one could probably find something where a snapdragon 800 or apple A7 toasts Haswell too.

This is a bit more revealing :
http://techreport.com/news/24886/haswell-compared-to-everything

The i7-950 (quad core) was released in Q2 2009 - almost exactly 5 years ago. The i7-4770K is only about 60% faster than that chip on this benchmark - for both multi and single thread.

The fastest C2Q on this chart is the Q9400 - which was one of the slower clocked last gen C2Q models. I would have preferred a comparison to Q9550 or Q9650. Even with the slower Q9400 though, the i7-4770K was only about 100-110% faster here. If we had a Q9650 vs i7-4770, it would be more like 90% faster. Q9550 / Q9650 came out in Q1 2008.

So it took roughly 6 years to double general purpose compute performance.

Homeles · Jun 15, 2014

SunburstLP said:
I registered just to say thanks for this. I'm a simple guy who is enthusiastic about computers and technology, but I'd never considered this type of thing as it would apply to microelectronics engineering. Thank goodness the planet has so many people that are way smarter than I am.:thumbsup:

Is is just process, design and/or layout choices that are keeping HSW seemingly stuck <5GHz? Or, to phrase it differently: What factors in a processor's design will ultimately limit its top-end clockspeed? I apologize for such a huge and open question, but I do mean within the context of the info you've already shared in your post, Ken. Thanks.

Process is definitely a big part of it, as is design. Really, they're intertwined.

Haswell runs at ~4GHz because CPUs are built for high clock speeds. GPUs on the other hand have only just barely broken the 1GHz barrier.

However, if you were to take a Pentum 4 and throw it on 22nm, 5GHz stock clocks would probably be relatively easy to achieve.

Intel's 22nm process is rather mobile friendly, at the cost of the high performance and power part of the spectrum. If it were a 22nm planar process, it'd clock a tad higher at the high end of the spectrum (at ~1.1v), but not nearly enough to justify the performance losses in the lower voltage part of the spectrum.

SunburstLP · Jun 15, 2014

Interesting. Thanks.

biostud · Jun 15, 2014

Lots of progress, just not for desktop CPU's in terms of raw power.

Phynaz · Jun 15, 2014

AtenRa said:
That means Intel is switching to ATOM based CPUs for the enterprise segment ??

Yes. http://www8.hp.com/us/en/products/proliant-servers/product-detail.html?oid=5375897#!tab=features

I'm under NDA to discuss what's coming next, but I can tell you that their power limit is 100w per server.

teejee · Jun 15, 2014

Ken g6 said:
Right. The problem is that all the parts of a CPU (or at the very least one core) have to receive a clock signal before a new clock signal can begins

No, this is not necessary. Just a convinient way of doing it today.

http://en.m.wikipedia.org/wiki/Asynchronous_circuit

Huge amount of research is spent on creating faster electronics. Of course this will lead to break through of different kinds.
The whole semiconductor development since the 60's is based on silicon wafers. New materials could change the industry completely in the long term.

Lepton87 · Jun 15, 2014

witeken said:
Transistor (too big gate means higher delay), design (if Core wasn't optimized it wouldn't go beyond 1-2GHz), physics (material characteristics of silicon).

To be more specific transistor performance especially scaling with voltage, pipeline length and FO4. Short FO4 of 13 is what allowed IBM Power6 to be clocked at 5GHz at stock at an ancient 65nm process. BTW does anyone have any idea what is FO4 for modern Intel CPUs or AMD's CPUs/APUs? I could only find that P4s at mid 3GHz had FO4 delay of 16.4 (estimated not even official) Probably Northwood not Prescott but I'm not sure.

Enigmoid · Jun 15, 2014

Lepton87 said:
To be more specific transistor performance especially scaling with voltage, pipeline length and FO4. Short FO4 of 13 is what allowed IBM Power6 to be clocked at 5GHz at stock at an ancient 65nm process. BTW does anyone have any idea what is FO4 for modern Intel CPUs or AMD's CPUs/APUs? I could only find that P4s at mid 3GHz had FO4 delay of 16.4 (estimated not even official) Probably Northwood not Prescott but I'm not sure.

Not sure if this is the paper you got your info from but they have estimates.

http://www.cse.ohio-state.edu/~teodores/workshops/wmos/resources/Papers/zhang_wmos2014.pdf

CPU improvements year to year

Platinum Member

Lifer

Golden Member

Lifer

Diamond Member

Senior member

Diamond Member

Lifer

Lifer

Lifer

Platinum Member

Lifer

Programming Moderator, Elite Member

Member

Golden Member

Lifer

Diamond Member

Platinum Member

Platinum Member

Member

Lifer

Lifer

Senior member

Platinum Member

Platinum Member