Has Intel stayed on schedule?

frozentundra123456 · Apr 4, 2013

Idontcare said:
Not that I am in disagreement with the spirit of your post, but 22nm IB did better than 20% over 32nm SB in terms of power consumption, excepting for the corner case of >4.5GHz clockspeeds (well above the intended max clocks for either 32nm or 22nm).

The TDP for IB decreased from 95W @32nm to 77W @22nm for a good reason, the power savings really were there to make it happen.

I think what made 22nm seem so "meh" to us enthusiasts is that 32nm SB was just so awesome. 5GHz clocks on a HKMG process that even now GloFo only wishes they had. It makes for a tough act to follow.

Are these numbers from your personal tests? I was just baseing my estimate on the tdp of a quad core, 77w being slightly more than 80% of 95 watt sandy bridge quad tdp. You are much more of an engineering person than I am so I will accept your numbers. Still seems like that intel is falling into the same pattern of always saying the next chip out will be the big breakthrough.

First, it was wait for ivy bridge, it will really lower power consumption. Then it was wait for Haswell that is a tock, it will really change things, but already some are saying Intel will not really be competitive with ARM until Broadwell. The problem I see is that ARM is becoming so integrated into the landscape that when intel finally gets x86 competitive, it may be too late. And believe me, I really would hate to see ARM force x86 out of the mobile market. I have an ARM tablet, and have never hated a computing device so much since my Windows ME desktop with a celeron and 2mb (yes mb) integrated graphics.

TidusZ · Apr 4, 2013

They could release a new processor every day and it would still take 10 years until it was worth upgrading since all they make now are processors with 4% more performance 20% less power and 10% less overclock.

I'll probably buy a new computer when my current one dies of old age.

Homeles · Apr 4, 2013

TidusZ said:
They could release a new processor every day and it would still take 10 years until it was worth upgrading since all they make now are processors with 4% more performance 20% less power and 10% less overclock.

Man, I really hope you aren't a statistician or scientist for a living.

Sleepingforest · Apr 4, 2013

TidusZ said:
They could release a new processor every day and it would still take 10 years until it was worth upgrading since all they make now are processors with 4% more performance 20% less power and 10% less overclock.

At 4% power improvement per day, you'd double the baseline's power in roughly 18 days. At a 10% improvement every 14 months or so, you're looking at doubling the power roughly every 102 months, which is 8.5 years.

More fun facts: If the power needs of each chip is 80% of the previous one (20% less), the chip will only consume 15% of the energy of the last generation by the time the power doubles.

Finally, Ivy Bridge will actually overclock just as well as Sandy Bridge, provided that you are willing to de-lid and lap the IHS and your heatsink. I mean, you're overclocking, might as well totally destroy your chances at warranty. :awe:

Idontcare · Apr 4, 2013

frozentundra123456 said:
Are these numbers from your personal tests? I was just baseing my estimate on the tdp of a quad core, 77w being slightly more than 80% of 95 watt sandy bridge quad tdp. You are much more of an engineering person than I am so I will accept your numbers. Still seems like that intel is falling into the same pattern of always saying the next chip out will be the big breakthrough.

First, it was wait for ivy bridge, it will really lower power consumption. Then it was wait for Haswell that is a tock, it will really change things, but already some are saying Intel will not really be competitive with ARM until Broadwell. The problem I see is that ARM is becoming so integrated into the landscape that when intel finally gets x86 competitive, it may be too late. And believe me, I really would hate to see ARM force x86 out of the mobile market. I have an ARM tablet, and have never hated a computing device so much since my Windows ME desktop with a celeron and 2mb (yes mb) integrated graphics.

Yeah they are from here.

But you are right, Intel is really good at telling you how great the next chip will be. Nehalem and all the static CMOS hype was the one that really ground my gears :colbert:

ViRGE · Apr 4, 2013

Exophase said:
No, it isn't flat or tapped out, it's just increasing less and less for more and more effort as I've indicated. The low hanging fruit has been picked.

The improvements in IPC for legacy code have been decreasing less and less pretty much every uarch generation since x86 began, if you ignore the anomalous Netburst CPUs and stick to the progression from Pentium 3 to Pentium M to Core to Core 2. It's just hard to notice because clock speed has increased a lot during that time, but has stopped due to hitting the power wall.

I have to agree. Although I'm sure Intel's engineers have a few crackpot ideas on the backburner, I don't buy the "Intel is being lazy" theory either. Intel is on the cutting edge here, and they have to deliver performance improvements in accordance with their 2%/1% rule (2% more performance for 1% more power). Even if we held power usage constant the performance gains would be limited, when in fact power consumption is trending downwards, further capping those performance gains.

On the chip side Intel can always throw caution (and power bills) into the wind and release a high clocked, high voltage chip that offers higher performance for significantly higher power consumption. But otherwise the performance gains will come from mild clockspeed increases (at best) and IPC improvements. And I'm not sure there's anyone working the IPC mill as hard as Intel right now.

Idontcare said:
Not that I am in disagreement with the spirit of your post, but 22nm IB did better than 20% over 32nm SB in terms of power consumption, excepting for the corner case of >4.5GHz clockspeeds (well above the intended max clocks for either 32nm or 22nm).

The TDP for IB decreased from 95W @32nm to 77W @22nm for a good reason, the power savings really were there to make it happen.

I think what made 22nm seem so "meh" to us enthusiasts is that 32nm SB was just so awesome. 5GHz clocks on a HKMG process that even now GloFo only wishes they had. It makes for a tough act to follow.

And it's not just a 5% CPU performance improvement alongside a 20% power reduction, but all of that with something approaching a 50% increase in GPU performance. Intel did improve performance by quite a bit on IVB, it just wasn't CPU performance.

mrmt · Apr 4, 2013

ViRGE said:
And it's not just a 5% CPU performance improvement alongside a 20% power reduction, but all of that with something approaching a 50% increase in GPU performance. Intel did improve performance by quite a bit on IVB, it just wasn't CPU performance.

I really want to see IVB-EP against SNB-EP. Without the GPU eating transistor budget and TDP envelope, we'll have a very good idea on what 22nm can do to CPUs.

ViRGE · Apr 4, 2013

mrmt said:
I really want to see IVB-EP against SNB-EP. Without the GPU eating transistor budget and TDP envelope, we'll have a very good idea on what 22nm can do to CPUs.

The fundamental CPU architecture won't be any different from IVB of course. So it will probably boil down to some combination of more cores, more cache, and less power.

mrmt · Apr 4, 2013

ViRGE said:
The fundamental CPU architecture won't be any different from IVB of course. So it will probably boil down to some combination of more cores, more cache, and less power.

And this is a server guy delight.

Idontcare · Apr 4, 2013

ViRGE said:
I have to agree. Although I'm sure Intel's engineers have a few crackpot ideas on the backburner, I don't buy the "Intel is being lazy" theory either. Intel is on the cutting edge here, and they have to deliver performance improvements in accordance with their 2%/1% rule (2% more performance for 1% more power). Even if we held power usage constant the performance gains would be limited, when in fact power consumption is trending downwards, further capping those performance gains.

Their R&D budget does not say "we are being lazy" either. Intel spends an amazing amount of cash on R&D.

We all know Intel deftly manages their finances, if they wanted to be lazy and milk along the sequential improvements then they'd do so by way of saving some coin and cutting back on R&D.

Now to be sure one could make the argument that Intel could fund their R&D teams to ever higher levels and develop ever better IPC than they currently do if only they chose to spend all their profits on R&D. From that POV one could argue the case that Intel is being lazy because there is more they could do but they are opting to not be that aggressive.

But that is a business management decision, not an emotion-based decision involving laziness or procrastination all the same.

R0H1T · Apr 4, 2013

Idontcare said:
Their R&D budget does not say "we are being lazy" either. Intel spends an amazing amount of cash on R&D.

Can you breakdown the numbers(roughly speaking) as to how much they spend on IC vs Node R&D cause from the charts it does look like AMD/TSMC are alot more efficient in this regard :hmm:

Hulk · Apr 4, 2013

Exophase said:
No, it isn't flat or tapped out, it's just increasing less and less for more and more effort as I've indicated. The low hanging fruit has been picked.

The improvements in IPC for legacy code have been decreasing less and less pretty much every uarch generation since x86 began, if you ignore the anomalous Netburst CPUs and stick to the progression from Pentium 3 to Pentium M to Core to Core 2. It's just hard to notice because clock speed has increased a lot during that time, but has stopped due to hitting the power wall.

I have thought about your comment and I have to admit you are right. The front end of Core has stayed 4 wide and you have to ask yourself why? And the only conclusion I can come to is that they still haven't extracted enough instruction level parallelism to fully saturate it.

Now that I think about it the first indication of this is the return of hyperthreading to Nehalem. Of course HT won't increase IPC but it will increase overall performance on most applications. Back with Nehalem Intel was already scrambling to use more of the execution engine.

The constant attention to the branch predictor, larger internal structures, memory disambiguation, all efforts to prevent pipeline stalls and increase IPC.

I was thinking that they could perhaps simply go 5 or 6 wide to increase IPC but in reality I guess there is only so much instruction level parallelism you can extract from a single thread right?

Fx1 · Apr 4, 2013

Intel is drip feeding performance increases so it can save big increases for the years when it needs them.

Idontcare · Apr 4, 2013

R0H1T said:
Can you breakdown the numbers(roughly speaking) as to how much they spend on IC vs Node R&D cause from the charts it does look like AMD/TSMC are alot more efficient in this regard :hmm:

The TSMC number gives you an indication for how much it costs to develop a leading edge process node, second only to Intel's. But it doesn't include any expenses related to the development costs of designing chips.

So take the TSMC number, multiply it by say 1.5x to get the value Intel is probably shoveling into their process node R&D.

The remainder of their expansive R&D is what Intel is allocating towards the development of chips that are being designed to be used on those leading edge nodes.

Look at any of the truly fabless companies, they spend billions per year on R&D and don't have fabs. That is showing you what it costs to develop the chips that are to be manufactured on the leading edge nodes.

ViRGE · Apr 4, 2013

Hulk said:
I was thinking that they could perhaps simply go 5 or 6 wide to increase IPC but in reality I guess there is only so much instruction level parallelism you can extract from a single thread right?

Bingo. At some point you hit a dependent instruction that you can't schedule around, predict, or otherwise get out of the way. So the only thing you can do is spend the time executing it while everything else on that core waits.

Intel may go 5-6 wide in the future if they run out of other tricks. But there's no evidence right now it would improve performance by a great deal. ILP is basically a special base of Amdahl's Law.

Exophase · Apr 4, 2013

The thing with going with wider decode is that it doesn't just cost the transistor space for the decoders. The decode process is at least partially serial on some level because it has to figure out where instructions start. Even if it's marked in the instruction cache you still need to scan the markings unless you have a much more complex data structure. That said I don't think Intel uses instruction boundary marking at all anymore.

If decoding takes longer you could end up needing more pipeline stages or a lower clock speed. These days multiple pipe stages are spent on decoding, while Mitch Alsup commented that he got a simple x86 design with 6 stage pipeline running at 3GHz - so it stands to reason that adding decode width ate up extra pipeline stages.

I don't really know the implementation details of a decoder, but if anything causes the power usage to grow non-linearly with the decoder count then it's a bad perf/W move. And anything that increases instruction throughput will ultimately waste more energy on branch mispredictions, if you can't also reduce the mispredict penalty.

I could see Intel increasing the width of their uop cache interface first, although I'm sure that leads to its own problems.. quite a lot of wires moving those big uops around..

TidusZ · Apr 4, 2013

Homeles said:
Man, I really hope you aren't a statistician or scientist for a living.

The point I was making is that I see no reason to buy a new processor since the performance increase is minimal. Back in the day I went from a 1.4 ghz tbird to an athlon 3000+ to core 2 duo e4300 in 2 year gaps and it was a huge improvement each time. Now you can look at a 2 year old sandy bridge @ 5ghz vs a brand new haswell which will probably o/c to 4.5-5 (being generous) and the performance will be about the same.

I love PCs and getting more performance and I don't care about the power my pc consumes. It's not like paying $200/month for gasoline, its maybe $20/month tops. My monitor uses more power than my PC most of the time and I'd rather 50% more performance like in the past than save a dollar or two a month on power.

Hence why I'll buy a new pc in 10 years when the performance increase is justifiable. I already regret waiting for ivy bridge to upgrade my q9550, since I didn't know it was going to be a 4% to 7% performance increase with lower overclocks and more degradation.

Sleepingforest · Apr 4, 2013

The other thing that could happen is that power prices could shoot up in an effort to reduce electricity usage. Then you'd also care.

Idontcare · Apr 4, 2013

ViRGE said:
Bingo. At some point you hit a dependent instruction that you can't schedule around, predict, or otherwise get out of the way. So the only thing you can do is spend the time executing it while everything else on that core waits.

Intel may go 5-6 wide in the future if they run out of other tricks. But there's no evidence right now it would improve performance by a great deal. ILP is basically a special base of Amdahl's Law.

Doesn't Poulson do something silly, like 11 or 12 width? Wonder if x86 will get that wide in the coming decade.

Homeles · Apr 4, 2013

It has a 12-wide issue.

frozentundra123456 · Apr 4, 2013

Homeles said:
Man, I really hope you aren't a statistician or scientist for a living.

Over 10 years that would be 10^62 improvement factor!!!!!!!

Homeles · Apr 4, 2013

frozentundra123456 said:
Over 10 years that would be 10^62 improvement factor!!!!!!!

I actually wasn't paying attention to the hyperbole he made -- my comment was regarding his rather ridiculous assertion that one outlier of a data point (Ivy Bridge) spells doom for the world.

Pentium 4 was terrible, so obviously that set of data points meant the collapse of x86 processors.

Anyways, there just is not enough evidence that we're getting only "4%" gains every year. Might as well complain that the sky is falling.

Hulk · Apr 4, 2013

I am wondering if Intel did go 5 or 6 wide on the execution engine then perhaps it would be possible to optimize that extra execution capability, that is when single thread could not, with 2 logical cores instead of 1?

So just to take one tiny example of the multiple permutations the microprocessors must be confronted with at every turn...

Make the core wider in an attempt to extract as much instruction level parallelism from a single thread and increase IPC as much as possible. Also increase and optimize supporting structures supporting this goal.

Since at times you might have a lot of unused execution potential create 2 logical HT cores for this one physical core. From what I've read, compared to a physical core HT requires a pretty small number of transistors to implement.

OR

Keep the basic core design the same and just add another core.

It would be pretty interesting to have a 4 super wide physical cores and 8 logical cores microprocessor. Or I should say it would be interesting to model how such a processor would perform.

Ajay · Apr 4, 2013

Exophase said:
The improvements in IPC for legacy code have been decreasing less and less pretty much every uarch generation since x86 began, if you ignore the anomalous Netburst CPUs and stick to the progression from Pentium 3 to Pentium M to Core to Core 2. It's just hard to notice because clock speed has increased a lot during that time, but has stopped due to hitting the power wall.

This is a good point and part of the problem. To really take advantage of Intel's newer microarchictues software engineers need to actively use instructions that dramatically speed up the performance of specific classes of algorithms and data structures. Some of Intel's recent moves point to developers reluctance to do this by making use of the extra xtors and compiler improvements so that the use of AVX2 does not burden devs with learning much in the way of new tech, but simply flipping a compiler switch and seeing if their application benefits (sort of like a giant distributed Monte Carlo simulation across thousands of software packages).

Still, these programs have to work on older CPUs, so developers needs to use DLLs that can be loaded at runtime according to the CPU arch. That means more regression testing, more complex debugging, more software support costs and, ultimately,the abandonment of dual track approaches based on economic realities.

So except for applications that have 'a need for speed', new techniques just don't make it into new applications because a solid 15% boost in application performance just isn't worth the associated cost structure. The golden handcuffs strike again!

Ajay · Apr 4, 2013

Hulk said:
It would be pretty interesting to have a 4 super wide physical cores and 8 logical cores microprocessor. Or I should say it would be interesting to model how such a processor would perform.

I thought Haswell is an 8 port CPU core. That pretty dang wide. Unless you are referring to data paths or something else.

Has Intel stayed on schedule?

Lifer

Golden Member

Platinum Member

Platinum Member

Elite Member

Elite Member, Moderator Emeritus

Diamond Member

Elite Member, Moderator Emeritus

Diamond Member

Elite Member

Platinum Member

Diamond Member

Golden Member

Elite Member

Elite Member, Moderator Emeritus

Diamond Member

Golden Member

Platinum Member

Elite Member

Platinum Member

Lifer

Platinum Member

Diamond Member

Lifer

Lifer