Haswell Core Count

BenchPress · May 9, 2012

pelov said:
So you're implying that Intel should be making chips that are 4-times+ the size of Haswells to achieve GPGPU levels of performance that can be achieved by spending only $450, and somehow that's supposed to prove your point?

You're still not getting it. That graph shows a 300$ CPU without AVX2 beating a 500$ GTX 680. And yes, the HD 7970 outperforms it but Intel doesn't need 4x the core count to catch up. It just needs AVX2 and it would still be cheaper and more power efficient!

Also keep in mind that GPUs have a 10% or higher failure rate. That's ridiculous. There's a reason NVIDIA's Tesla cards cost several times more than the consumer parts.

Don Karnage · May 9, 2012

AtenRa said:
Lets see,

Intel will have to compete against their own CPUs first, no one will upgrade from 4 core IB to 4 core Haswell unless it has a huge IPC gain. Remember, both IB and Haswell will be on the same 22nm, dont expect to see a huge power reduction like SB to IB.

Secondly, in 2013 (same year Intel will introduce Haswell) AMD will introduce the SteamRoller. From what we know now, it seams that SteamRoller will have more than 8 threads. It will take an enormous IPC gain for a 4 core 8 threads Haswell to be competitive against a 10-12 threads SteamRoller in multithreading apps.

I believe we may see a cheaper 6 core Intel CPU with Haswell-E.

No one will upgrade from a quad to another quad? Why did people go from a 760 to a 2500k? Or a 2500k to a 3750k?

BenchPress · May 9, 2012

pelov said:
Because the CPU can't handle such HPC workloads. You need a massive bus width, gigabytes of attached very quick ram and thousands of processors!

No, you don't. The only reason the GPU has a massive bus width is because it has tiny caches so it has no other choice but to store everything in RAM. The CPU only requires a fraction of the RAM bandwidth because most memory accesses hit the L1, L2 or L3 cache, which also have a much lower latency and lower power consumption than a RAM access.

You can count cores however you want, at the end of the day the GPU has many many more than does the on-die CPU.

You are blinded by core count. The 1536-core GTX 680 is beaten by a 4-core CPU. Even if you do count cores the same way, it's a pathetic result for the GPU. The HD 7970 doesn't fare much better considering that this CPU still doesn't feature AVX2.

If Intel was as confident in Haswell and AVX2 as you are then they wouldn't have bothered with Knight's Corner co-processor.

AVX2 will be in every CPU from Haswell forward. Larrabee on the other hand got cancelled and they're making up for the investment by trying to sell it to the HPC market. Also note that many LRBni instructions resurfaced as part of AVX2.

So yeah, Intel is very confident in AVX2. And they designed VEX to be extendable to 1024-bit...

AtenRa · May 9, 2012

Don Karnage said:
No one will upgrade from a quad to another quad? Why did people go from a 760 to a 2500k? Or a 2500k to a 3750k?

760 45nm to 2500K 32nm (more performance from IPC and frequency increase + less power consumption)
2500k 32nm to 3570K 22nm (more performance from IPC and frequency increase + less power consumption)

IB 22nm to Haswell 22nm (if Haswell has 10% IPC and almost same power consumption i dont see any reason for anyone to upgrade from IB at the time).

BenchPress · May 9, 2012

AtenRa said:
Im sorry but that behemoth has 3-4x times the performance with a TDP of 210W when Intel Core i7 3820 has a TDP of 125W.

Yes, but the i7-3820 use a 32 nm process while TSMC's 28 nm process is much closer to Intel's 22 nm process in mass availability. And Intel is going to keep that process advantage for the foreseeable future. Also, it takes only a small reduction in clock speed to bring that TDP down. The i7-3820QM only consumes 45 Watt and likely won't score much lower than the i7-3820.

And keep in mind that the performance/Watt improvement won't end with AVX2. AVX-1024 can be implemented by splitting them into four 256-bit operations at instruction issue. It doesn't increase the throughput, but the CPU's power-hungry front-end would have to only deliver 1/4 of the instruction count and the schedulers would have much less switching activity as well. So power consumption would go down, and it has the added advantage of more latency hiding. It's a tiny extension that would make the CPU behave very closely to a GPU.

I believe that you will find that GPUs are far more efficient that your CPU for GPGPU. Just wait and see how much more efficient the GK100/110 will be.

Again, that doesn't matter for the mainstream market because a lot of people will buy the GTX 680 or GPUs based on the same architecture. So developers are not inclined to adopt GPGPU when a large percentage of the latest hardware can't run it efficiently.

It is the reason most super computers are using GPUs for highly compute scenarios than CPUs.

Supercomputers aren't using GPUs because of some magical feature the CPU won't ever have. They're using them because they fit the SPMD programming model. That advantage will go away when the CPU supports AVX2, which is clearly oriented at SPMD!

Besides, let's get something straight, the topic is about Haswell's core count: That is, four cores for the mainstream part. Hence this isn't about supercomputers or anything of the like. It's about how much better a quad-core Haswell will be compared to previous quad-cores, and as an extension how this will affect mainstream GPGPU. Its days are numbered.

Edrick · May 9, 2012

AtenRa said:
IB 22nm to Haswell 22nm (if Haswell has 10% IPC and almost same power consumption i dont see any reason for anyone to upgrade from IB at the time).

If you seriously think SB to IB was a 'good' upgrade just because of 22nm, and at the same time dismiss the next 'Tock' just because it is on the same process, then you clearly do not understand what a new uArch can bring to the table.

AtenRa · May 9, 2012

I know what a new uArch can bring to the table, but i also know that Intel will be targeting to increase the iGPU performance too. Haswell's iGPU will be bigger than IB's and it will eat precious die size from the CPU cores.

frostedflakes · May 9, 2012

People get too hung up on core counts. Architectural improvements have allowed Intel to squeeze significantly more performance out of the same number of cores. For example, here's a $200 2500K vs what used to be a what, $400 Q6600? Both are quad cores but the 2500K slaughters the Q6600 offering at least double the performance on average. Also notable is that the dual core w/HT i3-2100 outperforms the Q6600 quad core even in most heavily threaded tasks. In the end who cares how many cores the CPU has, as long as performance isn't stagnating, which it isn't.

http://www.anandtech.com/bench/Product/288?vs=53

Like others have explained, Intel's mainstream processors are intended for regular users, and for the vast majority of regular users something like an i5 is already overkill. Intel is more concerned about improving per core performance since 99.9% of programs out there can't utilize more than eight threads and improving integrated GPU performance, which tends to bottleneck a person in games and other tasks much more than CPU performance bottlenecks them in everyday computing.

If you want or need more cores, you're part of a niche market and you have to pay a premium for their enthusiast/workstation oriented parts. Just the way it is.

BenchPress · May 9, 2012

AtenRa said:
I know what a new uArch can bring to the table, but i also know that Intel will be targeting to increase the iGPU performance too. Haswell's iGPU will be bigger than IB's and it will eat precious die size from the CPU cores.

Precious die size? Ivy Bridge is tiny. You should check the size and cost of AMD chips. Intel will have no trouble making the IGP bigger while also adding AVX2 and TSX to the CPU cores, and still make a lot of profit.

Also note that Haswell will feature various IGP models. Even the top model won't result in a big die, and the rest is still tiny.

IntelUser2000 · May 9, 2012

Edrick said:
If you seriously think SB to IB was a 'good' upgrade just because of 22nm, and at the same time dismiss the next 'Tock' just because it is on the same process, then you clearly do not understand what a new uArch can bring to the table.

+1

Sandy Bridge at the same 22nm process as Westmere brought 12-15% performance increase across the table. Ivy Bridge over Sandy Bridge brings less than half that. At the same time, power consumption at the same frequency dropped by more than 10%, which is why the mobile chips got such a big advancement.

frozentundra123456 · May 9, 2012

frostedflakes said:
People get too hung up on core counts. Architectural improvements have allowed Intel to squeeze significantly more performance out of the same number of cores. For example, here's a $200 2500K vs what used to be a what, $400 Q6600? Both are quad cores but the 2500K slaughters the Q6600 offering at least double the performance on average. Also notable is that the dual core w/HT i3-2100 outperforms the Q6600 quad core even in most heavily threaded tasks. In the end who cares how many cores the CPU has, as long as performance isn't stagnating, which it isn't.

http://www.anandtech.com/bench/Product/288?vs=53

Like others have explained, Intel's mainstream processors are intended for regular users, and for the vast majority of regular users something like an i5 is already overkill. Intel is more concerned about improving per core performance since 99.9% of programs out there can't utilize more than eight threads and improving integrated GPU performance, which tends to bottleneck a person in games and other tasks much more than CPU performance bottlenecks them in everyday computing.

If you want or need more cores, you're part of a niche market and you have to pay a premium for their enthusiast/workstation oriented parts. Just the way it is.

I see what you are saying, but by the same token, look at how the price has come down on quad cores. So if they can bring the price of a quad core down from 400.00 to 200.00, why cant they put out a 350.00 six core? I agree that right now most people dont need more than a quad core. But if I were to buy a machine, I would want it to last several years at a minimum. Do you really think a quad core is not going to be inadequate in 3 to 5 years? And you mention intel is more interested in increasing per core performance than increasing the number of cores. I basically agree with that strategy to a point, but they seem to have reached a point of diminishing returns in trying to increase per core performance. I think the reason people are disappointed is because IB increased neither per core performance (at least to any significant degree) nor increased core count.
And I am beginning to fear Haswell will also be a minimal upgrade except for the IGP, which I dont really care about on the desktop.

IntelUser2000 · May 9, 2012

Nehalem brought 20-30% gains because of Hyperthreading. The per thread gains were no better than Ivy Bridge had over Sandy Bridge.

The question is, will it end up like Sandy Bridge, where it gains overall 12-15% gain, or be like Nehalem where it gains 3-5% per thread but have a critical feature that boosts performance up enormously elsewhere? And are we counting TSX as that feature?

Mr. Pedantic · May 9, 2012

IntelUser2000 said:
Yea but the people who does such big jobs can easily afford those processors. CPUs are probably the least of their costs anyway. But individuals don't run those sorts of stuff.

No, they can't. The whole purpose of Folding@home, SETI@home, Muon, DCNET, etc are that THEY CAN'T AFFORD THE COMPUTING POWER THEY NEED.

-Tasks being accelerated by dedicated and low power units like QuickSync and/or being offloaded to GPUs.
-Tablets becoming the potential biggest threat to PC.
-Multi-threading being extremely limited in the consumer space
http://www.anandtech.com/bench/Product/443?vs=287

- True. But if you have dedicated hardware for every single processor heavy application you need, that is a heck of a lot of design and development time for a very specialized job. And besides, while QuickSync is good at its job, people by and large shun GPU video encoding because it has traditionally been very poor in quality compared to in software.
- Really? You think I want to do work on a tablet? You're dreaming.
- Really? As I said, the only major application I can think of that needs multithreading that doesn't have it is MS Office.

You see applications like Adobe CS4/WME/3DSMax/Blender already benefitting minimally from 2 additional cores. The gains are about the same as what Hyperthreading offered. Hell the 3770K is beating the 3960X! I could point out few more that will benefit absolutely nothing going to 8 cores but still benefit moving from 4 to 6 cores.

Click to expand...

True. But it is still multithreaded, and your circular reasoning basically ensures that if nobody buys or demands CPUs with more cores or threads, Adobe sure as he'll won't spend me and money programming for more.

-Low power increasing in importance

Click to expand...

So more threads is even better, because there's more opportunity for power gating in the same chip.

So why should they make a dedicated chip for the few that wants those extra few cores in the consumer space? Why not just keep making derivatives(3960X like chips) of the Xeon, which the customers really need the extra performance?

Click to expand...

We are not talking about making dedicated chips. We are talking about making EVERY chip in the lineup better.

Pardon any typos, I hate typing on my phone.

Click to expand...

BenchPress · May 9, 2012

IntelUser2000 said:
The question is, will it end up like Sandy Bridge, where it gains overall 12-15% gain, or be like Nehalem where it gains 3-5% per thread but have a critical feature that boosts performance up enormously elsewhere? And are we counting TSX as that feature?

Is AVX2's ability to make many code loops run up to eight times faster not good enough for you?

TSX is revolutionary in its own right, but its role will grow when the core count increases beyond four.

LOL_Wut_Axel · May 9, 2012

BenchPress said:
You've been fooled by GPU marketing. There is no such thing as a GPU with thousands of cores. So I'm terribly sorry but you're the only being delusional here.

GPU manufacturers count each SIMD lane as an individual core. Using the same "logic", mainstream Haswell will have 64 cores, running at over three times the clock frequency of the latest GPUs. For the record, the 22 nm HD 4000 has 128 of such "cores", however they're running at only 1150 Mhz. So there's really not that big a difference between a CPU and a GPU. We certainly don't need a big jump in core count.

That said, the instruction set is just part of the reason Haswell will kill mainstream GPGPU. CPUs can already put the latest GPUs to shame at GPGPU workloads. The reason for this is that there's no round-trip delay, no bandwidth bottleneck, and no hard register limit. And Haswell will strengthen these benefits with a GPU-like instruction set extension!

Haswell obviously won't kill mainstream GPGPU overnight, but it's blatantly obvious that GPGPU has no future. Adding AVX2 to the CPU won't cause any compromises. In contrast, for the GPU to become any better at GPGPU it has to sacrifice a considerable amount of graphics performance. It basically has to become more like a CPU. But that's downright silly. If becoming more like the CPU is the answer then why not let the CPU handle these workloads in the first place? AVX2 was the only missing bit to make that happen.

That's an interesting perspective.

I was thinking of getting an HD 7950 if it was a considerable amount less expensive than the GTX 670, more for the high compute performance than for the price difference. But Sandy/Ivy Bridge don't support AVX2, so would the HD 7950 still be a good alternative if you want both good gaming and compute performance? The only compute-heavy thing I'd do is folding@home... speaking of which, the results I've seen in f@h for the GTX 680 are terri-bad and they don't seem good for the HD 7900 series either. It looks like CPU folding continues to be the way to go.

Everything seems to be going towards a general point when it comes to performance. For example, now-a-days there's barely any difference between RISC and x86, and GPUs (except Kepler) are targeting general purpose (floating point, anyway) computing more. As we've seen with Tahiti, you do give up a bit of gaming performance to gain compute performance and also some efficiency both in PP/W and PP/mm^2 in gaming tasks but it's still a very good compromise, barring the fact most people don't give 2 cents about compute performance.

WhoFarted · May 10, 2012

MisterMac said:
3930k is 500 USD.
+ a cheap mobo - 250/300 USD.

800 USD.

Now let's try that overseas here in EU:

3930k = 900 USD
Granted there are cheap mobo's for 350 USD in my territory, but most are 400 USD.

That equals 1250/1300 USD averagely.

That's more than 50% the cost from EU.
A cheap mobo + 3550k costs me 500-600 USD.
(Depending on mobo/features).

So no, it's not just "SIMPLE CHEAP JUST A BIT MORE" for everyone.
And this is discounting that if i live in the US - there's CRAZY deals on mobo + mainstream i5/i7 deals in most big retailers suddenly.

I've seen these kind of price differences before. Is this a result of the VAT tax or just higher retail price (or both)?

ShintaiDK · May 10, 2012

WhoFarted said:
I've seen these kind of price differences before. Is this a result of the VAT tax or just higher retail price (or both)?

Usually VAT. Without VAT the prices are very similar.

Lonbjerg · May 10, 2012

ShintaiDK said:
Usually VAT. Without VAT the prices are very similar.

To add to the uninformed.
There is 25% VAT on EVERYTHING in Denmark.

On TOP of that, many good have extra taxes:
Sugar
Fat
Electronics

So it's all "hidden" taxes making hardware so much more expensive...politicians, tradebarrieres...not profit in retail dictates prices.

Edrick · May 10, 2012

I do not understand why things like VAT and other country specific taxes come into the discussion. Intel and AMD have no control over what other countries politicians do.

ShintaiDK · May 10, 2012

Edrick said:
I do not understand why things like VAT and other country specific taxes come into the discussion. Intel and AMD have no control over what other countries politicians do.

Because you need to deduct them to compare prices.

Some people tend to use price+VAT and then whine with Intel/AMD/nVidia is making us pay more etc than x other country. And thats avery wrong assumption.

Electronics are global price. Any variation is price is due to VAT and detail markup.

Denithor · May 10, 2012

Wonder if the iGPU on SB/IB could be used to run physics for games?

Finally, a valid reason for all those extra transistors...

Edrick · May 10, 2012

Denithor said:
Wonder if the iGPU on SB/IB could be used to run physics for games?

Finally, a valid reason for all those extra transistors...

SB = no

IB = maybe, as it is OpenCL compatable.

Haswell = most likely.

Not only games, but imagine being able to use the IGP for processor tasks similar to CUDA, without actually having to utilize your main GPU.

BenchPress · May 10, 2012

Edrick said:
IB = maybe, as it is OpenCL compatable.

Haswell = most likely.

Not only games, but imagine being able to use the IGP for processor tasks similar to CUDA, without actually having to utilize your main GPU.

The desktop Haswell chips will only support GT2, not GT3. The motivation is simple: for office systems a very basic IGP suffices, and multimedia/gaming systems have a discrete GPU anyway.

This also means the IGP will offer less computing power than AVX2, and it's far less efficient at generic computing anyway. The future of gaming physics is with AVX2.

Denithor · May 10, 2012

Edrick said:
Not only games, but imagine being able to use the IGP for processor tasks similar to CUDA, without actually having to utilize your main GPU.

Plus, there won't be competing varieties like there are in discrete GPU space (AMD architecture significantly different from nVidia). Everyone programs one way and it just works...

BenchPress said:
The desktop Haswell chips will only support GT2, not GT3. The motivation is simple: for office systems a very basic IGP suffices, and multimedia/gaming systems have a discrete GPU anyway.

This also means the IGP will offer less computing power than AVX2, and it's far less efficient at generic computing anyway. The future of gaming physics is with AVX2.

So why does so much of the real estate need to be taken up with a virtually useless iGPU? It's overpowered already for 'typical' office/home uses and it's not likely to replace 200-300W monster discrete GPUs anytime soon.

If they can find distinct ways to use the hardware (QuickSync, perhaps physics, etc) it could still be worth including, but to keep increasing the portion of the chip for something few people really need/use seems kinda pointless.

BenchPress · May 10, 2012

Denithor said:
So why does so much of the real estate need to be taken up with a virtually useless iGPU? It's overpowered already for 'typical' office/home uses and it's not likely to replace 200-300W monster discrete GPUs anytime soon.

If they can find distinct ways to use the hardware (QuickSync, perhaps physics, etc) it could still be worth including, but to keep increasing the portion of the chip for something few people really need/use seems kinda pointless.

Like I said, desktop Haswell chips will only support up to GT2, not GT3. So it's not taking up a lot of die space. Mobile chips will have GT1 to GT3, depending on your gaming needs. So you're not paying for what you don't need either.

But seriously, the IGP is worthless for GPGPU. Even the latest GTX 680 gets beaten by current quad-core CPUs, so why would anyone have any hope that a much weaker IGP is going to be faster than a much more powerful CPU with AVX2?

People need to let go of the idea that GPUs are magically faster than CPUs. They merely have wide vector units, FMA, and gather. With AVX2 you get the exact same things on the CPU, and it already has a vastly superior memory hierarchy and much more advanced scheduling.

Haswell Core Count

Senior member

Platinum Member

Senior member

Lifer

Senior member

Golden Member

Lifer

Diamond Member

Senior member

Elite Member

Lifer

Elite Member

Diamond Member

Senior member

Diamond Member

Junior Member

Lifer

Diamond Member

Golden Member

Lifer

Diamond Member

Golden Member

Senior member

Diamond Member

Senior member