Haswell Core Count

Page 6 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

BenchPress

Senior member
Nov 8, 2011
392
0
0
So you're implying that Intel should be making chips that are 4-times+ the size of Haswells to achieve GPGPU levels of performance that can be achieved by spending only $450, and somehow that's supposed to prove your point?
You're still not getting it. That graph shows a 300$ CPU without AVX2 beating a 500$ GTX 680. And yes, the HD 7970 outperforms it but Intel doesn't need 4x the core count to catch up. It just needs AVX2 and it would still be cheaper and more power efficient!

Also keep in mind that GPUs have a 10% or higher failure rate. That's ridiculous. There's a reason NVIDIA's Tesla cards cost several times more than the consumer parts.
 

Don Karnage

Platinum Member
Oct 11, 2011
2,865
0
0
Lets see,

Intel will have to compete against their own CPUs first, no one will upgrade from 4 core IB to 4 core Haswell unless it has a huge IPC gain. Remember, both IB and Haswell will be on the same 22nm, dont expect to see a huge power reduction like SB to IB.

Secondly, in 2013 (same year Intel will introduce Haswell) AMD will introduce the SteamRoller. From what we know now, it seams that SteamRoller will have more than 8 threads. It will take an enormous IPC gain for a 4 core 8 threads Haswell to be competitive against a 10-12 threads SteamRoller in multithreading apps.

I believe we may see a cheaper 6 core Intel CPU with Haswell-E.

No one will upgrade from a quad to another quad? Why did people go from a 760 to a 2500k? Or a 2500k to a 3750k?
 

BenchPress

Senior member
Nov 8, 2011
392
0
0
Because the CPU can't handle such HPC workloads. You need a massive bus width, gigabytes of attached very quick ram and thousands of processors!
No, you don't. The only reason the GPU has a massive bus width is because it has tiny caches so it has no other choice but to store everything in RAM. The CPU only requires a fraction of the RAM bandwidth because most memory accesses hit the L1, L2 or L3 cache, which also have a much lower latency and lower power consumption than a RAM access.
You can count cores however you want, at the end of the day the GPU has many many more than does the on-die CPU.
You are blinded by core count. The 1536-core GTX 680 is beaten by a 4-core CPU. Even if you do count cores the same way, it's a pathetic result for the GPU. The HD 7970 doesn't fare much better considering that this CPU still doesn't feature AVX2.
If Intel was as confident in Haswell and AVX2 as you are then they wouldn't have bothered with Knight's Corner co-processor.
AVX2 will be in every CPU from Haswell forward. Larrabee on the other hand got cancelled and they're making up for the investment by trying to sell it to the HPC market. Also note that many LRBni instructions resurfaced as part of AVX2.

So yeah, Intel is very confident in AVX2. And they designed VEX to be extendable to 1024-bit...
 

AtenRa

Lifer
Feb 2, 2009
14,001
3,357
136
No one will upgrade from a quad to another quad? Why did people go from a 760 to a 2500k? Or a 2500k to a 3750k?

760 45nm to 2500K 32nm (more performance from IPC and frequency increase + less power consumption)
2500k 32nm to 3570K 22nm (more performance from IPC and frequency increase + less power consumption)

IB 22nm to Haswell 22nm (if Haswell has 10% IPC and almost same power consumption i dont see any reason for anyone to upgrade from IB at the time).
 

BenchPress

Senior member
Nov 8, 2011
392
0
0
Im sorry but that behemoth has 3-4x times the performance with a TDP of 210W when Intel Core i7 3820 has a TDP of 125W.
Yes, but the i7-3820 use a 32 nm process while TSMC's 28 nm process is much closer to Intel's 22 nm process in mass availability. And Intel is going to keep that process advantage for the foreseeable future. Also, it takes only a small reduction in clock speed to bring that TDP down. The i7-3820QM only consumes 45 Watt and likely won't score much lower than the i7-3820.

And keep in mind that the performance/Watt improvement won't end with AVX2. AVX-1024 can be implemented by splitting them into four 256-bit operations at instruction issue. It doesn't increase the throughput, but the CPU's power-hungry front-end would have to only deliver 1/4 of the instruction count and the schedulers would have much less switching activity as well. So power consumption would go down, and it has the added advantage of more latency hiding. It's a tiny extension that would make the CPU behave very closely to a GPU.
I believe that you will find that GPUs are far more efficient that your CPU for GPGPU. Just wait and see how much more efficient the GK100/110 will be.
Again, that doesn't matter for the mainstream market because a lot of people will buy the GTX 680 or GPUs based on the same architecture. So developers are not inclined to adopt GPGPU when a large percentage of the latest hardware can't run it efficiently.
It is the reason most super computers are using GPUs for highly compute scenarios than CPUs. ;)
Supercomputers aren't using GPUs because of some magical feature the CPU won't ever have. They're using them because they fit the SPMD programming model. That advantage will go away when the CPU supports AVX2, which is clearly oriented at SPMD!

Besides, let's get something straight, the topic is about Haswell's core count: That is, four cores for the mainstream part. Hence this isn't about supercomputers or anything of the like. It's about how much better a quad-core Haswell will be compared to previous quad-cores, and as an extension how this will affect mainstream GPGPU. Its days are numbered.
 
Last edited:

Edrick

Golden Member
Feb 18, 2010
1,939
230
106
IB 22nm to Haswell 22nm (if Haswell has 10% IPC and almost same power consumption i dont see any reason for anyone to upgrade from IB at the time).

If you seriously think SB to IB was a 'good' upgrade just because of 22nm, and at the same time dismiss the next 'Tock' just because it is on the same process, then you clearly do not understand what a new uArch can bring to the table.
 
Last edited:

AtenRa

Lifer
Feb 2, 2009
14,001
3,357
136
I know what a new uArch can bring to the table, but i also know that Intel will be targeting to increase the iGPU performance too. Haswell's iGPU will be bigger than IB's and it will eat precious die size from the CPU cores.
 

frostedflakes

Diamond Member
Mar 1, 2005
7,925
1
0
People get too hung up on core counts. Architectural improvements have allowed Intel to squeeze significantly more performance out of the same number of cores. For example, here's a $200 2500K vs what used to be a what, $400 Q6600? Both are quad cores but the 2500K slaughters the Q6600 offering at least double the performance on average. Also notable is that the dual core w/HT i3-2100 outperforms the Q6600 quad core even in most heavily threaded tasks. In the end who cares how many cores the CPU has, as long as performance isn't stagnating, which it isn't.

http://www.anandtech.com/bench/Product/288?vs=53

Like others have explained, Intel's mainstream processors are intended for regular users, and for the vast majority of regular users something like an i5 is already overkill. Intel is more concerned about improving per core performance since 99.9% of programs out there can't utilize more than eight threads and improving integrated GPU performance, which tends to bottleneck a person in games and other tasks much more than CPU performance bottlenecks them in everyday computing.

If you want or need more cores, you're part of a niche market and you have to pay a premium for their enthusiast/workstation oriented parts. Just the way it is.
 

BenchPress

Senior member
Nov 8, 2011
392
0
0
I know what a new uArch can bring to the table, but i also know that Intel will be targeting to increase the iGPU performance too. Haswell's iGPU will be bigger than IB's and it will eat precious die size from the CPU cores.
Precious die size? Ivy Bridge is tiny. You should check the size and cost of AMD chips. Intel will have no trouble making the IGP bigger while also adding AVX2 and TSX to the CPU cores, and still make a lot of profit.

Also note that Haswell will feature various IGP models. Even the top model won't result in a big die, and the rest is still tiny.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
If you seriously think SB to IB was a 'good' upgrade just because of 22nm, and at the same time dismiss the next 'Tock' just because it is on the same process, then you clearly do not understand what a new uArch can bring to the table.

+1

Sandy Bridge at the same 22nm process as Westmere brought 12-15% performance increase across the table. Ivy Bridge over Sandy Bridge brings less than half that. At the same time, power consumption at the same frequency dropped by more than 10%, which is why the mobile chips got such a big advancement.
 
Aug 11, 2008
10,451
642
126
People get too hung up on core counts. Architectural improvements have allowed Intel to squeeze significantly more performance out of the same number of cores. For example, here's a $200 2500K vs what used to be a what, $400 Q6600? Both are quad cores but the 2500K slaughters the Q6600 offering at least double the performance on average. Also notable is that the dual core w/HT i3-2100 outperforms the Q6600 quad core even in most heavily threaded tasks. In the end who cares how many cores the CPU has, as long as performance isn't stagnating, which it isn't.

http://www.anandtech.com/bench/Product/288?vs=53

Like others have explained, Intel's mainstream processors are intended for regular users, and for the vast majority of regular users something like an i5 is already overkill. Intel is more concerned about improving per core performance since 99.9% of programs out there can't utilize more than eight threads and improving integrated GPU performance, which tends to bottleneck a person in games and other tasks much more than CPU performance bottlenecks them in everyday computing.

If you want or need more cores, you're part of a niche market and you have to pay a premium for their enthusiast/workstation oriented parts. Just the way it is.

I see what you are saying, but by the same token, look at how the price has come down on quad cores. So if they can bring the price of a quad core down from 400.00 to 200.00, why cant they put out a 350.00 six core? I agree that right now most people dont need more than a quad core. But if I were to buy a machine, I would want it to last several years at a minimum. Do you really think a quad core is not going to be inadequate in 3 to 5 years? And you mention intel is more interested in increasing per core performance than increasing the number of cores. I basically agree with that strategy to a point, but they seem to have reached a point of diminishing returns in trying to increase per core performance. I think the reason people are disappointed is because IB increased neither per core performance (at least to any significant degree) nor increased core count.
And I am beginning to fear Haswell will also be a minimal upgrade except for the IGP, which I dont really care about on the desktop.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Nehalem brought 20-30% gains because of Hyperthreading. The per thread gains were no better than Ivy Bridge had over Sandy Bridge.

The question is, will it end up like Sandy Bridge, where it gains overall 12-15% gain, or be like Nehalem where it gains 3-5% per thread but have a critical feature that boosts performance up enormously elsewhere? And are we counting TSX as that feature?
 

Mr. Pedantic

Diamond Member
Feb 14, 2010
5,039
0
76
Yea but the people who does such big jobs can easily afford those processors. CPUs are probably the least of their costs anyway. But individuals don't run those sorts of stuff.

No, they can't. The whole purpose of Folding@home, SETI@home, Muon, DCNET, etc are that THEY CAN'T AFFORD THE COMPUTING POWER THEY NEED.

-Tasks being accelerated by dedicated and low power units like QuickSync and/or being offloaded to GPUs.
-Tablets becoming the potential biggest threat to PC.
-Multi-threading being extremely limited in the consumer space
http://www.anandtech.com/bench/Product/443?vs=287
- True. But if you have dedicated hardware for every single processor heavy application you need, that is a heck of a lot of design and development time for a very specialized job. And besides, while QuickSync is good at its job, people by and large shun GPU video encoding because it has traditionally been very poor in quality compared to in software.
- Really? You think I want to do work on a tablet? You're dreaming.
- Really? As I said, the only major application I can think of that needs multithreading that doesn't have it is MS Office.

You see applications like Adobe CS4/WME/3DSMax/Blender already benefitting minimally from 2 additional cores. The gains are about the same as what Hyperthreading offered. Hell the 3770K is beating the 3960X! I could point out few more that will benefit absolutely nothing going to 8 cores but still benefit moving from 4 to 6 cores.
True. But it is still multithreaded, and your circular reasoning basically ensures that if nobody buys or demands CPUs with more cores or threads, Adobe sure as he'll won't spend me and money programming for more.

-Low power increasing in importance
So more threads is even better, because there's more opportunity for power gating in the same chip.

So why should they make a dedicated chip for the few that wants those extra few cores in the consumer space? Why not just keep making derivatives(3960X like chips) of the Xeon, which the customers really need the extra performance?
We are not talking about making dedicated chips. We are talking about making EVERY chip in the lineup better.

Pardon any typos, I hate typing on my phone.
 

BenchPress

Senior member
Nov 8, 2011
392
0
0
The question is, will it end up like Sandy Bridge, where it gains overall 12-15% gain, or be like Nehalem where it gains 3-5% per thread but have a critical feature that boosts performance up enormously elsewhere? And are we counting TSX as that feature?
Is AVX2's ability to make many code loops run up to eight times faster not good enough for you? ;)

TSX is revolutionary in its own right, but its role will grow when the core count increases beyond four.
 

LOL_Wut_Axel

Diamond Member
Mar 26, 2011
4,310
8
81
You've been fooled by GPU marketing. There is no such thing as a GPU with thousands of cores. So I'm terribly sorry but you're the only being delusional here.

GPU manufacturers count each SIMD lane as an individual core. Using the same "logic", mainstream Haswell will have 64 cores, running at over three times the clock frequency of the latest GPUs. For the record, the 22 nm HD 4000 has 128 of such "cores", however they're running at only 1150 Mhz. So there's really not that big a difference between a CPU and a GPU. We certainly don't need a big jump in core count.

That said, the instruction set is just part of the reason Haswell will kill mainstream GPGPU. CPUs can already put the latest GPUs to shame at GPGPU workloads. The reason for this is that there's no round-trip delay, no bandwidth bottleneck, and no hard register limit. And Haswell will strengthen these benefits with a GPU-like instruction set extension!

Haswell obviously won't kill mainstream GPGPU overnight, but it's blatantly obvious that GPGPU has no future. Adding AVX2 to the CPU won't cause any compromises. In contrast, for the GPU to become any better at GPGPU it has to sacrifice a considerable amount of graphics performance. It basically has to become more like a CPU. But that's downright silly. If becoming more like the CPU is the answer then why not let the CPU handle these workloads in the first place? AVX2 was the only missing bit to make that happen.

That's an interesting perspective.

I was thinking of getting an HD 7950 if it was a considerable amount less expensive than the GTX 670, more for the high compute performance than for the price difference. But Sandy/Ivy Bridge don't support AVX2, so would the HD 7950 still be a good alternative if you want both good gaming and compute performance? The only compute-heavy thing I'd do is folding@home... speaking of which, the results I've seen in f@h for the GTX 680 are terri-bad and they don't seem good for the HD 7900 series either. It looks like CPU folding continues to be the way to go.

Everything seems to be going towards a general point when it comes to performance. For example, now-a-days there's barely any difference between RISC and x86, and GPUs (except Kepler) are targeting general purpose (floating point, anyway) computing more. As we've seen with Tahiti, you do give up a bit of gaming performance to gain compute performance and also some efficiency both in PP/W and PP/mm^2 in gaming tasks but it's still a very good compromise, barring the fact most people don't give 2 cents about compute performance.
 

WhoFarted

Junior Member
Mar 16, 2012
6
0
0
3930k is 500 USD.
+ a cheap mobo - 250/300 USD.

800 USD.

Now let's try that overseas here in EU:

3930k = 900 USD
Granted there are cheap mobo's for 350 USD in my territory, but most are 400 USD.

That equals 1250/1300 USD averagely.

That's more than 50% the cost from EU.
A cheap mobo + 3550k costs me 500-600 USD.
(Depending on mobo/features).


So no, it's not just "SIMPLE CHEAP JUST A BIT MORE" for everyone.
And this is discounting that if i live in the US - there's CRAZY deals on mobo + mainstream i5/i7 deals in most big retailers suddenly.

I've seen these kind of price differences before. Is this a result of the VAT tax or just higher retail price (or both)?
 

Lonbjerg

Diamond Member
Dec 6, 2009
4,419
0
0
Usually VAT. Without VAT the prices are very similar.

To add to the uninformed.
There is 25% VAT on EVERYTHING in Denmark.

On TOP of that, many good have extra taxes:
Sugar
Fat
Electronics

So it's all "hidden" taxes making hardware so much more expensive...politicians, tradebarrieres...not profit in retail dictates prices.
 

Edrick

Golden Member
Feb 18, 2010
1,939
230
106
I do not understand why things like VAT and other country specific taxes come into the discussion. Intel and AMD have no control over what other countries politicians do.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
I do not understand why things like VAT and other country specific taxes come into the discussion. Intel and AMD have no control over what other countries politicians do.

Because you need to deduct them to compare prices.

Some people tend to use price+VAT and then whine with Intel/AMD/nVidia is making us pay more etc than x other country. And thats avery wrong assumption.

Electronics are global price. Any variation is price is due to VAT and detail markup.
 

Denithor

Diamond Member
Apr 11, 2004
6,300
23
81
Wonder if the iGPU on SB/IB could be used to run physics for games?

Finally, a valid reason for all those extra transistors...

:)
 

Edrick

Golden Member
Feb 18, 2010
1,939
230
106
Wonder if the iGPU on SB/IB could be used to run physics for games?

Finally, a valid reason for all those extra transistors...

:)

SB = no

IB = maybe, as it is OpenCL compatable.

Haswell = most likely.

Not only games, but imagine being able to use the IGP for processor tasks similar to CUDA, without actually having to utilize your main GPU.
 

BenchPress

Senior member
Nov 8, 2011
392
0
0
IB = maybe, as it is OpenCL compatable.

Haswell = most likely.

Not only games, but imagine being able to use the IGP for processor tasks similar to CUDA, without actually having to utilize your main GPU.
The desktop Haswell chips will only support GT2, not GT3. The motivation is simple: for office systems a very basic IGP suffices, and multimedia/gaming systems have a discrete GPU anyway.

This also means the IGP will offer less computing power than AVX2, and it's far less efficient at generic computing anyway. The future of gaming physics is with AVX2.
 

Denithor

Diamond Member
Apr 11, 2004
6,300
23
81
Not only games, but imagine being able to use the IGP for processor tasks similar to CUDA, without actually having to utilize your main GPU.

Plus, there won't be competing varieties like there are in discrete GPU space (AMD architecture significantly different from nVidia). Everyone programs one way and it just works...

The desktop Haswell chips will only support GT2, not GT3. The motivation is simple: for office systems a very basic IGP suffices, and multimedia/gaming systems have a discrete GPU anyway.

This also means the IGP will offer less computing power than AVX2, and it's far less efficient at generic computing anyway. The future of gaming physics is with AVX2.

So why does so much of the real estate need to be taken up with a virtually useless iGPU? It's overpowered already for 'typical' office/home uses and it's not likely to replace 200-300W monster discrete GPUs anytime soon.

If they can find distinct ways to use the hardware (QuickSync, perhaps physics, etc) it could still be worth including, but to keep increasing the portion of the chip for something few people really need/use seems kinda pointless.
 

BenchPress

Senior member
Nov 8, 2011
392
0
0
So why does so much of the real estate need to be taken up with a virtually useless iGPU? It's overpowered already for 'typical' office/home uses and it's not likely to replace 200-300W monster discrete GPUs anytime soon.

If they can find distinct ways to use the hardware (QuickSync, perhaps physics, etc) it could still be worth including, but to keep increasing the portion of the chip for something few people really need/use seems kinda pointless.
Like I said, desktop Haswell chips will only support up to GT2, not GT3. So it's not taking up a lot of die space. Mobile chips will have GT1 to GT3, depending on your gaming needs. So you're not paying for what you don't need either.

But seriously, the IGP is worthless for GPGPU. Even the latest GTX 680 gets beaten by current quad-core CPUs, so why would anyone have any hope that a much weaker IGP is going to be faster than a much more powerful CPU with AVX2?

People need to let go of the idea that GPUs are magically faster than CPUs. They merely have wide vector units, FMA, and gather. With AVX2 you get the exact same things on the CPU, and it already has a vastly superior memory hierarchy and much more advanced scheduling.