[THG]Core i7-4770K: Haswell's Performance-Previewed

Page 9 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

inf64

Diamond Member
Mar 11, 2011
3,884
4,692
136
I'll agree it's a niche product affecting a small set of users, but note that it's used worldwilde, dayly, since several years, by 1000s of users in big companies like Procter&Gamble and Logitech just for this application : http://www.planogrambuilder.com/
so it's arguably more than a proof of concept

anyway, the take home message of the excerpt of a post of mine above (by ShintaiDK), is that the port was very easy, even that I use low level constructs (intrinsics in wrapper classes), for developers relying on the vectorizer it will be just a matter of recompilation, for the ones using the Intel libraries such as MKL/IPP merelly linking with new static libraries or shipping new DLLs
I assume you are the developer of Kribi? Can you share performance numbers with and without AVX2 codepath on Haswell(if you can of course)?
 

bronxzv

Senior member
Jun 13, 2011
460
0
71

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
I had to try :D. Anyway kudos for supporting AVX2 at least, we can only hope some other software companies will follow suit.

There should be much more with AVX2 than with AVX. The reason is because the doubling of Integer width is actually more useful for PCs than doubling of FP. Video processing is supposedly all Integer, so encode/decode/playback should all benefit. If we had both double with AVX, we'd likely have seen more adoption but that's not happened.

But because AVX only doubles Floating Point, we only see uses in scenarios where we'll never see direct benefit, like with Xeon workstations, or strategic pushing of new(yet niche) market, and few PC applications.
 

Acanthus

Lifer
Aug 28, 2001
19,915
2
76
ostif.org
40% increase over his current hardware. Not 40% in a single generation.

This, it has to beat my current CPU, on average, by 40% in single threaded performance.

I don't care how this is achieved, IPC, clockspeed, cache, chipset, etc.

That is just the metric i use to decide when to upgrade, because it stops me from wasting mounds of cash on marginal upgrades :p
 

Revolution 11

Senior member
Jun 2, 2011
952
79
91
This, it has to beat my current CPU, on average, by 40% in single threaded performance.

I don't care how this is achieved, IPC, clockspeed, cache, chipset, etc.

That is just the metric i use to decide when to upgrade, because it stops me from wasting mounds of cash on marginal upgrades :p
You might be waiting a good 3 to 7 years at the current rate of 5-15% annual IPC improvements. :eek: And Ivy shows that clock-speeds will not necessarily progress with new process nodes.
 

itsmydamnation

Diamond Member
Feb 6, 2011
3,076
3,908
136
There should be much more with AVX2 than with AVX. The reason is because the doubling of Integer width is actually more useful for PCs than doubling of FP. Video processing is supposedly all Integer, so encode/decode/playback should all benefit. If we had both double with AVX, we'd likely have seen more adoption but that's not happened.

But because AVX only doubles Floating Point, we only see uses in scenarios where we'll never see direct benefit, like with Xeon workstations, or strategic pushing of new(yet niche) market, and few PC applications.

When did INT SIMD become this magically beast that once over come will transform all our PC's into the spruce moose? I hadn't seen ( and dont see outside of anandtech) the chorus of people cursing CPU architectures for their lack of 256bit INT SIMD support.

Just look at K8 vs K10 to get an idea of how much performance you get with a comparable archetexture that you double the core width (K8 did SSE2 via two 64bit ops) , remember lots of AVX's instructions have comparable SSE2+ instructions. Even PC games that are quite FPU heavy haven't seen massive adoption of AVX.
 
Last edited:

NTMBK

Lifer
Nov 14, 2011
10,448
5,829
136
I honestly think that FMA is going to be more useful than AVX2 for games. 3D geometry has natural vector length of 3 or 4 (W component for perspective projections) and games don't need 64 bit precision, making 256bit vectors not that much use. But FMA is available on 128 bit vectors, and performs what used to be two ops in one.

(And don't forget that Piledriver already implements both FMA3 and FMA4, so it should see a nice boost if it starts getting used.)
 

Acanthus

Lifer
Aug 28, 2001
19,915
2
76
ostif.org
You might be waiting a good 3 to 7 years at the current rate of 5-15% annual IPC improvements. :eek: And Ivy shows that clock-speeds will not necessarily progress with new process nodes.

Yup.

And software threading isnt going to improve much, either. So I don't have any incentive to upgrade to a 8+ core CPU with the same single threaded performance.

With the current IPC numbers i've been seeing from Haswell, i'm thinking it'll take 5.5-5.7GHz to reach my goal.

I'm thinking one more generation might do it. Moving to DDR4 and clocking it up, mature AVX2 apps, die shrink, and another slight general IPC bump through architecture tweaks.

It was torture to wait for Sandy to beat my Xeon X3220 @ 3.6GHz :D
 

thm1223

Senior member
Jun 24, 2011
336
0
71
So to sum it up, CPU performance imporvement is minimal, iGPU gains are pretty significant, and power consumption is greatly reduced?
 

buklau

Member
May 4, 2012
135
0
76
I wonder if it's worth to upgrade from a 4.5ghz 2500k to haswell 4770k? :hmm: My current 2500k would run 4.8ghz at 1.4v, haven't tried higher since p95 load temps already in the low 70s and no more than 63c with 4.5ghz 1.328v.
 

VulgarDisplay

Diamond Member
Apr 3, 2009
6,188
2
76
I wonder if it's worth to upgrade from a 4.5ghz 2500k to haswell 4770k? :hmm: My current 2500k would run 4.8ghz at 1.4v, haven't tried higher since p95 load temps already in the low 70s and no more than 63c with 4.5ghz 1.328v.

I'm struggling with the same exact situation. I know haswell will be an upgrade from the 2500k, especially if I go with a part that has hyperthreading. The question is, will it be worth upgrading to haswell, or better waiting a little bit longer.

It all depends on what next gen game development brings us in the CPU department for me.
 

Ventanni

Golden Member
Jul 25, 2011
1,432
142
106
Given the preliminary benchmarks, I just can't see upgrading from SB to Haswell being worth it at the high end. I think the majority of the performance improvement we'll see will come from the 10-17-25-35w sectors, but at our level where power isn't as much of a concern, we'll see minimal to marginal gains at best.
 

BallaTheFeared

Diamond Member
Nov 15, 2010
8,115
0
71
I'm struggling with the same exact situation. I know haswell will be an upgrade from the 2500k, especially if I go with a part that has hyperthreading. The question is, will it be worth upgrading to haswell, or better waiting a little bit longer.

It all depends on what next gen game development brings us in the CPU department for me.

Luckily I dumped my i5 while it still had decent value, i3-540 to i7 Haswell is an easy choice to make ;)
 

VulgarDisplay

Diamond Member
Apr 3, 2009
6,188
2
76
Luckily I dumped my i5 while it still had decent value, i3-540 to i7 Haswell is an easy choice to make ;)

Maybe I'd be better off getting a 2700K and keeping the same motherboard just to get hyperthreading.

Especially if the performance difference isn't going to be worth a $400 CPU and $200 motherboard over said hyperthreaded 2700k.

EDIT: Never realized the 3770k uses the same socket as the 2500k. Interesting.
 

cytg111

Lifer
Mar 17, 2008
26,161
15,586
136
I wonder if it's worth to upgrade from a 4.5ghz 2500k to haswell 4770k? :hmm: My current 2500k would run 4.8ghz at 1.4v, haven't tried higher since p95 load temps already in the low 70s and no more than 63c with 4.5ghz 1.328v.

No. Unless Yes. Well to be certain we need more reviews, but if your applications wont get support for AVX2 or simply does not gain from it, from what we know now, it would be a side-grade, or close to it.
It pretty much stands or falls with avx2.
 

inf64

Diamond Member
Mar 11, 2011
3,884
4,692
136
Maybe I'd be better off getting a 2700K and keeping the same motherboard just to get hyperthreading.

Especially if the performance difference isn't going to be worth a $400 CPU and $200 motherboard over said hyperthreaded 2700k.

EDIT: Never realized the 3770k uses the same socket as the 2500k. Interesting.
Used 2600K is even better option. You will get SMT support and great OC margin ,all on the same socket.
 

Piroko

Senior member
Jan 10, 2013
905
79
91
There should be much more with AVX2 than with AVX. The reason is because the doubling of Integer width is actually more useful for PCs than doubling of FP. Video processing is supposedly all Integer, so encode/decode/playback should all benefit.
Not necessarily. Video transcoding can be done in int and float format, the common denominator being that it's low precision (for higher performance). Float might even be better for the job iirc, but don't nail me on it.
 

Homeles

Platinum Member
Dec 9, 2011
2,580
0
0
I could have sworn video processing was floating point intensive... hence the speedup from OpenCL.
 

Piroko

Senior member
Jan 10, 2013
905
79
91
Yea, the opencl implementations will probably use float instructions. Float will also give slightly higher precision given the choice, but CPU encoders apparently were mostly integer based, because CPUs historically had much higher int throughput. At least the xvid encoder was, couldn't find a more recent source :x
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
I could have sworn video processing was floating point intensive... hence the speedup from OpenCL.

Video encoding is almost all integer. The speedup from GPU usage is due to the amount of parallel computations that are available.

Also note that all current implementations of non-CPU encoding sacrifice a lot of quality in order to accommodate GPU architectures.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
I could have sworn video processing was floating point intensive... hence the speedup from OpenCL.

If that were true then AMD's 8-core processors would totally suck wind compared to Intel's 4-core processors in video processing apps. Which isn't the case excepting for those portions of the transcode job which must be handled serial instead of parallel.