Is intel still on nehalem architecture as their groundwork?

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

NTMBK

Lifer
Nov 14, 2011
10,435
5,785
136
You do understand that you compare 4c+HT against 4c only and get ~33% gain from HT + IPC improvement, entirely in-line with SMT scaling in Cinebench and ~10% ST improvement per clock.

Yeah, 4 more threads certainly change a lot of performance :p

Good point, that one doesn't isolate the effects from HT. Here's one that does:

cinebench.gif


So you're right, there's a big gain from HT; but even with HT disabled, there's still a 16.4% improvement.
 

itsmydamnation

Diamond Member
Feb 6, 2011
3,065
3,882
136
From a Core perspective Sandy and Haswell are the two biggest in the last few generation.

Sandy bridge went from a ROB to a physical register file with a retirement queue ( completely changing the way the OOO Engine works) added uop cache and went to the ring bus.

Haswell dramatically widened the core compared to Sandy and increase Load /store and cache bandwidth by a factor of 2 per core.

Skylake compared to those two is a very modest ungrade......
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
So you're right, there's a big gain from HT; but even with HT disabled, there's still a 16.4% improvement.

If you take out the Turbo gain, it goes down to 11-12%. Applications that were really bandwidth hungry or benefitted from Nehalem's parallelization improvement(even without HT) did better but average was 5-10%, close to the former. Xbitlabs had a great normalized comparison but the page is down now.

Especially back in Nehalem's day the gain in multi-threading wasn't as well perceived as now(another reason is that more than few probably expected general gains to be in the 25% range like Athlon 64's did.). Adding things like IMC and HT is more of a special purpose improvement than an architectural change. Single threaded performance is a misnomer. We mean "how much faster is the new CPU without recompiling and even in our old software".

That's another reason most aren't impressed with the HEDT platform. Even if we don't care about the cost the extra cores are often, not worth it. Heck even multi-threaded applications have big trouble scaling. Single threaded gains lift all boats and that come with real uarch changes.
 

sm625

Diamond Member
May 6, 2011
8,172
137
106
Nehalem is 10 years old now and was their first core i5/i7 architecture. Is intel limited by the fact that their groundwork is old now? Do they need a clean slate for their future generations?

Their 2% for 1% rule might be limiting them. They need to massively increase several parts of the core, such as the OoO window, scheduler, L1 instruction and data (widths, not just size). But they probably cant do this without breaking their 2:1 rule. I think they need to break it, and then look to recapture power efficiency elsewhere. For example, adding more power gating granularity to the caches. Double the cache, but allow portions of it to be gated off if not needed. Maybe even add power gating granularity to the buffers themselves. For example, if the uOp cache fills, power up the 2nd half and keep filling it. Power it down when it empties.