Question Intel 12th to 13th generation performance comparison

Page 7 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

GunsMadeAmericaFree

Golden Member
Jan 23, 2007
1,392
379
136
Intel13thGenRefresh.jpg


I thought this was an interesting read - benchmark comparisons between Intel 12th generation & 13th generation:

Article with details

That's an average performance increase of 47% from one generation to the next. I wonder if AMD will have a similar increase?
 
Last edited:

Harry_Wild

Senior member
Dec 14, 2012
860
169
106
Most general users want to know single thread performance comparisons. Multicore is for crushing numbers, editing graphics videos, etc… Single core is for reading emails, rendering webpages for internet reading content, streaming channel content!
 

TheELF

Diamond Member
Dec 22, 2012
4,027
753
126
Even if they did nothing else than just use the freed space to make caches larger, this elementary move would help performance. And there are many improvements they can make by just adding more already existing functional units. And by making things simpler they can easier implement new ways of how things work.

This is not hard to imagine.
How many of the apps that exist in the world and that are single threaded do hit the cache or the unit limit?
Those are things that can limit a highly threaded or highly parallel workload including many games but not single threads.
 

Kocicak

Golden Member
Jan 17, 2019
1,177
1,232
136
...
So HT on completed the task 8.7% faster. Doesn't seem like much ....

No, it is not much at all. Do not forget that this improvement is this small only because you were running 100% load. Any lesser load would make this improvement even tinier.

Now imagine that without HT circuitry and after improvements, you could be getting improvement say 3-5% on everything below 24 threads.

Sounds great, right? Any normal user, who does not spend all the time rendering and transcoding (or whatever) things, would welcome this with open arms.

BTW I have a gut feeling given how huge complication HT is and how broad consequences and connections it really has in the core, that the improvement enabled by dropping it could be even 15 or more percent.
 
Last edited:

VirtualLarry

No Lifer
Aug 25, 2001
56,587
10,225
126
BTW I have a gut feeling given how huge complication HT is and how broad consequences and connections it really has in the core, that the improvement enabled by dropping it could be even 15 or more percent.
You've got that exactly backwards. Choosing not to extract TLP performance could be as much as a 15% loss in MT or memory-bound scenarios.

Doesn't anyone in this forum study Computer Science any more?

No, just posting your imaginative "feels"?
 
  • Like
Reactions: igor_kavinski

Exist50

Platinum Member
Aug 18, 2016
2,452
3,106
136
You've got that exactly backwards. Choosing not to extract TLP performance could be as much as a 15% loss in MT or memory-bound scenarios.

Doesn't anyone in this forum study Computer Science any more?

No, just posting your imaginative "feels"?
The entire premise of this discussion is that ST performance is worth more than the MT gain from SMT. A modern consumer CPU has plenty of other cores for thread level parallelism.
 

Exist50

Platinum Member
Aug 18, 2016
2,452
3,106
136
I meant the jump from Skylake to Golden Cove. I think it was over 40% if memory serves. BTW, I just encoded a 4K 60 FPS video with Handbrake x265 with HT on and HT off.

HT off results:



HT on results:



So HT on completed the task 8.7% faster. Doesn't seem like much until you realize that this workload pegged all 32 threads on my CPU at 100% load, and with HT off, all 24 threads were also pegged to 100%.

This is a highly threaded workload of course, but my point is that having HT on still made the CPU notably faster and more efficient against 24 physical cores.
I don't think anyone is debating the fact that in sufficiently parallel workloads, SMT improves performance. Instead the question is whether that's truly meaningful for the target market vs more single thread performance.
 

Kocicak

Golden Member
Jan 17, 2019
1,177
1,232
136
A modern consumer CPU has plenty of other cores for thread level parallelism.
This slightly differs now, while lower end AMD 7600X has only 6 physical cores, 13600K from intel has 14 of them. I do not believe that a "normal consumer" needs HT in 13600K, but it could be still useful in 7600X.
 

TheELF

Diamond Member
Dec 22, 2012
4,027
753
126
The entire premise of this discussion is that ST performance is worth more than the MT gain from SMT.
But it's not,

unless you run extremely specific software 100% of the time...and the same goes for MT and SMT.
For a normal user that runs normal stuff there will be no difference at all with the core being even 10% faster or slower.
Do a blind test on someone and reduce their clocks by 10% and see if they notice...they would only notice if they are running extremely specific things... that tell them the performance in real time, in form of FPS or points per sec or something.
Most people have their systems run at idle, at like 800Mhz or how low modern CPUs go nowadays, and it rarely ever goes to full speed.
 
  • Like
Reactions: JustViewing

Kocicak

Golden Member
Jan 17, 2019
1,177
1,232
136
...
For a normal user that runs normal stuff there will be no difference at all with the core being even 10% faster or slower.
...
Yes, then why bother improving anything. If everybody was thinking this way, we would still be counting on abaci.
 

LightningZ71

Platinum Member
Mar 10, 2017
2,547
3,241
136
But you are assuming that this is possible without any basis in reality without having any facts. You just believe it so it must be true. You can't possibly know if the IPC improvement they have each gen isn't the max that is possible, no matter what they do.
If it were the case that the ST IPC throughput (and, believe it or not, IPC is NOT the only determinant of core performance, extra transistors can be used to increase the achievable frequency of a design, though, with everything, there is a cost) achieved is "the max that is possible" then there would never be any advancement in processor development and that design would be the end of the road.

There is always a way to increase IPC in a processor by adding transistors. It is absolutely a situation of diminishing returns, but there are still gains to be made. Every design out there, and I do mean EVERY design, is a compromise between development budget, die area available, targeted yields, etc. that limits the total number of transistors and features that wind up in the core. The decision to spend ~10% of the transistor budget of a core on HT enablement is also a decision to limit the size of the L1 Data and instruction caches, the TLBs, the OoO window, improved function units, converting microcode instructions to actual transistors, etc. All of those things can afford more ST performance with more resources (in the right balance and with proper planning, of course). In a world without E-cores, it makes a TON of sense to have HT on an x86 core. In a world with a high number of e-cores, it makes a lot less sense to keep paying the price for HT on the P cores.
 
  • Like
Reactions: BorisTheBlade82

LightningZ71

Platinum Member
Mar 10, 2017
2,547
3,241
136
I meant the jump from Skylake to Golden Cove. I think it was over 40% if memory serves. BTW, I just encoded a 4K 60 FPS video with Handbrake x265 with HT on and HT off.

HT off results:



HT on results:



So HT on completed the task 8.7% faster. Doesn't seem like much until you realize that this workload pegged all 32 threads on my CPU at 100% load, and with HT off, all 24 threads were also pegged to 100%.

This is a highly threaded workload of course, but my point is that having HT on still made the CPU notably faster and more efficient against 24 physical cores.
You are doing that test in isolation of two things that are being proposed: 1) The P cores gain back the transistors used to support HT and spend that on ST improvement. 2) Some (in this case, a pair) of the P cores are replaced with E-cores.

In your case, you have 8 P cores performing at speed 1, 16 E cores performing at speed ~.7, and 8 HT threads performing at about .1 exchange two of those P cores for 8 more E cores. Use the available transistors from HT to instead make the P cores 10% faster. That gives you 6 * 1.1=6.6, 24 * .7= 16.8 with a total of 23.4. Your current processor has 8 + 11.2 + (8 * .1)=20

Those are rough numbers, and are highly situationally dependent, but, the point is that by sacrificing 2 performance cores and HT on the performance cores, you gain space for 8 additional E cores. That's exchanging two hardware cores for 8 that are roughly 70% as good. You also gain performance on the P cores. Now, you also have a processor that is even faster in lightly threaded loads, including games if that's important to you, and also faster in heavy MT loads.

Your performance hit for disabling HT doesn't take into account the fact that the P cores would be FASTER without the HT circuitry, not because the HT circuitry itself imposes a significant penalty by it's existence, but because it takes up space and transistor budget that could go towards making the processor itself inherently faster for single threaded work. Consequently, they would have lower resource utilization, but, they would also generate less heat per unit of work performed in ST (unused resources are power gated in modern processors). That can be demonstrated now by disabling HT in most any modern processor and watching it's power draw and heat dissipation drop because it's using less of it's internal resources each clock. This means MORE power budget for the rest of the processor and the ability to sustain higher all-core clocks under load.
 
  • Like
Reactions: Rigg

LightningZ71

Platinum Member
Mar 10, 2017
2,547
3,241
136
But it's not,

unless you run extremely specific software 100% of the time...and the same goes for MT and SMT.
For a normal user that runs normal stuff there will be no difference at all with the core being even 10% faster or slower.
Do a blind test on someone and reduce their clocks by 10% and see if they notice...they would only notice if they are running extremely specific things... that tell them the performance in real time, in form of FPS or points per sec or something.
Most people have their systems run at idle, at like 800Mhz or how low modern CPUs go nowadays, and it rarely ever goes to full speed.
Under your premise, the P cores having HT at all would be completely unnoticed by any normal user. However, when that user goes to do something that happens to tax all the cores, they would benefit from there being more E cores and less P cores (down to a point of course) as it would complete that task faster. We're not arguing to eliminate HT without anything in return, the trade is fewer P cores that are faster, and more E cores to crunch the heavy loads.
 

TheELF

Diamond Member
Dec 22, 2012
4,027
753
126
Yes, then why bother improving anything. If everybody was thinking this way, we would still be counting on abaci.
How is that an argument on how important ST is?
A normal user is in the i3-i5 range while the ST bariers get broken on the top halo CPU only.
If it were the case that the ST IPC throughput
...
...
achieved is "the max that is possible" then there would never be any advancement in processor development and that design would be the end of the road.
So if the IPC increase we get each gen would be the only IPC increase we get then we would not get any IPC increase ever
...
...
...
Under your premise, the P cores having HT at all would be completely unnoticed by any normal user. However, when that user goes to do something that happens to tax all the cores, they would benefit from there being more E cores and less P cores (down to a point of course) as it would complete that task faster. We're not arguing to eliminate HT without anything in return, the trade is fewer P cores that are faster, and more E cores to crunch the heavy loads.
No you are arguing that the ST improvement would be high enough to negate the loss of HT and that is just insane.
 

dullard

Elite Member
May 21, 2001
26,044
4,690
126
Yes, then why bother improving anything. If everybody was thinking this way, we would still be counting on abaci.
1) He is correct, most people can just barely notice a 10% change and that is only when they are paying close attention to it. If it takes 0.55 seconds instead of 0.5 seconds to open a PDF most people don't really know or care.
2) Why improve? Because while one 10% change might not be noticeable, multiple 10% changes over years are definitely noticeable. They will notice how an older computer took 0.55 seconds to open that PDF but a newer computer does it smoothly in a flash. Those little brief pauses really do make you think a computer is slow. But you will notice if a new computer doesn't have that brief pause.
3) Why improve? Because normal software can then change over time. For example, It used to take an expert lots of time to remove an object from a photo, now any lay person can click on an undesired object in a photo and it vanishes and fills in what a background probably looked like.
 

LightningZ71

Platinum Member
Mar 10, 2017
2,547
3,241
136
How is that an argument on how important ST is?
A normal user is in the i3-i5 range while the ST bariers get broken on the top halo CPU only.

So if the IPC increase we get each gen would be the only IPC increase we get then we would not get any IPC increase ever
...
...
...

No you are arguing that the ST improvement would be high enough to negate the loss of HT and that is just insane.
No, I'm arguing that the 13900K would be a better all around product if the P cores were optimized for ST at the expense of HT capabilities AND they swapped 2 of the P cores for 8 more E-cores (2 quads, roughly equivalent space). That would give better ST AND better MT throughput. It would be a 6+24 core / 30 thread product. I haven't included that in every post because it gets redundant after a while.
 
  • Like
Reactions: Rigg

TheELF

Diamond Member
Dec 22, 2012
4,027
753
126
No, I'm arguing that the 13900K would be a better all around product if the P cores were optimized for ST at the expense of HT capabilities AND they swapped 2 of the P cores for 8 more E-cores (2 quads, roughly equivalent space). That would give better ST AND better MT throughput. It would be a 6+24 core / 30 thread product. I haven't included that in every post because it gets redundant after a while.
If we take Kocicaks numbers, which are only for CB and would be different for anything else...
We get these numbers and we would need a boost on ST of the remaining 6 p-cores just to break even on MT.
Also with the thread director keeping foreground tasks on the p-cores you would lose a massive amount of things you could do as foreground tasks, almost half.

22640= 8 main cores + HT
19360=16 e-cores
42000

12780=6 main cores no HT
29,088=24 ecores
41,868


2130 - 1st thread on a P core
1210 - thread on a E core
700 - 2nd thread on a P core.
 

Starjack

Member
Apr 10, 2016
25
0
66
Most general users want to know single thread performance comparisons. Multicore is for crushing numbers, editing graphics videos, etc… Single core is for reading emails, rendering webpages for internet reading content, streaming channel content!

Do you think that's why Intel go for the hybrid designs for its new processors? From what i gather a bit is that single threaded performance is usually influence more by the performance cores, even with the latest chipsets.
 

dullard

Elite Member
May 21, 2001
26,044
4,690
126
Do you think that's why Intel go for the hybrid designs for its new processors? From what i gather a bit is that single threaded performance is usually influence more by the performance cores, even with the latest chipsets.
The P cores are for what people interface with--the software that they are actively using in front of all other software. People want that to feel snappy, so the P cores should be as fast as possible. That snappiness comes at a cost though. P cores are power hungry. Add more and more P cores and soon you have an unsustainable inferno. That is where the E cores come in. They are intended to do the grunt work--to power through the background tasks and heavy workloads.

Hybrid is trying to give people the best of both worlds: snappiness and extreme multi-threaded performance. ARM did it, Apple did it, Intel did it. Even patents show that AMD probably will do it. https://www.tomshardware.com/news/amd-patent-hybrid-cpu-rival-intel-raptor-lake-cpu

In an ideal world where power consumption doesn't matter, you would want as many P cores as possible. But that just isn't the world that we live in. We are at the point now that we can't add more P cores without making the P cores weaker. That would make the software that you are actively using more sluggish.

Intel's problem is that they reached the power limitation before AMD did. And the result was some pretty crummy CPUs without enough E cores to dominate (stay far away from any CPU with 4 E cores as that gives you the worst of both worlds).
 
  • Like
Reactions: Starjack

LightningZ71

Platinum Member
Mar 10, 2017
2,547
3,241
136
If we take Kocicaks numbers, which are only for CB and would be different for anything else...
We get these numbers and we would need a boost on ST of the remaining 6 p-cores just to break even on MT.
Also with the thread director keeping foreground tasks on the p-cores you would lose a massive amount of things you could do as foreground tasks, almost half.

22640= 8 main cores + HT
19360=16 e-cores
42000

12780=6 main cores no HT
29,088=24 ecores
41,868
So, from the numbers you posted, without any improvement on the P cores, the performance is, essentially, a wash. With any improvement to the P cores from the transistor budget afforded by removing HT, you'd already be past it. As for thread director, it only pushes foreground tasks onto the P cores on their first threads. It uses the HT threads for overflow and very low priority tasks. I realize that there isn't a technical difference in the capabilities of each thread, but, populating them does force a sharing of resources that slows down the first thread. So, in the example, you loose 25% of the threads that would be available on foreground, and, those remaining threads would be faster to boot, so, for the vast majority of foreground tasks that only have one or two performance critical threads, they still perform better.
 

Kocicak

Golden Member
Jan 17, 2019
1,177
1,232
136
I played a little bit with 12600K at the extreme end of multithread load, where HT is most relevant.

Here are the results:

12600K HT max MT analyze.png

You need 4 E cores to compensate function of 6 second threads of P cores with HT,
both in power unlimited and limited scenario, while increasing efficiency of the CPU.

Given how small E cores are and how much of them you can already have on a CPU, I declare hyperthreading dead (in personal computers for normal people).

This function and all complications stemming from it is simply not necessary now, as it can be easily compensated for by adding a small number of E cores.

(I must admit that HT is still relevant on a CPU with very low number of E cores as 12600K, but these CPUs do not make much sense anyway.)
 
Last edited:
  • Like
Reactions: BorisTheBlade82

Starjack

Member
Apr 10, 2016
25
0
66
I did watch a video recently from a Youtuber who play games with a i5-12600K Processor (6P cores / 4E cores), Nvidia GeForce RTX 2060 Super GPU and 32GB RAM, who try playing games with E cores enabled and disabled. The difference is only a mere percent off, favored a bit with E cores enabled. The difference in power usage of the CPU is approx 5%. Despite this in both scenarios, the games maintain near similar average framerates, or just 1% off.
 

dullard

Elite Member
May 21, 2001
26,044
4,690
126
I did watch a video recently from a Youtuber who play games with a i5-12600K Processor (6P cores / 4E cores), Nvidia GeForce RTX 2060 Super GPU and 32GB RAM, who try playing games with E cores enabled and disabled. The difference is only a mere percent off, favored a bit with E cores enabled. The difference in power usage of the CPU is approx 5%. Despite this in both scenarios, the games maintain near similar average framerates, or just 1% off.
1) The majority of games don't need the 16 threads that the 12600K can process. So, having the 4 E cores just doesn't have a chance to help most games. I'm not at all surprised the results were within 1% of each other in that video.

2) The 12600K is one of the ugly stepchild processors with just 4 E cores. I would avoid it. While @Kocicak has data above showing a ~25% performance boost with the 4 E cores, for not much more money you could get 8 P cores (the 12700F is only $25 more). 8/6 = 1.33, so you should expect nearly a 33% boost in that situation with just having more P cores (plus you'd get E cores too so the final performance would be maybe ~50% more). Finally, there is a problem with the way Intel chose to use background software--it puts all the work onto the E cores. So, if you did want to run something in the background, you are making your few E cores do all the work. There just aren't enough E cores to make that a good use of the processor. The 13600K and 13600KF solve that latter problem with enough E cores to get the background jobs done in a reasonable amount of time.

Since you use video reviews, here is one showing the 13600KF dominates the 12600K for just $5 more:
 
Last edited:

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
OK I did another encoding test. I took that 4K60 FPS video and transcoded it to x265 1080p30 FPS. It was still a fairly heavy workload, but not as much as converting the original 4K60 FPS video to x265 4K60FPS.

Hyperthreading this time yielded only 3.77% faster performance. This made me think. The efficiency cores are definitely carrying a great deal of the load, because I remember testing HT on and off in encoding workloads on previously owned CPUs ranging from the 3930K, 4930K, 5930K and 6900K and the gains were always in the double digit territory, above 20% in many cases.

So I would definitely think that HT/SMT on CPUs that have efficiency cores isn't as potent as if the CPUs had none for obvious reasons. But I'm still not convinced that HT should be totally trashed in favor of the efficiency cores. In high TLP workloads, HT can still improve efficiency and throughput as evidenced by my previous example where HT yielded 8.7%.

Rendering should yield greater gains due to being embarrassingly parallel, but that workload is something mostly done by professionals (benchmarking doesn't count) that have way more cores than enthusiasts.

We also don't know the transistor cost. 5% die space was often quoted, but that was for the PIV if I'm not mistaken. These newer CPUs may have a much smaller transistor cost, which is why Intel and AMD still keep it around.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
27,260
16,118
136
OK I did another encoding test. I took that 4K60 FPS video and transcoded it to x265 1080p30 FPS. It was still a fairly heavy workload, but not as much as converting the original 4K60 FPS video to x265 4K60FPS.

Hyperthreading this time yielded only 3.77% faster performance. This made me think. The efficiency cores are definitely carrying a great deal of the load, because I remember testing HT on and off in encoding workloads on previously owned CPUs ranging from the 3930K, 4930K, 5930K and 6900K and the gains were always in the double digit territory, above 20% in many cases.

So I would definitely think that HT/SMT on CPUs that have efficiency cores isn't as potent as if the CPUs had none for obvious reasons. But I'm still not convinced that HT should be totally trashed in favor of the efficiency cores. In high TLP workloads, HT can still improve efficiency and throughput as evidenced by my previous example where HT yielded 8.7%.

Rendering should yield greater gains due to being embarrassingly parallel, but that workload is something mostly done by professionals (benchmarking doesn't count) that have way more cores than enthusiasts.

We also don't know the transistor cost. 5% die space was often quoted, but that was for the PIV if I'm not mistaken. These newer CPUs may have a much smaller transistor cost, which is why Intel and AMD still keep it around.
AMD will be a whole different story. You need both before saying anything.
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
AMD will be a whole different story. You need both before saying anything.

Feel free to run that test if you want with SMT on and off.

I downloaded the video from here.

And I used the latest Handbrake with priority set to high and the preset was H.265 MKV 2160p60.