• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

CPU Cache Question...

Status
Not open for further replies.
On a CPU like the FX-8xxx, when a single or dual game thread is evenly balanced on cores 0,2,4,6, does that mean all 8MB of L2 and all 8MB of L3 cache is utilized in the process?

The reason I ask this is Starcraft 2 performance. Even on all low settings and 1024x768, my desktop is beyond ridiculously faster than my laptop in custom madness maps (where there's 10x+ polygon rendering going on than what would be in a normal Starcraft 2 map). The GPU load on both are negligible at best (0-5%) on low settings. When things get very intense, CPU load spikes to 100% on the P8700 and 25% on my desktop, meaning two and only two game threads are being utilized. When things are very intense the memory usage is still below the 256MB threshold for the 3650 and the GPU usage in GPU-z shows practically nothing in both. In fact the 6870 down-clocks to 300 or 775mhz because there isn't much of a load at all.

I get that the P8700 (2.5Ghz, 3MB L2) is clocked lower, but per-clock it is more efficient than the FX-8350. It should be in the ballpark of a 3Ghz FX core. It shouldn't be 3-4x as slow as the FX unless cache played a huge role in the performance. Maybe cache plays a huge role because hundreds of the same set of polygons have to be rendered over and over, reducing the hit by using on-die cache? 😕

Any relevant input is appreciated.
 
desktop hard drives normally spin faster than a laptop hard drive. Also your desktop is a SSD.


Would be able to further pick apart the technicalities if the LT also had a SSD.
 
FWIW the laptop has more than enough ram to load the 1.5GB at a time that SC2 demands. After loading the map the HDD is barely touched except for music. The desktop has Starcraft 2 on one of the (old) 1TB WD Blue HDDs as well.
 
On a CPU like the FX-8xxx, when a single or dual game thread is evenly balanced on cores 0,2,4,6, does that mean all 8MB of L2 and all 8MB of L3 cache is utilized in the process?

No, the L2 cache on each modules will only be used by code running on that module. So if you have two threads active they'll use 2MB of L2 cache if they're both on the same module, or 4MB in they're on separate modules. You do get the full 8MB of L3 cache though.

But I really doubt the L3 cache would play this much of a role in a game.

When you say that it's 3-4x faster on the desktop does that mean the framerate is 3-4x higher?
 
No, the L2 cache on each modules will only be used by code running on that module. So if you have two threads active they'll use 2MB of L2 cache if they're both on the same module, or 4MB in they're on separate modules. You do get the full 8MB of L3 cache though.

But I really doubt the L3 cache would play this much of a role in a game.

When you say that it's 3-4x faster on the desktop does that mean the framerate is 3-4x higher?

That's what it appears. I'll have to save a replay on one and then run fraps to get an exact number, but it's far better on my desktop. In maps like Bunker Wars X I'll get dips to 2-5FPS while my desktop hits as low as 7-12fps. The ending parts to Nexus Wars will drag the P8700 into almost a standstill but the FX you're able to scroll around at a low framerate. The video cards on both are not stressed a bit during this time.

As stated in a thread I made in the CPU forum, upgrading the laptop from a T6400 (2Ghz, 2MB L2, 35w) to the P8700 (2.53Ghz, 3MB L2, 25w) has made a huge difference in Starcraft 2 custom maps. Going from the custom madness maps being nearly unplayable to playable was a big step with the upgrade, but my desktop is much smoother and the dips in crazy battles aren't hindering my ability to control what's going on. I know it's not temps as the P8700 never breaks 55C as I have it on a cooler, and the Mobility 3650 rarely breaks 60C in any game.

It is rare to go below 60fps on low settings @ 1280x800 on regular single player SC2 and most competitive multiplayer maps with the laptop, but the custom maps are a whole different ballgame.

SPBHM posted a tidbit in the upgrade thread about SC2 performance with higher cache.

snip
http://pclab.pl/zdjecia/artykuly/focus/cpu2013/def/sc2_1920.png
all 45nm "Core 2"
e5300 16,8FPS (2.60GHz FSB 200 2MB l2)
e7300 18,6FPS (2.66GHz FSB 266 3MB l2)
e8200 22,1FPS (2.66GHz FSB 333 6MB l2)
snip

And this is in single player with ultra settings. Custom maps still have many more polys to be rendered. Hundreds of the same/similar units going at it.

Another interesting part of the bench is these two CPUs having similar framerates.

FX 8120 3100Mhz 4M/8T 8MB L2/8MB L3 24.2FPS
FX 4100 3600Mhz 2M/4T 4MB L2/8MB L3 24FPS

And the game only uses two threads at maximum, even with ultra settings. When I get a couple straight hours free I'll get some fraps replays benched.
 
Piledriver is a better micro-architecture than Core 2: bigger caches, enhanced out-of-order execution, better branch prediction… It comes as no surprise that a modern desktop CPU outperforms an "old" mobile CPU.

PS: CPU's caches don't store polygons that are rendered (they're send to GPU's memory).
 
SPBHM posted a tidbit in the upgrade thread about SC2 performance with higher cache.
By that very graph, the Core 2's IPC does not appear to be anywhere near as superior as you initially suppose it is. But, when comparing the Core 2 CPUs, note than the FSB is also different, meaning that memory latency and bandwidth are different, not merely cache size.

Roughly your CPU: E7200 (2.53Ghz): 17.7 FPS
FX-8150 (4.2GHz*): 26.7 FPS
FX-8320 (3.5-4GHz): 28.3 FPS
FX-8350 (4-4.2GHz): 32.3 FPS

The E7200 gets 7.00 FPS/GHz
The FX-8150 gets 6.36 FPS/GHz
The FX-8320 gets 7.08-8.01 FPS/GHz
The FX-8350 gets 7.70-8.08 FPS/GHz

Basically, the PD FX CPUs are as good as the Core 2 or better, per clock, and run faster.

Meanwhile, that's average. CPUs with fewer cores and smaller caches tend to have worse max frame times, which make it feel slower than the average indicates, and will be shown empirically only with measurement of the worst FPS/time in the measurement. For example, check out reviews including such numbers, for the FM2+ CPUs, and OCable Pentium. If the big maps have working sets larger than the cache, the average could also be reduced even further.

Also, as already mentioned, polygons don't matter, for any CPU-bound scenario.

* Based on other release BD AMD CPUs in the chart, it's clear that it's running at 4.2GHz, not 3.6GHz. The PD ones are probably all at max Turbo, as well, but it's not so plainly obvious.
 
The game is eating through your caches and doing a lot of swapping with DDR3. Core 2 has relatively high memory latency. That FX chip has double the cinebench single thread score. It's FPU is way more powerful than the FPU inside a core 2. Only on the integer side does core 2 have an advantage, but it is relatively minimal and even then it depends on the instructions used.
 
Status
Not open for further replies.
Back
Top