• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Possible Zen 2 Thread Ripper sightings

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
This would mean 8 dual-core CCXs. This is interesting for running many concurrent processes which do not communicate much, or at all. Multithreaded applications with appreciable inter-thread synchronization needs may fare better on processors with fully enabled (that is, 4-core) CCXs. That's my guess at least.
That will not be the case due to the IO doing the CCX comunications, at least we don't see any penalties with the 3900X wich has two CCD with 6 core each(4 CCx nodes with 3 cores each)
 
@nicalandia, my point is not the number of CCDs per processor*, but the number of cores per CCX**. The latter will without doubt determine how many threads per process a data-intensive computational program (with data dependency between the threads) should have for optimum throughput.

Published measurements have shown inter-CCX communication and inter-CCD communication perform virtually identically, and perform the same as or only somewhat better than memory accesses (*). This is in contrast to inter-core communication within the same CCX. That's the whole reason to organize cores in CCXs in the first place. (**)
 
@nicalandia, my point is not the number of CCDs per processor*, but the number of cores per CCX**. The latter will without doubt determine how many threads per process a data-intensive computational program (with data dependency between the threads) should have for optimum throughput.

Published measurements have shown inter-CCX communication and inter-CCD communication perform virtually identically, and perform the same as or only somewhat better than memory accesses (*). This is in contrast to inter-core communication within the same CCX. That's the whole reason to organize cores in CCXs in the first place. (**)
but with zen 2, its doesn't matter where the core is, each core will have go trough IO die to communicate.
 
but with zen 2, its doesn't matter where the core is, each core will have go trough IO die to communicate.
Each CCX has to go through the IO die.

I wonder what the speed difference will be between a TR 16 core 4 channel and 3950X (2 channel) in games.
I don't see an obvious disadvantage for TR but who knows...
It will have a higher idle power I assume.
 
Published measurements have shown inter-CCX communication and inter-CCD communication perform virtually identically, and perform the same as or only somewhat better than memory accesses (*). This is in contrast to inter-core communication within the same CCX. That's the whole reason to organize cores in CCXs in the first place.
So far the 3900X has not been crippled against the 3800X and still has 12-cores in a 3+3+3+3 core configuration so a 2+2+2+2 will have little if any penalties compared to 4+4+ due to the I0 handling the communications. Yes it's not optimal but those 8 core Epycs are meant for system that are PCI heavy and not throughput so I doubt that we will see them on Thread Ripper as I believe the last 8 core ThreadRipper was the 1900X
 
Last edited:
Sure, if you have little synchronization, there are little penalties for spreading an application over multiple CCXs. Yet there are applications with a good amount of sharing of hot data between threads, such as fast fourier transforms. It will be interesting to measure those on 4-/ 3-/ 2-core CCX SKUs.
 
Sure, if you have little synchronization, there are little penalties for spreading an application over multiple CCXs. Yet there are applications with a good amount of sharing of hot data between threads, such as fast fourier transforms. It will be interesting to measure those on 4-/ 3-/ 2-core CCX SKUs.
There was a review that turn "Game" mode On for 3900X(the mode was initially created for ThreadRipper it basically turn off all but one compute unit 8 core) that basically turned the 3900X to a 3600X(3+3 single CCD) and in 99% of the games it actually got worst performance using "Game" mode, I would guess that the lost of performance was due to lower total cache and not due to 3+3 vs 3+3+3+3 communication penalties.

https://www.legitreviews.com/game-mode-might-boost-performance-on-amd-ryzen-3900x-processors_213087

"When we ran other game titles with ‘Game Mode’ enabled on the 3900X we found that performance dropped. We only ran Rainbow Six: Siege, Far Cry 5 and Metro Exodus in our CPU test suite and of those titles only Metro Exodus showed a performance improvement with Game Mode enabled. It is likely that 99% of game titles run better in ‘Creator Mode’ and we hope that is the case. Now that this news is out the sites that have dozens of games that can be tested with automation will be able to crank out results in a timely manner from a larger data set."
 
Are there any downloadable benchmarks that utilize fast fourier transforms?
Prime95 for example. It is using the gwnum library for the heavy lifting and spends most of the processor time in FFT. The FFT size can be configured, and (times 8 for size in bytes) determines the size of the hot data that should be kept in a processor cache shared by all cores used by one worker process.

(I've been using the FFTW library in structural dynamics myself. I don't recall a ready made benchmark for FFTW; maybe there is one; but those who use FFTW should have little trouble to benchmark for their own use case. I have also been using a CFD program which heavily relies on FFT; it's been a while so I don't recall whether it used a home-grown implementation or a library which would enable standalone benchmarks.)
 
It certainly will be interesting to see if HEDT with Zen 2 TR makes a comeback in gaming. That would be cool after the dissapointing gaming performance of X299 and X399. My last HEDT board was the Asus X99 Deluxe with a 5930k. Recently I have moved on to AM4.
 
Back
Top