as the 2xxx series and the x399 motherboard will still be produced.
So what is the problem here ? Please somebody straighten me out.
I read it somewhere in one of these threads.... Too late tonight to go back and re-read everything, but I am pretty sure thats true. At least for the next your or so, by which time TR40 or whatever will be old news.They are? I thought those were going to be phased out, just like the first-gen Threadripper products were phased out.
Problem is as core count goes up Intels clock speed advantage goes away and Cascade Lake -X is just SL-X++ which Zen 2 is already better then. End result is I think that they will be nearly a wash on price/perf.
They are much less power efficient and slower (means louder) much lower FP and on top of that not really good in VR/games.So you can have almost any configuration of threadripper to meet your needs and budget, as the 2xxx series and the x399 motherboard will still be produced.
So what is the problem here ? Please somebody straighten me out.
Yes I saw it a few weeks ago, AVX2 vs AVX512 is about twice the speed for certain numpy tasks, I assume in the coming years more software will be optimized for AVX512.And also this.
If you use python and hence numpy heavily, the intel cpu will have a tremendous advantage due to avx512.
BUT on the other hand if you need to do something with big workload (like you have to wait) then you may be much faster (>10x) using CUDA on the GPU (Numba), making this AVX512 point not really important anymore.
But that is some bug in anaconda that needs to be fixed, or may already have been fixed.I disagree. In my link you can see a Ryzen 3900x on mkl is almost 10x slower than a xeon due to intel mkl defaulting to sse for AMD. This is for basic tasks which don't need a gpu at all. Even small tasks it matters if it's 2 minutes or 15 seconds.
In fact since I have the 3900x on windows (my previous link used ubuntu) I tried to recreate the results and they pretty much match what that guy got on ubuntu. However, getting numpy installed with openblas instead of mkl was a real pain in the ass on windows with anaconda, see took me an hour to get it working and that is just with numpy.
if I try to install scikit-learn on top it fails if using default channel and "downgrades" to mkl if using conda-forge meaning your back to the slow speed. or said otherwise for python/numpy on windows, better to buy a intel cpu...
But that is some bug in anaconda that needs to be fixed, or may already have been fixed.
At release, all Epyc 7002 SKUs came with two (edit: or three) cTDPs supported by the firmware (though with a far smaller step between the two cTDPs than Ryzen's eco mode). Perhaps Threadripper 3000 firmware will support cTDP too.The Ryzen 3000 will have the eco-mode feature (105W->65W, 65W->45W) not present in the new TR3, could be a great setting to make it more silent when doing heavy work. (it's not like I have a separate room to put the PC in)
280W TR3 could really use something similar like 280W->180W->105W
AT this time there is no evidence for or against the assumption that a TR3 processor must be manufactured with exactly four CCDs.I don't think there are to many 2 or 4 core dies out there. The supply of those chips would mean they are either fighting Epyc for the crippled dies, or they are crippling dies that could go into higher margin products.
the 2xxx series and the x399 motherboard will still be produced.
My understanding of the reporting so far was that they will remain available for some time. (This is not necessarily the same as still being manufactured.)They are? I thought those were going to be phased out, just like the first-gen Threadripper products were phased out.
And also this.
If you use python and hence numpy heavily, the intel cpu will have a tremendous advantage due to avx512.
People need to stop perpetuating these misunderstandings about AVX512.Yes I saw it a few weeks ago, AVX2 vs AVX512 is about twice the speed for certain numpy tasks, I assume in the coming years more software will be optimized for AVX512.
People need to stop perpetuating these misunderstandings about AVX512.
Intel MKL is not alone in this. Further, Zen2 tends to require different optimizations than Zen/Zen+ (with Zen2's optimum or near-optimum code path being more in line with that of recent Intel microarchitectures due to the reorganized floating point and vector execution units). This is completely orthogonal to what I said what people need to do about their understanding of AVX512.It's not just avx512, intel mkl which is used by default if you use anaconda and numpy runs sse-code or even slower path on ryzen.
AT this time there is no evidence for or against the assumption that a TR3 processor must be manufactured with exactly four CCDs.
Replying to my own post, for an update. I saw a 7551 EPYC ES for $300 on ebay, and could not resist. I got 2 of them and a motherboard and 128 gig of 2666 ECC ram for $2500. Just an example of what you can get for your money if you want a lot of performance. 16 channels of ram and 128 threads for $2500 !!!! Not they only run at 2.5 ghz, but with that many cores/threads/ram, who cares ! (not a gamer)Everybody keeps ignoring the below:
You want 16 core and no IO/PCIE4 and lanes ? get a 3950x $750
Need more PCIE lanes or quad channel memory or something the threadripper platform offers, get a 1950x $400 (used)
Or more horsepower than that, get a 2950x $600 (used)
Or even more get a 2970wx $800 (used, just passed on one that went for 670 today)
Or even more get a 2990wx $1200 (used, just passed one up for $1150 today)
Or if you need the ultimate in power, PCIE lanes and4.0, and the fastest cores go
24 cores for $1400
Or 32 cores for $2000
Used prices based on pretty close to the lowest "buy it now" price on ebay. I have bought many of mine from there.
So you can have almost any configuration of threadripper to meet your needs and budget, as the 2xxx series and the x399 motherboard will still be produced.
So what is the problem here ? Please somebody straighten me out.
Rome does have an 8 cores version.There is good reason to believe that they are limited to either 4 or 8 CCDs for several reasons. Until we see anything different then any configuration of cores will be based on that.
Of course all of the alternatives come with various compromises. Whether or not these compromises are tolerable is a case by case consideration.While I agree that AMD is leaving a certain gap between the Ryzens and TR 3000s (at least for the time being), that's how things are, and you have plenty of alternatives buying new or 2nd hand, from AMD or from Intel.
And possibly more. At this point we don't have solid info how far AMD customized the I/O die and the PCB for TR3. Those who make guesses about possible and economically valid TR3 configuration should admit to themselves that these are just guesses, and wild guesses at that unless they have direct info from inside AMD.Anything Rome can do TR3 can do also, if they want to do it.
2+0, 2+0, 2+0 2+0How do you do that with 4 dies? 1 core per CCX? then the cache doesn't add up to 32MB since you have 16MB L3 per CCX
32MB means 2 CCX you can do that with 1 or 2 CCDs
3+0 3+0 3+0 3+0Their 12 core does have 64MB total L3 cache meaning 4 CCX, 3 active cores on 1 die, I don't think so. I think 2 CCDs here.
So, No, I don't think they need minimum 4 dies connected to the I/O die and I don't see any technical reason why this min. 4 would exist. (ignoring if it's a good idea to do it or not)
I think the main reason now is that 32 and 24 cores hence always 4 dies is the same 1 assembly/production line for TR3 now.
Also heat and mechanical stress would need to be tested again with only 2 dies and the production line will need changes or a second need to be set-up.
Anything Rome can do TR3 can do also, if they want to do it.
But the L3 cache doesn't add up, 4 active CCX is 64MB L3 cache and the spec says 32MB (meaning only 2 CCXs)2+0, 2+0, 2+0 2+0
But the L3 cache doesn't add up, 4 active CCX is 64MB L3 cache and the spec says 32MB (meaning only 2 CCXs)
They have done half cache before. The 1400 had only 8MB of L2 even though it was 2x2 CCX.But the L3 cache doesn't add up, 4 active CCX is 64MB L3 cache and the spec says 32MB (meaning only 2 CCXs)
8 core Rome has 32MB L3 cache (half of the 12 core Rome)L3 is cut in half?
8 core Rome has 32MB L3 cache (half of the 12 core Rome)
I don't think they will sell an 8MB/CCX die in EPYC7002 or TR3 maybe in a very low end Ryzen.That's what I mean, half of the L3 is fused off. So the 7232P is presumably four dies with one CCX enabled per die, each with 2 cores each and half of the L3 enabled. The 7252 has the full L3.