Yes. Cascade Lake and Cooper Lake are identical except for bfloat16. That's a reasonable target. IceLake supports a lot more except bfloat16. That's a more-ambitious target.Did you look at the table I linked in your quote? Each of them supports a different selection of subsets, none of them supports them all (not even Tiger Lake).
Not AVX512, though.There's plenty of it in x264, x265, SVT-AV1 and now some in dav1d (therefore by extension rav1e too).
I think some emulators have it too, and I wouldn't be surprised to find it in Intel's Embree which is used in quite a few ray tracing code projects (oddly including one of AMD's).
I was talking abut AVX512. Look up those things and you'll find the code commits.Not AVX512, though.
Nice, that's happening sooner than I'd hoped.I was talking abut AVX512. Look up those things and you'll find the code commits.
See [Wikipedia](https://en.wikipedia.org/wiki/AVX-512#CPUs_with_AVX-512). The target will be Ice Lake (F, CD, VL, DQ, BW, IFMA, VBMI, VBMI2, VPOPCNTDQ, BITALG, VNNI, VPCLMULQDQ, GFNI, VAES). Where it makes sense, it would be nice...code.videolan.org
By Pradeep Ramachandran Finally, the acceleration that we’ve all been waiting for is here! We’ve been working extensively with Intel for the last few months to use Intel Advanced Vector Extensions 512 (AVX-512) to accelerate x265. After much effort, we’re delighted to share that we’ve been able...x265.orgVector units in CPUs have become the de facto standard for acceleration of media, and other kernels that exhibit parallelism according to the single instruction, multiple data (SIMD) paradigm. SIMD on Intel® architecture processors have evolved to enable 512-bit register files in Intel® Advanced...software.intel.com
Probably more open source uses it than closed.
Just a WAG... Do you think another 20% over Zen 3 is possible?I will give an opinion base on everything I have read in the media. Zen 3 will be at least a 20% improvement over Zen2. Zen 4 will have DDR5 and built on 5nm. It's too far out to predict. DDR5 could be a big improvement over DDR4.
I look over to the side a bit and I see AMD claiming a 50% gen on gen (and within 12-18 months of one another) with their GPU stack in terms of power efficiency even without node shrinks.Just a WAG... Do you think another 20% over Zen 3 is possible?
AMD already said they were very happy with what Zen3 has done. Meaning customers will be happy with the results. In an earlier post (here) it was said 20-25% improvement for Zen3 over Zen 2. I think that is probably very accurate. Consider that Zen3 is a new processor vs. Zen 2 that was largely an evolution of Zen with a 12nm and 7nm die shrink. Whether the 7nm +is much better. They said AMD was hoping for more in the new 7+nm similar to the shrink from 12nm to 7nm, but the architecture improvements offset that. The boost they were hoping for in silicon was not as significant as they had hoped.Just a WAG... Do you think another 20% over Zen 3 is possible?
We talked about this before. AMD stated they had a 40% IPC design goal with Zen1 over previous core (last iteration of Bulldozer). They achieved ~52%.I look over to the side a bit and I see AMD claiming a 50% gen on gen (and within 12-18 months of one another) with their GPU stack in terms of power efficiency even without node shrinks.
I think we're looking at an AMD that's dedicated to pushing performance as much as possible. With Zen 4 comes the switch to N5. That alone should allow for more than just a 20% boost in overall performance, and as things stand, I have absolutely no reason to believe they can't do it.
While I agree, the question was about performance, not IPCWe talked about this before. AMD stated they had a 40% IPC design goal with Zen1 over previous core (last iteration of Bulldozer). They achieved ~52%.
I expect the same goes for Zen3 Vs Zen1 and Zen5 vs Zen3. If they were to achieve the goal for Zen3 (40% higher IPC vz SummitRidge Zen1) then Zen3 just needs 17.5% average IPC improvement to accomplish that.
For Zen5 is a bit trickier since we know nothing about improvements in Zen4, but if they were to be more in the line of Zen1->Zen2 then Zen5 would need north of 20% to achieve that goal. It's very much possible IMO. We are talking here about strict IPC gains without counting in the process node (power/clock). Clock and IPC would give the final performance of course.
OT:I look over to the side a bit and I see AMD claiming a 50% gen on gen (and within 12-18 months of one another) with their GPU stack in terms of power efficiency even without node shrinks.
Forgot to get back to this. I meant the 'marketing' 1nm. Quantum effects were already an issue with Intel's 22nm FinFETs, forget about dealing with actual feature sizes in the 1nm range; I just don’t see that happening in the next 20 years.What's crazy is as recently as 2016 it was thought that transistors smaller than 7nm would be highly susceptible to quantum tunneling. That didn't turn out to be an issue as processes move from N7 to N7P/N7+ to 6/5 for TSMC, and even Samsung targeting 3nm GAAFET production in 2021.
The main future issue is that 1 silicon atom is 0.2nm wide. It seems the industry has said they are unsure if nodes beyond 3nm would be viable, though TSMC is researching 2nm, and Intel thinks they can do 1.4nm by 2029.
Circling back to Zen4... if on 5nm, and if it doesn't drop til 2022, it may be very interesting market-wise. If Intel can get 7nm out by then (I know, but bear with me), they *might* regain the process lead. I think this path is the only way I could see them doing so in the next 5 years. 5nm TSMC is projected to have 171.3 MTr/mm2 and 7nm Intel is projected to have 237.18 MTr/mm2.
In any case, it's remarkable to see that we are going to see roughly a doubling of # of transistors from 7nm to 5nm so quickly and if TSMC keeps up the cadence, it'll happen still at a speed nearly in accordance with Moore's observation, even though we are starting to approach a literal atomic limit.
1. Low hanging fruit is nice and all, but the same was claimed for RDNA3.OT:
That's because AMD GPUs have such a terrible perf/watt compared to NV. Clearly, they have a big delta to cover just as with the construction core compared to Zen. Also, all we know is that RDNA2 is being manufacturing on '7nm', not which precise node.
I don't know what you are talking about. Don't call me boyo, K?1. Low hanging fruit is nice and all, but the same was claimed for RDNA3.
2. RDNA1 is on N7P. It doesn't matter which flavour of N7 RDNA2 is on, because best case is N7+ and N7+ is only 3% better in efficiency than N7P.
It's almost all uArch boyo.
This is really cool, honestly. I'm sure there's a math equation that tells us how beneficial this could be, but by adding 3D packaging, you can nearly double the number of points a given distance away. That could be really huge.It seems to me that the X3D packaging that AMD is talking about could be TSMC's CoWoS with SoIC.
2.5D HBM and 3D SoC?
View attachment 18525
Also AMD registered some new patent applications for chiplet IVR.
A data processor is implemented as an integrated circuit. The data processor includes a processor die. The processor die is connected to an integrated voltage regulator die using die-to-die bonding. The integrated voltage regulator die provides a regulated voltage to the processor die, and the processor die operates in response to the regulated voltage.
|Thread starter||Similar threads||Forum||Replies||Date|
|Speculation: Spring refresh for Ryzen||CPUs and Overclocking||67|
|N||Speculation: Thread Ripper 3||CPUs and Overclocking||27|
|Speculation: Ryzen 4000 series/Zen 3||CPUs and Overclocking||2308|
|E||Speculation: Future Iterations of Compute Technology||CPUs and Overclocking||9|
|With all the speculative leaks and multithreading /hyperthreading issues ...||CPUs and Overclocking||4|