1. Not that I know of.so i came across a bit that where it's said that Pentirum 4 has double pumped ALU's or execution pipes, meaning that such pipes run at twice the frequency as the processor frequency.
1. I'm wondering if anyone here knows whether there are any modern processors where this is also done.
2. Is this done on modern GPU's?
More of a GPU question but my guess is that it was too difficult or would've broken GCN, or just taken too long. They call call Navi RDNA but IMHO it looks like RDNA2 next year will be the real "break", if you will, from GCN.I was thinking of doing this, thanks.
Why did AMD insist on hardware scheduling on its Vega GPUs instead of transferring it to the CPU (SW) like nvidia? Is it theoretically more efficient or was it a shortcoming on GCN they needed to address to achieve its potential
Excuse my ignorance, but better inter-module communication + plus shared execution resources? L2 probably wasn't as necessarily efficient. Shooting in the dark here.1. Not that I know of.
Here's my question:
Why did K10 - BD have massive L3 set associative cache? I think K10 was 32 way, K10.5 was 48 way, and BD was 64 way. We seemed to have settled on to 16 way these days. I just double checked, and those values for K10 and K10.5 seem correct, but it was a bit difficult to find numbers on the BD family. I remember seeing 64 but now I am also seeing 16. I'll edit this if I find something conclusive.
That might make sense. At least for BD. Still not sure why they had so many sets for Phemon I & II though. I have to think that cost them latency wise.Excuse my ignorance, but better inter-module communication + plus shared execution resources? L2 probably wasn't as necessarily efficient. Shooting in the dark here.
Excuse my ignorance, but better inter-module communication + plus shared execution resources? L2 probably wasn't as necessarily efficient. Shooting in the dark here.
Not sure but they had server ambitions and BD and PD both had large L2's already. So I think they figured L3 may as well be slow, and if anything improve speed first on L2.Why did K10 - BD have massive L3 set associative cache?
Hardware scheduling might ease CPU to GPU communication, so maybe out of bandwidth concerns for APU's (whose size and performance are always limited by RAM bandwidth).Why did AMD insist on hardware scheduling on its Vega GPUs instead of transferring it to the CPU (SW) like nvidia? Is it theoretically more efficient or was it a shortcoming on GCN they needed to address to achieve its potential