Not sure if I am comforted or disappointed that you see the same issues on the EA forum where a bumpgate article devolves into a AMD vs. NV rant. :/
Anyway, I don't think it matters much for the consumer Pascal products as it is HBM related. Charlie fishing for hits...
well for now nah it doesnt matter...
but in 8 months we will get volta
Au contraire, if they don't make enough $ revenues & profits from their HPC cards where do you think the next target would be? If the GP100 isn't a big hit, the Pascal consumer cards will be milked way harder than Maxwell & we'll see evidence of that soon enough D:Not sure if I am comforted or disappointed that you see the same issues on the EA forum where a bumpgate article devolves into a AMD vs. NV rant. :/
Anyway, I don't think it matters much for the consumer Pascal products as it is HBM related. Charlie fishing for hits...
Au contraire, if they don't make enough $ revenues & profits from their HPC cards where do you think their next target would be? If the GP100 isn't a big hit, the Pascal consumer cards will be milked way harder than Maxwell & we'll see evidence of that soon enough D:
when they changed the roadmap again? or they added volta after they added pascal?Volta is planned for 2018, not 2017, so a fair bit more than 8 months.
when they changed the roadmap again? or they added volta after they added pascal?
Not necessarily, the new buzzword being mindshare. There are enough products in the AMD stack that're way better VFM than anything Nvidia has to offer & yet the "970gate" & disabling OC on mobile Maxwell, without serious repercussions, shows us the power of brand Nvidia. So, no I don't AMD has too much to do with the success Nvidia has had in the last two years, even more baffling after GCN's DX12 exploits!If it was possible for Nvidia to milk consumer cards anymore than they are doing now, then they would already have done so, otherwise they wouldn't be running their business properly (with regards to profit maximizing).
If consumer Pascal is more "milkable" than Maxwell, then it won't be because of GP100 failing, it will because of AMD failing to provide a competitive product.
You answered that yourself 😀when they changed the roadmap again? or they added volta after they added pascal?
Not necessarily, the new buzzword being mindshare. There are enough products in the AMD stack that're way better VFM than anything Nvidia has to offer & yet the "970gate" & disabling OC on mobile Maxwell, without serious repercussions, shows us the power of brand Nvidia. So, no I don't AMD has too much to do with the success Nvidia has had in the last two years, even more baffling after GCN's DX12 exploits!
volta didnt existed on any roadmap till early 2015 it was pascal in the place of volta(2018 didnt even existed back then) it was 2016/17I'm pretty sure Volta has always been 2018 on Nvidia's roadmaps (except for the ones where it didn't have any date at all)
Volta HPC is scheduled to ship in 2017 for Summit and Sierra supercomputers:Volta is planned for 2018, not 2017, so a fair bit more than 8 months.
Volta HPC is scheduled to ship in 2017 for Summit and Sierra supercomputers:
http://www.anandtech.com/show/8727/nvidia-ibm-supercomputers
Geforce gaming Volta (GV104) should follow very quickly, maybe with the same timeline as Pascal with availability around Computex 2017
still you already find GV100 (big Volta) reference in the last CUDA library...That article is from 2014 - I wouldn't put much faith in those dates.
It seems that people are still confusing terms "async compute", "async shaders" and "compute queue". Marketing and press doesn't seem to understand the terms properly and spread the confusion 🙂
Hardware:
AMD - Each compute unit (CUs) on GCN can run multiple shaders concurrently. Each CU can run both compute (CS) and graphics (PS/VS/GS/HS/DS) tasks concurrently. The 64 KB LDS (local data store) inside a CU is dynamically split between currently running shaders. Graphics shaders also use it for intermediate storage. AMD calls this feature "Async shaders".
Intel / Nvidia: These GPUs do not support running graphics + compute concurrently on a single compute unit. One possible reason is the LDS / cache configuration (GPU on chip memory is configured differently when running graphics - CUDA even allows direct control for it). There most likely are other reasons as well. According to Intel documentation it seems that they are running the whole GPU either in compute mode or graphics mode. Nvidia is not as clear about this. Maxwell likely can run compute and graphics simultaneously, but not both in the same "shader multiprocessor" (SM).
Async compute = running shaders in the compute queue. Compute queue is like another "CPU thread". It doesn't have any ties to the main queue. You can use fences to synchronize between queues, but this is a very heavy operation and likely causes stalls. You don't want to do more than a few fences (preferably one) per frame. Just like "CPU threads", compute queue doesn't guarantee any concurrent execution. Driver can time slice queues (just like OS does for CPU threads when you have more threads than the CPU core count). This can still be beneficial if you have big stalls (GPU waiting for CPU for instance). AMDs hardware works a bit like hyperthreading. It can feed multiple queues concurrently to all the compute units. If a compute units has stalls (even small stalls can be exploited), the CU will immediately switches to another shader (also graphics<->compute). This results in higher GPU utilization.
You don't need to use the compute queue in order to execute multiple shaders concurrently. DirectX 12 and Vulkan are by default running all commands concurrently, even from a single queue (at the level of concurrency supported by the hardware). The developer needs to manually insert barriers in the queue to represent synchronization points for each resource (to prevent read<->write hazards). All modern GPUs are able to execute multiple shaders concurrently. However on Intel and Nvidia, the GPU is running either graphics or compute at a time (but can run multiple compute shaders or multiple graphics shaders concurrently). So in order to maximize the performance, you'd want submit large batches of either graphics or compute to the queue at once (not alternating between both rapidly). You get a GPU stall ("wait until idle") on each graphics<->compute switch (unless you are AMD of course).
If you assume that a single Pascal SM cannot run mixed graphics + compute then splitting the MPs should improve the granularity. Compute and graphics might also share some higher level (more global) resources as well. Nvidia has quite sophisticated load balancing in their geometry processing. Distributed geometry data needs to be stored somewhere (SM L1 at least is partially pinned for graphics work, see this presentation: http://on-demand.gputechconf.com/gtc/2016/video/S6138.html). Also, Nvidia doesn't have separate ROP caches (AMD still does). Some portion of their L2 needs to serve ROPs when rendering graphics. This might be transparent (just another client of the cache) or might be statically pinned based on the GPU state. I don't know 🙂
Very informative post by sebbbi @ Beyond3D.
https://forum.beyond3d.com/threads/...eculation-rumors-and-discussion.56719/page-72
Not necessarily, the new buzzword being mindshare. There are enough products in the AMD stack that're way better VFM than anything Nvidia has to offer & yet the "970gate" & disabling OC on mobile Maxwell, without serious repercussions, shows us the power of brand Nvidia. So, no I don't AMD has too much to do with the success Nvidia has had in the last two years, even more baffling after GCN's DX12 exploits!You answered that yourself 😀
AMD has had plenty of mistakes to affect their "mind share"
Hawaii was too hot and loud
Fiji was too slow and memory limited
Crimson killed cards
Terrible directx 11 driver overhead
Terrible Linux drivers
Ect.
Its a very good post.
Now the question begs, in Maxwell when running parallel graphics+compute tasks, does the entire GPU need to flip between graphics <-> compute modes?
Its interesting to see if in Pascal, if they could do this on a TPC or GPC level which would undoubtedly make it better than the entire GPU having to change modes. I doubt they can do it in SM level (similiar to GCN where they can do it in CU level).
When did this occur??
When did this occur??
Very informative post by sebbbi @ Beyond3D.
https://forum.beyond3d.com/threads/...eculation-rumors-and-discussion.56719/page-72
Recently. Probably got swept under the rug here.
http://m.hardocp.com/news/2015/11/29/users_report_amd_crimson_driver_heating_killing_gpus/
anyway on topic the 1080 is going to be significantly faster due to GDDR5X. my guess is 20-25%. This means Nvidia can get more users to pick 1080 over 1070 because overclocking 1070 might not be enough to catch 1080.