"GPU Multitasking" in the modern era, vendor relative ability

VirtualLarry

No Lifer
Aug 25, 2001
56,587
10,225
126
Just a real-world test here.

I've got a couple of R5 1600 boxes here, one with a GTX 1070 ti 8GB card, and one with TWO (RX 470 + RX 570). Windows 10, I believe newest video drivers on both.

I'm normally on the box with the RX 570, and I sometimes Skype and watch YouTube at the same time, while mining in the background. On that box, I simply watch my mining H/s output decrease, when I put the additional load on the video card.

On the box with the GTX 1070 ti, which is arguably the more powerful card, I see stutters and FF/RW on the video "chunks" while watching YouTube. Very disconcerting, and hard to watch. Skype was OK, just at a slightly lower-frame rate than when not mining.

I put this all down to multi-tasking, and the very real advantage that AMD has with their GPU architecture, having hardware-based scheduling, with multiple ACEs and whatnot. It just runs a lot smoother overall, when you've got multiple programs demanding work from the CPU, while at the same time, the host CPU is loaded way down (mining XMR), so a software-based scheduling solution is going to have a lot of latency, which is what I was seeing with the YouTube videos.

Has anyone else noticed this, with a real gritty A/B-type real-world testing?
 

AdamK47

Lifer
Oct 9, 1999
15,842
3,628
136
One solution I know of to get around the stuttering issue is to go into task manager and kill all mining processes. This will free up resources immensely and allow for smooth PC gaming with your graphics card.
 

Muhammed

Senior member
Jul 8, 2009
453
199
116
having hardware-based scheduling, with multiple ACEs and whatnot
Sigh, how many times this fallacy has been repeated? NV GPUs don't lack a hardware scheduler at all. THEY ARE NOT USING A SOFTWARE SCHEDULER!
Ever thought the reason you are having this disparity is using TWO AMD GPUs instead of one?
 

VirtualLarry

No Lifer
Aug 25, 2001
56,587
10,225
126
I just did this again, with the SAME video on YouTube, and calling the SAME person on Skype, on the AMD rig, and ... no stutters whatsoever with the YouTube vid.

You really think that have a secondary AMD GPU in the rig, actually accelerates Skype or YouTube, MORE than just having a single AMD GPU? I've not heard of any support for multi-GPU acceleration for those things mentioned in driver release notes, have you?

Could a scheduling issue be behind why the GT1030 has such horrible frametimes, as compared to AMD's 2200G and 2400G APUs (released today, so drivers aren't even really optimized well yet), in this game?

https://techreport.com/review/33235/amd-ryzen-3-2200g-and-ryzen-5-2400g-processors-reviewed/7

I cannot speak to "boots on the ground" experience with that game on the GT1030, as I do own a GT1030 (in my FX-8320E rig as of last week), but I don't own that game yet.

I don't have a 2200G APU yet, either, probably maybe by next week I'll have one, could maybe do some testing?
 

tamz_msc

Diamond Member
Jan 5, 2017
3,865
3,730
136
I've noticed that on my GT 730, doing some memory-intensive Einstein@home jobs on the GPU causes visible stutter - the BOINC screensaver and desktop rendering is pretty choppy during that time.
 

Despoiler

Golden Member
Nov 10, 2007
1,968
773
136
Sigh, how many times this fallacy has been repeated? NV GPUs don't lack a hardware scheduler at all. THEY ARE NOT USING A SOFTWARE SCHEDULER!
Ever thought the reason you are having this disparity is using TWO AMD GPUs instead of one?

Nvidia moved away from full hardware scheduling in Kepler. https://www.anandtech.com/show/5699/nvidia-geforce-gtx-680-review/3
However based on their own internal research and simulations, in their search for efficiency NVIDIA found that hardware scheduling was consuming a fair bit of power and area for few benefits. In particular, since Kepler’s math pipeline has a fixed latency, hardware scheduling of the instruction inside of a warp was redundant since the compiler already knew the latency of each math instruction it issued. So NVIDIA has replaced Fermi’s complex scheduler with a far simpler scheduler that still uses scoreboarding and other methods for inter-warp scheduling, but moves the scheduling of instructions in a warp into NVIDIA’s compiler. In essence it’s a return to static scheduling.

Pascal's differs from the above approach by introducing dynamic load balancing which is done in the driver. It's in Nvidia's own presentation of the feature. Once the driver determines the ratio of work types coming in it distributes it to the GPU(assuming the dev didn't choose static allocation). Eventually there is hardware scheduling, but it's once all of the work is getting broken down for processing at the SMs.
 
  • Like
Reactions: IEC

VirtualLarry

No Lifer
Aug 25, 2001
56,587
10,225
126
I disabled the second card in Device Manager, now just using the Primary RX 570, I exited and re-started NiceHash, it gave me an error about the missing video card, but started mining on the remaining primary card.

Then, I about had a heart-attack, my hashrate went down to 4Mh/s, rather than 22Mh/s, and my entire Windows 10 experience got slowed way down. By the time I decided to reboot, I checked HWMonitor, and it appeared that the MemClock went down to 300Mhz, for some reason, rather than 1850Mhz. I shut down most of my programs, and then started the mining app back up, and MemClock returned to 1850Mhz, and I'm back to 22Mh/s. I guess there's still some driver quirks about the MemClock getting set way down, for some reason, and not bouncing up under load.

I still persistently kind of have that problem with my other rig, with a pair of R9 260X cards, HWMonitor shows them at 300Mhz MemClock, and they don't want to mine very well. On the newest drivers, I believe. I guess AMD hasn't tested the newest drivers with ALL of their old GCN-based cards, or maybe there's some BIOS- or power-management- quirk with those cards. Most of my cards are XFX.
 

Headfoot

Diamond Member
Feb 28, 2008
4,444
641
126
My 290s do the same thing VL. If you figure out the solution definitely post again please, so far I've just been periodically rebooting the machine to "reset" back to normal and that seems to work okay
 

MrTeal

Diamond Member
Dec 7, 2003
3,919
2,708
136
I think it's just an nVidia thing. My system bogs right down when mining on my 1080Tis and is basically unusable. Videos stutter, and the mouse lags horribly. With AMD cards, while you wouldn't have been able to game it was perfectly usable for video watching and internet browsing while also mining.
 
  • Like
Reactions: AntonioHG

VirtualLarry

No Lifer
Aug 25, 2001
56,587
10,225
126
I think it's just an nVidia thing. My system bogs right down when mining on my 1080Tis and is basically unusable. Videos stutter, and the mouse lags horribly. With AMD cards, while you wouldn't have been able to game it was perfectly usable for video watching and internet browsing while also mining.
That's been my experience as well, in a nutshell. AMD cards seem to me, from observation, to be MUCH better at "actual multitasking".
 

Crono

Lifer
Aug 8, 2001
23,720
1,503
136
The default intensity level may be higher for Nvidia cards. AMD cards definitely would get bogged down in equal fashion running at higher intensity. Not sure if there's an option to change it in NiceHash (probably can through config file), but you usually can through pooled mining like via Claymore or ethminer.
 

SimianR

Senior member
Mar 10, 2011
609
16
81
I don't mine on Nicehash, I just use Claymore miner and mine ethereum with a GTX 1080. I found that I had this mouse lag and video lag as well, and when I changed the intensity to 1, which is -ethi 1 in the config, the mouse lag went away and there was only a minor performance hit (from 25Mh's to 24Mh's).
 

24601

Golden Member
Jun 10, 2007
1,683
40
86
It's because OpenCL sucks for high performance computing, like mining, as OpenCL programs usually are relying on the drivers to load the GPU as much as it can.

The CUDA miners are optimized closer to the metal while the OpenCL ones are more generic and relying on whatever the AMD Driver allows it to do.

Due to the CUDA miners being so well optimized you are left with less idle resources on the graphics card.

The last AMD graphics card that had an architecture that wasn't hilariously lopsided was Tahiti, and you will see exactly the same thing happen with Tahiti when you run hyper-optimized hand-coded miners.

The entire reason why AMD jammed more and more scheduling silicon into every iteration after Tahiti is due to the fact that AMD sucks at actually keeping it's resources busy.

It's the same reason Ryzen has higher "gain" from hyperthreading than Skylake due to Ryzen sucking at extracting instruction level parallelism compared to Skylake.
 
Last edited:

zlatan

Senior member
Mar 15, 2011
580
291
136
Windows and the traditional GPGPU model is the problem. A GPGPU program will launch kernels, no matter what API are you use, and the drivers will ask the OS to launch these for the user, which will involves a lot of system calls. Now if an app is drastically optimized for a GPU, and launch a lot of kernels it will involves more system calls, and the Windows OS might give you a bad multitasking experience, because the OS kernel has a lot of work to feed the GPU. And it doesn't matter if you not see a big CPU load, the problem is in the traditional GPGPU programming model, which will lead to a lot of unnecessary copy overhead. This gets actually worse with the meltdown patch.

This is also can be reproduced on AMD, although their driver is just more robust, and they are specially optimize for these scenarios, while Nvidia don't. But the underlying issue is still there in the OS, and there might be some GPGPU apps that can lead to a bad multitasking experience with a Radeon GPU.

AMD also solved the whole problem with ROCm on Linux. In this model a kernel launch doesn't involve an OS system call. But this approach requires a very big structural change in the OS, this is why ROCm is not supported on Windows.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,587
10,225
126
While I agree, it might have something to do with the "intensity" setting, but I think that AMD drivers have something to do with it too, as on my FM1 APU rig, even at max CPU load, it was still "responsive", far, far, better than a Sandy Bridge i3 rig I happened to be working on on the same day, that I made a comparison with, while updating Windows 10.

One thing that I noticed, is that on the FM1 rig, running Windows 10 64-bit, the CPU load will max out at 99%, it wouldn't allow (user?) threads to occupy 100% CPU time. In a matter of fact, I could listen to internet radio flawlessly, well, unless the internet connection got bogged down, which it can during Windows 10 updates. But the CPU portion of the FM1 APU never did, and browsing was still possible, while doing updates.
 

Shmee

Memory & Storage, Graphics Cards Mod Elite Member
Super Moderator
Sep 13, 2008
8,313
3,177
146
Larry, I do not have this issue with my 1080Ti. One thing I can recommend, is using EWBF's miner to mine equihash coins.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,587
10,225
126
Larry, I do not have this issue with my 1080Ti. One thing I can recommend, is using EWBF's miner to mine equihash coins.
Interesting. So, maybe the issue is NiceHash? Or how "optimized" the CUDA miners are? Not leaving that extra 1-2% of GPU power spare, for other things?

Now that you mention it, I think that some DC projects that ran on my 7950 3GB cards, were like that too, but in some later years, they weren't as bad, because they were modified to only allow using so many max percentage of the GPU's resources, leaving some free for occasional other things. I don't know if that was a driver change, or a change in the DC program that "drove" the GPUs.
 

Headfoot

Diamond Member
Feb 28, 2008
4,444
641
126
When I need to use my pc with the miner on I just turn down intensity and it works fine
 

PhonakV30

Senior member
Oct 26, 2009
987
378
136
I use only Claymore CryptoNote and It's great without stutter.If i use xmr-stak-amd.exe for mining Monero , I get horrible stutter on gaming.so change your Mining App for Nv cards
 

Muhammed

Senior member
Jul 8, 2009
453
199
116
Nvidia moved away from full hardware scheduling in Kepler. https://www.anandtech.com/show/5699/nvidia-geforce-gtx-680-review/3
The concept of AMD having better scheduling than NVIDIA is BLATANTLY WRONG, NVIDIA still has more scheduling hardware than AMD even after the reduction. There are two parts of scheduling, instruction and latency tracking, Fermi has BOTH. Kepler did without the latency tracking scoreboarding, but it still retained the instruction part, and it's arguably even more complicated than GCN's fixed latency pipeline and simple counters. NVIDIA still uses reduced set of scoreboards for that part.

In essence: NVIDIA had a huge scheduling hardware in Fermi operating at two levels, AMD never tried that approach before, they only introduced one level in GCN. NVIDIA also reduced their scheduling to one level in Kepler but they still operate an arguably more complicated scheduler than AMD to this day.

Read here for more:
https://forum.beyond3d.com/posts/2015733/
https://forum.beyond3d.com/posts/1989879/
https://forum.beyond3d.com/posts/1975077/
https://forum.beyond3d.com/posts/1930541/
https://forum.beyond3d.com/threads/...rs-and-discussion.59649/page-102#post-1989856
 
Last edited:

Despoiler

Golden Member
Nov 10, 2007
1,968
773
136
The concept of AMD having better scheduling than NVIDIA is BLATANTLY WRONG, NVIDIA still has more scheduling hardware than AMD even after the reduction. There are two parts of scheduling, instruction and latency tracking, Fermi has BOTH. Kepler did without the latency tracking scoreboarding, but it still retained the instruction part, and it's arguably even more complicated than GCN's fixed latency pipeline and simple counters. NVIDIA still uses reduced set of scoreboards for that part.

In essence: NVIDIA had a huge scheduling hardware in Fermi operating at two levels, AMD never tried that approach before, they only introduced one level in GCN. NVIDIA also reduced their scheduling to one level in Kepler but they still operate an arguably more complicated scheduler than AMD to this day.

Read here for more:
https://forum.beyond3d.com/posts/2015733/
https://forum.beyond3d.com/posts/1989879/
https://forum.beyond3d.com/posts/1975077/
https://forum.beyond3d.com/posts/1930541/
https://forum.beyond3d.com/threads/...rs-and-discussion.59649/page-102#post-1989856

That wasn't the statement though. We are discussing whether Nvidia has software scheduling and the answer is they do. We know they use software to do some part of their scheduling. We know that AMD does it all in hardware. So really Nvidia should be referred to as a hybrid solution and AMD a hardware.
 

hemla

Junior Member
Jul 29, 2017
18
0
36
If you have integrated card you might try setting your OS to use that card for display while leaving your mining cards alone?
 

JimKiler

Diamond Member
Oct 10, 2002
3,561
206
106
That wasn't the statement though. We are discussing whether Nvidia has software scheduling and the answer is they do. We know they use software to do some part of their scheduling. We know that AMD does it all in hardware. So really Nvidia should be referred to as a hybrid solution and AMD a hardware.

Nope, because AMD has a much simpler scheduler, they use a hybrid solution as well. In fact they always did from the very beginning.

Are we talking about schedulers or who is better AMD versus Nvidia?
 
  • Like
Reactions: ZipSpeed

IEC

Elite Member
Super Moderator
Jun 10, 2004
14,605
6,093
136
Does this on my Pascal cards when running BOINC as well.

Solution is to suspend BOINC on my main rig when doing something that requires the absence of stutter.