Strange problem with twin HD4850 cards and MW@Home

VirtualLarry

No Lifer
Aug 25, 2001
56,339
10,044
126
Using the newest released version of BOINC (7.x.42, I think?), and a pair of HD4850 VisionTek 512MB GDDR3 reference single-slot cards. Platform is a Thuban 1045T @ 2.7Ghz (stock), ASRock 990FX mobo (dual PCI-E 2.0 x16 slots), running Win7 HP 64-bit SP1 + updates.

I have my 32" HDTV connected via a VGA cable, via a DVI-I to VGA adapter on the topmost card (first slot).

I thought, before I shut the box down, that both cards were running pretty much flat-out (and whining a bit, those tiny 8K-9K RPM fans do that).

Now, after re-starting the box, and starting two copies of GPU-Z, and selecting the first card on one copy, and the second card on the second copy, and looking at the "Sensors" tab - the first card shows 100% GPU usage, and temps as high as 112C. The second card shows 0% GPU usage, and all temps are around 86C.

Assuming that GPU-Z is correct, then that seems that one of the cards isn't even being used.

But BOINC seems to think that both cards are being used, or something. It's running two MW@Home WUs at a time.

So, is this an AMD/ATI driver problem? I'm going to re-install / roll-back to 13.12, that version seemed to work pretty well for me, or maybe 13.9.

I'm just a little confused as to why this is happening. I don't recall changing the app_info.xml for MW@Home to allow two WUs on one GPU.


Edit: I tried installing 13.12, but that doesn't support the HD4850, which was apparently moved to "legacy" before then. I then tried 13.9, which was the newest offered "legacy" driver.

GPU-Z was only reporting that one of the card's was at 100%, the other was at 0%.

So I experimented with cc_config.xml.

Code:
<cc_config>
    <options>
        <use_all_gpus>1</use_all_gpus>
        <ignore_ati_dev>1</ignore_ati_dev> 
    </options>
</cc_config>

If I use that cc_config.xml file, BOINC reports that one of the MW@Home tasks is waiting, the other running, but it doesn't make any progress, and the fans are nearly silent. GPU-Z reports that both cards' GPU utilization is 0%.

Changing the ATI_DEV exclusion to 0, results in one of the MW@Home tasks waiting, one running and making progress, and at least one of the cards' fans ramp up. GPU-Z reports that the first card's GPU utilization is 100%.

So, clearly, something is wrong.

AMD's CCC report on the hardware, shows a "Primary adapter" (HD4800 series), as well as a "disabled adapter" (HD4800 series).

I don't know how to enable it. I tried going into "Screen Resolution", but I could only enable outputs on the primary card. (I remember having to do that with my quad-GPU cruncher with NV cards for F@H, I had to enable "dummy" VGA outputs on all of the cards, all 8 outputs.)

Device Manager shows two "HD4800 series" entries under Display Devices. But I seem to recall, that these cards showed up twice in Device Manager when there was only one card, that's how they were designed. Either that, or perhaps that was my older X1950 card. I don't remember exactly.

*Confused*

Edit: Maybe it is using both cards after all. I noticed that, while quiet, the MW@Home WUs seem to be completing, in 2 minutes, rather than 6. GPU-Z doesn't report high temps, nor any visible GPU usage on either card though.

Could this all just be a GPU-Z reporting anomaly?

I think I do remember that one of my HD4850 cards had a nearly failing fan, but the other one was in pretty good shape. I guess that's what's going on after all.

Edit: I guess my problem is that one of the cards has a failing fan. It's showing 9600RPM, and 112C temp on ShaderCore. Ouch. Maybe I should just take that one out of comission. I tried cleaning it, didn't get too fan. These ATI cards are hard to clean.
 
Last edited:

Pokey

Platinum Member
Oct 20, 1999
2,766
457
126
Sounds like you have pretty much narrowed it down. It could be the fan on the offending card, but it could be the heat sink is no longer making good contact.
I have never tried to do that kind of maintenance so don't know if it is possible, or even feasible on an older card.
 

salvorhardin

Senior member
Jan 30, 2003
389
35
91
Have you tried removing the <ignore_ati_dev> line? that line is making boinc ignore a certain gpu (0 primary 1 secondary).
 

VirtualLarry

No Lifer
Aug 25, 2001
56,339
10,044
126
Have you tried removing the <ignore_ati_dev> line? that line is making boinc ignore a certain gpu (0 primary 1 secondary).

Yeah. When one of them was ignored, I heard the fan spin up high, and the primary GPU showed 100% usage in GPU-Z, 0% on the secondary; when the other was ignored, I heard nothing, and GPU-Z for both GPUs showed 0% usage. Thus I thought that both WUs were running on one card.

I have it running without that line now, and BOINC shows two WUs being worked on.

Just kinda weird that one of the cards is at 100% GPU load, 112C, and 9600RPM, and the other shows 0% GPU load, 92C, and 4600RPM.
 

salvorhardin

Senior member
Jan 30, 2003
389
35
91
does your tv have any extra inputs that you can connect your 2nd gpu? when I was running 2 amd gpus I had to connect them both to my monitor (dvi and hdmi) so the 2nd gpu would show up as available in windows and would actually process wus.
 

Pokey

Platinum Member
Oct 20, 1999
2,766
457
126
I assume you have tried each card alone and they both work normally as single cards??