Hi All,
I'm posting this on various forums because I've been unable to get a solution to my problem. Please may I ask some assistance from you wisened gurus? Any advice or at least some knowledge would be invaluable to me. First, my system specs and some background:
System Specs:
HAF X Case - All fans replaced with 200m Megaflows
ASUS Rampage IV Extreme BIOS 4004/3404 (chip1/chip2)
Intel 3930k C2 Stepping @ stock
32GB GSKILL RipjawZ 2133Mhz @ 1600Mhz (will push top max speed once everything works)
2 x MSI 7970 Lightning’s 3GB in Crossfire @ stock (GPU 1 in slot 1 [red x16] GPU 2 in slot 3 [black x8])
Titanium HD Sound Card (slot 5 black PCI-E x1)
Samsung 840 Pro SSD (Boot) - latest firmware
OCZ Vertex 3 MaxIops (Games) - latest firmware
2 x Seagate Barracuda 1TB in Raid-0 (VM's/Projects/Storage)
1 x WD 500GB (Extra Storage)
Corsair Ax1200w PSU
Sony Blu-Ray Burner/Player
2 x Iiyama Prolite 1920 x 1080 Monitors
Lamptron FC-9 Fan Controller
Koolance Dual Pump Relay (overvolt pumps 12V-->24V)
Custom Water Loop - 1 x Black Ice SR-1 480 Rad externally mounted, 1 x GTX 360 Rad Internally mounted, EK 250ml X2 Advanced Res, EKFC7970 Water Blocks for both GPU's, XSPC Raystorm Waterblock for CPU, Prolimatech LRT 1/2" ID 3/4" OD tubing, VL4N QDC's, Bitspower fittings/rotaries, 2 x Koolance-PMP 450s pumps with EK Dual Top, Prolimtech PK-1/Phobya HeGrease TIM, Apollish Vegas 2000rpm fans for GTX Rad/Shark Aerocool 1500rpm fans for SR-1 Rad.
Background:
System was running stably for 4+ months however GPU2 was running too hot (GPU1 42 degrees under load, GPU2 75-85 degrees under half load). Both cards are cooled by the same block and both have a thin layer of Prolimatech PK-1 applied to the GPU core (using X method). The cards are linked by a triple slot EK Plexi bridge.
I decided to swap the cards around as a test. The temperature variations followed the same card so I took it apart, noticed the TIM had gone hard in the middle and there was way too much. Re-applied TIM using credit card method and put back into original (slot 3) position. Since that day, the card powers down after 20-30s of being on. When I turn the PC on, the blue lights on the rear PCB come on (all 8) and stay on for about 20-30s before going out. The reactor core stays blue at all times. If I'm lucky enough to get into the BIOS while the blue lights are on, the card shows in the BIOS, if the lights go out, then the card won't show in the BIOS. This leads me to believe there is some kind of power saving feature going on, DMI/IRQ conflict or the card was simply bust. I should also point out that the "Boot Device LED" light is solid red from power on until the Windows Logo comes up, then it disappears and the status readout shows "AA" (everything running nominal). Before applying for an RMA, I performed vast amounts of troubleshooting/testing, including but not limited to:
- Using the PCI-E dip switches on the motherboard to disable lanes and test each card individually (issue follows the same card regardless of position i.e. there is no display from the second GPU because the card powers off 20-30s after POST)
- Connecting 1 and 2 monitors to the card to see if it 'kick starts' it into staying powered on
- Disconnecting everything apart from primary boot drive, 1 stick of RAM, GPU's, keyboard/mouse
- Uninstalling drivers using AMD Driver Removal tool followed by Driver Fusion in safe mode, then re-installing latest drivers (including 13.5 Beta 2 drivers and latest CAP)
- Using MSI Afterburner latest version to disable ULPS and also registry tweaks to set "enableULPS" keys from "1" to "0" (there's about 10 of them on my machine)
- Running with one card (required re-tubing of system because of plexi bridge and position of other components in the case)
- Doing a complete re-install of Windows with everything at 'optimised defaults' in the BIOS (note this is possible because GPU1 works fine, so I connect the monitors to that whilst testing)
- Swapping power cables from GPU1 to GPU2
- Reseating all power cables and checking they're all positioned correctly
- Ensuring there is power to the EZ-Plug next to the GPU's
- Changing crossfire cables (3) in positions 1 and 2 on the cards
Realising that all of the above had been to no avail, I applied for an RMA with Overclockers.co.uk. They wanted photos of the card and for me to reassemble to stock cooler, which I did. I sent the card off and 3 days later had a response that they were sending it back to me because there were no problems found and the card passes all benchmarks. So, the card MUST be working?? I tested the card with stock cooler in my old PC and it did indeed turn on, stay on and give me a display. Cracking I thought, so everything is hunky dory again. Wrong. I took the card apart again, checked the PCB with a magnifying glass and gave the waterblock a thorough cleaning and ensured all thermal pads were in optimum positions. The waterblock was put back on, thermal paste applied thinly and evenly, mounting pressure was uniform and cross-sectioned in terms of screws and everything looked perfect. The pads were flush, the core was touching perfectly and the rear mounting plate with the core reactor was mounted correctly. Checked in between layers for potential shorting issues - nothing. Put the cards back into the system as per the original spec (water flowing great, both cards freezing to the touch, no leaks, 8-pin power cables secure, seated firmly in PCI-E slots etc etc), powered on and SAME thing. BOTH GPU's turn on at initial power on, then, after 20-30 seconds, the blue lights on GPU2 turn off completely as if the card is powering down because it's not in use. GPU1 at this point is on and there's 2 solid blue lights on the back.
I've been trying to rectify this issue for about 3 weeks now in the very sparse time I have to do so (I get maybe 5 hours a week because of work/family/gym obligations). I have tried everything I can think of and just don't know what to do next. I just don't understand how both cards were working fine (albeit one very hot) then all of a sudden, after changing them around and re-applying thermal paste, everything goes to pot. What could it be? Is it indeed a power saving issue? Is slot 3 (black PCI-E x8) now dead? Is the motherboard now dead?? Why would it show "Boot Device LED" at power on then status "AA" once in windows?? No matter what though, the second card just won't stay on. It's not even recognised in Windows (old or new build). It did show up in safe mode one time, but as soon as I booted normally, it disappeared from Device Manager. I am truly at my wits end and am turning on someone, anyone with the knowledge/skill-set to help me. If anyone knows how I might proceed or at least knows someone with these cards, I'd be eternally grateful for any information you could provide. I apologise for the wall of text but wanted to make sure I provided every last detail 🙂
Thanks very much indeed everyone and I look forward to your feedback.
Kind Regards
Malachor.
I'm posting this on various forums because I've been unable to get a solution to my problem. Please may I ask some assistance from you wisened gurus? Any advice or at least some knowledge would be invaluable to me. First, my system specs and some background:
System Specs:
HAF X Case - All fans replaced with 200m Megaflows
ASUS Rampage IV Extreme BIOS 4004/3404 (chip1/chip2)
Intel 3930k C2 Stepping @ stock
32GB GSKILL RipjawZ 2133Mhz @ 1600Mhz (will push top max speed once everything works)
2 x MSI 7970 Lightning’s 3GB in Crossfire @ stock (GPU 1 in slot 1 [red x16] GPU 2 in slot 3 [black x8])
Titanium HD Sound Card (slot 5 black PCI-E x1)
Samsung 840 Pro SSD (Boot) - latest firmware
OCZ Vertex 3 MaxIops (Games) - latest firmware
2 x Seagate Barracuda 1TB in Raid-0 (VM's/Projects/Storage)
1 x WD 500GB (Extra Storage)
Corsair Ax1200w PSU
Sony Blu-Ray Burner/Player
2 x Iiyama Prolite 1920 x 1080 Monitors
Lamptron FC-9 Fan Controller
Koolance Dual Pump Relay (overvolt pumps 12V-->24V)
Custom Water Loop - 1 x Black Ice SR-1 480 Rad externally mounted, 1 x GTX 360 Rad Internally mounted, EK 250ml X2 Advanced Res, EKFC7970 Water Blocks for both GPU's, XSPC Raystorm Waterblock for CPU, Prolimatech LRT 1/2" ID 3/4" OD tubing, VL4N QDC's, Bitspower fittings/rotaries, 2 x Koolance-PMP 450s pumps with EK Dual Top, Prolimtech PK-1/Phobya HeGrease TIM, Apollish Vegas 2000rpm fans for GTX Rad/Shark Aerocool 1500rpm fans for SR-1 Rad.
Background:
System was running stably for 4+ months however GPU2 was running too hot (GPU1 42 degrees under load, GPU2 75-85 degrees under half load). Both cards are cooled by the same block and both have a thin layer of Prolimatech PK-1 applied to the GPU core (using X method). The cards are linked by a triple slot EK Plexi bridge.
I decided to swap the cards around as a test. The temperature variations followed the same card so I took it apart, noticed the TIM had gone hard in the middle and there was way too much. Re-applied TIM using credit card method and put back into original (slot 3) position. Since that day, the card powers down after 20-30s of being on. When I turn the PC on, the blue lights on the rear PCB come on (all 8) and stay on for about 20-30s before going out. The reactor core stays blue at all times. If I'm lucky enough to get into the BIOS while the blue lights are on, the card shows in the BIOS, if the lights go out, then the card won't show in the BIOS. This leads me to believe there is some kind of power saving feature going on, DMI/IRQ conflict or the card was simply bust. I should also point out that the "Boot Device LED" light is solid red from power on until the Windows Logo comes up, then it disappears and the status readout shows "AA" (everything running nominal). Before applying for an RMA, I performed vast amounts of troubleshooting/testing, including but not limited to:
- Using the PCI-E dip switches on the motherboard to disable lanes and test each card individually (issue follows the same card regardless of position i.e. there is no display from the second GPU because the card powers off 20-30s after POST)
- Connecting 1 and 2 monitors to the card to see if it 'kick starts' it into staying powered on
- Disconnecting everything apart from primary boot drive, 1 stick of RAM, GPU's, keyboard/mouse
- Uninstalling drivers using AMD Driver Removal tool followed by Driver Fusion in safe mode, then re-installing latest drivers (including 13.5 Beta 2 drivers and latest CAP)
- Using MSI Afterburner latest version to disable ULPS and also registry tweaks to set "enableULPS" keys from "1" to "0" (there's about 10 of them on my machine)
- Running with one card (required re-tubing of system because of plexi bridge and position of other components in the case)
- Doing a complete re-install of Windows with everything at 'optimised defaults' in the BIOS (note this is possible because GPU1 works fine, so I connect the monitors to that whilst testing)
- Swapping power cables from GPU1 to GPU2
- Reseating all power cables and checking they're all positioned correctly
- Ensuring there is power to the EZ-Plug next to the GPU's
- Changing crossfire cables (3) in positions 1 and 2 on the cards
Realising that all of the above had been to no avail, I applied for an RMA with Overclockers.co.uk. They wanted photos of the card and for me to reassemble to stock cooler, which I did. I sent the card off and 3 days later had a response that they were sending it back to me because there were no problems found and the card passes all benchmarks. So, the card MUST be working?? I tested the card with stock cooler in my old PC and it did indeed turn on, stay on and give me a display. Cracking I thought, so everything is hunky dory again. Wrong. I took the card apart again, checked the PCB with a magnifying glass and gave the waterblock a thorough cleaning and ensured all thermal pads were in optimum positions. The waterblock was put back on, thermal paste applied thinly and evenly, mounting pressure was uniform and cross-sectioned in terms of screws and everything looked perfect. The pads were flush, the core was touching perfectly and the rear mounting plate with the core reactor was mounted correctly. Checked in between layers for potential shorting issues - nothing. Put the cards back into the system as per the original spec (water flowing great, both cards freezing to the touch, no leaks, 8-pin power cables secure, seated firmly in PCI-E slots etc etc), powered on and SAME thing. BOTH GPU's turn on at initial power on, then, after 20-30 seconds, the blue lights on GPU2 turn off completely as if the card is powering down because it's not in use. GPU1 at this point is on and there's 2 solid blue lights on the back.
I've been trying to rectify this issue for about 3 weeks now in the very sparse time I have to do so (I get maybe 5 hours a week because of work/family/gym obligations). I have tried everything I can think of and just don't know what to do next. I just don't understand how both cards were working fine (albeit one very hot) then all of a sudden, after changing them around and re-applying thermal paste, everything goes to pot. What could it be? Is it indeed a power saving issue? Is slot 3 (black PCI-E x8) now dead? Is the motherboard now dead?? Why would it show "Boot Device LED" at power on then status "AA" once in windows?? No matter what though, the second card just won't stay on. It's not even recognised in Windows (old or new build). It did show up in safe mode one time, but as soon as I booted normally, it disappeared from Device Manager. I am truly at my wits end and am turning on someone, anyone with the knowledge/skill-set to help me. If anyone knows how I might proceed or at least knows someone with these cards, I'd be eternally grateful for any information you could provide. I apologise for the wall of text but wanted to make sure I provided every last detail 🙂
Thanks very much indeed everyone and I look forward to your feedback.
Kind Regards
Malachor.
Last edited: