New rig totally unstable in SLI, SLI-DR, FX-55, at my wit's end

jmferraiolo

Junior Member
May 11, 2005
13
0
0


Here's my vital system specs, a machine I had prebuilt from Monarch Computer


Athlon FX-55 @ stock
DFI Lanparty NF4 SLI-DR (310p BIOS) - have also tried 414-3
OCZ ELDCGE-K DDR PC-3200 Dual Channel Gold (in orange slots)
Zalman Copper CNPS7000B CPU Fan
Enermax EG565P-FMA REV.2 FMA Series ATX 12V Ver.2.0 535W PSU
(2) BFG 6800GT OC @ stock (02.28 bios rev, then 03.10)
hp L2335 23" widescreen monitor (DVI)

In windows:

NForce 6.53 (SM bus driver only)
Forceware 71.89, have also tried 76.10, 76.44, 76.45, 76.50...


My memory timings and board voltages have been set according to specs here:

http://www.bleedinedge.com/forum/showthread.php?t=9942

All of my plugs on the board are in place, etc.


Let me begin by saying that I feel like I've tried every tip, trick, timing, etc. that I've found
in forums all over the net, and I can not for the life of me get this rig stable in any game in
SLI mode.

With no video drivers loaded and only 1 video card in the machine, the system has passed all Memtest,
SuperPI, StressCPU, and Prime95 runs that I've thrown at it. It seems like the underlying CPU/Memory
subsystems are doing fine, or at least appear to be.

Once I loaded the video drivers and began trying to test games, well...

Basically, the PC will crash any time SLI is enabled on the system and I go to play a game.
Half Life 2 is the main culprit, but I also get crashes in World of Warcraft. The system seems
to work fine with either of the cards operating individually in 1 slot. I did this just to make sure
that one of these 6800GTs being defective wasn't the problem. The system also works fine with both
cards in the machine with SLI disabled. And yet still, it also works fine with SLI enabled but with
the SLI bridge off ("software" SLI). It is only when SLI is enabled and the bridge is connecting
the two cards for full SLI operation that I get the issues. And yes, I have two SLI bridges and it
does it with both of them (and I've tried facing them both ways; I don't know whether this matters).

The main offending app, or the game that I have the most "success" with crashing the system is
Half Life 2. Usually it will crash inside of 10 minutes, but I've had it go for an hour or so
before crashing one time using the DNA Forceware 76.44 drivers. Of course, once I crashed with
these drivers, my system was no longer stable even inside of Windows, locking up about 10 seconds
within logon (with a noticeable high pitched squeal emanating from the motherboard). I quickly
went back to the 76.45 WHQLs and no longer encountered _this_ type of crash...

The crashes are particularly heinous as they knock my system to a state where the machine won't even
come back from a reset. I've had garbled, unreadable BSODs (nv4_disp.sys except totally unreadable),
nasty vertical banded test pattern type screens, and just 100% straight up black screen hard crashes
which trigger a reboot (even when I have the system NOT set to automatically restart on critical
errors). After I crash to a black screen, my monitor goes out of sync as if I were going to POST
... except it doesn't come back at all. I think this may have something to do with the cold/warm
boot problems on the SLI-DR board w/ a DVI monitor outlined in a thread on DFI-Street.com (don't have
link handy).

If I press reset, when I look at the 4 diagnostic LEDs on the NF4, all 4 light up and the system does
nothing except blink the 1st LED (which is CPU detect, I believe). It will not POST at all here.
I have to cold power it down and then power it back up in order to POST. All of this from a crash
in a game? Seems like a bit much.

The rare times when HL2 hasn't hard crashed my machine and when I've crashed in World of Warcraft,
I get a standard Windows application error with the "the referenced memory area can not be "read"
message. I think this traces back to some STUCK_THREAD Q article, not sure.

Here's what I've tried thus far:

Vid card
--------

1) Switching cards / SLI connector on/off
2) Flashed BFG 6800GT bios 02.28 to latest 03.10 (www.mvktech.com)
3) Different drivers, all of them crash equally
4) Underclocked the 370/1000 stock speeds on the 6800GTs to Nvidia spec 350/1000
5) Overlclocked the cards just to see if it'd crash the machine quicker. It did.

Memory timings / voltages / BIOS
--------------------------------

1) Tried everything on AUTO
2) Set specs according to thread posted above
3) Updated to 414-3 bios, still same issues
4) Enabled / disabled all motherboard driven devices (IRDA, RAID, sound, etc.)

Software
--------

1) Tried Rivatuner Athlon "compatibility fix"
2) Multi-GPU / Single GPU rendering modes

A lot of times these were done on a fresh install of Windows. By my count I've reinstalled the system
10-15 times or so.


Conclusions and questions thus far
----------------------------------

I'm getting ready to RMA this thing, but I don't feel like I should have to. I am sure there
are dozens of others that have an FX-55 with this board running 2 6800GTs in SLI mode just fine
without these problems. If you're one of them, please speak up and let me how it was accomplished.

* I have a 535W Enermax power supply with 18a each on 2 12V rails. This should be enough for SLI,
yes?

* The fact that the machine doesn't POST after a crash and requires a hard poweroff makes me
suspect there's some kind of issue with the motherboard. Am I right to assume this?

* The fact that I am getting these horrible test pattern crashes / garbled blue screens also makes
me wonder if either the cards are unstable at their gaming temps (71-75c) or if not enough power
is being supplied to them (which shouldn't be, right?) Yet the cards work fine one at a time...


Basically I guess I don't want to admit defeat. :) But I've already struggled with this thing
all week.

Any help anyone can provide would be welcomed. I'm crossposting this to a couple different
hardware forums so please excuse me if you see this twice.

 

deadseasquirrel

Golden Member
Nov 20, 2001
1,736
0
0
One problem with prebuilt is that you personally didn't get your hands in there and do it, paying close attention to every step along the way. How can we trust that the builder hooked everything up correctly? For example, the DFI board has 2 power connectors on the board itself (one HDD-type connector and one FDD-type connector) that aid in the stability of the board when running SLI (as well as better overall stability in general).

This would be one of the first things I would check.
 

jmferraiolo

Junior Member
May 11, 2005
13
0
0
I'm not sure exactly what I'm looking for when I look for these. Just about everything on my motherboard that looks like a power plug has something plugged in to it. I'll check this ASAP though, once I know what I'm looking for. Perhaps I can find a link to the board layout.
 

deadseasquirrel

Golden Member
Nov 20, 2001
1,736
0
0
here you go

One is right next to the NB fan near the top center.

The FDD one is almost smack right in the middle of the board... to the lower right of the CPU socket.

I'm not saying that this is your problem, just that it *could* be. A few of the things you point out in your post make me think "power".
 

jmferraiolo

Junior Member
May 11, 2005
13
0
0
You were right -- the FDD one was not connected! It is hidden unbelievably well, granted, but I can't believe a system builder as experienced as Monarch would overlook something like this, particularly on a shipped SLI system.

Of course it remains to be seen whether or not it will fix my issues -- I connected it, rolled back my video card BIOS to 02.28 (which is what it shipped with), and restored my drivers back to 71.89 (76.xx was causing crashing in Windows with the high pitched noise emanating from the system even after I plugged in the rogue power connector). I loaded up HL2 and went back to work -- if I come home and the system is still up, I will be very pleased. I'm kinda not banking on it though...

 

deadseasquirrel

Golden Member
Nov 20, 2001
1,736
0
0
Originally posted by: jmferraiolo
I'm kinda not banking on it though...

Now now. What kind of attitude is that? Lemme know how it goes, 'cuz I've got some other ideas. Do you have access to a live chicken? Good. It hasn't been exposed to lead has it? Good. Any astroglide? Good. You're also gonna need a bicycle pump (one that can reach 120PSI) and one of those big sausages from a deli.
 

jmferraiolo

Junior Member
May 11, 2005
13
0
0
I'll be sure to pick up all of those on the way home.

Heh sorry for the pessimism, its just the way my luck has gone with this machine the past week -- I find the one thing that I feel like I had yet to uncover and it may not make a difference. We'll see in about an hour.

 

jmferraiolo

Junior Member
May 11, 2005
13
0
0
Couldn't wait and called the wife -- system that I left in HL2 had crashed. So apparently plugging in the unplugged power rail hasn't stopped anything. :(

Reports I'm getting from other forums also suspect the power supply. I just don't see how that can be, but hell, maybe it is.

 

deadseasquirrel

Golden Member
Nov 20, 2001
1,736
0
0
Do you have a multimeter to test your voltages? If not, can you list the voltages showing in your BIOS for your 12v, 3.3v, and 5v (or in Windows, I suppose)?

I'm not a fan of dual rail PSUs. Not that they're bad or anything... I just don't know how the builder hooked it up. How is everything plugged in, etc?

I would test out a different PSU if you have one.
 
Nov 11, 2004
10,855
0
0
Just send the system to me. If it works, I'll keep it. If it doesn't, I'll send it back. :)

A faulty PSU would be one of my fist guesses. It could also be faulty memory. Try a few dozen passes of memtest86 overnight.
 

stevty2889

Diamond Member
Dec 13, 2003
7,036
8
81
It really sounds like some sort of driver issue since it works fine in anything but SLI. You may also want to try a bios update if one is available for your motherboard and see if there are any updated chipset drivers for your motherboard. Not sure how to check this, but if both cards are on the same 12v rail, it might be too much for the power supply.
 

jmferraiolo

Junior Member
May 11, 2005
13
0
0
I don't have a multimeter but the voltages in my BIOS are plenty strong and consistently well above the voltage rates per rail.

I have a Thermaltake case that has 6 fans in it as well, and the front panel controls which lists the speeds the fans are running at and you can adjust them etc. Would this draw a lot of power from the PSU?

I'd think a 535W PSU should be adequate for the hardware I have, but maybe it's not. I might pick up the OCZ 600 powerstream and see if that solves anything but it's gonna be a messy rewire job.

I did flash update the BIOS to 414-2 per someone's suggestion on another forum but that didn't solve anything.

The system passed 12 passes of memtest last night prior to my 15000th reinstall of Windows under the new BIOS. I just wanted to be sure. Overnight I ran 8 hours of Prime and it passed that too. I'm reasonably sure it's not the memory.

 

Minotar

Member
Aug 30, 2004
147
0
0
Time to send the PC back to Monarch, bud! Sounds to me like an SLI hardware problem, which probabaly indicates some underlying mobo problems. I personally don't like Monarch, as they failed my 25 point question when I built the rig in my sig. I asked 15 different PC companies the 25pt questionaire and Monarch really disappointed me when we got down to almost doing business. I doubt it is the power supply from the data you provide, but it does sound like it is either a mobo problem or your video cards are massively overheating under SLI mode... What are your temps under full load? That can indeed cause the crashes you describe. How is your case and cooling situation? I hope Monarch gives you quicker support than they give many.