Asus KGPE-D16 and ESXi

millhouse513

Junior Member
Apr 24, 2013
1
0
0
Hello,

I've built a server at home for testing. The specs are as follows:
- case: Supermicro 4U tower case (convertable to rackmount)
- motherboard: KGPE-D16
- RAM: 32GB DDR3 ECC Kingston memory (9-9-9-24 I believe)
- CPU: 2x Opteron (8 cores @ 2GHz each)
- RAID: Perc6 (eBay special)
- HDD's: mix of 750's and 1TB's split into two small RAID arrays

The server is running ESXi version 5.1. The motherboard has almost the latest firmware (found a slightly newer version that dropped about 2 weeks ago).

The problems with the system are that it will spontaneously reboot without any warning and any logging to vmware. The only indication I have is that I lose connectivity to everything and when vmware comes back up all I see in the logs are "system was booted".

It appears to be heat related because if I leave the case on, I can't get more than 1-3 days of uptime out of it before it reboots. If I have the case off and a fan blowing over the motherboard, it's able to stay up for about 5 days but never any longer.

The temperature sensors for CPU's show that the CPU's never get above 45C. I can't see the temperatures of the RAM modules while in the OS, but if I ever reboot the system after being on for hours and check the BIOS, no RAM module has been above 110F.

I've run memtest on the system for a little over 9 hours and no errors were reported.

Everything has the latest firmware/update with two exceptions:
- the motherboard (I found a minor revision to the BIOS that came out a short while ago and haven't applied it)
- the RAID card: I don't recall the firmware revision off hand but it's not the latest.

The RAID card I know works because it worked for about two months straight w/o any issues in another box before I moved over to this one.

I've tested the power supply and it's good... At first I thought it was a PSU issue because the PSU that shipped with the case was 550 Watts; this current one is an 860W.

I've tried booting the system w/vmware esxi running as the only drive plugged in (direct to motherboard) while having NO cards plugged in and NO hard drives plugged in (aside from boot)...no change in stability.

The PCI cards I have plugged in are:
- Dell Perc RAID card
- 6 port gigabit Ethernet card.


I'm confused as to where to look next... BIOS settings recommendations? Could it be a bad southern bridge chipset?

Everything is still under warranty so I can RMA something if I need to.

Any ideas/thoughts/tips would be greatly appreciated!


Thanks!
 

nenforcer

Golden Member
Aug 26, 2008
1,775
14
81
Update to the latest BIOS and try running with only a single or possibly 2 RAM chips at a time.

Check the temperatures on the heatsinks for the South Bridge and even the VRM's if possible.

Do all of this with the minimum amount of equipment installed (only the single boot drive) so you can rule everything else out.
 

Fizban64

Junior Member
Nov 28, 2013
1
0
0
Hello, hopefully you have sorted your overheating issue.

I have the motherboard, and it can overheat. The problem is down to the heatsink on the AMD SR5690. If you have a long video card, the heat seems to add to the heat by the passive heatsink. Now I have had to change my video card to a pretty bog standard little one, and I've brought a little fan to keep the heatsink in check. It used to randomly beep and the temperature was quite alarming, so if you still have issues (anyone out there) go get a little fan from ebay and improve your emotional well being !

Hopefully ASUS are aware of this and they'll put a copper/fan with decent heatsink on future motherboards.

Good luck and happy playing