Testing stability under I/O load

soltys

Junior Member
Aug 6, 2004
23
0
66
Hello

In my rig, I've recently discovered, that usual prime95/memtest/3dmark/etc. stress tests weren't enough in my case.

Purely by coincidence I disocvered problems with simple I/O operations - when I copied one larger (around 300mb) archive to the other disk, and 1 bit turned out to be different :).

So I made simple shell script that in loop copied some larger file, synced, and after that did binary comparison. Rinse, repeat. Turned out - that I had approx 2 1-bit errors per ~6 hours of testing. When I backed with fsb from 460ish to earlier 410, everything went back to normal.

The board in question is old rev. 1.0 ga-p965-dq6 (with E6400, TwinX2048-6400c4) - so pretty old for today, and without too many bios settings re overclocking. As it was a while ago, I can't recall precise settings - still they were carefully set to achieve stability under typical testers, as mentioned above (tens of hours spent in orthos and memtest).

I still wonder though, if it was cpu/memory culprit missed by orthos/etc., or southbridge (both disks were on ich8r) being unable to work properly under these conditions, that my small test caught somehow (also, my board doesn't allow sb overvolting).


Still - putting my setup aside - does anyone of you perform these type of tests. If so - any specific tools you use for that purpose ?

 

tynopik

Diamond Member
Aug 10, 2004
5,245
500
126
i wish they had more tests like this in reviews

especially motherboard reviews they could run some I/O stress tests where they're exercising the onboard raid, ethernet, usb, firewire, etc all at once and see how many errors they get

from my experience, errors happen a lot more frequently than you might expect
 

VirtualLarry

No Lifer
Aug 25, 2001
56,572
10,208
126
I had errors like that once, when using a Promise Ultra66 card, on a BX6-r2 (440BX) mobo. As it turns out, there is a bug in the PCI interface on that card, that causes corruption of the last few bytes of a 4096-byte block of data, when there is heavy PCI traffic. I had to do a lot of disk transfers, as well as load down my ethernet, in order to trigger the bug. It was annoying, my 440BX was otherwise rock-stable.

The Ultra100TX2/Ultra133TX2 cards supposedly have the bug fixed. That's what I'm currently using as a controller card.

Another good way to test this out, is when using QuickPAR to repair a large set of files, after repairing, the files still won't test out correct. If that happens, suspect memory errors.

I should also mention, I've done some reading, and the P5K-series boards seem to have problems with data corruption over the ICH9R chip, especially with Maxtor SATA-II HDs. Other mfg's ICH9R boards seem unaffected. (?)
 

soltys

Junior Member
Aug 6, 2004
23
0
66
I have retested this again, and so far it seems that memory was holding me back after all. With relaxed timings it's working fine. Or at least I haven't found any errors yet.

Despite that, the only method to find it was to either use mentioned copying, or GoldMemory. I completely forgot about this memtest alternative. Either way - it found errors somewhere around its 350th-ish test, during first pass. In total it has 711 tests per pass.

So currently running at mch +0.3, fsb term. +0.05, fsb @468, e6400 with multi x7. Rev 1.0 GA-P965-DQ6 with latest bios. At this moment ram is set to 5/5/5/15 @2.1v.

I'll put it to longer torture tommorow :)