Is it normal to take 33+ hours for a memtest to finally report an error?

Red Squirrel

No Lifer
May 24, 2003
67,308
12,093
126
www.anyf.ca
I just built a new machine but I'm going to use 2 sticks out of a 4 stick pack in which 1 stick failed, but since 1 stick of that batch failed I'm testing it to ensure it's not all 4 or that there's not something odd. While testing with two sticks I got an error a couple hours in after a few full passes. So I took one stick out, and never got an error after about 24 hours. I took it out and put the other stick in (always testing from slot 1) and got an error after 33 hours. I suppose I should have took a pic of the error and details but never occurred to me. I'm rerunning another test on the other stick that originally passed so I can go past 33 hours, perhaps even longer.

Considering it took that long and that many passes to get an error, even if I don't get an error on this stick, should I even trust this ram at all? How sensitive is ram to temperature fluctuations? Since I don't heat my house at max when I'm not home to save on natural gas, the ambient temp ranges between around 11c and 22c depending on if I'm home or not.

I have another stick I did not test yet. If I can isolate one stick as being good I will then test the other, then I will test them again together. If I can pass the test after a couple days is it safe to assume it's good ram? Or am I better safe just finding more ram? Problem is, it's easy to find a motherboard that is compatible with specific ram, but harder to find ram that is compatible with a specific motherboard, I specifically picked this motherboard out of the ram's compatibility list, but motherboards don't tend to have extensive ram compatibility lists so I will have to cross reference each ram vendor AND make sure I can buy the ram through a Canadian retailer at a decent price, so it's quite a pain if I do have to buy more ram.

If it matters, the motherboard is a Gigabyte Z97MX-gaming 5. The ram is Gskill F3-17000CL9Q-16GBZH (they are 4GB modules).
 

bryanl

Golden Member
Oct 15, 2006
1,157
8
81
Yes, but it's also possible the error was due to a power transient, not that I would expect that to be the prime reason for a diagnostic to report an error with Gskill memory. In other words, Gskill is not the highest quality memory, and your PC17000 CL9 is most likely just PC12800, CL11, at best.
 
Last edited:

Soulkeeper

Diamond Member
Nov 23, 2001
6,712
142
106
It's rare for me to ever run more than 1 or 2 runs of memtest, that usually finds the errors
further stability testing is done within the os (compiling large projects like firefox is the best way to hammer the cpu and mem subsystem in my experience).

If it took that long for you i'd look at things other than memory: cpu, memory controller, temps/voltages

Was it at the hottest point of the day when the errors hit ?
I've seen systems 24/7 stable for weeks, then get errors on exactly the hottest part of the hottest day.
 

tortillasoup

Golden Member
Jan 12, 2011
1,977
3
81
If you're overclocking, then yes but if you're just talking stock speeds, then it's possible you have bad memory, a bad motherboard or a bad powersupply. It's also possible that something inside of your house or even a neighbor's house is somehow affecting the power supply in your computer causing that transient failure. Electricity can be funny like that. Yes capacitors are suppose to filter this sort of stuff out but I've seen some odd things happen due to malformed electricity....

You wouldn't happen to have this computer on a backup UPS system and or connected to anything but direct grid power would you? I had a series of gradual hardware system failures that manifest themselves as software errors caused by a whole house Solar Inverter System that was outputting a modified square wave (owner's manual of the inverter indicated as such).
 

DigDog

Lifer
Jun 3, 2011
13,468
2,103
126
so you got ONE error, in 33 hours?

you know what that means, right?
 

polarmystery

Diamond Member
Aug 21, 2005
3,907
8
81
If you're overclocking, then yes but if you're just talking stock speeds, then it's possible you have bad memory, a bad motherboard or a bad powersupply. It's also possible that something inside of your house or even a neighbor's house is somehow affecting the power supply in your computer causing that transient failure. Electricity can be funny like that. Yes capacitors are suppose to filter this sort of stuff out but I've seen some odd things happen due to malformed electricity....

You wouldn't happen to have this computer on a backup UPS system and or connected to anything but direct grid power would you? I had a series of gradual hardware system failures that manifest themselves as software errors caused by a whole house Solar Inverter System that was outputting a modified square wave (owner's manual of the inverter indicated as such).

Absolute garbage.

It is the exception, and not the rule that RAM BER tests result in zero errors. Things like total clock jitter (T_j), deterministic jitter (D_j), random jitter (R_j) can all affect the UI for setup/hold times for source-synchronous RAM, nevermind the variations in the manufacturing process of the motherboards and ram and temperature deltas during normal operation.

I don't know the nature of how memtest performs its tests, but one error in 33 hours (if you were running constantly) @ DDR 2133 (again assuming constant transfers through all address banks) sounds more than reasonable and is probably well below the JEDEC requirement.

I can only imagine when DDR 4 becomes the norm, the UI for that stuff is so embarrassingly small that you are almost guaranteed to get errors even with the best pre/post emphasis filtering.
 
Last edited:

Red Squirrel

No Lifer
May 24, 2003
67,308
12,093
126
www.anyf.ca
No OCing, everything is stock. My workbench does not have a UPS, guess it would be a good idea though.

So can I actually treat this ram as good if it took that long? What would happen in a real world situation if it hit that error? I'm tired of dealing with my current system that randomly locks up or gets all sorts of weird random issues. I want to avoid that on the new system.

When I had the two sticks together it only took like an hour or two to get the error though. Guessing that dual channel might accelerate this? Can you run without dual channel? I want reliability over performance. I have the worse luck with having weird issues in Linux I just want a rock solid system that wont lock up, or have GUI glitches, etc.
 

exar333

Diamond Member
Feb 7, 2004
8,518
8
91
You should not get ANY errors in memtest. It is pretty much a 'least stressful' stability test you can do for your system. The fact that you did get just one error in 33 hours (albeit a rather long time to memtest, but obviously you are doing that because you had issues) points to an issue.

If you had more issues before removing the one stick, and still saw the issue after, it makes me think it is MB or memory-controller related. I would try XMP profiles down to 1333mhz and work your way up. Try different voltages and so forth. Memory is always the place I start when OCing or even just validating stock speeds in a new build. If you have memory 'gremlins' any other validation you do has to be taken with a grain of salt. With so many settings and so forth on modern MBs with memory, it is easy to get something a little 'off' that can cause issues.

Some effort now might save you from a lot of grief down the road. Just my $0.02.
 

Red Squirrel

No Lifer
May 24, 2003
67,308
12,093
126
www.anyf.ca
I left bios to defaults, should I need to play with anything to get optimal settings? I tend not to mess around in there unless I specifically need to change something. I figure it reduces the number of variables in my troubleshooting process. ex: I can't wonder if a setting I changed is what is causing the issue.

I'm on 17 hours / 37 passes and no errors so far on the other stick. I have another stick to test as well. Downside of that other stick being bad though is that this brings me down to 8GB. So almost worthwhile to just order another quad set and go up to 16GB. I guess to make cross referencing compatibility sheets easier is I can start with the retailer, then choose a ram set in a decent price range then check the manufacturer's website and see if my motherboard is listed. Then just keep doing that for each result in the retailer's site.
 

Red Squirrel

No Lifer
May 24, 2003
67,308
12,093
126
www.anyf.ca
I left bios to defaults, should I need to play with anything to get optimal settings? I tend not to mess around in there unless I specifically need to change something. I figure it reduces the number of variables in my troubleshooting process. ex: I can't wonder if a setting I changed is what is causing the issue.

I'm on 17 hours / 37 passes and no errors so far on the other stick. I have another stick to test as well. Downside of that other stick being bad though is that this brings me down to 8GB. So almost worthwhile to just order another quad set and go up to 16GB. I guess to make cross referencing compatibility sheets easier is I can start with the retailer, then choose a ram set in a decent price range then check the manufacturer's website and see if my motherboard is listed. Then just keep doing that for each result in the retailer's site.
 

GlacierFreeze

Golden Member
May 23, 2005
1,125
1
0
Overkill.

Every error I've encountered using Memtest+ showed within seconds. I do 1 pass. No errors? Pop it in and go. No problems here.
 
  • Like
Reactions: Leeea

tortillasoup

Golden Member
Jan 12, 2011
1,977
3
81
Overkill.

Every error I've encountered using Memtest+ showed within seconds. I do 1 pass. No errors? Pop it in and go. No problems here.

It's not overkill if you have a system that has periodic software errors/malfunctions... I had a laptop that experienced such issues and I couldn't figure out for a long while what was the cause of the issues. I had undervolted the system at the peak and minimum CPU frequencies and it interpolated the voltages in between and it had passed the Prime95 tests for over 24 hours at both frequencies. I ran the test again and forced the minimum frequency and after 36 hours, an error had shown up. I upped the minimum voltage for the minimum frequency and the problems stopped occurring.
 

Red Squirrel

No Lifer
May 24, 2003
67,308
12,093
126
www.anyf.ca
On similar subject what are other tests I can run on a new machine? (boot CD type tests, there's no hard drive in this system yet)

This is a new machine, but since I have so many issues with my current machine (weird graphic glitches, random lockups etc) I want to make sure my new machine is problem free before I migrate to it. Just sick of having to deal with trying to troubleshoot issues.
 

tortillasoup

Golden Member
Jan 12, 2011
1,977
3
81
You can run a boottime prime95 or stress test program like that found on UBCD or you can just run those prime95 programs while booted into windows. Try like I did with the prime95 while the CPU is forced to a lower multiplier and see if it glitches.
 

tortillasoup

Golden Member
Jan 12, 2011
1,977
3
81
Absolute garbage.

It is the exception, and not the rule that RAM BER tests result in zero errors. Things like total clock jitter (T_j), deterministic jitter (D_j), random jitter (R_j) can all affect the UI for setup/hold times for source-synchronous RAM, nevermind the variations in the manufacturing process of the motherboards and ram and temperature deltas during normal operation.

I don't know the nature of how memtest performs its tests, but one error in 33 hours (if you were running constantly) @ DDR 2133 (again assuming constant transfers through all address banks) sounds more than reasonable and is probably well below the JEDEC requirement.

I can only imagine when DDR 4 becomes the norm, the UI for that stuff is so embarrassingly small that you are almost guaranteed to get errors even with the best pre/post emphasis filtering.

So I must have amazing equipment then because I'm pretty certain none of my equipment makes memory errors, let alone after only 33 hours. Maybe I'll do a memtest86 test on my laptop with 16GB of ram using DDR3 1600 memory.
 

Deders

Platinum Member
Oct 14, 2012
2,401
1
91
Overkill.

Every error I've encountered using Memtest+ showed within seconds. I do 1 pass. No errors? Pop it in and go. No problems here.

I've had Memtest+ pass faulty memory through several passes, scanning again with windows memory diagnostics showed up the error on the 2nd pass.
 

Red Squirrel

No Lifer
May 24, 2003
67,308
12,093
126
www.anyf.ca
So I must have amazing equipment then because I'm pretty certain none of my equipment makes memory errors, let alone after only 33 hours. Maybe I'll do a memtest86 test on my laptop with 16GB of ram using DDR3 1600 memory.

Yeah to me an error (assuming it's legit and it's not the actual software glitching) is a cause for alarm. It's kinda like a bad sector on a hard drive but in a way it's worse as it's random. If something critical is running and happens to hit that spot 100 times it will run fine, then suddenly crash with an unexplainable segfault or other issue. Try troubleshooting that... Computers should not generate random results when fed the same instruction.

I'm 24h / 50 passes in on the other stick that I had previously considered good, but I'll let it go for much longer.

It's still quite alarming though, did not realize it could take that many passes for an error to show up. Makes me wonder if the ram in my current system (which has all sorts of issues) is actually bad even though I've past many tests.

Also is there a good way to test video card memory? I suppose that's something worth testing too. I will have to download the latest UBCD and check out the tools on there for the stress testing as well.

I figure while I have both systems more or less running in paralell I may as well "certify" the new one before I start using it. There is nothing worse than trying to troubleshoot a machine that I actually need. Only thing, I probably should be doing these tests with the video card in the system... But it's in this system. or does that not matter for testing ram, cpu and motherboard? I can always do more tests after too. The current machine will be Windows only so at least I can still use that one while the new machine is undergoing more tests. I typically always just throw a new machine right into production but I want to change that. It seems to bite me every time.


Oh and has anyone tried this one?

http://www.stresslinux.org/sl/

Sounds interesting, because from what I'm reading it also tells me the temp and other info, which is useful to have while running a stress test.
 
Last edited:

BonzaiDuck

Lifer
Jun 30, 2004
15,708
1,450
126
Absolute garbage.

It is the exception, and not the rule that RAM BER tests result in zero errors. Things like total clock jitter (T_j), deterministic jitter (D_j), random jitter (R_j) can all affect the UI for setup/hold times for source-synchronous RAM, nevermind the variations in the manufacturing process of the motherboards and ram and temperature deltas during normal operation.

I don't know the nature of how memtest performs its tests, but one error in 33 hours (if you were running constantly) @ DDR 2133 (again assuming constant transfers through all address banks) sounds more than reasonable and is probably well below the JEDEC requirement.

I can only imagine when DDR 4 becomes the norm, the UI for that stuff is so embarrassingly small that you are almost guaranteed to get errors even with the best pre/post emphasis filtering.

We mustn't forget the random phenomenon that used to be a fascination for builders of cloud chambers: cosmic rays and alpha particles.

That's the reason for using ECC RAM in mission-critical operations. It's very helpful to study briefly the way ECC works; it is quite clever. You don't need to remember the details, but it clarifies a lot. ECC RAM is therefore likely to be a tad slower than non-ECC RAM, or so I was led to believe the last time I read anything about it.
 

Red Squirrel

No Lifer
May 24, 2003
67,308
12,093
126
www.anyf.ca
We mustn't forget the random phenomenon that used to be a fascination for builders of cloud chambers: cosmic rays and alpha particles.

That's the reason for using ECC RAM in mission-critical operations. It's very helpful to study briefly the way ECC works; it is quite clever. You don't need to remember the details, but it clarifies a lot. ECC RAM is therefore likely to be a tad slower than non-ECC RAM, or so I was led to believe the last time I read anything about it.

I've actually thought of going ECC ram for my workstations. Server motherboards and CPUs tend to be way more expensive though.


This particular stick has been going for over 50 hours with no errors though. So I think it's safe to say the other stick IS bad and this one is ok. I have another spare stick, I'll probably test it too. Since I'm not in a hurry, I'll leave this test go for another day. :p

I still need to do some other tests using boot CDs like prime95, for the rest of the system.
 

BonzaiDuck

Lifer
Jun 30, 2004
15,708
1,450
126
I've actually thought of going ECC ram for my workstations. Server motherboards and CPUs tend to be way more expensive though.


This particular stick has been going for over 50 hours with no errors though. So I think it's safe to say the other stick IS bad and this one is ok. I have another spare stick, I'll probably test it too. Since I'm not in a hurry, I'll leave this test go for another day. :p

I still need to do some other tests using boot CDs like prime95, for the rest of the system.

Before you proceed too quickly on this, you might want to try something just to be sure. Maybe you've tried re-seating the suspect RAM module in its slot; maybe you haven't.

However . . . Just about any electrical/electronics "jobber" warehouse or store should be able to sell you a bottle of "Super Contact Cleaner," and Radio-Shack had sold it in a spray-can with a brush-nozzle. Brush the stuff on the gold RAM-module contacts and the slot itself; then blow it off with "canned air."

Then you can tell yourself that you have a bad RAM module if the same test results show it -- beyond any doubt.
 

Trainwrexk

Junior Member
Sep 29, 2021
2
2
36
Absolute garbage.

It is the exception, and not the rule that RAM BER tests result in zero errors. Things like total clock jitter (T_j), deterministic jitter (D_j), random jitter (R_j) can all affect the UI for setup/hold times for source-synchronous RAM, nevermind the variations in the manufacturing process of the motherboards and ram and temperature deltas during normal operation.

I don't know the nature of how memtest performs its tests, but one error in 33 hours (if you were running constantly) @ DDR 2133 (again assuming constant transfers through all address banks) sounds more than reasonable and is probably well below the JEDEC requirement.

I can only imagine when DDR 4 becomes the norm, the UI for that stuff is so embarrassingly small that you are almost guaranteed to get errors even with the best pre/post emphasis filtering.
Let me just say this.. I have two desktop and one of it I used for various testing purposes. Look, I have 32gb ram (4x8gb) and I did memtest86 for one week non stop with zero error. If you have a bad memory stick.. You have a bad memory stick. There are no confusion on this. Denial will only make it worse.
 

UsandThem

Elite Member
May 4, 2000
16,068
7,380
146
Let me just say this.. I have two desktop and one of it I used for various testing purposes. Look, I have 32gb ram (4x8gb) and I did memtest86 for one week non stop with zero error. If you have a bad memory stick.. You have a bad memory stick. There are no confusion on this. Denial will only make it worse.
Hopefully they figured out their memory issues when they posted the comment you quoted (6+ years ago). ;)
 
  • Like
Reactions: Leeea