Ultra stability test

makizeka · Jun 18, 2011

Hello, I'm fond of overclocking for almost 2 years ... I'm majoring in computer engineering in Italy ... my thesis, in fact, is an evolutionary development of a tool to test the stability of a computer ...

I am developing a new test tools for Intel Core i7 950 through. The main advantage over other tests is that it is much faster (max 5 m).

Unlike the traditional stress test, this is intended to highlight the Speedpath processor, which means the slower path through the logic circuits of the CPU, which limits the maximum frequency, so it does not need a running too long, and it is very attached to processor on which it was developed.

Unlike other programs such as LINX, IBT occt ..., my program does not generate much heat because it generates few cache miss: ram-> cache (because the occupation of the memory of the program are a few Mb instead of Gb, IBT & co.) then the ram is not very tested, and the temperatures of the CPU does not grow much. In addition it dectects the system instability before other programs. Precisely for this reason, I would like to have more feedback from other users who have the same platform or other.

For example, in testing I tried it on my system ... situations of instability were detected only from my program ... while other test programs gave me rock solid as a result ... or, in other cases reported instability with more time ...

At the end of the test there will be a result: if the CPU is rock solid then the result will be "Pass", if the system is unstable, the result is "Fail!"

I post a screenshot of how the tool should work for you feedback.

Download:
http://www.cad.polito.it/research/Evolutionary_Computation/files/M&G_v2a.zip

On this page you will find the next versions of the tool:
http://www.cad.polito.it/research/Evolutionary_Computation/Overclocking.html

Thank you for your cooperation, I hope to experience with you.

Concillian · Jun 18, 2011

There are very few cases where a maximum stability 24/7 type overclock is not limited by the cooling subsystem. I always reach my limit on Temperature to Tj before I reach the maximum comfortable voltage limit. As a result, I think stressing the cooling subsystem is a big part of stability testing.

Color me skeptical of any stability test that doesn't also create significant heat.
How do you know that the speedpath you test will remain stable when the system is producing more heat than when you test and the die temperature increases?

Perhaps more importantly, how do you ensure that the system is definitely not capable of creating too much heat for the cooling subsystem to maintain a safe distance to Tj if you are not stressing at maximum temperature and monitoring temperatures?

CTho9305 · Jun 18, 2011

I have a few questions/comments:
1) How do you identify the critical paths (speedpaths)? As I understand it, the manufacturer themselves can only make educated guesses based on their models, and the guesses are almost always wrong. Actual identification of speedpath often requires the use of specialized tools that slowly raster lasers across the die while looping a test pattern and finding which parts of the chip change the pass/fail rate of the test, then combining those results with highly-detailed knowledge of the design to identify what logic lies under those spots, and finding logic paths that also hit those spots. As far as I know, speedpath details are also never externally disclosed.
2) Why aren't you adding additional operations to increase power consumption? For what it's worth, on an x86 processor the highest-power patterns are probably not going to be particularly memory-intensive, and will probably mostly fit in the cache, to keep high-power execution units as busy as possible (e.g. floating point multipliers, which are notoriously power-hungry). To properly test worst-case conditions, you need a high-power pattern for two reasons:
A) Heat. Hotter transistors slow down, so you're not finding the worst-case limits unless the transistors are also as hot as they'll ever get.
B) Resistance in the on-chip / on-die power distribution. V=IR; even though the grid (hopefully) has a very low R, as I increases, you get more drop between the externally-supplied voltage and the voltage actually seen at the devices. The "IR" drop in the power grid is enough to meaningfully further slow down the transistors. This is a pretty big deal.
3) Are you considering cross-core effects? I've heard about some research IBM did that modeled the interaction between two cores on a multi-core processor, in which a sudden power-draw increase on one of the cores resulted in a delayed droop in another core's power grid. To hit the absolute worst-case, you need to make sure that the core you're testing is experiencing the worst environment it'll see in normal operation, which includes cases where other cores are doing work too.
4) Are you considering inductance in the power distribution? To get the absolute worst-case voltage at the transistors, you may actually need your test to alternate quickly between high-power and low-power conditions to stimulate transients in the power grid (probably undershoot in the low-power to high-power case, but possibly also overshoot in the high-power to low-power case, because that could cause the chip's clock distribution network to suddenly shorten a clock cycle or two).

edit: Your research looks interesting. Is full text available anywhere, particularly for http://www.cad.polito.it/pap/exact/ets07b.html?

reb0rn · Jun 18, 2011

I will test it a bit... but i have SandyBridge and mine best tool for SB OC is prime blend test (but it need 24h run

at least) with all memory allocated....
duno if your tool is suited for i5 2500K

i lovered voltage for OC by 0.02v which crash prime blend in 4-5h, but still your tool passed 3x times, each run ~100sec... maybe your tool is not suited for SandyBridge at all

Search

Ultra stability test

makizeka

Junior Member

Concillian

Diamond Member

CTho9305

Elite Member

reb0rn

Senior member

TRENDING THREADS