• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

RAID 1 for your CPU and memory?

darfur

Member
This was an idea that just raced through my mind.

In a multi CPU or multi core system, would it be possible to only use cpu 0 (and its memory) and have cpu 1 (and its memory) acting as a redudant processor incase cpu 0 encounters an error (OS crash for example).

I imagine something like this would be extremely useful to a mission critical type of app, or for error control (perform the same calculations on all cpus, and if inconcistency's appear then mybe use the most common result, or recalculate all together?)

So, would this be possible to do? And if so, why hasn't anything like this be done before? (or at least, nothing that I've heard of)
 
eh...I knew I wasn't that innovative.

Any chance this might eventually trickle down to consumer grade equipment? Especially with dual core chips on the horizon?
 
I am not totally sure on the specifics, but the space shuttle has many redundent computers, several doing the same calculations. If one starts giving out bad data, it is automaticly taken off line and a hot standby kicks in instantly.

I don't see the economic fesability in consumer grade equipment though. Why cut your computing power in half just so you don't have to reboot so often? Is it worth it?
 
With a dualcore P4, you will have 4 logical processors. Most people would not need that, so why not put it to use?

It sure as hell would give Intel a HUGE selling point.

(I say intel because I have a P4 and the hyperthreading is nice, but I wouldn't give up a thread to reduduancy checks and the AMD would only have 2 processors)
 
THe intel doesn't have 4 processors per se. It has 2 physical but 2 virtual. Also that really only helps when multi tasking.

One thing to consider also is that when running in multi processor configs the Intel processors take a large performance hit as they still share the memory bandwidth as AMD has seperate HT links for each processor IIRC.

-Kevin
 
Also, from the front page article, the new P4 will not have HT on the dual core chips. Not sure why not...
 
I know that in some "mission critical" equipment (guidance systems for rockets etc) they sometimes use 3 separate computers; two have identical hardware/software but the third is different.
Whenever the computers need to make a "critical" decision they "vote", if they all agree then everyhing is fine; if one of the computers "disagree" the other two will double-check and if decide everything is OK they can turn off the third (presumebly faulty) computer and continue the mission.
 
Originally posted by: Gamingphreek
THe intel doesn't have 4 processors per se. It has 2 physical but 2 virtual. Also that really only helps when multi tasking.

One thing to consider also is that when running in multi processor configs the Intel processors take a large performance hit as they still share the memory bandwidth as AMD has seperate HT links for each processor IIRC.

-Kevin

Dual-Core Opterons will have only one memory controller. Memory bus is shared between both cores.

Also, why does everyone seem to think Athlons connect to memory via HT?
 
Originally posted by: Sahakiel
Originally posted by: Gamingphreek
THe intel doesn't have 4 processors per se. It has 2 physical but 2 virtual. Also that really only helps when multi tasking.

One thing to consider also is that when running in multi processor configs the Intel processors take a large performance hit as they still share the memory bandwidth as AMD has seperate HT links for each processor IIRC.

-Kevin

Dual-Core Opterons will have only one memory controller. Memory bus is shared between both cores.

Also, why does everyone seem to think Athlons connect to memory via HT?

What's a HT link? To me.. HT = Hyper Threading = Marketer's term for multi-threading = running multiple threads while sharing resources.

What's a HT link? A marketer's term for serial links?
 
Its Hyper Transport. It is how the CPU communicates with all the other peripherals onboard. The K8 CPUs have 3 of them on it just waiting to be used (1 is already in use, soon to be 2)

-Kevin
 
Originally posted by: darfur

So, would this be possible to do? And if so, why hasn't anything like this be done before? (or at least, nothing that I've heard of)

Yep, it's often called lockstep. IBM's G4-G6 processors (no relation to IBM's PowerPC G5 or Motorola's G4) in their zSeries mainframes have dual pipelines that execute the same instructions. At the end of checkpoint periods, the state of the two pipelines are compared, and if they mismatch, the state is reverted to the previous checkpoint. Continued failures causes the processor to halt and the OS to take over recovery.

HP's NonStop servers (formally of Compaq, and before that Tandem), running MIPS and soon Itanium, feature socket-level lockstep support...the CPU pins are compared at each bus cycle.

And yes, core-level lockstep is coming to multi-core CPUs, sooner rather than later. 🙂

RAS design for the IBM eServer z900
HP NonStop Servers
 
Originally posted by: darfur
This was an idea that just raced through my mind.

In a multi CPU or multi core system, would it be possible to only use cpu 0 (and its memory) and have cpu 1 (and its memory) acting as a redudant processor incase cpu 0 encounters an error (OS crash for example).

I imagine something like this would be extremely useful to a mission critical type of app, or for error control (perform the same calculations on all cpus, and if inconcistency's appear then mybe use the most common result, or recalculate all together?)

So, would this be possible to do? And if so, why hasn't anything like this be done before? (or at least, nothing that I've heard of)

ECC ram works like RAID5 already, no? It can detect and fix the very rare soft error. And arguably it is already in consumer products (875 and 925 support ECC)
 
Back
Top