XP & raid1 boot problems

Mat99

Member
Sep 10, 2008
27
0
0
Hi folks,

I have a problem that I've been trying to fix for 2 days now, but no luck.

Here's the info:
I have a system with 4 drives, 2 in raid1 and 2 in raid0. I have OS installed on raid1. Two days ago, after restarting the PC, chkdsk started during boot (just before getting to the desktop). It ran for a while and did find and fix some errors. Then it restarted the pc and during start up procedures I notice a "verify" status displayed next to my raid1 arrays (called Primary). I knew what would happened next and it did. The xp wouldn't load and I get the bsod. This has happened before and I image it happens couse raid array needs to be verified, but this can't be done through dos. So I turned everything off and added the fifth IDE disk to the system on which I had a clean xp instalation that I did a year ago, when the first crash like this happened. So I got into XP (on this 5th drive) with no problems and Intel Matrix Storage console saw the array problem straight away and started the "verify" process. All good, same as usually. It ran for a while, found no problems and I did that I usually do next, turned down the pc, unplugged the ide drive and start the pc again. Usually it's all good, I get into xp on my raid array with no problems and all is fixed. But not this time! I gets through startup procedure, I see the splash screen, then it crashes and gives me bsod. This is the first time it has happened so I'm not sure what's up.
The code says: 0x0000007B (0xF78D2524, 0xC0000034, 0x00000000, 0x00000000)

I added the 5th drive again, ran antivirus on all drives, nothing was found.
I tried booting with XP CD, but after all the "set ups; just before the options page" it crashes again and gives the same bsod.
I tried booting in safe mode, same thing.
I tried booting with last know good configuration, same thing.
Through XP on the 5th drive, I ran chkdsk on all other drives, nothing was found.

If this was a new pc and I was instaling stuff, I would imagine something went wrong. But I had this setup for 2 years now, it all worked great. So I guess the setting in bios are all correct, if it worked till now, right? Same for the connections and stuff, right? It all worked great until that chkdsk 2 days ago, which found and "fixed" some problem on my Primary raid1 drive.

Could it be something with the boot sector or mbr?
But everywhere I look it says "start the recovery consol and rewrite the boot sector".. I can't, I get bsod before I get to the options page to select the Recovery Consol.

With my 5th drive I have access to all my 4 drives in 2 raid arrays. Can somehting be done through this 5th drive, can I fix the boot sector and/or mbr on my raid1 array through this 5th drive?

Please help, I'm totally lost..

TIA,
Mat

 

Mat99

Member
Sep 10, 2008
27
0
0
Hi Kub,

thx for reply!

I'm planning to do that just now, but I have a feeling, it won't help.
I'll just break the array and I'll just have to verify it again.


But I did managed to get into the Recovery console (I had to use the Raid driver diskette during xp boot from CD), and did the "fixboot" command. It wrote the new boot, I restarted the pc, but same thing, I see the splash screen, then it goes black and just as I was suppose to see the desktop, it restarts.
I'm looking at fixmbr option now, but I'm afraid to use it. It gives me the usual warning "you have a non-standard or corrupted mbr,......." which is probably due to raid array. I'm afraid if I use it, it'll delete all my partitions :S
I have full access to all drives now (throught OS on my 5th ide drive), I've checked every drive, with chkdsk, antivirus, etc,.. no errors, no problems.. just can't freaking boot :S
 

Mat99

Member
Sep 10, 2008
27
0
0
Ok, I did try booting with only one Raid drive connected, but same thing. It reboots during splash screen. I did get the "degraded" status for the raid array. Drives should be identical, so it has to be something on them (error or something I mean, data is there, I know what), right?
Man, I just don't know what else to do... :(
 

TheKub

Golden Member
Oct 2, 2001
1,756
1
0
I would tell the raid device to not do raid (vs just running a raid1 array with 1 drive) or move the drive to a non-raid controller. Obviously you have to change the boot order in the BIOS. Then if the error still occurs try the fixboot\fixmbr on that single non-raided drive.
 

Mat99

Member
Sep 10, 2008
27
0
0
Yeah, but if I do that, I won't be able to use both drives in raid1 again.......
 

TheKub

Golden Member
Oct 2, 2001
1,756
1
0
I would say do it just for testing. Besides you will have the other drive that you can use to rebuild the raid after the fact.

(Personally, RAID is overrated)

You can also download the diagnostic software from your HDD manufacturer and run the extended tests on them just to make sure that one is not flaky.
 

corkyg

Elite Member | Peripherals
Super Moderator
Mar 4, 2000
27,370
239
106
I just had recent experience with a boot drive always wanting to run CHKDSK - there were lots of corrupted files. I took that as a warning that the drive was having problems, and failure was pending.

I cloned the drive to a different one, and all was perfect. No CHKDSK - running now for 7 days straight.

Coujld be that only one of your RAID1 drives was failing - but the mirroring put corrputed files on both.

You need to run a hardware diagnosis on each to determine the failing drive. Replace it and rebuild your array.

Agree that RAID is overrated. :)
 

Smilin

Diamond Member
Mar 4, 2002
7,357
0
0
I know Stop 7Bs :)

You wouldn't see the message if the kernel wasn't loading. You're past mbr and ntldr at that point.

If you are getting an immediate 7B before the splash screen then you are in one of two spots:
1. Reading the drivers to load from the registry
2. Loading the mass storage driver (or filter) from disk.

Given a bad shutdown/hardware raid hiccup this also makes sense.

Drop down to recovery console again and navigate to windows\system32\config
Take a look at the size of that file (edit: the 'system' file in there). See if it looks 'normal'. It will be somewhere from 2.5 to say 6mb in size. Anything smaller and it's taken some unrecoverable damage. Use the steps for correcting a missing or corrupt system32\config from here:

307545 How to recover from a corrupted registry that prevents Windows XP from starting
http://support.microsoft.com/d...x?scid=kb;EN-US;307545


Also take a look at the actual driver file for your raid controller. I can't tell you the filename...go download a new one to see what it is. Ensure this file is intact (in system32\drivers) and is the proper file size.

If it's a filter driver thats failing to load the troubleshooting will be a bit tougher. Filter drivers can be specified in a number of places. This is the high level overview of what you'll need to do:
1. Along the lines of that kb307545 above grab a copy of the windows\repair\system file and drop it in place of your system32\config\system one (do NOT overwrite, rename original first). At the minimum this will be a system hive from when gui mode setup first completed when you built the box. You should be able to boot but you should expect some nasty frankenstein behavior like device redetection all over the ying yang. no worries. Just get booted long enough to do the next steps.
2. Open regedit and do a "load hive" (see helpfiles if needed) to load the original system hive. Search the system hive for "upperfilter" and "lowerfilter". You'll find many of these and they're normal. You are interested in the ones found for your hard disks.. for instance diskperf.
3. When you find the filter drivers do the same thing you did with your main driver...verify they exist and are not damaged.
4. Optionally: make another backup of the system hive, then edit this one. Simply delete any filter driver entries...they are needed for added functionality but not to boot. If you are in doubt about whether a filter driver belongs just compare to a working box.
(edit #2..I'm an airhead... Once you are done editing drop the hive back into system32 then reboot. You don't want to run indefinately with that windows\repair version)


I hope this helps. It's a hell of a lot easier to do than explain. My bet would be the registry got truncated during the chkdsk (which isn't a bad thing...it was fvcked anyway).




 

Mat99

Member
Sep 10, 2008
27
0
0
Thx guys!

I actually never really had any problems till now. The only time chkdsk started was when irregular shutdown was done. This happened like 5 times since I have this setup (almost 3 years now).
I'm also getting more and more frustrated with raid. The only reason for it was to keep the data safe (I lost too much stuff in the past). So I have a drive in 3 partitions, one for OS, one for data and one for photos. Was hoping to get rid of just such problems, but now I see it just duplicates the errors as well. I have a ton of small programs, plugins, tools and scripts installed, so os reinstall is always a killer for me. I guess I'll have to find an alternative, something that would mirror the disk with one day delay or something. So if the "fresh" drive doesn't works, I'll be able to use the "day old" copy. Something like that, need to do more research on the options.

Anyhow, on to my problem. Atm I'm rebuilding the array, so can't do anything for the next 3 hours. I'll try what Smilin is suggesting. I actually do see splash screen, the bar animates for about 15 seconds, then it goes black and when I'd suppose to see the desktop, I get the bsod. If I remember correctly, I have a copy of the registry in my backup folder on my drive, which is about 1-2 months old (have to wait for rebuilding to stop, to check). Do you think I should try replacing the registry with that backup copy? As far as I can remember, no hardware was added in the past months, porhaps one or two programs, but the important ones are installed for years.
I'll do the other suggested steps and report back. Will also diagnose both drives to see if something is wrong.

Thx for the help!
Mat
 

Mat99

Member
Sep 10, 2008
27
0
0
Ok, I've checked some things:

- system file in wndows/system32/config is 6.144KB in size
- from the Device manager I can see that the driver for Raid controller is named iaStor.sys
On my 5th drive with working xp, the file is present in windows/system32/drivers/ and it's size iz 302KB (in properties I see: version 7.8.0.1012)
On my raid os drive, the file is also present in it's windows/system32/drives/ folder, it's size is 298KB (in properties I see: version 7.5.0.1017)
I guess all this is ok..

Will try the next suggested thing.. will report back.

Thx, Mat


Update: I do have the registry backup from the os raid drive, 2 months old. Could this be of any help?

Update#2: I can't use the suggester register fix, since I do have oem xp installed. It says in the start of the document, that I shouldn't use that procedure, since I won't be able to log into the recovery console later, to replace the original registry hives files. I do however, have access to the whole drive throught my 5th drive, if that would work? If it's a simple thing as copy&paste..
 

Smilin

Diamond Member
Mar 4, 2002
7,357
0
0
The whole recovery console thing is a pretty big hassle. If you have access to the drive via a parallel install of some sort definately go that route.


Regarding the symptoms update you mentioned. If you are getting as far as seeing a splash screen with the bar moving across then you've gotten past loading drivers so those should be intact. At that point you are initializing system (rather than boot) drivers so filter drivers become more likely to be an issue.


What do you have available System Restore Point wise? Take a look in the system volume information folder for the RP* folders and see what the latest is you have. If you have a pretty recent one, just grab all 4 registry files from there (system, software, sam, security) and drop them in place (preserving originals!).

Going a full two months old with a system hive isn't usually a bad deal. Problem is you should usually move system and software (if not all 4) as a single unit. Going back two months in the software hive is going to have the likely side effect of "uninstalling" some stuff for you. Data and program files for the "uninstalls" will still be intact of course.

But hey...even those side effects are better than a non-bootable computer :)


Regarding raid:
I've fixed a LOT of dead servers. It has been my experience that high end server raid controllers with batter backed up cache and all the goodies perform really well. Anything less like say a promise controller can often cause more trouble than it prevents. Especially with raid 5. You blow a drive and it's supposed to keep rolling. In reality blowing a drive sometimes screws up the parity of the array and takes the whole thing down. So instead of redundancy you've basically introduced multiple single points of failure instead. My views may be skewed though since the only ones I ever saw were the ones that failed...who calls support when it's working? :p

IMHO the best raid for a PC/Workstation is Raid 1. The MTBF on modern drives is so incredibly high that the drives usually go obsolete before they fail...even with the doubled chance of failure. Keep good backups and all is well.

Always remember: Raid is to protect server UPTIME, not data. Backups protect data.

 

Smilin

Diamond Member
Mar 4, 2002
7,357
0
0
Originally posted by: Smilin
It has been my experience that high end server raid controllers with batter backed up cache...

LOL at self typo...Meant "battery"

MMM...batter dipped raid, just like corndogs!

 

TheKub

Golden Member
Oct 2, 2001
1,756
1
0
Originally posted by: Mat99
...but now I see it just duplicates the errors as well.

Unfortunately, far far too many people have the mind set that RAID is a backup.
 

Mat99

Member
Sep 10, 2008
27
0
0
Hi guys,

thx again for helping out!

Ok, I navigated to the System Volume information on my raid os drive and I see two folders there:
_restore (a bunch of numbers and letters) with the date 10.09.2008 (I have DD/MM/YYYY)
_restore (a bunch of numbers and letters) with the date 07.09.2008

My PC crashed on monday 08.09.2008, so I opened the second folder. In it I have:
RP403 - date 08.09.2008
RP402 - date 07.09.2008
RP401 - date 06.09.2008
and others since 01.09.2008

So I used the ones from 06.09.
With the help of this document: http://www.wikihow.com/Recover...ndows-XP-from-Starting
I replaced all five files as instructed and restarted the pc.
No luck :( Same thing, I get the splash screen, bar animates for 10-15 seconds, goes black (I see the light on my mouse goes on) and 3-5 seconds later, it restarts.
I forgot to mention this yesterday, I no longer get bsod, the pc simply restarts.

I then noticed Smilin said to replace only 4 files instead of 5 that are mentioned in the above document. So I put the original "default" file back in it's place and restarted. I got the splash, bar animates, I get the black screen, then it just hangs there.. it doesn't restart, it just stays there with the black screen (I waited for about 5min) and I have to restart it myself.
When I restarted it, I got the F8 options and I tried Safe Mode. PC restarted soon after that, no luck.
Restarted again, used normal login, same as before, got the splash, it animates, then the black screen again where it stays, no reboot.
So no luck yet, still can't get in :( Any other ideas?


As for the raid itself, I use raid1 and I think it is a good backup thing, but I shouldn't have used it for the OS, just the data. I lost all my data 2 or 3 times before, when the drive just died. Now I have 2 identical copies, so if one dies.. That was my thinking at least, when I did it.

Thx for all the help!
Mat
 

RebateMonger

Elite Member
Dec 24, 2005
11,586
0
0
Nothing wrong with RAID 1 for an OS partition. But it won't help if the Registry gets corrupted or if the rest of the OS gets corrupted. You end up with two corrupted disks.

The best way to avoid data loss is to keep ongoing backups of the data. Modern disk imaging software is great, since you can generally restore either the entire volume or just individual files.
 

Mat99

Member
Sep 10, 2008
27
0
0
I've tried some other things, but no luck. It either hands or simply restarts.
I added a boot log so I got the log, if this might help in any way? It's hosted here: http://shrani.si/f/1c/S3/2nHCpP6d/ntbtlog.doc (should be a .doc file in a zip; can't save .txt on this host, that's why .doc).
I'm getting more and more worried, there's no light at the end of this tunnel.. :(


I'm open to ANY ideas on how to avoid just such problems! I spend a lot of money on drives and raid and external drives already.
I just want my data to be 100% safe (I thought raid1 would handle this) and a way to restore the OS (in matter of minutes or few hours) if it gets corupted like this now. I just want to avoid this "urgent" re-installations of OS and all the programs.
I also want all this backup to be done on the fly, in real time, not some thing where you have to push a button once a week and wait for hours for the thing to do it's job. It's too easy to forget, to not have time, etc..
Atm I have 4 sata Seagate Barracuda 320GB drives, 2 in raid1 and 2 in raid0. Raid1 currently has 3 partitions (OS, data, photos), raid0 is intendent for scratch disk. I can loose raid0 and use them as single drives if needed. I also have a space for 5th IDE drive (this is the one that's giving me access to drives atm). I also have 2 external 200GB drives.
I'm still hoping I'll be able to fix the current os and get in, but sooner or later I'll have to think of a better configuration. I lost a week on this already, it can't happen again. Any suggestions?

Thx!
Mat


Update: while moving files from raid os drive to new drive, avast found a trojan in pagefile.sys. This is funny, since it didn't find it 2 days ago, when I did the full scan.
I deleted the pagefile and rebooted. No difference, sistem hangs after the splash animation.
 

TheKub

Golden Member
Oct 2, 2001
1,756
1
0
Originally posted by: Mat99
I also want all this backup to be done on the fly, in real time, not some thing where you have to push a button once a week and wait for hours for the thing to do it's job. It's too easy to forget, to not have time, etc..

Unfortunately, the more convenient the backup the less useful (read safe) it is. To have it real time (or even automated) you have two options, having the backup drive running in the same machine (internal or external) or on a separate machine over a network. Having the backup drive running all the time on the same PC could be problematic to things like power surges or some viruses. Having the disk on the network would only be a marginal improvement as some viruses can traverse networks and a surge strong enough to damage one PC will most likely damage both. Also, remember its the "real time" backup of RAID that bit your ass, if a file gets corrupted\deleted your backup is IMMEDIATELY gone too. So unless you are going to be keeping archival backups in conjunction with your "real time" backup it wont really do you any good.

Originally posted by: Mat99
I'm open to ANY ideas on how to avoid just such problems! I spend a lot of money on drives and raid and external drives already.
I just want my data to be 100% safe (I thought raid1 would handle this) and a way to restore the OS (in matter of minutes or few hours) if it gets corupted like this now. I just want to avoid this "urgent" re-installations of OS and all the programs.

My recommendation is using external storage. Once a month (or as you add programs\major config changes) create a disk image of your OS disk. In most cases if you have all your data separate from your apps this backup will not need to be done frequently. And should you loose your OS drive you can restore the backup image in a matter of minuets.

For your data drive determine the frequency of the backups you want to do. If you can stand to loose a couple of days do it weekly, if you want all data backed up all the time you will most likely want to keep the external attached to your PC all the time. I would recommend at least unplug the AC line when the drive is not actively backing up as this should hopefully save you from power surges and viruses cant access a powered down disk.

Of course the above scenario ONLY protects you from hardware failure\corruption\viruses if your house gets robed\burned down\flooded\etc all your data is gone. If you want to add an additional layer you need to keep that external hard drive out of your house when its not being used. For example have 2 external drives backup one day and take it to work, leave that drive at work and bring home the second. Yes its a lot of work and hassle but its up to you to determine HOW important your data is.

If spending money is not that big of an issue you need to look at online backups. They can be done automatically and with little hassle but you have to pay for the convenience. Though you will still have to do the disk imaging on your end to restore your OS should it die. Also, should there be a major failure and you need your data, depending how much you have and how fast your connection is it may take a LONG time to restore it all.

I really see your position but if your data is THAT important then backups are THAT important to you. At companies where loosing data can cost hundreds\thousands\millions of dollars someone is rotating tapes off site and\or being backed up to another storage device (san\nas). They will have multiple versions of backups so should a file get corrupted and no one notices immediately they have a chance to go back through the tapes and hopefully find one before the corruption occurred. So just like the old computer saying goes you can have it cheap, easy, reliable you choose which 2.
 

Mat99

Member
Sep 10, 2008
27
0
0
Thx for this!
So I'm guessing I'll keep raid1 for my data only (I keep a third copy on external disk as well), and install os on a new drive.
Do you recommend any software for disk imaging?
I could image OS to the raid1 drives as well I guess. Would leave me with 2 copies of OS.
Have to think a bit more about this.
 

TheKub

Golden Member
Oct 2, 2001
1,756
1
0
I really only have experience with Ghost. Though there is likely free\open source options for cloning.
 

RebateMonger

Elite Member
Dec 24, 2005
11,586
0
0
I back up all of the PCs and Servers in my office using Windows Home Server. It automatically updates full system image backups of the PCs while retaining older versions, too. In case of a problem, I can restore individual files or entire partitions from pretty much any point in the past. Windows Drivers are automatically backed up, too, for the restoration process.

I also make separate full backups of my SBS 2003 Server (which has a single RAID 1 array) so I can keep some offline backups and because Windows Home Server backups of SBS 2003 technically aren't supported by Microsoft.

I've had my RAID 1 array corrupted by a power glitch, and I've seen MANY RAID 5 arrays corrupted or lost for many reasons.