INACCESSIBLE_BOOT_DEVICE in W2k - how to mod registry to default IDE driver?

gustep12

Junior Member
Sep 9, 2004
2
0
0
INACCESSIBLE_BOOT_DEVICE in W2k - how to mod registry to default IDE driver?

The IDE controller on my ECS K7S5A Pro mobo died unexpectedly. I went ahead and bought a new mobo, a different brand and chipset this time, because I felt the ECS had let me down in terms of reliability.

I swapped mobos and powered up the PC, only to be confronted with the blue screen of death message:

INACCESSIBLE_BOOT_DEVICE

This is running Windows 2000 with SP4.

I have read a lot about this issue, and it can be summed up to the following causes:

- Windows does not know how to talk to the new chipset's IDE controller, hence the error message
- Unfortunately (and stupidly), there is no easy way to force Windows to re-detect the changed IDE hardware

There are, however, two cures, aside from a completely new install from scratch:

1.) Forcefully invoke the hardware re-enumeration / detection / whatever by re-installing windows over the old install, using the "repair" option. If all goes well, the system comes back to a working state eventually. Note: For this to work you have to first select "Upgrade to windows 2000", then "Repair" on a later screen.

2.) You can also try to have windows default to the simple old compatibility IDE driver in PIO mode. It's too bad that this is not a standard feature of the safe mode! It should be! Instead, you have to do it manually: Ensure that a few driver files (atapi.sys, intelide.sys, pciide.sys, and pciidex.sys) are in the winnt\system32\drivers directory, and then merge a few lines of registry code into the registry.

But how to do this? All instructions on the Internet assume that you prepare for a voluntary motherboard migration and can merge the reg file on your old system before you do the swap. My old system is dead, however! I cannot edit the w2k registry under DOS. I could probably get the WINNT\SYSTEM32\CONFIG\SYSTEM file that contains the registry keys of interest and try to edit it on another PC, but this is still tedious.

Here are my questions:

A.) Is there a blank SYSTEM. registry file somewhere that I could use to completely replace WINNT\SYSTEM32\CONFIG\SYSTEM with, and which would force W2K to re-detect all the hardware, including the IDE controller, while maintaining the software installations?

B.) What would happen if I erase the SYSTEM. registry file, and it's backup SYSTEM.ALT, and other possible backups completely?

C.) Does someone have another solution as to how I can merge my "RevertToDefaultIDE.reg" file into the existing registry under DOS or on another PC with little hassle?

Thanks!

 

rumptis

Member
Sep 3, 2004
77
0
0
I have dealt with this a ton of times and the easiest thing to do is do a Windows Repair. It would be best to use a Windows 2000 cd that has SP4 Slipstreamed onto it.

Just boot from the 2000 cd and skip the first repair and choose the 2nd it replaces all the system files and redetects all the hardware, You won't lose anything you will just need to update most of your drivers which isn't a big deal. As far as I know this is the only way to fix that problem if the hardware crash I think you could make some changes if windows would still boot before you put the new motherboard it but most of us are never that lucky.
 

Smilin

Diamond Member
Mar 4, 2002
7,357
0
0
Originally posted by: gustep12

Here are my questions:

A.) Is there a blank SYSTEM. registry file somewhere that I could use to completely replace WINNT\SYSTEM32\CONFIG\SYSTEM with, and which would force W2K to re-detect all the hardware, including the IDE controller, while maintaining the software installations?

B.) What would happen if I erase the SYSTEM. registry file, and it's backup SYSTEM.ALT, and other possible backups completely?

C.) Does someone have another solution as to how I can merge my "RevertToDefaultIDE.reg" file into the existing registry under DOS or on another PC with little hassle?

Thanks!

A.) You can see if there is a system.bak located in winnt\repair. This hive will immediately kick off GUI mode setup. However you will lose so much system configuration info it would be better to completely reload.

B.) You will boot to a "missing or corrupt winnt\system32\config\system.ced" error.

C.) Lookup "load hive" in regedit help. You can also insert a file using regedit with the S switch. Run "regedit /s <path to reg file>".

I would suggest you just do as rumptis said and perform an inplace upgrade (aka repair). Boot to your w2k CD, provide the F6 driver if needed, Hit enter at "welcome to setup", hit F8 at the EULA, Hit 'R' to repair an installation. It will preserve all your programs.



 

McMadman

Senior member
Mar 25, 2000
938
0
76
Originally posted by: Smilin

I would suggest you just do as rumptis said and perform an inplace upgrade (aka repair). Boot to your w2k CD, provide the F6 driver if needed, Hit enter at "welcome to setup", hit F8 at the EULA, Hit 'R' to repair an installation. It will preserve all your programs.

Sadly the "repair installation" feature of 2000 was not nearly as reliable as it is in XP. I had a similar failure (from a VIA based chipset to SiS based 2k failed and was never recovered.)

Or in the few cases where it did reenumerate devices correctly, there have been reports of it not properly keeping installed programs.
 

McMadman

Senior member
Mar 25, 2000
938
0
76
Well, I suppose "reports" probably wasn't the best choice of words.

One experience I had was a repair install on a friends computer, it got windows loaded again but it loaded a new "user" registry so a good amount of the basic settings were lost.

The other case was a friend running it with a similar result of all his programs needed to be reinstalled. I can't exactly speak for what he did in this case, nor is this really a great sample size for 2000 repairs of this nature.

I have done repair installs in xp and never had the user profiles get lost (the worse that happens is of course hotfixes are lost - which is to be expected)

I don't have much explanation for the behavior, since you do the exact same steps in either 2k or xp.
 

Smilin

Diamond Member
Mar 4, 2002
7,357
0
0

Yes, you can end up with a new user profile sitting in a docs and settings' .000 folder. It should never really goof anything up outside HKCU though.

I think your friend may have done a new install to the same folder on accident. If you ever have to hit 'L' during setup you've made a wrong turn :p



Craziest sh1t I've seen: You can kick off an inplace upgrade/repair and wait for it to finish hardware detection. You then shift-f10 over, bring up regedit and manually knock yourself out of gui mode setup. Then cold boot out of the install. It lets setup redetect all your hardware for you without touching any other portion of the registry. Some dude I know came up with the technique so he could run a repair on a DC without damaging it (inplace on a DC is baaaad.)
 

VirtualLarry

No Lifer
Aug 25, 2001
56,570
10,202
126
Originally posted by: gustep12
2.) You can also try to have windows default to the simple old compatibility IDE driver in PIO mode. It's too bad that this is not a standard feature of the safe mode! It should be! Instead, you have to do it manually: Ensure that a few driver files (atapi.sys, intelide.sys, pciide.sys, and pciidex.sys) are in the winnt\system32\drivers directory, and then merge a few lines of registry code into the registry.

You know, that's a VERY, VERY, GOOD IDEA! (Microsoft, are you listening?)

Perhaps add a "/stdide" switch to the BOOT.INI OS-loader commandline?

That's one of the very few features that I thought made W2K a much more "fragile" OS than Win9x ever was, because Win9x can always resort to a fail-safe mode of using "MS-DOS Compatibility mode" (really, BIOS int 13h interrupt calls) for disk I/O. (In fact, if you want to run Win9x on a partition that requires 48-bit LBA, but you don't have a controller driver that supports it, but the BIOS does, then you could concievably run the OS, alibeit slowly, by using the BIOS Int compatibility mode.)

Originally posted by: Smilin
Craziest sh1t I've seen: You can kick off an inplace upgrade/repair and wait for it to finish hardware detection. You then shift-f10 over, bring up regedit and manually knock yourself out of gui mode setup. Then cold boot out of the install. It lets setup redetect all your hardware for you without touching any other portion of the registry. Some dude I know came up with the technique so he could run a repair on a DC without damaging it (inplace on a DC is baaaad.)

Wow, that's some crazy sh1t indeed. Isn't there a registry setting that you can set, that causes Windows' to go through an entire hardware-level redetection when next booted? (Sysprep sets this, I think, right?)

I'm kind of curious how you would "manually knock yourself out of gui mode setup", using a command window... you mean forcibly kill the GUI setup process? What kind of user shell do you end up with then, since the command window that you've been using, is a child window/process of the GUI setup program, isn't it?

 

Smilin

Diamond Member
Mar 4, 2002
7,357
0
0

Sysprep isn't quite the same. If you have a non-inbox mass storage driver you'll still need to add it for sysprep to work. It's big purpose is to change SIDs. Besides the system has to be bootable to run sysprep, but not an inplace upgrade.


I'll PM you *some* of the other details about GUI mode. :p
 

McMadman

Senior member
Mar 25, 2000
938
0
76
Originally posted by: Smilin
Yes, you can end up with a new user profile sitting in a docs and settings' .000 folder. It should never really goof anything up outside HKCU though.

That I believe did happen, which wasn't so horrible, the system was still running after a motherboard swap, which was better than a total reinstall.

When I had my failure, I fought it for hours trying to repair and just gave up and deleted the entire install (dualboot)

Originally posted by: Smilin
Craziest sh1t I've seen: You can kick off an inplace upgrade/repair and wait for it to finish hardware detection. You then shift-f10 over, bring up regedit and manually knock yourself out of gui mode setup. Then cold boot out of the install. It lets setup redetect all your hardware for you without touching any other portion of the registry. Some dude I know came up with the technique so he could run a repair on a DC without damaging it (inplace on a DC is baaaad.)

Now that just seems like a strange way of doing a redetection of hardware, but then again we are talking about windows which often results in these odd problems.

I've never had to work with anything that mission critical myself, but I have to wonder what would make someone think of doing that in the first place, I've NEVER heard of shift+F10 during an install!

Originally posted by: VirtualLarry
Perhaps add a "/stdide" switch to the BOOT.INI OS-loader commandline?

That's one of the very few features that I thought made W2K a much more "fragile" OS than Win9x ever was, because Win9x can always resort to a fail-safe mode of using "MS-DOS Compatibility mode"

I agree, while of course reverting to something as horribly slow as "compatibility mode" sucks for speed, it would potentially allow you to change/remove the old ide controller and allow something standard to be installed after you can no longer boot (failure/etc) Safe mode should be just that, a bare minimum load of windows that allows you access to some standard components/gui for diagnostic/troubleshooting.

The "recovery console" is mostly useless since you can't easily remove faulty devices, and if your SAM file is corrupt then you can't even login to it as administrator anyways (granted you won't be able to do anything at this point short of restore an old copy using something like ntfsdos or using another bootable computer with the drive slaved.)

In 9x you could so easily go into safe mode, remove ALL devices from device manager and reboot (might need to force installation via "add new hardware wizard") but you can totally recover. I wouldn't even think of doing this in 2k/xp!
 

Smilin

Diamond Member
Mar 4, 2002
7,357
0
0
Originally posted by: McMadman
I agree, while of course reverting to something as horribly slow as "compatibility mode" sucks for speed, it would potentially allow you to change/remove the old ide controller and allow something standard to be installed after you can no longer boot (failure/etc) Safe mode should be just that, a bare minimum load of windows that allows you access to some standard components/gui for diagnostic/troubleshooting.

You're kindof assuming that the failed drive is ATA. Might by SCSI, SATA or RAID. It's a tough transition to bootstrap a computer up from int13 calls all the way to 32/64bit preemptive multitasking. They have crammed quite a bit of cleverness in there already. If you want some kinda compatibility mode you're going to have to throw a pretty big wrench into the architecture.

The "recovery console" is mostly useless since you can't easily remove faulty devices, and if your SAM file is corrupt then you can't even login to it as administrator anyways (granted you won't be able to do anything at this point short of restore an old copy using something like ntfsdos or using another bootable computer with the drive slaved.)

The recovery console is VERY useful. VERY.
You can easily remove faulty devices, see the "listsvc" and "disable" commands.
Also if your SAM is so corrupt it can't be read you can just use a repair, regback or sysvol version to boot. If it's damaged "just right" you'll need a parallel to get in to swap the hives. Only seen this once ever.


The aforementioned registry/setup trick was due to some guy's enum key looking quite a bit like ascii porn after an array failure.

Among other things I specialize in fixing "dead" servers for a living so I see more wierd crap in a few days than most people see in years.
 

dclive

Elite Member
Oct 23, 2003
5,626
2
81
Originally posted by: Smilin


Craziest sh1t I've seen: You can kick off an inplace upgrade/repair and wait for it to finish hardware detection. You then shift-f10 over, bring up regedit and manually knock yourself out of gui mode setup. Then cold boot out of the install. It lets setup redetect all your hardware for you without touching any other portion of the registry. Some dude I know came up with the technique so he could run a repair on a DC without damaging it (inplace on a DC is baaaad.)


E-mail me whatever you've got on this - I'd love to read up on it!
 

McMadman

Senior member
Mar 25, 2000
938
0
76
Originally posted by: Smilin
Originally posted by: McMadman
I agree, while of course reverting to something as horribly slow as "compatibility mode" sucks for speed, it would potentially allow you to change/remove the old ide controller and allow something standard to be installed after you can no longer boot (failure/etc) Safe mode should be just that, a bare minimum load of windows that allows you access to some standard components/gui for diagnostic/troubleshooting.

You're kindof assuming that the failed drive is ATA. Might by SCSI, SATA or RAID. It's a tough transition to bootstrap a computer up from int13 calls all the way to 32/64bit preemptive multitasking. They have crammed quite a bit of cleverness in there already. If you want some kinda compatibility mode you're going to have to throw a pretty big wrench into the architecture.

I thought that ATA/SCSI would both accept int13 calls, although on the same note, usually SCSI/SATA/RAID devices will have their own controller card which windows should already have had the drivers installed.

The "recovery console" is mostly useless since you can't easily remove faulty devices, and if your SAM file is corrupt then you can't even login to it as administrator anyways (granted you won't be able to do anything at this point short of restore an old copy using something like ntfsdos or using another bootable computer with the drive slaved.)

The recovery console is VERY useful. VERY.
You can easily remove faulty devices, see the "listsvc" and "disable" commands.
Also if your SAM is so corrupt it can't be read you can just use a repair, regback or sysvol version to boot. If it's damaged "just right" you'll need a parallel to get in to swap the hives. Only seen this once ever.

Whoops, I forgot that "disable" would work for devices. I guess I haven't fully had to use the recovery console as much to have good success with it.

I have had to do some basic recovery of the registry hives a few too many times actually, windows wouldn't load in GUI mode (freeze at "logging in" and recovery console had invalid password) I've used ntfsdos to access the system volume information (windows xp in all cases) and locate a fairly recent registry snapshot to restore as the active registry which solved the problem, ironically the major times I've seen this happen was on a laptop (in some cases more than once)

The aforementioned registry/setup trick was due to some guy's enum key looking quite a bit like ascii porn after an array failure.

Among other things I specialize in fixing "dead" servers for a living so I see more wierd crap in a few days than most people see in years.

That registry/setup trick is neat, I'll have to remember it. What other steps were needed in this case if any?

Of course doing this for a living, you'll see a lot more oddities, and I'm sure some of the servers are mission critical so need to recover asap with no changes in security/structure of the server.
 

Smilin

Diamond Member
Mar 4, 2002
7,357
0
0
I thought that ATA/SCSI would both accept int13 calls, although on the same note, usually SCSI/SATA/RAID devices will have their own controller card which windows should already have had the drivers installed.
Yep, they all do int13 and that's what ntldr uses initially to get boot.ini loaded, ntdetect.com run, the system hive loaded and all "boot" services/drivers loaded (not started), ntoskrnl and hal.dll are also loaded. Keep in mind all that crap won't fit in real-mode memory so it has to transition to 32 bit flat mode. By the time it passes control to the kernel it has initialized page tables and is in 32 bit protected mode. The kernel actually starts the storage driver almost immediately.

So basically by the time the kernel starts the OS is already dependent on a driver. Your "compatibility mode" would pretty much have to make due without a kernel!?! Windows does come with a whole slew of "inbox" drivers that are supplied by the manufacturer but most modern servers have already outgrown those drivers due to either new hardware that has come out after the OS was released or firmware updates that make the inbox drivers incompatible.

ntldr can also start with ntbootdd.sys instead of using int13s for all it's work. This is the exact same code as the driver but it simply can't do anything with it useful like the kernel can.

I've used ntfsdos to access the system volume information (windows xp in all cases) and locate a fairly recent registry snapshot to restore as the active registry
Yet another reason why XP rocks compared to 2k. People are actually dumb enough to turn off system restore points too. *sigh*


Generally what I see with recovering the servers isn't as much about ASAP. Some cases it can take hours to get a 'dead' server back to life. The time is often better spent slapping down a new install and restoring. No, what I usually get is: Yeah this is our sole Exchange server and DC and we haven't had a good backup in 8 months. So yeah, you gotta save the day and it gets 'creative' from time to time.

Sometimes it is about ASAP and backups are available (God bless those customers :D). In that case it's all about getting it bootable with just enough functionality to run the restore software. Repair hives come in handy.

There are many methods to recover and if you *truly* find yourself in such a pickle that you need that registry trick, give MS a call. If your butt can be saved they'll do it. A guy in their Enterprise Platforms Support group developed it. It's not documented outside MS.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,570
10,202
126
Originally posted by: Smilin
Originally posted by: McMadman
I agree, while of course reverting to something as horribly slow as "compatibility mode" sucks for speed, it would potentially allow you to change/remove the old ide controller and allow something standard to be installed after you can no longer boot (failure/etc) Safe mode should be just that, a bare minimum load of windows that allows you access to some standard components/gui for diagnostic/troubleshooting.

You're kindof assuming that the failed drive is ATA. Might by SCSI, SATA or RAID. It's a tough transition to bootstrap a computer up from int13 calls all the way to 32/64bit preemptive multitasking. They have crammed quite a bit of cleverness in there already. If you want some kinda compatibility mode you're going to have to throw a pretty big wrench into the architecture.

I made the implicit assumption that the OS was using a "multi()" BOOT.INI loader line. (Ie. BIOS Int13h-accessable.) If it used a "SCSI()" line, then things would be a little different. (Actually, I am curious - is that one way to get around an ERROR_INACCESSABLE_DISK" boot failure, if you manually copy a vendor-specific SCSI-port disk driver over to, what is it, NTDISK.SYS or something, with the bootloader files, and then change the BOOT.INI line? Or will it still barf, because it doesn't have that controller's registry settings in the CurrentControlSet, etc?

The aforementioned registry/setup trick was due to some guy's enum key looking quite a bit like ascii porn after an array failure.

Among other things I specialize in fixing "dead" servers for a living so I see more wierd crap in a few days than most people see in years.

You indeed must see some interesting things then. :) Thanks for sharing.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,570
10,202
126
Originally posted by: Smilin
ntldr can also start with ntbootdd.sys instead of using int13s for all it's work.
Ahh. That's the one. I always forget that filename. :|

Originally posted by: Smilin
Generally what I see with recovering the servers isn't as much about ASAP. Some cases it can take hours to get a 'dead' server back to life. The time is often better spent slapping down a new install and restoring. No, what I usually get is: Yeah this is our sole Exchange server and DC and we haven't had a good backup in 8 months. So yeah, you gotta save the day and it gets 'creative' from time to time.

So true. I had to recover someone's W2K SP4 system recently, they were running a firmware-RAID mirror setup, and removed one of the drives, and suddenly both drives in the mirror came up blank. Somehow, the first sector got zero'ed out on both, have no idea how. They had a lot of valuable personal data in one of the extended partitions that they hadn't backed up in some time. Spent a few days scanning the disk for boot sectors and rebuilding partition tables for them. Informed them in no uncertain terms that "a mirror is NOT a backup". Bleh. Users. :)

Originally posted by: Smilin
Sometimes it is about ASAP and backups are available (God bless those customers :D). In that case it's all about getting it bootable with just enough functionality to run the restore software. Repair hives come in handy.

Ghost 2003 is definately my friend. That being said, I can't remember the last time that I made an W2K ERD for my system, might be a good idea to do that now, as a matter of fact. Hmm.
 

dclive

Elite Member
Oct 23, 2003
5,626
2
81
Originally posted by: VirtualLarry

Ghost 2003 is definately my friend. That being said, I can't remember the last time that I made an W2K ERD for my system, might be a good idea to do that now, as a matter of fact. Hmm.

Run NTBACKUP. Do a system state backup. Note that your c:\windows\repair\regback (in W2k) directory now has a copy of your current (as of the time backup was run) registry. :)

 

Smilin

Diamond Member
Mar 4, 2002
7,357
0
0
Originally posted by: VirtualLarry
I made the implicit assumption that the OS was using a "multi()" BOOT.INI loader line. (Ie. BIOS Int13h-accessable.) If it used a "SCSI()" line, then things would be a little different. (Actually, I am curious - is that one way to get around an ERROR_INACCESSABLE_DISK" boot failure, if you manually copy a vendor-specific SCSI-port disk driver over to, what is it, NTDISK.SYS or something, with the bootloader files, and then change the BOOT.INI line? Or will it still barf, because it doesn't have that controller's registry settings in the CurrentControlSet, etc?

Stop 7B, INACCESSIBLE_BOOT_DEVICE can be caused by a whole slew of reasons. It's very rare that it's due to problems with the boot loader phase. I've seen it maybe one or two times in the last 50 or so 7Bs I've fixed. You can eliminate nearly every boot loader phase problem by just using a floppy to boot with. The vast majority of 7Bs are related to either filesystem corruption or filter drivers. Backup and Antivirus software vendors have a really bad track record when it comes to properly installing, upgrading and removing their boot and system start filter drivers. Veritas is particularly bad.

Boot loader phase Stop 7Bs will happen *immediately*. If you see the splash screen at all before the Stop 7B you have already entered Kernel phase.

ntldr uses ntbootdd.sys prior to accessing the registry. Specific settings for the real driver (cpqarray.sys or whatever) are only applied once ntoskrnl.exe begins using it.

edit: To clarify ntldr accesses the system hive using ntbootdd.sys but is not paying any particular attention to the mass storage driver settings at this point.
 

Auric

Diamond Member
Oct 11, 1999
9,591
2
71
Don't see any reason not to just repair. It's EZ and fast. If you plan on moving to a different storage controller, change it to the "standard" generic one first. This could be done before making backup images in anticipation of a mobo failure where the storage controller is likely to be changed but is prolly o'erkill.
 

McMadman

Senior member
Mar 25, 2000
938
0
76
Originally posted by: Smilin
I've used ntfsdos to access the system volume information (windows xp in all cases) and locate a fairly recent registry snapshot to restore as the active registry
Yet another reason why XP rocks compared to 2k. People are actually dumb enough to turn off system restore points too. *sigh*

System restore is quite often useful, granted when it comes to removing any sort spyware/virii/worms it can be quite annoying when it restores what you undo. The fact that it'll make and store a number of system states makes recovery much easier when the problem is registry related.

Generally what I see with recovering the servers isn't as much about ASAP. Some cases it can take hours to get a 'dead' server back to life. The time is often better spent slapping down a new install and restoring. No, what I usually get is: Yeah this is our sole Exchange server and DC and we haven't had a good backup in 8 months. So yeah, you gotta save the day and it gets 'creative' from time to time.

Quite scary when you think about it, how necessary the server is to a customer, yet they haven't spent the time to get a decent backup procedure in place

Sometimes it is about ASAP and backups are available (God bless those customers :D). In that case it's all about getting it bootable with just enough functionality to run the restore software. Repair hives come in handy.

This I'm sure is quite rare, but a godsend when it can make life much easier for repair, and get the system up and running.

And in hindsight - my problem wasn't an inaccessible boot device error, It was a hardlock (safe mode hung on viaagp.sys trying to initalize it) but this was still a failure due to a motherboard swap and a case of a repair install not fully repairing.
 

Smilin

Diamond Member
Mar 4, 2002
7,357
0
0
Originally posted by: McMadman

And in hindsight - my problem wasn't an inaccessible boot device error, It was a hardlock (safe mode hung on viaagp.sys trying to initalize it) but this was still a failure due to a motherboard swap and a case of a repair install not fully repairing.

Keep in mind the Microsoft supported method for swapping hardware is a clean install and a restore of data. A repair install to swap hardware is not officially supported. Unofficially it usually works anyway :)



Worst 'dead' server I've ever seen:
Customer had three external drive arrays with ~10 disks each. Each one a raid 5 array. He spanned all three arrays into one 6TB volume. They lost a fan in one of the external arrays and like 3 or 4 disks burned up. The whole 6TB volume went.

Here's the kicker:
They had been too busy using the tape drives to shovel data IN to the array to use them for backups. Last complete backup was almost a year earlier. I can't say what the nature of the data is without putting anonymity at risk but it was worth millions. I was told several hundred million by their president but I think he was exagerating. Due to time constraints they couldn't call a data recovery service and I'm not sure if Ontrack would do handle that size of job.

How he got lucky:
The volume wasn't full and the data was mostly sitting on the earlier 1/2 of the volume and it was the last array that died. We trimmed up the volume size to 2/3 with diskprobe then let chkdsk rip. Based on size about 75% of the data was intact. Because of where the MFT was located relative to the data it left anything it couldn't repair as 0 byte files so they could tell what was good and what wasn't. They pulled most of the rest of it back on to the array from workstations where people had been downloading big chunks to work on.

Where they weren't so smart:
The obvious of course. They made a backup but then went back to using the tapes to shovel data in. The company president didn't fire the IT guy as far as I know.
 

McMadman

Senior member
Mar 25, 2000
938
0
76
Originally posted by: Smilin
Originally posted by: McMadman

And in hindsight - my problem wasn't an inaccessible boot device error, It was a hardlock (safe mode hung on viaagp.sys trying to initalize it) but this was still a failure due to a motherboard swap and a case of a repair install not fully repairing.

Keep in mind the Microsoft supported method for swapping hardware is a clean install and a restore of data. A repair install to swap hardware is not officially supported. Unofficially it usually works anyway :)

True, most of the time people will say to just reinstall windows if possible for the cleanest windows install when swapping - granted I probably held a record with my old 98se install (installed in 99, went through 4-6 motherboards varying from via to sis to nforce2) at the very end however it was getting pretty unstable, but it was of course over 4 years old (in some cases it worked better than my reinstall on the same hardware)

They had been too busy using the tape drives to shovel data IN to the array to use them for backups. Last complete backup was almost a year earlier. I can't say what the nature of the data is without putting anonymity at risk but it was worth millions. I was told several hundred million by their president but I think he was exagerating. Due to time constraints they couldn't call a data recovery service and I'm not sure if Ontrack would do handle that size of job.

Laugh! that's just totally backwards, obviously to be taking data from tape, the data either never/rarely changes or was backed up from another location - while I hope there was some reasoning behind this, I'm sure they could have easily expanded to do regular full backups with some incremental backups thrown in.

I'm sure that Ontrack or some other professional data recovery would have been willing to do it (not cheaply of course) if that data is critical to a business any downtime is unacceptable which is why a proper backup system should have been implemented in the first place.

How he got lucky:
The volume wasn't full and the data was mostly sitting on the earlier 1/2 of the volume and it was the last array that died. We trimmed up the volume size to 2/3 with diskprobe then let chkdsk rip. Based on size about 75% of the data was intact. Because of where the MFT was located relative to the data it left anything it couldn't repair as 0 byte files so they could tell what was good and what wasn't. They pulled most of the rest of it back on to the array from workstations where people had been downloading big chunks to work on.

I'm not too sure how this setup works, I know the basics on raid5, but not about spanning multiple arrays into one logical volume. From what I know the third array would be fragments unrecoverable by parity data so you'd be left with whatever could be recovered from the stripes.

He did get lucky by having most of the data on the first 2 arrays, if the MFT was on that third array or it had been the second which failed recovery would have been MUCH harder.
Where they weren't so smart:
The obvious of course. They made a backup but then went back to using the tapes to shovel data in. The company president didn't fire the IT guy as far as I know.

I'd be really amazed if the IT guy didn't get fired, at the very least though they sure learned a lesson and hopefully they've fixed their backup situation, since next time they might not get off so easily.
 

Smilin

Diamond Member
Mar 4, 2002
7,357
0
0
I didn't get the impression from the president that he was going to fire his IT guy. Seriously.

After the whole thing was done it took serious arm twisting from my manager to get them to sit down and discuss their situation with us. We weren't about to let them walk away from the case without making some serious changes going forward. It was like talking to a wall. I see more trouble in their future.

 

McMadman

Senior member
Mar 25, 2000
938
0
76
I can somewhat understand the president being ignorant about the topic (isn't it almost always the case) but alas, if they refuse to change and a similar failure happens in the future with less recovery, they can only blame themselves