• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Grub 1.5 - Error 17 - CentOS 6 with HP Smartarray

Scarpozzi

Lifer
So....I've been called to swoop in and save the day on a server that I didn't setup. I'm only playing detective and trying to figure out how it was setup so I can try to come up with a fix. I need help trying to figure out the best way to go about fixing the issue.

I work for a Dell shop and have only worked with Dell/IBM servers for the past 12 years. I know very little about HP.

This is an HP Proliant server that boots to Grub 1.5 - Error 17. It will not boot to the local file system, but I was able to boot to a live cd and run fdisk -l. HP Smartarray is a driver set that relabels /dev/sda /dev/sda1 /dev/sda2.../dev/sda5

On /dev/sda, it's relabelled /dev/cciss/c0d0 and partitions are /dev/cciss/c0d0p1-p5.
p1 = is mapped somehow in fstab to /state/partition1. I believe this partition contains /boot and /etc becuase they don't show up in p2, whcih appears to be /.
p3 is swap
p4 and p5 appear to be multipathed copies of the same partition. I was only able to mount one of the two for some reason.

When I attempt to mount /dev/cciss/c0d0p1, I get the wrong filesystem or bad superblock error. I tried picking a different superblock with fsck and had no luck.

I was told by the admin that did the damage that he ran fsck on the system and rebooted. I don't know what switches he ran, but am thinking he did something to wipe that /state/partition1 or /dev/cciss/c0d0p1.

Got any cool tips for trying to restore a partition. The admin in charge of the system does not have backups for the system (never paid for them) and I would prefer not to resort to a OS reinstall to replace /boot and /etc. I fear that it would remove most of his config and he'll be knocking on my door more down the road.
 
Thanks. I was looking at TestDisk as an option, but I've had limited success with it. The guy that worked on it admitted that he wasn't sure what he was doing....so I have no clue what flags he ran on fsck... It makes me wonder if there was some override to run it on the filesystem while it was up. (not sure if that can be done)

I just got that system up last week after it had been down for weeks and now I'm in a worse place than I was before. If I owned the system, I'd take full ownership of root... Since I'm always the one bailing them out, I probably should and save myself trouble... sorry...had to vent.
 
I am using Knoppix as my live cd and sure enough it has TestDisk on it. The first run of TestDisk only found the two partitions I can read (not counting swap). Which are listed as /var and /state/partition

So I guess what I thought was / was actually /var. I'm doing the 'deep' search for partitions now to see if it can find anything else on the partitions before going the reinstall OS route. Not sure if there are any other restore utils...but I'm going to wash my hands of this until they get a backup solution in place.
 
It looks like the root partition is unavailable and this is going to be an OS reinstall. Thanks for the suggestion of TestDisk. It at least lets me know that I can stop spinning my wheels on this and send it back to the guy that messed it up. 🙂
 
Yeah, without knowing more about how it was initially setup, you are really just digging in the dark. If you at least knew how many partitions there were, roughly what sizes, if LVM was used, etc., you might have had a chance. But as is, you were screwed from the beginning. I've managed to recover a few disks from corrupted partition tables and LVM layout a few times, but you need to know how things were initially setup. Without that, you are simply screwed.
 
Back
Top