Bunch of libs hosed + LDConfig Errors

EricMartello

Senior member
Apr 17, 2003
910
0
0
Ran into a problem I haven't seen before with no apparent cause. Yesterday one of my boxes "dropped off" the net and upon logging onto the console and reviewing the logs, I could see that somehow many shared libs got hosed. I don't know how that happened. We were able to get it back up but it's still throwing out errors like the ones below.

Anyone know how to fix this and what may have caused this in the first place? It is running Centos 3.9. I am hoping that I can fix this without having to completely reinstall the os. I have already tried reinstalling the packages that contain the SO's but it just throws out more LDconfig errors...

/sbin/ldconfig: /usr/lib/libgmp.so.3.3.2 is not an ELF file - it has the wrong magic bytes at the start.0.9.7a-33.26.i686.rpm

/sbin/ldconfig: /usr/lib/libgmp.so.3 is not an ELF file - it has the wrong magic bytes at the start.

/sbin/ldconfig: /usr/lib/libmp.so.3 is not an ELF file - it has the wrong magic bytes at the start.

/sbin/ldconfig: /usr/lib/libslang-utf8.so.1.4.5 is not an ELF file - it has the wrong magic bytes at the start.

/sbin/ldconfig: /usr/lib/libnewt.so.0.51.5 is not an ELF file - it has the wrong magic bytes at the start.

/sbin/ldconfig: /usr/lib/libnewt.so.0.51 is not an ELF file - it has the wrong magic bytes at the start.

/sbin/ldconfig: /usr/lib/libz.so.1.1.4 is not an ELF file - it has the wrong magic bytes at the start.

/sbin/ldconfig: /usr/lib/libcursesw.so is not an ELF file - it has the wrong magic bytes at the start.

/sbin/ldconfig: /usr/lib/libform.so is not an ELF file - it has the wrong magic bytes at the start.

/sbin/ldconfig: /usr/lib/libgcj.so.3 is not an ELF file - it has the wrong magic bytes at the start.

/sbin/ldconfig: /usr/lib/libnautilus-adapter.so.2 is not an ELF file - it has the wrong magic bytes at the start.

/sbin/ldconfig: /usr/lib/libnautilus-adapter.so is not an ELF file - it has the wrong magic bytes at the start.

/sbin/ldconfig: Cannot mmap file /usr/lib/libssl.so.0.

/sbin/ldconfig: /usr/lib/libcupsimage.so.2 is not an ELF file - it has the wrong magic bytes at the start.

/sbin/ldconfig: Cannot mmap file /usr/lib/libssl.so.4.

many more of these errors but you get the idea...
 

Crusty

Lifer
Sep 30, 2001
12,684
2
81
If you can I would run memtest and check for SMART errors from the drives.
 

EricMartello

Senior member
Apr 17, 2003
910
0
0
If you can I would run memtest and check for SMART errors from the drives.

Yeah I did that, and I did a fsck on reboot too. The disks seem to be ok. This problem happened after performing an "optimize" command on a large mysql table. The mysql (v5.0) daemon crapped out and left the table in a "crashed" state. I repaired the table using myisamchk -r and that seemed to work, but about 3 hours later this happened. I don't know if the optimize commmand somehow triggered this, but it did run for a few hours before the mysql daemon finally died.

In any case, is there some way to restore/replace the damaged libs so I can get all the services working again?
 

Nothinman

Elite Member
Sep 14, 2001
30,672
0
0
Reinstalling them is going to be the simplest, do you mean that the packages won't reinstall at all or that reinstalling one just causes more ldconfig errors about different libs?
 

EricMartello

Senior member
Apr 17, 2003
910
0
0
Reinstalling them is going to be the simplest, do you mean that the packages won't reinstall at all or that reinstalling one just causes more ldconfig errors about different libs?

Some of them would not reinstall until I deleted an old symlink which had been converted into a directory. After deleting the directory I was able to use RPM to do a --force install.

I spent a while replacing all the libs one by one and it seems to have fixed the problem but I am still unsure why this happened in the first place. I am guess the disk subsystem crapped out when running that optimize command on the mysql database because there are other services that hit the DB too...but it's only a guess.
 

Nothinman

Elite Member
Sep 14, 2001
30,672
0
0
If files magically became directories, then yea you had some filesystem corruption which is most likely a hardware issue.
 

EricMartello

Senior member
Apr 17, 2003
910
0
0
If files magically became directories, then yea you had some filesystem corruption which is most likely a hardware issue.

My working theory is that it is a combination of two main factors. First, the disks in this box are set up as a stripped raid array using linux's built-in software raid. Since that is fully CPU dependent and not handled by a secondary controller, it can falter when the disks get hit hard.

The MySQL optimize command that ran for hours only to crash the daemon when other disk-intensive processes were launched by cron resulted in an overload, which at that time may have corrupted some libs. For whatever reason the system decided to restart on its own and upon restarting the corrupted libs prevented most services from starting.

An on-site tech got SSH running again for me and I was able to fix the rest. What do you think? Plausible, or should I be backing up hourly now? :)
 

Nothinman

Elite Member
Sep 14, 2001
30,672
0
0
Since that is fully CPU dependent and not handled by a secondary controller, it can falter when the disks get hit hard.

No, just plain high I/O won't cause any issues with software RAID unless some hardware is at fault. Either the drives, cables, memory, etc can handle the I/O or not and if not you'll have problems regardless.

The MySQL optimize command that ran for hours only to crash the daemon when other disk-intensive processes were launched by cron resulted in an overload, which at that time may have corrupted some libs. For whatever reason the system decided to restart on its own and upon restarting the corrupted libs prevented most services from starting.

Which tends to indicate hardware as the problem. Cable noise, heat, memory, etc are all more likely than software RAID itself.
 

Nothinman

Elite Member
Sep 14, 2001
30,672
0
0
You could setup a periodic rsync or something too just in case, after the initial run it should be fairly quick.