I'm at a loss, what keeps crashing my system?

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Red Squirrel

No Lifer
May 24, 2003
67,526
12,193
126
www.anyf.ca
Did it two nights in a row now. I think I will really look at just doing a full blown software upgrade to see if it goes away. Newer distro, and newer/different virtualization solution. Heck, I might just put Xen on there and do a p2v of the current server setup. It's time for me to get with the times and go full virtual, instead of hybrid. :biggrin:
 

Red Squirrel

No Lifer
May 24, 2003
67,526
12,193
126
www.anyf.ca
Ok just an update on this issue. It's still present, and I have not sunk any more money into the server to try fixing it. But I think I may have another lead. It seems to do it more when I have my p2p VM running. The VM has Transmission on it and I use it more or less as a torrent box for Linux distros and stuff. Once in a while I'll power download a bunch of stuff at once (I'll have like 50+ torrents going) and I'll just leave it on. I'll close it if I need my internet, then let it go. Once most of the torrents are done downloading then I just let them seed for a few weeks. So tons of packets are going in and out of that VM, as well as the server's physical nic. I recently been transferring lot of stuff and noticed my system crashing more often since then.

Could it be possible that the issue is actually the nic? My hunch was always that it was a storage related issue, but now I'm starting to wonder if the nic is degraded after that big power outage, and when I overload it, something goes wonky which is affecting the system as a whole. It's a built on nic. Wondering if it's worth my while putting in a separate nic such as an Intel server nic or something. I do have a spare PCIe 1x slot I could use.

Oh and another reason my hunch is now on the p2p VM, it's the one that crashes the hardest when the host spits out those errors. I have another Linux VM acting as a SQL server for a dev and test environment, and that one actually has not been crashing as much. It will get a couple IDE I/O errors on the console, but continue to work. The P2P one will be spamming I/O errors at an insane rate and require to be hard reset. The Windows VMs keep going normally.

I recently bought a dual core server for cheap so I may also look at loading xen on it or something and migrating some of the stuff over.
 

Nothinman

Elite Member
Sep 14, 2001
30,672
0
0
The torrent VM probably just shows the most symptoms because it's the one doing the most I/O.
 

Red Squirrel

No Lifer
May 24, 2003
67,526
12,193
126
www.anyf.ca
The SQL one is probably doing more IO, though, Mysql does does lot of ram caching so maybe it's not as much as I think? The two windows boxes that access it write about 10 queries per second to it.
 

Scarpozzi

Lifer
Jun 13, 2000
26,389
1,778
126
How long has it been since you've run FSCK on your Fedora box? If you really lost RAID5 (multiple disk failure), bringing it back up will likely corrupt whatever files happened to be open in RW mode on the file system. I've worked with quite a few file systems over the years running RAID5 where the backplane failed and I was able to get the system up and running, but often you can get file corruption that won't cause a problem until that file or files are accessed again....it's always a huge YMMV.

I've worked primarily with NSS (Netware) and NTFS (Windows) for those kinds of issues, but it would apply to Reiser, EXT3, etc...all journaling file systems can get whacked the same way from a RAID failure.

My recommendation is to save all your VMs to another system (Export data) and do a clean install of Fedora/Virtualbox. If you want to troubleshoot some more, you can....try doing as many file system checks as you can to see if you can find/isolate any problems.
 

Red Squirrel

No Lifer
May 24, 2003
67,526
12,193
126
www.anyf.ca
I did do a fsck on the raid and it repaired a bunch of errors. Is that not good enough? Should I copy all the data to another storage, delete the raid and rebuild/reformat then copy the data back? I do have backups but they would have been overwritten by now.

It also happened again last night, I ran an external drive backup job. Seems anything with high I/O causes it.
 

LCTSI

Member
Aug 17, 2010
93
0
66
Good luck. This type of problem is particularly hard to track down.

My Ubunter 10.04 install at work decided to do the "hung task" thing any time I would run `apt-get upgrade`. I could read the disk end to end without any problems. No hardware problems that I could find. puppet/aide never found anything changed.

I was about due for a refresh (that install had started as Ubuntu 8.04 anyway), so I just dropped the drive in the secondary slot of a new machine. (It runs fine in a VM!)

You might make a copy of grub menu.lst and try setting `highres=off` on the kernel command-line.

EDIT: I thought I had run memtest86 on the machine. When I threw it back into the pile it got reimaged and deployed, there was a recent ticket for that service tag and they say it had a bad stick of ram. Check yo ram, memtest86 for 24h.
 
Last edited:

rasczak

Lifer
Jan 29, 2005
10,453
22
81
I think I had run into something similar a few months ago after a power outage. our ultra 45 ended toasting 16gb of memory. I would highly recommend if you haven't done so already, doing a memory swap with a known good source.
 

Red Squirrel

No Lifer
May 24, 2003
67,526
12,193
126
www.anyf.ca
I had done a memtest and all was good, but I'm starting to wonder if it may still be worth a shot to just swap the ram anyway. Can memtest sometimes "miss" an error? Especially these random nature errors where it may not error out every time it hits that address. Or is ram addresses either they work or they don't?
 

Nothinman

Elite Member
Sep 14, 2001
30,672
0
0
I had done a memtest and all was good, but I'm starting to wonder if it may still be worth a shot to just swap the ram anyway. Can memtest sometimes "miss" an error? Especially these random nature errors where it may not error out every time it hits that address. Or is ram addresses either they work or they don't?

If the hardware isn't failing consistently then nothing will be able to detect it every time, that's why people recommend running more than one full pass if you're having issues.
 

lxskllr

No Lifer
Nov 30, 2004
57,478
7,681
126
If the hardware isn't failing consistently then nothing will be able to detect it every time, that's why people recommend running more than one full pass if you're having issues.

I usually let it go overnight. I'm satisfied if it goes at least 8 hours without failure.
 

Red Squirrel

No Lifer
May 24, 2003
67,526
12,193
126
www.anyf.ca
Here we go again, this is a huge wave of crashes, and hardly no VMs running. First time it happens this bad when I'm actively using the system. Normally it happens overnight.


Code:
[root@borg ~]# dmesg 
 CIFS VFS: Error connecting to socket. Aborting operation
 CIFS VFS: cifs_mount failed w/return code = -113
 CIFS VFS: Error connecting to socket. Aborting operation
 CIFS VFS: cifs_mount failed w/return code = -113
 CIFS VFS: Error connecting to socket. Aborting operation
 CIFS VFS: cifs_mount failed w/return code = -113
 CIFS VFS: Error connecting to socket. Aborting operation
 CIFS VFS: cifs_mount failed w/return code = -113
 CIFS VFS: Error connecting to socket. Aborting operation
 CIFS VFS: cifs_mount failed w/return code = -113
 CIFS VFS: Error connecting to socket. Aborting operation
 CIFS VFS: cifs_mount failed w/return code = -113
 CIFS VFS: Error connecting to socket. Aborting operation
 CIFS VFS: cifs_mount failed w/return code = -113
 CIFS VFS: Error connecting to socket. Aborting operation
 CIFS VFS: cifs_mount failed w/return code = -113
 CIFS VFS: Error connecting to socket. Aborting operation
 CIFS VFS: cifs_mount failed w/return code = -113
ata4: exception Emask 0x10 SAct 0x0 SErr 0x4040000 action 0xe frozen
ata4: irq_stat 0x00000040, connection status changed
ata4: SError: { CommWake DevExch }
ata4: hard resetting link
ata4: SATA link down (SStatus 0 SControl 300)
ata4: EH complete
ata4: exception Emask 0x10 SAct 0x0 SErr 0x4040000 action 0xe frozen
ata4: irq_stat 0x00000040, connection status changed
ata4: SError: { CommWake DevExch }
ata4: hard resetting link
ata4: SATA link down (SStatus 0 SControl 300)
ata4: EH complete
INFO: task kjournald:1339 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kjournald     D ffff8800905ab400     0  1339      2
 ffff880218e13ca0 0000000000000046 0000000000000086 ffff88021ab28f30
 ffffffff8162a500 ffffffff8162a500 ffff88021ab88000 ffff8801007d2dc0
 ffff88021ab88348 00000001a02d4a6f ffff88021a470798 ffff88021ab88348
Call Trace:
 [<ffffffff8101686f>] ? read_tsc+0xe/0x24
 [<ffffffff810590b6>] ? getnstimeofday+0x54/0xb0
 [<ffffffff812c08a3>] io_schedule+0x63/0xa5
 [<ffffffff810e15e1>] sync_buffer+0x3b/0x3f
 [<ffffffff812c0de1>] __wait_on_bit+0x47/0x79
 [<ffffffff810e15a6>] ? sync_buffer+0x0/0x3f
 [<ffffffff810e15a6>] ? sync_buffer+0x0/0x3f
 [<ffffffff812c0e7d>] out_of_line_wait_on_bit+0x6a/0x77
 [<ffffffff81053861>] ? wake_bit_function+0x0/0x2a
 [<ffffffff810e150a>] __wait_on_buffer+0x36/0x3a
 [<ffffffffa0024deb>] wait_on_buffer+0x41/0x45 [jbd]
 [<ffffffffa002540e>] journal_commit_transaction+0x55d/0xf2f [jbd]
 [<ffffffff81049a0d>] ? try_to_del_timer_sync+0x58/0x63
 [<ffffffffa0028ad8>] kjournald+0xe3/0x23a [jbd]
 [<ffffffff81053829>] ? autoremove_wake_function+0x0/0x38
 [<ffffffffa00289f5>] ? kjournald+0x0/0x23a [jbd]
 [<ffffffff810534bf>] kthread+0x49/0x76
 [<ffffffff81011719>] child_rip+0xa/0x11
 [<ffffffff81010a37>] ? restore_args+0x0/0x30
 [<ffffffff81053476>] ? kthread+0x0/0x76
 [<ffffffff8101170f>] ? child_rip+0x0/0x11

INFO: task VirtualBox:10082 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
VirtualBox    D ffff88015d0409c0     0 10082  12565
 ffff8801cac998f8 0000000000000086 0000000000000092 ffff88021ab28f30
 ffffffff8162a500 ffffffff8162a500 ffff88020ed344a0 ffff8801fb1944a0
 ffff88020ed347e8 00000001a02d4a6f ffff88021a470798 ffff88020ed347e8
Call Trace:
 [<ffffffff8101686f>] ? read_tsc+0xe/0x24
 [<ffffffff810590b6>] ? getnstimeofday+0x54/0xb0
 [<ffffffff812c08a3>] io_schedule+0x63/0xa5
 [<ffffffff810e15e1>] sync_buffer+0x3b/0x3f
 [<ffffffff812c0de1>] __wait_on_bit+0x47/0x79
 [<ffffffff810e15a6>] ? sync_buffer+0x0/0x3f
 [<ffffffff810e15a6>] ? sync_buffer+0x0/0x3f
 [<ffffffff812c0e7d>] out_of_line_wait_on_bit+0x6a/0x77
 [<ffffffff81053861>] ? wake_bit_function+0x0/0x2a
 [<ffffffff810e150a>] __wait_on_buffer+0x36/0x3a
 [<ffffffff810e154f>] wait_on_buffer+0x41/0x45
 [<ffffffff810e22fe>] __block_prepare_write+0x2ba/0x2fd
 [<ffffffffa0037e85>] ? ext3_get_block+0x0/0xfc [ext3]
 [<ffffffff8108d627>] ? add_to_page_cache_locked+0x9a/0xae
 [<ffffffff810e24b6>] block_write_begin+0x86/0xd8
 [<ffffffffa00374c3>] ext3_write_begin+0xdf/0x1a7 [ext3]
 [<ffffffffa0037e85>] ? ext3_get_block+0x0/0xfc [ext3]
 [<ffffffff8108e0fa>] generic_file_buffered_write+0x14b/0x643
 [<ffffffff810d623b>] ? mnt_drop_write+0x82/0x143
 [<ffffffff8108e9e7>] __generic_file_aio_write_nolock+0x25e/0x292
 [<ffffffff8108f233>] generic_file_aio_write+0x67/0xc3
 [<ffffffffa003443f>] ext3_file_write+0x1e/0x9f [ext3]
 [<ffffffff810beb88>] do_sync_write+0xe7/0x12d
 [<ffffffffa04268a2>] ? RTMemFree+0x1e/0x20 [vboxdrv]
 [<ffffffff81053829>] ? autoremove_wake_function+0x0/0x38
 [<ffffffff812c20fa>] ? _spin_lock+0x9/0xc
 [<ffffffff81031103>] ? need_resched+0x1e/0x28
 [<ffffffff81120e84>] ? security_file_permission+0x11/0x13
 [<ffffffff810bf444>] vfs_write+0xab/0x105
 [<ffffffff810bf562>] sys_write+0x47/0x6f
 [<ffffffff8101027a>] system_call_fastpath+0x16/0x1b

INFO: task kjournald:1339 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kjournald     D ffff8800905ab400     0  1339      2
 ffff880218e13ca0 0000000000000046 0000000000000086 ffff88021ab28f30
 ffffffff8162a500 ffffffff8162a500 ffff88021ab88000 ffff8801007d2dc0
 ffff88021ab88348 00000001a02d4a6f ffff88021a470798 ffff88021ab88348
Call Trace:
 [<ffffffff8101686f>] ? read_tsc+0xe/0x24
 [<ffffffff810590b6>] ? getnstimeofday+0x54/0xb0
 [<ffffffff812c08a3>] io_schedule+0x63/0xa5
 [<ffffffff810e15e1>] sync_buffer+0x3b/0x3f
 [<ffffffff812c0de1>] __wait_on_bit+0x47/0x79
 [<ffffffff810e15a6>] ? sync_buffer+0x0/0x3f
 [<ffffffff810e15a6>] ? sync_buffer+0x0/0x3f
 [<ffffffff812c0e7d>] out_of_line_wait_on_bit+0x6a/0x77
 [<ffffffff81053861>] ? wake_bit_function+0x0/0x2a
 [<ffffffff810e150a>] __wait_on_buffer+0x36/0x3a
 [<ffffffffa0024deb>] wait_on_buffer+0x41/0x45 [jbd]
 [<ffffffffa002540e>] journal_commit_transaction+0x55d/0xf2f [jbd]
 [<ffffffff81049a0d>] ? try_to_del_timer_sync+0x58/0x63
 [<ffffffffa0028ad8>] kjournald+0xe3/0x23a [jbd]
 [<ffffffff81053829>] ? autoremove_wake_function+0x0/0x38
 [<ffffffffa00289f5>] ? kjournald+0x0/0x23a [jbd]
 [<ffffffff810534bf>] kthread+0x49/0x76
 [<ffffffff81011719>] child_rip+0xa/0x11
 [<ffffffff81010a37>] ? restore_args+0x0/0x30
 [<ffffffff81053476>] ? kthread+0x0/0x76
 [<ffffffff8101170f>] ? child_rip+0x0/0x11

INFO: task VirtualBox:10082 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
VirtualBox    D ffff88015d0409c0     0 10082  12565
 ffff8801cac998f8 0000000000000086 0000000000000092 ffff88021ab28f30
 ffffffff8162a500 ffffffff8162a500 ffff88020ed344a0 ffff8801fb1944a0
 ffff88020ed347e8 00000001a02d4a6f ffff88021a470798 ffff88020ed347e8
Call Trace:
 [<ffffffff8101686f>] ? read_tsc+0xe/0x24
 [<ffffffff810590b6>] ? getnstimeofday+0x54/0xb0
 [<ffffffff812c08a3>] io_schedule+0x63/0xa5
 [<ffffffff810e15e1>] sync_buffer+0x3b/0x3f
 [<ffffffff812c0de1>] __wait_on_bit+0x47/0x79
 [<ffffffff810e15a6>] ? sync_buffer+0x0/0x3f
 [<ffffffff810e15a6>] ? sync_buffer+0x0/0x3f
 [<ffffffff812c0e7d>] out_of_line_wait_on_bit+0x6a/0x77
 [<ffffffff81053861>] ? wake_bit_function+0x0/0x2a
 [<ffffffff810e150a>] __wait_on_buffer+0x36/0x3a
 [<ffffffff810e154f>] wait_on_buffer+0x41/0x45
 [<ffffffff810e22fe>] __block_prepare_write+0x2ba/0x2fd
 [<ffffffffa0037e85>] ? ext3_get_block+0x0/0xfc [ext3]
 [<ffffffff8108d627>] ? add_to_page_cache_locked+0x9a/0xae
 [<ffffffff810e24b6>] block_write_begin+0x86/0xd8
 [<ffffffffa00374c3>] ext3_write_begin+0xdf/0x1a7 [ext3]
 [<ffffffffa0037e85>] ? ext3_get_block+0x0/0xfc [ext3]
 [<ffffffff8108e0fa>] generic_file_buffered_write+0x14b/0x643
 [<ffffffff810d623b>] ? mnt_drop_write+0x82/0x143
 [<ffffffff8108e9e7>] __generic_file_aio_write_nolock+0x25e/0x292
 [<ffffffff8108f233>] generic_file_aio_write+0x67/0xc3
 [<ffffffffa003443f>] ext3_file_write+0x1e/0x9f [ext3]
 [<ffffffff810beb88>] do_sync_write+0xe7/0x12d
 [<ffffffffa04268a2>] ? RTMemFree+0x1e/0x20 [vboxdrv]
 [<ffffffff81053829>] ? autoremove_wake_function+0x0/0x38
 [<ffffffff812c20fa>] ? _spin_lock+0x9/0xc
 [<ffffffff81031103>] ? need_resched+0x1e/0x28
 [<ffffffff81120e84>] ? security_file_permission+0x11/0x13
 [<ffffffff810bf444>] vfs_write+0xab/0x105
 [<ffffffff810bf562>] sys_write+0x47/0x6f
 [<ffffffff8101027a>] system_call_fastpath+0x16/0x1b

INFO: task dir:23398 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
dir           D 0000000000000002     0 23398  12627
 ffff880054579a78 0000000000000086 0000000000000096 ffff88021ab28f30
 ffffffff8162a500 ffffffff8162a500 ffff8801f1438000 ffff88021f1e2dc0
 ffff8801f1438348 00000001a02d4a6f ffff88021a470798 ffff8801f1438348
Call Trace:
 [<ffffffff8101686f>] ? read_tsc+0xe/0x24
 [<ffffffff810590b6>] ? getnstimeofday+0x54/0xb0
 [<ffffffff812c08a3>] io_schedule+0x63/0xa5
 [<ffffffff810e15e1>] sync_buffer+0x3b/0x3f
 [<ffffffff812c0de1>] __wait_on_bit+0x47/0x79
 [<ffffffff810e15a6>] ? sync_buffer+0x0/0x3f
 [<ffffffff810e15a6>] ? sync_buffer+0x0/0x3f
 [<ffffffff812c0e7d>] out_of_line_wait_on_bit+0x6a/0x77
 [<ffffffff81053861>] ? wake_bit_function+0x0/0x2a
 [<ffffffff8113cb81>] ? submit_bio+0xe0/0xe9
 [<ffffffff810e150a>] __wait_on_buffer+0x36/0x3a
 [<ffffffffa0035cc7>] wait_on_buffer+0x41/0x45 [ext3]
 [<ffffffffa0035f4a>] __ext3_get_inode_loc+0x27f/0x2d8 [ext3]
 [<ffffffffa0036003>] ext3_iget+0x60/0x398 [ext3]
 [<ffffffffa003bea8>] ext3_lookup+0x81/0xc5 [ext3]
 [<ffffffff810d06fd>] ? d_alloc+0x18e/0x19b
 [<ffffffff810c63de>] do_lookup+0xd3/0x15d
 [<ffffffff810c7b05>] __link_path_walk+0x602/0x754
 [<ffffffff810c8125>] path_walk+0x61/0xc4
 [<ffffffff810c838c>] do_path_lookup+0x165/0x1be
 [<ffffffff810c999e>] user_path_at+0x52/0x8c
 [<ffffffff810c8e6e>] ? putname+0x30/0x39
 [<ffffffff810c99a9>] ? user_path_at+0x5d/0x8c
 [<ffffffff810c21bc>] vfs_lstat_fd+0x1e/0x4b
 [<ffffffff810d518e>] ? mntput_no_expire+0x31/0x144
 [<ffffffff810c220b>] sys_newlstat+0x22/0x3c
 [<ffffffff810c60bf>] ? path_put+0x1d/0x21
 [<ffffffff8107cf8d>] ? audit_syscall_entry+0x101/0x135
 [<ffffffff8101027a>] system_call_fastpath+0x16/0x1b

INFO: task FahCore_a3.exe:11036 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
FahCore_a3.ex D 0000000000000002     0 11036   3111
 ffff8801c929fd08 0000000000000086 0000000000000001 00000000014285f9
 ffffffff8162a500 ffffffff8162a500 ffff8801007d2dc0 ffff88021a932dc0
 ffff8801007d3108 0000000100000000 ffff8801c929fce8 ffff8801007d3108
Call Trace:
 [<ffffffffa00285e0>] log_wait_commit+0xbd/0x116 [jbd]
 [<ffffffff81053829>] ? autoremove_wake_function+0x0/0x38
 [<ffffffff81094370>] ? write_cache_pages+0x179/0x3b4
 [<ffffffffa0023c01>] journal_stop+0x189/0x1c1 [jbd]
 [<ffffffffa0024d09>] journal_force_commit+0x23/0x26 [jbd]
 [<ffffffffa003e668>] ext3_force_commit+0x26/0x28 [ext3]
 [<ffffffffa0035c47>] ext3_write_inode+0x39/0x3f [ext3]
 [<ffffffff810dbfd1>] __writeback_single_inode+0x1de/0x332
 [<ffffffff810dc14d>] sync_inode+0x28/0x40
 [<ffffffffa0034562>] ext3_sync_file+0xa2/0xb0 [ext3]
 [<ffffffff810df337>] do_fsync+0x55/0x8a
 [<ffffffff810df39a>] __do_fsync+0x2e/0x44
 [<ffffffff810df3cb>] sys_fsync+0xb/0xd
 [<ffffffff8101027a>] system_call_fastpath+0x16/0x1b

INFO: task kjournald:1339 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kjournald     D ffff8800905ab400     0  1339      2
 ffff880218e13ca0 0000000000000046 0000000000000086 ffff88021ab28f30
 ffffffff8162a500 ffffffff8162a500 ffff88021ab88000 ffff8801007d2dc0
 ffff88021ab88348 00000001a02d4a6f ffff88021a470798 ffff88021ab88348
Call Trace:
 [<ffffffff8101686f>] ? read_tsc+0xe/0x24
 [<ffffffff810590b6>] ? getnstimeofday+0x54/0xb0
 [<ffffffff812c08a3>] io_schedule+0x63/0xa5
 [<ffffffff810e15e1>] sync_buffer+0x3b/0x3f
 [<ffffffff812c0de1>] __wait_on_bit+0x47/0x79
 [<ffffffff810e15a6>] ? sync_buffer+0x0/0x3f
 [<ffffffff810e15a6>] ? sync_buffer+0x0/0x3f
 [<ffffffff812c0e7d>] out_of_line_wait_on_bit+0x6a/0x77
 [<ffffffff81053861>] ? wake_bit_function+0x0/0x2a
 [<ffffffff810e150a>] __wait_on_buffer+0x36/0x3a
 [<ffffffffa0024deb>] wait_on_buffer+0x41/0x45 [jbd]
 [<ffffffffa002540e>] journal_commit_transaction+0x55d/0xf2f [jbd]
 [<ffffffff81049a0d>] ? try_to_del_timer_sync+0x58/0x63
 [<ffffffffa0028ad8>] kjournald+0xe3/0x23a [jbd]
 [<ffffffff81053829>] ? autoremove_wake_function+0x0/0x38
 [<ffffffffa00289f5>] ? kjournald+0x0/0x23a [jbd]
 [<ffffffff810534bf>] kthread+0x49/0x76
 [<ffffffff81011719>] child_rip+0xa/0x11
 [<ffffffff81010a37>] ? restore_args+0x0/0x30
 [<ffffffff81053476>] ? kthread+0x0/0x76
 [<ffffffff8101170f>] ? child_rip+0x0/0x11

INFO: task VirtualBox:10082 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
VirtualBox    D ffff88015d0409c0     0 10082  12565
 ffff8801cac998f8 0000000000000086 0000000000000092 ffff88021ab28f30
 ffffffff8162a500 ffffffff8162a500 ffff88020ed344a0 ffff8801fb1944a0
 ffff88020ed347e8 00000001a02d4a6f ffff88021a470798 ffff88020ed347e8
Call Trace:
 [<ffffffff8101686f>] ? read_tsc+0xe/0x24
 [<ffffffff810590b6>] ? getnstimeofday+0x54/0xb0
 [<ffffffff812c08a3>] io_schedule+0x63/0xa5
 [<ffffffff810e15e1>] sync_buffer+0x3b/0x3f
 [<ffffffff812c0de1>] __wait_on_bit+0x47/0x79
 [<ffffffff810e15a6>] ? sync_buffer+0x0/0x3f
 [<ffffffff810e15a6>] ? sync_buffer+0x0/0x3f
 [<ffffffff812c0e7d>] out_of_line_wait_on_bit+0x6a/0x77
 [<ffffffff81053861>] ? wake_bit_function+0x0/0x2a
 [<ffffffff810e150a>] __wait_on_buffer+0x36/0x3a
 [<ffffffff810e154f>] wait_on_buffer+0x41/0x45
 [<ffffffff810e22fe>] __block_prepare_write+0x2ba/0x2fd
 [<ffffffffa0037e85>] ? ext3_get_block+0x0/0xfc [ext3]
 [<ffffffff8108d627>] ? add_to_page_cache_locked+0x9a/0xae
 [<ffffffff810e24b6>] block_write_begin+0x86/0xd8
 [<ffffffffa00374c3>] ext3_write_begin+0xdf/0x1a7 [ext3]
 [<ffffffffa0037e85>] ? ext3_get_block+0x0/0xfc [ext3]
 [<ffffffff8108e0fa>] generic_file_buffered_write+0x14b/0x643
 [<ffffffff810d623b>] ? mnt_drop_write+0x82/0x143
 [<ffffffff8108e9e7>] __generic_file_aio_write_nolock+0x25e/0x292
 [<ffffffff8108f233>] generic_file_aio_write+0x67/0xc3
 [<ffffffffa003443f>] ext3_file_write+0x1e/0x9f [ext3]
 [<ffffffff810beb88>] do_sync_write+0xe7/0x12d
 [<ffffffffa04268a2>] ? RTMemFree+0x1e/0x20 [vboxdrv]
 [<ffffffff81053829>] ? autoremove_wake_function+0x0/0x38
 [<ffffffff812c20fa>] ? _spin_lock+0x9/0xc
 [<ffffffff81031103>] ? need_resched+0x1e/0x28
 [<ffffffff81120e84>] ? security_file_permission+0x11/0x13
 [<ffffffff810bf444>] vfs_write+0xab/0x105
 [<ffffffff810bf562>] sys_write+0x47/0x6f
 [<ffffffff8101027a>] system_call_fastpath+0x16/0x1b

INFO: task dir:23398 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
dir           D 0000000000000002     0 23398  12627
 ffff880054579a78 0000000000000086 0000000000000096 ffff88021ab28f30
 ffffffff8162a500 ffffffff8162a500 ffff8801f1438000 ffff88021f1e2dc0
 ffff8801f1438348 00000001a02d4a6f ffff88021a470798 ffff8801f1438348
Call Trace:
 [<ffffffff8101686f>] ? read_tsc+0xe/0x24
 [<ffffffff810590b6>] ? getnstimeofday+0x54/0xb0
 [<ffffffff812c08a3>] io_schedule+0x63/0xa5
 [<ffffffff810e15e1>] sync_buffer+0x3b/0x3f
 [<ffffffff812c0de1>] __wait_on_bit+0x47/0x79
 [<ffffffff810e15a6>] ? sync_buffer+0x0/0x3f
 [<ffffffff810e15a6>] ? sync_buffer+0x0/0x3f
 [<ffffffff812c0e7d>] out_of_line_wait_on_bit+0x6a/0x77
 [<ffffffff81053861>] ? wake_bit_function+0x0/0x2a
 [<ffffffff8113cb81>] ? submit_bio+0xe0/0xe9
 [<ffffffff810e150a>] __wait_on_buffer+0x36/0x3a
 [<ffffffffa0035cc7>] wait_on_buffer+0x41/0x45 [ext3]
 [<ffffffffa0035f4a>] __ext3_get_inode_loc+0x27f/0x2d8 [ext3]
 [<ffffffffa0036003>] ext3_iget+0x60/0x398 [ext3]
 [<ffffffffa003bea8>] ext3_lookup+0x81/0xc5 [ext3]
 [<ffffffff810d06fd>] ? d_alloc+0x18e/0x19b
 [<ffffffff810c63de>] do_lookup+0xd3/0x15d
 [<ffffffff810c7b05>] __link_path_walk+0x602/0x754
 [<ffffffff810c8125>] path_walk+0x61/0xc4
 [<ffffffff810c838c>] do_path_lookup+0x165/0x1be
 [<ffffffff810c999e>] user_path_at+0x52/0x8c
 [<ffffffff810c8e6e>] ? putname+0x30/0x39
 [<ffffffff810c99a9>] ? user_path_at+0x5d/0x8c
 [<ffffffff810c21bc>] vfs_lstat_fd+0x1e/0x4b
 [<ffffffff810d518e>] ? mntput_no_expire+0x31/0x144
 [<ffffffff810c220b>] sys_newlstat+0x22/0x3c
 [<ffffffff810c60bf>] ? path_put+0x1d/0x21
 [<ffffffff8107cf8d>] ? audit_syscall_entry+0x101/0x135
 [<ffffffff8101027a>] system_call_fastpath+0x16/0x1b

INFO: task FahCore_a3.exe:11036 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
FahCore_a3.ex D 0000000000000002     0 11036   3111
 ffff8801c929fd08 0000000000000086 0000000000000001 00000000014285f9
 ffffffff8162a500 ffffffff8162a500 ffff8801007d2dc0 ffff88021a932dc0
 ffff8801007d3108 0000000100000000 ffff8801c929fce8 ffff8801007d3108
Call Trace:
 [<ffffffffa00285e0>] log_wait_commit+0xbd/0x116 [jbd]
 [<ffffffff81053829>] ? autoremove_wake_function+0x0/0x38
 [<ffffffff81094370>] ? write_cache_pages+0x179/0x3b4
 [<ffffffffa0023c01>] journal_stop+0x189/0x1c1 [jbd]
 [<ffffffffa0024d09>] journal_force_commit+0x23/0x26 [jbd]
 [<ffffffffa003e668>] ext3_force_commit+0x26/0x28 [ext3]
 [<ffffffffa0035c47>] ext3_write_inode+0x39/0x3f [ext3]
 [<ffffffff810dbfd1>] __writeback_single_inode+0x1de/0x332
 [<ffffffff810dc14d>] sync_inode+0x28/0x40
 [<ffffffffa0034562>] ext3_sync_file+0xa2/0xb0 [ext3]
 [<ffffffff810df337>] do_fsync+0x55/0x8a
 [<ffffffff810df39a>] __do_fsync+0x2e/0x44
 [<ffffffff810df3cb>] sys_fsync+0xb/0xd
 [<ffffffff8101027a>] system_call_fastpath+0x16/0x1b

INFO: task dir:23532 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
dir           D ffff8800905aad80     0 23532  27593
 ffff8800a9a5ba78 0000000000000082 0000000000000096 ffff88021ab28f30
 ffffffff8162a500 ffffffff8162a500 ffff88006dfc2dc0 ffff8801fb132dc0
 ffff88006dfc3108 00000001a02d4a6f ffff88021a470798 ffff88006dfc3108
Call Trace:
 [<ffffffff8101686f>] ? read_tsc+0xe/0x24
 [<ffffffff810590b6>] ? getnstimeofday+0x54/0xb0
 [<ffffffff812c08a3>] io_schedule+0x63/0xa5
 [<ffffffff810e15e1>] sync_buffer+0x3b/0x3f
 [<ffffffff812c0de1>] __wait_on_bit+0x47/0x79
 [<ffffffff810e15a6>] ? sync_buffer+0x0/0x3f
 [<ffffffff810e15a6>] ? sync_buffer+0x0/0x3f
 [<ffffffff812c0e7d>] out_of_line_wait_on_bit+0x6a/0x77
 [<ffffffff81053861>] ? wake_bit_function+0x0/0x2a
 [<ffffffff8113cb81>] ? submit_bio+0xe0/0xe9
 [<ffffffff810e150a>] __wait_on_buffer+0x36/0x3a
 [<ffffffffa0035cc7>] wait_on_buffer+0x41/0x45 [ext3]
 [<ffffffffa0035f4a>] __ext3_get_inode_loc+0x27f/0x2d8 [ext3]
 [<ffffffffa0036003>] ext3_iget+0x60/0x398 [ext3]
 [<ffffffffa003bea8>] ext3_lookup+0x81/0xc5 [ext3]
 [<ffffffff810d06fd>] ? d_alloc+0x18e/0x19b
 [<ffffffff810c63de>] do_lookup+0xd3/0x15d
 [<ffffffff810c7b05>] __link_path_walk+0x602/0x754
 [<ffffffff810c5fa9>] ? mntput+0x18/0x1a
 [<ffffffff810c8125>] path_walk+0x61/0xc4
 [<ffffffff810c838c>] do_path_lookup+0x165/0x1be
 [<ffffffff810c999e>] user_path_at+0x52/0x8c
 [<ffffffff810c1f60>] ? cp_new_stat+0xe2/0xef
 [<ffffffff812c20fa>] ? _spin_lock+0x9/0xc
 [<ffffffff810d623b>] ? mnt_drop_write+0x82/0x143
 [<ffffffff810c21bc>] vfs_lstat_fd+0x1e/0x4b
 [<ffffffff810c220b>] sys_newlstat+0x22/0x3c
 [<ffffffff810c60bf>] ? path_put+0x1d/0x21
 [<ffffffff8107cf8d>] ? audit_syscall_entry+0x101/0x135
 [<ffffffff8101027a>] system_call_fastpath+0x16/0x1b


Damn I need to build a new server, this thing is on it's last legs. I just can't afford it.

Even "dir" locks up. It's almost like the whole raid array crapped out, yet mdadm is showing no signs of issues. Everything is locking up. This is brutal.


Edit: Everything is back to normal now, as if it never happened. Oddly I just noticed with windows 7 when you lose a network drive you have to reboot. Is this normal? In XP I could just try connecting to it again and it would pick it up again.
 
Last edited:

Crusty

Lifer
Sep 30, 2001
12,684
2
81
One of your disks looks like it is not responding to commands over SATA for some reason and the driver is trying to reset the device. Have you replaced all of your drive cabling?
 

Red Squirrel

No Lifer
May 24, 2003
67,526
12,193
126
www.anyf.ca
Yep all the drives, cables and backplanes were replaced. I did notice that error too about the disk reset, but I'm wondering why mdadm is not picking it up as a failed drive? It would be nice to know what disk it is so I can at least manually fail it and replace it.


But this led me to realize I have not done a smart check in a while. This is VERY bad... it seems almost all the drives are failing:




Code:
[root@borg raid1]# smartctl /dev/sda
smartctl version 5.38 [x86_64-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# smartctl -a /dev/sda
smartctl version 5.38 [x86_64-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD6402AAEX-00Z3A0
Serial Number:    WD-WCATR4058891
Firmware Version: 05.01D05
User Capacity:    640,135,028,736 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Wed Jun 27 17:45:58 2012 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)	Offline data collection activity
					was suspended by an interrupting command from host.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		 (12360) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 145) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x3037)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   174   170   021    Pre-fail  Always       -       4300
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       78
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   087   087   000    Old_age   Always       -       10218
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       76
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       35
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       42
194 Temperature_Celsius     0x0022   118   098   000    Old_age   Always       -       29
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 

[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# smartctl -a /dev/sdb
smartctl version 5.38 [x86_64-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD1002FAEX-00Y9A0
Serial Number:    WD-WCAW32643966
Firmware Version: 05.01D05
User Capacity:    1,000,204,886,016 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Wed Jun 27 17:46:04 2012 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)	Offline data collection activity
					was suspended by an interrupting command from host.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		 (16800) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 173) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x3035)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   178   175   021    Pre-fail  Always       -       4091
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       34
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   097   097   000    Old_age   Always       -       2559
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       32
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       22
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       13
194 Temperature_Celsius     0x0022   118   101   000    Old_age   Always       -       29
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# smartctl -a /dev/sdc
smartctl version 5.38 [x86_64-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD1002FAEX-00Y9A0
Serial Number:    WD-WCAW32668254
Firmware Version: 05.01D05
User Capacity:    1,000,204,886,016 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Wed Jun 27 17:46:19 2012 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)	Offline data collection activity
					was suspended by an interrupting command from host.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		 (16800) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 173) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x3035)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       2
  3 Spin_Up_Time            0x0027   177   174   021    Pre-fail  Always       -       4108
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       34
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   097   097   000    Old_age   Always       -       2537
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       32
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       22
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       13
194 Temperature_Celsius     0x0022   118   099   000    Old_age   Always       -       29
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# smartctl -a /dev/sdd
smartctl version 5.38 [x86_64-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD1002FAEX-00Y9A0
Serial Number:    WD-WCAW32467397
Firmware Version: 05.01D05
User Capacity:    1,000,204,886,016 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Wed Jun 27 17:46:30 2012 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)	Offline data collection activity
					was suspended by an interrupting command from host.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		 (16680) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 172) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x3035)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       3
  3 Spin_Up_Time            0x0027   174   172   021    Pre-fail  Always       -       4266
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       34
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   097   097   000    Old_age   Always       -       2563
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       32
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       22
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       13
194 Temperature_Celsius     0x0022   118   105   000    Old_age   Always       -       29
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# smartctl -a /dev/sde
smartctl version 5.38 [x86_64-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD1002FAEX-00Y9A0
Serial Number:    WD-WCAW32071637
Firmware Version: 05.01D05
User Capacity:    1,000,204,886,016 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Wed Jun 27 17:46:40 2012 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)	Offline data collection activity
					was suspended by an interrupting command from host.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		 (16800) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 173) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x3035)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       4
  3 Spin_Up_Time            0x0027   173   172   021    Pre-fail  Always       -       4308
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       43
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   096   096   000    Old_age   Always       -       3436
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       42
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       27
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       15
194 Temperature_Celsius     0x0022   118   100   000    Old_age   Always       -       29
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       4

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# smartctl -a /dev/sdf
smartctl version 5.38 [x86_64-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD1002FAEX-00Y9A0
Serial Number:    WD-WCAW32590153
Firmware Version: 05.01D05
User Capacity:    1,000,204,886,016 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Wed Jun 27 17:46:46 2012 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)	Offline data collection activity
					was suspended by an interrupting command from host.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		 (16860) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 174) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x3035)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       2
  3 Spin_Up_Time            0x0027   177   175   021    Pre-fail  Always       -       4133
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       25
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   098   098   000    Old_age   Always       -       1673
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       23
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       17
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       9
194 Temperature_Celsius     0x0022   116   104   000    Old_age   Always       -       31
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       1

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# 
[root@borg raid1]# smartctl -a /dev/sdg
smartctl version 5.38 [x86_64-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD1002FAEX-00Y9A0
Serial Number:    WD-WCAW31551813
Firmware Version: 05.01D05
User Capacity:    1,000,204,886,016 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Wed Jun 27 17:46:54 2012 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)	Offline data collection activity
					was suspended by an interrupting command from host.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		 (16680) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 172) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x3035)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       1
  3 Spin_Up_Time            0x0027   174   173   021    Pre-fail  Always       -       4258
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       42
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   096   096   000    Old_age   Always       -       3109
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       40
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       27
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       16
194 Temperature_Celsius     0x0022   117   100   000    Old_age   Always       -       30
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

[root@borg raid1]#



The first one being the OS drive then g b c d e f are part of the raid. This is exactly what happened to the other drives I replaced. They all slowly started dying. What could be causing all these drives to fail? Bad PSU? When I tested the voltage was stable but maybe it spikes or something?
 

Crusty

Lifer
Sep 30, 2001
12,684
2
81
Why do you think your drives are failing? smartctl says PASSED for every drive you tested.
 

Red Squirrel

No Lifer
May 24, 2003
67,526
12,193
126
www.anyf.ca
If you look at the stats, most of them have a raw read error rate and multi zone error rate. They're also running at 30C, is that maybe too hot, and why they're failing? Though it would not explain this issue in winter when the drives run at around 20C.

I guess I should probably actually run a long test to see what that says, those are just the real time stats.
 

Crusty

Lifer
Sep 30, 2001
12,684
2
81
I think you are reading the smart values incorrectly. The first thing to look for is something in the WHEN_FAILED column, you've got none. The second thing would be to look for any VALUE(s) that are BELOW their corresponding THRESH value, but since you don't have any WHEN_FAILED there won't be any VALUE(s) below their THRESH. Thus according to your drive manufacturer your drives are still within normal operating specs.
 

Red Squirrel

No Lifer
May 24, 2003
67,526
12,193
126
www.anyf.ca
Oh ok, that's good to know. I was looking at the raw value. Some of them have a 1 or 2 for the error rate. So as long as "when failed" is blank then I'm good?
 

afsajghfd

Junior Member
Aug 20, 2012
1
0
0
Hi all, I would like to know any news about this topic. Currently our hosting seems to have the same malfunction.

Thanks!
 

Red Squirrel

No Lifer
May 24, 2003
67,526
12,193
126
www.anyf.ca
It's been happening a lot to me lately again. I recently deployed a new VM for environmental monitoring, something temp as I am working on my own app so I can put it on my physical environmental monitoring server. I could not figure out the dependencies to install it manually on that server so I just used a VM appliance for now while I code my own app that will run on the physical box.

But since I have that VM running it's been happening more. So I'm really starting to wonder if it really is a VirtualBox related issue.

Once I take down that VM I will probably try a different VM solution such as KVM to see if the issues go away. I have a few other VMs for stuff such as torrents but I can easily just rebuild those.
 

Red Squirrel

No Lifer
May 24, 2003
67,526
12,193
126
www.anyf.ca
Still does it every now and then. Been a while since the last time though. Never really found out the root cause. Gave up on it. Been wanting to build a new server so I can rule out hardware and software all at the same time and hopefully put this issue behind me.

I did shut down my UO game server though so the dev environment is down and no longer needed (3 vms). Only have a p2p vm running and I can easily remake that if I want to. That opens me up to another VM solution. Might look at KVM.
 

Red Squirrel

No Lifer
May 24, 2003
67,526
12,193
126
www.anyf.ca
well I jinxed myself, did it again, and one of my custom apps gave an error writing to the log file, indicating there must have been some IO hangup at the time.

Code:
ssh: page allocation failure. order:5, mode:0x4020
Pid: 14666, comm: ssh Tainted: G        W 2.6.27.25-78.2.56.fc9.x86_64 #1

Call Trace:
 <IRQ>  [<ffffffff810938e7>] __alloc_pages_internal+0x436/0x457
 [<ffffffff810b8742>] kmalloc_large_node+0x66/0xa5
 [<ffffffff810ba084>] __kmalloc_node_track_caller+0x29/0x103
 [<ffffffff8123ba24>] ? skb_copy+0x30/0x97
 [<ffffffff8123b398>] __alloc_skb+0x6f/0x135
 [<ffffffff8123ba24>] skb_copy+0x30/0x97
 [<ffffffffa05d131f>] vboxNetFltLinuxPacketHandler+0x5d/0x481 [vboxnetflt]
 [<ffffffff81239418>] ? __skb_clone+0x29/0x11b
 [<ffffffff81240d4d>] dev_hard_start_xmit+0x12b/0x258
 [<ffffffff81252a53>] __qdisc_run+0xed/0x206
 [<ffffffff8123f022>] qdisc_run+0x36/0x3b
 [<ffffffff812412cb>] dev_queue_xmit+0x344/0x453
 [<ffffffff81265ff9>] ip_finish_output2+0x1a8/0x1ea
 [<ffffffff8104502f>] ? local_bh_enable+0xd/0xf
 [<ffffffff812660a3>] ip_finish_output+0x68/0x6a
 [<ffffffff81266140>] ip_output+0x9b/0xa0
 [<ffffffff8126545a>] ip_local_out+0x20/0x24
 [<ffffffff81265c92>] ip_queue_xmit+0x2c9/0x31f
 [<ffffffff81049c35>] ? __mod_timer+0xbb/0xcd
 [<ffffffff81275586>] ? tcp_current_mss+0x74/0xf4
 [<ffffffff812760c1>] tcp_transmit_skb+0x60b/0x64e
 [<ffffffff812947be>] ? bictcp_acked+0x46/0x65
 [<ffffffff8127898b>] __tcp_push_pending_frames+0x725/0x80f
 [<ffffffff81275586>] ? tcp_current_mss+0x74/0xf4
 [<ffffffff812738aa>] tcp_data_snd_check+0x27/0x118
 [<ffffffff81273dd0>] tcp_rcv_established+0xf2/0x853
 [<ffffffff8127b8b4>] tcp_v4_do_rcv+0x1db/0x389
 [<ffffffff8104502f>] ? local_bh_enable+0xd/0xf
 [<ffffffff8127bed7>] tcp_v4_rcv+0x475/0x6a8
 [<ffffffffa05d12b3>] ? vboxNetFltLinuxForwardSegment+0x66/0x75 [vboxnetflt]
 [<ffffffff81261a0d>] ip_local_deliver_finish+0x103/0x19f
 [<ffffffff81261b1b>] ip_local_deliver+0x72/0x7a
 [<ffffffff81261601>] ip_rcv_finish+0x305/0x321
 [<ffffffff8126187c>] ip_rcv+0x25f/0x295
 [<ffffffff81240875>] netif_receive_skb+0x3e6/0x40b
 [<ffffffffa0106ce1>] e1000_receive_skb+0x5b/0x76 [e1000e]
 [<ffffffffa0106ed8>] e1000_clean_rx_irq+0x1dc/0x28a [e1000e]
 [<ffffffff811478ce>] ? cfq_idle_slice_timer+0x0/0x9f
 [<ffffffffa01054e8>] e1000_clean+0x94/0x2a4 [e1000e]
 [<ffffffff8123ee45>] net_rx_action+0xd4/0x1fd
 [<ffffffff8104515a>] __do_softirq+0x7e/0x10c
 [<ffffffff81011bfc>] call_softirq+0x1c/0x28
 [<ffffffff81012e06>] do_softirq+0x4d/0xb0
 [<ffffffff81044d2f>] irq_exit+0x4e/0x9d
 [<ffffffff81013122>] do_IRQ+0x147/0x169
 [<ffffffff81010963>] ret_from_intr+0x0/0x2e
 <EOI> 
Mem-Info:
Node 0 DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
CPU    2: hi:    0, btch:   1 usd:   0
CPU    3: hi:    0, btch:   1 usd:   0
Node 0 DMA32 per-cpu:
CPU    0: hi:  186, btch:  31 usd:   0
CPU    1: hi:  186, btch:  31 usd:   0
CPU    2: hi:  186, btch:  31 usd:  30
CPU    3: hi:  186, btch:  31 usd:  33
Node 0 Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd:   0
CPU    1: hi:  186, btch:  31 usd:   0
CPU    2: hi:  186, btch:  31 usd:  47
CPU    3: hi:  186, btch:  31 usd:  62
Active:132283 inactive:1508049 dirty:161830 writeback:2625 unstable:0
 free:10126 slab:90087 mapped:168019 pagetables:12444 bounce:0
Node 0 DMA free:9444kB min:8kB low:8kB high:12kB active:0kB inactive:0kB present:7624kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 3243 7788 7788
Node 0 DMA32 free:24016kB min:4700kB low:5872kB high:7048kB active:202620kB inactive:2616020kB present:3321004kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 4545 4545
Node 0 Normal free:7044kB min:6588kB low:8232kB high:9880kB active:326512kB inactive:3416176kB present:4654080kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 5*4kB 4*8kB 1*16kB 3*32kB 3*64kB 1*128kB 1*256kB 1*512kB 2*1024kB 1*2048kB 1*4096kB = 9444kB
Node 0 DMA32: 5309*4kB 98*8kB 90*16kB 10*32kB 2*64kB 0*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 24164kB
Node 0 Normal: 1464*4kB 15*8kB 2*16kB 6*32kB 4*64kB 4*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 7224kB
1612609 total pagecache pages
58515 pages in swap cache
Swap cache stats: add 3704262, delete 3645747, find 11825741/12339098
Free swap  = 163321668kB
Total swap = 164095932kB
2031616 pages RAM
208820 pages reserved
577140 pages shared
1279163 pages non-shared
nfsd: page allocation failure. order:5, mode:0x4020
Pid: 7697, comm: nfsd Tainted: G        W 2.6.27.25-78.2.56.fc9.x86_64 #1

Call Trace:
 [<ffffffff810938e7>] __alloc_pages_internal+0x436/0x457
 [<ffffffff810b8742>] kmalloc_large_node+0x66/0xa5
 [<ffffffff810ba084>] __kmalloc_node_track_caller+0x29/0x103
 [<ffffffff8123ba24>] ? skb_copy+0x30/0x97
 [<ffffffff8123b398>] __alloc_skb+0x6f/0x135
 [<ffffffff8123ba24>] skb_copy+0x30/0x97
 [<ffffffffa05d131f>] vboxNetFltLinuxPacketHandler+0x5d/0x481 [vboxnetflt]
 [<ffffffff81239418>] ? __skb_clone+0x29/0x11b
 [<ffffffff81240d4d>] dev_hard_start_xmit+0x12b/0x258
 [<ffffffff81252a53>] __qdisc_run+0xed/0x206
 [<ffffffff8123f022>] qdisc_run+0x36/0x3b
 [<ffffffff812412cb>] dev_queue_xmit+0x344/0x453
 [<ffffffff81265ff9>] ip_finish_output2+0x1a8/0x1ea
 [<ffffffff812c0936>] ? _cond_resched+0x9/0x38
 [<ffffffff812660a3>] ip_finish_output+0x68/0x6a
 [<ffffffff81266140>] ip_output+0x9b/0xa0
 [<ffffffff8126545a>] ip_local_out+0x20/0x24
 [<ffffffff81265c92>] ip_queue_xmit+0x2c9/0x31f
 [<ffffffff810d1570>] ? iput+0x2f/0x65
 [<ffffffff810d0dcb>] ? d_alloc_anon+0x25/0x123
 [<ffffffff812760c1>] tcp_transmit_skb+0x60b/0x64e
 [<ffffffff8127898b>] __tcp_push_pending_frames+0x725/0x80f
 [<ffffffff81275586>] ? tcp_current_mss+0x74/0xf4
 [<ffffffff810dde8f>] ? page_cache_pipe_buf_release+0x14/0x1d
 [<ffffffff8126b1dd>] tcp_push+0x8b/0x8d
 [<ffffffff8126dd30>] tcp_sendpage+0x458/0x499
 [<ffffffff81233ae0>] kernel_sendpage+0x16/0x1f
 [<ffffffffa0383760>] svc_sendto+0x16e/0x2a5 [sunrpc]
 [<ffffffff81120d00>] ? security_inode_getattr+0x20/0x26
 [<ffffffffa06c3fd4>] ? encode_fattr3+0x137/0x13f [nfsd]
 [<ffffffffa06c4054>] ? encode_post_op_attr+0x78/0x93 [nfsd]
 [<ffffffff810b7ac7>] ? virt_to_head_page+0x31/0x41
 [<ffffffffa03842aa>] svc_tcp_sendto+0x52/0xb5 [sunrpc]
 [<ffffffff812c0ed5>] ? mutex_lock+0x22/0x33
 [<ffffffffa038c6ec>] svc_send+0x70/0x9e [sunrpc]
 [<ffffffffa03825c1>] svc_process+0x467/0x63f [sunrpc]
 [<ffffffff812c1edb>] ? __down_read+0x3d/0xbd
 [<ffffffffa06b8868>] nfsd+0x147/0x1a5 [nfsd]
 [<ffffffffa06b8721>] ? nfsd+0x0/0x1a5 [nfsd]
 [<ffffffffa06b8721>] ? nfsd+0x0/0x1a5 [nfsd]
 [<ffffffff810534bf>] kthread+0x49/0x76
 [<ffffffff81011719>] child_rip+0xa/0x11
 [<ffffffff81010a37>] ? restore_args+0x0/0x30
 [<ffffffff81053476>] ? kthread+0x0/0x76
 [<ffffffff8101170f>] ? child_rip+0x0/0x11

Mem-Info:
Node 0 DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
CPU    2: hi:    0, btch:   1 usd:   0
CPU    3: hi:    0, btch:   1 usd:   0
Node 0 DMA32 per-cpu:
CPU    0: hi:  186, btch:  31 usd:  49
CPU    1: hi:  186, btch:  31 usd: 155
CPU    2: hi:  186, btch:  31 usd: 189
CPU    3: hi:  186, btch:  31 usd:  33
Node 0 Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd:  31
CPU    1: hi:  186, btch:  31 usd: 181
CPU    2: hi:  186, btch:  31 usd:  55
CPU    3: hi:  186, btch:  31 usd:  62
Active:132283 inactive:1508369 dirty:161830 writeback:2625 unstable:0
 free:9960 slab:90087 mapped:168019 pagetables:12444 bounce:0
Node 0 DMA free:9444kB min:8kB low:8kB high:12kB active:0kB inactive:0kB present:7624kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 3243 7788 7788
Node 0 DMA32 free:23868kB min:4700kB low:5872kB high:7048kB active:202620kB inactive:2616020kB present:3321004kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 4545 4545
Node 0 Normal free:6528kB min:6588kB low:8232kB high:9880kB active:326512kB inactive:3417456kB present:4654080kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 5*4kB 4*8kB 1*16kB 3*32kB 3*64kB 1*128kB 1*256kB 1*512kB 2*1024kB 1*2048kB 1*4096kB = 9444kB
Node 0 DMA32: 5401*4kB 99*8kB 51*16kB 10*32kB 2*64kB 0*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 23916kB
Node 0 Normal: 1348*4kB 15*8kB 2*16kB 5*32kB 4*64kB 4*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 6728kB
1612984 total pagecache pages
58528 pages in swap cache
Swap cache stats: add 3704275, delete 3645747, find 11825741/12339100
Free swap  = 163321668kB
Total swap = 164095932kB
2031616 pages RAM
208820 pages reserved
577655 pages shared
1277731 pages non-shared
FahCore_a3.exe: page allocation failure. order:3, mode:0x4020
Pid: 23478, comm: FahCore_a3.exe Tainted: G        W 2.6.27.25-78.2.56.fc9.x86_64 #1

Call Trace:
 <IRQ>  [<ffffffff810938e7>] __alloc_pages_internal+0x436/0x457
 [<ffffffff810b8742>] kmalloc_large_node+0x66/0xa5
 [<ffffffff810ba084>] __kmalloc_node_track_caller+0x29/0x103
 [<ffffffff8123ba24>] ? skb_copy+0x30/0x97
 [<ffffffff8123b398>] __alloc_skb+0x6f/0x135
 [<ffffffff8123ba24>] skb_copy+0x30/0x97
 [<ffffffffa05d131f>] vboxNetFltLinuxPacketHandler+0x5d/0x481 [vboxnetflt]
 [<ffffffff81239418>] ? __skb_clone+0x29/0x11b
 [<ffffffff81240d4d>] dev_hard_start_xmit+0x12b/0x258
 [<ffffffff81252a53>] __qdisc_run+0xed/0x206
 [<ffffffff8123f022>] qdisc_run+0x36/0x3b
 [<ffffffff812412cb>] dev_queue_xmit+0x344/0x453
 [<ffffffff81265ff9>] ip_finish_output2+0x1a8/0x1ea
 [<ffffffff812660a3>] ip_finish_output+0x68/0x6a
 [<ffffffff81266140>] ip_output+0x9b/0xa0
 [<ffffffff8126545a>] ip_local_out+0x20/0x24
 [<ffffffff81265c92>] ip_queue_xmit+0x2c9/0x31f
 [<ffffffff8123ab13>] ? skb_release_data+0xc6/0xcb
 [<ffffffff8126cf77>] ? sk_stream_alloc_skb+0x38/0xed
 [<ffffffff812760c1>] tcp_transmit_skb+0x60b/0x64e
 [<ffffffff8127898b>] __tcp_push_pending_frames+0x725/0x80f
 [<ffffffff81275586>] ? tcp_current_mss+0x74/0xf4
 [<ffffffff812738aa>] tcp_data_snd_check+0x27/0x118
 [<ffffffff81273dd0>] tcp_rcv_established+0xf2/0x853
 [<ffffffff8127b8b4>] tcp_v4_do_rcv+0x1db/0x389
 [<ffffffff8104502f>] ? local_bh_enable+0xd/0xf
 [<ffffffff8127bed7>] tcp_v4_rcv+0x475/0x6a8
 [<ffffffffa05d12b3>] ? vboxNetFltLinuxForwardSegment+0x66/0x75 [vboxnetflt]
 [<ffffffff81261a0d>] ip_local_deliver_finish+0x103/0x19f
 [<ffffffff81261b1b>] ip_local_deliver+0x72/0x7a
 [<ffffffff81261601>] ip_rcv_finish+0x305/0x321
 [<ffffffff8126187c>] ip_rcv+0x25f/0x295
 [<ffffffff81240875>] netif_receive_skb+0x3e6/0x40b
 [<ffffffffa0106ce1>] e1000_receive_skb+0x5b/0x76 [e1000e]
 [<ffffffffa0106ed8>] e1000_clean_rx_irq+0x1dc/0x28a [e1000e]
 [<ffffffff81057f0a>] ? sched_clock_cpu+0x10f/0x120
 [<ffffffffa01054e8>] e1000_clean+0x94/0x2a4 [e1000e]
 [<ffffffff8123ee45>] net_rx_action+0xd4/0x1fd
 [<ffffffff8104515a>] __do_softirq+0x7e/0x10c
 [<ffffffff81011bfc>] call_softirq+0x1c/0x28
 [<ffffffff81012e06>] do_softirq+0x4d/0xb0
 [<ffffffff81044d2f>] irq_exit+0x4e/0x9d
 [<ffffffff81013122>] do_IRQ+0x147/0x169
 [<ffffffff81010963>] ret_from_intr+0x0/0x2e
 <EOI> 
Mem-Info:
Node 0 DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
CPU    2: hi:    0, btch:   1 usd:   0
CPU    3: hi:    0, btch:   1 usd:   0
Node 0 DMA32 per-cpu:
CPU    0: hi:  186, btch:  31 usd:   0
CPU    1: hi:  186, btch:  31 usd:   0
CPU    2: hi:  186, btch:  31 usd:  40
CPU    3: hi:  186, btch:  31 usd:   0
Node 0 Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd:   0
CPU    1: hi:  186, btch:  31 usd:  61
CPU    2: hi:  186, btch:  31 usd:  14
CPU    3: hi:  186, btch:  31 usd:  26
Active:132846 inactive:1507468 dirty:160016 writeback:1756 unstable:0
 free:10302 slab:90485 mapped:168022 pagetables:12456 bounce:0
Node 0 DMA free:9444kB min:8kB low:8kB high:12kB active:0kB inactive:0kB present:7624kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 3243 7788 7788
Node 0 DMA32 free:23832kB min:4700kB low:5872kB high:7048kB active:203476kB inactive:2613552kB present:3321004kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 4545 4545
Node 0 Normal free:7932kB min:6588kB low:8232kB high:9880kB active:327908kB inactive:3416320kB present:4654080kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 5*4kB 4*8kB 1*16kB 3*32kB 3*64kB 1*128kB 1*256kB 1*512kB 2*1024kB 1*2048kB 1*4096kB = 9444kB
Node 0 DMA32: 5453*4kB 24*8kB 99*16kB 2*32kB 2*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 24164kB
Node 0 Normal: 1687*4kB 44*8kB 2*16kB 1*32kB 1*64kB 1*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 7868kB
1612378 total pagecache pages
58594 pages in swap cache
Swap cache stats: add 3704360, delete 3645766, find 11825742/12339131
Free swap  = 163321668kB
Total swap = 164095932kB
2031616 pages RAM
208820 pages reserved
571180 pages shared
1284697 pages non-shared
ssh: page allocation failure. order:5, mode:0x4020
Pid: 14666, comm: ssh Tainted: G        W 2.6.27.25-78.2.56.fc9.x86_64 #1

Call Trace:
 <IRQ>  [<ffffffff810938e7>] __alloc_pages_internal+0x436/0x457
 [<ffffffff810b8742>] kmalloc_large_node+0x66/0xa5
 [<ffffffff810ba084>] __kmalloc_node_track_caller+0x29/0x103
 [<ffffffff8123ba24>] ? skb_copy+0x30/0x97
 [<ffffffff8123b398>] __alloc_skb+0x6f/0x135
 [<ffffffff8123ba24>] skb_copy+0x30/0x97
 [<ffffffffa05d131f>] vboxNetFltLinuxPacketHandler+0x5d/0x481 [vboxnetflt]
 [<ffffffff81239418>] ? __skb_clone+0x29/0x11b
 [<ffffffff81240d4d>] dev_hard_start_xmit+0x12b/0x258
 [<ffffffff811589c0>] ? swiotlb_dma_mapping_error+0x18/0x25
 [<ffffffff81252a53>] __qdisc_run+0xed/0x206
 [<ffffffffa05d1705>] ? vboxNetFltLinuxPacketHandler+0x443/0x481 [vboxnetflt]
 [<ffffffff8123f022>] qdisc_run+0x36/0x3b
 [<ffffffff812412cb>] dev_queue_xmit+0x344/0x453
 [<ffffffff81265ff9>] ip_finish_output2+0x1a8/0x1ea
 [<ffffffff812c20fa>] ? _spin_lock+0x9/0xc
 [<ffffffff812660a3>] ip_finish_output+0x68/0x6a
 [<ffffffff81266140>] ip_output+0x9b/0xa0
 [<ffffffff8126545a>] ip_local_out+0x20/0x24
 [<ffffffff81265c92>] ip_queue_xmit+0x2c9/0x31f
 [<ffffffff8123ab13>] ? skb_release_data+0xc6/0xcb
 [<ffffffff81275256>] ? tcp_established_options+0x2e/0xa4
 [<ffffffff81275586>] ? tcp_current_mss+0x74/0xf4
 [<ffffffff812760c1>] tcp_transmit_skb+0x60b/0x64e
 [<ffffffff812947be>] ? bictcp_acked+0x46/0x65
 [<ffffffff8127898b>] __tcp_push_pending_frames+0x725/0x80f
 [<ffffffff81275586>] ? tcp_current_mss+0x74/0xf4
 [<ffffffff812738aa>] tcp_data_snd_check+0x27/0x118
 [<ffffffff81273dd0>] tcp_rcv_established+0xf2/0x853
 [<ffffffff8127b8b4>] tcp_v4_do_rcv+0x1db/0x389
 [<ffffffff8104502f>] ? local_bh_enable+0xd/0xf
 [<ffffffff8127bed7>] tcp_v4_rcv+0x475/0x6a8
 [<ffffffffa05d12b3>] ? vboxNetFltLinuxForwardSegment+0x66/0x75 [vboxnetflt]
 [<ffffffff81261a0d>] ip_local_deliver_finish+0x103/0x19f
 [<ffffffff81261b1b>] ip_local_deliver+0x72/0x7a
 [<ffffffff81261601>] ip_rcv_finish+0x305/0x321
 [<ffffffff8126187c>] ip_rcv+0x25f/0x295
 [<ffffffff81240875>] netif_receive_skb+0x3e6/0x40b
 [<ffffffffa0106ce1>] e1000_receive_skb+0x5b/0x76 [e1000e]
 [<ffffffffa0106ed8>] e1000_clean_rx_irq+0x1dc/0x28a [e1000e]
 [<ffffffffa01054e8>] e1000_clean+0x94/0x2a4 [e1000e]
 [<ffffffff81252b15>] ? __qdisc_run+0x1af/0x206
 [<ffffffff8123ee45>] net_rx_action+0xd4/0x1fd
 [<ffffffff8104515a>] __do_softirq+0x7e/0x10c
 [<ffffffff81011bfc>] call_softirq+0x1c/0x28
 [<ffffffff81012e06>] do_softirq+0x4d/0xb0
 [<ffffffff81044d2f>] irq_exit+0x4e/0x9d
 [<ffffffff81013122>] do_IRQ+0x147/0x169
 [<ffffffff81010963>] ret_from_intr+0x0/0x2e
 <EOI> 
Mem-Info:
Node 0 DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
CPU    2: hi:    0, btch:   1 usd:   0
CPU    3: hi:    0, btch:   1 usd:   0
Node 0 DMA32 per-cpu:
CPU    0: hi:  186, btch:  31 usd:   0
CPU    1: hi:  186, btch:  31 usd:  54
CPU    2: hi:  186, btch:  31 usd:   0
CPU    3: hi:  186, btch:  31 usd:   3
Node 0 Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd:   0
CPU    1: hi:  186, btch:  31 usd: 169
CPU    2: hi:  186, btch:  31 usd:   0
CPU    3: hi:  186, btch:  31 usd:  84
Active:130543 inactive:1511788 dirty:126434 writeback:14307 unstable:0
 free:10162 slab:88340 mapped:168012 pagetables:12456 bounce:0
Node 0 DMA free:9444kB min:8kB low:8kB high:12kB active:0kB inactive:0kB present:7624kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 3243 7788 7788
Node 0 DMA32 free:23820kB min:4700kB low:5872kB high:7048kB active:196200kB inactive:2628732kB present:3321004kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 4545 4545
Node 0 Normal free:7384kB min:6588kB low:8232kB high:9880kB active:325972kB inactive:3418420kB present:4654080kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 5*4kB 4*8kB 1*16kB 3*32kB 3*64kB 1*128kB 1*256kB 1*512kB 2*1024kB 1*2048kB 1*4096kB = 9444kB
Node 0 DMA32: 5296*4kB 39*8kB 131*16kB 4*32kB 2*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 23976kB
Node 0 Normal: 1571*4kB 14*8kB 2*16kB 1*32kB 0*64kB 3*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 7356kB
1614451 total pagecache pages
58595 pages in swap cache
Swap cache stats: add 3704640, delete 3646045, find 11825773/12339230
Free swap  = 163321488kB
Total swap = 164095932kB
2031616 pages RAM
208820 pages reserved
589042 pages shared
1267114 pages non-shared
ssh: page allocation failure. order:5, mode:0x4020
Pid: 14666, comm: ssh Tainted: G        W 2.6.27.25-78.2.56.fc9.x86_64 #1

Call Trace:
 <IRQ>  [<ffffffff810938e7>] __alloc_pages_internal+0x436/0x457
 [<ffffffff810b8742>] kmalloc_large_node+0x66/0xa5
 [<ffffffff810ba084>] __kmalloc_node_track_caller+0x29/0x103
 [<ffffffff8123ba24>] ? skb_copy+0x30/0x97
 [<ffffffff8123b398>] __alloc_skb+0x6f/0x135
 [<ffffffff8123ba24>] skb_copy+0x30/0x97
 [<ffffffffa05d131f>] vboxNetFltLinuxPacketHandler+0x5d/0x481 [vboxnetflt]
 [<ffffffff81239418>] ? __skb_clone+0x29/0x11b
 [<ffffffff81240d4d>] dev_hard_start_xmit+0x12b/0x258
 [<ffffffff81252a53>] __qdisc_run+0xed/0x206
 [<ffffffff8105314d>] ? posix_timer_fn+0x0/0xba
 [<ffffffff8123f022>] qdisc_run+0x36/0x3b
 [<ffffffff8123f0fd>] net_tx_action+0xd6/0x113
 [<ffffffff8104515a>] __do_softirq+0x7e/0x10c
 [<ffffffff81011bfc>] call_softirq+0x1c/0x28
 [<ffffffff81012e06>] do_softirq+0x4d/0xb0
 [<ffffffff81044d2f>] irq_exit+0x4e/0x9d
 [<ffffffff81013122>] do_IRQ+0x147/0x169
 [<ffffffff81010963>] ret_from_intr+0x0/0x2e
 <EOI> 
Mem-Info:
Node 0 DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
CPU    2: hi:    0, btch:   1 usd:   0
CPU    3: hi:    0, btch:   1 usd:   0
Node 0 DMA32 per-cpu:
CPU    0: hi:  186, btch:  31 usd:   1
CPU    1: hi:  186, btch:  31 usd: 150
CPU    2: hi:  186, btch:  31 usd:   0
CPU    3: hi:  186, btch:  31 usd:   3
Node 0 Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd:  30
CPU    1: hi:  186, btch:  31 usd: 182
CPU    2: hi:  186, btch:  31 usd:  27
CPU    3: hi:  186, btch:  31 usd:  84
Active:130543 inactive:1511788 dirty:126434 writeback:14307 unstable:0
 free:10162 slab:88340 mapped:168012 pagetables:12456 bounce:0
Node 0 DMA free:9444kB min:8kB low:8kB high:12kB active:0kB inactive:0kB present:7624kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 3243 7788 7788
Node 0 DMA32 free:23820kB min:4700kB low:5872kB high:7048kB active:196200kB inactive:2628732kB present:3321004kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 4545 4545
Node 0 Normal free:7384kB min:6588kB low:8232kB high:9880kB active:325972kB inactive:3418420kB present:4654080kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 5*4kB 4*8kB 1*16kB 3*32kB 3*64kB 1*128kB 1*256kB 1*512kB 2*1024kB 1*2048kB 1*4096kB = 9444kB
Node 0 DMA32: 5296*4kB 39*8kB 131*16kB 4*32kB 2*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 23976kB
Node 0 Normal: 1509*4kB 15*8kB 2*16kB 1*32kB 0*64kB 3*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 7116kB
1614451 total pagecache pages
58599 pages in swap cache
Swap cache stats: add 3704644, delete 3646045, find 11825776/12339235
Free swap  = 163321488kB
Total swap = 164095932kB
2031616 pages RAM
208820 pages reserved
589162 pages shared
1267354 pages non-shared
ssh: page allocation failure. order:5, mode:0x4020
Pid: 14666, comm: ssh Tainted: G        W 2.6.27.25-78.2.56.fc9.x86_64 #1

Call Trace:
 <IRQ>  [<ffffffff810938e7>] __alloc_pages_internal+0x436/0x457
 [<ffffffff810b8742>] kmalloc_large_node+0x66/0xa5
 [<ffffffff810ba084>] __kmalloc_node_track_caller+0x29/0x103
 [<ffffffff8123ba24>] ? skb_copy+0x30/0x97
 [<ffffffff8123b398>] __alloc_skb+0x6f/0x135
 [<ffffffff8123ba24>] skb_copy+0x30/0x97
 [<ffffffffa05d131f>] vboxNetFltLinuxPacketHandler+0x5d/0x481 [vboxnetflt]
 [<ffffffff81239418>] ? __skb_clone+0x29/0x11b
 [<ffffffff81240d4d>] dev_hard_start_xmit+0x12b/0x258
 [<ffffffff81252a53>] __qdisc_run+0xed/0x206
 [<ffffffff8123f022>] qdisc_run+0x36/0x3b
 [<ffffffff812412cb>] dev_queue_xmit+0x344/0x453
 [<ffffffff81265ff9>] ip_finish_output2+0x1a8/0x1ea
 [<ffffffff812660a3>] ip_finish_output+0x68/0x6a
 [<ffffffff81266140>] ip_output+0x9b/0xa0
 [<ffffffff8126545a>] ip_local_out+0x20/0x24
 [<ffffffff81265c92>] ip_queue_xmit+0x2c9/0x31f
 [<ffffffff8126cf77>] ? sk_stream_alloc_skb+0x38/0xed
 [<ffffffff812760c1>] tcp_transmit_skb+0x60b/0x64e
 [<ffffffff8127898b>] __tcp_push_pending_frames+0x725/0x80f
 [<ffffffff81275586>] ? tcp_current_mss+0x74/0xf4
 [<ffffffff812738aa>] tcp_data_snd_check+0x27/0x118
 [<ffffffff81273dd0>] tcp_rcv_established+0xf2/0x853
 [<ffffffff8127b8b4>] tcp_v4_do_rcv+0x1db/0x389
 [<ffffffff8104502f>] ? local_bh_enable+0xd/0xf
 [<ffffffff8127bed7>] tcp_v4_rcv+0x475/0x6a8
 [<ffffffffa05d12b3>] ? vboxNetFltLinuxForwardSegment+0x66/0x75 [vboxnetflt]
 [<ffffffff81261a0d>] ip_local_deliver_finish+0x103/0x19f
 [<ffffffff81261b1b>] ip_local_deliver+0x72/0x7a
 [<ffffffff81261601>] ip_rcv_finish+0x305/0x321
 [<ffffffff8126187c>] ip_rcv+0x25f/0x295
 [<ffffffff81240875>] netif_receive_skb+0x3e6/0x40b
 [<ffffffffa0106ce1>] e1000_receive_skb+0x5b/0x76 [e1000e]
 [<ffffffffa0106ed8>] e1000_clean_rx_irq+0x1dc/0x28a [e1000e]
 [<ffffffffa01054e8>] e1000_clean+0x94/0x2a4 [e1000e]
 [<ffffffff812529af>] ? __qdisc_run+0x49/0x206
 [<ffffffff8123ee45>] net_rx_action+0xd4/0x1fd
 [<ffffffff8104515a>] __do_softirq+0x7e/0x10c
 [<ffffffff81011bfc>] call_softirq+0x1c/0x28
 [<ffffffff81012e06>] do_softirq+0x4d/0xb0
 [<ffffffff81044d2f>] irq_exit+0x4e/0x9d
 [<ffffffff81013122>] do_IRQ+0x147/0x169
 [<ffffffff81010963>] ret_from_intr+0x0/0x2e
 <EOI> 
Mem-Info:
Node 0 DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
CPU    2: hi:    0, btch:   1 usd:   0
CPU    3: hi:    0, btch:   1 usd:   0
Node 0 DMA32 per-cpu:
CPU    0: hi:  186, btch:  31 usd:   0
CPU    1: hi:  186, btch:  31 usd:  50
CPU    2: hi:  186, btch:  31 usd:  47
CPU    3: hi:  186, btch:  31 usd:   3
Node 0 Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd:  31
CPU    1: hi:  186, btch:  31 usd: 122
CPU    2: hi:  186, btch:  31 usd:  23
CPU    3: hi:  186, btch:  31 usd:  84
Active:130543 inactive:1512218 dirty:126434 writeback:14307 unstable:0
 free:9836 slab:88340 mapped:168012 pagetables:12456 bounce:0
Node 0 DMA free:9444kB min:8kB low:8kB high:12kB active:0kB inactive:0kB present:7624kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 3243 7788 7788
Node 0 DMA32 free:23376kB min:4700kB low:5872kB high:7048kB active:196200kB inactive:2629172kB present:3321004kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 4545 4545
Node 0 Normal free:6524kB min:6588kB low:8232kB high:9880kB active:325972kB inactive:3419700kB present:4654080kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 5*4kB 4*8kB 1*16kB 3*32kB 3*64kB 1*128kB 1*256kB 1*512kB 2*1024kB 1*2048kB 1*4096kB = 9444kB
Node 0 DMA32: 5267*4kB 12*8kB 113*16kB 4*32kB 2*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 23356kB
Node 0 Normal: 1294*4kB 14*8kB 2*16kB 1*32kB 0*64kB 3*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 6248kB
1614881 total pagecache pages
58616 pages in swap cache
Swap cache stats: add 3704661, delete 3646045, find 11825781/12339245
Free swap  = 163321488kB
Total swap = 164095932kB
2031616 pages RAM
208820 pages reserved
589137 pages shared
1267413 pages non-shared
FahCore_a3.exe: page allocation failure. order:5, mode:0x4020
Pid: 23478, comm: FahCore_a3.exe Tainted: G        W 2.6.27.25-78.2.56.fc9.x86_64 #1

Call Trace:
 <IRQ>  [<ffffffff810938e7>] __alloc_pages_internal+0x436/0x457
 [<ffffffff810b8742>] kmalloc_large_node+0x66/0xa5
 [<ffffffff810ba084>] __kmalloc_node_track_caller+0x29/0x103
 [<ffffffff8123ba24>] ? skb_copy+0x30/0x97
 [<ffffffff8123b398>] __alloc_skb+0x6f/0x135
 [<ffffffff8123ba24>] skb_copy+0x30/0x97
 [<ffffffffa05d131f>] vboxNetFltLinuxPacketHandler+0x5d/0x481 [vboxnetflt]
 [<ffffffff8123e407>] ? __netif_schedule+0x18/0x1a
 [<ffffffffa0103fe9>] ? netif_tx_wake_queue+0x34/0x39 [e1000e]
 [<ffffffff81239418>] ? __skb_clone+0x29/0x11b
 [<ffffffff81240d4d>] dev_hard_start_xmit+0x12b/0x258
 [<ffffffff81252a53>] __qdisc_run+0xed/0x206
 [<ffffffff8123f022>] qdisc_run+0x36/0x3b
 [<ffffffff8123f0fd>] net_tx_action+0xd6/0x113
 [<ffffffff8104515a>] __do_softirq+0x7e/0x10c
 [<ffffffff81011bfc>] call_softirq+0x1c/0x28
 [<ffffffff81012e06>] do_softirq+0x4d/0xb0
 [<ffffffff81044d2f>] irq_exit+0x4e/0x9d
 [<ffffffff81013122>] do_IRQ+0x147/0x169
 [<ffffffff81010963>] ret_from_intr+0x0/0x2e
 <EOI> 
Mem-Info:
Node 0 DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
CPU    2: hi:    0, btch:   1 usd:   0
CPU    3: hi:    0, btch:   1 usd:   0
Node 0 DMA32 per-cpu:
CPU    0: hi:  186, btch:  31 usd:   0
CPU    1: hi:  186, btch:  31 usd:  42
CPU    2: hi:  186, btch:  31 usd:  60
CPU    3: hi:  186, btch:  31 usd:   3
Node 0 Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd:  31
CPU    1: hi:  186, btch:  31 usd: 122
CPU    2: hi:  186, btch:  31 usd:  23
CPU    3: hi:  186, btch:  31 usd:  84
Active:130543 inactive:1512218 dirty:126434 writeback:14307 unstable:0
 free:9762 slab:88340 mapped:168012 pagetables:12456 bounce:0
Node 0 DMA free:9444kB min:8kB low:8kB high:12kB active:0kB inactive:0kB present:7624kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 3243 7788 7788
Node 0 DMA32 free:23080kB min:4700kB low:5872kB high:7048kB active:196200kB inactive:2629172kB present:3321004kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 4545 4545
Node 0 Normal free:6524kB min:6588kB low:8232kB high:9880kB active:325972kB inactive:3419700kB present:4654080kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 5*4kB 4*8kB 1*16kB 3*32kB 3*64kB 1*128kB 1*256kB 1*512kB 2*1024kB 1*2048kB 1*4096kB = 9444kB
Node 0 DMA32: 5267*4kB 13*8kB 97*16kB 4*32kB 2*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 23108kB
Node 0 Normal: 1294*4kB 14*8kB 2*16kB 1*32kB 0*64kB 3*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 6248kB
1614881 total pagecache pages
58633 pages in swap cache
Swap cache stats: add 3704678, delete 3646045, find 11825782/12339251
Free swap  = 163321488kB
Total swap = 164095932kB
2031616 pages RAM
208820 pages reserved
589027 pages shared
1267638 pages non-shared
ssh: page allocation failure. order:5, mode:0x4020
Pid: 14666, comm: ssh Tainted: G        W 2.6.27.25-78.2.56.fc9.x86_64 #1

Call Trace:
 <IRQ>  [<ffffffff810938e7>] __alloc_pages_internal+0x436/0x457
 [<ffffffff810b8742>] kmalloc_large_node+0x66/0xa5
 [<ffffffff810ba084>] __kmalloc_node_track_caller+0x29/0x103
 [<ffffffff8123ba24>] ? skb_copy+0x30/0x97
 [<ffffffff8123b398>] __alloc_skb+0x6f/0x135
 [<ffffffff8123ba24>] skb_copy+0x30/0x97
 [<ffffffffa05d131f>] vboxNetFltLinuxPacketHandler+0x5d/0x481 [vboxnetflt]
 [<ffffffff8123e407>] ? __netif_schedule+0x18/0x1a
 [<ffffffffa0103fe9>] ? netif_tx_wake_queue+0x34/0x39 [e1000e]
 [<ffffffff81239418>] ? __skb_clone+0x29/0x11b
 [<ffffffff81240d4d>] dev_hard_start_xmit+0x12b/0x258
 [<ffffffff81252a53>] __qdisc_run+0xed/0x206
 [<ffffffff8123f022>] qdisc_run+0x36/0x3b
 [<ffffffff8123f0fd>] net_tx_action+0xd6/0x113
 [<ffffffff8104515a>] __do_softirq+0x7e/0x10c
 [<ffffffff81011bfc>] call_softirq+0x1c/0x28
 [<ffffffff81012e06>] do_softirq+0x4d/0xb0
 [<ffffffff81044d2f>] irq_exit+0x4e/0x9d
 [<ffffffff81013122>] do_IRQ+0x147/0x169
 [<ffffffff8123a9f9>] ? __kfree_skb+0x74/0x78
 [<ffffffff81010963>] ret_from_intr+0x0/0x2e
 <EOI>  [<ffffffff810b93a3>] ? kmem_cache_free+0xa5/0xb8
 [<ffffffff8123a9f9>] ? __kfree_skb+0x74/0x78
 [<ffffffff8126b216>] ? sk_eat_skb+0x37/0x5b
 [<ffffffff8126c833>] ? tcp_recvmsg+0x81f/0xb6e
 [<ffffffff8123626c>] ? sock_common_recvmsg+0x32/0x47
 [<ffffffff812342ca>] ? __sock_recvmsg+0x6d/0x7a
 [<ffffffff812343c5>] ? sock_aio_read+0xee/0xfe
 [<ffffffff81152500>] ? copy_user_generic_string+0x30/0x40
 [<ffffffff810becb5>] ? do_sync_read+0xe7/0x12d
 [<ffffffff81053829>] ? autoremove_wake_function+0x0/0x38
 [<ffffffff8103c847>] ? finish_task_switch+0x31/0xc9
 [<ffffffff812c0812>] ? thread_return+0xab/0xd9
 [<ffffffff81120e84>] ? security_file_permission+0x11/0x13
 [<ffffffff810bf645>] ? vfs_read+0xbb/0x102
 [<ffffffff810bf750>] ? sys_read+0x47/0x6e
 [<ffffffff8101027a>] ? system_call_fastpath+0x16/0x1b

Mem-Info:
Node 0 DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
CPU    2: hi:    0, btch:   1 usd:   0
CPU    3: hi:    0, btch:   1 usd:   0
Node 0 DMA32 per-cpu:
CPU    0: hi:  186, btch:  31 usd:   0
CPU    1: hi:  186, btch:  31 usd:   0
CPU    2: hi:  186, btch:  31 usd:   0
CPU    3: hi:  186, btch:  31 usd:   1
Node 0 Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd:   0
CPU    1: hi:  186, btch:  31 usd:   1
CPU    2: hi:  186, btch:  31 usd:   0
CPU    3: hi:  186, btch:  31 usd:  33
Active:127767 inactive:1518813 dirty:158486 writeback:3501 unstable:0
 free:10546 slab:83534 mapped:167992 pagetables:12456 bounce:0
Node 0 DMA free:9444kB min:8kB low:8kB high:12kB active:0kB inactive:0kB present:7624kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 3243 7788 7788
Node 0 DMA32 free:24716kB min:4700kB low:5872kB high:7048kB active:191824kB inactive:2640760kB present:3321004kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 4545 4545
Node 0 Normal free:8024kB min:6588kB low:8232kB high:9880kB active:319244kB inactive:3434492kB present:4654080kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 5*4kB 4*8kB 1*16kB 3*32kB 3*64kB 1*128kB 1*256kB 1*512kB 2*1024kB 1*2048kB 1*4096kB = 9444kB
Node 0 DMA32: 5417*4kB 55*8kB 132*16kB 6*32kB 3*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 24988kB
Node 0 Normal: 1549*4kB 84*8kB 44*16kB 8*32kB 3*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 8532kB
1618644 total pagecache pages
58472 pages in swap cache
Swap cache stats: add 3704828, delete 3646356, find 11825811/12339319
Free swap  = 163321296kB
Total swap = 164095932kB
2031616 pages RAM
208820 pages reserved
603413 pages shared
1252549 pages non-shared
ssh: page allocation failure. order:1, mode:0x4020
Pid: 14666, comm: ssh Tainted: G        W 2.6.27.25-78.2.56.fc9.x86_64 #1

Call Trace:
 <IRQ>  [<ffffffff810938e7>] __alloc_pages_internal+0x436/0x457
 [<ffffffff8123ab60>] ? pskb_expand_head+0x48/0x14e
 [<ffffffff810b8742>] kmalloc_large_node+0x66/0xa5
 [<ffffffff810ba084>] __kmalloc_node_track_caller+0x29/0x103
 [<ffffffff8123ba24>] ? skb_copy+0x30/0x97
 [<ffffffff8123b398>] __alloc_skb+0x6f/0x135
 [<ffffffff8123ba24>] skb_copy+0x30/0x97
 [<ffffffffa05d131f>] vboxNetFltLinuxPacketHandler+0x5d/0x481 [vboxnetflt]
 [<ffffffff81239418>] ? __skb_clone+0x29/0x11b
 [<ffffffff81240d4d>] dev_hard_start_xmit+0x12b/0x258
 [<ffffffff81252a53>] __qdisc_run+0xed/0x206
 [<ffffffff8123f022>] qdisc_run+0x36/0x3b
 [<ffffffff8123f0fd>] net_tx_action+0xd6/0x113
 [<ffffffff8104515a>] __do_softirq+0x7e/0x10c
 [<ffffffff81011bfc>] call_softirq+0x1c/0x28
 [<ffffffff81012e06>] do_softirq+0x4d/0xb0
 [<ffffffff81044d2f>] irq_exit+0x4e/0x9d
 [<ffffffff81013122>] do_IRQ+0x147/0x169
 [<ffffffff81010963>] ret_from_intr+0x0/0x2e
 <EOI>  [<ffffffff810dfd66>] ? free_buffer_head+0x24/0x3e
 [<ffffffff8101fde5>] ? __cpus_empty+0x18/0x45
 [<ffffffff8101ff23>] ? native_flush_tlb_others+0xa1/0xca
 [<ffffffff812c20fa>] ? _spin_lock+0x9/0xc
 [<ffffffff81020044>] ? flush_tlb_page+0x82/0x8c
 [<ffffffff8102d32f>] ? ptep_clear_flush_young+0x4e/0x5d
 [<ffffffff810a5f97>] ? page_referenced_one+0x83/0xeb
 [<ffffffffa042a91f>] ? RTSpinlockReleaseNoInts+0x10/0x12 [vboxdrv]
 [<ffffffff810a6ec8>] ? page_referenced+0x87/0x107
 [<ffffffff8109794c>] ? shrink_page_list+0x164/0x61b
 [<ffffffff81097632>] ? isolate_lru_pages+0x174/0x1d1
 [<ffffffff81097feb>] ? shrink_inactive_list+0x1bf/0x473
 [<ffffffff8109837b>] ? shrink_zone+0xdc/0xff
 [<ffffffff81099424>] ? try_to_free_pages+0x25c/0x3c9
 [<ffffffff8109768f>] ? isolate_pages_global+0x0/0x34
 [<ffffffff810cbaa4>] ? do_select+0x4da/0x505
 [<ffffffff8109375b>] ? __alloc_pages_internal+0x2aa/0x457
 [<ffffffff810b8742>] ? kmalloc_large_node+0x66/0xa5
 [<ffffffff810ba084>] ? __kmalloc_node_track_caller+0x29/0x103
 [<ffffffff81237584>] ? sock_alloc_send_skb+0x9f/0x20e
 [<ffffffff8123b398>] ? __alloc_skb+0x6f/0x135
 [<ffffffff81237584>] ? sock_alloc_send_skb+0x9f/0x20e
 [<ffffffff812c2209>] ? _spin_unlock_bh+0x13/0x15
 [<ffffffff81236ddc>] ? release_sock+0xb0/0xbc
 [<ffffffff81121498>] ? security_socket_getpeersec_dgram+0x11/0x13
 [<ffffffff812a5927>] ? unix_stream_sendmsg+0x13b/0x2b8
 [<ffffffff8123415e>] ? __sock_sendmsg+0x59/0x62
 [<ffffffff8123424d>] ? sock_aio_write+0xe6/0xf6
 [<ffffffff81152500>] ? copy_user_generic_string+0x30/0x40
 [<ffffffff810beb88>] ? do_sync_write+0xe7/0x12d
 [<ffffffff81053829>] ? autoremove_wake_function+0x0/0x38
 [<ffffffff8103c847>] ? finish_task_switch+0x31/0xc9
 [<ffffffff81120e84>] ? security_file_permission+0x11/0x13
 [<ffffffff810bf457>] ? vfs_write+0xbe/0x105
 [<ffffffff810bf562>] ? sys_write+0x47/0x6f
 [<ffffffff8101027a>] ? system_call_fastpath+0x16/0x1b

Mem-Info:
Node 0 DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
CPU    2: hi:    0, btch:   1 usd:   0
CPU    3: hi:    0, btch:   1 usd:   0
Node 0 DMA32 per-cpu:
CPU    0: hi:  186, btch:  31 usd:   0
CPU    1: hi:  186, btch:  31 usd:  56
CPU    2: hi:  186, btch:  31 usd:   0
CPU    3: hi:  186, btch:  31 usd:  19
Node 0 Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd:   0
CPU    1: hi:  186, btch:  31 usd: 177
CPU    2: hi:  186, btch:  31 usd:  24
CPU    3: hi:  186, btch:  31 usd: 163
Active:109568 inactive:1563173 dirty:74941 writeback:831 unstable:0
 free:10285 slab:56685 mapped:165533 pagetables:12234 bounce:0
Node 0 DMA free:9444kB min:8kB low:8kB high:12kB active:0kB inactive:0kB present:7624kB pages_scanned:0 all_unreclaimable? yes
lowmem_reserve[]: 0 3243 7788 7788
Node 0 DMA32 free:24376kB min:4700kB low:5872kB high:7048kB active:159976kB inactive:2719564kB present:3321004kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 4545 4545
Node 0 Normal free:7320kB min:6588kB low:8232kB high:9880kB active:278296kB inactive:3533128kB present:4654080kB pages_scanned:64 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 5*4kB 4*8kB 1*16kB 3*32kB 3*64kB 1*128kB 1*256kB 1*512kB 2*1024kB 1*2048kB 1*4096kB = 9444kB
Node 0 DMA32: 5865*4kB 1*8kB 17*16kB 1*32kB 0*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 24284kB
Node 0 Normal: 1537*4kB 15*8kB 14*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 7324kB
1639202 total pagecache pages
58424 pages in swap cache
Swap cache stats: add 3875323, delete 3816899, find 11862398/12416849
Free swap  = 163242300kB
Total swap = 164095932kB
2031616 pages RAM
208820 pages reserved
897339 pages shared
950835 pages non-shared


This is a different kind of crash though. I suspect it while I was deleting over 1TB of data. For some reason deleting data seems to bring the system to a grind, even vim becomes unusable.


What scares me is I will need to do some power maintenance involving physically moving the UPS and the batteries to a new rack. If this server goes down, it's probably game over. It's a shutdown and startup that actually initiated all of these issues. It's old hardware and it's on it's last legs. Though if I'm quick enough and don't let everything cool down, I'll hopefully be ok. I seriously need to start saving up for a new server, I'm really living on the edge with this box and I have no redundancy, this server does everything.
 
Last edited: