Data Corruption Bug In Kernel 3.6.3 - EXT4

lxskllr · Oct 24, 2012

It's under fairly specific circumstances it can happen, but it's something to be aware of.

Date Tue, 23 Oct 2012 18:19:13 -0400
From Theodore Ts'o <>
Subject Re: Apparent serious progressive ext4 data corruption bug in 3.6.3 (and other stable branches?)

On Tue, Oct 23, 2012 at 09:57:08PM +0100, Nix wrote:
>
> It is now quite clear that this is a bug introduced by one or more of
> the post-3.6.1 ext4 patches (which have all been backported at least to
> 3.5, so the problem is probably there too).
>
> [ 60.290844] EXT4-fs error (device dm-3): ext4_mb_generate_buddy:741: group 202, 1583 clusters in bitmap, 1675 in gd
> [ 60.291426] JBD2: Spotted dirty metadata buffer (dev = dm-3, blocknr = 0). There's a risk of filesystem corruption in case of system crash.
>

I think I've found the problem. I believe the commit at fault is commit
14b4ed22a6 (upstream commit eeecef0af5e):

jbd2: don't write superblock when if its empty

which first appeared in v3.6.2.

The reason why the problem happens rarely is that the effect of the
buggy commit is that if the journal's starting block is zero, we fail
to truncate the journal when we unmount the file system. This can
happen if we mount and then unmount the file system fairly quickly,
before the log has a chance to wrap. After the first time this has
happened, it's not a disaster, since when we replay the journal, we'll
just replay some extra transactions. But if this happens twice, the
oldest valid transaction will still not have gotten updated, but some
of the newer transactions from the last mount session will have gotten
written by the very latest transacitons, and when we then try to do
the extra transaction replays, the metadata blocks can end up getting
very scrambled indeed.

*Sigh*. My apologies for not catching this when I reviewed this
patch. I believe the following patch should fix the bug; once it's
reviewed by other ext4 developers, I'll push this to Linus ASAP.

- Ted

commit 26de1ba5acc39f0ab57ce1ed523cb128e4ad73a4
Author: Theodore Ts'o <tytso@mit.edu>
Date: Tue Oct 23 18:15:22 2012 -0400

jbd2: fix a potential fs corrupting bug in jbd2_mark_journal_empty

Fix a potential file system corrupting bug which was introduced by
commit eeecef0af5ea4efd763c9554cf2bd80fc4a0efd3: jbd2: don't write
superblock when if its empty.

We should only skip writing the journal superblock if there is nothing
to do --- not just when s_start is zero.

This has caused users to report file system corruptions in ext4 that
look like this:

EXT4-fs error (device sdb3): ext4_mb_generate_buddy:741: group 436, 22902 clusters in bitmap, 22901 in gd
JBD2: Spotted dirty metadata buffer (dev = sdb3, blocknr = 0). There's a risk of filesystem corruption in case of system crash.

after the file system has been corrupted.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index 0f16edd..0064181 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -1351,18 +1351,20 @@ void jbd2_journal_update_sb_log_tail(journal_t *journal, tid_t tail_tid,
static void jbd2_mark_journal_empty(journal_t *journal)
{
journal_superblock_t *sb = journal->j_superblock;
+ __be32 new_tail_sequence;

BUG_ON(!mutex_is_locked(&journal->j_checkpoint_mutex));
read_lock(&journal->j_state_lock);
- /* Is it already empty? */
- if (sb->s_start == 0) {
+ new_tail_sequence = cpu_to_be32(journal->j_tail_sequence);
+ /* Nothing to do? */
+ if (sb->s_start == 0 && sb->s_sequence == new_tail_sequence) {
read_unlock(&journal->j_state_lock);
return;
}
jbd_debug(1, "JBD2: Marking journal as empty (seq %d)\n",
journal->j_tail_sequence);

- sb->s_sequence = cpu_to_be32(journal->j_tail_sequence);
+ sb->s_sequence = new_tail_sequence;
sb->s_start = cpu_to_be32(0);
read_unlock(&journal->j_state_lock);

https://lkml.org/lkml/2012/10/23/690

Jodell88 · Oct 24, 2012

Guess what kernel I'm running? :\

lxskllr · Oct 24, 2012

Jodell88 said:
Guess what kernel I'm running? :\

You Archers... :shakes head:

Good old Debian, slow and stable. I'm on 3.2.0

:^P :^D

Jodell88 · Oct 24, 2012

If I want to, I can use the linux-lts kernel. That's on 3.0.48.

Red Squirrel · Oct 24, 2012

Looks like I'm safe.

ryan@falcon:~$ uname -a
Linux falcon 3.2.0-26-generic #41-Ubuntu SMP Thu Jun 14 17:49:24 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
ryan@falcon:~$

Though I did try to run system update today and it went to hell, not sure if that would have updated the kernel or not but kinda glad it failed now. I'll wait to try again.

Cerb · Oct 25, 2012

lxskllr said:
You Archers... :shakes head:

Good old Debian, slow and stable. I'm on 3.2.0

:^P :^D

I'm an Archer, and I'm on a 3.4.x-ck

, with 3.2.29 on my virtual dev instance (Slackware, because Awt and GTK+ are borked, ATM). But, still, yikes!

The worst will be for users of new hardware, where an older kernel could mean poor feature support (such as unstable virtualization, or poor GPU or NIC support).

Jodell88 · Oct 25, 2012

Based on comments by the devs working on fixing this bug on the phoronix forums, I don't believe that I have anything to fear. It is a very difficult bug to trigger, and one of the two people that this bug triggered for was running experimental features. I'm confident that nothing will happen to me.

VinDSL · Oct 25, 2012

SOURCE: http://www.h-online.com/open/news/i...-ext4-data-corruption-bug-Update-1736110.html

[SIZE="+1"]Update 25-10-12:[/SIZE]
Theodore Ts'o has continued his investigation of the bug and has found that the problem was more esoteric than was first thought. The user who reported the problem was using umount -l, which immediately unmounts the filesystem without waiting for it to stop being busy. The bug is now thought to be caused when the machine is being shut down while it is in the process of unmounting the filesystem with an already compromised journal.

The developers are still working to pinpoint the exact problem and it might actually involve more kernel components than just the ext4 drivers. In any case, it has become clear that the bug needs a very specific configuration to surface and is unlikely to affect most users.

Red Squirrel · Oct 25, 2012

That's the nice thing with Linux. Most of the time these very serious bugs are hard to trigger, and they fix them up pretty fast.

lxskllr · Oct 31, 2012

Fixed...

author Linus Torvalds <torvalds@linux-foundation.org>
Tue, 30 Oct 2012 22:35:16 +0000 (15:35 -0700)
committer Linus Torvalds <torvalds@linux-foundation.org>
Tue, 30 Oct 2012 22:35:16 +0000 (15:35 -0700)
Pull ext4 bugfix from Ted Ts'o:
"This fixes the root cause of the ext4 data corruption bug which raised
a ruckus on LWN, Phoronix, and Slashdot.

This bug only showed up when non-standard mount options
(journal_async_commit and/or journal_checksum) were enabled, and when
the file system was not cleanly unmounted, but the root cause was the
inode bitmap modifications was not being properly journaled.

This could potentially lead to minor file system corruptions (pass 5
complaints with the inode allocation bitmap) after an unclean shutdown
under the wrong/unlucky workloads, but it turned into major failure if
the journal_checksum and/or jouaral_async_commit was enabled."

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: fix unjournaled inode bitmap modification

https://git.kernel.org/?p=linux/ker...ff;h=8c673cbc7682b3f2862fe42f8069cac20c09e160

Search

Data Corruption Bug In Kernel 3.6.3 - EXT4

lxskllr

No Lifer

Jodell88

Diamond Member

lxskllr

No Lifer

Jodell88

Diamond Member

Red Squirrel

No Lifer

Cerb

Elite Member

Jodell88

Diamond Member

VinDSL

Diamond Member

Red Squirrel

No Lifer

lxskllr

No Lifer

TRENDING THREADS