GFS2 Kernel Oops
Sunday, October 30th, 2011For a few years I’ve run a system using DRBD replication between two machines with GFS2 running in dual primary mode to test a theory on a particular type of web hosting I’ve been developing.
For months the system will run fine, then, out of the blue, one of the nodes will drop from the cluster, reboot and we’ve never seen anything in the logs. It’ll run another 120-180 days without incident and then will reboot again with no real indication of the problem. We knew it was a kernel panic or kernel oops, but, the logs never were flushed to disk when the machine was rebooted.
Imagine our luck when two days in a row, at roughly the same time of day, the node rebooted. Even though we have remote syslog set up, we’ve never caught it.
/etc/sysctl.conf was changed so that panic_on_oops was set to 0, a number of terminal sessions were opened from another machine tailing various logs, and we were hoping to have the problem occur again.
/etc/sysctl.conf:
kernel.panic=5 kernel.panic_on_oops=0
At 6:25am, coincidentally during log rotation, the GFS2 partition umounted, but, the machine didn’t reboot. Checking our terminal, we still had access to dmesg, and, we had some logs:
GFS2: fsid=gamma:gfs1.0: fatal: invalid metadata block GFS2: fsid=gamma:gfs1.0: bh = 1211322 (magic number) GFS2: fsid=gamma:gfs1.0: function = gfs2_meta_indirect_buffer, file = fs/gfs2/meta_io.c, line = 401 GFS2: fsid=gamma:gfs1.0: about to withdraw this file system GFS2: fsid=gamma:gfs1.0: telling LM to unmount GFS2: fsid=gamma:gfs1.0: withdrawn Pid: 18047, comm: gzip Not tainted 3.0.0 #1 Call Trace: [] ? gfs2_lm_withdraw+0xd9/0x10a [ ] ? gfs2_meta_check_ii+0x3c/0x48 [ ] ? gfs2_meta_indirect_buffer+0xf0/0x14a [ ] ? gfs2_block_map+0x1a3/0x9fe [ ] ? drive_stat_acct+0xf3/0x12e [ ] ? do_mpage_readpage+0x160/0x49f [ ] ? pagevec_lru_move_fn+0xab/0xc1 [ ] ? gfs2_unstuff_dinode+0x383/0x383 [ ] ? mpage_readpages+0xd0/0x12a [ ] ? gfs2_unstuff_dinode+0x383/0x383 [ ] ? bit_waitqueue+0x14/0x63 [ ] ? gfs2_readpages+0x67/0xa8 [ ] ? sd_prep_fn+0x2c1/0x902 [ ] ? gfs2_readpages+0x3b/0xa8 [ ] ? __do_page_cache_readahead+0x11b/0x1c0 [ ] ? ra_submit+0x19/0x1d [ ] ? generic_file_aio_read+0x2b4/0x5e0 [ ] ? do_sync_read+0xab/0xe3 [ ] ? vfs_read+0xa3/0x10f [ ] ? sys_read+0x45/0x6e [ ] ? system_call_fastpath+0x16/0x1b ------------[ cut here ]------------ WARNING: at fs/buffer.c:1188 gfs2_block_map+0x2be/0x9fe() Hardware name: PDSMi VFS: brelse: Trying to free free buffer Modules linked in: Pid: 18047, comm: gzip Not tainted 3.0.0 #1 Call Trace: [ ] ? gfs2_block_map+0x2be/0x9fe [ ] ? warn_slowpath_common+0x78/0x8c [ ] ? warn_slowpath_fmt+0x45/0x4a [ ] ? gfs2_block_map+0x2be/0x9fe [ ] ? drive_stat_acct+0xf3/0x12e [ ] ? do_mpage_readpage+0x160/0x49f [ ] ? pagevec_lru_move_fn+0xab/0xc1 [ ] ? gfs2_unstuff_dinode+0x383/0x383 [ ] ? mpage_readpages+0xd0/0x12a [ ] ? gfs2_unstuff_dinode+0x383/0x383 [ ] ? bit_waitqueue+0x14/0x63 [ ] ? gfs2_readpages+0x67/0xa8 [ ] ? sd_prep_fn+0x2c1/0x902 [ ] ? gfs2_readpages+0x3b/0xa8 [ ] ? __do_page_cache_readahead+0x11b/0x1c0 [ ] ? ra_submit+0x19/0x1d [ ] ? generic_file_aio_read+0x2b4/0x5e0 [ ] ? do_sync_read+0xab/0xe3 [ ] ? vfs_read+0xa3/0x10f [ ] ? sys_read+0x45/0x6e [ ] ? system_call_fastpath+0x16/0x1b ---[ end trace 54fad1a4877f173c ]--- BUG: unable to handle kernel paging request at ffffffff813b8f0f IP: [ ] __brelse+0x7/0x26 PGD 1625067 PUD 1629063 PMD 12001e1 Oops: 0003 [#1] SMP CPU 0 Modules linked in: Pid: 18047, comm: gzip Tainted: G W 3.0.0 #1 Supermicro PDSMi/PDSMi+ RIP: 0010:[ ] [ ] __brelse+0x7/0x26 RSP: 0018:ffff880185d85800 EFLAGS: 00010286 RAX: 00000000e8df8948 RBX: ffff8801f3fb6c18 RCX: ffff880185d857d0 RDX: 0000000000000010 RSI: 000000000002ccee RDI: ffffffff813b8eaf RBP: 0000000000000000 R08: ffff880185d85890 R09: ffff8801f3fb6c18 R10: 00000000000029e0 R11: 0000000000000078 R12: ffff880147986000 R13: ffff880147986140 R14: 00000000000029c1 R15: 0000000000001000 FS: 00007fb7d320f700(0000) GS:ffff88021fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffff813b8f0f CR3: 0000000212a73000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process gzip (pid: 18047, threadinfo ffff880185d84000, task ffff880147dc4ed0) Stack: ffffffff811537f6 0000000000000000 00000000000022a1 0000000000000001 000000000000ffff 0000000000000000 ffffffff8102a7ce 0000000000000001 0000000100000008 00000000fffffffb 0000000000000001 ffff8801f3fb6c18 Call Trace: [ ] ? gfs2_block_map+0x2be/0x9fe [ ] ? warn_slowpath_common+0x7d/0x8c [ ] ? printk+0x43/0x48 [ ] ? alloc_page_buffers+0x62/0xba [ ] ? block_read_full_page+0x141/0x260 [ ] ? gfs2_unstuff_dinode+0x383/0x383 [ ] ? do_mpage_readpage+0x49b/0x49f [ ] ? pagevec_lru_move_fn+0xab/0xc1 [ ] ? gfs2_unstuff_dinode+0x383/0x383 [ ] ? mpage_readpages+0xd0/0x12a [ ] ? gfs2_unstuff_dinode+0x383/0x383 [ ] ? bit_waitqueue+0x14/0x63 [ ] ? gfs2_readpages+0x67/0xa8 [ ] ? sd_prep_fn+0x2c1/0x902 [ ] ? gfs2_readpages+0x3b/0xa8 [ ] ? __do_page_cache_readahead+0x11b/0x1c0 [ ] ? ra_submit+0x19/0x1d [ ] ? generic_file_aio_read+0x2b4/0x5e0 [ ] ? do_sync_read+0xab/0xe3 [ ] ? vfs_read+0xa3/0x10f [ ] ? sys_read+0x45/0x6e [ ] ? system_call_fastpath+0x16/0x1b Code: 31 00 45 31 f6 fe 85 88 00 00 00 48 89 df e8 a2 1a fc ff eb 03 45 31 f6 5b 4c 89 f0 5d 41 5c 41 5d 41 5e c3 8b 47 60 85 c0 74 05 ff 4f 60 c3 48 c7 c2 96 7b 4d 81 be a4 04 00 00 31 c0 48 c7 RIP [ ] __brelse+0x7/0x26 RSP CR2: ffffffff813b8f0f ---[ end trace 54fad1a4877f173d ]---
As I suspected, log rotation appeared to trigger the problem and handed us the above traceback. Running fsck.gfs2 resulted in:
# fsck.gfs2 -y /dev/drbd1 Initializing fsck Validating Resource Group index. Level 1 RG check. (level 1 passed) Error: resource group 7339665 (0x6ffe91): free space (64473) does not match bitmap (64658) The rgrp was fixed. Error: resource group 7405179 (0x70fe7b): free space (64249) does not match bitmap (64299) (50 blocks were reclaimed) The rgrp was fixed. Error: resource group 7470693 (0x71fe65): free space (65456) does not match bitmap (65464) (8 blocks were reclaimed) The rgrp was fixed. ...snip... Ondisk and fsck bitmaps differ at block 133061348 (0x7ee5ae4) Ondisk status is 1 (Data) but FSCK thinks it should be 0 (Free) Metadata type is 0 (free) Succeeded. Ondisk and fsck bitmaps differ at block 133061349 (0x7ee5ae5) Ondisk status is 1 (Data) but FSCK thinks it should be 0 (Free) Metadata type is 0 (free) Succeeded. RG #133061031 (0x7ee59a7) free count inconsistent: is 65232 should be 65508 Inode count inconsistent: is 37 should be 0 Resource group counts updated Inode count inconsistent: is 1267 should be 1266 Resource group counts updated Pass5 complete The statfs file is wrong: Current statfs values: blocks: 188730628 (0xb3fcd04) free: 176443034 (0xa844e9a) dinodes: 644117 (0x9d415) Calculated statfs values: blocks: 188730628 (0xb3fcd04) free: 177426468 (0xa935024) dinodes: 493059 (0x78603) The statfs file was fixed. Writing changes to disk gfs2_fsck complete
Filesystem was remounted after a 7 minute fsck and we’ll see if it happens again tomorrow.