dm_dust_btrfs救援

btrfs raid1 下 掉盘+元数据暴满处理

root@localhost:~# btrfs fi usage -T /mnt/btrfs_practice/
WARNING: failed to get device size for <missing disk>: No such file or directory
Overall:
    Device size:                   1.50GiB
    Device allocated:           1022.00MiB
    Device unallocated:          514.00MiB
    Device missing:                1.00GiB
    Device slack:                    0.00B
    Used:                        672.43MiB
    Free (estimated):            348.16MiB      (min: 348.16MiB)
    Free (statfs, df):            91.16MiB
    Data ratio:                       2.00
    Metadata ratio:                   2.00
    Global reserve:                5.50MiB      (used: 0.00B)
    Multiple profiles:                  no

                  Data      Metadata System
Id Path           RAID1     RAID1    RAID1    Unallocated Total     Slack
-- -------------- --------- -------- -------- ----------- --------- -----
 1 <missing disk> 426.25MiB 76.75MiB  8.00MiB   513.00MiB   1.00GiB     -
 2 /dev/loop2     426.25MiB 76.75MiB  8.00MiB     1.00MiB 512.00MiB     -
-- -------------- --------- -------- -------- ----------- --------- -----
   Total          426.25MiB 76.75MiB  8.00MiB   514.00MiB   1.50GiB 0.00B
   Used           335.09MiB  1.11MiB 16.00KiB

无法直接replace掉

root@localhost:~# btrfs replace start -B 1 /dev/loop3 /mnt/btrfs_practice/
Performing full device TRIM /dev/loop3 (1.00GiB) ...
ERROR: ioctl(DEV_REPLACE_START) failed on "/mnt/btrfs_practice/": No space left on device

操作:
基于/dev/loop3 创建一个新btrfs,snapshot /mnt/btrfs_practice 为/mnt/btrfs_practice/backup

btrfs send /mnt/btrfs_practice/backup | btrfs receive /tmp/loop3

btrfs subvolume set-default /tmp/loop3/backup

然后重新mount loop3就可以了

fix block size

  ┌────────────┬───────────────────────────────────────────────────────────────┐
  │    项目    │                              值                               │
  ├────────────┼───────────────────────────────────────────────────────────────┤
  │ 超级块记录 │ 1073741824 bytes(1 GiB)                                     │
  ├────────────┼───────────────────────────────────────────────────────────────┤
  │ 实际设备   │ 1072693248 bytes(少 1 MiB)                                  │
  ├────────────┼───────────────────────────────────────────────────────────────┤
  │ 故障       │ total_bytes should be at most 1072693248 but found 1073741824 │
  └────────────┴───────────────────────────────────────────────────────────────┘

root@localhost:~# mount /dev/mapper/dust_btrfs /mnt/btrfs_practice/
mount: /mnt/btrfs_practice: wrong fs type, bad option, bad superblock on /dev/mapper/dust_btrfs, missing codepage or helper program, or other error.
       dmesg(1) may have more information after failed mount system call.
# dmesg...

[12387.417768] BTRFS info (device dm-1): using crc32c (crc32c-lib) checksum algorithm
[12387.425092] BTRFS error (device dm-1): device total_bytes should be at most 1072693248 but found 1073741824
[12387.425104] BTRFS error (device dm-1): failed to read chunk tree: -22
[12387.429301] BTRFS error (device dm-1): open_ctree failed: -22
root@localhost:~# btrfs rescue fix-device-size /dev/mapper/dust_btrfs
Fixed device size for devid 1, old size: 1073741824 new size: 1072693248
Fixed super total bytes, old size: 1073741824 new size: 1072693248
Fixed unaligned/mismatched total_bytes for super block and device items

异常断电 zero-log 清理日志树 bad tree block start, mirror 1 want 524288000 have 0

[35709.020914] BTRFS info (device loop1): start tree-log replay
[35709.021153] BTRFS error (device loop1): bad tree block start, mirror 1 want 524288000 have 0
[35709.021317] BTRFS warning (device loop1): failed to read log tree
[35709.026257] BTRFS error (device loop1): open_ctree failed: -5
root@localhost:~# mount -o loop ./btrfs11.img /mnt/btrfs_practice/
mount: /mnt/btrfs_practice: can't read superblock on /dev/loop1.
       dmesg(1) may have more information after failed mount system call.

btrfs rescue zero-log

root@localhost:~# btrfs rescue zero-log ./btrfs11.img
checksum verify failed on 524288000 wanted 0x00000000 found 0xb6bde3e4
Couldn't setup log root tree
Clearing log on ./btrfs11.img, previous log_root 524288000, level 0

root@localhost:~# mount -o loop ./btrfs11.img /mnt/btrfs_practice/

35807.682528] loop1: detected capacity change from 0 to 2097152
[35807.695142] BTRFS: device fsid 43f62df9-e3bc-483f-afba-f10edc4e12f0 devid 1 transid 13 /dev/loop1 (7:1) scanned by mount (44264)
[35807.696078] BTRFS info (device loop1): first mount of filesystem 43f62df9-e3bc-483f-afba-f10edc4e12f0
[35807.696270] BTRFS info (device loop1): using crc32c (crc32c-lib) checksum algorithm
[35807.713139] BTRFS info (device loop1): enabling ssd optimizations
[35807.713150] BTRFS info (device loop1): enabling free space tree

bad superblock

root@localhost:~# mount -o loop ./btrfs33.img /mnt/btrfs_practice/
mount: /mnt/btrfs_practice: wrong fs type, bad option, bad superblock on /dev/loop1, missing codepage or helper program, or other error.
       dmesg(1) may have more information after failed mount system call.
[37848.878398] loop1: detected capacity change from 0 to 2097152
[37848.890849] EXT4-fs (loop1): VFS: Can't find ext4 filesystem
[37848.891709] EXT4-fs (loop1): VFS: Can't find ext4 filesystem
[37848.892328] EXT4-fs (loop1): VFS: Can't find ext4 filesystem
[37848.910358] ISOFS: Unable to identify CD-ROM format.
[37848.910740] FAT-fs (loop1): bogus number of reserved sectors
[37848.910750] FAT-fs (loop1): Can't find a valid FAT filesystem
[37848.911521] hfs: can't find a HFS filesystem on dev loop1
[37848.912326] hfsplus: unable to find HFS+ superblock
[37848.916174] XFS (loop1): Invalid superblock magic number
btrfs rescue super-recover <device>
root@localhost:~# btrfs rescue super-recover ./btrfs33.img
Make sure this is a btrfs disk otherwise the tool will destroy other fs, Are you sure? [y/N]: y
Recovered bad superblocks successful
root@localhost:~# mount -o loop ./btrfs33.img /mnt/btrfs_practice/
root@localhost:~# findmnt /mnt/btrfs_practice
TARGET              SOURCE     FSTYPE OPTIONS
/mnt/btrfs_practice /dev/loop1 btrfs  rw,relatime,seclabel,ssd,space_cache=v2,subvolid=5,subvol=/
root@localhost:~#

couldn’t read tree root && open_ctree failed: -5 综合题

dmesg

[38732.871141] loop1: detected capacity change from 0 to 2097152
[38732.882712] BTRFS: device fsid 160086ed-2388-472f-92c6-47ee8e2e1407 devid 1 transid 12 /dev/loop1 (7:1) scanned by mount (56215)
[38732.884333] BTRFS info (device loop1): first mount of filesystem 160086ed-2388-472f-92c6-47ee8e2e1407
[38732.884353] BTRFS info (device loop1): using crc32c (crc32c-lib) checksum algorithm
[38732.890271] BTRFS error (device loop1): bad tree block start, mirror 1 want 33423360 have 0
[38732.890479] BTRFS error (device loop1): bad tree block start, mirror 2 want 33423360 have 0
[38732.890533] BTRFS warning (device loop1): couldn't read tree root
[38732.894913] BTRFS error (device loop1): open_ctree failed: -5

btrfs check –readonly

root@localhost:~# btrfs check --readonly ./btrfs44.img
Opening filesystem to check...
checksum verify failed on 33423360 wanted 0x00000000 found 0xb6bde3e4
checksum verify failed on 33423360 wanted 0x00000000 found 0xb6bde3e4
checksum verify failed on 33423360 wanted 0x00000000 found 0xb6bde3e4
bad tree block 33423360, bytenr mismatch, want=33423360, have=0
Couldn't read tree root
ERROR: cannot open file system

btrfs check –repair失败

root@localhost:~# btrfs check --repair ./btrfs44.img
enabling repair mode
WARNING:

        Do not use --repair unless you are advised to do so by a developer
        or an experienced user, and then only after having accepted that no
        fsck can successfully repair all types of filesystem corruption. E.g.
        some software or hardware bugs can fatally damage a volume.
        The operation will start in 10 seconds.
        Use Ctrl-C to stop it.
10 9 8 7 6 5 4 3 2 1
Starting repair.
Opening filesystem to check...
checksum verify failed on 33423360 wanted 0x00000000 found 0xb6bde3e4
checksum verify failed on 33423360 wanted 0x00000000 found 0xb6bde3e4
checksum verify failed on 33423360 wanted 0x00000000 found 0xb6bde3e4
bad tree block 33423360, bytenr mismatch, want=33423360, have=0
Couldn't read tree root
ERROR: cannot open file system
root@localhost:~#

使用备份

 Starting point selection:
    -s|--super <superblock>   use this superblock copy
root@localhost:~# btrfs check -s 1 ./btrfs44.img

我的练习环境是1GB,只有两个备份

root@localhost:~# btrfs check -s 1 ./btrfs44.img
using SB copy 1, bytenr 67108864
Opening filesystem to check...
checksum verify failed on 33423360 wanted 0x00000000 found 0xb6bde3e4
checksum verify failed on 33423360 wanted 0x00000000 found 0xb6bde3e4
checksum verify failed on 33423360 wanted 0x00000000 found 0xb6bde3e4
bad tree block 33423360, bytenr mismatch, want=33423360, have=0
Couldn't read tree root
ERROR: cannot open file system

使用备份也失败

btrfs restore 失败

root@localhost:~# btrfs restore -i -o ./btrfs44.img ./btrfs44_restored/
checksum verify failed on 33423360 wanted 0x00000000 found 0xb6bde3e4
checksum verify failed on 33423360 wanted 0x00000000 found 0xb6bde3e4
checksum verify failed on 33423360 wanted 0x00000000 found 0xb6bde3e4
bad tree block 33423360, bytenr mismatch, want=33423360, have=0
Couldn't read tree root
Could not open root, trying backup super
checksum verify failed on 33423360 wanted 0x00000000 found 0xb6bde3e4
checksum verify failed on 33423360 wanted 0x00000000 found 0xb6bde3e4
checksum verify failed on 33423360 wanted 0x00000000 found 0xb6bde3e4
bad tree block 33423360, bytenr mismatch, want=33423360, have=0
Couldn't read tree root
Could not open root, trying backup super
ERROR: superblock bytenr 274877906944 is larger than device size 1073741824
Could not open root, trying backup super

btrfs-find-root

root@localhost:~# btrfs-find-root ./btrfs44.img
Couldn't read tree root
Superblock thinks the generation is 12
Superblock thinks the level is 0
Well block 32014336(gen: 11 level: 0) seems good, but generation/level doesn't match, want gen: 12 level: 0
Well block 31571968(gen: 10 level: 0) seems good, but generation/level doesn't match, want gen: 12 level: 0
Well block 30949376(gen: 9 level: 0) seems good, but generation/level doesn't match, want gen: 12 level: 0
Well block 30572544(gen: 8 level: 0) seems good, but generation/level doesn't match, want gen: 12 level: 0

指定tree的位置进行restore

root@localhost:~# btrfs restore -t 32014336 -i -o -v ./btrfs44.img ./btrfs44_restored/
parent transid verify failed on 32014336 wanted 12 found 11
parent transid verify failed on 32014336 wanted 12 found 11
parent transid verify failed on 32014336 wanted 12 found 11
Ignoring transid failure
root@localhost:~# ls ./btrfs44_restored/
Backups  Docker  Documents  Music  Photos  Thumbnails  Videos

小结

逆天到绝望的场景,最终使用restore还是能将数据救出来,感恩!

raid1 missing device 基操

root@localhost:~/btrfs44_restored# btrfs fi usage -T /mnt/btrfs_practice/
WARNING: failed to get device size for <missing disk>: No such file or directory
Overall:
    Device size:                   2.00GiB
    Device allocated:              1.63GiB
    Device unallocated:          374.50MiB
    Device missing:                1.00GiB
    Device slack:                    0.00B
    Used:                        840.77MiB
    Free (estimated):            494.55MiB      (min: 494.55MiB)
    Free (statfs, df):           400.43MiB
    Data ratio:                       2.00
    Metadata ratio:                   2.00
    Global reserve:                5.50MiB      (used: 0.00B)
    Multiple profiles:                  no

                  Data      Metadata  System
Id Path           RAID1     RAID1     RAID1    Unallocated Total   Slack
-- -------------- --------- --------- -------- ----------- ------- -----
 1 <missing disk> 726.38MiB 102.38MiB  8.00MiB   187.25MiB 1.00GiB     -
 2 /dev/loop1     726.38MiB 102.38MiB  8.00MiB   187.25MiB 1.00GiB     -
-- -------------- --------- --------- -------- ----------- ------- -----
   Total          726.38MiB 102.38MiB  8.00MiB   374.50MiB 2.00GiB 0.00B
   Used           419.07MiB   1.30MiB 16.00KiB
root@localhost:~/btrfs44_restored# btrfs replace start 1 /dev/loop2 /mnt/btrfs_practice/
Performing full device TRIM /dev/loop2 (1.00GiB) ...
root@localhost:~/btrfs44_restored# btrfs replace status /mnt/btrfs_practice/
Started on  3.May 21:04:52, finished on  3.May 21:04:53, 0 write errs, 0 uncorr. read errs
root@localhost:~/btrfs44_restored# btrfs fi usage -T /mnt/btrfs_practice/
Overall:
    Device size:                   2.00GiB
    Device allocated:              1.82GiB
    Device unallocated:          188.25MiB
    Device missing:                  0.00B
    Device slack:                    0.00B
    Used:                        840.75MiB
    Free (estimated):            434.50MiB      (min: 431.84MiB)
    Free (statfs, df):           328.43MiB
    Data ratio:                       1.95
    Metadata ratio:                   1.48
    Global reserve:                5.50MiB      (used: 0.00B)
    Multiple profiles:                 yes      (data, metadata, system)

              Data     Data      Metadata  Metadata  System   System
Id Path       single   RAID1     single    RAID1     single   RAID1   Unallocated Total   Slack
-- ---------- -------- --------- --------- --------- -------- ------- ----------- ------- -----
 1 /dev/loop2        - 726.38MiB         - 102.38MiB        - 8.00MiB   187.25MiB 1.00GiB     -
 2 /dev/loop1 42.25MiB 726.38MiB 112.00MiB 102.38MiB 32.00MiB 8.00MiB     1.00MiB 1.00GiB     -
-- ---------- -------- --------- --------- --------- -------- ------- ----------- ------- -----
   Total      42.25MiB 726.38MiB 112.00MiB 102.38MiB 32.00MiB 8.00MiB   188.25MiB 2.00GiB 0.00B
   Used          0.00B 419.07MiB     0.00B   1.30MiB 16.00KiB   0.00B
root@localhost:~/btrfs44_restored# btrfs balance start -dconvert=raid1 -mconvert=raid1 /mnt/btrfs_practice/
Done, had to relocate 9 out of 9 chunks  
  
root@localhost:~/btrfs44_restored# btrfs fi usage -T /mnt/btrfs_practice/
Overall:
    Device size:                   2.00GiB
    Device allocated:              1.50GiB
    Device unallocated:          512.00MiB
    Device missing:                  0.00B
    Device slack:                    0.00B
    Used:                        840.83MiB
    Free (estimated):            476.93MiB      (min: 476.93MiB)
    Free (statfs, df):           475.93MiB
    Data ratio:                       2.00
    Metadata ratio:                   2.00
    Global reserve:                5.50MiB      (used: 0.00B)
    Multiple profiles:                  no

              Data      Metadata System
Id Path       RAID1     RAID1    RAID1    Unallocated Total   Slack
-- ---------- --------- -------- -------- ----------- ------- -----
 1 /dev/loop2 640.00MiB 96.00MiB 32.00MiB   256.00MiB 1.00GiB     -
 2 /dev/loop1 640.00MiB 96.00MiB 32.00MiB   256.00MiB 1.00GiB     -
-- ---------- --------- -------- -------- ----------- ------- -----
   Total      640.00MiB 96.00MiB 32.00MiB   512.00MiB 2.00GiB 0.00B
   Used       419.07MiB  1.33MiB 16.00KiB

小结

其实raid1配置下,较好操作,因为数据都在,不用刻意去通过线索去找解决方案

chunk-recover

[41783.759328] loop1: detected capacity change from 0 to 2097152
[41783.771329] BTRFS: device fsid 3da92481-25e6-4e85-9265-dc9fb23b088d devid 1 transid 12 /dev/loop1 (7:1) scanned by mount (64417)
[41783.772925] BTRFS info (device loop1): first mount of filesystem 3da92481-25e6-4e85-9265-dc9fb23b088d
[41783.772947] BTRFS info (device loop1): using crc32c (crc32c-lib) checksum algorithm
[41783.779253] BTRFS error (device loop1): bad tree block start, mirror 1 want 22085632 have 0
[41783.779697] BTRFS error (device loop1): bad tree block start, mirror 2 want 22085632 have 0
[41783.779804] BTRFS error (device loop1): failed to read chunk root
[41783.783596] BTRFS error (device loop1): open_ctree failed: -5
root@localhost:~/btrfs44_restored# btrfs rescue chunk-recover ./btrfs66.img
Scanning: 0 in dev0scan chunk headers error
root@localhost:~/btrfs44_restored# btrfs-find-root ./btrfs66.img
ERROR: tree block bytenr 0 is not aligned to sectorsize 4096
WARNING: cannot read chunk root, continue anyway
Superblock thinks the generation is 12
Superblock thinks the level is 0
root@localhost:~/btrfs44_restored# btrfs restore -i -o ./btrfs66.img ./btrfs66_dir
ERROR: tree block bytenr 0 is not aligned to sectorsize 4096
ERROR: cannot read chunk root
Could not open root, trying backup super
ERROR: tree block bytenr 0 is not aligned to sectorsize 4096
ERROR: cannot read chunk root
Could not open root, trying backup super
ERROR: superblock bytenr 274877906944 is larger than device size 1073741824
Could not open root, trying backup super

发表评论: