dm_dust
dm-dust
ddrescue_manual
btrfs_doc
使用claude code 进入linux设备
truncate 一个10G文件
创建一个loop设备
基于这个loop设备创建btrfs文件系统
写满90%的碎文件
目的:用于救援练习
使用dm-dust基于loop设备,配置文件对应的坏道,创建一个新的device mapper(dm)
使用dm设备进行坏道的模拟救援
任何情况下,都应该先做文件系统的镜像,然后基于这个镜像再做文件系统的救援
正题
#lsblk
/dev/loop4 7:4 0 10G 0 loop
└─/dev/mapper/dust1 253:0 0 10G 0 dm
ddrescue -n -r3 /dev/mapper/dust1 dust1.img
cp dust1.img dust1.img.bak
mount -o loop,ro dust1.img /tmp/dust1
scrub结果,扫描出错误文件
# btrfs scrub start -B -r /tmp/dust1
Starting scrub on devid 1
scrub done for e7a97f8a-d34a-4502-890d-1c5e84c71d1b
Scrub started: Sun Apr 26 22:50:47 2026
Status: finished
Duration: 0:00:01
Total to scrub: 9.02GiB
Rate: 9.02GiB/s
Error summary: csum=1
Corrected: 0
Uncorrectable: 1
Unverified: 0
ERROR: there are 1 uncorrectable errors
[226329.136556] BTRFS info (device loop5): scrub: started on devid 1
[226329.523873] BTRFS error (device loop5): scrub: unable to fixup (regular) error at logical 3496345600 on dev /dev/loop5 physical 3773169664
[226329.523902] BTRFS warning (device loop5): scrub: checksum error at logical 3496345600 on dev /dev/loop5, physical 3773169664 root 5 inode 2205 offset 14983168 length 4096 links 1 (path: data/large_batch/17/data_1.bin)
[226329.523905] BTRFS error (device loop5): bdev /dev/loop5 errs: wr 0, rd 1, flush 0, corrupt 2, gen 0
[226330.275749] BTRFS info (device loop5): scrub: finished on devid 1 with status: 0
通过inode找到realpath
# find /tmp/dust1 -xdev -inum 2205
/tmp/dust1/data/large_batch/17/data_1.bin
# umount ./dust1.img
# mount -o loop ./dust1.img /tmp/dust1
# rm -f /tmp/dust1/data/large_batch/17/data_1.bin
再执行 scrub
# btrfs scrub start -B /tmp/dust1
Starting scrub on devid 1
scrub done for e7a97f8a-d34a-4502-890d-1c5e84c71d1b
Scrub started: Sun Apr 26 23:01:24 2026
Status: finished
Duration: 0:00:01
Total to scrub: 8.92GiB
Rate: 8.92GiB/s
Error summary: csum=1
Corrected: 0
Uncorrectable: 1
Unverified: 0
ERROR: there are 1 uncorrectable errors
[226966.739102] BTRFS info (device loop5): scrub: started on devid 1
[226967.097099] BTRFS error (device loop5): scrub: unable to fixup (regular) error at logical 3496345600 on dev /dev/loop5 physical 3773169664
[226967.097108] BTRFS error (device loop5): bdev /dev/loop5 errs: wr 0, rd 1, flush 0, corrupt 2, gen 0
[226967.835720] BTRFS info (device loop5): scrub: finished on devid 1 with status: 0
# btrfs inspect-internal logical-resolve -v 3773169664 /tmp/dust1
/tmp/dust1/data/large_batch/4/data_3.bin
man btrfs-inspect-internal
logical-resolve [-Pvo] [-s <bufsize>] <logical> <path>
(needs root privileges)
resolve paths to all files at given logical address in the linear filesystem space
Options
-P skip the path resolving and print the inodes instead
-o ignore offsets, find all references to an extent instead of a single block. Requires kernel support for the V2 ioctl (added in 4.15). The results might need further
processing to filter out unwanted extents by the offset that is supposed to be obtained by other means.
-s <bufsize>
set internal buffer for storing the file names to bufsize, default is 64KiB, maximum 16MiB. Buffer sizes over 64KiB require kernel support for the V2 ioctl (added in
4.15).
-v (deprecated) alias for global -v option
# rm /tmp/dust1/data/large_batch/4/data_3.bin
# btrfs scrub start -B /tmp/dust1
Starting scrub on devid 1
scrub done for e7a97f8a-d34a-4502-890d-1c5e84c71d1b
Scrub started: Sun Apr 26 23:11:38 2026
Status: finished
Duration: 0:00:01
Total to scrub: 8.83GiB
Rate: 8.83GiB/s
Error summary: csum=1
Corrected: 0
Uncorrectable: 1
Unverified: 0
ERROR: there are 1 uncorrectable errors
# btrfs device stats /tmp/dust1
[/dev/loop5].write_io_errs 0
[/dev/loop5].read_io_errs 0
[/dev/loop5].flush_io_errs 0
[/dev/loop5].corruption_errs 1
[/dev/loop5].generation_errs 0
使用btrfs check –readonly –check-data-cum /tmp/dust1 拿到 3496349696 地址,然后找到这个地址对应的文件
# btrfs check --check-data-csum --readonly ./dust1.img
Opening filesystem to check...
Checking filesystem on ./dust1.img
UUID: e7a97f8a-d34a-4502-890d-1c5e84c71d1b
[1/8] checking log skipped (none written)
[2/8] checking root items
[3/8] checking extents
[4/8] checking free space tree
[5/8] checking fs roots
[6/8] checking csums against data
mirror 1 bytenr 3496349696 csum 0x8941f998 expected csum 0xd4d5b685
ERROR: errors found in csum tree
[7/8] checking root refs
[8/8] checking quota groups skipped (not enabled on this FS)
found 9459388416 bytes used, error(s) found
total csum bytes: 9220356
total tree bytes: 17743872
total fs tree bytes: 4456448
total extent tree bytes: 3276800
btree space waste bytes: 967464
file data blocks allocated: 9441648640
referenced 9441648640
# btrfs inspect-internal logical-resolve -v 3496349696 /tmp/dust1
/tmp/dust1/backup/contract_backup_ref.txt
/tmp/dust1/documents/contracts/important_contract.txt
# cat /tmp/dust1/documents/contracts/important_contract.txt
cat: /tmp/dust1/documents/contracts/important_contract.txt: Input/output error
# cat /tmp/dust1/backup/contract_backup_ref.txt
cat: /tmp/dust1/backup/contract_backup_ref.txt: Input/output error
# rm /tmp/dust1/documents/contracts/important_contract.txt
# rm /tmp/dust1/backup/contract_backup_ref.txt
# btrfs device stats /tmp/dust1 -z
[/dev/loop5].write_io_errs 0
[/dev/loop5].read_io_errs 0
[/dev/loop5].flush_io_errs 0
[/dev/loop5].corruption_errs 6
[/dev/loop5].generation_errs 0
# btrfs scrub start -B /tmp/dust1
Starting scrub on devid 1
scrub done for e7a97f8a-d34a-4502-890d-1c5e84c71d1b
Scrub started: Sun Apr 26 23:21:03 2026
Status: finished
Duration: 0:00:01
Total to scrub: 8.83GiB
Rate: 8.83GiB/s
Error summary: no errors found
# btrfs device stats /tmp/dust1
[/dev/loop5].write_io_errs 0
[/dev/loop5].read_io_errs 0
[/dev/loop5].flush_io_errs 0
[/dev/loop5].corruption_errs 0
[/dev/loop5].generation_errs 0
救援完成!!!
总结
本以为简单地从 Dmessage 里面把坏的文件删掉,就能解决问题。最后发现删完之后还是继续报错,用 scrub 继续报错,最终使用btrfs check –readonly –check-data-csum /tmp/dust1 找到了坏文件的address。
再使用 btrfs inspect-internal logical-resolve -v <address> /tmp/dust1 将对应的文件进行删除。