CAUTION: before you continue, im not responsible for any data loss that these steps might cause you. everyone should have a backup of their data.
Here is how you restore a really corrupt btrfs filesystem using btrfs restore.
PRE STEPS (realizing the filesystem is really corrupt by trying simple mounts and restores):
* assume your volume name is /dev/md127 and that you have an available /root folder to dump temp data to and that you will be dumping your restore to /USB
* you can change any of those variable if thats not the case
Here we assume the filesystem is so corrupt that its not mounting regulary:
# mount /dev/md127 /data mount: wrong fs type, bad option, bad superblock on /dev/md127, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so
And its not mounting with recovery options & readonly:
# mount -o recovery /dev/md127 /data mount: wrong fs type, bad option, bad superblock on /dev/md127, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so
dmesg can give more information & this usually means we should try btrfs restore
# dmesg | tail ... skip not showing all of the output ... btrfs bad tree block start 16965259068170886083 17881085411328 btrfs bad tree block start 16965259068170886083 17881085411328 btrfs: failed to read tree root on md127 btrfs bad tree block start 6500176962124313767 17881085181952 btrfs bad tree block start 6500176962124313767 17881085181952 btrfs: failed to read tree root on md127 btrfs bad tree block start 9323312780541546747 17881025347584 btrfs bad tree block start 9323312780541546747 17881025347584 btrfs: failed to read tree root on md127 btrfs: open_ctree failed
NOTE: open_ctree failed can mean many things… you could simply be missing a needed btrfs device, make sure all of the btrfs devices that are part of the filesystem are in device & try running “btrfs device scan” and repeat the mount commands
Now try this a regular btrfs restore, first I run it dry (-D)to see if it can recover anything. Dry meaning its a test run, it wont actually do any writes.
# btrfs restore -F -i -D -v /dev/md127 /dev/null checksum verify failed on 17881085575168 found 2E29147B wanted 4EBACCFF checksum verify failed on 17881085575168 found 2E29147B wanted 4EBACCFF checksum verify failed on 17881085575168 found 2E29147B wanted 4EBACCFF checksum verify failed on 17881085575168 found 2E29147B wanted 4EBACCFF Csum didn't match Couldn't read tree root Could not open root, trying backup super checksum verify failed on 17881085575168 found 2E29147B wanted 4EBACCFF checksum verify failed on 17881085575168 found 2E29147B wanted 4EBACCFF checksum verify failed on 17881085575168 found 2E29147B wanted 4EBACCFF checksum verify failed on 17881085575168 found 2E29147B wanted 4EBACCFF Csum didn't match Couldn't read tree root Could not open root, trying backup super checksum verify failed on 17881085575168 found 2E29147B wanted 4EBACCFF checksum verify failed on 17881085575168 found 2E29147B wanted 4EBACCFF checksum verify failed on 17881085575168 found 2E29147B wanted 4EBACCFF checksum verify failed on 17881085575168 found 2E29147B wanted 4EBACCFF Csum didn't match Couldn't read tree root Could not open root, trying backup super
If it would of worked (it would of listed the files it would of restored) I would of ran this to continue with the restore:
btrfs restore -F -i -D -v /dev/md127 /USB
But since that didnt show anything would be restored we need to try “btrfs restore” from other tree locations.
STEPS (restoring corrupt filesystem):
Process is similar to this article talking about undeleting files in btrfs.
Here we will try to restore from another tree location (well block #), we find these by running btrfs-find-root.
Note: if you dont have any well block numbers or tree locations its probably because you dont have COW enabled and you also didnt take any snapshots (not having snapshots is all that bad, but not having COW enabled is bad as you probably wont have any well block #s / tree locations). Also I believe running btrfs balances and btrfs defrags might clear up other tree locations (well block numbers) making this type of recovery impossible.
# what is this: these are my commands and notes in restoring a filesystem that simply will not mount and will not work with "btrfs restore" using default options ## First run this like so to get the location of all of the tree locations ("well blocks"): nohup btrfs-find-root /dev/md127 &> /root/000-btrfs-find-root.1 & ## Monitor this process like this: tail -f /root/000-btrfs-find-root.1 ## When done, create this script:"vi 111-btrfs-restore-from-tree.sh" OR copy paste below to create it: ## NEXT SCRIPT IS JUST THIS ONE LINE: ## for i in `tac /root/000-btrfs-find-root.1 | grep 'Well block' | awk '{print $3}'`; do echo "--- Well block $i ---"; btrfs restore -F -D -i -v -t $i /dev/md127 /dev/null 2>&1 | tee /root/333-btrfs-restore-wb-$i.1; done; cat > /root/111-btrfs-restore-from-tree.sh << EOF for i in \`tac /root/000-btrfs-find-root.1 | grep 'Well block' | awk '{print \$3}'\`; do echo "--- Well block \$i ---"; btrfs restore -F -D -i -v -t \$i /dev/md127 /dev/null 2>&1 | tee /root/333-btrfs-restore-wb-\$i.1; done; EOF chmod +x /root/111-btrfs-restore-from-tree.sh ## Then run file like so to see which well block has most output: nohup /root/111-btrfs-restore-from-tree.sh &> /root/222-restore-from-tree.1 & ## Use either command below to find out which well block / tree location has the most restored files & folders (usually biggest 333* file has the most) ls -lisahSr /root/333* for i in /root/333*; do echo -n "$i : "; cat $i | grep "^Restoring" | wc -l; done; for i in /root/333*; do echo -n "$i : "; cat $i | grep "^Restoring" | wc -l; done | sort -nk3 chmod +x /root/111-btrfs-restore-from-tree.sh # NOTE: bigger wellblock numbers are newer and thus have more current data. So its up to you to decide whether to pick a bigger wb # on the filename or pick the larger wellblock number by filesize. bigger filesize means more files to restore. bigger wb # means its more recent. so a comprimise has to be made... unless of course your biggest and final wb ... I personally would favor the more recent file (so bigger wb #, rather than bigger size).. the most recent tree location is the last (at the bottom of the file) well block entry in 0000-btrfs-find-root.1 # when I find the wb # i like, then I can restore its contents like so: # imagine my wb # is 123412341234 btrfs restore -F -i -o -v -t 123412341234 /dev/md127 /USB # -i to ignore errors, -o to overwrite (optional), -v to show every file and folder anme that is being restored, -t to select a different tree location. NOTE: -F is a feature "btrfs restore" on readynas os6, it might be upstream as well on the btrfs tools, all it does is hit "yes" automatically when the btrfs restore process stops and asks "do you want to continue looping on this file?" (in other words its asking do you want to continue with this recovery - and the answer is of course yes). so -F makes btrfs restore automatic instead of interactive (which is good as you just want to run the procedure and leave it alone until its done).
What im doing here is basically:
First run this to find all of the tree locations which are the number after “well block #”
btrfs-find-root /dev/md127
example output:
Well block 42434560 seems great, but generation doesn't match, have=711129, want=712257 level 0 Well block 48267264 seems great, but generation doesn't match, have=711130, want=712257 level 0 Well block 93323264 seems great, but generation doesn't match, have=711623, want=712257 level 0 Well block 94601216 seems great, but generation doesn't match, have=711624, want=712257 level 0 Well block 143163392 seems great, but generation doesn't match, have=712062, want=712257 level 0 Well block 156008448 seems great, but generation doesn't match, have=712061, want=712257 level 0 Well block 164462592 seems great, but generation doesn't match, have=712062, want=712257 level 0 Well block 164855808 seems great, but generation doesn't match, have=712064, want=712257 level 0 Well block 166133760 seems great, but generation doesn't match, have=712065, want=712257 level 0 Well block 9120399425536 seems great, but generation doesn't match, have=712083, want=712257 level 0 Well block 9737799925760 seems great, but generation doesn't match, have=712091, want=712257 level 0 Well block 10669166821376 seems great, but generation doesn't match, have=712251, want=712257 level 0 Super think's the tree root is at 17881085575168, chunk root 17091680108544
Then run a dry btrfs restore to see how many files and folders are restored
btrfs restore -D -t <well block> -v -F -i /dev/md127 /dev/null
Then I pick the biggest output / most recent tree location.
– most recent tree location has bigger tree block number (i assume this, as these look like transaction ids, and all transaction ids are incremented in filesystems)
– biggest output of “btrfs restore -D” means there is more files and folders restored.
Once I find the well block number i like that gives me the most recent output of alot of data, I then plug in a usb and mount it to /USB (or whereever), then I run:
btrfs restore -t <well block> -v -F -i /dev/md127 /USB
-o is optional, -F is optional (but if you dont use you will need to type “yes” or “y” alot telling the process to continue).
Now you wait for the restore to finish.
Example of picking tree location / well block number: I ran the above scripts and I found that the biggest file
# ls -lisah 204405 4.0K -rw-r--r-- 1 root root 1.3K Jul 24 13:39 000-btrfs-find-root.1 204554 4.0K -rwxr-xr-x 1 root root 219 Jul 24 14:17 111-btrfs-restore-from-tree.sh 204564 4.3M -rw-r--r-- 1 root root 4.3M Jul 24 14:12 222-restore-from-tree.1 204584 468K -rw-r--r-- 1 root root 468K Jul 24 14:12 333-btrfs-restore-wb-10669166821376.1 204591 352K -rw-r--r-- 1 root root 349K Jul 24 14:12 333-btrfs-restore-wb-143163392.1 204590 352K -rw-r--r-- 1 root root 349K Jul 24 14:12 333-btrfs-restore-wb-156008448.1 204589 352K -rw-r--r-- 1 root root 350K Jul 24 14:12 333-btrfs-restore-wb-164462592.1 204588 352K -rw-r--r-- 1 root root 349K Jul 24 14:12 333-btrfs-restore-wb-164855808.1 204587 352K -rw-r--r-- 1 root root 349K Jul 24 14:12 333-btrfs-restore-wb-166133760.1 204595 352K -rw-r--r-- 1 root root 350K Jul 24 14:12 333-btrfs-restore-wb-42434560.1 204594 352K -rw-r--r-- 1 root root 350K Jul 24 14:12 333-btrfs-restore-wb-48267264.1 204586 352K -rw-r--r-- 1 root root 350K Jul 24 14:12 333-btrfs-restore-wb-9120399425536.1 204593 352K -rw-r--r-- 1 root root 350K Jul 24 14:12 333-btrfs-restore-wb-93323264.1 204592 352K -rw-r--r-- 1 root root 349K Jul 24 14:12 333-btrfs-restore-wb-94601216.1 204585 352K -rw-r--r-- 1 root root 350K Jul 24 14:12 333-btrfs-restore-wb-9737799925760.1 # for i in /root/333*; do echo -n "$i : "; cat $i | grep "^Restoring" | wc -l; done | sort -nk3 /root/333-btrfs-restore-wb-143163392.1 : 1828 /root/333-btrfs-restore-wb-156008448.1 : 1828 /root/333-btrfs-restore-wb-164462592.1 : 1828 /root/333-btrfs-restore-wb-164855808.1 : 1828 /root/333-btrfs-restore-wb-166133760.1 : 1828 /root/333-btrfs-restore-wb-42434560.1 : 1828 /root/333-btrfs-restore-wb-48267264.1 : 1828 /root/333-btrfs-restore-wb-9120399425536.1 : 1828 /root/333-btrfs-restore-wb-93323264.1 : 1828 /root/333-btrfs-restore-wb-94601216.1 : 1828 /root/333-btrfs-restore-wb-9737799925760.1 : 1828 /root/333-btrfs-restore-wb-10669166821376.1 : 3517
Also at the bottom of 000-btrfs-find-root.1 is the number 1066916682137
So here I would choose 1066916682137 as its the most recent well block number and its also the biggest 333 file (as it has the most output when running btrfs restore in dry mode – meaning it has the most restore entries — sure enough it has 3517 restore lines)
I would restore it like so:
# one way: btrfs restore -t 1066916682137 -v -F -i -o /dev/md127 /USB # or another way: # or with nohup & logging mkdir -p /USB/logs mkdir -p /USB/data nohup btrfs restore -t 1066916682137 -v -F -i -o /dev/md127 /USB/data > /USB/logs/restore.log 2>&1 & # then monitor the operation like so: tail -f /USB/logs/restore.log # or watch the folder size of /USB/data grow watch -n 10 "df -h; du -sh /USB/data;"
SIDENOTE: how to find out how much data there might need to be dumped?
Use btrfs-show-super
# btrfs-show-super /dev/md127 superblock: bytenr=65536, device=/dev/md127 --------------------------------------------------------- csum 0x4545078c [match] bytenr 65536 flags 0x1 magic _BHRfS_M [match] fsid cbbeb9dc-ab51-46df-96f3-13b5ccd28373 label 0e123:data generation 712257 root 17881085575168 sys_array_size 226 chunk_root_generation 712252 root_level 0 chunk_root 17091680108544 chunk_root_level 1 log_root 0 log_root_transid 0 log_root_level 0 total_bytes 11987456360448 bytes_used 9327327404032 sectorsize 4096 nodesize 32768 leafsize 32768 stripesize 4096 root_dir 6 num_devices 1 compat_flags 0x0 compat_ro_flags 0x0 incompat_flags 0x21 ( MIXED_BACKREF | BIG_METADATA ) csum_type 0 csum_size 4 cache_generation 30 uuid_tree_generation 0 dev_item.uuid 0d8e7df1-8162-4af7-8df4-94383523ca15 dev_item.fsid cbbeb9dc-ab51-46df-96f3-13b5ccd28373 [match] dev_item.type 0 dev_item.total_bytes 11987456360448 dev_item.bytes_used 10001976393728 dev_item.io_align 4096 dev_item.io_width 4096 dev_item.sector_size 4096 dev_item.devid 1 dev_item.dev_group 0 dev_item.seek_speed 0 dev_item.bandwidth 0 dev_item.generation 0
Look (grep) for bytes:
# btrfs-show-super /dev/md127 | grep bytes total_bytes 11987456360448 bytes_used 9327327404032 dev_item.total_bytes 11987456360448 dev_item.bytes_used 10001976393728
So we know there is a max of 9,327,327,404,032 bytes of data (8.5 TiB). So I will need to mount a location to /USB that has that much space (two 6 TB usbs LVMed or BTRFSed together can provide 12 TB which will fit 8.5 TiB). Note that in reality there might be less data as “bytes_used” includes metadata and snapshots not just data. When we are running btrfs restore we are not restoring any of the snapshots (you could restore the snapshots if you wanted to. Also btrfs restore gets the full filesystem (aside from the snapshots), so if you only wanted a particular subvolume (or folder) you can use the –path-regex option of “btrfs restore” to ask it only to dump data from that subfolder.
The end.