Can't boot after crash, BTRFS bad tree block start

I foolishly thought I should use KVM with PCI passthrough instead of double solo booting (selecting my OS from UEFI BIOS). It worked, I could even install the GPU driver (Intel Arc B580) but I had no internet.
While troubleshooting that, I booted into my win10 VM using Virtual machine manager, but my PC froze during the windows boot.
I waited a while, then forced a shutdown.
Now I can’t boot into cachyos and get the following (I copied the text from a photo I took)


[0.798877] hub 8-01:1.0: config failed, hub doesn't have any ports! (err -19)
:: running early hook [udev]
Starting systemd-udevd version 257.3-1-arch
:: running hook [udev]
:: Triggering uevents...
:: running hook [keymap]
:: Loading keymap...done.
:: running hook [plymouth]
ERROR: Failed to mount 'UUID=fa2fcf69-ddac-492b-a03c-15b256d7a8df' on real root
You are now being dropped into an emergency shell.
sh: can't access tty; job control turned off
[rootfs ~]#

When trying to access my root partition from a live environment I get the following errors (from dmesg):


[  397.353745] BTRFS error (device nvme0n1p2): bad tree block start, mirror 1 want 2129288511488 have 1444175314944
[  397.353845] BTRFS error (device nvme0n1p2): bad tree block start, mirror 2 want 2129288511488 have 1444175314944
[  397.353851] BTRFS error (device nvme0n1p2): failed to read block groups: -5
[  397.354708] BTRFS error (device nvme0n1p2): open_ctree failed

I would love to recover the whole SSD, or at least a couple files like my browser bookmarks.

Here is the SMART output:

SMART OUTPUT
=== START OF INFORMATION SECTION ===
Model Number:                       WD_BLACK SN850X 2000GB
Serial Number:                      244615801785
Firmware Version:                   620361WD
PCI Vendor/Subsystem ID:            0x15b7
IEEE OUI Identifier:                0x001b44
Total NVM Capacity:                 2,000,398,934,016 [2.00 TB]
Unallocated NVM Capacity:           0
Controller ID:                      8224
NVMe Version:                       1.4
Number of Namespaces:               1
Namespace 1 Size/Capacity:          2,000,398,934,016 [2.00 TB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            001b44 8b40fee2b3
Local Time is:                      Sat Feb 22 09:59:26 2025 UTC
Firmware Updates (0x14):            2 Slots, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x00df):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp Verify
Log Page Attributes (0x1e):         Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg Pers_Ev_Lg
Maximum Data Transfer Size:         128 Pages
Warning  Comp. Temp. Threshold:     90 Celsius
Critical Comp. Temp. Threshold:     94 Celsius
Namespace 1 Features (0x02):        NA_Fields

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     9.00W    9.00W       -    0  0  0  0        0       0
 1 +     6.00W    6.00W       -    0  0  0  0        0       0
 2 +     4.50W    4.50W       -    0  0  0  0        0       0
 3 -   0.0250W       -        -    3  3  3  3     5000   10000
 4 -   0.0050W       -        -    4  4  4  4     3900   45700

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         2
 1 -    4096       0         1

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        53 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    0%
Data Units Read:                    10,606,606 [5.43 TB]
Data Units Written:                 8,501,318 [4.35 TB]
Host Read Commands:                 54,467,387
Host Write Commands:                93,201,363
Controller Busy Time:               59
Power Cycles:                       100
Power On Hours:                     480
Unsafe Shutdowns:                   12
Media and Data Integrity Errors:    0
Error Information Log Entries:      0
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0

Error Information (NVMe Log 0x01, 16 of 256 entries)
No Errors Logged

Read Self-test Log failed: Invalid Field in Command (0x4002)

Also I have a segfault error in dmesg, I don’t know if it’s related:

[   54.942071] kwin-6.0-reset-[2025]: segfault at 0 ip 00007844e5131ba4 sp 00007fffbdd9cd78 error 4 in libQt6Core.so.6.8.2[2d9ba4,7844e4ee6000+3ba000] likely on CPU 7 (core 1, socket 0)

This isn’t advice, but I seem to recall an instance where I had crashed like you did, and it corrupted everything. I right clicked on the partion after deleting it and clicked Recover and I could see the files though I didn’t care about them - and now I wish I would have clicked. That was not the direct steps and please don’t follow. I also saw something about someone booting off a USB drive in some way, and grabbing data off of it. I think it just depends how bad the hard shutdown was for it. It seems to have a harsher affect on Linux machines, but that’s partially selection bias. Either way, installed ventoy 8 times, Rufus hasn’t corrupted yet. Sometimes you have to shake your head

Where did you click recover?

I tried probing it within the Live environment (from USB). But I haven’t been able to repair/restore the block and I didn’t want to try more invasive steps that could corrupt more of the SSD.

I had several crashes or moments where I had to hard reset, but Linux never failed, until now.

Using

sudo mount -o ro,rescue=all /dev/nvme0n1p2 /mnt

I can mount the ssd, but there don’t seem any of my own files (like photos, browser profiles, games, etc.)