Hi!
Long story short: I encounter frequent kernel panics (like 1/hour) while playing WoW. I didn’t test other games though. I finally got Kdumps working and here is the first dmesg output:
Summary
[ 584.908903] [ C10] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 584.908907] [ C10] #PF: supervisor read access in kernel mode
[ 584.908910] [ C10] #PF: error_code(0x0000) - not-present page
[ 584.908912] [ C10] PGD 0 P4D 0
[ 584.908915] [ C10] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 584.908918] [ C10] CPU: 10 PID: 4276 Comm: wineserver Kdump: loaded Tainted: G OE 6.10.8-2-cachyos #1 56996ea3d65410c7
87cafb5fc91984f901b413d0
[ 584.908922] [ C10] Hardware name: To Be Filled By O.E.M. X570 Taichi/X570 Taichi, BIOS P5.63 08/22/2024
[ 584.908924] [ C10] RIP: 0010:dcn10_set_drr+0xe2/0x110 [amdgpu]
[ 584.909202] [ C10] Code: 85 ff 74 44 48 8b 17 48 85 d2 74 3c 48 8b 92 28 01 00 00 48 85 d2 74 0b 48 89 ee e8 98 84 5f dc 48 8b 03 48
8b b8 f8 00 00 00 <48> 8b 07 48 8b 80 40 01 00 00 48 85 c0 74 0f ba 02 00 00 00 be 00
[ 584.909204] [ C10] RSP: 0018:ffffb40e00424db0 EFLAGS: 00010086
[ 584.909207] [ C10] RAX: ffff89f2e1ac0be0 RBX: ffffb40e00424e08 RCX: 0000000000000000
[ 584.909209] [ C10] RDX: 0000000080010035 RSI: 00000000000141e4 RDI: 0000000000000000
[ 584.909211] [ C10] RBP: ffffb40e00424db4 R08: 0000000000000008 R09: 000000009f3f07ff
[ 584.909212] [ C10] R10: ffffb40e00424da0 R11: 0000000080000000 R12: ffffb40e00424e10
[ 584.909214] [ C10] R13: ffff89ef55f80178 R14: ffff89ef55f804f0 R15: ffff89ef453faf60
[ 584.909216] [ C10] FS: 00007a9dd4731b40(0000) GS:ffff89f66ed00000(0000) knlGS:0000000000000000
[ 584.909218] [ C10] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 584.909220] [ C10] CR2: 0000000000000000 CR3: 000000023add8000 CR4: 0000000000f50ef0
[ 584.909222] [ C10] PKRU: 55555554
[ 584.909224] [ C10] Call Trace:
[ 584.909226] [ C10] <IRQ>
[ 584.909228] [ C10] ? __die_body.cold+0x8/0x12
[ 584.909233] [ C10] ? page_fault_oops+0x15a/0x2e0
[ 584.909237] [ C10] ? generic_reg_set_ex+0x156/0x2d0 [amdgpu d42adf081d1bd1efedaa3586cdc4ecc046215f96]
[ 584.909454] [ C10] ? exc_page_fault+0x81/0x190
[ 584.909458] [ C10] ? asm_exc_page_fault+0x26/0x30
[ 584.909464] [ C10] ? dcn10_set_drr+0xe2/0x110 [amdgpu d42adf081d1bd1efedaa3586cdc4ecc046215f96]
[ 584.909706] [ C10] dc_stream_adjust_vmin_vmax+0x195/0x360 [amdgpu d42adf081d1bd1efedaa3586cdc4ecc046215f96]
[ 584.909876] [ C10] dm_crtc_high_irq+0x230/0x2b0 [amdgpu d42adf081d1bd1efedaa3586cdc4ecc046215f96]
[ 584.910066] [ C10] amdgpu_dm_irq_handler+0x85/0x1f0 [amdgpu d42adf081d1bd1efedaa3586cdc4ecc046215f96]
[ 584.910253] [ C10] amdgpu_irq_dispatch+0xd6/0x210 [amdgpu d42adf081d1bd1efedaa3586cdc4ecc046215f96]
[ 584.910402] [ C10] amdgpu_ih_process+0x83/0x100 [amdgpu d42adf081d1bd1efedaa3586cdc4ecc046215f96]
[ 584.910544] [ C10] amdgpu_irq_handler+0x23/0x60 [amdgpu d42adf081d1bd1efedaa3586cdc4ecc046215f96]
[ 584.910685] [ C10] __handle_irq_event_percpu+0x4d/0x1b0
[ 584.910688] [ C10] handle_irq_event+0x3b/0x90
[ 584.910690] [ C10] handle_edge_irq+0x9a/0x260
[ 584.910693] [ C10] __common_interrupt+0x41/0xa0
[ 584.910696] [ C10] common_interrupt+0x80/0xa0
[ 584.910698] [ C10] </IRQ>
[ 584.910699] [ C10] <TASK>
[ 584.910701] [ C10] asm_common_interrupt+0x26/0x40
[ 584.910703] [ C10] RIP: 0010:_raw_spin_unlock_irqrestore+0x1d/0x40
[ 584.910705] [ C10] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 c6 07 00 0f 1f 00 f7 c6 00 02 00 00 74 06
fb 0f 1f 44 00 00 <65> ff 0d 74 1c b7 62 74 05 e9 f0 04 24 00 e8 10 33 f4 fe e9 e6 04
[ 584.910707] [ C10] RSP: 0018:ffffb40e1cfc7bb8 EFLAGS: 00000206
[ 584.910709] [ C10] RAX: ffff89f66ed25a00 RBX: ffffb40e1cfc7bc0 RCX: ffffb40e26f03d68
[ 584.910710] [ C10] RDX: 0000000000000000 RSI: 0000000000000287 RDI: ffff89f66ed259c0
[ 584.910711] [ C10] RBP: ffff89ef99788000 R08: ffff89f66ed25a20 R09: 0000000000000000
[ 584.910713] [ C10] R10: 0000000000000000 R11: 0000000000000100 R12: ffff89f66ed25a00
[ 584.910714] [ C10] R13: 0000000000000287 R14: ffff89f66ed259c0 R15: ffff89f66ed259c0
[ 584.910717] [ C10] ? srso_alias_return_thunk+0x5/0xfbef5
[ 584.910720] [ C10] schedule_hrtimeout_range_clock+0x203/0x2c0
[ 584.910722] [ C10] ? __pfx_hrtimer_wakeup+0x10/0x10
[ 584.910726] [ C10] ep_poll+0x623/0x6f0
[ 584.910730] [ C10] ? __pfx_ep_autoremove_wake_function+0x10/0x10
[ 584.910733] [ C10] __x64_sys_epoll_wait+0x19b/0x1e0
[ 584.910737] [ C10] do_syscall_64+0x82/0x190
[ 584.910739] [ C10] ? do_syscall_64+0x8e/0x190
[ 584.910742] [ C10] ? srso_alias_return_thunk+0x5/0xfbef5
[ 584.910743] [ C10] ? syscall_exit_to_user_mode_prepare+0x148/0x170
[ 584.910746] [ C10] ? srso_alias_return_thunk+0x5/0xfbef5
[ 584.910748] [ C10] ? syscall_exit_to_user_mode+0x73/0x1f0
[ 584.910750] [ C10] ? srso_alias_return_thunk+0x5/0xfbef5
[ 584.910751] [ C10] ? do_syscall_64+0x8e/0x190
[ 584.910756] [ C10] ? srso_alias_return_thunk+0x5/0xfbef5
[ 584.910757] [ C10] ? syscall_exit_to_user_mode_prepare+0x148/0x170
[ 584.910760] [ C10] ? srso_alias_return_thunk+0x5/0xfbef5
[ 584.910761] [ C10] ? syscall_exit_to_user_mode+0x73/0x1f0
[ 584.910763] [ C10] ? srso_alias_return_thunk+0x5/0xfbef5
[ 584.910765] [ C10] ? do_syscall_64+0x8e/0x190
[ 584.910767] [ C10] ? do_syscall_64+0x8e/0x190
[ 584.910769] [ C10] ? srso_alias_return_thunk+0x5/0xfbef5
[ 584.910771] [ C10] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 584.910773] [ C10] RIP: 0033:0x7a9dd4bded17
[ 584.910793] [ C10] Code: ff ff ff ff eb ba 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 80 3d 55 c3 0d 00 00 41 89 ca 74 10 b8
e8 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 59 c3 48 83 ec 28 89 54 24 18 48 89 74 24 10
[ 584.910794] [ C10] RSP: 002b:00007fffdaf84038 EFLAGS: 00000202 ORIG_RAX: 00000000000000e8
[ 584.910796] [ C10] RAX: ffffffffffffffda RBX: 00007fffdaf84050 RCX: 00007a9dd4bded17
[ 584.910797] [ C10] RDX: 0000000000000080 RSI: 00007fffdaf84040 RDI: 000000000000000e
[ 584.910799] [ C10] RBP: 00007fffdaf84040 R08: 0000000000000007 R09: 00005ee10e8cd600
[ 584.910800] [ C10] R10: 000000000000000a R11: 0000000000000202 R12: 00007fffdaf84050
[ 584.910801] [ C10] R13: 00007fffdaf84808 R14: 0000000000000000 R15: 0000000000000001
[ 584.910805] [ C10] </TASK>
[ 584.910806] [ C10] Modules linked in: rfcomm snd_seq_dummy snd_hrtimer snd_seq nf_conntrack_netbios_ns nf_conntrack_broadcast nft_mas
q nft_reject_ipv4 bridge stp llc nf_nat_tftp nf_conntrack_tftp nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib ccm nft_reject_inet algif_aead
nf_reject_ipv4 crypto_null nf_reject_ipv6 nft_reject des3_ede_x86_64 cbc nft_ct des_generic libdes md4 nft_chain_nat nf_nat nf_conntrack nf
_defrag_ipv6 nf_defrag_ipv4 nf_tables cmac algif_hash algif_skcipher af_alg bnep vfat fat joydev mousedev intel_rapl_msr amd_atl intel_rapl_
common kvm_amd iwlmvm hid_generic kvm snd_hda_codec_hdmi crct10dif_pclmul uvcvideo mac80211 videobuf2_vmalloc crc32_pclmul snd_hda_intel pol
yval_clmulni uvc snd_usb_audio snd_intel_dspcfg videobuf2_memops polyval_generic snd_virtuoso btusb ghash_clmulni_intel snd_intel_sdw_acpi v
ideobuf2_v4l2 libarc4 snd_hda_codec snd_oxygen_lib snd_usbmidi_lib sha512_ssse3 btrtl ucsi_ccg videodev sha1_ssse3 snd_mpu401_uart snd_ump i
wlwifi typec_ucsi btintel snd_hda_core aesni_intel snd_rawmidi
[ 584.910857] [ C10] btbcm typec videobuf2_common snd_seq_device gf128mul btmtk snd_hwdep crypto_simd roles usbhid cryptd mc snd_pcm b
luetooth cfg80211 wmi_bmof mxm_wmi intel_wmi_thunderbolt ccp rapl igb pcspkr snd_timer k10temp i2c_piix4 ptp crc16 snd pps_core thunderbolt
dca rfkill soundcore mac_hid lz4 lz4_compress winesync(OE) pkcs8_key_parser i2c_dev crypto_user dm_mod loop nfnetlink zram ip_tables x_table
s amdgpu btrfs blake2b_generic video libcrc32c amdxcp xor i2c_algo_bit raid6_pq drm_ttm_helper ttm drm_exec gpu_sched drm_suballoc_helper dr
m_buddy nvme drm_display_helper nvme_core sha256_ssse3 cec nvme_auth xhci_pci xhci_pci_renesas wmi crc32c_generic crc32c_intel
[ 584.910902] [ C10] CR2: 0000000000000000
[ 584.910904] [ C10] ---[ end trace 0000000000000000 ]---
[ 584.910905] [ C10] RIP: 0010:dcn10_set_drr+0xe2/0x110 [amdgpu]
[ 584.911137] [ C10] Code: 85 ff 74 44 48 8b 17 48 85 d2 74 3c 48 8b 92 28 01 00 00 48 85 d2 74 0b 48 89 ee e8 98 84 5f dc 48 8b 03 48
8b b8 f8 00 00 00 <48> 8b 07 48 8b 80 40 01 00 00 48 85 c0 74 0f ba 02 00 00 00 be 00
[ 584.911139] [ C10] RSP: 0018:ffffb40e00424db0 EFLAGS: 00010086
[ 584.911142] [ C10] RAX: ffff89f2e1ac0be0 RBX: ffffb40e00424e08 RCX: 0000000000000000
[ 584.911144] [ C10] RDX: 0000000080010035 RSI: 00000000000141e4 RDI: 0000000000000000
[ 584.911146] [ C10] RBP: ffffb40e00424db4 R08: 0000000000000008 R09: 000000009f3f07ff
[ 584.911148] [ C10] R10: ffffb40e00424da0 R11: 0000000080000000 R12: ffffb40e00424e10
[ 584.911149] [ C10] R13: ffff89ef55f80178 R14: ffff89ef55f804f0 R15: ffff89ef453faf60
[ 584.911151] [ C10] FS: 00007a9dd4731b40(0000) GS:ffff89f66ed00000(0000) knlGS:0000000000000000
[ 584.911154] [ C10] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 584.911156] [ C10] CR2: 0000000000000000 CR3: 000000023add8000 CR4: 0000000000f50ef0
[ 584.911158] [ C10] PKRU: 55555554
[ 584.911160] [ C10] Kernel panic - not syncing: Fatal exception in interrupt
[ 584.912166] [ C10] Kernel Offset: 0x1b400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Is there something I can do? I’ll try to downgrade to an older kernel now.
Best regards,
Mr nUUb
EDIT: Is there some archive where I can download an older kernel version? Some days ago I accidentally deleted /var/cache/*
…
EDIT 2: I think I found the issue: Kernel panic during amdgpu IRQ (#3142) · Issues · drm / amd · GitLab
EDIT 3: Temporary workaround: disable VRR. With VRR enabled, I can very easily trigger the crash (and graphical artifacts) by rapidly enabling/disabling nightlight (KDE Plasma). With VRR disabled, I cannot reproduce it anymore.