System crash when using igpu
Documenting for future use. This is still not resolved as of:
Date: 2023-01-21
Kernel: 6.1.6-060106-generic
Mesa: mesa-va-drivers/kinetic 23.1~git2301210600.797b83~oibaf~k amd64
i915 Firmware: linux-firmware/kinetic-updates,now 20220923.gitf09bebf3-0ubuntu1.3 all [installed]
I’m using a 13900k for my homelab, which has mostly been excellent. However the embedded GPU has major issues hardware transcoding video. I’m testing hardware transcoding on Jellyfin, as well as Tdarr’s Boosh-Transcode using QSV GPU & FFMPEG
.
Summary: the GPU eventually crashes, more frequently when using more streams. jellyfin-ffmpeg
sometimes recovers and reverts to software transcoding, but more often just hangs until killed.
FWIW this was also an issue on the 12th gen CPU I recently upgraded from, I’m not expecting this to be fixed anytime soon. So much for 2 codec engines 🌩
ecodes when transcoding using QSV:
[ 534.980513] i915 0000:00:10.0: [drm] GPU HANG: ecode 12:1:a014184f, in tdarr-ffmpeg [2058]
[ 534.980612] i915 0000:00:10.0: [drm] Resetting chip for stopped heartbeat on rcs0
[ 535.082049] i915 0000:00:10.0: [drm] tdarr-ffmpeg[1974] context reset due to GPU hang
[ 535.082057] i915 0000:00:10.0: [drm] tdarr-ffmpeg[2058] context reset due to GPU hang
[ 535.094994] i915 0000:00:10.0: [drm] GuC firmware i915/tgl_guc_70.bin version 70.5.1
[ 535.094998] i915 0000:00:10.0: [drm] HuC firmware i915/tgl_huc.bin version 7.9.3
[ 301.022234] i915 0000:00:10.0: [drm] GPU HANG: ecode 12:4:44df4a95, in ffmpeg [5460]
[ 301.051882] i915 0000:00:10.0: [drm] GPU HANG: ecode 12:4:98f287b8, in ffmpeg [5262]
[ 194.711129] i915 0000:00:10.0: [drm] GPU HANG: ecode 12:1:a9fa002d, in ffmpeg [5383]
[ 194.712922] i915 0000:00:10.0: [drm] Resetting rcs0 for CS error
[ 194.744843] i915 0000:00:10.0: [drm] GPU HANG: ecode 12:4:ec85561a, in ffmpeg [5383]
[ 1182.201301] i915 0000:00:10.0: [drm] ffmpeg[13947] context reset due to GPU hang
[ 1182.201328] i915 0000:00:10.0: [drm] GPU HANG: ecode 12:8:cc3768fb, in ffmpeg [13947]
[ 135.855164] i915 0000:00:10.0: [drm] ffmpeg[5065] context reset due to GPU hang
[ 143.000691] i915 0000:00:10.0: [drm] GPU HANG: ecode 12:4:9c5a5653, in ffmpeg [5065]
[ 143.004613] i915 0000:00:10.0: [drm] Resetting vcs0 for CS error
[ 143.004641] i915 0000:00:10.0: [drm] ffmpeg[5065] context reset due to GPU hang
[ 150.907517] i915 0000:00:10.0: [drm] GPU HANG: ecode 12:4:9c595551, in ffmpeg [5065]
[ 157.992504] i915 0000:00:10.0: [drm] GPU HANG: ecode 12:0:00000000
ecodes when transcoding using VAAPI:
[ 272.803882] i915 0000:00:10.0: [drm] GPU HANG: ecode 12:4:cc2b051d, in ffmpeg [4529]
Somewhat relevant links
- https://community.frame.work/t/hard-freezing-on-fedora-36-with-the-new-12th-gen-system/20675/23
- https://gitlab.freedesktop.org/drm/intel/-/wikis/How-to-file-i915-bugs
- https://gitlab.freedesktop.org/drm/intel/-/issues/4858
- https://community.frame.work/t/hard-freezing-on-fedora-36-with-the-new-12th-gen-system/20675/47
- https://community.frame.work/t/hard-freezing-on-fedora-36-with-the-new-12th-gen-system/20675/138
- https://forums.servethehome.com/index.php?threads/lga-1700-alder-lake-servers.35719/page-9