Someone else experiencing low performance with latest scx-scheds?

If update to latest scx-scheds v.1.0.9 the performance goes from 80+ fps and <10ms frame time to <60 fps and >25ms, very choppy responses to any input (kb or mouse). I used to use scx_lavd performance.

Game: Space Marine 2 (only game I play :rofl:)
CPU: i9-13900KF
GPU: RTX 3700 Ti 570 drivers closed source GSP disabled
KDE Plasma Wayland with VRR Automatic

I’ve reverted with Timeshift and freezed the package update by ignoring it in pacman.conf for the moment.

From where can I start to debugging?
Thanks in advance!

Were you using v1.0.8 before?

edit: Try scx-scheds-git and see if you can reproduce.

Yes

Same results. Also noticed this time that GPU Utilization never goes up 75% where before it was >95% (with v1.0.8). I seem to remember that was the same with v1.0.9

From doing a git log, I’ve found these commits to be the suspect ones.

52b3691ce2686bcab1eef04c6dee8db32d7b8370 # scx_lavd: Reuse bpf_ktime_get_ns() as many as possible
94dc1c1c26faad5ce3be79970273a4bee1895f74 # Merge pull request #1107 from multics69/lavd-opt-preemption
758fba63902502993b40efc9bd32de793e44d833 # scx_lavd: Properly reinitialize cpumask for autopilot mode
c5e3cfa6575e01660cd31707564db25725378e80 # scx_lavd: Turn on frequency scaling in performance mode
5b10a44ee14d8d4c10b9c45edeb68dd8f3587ec9 # Merge pull request #1112 from multics69/lavd-opt-autopilot
ba80d5e20455c9cccf145a8546dc74cbf7490967 # scx_lavd: Add a fast path to pick_idle_cpu() when overloaded
16e13f962f94dd0be6d234d4376d9c35e51d9535 # Merge pull request #1117 from multics69/lavd-pick-idle-cpu
923e6b64dca394225deb46b8659b55d6ca010718 # scx_lavd: Change the default time slice to 5 msec
eb4a8ed3187a48356d87dce507b796e23ca75d97 # Merge pull request #1119 from multics69/lavd-ts-5000
b15eb6993ccc021482bf5974d86578d5952dd330 # scx_lavd: Kick an idle cpu as early as possible on ops.select_cpu()
8ec4c0146667912068a754dee8a723aaebfa5e8e # scx_lavd: Refactoring lavd_dispatch() for readability
9c6f58602a1839ca0fd12cc7d8667a6b998523fa # scx_lavd: Ensure to check all compute domains when task stealing
467bfdfcbbbd0143c68dd46ffdbbb316c2c7deab # Merge pull request #1125 from multics69/lavd-opt-kick
b126ffde3f0b65d69eeb2d75a7ac76c62e22c4ee # scx_lavd: Prioritize a migration-disabled task
e81412b14a85cb52c0a58b1f57bd9d95fc4284c4 # Merge pull request #1126 from multics69/lavd-mig-dis
2b7033a9bdc6df62935ddd794f7ee2f115046b86 # scx_lavd: Add a fast path for a migration-disabled task in pick_idle_cpu()
35c4d7d652c0d721e4065c951127961e8ac7f2cb # Merge pull request #1156 from multics69/lavd-fast-path-mig

Thanks, I have notified the lavd developer. Which kernel version are you using? and are there any flags you’re using for lavd?

6.13.1-2-cachyos
scx_lavd --performance

Thank you! I’ll stay here in case you need to do any other tests.

What if you omit --performance?

Somewhat better, but far from v1.0.8. Choppyness has gone.

60 fps 16ms 80-85% GPU utilization

1 Like

Don’t know if you’re active on Discord or not (I pinged you there), but if you are available to test bisected builds, please first try the first build of scx-scheds-git from Nginx Directory. This is the exact tag that the repo package is built on, but has all lavd commits reverted. In theory, this should match performance with 1.0.8.

pacman -Qi scx-scheds-git
...
Version                   : 1.0.8.r162.ga878d8be-2

Confirming installed version provided. The game runs smooth like silk linen :sunglasses:

1 Like

From the same link, I have pushed a -3 build, please try that too :slight_smile:

The game runs flawlessly! :metal: :sunglasses: :metal:

❯ systemctl status scx_loader.service
...
CGroup: /system.slice/scx_loader.service
        ├─1153 /usr/bin/scx_loader
        └─1201 scx_lavd --performance

❯ pacman -Qi scx-scheds-git
...
VersiĂłn                   : 1.0.8.r162.ga878d8be-3

-4 is up, please try that kernel.

This is the bad one. Which PR is guilty? :stuck_out_tongue_winking_eye:

Don’t know yet, please try -5 on the same link. This or the next build should hopefully be the last build that you need to test.

Same as -4 even a bit worse.

Found the bad commit, please try -6.

This time runs perfectly :metal: :sunglasses: :metal:

Nice, below is the bad commit

commit b15eb6993ccc021482bf5974d86578d5952dd330
Author: Changwoo Min <changwoo@igalia.com>
Date:   Fri Dec 20 17:29:39 2024 +0900

    scx_lavd: Kick an idle cpu as early as possible on ops.select_cpu()
    
    Right after direct dispatch to an idle CPU, kick the CPU to reduce the latency.
    
    Signed-off-by: Changwoo Min <changwoo@igalia.com>

diff --git a/scheds/rust/scx_lavd/src/bpf/main.bpf.c b/scheds/rust/scx_lavd/src/bpf/main.bpf.c
index f75c840d..a4658f93 100644
--- a/scheds/rust/scx_lavd/src/bpf/main.bpf.c
+++ b/scheds/rust/scx_lavd/src/bpf/main.bpf.c
@@ -1010,6 +1010,7 @@ s32 BPF_STRUCT_OPS(lavd_select_cpu, struct task_struct *p, s32 prev_cpu,
          * disptach the task to the idle cpu right now.
          */
         direct_dispatch(p, taskc, 0);
+        scx_bpf_kick_cpu(cpu_id, SCX_KICK_IDLE);
         return cpu_id;
     }
 

Oneliners can cause great harm too :smile:

1 Like

Yeah :metal:t2::sunglasses::metal:t2: Great job @naim