Strange performance using Ansible (and maybe other programs)

Hi,

I recently switched to CachyOS and I like it. Hopping from Distro to Distro ever couple of years, I am finally back to “Arch” (in a greater sense) - I think last time was around 2010 or so.

So here is the thing: I am using ansible (more precise ansible-playbook) to manage some systems (a webserver plus some local VMs). So far, this never had been a problem, the playbooks run smooth. But with CachyOS, this is not the case - they are slow as hell. I tried really a lot of things:

  • running them from a VM - no problem. Even an Archlinux VM had no problem.
  • using a container on CachyOS with podman
  • switching Schedulers and even trying other kernels (including core/linux from the Arch repo)

All of those are running fine, except the kernel/scheduler switch on CachyOS (no matter which constellation, same “bad” performance). My main playbook is for my webserver and I run it quite regulary. It is executing in ~1min (sure there are some variations), in CachyOS it is more like 3m30s+ (up to 12 min!). And I also tested it in a “blank”, almost unconfigured CachyOS VM installation - same issue.

So I am a little bit confused. I have no idea where to look to further. Its not really a showstopper for me as it works very slowly, but also there are other options that I can use (podman+container or a small VM I startup and shutdown afterwards). I would like to understand the underlying problem - where is the difference between CachyOS and any other Distro here? As ansible is actually quite simple (copying some python scripts to a remote host via scp and executing them), this maybe also impact other programs.

The scheduler probably differs, but I wouldn’t think it would impact that much?

Hm, I fear it maybe some strange ssh configuration/network latency issue. I haven’t figured it out so far, but running it “locally” (a test VM webserver clone I setup to test updates and such), it is doing fine. I am really out of ideas, but want to understand what makes CachyOS in this specific case so terrible slow. Not running a VPN or something like this (well, then the docker container/ansible VM would have the same issue as the host).

So ansible locally to a test VM (nothing todo on VM side):
~20s from a VM and also from CachyOS

Running it towards an external Webserver (located in the same country):
~1min from a VM or Docker container (this should use the same kernel and scheduler)
~3min+ (up to 10!) from CachyOS (host or from a test VM) - each task/step is significant slower!

I checked quiet a lot of settings, today I compared the network settings in sysctl with an Archlinux default without much chance. A tcpdump shows a “high” number of “spurious retransmissions” when I execute ansible from the command-line. From Docker/Podman those aren’t there. I may try to do it via network cable too - maybe its some strange handling of the WLAN adapter, also this should usually also affect a container/VM. I even downgraded ssh to be on the Archlinux version.

Further ideas would be great :slight_smile: I would like to solve this issue.