Hello guys! Does any body tried to compile and install cachyos kernel with autoFDO and propeller?
At first I thought it’s in AUR linux-cachyos already added but it’s not.
I’m not sure what I can say except that my stock cachy kernel installed through the kernel manager does indeed have autofdo and propeller enabled. Obviously don’t clone an ancient source tree but I don’t see why the directions on that page and in the PKGBUILD woukdn’t work. Just start from the current kernel.
I understand that but also I think it’s as much or more about optimizing for your workload, and I’ve personally been fine with Cachy’s presumably well-chosen representative workload there, especially given the expectation of
Thanks for revisiting and coming back with the response. That’s about what I expected which was why I couldn’t really offer much beyond “it’s enabled in github so presumably the kernel and build process isn’t broken”
IMO it’d be more worthwhile if setting up a system with a well-defined workload where I could profile with a known more-representative workload (say a server or other system dedicated to a specific performance-sensitive workload) but I’m also not sure if I’d choose a rolling release there.
3 bash scripts + 1 bash script to automatize everything. it’s possible to make the whole process convenient.
I did the same for systemd which is not optimized in cachyOS. just taking sources from arch linux repository and compiling it at least for my processor. main idea is too see what really can be achieved.
rolling release might seem to be not optimal.. but in the same time… it allows to unleash real potential.
for example:
also I might have done it wrong. it would be helpful if someone clarified:
first kernel build: autofdo + debug + (lto?) lto is needed for propeller in future but it’s bad for autofdo profiling
kernel build autofdo + autofdo profile + debug + thinLTO (fullLTO produced enormously large packages for me +~200MB)
Even Debian stable is on a reasonably fast kernel at this point. The challenges of a rolling distro are that there are so many potential moving pieces that something else might change while you’re optimizing elsewhere, particularly with a vaguely binary-centric one like Arch-derivatives. Too much else going around surrounding your optimizations imo. If your desire is to roll out compiler optimizations everywhere and be on the bleeding edge, a source-based distro like gentoo feels like a better fit. But just my opinion.
Even scripted the compilation takes time, and to be a net benefit in a meaningful way that time needs to balance with performance gains.
Your gains don’t seem outside of the expected range for an already-optimized kernel.
All I can say is the docs say not to.
# Enable AUTOFDO_CLANG for the first compilation to create a kernel, which can be used for profiling
# Workflow:
# https://cachyos.org/blog/2411-kernel-autofdo/
# 1. Compile Kernel with _autofdo=yes and _build_debug=yes
# 2. Boot the kernel in QEMU or on your system, see Workload
# 3. Profile the kernel and convert the profile, see Generating the Profile for AutoFDO
# 4. Put the profile into the sourcedir
# 5. Run kernel build again with the _autofdo_profile_name path to profile specified
: "${_autofdo:=no}"
# Name for the AutoFDO profile
: "${_autofdo_profile_name:=}"
# Propeller should be applied, after the kernel is optimized with AutoFDO
# Workflow:
# 1. Proceed with above AutoFDO Optimization, but enable at the final compilation also _propeller
# 2. Boot into the AutoFDO Kernel and profile it
# 3. Convert the profile into the propeller profile, example:
# create_llvm_prof --binary=/usr/src/debug/linux-cachyos-rc/vmlinux --profile=propeller.data --format=propeller --propeller_output_module_name --out=propeller_cc_profile.txt --propeller_symorder=propeller_ld_profile.txt
# 4. Place the propeller_cc_profile.txt and propeller_ld_profile.txt into the srcdir
# 5. Enable _propeller_prefix
: "${_propeller:=no}"
# Enable this after the profiles have been generated
: "${_propeller_profiles:=no}"
# Run the build
makepkg --cleanbuild -sfi --skipinteg
as of gentoo.. I don’t think it’s fast. it even doesn’t have lto out of the box.
i tried to compile different cachyos packages from sources (arch/cachy os repos) - no real gains at all. even firefox-pure is almost of the same performance (it has pgo on board).
ing instruction at address: 0xffffffff82324beb with counter sum 28, instruction name: NOOPL
I20251108 21:08:22.559561 52871 llvm_propeller_binary_address_mapper.cc:463] Started reading the binary content from: /usr/src/debug/linux-sakkan//vmlinux
E20251108 21:08:22.594228 52871 create_llvm_prof.cc:238] INTERNAL: Failed to read the LLVM_BB_ADDR_MAP section from /usr/src/debug/linux-sakkan//vmlinux: unable to read SHT_LLVM_BB_ADDR_MAP section with index 61: unsupported SHT_LLVM_BB_ADDR_MAP version: 3.
because i enable full-lto?
I have the same issue, my research says the propeller toolchain may be outdated, but I’m not sure how to resolve that in the context of CachyOS. I think we’re using the vmlinux provided by CachyOS.
create_llvm_prof.cc:238] INTERNAL: Failed to read the LLVM_BB_ADDR_MAP section from /usr/src/debug/linux-cachyos-lto/vmlinux: unable to read SHT_LLVM_BB_ADDR_MAP section with index 61: unsupported SHT_LLVM_BB_ADDR_MAP version: 3.