Skip to content

[DO NOT MERGE] [POC] fix(kernel): re-enable selftests package build#17045

Draft
rlmenge wants to merge 6 commits intotomls/base/mainfrom
rlmenge/tomls/kselftests-failures
Draft

[DO NOT MERGE] [POC] fix(kernel): re-enable selftests package build#17045
rlmenge wants to merge 6 commits intotomls/base/mainfrom
rlmenge/tomls/kselftests-failures

Conversation

@rlmenge
Copy link
Copy Markdown
Contributor

@rlmenge rlmenge commented May 6, 2026

Merge Checklist

All boxes should be checked before merging the PR (just tick any boxes which don't apply to this PR)

  • The toolchain has been rebuilt successfully (or no changes were made to it)
  • The toolchain/worker package manifests are up-to-date
  • Any updated packages successfully build (or no packages were changed)
  • Packages depending on static components modified in this PR (Golang, *-static subpackages, etc.) have had their Release tag incremented.
  • Package tests (%check section) have been verified with RUN_CHECK=y for existing SPEC files, or added to new SPEC files
  • All package sources are available
  • cgmanifest files are up-to-date and sorted (./cgmanifest.json, ./toolkit/scripts/toolchain/cgmanifest.json, .github/workflows/cgmanifest.json)
  • LICENSE-MAP files are up-to-date (./LICENSES-AND-NOTICES/SPECS/data/licenses.json, ./LICENSES-AND-NOTICES/SPECS/LICENSES-MAP.md, ./LICENSES-AND-NOTICES/SPECS/LICENSE-EXCEPTIONS.PHOTON)
  • All source files have up-to-date hashes in the *.signatures.json files
  • sudo make go-tidy-all and sudo make go-test-coverage pass
  • Documentation has been updated to match any changes to the build system
  • Ready to merge

Summary

This patchset fixes the kernel selftests build and packaging path end-to-end. The main goal is to make kernel-selftests-internal build reliably, include the expected selftest binaries, and remain installable in images.

The failures were a chain rather than a single bug:

  1. One BPF light-skeleton selftest required kernel symbol probing during the build, which fails under mock's restricted systemd-nspawn environment.
  2. Many kernel selftests dropped -fPIE by resetting CFLAGS, then failed to link under Azure Linux's hardened PIE-by-default toolchain.
  3. Once the exec selftests built, RPM generated a bogus Requires: /usr/bin/inc from selftest-local scripts, making kernel-selftests-internal uninstallable in images.
  4. Re-enable tests

Patch Order

1. fix(kernel): drop BPF lskel variant of test_ksyms_weak

test_progs and test_progs-no_alu32 were missing from kernel-selftests-internal because the BPF selftests build failed while generating test_ksyms_weak.lskel.h.

That light-skeleton path asks libbpf/bpftool to resolve weak kernel symbols. In the mock systemd-nspawn build chroot, /proc/kallsyms reads are blocked with EACCES, which is expected container hardening: kallsyms exposes kernel symbol information and should not be available to an unprivileged build container.

Because the header generation failed, the BPF test_progs binaries did not link, and the spec's permissive install hooks allowed the missing binaries to go unnoticed until RPM contents were inspected.

The patch removes the optional light-skeleton variant of test_ksyms_weak:

  • removes test_ksyms_weak.c from LSKELS_EXTRA
  • removes the matching test_weak_syms_lskel subtest
  • removes the now-unused test_ksyms_weak.lskel.h include

This mirrors the Fedora-side fix noted in the kernel changelog while keeping the rest of the BPF selftests enabled.

2. fix(kernel): pass USERCFLAGS=-fPIE to kselftests for hardened toolchain

Azure Linux's hardened toolchain links executables as PIE by default. For PIE linking to succeed, C objects also need to be compiled with PIE-compatible code generation, normally -fPIE.

Many selftest Makefiles reset CFLAGS with assignments like:

CFLAGS := ...

That drops inherited hardening flags. The link step still uses PIE defaults, so affected tests fail with relocation errors such as:

relocation R_X86_64_32 ... can not be used when making a PIE object

The patch adds:

USERCFLAGS="-fPIE"

to the kselftests make invocation. tools/testing/selftests/lib.mk appends $(USERCFLAGS) after per-test CFLAGS resets, so -fPIE survives consistently across the selftest tree.

This caused previously missing vDSO, exec, firmware, mount, timens, and BPF-related selftest binaries to build and package correctly.

3. fix(kernel): filter bogus inc selftest dependency

After the PIE fix, the exec selftests successfully installed additional files, including:

script-exec.inc
script-noexec.inc

These scripts use:

#!/usr/bin/env inc

RPM's shebang dependency generator interpreted that as:

Requires: /usr/bin/inc

That dependency is bogus for the selftests package. The inc helper is shipped inside kernel-selftests-internal at:

/usr/libexec/kselftests/exec/inc

It is not a system interpreter provided by /usr/bin/inc. The generated dependency made kernel-selftests-internal uninstallable in images.

The patch extends the existing requires filter from:

%define __requires_exclude ^liburandom_read.so.*$

to:

%define __requires_exclude ^liburandom_read.so.*$|^/usr/bin/inc$

This keeps the package installable without hiding a real external dependency.

Change Log
  • Change
  • Change
  • Change
Does this affect the toolchain?

NO

Associated issues
Test Methodology

Validation

Build/package validation

  • Rebuilt LLVM with the corrected clang GCC triple.
  • Rebuilt the kernel using the fixed clang package.
  • Verified kernel-selftests-internal-6.18.5-1.8.azl4.x86_64 was produced.
  • Verified previously missing selftest binaries were present in the RPM, including BPF test_progs, test_progs-no_alu32, vDSO tests, exec tests, and firmware tests.
  • Verified the bogus /usr/bin/inc RPM dependency was filtered.

Image/VM validation

  • Built a vm-base image with kernel-selftests-internal included for validation.
  • Published the image to Azure Shared Image Gallery
  • Deployed a real Azure VM from the image.
  • Verified the VM booted:
6.18.5-1.8.azl4.x86_64
  • Verified kernel-selftests-internal was installed on the VM.
  • Verified /usr/libexec/kselftests contained the expected test collections and 772 executable files.

Kselftest smoke results on Azure VM

Medium suite results:

Collection Result
vDSO 7/7 passed
timens 8/8 passed
exec 12/13 passed; check-exec-tests.sh failed
firmware 1/1 reported SKIP, expected for this environment
bpf ran successfully; remaining failures are runtime/environmental

Extended BPF-only run completed 13 top-level BPF tests before the wall-clock cap:

Test Result
test_tag passed
test_maps passed
test_lru_map passed
test_sockmap passed
test_tcpnotify_user passed
test_kmod.sh passed
test_tc_edt.sh passed
test_verifier failed
test_progs failed, but executed thousands of subtests
test_progs-no_alu32 failed, but executed thousands of subtests
test_progs-cpuv4 failed, but executed thousands of subtests
test_lirc_mode2.sh failed, likely environment/device dependent
test_tc_tunnel.sh failed, likely environment/network dependent

The important packaging validation is that the large BPF binaries now build, install, and run. The remaining BPF failures appear to be runtime/test-environment issues, especially repeated module load failures such as:

Failed to load bpf_testmod.ko into the kernel: -13

That is likely related to module loading/signing/permission behavior on the VM rather than the original build or packaging failures.

rlmenge added 6 commits May 5, 2026 22:02
Disable kernel selftests as a temporary mitigation while kselftests
packaging is fixed separately, and bump azl_pkgrelease to avoid reusing
the previous NVR for a changed package set.
… Linux stage2

Temporary LLVM workaround kept on this branch only to unblock local kernel selftests investigation. Do not merge this commit with the kernel selftests fixes.
The kernel-selftests-internal RPM was missing test_progs and
test_progs-no_alu32 because their build silently failed.

Root cause: building test_ksyms_weak.lskel.h runs
'bpftool gen skeleton -L test_ksyms_weak.bpf.o', which loads the
BPF program through libbpf. libbpf probes /proc/kallsyms to resolve
__ksym __weak symbols. mock 6.7's systemd-nspawn build chroot blocks
/proc/kallsyms reads (EACCES), so the lskel header is never generated
and the test_progs / test_progs-no_alu32 link step fails. The kernel
spec then tolerates the missing binaries with 'cp ... || true' on
the install hooks, so the failure is invisible until you check what's
actually packaged.

Fix: in %prep, drop test_ksyms_weak.c from LSKELS_EXTRA in the BPF
selftests Makefile and remove the matching subtest from
prog_tests/ksyms_btf.c so the file still compiles without
test_ksyms_weak.lskel.h. This mirrors the upstream Fedora fix
'selftests/bpf: Remove ksyms_weak_lskel test' by Artem Savkov which
is referenced in kernel.spec's changelog but not present in the
6.18.5 source tree we pull from microsoft/CBL-Mariner-Linux-Kernel.

Verified by inspecting the resulting kernel-selftests-internal RPM
contents: test_progs, test_progs-no_alu32, urandom_read, and
urandom_read-no_alu32 are now packaged, and 'test_progs --help'
runs cleanly in a mock smoke test.
Many kselftest binaries (timens/, exec/, firmware/, mount/, vDSO/, ...)
were failing to link with errors like:

    ld: relocation R_X86_64_32 against '.rodata.str1.1' can not be used
        when making a PIE object; recompile with -fPIE

and silently getting omitted from kernel-selftests-internal because
the kernel spec uses 'cp ... || true' on the install hooks.

Root cause: Azure Linux's redhat-hardened-cc1 spec adds -fPIE to
compile and -pie to link by default. The kernel selftests Makefile
passes -fPIE down via EXTRA_CFLAGS, but many per-test Makefiles do
'CFLAGS := -Wall ...' which fully resets CFLAGS and drops the
inherited -fPIE. The link step then runs with -pie but objects
compiled without -fPIE, producing R_X86_64_32 relocations that PIE
links can't accept.

Fix: pass USERCFLAGS=-fPIE on the kselftests make line.
tools/testing/selftests/lib.mk does 'CFLAGS += $(USERCFLAGS)'
AFTER the per-test CFLAGS reset, so -fPIE is reliably re-added for
every selftest target.

Verified by inspecting the resulting kernel-selftests-internal RPM:
the previously-missing vdso_test_*, gettime_perf, set-exec, and
fw_namespace binaries are now packaged. Smoke-tested in mock:
'file vdso_test_abi' reports 'ELF 64-bit LSB pie executable' and
'vdso_test_abi' runs cleanly with valid TAP output.
The exec selftests install script-exec.inc and script-noexec.inc with a /usr/bin/env inc shebang. RPM's dependency generator turns those into Requires: /usr/bin/inc, but the inc helper is shipped inside kernel-selftests-internal under /usr/libexec/kselftests/exec/inc, not as a system interpreter.

Filter /usr/bin/inc in __requires_exclude alongside liburandom_read.so.* so kernel-selftests-internal can be installed into images after the PIE fix makes the exec samples build and package successfully.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

🔒❌ Lock files are out of date

FIX: — run this and commit the result:

azldev component update -p llvm -p opencryptoki

Or download the fix patch and apply it:

gh run download 25409408457 -R microsoft/azurelinux -n locks-patch
git apply locks.patch

Changed components (2)

Component New upstream commit
llvm 659c0740e5c29098632207467aa89ce0083d5892
opencryptoki 31e5908208f5ea88fc649e244a861f070d57b813

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

📄❌ Rendered specs are out of date

FIX: — run this and commit the result:

azldev component render llvm

Or download the fix patch and apply it:

gh run download 25409408457 -R microsoft/azurelinux -n rendered-specs-patch
git apply rendered-specs.patch
Category Count
Content diffs 1
Extra files (untracked) 0
Missing files (deleted) 0

Content diffs

`specs/l/llvm/llvm.spec`
--- committed/specs/l/llvm/llvm.spec
+++ rendered/specs/l/llvm/llvm.spec
@@ -3527,13 +3527,8 @@
 
 %changelog
 ## START: Generated by rpmautospec
-<<<<<<< HEAD
 * Thu Apr 30 2026 Daniel McIlvaney <damcilva@microsoft.com> - 21.1.8-5
 - feat: introduce deterministic commit resolution via Azure Linux lock file
-=======
-* Tue Apr 28 2026 azldev <azurelinux@microsoft.com> - 21.1.8-5
-- Latest state for llvm
->>>>>>> 554641f25f (fix(llvm): correct clang's default GCC triple for Azure Linux stage2)
 
 * Thu Jan 22 2026 Josh Stone <jistone@redhat.com> - 21.1.8-4
 - Fix s390x vector miscompilation (rhbz#2430017)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant