Skip to content

[ELFLOADER] handle PT_LOAD host-page overlaps on NON4KPAGE hosts#3657

Open
yzewei wants to merge 1 commit intoptitSeb:mainfrom
yzewei:16kto4k
Open

[ELFLOADER] handle PT_LOAD host-page overlaps on NON4KPAGE hosts#3657
yzewei wants to merge 1 commit intoptitSeb:mainfrom
yzewei:16kto4k

Conversation

@yzewei
Copy link
Contributor

@yzewei yzewei commented Mar 12, 2026

Problem

On non-4K page systems, it's possible for adjacent PT_LOAD segments in x86_64 ELF to fall into the same host page.

For example, in an x86 Node.js application, there might be a situation where an R|X segment and an adjacent R segment share the same 16K page.

Under the current logic, inconsistencies between host page permissions and segment-level permissions occur during loading, ultimately leading to SIGSGEV.

Solution Modification

For this type of shared host page situation: adopt a permission stacking strategy.

Modifications

Main Changes:

  • Added a local helper for host page scope and ELF flags to host PROT_*

  • On NON4K page systems, added ElfPhdrNeedsCopyLoad64() checks:

  • Force copy-load if the current PT_LOAD does not completely cover the host page.

  • Force copy-load if the host page covered by the current PT_LOAD overlaps with other PT_LOADs.

  • Retain the original mmap fast path, but only for segments that "completely cover the host page and do not share the host page with other PT_LOAD".

  • After all segments are loaded, recalculate the final permission union by host page and uniformly execute mprotect.

Impact

For 4K page systems, existing paths remain unchanged.

For NON4K page systems, only insecure PT_LOADs will revert to copy-load; the original fast path will still be preserved as much as possible. ## Verification

I verified the following issue locally:

  • The startup issue of node --version on the LoongArch big page system has been fixed.
  • download nodejs
  • PT_LAOD like this:
  LOAD           0x0000000002c00000 0x0000000003000000 0x0000000003000000
                 0x00000000000006a1 0x00000000000006a1  R E    0x200000
  LOAD           0x0000000002c01000 0x0000000003001000 0x0000000003001000
                 0x000000000379c6f2 0x000000000379c6f2  R      0x1000

@yzewei
Copy link
Contributor Author

yzewei commented Mar 12, 2026

Additional note: deferred mprotect cannot resolve this issue; it cannot recover the overwritten EXEC statements.

@xiangzhai
Copy link
Contributor

Hi,

I just only fixed one NON4KPAGE issue I often confused by other NON4KPAGE issues https://github.com/ptitSeb/box64/issues?q=NON4KPAGE so I just reviewed the patch and asked @yzewei (might not related) question: what if .text section overlap .got section? #2579 (comment)

RFC to other NON4KPAGE hunters.

Thanks,
Leslie Zhai

@yzewei
Copy link
Contributor Author

yzewei commented Mar 16, 2026

@runlevel5 I've modified the logic here; could you please review it and see if there are any potential negative impacts?

@runlevel5
Copy link
Contributor

I am a bit concerned there might be possible regression for 4K pagesize system with unconditional mprotect on all pages. The old code only called mprotect for non-writable segments (!(flags & PF_W)). The new code calls mprotect on every page of every segment unconditionally. On 4K-page systems where segments don't overlap, this means extra mprotect syscalls for writable segments that previously didn't need them. The setProtection_elf + mprotect pair is called for every covered host page, even when the segment was already mmap'd with the correct protections. Though I think the performance impact if there are extra sys calls during ELF load. On second thought it is unlikely to be measurable.

@runlevel5
Copy link
Contributor

@yzewei @xiangzhai please test on 4K-pag systems to confirm no regressions (the ElfPhdrNeedsCopyLoad64 early-return handles this, but the mprotect loop change affects all page sizes)

@yzewei
Copy link
Contributor Author

yzewei commented Mar 17, 2026

I am a bit concerned there might be possible regression for 4K pagesize system with unconditional mprotect on all pages. The old code only called mprotect for non-writable segments (!(flags & PF_W)). The new code calls mprotect on every page of every segment unconditionally. On 4K-page systems where segments don't overlap, this means extra mprotect syscalls for writable segments that previously didn't need them. The setProtection_elf + mprotect pair is called for every covered host page, even when the segment was already mmap'd with the correct protections. Though I think the performance impact if there are extra sys calls during ELF load. On second thought it is unlikely to be measurable.

Yes, that's correct.

Because of the unified final protection process, 4K hosts will now also undergo the same final recalculation step. Therefore, this may introduce an additional mprotect() system call compared to the previous delayed path that only used !(PF_W).

The purpose of this approach is to unify the final protection model for 4K and non-4K hosts and to stop relying on the old getProtection(page) replay logic. So this ensures that the goal is achieved correctly.

It only happens during ELF loading, not during stable execution, so the performance impact should be minimal.

@yzewei
Copy link
Contributor Author

yzewei commented Mar 17, 2026

@yzewei @xiangzhai please test on 4K-pag systems to confirm no regressions (the ElfPhdrNeedsCopyLoad64 early-return handles this, but the mprotect loop change affects all page sizes)

Agreed.

4K regression testing is indeed necessary.

I've only reproduced and verified the non-4K scenario locally so far. However, I don't have a suitable 4K page environment on hand. Could you please help me verify it? Thank you very much!

Signed-off-by: Zewei Yang <yangzewei@loongson.cn>
@runlevel5
Copy link
Contributor

@yzewei I've only reproduced and verified the non-4K scenario locally so far. However, I don't have a suitable 4K page environment on hand. Could you please help me verify it? Thank you very much!

I tested on ARM64 4k and 16k, the ctest all passed and node --version runs correctly.
I also tested on PPC64LE 64k, the ctest too passed and node --version crashed (that is totally different issue) but I could confirm ELF loads correctly (copy-load path)

@yzewei
Copy link
Contributor Author

yzewei commented Mar 17, 2026

@yzewei I've only reproduced and verified the non-4K scenario locally so far. However, I don't have a suitable 4K page environment on hand. Could you please help me verify it? Thank you very much!

I tested on ARM64 4k and 16k, the ctest all passed and node --version runs correctly. I also tested on PPC64LE 64k, the ctest too passed and node --version crashed (that is totally different issue) but I could confirm ELF loads correctly (copy-load path)

Thanks!

@yzewei
Copy link
Contributor Author

yzewei commented Mar 25, 2026

@ptitSeb What are your thoughts on this?

@ptitSeb
Copy link
Owner

ptitSeb commented Mar 25, 2026

I need to think about it, ELF Loader is quite brittle for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants