feat(heap-profiling): add initial heap profiling primitives [PROF-15190] by scottgerring · Pull Request #2166 · DataDog/libdatadog

scottgerring · 2026-06-26T13:19:14Z

What does this PR do?

Adds initial support for eBPF heap profiling to libdatadog for the ebpf heap profiling project. These changes provide the ability for libdatadog users to introduce a USDT into consuming applications allocation and deallocation paths which is fired behind a poisson distributed sampling path every ~0.5 MiB of allocation. This will give us something to hook in user applications from the eBPF side to sample allocations for allocation tracking and sample allocations+frees for live heap.

I expect we will make more changes to this as we go along, but as this is a standalone feature and is largely based on prior work in ddprof it would be nice to get things in now.

A brief description of the change being made with this pull request.

Interesting Details

I have vendored usdt.h; it is not consistently available on all our build environments (read: old alpine) and this seems to be the common pattern. The API is stable and I see no concerns with this and have updated the license stuff appropriately
I have gated the binding generation behind an env var and committed the bindings. This is because the generation imposes a fair bit of extra stuff (llvm and supporting bits) on every build job and developer. I've added a CI job to validate that the bindings generated reflect what's checked in to avoid the footgun.

Motivation

An enthusiasm for heap profiling

Additional Notes

Anything else we should know when reviewing?

PoC-y branch for the full host profiler with heap profiler
The profiling backend + frontend are done; you can see heap profiles pushed through using this infrastructure in staging

How to test the change?

Describe here in detail how the change can be validated.

datadog-prod-us1-4 · 2026-06-26T13:19:20Z

Tests

✨ Fix all issues with BitsAI

⚠️ Warnings

🚦 1 Pipeline job failed

DataDog/apm-reliability/libdatadog | benchmarks

ℹ️ Info

No other issues found (see more)

🧪 All tests passed
❄️ No new flaky tests detected

🎯 Code Coverage (details)
• Patch Coverage: 85.77%
• Overall Coverage: 74.56% (+0.10%)

Useful? React with 👍 / 👎

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: bebaa38 | Docs | Datadog PR Page | Give us feedback!}

github-actions · 2026-06-26T13:21:36Z

📚 Documentation Check Results

⚠️ 627 documentation warning(s) found

📦 `libdd-heap-allocator` - 44 warning(s)

📦 `libdd-heap-gotter-ffi` - 322 warning(s)

📦 `libdd-heap-gotter` - 44 warning(s)

📦 `libdd-heap-sampler` - 44 warning(s)

📦 `tools` - 173 warning(s)

Updated: 2026-07-03 17:32:41 UTC | Commit: ae516cc | missing-docs job results

github-actions · 2026-06-26T13:22:21Z

Clippy Allow Annotation Report

Comparing clippy allow annotations between branches:

Base Branch: origin/main
PR Branch: origin/sgg/heap-prof-poc

Summary by Rule

Rule	Base Branch	PR Branch	Change

Annotation Counts by File

File	Base Branch	PR Branch	Change

Annotation Stats by Crate

Crate	Base Branch	PR Branch	Change
`clippy-annotation-reporter`	5	5	No change (0%)
`datadog-ffe-ffi`	1	1	No change (0%)
`datadog-ipc`	22	22	No change (0%)
`datadog-live-debugger`	4	4	No change (0%)
`datadog-live-debugger-ffi`	10	10	No change (0%)
`datadog-profiling-replayer`	4	4	No change (0%)
`datadog-sidecar`	45	45	No change (0%)
`libdd-common`	13	13	No change (0%)
`libdd-common-ffi`	12	12	No change (0%)
`libdd-data-pipeline`	6	6	No change (0%)
`libdd-ddsketch`	2	2	No change (0%)
`libdd-dogstatsd-client`	1	1	No change (0%)
`libdd-profiling`	13	13	No change (0%)
`libdd-remote-config`	3	3	No change (0%)
`libdd-telemetry`	20	20	No change (0%)
`libdd-tinybytes`	4	4	No change (0%)
`libdd-trace-normalization`	2	2	No change (0%)
`libdd-trace-obfuscation`	3	3	No change (0%)
`libdd-trace-stats`	1	1	No change (0%)
`libdd-trace-utils`	11	11	No change (0%)
Total	182	182	No change (0%)

About This Report

This report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality.

github-actions · 2026-06-26T13:23:17Z

🔒 Cargo Deny Results

⚠️ 4 issue(s) found, showing only errors (advisories, bans, sources)

📦 `libdd-heap-allocator` - ✅ No issues

📦 `libdd-heap-gotter-ffi` - 1 error(s)

Show output

error[unsound]: Rand is unsound with a custom logger using `rand::rng()`
    ┌─ /home/runner/work/libdatadog/libdatadog/Cargo.lock:121:1
    │
121 │ rand 0.8.5 registry+https://github.com/rust-lang/crates.io-index
    │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ unsound advisory detected
    │
    ├ ID: RUSTSEC-2026-0097
    ├ Advisory: https://rustsec.org/advisories/RUSTSEC-2026-0097
    ├ It has been reported (by @lopopolo) that the `rand` library is [unsound](https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#soundness-of-code--of-a-library) (i.e. that safe code using the public API can cause Undefined Behaviour) when all the following conditions are met:
      
      - The `log` and `thread_rng` features are enabled
      - A [custom logger](https://docs.rs/log/latest/log/#implementing-a-logger) is defined
      - The custom logger accesses `rand::rng()` (previously `rand::thread_rng()`) and calls any `TryRng` (previously `RngCore`) methods on `ThreadRng`
      - The `ThreadRng` (attempts to) reseed while called from the custom logger (this happens every 64 kB of generated data)
      - Trace-level logging is enabled or warn-level logging is enabled and the random source (the `getrandom` crate) is unable to provide a new seed
      
      `TryRng` (previously `RngCore`) methods for `ThreadRng` use `unsafe` code to cast `*mut BlockRng<ReseedingCore>` to `&mut BlockRng<ReseedingCore>`. When all the above conditions are met this results in an aliased mutable reference, violating the Stacked Borrows rules. Miri is able to detect this violation in sample code. Since construction of [aliased mutable references is Undefined Behaviour](https://doc.rust-lang.org/stable/nomicon/references.html), the behaviour of optimized builds is hard to predict.
    ├ Announcement: https://github.com/rust-random/rand/pull/1763
    ├ Solution: Upgrade to >=0.10.1 OR <0.10.0, >=0.9.3 OR <0.9.0, >=0.8.6 (try `cargo update -p rand`)
    ├ rand v0.8.5
      └── (dev) libdd-common v5.0.0
          └── libdd-common-ffi v37.0.0
              └── libdd-heap-gotter-ffi v37.0.0

advisories FAILED, bans ok, sources ok

📦 `libdd-heap-gotter` - ✅ No issues

📦 `libdd-heap-sampler` - ✅ No issues

📦 `tools` - 3 error(s)

Show output

error[vulnerability]: Quadratic run time when checking a start tag for duplicate attribute names
   ┌─ /home/runner/work/libdatadog/libdatadog/Cargo.lock:73:1
   │
73 │ quick-xml 0.37.5 registry+https://github.com/rust-lang/crates.io-index
   │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ security vulnerability detected
   │
   ├ ID: RUSTSEC-2026-0194
   ├ Advisory: https://rustsec.org/advisories/RUSTSEC-2026-0194
   ├ `BytesStart::attributes()` returns an `Attributes` iterator which, by default
     (`with_checks(true)`), rejects a start tag that repeats an attribute name. For
     each attribute yielded, the iterator compared the new name against every name
     seen so far in the same tag using a linear scan, so a start tag with `N`
     distinct attribute names cost `O(N²)` byte comparisons. There was no bound on
     `N` other than the size of the buffered start tag.
     
     ## Impact
     
     Any code that parses untrusted XML and iterates a start tag's attributes with
     the default duplicate check enabled can be made to spend CPU time quadratic in
     the number of attributes on a single tag. Because the check is pure computation
     with no `.await`/I/O, an I/O-based timeout on the consumer (for example a read
     or request timeout) cannot interrupt it while it runs.
     
     Measured cost of a single start tag, release build:
     
     | Attributes on one tag | Time |
     |---|---|
     | 80,000  | ~6 s   |
     | 800,000 | ~10 min |
     
     The cost grows with the square of the attribute count, so a start tag of a few
     tens of megabytes can stall a parsing thread for hours. No memory is exhausted
     and the parser does not crash; the effect is CPU exhaustion on the thread doing
     the parsing: a single crafted start tag can pin a CPU core for minutes to hours,
     denying service to that worker. A deployment that places a wall-clock bound on
     parsing, or confines it to a non-critical thread, may consider the availability
     impact lower.
     
     ## Affected code paths
     
     * `BytesStart::attributes()` / `Attributes` iterated with checks enabled (the
       default), and `BytesStart::try_get_attribute`.
     * `NsReader`, which resolves namespaces by iterating a tag's attributes and so
       reaches the same check internally.
     
     Consumers that iterate attributes with `.attributes().with_checks(false)` and do
     not use `NsReader` are not affected.
     
     This was reported as reachable by a remote, unauthenticated attacker in a
     real-world RPKI relying party (NLnet Labs Routinator) via a crafted RRDP
     `snapshot.xml`.
     
     ## Remediation
     
     Upgrade to `quick-xml >= 0.41.0`, where the duplicate check keeps the linear
     scan for start tags with a small number of attributes and switches to an `O(1)`
     hash pre-filter above a threshold, making the whole tag `O(N)`. The reported
     `AttrError::Duplicated` positions are unchanged.
     
     If upgrading is not possible and duplicate-name detection is not required,
     disable it with `.attributes().with_checks(false)` (this does not help
     `NsReader` consumers, which have no equivalent opt-out before 0.41.0).
   ├ Announcement: https://github.com/tafia/quick-xml/issues/969
   ├ Solution: Upgrade to >=0.41.0 (try `cargo update -p quick-xml`)
   ├ quick-xml v0.37.5
     └── tools v37.0.0

error[vulnerability]: Unbounded namespace-declaration allocation in `NsReader` enables memory-exhaustion denial of service
   ┌─ /home/runner/work/libdatadog/libdatadog/Cargo.lock:73:1
   │
73 │ quick-xml 0.37.5 registry+https://github.com/rust-lang/crates.io-index
   │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ security vulnerability detected
   │
   ├ ID: RUSTSEC-2026-0195
   ├ Advisory: https://rustsec.org/advisories/RUSTSEC-2026-0195
   ├ `NsReader` resolves namespaces by calling `NamespaceResolver::push` for every
     `Start`/`Empty` event *before* the event is returned to the caller. `push`
     iterated all `xmlns` / `xmlns:*` attributes on the start tag and, for each one,
     appended the prefix bytes to an internal buffer and pushed a `NamespaceBinding`
     (32 bytes on 64-bit) to an internal `Vec`, with no upper bound on the number of
     declarations.
     
     ## Impact
     
     A start tag with `N` namespace declarations drove roughly `3×` the tag's byte
     size in `NamespaceResolver` heap, allocated *inside* `quick-xml` before the
     `NsReader` consumer ever received the event and could inspect or reject it. A
     consumer that bounds its *input* size therefore still cannot bound this
     allocation: an `M`-byte start tag yields on the order of `3 × M` bytes of
     resolver heap the caller never sees.
     
     On untrusted XML this lets a remote, unauthenticated attacker force large heap
     allocations with a single start tag. With several `NsReader`s running
     concurrently on independent inputs (a common server pattern), the allocations
     stack and can exhaust process memory, causing the operating system to kill the
     process (OOM). This was confirmed against a real-world RPKI relying party (NLnet
     Labs Routinator), where concurrent RRDP validation workers parsing a crafted
     `snapshot.xml` exceeded the memory limit and the process was OOM-killed.
     
     ## Affected code paths
     
     Consumers using `NsReader` (which always calls `NamespaceResolver::push` before
     yielding `Start`/`Empty`), or calling `NamespaceResolver::push` directly. A plain
     `Reader` that does not perform namespace resolution is not affected.
     
     ## Remediation
     
     Upgrade to `quick-xml >= 0.41.0`. `NamespaceResolver::push` now rejects a start
     tag that declares more than `DEFAULT_MAX_DECLARATIONS_PER_ELEMENT` (256)
     namespace bindings, returning the new `NamespaceError::TooManyDeclarations`
     instead of allocating without limit. The limit is configurable via
     `NamespaceResolver::set_max_declarations_per_element` (use `usize::MAX` to
     restore the previous unbounded behavior), and `NsReader::resolver_mut()` is
     provided to reach it.
     
     There is no clean workaround for `NsReader` consumers before 0.41.0, as the
     allocation happens inside the reader with no configuration knob to cap it.
   ├ Announcement: https://github.com/tafia/quick-xml/issues/970
   ├ Solution: Upgrade to >=0.41.0 (try `cargo update -p quick-xml`)
   ├ quick-xml v0.37.5
     └── tools v37.0.0

error[unsound]: Rand is unsound with a custom logger using `rand::rng()`
   ┌─ /home/runner/work/libdatadog/libdatadog/Cargo.lock:76:1
   │
76 │ rand 0.8.5 registry+https://github.com/rust-lang/crates.io-index
   │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ unsound advisory detected
   │
   ├ ID: RUSTSEC-2026-0097
   ├ Advisory: https://rustsec.org/advisories/RUSTSEC-2026-0097
   ├ It has been reported (by @lopopolo) that the `rand` library is [unsound](https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#soundness-of-code--of-a-library) (i.e. that safe code using the public API can cause Undefined Behaviour) when all the following conditions are met:
     
     - The `log` and `thread_rng` features are enabled
     - A [custom logger](https://docs.rs/log/latest/log/#implementing-a-logger) is defined
     - The custom logger accesses `rand::rng()` (previously `rand::thread_rng()`) and calls any `TryRng` (previously `RngCore`) methods on `ThreadRng`
     - The `ThreadRng` (attempts to) reseed while called from the custom logger (this happens every 64 kB of generated data)
     - Trace-level logging is enabled or warn-level logging is enabled and the random source (the `getrandom` crate) is unable to provide a new seed
     
     `TryRng` (previously `RngCore`) methods for `ThreadRng` use `unsafe` code to cast `*mut BlockRng<ReseedingCore>` to `&mut BlockRng<ReseedingCore>`. When all the above conditions are met this results in an aliased mutable reference, violating the Stacked Borrows rules. Miri is able to detect this violation in sample code. Since construction of [aliased mutable references is Undefined Behaviour](https://doc.rust-lang.org/stable/nomicon/references.html), the behaviour of optimized builds is hard to predict.
   ├ Announcement: https://github.com/rust-random/rand/pull/1763
   ├ Solution: Upgrade to >=0.10.1 OR <0.10.0, >=0.9.3 OR <0.9.0, >=0.8.6 (try `cargo update -p rand`)
   ├ rand v0.8.5
     └── (dev) libdd-common v5.0.0
         └── tools v37.0.0

advisories FAILED, bans ok, sources ok

Updated: 2026-07-03 17:33:59 UTC | Commit: ae516cc | dependency-check job results

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: be77a1cb93

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Copilot

Pull request overview

Adds the first set of heap-profiling building blocks to libdatadog: a Linux-focused sampler that emits ddheap:* USDT probes, plus Rust and runtime-injection frontends (GlobalAlloc wrapper + GOT interposition) and CI to keep the checked-in bindgen artifacts in sync.

Changes:

Introduces libdd-heap-sampler (C implementation + checked-in bindgen outputs) with vendored usdt.h and a workflow to verify regenerated bindings match the committed artifacts.
Adds libdd-heap-allocator (Rust GlobalAlloc wrapper) and libdd-heap-gotter (ELF GOT-patching injector) with demos and tests.
Adds libdd-heap-gotter-ffi to expose install/update/restore over a small C ABI for use by other runtimes.

Reviewed changes

Copilot reviewed 42 out of 47 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
tools/docker/Dockerfile.build	Adds new heap crates to Docker build caching/stubs.
NOTICE	Notes bundled `usdt.h` license attribution.
Cargo.toml	Adds new heap crates to workspace members.
Cargo.lock	Records new crate deps (incl. bindgen split versions).
.github/workflows/verify-heap-sampler-bindings.yml	CI job to regenerate/verify checked-in bindgen artifacts.
.github/CODEOWNERS	Assigns heap crates to profiling team.
libdd-heap-sampler/Cargo.toml	New sampler crate metadata + build deps.
libdd-heap-sampler/build.rs	Builds C sources on Linux; optional bindgen regen via env var.
libdd-heap-sampler/README.md	High-level design and binding-regeneration instructions.
libdd-heap-sampler/vendor/usdt.h	Vendored libbpf/usdt single-header implementation.
libdd-heap-sampler/vendor/README.md	Documents vendoring source/license and refresh procedure.
libdd-heap-sampler/src/lib.rs	Rust façade over generated bindings + basic layout/smoke tests.
libdd-heap-sampler/src/generated/bindings.rs	Checked-in bindgen Rust declarations.
libdd-heap-sampler/src/generated/dd_heap_sampler_static_wrappers.c	Checked-in bindgen static-inline wrappers.
libdd-heap-sampler/src/tl_state.c	TLS state init/seed logic for per-thread sampling.
libdd-heap-sampler/src/sample_flag.c	Arch-specific sampled-pointer marking (x86 header, arm64 TBI).
libdd-heap-sampler/src/probes.c	Non-inline USDT probe emission functions.
libdd-heap-sampler/src/allocation_requested.c	Poisson sampler slow-path + size bumping logic.
libdd-heap-sampler/src/allocation_created.c	Applies sample flag + emits alloc probe + closes reentry guard.
libdd-heap-sampler/src/allocation_freed.c	Emits free probe and returns raw pointer/size adjustments.
libdd-heap-sampler/src/allocation_realloc.c	Realloc helpers for sampled-old teardown behavior.
libdd-heap-sampler/include/datadog/heap/tl_state.h	Public TLS state struct + fast getters.
libdd-heap-sampler/include/datadog/heap/sample_flag.h	Public flagging API + x86/arm64 fast check helpers.
libdd-heap-sampler/include/datadog/heap/probes.h	Public probe API + Linux USDT include.
libdd-heap-sampler/include/datadog/heap/allocation_requested.h	Public pre-alloc hook API + fast path inline.
libdd-heap-sampler/include/datadog/heap/allocation_created.h	Public post-alloc hook API + fast path inline.
libdd-heap-sampler/include/datadog/heap/allocation_freed.h	Public pre-free hook API + fast path inline.
libdd-heap-sampler/include/datadog/heap/allocation_realloc.h	Public realloc helper API.
libdd-heap-allocator/Cargo.toml	New Rust allocator wrapper crate.
libdd-heap-allocator/src/lib.rs	Exposes `SampledAllocator`.
libdd-heap-allocator/src/allocator.rs	Implements `GlobalAlloc` wrapper + tests.
libdd-heap-allocator/README.md	Usage docs + example pointer.
libdd-heap-allocator/examples/usdt_demo.rs	Demo binary emitting USDT alloc samples.
libdd-heap-gotter/Cargo.toml	New GOT interposition crate.
libdd-heap-gotter/src/lib.rs	Public API + Linux/no-op cross-platform surface.
libdd-heap-gotter/src/hooks.rs	Allocator/dlopen/pthread_create hook implementations.
libdd-heap-gotter/src/elf.rs	ELF/DYNAMIC parsing + relocation patching + mprotect guard.
libdd-heap-gotter/README.md	GOTter overview documentation.
libdd-heap-gotter/examples/gotter_usdt_demo.rs	Demo using GOT overrides to drive sampler via Rust allocations.
libdd-heap-gotter/tests/install.rs	Linux integration smoke test for install/restore.
libdd-heap-gotter-ffi/Cargo.toml	New C-ABI wrapper crate around gotter.
libdd-heap-gotter-ffi/build.rs	Generates C header via cbindgen.
libdd-heap-gotter-ffi/cbindgen.toml	C header generation configuration.
libdd-heap-gotter-ffi/src/lib.rs	Exposes install/update/restore/is-installed over C ABI.
libdd-heap-gotter-ffi/README.md	Documents C ABI and lifetime constraints.
libdd-heap-gotter-ffi/examples/cdylib_demo.rs	Demo dynamic-loading and calling the C ABI.
libdd-heap-gotter-ffi/tests/install.rs	FFI smoke test for install/restore roundtrip.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Adds a Criterion benchmark comparing direct System allocation, the SampledAllocator hot path, and direct sampler calls, plus a small bench.sh helper for libdd-heap-sampler. Kept on a separate change because it pulls criterion (and thus bindgen via libdd-heap-sampler's build) into the bench compile set, which the libdatadog benchmarking-platform image cannot build without libclang-dev.

yannham

Partial review: libdd-heap-allocator, libdd-heap-gotter-ffi.

yannham · 2026-07-03T09:54:36Z

+To wrap a custom allocator instead; note that this is kind of ill-advised; we want to see _all_ allocations for the process:
+
+```rust
+#[global_allocator]
+static ALLOC: SampledAllocator<MyAllocator> = SampledAllocator::new(MyAllocator::new());
+```


Also, I'm not sure to understand this mention. Why wrapping a custom allocator is ill-advised? Doesn't the wrapping reach the same blast radius anyway, which is the current project/compilation unit/whatever (This is not a rhetorical question, I don't remember how custom allocators are scoped) ?

Suggested change

To wrap a custom allocator instead; note that this is kind of ill-advised; we want to see _all_ allocations for the process:

```rust

#[global_allocator]

static ALLOC: SampledAllocator<MyAllocator> = SampledAllocator::new(MyAllocator::new());

```

To wrap a custom allocator instead:

```rust

#[global_allocator]

static ALLOC: SampledAllocator<MyAllocator> = SampledAllocator::new(MyAllocator::new());

```

Note that this is kind of ill-advised: we want to see _all_ allocations for the process.

You could foreseeably use a custom allocator that doesn't go through the global allocator and isn't used everywhere, so our wrapper doesn't see all the heap allocations for the process that don't go through that wrapper. I think you need nightly to do this, but you can use Vec::new_in to allocate from a custom allocator.

What we want is that all allocations go through one point that we wrap so we have complete visibility. I can see the wording here needs to be improved!

Cleaned up the note to make this clearer

What I mean is that, when you do:

#[global_allocator] static ALLOC: SampledAllocator<System> = SampledAllocator::<System>::DEFAULT;

It only uses SampledAllocator for whatever is the scope of static ALLOC. If this is only valid for say a crate (I have no idea), or a specific module, then the wrapper will also only be valid for this scope. Code outside of this will use the global allocator unwrapped, won't it? It looks like whether you put the default here or a custom allocator, or put differenty whatever is on the right of the = should not change the scope of where ALLOC is used as the allocator, and thus which allocations are tracked. Or at least that's what I would naively think.

yannham · 2026-07-03T10:07:31Z

+fn assert_ok(result: VoidResult, what: &str) {
+    match result {
+        VoidResult::Ok => {}
+        VoidResult::Err(err) => panic!("{what} failed: {err}"),
+    }
+}


Nit: I think you can just do assert_eq!(foo, Ok(_)) to the same effect instead of defining this helper. It should print foo and Err(e) in debug mode.

Co-authored-by: Yann Hamdaoui <yann.hamdaoui@gmail.com>

yannham

Partial review of libdd-heap-gotter

yannham · 2026-07-03T13:46:53Z

+    unsafe impl Send for LiveAlloc {}
+    unsafe impl Sync for LiveAlloc {}
+    impl LiveAlloc {
+        fn as_slice_mut(&self) -> &'static mut [u8] {


I think this function is unsafe, because you can just call it two times in a row and get aliasing mutable reference to self.ptr. So either it should take &mut self to avoid that, plus ensuring that nobody else can create mutable reference from another alias, or just mark this function as unsafe and let the caller enforce the invariants.

yannham · 2026-07-03T13:47:22Z

+        fn new(seed: u64) -> Self {
+            Rng(seed)
+        }
+        fn next(&mut self) -> u64 {
+            let mut z = self.0.wrapping_add(0x9e37_79b9_7f4a_7c15);
+            self.0 = z;
+            z = (z ^ (z >> 30)).wrapping_mul(0xbf58_476d_1ce4_e5b9);
+            z = (z ^ (z >> 27)).wrapping_mul(0x94d0_49bb_1331_11eb);
+            z ^ (z >> 31)
+        }
+        fn range(&mut self, lo: usize, hi: usize) -> usize {
+            lo + (self.next() as usize) % (hi - lo).max(1)
+        }
+        fn choice<'a, T>(&mut self, xs: &'a [T]) -> &'a T {
+            &xs[(self.next() as usize) % xs.len()]
+        }


Nit: spacing

Suggested change

fn new(seed: u64) -> Self {

Rng(seed)

}

fn next(&mut self) -> u64 {

let mut z = self.0.wrapping_add(0x9e37_79b9_7f4a_7c15);

self.0 = z;

z = (z ^ (z >> 30)).wrapping_mul(0xbf58_476d_1ce4_e5b9);

z = (z ^ (z >> 27)).wrapping_mul(0x94d0_49bb_1331_11eb);

z ^ (z >> 31)

}

fn range(&mut self, lo: usize, hi: usize) -> usize {

lo + (self.next() as usize) % (hi - lo).max(1)

}

fn choice<'a, T>(&mut self, xs: &'a [T]) -> &'a T {

&xs[(self.next() as usize) % xs.len()]

}

fn new(seed: u64) -> Self {

Rng(seed)

}

fn next(&mut self) -> u64 {

let mut z = self.0.wrapping_add(0x9e37_79b9_7f4a_7c15);

self.0 = z;

z = (z ^ (z >> 30)).wrapping_mul(0xbf58_476d_1ce4_e5b9);

z = (z ^ (z >> 27)).wrapping_mul(0x94d0_49bb_1331_11eb);

z ^ (z >> 31)

}

fn range(&mut self, lo: usize, hi: usize) -> usize {

lo + (self.next() as usize) % (hi - lo).max(1)

}

fn choice<'a, T>(&mut self, xs: &'a [T]) -> &'a T {

&xs[(self.next() as usize) % xs.len()]

}

yannham · 2026-07-03T13:53:44Z

@@ -0,0 +1,960 @@
+// Copyright 2025-Present Datadog, Inc. https://www.datadoghq.com/


Maybe not for everything, but could this file re-use existing Rust infrastructure for elf parsing, such as the elf crate for example?

good idea - i've not looked into it tbh! It'll need to work from dl_iterate_phdr in the running process but potentially. I will leave this open and have a look next week

yannham · 2026-07-03T13:54:53Z

+            // Fallback for degenerate .gnu.hash (e.g. executables with only
+            // undefined imports): estimate dynsym entry count from the
+            // distance between DT_SYMTAB and DT_STRTAB. This works because
+            // the linker always places .dynsym immediately before .dynstr.


Is it true for all linkers (mold, lld, ld) or is there any guarantee of that, or does it just happen to be the case right now?

Good catch. The GNU hash walk itself is ported from ddprof, but this DT_STRTAB - DT_SYMTAB fallback i've added. It's already a fallback for malformed binaries, so we should generally not get here, so I don't think we need to dig into mold etc. to validate - at least for the moment.

If it underestimates, the likely failure mode is that we skip patching some relocations. If it overestimates, valid relocation symbol indexes in normal loaded objects should still keep us safe. I've rewritten the comment to make this a bit clearer.

yannham · 2026-07-03T13:56:54Z

+    None
+}
+
+/// Mirror of ddprof's `check`: defining symbol, function/object/notype.


I know clankers love to do that when you translate from existing code (here I guess ddprof), but I would clean those references to ddprof (unless they help understanding or give proper attribution): if this crate is supposed to be a stand-alone implementation in the future, it's a bit confusing for the reader IMHO.

I actively left it around to help reviewers, but if it janks you out then it's obviously not being very helpful. I'll remove it!

yannham · 2026-07-03T14:41:43Z

+
+/// Resolved address of the real `malloc`; filled by `install_heap_overrides`.
+/// The bare `usize` payload is a function pointer.
+pub(crate) static ORIG_MALLOC: AtomicUsize = AtomicUsize::new(0);


Any reason to use usize here instead of e.g. AtomicPointer<c_void>?

Well I'm not sure it would be more ergonomic given what we do with it, so feel free to ignore. I didn't know you couldn't convert pointers to function pointers freely.

yannham · 2026-07-03T14:44:16Z

+        // SAFETY: caller guarantees T is the right `extern "C" fn(...)`
+        // type and that `v` was written by `apply_overrides` with a
+        // function of that signature.
+        Some(core::mem::transmute_copy::<usize, T>(&v))


Why do we need transmute_copy here, instead of just transmute(v)?

yannham · 2026-07-03T14:45:42Z

+    *const libc::pthread_attr_t,
+    StartRoutine,
+    *mut c_void,
+) -> c_int;


Do you know how stable those signatures are, or if there's any way to avoid drift as much as possible (possibly with runtime checks)?

yannham · 2026-07-03T14:48:10Z

+        return;
+    };
+    if ptr.is_null() {
+        return;


I don't know what is the behavior of free(NULL), but shouldn't we forward the call anyway just to make sure to preserve behavior? Maybe it sets errno somewhere or something and some crazy people rely on that?

yannham · 2026-07-03T14:49:23Z

+#[no_mangle]
+pub unsafe extern "C" fn gotter_calloc(nmemb: usize, size: usize) -> *mut c_void {
+    let Some(real): Option<CallocFn> = load_fn(&ORIG_CALLOC) else {
+        return std::ptr::null_mut();


Same remark about null handling.

scottgerring force-pushed the sgg/heap-prof-poc branch from b6aab83 to 72d9f77 Compare June 26, 2026 13:19

scottgerring force-pushed the sgg/heap-prof-poc branch from 72d9f77 to fc5c074 Compare June 26, 2026 13:22

scottgerring force-pushed the sgg/heap-prof-poc branch 2 times, most recently from b80ec4a to 11cf066 Compare June 26, 2026 14:37

scottgerring changed the title ~~[PROF-15190]: DRAFT - add initial heap profiling crates~~ feat(heap-profiling): add initial heap profiling primitives [PROF-15190] Jun 29, 2026

scottgerring force-pushed the sgg/heap-prof-poc branch from 11cf066 to f40cfdf Compare June 29, 2026 09:53

github-actions Bot added the ci-build label Jun 29, 2026

scottgerring force-pushed the sgg/heap-prof-poc branch 19 times, most recently from 6fe0523 to d10a58f Compare July 1, 2026 07:01

chatgpt-codex-connector Bot reviewed Jul 1, 2026

View reviewed changes

address review comments

13be13c

scottgerring force-pushed the sgg/heap-prof-poc branch from ad0beee to 06d136f Compare July 1, 2026 14:59

move realloc logic into sampler

d46e2c4

scottgerring force-pushed the sgg/heap-prof-poc branch from 06d136f to d46e2c4 Compare July 1, 2026 15:02

scottgerring requested a review from Copilot July 1, 2026 15:08

Copilot started reviewing on behalf of scottgerring July 1, 2026 15:08 View session

Copilot AI reviewed Jul 1, 2026

View reviewed changes

scottgerring added 2 commits July 1, 2026 17:27

make the gotter demo do wilder things with libc allocation APIs

a89cfb7

address copilot feedback

3db26ad

jbachorik added the sphinx:critical Sphinx: critical — human review required label Jul 2, 2026

scottgerring force-pushed the sgg/heap-prof-poc branch 2 times, most recently from 17c7bba to 38fba33 Compare July 2, 2026 14:47

address jb robot army review comments

c051bd7

scottgerring force-pushed the sgg/heap-prof-poc branch from 38fba33 to c051bd7 Compare July 3, 2026 08:13

scottgerring mentioned this pull request Jul 3, 2026

feat(heap-profiling): expose libdatadog heap profiling primitive to users [PROF-15190] DataDog/dd-trace-rs#273

Draft

address a correctness issue

f0a58df

scottgerring force-pushed the sgg/heap-prof-poc branch from 0caaaad to f0a58df Compare July 3, 2026 09:24

scottgerring added 4 commits July 3, 2026 12:53

x86-64 can't sample 4 KiB alignment; add a test to confirm this

2533727

skip deepbind

99cfed4

fix issue in test

00e15fd

scottgerring force-pushed the sgg/heap-prof-poc branch from af47843 to 00e15fd Compare July 3, 2026 12:11

cleanup APIs

adc2f14

yannham reviewed Jul 3, 2026

View reviewed changes

scottgerring and others added 3 commits July 3, 2026 15:55

Apply suggestions from code review

9fc5603

Co-authored-by: Yann Hamdaoui <yann.hamdaoui@gmail.com>

more code review work

ff846ec

better tests

df27ec1

yannham reviewed Jul 3, 2026

View reviewed changes

more review comments & a correctness fix

bebaa38

		@@ -0,0 +1,960 @@
		// Copyright 2025-Present Datadog, Inc. https://www.datadoghq.com/

Uh oh!

Conversation

scottgerring commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Interesting Details

Motivation

Additional Notes

How to test the change?

Uh oh!

datadog-prod-us1-4 Bot commented Jun 26, 2026 • edited by datadog-datadog-prod-us1-2 Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Warnings

ℹ️ Info

Uh oh!

github-actions Bot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📚 Documentation Check Results

📦 libdd-heap-allocator - 44 warning(s)

📦 libdd-heap-gotter-ffi - 322 warning(s)

📦 libdd-heap-gotter - 44 warning(s)

📦 libdd-heap-sampler - 44 warning(s)

📦 tools - 173 warning(s)

Uh oh!

github-actions Bot commented Jun 26, 2026

Clippy Allow Annotation Report

Summary by Rule

Annotation Counts by File

Annotation Stats by Crate

About This Report

Uh oh!

github-actions Bot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔒 Cargo Deny Results

📦 libdd-heap-allocator - ✅ No issues

📦 libdd-heap-gotter-ffi - 1 error(s)

📦 libdd-heap-gotter - ✅ No issues

📦 libdd-heap-sampler - ✅ No issues

📦 tools - 3 error(s)

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yannham left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yannham Jul 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

scottgerring commented Jun 26, 2026 •

edited

Loading

datadog-prod-us1-4 Bot commented Jun 26, 2026 •

edited by datadog-datadog-prod-us1-2 Bot

Loading

github-actions Bot commented Jun 26, 2026 •

edited

Loading

📦 `libdd-heap-allocator` - 44 warning(s)

📦 `libdd-heap-gotter-ffi` - 322 warning(s)

📦 `libdd-heap-gotter` - 44 warning(s)

📦 `libdd-heap-sampler` - 44 warning(s)

📦 `tools` - 173 warning(s)

github-actions Bot commented Jun 26, 2026 •

edited

Loading

📦 `libdd-heap-allocator` - ✅ No issues

📦 `libdd-heap-gotter-ffi` - 1 error(s)

📦 `libdd-heap-gotter` - ✅ No issues

📦 `libdd-heap-sampler` - ✅ No issues

📦 `tools` - 3 error(s)

yannham Jul 3, 2026 •

edited

Loading