Skip to content

Fix #154: hisi_gen2_read_exit_cb segfault that emptied jxf22 traces#155

Merged
widgetii merged 1 commit intomasterfrom
fix/issue-154-hisi-gen2-read-cb-segfault
May 4, 2026
Merged

Fix #154: hisi_gen2_read_exit_cb segfault that emptied jxf22 traces#155
widgetii merged 1 commit intomasterfrom
fix/issue-154-hisi-gen2-read-cb-segfault

Conversation

@widgetii
Copy link
Copy Markdown
Member

@widgetii widgetii commented May 4, 2026

Summary

  • Root cause of ipctool trace produces empty output on Hi3518EV200 + libsns_jxf22.so despite write() being used #154 was a one-line bug in hisi_gen2_read_exit_cbmemset(&buf, 0, sizeof(nbyte)) zeroed the local pointer instead of the buffer, so the next copy_from_process(..., NULL, ..) SIGSEGV'd the tracer right after the first i2c_read(). The streamer kept running untraced, and the trace stopped at ~10 lines — exactly the empty-trace symptom in the bug report.
  • Defensive src/ptrace.c hardening to match strace's PTRACE setup: PTRACE_O_TRACESYSGOOD + gate syscall stops on SIGTRAP|0x80, PTRACE_O_TRACEEXEC + a PTRACE_EVENT_EXEC handler, drop the redundant PTRACE_ATTACH, drop the early PTRACE_SYSCALL race on new clones, NULL-guard syscall_open. Also adds an IPCTOOL_TRACE_DEBUG=1 env knob (zero overhead unless set) for triaging future "fd-open-but-not-in-trace" reports.
  • New soi_jx entry in tools/trace_segment.py's INIT_PATTERNS (reg 0x12, init 0x40, stream-on 0x00) covers JXF22/JXF23/JXH62/SOI 8-bit-register sensors.

Diagnostic trail

This took several rounds of bisection on real Hi3518EV200 hardware before the actual bug surfaced — the empty-trace symptom looked like a ptrace clone-following or signal-handling issue (every PR in the #145#153 chain also looked at those angles). The breakthrough was attaching gdb to a core dump from the running tracer:

#0 copy_from_process (child=2317, addr=3204084084, ptr=0x0, size=1) at src/ptrace.c:153
#1 hisi_gen2_read_exit_cb (proc=…, fd=19, …, nbyte=1, sysret=1) at src/ptrace.c:509
#2 syscall_read_exit (…)  at src/ptrace.c:1112
#3 exit_syscall (…)
#4 do_trace

buf=0x0 in frame #1 made the bug obvious. The supporting hardening in this PR was developed during the diagnostic process before the segfault was the confirmed root cause; each item has independent justification (matches strace, prevents a real edge case) so they're kept.

Verification — Definition of Done from #154

  • Hi3518EV200 + Majestic + libsns_jxf22.so capture via ipctool trace79 sensor_write_register lines (target ≥50, matches strace -f baseline of 79).
  • tools/trace_segment.py on the captured log → init_pattern: soi_jx, 75 init events.
  • tools/trace_to_driver.py --sensor jxf22 → emits jxf22_linear_init that passes gcc -Wall -Wextra -fsyntax-only.
  • tools/trace_diff.py against OpenIPC/glutinium/hi35xx_sensor_jxf22/src/jxf22_sensor_ctl.c::sensor_linear_1080p30_init96.1% address / 100% value match (target ≥90 / ≥80). Raw byte-for-byte sequence comparison (after normalising hex padding) is 100% identical across all 79 writes; the 96.1% scoped diff drops 4 trailing writes that the segmenter classifies as post_init because 0x12=0x00 is treated as stream-on.
  • tools/test_pipeline.sh (CI smoke test) still passes — synthetic SmartSens and Sony IMX traces still segment + generate + compile.
  • SC2315E pattern detection unchanged: synthetic SmartSens trace still hits init_pattern=smartsens (the new soi_jx entry is appended after smartsens/sony_imx in the table). trace_to_driver and trace_diff are unchanged so an unchanged SC2315E trace produces an unchanged diff. End-to-end re-run on real SC2315E hardware not done — no access from this environment.
  • Documented in docs/sensor-driver-extraction.md: bullet in "When the trace is empty anyway" with a core-dump + gdb recipe explaining the i2c_read()-then-silence signature; IPCTOOL_TRACE_DEBUG=1 documented in Troubleshooting; soi_jx row added to the family table.

Test plan

  • tools/test_pipeline.sh (CI gate) — passes locally
  • Cross-compile with OpenIPC CI toolchain + UPX-pack — succeeds, no warnings on ptrace.c
  • On-hardware capture on a private Hi3518EV200 lab camera — ipctool stays alive, captures 79 jxf22 register writes
  • Pipeline end-to-end on the captured trace — segments, generates, diffs cleanly against glutinium reference (100% byte-for-byte across the full 79-write init sequence)
  • Cross-platform smoke: re-run an SC2315E capture on a Hi3516CV300/EV200 box to confirm the prior 100/100/100 diff is preserved (requires hardware access from a maintainer)

Fixes #154

🤖 Generated with Claude Code

Root cause:

  unsigned char *buf = alloca(nbyte);
  memset(&buf, 0, sizeof(nbyte));   // <-- bug
  copy_from_process(proc->pid, remote_addr, buf, nbyte);

`memset(&buf, 0, sizeof(nbyte))` zeroed the local POINTER variable
(not the buffer it pointed at), so `buf` became NULL. The next
`copy_from_process(..., buf, ..)` then SIGSEGV'd the tracer at
`buf[i / sizeof(size_t)] = ret`.

On HISI_V2 / V2A targets where the sensor driver does an I2C read of
the chip-ID register right after `I2C_SLAVE_FORCE` (jxf22, sc2235,
ar0130, ...) ipctool died immediately after the first `i2c_read()`
line. The streamer kept running untraced and finished its 79
register writes with no observer; the captured log stopped at ~10
lines. That is the empty-trace symptom of #154 - and an inflight
hazard for other V2/V2A sensor families that follow the same probe
pattern.

The memset was redundant - copy_from_process overwrites buf in full
- so just drop it.

Supporting ptrace.c hardening (matches strace's setup):

* PTRACE_O_TRACESYSGOOD + gate syscall stops on (SIGTRAP|0x80).
  Defends against a stray real SIGTRAP being processed as a syscall
  enter, which would flip per-PID enter/exit parity and corrupt
  every subsequent register read.
* PTRACE_O_TRACEEXEC + a PTRACE_EVENT_EXEC handler. Without it the
  kernel signals execve completion with a legacy plain SIGTRAP that
  the wait loop would inject back, killing the tracee.
* Drop the redundant PTRACE_ATTACH on the main tracee - it always
  returned EPERM because the child PTRACE_TRACEME'd itself first.
* Drop the early `ptrace(PTRACE_SYSCALL, new_child, 0, 0)` in the
  CLONE/FORK/VFORK handler - it raced with the kernel's auto-attach
  and is unnecessary because the new tracee's SIGSTOP arrives
  through the wait loop on its own.
* NULL-guard `syscall_open` against `copy_from_process_str()`
  failure (would otherwise NULL-deref in strcmp/IS_PREFIX).

New diagnostic env knob:

* `IPCTOOL_TRACE_DEBUG=1` makes `syscall_open` and `syscall_write_exit`
  log to stderr (filename, fd, callback wired). Off by default,
  zero overhead unless set. Used to triage cases where /proc/<pid>/fd
  shows a device open but the trace contains no banner/writes for it.

New segmenter pattern:

* `tools/trace_segment.py` adds `soi_jx` (reg 0x12, init=0x40,
  stream-on=0x00) for SOI/JX 8-bit-register sensors (JXF22, JXF23,
  JXH62, ...). Tried after smartsens/sony_imx so existing detection
  is unchanged.

Documentation:

* `docs/sensor-driver-extraction.md`: document the segfault-induced
  empty trace in "When the trace is empty anyway" with a core-dump
  + gdb recipe; document `IPCTOOL_TRACE_DEBUG=1` in Troubleshooting;
  add the soi_jx row to the family table.

Verification (Definition-of-Done from #154):

  [x] Hi3518EV200 + Majestic + libsns_jxf22.so capture: 79 writes
      (target >=50, matches strace -f baseline).
  [x] tools/trace_segment.py emits init_pattern=soi_jx with 75 init
      events (was 0).
  [x] tools/trace_to_driver.py emits jxf22_linear_init that passes
      `gcc -Wall -Wextra -fsyntax-only`.
  [x] tools/trace_diff.py vs OpenIPC/glutinium hi35xx_sensor_jxf22
      reports 96.1% address / 100% value match (target >=90/>=80).
  [x] tools/test_pipeline.sh (CI) still passes.
  [x] SC2315E pattern detection unchanged: synthetic SmartSens
      trace still hits init_pattern=smartsens (soi_jx is tried
      after smartsens). Trace_to_driver/trace_diff scripts are
      unchanged so an unchanged SC2315E trace produces an
      unchanged diff. End-to-end re-verification on real SC2315E
      hardware not run (no access from this environment).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@widgetii widgetii merged commit 7e446d4 into master May 4, 2026
3 checks passed
@widgetii widgetii deleted the fix/issue-154-hisi-gen2-read-cb-segfault branch May 4, 2026 04:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ipctool trace produces empty output on Hi3518EV200 + libsns_jxf22.so despite write() being used

1 participant