Skip to content

feat(wasm-runtime): complete host function infrastructure for WASM challenges#49

Closed
echobt wants to merge 4 commits intomainfrom
feat/wasm-runtime-infrastructure-gaps
Closed

feat(wasm-runtime): complete host function infrastructure for WASM challenges#49
echobt wants to merge 4 commits intomainfrom
feat/wasm-runtime-infrastructure-gaps

Conversation

@echobt
Copy link
Contributor

@echobt echobt commented Feb 18, 2026

Summary

Closes multiple gaps in the WASM runtime infrastructure so that external challenge modules built against platform-challenge-sdk-wasm can be loaded and executed with full feature support — including per-challenge storage, container execution delegation, challenge data loading, and correct ABI handling.

Changes

Storage host functions (GAP 1 & GAP 4)

  • Register storage_delete and storage_propose_write handlers in StorageHostFunctions::register() — these were declared as constants but never wired into the linker
  • Implement handle_storage_delete and handle_storage_propose_write with proper policy enforcement, key/value validation, and backend delegation
  • Wire a real StorageBackend into WasmChallengeExecutor via WasmExecutorConfig.storage_backend instead of silently using NoopStorageBackend
  • Pass storage_host_config and storage_backend through to InstanceConfig with allow_direct_writes: true

get_tasks ABI fix (GAP 2)

  • Change execute_get_tasks() from call_return_i32call_return_i64 to match the SDK export which returns a packed (len << 32) | ptr via pack_ptr_len
  • Remove the call to non-existent get_tasks_result_len export

EvaluationOutput type alignment (GAP 3)

  • Add missing metrics: Option<Vec<u8>> and details: Option<Vec<u8>> fields (with #[serde(default)]) to the executor-side EvaluationOutput and EvaluationInput structs so bincode deserialization succeeds when WASM modules populate these fields

Container execution host functions (GAP 5)

  • New platform_container namespace with container_run host function
  • ContainerPolicy controls allowed images, resource limits, network access, and max containers per execution
  • Implementation delegates to docker run via std::process::Command with timeout enforcement
  • ContainerRunRequest/ContainerRunResponse types added to both host and guest SDK

Challenge data loading host functions (GAP 6)

  • New platform_data namespace with data_get and data_list host functions
  • DataBackend trait with NoopDataBackend and FilesystemDataBackend implementations
  • DataPolicy controls allowed paths, max file size, and enable/disable
  • DataState tracks per-execution access counters

Arena allocator sizing (GAP 7)

  • Add huge-arena feature to challenge-sdk-wasm for 16 MiB arena allocation (vs 1 MiB default / 4 MiB large-arena)
  • Properly gate large-arena so it does not conflict with huge-arena

Bridge score conversion fix (GAP 8)

  • Fix output_to_response to divide by 10_000.0 instead of 100.0, matching score_f64_scaled which multiplies by 10,000

HTTP host function signature fix (GAP 9)

  • Align http_get host registration to accept 4 parameters (req_ptr, req_len, resp_ptr, resp_len) matching the guest SDK extern declaration, instead of only 3 parameters with a hardcoded response buffer size

Runtime integration

  • Add DataState, ContainerState to RuntimeState
  • Add data_policy, data_backend, container_policy to InstanceConfig with sensible defaults
  • Register DataHostFunctions and ContainerHostFunctions in the linker during instantiation
  • Propagate all new config fields through InstanceConfig::default()

Testing

  • All existing unit tests pass (700+ tests across workspace)
  • Pre-push CI checks pass: formatting, cargo check, clippy, tests

Summary by CodeRabbit

Release Notes

  • New Features

    • Added container execution support for running containerized workloads with policy controls
    • Introduced data access layer enabling key-value get and list operations
    • Enhanced storage operations with delete and propose_write capabilities
    • Added "huge-arena" configuration option for WASM memory allocation
  • Bug Fixes

    • Adjusted score normalization (now divided by 10,000 instead of 100)

The host registered http_get with 3 params (req_ptr, req_len, resp_ptr)
but the guest SDK declares 4 params (req_ptr, req_len, resp_ptr, resp_len).
This mismatch caused WASM instantiation failures with 'incompatible import
type' when a module used http_get.

Added resp_len parameter to both the linker closure and handle_http_get
function. Removed the now-unused DEFAULT_RESPONSE_BUF_SIZE constant.
…and data modules

Add ..Default::default() to three InstanceConfig initializations in
wasm_executor.rs that were missing the container_policy, data_policy,
and data_backend fields introduced by the container and data module
integration.
@coderabbitai
Copy link

coderabbitai bot commented Feb 18, 2026

📝 Walkthrough

Walkthrough

This PR introduces container execution and data access host functions for WASM runtime, adds configurable storage backend injection to the WASM executor, enhances evaluation I/O structures with optional fields, expands arena size configuration, and adjusts score normalization calculation.

Changes

Cohort / File(s) Summary
Executor Storage Backend Configuration
bins/validator-node/src/main.rs, bins/validator-node/src/wasm_executor.rs
Added storage_backend field to WasmExecutorConfig; wired InMemoryStorageBackend into executor initialization; updated EvaluationInput/Output with optional task_definition, environment_config, metrics, and details fields; integrated storage_host_config into instance configuration; changed get_tasks result handling to use 64-bit packed values.
Challenge SDK WASM Enhancements
crates/challenge-sdk-wasm/Cargo.toml, crates/challenge-sdk-wasm/src/alloc_impl.rs, crates/challenge-sdk-wasm/src/lib.rs, crates/challenge-sdk-wasm/src/types.rs
Added "huge-arena" feature (16 MiB) alongside "large-arena" (4 MiB); added ContainerRunRequest and ContainerRunResponse public types with execution and response metadata; re-exported container types from lib.
Container Execution Host Functions
crates/wasm-runtime-interface/src/container.rs
New module implementing policy-driven container execution with ContainerPolicy, ContainerState, and ContainerHostFunctions; includes image whitelisting, resource limits, timeout enforcement, Docker-based execution, and comprehensive error handling via ContainerHostStatus and ContainerExecError.
Data Access Host Functions
crates/wasm-runtime-interface/src/data.rs
New module implementing data backend access with DataPolicy, DataState, and DataHostFunctions; supports get and list operations with FilesystemDataBackend and NoopDataBackend implementations; enforces read limits and key/value size constraints.
Runtime Integration
crates/wasm-runtime-interface/src/runtime.rs, crates/wasm-runtime-interface/src/lib.rs
Extended InstanceConfig and RuntimeState with data_policy, data_backend, container_policy, and corresponding state fields; added reset utility methods; registered DataHostFunctions and ContainerHostFunctions in linker; re-exported new public types.
Storage Operations
crates/wasm-runtime-interface/src/storage.rs
Added host function implementations for storage_delete and storage_propose_write; integrated operations counting and error status mapping into existing StorageHostFunctions registration.
Score Normalization
crates/wasm-runtime-interface/src/bridge.rs
Changed score divisor from 100.0 to 10,000.0 in output_to_response; updated test assertion to pass 10,000 instead of 100 for output construction.
Network Configuration
crates/wasm-runtime-interface/src/network.rs
Removed DEFAULT_RESPONSE_BUF_SIZE constant; updated handle_http_get to accept explicit resp_len parameter instead of using hardcoded default.

Sequence Diagram(s)

sequenceDiagram
    participant WASM as WASM Module
    participant Host as Container Host<br/>(container_run)
    participant Policy as Policy Engine
    participant Docker as Docker Executor
    participant Mem as WASM Memory

    WASM->>Mem: Write request (image,<br/>command, env, etc.)
    WASM->>Host: Call container_run(req_ptr,<br/>req_len, resp_ptr)
    Host->>Mem: Read ContainerRunRequest
    Host->>Policy: Check enabled
    Host->>Policy: Check image allowed
    Host->>Policy: Validate resource limits
    alt Policy Violation
        Host->>Host: Set error status
        Host->>Mem: Write error response
        Host->>WASM: Return error code
    else Policy Pass
        Host->>Docker: execute_container(image,<br/>command, env, ...)
        Docker-->>Host: ContainerRunResponse<br/>(exit_code, stdout,<br/>stderr, duration)
        Host->>Mem: Write response
        Host->>WASM: Return success status
    end
Loading
sequenceDiagram
    participant WASM as WASM Module
    participant Host as Data Host<br/>(data_get/list)
    participant Policy as Policy Engine
    participant Backend as Data Backend
    participant Mem as WASM Memory

    WASM->>Mem: Write key/prefix
    WASM->>Host: Call data_get(key_ptr,<br/>key_len, resp_ptr)
    Host->>Mem: Read key
    Host->>Policy: Check enabled
    Host->>Policy: Validate key size
    Host->>Policy: Check read count
    alt Policy Violation
        Host->>Host: Set error status
        Host->>WASM: Return status code
    else Policy Pass
        Host->>Backend: get(challenge_id, key)
        Backend-->>Host: Option<Vec<u8>>
        alt Value Found
            Host->>Mem: Write value
            Host->>WASM: Return length
        else Not Found
            Host->>WASM: Return NotFound status
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐰 New containers run with flair,
Data flows through host-space air,
Storage backends, scores precise—
WASM gets a feature spice!
Platform grows, one function at a time. 🚀

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 29.58% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly captures the main objective: completing the host function infrastructure for WASM challenges. It accurately reflects the PR's core purpose of closing gaps in the WASM runtime by adding storage, container, and data host functions.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/wasm-runtime-infrastructure-gaps

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (8)
crates/challenge-sdk-wasm/src/types.rs (1)

131-151: Consider adding an is_success() helper to ContainerRunResponse.

SandboxExecResponse (line 88) provides is_success() for convenience. The new ContainerRunResponse has the same exit_code field but lacks this helper. Adding it would keep the API surface consistent.

Proposed addition
 pub struct ContainerRunResponse {
     pub exit_code: i32,
     pub stdout: Vec<u8>,
     pub stderr: Vec<u8>,
     pub duration_ms: u64,
 }
+
+impl ContainerRunResponse {
+    pub fn is_success(&self) -> bool {
+        self.exit_code == 0
+    }
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/challenge-sdk-wasm/src/types.rs` around lines 131 - 151,
ContainerRunResponse lacks a convenience method to check success like
SandboxExecResponse::is_success(); add a public method fn is_success(&self) ->
bool on ContainerRunResponse that returns true when self.exit_code == 0,
mirroring the behavior of SandboxExecResponse::is_success(), and update any
docs/tests if present to use ContainerRunResponse::is_success for consistency
across the API.
bins/validator-node/src/main.rs (1)

379-379: Prefer the already-imported Arc alias over std::sync::Arc.

Arc is imported at line 34 (use std::sync::Arc;). Using the fully qualified path is inconsistent with the rest of the file.

Proposed fix
-        storage_backend: std::sync::Arc::new(wasm_runtime_interface::InMemoryStorageBackend::new()),
+        storage_backend: Arc::new(wasm_runtime_interface::InMemoryStorageBackend::new()),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bins/validator-node/src/main.rs` at line 379, The field initialization for
storage_backend uses the fully-qualified std::sync::Arc::new; replace it with
the already-imported Arc alias to match the rest of the file by changing
std::sync::Arc::new(wasm_runtime_interface::InMemoryStorageBackend::new()) to
Arc::new(wasm_runtime_interface::InMemoryStorageBackend::new()) so the
storage_backend assignment uses Arc::new consistently.
crates/wasm-runtime-interface/src/runtime.rs (1)

196-236: RuntimeState::new now takes 16 parameters — consider a builder or config struct.

The #[allow(clippy::too_many_arguments)] suppression is already present, but adding two more parameters (now 16 total) pushes this further. A builder or a dedicated RuntimeStateConfig struct passed as a single argument would improve readability and reduce the chance of argument-ordering mistakes.

Not urgent given the existing suppression, but worth considering as the surface continues to grow.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/wasm-runtime-interface/src/runtime.rs` around lines 196 - 236,
RuntimeState::new has grown to 16 positional parameters which is error-prone;
replace it with a single config/builder object (e.g., RuntimeStateConfig or
RuntimeStateBuilder) that contains the current fields (network_policy,
sandbox_policy, network_state, exec_state, time_state, consensus_state,
terminal_state, data_state, container_state, memory_export, challenge_id,
validator_id, restart_id, config_version, storage_state, fixed_timestamp_ms,
limits), then update RuntimeState::new to accept that config (or expose
RuntimeState::from(config) / RuntimeStateBuilder::build()) and construct Self
from the config fields; ensure existing call sites are migrated to construct the
config or use the builder to set named fields to avoid ordering mistakes.
bins/validator-node/src/wasm_executor.rs (2)

406-422: execute_get_tasks and execute_configure use InMemoryStorageBackend instead of self.config.storage_backend.

Unlike execute_evaluation_with_sandbox and execute_validation (which clone the configured backend), execute_get_tasks (line 418) and execute_configure (line 497) hard-code Arc::new(InMemoryStorageBackend::new()). If this is intentional (these calls don't need persistent storage), a brief comment would clarify the rationale. If not, this is an inconsistency.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bins/validator-node/src/wasm_executor.rs` around lines 406 - 422, The
InstanceConfig in execute_get_tasks and execute_configure currently hardcodes
storage_backend to Arc::new(InMemoryStorageBackend::new()) which is inconsistent
with execute_evaluation_with_sandbox and execute_validation; replace the
hard-coded backend with a clone of the configured backend (use
self.config.storage_backend.clone()) when constructing InstanceConfig (or, if
intentional, add a clear inline comment at the InstanceConfig creation in
execute_get_tasks and execute_configure explaining why ephemeral
InMemoryStorageBackend is required); update the references in those two
functions (execute_get_tasks and execute_configure) to use the same
storage_backend source as the other execution paths.

169-190: Storage config wiring looks correct, but reliance on ..Default::default() silently picks up new fields.

The explicit StorageHostConfig override with allow_direct_writes: true and require_consensus: false is appropriate. However, the trailing ..Default::default() at line 189 means any new fields added to InstanceConfig in the future will silently receive their default values. This is fine for now (it fills data_policy, data_backend, container_policy per the relevant snippet from InstanceConfig::default()), but be aware that if a future field requires non-default initialization in the evaluation path, it will be silently missed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bins/validator-node/src/wasm_executor.rs` around lines 169 - 190, The
InstanceConfig construction uses a trailing ..Default::default(), which can
silently omit initialization for any future fields; replace the blanket default
spread with an explicit construction of all InstanceConfig fields required for
evaluation (i.e., list each field instead of using ..Default::default()),
ensuring you preserve the explicit StorageHostConfig override
(allow_direct_writes: true, require_consensus: false) and the other set values
(network_policy, sandbox_policy, exec_policy, time_policy, audit_logger,
memory_export, challenge_id, validator_id, restart_id, config_version,
storage_backend, fixed_timestamp_ms, consensus_policy, terminal_policy);
alternatively, if exhaustive listing is impractical right now, add an explicit
comment/TODO by the InstanceConfig::default() usage and add a unit-test or
runtime assertion that validates InstanceConfig::default() fields are suitable
for the evaluation path so future additions won't be silently missed.
crates/wasm-runtime-interface/src/data.rs (1)

345-392: Duplicate memory helpers across data.rs and container.rs.

read_wasm_memory, write_wasm_memory, and get_memory in this file are nearly identical to read_memory, write_bytes/write_result, and get_memory in container.rs. Consider extracting shared helpers into a common module (e.g., wasm_memory.rs) to reduce duplication.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/wasm-runtime-interface/src/data.rs` around lines 345 - 392, Duplicate
wasm memory helpers exist in data.rs (read_wasm_memory, write_wasm_memory,
get_memory) and container.rs (read_memory, write_bytes/write_result,
get_memory); extract these into a single shared module (e.g., wasm_memory.rs)
that exposes unified helpers (keep signatures using Caller<RuntimeState>),
replace the copies in data.rs and container.rs with imports from the new module,
and provide small adapter wrappers in either file if naming/signature
differences exist; also update use/import paths and remove the duplicated
functions from both files to avoid conflicts.
crates/wasm-runtime-interface/src/container.rs (2)

88-96: Image allowlist matching: tag-only, no registry/namespace awareness.

The current matching logic (i == image || image.starts_with(&format!("{}:", i))) handles name and name:tag but does not account for fully-qualified references like docker.io/library/ubuntu. If the allowed list contains "ubuntu", a request for "docker.io/library/ubuntu:latest" would be rejected. This may be intentional but is worth documenting, especially for challenge authors.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/wasm-runtime-interface/src/container.rs` around lines 88 - 96, The
is_image_allowed function currently only matches exact image or tag-only forms
and misses fully-qualified references; update is_image_allowed to normalize the
requested image by stripping any digest (@...) and tag (:...), then take the
repository path's last segment (the image name after the final '/'), and match
allowed_images against: wildcard "*", exact match against the full original
image string, exact match against the normalized image name, and exact match
against allowed entries that may include namespace/registry (so also allow
comparing allowed entry to the full repository path). Use the existing symbols
is_image_allowed and allowed_images and ensure the matching preserves the
current tag-only behavior (i.e., still allow matches like "name:tag").

343-360: Consider hardening the docker run invocation with security flags.

The container runs with Docker defaults, which includes full capability sets and writable root filesystem. For a multi-tenant challenge evaluation environment, consider adding flags like --cap-drop=ALL, --security-opt=no-new-privileges, --pids-limit, and --read-only (if feasible). This reduces the blast radius of a compromised container.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/wasm-runtime-interface/src/container.rs` around lines 343 - 360, The
docker invocation built via the Command named cmd should be hardened by
appending security flags: add args like "--cap-drop=ALL",
"--security-opt=no-new-privileges", and a sensible "--pids-limit=<n>" and
"--read-only" to the existing cmd construction; if "--read-only" is used, ensure
any necessary writable paths (e.g., request.working_dir) are mounted as tmpfs or
explicit bind mounts (or skip read-only when working_dir requires write). Update
the block that builds cmd (the Command::new("docker") sequence that uses
network_mode, memory_limit_mb, request.env_vars, request.working_dir,
request.image, and request.command) to include these flags and handle exceptions
for writable mounts accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bins/validator-node/src/wasm_executor.rs`:
- Around line 432-445: The code silently swallows errors from
instance.read_memory when building result_data (after calling
instance.call_return_i64 for "get_tasks"); replace the .unwrap_or_default() on
the read_memory call with proper error propagation and contextual messaging
(e.g., map_err(...)?), so read_memory failures (invalid address/out-of-bounds)
produce an anyhow::Error with a clear message rather than returning an empty
Vec; update the result_data assignment to propagate the error using the same
pattern as the surrounding map_err usage for call_return_i64 and include
identifiers like instance.read_memory, call_return_i64("get_tasks"), and
result_data in the error context.

In `@crates/wasm-runtime-interface/src/bridge.rs`:
- Line 98: The bridge normalizes output.score by dividing by 10_000.0 (matching
SDK's score_f64_scaled), but bins/validator-node's process_wasm_evaluations
currently normalizes with i64::MAX; update that path so the same score
normalization is applied: in the process_wasm_evaluations function change the
computation of the variable normalized (which uses output.score) to divide by
10_000.0 (consistent with score_f64_scaled and the bridge implementation) and
ensure any comments/tests reflect this scaling.

In `@crates/wasm-runtime-interface/src/container.rs`:
- Around line 376-400: The code silently ignores errors from writing to the
child stdin and checks the timeout before polling the child, causing missed
successful exits; change the stdin write to handle errors (return
Err(ContainerExecError::ExecutionFailed(...)) or at minimum log a warning) when
stdin.write_all(stdin_data) fails (referencing child.stdin and stdin.write_all),
and reorder the loop in the exec routine so that child.try_wait() is called
first and only then check start.elapsed() > timeout (preserving the existing
kill and ContainerExecError::ExecutionTimeout behavior) to avoid killing
processes that have just exited.
- Around line 335-410: The Docker run never applies a CPU limit: extract
cpu_limit in handle_container_run the same way memory is handled (let cpu_limit
= request.cpu_limit.unwrap_or(policy.max_cpu_count).min(policy.max_cpu_count)),
pass that cpu_limit into execute_container, update execute_container signature
to accept the cpu_limit parameter, and add the CPUs flag when building the
command (use cmd.args(["--cpus", &cpu_limit.to_string()]) alongside the existing
memory handling) so the container is constrained by request.cpu_limit /
policy.max_cpu_count.

In `@crates/wasm-runtime-interface/src/data.rs`:
- Around line 263-290: The read quota (caller.data().data_state.reads) is
incremented too early—move the increment so it only happens after verifying the
buffer is large enough and after a successful write to guest memory: keep the
backend.get(...) call and error handling as-is, then perform the buf_len check
and call write_wasm_memory(caller, buf_ptr, &value); only if the buffer check
passes and write_wasm_memory returns Ok should you increment
caller.data_mut().data_state.reads and then return value.len() as i32; ensure
NotFound/IoError/BufferTooSmall/InternalError flows are preserved.
- Around line 120-131: The get and list methods on FilesystemDataBackend are
vulnerable to path traversal because starts_with is used before resolving '..'
components; update both FilesystemDataBackend::get and
FilesystemDataBackend::list to validate paths after canonicalization (or reject
keys containing ".." components) by joining
base_dir.join(challenge_id).join(key), canonicalizing the resulting path (e.g.,
std::fs::canonicalize) and then ensuring the canonicalized path starts_with the
canonicalized base_dir.join(challenge_id); if canonicalization fails or the
check fails, return DataError::PathNotAllowed(key.to_string()) (or appropriate
DataError), otherwise proceed to read or list files.

---

Nitpick comments:
In `@bins/validator-node/src/main.rs`:
- Line 379: The field initialization for storage_backend uses the
fully-qualified std::sync::Arc::new; replace it with the already-imported Arc
alias to match the rest of the file by changing
std::sync::Arc::new(wasm_runtime_interface::InMemoryStorageBackend::new()) to
Arc::new(wasm_runtime_interface::InMemoryStorageBackend::new()) so the
storage_backend assignment uses Arc::new consistently.

In `@bins/validator-node/src/wasm_executor.rs`:
- Around line 406-422: The InstanceConfig in execute_get_tasks and
execute_configure currently hardcodes storage_backend to
Arc::new(InMemoryStorageBackend::new()) which is inconsistent with
execute_evaluation_with_sandbox and execute_validation; replace the hard-coded
backend with a clone of the configured backend (use
self.config.storage_backend.clone()) when constructing InstanceConfig (or, if
intentional, add a clear inline comment at the InstanceConfig creation in
execute_get_tasks and execute_configure explaining why ephemeral
InMemoryStorageBackend is required); update the references in those two
functions (execute_get_tasks and execute_configure) to use the same
storage_backend source as the other execution paths.
- Around line 169-190: The InstanceConfig construction uses a trailing
..Default::default(), which can silently omit initialization for any future
fields; replace the blanket default spread with an explicit construction of all
InstanceConfig fields required for evaluation (i.e., list each field instead of
using ..Default::default()), ensuring you preserve the explicit
StorageHostConfig override (allow_direct_writes: true, require_consensus: false)
and the other set values (network_policy, sandbox_policy, exec_policy,
time_policy, audit_logger, memory_export, challenge_id, validator_id,
restart_id, config_version, storage_backend, fixed_timestamp_ms,
consensus_policy, terminal_policy); alternatively, if exhaustive listing is
impractical right now, add an explicit comment/TODO by the
InstanceConfig::default() usage and add a unit-test or runtime assertion that
validates InstanceConfig::default() fields are suitable for the evaluation path
so future additions won't be silently missed.

In `@crates/challenge-sdk-wasm/src/types.rs`:
- Around line 131-151: ContainerRunResponse lacks a convenience method to check
success like SandboxExecResponse::is_success(); add a public method fn
is_success(&self) -> bool on ContainerRunResponse that returns true when
self.exit_code == 0, mirroring the behavior of
SandboxExecResponse::is_success(), and update any docs/tests if present to use
ContainerRunResponse::is_success for consistency across the API.

In `@crates/wasm-runtime-interface/src/container.rs`:
- Around line 88-96: The is_image_allowed function currently only matches exact
image or tag-only forms and misses fully-qualified references; update
is_image_allowed to normalize the requested image by stripping any digest (@...)
and tag (:...), then take the repository path's last segment (the image name
after the final '/'), and match allowed_images against: wildcard "*", exact
match against the full original image string, exact match against the normalized
image name, and exact match against allowed entries that may include
namespace/registry (so also allow comparing allowed entry to the full repository
path). Use the existing symbols is_image_allowed and allowed_images and ensure
the matching preserves the current tag-only behavior (i.e., still allow matches
like "name:tag").
- Around line 343-360: The docker invocation built via the Command named cmd
should be hardened by appending security flags: add args like "--cap-drop=ALL",
"--security-opt=no-new-privileges", and a sensible "--pids-limit=<n>" and
"--read-only" to the existing cmd construction; if "--read-only" is used, ensure
any necessary writable paths (e.g., request.working_dir) are mounted as tmpfs or
explicit bind mounts (or skip read-only when working_dir requires write). Update
the block that builds cmd (the Command::new("docker") sequence that uses
network_mode, memory_limit_mb, request.env_vars, request.working_dir,
request.image, and request.command) to include these flags and handle exceptions
for writable mounts accordingly.

In `@crates/wasm-runtime-interface/src/data.rs`:
- Around line 345-392: Duplicate wasm memory helpers exist in data.rs
(read_wasm_memory, write_wasm_memory, get_memory) and container.rs (read_memory,
write_bytes/write_result, get_memory); extract these into a single shared module
(e.g., wasm_memory.rs) that exposes unified helpers (keep signatures using
Caller<RuntimeState>), replace the copies in data.rs and container.rs with
imports from the new module, and provide small adapter wrappers in either file
if naming/signature differences exist; also update use/import paths and remove
the duplicated functions from both files to avoid conflicts.

In `@crates/wasm-runtime-interface/src/runtime.rs`:
- Around line 196-236: RuntimeState::new has grown to 16 positional parameters
which is error-prone; replace it with a single config/builder object (e.g.,
RuntimeStateConfig or RuntimeStateBuilder) that contains the current fields
(network_policy, sandbox_policy, network_state, exec_state, time_state,
consensus_state, terminal_state, data_state, container_state, memory_export,
challenge_id, validator_id, restart_id, config_version, storage_state,
fixed_timestamp_ms, limits), then update RuntimeState::new to accept that config
(or expose RuntimeState::from(config) / RuntimeStateBuilder::build()) and
construct Self from the config fields; ensure existing call sites are migrated
to construct the config or use the builder to set named fields to avoid ordering
mistakes.

Comment on lines +432 to 445
let result = instance
.call_return_i64("get_tasks")
.map_err(|e| anyhow::anyhow!("WASM get_tasks call failed: {}", e))?;

let result_data = if result_ptr > 0 {
let len = instance
.call_return_i32("get_tasks_result_len")
.unwrap_or(0);
if len > 0 {
instance
.read_memory(result_ptr as usize, len as usize)
.unwrap_or_default()
} else {
Vec::new()
}
let out_len = (result >> 32) as i32;
let out_ptr = (result & 0xFFFF_FFFF) as i32;

let result_data = if out_ptr > 0 && out_len > 0 {
instance
.read_memory(out_ptr as usize, out_len as usize)
.unwrap_or_default()
} else {
Vec::new()
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Find the wasm_executor.rs file and check its structure
fd "wasm_executor.rs" --type f

Repository: PlatformNetwork/platform-v2

Length of output: 111


🏁 Script executed:

# Search for call_return_i64 method definition
rg "call_return_i64" --type rust -B 2 -A 5

Repository: PlatformNetwork/platform-v2

Length of output: 1332


🏁 Script executed:

# Search for the instance type and its methods in wasm_executor.rs
rg "fn call_return_i64|impl.*Instance|struct.*Instance" --type rust -A 3

Repository: PlatformNetwork/platform-v2

Length of output: 1970


🏁 Script executed:

# Find the read_memory method definition
rg "fn read_memory" --type rust -A 5

Repository: PlatformNetwork/platform-v2

Length of output: 2602


🏁 Script executed:

# Check broader context in wasm_executor.rs around the code in question
sed -n '400,500p' bins/validator-node/src/wasm_executor.rs | cat -n

Repository: PlatformNetwork/platform-v2

Length of output: 4547


🏁 Script executed:

# Search for other uses of read_memory in wasm_executor.rs to see error handling patterns
rg "read_memory" bins/validator-node/src/wasm_executor.rs -B 1 -A 2

Repository: PlatformNetwork/platform-v2

Length of output: 429


call_return_i64 method confirmed to exist — the ABI fix is correctly implemented. The bit-unpacking logic (len << 32) | ptr is properly inverted via (result >> 32) and (result & 0xFFFF_FFFF).

On line 442, .unwrap_or_default() silently discards read_memory errors, which diverges from the error-handling pattern used elsewhere in this file (e.g., map_err with explicit error messages). Since read_memory can fail for out-of-bounds access or invalid addresses, consider replacing .unwrap_or_default() with proper error handling or at least logging to preserve debuggability.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bins/validator-node/src/wasm_executor.rs` around lines 432 - 445, The code
silently swallows errors from instance.read_memory when building result_data
(after calling instance.call_return_i64 for "get_tasks"); replace the
.unwrap_or_default() on the read_memory call with proper error propagation and
contextual messaging (e.g., map_err(...)?), so read_memory failures (invalid
address/out-of-bounds) produce an anyhow::Error with a clear message rather than
returning an empty Vec; update the result_data assignment to propagate the error
using the same pattern as the surrounding map_err usage for call_return_i64 and
include identifiers like instance.read_memory, call_return_i64("get_tasks"), and
result_data in the error context.

) -> EvalResponse {
if output.valid {
let score = output.score as f64 / 100.0;
let score = output.score as f64 / 10_000.0;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Score normalization now correctly matches score_f64_scaled.

The divisor 10_000.0 aligns with score_f64_scaled in the SDK which multiplies by 10_000.0. The test at line 188 is updated accordingly.

However, note that bins/validator-node/src/main.rs at line 1184 normalizes the same output.score differently:

let normalized = (output.score as f64) / i64::MAX as f64;

A challenge using score_f64_scaled(1.0) produces score = 10_000. The bridge correctly yields 1.0, but the validator's process_wasm_evaluations yields ≈ 1.08e-15. One of these paths is wrong — the validator path should likely use / 10_000.0 as well.

Proposed fix in bins/validator-node/src/main.rs
-                let normalized = (output.score as f64) / i64::MAX as f64;
+                let normalized = (output.score as f64) / 10_000.0;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/wasm-runtime-interface/src/bridge.rs` at line 98, The bridge
normalizes output.score by dividing by 10_000.0 (matching SDK's
score_f64_scaled), but bins/validator-node's process_wasm_evaluations currently
normalizes with i64::MAX; update that path so the same score normalization is
applied: in the process_wasm_evaluations function change the computation of the
variable normalized (which uses output.score) to divide by 10_000.0 (consistent
with score_f64_scaled and the bridge implementation) and ensure any
comments/tests reflect this scaling.

Comment on lines +335 to +410
fn execute_container(
request: &ContainerRunRequest,
timeout: Duration,
memory_limit_mb: u64,
network_mode: &str,
) -> Result<ContainerRunResponse, ContainerExecError> {
let start = Instant::now();

let mut cmd = Command::new("docker");
cmd.arg("run");
cmd.arg("--rm");
cmd.args(["--network", network_mode]);
cmd.args(["--memory", &format!("{}m", memory_limit_mb)]);

for (key, value) in &request.env_vars {
cmd.args(["-e", &format!("{}={}", key, value)]);
}

if let Some(ref dir) = request.working_dir {
cmd.args(["-w", dir]);
}

cmd.arg(&request.image);
for arg in &request.command {
cmd.arg(arg);
}

let has_stdin = request.stdin.as_ref().is_some_and(|s| !s.is_empty());

if has_stdin {
cmd.stdin(std::process::Stdio::piped());
} else {
cmd.stdin(std::process::Stdio::null());
}
cmd.stdout(std::process::Stdio::piped());
cmd.stderr(std::process::Stdio::piped());

let mut child = cmd
.spawn()
.map_err(|e| ContainerExecError::ExecutionFailed(e.to_string()))?;

if has_stdin {
if let Some(ref stdin_data) = request.stdin {
if let Some(ref mut stdin) = child.stdin {
use std::io::Write;
let _ = stdin.write_all(stdin_data);
}
}
child.stdin.take();
}

let output = loop {
if start.elapsed() > timeout {
let _ = child.kill();
return Err(ContainerExecError::ExecutionTimeout(timeout.as_secs()));
}
match child.try_wait() {
Ok(Some(_)) => {
break child
.wait_with_output()
.map_err(|e| ContainerExecError::ExecutionFailed(e.to_string()))?
}
Ok(None) => std::thread::sleep(Duration::from_millis(50)),
Err(e) => return Err(ContainerExecError::ExecutionFailed(e.to_string())),
}
};

let duration_ms = start.elapsed().as_millis() as u64;

Ok(ContainerRunResponse {
exit_code: output.status.code().unwrap_or(-1),
stdout: output.stdout,
stderr: output.stderr,
duration_ms,
})
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if cpu_limit is used in container.rs and where it appears
rg -n 'cpu_limit' crates/wasm-runtime-interface/src/container.rs

Repository: PlatformNetwork/platform-v2

Length of output: 142


🏁 Script executed:

#!/bin/bash
# Check the function signature of execute_container and where it's called
rg -B5 -A1 'fn execute_container' crates/wasm-runtime-interface/src/container.rs

Repository: PlatformNetwork/platform-v2

Length of output: 190


🏁 Script executed:

#!/bin/bash
# Check where execute_container is called to see what parameters are passed
rg -n 'execute_container\(' crates/wasm-runtime-interface/src/container.rs

Repository: PlatformNetwork/platform-v2

Length of output: 183


🏁 Script executed:

#!/bin/bash
# Look at the ContainerRunRequest struct to see if cpu_limit is a field
rg -A20 'struct ContainerRunRequest' crates/wasm-runtime-interface/src/container.rs

Repository: PlatformNetwork/platform-v2

Length of output: 628


🏁 Script executed:

#!/bin/bash
# Check the entire context around the handle_container_run function to see how cpu_limit is used
sed -n '192,333p' crates/wasm-runtime-interface/src/container.rs | head -80

Repository: PlatformNetwork/platform-v2

Length of output: 2428


🏁 Script executed:

#!/bin/bash
# Look for where memory_limit is extracted/determined in handle_container_run
sed -n '270,310p' crates/wasm-runtime-interface/src/container.rs

Repository: PlatformNetwork/platform-v2

Length of output: 1320


🏁 Script executed:

#!/bin/bash
# Verify cpu_limit is never used in the execute_container function body
sed -n '335,410p' crates/wasm-runtime-interface/src/container.rs | grep -i cpu

Repository: PlatformNetwork/platform-v2

Length of output: 53


🏁 Script executed:

#!/bin/bash
# Check if there are any other references to cpu_limit in the entire file
rg -c 'cpu_limit' crates/wasm-runtime-interface/src/container.rs

Repository: PlatformNetwork/platform-v2

Length of output: 72


🏁 Script executed:

#!/bin/bash
# Check the full handle_container_run to see all variable extractions
sed -n '192,310p' crates/wasm-runtime-interface/src/container.rs

Repository: PlatformNetwork/platform-v2

Length of output: 3616


🏁 Script executed:

#!/bin/bash
# Check ContainerPolicy struct to see if max_cpu_count exists
rg -A20 'struct ContainerPolicy' crates/wasm-runtime-interface/src/container.rs

Repository: PlatformNetwork/platform-v2

Length of output: 665


🏁 Script executed:

#!/bin/bash
# Verify the complete execute_container signature and confirm cpu_limit is never referenced
sed -n '335,345p' crates/wasm-runtime-interface/src/container.rs

Repository: PlatformNetwork/platform-v2

Length of output: 372


Apply CPU limit from request/policy to Docker command.

request.cpu_limit and policy.max_cpu_count are defined but never enforced. The Docker command receives only a --memory flag; no --cpus limit is applied, allowing a guest to consume unbounded CPU on the host.

Extract cpu_limit in handle_container_run following the same pattern as memory_limit:

let cpu_limit = request
    .cpu_limit
    .unwrap_or(policy.max_cpu_count)
    .min(policy.max_cpu_count);

Then pass it to execute_container and apply --cpus to the Docker command:

cmd.args(["--cpus", &cpu_limit.to_string()]);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/wasm-runtime-interface/src/container.rs` around lines 335 - 410, The
Docker run never applies a CPU limit: extract cpu_limit in handle_container_run
the same way memory is handled (let cpu_limit =
request.cpu_limit.unwrap_or(policy.max_cpu_count).min(policy.max_cpu_count)),
pass that cpu_limit into execute_container, update execute_container signature
to accept the cpu_limit parameter, and add the CPUs flag when building the
command (use cmd.args(["--cpus", &cpu_limit.to_string()]) alongside the existing
memory handling) so the container is constrained by request.cpu_limit /
policy.max_cpu_count.

Comment on lines +376 to +400
if has_stdin {
if let Some(ref stdin_data) = request.stdin {
if let Some(ref mut stdin) = child.stdin {
use std::io::Write;
let _ = stdin.write_all(stdin_data);
}
}
child.stdin.take();
}

let output = loop {
if start.elapsed() > timeout {
let _ = child.kill();
return Err(ContainerExecError::ExecutionTimeout(timeout.as_secs()));
}
match child.try_wait() {
Ok(Some(_)) => {
break child
.wait_with_output()
.map_err(|e| ContainerExecError::ExecutionFailed(e.to_string()))?
}
Ok(None) => std::thread::sleep(Duration::from_millis(50)),
Err(e) => return Err(ContainerExecError::ExecutionFailed(e.to_string())),
}
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Silent stdin.write_all failure and timeout-before-poll ordering.

Two concerns:

  1. Line 380: let _ = stdin.write_all(stdin_data) silently discards write errors. If the write fails (e.g., pipe broken), the container runs with incomplete or no input, producing misleading results. At minimum, log a warning.

  2. Lines 387–389: The timeout check runs before try_wait, so a container that finishes at the exact timeout boundary is killed and reported as timed-out. Swapping the order (check try_wait first, then timeout) would avoid losing valid results in this edge case.

Proposed fix
     if has_stdin {
         if let Some(ref stdin_data) = request.stdin {
             if let Some(ref mut stdin) = child.stdin {
                 use std::io::Write;
-                let _ = stdin.write_all(stdin_data);
+                if let Err(e) = stdin.write_all(stdin_data) {
+                    warn!(error = %e, "container_run: failed to write stdin");
+                }
             }
         }
         child.stdin.take();
     }
 
     let output = loop {
-        if start.elapsed() > timeout {
-            let _ = child.kill();
-            return Err(ContainerExecError::ExecutionTimeout(timeout.as_secs()));
-        }
         match child.try_wait() {
             Ok(Some(_)) => {
                 break child
                     .wait_with_output()
                     .map_err(|e| ContainerExecError::ExecutionFailed(e.to_string()))?
             }
-            Ok(None) => std::thread::sleep(Duration::from_millis(50)),
+            Ok(None) => {
+                if start.elapsed() > timeout {
+                    let _ = child.kill();
+                    return Err(ContainerExecError::ExecutionTimeout(timeout.as_secs()));
+                }
+                std::thread::sleep(Duration::from_millis(50));
+            }
             Err(e) => return Err(ContainerExecError::ExecutionFailed(e.to_string())),
         }
     };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/wasm-runtime-interface/src/container.rs` around lines 376 - 400, The
code silently ignores errors from writing to the child stdin and checks the
timeout before polling the child, causing missed successful exits; change the
stdin write to handle errors (return
Err(ContainerExecError::ExecutionFailed(...)) or at minimum log a warning) when
stdin.write_all(stdin_data) fails (referencing child.stdin and stdin.write_all),
and reorder the loop in the exec routine so that child.try_wait() is called
first and only then check start.elapsed() > timeout (preserving the existing
kill and ContainerExecError::ExecutionTimeout behavior) to avoid killing
processes that have just exited.

Comment on lines +120 to +131
impl DataBackend for FilesystemDataBackend {
fn get(&self, challenge_id: &str, key: &str) -> Result<Option<Vec<u8>>, DataError> {
let path = self.base_dir.join(challenge_id).join(key);
if !path.starts_with(self.base_dir.join(challenge_id)) {
return Err(DataError::PathNotAllowed(key.to_string()));
}
match std::fs::read(&path) {
Ok(data) => Ok(Some(data)),
Err(e) if e.kind() == std::io::ErrorKind::NotFound => Ok(None),
Err(e) => Err(DataError::Io(e.to_string())),
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, examine the actual file to confirm it matches the snippet
cat -n crates/wasm-runtime-interface/src/data.rs | head -160

Repository: PlatformNetwork/platform-v2

Length of output: 5827


🌐 Web query:

Rust Path::starts_with behavior with .. components normalization

💡 Result:

Path::starts_with is a purely lexical, component-by-component prefix check: it matches only whole path components and does not resolve/normalize .. (parent) components into their lexical parents. [1]

What is normalized for the comparison is whatever Path::components() normalizes: repeated separators are ignored, . is mostly removed (except a leading ./), and trailing separators are ignored—but “no other normalization takes place”, specifically calling out that a/c and a/b/../c remain distinct (largely because full normalization can change meaning with symlinks). [2][3]

So, for example (conceptually):

  • Path::new("a/b/../c").starts_with("a")true (first component matches)
  • Path::new("a/b/../c").starts_with("a/c")false (because the second component is b, not c; .. is not collapsed)
  • Path::new("../x").starts_with("..")true (.. is just a normal ParentDir component)

If you need .. collapsed, you must do it explicitly (e.g., filesystem-based canonicalize, or the nightly lexical normalizer normalize_lexically). [2]

Sources: Rust Path::starts_with docs [1], Path::components normalization rules (and explicit non-normalization of ..) [2], Rust path normalization rationale (symlink semantics) [3].

[1] (doc.rust-lang.org)
[2] (doc.rust-lang.org)
[3] (rust-lang.github.io)

Citations:


Critical: Path traversal vulnerability in FilesystemDataBackend.

Path::starts_with performs a purely lexical, component-by-component prefix check without normalizing .. components. A malicious key like ../../etc/passwd produces /data/challenge1/../../etc/passwd, which matches the prefix check (first two components match) but resolves to /etc/passwd when read by std::fs::read(), breaking the sandbox.

The same vulnerability exists in the list() method (line 135).

Fix: Canonicalize paths after joining before the starts_with check, or reject keys containing .. components:

Proposed fix
impl DataBackend for FilesystemDataBackend {
    fn get(&self, challenge_id: &str, key: &str) -> Result<Option<Vec<u8>>, DataError> {
-       let path = self.base_dir.join(challenge_id).join(key);
-       if !path.starts_with(self.base_dir.join(challenge_id)) {
+       let base = self.base_dir.join(challenge_id);
+       let path = base.join(key);
+       let safe = if let (Ok(canon_base), Ok(canon_path)) = (
+           std::fs::canonicalize(&base),
+           std::fs::canonicalize(&path),
+       ) {
+           canon_path.starts_with(&canon_base)
+       } else {
+           !key.split(['/', '\\']).any(|c| c == "..")
+       };
+       if !safe {
            return Err(DataError::PathNotAllowed(key.to_string()));
        }

Apply analogously to list().

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
impl DataBackend for FilesystemDataBackend {
fn get(&self, challenge_id: &str, key: &str) -> Result<Option<Vec<u8>>, DataError> {
let path = self.base_dir.join(challenge_id).join(key);
if !path.starts_with(self.base_dir.join(challenge_id)) {
return Err(DataError::PathNotAllowed(key.to_string()));
}
match std::fs::read(&path) {
Ok(data) => Ok(Some(data)),
Err(e) if e.kind() == std::io::ErrorKind::NotFound => Ok(None),
Err(e) => Err(DataError::Io(e.to_string())),
}
}
impl DataBackend for FilesystemDataBackend {
fn get(&self, challenge_id: &str, key: &str) -> Result<Option<Vec<u8>>, DataError> {
let base = self.base_dir.join(challenge_id);
let path = base.join(key);
let safe = if let (Ok(canon_base), Ok(canon_path)) = (
std::fs::canonicalize(&base),
std::fs::canonicalize(&path),
) {
canon_path.starts_with(&canon_base)
} else {
!key.split(['/', '\\']).any(|c| c == "..")
};
if !safe {
return Err(DataError::PathNotAllowed(key.to_string()));
}
match std::fs::read(&path) {
Ok(data) => Ok(Some(data)),
Err(e) if e.kind() == std::io::ErrorKind::NotFound => Ok(None),
Err(e) => Err(DataError::Io(e.to_string())),
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/wasm-runtime-interface/src/data.rs` around lines 120 - 131, The get
and list methods on FilesystemDataBackend are vulnerable to path traversal
because starts_with is used before resolving '..' components; update both
FilesystemDataBackend::get and FilesystemDataBackend::list to validate paths
after canonicalization (or reject keys containing ".." components) by joining
base_dir.join(challenge_id).join(key), canonicalizing the resulting path (e.g.,
std::fs::canonicalize) and then ensuring the canonicalized path starts_with the
canonicalized base_dir.join(challenge_id); if canonicalization fails or the
check fails, return DataError::PathNotAllowed(key.to_string()) (or appropriate
DataError), otherwise proceed to read or list files.

Comment on lines +263 to +290
if caller.data().data_state.reads >= caller.data().data_state.policy.max_reads_per_execution {
return DataHostStatus::InternalError.to_i32();
}

let challenge_id = caller.data().data_state.challenge_id.clone();
let backend = std::sync::Arc::clone(&caller.data().data_state.backend);

let value = match backend.get(&challenge_id, &key_str) {
Ok(Some(v)) => v,
Ok(None) => return DataHostStatus::NotFound.to_i32(),
Err(err) => {
warn!(error = %err, "data_get: backend read failed");
return DataHostStatus::IoError.to_i32();
}
};

caller.data_mut().data_state.reads += 1;

if buf_len < 0 || value.len() > buf_len as usize {
return DataHostStatus::BufferTooSmall.to_i32();
}

if let Err(err) = write_wasm_memory(caller, buf_ptr, &value) {
warn!(error = %err, "data_get: failed to write value to wasm memory");
return DataHostStatus::InternalError.to_i32();
}

value.len() as i32
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Read counter incremented even when buffer is too small — guest can burn read quota without receiving data.

On line 279, reads is incremented after a successful backend fetch. But if the guest-provided buffer is too small (line 281–282), BufferTooSmall is returned after the counter has already been bumped. A misbehaving guest could exhaust its read allowance by repeatedly supplying undersized buffers.

Move the increment to after the buffer-size check and successful write.

Proposed fix
     let value = match backend.get(&challenge_id, &key_str) {
         Ok(Some(v)) => v,
         Ok(None) => return DataHostStatus::NotFound.to_i32(),
         Err(err) => {
             warn!(error = %err, "data_get: backend read failed");
             return DataHostStatus::IoError.to_i32();
         }
     };
 
-    caller.data_mut().data_state.reads += 1;
-
     if buf_len < 0 || value.len() > buf_len as usize {
         return DataHostStatus::BufferTooSmall.to_i32();
     }
 
     if let Err(err) = write_wasm_memory(caller, buf_ptr, &value) {
         warn!(error = %err, "data_get: failed to write value to wasm memory");
         return DataHostStatus::InternalError.to_i32();
     }
 
+    caller.data_mut().data_state.reads += 1;
+
     value.len() as i32
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if caller.data().data_state.reads >= caller.data().data_state.policy.max_reads_per_execution {
return DataHostStatus::InternalError.to_i32();
}
let challenge_id = caller.data().data_state.challenge_id.clone();
let backend = std::sync::Arc::clone(&caller.data().data_state.backend);
let value = match backend.get(&challenge_id, &key_str) {
Ok(Some(v)) => v,
Ok(None) => return DataHostStatus::NotFound.to_i32(),
Err(err) => {
warn!(error = %err, "data_get: backend read failed");
return DataHostStatus::IoError.to_i32();
}
};
caller.data_mut().data_state.reads += 1;
if buf_len < 0 || value.len() > buf_len as usize {
return DataHostStatus::BufferTooSmall.to_i32();
}
if let Err(err) = write_wasm_memory(caller, buf_ptr, &value) {
warn!(error = %err, "data_get: failed to write value to wasm memory");
return DataHostStatus::InternalError.to_i32();
}
value.len() as i32
let value = match backend.get(&challenge_id, &key_str) {
Ok(Some(v)) => v,
Ok(None) => return DataHostStatus::NotFound.to_i32(),
Err(err) => {
warn!(error = %err, "data_get: backend read failed");
return DataHostStatus::IoError.to_i32();
}
};
if buf_len < 0 || value.len() > buf_len as usize {
return DataHostStatus::BufferTooSmall.to_i32();
}
if let Err(err) = write_wasm_memory(caller, buf_ptr, &value) {
warn!(error = %err, "data_get: failed to write value to wasm memory");
return DataHostStatus::InternalError.to_i32();
}
caller.data_mut().data_state.reads += 1;
value.len() as i32
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/wasm-runtime-interface/src/data.rs` around lines 263 - 290, The read
quota (caller.data().data_state.reads) is incremented too early—move the
increment so it only happens after verifying the buffer is large enough and
after a successful write to guest memory: keep the backend.get(...) call and
error handling as-is, then perform the buf_len check and call
write_wasm_memory(caller, buf_ptr, &value); only if the buffer check passes and
write_wasm_memory returns Ok should you increment
caller.data_mut().data_state.reads and then return value.len() as i32; ensure
NotFound/IoError/BufferTooSmall/InternalError flows are preserved.

@echobt echobt closed this Feb 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments