Skip to content

chore: audit changes#26

Merged
tac0turtle merged 6 commits intomainfrom
marko/audit
Mar 17, 2026
Merged

chore: audit changes#26
tac0turtle merged 6 commits intomainfrom
marko/audit

Conversation

@tac0turtle
Copy link
Contributor

@tac0turtle tac0turtle commented Mar 16, 2026

Overview

minor changes to clean up some code

Summary by CodeRabbit

  • Documentation

    • Added a comprehensive architectural reference and local setup guide.
  • New Features

    • Optional configurable CORS origin via environment variable.
    • API now enforces a maximum of 20 addresses for multi-balance requests.
  • Improvements

    • Friendlier client error messages with richer internal logging.
    • Metadata URL scheme validation for safer fetches.
    • Security response headers added to the frontend.
    • Server container runs as non-root; SSE uses keep-alive.
    • Address/hash lookups use exact matching.
  • Chores

    • Consolidated developer tasks/CI into grouped targets; updated dev/build tooling and base images.

@coderabbitai
Copy link

coderabbitai bot commented Mar 16, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d017442f-a5cb-4b4f-9835-cb8b7984e475

📥 Commits

Reviewing files that changed from the base of the PR and between 25074e5 and 6a207d5.

📒 Files selected for processing (2)
  • .env.example
  • backend/crates/atlas-server/src/api/mod.rs
✅ Files skipped from review due to trivial changes (1)
  • .env.example

📝 Walkthrough

Walkthrough

Adds architecture docs and reorganizes developer tasks; upgrades dependencies and container user; switches many SQL comparisons from case-insensitive to exact equality; refactors HTTP provider initialization; adds optional CORS origin config; widens SSE return types to KeepAliveStream; several handler and indexer adjustments; adds frontend security headers.

Changes

Cohort / File(s) Summary
Docs & CI tooling
AGENTS.md, Justfile, docker-compose.yml
Adds AGENTS.md architecture doc; consolidates Justfile into grouped targets (quality/backend/frontend/docker/ci) and adds docker test-run orchestration; docker-compose moves to parameterized env defaults and removes published DB/server ports.
Dependencies & Container runtime
backend/Cargo.toml, backend/Dockerfile
Bumps alloy and reqwest, relaxes serde constraint; Dockerfile updates Rust base and makes final image run as non-root atlas user.
Common types & errors
backend/crates/atlas-common/src/error.rs, backend/crates/atlas-common/src/types.rs
Adds AtlasError::Validation(String) mapped to 400; adjusts Pagination offset casting order.
API error handling
backend/crates/atlas-server/src/api/error.rs
Returns structured client-facing JSON {"error": ...} messages, logs opaque/internal errors, and exposes only user-relevant messages.
Router, CORS & config
backend/crates/atlas-server/src/api/mod.rs, backend/crates/atlas-server/src/config.rs, backend/crates/atlas-server/src/main.rs
Adds cors_origin: Option<String> to Config; changes build_router signature to accept cors_origin and uses helper to build conditional CorsLayer; main forwards configured origin.
SSE keep-alive
backend/crates/atlas-server/src/api/handlers/sse.rs
Widened SSE return types to Sse<KeepAliveStream<...>>, updating handler and helper signatures.
SQL equality normalization
backend/crates/atlas-server/src/api/handlers/...
addresses.rs, etherscan.rs, logs.rs, nfts.rs, proxy.rs, tokens.rs
Replaces many LOWER(field) = LOWER($x) predicates with exact field = $x equality across handlers and joins; adjust related queries, counts, and joins.
Provider & indexer refactor
backend/crates/atlas-server/src/indexer/...
fetcher.rs, indexer.rs, metadata.rs
Simplifies provider types and construction: replace ProviderBuilder::new().on_http / Http<Client> aliases with RootProvider::new_http / RootProvider<Ethereum>, and adapt call signatures and metadata extraction (token URI, contract metadata).
Etherscan & NFT handler adjustments
backend/crates/atlas-server/src/api/handlers/etherscan.rs, .../nfts.rs
Switches provider init to connect_http/new_http, removes full-tx flag on block fetch, adds guard on multi-balance size, validates metadata URL schemes, and updates provider.call usage.
Logs & tokens specifics
backend/crates/atlas-server/src/api/handlers/logs.rs, .../tokens.rs
Changes tx_hash and address lookups to exact-match comparisons; updates token supply/balance queries to use exact-address lookups.
Indexer dataflow tweaks
backend/crates/atlas-server/src/indexer/*
Adjusts signer extraction and mappings aligned with provider simplification and modified return value shapes.
Frontend security
frontend/nginx.conf
Adds security response headers: X-Content-Type-Options, X-Frame-Options, Referrer-Policy, and Permissions-Policy.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Client
participant Router as API Router (with CORS)
participant Handler
participant DB as Postgres
participant RPC as Ethereum RPC / Provider
Client->>Router: HTTP request / SSE connect
Router->>Handler: route + CORS check
alt regular API call
Handler->>DB: SQL query (exact-match predicates)
DB-->>Handler: rows
Handler->>RPC: provider call (RootProvider::new_http / connect_http)
RPC-->>Handler: RPC response
Handler-->>Router: JSON response
Router-->>Client: HTTP response
else SSE flow
Handler->>RPC: subscribe to block events
RPC-->>Handler: event stream
Handler->>Router: KeepAlive-wrapped SSE stream
Router-->>Client: SSE stream (keep-alive)
end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • pthmas

Poem

🐰
I nibbled through queries, made matches exact,
Tuned providers, kept events alive and packed.
CORS found a home, a non-root burrow too,
Headers snug, tasks tidy — a carrot for you. 🥕✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Title check ⚠️ Warning The title 'chore: audit changes' is too vague and generic. While it uses semantic commit convention format, it lacks specificity about what was actually changed. Replace with a specific, descriptive title that highlights the main changes, such as 'chore: upgrade dependencies and enforce case-sensitive address comparisons' or 'chore: refactor address handling and reorganize build targets'.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 93.18% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch marko/audit
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

You can customize the high-level summary generated by CodeRabbit.

Configure the reviews.high_level_summary_instructions setting to provide custom instructions for generating the high-level summary.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 14

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
backend/crates/atlas-server/src/api/handlers/nfts.rs (1)

59-75: ⚠️ Potential issue | 🟠 Major

The NFT read/update path is now case-sensitive.

These queries used to tolerate mixed-case stored contract addresses; now they only match the lowercase form produced by normalize_address(). If any existing nft_contracts / nft_tokens / nft_transfers rows are checksummed, collection/token lookups will miss them and the metadata updater can execute an UPDATE that touches 0 rows before the follow-up fetch_one() fails.

Also applies to: 137-145, 172-172, 232-247, 374-419

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/crates/atlas-server/src/api/handlers/nfts.rs` around lines 59 - 75,
The database lookups/updates are failing for checksummed addresses because
comparisons are now case-sensitive; fix by making address comparisons
case-insensitive: either normalize addresses before DB binds using
normalize_address(...) everywhere you bind addresses (e.g., when selecting the
collection, in the nft_contracts UPDATE after fetch_collection_metadata, and in
the nft_tokens/nft_transfers queries referenced) or change the SQL predicates to
compare LOWER(address) = LOWER($1) (and corresponding WHERE clauses like WHERE
LOWER(address) = LOWER($3) for the UPDATE). Update all occurrences mentioned
(selection, the UPDATE in the nft_contracts path, and the other query sites) so
lookups and updates match regardless of case.
backend/crates/atlas-server/src/indexer/metadata.rs (1)

254-310: ⚠️ Potential issue | 🟡 Minor

Add tests for the new metadata extraction/conversion paths.

The changed logic around Lines 254-310 and Line 522 is not covered by the current tests in this file (which only validate resolve_uri). Please add focused tests for call-result mapping and supply conversion edge cases.

As per coding guidelines: "Add unit tests for new logic in a #[cfg(test)] mod tests block in the same file, and run with cargo test --workspace".

Also applies to: 510-523

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/crates/atlas-server/src/indexer/metadata.rs` around lines 254 - 310,
Add unit tests in a #[cfg(test)] mod tests block in the same file to exercise
the new call-result mapping and conversion logic in fetch_nft_contract_metadata
(name/symbol/totalSupply mapping to Option and i64) and
fetch_erc20_contract_metadata (decimals -> Option<i16>, totalSupply ->
BigDecimal conversion and fallback). Write focused tests that mock the contract
call results for: successful values, large totalSupply that needs BigDecimal
conversion, absent/None responses, and invalid conversion paths to ensure
.ok().map(...) behavior and COALESCE handling are correct; reuse the pattern
from existing resolve_uri tests and run them with cargo test --workspace. Ensure
tests assert the database-bound values or the mapped return values from those
functions (via returned Result or by calling helper extraction logic) to cover
edge cases mentioned.
🧹 Nitpick comments (1)
backend/crates/atlas-server/src/api/handlers/sse.rs (1)

50-95: Please add a regression test for the lag/backfill path.

The new recv() / DB backfill loop is the risky part of this change, but the same-file tests still only cover event serialization. A test that simulates Lagged, verifies ascending replay, and confirms duplicate live blocks are skipped would make this much safer.

As per coding guidelines, "Add unit tests for new logic in a #[cfg(test)] mod tests block in the same file, and run with cargo test --workspace."

Also applies to: 128-132

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/crates/atlas-server/src/api/handlers/sse.rs` around lines 50 - 95,
Add a unit test in a #[cfg(test)] mod tests block in the same file that
exercises the recv()/DB backfill path: mock or stub the broadcast receiver so
rx.recv().await first returns Err(broadcast::error::RecvError::Lagged(n)) and
then later yields live blocks, mock fetch_blocks_after(&pool, cursor).await to
return one or more batches of blocks (including a block with number <=
last_block_number to assert deduplication), and assert that the produced stream
(the loop using rx.recv(), last_block_number, block_to_event and
FETCH_BATCH_SIZE) yields backfilled events in ascending order and that
duplicate/older live blocks are skipped; use the same helper functions
(fetch_blocks_after, block_to_event) and verify behavior for the batch_len <
FETCH_BATCH_SIZE break condition and for erroring fetch_blocks_after to cover
the warning path, then run cargo test --workspace.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@AGENTS.md`:
- Around line 66-67: Update the API docs to reflect current backend behavior:
clarify that TimeoutLayer::with_status_code(StatusCode::REQUEST_TIMEOUT,
Duration::from_secs(10)) wraps all HTTP routes except SSE streams (exclude SSE
routes from the 10s timeout), and change the description for the polling vs live
endpoints to state that GET /api/status is the polling fallback returning the
payload { block_height, indexed_at } while GET /api/events is the SSE endpoint
for live updates; update the two doc locations that mention the timeout and the
endpoints accordingly.
- Line 17: Update the unlabeled fenced code block in AGENTS.md (the block
starting at the example directory tree) to include a language identifier (e.g.,
add "text" after the opening triple backticks) so the block is labeled and
satisfies markdown lint rule MD040; locate the fenced block in AGENTS.md and
change the opening fence from ``` to ```text.

In `@backend/Cargo.toml`:
- Line 29: The Cargo.toml entry for serde currently uses a floating "1.0" which
can bump MSRV requirements; either declare a workspace rust-version in
[workspace.package] (e.g., add rust-version = "1.xx") or tighten the serde
dependency to a fixed patch (e.g., serde = "1.0.Y") to guarantee MSRV
stability—update the manifest accordingly and ensure the serde line and/or
[workspace.package] rust-version are changed so CI and downstream consumers have
a stable minimum Rust version.

In `@backend/crates/atlas-common/src/error.rs`:
- Around line 46-47: Add a colocated unit test in a #[cfg(test)] mod tests block
that asserts AtlasError::Validation(...) maps to HTTP 400; specifically,
construct an AtlasError::Validation variant and call the status-code mapping
(the function/method containing the match with
AtlasError::InvalidInput/Validation -> 400) and assert it equals 400. Place this
test in the same file alongside other tests and run with cargo test --workspace.

In `@backend/crates/atlas-common/src/types.rs`:
- Around line 246-248: Pagination::offset currently exposes offset-based
pagination; replace it with a keyset/cursor helper. Remove or deprecate the
offset(&self) method on the Pagination type and add a cursor helper (e.g.,
Pagination::cursor_after or Pagination::keyset) that returns an opaque cursor
(Option<String> or Option<CursorStruct>) encoding the last-seen key(s) needed
for keyset queries (for example primary key and tie-breaker like created_at),
and implement a complementary helper (e.g., Pagination::apply_to_query or
Pagination::bind_keyset) to convert that cursor into query parameters for WHERE
... > (key) ORDER BY key LIMIT. Update any call sites that used offset() to
consume the new cursor API. Ensure the cursor is stable/opaque (base64 or
similar) and include reference to Pagination::offset, Pagination::cursor_after
(or chosen names), and Pagination::apply_to_query when making these
replacements.

In `@backend/crates/atlas-server/src/api/handlers/addresses.rs`:
- Around line 96-97: The SQL joins changed to exact equality and must be
restored to case-insensitive/comparable-address matching to avoid splitting
checksummed/mixed-case legacy rows; update the JOIN conditions in the queries
used by list_addresses() and get_address() (and the other affected blocks) to
compare normalized forms (e.g., compare LOWER(address) or apply the same
normalize_address() logic on both sides) instead of using plain '=' so legacy
mixed-case/checksummed rows still match.

In `@backend/crates/atlas-server/src/api/handlers/etherscan.rs`:
- Line 343: The SQL WHERE clauses in etherscan.rs now assume addresses are
stored canonical-lowercase and will miss mixed-case rows; update each query that
filters by address (the txlist, tokentx, tokenbalance, getabi, getsourcecode and
proxy metadata lookup queries inside etherscan.rs) to either compare normalized
forms or normalize the incoming parameter: call the same
normalize_address()-equivalent on the query side or use an expression like
lower(column) = lower($1) so both sides are normalized, e.g., modify the WHERE
conditions (e.g., the clause "WHERE from_address = $1 OR to_address = $1" and
the similar clauses at the other noted locations) to normalize columns or the
parameter consistently to guarantee matches regardless of stored case.

In `@backend/crates/atlas-server/src/api/handlers/nfts.rs`:
- Around line 302-324: The current validate_metadata_url_scheme(url: &str) only
checks the scheme and still allows SSRF via private addresses or redirect
chains; update validation so that before performing the HTTP fetch you (1)
perform DNS resolution for the host (async resolver) and reject any resolved
A/AAAA addresses in RFC1918 ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16),
loopback (127.0.0.0/8, ::1), link-local (169.254.0.0/16, fe80::/10) and other
non-public ranges, (2) ensure the HTTP client does not blindly follow redirects
— either disable automatic redirects or intercept Location headers and apply the
same scheme+host+resolved-IP checks for each redirect target up to a safe
redirect limit, and (3) alternatively offer/require an allowlist of hosts to
fetch; modify validate_metadata_url_scheme to only do scheme checks and add a
new async resolve_and_validate_host(host: &str) function (and redirect
validation flow) to enforce IP-range and redirect checks before fetching.

In `@backend/crates/atlas-server/src/api/mod.rs`:
- Around line 166-172: The build_cors_layer function currently panics on an
invalid CORS_ORIGIN because it calls expect when parsing into
axum::http::HeaderValue; change it to validate and return a startup error
instead of panicking by making build_cors_layer return Result<CorsLayer, E>
(e.g., anyhow::Error or a config error type), parse the origin with
.parse().map_err(|e| format_err!("invalid CORS_ORIGIN: {}", e)) and propagate
that error, or accept a pre-parsed HeaderValue from config; update callers to
handle the Result and fail startup with a structured error rather than calling
expect. Ensure references to AllowOrigin::exact and CorsLayer are preserved when
constructing the successful result.

In `@backend/crates/atlas-server/src/indexer/indexer.rs`:
- Around line 253-256: The indexer currently sends live_block via
self.block_events_tx.send(...) before persisting the batch (see
live_block_from_fetched, collect_block and the send call near where
next_to_process is incremented), which can advertise blocks that fail to write;
move the send so it happens only after write_batch(...) returns success (i.e.,
in the post-commit paths that currently call write_batch), and do the same for
the other occurrence around lines 338–340 — construct the live_block before the
write but call block_events_tx.send(live_block) only after write_batch succeeds
(and do not send on write errors or retries).

In `@backend/crates/atlas-server/src/indexer/metadata.rs`:
- Around line 260-266: The current mapping from
contract.totalSupply().call().await to total_supply uses
try_into().unwrap_or(0i64), which turns conversion failures into 0; change it to
yield None on overflow by replacing the unwrap_or with a safe conversion that
returns Option<i64> (e.g., use .and_then(|r| r.try_into().ok()) or an equivalent
.map(...).flatten()), so total_supply is None on conversion failure and existing
nft_contracts.total_supply is preserved by COALESCE; touch the expression around
contract.totalSupply().call().await, try_into, and total_supply to implement
this.

In `@docker-compose.yml`:
- Around line 6-7: The healthcheck still uses the hardcoded atlas/atlas
credentials and will fail when POSTGRES_USER or POSTGRES_PASSWORD are
overridden; update the healthcheck to read and use the POSTGRES_USER and
POSTGRES_PASSWORD environment variables (instead of the literal "atlas") so
probes reflect configured credentials—modify the healthcheck command to source
or reference POSTGRES_USER and POSTGRES_PASSWORD (or switch to pg_isready/psql
invoked with those env vars) so the container becomes healthy with custom creds
(look for POSTGRES_USER, POSTGRES_PASSWORD and the healthcheck entry in the
docker-compose.yml to change).

In `@frontend/nginx.conf`:
- Line 20: The nginx config forwards Host using "proxy_set_header Host
$server_name;" which, together with "server_name _;" causes the backend to
receive Host: _; update the proxy headers to use the client-provided hostname by
replacing occurrences of the directive "proxy_set_header Host $server_name;"
with "proxy_set_header Host $host;" so the backend receives the original Host
header instead of "_".

In `@Justfile`:
- Around line 6-24: The aggregate targets (quality, ci, and check) no longer
include the frontend static checks so running just quality/ci can skip
frontend-lint and frontend-typecheck; update the Justfile so the quality and ci
aggregates (and check if intended) depend on frontend-lint and
frontend-typecheck in addition to existing backend targets (e.g., add
frontend-lint and frontend-typecheck to the dependency lists for the
quality/ci/check recipes), ensuring the aggregate graph includes both frontend
static checks and the existing backend targets (refer to the targets named
quality, ci, check, frontend-lint, and frontend-typecheck).

---

Outside diff comments:
In `@backend/crates/atlas-server/src/api/handlers/nfts.rs`:
- Around line 59-75: The database lookups/updates are failing for checksummed
addresses because comparisons are now case-sensitive; fix by making address
comparisons case-insensitive: either normalize addresses before DB binds using
normalize_address(...) everywhere you bind addresses (e.g., when selecting the
collection, in the nft_contracts UPDATE after fetch_collection_metadata, and in
the nft_tokens/nft_transfers queries referenced) or change the SQL predicates to
compare LOWER(address) = LOWER($1) (and corresponding WHERE clauses like WHERE
LOWER(address) = LOWER($3) for the UPDATE). Update all occurrences mentioned
(selection, the UPDATE in the nft_contracts path, and the other query sites) so
lookups and updates match regardless of case.

In `@backend/crates/atlas-server/src/indexer/metadata.rs`:
- Around line 254-310: Add unit tests in a #[cfg(test)] mod tests block in the
same file to exercise the new call-result mapping and conversion logic in
fetch_nft_contract_metadata (name/symbol/totalSupply mapping to Option and i64)
and fetch_erc20_contract_metadata (decimals -> Option<i16>, totalSupply ->
BigDecimal conversion and fallback). Write focused tests that mock the contract
call results for: successful values, large totalSupply that needs BigDecimal
conversion, absent/None responses, and invalid conversion paths to ensure
.ok().map(...) behavior and COALESCE handling are correct; reuse the pattern
from existing resolve_uri tests and run them with cargo test --workspace. Ensure
tests assert the database-bound values or the mapped return values from those
functions (via returned Result or by calling helper extraction logic) to cover
edge cases mentioned.

---

Nitpick comments:
In `@backend/crates/atlas-server/src/api/handlers/sse.rs`:
- Around line 50-95: Add a unit test in a #[cfg(test)] mod tests block in the
same file that exercises the recv()/DB backfill path: mock or stub the broadcast
receiver so rx.recv().await first returns
Err(broadcast::error::RecvError::Lagged(n)) and then later yields live blocks,
mock fetch_blocks_after(&pool, cursor).await to return one or more batches of
blocks (including a block with number <= last_block_number to assert
deduplication), and assert that the produced stream (the loop using rx.recv(),
last_block_number, block_to_event and FETCH_BATCH_SIZE) yields backfilled events
in ascending order and that duplicate/older live blocks are skipped; use the
same helper functions (fetch_blocks_after, block_to_event) and verify behavior
for the batch_len < FETCH_BATCH_SIZE break condition and for erroring
fetch_blocks_after to cover the warning path, then run cargo test --workspace.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7a9d1e8b-c8ed-4cd7-8ef3-a4ac8f8fa7f3

📥 Commits

Reviewing files that changed from the base of the PR and between f6d18cb and d5571c4.

📒 Files selected for processing (22)
  • AGENTS.md
  • Justfile
  • backend/Cargo.toml
  • backend/Dockerfile
  • backend/crates/atlas-common/src/error.rs
  • backend/crates/atlas-common/src/types.rs
  • backend/crates/atlas-server/src/api/error.rs
  • backend/crates/atlas-server/src/api/handlers/addresses.rs
  • backend/crates/atlas-server/src/api/handlers/etherscan.rs
  • backend/crates/atlas-server/src/api/handlers/logs.rs
  • backend/crates/atlas-server/src/api/handlers/nfts.rs
  • backend/crates/atlas-server/src/api/handlers/proxy.rs
  • backend/crates/atlas-server/src/api/handlers/sse.rs
  • backend/crates/atlas-server/src/api/handlers/tokens.rs
  • backend/crates/atlas-server/src/api/mod.rs
  • backend/crates/atlas-server/src/config.rs
  • backend/crates/atlas-server/src/indexer/fetcher.rs
  • backend/crates/atlas-server/src/indexer/indexer.rs
  • backend/crates/atlas-server/src/indexer/metadata.rs
  • backend/crates/atlas-server/src/main.rs
  • docker-compose.yml
  • frontend/nginx.conf


## Repository Layout

```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add a language to the fenced code block.

Line 17 uses an unlabeled fenced block, which triggers markdown linting (MD040).

🧹 Suggested patch
-```
+```text
 atlas/
 ├── backend/
 ...
-```
+```
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 17-17: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@AGENTS.md` at line 17, Update the unlabeled fenced code block in AGENTS.md
(the block starting at the example directory tree) to include a language
identifier (e.g., add "text" after the opening triple backticks) so the block is
labeled and satisfies markdown lint rule MD040; locate the fenced block in
AGENTS.md and change the opening fence from ``` to ```text.

Comment on lines +66 to +67
`TimeoutLayer::with_status_code(StatusCode::REQUEST_TIMEOUT, Duration::from_secs(10))` wraps all routes — returns 408 if any handler exceeds 10s.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Update API behavior docs to match current backend conventions.

Two statements are currently out of sync with the project conventions:

  • Line 66 should call out that SSE routes are excluded from the 10s timeout wrapper.
  • Line 82-83 should document /api/status as the polling fallback payload { block_height, indexed_at }, with /api/events as live updates.
📝 Suggested doc patch
-`TimeoutLayer::with_status_code(StatusCode::REQUEST_TIMEOUT, Duration::from_secs(10))` wraps all routes — returns 408 if any handler exceeds 10s.
+`TimeoutLayer::with_status_code(StatusCode::REQUEST_TIMEOUT, Duration::from_secs(10))` wraps all non-SSE routes — returns 408 if any handler exceeds 10s.

 ### Frontend API client
 - Base URL: `/api` (proxied by nginx to `atlas-api:3000`)
-- Fast polling endpoint: `GET /api/height` → `{ block_height, indexed_at }` — used by navbar every 2s
-- Chain status: `GET /api/status` → full chain info, fetched once on page load
+- Live updates: `GET /api/events` (SSE)
+- Polling fallback: `GET /api/status` → `{ block_height, indexed_at }`

Based on learnings: "Use GET /api/status endpoint for navbar polling fallback …" and "Use GET /api/events endpoint for SSE live updates …", plus "Wrap all HTTP routes except SSE with TimeoutLayer...".

Also applies to: 82-84

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@AGENTS.md` around lines 66 - 67, Update the API docs to reflect current
backend behavior: clarify that
TimeoutLayer::with_status_code(StatusCode::REQUEST_TIMEOUT,
Duration::from_secs(10)) wraps all HTTP routes except SSE streams (exclude SSE
routes from the 10s timeout), and change the description for the polling vs live
endpoints to state that GET /api/status is the polling fallback returning the
payload { block_height, indexed_at } while GET /api/events is the SSE endpoint
for live updates; update the two doc locations that mention the timeout and the
endpoints accordingly.


# Serialization
serde = { version = ">=1.0.171, <1.0.220", features = ["derive"] }
serde = { version = "1.0", features = ["derive"] }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check whether rust-version is declared in workspace/package manifests
rg -n '^\s*rust-version\s*=' --glob '**/Cargo.toml'

# Confirm serde constraint in this workspace manifest
rg -n '^\s*serde\s*=' backend/Cargo.toml

# Inspect currently locked serde version (if lockfile exists)
if [ -f backend/Cargo.lock ]; then
  rg -n -C2 '^name = "serde"$|^version = "' backend/Cargo.lock
elif [ -f Cargo.lock ]; then
  rg -n -C2 '^name = "serde"$|^version = "' Cargo.lock
else
  echo "No Cargo.lock found at repo root or backend/"
fi

Repository: evstack/atlas

Length of output: 39


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# List repository structure to understand layout
echo "=== Repository root files ==="
ls -la | head -20

# Check if backend directory exists
if [ -d "backend" ]; then
  echo -e "\n=== Backend directory contents ==="
  ls -la backend/ | head -20
fi

# Try to find all Cargo.toml files
echo -e "\n=== Cargo.toml files ==="
find . -maxdepth 3 -name "Cargo.toml" -type f 2>/dev/null | head -20

# If backend/Cargo.toml exists, inspect it
if [ -f "backend/Cargo.toml" ]; then
  echo -e "\n=== backend/Cargo.toml content ==="
  cat -n backend/Cargo.toml | head -50
fi

# Check for Cargo.lock files
echo -e "\n=== Cargo.lock files ==="
find . -maxdepth 2 -name "Cargo.lock" -type f 2>/dev/null

Repository: evstack/atlas

Length of output: 3134


Consider pinning rust-version or tightening serde constraints for MSRV stability.

Line 29 uses unconstrained serde 1.0 without a declared workspace rust-version. Patch updates within the 1.0.x range could introduce higher MSRV requirements. Add rust-version to [workspace.package] or narrow serde to a specific patch version.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/Cargo.toml` at line 29, The Cargo.toml entry for serde currently uses
a floating "1.0" which can bump MSRV requirements; either declare a workspace
rust-version in [workspace.package] (e.g., add rust-version = "1.xx") or tighten
the serde dependency to a fixed patch (e.g., serde = "1.0.Y") to guarantee MSRV
stability—update the manifest accordingly and ensure the serde line and/or
[workspace.package] rust-version are changed so CI and downstream consumers have
a stable minimum Rust version.

Comment on lines +46 to 47
AtlasError::InvalidInput(_) | AtlasError::Validation(_) => 400,
AtlasError::Unauthorized(_) => 401,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Add a colocated unit test for the new Validation mapping.

Line 46 adds new status-code logic, but this file still lacks a #[cfg(test)] assertion for AtlasError::Validation(_) -> 400.

✅ Suggested test addition
 impl AtlasError {
     pub fn status_code(&self) -> u16 {
         match self {
             AtlasError::NotFound(_) => 404,
             AtlasError::InvalidInput(_) | AtlasError::Validation(_) => 400,
@@
         }
     }
 }
+
+#[cfg(test)]
+mod tests {
+    use super::AtlasError;
+
+    #[test]
+    fn validation_maps_to_400() {
+        let err = AtlasError::Validation("bad field".to_string());
+        assert_eq!(err.status_code(), 400);
+    }
+}

As per coding guidelines: backend/**/*.rs: "Add unit tests for new logic in a #[cfg(test)] mod tests block in the same file, and run with cargo test --workspace."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/crates/atlas-common/src/error.rs` around lines 46 - 47, Add a
colocated unit test in a #[cfg(test)] mod tests block that asserts
AtlasError::Validation(...) maps to HTTP 400; specifically, construct an
AtlasError::Validation variant and call the status-code mapping (the
function/method containing the match with AtlasError::InvalidInput/Validation ->
400) and assert it equals 400. Place this test in the same file alongside other
tests and run with cargo test --workspace.

Comment on lines 246 to 248
pub fn offset(&self) -> i64 {
((self.page.saturating_sub(1)) * self.limit) as i64
(self.page.saturating_sub(1) as i64) * self.limit()
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Replace offset() with cursor/keyset pagination helper.

Pagination::offset() still encodes offset-based pagination (Line 247), which violates repo pagination rules for large tables and promotes slow scan patterns in downstream queries. Please switch this shared API to cursor/keyset semantics.

Suggested refactor direction
 impl Pagination {
-    pub fn offset(&self) -> i64 {
-        (self.page.saturating_sub(1) as i64) * self.limit()
-    }
+    /// Cursor for descending keyset pagination.
+    /// Query shape at call sites:
+    /// `WHERE number <= $cursor ORDER BY number DESC LIMIT $1`
+    pub fn cursor(&self, total_count: i64) -> i64 {
+        total_count
+            .saturating_sub(1)
+            .saturating_sub((self.page.saturating_sub(1) as i64) * self.limit())
+    }
 
     pub fn limit(&self) -> i64 {
         self.limit.min(100) as i64
     }
 }

As per coding guidelines, "{backend/migrations/*.sql,backend/**/*.rs}: Never use OFFSET for pagination on large tables — use keyset/cursor pagination instead".

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub fn offset(&self) -> i64 {
((self.page.saturating_sub(1)) * self.limit) as i64
(self.page.saturating_sub(1) as i64) * self.limit()
}
/// Cursor for descending keyset pagination.
/// Query shape at call sites:
/// `WHERE number <= $cursor ORDER BY number DESC LIMIT $1`
pub fn cursor(&self, total_count: i64) -> i64 {
total_count
.saturating_sub(1)
.saturating_sub((self.page.saturating_sub(1) as i64) * self.limit())
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/crates/atlas-common/src/types.rs` around lines 246 - 248,
Pagination::offset currently exposes offset-based pagination; replace it with a
keyset/cursor helper. Remove or deprecate the offset(&self) method on the
Pagination type and add a cursor helper (e.g., Pagination::cursor_after or
Pagination::keyset) that returns an opaque cursor (Option<String> or
Option<CursorStruct>) encoding the last-seen key(s) needed for keyset queries
(for example primary key and tie-breaker like created_at), and implement a
complementary helper (e.g., Pagination::apply_to_query or
Pagination::bind_keyset) to convert that cursor into query parameters for WHERE
... > (key) ORDER BY key LIMIT. Update any call sites that used offset() to
consume the new cursor API. Ensure the cursor is stable/opaque (base64 or
similar) and include reference to Pagination::offset, Pagination::cursor_after
(or chosen names), and Pagination::apply_to_query when making these
replacements.

Comment on lines 260 to 266
let total_supply = contract
.totalSupply()
.call()
.await
.ok()
.map(|r| r._0.try_into().unwrap_or(0i64));
.map(|r| r.try_into().unwrap_or(0i64));

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid coercing overflowed totalSupply to zero.

Line 265 currently turns U256 -> i64 conversion failures into 0, which can silently corrupt nft_contracts.total_supply for large collections while still setting metadata_fetched = true. Prefer None on overflow so COALESCE preserves existing data.

Proposed fix
     let total_supply = contract
         .totalSupply()
         .call()
         .await
         .ok()
-        .map(|r| r.try_into().unwrap_or(0i64));
+        .and_then(|r| i64::try_from(r).ok());
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/crates/atlas-server/src/indexer/metadata.rs` around lines 260 - 266,
The current mapping from contract.totalSupply().call().await to total_supply
uses try_into().unwrap_or(0i64), which turns conversion failures into 0; change
it to yield None on overflow by replacing the unwrap_or with a safe conversion
that returns Option<i64> (e.g., use .and_then(|r| r.try_into().ok()) or an
equivalent .map(...).flatten()), so total_supply is None on conversion failure
and existing nft_contracts.total_supply is preserved by COALESCE; touch the
expression around contract.totalSupply().call().await, try_into, and
total_supply to implement this.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
backend/crates/atlas-server/src/api/mod.rs (1)

165-171: ⚠️ Potential issue | 🟠 Major

Avoid startup panic on invalid CORS_ORIGIN.

Line 170 still uses expect(...); a bad env value hard-crashes the server. Return a structured error instead and propagate it to startup handling.

Proposed fix
-fn build_cors_layer(cors_origin: Option<String>) -> CorsLayer {
+fn build_cors_layer(cors_origin: Option<String>) -> anyhow::Result<CorsLayer> {
     let origin = match cors_origin {
         Some(origin) => {
-            let header_value = origin
-                .parse::<axum::http::HeaderValue>()
-                .expect("CORS_ORIGIN is not a valid HTTP header value");
+            let header_value = origin.parse::<axum::http::HeaderValue>()?;
             AllowOrigin::exact(header_value)
         }
         None => AllowOrigin::any(),
     };
-    CorsLayer::new()
+    Ok(CorsLayer::new()
         .allow_origin(origin)
         .allow_methods(Any)
-        .allow_headers(Any)
+        .allow_headers(Any))
 }
-pub fn build_router(state: Arc<AppState>, cors_origin: Option<String>) -> Router {
+pub fn build_router(state: Arc<AppState>, cors_origin: Option<String>) -> anyhow::Result<Router> {
@@
-    Router::new()
+    Ok(Router::new()
@@
-        .layer(build_cors_layer(cors_origin))
+        .layer(build_cors_layer(cors_origin)?)
         .layer(TraceLayer::new_for_http())
+    )
 }
-    let app = api::build_router(state, config.cors_origin.clone());
+    let app = api::build_router(state, config.cors_origin.clone())?;
#!/bin/bash
# Verify the panic path and call chain are present.
rg -n -C3 'build_cors_layer|expect\("CORS_ORIGIN is not a valid HTTP header value"\)|build_router\(' backend/crates/atlas-server/src/api/mod.rs backend/crates/atlas-server/src/main.rs
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/crates/atlas-server/src/api/mod.rs` around lines 165 - 171, The
build_cors_layer function currently panics on invalid CORS_ORIGIN because it
calls expect; change build_cors_layer to return Result<CorsLayer, anyhow::Error>
(or a suitable error type) instead of CorsLayer, replace the parse().expect(...)
with parse().map_err(|e| anyhow::anyhow!("invalid CORS_ORIGIN header value: {}",
e)) (or create a descriptive error), and update callers (e.g., build_router and
startup initialization) to propagate and handle the error at startup rather than
letting the process panic; ensure AllowOrigin::exact(...) is only called on the
successful parsed HeaderValue and the error is surfaced to main so startup can
log/exit cleanly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@backend/crates/atlas-server/src/api/mod.rs`:
- Around line 165-171: The build_cors_layer function currently panics on invalid
CORS_ORIGIN because it calls expect; change build_cors_layer to return
Result<CorsLayer, anyhow::Error> (or a suitable error type) instead of
CorsLayer, replace the parse().expect(...) with parse().map_err(|e|
anyhow::anyhow!("invalid CORS_ORIGIN header value: {}", e)) (or create a
descriptive error), and update callers (e.g., build_router and startup
initialization) to propagate and handle the error at startup rather than letting
the process panic; ensure AllowOrigin::exact(...) is only called on the
successful parsed HeaderValue and the error is surfaced to main so startup can
log/exit cleanly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 579506ff-f2ab-42e9-a4e8-47bbda0338c3

📥 Commits

Reviewing files that changed from the base of the PR and between d5571c4 and 08d3c83.

📒 Files selected for processing (3)
  • backend/crates/atlas-server/src/api/mod.rs
  • backend/crates/atlas-server/src/indexer/indexer.rs
  • backend/crates/atlas-server/src/main.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/crates/atlas-server/src/indexer/indexer.rs

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
docker-compose.yml (1)

5-11: Minor inconsistency: POSTGRES_DB is hardcoded but referenced as variable in healthcheck.

Line 5 sets POSTGRES_DB: atlas as a fixed value, but line 11's healthcheck references ${POSTGRES_DB:-atlas}. While this works (the env var will be set to "atlas"), it's inconsistent with the parameterized pattern used for POSTGRES_USER and POSTGRES_PASSWORD. Consider parameterizing POSTGRES_DB as well for full consistency:

🔧 Suggested fix
     environment:
-      POSTGRES_DB: atlas
+      POSTGRES_DB: ${POSTGRES_DB:-atlas}
       POSTGRES_USER: ${POSTGRES_USER:-atlas}
       POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-atlas}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker-compose.yml` around lines 5 - 11, POSTGRES_DB is hardcoded while
POSTGRES_USER and POSTGRES_PASSWORD use parameterized defaults; change the
POSTGRES_DB definition to use the same pattern (e.g., POSTGRES_DB:
${POSTGRES_DB:-atlas}) so the environment variable and the healthcheck's
reference (${POSTGRES_DB:-atlas}) are consistent; update the service env block
where POSTGRES_DB is set and ensure the healthcheck test still uses the existing
${POSTGRES_DB:-atlas} reference.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/crates/atlas-server/src/api/handlers/nfts.rs`:
- Around line 343-356: validate_resolved_host currently resolves the hostname
and checks IPs but the later metadata fetch re-resolves the host (TOCTOU/DNS
rebinding risk); change the fetch so it connects directly to the
already-resolved SocketAddr(s) and not by hostname: after
tokio::net::lookup_host in validate_resolved_host, pick/verify the allowed
SocketAddr(s) (reject private/internal ranges) and pass that SocketAddr into the
HTTP client connect (or use a custom connector) while setting the HTTP Host
header to the original hostname, or alternately re-resolve immediately before
connect and compare addresses to the previously-validated set to fail on any
mismatch; apply this to the code path that performs the metadata fetch so the
network connection uses the validated IP(s) rather than a fresh hostname
resolution.
- Around line 147-148: Handlers that currently paginate using LIMIT ... OFFSET
in the touched NFT queries must be converted to keyset/cursor pagination:
replace OFFSET-based SQL (the query string containing "LIMIT $2 OFFSET $3") with
a deterministic ORDER BY (e.g., ORDER BY created_at DESC, id DESC or similar)
and add a cursor parameter used in the WHERE clause (e.g., WHERE (created_at,
id) < ($cursor_created_at, $cursor_id)) so the query uses a cursor and LIMIT
only; update the corresponding handler signatures to accept/return an encoded
cursor (and decode it into cursor_created_at and cursor_id), remove the OFFSET
handling, and apply the same change at the other occurrences referenced (lines
~431-432 and ~468-469) ensuring tests and API docs are updated to reflect the
new cursor parameter/response fields.
- Around line 302-369: Add a #[cfg(test)] mod tests block in the same file and
include unit tests covering the new functions: for validate_metadata_url_scheme
write tests asserting Ok for "http://..." and "https://..." and Err for
disallowed schemes (e.g., "file://"); for is_non_public_ip assert true for
loopback (127.0.0.1, ::1), link-local and private ranges and false for a public
IP; for validate_resolved_host add async #[tokio::test] cases using IP-literal
hosts (e.g., "http://127.0.0.1/" should Err, "http://8.8.8.8/" should Ok) so you
avoid external DNS and exercise the tokio::net::lookup_host path — keep tests in
the same file and run with cargo test --workspace.

---

Nitpick comments:
In `@docker-compose.yml`:
- Around line 5-11: POSTGRES_DB is hardcoded while POSTGRES_USER and
POSTGRES_PASSWORD use parameterized defaults; change the POSTGRES_DB definition
to use the same pattern (e.g., POSTGRES_DB: ${POSTGRES_DB:-atlas}) so the
environment variable and the healthcheck's reference (${POSTGRES_DB:-atlas}) are
consistent; update the service env block where POSTGRES_DB is set and ensure the
healthcheck test still uses the existing ${POSTGRES_DB:-atlas} reference.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9f0390c9-e61d-4716-9c80-a9b6e5046181

📥 Commits

Reviewing files that changed from the base of the PR and between 08d3c83 and 4e090dd.

📒 Files selected for processing (10)
  • Justfile
  • backend/crates/atlas-common/src/types.rs
  • backend/crates/atlas-server/src/api/handlers/nfts.rs
  • backend/crates/atlas-server/src/api/handlers/sse.rs
  • backend/crates/atlas-server/src/api/mod.rs
  • backend/crates/atlas-server/src/config.rs
  • backend/crates/atlas-server/src/indexer/indexer.rs
  • backend/crates/atlas-server/src/main.rs
  • docker-compose.yml
  • frontend/nginx.conf
🚧 Files skipped from review as they are similar to previous changes (3)
  • backend/crates/atlas-server/src/indexer/indexer.rs
  • backend/crates/atlas-server/src/config.rs
  • backend/crates/atlas-common/src/types.rs

Comment on lines 147 to 148
LIMIT $2 OFFSET $3"
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

OFFSET pagination in touched queries violates repo rule and won’t scale.

These handlers still page with OFFSET, which degrades badly on large NFT tables. Please switch these endpoints to keyset/cursor pagination.

As per coding guidelines: Never use OFFSET for pagination on large tables — use keyset/cursor pagination instead.

Also applies to: 431-432, 468-469

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/crates/atlas-server/src/api/handlers/nfts.rs` around lines 147 - 148,
Handlers that currently paginate using LIMIT ... OFFSET in the touched NFT
queries must be converted to keyset/cursor pagination: replace OFFSET-based SQL
(the query string containing "LIMIT $2 OFFSET $3") with a deterministic ORDER BY
(e.g., ORDER BY created_at DESC, id DESC or similar) and add a cursor parameter
used in the WHERE clause (e.g., WHERE (created_at, id) < ($cursor_created_at,
$cursor_id)) so the query uses a cursor and LIMIT only; update the corresponding
handler signatures to accept/return an encoded cursor (and decode it into
cursor_created_at and cursor_id), remove the OFFSET handling, and apply the same
change at the other occurrences referenced (lines ~431-432 and ~468-469)
ensuring tests and API docs are updated to reflect the new cursor
parameter/response fields.

Comment on lines +302 to +369
/// Validate that a resolved metadata URL uses an allowed scheme.
///
/// Only `http` and `https` are permitted after IPFS/Arweave resolution.
/// `file://`, `ftp://`, `gopher://`, and other schemes are blocked to prevent
/// SSRF via non-HTTP channels.
fn validate_metadata_url_scheme(url: &str) -> Result<(), AtlasError> {
let scheme_end = url
.find("://")
.ok_or_else(|| AtlasError::Validation(format!("Metadata URL has no scheme: {}", url)))?;
let scheme = &url[..scheme_end];
match scheme {
"http" | "https" => Ok(()),
other => Err(AtlasError::Validation(format!(
"Disallowed URL scheme '{}' in metadata URL",
other
))),
}
}

/// Returns true if the IP address is in a private, loopback, link-local, or
/// other non-public range that should not be reachable from metadata fetches.
fn is_non_public_ip(ip: &std::net::IpAddr) -> bool {
match ip {
std::net::IpAddr::V4(v4) => {
v4.is_loopback() // 127.0.0.0/8
|| v4.is_private() // 10/8, 172.16/12, 192.168/16
|| v4.is_link_local() // 169.254/16
|| v4.is_broadcast()
|| v4.is_unspecified()
|| v4.octets()[0] == 100 && (v4.octets()[1] & 0xC0) == 64 // 100.64/10 (CGNAT)
}
std::net::IpAddr::V6(v6) => {
v6.is_loopback() // ::1
|| v6.is_unspecified()
|| (v6.segments()[0] & 0xFFC0) == 0xFE80 // fe80::/10 link-local
|| (v6.segments()[0] & 0xFE00) == 0xFC00 // fc00::/7 ULA
}
}
}

/// Resolve the host from a URL and reject non-public IP addresses to prevent SSRF.
async fn validate_resolved_host(url: &str) -> Result<(), AtlasError> {
let parsed = reqwest::Url::parse(url)
.map_err(|e| AtlasError::Validation(format!("Invalid metadata URL: {}", e)))?;

let host = parsed
.host_str()
.ok_or_else(|| AtlasError::Validation("Metadata URL has no host".to_string()))?;

let port = parsed.port_or_known_default().unwrap_or(80);
let addr_str = format!("{}:{}", host, port);

let addrs = tokio::net::lookup_host(&addr_str).await.map_err(|e| {
AtlasError::MetadataFetch(format!("DNS resolution failed for {}: {}", host, e))
})?;

for addr in addrs {
if is_non_public_ip(&addr.ip()) {
return Err(AtlasError::Validation(format!(
"Metadata URL host {} resolves to non-public IP {}",
host,
addr.ip()
)));
}
}

Ok(())
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add same-file unit tests for the new URL/IP validation logic.

New security logic was added (validate_metadata_url_scheme, is_non_public_ip, validate_resolved_host) without a #[cfg(test)] mod tests block in this file.

As per coding guidelines: Add unit tests for new logic in a #[cfg(test)] mod tests block in the same file, and run with cargo test --workspace.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/crates/atlas-server/src/api/handlers/nfts.rs` around lines 302 - 369,
Add a #[cfg(test)] mod tests block in the same file and include unit tests
covering the new functions: for validate_metadata_url_scheme write tests
asserting Ok for "http://..." and "https://..." and Err for disallowed schemes
(e.g., "file://"); for is_non_public_ip assert true for loopback (127.0.0.1,
::1), link-local and private ranges and false for a public IP; for
validate_resolved_host add async #[tokio::test] cases using IP-literal hosts
(e.g., "http://127.0.0.1/" should Err, "http://8.8.8.8/" should Ok) so you avoid
external DNS and exercise the tokio::net::lookup_host path — keep tests in the
same file and run with cargo test --workspace.

Comment on lines +343 to +356
async fn validate_resolved_host(url: &str) -> Result<(), AtlasError> {
let parsed = reqwest::Url::parse(url)
.map_err(|e| AtlasError::Validation(format!("Invalid metadata URL: {}", e)))?;

let host = parsed
.host_str()
.ok_or_else(|| AtlasError::Validation("Metadata URL has no host".to_string()))?;

let port = parsed.port_or_known_default().unwrap_or(80);
let addr_str = format!("{}:{}", host, port);

let addrs = tokio::net::lookup_host(&addr_str).await.map_err(|e| {
AtlasError::MetadataFetch(format!("DNS resolution failed for {}: {}", host, e))
})?;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

SSRF check still has a DNS rebinding TOCTOU gap.

Line 354 resolves and validates IPs, but Line 385 performs a fresh hostname-based request that can re-resolve differently. A rebinding attacker can pass validation and still hit internal targets at fetch time.

Also applies to: 375-387

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/crates/atlas-server/src/api/handlers/nfts.rs` around lines 343 - 356,
validate_resolved_host currently resolves the hostname and checks IPs but the
later metadata fetch re-resolves the host (TOCTOU/DNS rebinding risk); change
the fetch so it connects directly to the already-resolved SocketAddr(s) and not
by hostname: after tokio::net::lookup_host in validate_resolved_host,
pick/verify the allowed SocketAddr(s) (reject private/internal ranges) and pass
that SocketAddr into the HTTP client connect (or use a custom connector) while
setting the HTTP Host header to the original hostname, or alternately re-resolve
immediately before connect and compare addresses to the previously-validated set
to fail on any mismatch; apply this to the code path that performs the metadata
fetch so the network connection uses the validated IP(s) rather than a fresh
hostname resolution.

Copy link
Collaborator

@pthmas pthmas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good overall, added small nits

/// Build the Axum router.
///
/// `cors_origin`: when `Some`, restrict CORS to that exact origin; when `None`,
/// allow any origin (backwards-compatible default for development / self-hosted
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to what you pointed out yesterday, no need to talk about previous design as this is not in prod anywhere

/// allow any origin (backwards-compatible default for development / self-hosted
/// deployments).
///
/// NOTE: Rate limiting has not yet been added here. The `tower_governor` crate
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we upgrade the governor to then use the correct version and have rate limiting? can be another PR

pub api_port: u16,
/// If set, restrict CORS to this exact origin. When unset, any origin is allowed
/// (backwards-compatible default for development / self-hosted deployments).
pub cors_origin: Option<String>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add CORS_ORIGIN, POSTGRES_USER, and POSTGRES_PASSWORD to .env.example ?

@tac0turtle tac0turtle requested a review from pthmas March 17, 2026 14:18
@tac0turtle tac0turtle merged commit 6cafa1e into main Mar 17, 2026
8 checks passed
@tac0turtle tac0turtle deleted the marko/audit branch March 17, 2026 15:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants