Skip to content

codex-reply should rehydrate sessions from disk when thread not found in memory #12596

@zsxkib

Description

@zsxkib

problem

i use codex as an MCP server inside claude code a lot. the two work really well as a team — claude handles the high-level orchestration and codex does the heavy lifting with shell access, file edits, etc.

the problem is when i've been working with just codex directly and it already has all the context for what i'm doing. i want to tell claude "go chat with codex `019c7616-b3e9-7782-9ea8-b53e6bb09329`" and have it pick up right where i left off. but if the MCP server has restarted since that session (which happens all the time — e.g. claude code restarts, laptop sleep, etc), `codex-reply` just returns "Session not found" even though the full JSONL transcript is still sitting on disk at `~/.codex/sessions/`.

this is frustrating because the data is RIGHT THERE.

the fix is surprisingly small

all three building blocks already exist in the codebase, they just aren't wired together in the `codex-reply` error path:

  1. `find_thread_path_by_id_str()` in `core/src/rollout/list.rs` — locates the JSONL rollout file by threadId (sqlite first, file search fallback)
  2. `read_session_meta_line()` in `core/src/rollout/list.rs` — reads the session metadata to recover the original cwd
  3. `resume_thread_from_rollout()` in `core/src/thread_manager.rs` — creates a new thread with the full conversation history loaded

the change is ~65 lines of new code in `mcp-server/src/message_processor.rs`:

  • when `get_thread(thread_id)` fails, call a new `try_rehydrate_from_disk()` method
  • that method chains the three functions above: find rollout → read session meta → resume thread
  • if rehydration also fails (e.g. no rollout file on disk), return the original "Session not found" error unchanged

the only other change needed is a 4-line `pub fn auth_manager()` getter on `ThreadManager` (needed to pass to `resume_thread_from_rollout`).

the response already includes `threadId` in the output, so the client automatically picks up the new thread ID for subsequent calls. no client-side changes needed.

i already built this

i had a working implementation in #12594 (closed per contribution policy). it compiles, passes all 11 existing tests, and the diff is +75/-10 lines across 2 files. happy to share more details or help test if this gets picked up.

use case

this would make the codex MCP server way more useful for multi-agent workflows where you want to hand off context between sessions. rn the workaround is to either keep the MCP server running indefinitely (not realistic) or re-explain everything from scratch in a new session.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestmcpIssues related to the use of model context protocol (MCP) serversmcp-serverIssues related to the use of the `codex mcp-server` subcommand

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions