problem
i use codex as an MCP server inside claude code a lot. the two work really well as a team — claude handles the high-level orchestration and codex does the heavy lifting with shell access, file edits, etc.
the problem is when i've been working with just codex directly and it already has all the context for what i'm doing. i want to tell claude "go chat with codex `019c7616-b3e9-7782-9ea8-b53e6bb09329`" and have it pick up right where i left off. but if the MCP server has restarted since that session (which happens all the time — e.g. claude code restarts, laptop sleep, etc), `codex-reply` just returns "Session not found" even though the full JSONL transcript is still sitting on disk at `~/.codex/sessions/`.
this is frustrating because the data is RIGHT THERE.
the fix is surprisingly small
all three building blocks already exist in the codebase, they just aren't wired together in the `codex-reply` error path:
- `find_thread_path_by_id_str()` in `core/src/rollout/list.rs` — locates the JSONL rollout file by threadId (sqlite first, file search fallback)
- `read_session_meta_line()` in `core/src/rollout/list.rs` — reads the session metadata to recover the original cwd
- `resume_thread_from_rollout()` in `core/src/thread_manager.rs` — creates a new thread with the full conversation history loaded
the change is ~65 lines of new code in `mcp-server/src/message_processor.rs`:
- when `get_thread(thread_id)` fails, call a new `try_rehydrate_from_disk()` method
- that method chains the three functions above: find rollout → read session meta → resume thread
- if rehydration also fails (e.g. no rollout file on disk), return the original "Session not found" error unchanged
the only other change needed is a 4-line `pub fn auth_manager()` getter on `ThreadManager` (needed to pass to `resume_thread_from_rollout`).
the response already includes `threadId` in the output, so the client automatically picks up the new thread ID for subsequent calls. no client-side changes needed.
i already built this
i had a working implementation in #12594 (closed per contribution policy). it compiles, passes all 11 existing tests, and the diff is +75/-10 lines across 2 files. happy to share more details or help test if this gets picked up.
use case
this would make the codex MCP server way more useful for multi-agent workflows where you want to hand off context between sessions. rn the workaround is to either keep the MCP server running indefinitely (not realistic) or re-explain everything from scratch in a new session.
problem
i use codex as an MCP server inside claude code a lot. the two work really well as a team — claude handles the high-level orchestration and codex does the heavy lifting with shell access, file edits, etc.
the problem is when i've been working with just codex directly and it already has all the context for what i'm doing. i want to tell claude "go chat with codex `019c7616-b3e9-7782-9ea8-b53e6bb09329`" and have it pick up right where i left off. but if the MCP server has restarted since that session (which happens all the time — e.g. claude code restarts, laptop sleep, etc), `codex-reply` just returns "Session not found" even though the full JSONL transcript is still sitting on disk at `~/.codex/sessions/`.
this is frustrating because the data is RIGHT THERE.
the fix is surprisingly small
all three building blocks already exist in the codebase, they just aren't wired together in the `codex-reply` error path:
the change is ~65 lines of new code in `mcp-server/src/message_processor.rs`:
the only other change needed is a 4-line `pub fn auth_manager()` getter on `ThreadManager` (needed to pass to `resume_thread_from_rollout`).
the response already includes `threadId` in the output, so the client automatically picks up the new thread ID for subsequent calls. no client-side changes needed.
i already built this
i had a working implementation in #12594 (closed per contribution policy). it compiles, passes all 11 existing tests, and the diff is +75/-10 lines across 2 files. happy to share more details or help test if this gets picked up.
use case
this would make the codex MCP server way more useful for multi-agent workflows where you want to hand off context between sessions. rn the workaround is to either keep the MCP server running indefinitely (not realistic) or re-explain everything from scratch in a new session.