| title | status | authors | based_on | category | source | tags | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Code-Then-Execute Pattern |
emerging |
|
|
Tool Use & Environment |
|
Plan lists are opaque; we want full data-flow analysis and taint tracking.
Have the LLM output a sandboxed program or DSL script:
- LLM writes code that calls tools and untrusted-data processors.
- Static checker/Taint engine verifies flows (e.g., no tainted var to
send_email.recipient). - Interpreter runs the code in a locked sandbox.
x = calendar.read(today)
y = QuarantineLLM.format(x)
email.write(to="john@acme.com", body=y)Complex multi-step agents like SQL copilots, software-engineering bots.
- Pros: Formal verifiability; replay logs.
- Cons: Requires DSL design and static-analysis infra.
- Debenedetti et al., CaMeL (2025); Beurer-Kellner et al., §3.1 (5).