RDI-Foundation · sofasogood · Nov 19, 2025 · Nov 19, 2025 · Nov 19, 2025 · Nov 19, 2025
diff --git a/.github/workflows/submit.yml b/.github/workflows/submit.yml
@@ -0,0 +1,41 @@
+name: Submit Solution
+
+on:
+  push:
+    branches: [main]
+  workflow_dispatch:
+
+jobs:
+  submit-attacker:
+    runs-on: ubuntu-latest
+    if: contains(github.event.head_commit.message, '[submit]') || contains(github.event.head_commit.message, '[submit-attacker]')
+
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Submit Attacker
+        uses: LambdaLabsML/agentbeats-submission-action@main
+        with:
+          api_key: ${{ secrets.COMPETITION_API_KEY }}
+          role: 'attacker'
+          submission_path: './scenarios/security_arena/agents/attacker'
+          run_tests: 'false'
+          openai_api_key: ${{ secrets.OPENAI_API_KEY }}
+          openai_base_url: ${{ secrets.OPENAI_BASE_URL }}
+
+  submit-defender:
+    runs-on: ubuntu-latest
+    if: contains(github.event.head_commit.message, '[submit]') || contains(github.event.head_commit.message, '[submit-defender]')
+
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Submit Defender
+        uses: LambdaLabsML/agentbeats-submission-action@main
+        with:
+          api_key: ${{ secrets.COMPETITION_API_KEY }}
+          role: 'defender'
+          submission_path: './scenarios/security_arena/agents/defender'
+          run_tests: 'false'
+          openai_api_key: ${{ secrets.OPENAI_API_KEY }}
+          openai_base_url: ${{ secrets.OPENAI_BASE_URL }}
diff --git a/.gitignore b/.gitignore
@@ -3,4 +3,5 @@
 .python-version
 .venv/
 __pycache__/
-*.pyc
+*.pyc
+results
diff --git a/README.md b/README.md
@@ -1,3 +1,75 @@
+# Lambda × Berkeley AgentBeats Security Arena
+
+**Building the Future of AI Security Testing**
+
+Repository: https://github.com/LambdaLabsML/agentbeats-lambda
+
+## Track Description
+
+The Agent Security Arena challenges participants to advance the field of AI agent security evaluation. Implement realistic security scenarios from our curated library of 400+ specifications—testing vulnerabilities like prompt injection, data exfiltration, and jailbreaking—using an industry-standard adversarial testing framework. Your contributions will help define how we evaluate and secure AI agents operating in real-world environments, from financial advisors to healthcare systems.
+
+Browse the current scenario library on our [scenario browser](https://agentbeats-lambda.s3.us-east-1.amazonaws.com/index.html).
+
+## Competition Structure: Two Phases
+
+### Overview
+
+As AI agents gain autonomy and take on sensitive tasks, current security testing methods fall short. The Agent Security Arena provides a framework for testing AI vulnerabilities through realistic adversarial scenarios. The competition runs in two sequential phases; all participants compete in both.
+
+- **Phase 1 (November 24 – January 16)**: Implement security scenarios that test real vulnerabilities.
+- **Phase 2 (February 2 – February 23)**: Compete with advanced attack or defense agents.
+
+This track focuses on building realistic test scenarios that reveal actual vulnerabilities before they're exploited in production. Participants will balance creating challenging attack scenarios while maintaining clear success criteria and realistic constraints.
+
+### Key Dates
+
+| Date | Milestone |
+|------|-----------|
+| Nov 24, 2024 | Phase 1 begins - Start building scenarios |
+| Jan 16, 2025 | Phase 1 submissions due |
+| Feb 2 , 2025 | Phase 2 begins - Agent competition launches |
+| Feb 23, 2025 | Winners announced |
+
+### Model Constraint
+
+**Use gpt-oss-20b** to ensure fair compute (fits in 80GB H100); no proprietary API advantage.
+
+## Getting Started
+
+**[View Full Competition Documentation →](scenarios/security_arena/README.md)**
+
+- Phase 1 documentation: [scenarios/security_arena/docs/phase1.md](scenarios/security_arena/docs/phase1.md)
+- Phase 2 documentation: [scenarios/security_arena/docs/phase2.md](scenarios/security_arena/docs/phase2.md)
+
+Quick start:
+```bash
+uv run agentbeats-run scenarios/security_arena/scenario_portfolioiq.toml
+```
+
+### Documentation
+
+Study existing examples:
+- **PortfolioIQ** — Data injection in financial risk assessment
+  `scenarios/security_arena/plugins/portfolioiq.py`
+- **Thingularity** — Information disclosure from shopping assistant
+  `scenarios/security_arena/plugins/thingularity.py`
+
+Core docs:
+- `README.md` - Framework architecture and usage
+- `SCENARIO_SPECIFICATIONS.md` - Plugin interface and submission requirements
+
+## Support
+
+Lambda engineers have set up dedicated support for participants:
+
+- **Discord**: Support channel
+- **GitHub Issues**: Bug reports and technical questions
+- **Response Time**: Critical issues same-day; general questions within 24 hours
+
+We're committed to helping you succeed - ask us anything about the framework, scenario implementation, or evaluation criteria.
+
+---
+
 ## Quickstart
 1. Clone (or fork) the repo:
 ```

diff --git a/pyproject.toml b/pyproject.toml
@@ -12,6 +12,8 @@ dependencies = [
     "a2a-sdk>=0.3.5",
     "google-adk>=1.14.1",
     "google-genai>=1.36.0",
+    "jinja2>=3.1.0",
+    "openai>=2.8.1",
     "pydantic>=2.11.9",
     "python-dotenv>=1.1.1",
     "uvicorn>=0.35.0",

diff --git a/scenarios/debate/adk_debate_judge.py b/scenarios/debate/adk_debate_judge.py
-Original file line number
+Diff line change
@@ Expand Up / @@ -3,4 +3,5 @@ @@
     .python-version
     .venv/
     __pycache__/
-    *.pyc
+    *.pyc
+    results