CodeAlive-AI · rodion-m · Apr 8, 2026 · Apr 8, 2026 · gemini-code-assist · Apr 8, 2026
diff --git a/README.md b/README.md
@@ -81,8 +81,10 @@ The key is stored once and shared across all agents on the same machine.
 Start your agent and ask naturally:
 
 - *"How is authentication implemented?"*
+- *"Find the exact regex or string match for this token parser"*
 - *"Show me error handling patterns across services"*
 - *"Find similar features to guide my implementation"*
+- *"Show me who calls this handler and what it depends on"*
 
 No special commands needed — the agent picks up the skill automatically.
 

diff --git a/skills/codealive-context-engine/SKILL.md b/skills/codealive-context-engine/SKILL.md
@@ -1,6 +1,6 @@
 ---
 name: codealive-context-engine
-description: Semantic code search and AI-powered codebase Q&A across indexed repositories. Use when understanding code beyond local files, exploring dependencies, discovering cross-project patterns, planning features, debugging, or onboarding. Queries like "How does X work?", "Show me Y patterns", "How is library Z used?". Provides search (fast, returns file locations and descriptions) and chat-with-codebase (slower, costs more, but returns synthesized answers).
+description: Semantic code search and AI-powered codebase Q&A across indexed repositories. Use when understanding code beyond local files, exploring dependencies, discovering cross-project patterns, planning features, debugging, or onboarding. Queries like "How does X work?", "Show me Y patterns", "How is library Z used?". The default path is semantic search plus grep search; chat-with-codebase is slower, more expensive, and usually secondary.
 ---
 
 # CodeAlive Context Engine
@@ -38,12 +38,15 @@ Do NOT retry the failed script until setup completes successfully.
 | Tool | Script | Speed | Cost | Best For |
 |------|--------|-------|------|----------|
 | **List Data Sources** | `datasources.py` | Instant | Free | Discovering indexed repos and workspaces |
-| **Search** | `search.py` | Fast | Low | Finding code locations, descriptions, identifiers |
+| **Semantic Search** | `search.py` | Fast | Low | Finding relevant artifacts by meaning |
+| **Grep Search** | `grep.py` | Fast | Low | Exact text and regex matches with line previews |
 | **Fetch Artifacts** | `fetch.py` | Fast | Low | Retrieving full content for search results |
 | **Artifact Relationships** | `relationships.py` | Fast | Low | Drilling into call graph, inheritance, references for one artifact |
 | **Chat with Codebase** | `chat.py` | Slow | High | Synthesized answers, architectural explanations |
 
-**Cost guidance:** Search is lightweight and should be the default starting point. Chat with Codebase invokes an LLM on the server side, making it significantly more expensive per call — use it when you need a synthesized, ready-to-use answer rather than raw search results.
+**Cost guidance:** `semantic_search` and `grep_search` are the default starting point. Chat with Codebase invokes an LLM on the server side, can take up to 30 seconds, and is significantly more expensive per call — use it only when you need a synthesized, ready-to-use answer rather than raw search results.
+
+**Highest-confidence guidance:** If your agent supports subagents and the task needs maximum reliability or depth, prefer a subagent-driven workflow that combines `search.py`, `grep.py`, `fetch.py`, `relationships.py`, and local file reads. `chat.py` is optional synthesis, not the default path.
 
 **Three-step workflow (search → triage → load real content):**
 1. **Search** — find relevant code locations with descriptions and identifiers
@@ -85,8 +88,9 @@ python scripts/datasources.py
 
 ```bash
 python scripts/search.py "JWT token validation" my-backend
-python scripts/search.py "error handling patterns" workspace:platform-team --mode deep
-python scripts/search.py "authentication flow" my-repo --description-detail full
+python scripts/search.py "authentication flow" my-repo --path src/auth --ext .py
+python scripts/grep.py "AuthService" my-repo
+python scripts/grep.py "auth\\(" my-repo --regex
 ```
 
 ### 3. Fetch full content (for external repos)
@@ -108,7 +112,7 @@ python scripts/relationships.py "my-org/backend::src/models.py::User" --profile
 python scripts/relationships.py "my-org/backend::src/svc.py::Service" --profile allRelevant --max-count 200
 ```
 
-### 5. Chat with codebase (slower, richer answers)
+### 5. Chat with codebase (slower, optional synthesis)
 
 ```bash
 python scripts/chat.py "Explain the authentication flow" my-backend
@@ -135,11 +139,9 @@ python scripts/search.py <query> <data_sources...> [options]
 
 | Option | Description |
 |--------|-------------|
-| `--mode auto` | Default. Intelligent semantic search — use 80% of the time |
-| `--mode fast` | Quick lexical search for known terms |
-| `--mode deep` | Exhaustive search for complex cross-cutting queries. Resource-intensive |
-| `--description-detail short` | Default. Brief description of each result |
-| `--description-detail full` | More detailed description of each result |
+| `--max-results N` | Optional cap for the number of returned artifacts |
+| `--path PATH` | Repo-relative path or directory scope (repeatable) |
+| `--ext EXT` | File extension scope such as `.py` or `.ts` (repeatable) |
 
 **`description` is a triage pointer ONLY** — it tells you which artifacts are
 worth a closer look. It is NOT the source of truth and you must NOT draw
@@ -148,6 +150,25 @@ source: use `fetch.py <identifier>` for external repos, or your editor's
 file-read tool on the path for repos in the current working directory. Treat
 only that real `content` as ground truth.
 
+### `grep.py` — Exact / Regex Search
+
+Returns artifact-level matches with line previews. Use this when the pattern
+itself matters more than semantic similarity.
+
+```bash
+python scripts/grep.py <query> <data_sources...> [--regex] [--max-results N] [--path PATH] [--ext EXT]
+```
+
+| Option | Description |
+|--------|-------------|
+| `--regex` | Interpret the query as a regex pattern |
+| `--max-results N` | Optional cap for the number of returned artifacts |
+| `--path PATH` | Repo-relative path or directory scope (repeatable) |
+| `--ext EXT` | File extension scope such as `.py` or `.ts` (repeatable) |
+
+Line previews are still search evidence, not source of truth. Use `fetch.py`
+or your local file-read tool before drawing conclusions about behavior.
+
 ### `fetch.py` — Fetch Artifact Content
 
 Retrieves the full source code content for artifacts found via search. Use this for external repositories you cannot access locally.
@@ -192,7 +213,7 @@ python scripts/relationships.py <identifier> [--profile PROFILE] [--max-count N]
 
 Sends your question to an AI consultant that has full context of the indexed codebase. Returns synthesized, ready-to-use answers. Supports conversation continuity for follow-ups.
 
-**This is more expensive than search** because it runs an LLM inference on the server side. Prefer search when you just need to locate code. Use chat when you need explanations, comparisons, or architectural analysis.
+**This is more expensive than search** because it runs an LLM inference on the server side. Prefer search when you just need to locate code. Use chat when you need explanations, comparisons, or architectural analysis after search. It can take up to 30 seconds.
 
 ```bash
 python scripts/chat.py <question> <data_sources...> [options]
@@ -270,7 +291,7 @@ This skill works standalone, but delivers the best experience when combined with
 | Component | What it provides |
 |-----------|-----------------|
 | **This skill** | Query patterns, workflow guidance, cost-aware tool selection |
-| **MCP server** | Direct `codebase_search`, `fetch_artifacts`, `get_artifact_relationships`, `codebase_consultant`, `get_data_sources` tools |
+| **MCP server** | Direct `semantic_search`, `grep_search`, `fetch_artifacts`, `get_artifact_relationships`, `chat`, `get_data_sources` tools plus deprecated aliases |
 
 When both are installed, prefer the MCP server's tools for direct operations and this skill's scripts for guided workflows.
 

diff --git a/skills/codealive-context-engine/references/query-patterns.md b/skills/codealive-context-engine/references/query-patterns.md
@@ -298,7 +298,7 @@ Use when:
 1. **Use natural language** - CodeAlive understands intent, not just keywords
 2. **Be specific about context** - Include domain/layer info (API, database, frontend)
 3. **Leverage workspaces** - Search across multiple repos for patterns
-4. **Start with chat** - Ask "How does X work?" before searching
+4. **Start with search** - Use semantic search first, then grep when the literal pattern matters; only use chat after you have evidence and still need synthesis
 5. **Iterate** - Use follow-up questions to drill deeper
 6. **Combine with local tools** - CodeAlive for discovery, Read for details
 7. **Think like a librarian** - Focus on "what" and "why", not "where"
diff --git a/skills/codealive-context-engine/references/workflows.md b/skills/codealive-context-engine/references/workflows.md
@@ -28,24 +28,24 @@ Review output to understand:
 - What workspaces group related repos
 - Which data sources to use for exploration
 
-### Step 2: Get Architectural Overview
+### Step 2: Understand Entry Points
 ```bash
-python chat.py "Provide an architectural overview of this codebase. What are the main components, how do they interact, and what's the tech stack?" my-backend-repo
+python search.py "main application entry point, startup initialization" my-backend-repo
 ```
 
-### Step 3: Understand Entry Points
+### Step 3: Explore Key Features
 ```bash
-python search.py "main application entry point, startup initialization" my-backend-repo
+python search.py "main features, core capabilities, major services" my-backend-repo
 ```
 
-### Step 4: Explore Key Features
+### Step 4: Get Architectural Overview Only If Needed
 ```bash
-python chat.py "What are the main features/capabilities of this system?" my-backend-repo
+python chat.py "Provide an architectural overview of this codebase. What are the main components, how do they interact, and what's the tech stack?" my-backend-repo
 ```
 
 ### Step 5: Understand Data Models
 ```bash
-python search.py "database models, schemas, entity definitions" my-backend-repo --mode auto
+python search.py "database models, schemas, entity definitions" my-backend-repo
 ```
 
 **Progressive Discovery:**
@@ -61,18 +61,19 @@ python search.py "database models, schemas, entity definitions" my-backend-repo
 
 ### Example: Understanding User Authentication
 
-#### Step 1: Start with High-Level Question
+#### Step 1: Start with Search
 ```bash
-python chat.py "How is user authentication implemented? Describe the flow from login to session management" my-backend
+python search.py "user authentication, login flow, session management" my-backend
+python grep.py "refresh token" my-backend
 ```
 
-Save conversation_id for follow-up questions.
-
-#### Step 2: Find Entry Points
+#### Step 2: Use Chat Only If You Still Need Synthesis
 ```bash
-python search.py "user login endpoint, authentication API" my-backend
+python chat.py "How is user authentication implemented? Describe the flow from login to session management" my-backend
 ```
 
+Save conversation_id for follow-up questions.
+
 #### Step 3: Trace Through Layers
 ```bash
 # API Layer

diff --git a/skills/codealive-context-engine/scripts/fetch.py b/skills/codealive-context-engine/scripts/fetch.py
@@ -15,7 +15,7 @@
     # Fetch multiple artifacts
     python fetch.py "my-org/backend::src/auth.py::login" "my-org/backend::src/utils.py::helper"
 
-Identifiers come from codebase_search results (the `identifier` field).
+Identifiers come from semantic/grep search results (the `identifier` field).
 The format is: {owner/repo}::{path}::{symbol} (for symbols/chunks)
                {owner/repo}::{path} (for files)
 

diff --git a/skills/codealive-context-engine/scripts/grep.py b/skills/codealive-context-engine/scripts/grep.py
@@ -0,0 +1,115 @@
+#!/usr/bin/env python3
+"""
+CodeAlive Grep Search - exact text or regex search across indexed repositories.
+
+Usage:
+    python grep.py "AuthService" my-repo
+    python grep.py "auth\\(" my-repo --regex --max-results 25
+    python grep.py "TODO" workspace:backend-team --path src --ext .py
+"""
+
+import sys
+from pathlib import Path
+
+sys.path.insert(0, str(Path(__file__).parent / "lib"))
+
+from api_client import CodeAliveClient
+
+
+def format_grep_results(results: dict) -> str:
+    items = results.get("results", []) if isinstance(results, dict) else []
+    if not items:
+        return "No results found."
+
+    output = []
+    for idx, result in enumerate(items, 1):
+        location = result.get("location", {})
+        file_path = location.get("path") or result.get("path")
+        matches = result.get("matches", [])
+
+        output.append(f"\n--- Result #{idx} [{result.get('kind', 'Artifact')}] ---")
+        if file_path:
+            output.append(f"  File: {file_path}")
+        if result.get("identifier"):
+            output.append(f"  Identifier: {result['identifier']}")
+        if result.get("matchCount") is not None:
+            output.append(f"  Match count: {result['matchCount']}")
+
+        for match in matches:
+            output.append(
+                "  "
+                f"{match.get('lineNumber', '?')}:{match.get('startColumn', '?')}-"
+                f"{match.get('endColumn', '?')}  {match.get('lineText', '')}"
+            )
+
+    output.append(
+        "\nHint: match previews are search evidence only. Fetch the full source "
+        "with `python fetch.py <identifier>` or read the local file before reasoning about behavior."
+    )
+    return "\n".join(output)
+
+
+def main():
+    if len(sys.argv) < 3:
+        print("Error: Missing required arguments.", file=sys.stderr)
+        print(
+            "Usage: python grep.py <query> <data_source> [data_source2...] "
+            "[--regex] [--max-results N] [--path PATH] [--ext EXT]",
+            file=sys.stderr,
+        )
+        sys.exit(1)
+
+    query = sys.argv[1]
+    data_sources = []
+    paths = []
+    extensions = []
+    max_results = None
+    regex = False
+
+    i = 2
+    while i < len(sys.argv):
+        arg = sys.argv[i]
+        if arg == "--regex":
+            regex = True
+            i += 1
+        elif arg == "--max-results" and i + 1 < len(sys.argv):
+            max_results = int(sys.argv[i + 1])
+            i += 2
-        elif arg == "--max-results" and i + 1 < len(sys.argv):
-            max_results = int(sys.argv[i + 1])
-            i += 2
+        elif arg == "--max-results" and i + 1 < len(sys.argv):
+            try:
+                max_results = int(sys.argv[i + 1])
+            except ValueError:
+                print(f"Error: --max-results must be an integer, got '{sys.argv[i + 1]}'", file=sys.stderr)
+                sys.exit(1)
+            i += 2
-        elif arg == "--max-results" and i + 1 < len(sys.argv):
-            max_results = int(sys.argv[i + 1])
-            i += 2
+        elif arg == "--max-results" and i + 1 < len(sys.argv):
+            try:
+                max_results = int(sys.argv[i + 1])
+            except ValueError:
+                print(f"Error: --max-results must be an integer, got '{sys.argv[i + 1]}'", file=sys.stderr)
+                sys.exit(1)
+            i += 2
+        elif arg == "--path" and i + 1 < len(sys.argv):
+            paths.append(sys.argv[i + 1])
+            i += 2
+        elif arg == "--ext" and i + 1 < len(sys.argv):
+            extensions.append(sys.argv[i + 1])
+            i += 2
+        elif arg == "--help":
+            print(__doc__)
+            sys.exit(0)
+        else:
+            data_sources.append(arg)
+            i += 1
+
+    if not data_sources:
+        print(
+            "Error: At least one data source is required. Run datasources.py to see available sources.",
+            file=sys.stderr,
+        )
+        sys.exit(1)
+
+    try:
+        client = CodeAliveClient()
+        results = client.grep_search(
+            query=query,
+            data_sources=data_sources,
+            paths=paths or None,
+            extensions=extensions or None,
+            max_results=max_results,
+            regex=regex,
+        )
+        print(format_grep_results(results))
+    except Exception as e:
+        print(f"Error: {e}", file=sys.stderr)
+        sys.exit(1)
+
+
+if __name__ == "__main__":
+    main()