Purpose: LLM-enhanced analysis, model interpretation, and AI assistance for GNN models
Pipeline Step: Step 13: LLM processing (13_llm.py)
Category: AI Enhancement / Analysis
Status: ✅ Production Ready
Version: 1.6.0
Last Updated: 2026-04-16
- LLM-based model analysis and interpretation
- Natural language explanations of GNN structures
- Active Inference concept clarification
- Model optimization suggestions
- Automated documentation generation
- Multi-provider LLM support (Ollama, OpenAI, OpenRouter, Perplexity; Anthropic keys appear in the provider matrix when present)
- Automated preference for local Ollama when no cloud API keys are set (
LLMProcessor) - Context-aware prompt generation
- Structured output parsing
- Rate limiting and error handling
Description: Main LLM processing function with automatic Ollama recovery. Processes GNN files using LLM analysis with multi-provider support.
Parameters:
target_dir(Path): Directory containing GNN files to analyzeoutput_dir(Path): Output directory for LLM analysesverbose(bool): Enable verbose logging (default: False)analysis_type(str, optional): Type of analysis ("comprehensive", "summary", "explain", "optimize") (default: "comprehensive")provider(str, optional): LLM provider ("auto", "openai", "anthropic", "ollama") (default: "auto")"auto": Automatically select best available provider (checks API keys, then Ollama)"openai": Use OpenAI API (requires OPENAI_API_KEY)"anthropic": Use Anthropic API (requires ANTHROPIC_API_KEY)"ollama": Use local Ollama (requires Ollama installation)
llm_tasks(str, optional): Specific tasks ("all", "summarize", "explain", "optimize") (default: "all")llm_timeout(int, optional): Timeout for LLM API calls in seconds (default: 60)max_tokens(int, optional): Maximum tokens in response (default: 2000)model(str, optional): Specific model to use (provider-specific)**kwargs: Additional LLM processing options
Returns: bool - True if processing succeeded, False otherwise
Example:
from llm import process_llm
from pathlib import Path
import logging
success = process_llm(
target_dir=Path("input/gnn_files"),
output_dir=Path("output/13_llm_output"),
verbose=True,
analysis_type="comprehensive",
provider="auto",
llm_tasks="all"
)analyze_gnn_file_with_llm(file_path: Path, verbose: bool = False, ollama_model: Optional[str] = None) -> Dict[str, Any] | Coroutine
Description: Analyze a GNN file (heuristic extractors + optional LLM summary). When ollama_model is set (e.g. from process_llm after _select_best_ollama_model), summarization uses that tag on Ollama via LLMOperations.summarize_gnn.
Parameters:
file_path(Path): Path to the GNN.mdfileverbose(bool): Verbose loggingollama_model(str, optional): Resolved Ollama tag for the per-file summary; if omitted, summarization followsLLMProcessordefaults
Returns: Result dict, or a coroutine if called while an event loop is already running
Description: Extract variable definitions from GNN content.
Parameters:
content(str): GNN content string
Returns: List[Dict[str, Any]] - List of variable dictionaries with name, type, dimensions
Description: Extract connection definitions from GNN content.
Parameters:
content(str): GNN content string
Returns: List[Dict[str, Any]] - List of connection dictionaries with source, target, type
generate_model_insights(gnn_content: str, analysis_results: Dict[str, Any] = None) -> Dict[str, Any]
Description: Generate insights from GNN model analysis.
Parameters:
gnn_content(str): GNN content stringanalysis_results(Dict[str, Any], optional): Previous analysis results
Returns: Dict[str, Any] - Insights dictionary with complexity, patterns, recommendations
Description: Generate comprehensive documentation for GNN model using LLM.
Parameters:
gnn_content(str): GNN content stringmodel_name(str, optional): Name of the model
Returns: str - Generated documentation as markdown string
- Ollama — local inference via the
ollamaPython client when functional, else CLI recovery (ollama chatJSON mode when supported, elseollama run). Default model tagsmollm2:135m-instruct-q4_K_S(llm.defaults.DEFAULT_OLLAMA_MODEL; overridable viaOLLAMA_MODEL,OLLAMA_TEST_MODEL, orinput/config.yamlllm.model). - OpenAI — cloud API when
OPENAI_API_KEYis set. - OpenRouter — when
OPENROUTER_API_KEYis set. - Perplexity — when
PERPLEXITY_API_KEYis set.
| Location | Role |
|---|---|
llm/providers/ollama_provider.py |
OllamaProvider: initialize, validate_config, generate_response, generate_stream, analyze, close |
llm/llm_processor.py |
LLMProcessor merges get_default_provider_configs() into provider_configs so env vars apply even when the caller passes None/{} |
llm/processor.py |
_start_ollama_if_needed, _select_best_ollama_model, _model_is_cached; step-13 orchestration and provider_matrix |
Model selection (_select_best_ollama_model): OLLAMA_MODEL or OLLAMA_TEST_MODEL → optional input/config.yaml llm.model if that name matches an installed tag → built-in preference list (smaller models first: smollm2, tinyllama, gemma3:4b, gemma2:2b, …) → first ollama list entry → llm.defaults.DEFAULT_OLLAMA_MODEL.
Request wiring: The tag chosen above is passed to LLMProcessor.get_response as model_name for every structured PromptType prompt and for custom prompts (same value as cache keys). AnalysisType.SUMMARY tasks prefer Ollama first when registered, then OpenAI / OpenRouter / Perplexity, so local runs are not blocked by exhausted cloud quota when a key is still present. For per-file summaries, process_llm passes the resolved tag into analyze_gnn_file_with_llm so it matches the prompt loop. Override defaults with OLLAMA_MODEL or input/config.yaml llm.model. To avoid OpenAI retries when quota is zero, unset OPENAI_API_KEY for local-only runs.
LLMProcessorloads API keys from the environment; Ollama is enabled unlessOLLAMA_DISABLEDis truthy (1,true).- Step 13 (
process_llm) probes the Ollama CLI (ollama list) and records status inprovider_matrix.ollama. - If the unified processor initializes, prompts use the selected local model; on failure, structured fallbacks and cache still write outputs.
- If no provider works, processing continues with recovery text and logged warnings.
provider(str): LLM provider to use (default:"auto")"auto": Automatically select best available provider"openai": Use OpenAI API (requires OPENAI_API_KEY)"anthropic": Use Anthropic API (requires ANTHROPIC_API_KEY)"ollama": Use local Ollama (requires Ollama installation)
analysis_type(str): Type of analysis to perform (default:"comprehensive")"comprehensive": Full model analysis"summary": Brief summary only"explain": Concept explanations"optimize": Optimization suggestions
llm_tasks(str): Specific tasks to perform (default:"all")"all": All available tasks"summarize": Generate model summary"explain": Explain Active Inference concepts"optimize": Suggest optimizations
llm_timeout(int): Timeout for LLM API calls in seconds (default:60)max_tokens(int): Maximum tokens in response (default:2000)temperature(float): LLM temperature (default:0.7)
OLLAMA_MODEL: Model tag for requests (defaultllm.defaults.DEFAULT_OLLAMA_MODEL)OLLAMA_TEST_MODEL: Overrides model name in tests when setOLLAMA_MAX_TOKENS: Defaultnum_predictcap (default256in processor config)OLLAMA_TIMEOUT: Client/CLI subprocess timeout in seconds (default60in env wiring; provider default 30s unless configured)OLLAMA_HOST: Optional base URL for the Pythonollamaclient (empty = client default)OLLAMA_DISABLED: Set to1ortrueto skip registering Ollama as a provider
OPENAI_API_KEY,OPENROUTER_API_KEY,PERPLEXITY_API_KEY,ANTHROPIC_API_KEY(Anthropic appears in summaries/matrix when present)DEFAULT_PROVIDER: e.g.ollama(seellm_processor.get_preferred_providers_from_env)
json- Configuration and outputpathlib- File operations
openai— OpenAI APIanthropic— Anthropic API (when used by callers)ollama(PyPI) — Python client; if import fails orchatis missing,OllamaProvideruses theollamaCLI when onPATH
utils.pipeline_template- Logging utilitiespipeline.config- Configuration managementllm.processor- Core LLM logic
from llm import process_llm
success = process_llm(
target_dir=Path("input/gnn_files"),
output_dir=Path("output/13_llm_output"),
logger=logger,
analysis_type="comprehensive"
)success = process_llm(
target_dir=Path("input/gnn_files"),
output_dir=Path("output/13_llm_output"),
logger=logger,
provider="ollama", # Force Ollama
llm_tasks="all"
){model}_llm_analysis.md- Full analysis report{model}_llm_summary.json- Structured summary{model}_llm_explanations.md- Concept explanationsllm_processing_summary.json- Processing summary
output/13_llm_output/
├── model_name_llm_analysis.md
├── model_name_llm_summary.json
├── model_name_llm_explanations.md
└── llm_processing_summary.json
- Duration: 3.73 seconds
- Memory: 28.8 MB
- Status: SUCCESS
- Provider Used: Ollama (recovery)
Added: Automatic Ollama availability check
# Check if Ollama is available
result = subprocess.run(['ollama', 'list'],
capture_output=True, timeout=5)
if result.returncode == 0:
# Use Ollama as recoverysrc/tests/test_llm_overall.py- Module-level testssrc/tests/test_llm_functional.py- Functional testssrc/tests/test_llm_ollama.py- Ollama-specific testssrc/tests/test_llm_ollama_integration.py- Ollama integration tests
Measure on demand:
uv run pytest src/tests/test_llm*.py \
--cov=src/llm --cov-report=term-missing- Ollama detection and availability check
- Model selection and prioritization
- LLM processing with Ollama integration
- Recovery mode when Ollama unavailable
- Error handling and recovery
- Timeout management for LLM calls
Symptom: "Ollama not found in PATH" message
Solution:
# Install Ollama (macOS/Linux)
curl -fsSL https://ollama.com/install.sh | sh
# Or download from https://ollama.comVerification:
ollama --version
which ollamaSymptom: "Ollama is installed but may not be running"
Solution:
# Start Ollama service
ollama serve
# In a separate terminal, verify it's running
ollama listAlternative: Run Ollama in background
# macOS/Linux
nohup ollama serve > /dev/null 2>&1 &
# Or use system service (if configured)
systemctl start ollamaSymptom: "Ollama is running but no models are installed"
Solution:
# Default pipeline tag (see llm.defaults.DEFAULT_OLLAMA_MODEL)
ollama pull smollm2:135m-instruct-q4_K_S
# Alternates
ollama pull gemma3:4b
ollama pull tinyllama
ollama pull llama2:7bSuggested models (latency and RAM depend on CPU/GPU — measure locally):
- Default / small instruct:
smollm2:135m-instruct-q4_K_S - Balanced:
tinyllama,gemma3:4b - Larger:
llama2:7band up
View Installed Models:
ollama listSymptom: "Prompt execution timed out" or slow responses
Solution:
-
✅ Automatic: Module now uses adaptive timeouts based on prompt complexity
-
Environment variable override:
export OLLAMA_TIMEOUT=120 # Increase timeout to 120 seconds
-
Use a smaller/faster tag:
export OLLAMA_MODEL=smollm2:135m-instruct-q4_K_S
Performance Tips:
- Use GPU acceleration if available (Ollama detects automatically)
- Close other applications to free memory
- Use smaller models for routine analysis
Symptom: Wrong model being used or "model not found" errors
Solution:
# Override model selection via environment variable
export OLLAMA_MODEL=tinyllama
# Or specify in command
OLLAMA_MODEL=tinyllama python src/13_llm.py --target-dir input/gnn_filesAutomatic Selection:
- ✅
_select_best_ollama_modelpicks from installed tags using the ordered preference list inllm/processor.py(starts withsmollm2,tinyllama,gemma3:4b,gemma2:2b, …)
Check Which Model Was Used:
# View LLM results
cat output/13_llm_output/llm_results/llm_results.json | grep "selected_model"Symptom: "Proceeding with recovery LLM analysis" messages
Explanation: This is expected when Ollama is not available. The module provides basic analysis without live LLM interaction.
Solution (if you want LLM features):
- Install and start Ollama (see issues #1 and #2)
- Install at least one model (see issue #3)
- Re-run the LLM step
Recovery Capabilities:
- ✅ Basic pattern extraction
- ✅ Variable and connection identification
- ✅ Structure analysis
- ❌ No natural language generation
- ❌ No model interpretation
Symptom: Step 13 takes several minutes (3m+ per model)
Causes:
- Large models (llama2:70b, etc.)
- CPU-only inference (no GPU)
- Complex/long prompts
- Multiple GNN files being processed
Solutions:
-
Use a smaller tag (see
llm.defaults.DEFAULT_OLLAMA_MODEL):export OLLAMA_MODEL=smollm2:135m-instruct-q4_K_S -
Reduce prompt complexity:
export OLLAMA_MAX_TOKENS=256 # Shorter responses
-
Enable GPU acceleration (if available):
# Ollama uses GPU automatically if detected ollama run llama2 --gpu -
Process files individually:
# Process one file at a time python src/13_llm.py --target-dir input/gnn_files --gnn-file specific_model.md
Performance: Measure with your hardware; smaller instruct models are usually faster on CPU.
Symptom: System slowdown or "out of memory" errors
Solution:
-
Use smaller models (e.g. default
smollm2:135m-instruct-q4_K_S). -
Limit concurrent processing:
- Process files one at a time
- Close other applications
-
Monitor resource usage:
# Monitor Ollama memory usage ps aux | grep ollama htop # or top
Memory: Check ollama show <tag> and system monitor while loading models.
- Automatic Ollama availability check
- Model listing and validation
- Service health monitoring (port 11434)
- Helpful installation instructions when not found
- Prioritizes small, fast models for quick execution
- Automatic recovery chain
- Environment variable override support
- Logs selected model for transparency
- File-by-file progress indicators
- Prompt-by-prompt completion tracking
- Detailed logging with emoji indicators 📝
- Clear success/failure indicators ✅/❌
- Graceful recovery when Ollama unavailable
- Per-prompt error handling
- Timeout protection with retry logic
- Comprehensive error messages
-
Install and Start Ollama Before Running:
# Terminal 1: Start Ollama ollama serve # Terminal 2: Run pipeline python src/main.py --only-steps "13" --verbose
-
Use Appropriate Model for Task:
- Quick / default:
smollm2:135m-instruct-q4_K_S - Balanced:
tinyllama,gemma3:4b - Deep Analysis:
llama2:7band larger
- Quick / default:
-
Monitor Performance:
# Run with verbose logging python src/13_llm.py --verbose --target-dir input/gnn_files # Check timing in results cat output/13_llm_output/llm_results/llm_results.json
-
Optimize for Speed:
export OLLAMA_MODEL=smollm2:135m-instruct-q4_K_S export OLLAMA_MAX_TOKENS=256 export OLLAMA_TIMEOUT=30
-
Check Results Quality:
# View generated analyses cat output/13_llm_output/llm_results/prompts_*/technical_description.md cat output/13_llm_output/llm_results/llm_summary.md
# Model selection
export OLLAMA_MODEL=tinyllama # Override automatic selection
export OLLAMA_TEST_MODEL=smollm2:135m-instruct-q4_K_S # Test/CI model
# Performance tuning
export OLLAMA_MAX_TOKENS=512 # Maximum response length
export OLLAMA_TIMEOUT=60 # Request timeout (seconds)
export OLLAMA_HOST=http://localhost:11434 # Ollama server URL
# Behavior
export OLLAMA_DISABLED=1 # Skip Ollama provider registration
export DEFAULT_PROVIDER=ollama # Prefer Ollama when keys allow# In your code or config
from llm.llm_processor import get_default_provider_configs
configs = get_default_provider_configs()
configs['ollama']['default_model'] = 'my-custom-model'
configs['ollama']['default_max_tokens'] = 1024- No API Keys: Automatically recovery to Ollama if available
- Ollama Unavailable: Skip LLM analysis, log informative message, continue pipeline
- LLM Timeout: Retry with shorter timeout, then skip if still fails
- Invalid Response: Parse what's possible, log warning
- Provider Unavailable: No API keys and Ollama not available (recovery: skip analysis)
- API Errors: Rate limits, network errors (recovery: retry with backoff)
- Timeout Errors: LLM response too slow (recovery: use faster model or skip)
- Parsing Errors: Invalid LLM response format (recovery: use raw response)
- Automatic Recovery: Try next available provider automatically
- Partial Analysis: Generate what's possible, report failures
- Resource Cleanup: Proper cleanup of LLM connections on errors
- Informative Messages: Clear error messages with recovery suggestions
- Input: Receives GNN models from Step 3 (gnn processing) and execution results from Step 12 (execute)
- Output: Generates LLM analyses for Step 16 (analysis), Step 20 (website generation), and Step 23 (report generation)
- Dependencies: Requires GNN parsing results from
3_gnn.pyoutput, optionally uses execution results from12_execute.py
- gnn/: Reads parsed GNN model data for analysis
- execute/: Optionally uses execution results for enhanced analysis
- analysis/: Provides LLM insights for statistical analysis
- report/: Provides LLM-generated summaries for reports
- OpenAI API: Cloud-based LLM analysis
- Anthropic API: Cloud-based LLM analysis
- Ollama: Local LLM execution for privacy and offline use
3_gnn.py (GNN parsing)
↓
12_execute.py (Execution results) [optional]
↓
13_llm.py (LLM analysis)
↓
├→ 16_analysis.py (Enhanced analysis)
├→ 20_website.py (LLM summaries)
├→ 23_report.py (Report generation)
└→ output/13_llm_output/ (Standalone analyses)
Features:
- Multi-provider LLM support (OpenAI, Anthropic, Ollama)
- Automatic Ollama recovery
- Context-aware prompt generation
- Structured output parsing
- Rate limiting and error handling
Known Issues:
- None currently
- Next Version: Enhanced prompt optimization
- Future: Multi-modal LLM support
Last Updated: 2026-04-16 Maintainer: GNN Pipeline Team Status: ✅ Production Ready Version: 1.6.0 Architecture Compliance: ✅ 100% Thin Orchestrator Pattern