diff --git a/CLAUDE.md b/CLAUDE.md index 9b91bfeda7d..386368996d9 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -92,6 +92,9 @@ just test-integration # Run a specific integration test case (e.g., "grafted" test case) TEST_CASE=grafted just test-integration + +# Override ports if using different service ports (e.g., for local development) +POSTGRES_TEST_PORT=5432 ETHEREUM_TEST_PORT=8545 IPFS_TEST_PORT=5001 just test-integration ``` **⚠️ Test Verification Requirements:** diff --git a/Cargo.lock b/Cargo.lock index 65a4bfdbeb0..dac8104545d 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -2895,6 +2895,7 @@ dependencies = [ "stable-hash 0.3.4", "stable-hash 0.4.4", "strum_macros", + "tempfile", "thiserror 2.0.17", "tiny-keccak 1.5.0", "tokio", diff --git a/README.md b/README.md index 118a7c8a846..a9d5349a5e5 100644 --- a/README.md +++ b/README.md @@ -114,6 +114,44 @@ Very large `graph-node` instances can also be configured using a the `graph-node` needs to connect to multiple chains or if the work of indexing and querying needs to be split across [multiple databases](./docs/config.md). +#### Log Storage + +`graph-node` supports storing and querying subgraph logs through multiple backends: + +- **File**: Local JSON Lines files (recommended for local development) +- **Elasticsearch**: Enterprise-grade search and analytics (for production) +- **Loki**: Grafana's log aggregation system (for production) +- **Disabled**: No log storage (default) + +**Quick example (file-based logs for local development):** +```bash +mkdir -p ./graph-logs + +cargo run -p graph-node --release -- \ + --postgres-url $POSTGRES_URL \ + --ethereum-rpc mainnet:archive:https://... \ + --ipfs 127.0.0.1:5001 \ + --log-store-backend file \ + --log-store-file-dir ./graph-logs +``` + +Logs are queried via GraphQL at `http://localhost:8000/graphql`: +```graphql +query { + _logs(subgraphId: "QmYourSubgraphHash", level: ERROR, first: 10) { + timestamp + level + text + } +} +``` + +**For complete documentation**, see the **[Log Store Guide](./docs/log-store.md)**, which covers: +- How to configure each backend (Elasticsearch, Loki, File) +- Complete GraphQL query examples +- Choosing the right backend for your use case +- Performance considerations and best practices + ## Contributing Please check [CONTRIBUTING.md](CONTRIBUTING.md) for development flow and conventions we use. diff --git a/docs/environment-variables.md b/docs/environment-variables.md index 560e5fe87a4..56c7883cd54 100644 --- a/docs/environment-variables.md +++ b/docs/environment-variables.md @@ -302,3 +302,55 @@ those. Disabling the store call cache may significantly impact performance; the actual impact depends on the average execution time of an `eth_call` compared to the cost of a database lookup for a cached result. (default: false) + +## Log Store Configuration + +`graph-node` supports storing and querying subgraph logs through multiple backends: Elasticsearch, Loki, local files, or disabled. + +**For complete log store documentation**, including detailed configuration, querying examples, and choosing the right backend, see the **[Log Store Guide](log-store.md)**. + +### Quick Reference + +**Backend selection:** +- `GRAPH_LOG_STORE_BACKEND`: `disabled` (default), `elasticsearch`, `loki`, or `file` + +**Elasticsearch:** +- `GRAPH_LOG_STORE_ELASTICSEARCH_URL`: Elasticsearch endpoint URL (required) +- `GRAPH_LOG_STORE_ELASTICSEARCH_USER`: Username (optional) +- `GRAPH_LOG_STORE_ELASTICSEARCH_PASSWORD`: Password (optional) +- `GRAPH_LOG_STORE_ELASTICSEARCH_INDEX`: Index name (default: `subgraph`) + +**Loki:** +- `GRAPH_LOG_STORE_LOKI_URL`: Loki endpoint URL (required) +- `GRAPH_LOG_STORE_LOKI_TENANT_ID`: Tenant ID (optional) + +**File-based:** +- `GRAPH_LOG_STORE_FILE_DIR`: Log directory (required) +- `GRAPH_LOG_STORE_FILE_MAX_SIZE`: Max file size in bytes (default: 104857600 = 100MB) +- `GRAPH_LOG_STORE_FILE_RETENTION_DAYS`: Retention period (default: 30) + +**Deprecated variables** (will be removed in future versions): +- `GRAPH_ELASTICSEARCH_URL` → use `GRAPH_LOG_STORE_ELASTICSEARCH_URL` +- `GRAPH_ELASTICSEARCH_USER` → use `GRAPH_LOG_STORE_ELASTICSEARCH_USER` +- `GRAPH_ELASTICSEARCH_PASSWORD` → use `GRAPH_LOG_STORE_ELASTICSEARCH_PASSWORD` +- `GRAPH_ELASTIC_SEARCH_INDEX` → use `GRAPH_LOG_STORE_ELASTICSEARCH_INDEX` + +### Example: File-based Logs for Local Development + +```bash +mkdir -p ./graph-logs +export GRAPH_LOG_STORE_BACKEND=file +export GRAPH_LOG_STORE_FILE_DIR=./graph-logs + +graph-node \ + --postgres-url postgresql://graph:pass@localhost/graph-node \ + --ethereum-rpc mainnet:https://... \ + --ipfs 127.0.0.1:5001 +``` + +See the **[Log Store Guide](log-store.md)** for: +- Detailed configuration for all backends +- How log stores work internally +- GraphQL query examples +- Choosing the right backend for your use case +- Best practices and troubleshooting diff --git a/docs/log-store.md b/docs/log-store.md new file mode 100644 index 00000000000..8be1cddecd8 --- /dev/null +++ b/docs/log-store.md @@ -0,0 +1,853 @@ +# Log Store Configuration and Usage + +This guide explains how to configure subgraph indexing logs storage in graph-node. + +## Table of Contents + +- [Overview](#overview) +- [How Log Stores Work](#how-log-stores-work) +- [Log Store Types](#log-store-types) + - [File-based Logs](#file-based-logs) + - [Elasticsearch](#elasticsearch) + - [Loki](#loki) + - [Disabled](#disabled) +- [Configuration](#configuration) + - [Environment Variables](#environment-variables) + - [CLI Arguments](#cli-arguments) + - [Configuration Precedence](#configuration-precedence) +- [Querying Logs](#querying-logs) +- [Migrating from Deprecated Configuration](#migrating-from-deprecated-configuration) +- [Choosing the Right Backend](#choosing-the-right-backend) +- [Best Practices](#best-practices) +- [Troubleshooting](#troubleshooting) + +## Overview + +Graph Node supports multiple logs storage backends for subgraph indexing logs. Subgraph indexing logs include: +- **User-generated logs**: Explicit logging from subgraph mapping code (`log.info()`, `log.error()`, etc.) +- **Runtime logs**: Handler execution, event processing, data source activity +- **System logs**: Warnings, errors, and diagnostics from the indexing system + +**Available backends:** +- **File**: JSON Lines files on local filesystem (for local development) +- **Elasticsearch**: Enterprise-grade search and analytics (for production) +- **Loki**: Grafana's lightweight log aggregation system (for production) +- **Disabled**: No log storage (default) + +All backends share the same query interface through GraphQL, making it easy to switch between them. + +**Important Note:** When log storage is disabled (the default), subgraph logs still appear in stdout/stderr as they always have. The "disabled" setting simply means logs are not stored separately in a queryable format. You can still see logs in your terminal or container logs - they just won't be available via the `_logs` GraphQL query. + +## How Log Stores Work + +### Architecture + +``` +┌─────────────────┐ +│ Subgraph Code │ +│ (mappings) │ +└────────┬────────┘ + │ log.info(), log.error(), etc. + ▼ +┌─────────────────┐ +│ Graph Runtime │ +│ (WebAssembly) │ +└────────┬────────┘ + │ Log events + ▼ +┌─────────────────┐ +│ Log Drain │ ◄─── slog-based logging system +└────────┬────────┘ + │ Write + ▼ +┌─────────────────┐ +│ Log Store │ ◄─── Configurable backend +│ (ES/Loki/File) │ +└────────┬────────┘ + │ + ▼ +┌─────────────────┐ +│ GraphQL API │ ◄─── Unified query interface +│ (port 8000) │ +└─────────────────┘ +``` + +### Log Flow + +1. **Log sources** generate logs from: + - User mapping code (explicit `log.info()`, `log.error()`, etc. calls) + - Subgraph runtime (handler execution, event processing, data source triggers) + - System warnings and errors (indexing issues, constraint violations, etc.) +2. **Graph runtime** captures these logs with metadata (timestamp, level, source location) +3. **Log drain** formats logs and writes to configured backend +4. **Log store** persists logs and handles queries +5. **GraphQL API** exposes logs through the `_logs` query + +### Log Entry Structure + +Each log entry contains: +- **`id`**: Unique identifier +- **`subgraphId`**: Deployment hash (QmXxx...) +- **`timestamp`**: ISO 8601 timestamp (e.g., `2024-01-15T10:30:00.123456789Z`) +- **`level`**: CRITICAL, ERROR, WARNING, INFO, or DEBUG +- **`text`**: Log message +- **`arguments`**: Key-value pairs from structured logging +- **`meta`**: Source location (module, line, column) + +## Log Store Types + +### File-based Logs + +**Best for:** Local development, testing + +#### How It Works + +File-based logs store each subgraph's logs in a separate JSON Lines (`.jsonl`) file: + +``` +graph-logs/ +├── QmSubgraph1Hash.jsonl +├── QmSubgraph2Hash.jsonl +└── QmSubgraph3Hash.jsonl +``` + +Each line in the file is a complete JSON object representing one log entry. + +#### Storage Format + +```json +{"id":"QmTest-2024-01-15T10:30:00.123456789Z","subgraphId":"QmTest","timestamp":"2024-01-15T10:30:00.123456789Z","level":"error","text":"Handler execution failed, retries: 3","arguments":[{"key":"retries","value":"3"}],"meta":{"module":"mapping.ts","line":42,"column":10}} +``` + +#### Query Performance + +File-based logs stream through files line-by-line with bounded memory usage. + +**Performance characteristics:** +- Query time: O(n) where n = number of log entries +- Memory usage: O(skip + first) - only matching entries kept in memory +- Suitable for: Development and testing + +#### Configuration + +**Minimum configuration (CLI):** +```bash +graph-node \ + --postgres-url postgresql://graph:pass@localhost/graph-node \ + --ethereum-rpc mainnet:https://... \ + --ipfs 127.0.0.1:5001 \ + --log-store-backend file \ + --log-store-file-dir ./graph-logs +``` + +**Full configuration (environment variables):** +```bash +export GRAPH_LOG_STORE_BACKEND=file +export GRAPH_LOG_STORE_FILE_DIR=/var/log/graph-node +export GRAPH_LOG_STORE_FILE_MAX_SIZE=104857600 # 100MB +export GRAPH_LOG_STORE_FILE_RETENTION_DAYS=30 +``` + +#### Features + +**Advantages:** +- No external dependencies +- Simple setup (just specify a directory) +- Human-readable format (JSON Lines) +- Easy to inspect with standard tools (`jq`, `grep`, etc.) +- Good for debugging during development + +**Limitations:** +- Not suitable for production with high log volume +- No indexing (O(n) query time scales with file size) +- No automatic log rotation or retention management +- Single file per subgraph (no sharding) + +#### When to Use + +Use file-based logs when: +- Developing subgraphs locally +- Testing on a development machine +- Running low-traffic subgraphs (< 1000 total logs/day including system logs) +- You want simple log access without external services + +### Elasticsearch + +**Best for:** Production deployments, high log volume, advanced search + +#### How It Works + +Elasticsearch stores logs in indices with full-text search capabilities, making it ideal for production deployments with high log volume. + +**Architecture:** +``` +graph-node → Elasticsearch HTTP API → Elasticsearch cluster + → Index: subgraph-logs-* + → Query DSL for filtering +``` + +#### Features + +**Advantages:** +- **Indexed searching**: Fast queries even with millions of logs +- **Full-text search**: Powerful text search across log messages +- **Scalability**: Handles billions of log entries +- **High availability**: Supports clustering and replication +- **Kibana integration**: Rich visualization and dashboards for operators +- **Time-based indices**: Efficient retention management + +**Considerations:** +- Requires Elasticsearch cluster (infrastructure overhead) +- Resource-intensive (CPU, memory, disk) + +#### Configuration + +**Minimum configuration (CLI):** +```bash +graph-node \ + --postgres-url postgresql://graph:pass@localhost/graph-node \ + --ethereum-rpc mainnet:https://... \ + --ipfs 127.0.0.1:5001 \ + --log-store-backend elasticsearch \ + --log-store-elasticsearch-url http://localhost:9200 +``` + +**Full configuration with authentication:** +```bash +graph-node \ + --postgres-url postgresql://graph:pass@localhost/graph-node \ + --ethereum-rpc mainnet:https://... \ + --ipfs 127.0.0.1:5001 \ + --log-store-backend elasticsearch \ + --log-store-elasticsearch-url https://es.example.com:9200 \ + --log-store-elasticsearch-user elastic \ + --log-store-elasticsearch-password secret \ + --log-store-elasticsearch-index subgraph-logs +``` + +**Environment variables:** +```bash +export GRAPH_LOG_STORE_BACKEND=elasticsearch +export GRAPH_LOG_STORE_ELASTICSEARCH_URL=http://localhost:9200 +export GRAPH_LOG_STORE_ELASTICSEARCH_USER=elastic +export GRAPH_LOG_STORE_ELASTICSEARCH_PASSWORD=secret +export GRAPH_LOG_STORE_ELASTICSEARCH_INDEX=subgraph-logs +``` + +#### Index Configuration + +Logs are stored in the configured index (default: `subgraph`). The index mapping is automatically created. + +**Recommended index settings for production:** +```json +{ + "settings": { + "number_of_shards": 3, + "number_of_replicas": 1, + "refresh_interval": "5s" + } +} +``` + +#### Query Performance + +**Performance characteristics:** +- Query time: O(log n) with indexing +- Memory usage: Minimal (server-side filtering) +- Suitable for: Millions to billions of log entries + +#### When to Use + +Use Elasticsearch when: +- Running production deployments +- High log volume +- Need advanced search and filtering +- Want to build dashboards with Kibana +- Need high availability and scalability +- Have DevOps resources to manage Elasticsearch or can set up a managed ElasticSearch deployment + +### Loki + +**Best for:** Production deployments, Grafana users, cost-effective at scale + +#### How It Works + +Loki is Grafana's log aggregation system, designed to be cost-effective and easy to operate. Unlike Elasticsearch, Loki only indexes metadata (not full-text), making it more efficient for time-series log data. + +**Architecture:** +``` +graph-node → Loki HTTP API → Loki + → Stores compressed chunks + → Indexes labels only +``` + +#### Features + +**Advantages:** +- **Cost-effective**: Lower storage costs than Elasticsearch +- **Grafana integration**: Native integration with Grafana +- **Horizontal scalability**: Designed for cloud-native deployments +- **Multi-tenancy**: Built-in tenant isolation +- **Efficient compression**: Optimized for log data +- **LogQL**: Powerful query language similar to PromQL +- **Lower resource usage**: Less CPU/memory than Elasticsearch + +**Considerations:** +- No full-text indexing (slower text searches) +- Best used with Grafana (less tooling than Elasticsearch) +- Younger ecosystem than Elasticsearch +- Query performance depends on label cardinality + +#### Configuration + +**Minimum configuration (CLI):** +```bash +graph-node \ + --postgres-url postgresql://graph:pass@localhost/graph-node \ + --ethereum-rpc mainnet:https://... \ + --ipfs 127.0.0.1:5001 \ + --log-store-backend loki \ + --log-store-loki-url http://localhost:3100 +``` + +**With multi-tenancy:** +```bash +graph-node \ + --postgres-url postgresql://graph:pass@localhost/graph-node \ + --ethereum-rpc mainnet:https://... \ + --ipfs 127.0.0.1:5001 \ + --log-store-backend loki \ + --log-store-loki-url http://localhost:3100 \ + --log-store-loki-tenant-id my-graph-node +``` + +**Environment variables:** +```bash +export GRAPH_LOG_STORE_BACKEND=loki +export GRAPH_LOG_STORE_LOKI_URL=http://localhost:3100 +export GRAPH_LOG_STORE_LOKI_TENANT_ID=my-graph-node +``` + +#### Labels + +Loki uses labels for indexing. Graph Node automatically creates labels: +- `subgraph_id`: Deployment hash +- `level`: Log level +- `job`: "graph-node" + +#### Query Performance + +**Performance characteristics:** +- Query time: O(n) for text searches, O(log n) for label queries +- Memory usage: Minimal (server-side processing) +- Suitable for: Millions to billions of log entries +- Best performance with label-based filtering + +#### When to Use + +Use Loki when: +- Already using Grafana for monitoring +- Need cost-effective log storage at scale +- Want simpler operations than Elasticsearch +- Multi-tenancy is required +- Log volume is very high (> 1M logs/day) +- Full-text search is not critical + +### Disabled + +**Best for:** Minimalist deployments, reduced overhead + +#### How It Works + +When log storage is disabled (the default), subgraph logs are **still written to stdout/stderr** along with all other graph-node logs. They are just **not stored separately** in a queryable format. + +**Important:** "Disabled" does NOT mean logs are discarded. It means: +- Logs appear in stdout/stderr (traditional behavior) +- Logs are not stored in a separate queryable backend +- The `_logs` GraphQL query returns empty results + +This is the default behavior - logs continue to work exactly as they did before this feature was added. + +#### Configuration + +**Explicitly disable:** +```bash +export GRAPH_LOG_STORE_BACKEND=disabled +``` + +**Or simply don't configure a backend** (defaults to disabled): +```bash +# No log store configuration = disabled +graph-node \ + --postgres-url postgresql://graph:pass@localhost/graph-node \ + --ethereum-rpc mainnet:https://... \ + --ipfs 127.0.0.1:5001 +``` + +#### Features + +**Advantages:** +- Zero additional overhead +- No external dependencies +- Minimal configuration +- Logs still appear in stdout/stderr for debugging + +**Limitations:** +- Cannot query logs via GraphQL (`_logs` returns empty results) +- No separation of subgraph logs from other graph-node logs in stdout +- Logs mixed with system logs (harder to filter programmatically) +- No structured querying or filtering capabilities + +#### When to Use + +Use disabled log storage when: +- Running minimal test deployments with less dependencies +- Exposing logs to users is not required for your use case +- You'd like subgraph logs sent to external log collection (e.g., container logs) + +## Configuration + +### Environment Variables + +Environment variables are the recommended way to configure log stores, especially in containerized deployments. + +#### Backend Selection + +```bash +GRAPH_LOG_STORE_BACKEND= +``` +Valid values: `disabled`, `elasticsearch`, `loki`, `file` + +#### Elasticsearch + +```bash +GRAPH_LOG_STORE_ELASTICSEARCH_URL=http://localhost:9200 +GRAPH_LOG_STORE_ELASTICSEARCH_USER=elastic # Optional +GRAPH_LOG_STORE_ELASTICSEARCH_PASSWORD=secret # Optional +GRAPH_LOG_STORE_ELASTICSEARCH_INDEX=subgraph # Default: "subgraph" +``` + +#### Loki + +```bash +GRAPH_LOG_STORE_LOKI_URL=http://localhost:3100 +GRAPH_LOG_STORE_LOKI_TENANT_ID=my-tenant # Optional +``` + +#### File + +```bash +GRAPH_LOG_STORE_FILE_DIR=/var/log/graph-node +GRAPH_LOG_STORE_FILE_MAX_SIZE=104857600 # Default: 100MB +GRAPH_LOG_STORE_FILE_RETENTION_DAYS=30 # Default: 30 +``` + +### CLI Arguments + +CLI arguments provide the same functionality as environment variables and the two can be mixed together. + +#### Backend Selection + +```bash +--log-store-backend +``` + +#### Elasticsearch + +```bash +--log-store-elasticsearch-url +--log-store-elasticsearch-user +--log-store-elasticsearch-password +--log-store-elasticsearch-index +``` + +#### Loki + +```bash +--log-store-loki-url +--log-store-loki-tenant-id +``` + +#### File + +```bash +--log-store-file-dir +--log-store-file-max-size +--log-store-file-retention-days +``` + +### Configuration Precedence + +When multiple configuration methods are used: + +1. **CLI arguments** take highest precedence +2. **Environment variables** are used if no CLI args provided +3. **Defaults** are used if neither is set + +## Querying Logs + +All log backends share the same GraphQL query interface. Logs are queried through the subgraph-specific GraphQL endpoint: + +- **Subgraph by deployment**: `http://localhost:8000/subgraphs/id/` +- **Subgraph by name**: `http://localhost:8000/subgraphs/name/` + +The `_logs` query is automatically scoped to the subgraph in the URL, so you don't need to pass a `subgraphId` parameter. + +**Note**: Queries return all log types - both user-generated logs from mapping code and system-generated runtime logs (handler execution, events, warnings, etc.). Use the `search` filter to search for specific messages, or `level` to filter by severity. + +### Basic Query + +Query the `_logs` field at your subgraph's GraphQL endpoint: + +```graphql +query { + _logs( + first: 100 + ) { + id + timestamp + level + text + } +} +``` + +**Example endpoint**: `http://localhost:8000/subgraphs/id/QmYourDeploymentHash` + +### Query with Filters + +```graphql +query { + _logs( + level: ERROR + from: "2024-01-01T00:00:00Z" + to: "2024-01-31T23:59:59Z" + search: "timeout" + first: 50 + skip: 0 + ) { + id + timestamp + level + text + arguments { + key + value + } + meta { + module + line + column + } + } +} +``` + +### Available Filters + +| Filter | Type | Description | +|--------|------|-------------| +| `level` | LogLevel | Filter by level: CRITICAL, ERROR, WARNING, INFO, DEBUG | +| `from` | String | Start timestamp (ISO 8601) | +| `to` | String | End timestamp (ISO 8601) | +| `search` | String | Case-insensitive substring search in log messages | +| `first` | Int | Number of results to return (default: 100, max: 1000) | +| `skip` | Int | Number of results to skip for pagination (max: 10000) | + +### Response Fields + +| Field | Type | Description | +|-------|------|-------------| +| `id` | String | Unique log entry ID | +| `timestamp` | String | ISO 8601 timestamp with nanosecond precision | +| `level` | LogLevel | Log level (CRITICAL, ERROR, WARNING, INFO, DEBUG) | +| `text` | String | Complete log message with arguments | +| `arguments` | [(String, String)] | Structured key-value pairs | +| `meta.module` | String | Source file name | +| `meta.line` | Int | Line number | +| `meta.column` | Int | Column number | + +### Query Examples + +#### Recent Errors + +```graphql +query RecentErrors { + _logs( + level: ERROR + first: 20 + ) { + timestamp + text + meta { + module + line + } + } +} +``` + +#### Search for Specific Text + +```graphql +query SearchTimeout { + _logs( + search: "timeout" + first: 50 + ) { + timestamp + level + text + } +} +``` + +#### Handler Execution Logs + +```graphql +query HandlerLogs { + _logs( + search: "handler" + first: 50 + ) { + timestamp + level + text + } +} +``` + +#### Time Range Query + +```graphql +query LogsInRange { + _logs( + from: "2024-01-15T00:00:00Z" + to: "2024-01-15T23:59:59Z" + first: 1000 + ) { + timestamp + level + text + } +} +``` + +#### Pagination + +```graphql +# First page +query Page1 { + _logs( + first: 100 + skip: 0 + ) { + id + text + } +} + +# Second page +query Page2 { + _logs( + first: 100 + skip: 100 + ) { + id + text + } +} +``` + +### Querying the logs store using cURL + +```bash +curl -X POST http://localhost:8000/subgraphs/id/ \ + -H "Content-Type: application/json" \ + -d '{ + "query": "{ _logs(level: ERROR, first: 10) { timestamp level text } }" + }' +``` + +### Performance Considerations + +**File-based:** _for development only_ +- Streams through files line-by-line (bounded memory usage) +- Memory usage limited to O(skip + first) entries +- Query time is O(n) where n = total log entries in file + +**Elasticsearch:** +- Indexed queries are fast regardless of size +- Text searches are optimized with full-text indexing +- Can handle billions of log entries +- Best for production with high query volume + +**Loki:** +- Label-based queries are fast (indexed) +- Text searches scan compressed chunks (slower than Elasticsearch) +- Good performance with proper label filtering +- Best for production with Grafana integration + +## Choosing the Right Backend + +### Decision Matrix + +| Scenario | Recommended Backend | Reason | +|----------|-------------------|-----------------------------------------------------------------------------------| +| Local development | **File** | Simple, no dependencies, easy to inspect | +| Testing/staging | **File** or **Elasticsearch** | File for simplicity, ES if testing production config | +| Production | **Elasticsearch** or **Loki** | Both handle scale well | +| Using Grafana | **Loki** | Native integration | +| Cost-sensitive at scale | **Loki** | Lower storage costs | +| Want rich ecosystem | **Elasticsearch** | More tools and plugins | +| Minimal deployment | **Disabled** | No overhead | + +### Resource Requirements + +#### File-based +- **Disk**: Minimal (log files only) +- **Memory**: Depends on file size during queries +- **CPU**: Minimal +- **Network**: None +- **External services**: None + +#### Elasticsearch +- **Disk**: High (indices + replicas) +- **Memory**: 4-8GB minimum for small deployments +- **CPU**: Medium to high +- **Network**: HTTP API calls +- **External services**: Elasticsearch cluster + +#### Loki +- **Disk**: Medium (compressed chunks) +- **Memory**: 2-4GB minimum +- **CPU**: Low to medium +- **Network**: HTTP API calls +- **External services**: Loki server + +## Best Practices + +### General + +1. **Start with file-based for development** - Simplest setup, easy debugging +2. **Use Elasticsearch or Loki for production** - Better performance and features +3. **Monitor log volume** - Set up alerts if log volume grows unexpectedly (includes both user logs and system-generated runtime logs) +4. **Set retention policies** - Don't keep logs forever (disk space and cost) +5. **Use structured logging** - Pass key-value pairs to log functions for better filtering + +### File-based Logs + +1. **Monitor file size** - While queries use bounded memory, larger files take longer to scan (O(n) query time) +2. **Archive old logs** - Manually archive/delete old files or implement external rotation +3. **Monitor disk usage** - Files can grow quickly with verbose logging +4. **Use JSON tools** - `jq` is excellent for inspecting .jsonl files locally + +**Example local inspection:** +```bash +# Count logs by level +cat graph-logs/QmExample.jsonl | jq -r '.level' | sort | uniq -c + +# Find errors in last 1000 lines +tail -n 1000 graph-logs/QmExample.jsonl | jq 'select(.level == "error")' + +# Search for specific text +cat graph-logs/QmExample.jsonl | jq 'select(.text | contains("timeout"))' +``` + +### Elasticsearch + +1. **Use index patterns** - Time-based indices for easier management +2. **Configure retention** - Use Index Lifecycle Management (ILM) +3. **Monitor cluster health** - Set up Elasticsearch monitoring +4. **Tune for your workload** - Adjust shards/replicas based on log volume +5. **Use Kibana** - Visualize and explore logs effectively + +**Example Elasticsearch retention policy:** +```json +{ + "policy": "graph-logs-policy", + "phases": { + "hot": { "min_age": "0ms", "actions": {} }, + "warm": { "min_age": "7d", "actions": {} }, + "delete": { "min_age": "30d", "actions": { "delete": {} } } + } +} +``` + +### Loki + +1. **Use proper labels** - Don't over-index, keep label cardinality low +2. **Configure retention** - Set retention period in Loki config +3. **Use Grafana** - Native integration provides best experience +4. **Compress efficiently** - Loki's compression works best with batch writes +5. **Multi-tenancy** - Use tenant IDs if running multiple environments + +**Example Grafana query:** +```logql +{subgraph_id="QmExample", level="error"} |= "timeout" +``` + +## Troubleshooting + +### File-based Logs + +**Problem: Log file doesn't exist** +- Check `GRAPH_LOG_STORE_FILE_DIR` is set correctly +- Verify directory is writable by graph-node + +**Problem: Queries are slow** +- Subgraph logs file may be very large +- Consider archiving old logs or implementing retention +- For high-volume production use, switch to Elasticsearch or Loki + +**Problem: Disk filling up** +- Implement log rotation +- Reduce log verbosity in subgraph code +- Set up monitoring for disk usage + +### Elasticsearch + +**Problem: Cannot connect to Elasticsearch** +- Verify `GRAPH_LOG_STORE_ELASTICSEARCH_URL` is correct +- Check Elasticsearch is running: `curl http://localhost:9200` +- Verify authentication credentials if using security features +- Check network connectivity and firewall rules + +**Problem: No logs appearing in Elasticsearch** +- Check Elasticsearch cluster health +- Verify index exists: `curl http://localhost:9200/_cat/indices` +- Check graph-node logs for write errors +- Verify Elasticsearch has disk space + +**Problem: Queries are slow** +- Check Elasticsearch cluster health and resources +- Verify indices are not over-sharded +- Consider adding replicas for query performance +- Review query patterns and add appropriate indices + +### Loki + +**Problem: Cannot connect to Loki** +- Verify `GRAPH_LOG_STORE_LOKI_URL` is correct +- Check Loki is running: `curl http://localhost:3100/ready` +- Verify tenant ID if using multi-tenancy +- Check network connectivity + +**Problem: No logs appearing in Loki** +- Check Loki service health +- Verify Loki has disk space for chunks +- Check graph-node logs for write errors +- Verify Loki retention settings aren't deleting logs immediately + +**Problem: Queries return no results in Grafana** +- Check label selectors match what graph-node is sending +- Verify time range includes when logs were written +- Check Loki retention period +- Verify tenant ID matches if using multi-tenancy + +## Further Reading + +- [Environment Variables Reference](environment-variables.md) +- [Graph Node Configuration](config.md) +- [Elasticsearch Documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html) +- [Grafana Loki Documentation](https://grafana.com/docs/loki/latest/) diff --git a/graph/Cargo.toml b/graph/Cargo.toml index 33cfbd40eb0..4dac56f40fa 100644 --- a/graph/Cargo.toml +++ b/graph/Cargo.toml @@ -108,6 +108,7 @@ clap.workspace = true maplit = "1.0.2" hex-literal = "1.1" wiremock = "0.6.5" +tempfile = "3.8" [build-dependencies] tonic-build = { workspace = true } diff --git a/graph/src/components/log_store/config.rs b/graph/src/components/log_store/config.rs new file mode 100644 index 00000000000..84118ca4864 --- /dev/null +++ b/graph/src/components/log_store/config.rs @@ -0,0 +1,216 @@ +use slog::{warn, Logger}; +use std::env; + +/// Read environment variable with fallback to deprecated key +/// +/// This helper function implements backward compatibility for environment variables. +/// It first tries the new key, then falls back to the old (deprecated) key with a warning. +/// +/// # Arguments +/// * `logger` - Logger for emitting deprecation warnings +/// * `new_key` - The new environment variable name +/// * `old_key` - The deprecated environment variable name +/// +/// # Returns +/// The value of the environment variable if found, or None if neither key is set +pub fn read_env_with_fallback(logger: &Logger, new_key: &str, old_key: &str) -> Option { + // Try new key first + if let Ok(value) = env::var(new_key) { + return Some(value); + } + + // Fall back to old key with deprecation warning + if let Ok(value) = env::var(old_key) { + warn!( + logger, + "Using deprecated environment variable '{}', please use '{}' instead", old_key, new_key + ); + return Some(value); + } + + None +} + +/// Read environment variable with default value and fallback +/// +/// Similar to `read_env_with_fallback`, but returns a default value if neither key is set. +/// +/// # Arguments +/// * `logger` - Logger for emitting deprecation warnings +/// * `new_key` - The new environment variable name +/// * `old_key` - The deprecated environment variable name +/// * `default` - Default value to return if neither key is set +/// +/// # Returns +/// The value of the environment variable, or the default if neither key is set +pub fn read_env_with_default( + logger: &Logger, + new_key: &str, + old_key: &str, + default: &str, +) -> String { + read_env_with_fallback(logger, new_key, old_key).unwrap_or_else(|| default.to_string()) +} + +/// Parse u64 from environment variable with fallback +/// +/// Reads an environment variable with fallback support and parses it as a u64. +/// Returns the default value if the variable is not set or cannot be parsed. +/// +/// # Arguments +/// * `logger` - Logger for emitting deprecation warnings +/// * `new_key` - The new environment variable name +/// * `old_key` - The deprecated environment variable name +/// * `default` - Default value to return if parsing fails or neither key is set +/// +/// # Returns +/// The parsed u64 value, or the default if parsing fails or neither key is set +pub fn read_u64_with_fallback(logger: &Logger, new_key: &str, old_key: &str, default: u64) -> u64 { + read_env_with_fallback(logger, new_key, old_key) + .and_then(|s| s.parse().ok()) + .unwrap_or(default) +} + +/// Parse u32 from environment variable with fallback +/// +/// Reads an environment variable with fallback support and parses it as a u32. +/// Returns the default value if the variable is not set or cannot be parsed. +/// +/// # Arguments +/// * `logger` - Logger for emitting deprecation warnings +/// * `new_key` - The new environment variable name +/// * `old_key` - The deprecated environment variable name +/// * `default` - Default value to return if parsing fails or neither key is set +/// +/// # Returns +/// The parsed u32 value, or the default if parsing fails or neither key is set +pub fn read_u32_with_fallback(logger: &Logger, new_key: &str, old_key: &str, default: u32) -> u32 { + read_env_with_fallback(logger, new_key, old_key) + .and_then(|s| s.parse().ok()) + .unwrap_or(default) +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn test_read_new_key_takes_precedence() { + let logger = crate::log::logger(true); + std::env::set_var("NEW_KEY_PRECEDENCE", "new_value"); + std::env::set_var("OLD_KEY_PRECEDENCE", "old_value"); + + let result = read_env_with_fallback(&logger, "NEW_KEY_PRECEDENCE", "OLD_KEY_PRECEDENCE"); + assert_eq!(result, Some("new_value".to_string())); + + std::env::remove_var("NEW_KEY_PRECEDENCE"); + std::env::remove_var("OLD_KEY_PRECEDENCE"); + } + + #[test] + fn test_read_old_key_when_new_not_present() { + let logger = crate::log::logger(true); + std::env::remove_var("NEW_KEY_FALLBACK"); + std::env::set_var("OLD_KEY_FALLBACK", "old_value"); + + let result = read_env_with_fallback(&logger, "NEW_KEY_FALLBACK", "OLD_KEY_FALLBACK"); + assert_eq!(result, Some("old_value".to_string())); + + std::env::remove_var("OLD_KEY_FALLBACK"); + } + + #[test] + fn test_read_returns_none_when_neither_present() { + let logger = crate::log::logger(true); + std::env::remove_var("NEW_KEY_NONE"); + std::env::remove_var("OLD_KEY_NONE"); + + let result = read_env_with_fallback(&logger, "NEW_KEY_NONE", "OLD_KEY_NONE"); + assert_eq!(result, None); + } + + #[test] + fn test_read_with_default() { + let logger = crate::log::logger(true); + std::env::remove_var("NEW_KEY_DEFAULT"); + std::env::remove_var("OLD_KEY_DEFAULT"); + + let result = read_env_with_default( + &logger, + "NEW_KEY_DEFAULT", + "OLD_KEY_DEFAULT", + "default_value", + ); + assert_eq!(result, "default_value"); + + std::env::remove_var("NEW_KEY_DEFAULT"); + std::env::remove_var("OLD_KEY_DEFAULT"); + } + + #[test] + fn test_read_u64_with_fallback() { + let logger = crate::log::logger(true); + std::env::set_var("NEW_KEY_U64", "12345"); + + let result = read_u64_with_fallback(&logger, "NEW_KEY_U64", "OLD_KEY_U64", 999); + assert_eq!(result, 12345); + + std::env::remove_var("NEW_KEY_U64"); + + // Test with old key + std::env::set_var("OLD_KEY_U64", "67890"); + let result = read_u64_with_fallback(&logger, "NEW_KEY_U64", "OLD_KEY_U64", 999); + assert_eq!(result, 67890); + + std::env::remove_var("OLD_KEY_U64"); + + // Test with default + let result = read_u64_with_fallback(&logger, "NEW_KEY_U64", "OLD_KEY_U64", 999); + assert_eq!(result, 999); + } + + #[test] + fn test_read_u32_with_fallback() { + let logger = crate::log::logger(true); + std::env::set_var("NEW_KEY_U32", "123"); + + let result = read_u32_with_fallback(&logger, "NEW_KEY_U32", "OLD_KEY_U32", 999); + assert_eq!(result, 123); + + std::env::remove_var("NEW_KEY_U32"); + + // Test with old key + std::env::set_var("OLD_KEY_U32", "456"); + let result = read_u32_with_fallback(&logger, "NEW_KEY_U32", "OLD_KEY_U32", 999); + assert_eq!(result, 456); + + std::env::remove_var("OLD_KEY_U32"); + + // Test with default + let result = read_u32_with_fallback(&logger, "NEW_KEY_U32", "OLD_KEY_U32", 999); + assert_eq!(result, 999); + } + + #[test] + fn test_invalid_u64_uses_default() { + let logger = crate::log::logger(true); + std::env::set_var("NEW_KEY_INVALID", "not_a_number"); + + let result = read_u64_with_fallback(&logger, "NEW_KEY_INVALID", "OLD_KEY_INVALID", 999); + assert_eq!(result, 999); + + std::env::remove_var("NEW_KEY_INVALID"); + } + + #[test] + fn test_invalid_u32_uses_default() { + let logger = crate::log::logger(true); + std::env::set_var("NEW_KEY_INVALID_U32", "not_a_number"); + + let result = + read_u32_with_fallback(&logger, "NEW_KEY_INVALID_U32", "OLD_KEY_INVALID_U32", 999); + assert_eq!(result, 999); + + std::env::remove_var("NEW_KEY_INVALID_U32"); + } +} diff --git a/graph/src/components/log_store/elasticsearch.rs b/graph/src/components/log_store/elasticsearch.rs new file mode 100644 index 00000000000..1b3a30e1019 --- /dev/null +++ b/graph/src/components/log_store/elasticsearch.rs @@ -0,0 +1,217 @@ +use async_trait::async_trait; +use reqwest::Client; +use serde::Deserialize; +use serde_json::json; +use std::collections::HashMap; +use std::time::Duration; + +use crate::log::elastic::ElasticLoggingConfig; +use crate::prelude::DeploymentHash; + +use super::{LogEntry, LogMeta, LogQuery, LogStore, LogStoreError}; + +pub struct ElasticsearchLogStore { + endpoint: String, + username: Option, + password: Option, + client: Client, + index: String, + timeout: Duration, +} + +impl ElasticsearchLogStore { + pub fn new(config: ElasticLoggingConfig, index: String, timeout: Duration) -> Self { + Self { + endpoint: config.endpoint, + username: config.username, + password: config.password, + client: config.client, + index, + timeout, + } + } + + fn build_query(&self, query: &LogQuery) -> serde_json::Value { + let mut must_clauses = Vec::new(); + + // Filter by subgraph ID + must_clauses.push(json!({ + "term": { + "subgraphId": query.subgraph_id.to_string() + } + })); + + // Filter by log level + if let Some(level) = &query.level { + must_clauses.push(json!({ + "term": { + "level": level.as_str() + } + })); + } + + // Filter by time range + if query.from.is_some() || query.to.is_some() { + let mut range = serde_json::Map::new(); + if let Some(from) = &query.from { + range.insert("gte".to_string(), json!(from)); + } + if let Some(to) = &query.to { + range.insert("lte".to_string(), json!(to)); + } + must_clauses.push(json!({ + "range": { + "timestamp": range + } + })); + } + + // Filter by text search + if let Some(search) = &query.search { + must_clauses.push(json!({ + "match": { + "text": search + } + })); + } + + json!({ + "query": { + "bool": { + "must": must_clauses + } + }, + "from": query.skip, + "size": query.first, + "sort": [ + { "timestamp": { "order": "desc" } } + ] + }) + } + + async fn execute_search( + &self, + query_body: serde_json::Value, + ) -> Result, LogStoreError> { + let url = format!("{}/{}/_search", self.endpoint, self.index); + + let mut request = self + .client + .post(&url) + .json(&query_body) + .timeout(self.timeout); + + // Add basic auth if credentials provided + if let (Some(username), Some(password)) = (&self.username, &self.password) { + request = request.basic_auth(username, Some(password)); + } + + let response = request.send().await.map_err(|e| { + LogStoreError::QueryFailed( + anyhow::Error::from(e).context("Elasticsearch request failed"), + ) + })?; + + if !response.status().is_success() { + let status = response.status(); + // Include response body in error context for debugging + // The body is part of the error chain but not the main error message to avoid + // leaking sensitive Elasticsearch internals in logs + let body_text = response + .text() + .await + .unwrap_or_else(|_| "".to_string()); + return Err(LogStoreError::QueryFailed( + anyhow::anyhow!("Elasticsearch query failed with status {}", status) + .context(format!("Response body: {}", body_text)), + )); + } + + let response_body: ElasticsearchResponse = response.json().await.map_err(|e| { + LogStoreError::QueryFailed( + anyhow::Error::from(e).context( + "failed to parse Elasticsearch search response: response format may have changed or be invalid", + ), + ) + })?; + + let entries = response_body + .hits + .hits + .into_iter() + .filter_map(|hit| self.parse_log_entry(hit.source)) + .collect(); + + Ok(entries) + } + + fn parse_log_entry(&self, source: ElasticsearchLogDocument) -> Option { + let level = source.level.parse().ok()?; + let subgraph_id = DeploymentHash::new(&source.subgraph_id).ok()?; + + // Convert arguments HashMap to Vec<(String, String)> + let arguments: Vec<(String, String)> = source.arguments.into_iter().collect(); + + Some(LogEntry { + id: source.id, + subgraph_id, + timestamp: source.timestamp, + level, + text: source.text, + arguments, + meta: LogMeta { + module: source.meta.module, + line: source.meta.line, + column: source.meta.column, + }, + }) + } +} + +#[async_trait] +impl LogStore for ElasticsearchLogStore { + async fn query_logs(&self, query: LogQuery) -> Result, LogStoreError> { + let query_body = self.build_query(&query); + self.execute_search(query_body).await + } + + fn is_available(&self) -> bool { + true + } +} + +// Elasticsearch response types +#[derive(Debug, Deserialize)] +struct ElasticsearchResponse { + hits: ElasticsearchHits, +} + +#[derive(Debug, Deserialize)] +struct ElasticsearchHits { + hits: Vec, +} + +#[derive(Debug, Deserialize)] +struct ElasticsearchHit { + #[serde(rename = "_source")] + source: ElasticsearchLogDocument, +} + +#[derive(Debug, Deserialize)] +struct ElasticsearchLogDocument { + id: String, + #[serde(rename = "subgraphId")] + subgraph_id: String, + timestamp: String, + level: String, + text: String, + arguments: HashMap, + meta: ElasticsearchLogMeta, +} + +#[derive(Debug, Deserialize)] +struct ElasticsearchLogMeta { + module: String, + line: i64, + column: i64, +} diff --git a/graph/src/components/log_store/file.rs b/graph/src/components/log_store/file.rs new file mode 100644 index 00000000000..af40ce41329 --- /dev/null +++ b/graph/src/components/log_store/file.rs @@ -0,0 +1,363 @@ +use async_trait::async_trait; +use serde::{Deserialize, Serialize}; +use std::cmp::Reverse; +use std::collections::BinaryHeap; +use std::fs::File; +use std::io::{BufRead, BufReader}; +use std::path::PathBuf; + +use crate::prelude::DeploymentHash; + +use super::{LogEntry, LogMeta, LogQuery, LogStore, LogStoreError}; + +pub struct FileLogStore { + directory: PathBuf, + // TODO: Implement log rotation when file exceeds max_file_size + #[allow(dead_code)] + max_file_size: u64, + // TODO: Implement automatic cleanup of logs older than retention_days + #[allow(dead_code)] + retention_days: u32, +} + +impl FileLogStore { + pub fn new( + directory: PathBuf, + max_file_size: u64, + retention_days: u32, + ) -> Result { + // Create directory if it doesn't exist + std::fs::create_dir_all(&directory) + .map_err(|e| LogStoreError::InitializationFailed(e.into()))?; + + Ok(Self { + directory, + max_file_size, + retention_days, + }) + } + + /// Get log file path for a subgraph + fn log_file_path(&self, subgraph_id: &DeploymentHash) -> PathBuf { + self.directory.join(format!("{}.jsonl", subgraph_id)) + } + + /// Parse a JSON line into a LogEntry + fn parse_line(&self, line: &str) -> Option { + let doc: FileLogDocument = serde_json::from_str(line).ok()?; + + let level = doc.level.parse().ok()?; + let subgraph_id = DeploymentHash::new(&doc.subgraph_id).ok()?; + + Some(LogEntry { + id: doc.id, + subgraph_id, + timestamp: doc.timestamp, + level, + text: doc.text, + arguments: doc.arguments, + meta: LogMeta { + module: doc.meta.module, + line: doc.meta.line, + column: doc.meta.column, + }, + }) + } + + /// Check if an entry matches the query filters + fn matches_filters(&self, entry: &LogEntry, query: &LogQuery) -> bool { + // Level filter + if let Some(level) = query.level { + if entry.level != level { + return false; + } + } + + // Time range filters + if let Some(ref from) = query.from { + if entry.timestamp < *from { + return false; + } + } + + if let Some(ref to) = query.to { + if entry.timestamp > *to { + return false; + } + } + + // Text search (case-insensitive) + if let Some(ref search) = query.search { + if !entry.text.to_lowercase().contains(&search.to_lowercase()) { + return false; + } + } + + true + } +} + +/// Helper struct to enable timestamp-based comparisons for BinaryHeap +/// Implements Ord based on timestamp field for maintaining a min-heap of recent entries +struct TimestampedEntry { + entry: LogEntry, +} + +impl PartialEq for TimestampedEntry { + fn eq(&self, other: &Self) -> bool { + self.entry.timestamp == other.entry.timestamp + } +} + +impl Eq for TimestampedEntry {} + +impl PartialOrd for TimestampedEntry { + fn partial_cmp(&self, other: &Self) -> Option { + Some(self.cmp(other)) + } +} + +impl Ord for TimestampedEntry { + fn cmp(&self, other: &Self) -> std::cmp::Ordering { + self.entry.timestamp.cmp(&other.entry.timestamp) + } +} + +#[async_trait] +impl LogStore for FileLogStore { + async fn query_logs(&self, query: LogQuery) -> Result, LogStoreError> { + let file_path = self.log_file_path(&query.subgraph_id); + + if !file_path.exists() { + return Ok(vec![]); + } + + let file = File::open(&file_path).map_err(|e| LogStoreError::QueryFailed(e.into()))?; + let reader = BufReader::new(file); + + // Calculate how many entries we need to keep in memory + // We need skip + first entries to handle pagination + let needed_entries = (query.skip + query.first) as usize; + + // Use a min-heap (via Reverse) to maintain only the top N most recent entries + // This bounds memory usage to O(skip + first) instead of O(total_log_entries) + let mut top_entries: BinaryHeap> = + BinaryHeap::with_capacity(needed_entries + 1); + + // Stream through the file line-by-line, applying filters and maintaining bounded collection + for line in reader.lines() { + // Skip malformed lines + let line = match line { + Ok(l) => l, + Err(_) => continue, + }; + + // Parse the line into a LogEntry + let entry = match self.parse_line(&line) { + Some(e) => e, + None => continue, + }; + + // Apply filters early to avoid keeping filtered-out entries in memory + if !self.matches_filters(&entry, &query) { + continue; + } + + let timestamped = TimestampedEntry { entry }; + + // Maintain only the top N most recent entries by timestamp + // BinaryHeap with Reverse creates a min-heap, so we can efficiently + // keep the N largest (most recent) timestamps + if top_entries.len() < needed_entries { + top_entries.push(Reverse(timestamped)); + } else if let Some(Reverse(oldest)) = top_entries.peek() { + // If this entry is more recent than the oldest in our heap, replace it + if timestamped.entry.timestamp > oldest.entry.timestamp { + top_entries.pop(); + top_entries.push(Reverse(timestamped)); + } + } + } + + // Convert heap to sorted vector (most recent first) + let mut result: Vec = top_entries + .into_iter() + .map(|Reverse(te)| te.entry) + .collect(); + + // Sort by timestamp descending (most recent first) + result.sort_by(|a, b| b.timestamp.cmp(&a.timestamp)); + + // Apply skip and take to get the final page + Ok(result + .into_iter() + .skip(query.skip as usize) + .take(query.first as usize) + .collect()) + } + + fn is_available(&self) -> bool { + self.directory.exists() && self.directory.is_dir() + } +} + +// File log document format (JSON Lines) +#[derive(Debug, Serialize, Deserialize)] +struct FileLogDocument { + id: String, + #[serde(rename = "subgraphId")] + subgraph_id: String, + timestamp: String, + level: String, + text: String, + arguments: Vec<(String, String)>, + meta: FileLogMeta, +} + +#[derive(Debug, Serialize, Deserialize)] +struct FileLogMeta { + module: String, + line: i64, + column: i64, +} + +#[cfg(test)] +mod tests { + use super::super::LogLevel; + use super::*; + use std::io::Write; + use tempfile::TempDir; + + #[test] + fn test_file_log_store_initialization() { + let temp_dir = TempDir::new().unwrap(); + let store = FileLogStore::new(temp_dir.path().to_path_buf(), 1024 * 1024, 30); + assert!(store.is_ok()); + + let store = store.unwrap(); + assert!(store.is_available()); + } + + #[test] + fn test_log_file_path() { + let temp_dir = TempDir::new().unwrap(); + let store = FileLogStore::new(temp_dir.path().to_path_buf(), 1024 * 1024, 30).unwrap(); + + let subgraph_id = DeploymentHash::new("QmTest").unwrap(); + let path = store.log_file_path(&subgraph_id); + + assert_eq!(path, temp_dir.path().join("QmTest.jsonl")); + } + + #[tokio::test] + async fn test_query_nonexistent_file() { + let temp_dir = TempDir::new().unwrap(); + let store = FileLogStore::new(temp_dir.path().to_path_buf(), 1024 * 1024, 30).unwrap(); + + let query = LogQuery { + subgraph_id: DeploymentHash::new("QmNonexistent").unwrap(), + level: None, + from: None, + to: None, + search: None, + first: 100, + skip: 0, + }; + + let result = store.query_logs(query).await; + assert!(result.is_ok()); + assert_eq!(result.unwrap().len(), 0); + } + + #[tokio::test] + async fn test_query_with_sample_data() { + let temp_dir = TempDir::new().unwrap(); + let store = FileLogStore::new(temp_dir.path().to_path_buf(), 1024 * 1024, 30).unwrap(); + + let subgraph_id = DeploymentHash::new("QmTest").unwrap(); + let file_path = store.log_file_path(&subgraph_id); + + // Write some test data + let mut file = File::create(&file_path).unwrap(); + let log_entry = FileLogDocument { + id: "log-1".to_string(), + subgraph_id: "QmTest".to_string(), + timestamp: "2024-01-15T10:30:00Z".to_string(), + level: "error".to_string(), + text: "Test error message".to_string(), + arguments: vec![], + meta: FileLogMeta { + module: "test.ts".to_string(), + line: 42, + column: 10, + }, + }; + writeln!(file, "{}", serde_json::to_string(&log_entry).unwrap()).unwrap(); + + // Query + let query = LogQuery { + subgraph_id, + level: None, + from: None, + to: None, + search: None, + first: 100, + skip: 0, + }; + + let result = store.query_logs(query).await; + assert!(result.is_ok()); + + let entries = result.unwrap(); + assert_eq!(entries.len(), 1); + assert_eq!(entries[0].id, "log-1"); + assert_eq!(entries[0].text, "Test error message"); + assert_eq!(entries[0].level, LogLevel::Error); + } + + #[tokio::test] + async fn test_query_with_level_filter() { + let temp_dir = TempDir::new().unwrap(); + let store = FileLogStore::new(temp_dir.path().to_path_buf(), 1024 * 1024, 30).unwrap(); + + let subgraph_id = DeploymentHash::new("QmTest").unwrap(); + let file_path = store.log_file_path(&subgraph_id); + + // Write test data with different levels + let mut file = File::create(&file_path).unwrap(); + for (id, level) in [("log-1", "error"), ("log-2", "info"), ("log-3", "error")] { + let log_entry = FileLogDocument { + id: id.to_string(), + subgraph_id: "QmTest".to_string(), + timestamp: format!("2024-01-15T10:30:{}Z", id), + level: level.to_string(), + text: format!("Test {} message", level), + arguments: vec![], + meta: FileLogMeta { + module: "test.ts".to_string(), + line: 42, + column: 10, + }, + }; + writeln!(file, "{}", serde_json::to_string(&log_entry).unwrap()).unwrap(); + } + + // Query for errors only + let query = LogQuery { + subgraph_id, + level: Some(LogLevel::Error), + from: None, + to: None, + search: None, + first: 100, + skip: 0, + }; + + let result = store.query_logs(query).await; + assert!(result.is_ok()); + + let entries = result.unwrap(); + assert_eq!(entries.len(), 2); + assert!(entries.iter().all(|e| e.level == LogLevel::Error)); + } +} diff --git a/graph/src/components/log_store/loki.rs b/graph/src/components/log_store/loki.rs new file mode 100644 index 00000000000..e06405feb55 --- /dev/null +++ b/graph/src/components/log_store/loki.rs @@ -0,0 +1,283 @@ +use async_trait::async_trait; +use reqwest::Client; +use serde::Deserialize; +use std::collections::HashMap; +use std::time::Duration; + +use crate::prelude::DeploymentHash; + +use super::{LogEntry, LogMeta, LogQuery, LogStore, LogStoreError}; + +pub struct LokiLogStore { + endpoint: String, + tenant_id: Option, + client: Client, +} + +impl LokiLogStore { + pub fn new(endpoint: String, tenant_id: Option) -> Result { + let client = Client::builder() + .timeout(Duration::from_secs(10)) + .build() + .map_err(|e| LogStoreError::InitializationFailed(e.into()))?; + + Ok(Self { + endpoint, + tenant_id, + client, + }) + } + + fn build_logql_query(&self, query: &LogQuery) -> String { + let mut selectors = vec![format!("subgraphId=\"{}\"", query.subgraph_id)]; + + // Add log level selector if specified + if let Some(level) = &query.level { + selectors.push(format!("level=\"{}\"", level.as_str())); + } + + // Base selector + let selector = format!("{{{}}}", selectors.join(",")); + + // Add line filter for text search if specified + let query_str = if let Some(search) = &query.search { + format!("{} |~ \"(?i){}\"", selector, regex::escape(search)) + } else { + selector + }; + + query_str + } + + async fn execute_query( + &self, + query_str: &str, + from: &str, + to: &str, + limit: u32, + ) -> Result, LogStoreError> { + let url = format!("{}/loki/api/v1/query_range", self.endpoint); + + let mut request = self + .client + .get(&url) + .query(&[ + ("query", query_str), + ("start", from), + ("end", to), + ("limit", &limit.to_string()), + ("direction", "backward"), // Most recent first + ]) + .timeout(Duration::from_secs(10)); + + // Add X-Scope-OrgID header for multi-tenancy if configured + if let Some(tenant_id) = &self.tenant_id { + request = request.header("X-Scope-OrgID", tenant_id); + } + + let response = request.send().await.map_err(|e| { + LogStoreError::QueryFailed(anyhow::Error::from(e).context("Loki request failed")) + })?; + + if !response.status().is_success() { + let status = response.status(); + return Err(LogStoreError::QueryFailed(anyhow::anyhow!( + "Loki query failed with status {}", + status + ))); + } + + let response_body: LokiResponse = response.json().await.map_err(|e| { + LogStoreError::QueryFailed( + anyhow::Error::from(e) + .context("failed to parse Loki response: response format may have changed"), + ) + })?; + + if response_body.status != "success" { + return Err(LogStoreError::QueryFailed(anyhow::anyhow!( + "Loki query failed with status: {}", + response_body.status + ))); + } + + // Parse results + let entries = response_body + .data + .result + .into_iter() + .flat_map(|stream| { + let stream_labels = stream.stream; // Take ownership + stream + .values + .into_iter() + .filter_map(move |value| self.parse_log_entry(value, &stream_labels)) + }) + .collect(); + + Ok(entries) + } + + fn parse_log_entry( + &self, + value: LokiValue, + _labels: &HashMap, + ) -> Option { + // value is [timestamp_ns, log_line] + // We expect the log line to be JSON with our log entry structure + let log_data: LokiLogDocument = serde_json::from_str(&value.1).ok()?; + + let level = log_data.level.parse().ok()?; + let subgraph_id = DeploymentHash::new(&log_data.subgraph_id).ok()?; + + Some(LogEntry { + id: log_data.id, + subgraph_id, + timestamp: log_data.timestamp, + level, + text: log_data.text, + arguments: log_data.arguments.into_iter().collect(), + meta: LogMeta { + module: log_data.meta.module, + line: log_data.meta.line, + column: log_data.meta.column, + }, + }) + } +} + +#[async_trait] +impl LogStore for LokiLogStore { + async fn query_logs(&self, query: LogQuery) -> Result, LogStoreError> { + let logql_query = self.build_logql_query(&query); + + // Calculate time range + let from = query.from.as_deref().unwrap_or("now-1h"); + let to = query.to.as_deref().unwrap_or("now"); + + // Execute query with limit + skip to handle pagination + let limit = query.first + query.skip; + + let mut entries = self.execute_query(&logql_query, from, to, limit).await?; + + // Apply skip/first pagination + if query.skip > 0 { + entries = entries.into_iter().skip(query.skip as usize).collect(); + } + entries.truncate(query.first as usize); + + Ok(entries) + } + + fn is_available(&self) -> bool { + true + } +} + +// Loki response types +#[derive(Debug, Deserialize)] +struct LokiResponse { + status: String, + data: LokiData, +} + +#[derive(Debug, Deserialize)] +struct LokiData { + // Part of Loki API response, required for deserialization + #[allow(dead_code)] + #[serde(rename = "resultType")] + result_type: String, + result: Vec, +} + +#[derive(Debug, Deserialize)] +struct LokiStream { + stream: HashMap, // Labels + values: Vec, +} + +#[derive(Debug, Deserialize)] +struct LokiValue( + // Timestamp in nanoseconds since epoch (part of Loki API, not currently used) + #[allow(dead_code)] String, + // Log line (JSON document) + String, +); + +#[derive(Debug, Deserialize)] +struct LokiLogDocument { + id: String, + #[serde(rename = "subgraphId")] + subgraph_id: String, + timestamp: String, + level: String, + text: String, + arguments: HashMap, + meta: LokiLogMeta, +} + +#[derive(Debug, Deserialize)] +struct LokiLogMeta { + module: String, + line: i64, + column: i64, +} + +#[cfg(test)] +mod tests { + use super::super::LogLevel; + use super::*; + + #[test] + fn test_build_logql_query_basic() { + let store = LokiLogStore::new("http://localhost:3100".to_string(), None).unwrap(); + let query = LogQuery { + subgraph_id: DeploymentHash::new("QmTest").unwrap(), + level: None, + from: None, + to: None, + search: None, + first: 100, + skip: 0, + }; + + let logql = store.build_logql_query(&query); + assert_eq!(logql, "{subgraphId=\"QmTest\"}"); + } + + #[test] + fn test_build_logql_query_with_level() { + let store = LokiLogStore::new("http://localhost:3100".to_string(), None).unwrap(); + let query = LogQuery { + subgraph_id: DeploymentHash::new("QmTest").unwrap(), + level: Some(LogLevel::Error), + from: None, + to: None, + search: None, + first: 100, + skip: 0, + }; + + let logql = store.build_logql_query(&query); + assert_eq!(logql, "{subgraphId=\"QmTest\",level=\"error\"}"); + } + + #[test] + fn test_build_logql_query_with_text_filter() { + let store = LokiLogStore::new("http://localhost:3100".to_string(), None).unwrap(); + let query = LogQuery { + subgraph_id: DeploymentHash::new("QmTest").unwrap(), + level: None, + from: None, + to: None, + search: Some("transaction failed".to_string()), + first: 100, + skip: 0, + }; + + let logql = store.build_logql_query(&query); + assert!(logql.contains("{subgraphId=\"QmTest\"}")); + assert!(logql.contains("|~")); + assert!(logql.contains("transaction failed")); + } +} diff --git a/graph/src/components/log_store/mod.rs b/graph/src/components/log_store/mod.rs new file mode 100644 index 00000000000..e951c37d1df --- /dev/null +++ b/graph/src/components/log_store/mod.rs @@ -0,0 +1,333 @@ +pub mod config; +pub mod elasticsearch; +pub mod file; +pub mod loki; + +use async_trait::async_trait; +use std::path::PathBuf; +use std::str::FromStr; +use std::sync::Arc; +use thiserror::Error; + +use crate::prelude::DeploymentHash; + +#[derive(Error, Debug)] +pub enum LogStoreError { + #[error("log store query failed: {0}")] + QueryFailed(#[from] anyhow::Error), + + #[error("log store is unavailable")] + Unavailable, + + #[error("log store initialization failed: {0}")] + InitializationFailed(anyhow::Error), + + #[error("log store configuration error: {0}")] + ConfigurationError(anyhow::Error), +} + +/// Configuration for different log store backends +#[derive(Debug, Clone)] +pub enum LogStoreConfig { + /// No logging - returns empty results + Disabled, + + /// Elasticsearch backend + Elasticsearch { + endpoint: String, + username: Option, + password: Option, + index: String, + timeout_secs: u64, + }, + + /// Loki (Grafana's log aggregation system) + Loki { + endpoint: String, + tenant_id: Option, + }, + + /// File-based logs (JSON lines format) + File { + directory: PathBuf, + max_file_size: u64, + retention_days: u32, + }, +} + +#[derive(Debug, Clone, Copy, PartialEq, Eq)] +pub enum LogLevel { + Critical, + Error, + Warning, + Info, + Debug, +} + +impl LogLevel { + pub fn as_str(&self) -> &'static str { + match self { + LogLevel::Critical => "critical", + LogLevel::Error => "error", + LogLevel::Warning => "warning", + LogLevel::Info => "info", + LogLevel::Debug => "debug", + } + } +} + +impl FromStr for LogLevel { + type Err = String; + + fn from_str(s: &str) -> Result { + match s.trim().to_lowercase().as_str() { + "critical" => Ok(LogLevel::Critical), + "error" => Ok(LogLevel::Error), + "warning" => Ok(LogLevel::Warning), + "info" => Ok(LogLevel::Info), + "debug" => Ok(LogLevel::Debug), + _ => Err(format!("Invalid log level: {}", s)), + } + } +} + +#[derive(Debug, Clone)] +pub struct LogMeta { + pub module: String, + pub line: i64, + pub column: i64, +} + +#[derive(Debug, Clone)] +pub struct LogEntry { + pub id: String, + pub subgraph_id: DeploymentHash, + pub timestamp: String, + pub level: LogLevel, + pub text: String, + pub arguments: Vec<(String, String)>, + pub meta: LogMeta, +} + +#[derive(Debug, Clone)] +pub struct LogQuery { + pub subgraph_id: DeploymentHash, + pub level: Option, + pub from: Option, + pub to: Option, + pub search: Option, + pub first: u32, + pub skip: u32, +} + +#[async_trait] +pub trait LogStore: Send + Sync + 'static { + async fn query_logs(&self, query: LogQuery) -> Result, LogStoreError>; + fn is_available(&self) -> bool; +} + +/// Factory for creating LogStore instances from configuration +pub struct LogStoreFactory; + +impl LogStoreFactory { + /// Create a LogStore from configuration + pub fn from_config(config: LogStoreConfig) -> Result, LogStoreError> { + match config { + LogStoreConfig::Disabled => Ok(Arc::new(NoOpLogStore)), + + LogStoreConfig::Elasticsearch { + endpoint, + username, + password, + index, + timeout_secs, + } => { + let timeout = std::time::Duration::from_secs(timeout_secs); + let client = reqwest::Client::builder() + .timeout(timeout) + .build() + .map_err(|e| LogStoreError::InitializationFailed(e.into()))?; + + let config = crate::log::elastic::ElasticLoggingConfig { + endpoint, + username, + password, + client, + }; + + Ok(Arc::new(elasticsearch::ElasticsearchLogStore::new( + config, index, timeout, + ))) + } + + LogStoreConfig::Loki { + endpoint, + tenant_id, + } => Ok(Arc::new(loki::LokiLogStore::new(endpoint, tenant_id)?)), + + LogStoreConfig::File { + directory, + max_file_size, + retention_days, + } => Ok(Arc::new(file::FileLogStore::new( + directory, + max_file_size, + retention_days, + )?)), + } + } + + /// Parse configuration from environment variables + /// + /// Supports both new (GRAPH_LOG_STORE_*) and old (deprecated) environment variable names + /// for backward compatibility. The new keys take precedence when both are set. + pub fn from_env() -> Result { + // Logger for deprecation warnings + let logger = crate::log::logger(false); + + // Read backend selector with backward compatibility + let backend = config::read_env_with_default( + &logger, + "GRAPH_LOG_STORE_BACKEND", + "GRAPH_LOG_STORE", + "disabled", + ); + + match backend.to_lowercase().as_str() { + "disabled" | "none" => Ok(LogStoreConfig::Disabled), + + "elasticsearch" | "elastic" | "es" => { + let endpoint = config::read_env_with_fallback( + &logger, + "GRAPH_LOG_STORE_ELASTICSEARCH_URL", + "GRAPH_ELASTICSEARCH_URL", + ) + .ok_or_else(|| { + LogStoreError::ConfigurationError(anyhow::anyhow!( + "Elasticsearch endpoint not set. Use GRAPH_LOG_STORE_ELASTICSEARCH_URL environment variable" + )) + })?; + + let username = config::read_env_with_fallback( + &logger, + "GRAPH_LOG_STORE_ELASTICSEARCH_USER", + "GRAPH_ELASTICSEARCH_USER", + ); + + let password = config::read_env_with_fallback( + &logger, + "GRAPH_LOG_STORE_ELASTICSEARCH_PASSWORD", + "GRAPH_ELASTICSEARCH_PASSWORD", + ); + + let index = config::read_env_with_default( + &logger, + "GRAPH_LOG_STORE_ELASTICSEARCH_INDEX", + "GRAPH_ELASTIC_SEARCH_INDEX", + "subgraph", + ); + + // Default: 10 seconds query timeout + // Configurable via GRAPH_LOG_STORE_ELASTICSEARCH_TIMEOUT environment variable + let timeout_secs = config::read_u64_with_fallback( + &logger, + "GRAPH_LOG_STORE_ELASTICSEARCH_TIMEOUT", + "GRAPH_ELASTICSEARCH_TIMEOUT", + 10, + ); + + Ok(LogStoreConfig::Elasticsearch { + endpoint, + username, + password, + index, + timeout_secs, + }) + } + + "loki" => { + let endpoint = config::read_env_with_fallback( + &logger, + "GRAPH_LOG_STORE_LOKI_URL", + "GRAPH_LOG_LOKI_ENDPOINT", + ) + .ok_or_else(|| { + LogStoreError::ConfigurationError(anyhow::anyhow!( + "Loki endpoint not set. Use GRAPH_LOG_STORE_LOKI_URL environment variable" + )) + })?; + + let tenant_id = config::read_env_with_fallback( + &logger, + "GRAPH_LOG_STORE_LOKI_TENANT_ID", + "GRAPH_LOG_LOKI_TENANT", + ); + + Ok(LogStoreConfig::Loki { + endpoint, + tenant_id, + }) + } + + "file" | "files" => { + let directory = config::read_env_with_fallback( + &logger, + "GRAPH_LOG_STORE_FILE_DIR", + "GRAPH_LOG_FILE_DIR", + ) + .ok_or_else(|| { + LogStoreError::ConfigurationError(anyhow::anyhow!( + "File log directory not set. Use GRAPH_LOG_STORE_FILE_DIR environment variable" + )) + }) + .map(PathBuf::from)?; + + // Default: 100MB per file (104857600 bytes) + // Configurable via GRAPH_LOG_STORE_FILE_MAX_SIZE environment variable + let max_file_size = config::read_u64_with_fallback( + &logger, + "GRAPH_LOG_STORE_FILE_MAX_SIZE", + "GRAPH_LOG_FILE_MAX_SIZE", + 100 * 1024 * 1024, + ); + + // Default: 30 days retention + // Configurable via GRAPH_LOG_STORE_FILE_RETENTION_DAYS environment variable + let retention_days = config::read_u32_with_fallback( + &logger, + "GRAPH_LOG_STORE_FILE_RETENTION_DAYS", + "GRAPH_LOG_FILE_RETENTION_DAYS", + 30, + ); + + Ok(LogStoreConfig::File { + directory, + max_file_size, + retention_days, + }) + } + + _ => Err(LogStoreError::ConfigurationError(anyhow::anyhow!( + "Unknown log store backend: {}. Valid options: disabled, elasticsearch, loki, file", + backend + ))), + } + } +} + +/// A no-op LogStore that returns empty results. +/// +/// Used when log storage is disabled (the default). Note that subgraph logs +/// still appear in stdout/stderr - they're just not stored in a queryable format. +pub struct NoOpLogStore; + +#[async_trait] +impl LogStore for NoOpLogStore { + async fn query_logs(&self, _query: LogQuery) -> Result, LogStoreError> { + Ok(vec![]) + } + + fn is_available(&self) -> bool { + false + } +} diff --git a/graph/src/components/mod.rs b/graph/src/components/mod.rs index 8abdc96f0b0..2dfc34f6373 100644 --- a/graph/src/components/mod.rs +++ b/graph/src/components/mod.rs @@ -50,6 +50,9 @@ pub mod server; /// Components dealing with storing entities. pub mod store; +/// Components dealing with log storage. +pub mod log_store; + pub mod link_resolver; pub mod trigger_processor; diff --git a/graph/src/log/common.rs b/graph/src/log/common.rs new file mode 100644 index 00000000000..1d6c921015d --- /dev/null +++ b/graph/src/log/common.rs @@ -0,0 +1,237 @@ +use std::collections::HashMap; +use std::fmt; + +use serde::Serialize; +use slog::*; + +/// Serializer for concatenating key-value arguments into a string +pub struct SimpleKVSerializer { + kvs: Vec<(String, String)>, +} + +impl Default for SimpleKVSerializer { + fn default() -> Self { + Self::new() + } +} + +impl SimpleKVSerializer { + pub fn new() -> Self { + Self { kvs: Vec::new() } + } + + /// Returns the number of key-value pairs and the concatenated string + pub fn finish(self) -> (usize, String) { + ( + self.kvs.len(), + self.kvs + .iter() + .map(|(k, v)| format!("{}: {}", k, v)) + .collect::>() + .join(", "), + ) + } +} + +impl Serializer for SimpleKVSerializer { + fn emit_arguments(&mut self, key: Key, val: &fmt::Arguments) -> slog::Result { + self.kvs.push((key.into(), val.to_string())); + Ok(()) + } +} + +/// Serializer for extracting key-value pairs into a Vec +pub struct VecKVSerializer { + kvs: Vec<(String, String)>, +} + +impl Default for VecKVSerializer { + fn default() -> Self { + Self::new() + } +} + +impl VecKVSerializer { + pub fn new() -> Self { + Self { kvs: Vec::new() } + } + + pub fn finish(self) -> Vec<(String, String)> { + self.kvs + } +} + +impl Serializer for VecKVSerializer { + fn emit_arguments(&mut self, key: Key, val: &fmt::Arguments) -> slog::Result { + self.kvs.push((key.into(), val.to_string())); + Ok(()) + } +} + +/// Serializer for extracting key-value pairs into a HashMap +pub struct HashMapKVSerializer { + kvs: Vec<(String, String)>, +} + +impl Default for HashMapKVSerializer { + fn default() -> Self { + Self::new() + } +} + +impl HashMapKVSerializer { + pub fn new() -> Self { + HashMapKVSerializer { kvs: Vec::new() } + } + + pub fn finish(self) -> HashMap { + self.kvs.into_iter().collect() + } +} + +impl Serializer for HashMapKVSerializer { + fn emit_arguments(&mut self, key: Key, val: &fmt::Arguments) -> slog::Result { + self.kvs.push((key.into(), val.to_string())); + Ok(()) + } +} + +/// Log metadata structure +#[derive(Clone, Debug, Serialize)] +#[serde(rename_all = "camelCase")] +pub struct LogMeta { + pub module: String, + pub line: i64, + pub column: i64, +} + +/// Converts an slog Level to a string representation +pub fn level_to_str(level: Level) -> &'static str { + match level { + Level::Critical => "critical", + Level::Error => "error", + Level::Warning => "warning", + Level::Info => "info", + Level::Debug => "debug", + Level::Trace => "trace", + } +} + +/// Builder for common log entry fields across different drain implementations +pub struct LogEntryBuilder<'a> { + record: &'a Record<'a>, + values: &'a OwnedKVList, + subgraph_id: String, + timestamp: String, +} + +impl<'a> LogEntryBuilder<'a> { + pub fn new( + record: &'a Record<'a>, + values: &'a OwnedKVList, + subgraph_id: String, + timestamp: String, + ) -> Self { + Self { + record, + values, + subgraph_id, + timestamp, + } + } + + /// Builds the log ID in the format: subgraph_id-timestamp + pub fn build_id(&self) -> String { + format!("{}-{}", self.subgraph_id, self.timestamp) + } + + /// Builds the text field by concatenating the message with all key-value pairs + pub fn build_text(&self) -> String { + // Serialize logger arguments + let mut serializer = SimpleKVSerializer::new(); + self.record + .kv() + .serialize(self.record, &mut serializer) + .expect("failed to serialize logger arguments"); + let (n_logger_kvs, logger_kvs) = serializer.finish(); + + // Serialize log message arguments + let mut serializer = SimpleKVSerializer::new(); + self.values + .serialize(self.record, &mut serializer) + .expect("failed to serialize log message arguments"); + let (n_value_kvs, value_kvs) = serializer.finish(); + + // Build text with all key-value pairs + let mut text = format!("{}", self.record.msg()); + if n_logger_kvs > 0 { + use std::fmt::Write; + write!(text, ", {}", logger_kvs).unwrap(); + } + if n_value_kvs > 0 { + use std::fmt::Write; + write!(text, ", {}", value_kvs).unwrap(); + } + + text + } + + /// Builds arguments as a Vec of tuples (for file drain) + pub fn build_arguments_vec(&self) -> Vec<(String, String)> { + let mut serializer = VecKVSerializer::new(); + self.record + .kv() + .serialize(self.record, &mut serializer) + .expect("failed to serialize log message arguments into vec"); + serializer.finish() + } + + /// Builds arguments as a HashMap (for elastic and loki drains) + pub fn build_arguments_map(&self) -> HashMap { + let mut serializer = HashMapKVSerializer::new(); + self.record + .kv() + .serialize(self.record, &mut serializer) + .expect("failed to serialize log message arguments into hash map"); + serializer.finish() + } + + /// Builds metadata from the log record + pub fn build_meta(&self) -> LogMeta { + LogMeta { + module: self.record.module().into(), + line: self.record.line() as i64, + column: self.record.column() as i64, + } + } + + /// Gets the level as a string + pub fn level_str(&self) -> &'static str { + level_to_str(self.record.level()) + } + + /// Gets the timestamp + pub fn timestamp(&self) -> &str { + &self.timestamp + } + + /// Gets the subgraph ID + pub fn subgraph_id(&self) -> &str { + &self.subgraph_id + } +} + +/// Creates a new asynchronous logger with consistent configuration +pub fn create_async_logger(drain: D, chan_size: usize, use_block_overflow: bool) -> Logger +where + D: Drain + Send + 'static, + D::Err: std::fmt::Debug, +{ + let mut builder = slog_async::Async::new(drain.fuse()).chan_size(chan_size); + + if use_block_overflow { + builder = builder.overflow_strategy(slog_async::OverflowStrategy::Block); + } + + Logger::root(builder.build().fuse(), o!()) +} diff --git a/graph/src/log/elastic.rs b/graph/src/log/elastic.rs index eb285d3d6e6..d2506211afd 100644 --- a/graph/src/log/elastic.rs +++ b/graph/src/log/elastic.rs @@ -1,6 +1,4 @@ use std::collections::HashMap; -use std::fmt; -use std::fmt::Write; use std::result::Result; use std::sync::{Arc, Mutex}; use std::time::Duration; @@ -15,10 +13,11 @@ use serde::ser::Serializer as SerdeSerializer; use serde::Serialize; use serde_json::json; use slog::*; -use slog_async; use crate::util::futures::retry; +use super::common::{create_async_logger, LogEntryBuilder, LogMeta}; + /// General configuration parameters for Elasticsearch logging. #[derive(Clone, Debug)] pub struct ElasticLoggingConfig { @@ -33,28 +32,14 @@ pub struct ElasticLoggingConfig { } /// Serializes an slog log level using a serde Serializer. -fn serialize_log_level(level: &Level, serializer: S) -> Result +fn serialize_log_level(level: &str, serializer: S) -> Result where S: SerdeSerializer, { - serializer.serialize_str(match level { - Level::Critical => "critical", - Level::Error => "error", - Level::Warning => "warning", - Level::Info => "info", - Level::Debug => "debug", - Level::Trace => "trace", - }) + serializer.serialize_str(level) } -// Log message meta data. -#[derive(Clone, Debug, Serialize)] -#[serde(rename_all = "camelCase")] -struct ElasticLogMeta { - module: String, - line: i64, - column: i64, -} +type ElasticLogMeta = LogMeta; // Log message to be written to Elasticsearch. #[derive(Clone, Debug, Serialize)] @@ -67,71 +52,10 @@ struct ElasticLog { timestamp: String, text: String, #[serde(serialize_with = "serialize_log_level")] - level: Level, + level: String, meta: ElasticLogMeta, } -struct HashMapKVSerializer { - kvs: Vec<(String, String)>, -} - -impl HashMapKVSerializer { - fn new() -> Self { - HashMapKVSerializer { - kvs: Default::default(), - } - } - - fn finish(self) -> HashMap { - let mut map = HashMap::new(); - self.kvs.into_iter().for_each(|(k, v)| { - map.insert(k, v); - }); - map - } -} - -impl Serializer for HashMapKVSerializer { - fn emit_arguments(&mut self, key: Key, val: &fmt::Arguments) -> slog::Result { - self.kvs.push((key.into(), format!("{}", val))); - Ok(()) - } -} - -/// A super-simple slog Serializer for concatenating key/value arguments. -struct SimpleKVSerializer { - kvs: Vec<(String, String)>, -} - -impl SimpleKVSerializer { - /// Creates a new `SimpleKVSerializer`. - fn new() -> Self { - SimpleKVSerializer { - kvs: Default::default(), - } - } - - /// Collects all key/value arguments into a single, comma-separated string. - /// Returns the number of key/value pairs and the string itself. - fn finish(self) -> (usize, String) { - ( - self.kvs.len(), - self.kvs - .iter() - .map(|(k, v)| format!("{}: {}", k, v)) - .collect::>() - .join(", "), - ) - } -} - -impl Serializer for SimpleKVSerializer { - fn emit_arguments(&mut self, key: Key, val: &fmt::Arguments) -> slog::Result { - self.kvs.push((key.into(), format!("{}", val))); - Ok(()) - } -} - /// Configuration for `ElasticDrain`. #[derive(Clone, Debug)] pub struct ElasticDrainConfig { @@ -309,43 +233,18 @@ impl Drain for ElasticDrain { type Err = (); fn log(&self, record: &Record, values: &OwnedKVList) -> Result { - // Don't sent `trace` logs to ElasticSearch. + // Don't send `trace` logs to ElasticSearch. if record.level() == Level::Trace { return Ok(()); } - let timestamp = Utc::now().to_rfc3339_opts(SecondsFormat::Nanos, true); - let id = format!("{}-{}", self.config.custom_id_value, timestamp); - - // Serialize logger arguments - let mut serializer = SimpleKVSerializer::new(); - record - .kv() - .serialize(record, &mut serializer) - .expect("failed to serializer logger arguments"); - let (n_logger_kvs, logger_kvs) = serializer.finish(); - - // Serialize log message arguments - let mut serializer = SimpleKVSerializer::new(); - values - .serialize(record, &mut serializer) - .expect("failed to serialize log message arguments"); - let (n_value_kvs, value_kvs) = serializer.finish(); - // Serialize log message arguments into hash map - let mut serializer = HashMapKVSerializer::new(); - record - .kv() - .serialize(record, &mut serializer) - .expect("failed to serialize log message arguments into hash map"); - let arguments = serializer.finish(); - - let mut text = format!("{}", record.msg()); - if n_logger_kvs > 0 { - write!(text, ", {}", logger_kvs).unwrap(); - } - if n_value_kvs > 0 { - write!(text, ", {}", value_kvs).unwrap(); - } + let timestamp = Utc::now().to_rfc3339_opts(SecondsFormat::Nanos, true); + let builder = LogEntryBuilder::new( + record, + values, + self.config.custom_id_value.clone(), + timestamp.clone(), + ); // Prepare custom id for log document let mut custom_id = HashMap::new(); @@ -356,17 +255,13 @@ impl Drain for ElasticDrain { // Prepare log document let log = ElasticLog { - id, + id: builder.build_id(), custom_id, - arguments, + arguments: builder.build_arguments_map(), timestamp, - text, - level: record.level(), - meta: ElasticLogMeta { - module: record.module().into(), - line: record.line() as i64, - column: record.column() as i64, - }, + text: builder.build_text(), + level: builder.level_str().to_string(), + meta: builder.build_meta(), }; // Push the log into the queue @@ -386,10 +281,6 @@ pub fn elastic_logger( error_logger: Logger, logs_sent_counter: Counter, ) -> Logger { - let elastic_drain = ElasticDrain::new(config, error_logger, logs_sent_counter).fuse(); - let async_drain = slog_async::Async::new(elastic_drain) - .chan_size(20000) - .build() - .fuse(); - Logger::root(async_drain, o!()) + let elastic_drain = ElasticDrain::new(config, error_logger, logs_sent_counter); + create_async_logger(elastic_drain, 20000, false) } diff --git a/graph/src/log/factory.rs b/graph/src/log/factory.rs index 1e8aef33b2e..58306a448b3 100644 --- a/graph/src/log/factory.rs +++ b/graph/src/log/factory.rs @@ -1,11 +1,15 @@ use std::sync::Arc; +use std::time::Duration; use prometheus::Counter; use slog::*; +use crate::components::log_store::LogStoreConfig; use crate::components::metrics::MetricsRegistry; use crate::components::store::DeploymentLocator; use crate::log::elastic::*; +use crate::log::file::{file_logger, FileDrainConfig}; +use crate::log::loki::{loki_logger, LokiDrainConfig}; use crate::log::split::*; use crate::prelude::ENV_VARS; @@ -23,20 +27,20 @@ pub struct ComponentLoggerConfig { #[derive(Clone)] pub struct LoggerFactory { parent: Logger, - elastic_config: Option, + log_store_config: Option, metrics_registry: Arc, } impl LoggerFactory { - /// Creates a new factory using a parent logger and optional Elasticsearch configuration. + /// Creates a new factory using a parent logger and optional log store configuration. pub fn new( logger: Logger, - elastic_config: Option, + log_store_config: Option, metrics_registry: Arc, ) -> Self { Self { parent: logger, - elastic_config, + log_store_config, metrics_registry, } } @@ -45,7 +49,7 @@ impl LoggerFactory { pub fn with_parent(&self, parent: Logger) -> Self { Self { parent, - elastic_config: self.elastic_config.clone(), + log_store_config: self.log_store_config.clone(), metrics_registry: self.metrics_registry.clone(), } } @@ -62,56 +66,127 @@ impl LoggerFactory { None => term_logger, Some(config) => match config.elastic { None => term_logger, - Some(config) => self - .elastic_config - .clone() - .map(|elastic_config| { - split_logger( - term_logger.clone(), - elastic_logger( - ElasticDrainConfig { - general: elastic_config, - index: config.index, - custom_id_key: String::from("componentId"), - custom_id_value: component.to_string(), - flush_interval: ENV_VARS.elastic_search_flush_interval, - max_retries: ENV_VARS.elastic_search_max_retries, - }, + Some(elastic_component_config) => { + // Check if we have Elasticsearch configured in log_store_config + match &self.log_store_config { + Some(LogStoreConfig::Elasticsearch { + endpoint, + username, + password, + .. + }) => { + // Build ElasticLoggingConfig on-demand + let elastic_config = ElasticLoggingConfig { + endpoint: endpoint.clone(), + username: username.clone(), + password: password.clone(), + client: reqwest::Client::new(), + }; + + split_logger( term_logger.clone(), - self.logs_sent_counter(None), - ), - ) - }) - .unwrap_or(term_logger), + elastic_logger( + ElasticDrainConfig { + general: elastic_config, + index: elastic_component_config.index, + custom_id_key: String::from("componentId"), + custom_id_value: component.to_string(), + flush_interval: ENV_VARS.elastic_search_flush_interval, + max_retries: ENV_VARS.elastic_search_max_retries, + }, + term_logger.clone(), + self.logs_sent_counter(None), + ), + ) + } + _ => { + // No Elasticsearch configured, just use terminal logger + term_logger + } + } + } }, } } - /// Creates a subgraph logger with Elasticsearch support. + /// Creates a subgraph logger with multi-backend support. pub fn subgraph_logger(&self, loc: &DeploymentLocator) -> Logger { let term_logger = self .parent .new(o!("subgraph_id" => loc.hash.to_string(), "sgd" => loc.id.to_string())); - self.elastic_config - .clone() - .map(|elastic_config| { - split_logger( + // Determine which drain to use based on log_store_config + let drain = match &self.log_store_config { + Some(LogStoreConfig::Elasticsearch { + endpoint, + username, + password, + index, + .. + }) => { + // Build ElasticLoggingConfig on-demand + let elastic_config = ElasticLoggingConfig { + endpoint: endpoint.clone(), + username: username.clone(), + password: password.clone(), + client: reqwest::Client::new(), + }; + + Some(elastic_logger( + ElasticDrainConfig { + general: elastic_config, + index: index.clone(), + custom_id_key: String::from("subgraphId"), + custom_id_value: loc.hash.to_string(), + flush_interval: ENV_VARS.elastic_search_flush_interval, + max_retries: ENV_VARS.elastic_search_max_retries, + }, + term_logger.clone(), + self.logs_sent_counter(Some(loc.hash.as_str())), + )) + } + + None => None, + + Some(LogStoreConfig::Loki { + endpoint, + tenant_id, + }) => { + // Use Loki + Some(loki_logger( + LokiDrainConfig { + endpoint: endpoint.clone(), + tenant_id: tenant_id.clone(), + flush_interval: Duration::from_secs(5), + subgraph_id: loc.hash.to_string(), + }, term_logger.clone(), - elastic_logger( - ElasticDrainConfig { - general: elastic_config, - index: ENV_VARS.elastic_search_index.clone(), - custom_id_key: String::from("subgraphId"), - custom_id_value: loc.hash.to_string(), - flush_interval: ENV_VARS.elastic_search_flush_interval, - max_retries: ENV_VARS.elastic_search_max_retries, - }, - term_logger.clone(), - self.logs_sent_counter(Some(loc.hash.as_str())), - ), - ) - }) + )) + } + + Some(LogStoreConfig::File { + directory, + max_file_size, + retention_days, + }) => { + // Use File + Some(file_logger( + FileDrainConfig { + directory: directory.clone(), + subgraph_id: loc.hash.to_string(), + max_file_size: *max_file_size, + retention_days: *retention_days, + }, + term_logger.clone(), + )) + } + + Some(LogStoreConfig::Disabled) => None, + }; + + // Combine terminal and storage drain + drain + .map(|storage_drain| split_logger(term_logger.clone(), storage_drain)) .unwrap_or(term_logger) } diff --git a/graph/src/log/file.rs b/graph/src/log/file.rs new file mode 100644 index 00000000000..beb4e218ea4 --- /dev/null +++ b/graph/src/log/file.rs @@ -0,0 +1,231 @@ +use std::fs::{File, OpenOptions}; +use std::io::{BufWriter, Write}; +use std::path::PathBuf; +use std::sync::{Arc, Mutex}; + +use chrono::prelude::{SecondsFormat, Utc}; +use serde::Serialize; +use slog::*; + +use super::common::{create_async_logger, LogEntryBuilder, LogMeta}; + +/// Configuration for `FileDrain`. +#[derive(Clone, Debug)] +pub struct FileDrainConfig { + /// Directory where log files will be stored + pub directory: PathBuf, + /// The subgraph ID used for the log filename + pub subgraph_id: String, + /// Maximum file size in bytes + pub max_file_size: u64, + /// Retention period in days + pub retention_days: u32, +} + +/// Log document structure for JSON Lines format +#[derive(Clone, Debug, Serialize)] +#[serde(rename_all = "camelCase")] +struct FileLogDocument { + id: String, + subgraph_id: String, + timestamp: String, + level: String, + text: String, + arguments: Vec<(String, String)>, + meta: FileLogMeta, +} + +type FileLogMeta = LogMeta; + +/// An slog `Drain` for logging to local files in JSON Lines format. +/// +/// Each subgraph gets its own .jsonl file with log entries. +/// Format: One JSON object per line +/// ```jsonl +/// {"id":"QmXxx-2024-01-15T10:30:00Z","subgraphId":"QmXxx","timestamp":"2024-01-15T10:30:00Z","level":"error","text":"Error message","arguments":[],"meta":{"module":"test.rs","line":42,"column":10}} +/// ``` +pub struct FileDrain { + config: FileDrainConfig, + error_logger: Logger, + writer: Arc>>, +} + +impl FileDrain { + /// Creates a new `FileDrain`. + pub fn new(config: FileDrainConfig, error_logger: Logger) -> std::io::Result { + std::fs::create_dir_all(&config.directory)?; + + let path = config + .directory + .join(format!("{}.jsonl", config.subgraph_id)); + let file = OpenOptions::new().create(true).append(true).open(path)?; + + Ok(FileDrain { + config, + error_logger, + writer: Arc::new(Mutex::new(BufWriter::new(file))), + }) + } +} + +impl Drain for FileDrain { + type Ok = (); + type Err = Never; + + fn log(&self, record: &Record, values: &OwnedKVList) -> std::result::Result<(), Never> { + // Don't write `trace` logs to file + if record.level() == Level::Trace { + return Ok(()); + } + + let timestamp = Utc::now().to_rfc3339_opts(SecondsFormat::Nanos, true); + let builder = + LogEntryBuilder::new(record, values, self.config.subgraph_id.clone(), timestamp); + + // Build log document + let log_doc = FileLogDocument { + id: builder.build_id(), + subgraph_id: builder.subgraph_id().to_string(), + timestamp: builder.timestamp().to_string(), + level: builder.level_str().to_string(), + text: builder.build_text(), + arguments: builder.build_arguments_vec(), + meta: builder.build_meta(), + }; + + // Write JSON line (synchronous, buffered) + let mut writer = self.writer.lock().unwrap(); + if let Err(e) = serde_json::to_writer(&mut *writer, &log_doc) { + error!(self.error_logger, "Failed to serialize log to JSON: {}", e); + return Ok(()); + } + + if let Err(e) = writeln!(&mut *writer) { + error!(self.error_logger, "Failed to write newline: {}", e); + return Ok(()); + } + + // Flush to ensure durability + if let Err(e) = writer.flush() { + error!(self.error_logger, "Failed to flush log file: {}", e); + } + + Ok(()) + } +} + +/// Creates a new asynchronous file logger. +/// +/// Uses `error_logger` to print any file logging errors, +/// so they don't go unnoticed. +pub fn file_logger(config: FileDrainConfig, error_logger: Logger) -> Logger { + let file_drain = match FileDrain::new(config, error_logger.clone()) { + Ok(drain) => drain, + Err(e) => { + error!(error_logger, "Failed to create FileDrain: {}", e); + // Return a logger that discards all logs + return Logger::root(slog::Discard, o!()); + } + }; + + create_async_logger(file_drain, 20000, true) +} + +#[cfg(test)] +mod tests { + use super::*; + use tempfile::TempDir; + + #[test] + fn test_file_drain_creation() { + let temp_dir = TempDir::new().unwrap(); + let error_logger = Logger::root(slog::Discard, o!()); + + let config = FileDrainConfig { + directory: temp_dir.path().to_path_buf(), + subgraph_id: "QmTest".to_string(), + max_file_size: 1024 * 1024, + retention_days: 30, + }; + + let drain = FileDrain::new(config, error_logger); + assert!(drain.is_ok()); + + // Verify file was created + let file_path = temp_dir.path().join("QmTest.jsonl"); + assert!(file_path.exists()); + } + + #[test] + fn test_log_entry_format() { + let arguments = vec![ + ("key1".to_string(), "value1".to_string()), + ("key2".to_string(), "value2".to_string()), + ]; + + let doc = FileLogDocument { + id: "test-id".to_string(), + subgraph_id: "QmTest".to_string(), + timestamp: "2024-01-15T10:30:00Z".to_string(), + level: "error".to_string(), + text: "Test error message".to_string(), + arguments, + meta: FileLogMeta { + module: "test.rs".to_string(), + line: 42, + column: 10, + }, + }; + + let json = serde_json::to_string(&doc).unwrap(); + assert!(json.contains("\"id\":\"test-id\"")); + assert!(json.contains("\"subgraphId\":\"QmTest\"")); + assert!(json.contains("\"level\":\"error\"")); + assert!(json.contains("\"text\":\"Test error message\"")); + assert!(json.contains("\"arguments\"")); + } + + #[test] + fn test_file_drain_writes_jsonl() { + use std::io::{BufRead, BufReader}; + + let temp_dir = TempDir::new().unwrap(); + let error_logger = Logger::root(slog::Discard, o!()); + + let config = FileDrainConfig { + directory: temp_dir.path().to_path_buf(), + subgraph_id: "QmTest".to_string(), + max_file_size: 1024 * 1024, + retention_days: 30, + }; + + let drain = FileDrain::new(config.clone(), error_logger).unwrap(); + + // Create a test record + let logger = Logger::root(drain, o!()); + info!(logger, "Test message"; "key" => "value"); + + // Give async drain time to write (in real test we'd use proper sync) + std::thread::sleep(std::time::Duration::from_millis(100)); + + // Read the file + let file_path = temp_dir.path().join("QmTest.jsonl"); + let file = File::open(file_path).unwrap(); + let reader = BufReader::new(file); + + let lines: Vec = reader.lines().map_while(|r| r.ok()).collect(); + + // Should have written at least one line + assert!(!lines.is_empty()); + + // Each line should be valid JSON + for line in lines { + let parsed: serde_json::Value = serde_json::from_str(&line).unwrap(); + assert!(parsed.get("id").is_some()); + assert!(parsed.get("subgraphId").is_some()); + assert!(parsed.get("timestamp").is_some()); + assert!(parsed.get("level").is_some()); + assert!(parsed.get("text").is_some()); + } + } +} diff --git a/graph/src/log/loki.rs b/graph/src/log/loki.rs new file mode 100644 index 00000000000..d921546ca26 --- /dev/null +++ b/graph/src/log/loki.rs @@ -0,0 +1,325 @@ +use std::collections::HashMap; +use std::sync::{Arc, Mutex}; +use std::time::Duration; + +use chrono::prelude::{SecondsFormat, Utc}; +use reqwest::Client; +use serde::Serialize; +use serde_json::json; +use slog::*; + +use super::common::{create_async_logger, LogEntryBuilder, LogMeta}; + +/// Configuration for `LokiDrain`. +#[derive(Clone, Debug)] +pub struct LokiDrainConfig { + pub endpoint: String, + pub tenant_id: Option, + pub flush_interval: Duration, + pub subgraph_id: String, +} + +/// A log entry to be sent to Loki +#[derive(Clone, Debug)] +struct LokiLogEntry { + timestamp_ns: String, // Nanoseconds since epoch as string + line: String, // JSON-serialized log entry + labels: HashMap, // Stream labels (subgraphId, level, etc.) +} + +/// Log document structure for JSON serialization +#[derive(Clone, Debug, Serialize)] +#[serde(rename_all = "camelCase")] +struct LokiLogDocument { + id: String, + subgraph_id: String, + timestamp: String, + level: String, + text: String, + arguments: HashMap, + meta: LokiLogMeta, +} + +type LokiLogMeta = LogMeta; + +/// A slog `Drain` for logging to Loki. +/// +/// Loki expects logs in the following format: +/// ```json +/// { +/// "streams": [ +/// { +/// "stream": {"subgraphId": "QmXxx", "level": "error"}, +/// "values": [ +/// ["", ""], +/// ["", ""] +/// ] +/// } +/// ] +/// } +/// ``` +pub struct LokiDrain { + config: LokiDrainConfig, + client: Client, + error_logger: Logger, + logs: Arc>>, +} + +impl LokiDrain { + /// Creates a new `LokiDrain`. + pub fn new(config: LokiDrainConfig, error_logger: Logger) -> Self { + let client = Client::builder() + .timeout(Duration::from_secs(30)) + .build() + .expect("failed to create HTTP client for LokiDrain"); + + let drain = LokiDrain { + config, + client, + error_logger, + logs: Arc::new(Mutex::new(vec![])), + }; + drain.periodically_flush_logs(); + drain + } + + fn periodically_flush_logs(&self) { + let flush_logger = self.error_logger.clone(); + let logs = self.logs.clone(); + let config = self.config.clone(); + let client = self.client.clone(); + let mut interval = tokio::time::interval(self.config.flush_interval); + + crate::tokio::spawn(async move { + loop { + interval.tick().await; + + let logs_to_send = { + let mut logs = logs.lock().unwrap(); + let logs_to_send = (*logs).clone(); + logs.clear(); + logs_to_send + }; + + // Do nothing if there are no logs to flush + if logs_to_send.is_empty() { + continue; + } + + // Group logs by labels (Loki streams) + let streams = group_by_labels(logs_to_send); + + // Build Loki push request body + let streams_json: Vec<_> = streams + .into_iter() + .map(|(labels, entries)| { + json!({ + "stream": labels, + "values": entries.into_iter() + .map(|e| vec![e.timestamp_ns, e.line]) + .collect::>() + }) + }) + .collect(); + + let body = json!({ + "streams": streams_json + }); + + let url = format!("{}/loki/api/v1/push", config.endpoint); + + let mut request = client + .post(&url) + .json(&body) + .timeout(Duration::from_secs(30)); + + if let Some(ref tenant_id) = config.tenant_id { + request = request.header("X-Scope-OrgID", tenant_id); + } + + match request.send().await { + Ok(resp) if resp.status().is_success() => { + // Success + } + Ok(resp) => { + error!( + flush_logger, + "Loki push failed with status: {}", + resp.status() + ); + } + Err(e) => { + error!(flush_logger, "Failed to send logs to Loki: {}", e); + } + } + } + }); + } +} + +impl Drain for LokiDrain { + type Ok = (); + type Err = (); + + fn log(&self, record: &Record, values: &OwnedKVList) -> std::result::Result<(), ()> { + // Don't send `trace` logs to Loki + if record.level() == Level::Trace { + return Ok(()); + } + + let now = Utc::now(); + let timestamp = now.to_rfc3339_opts(SecondsFormat::Nanos, true); + let timestamp_ns = now.timestamp_nanos_opt().unwrap().to_string(); + + let builder = LogEntryBuilder::new( + record, + values, + self.config.subgraph_id.clone(), + timestamp.clone(), + ); + + // Build log document + let log_doc = LokiLogDocument { + id: builder.build_id(), + subgraph_id: builder.subgraph_id().to_string(), + timestamp, + level: builder.level_str().to_string(), + text: builder.build_text(), + arguments: builder.build_arguments_map(), + meta: builder.build_meta(), + }; + + // Serialize to JSON line + let line = match serde_json::to_string(&log_doc) { + Ok(l) => l, + Err(e) => { + error!(self.error_logger, "Failed to serialize log to JSON: {}", e); + return Ok(()); + } + }; + + // Build labels for Loki stream + let mut labels = HashMap::new(); + labels.insert("subgraphId".to_string(), builder.subgraph_id().to_string()); + labels.insert("level".to_string(), builder.level_str().to_string()); + + // Create log entry + let entry = LokiLogEntry { + timestamp_ns, + line, + labels, + }; + + // Push to buffer + let mut logs = self.logs.lock().unwrap(); + logs.push(entry); + + Ok(()) + } +} + +/// Groups log entries by their labels to create Loki streams +/// Returns a HashMap where the key is the labels and the value is a vec of entries +fn group_by_labels( + entries: Vec, +) -> Vec<(HashMap, Vec)> { + let mut streams: HashMap, Vec)> = HashMap::new(); + for entry in entries { + // Create a deterministic string key from the labels + let label_key = serde_json::to_string(&entry.labels).unwrap_or_default(); + + streams + .entry(label_key) + .or_insert_with(|| (entry.labels.clone(), Vec::new())) + .1 + .push(entry); + } + + // Convert to a vec of (labels, entries) tuples + streams.into_values().collect() +} + +/// Creates a new asynchronous Loki logger. +/// +/// Uses `error_logger` to print any Loki logging errors, +/// so they don't go unnoticed. +pub fn loki_logger(config: LokiDrainConfig, error_logger: Logger) -> Logger { + let loki_drain = LokiDrain::new(config, error_logger); + create_async_logger(loki_drain, 20000, true) +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn test_group_by_labels() { + let mut labels1 = HashMap::new(); + labels1.insert("subgraphId".to_string(), "QmTest".to_string()); + labels1.insert("level".to_string(), "error".to_string()); + + let mut labels2 = HashMap::new(); + labels2.insert("subgraphId".to_string(), "QmTest".to_string()); + labels2.insert("level".to_string(), "info".to_string()); + + let entries = vec![ + LokiLogEntry { + timestamp_ns: "1000000000".to_string(), + line: "log1".to_string(), + labels: labels1.clone(), + }, + LokiLogEntry { + timestamp_ns: "2000000000".to_string(), + line: "log2".to_string(), + labels: labels1.clone(), + }, + LokiLogEntry { + timestamp_ns: "3000000000".to_string(), + line: "log3".to_string(), + labels: labels2.clone(), + }, + ]; + + let streams = group_by_labels(entries); + + // Should have 2 streams (one for each unique label set) + assert_eq!(streams.len(), 2); + + // Find streams by label and verify counts + for (labels, entries) in streams { + if labels.get("level") == Some(&"error".to_string()) { + assert_eq!(entries.len(), 2, "Error stream should have 2 entries"); + } else if labels.get("level") == Some(&"info".to_string()) { + assert_eq!(entries.len(), 1, "Info stream should have 1 entry"); + } else { + panic!("Unexpected label combination"); + } + } + } + + #[test] + fn test_loki_log_document_serialization() { + let mut arguments = HashMap::new(); + arguments.insert("key1".to_string(), "value1".to_string()); + + let doc = LokiLogDocument { + id: "test-id".to_string(), + subgraph_id: "QmTest".to_string(), + timestamp: "2024-01-15T10:30:00Z".to_string(), + level: "error".to_string(), + text: "Test error".to_string(), + arguments, + meta: LokiLogMeta { + module: "test.rs".to_string(), + line: 42, + column: 10, + }, + }; + + let json = serde_json::to_string(&doc).unwrap(); + assert!(json.contains("\"id\":\"test-id\"")); + assert!(json.contains("\"subgraphId\":\"QmTest\"")); + assert!(json.contains("\"level\":\"error\"")); + assert!(json.contains("\"text\":\"Test error\"")); + } +} diff --git a/graph/src/log/mod.rs b/graph/src/log/mod.rs index dfe8ab35379..93d2f6b7c17 100644 --- a/graph/src/log/mod.rs +++ b/graph/src/log/mod.rs @@ -27,8 +27,11 @@ use std::{fmt, io, result}; use crate::prelude::ENV_VARS; pub mod codes; +pub mod common; pub mod elastic; pub mod factory; +pub mod file; +pub mod loki; pub mod split; pub fn logger(show_debug: bool) -> Logger { diff --git a/graph/src/schema/api.rs b/graph/src/schema/api.rs index 86b13a9f3f2..1e8ef384dd9 100644 --- a/graph/src/schema/api.rs +++ b/graph/src/schema/api.rs @@ -11,7 +11,7 @@ use crate::cheap_clone::CheapClone; use crate::data::graphql::{ObjectOrInterface, ObjectTypeExt, TypeExt}; use crate::data::store::IdType; use crate::env::ENV_VARS; -use crate::schema::{ast, META_FIELD_NAME, META_FIELD_TYPE, SCHEMA_TYPE_NAME}; +use crate::schema::{ast, LOGS_FIELD_NAME, META_FIELD_NAME, META_FIELD_TYPE, SCHEMA_TYPE_NAME}; use crate::data::graphql::ext::{ camel_cased_names, DefinitionExt, DirectiveExt, DocumentExt, ValueExt, @@ -350,7 +350,7 @@ pub(in crate::schema) fn api_schema( ) -> Result { // Refactor: Don't clone the schema. let mut api = init_api_schema(input_schema)?; - add_meta_field_type(&mut api.document); + add_builtin_field_types(&mut api.document); add_types_for_object_types(&mut api, input_schema)?; add_types_for_interface_types(&mut api, input_schema)?; add_types_for_aggregation_types(&mut api, input_schema)?; @@ -445,18 +445,24 @@ fn init_api_schema(input_schema: &InputSchema) -> Result .map_err(|e| APISchemaError::SchemaCreationFailed(e.to_string())) } -/// Adds a global `_Meta_` type to the schema. The `_meta` field -/// accepts values of this type -fn add_meta_field_type(api: &mut s::Document) { +/// Adds built-in field types to the schema. Currently adds `_Meta_` and `_Log_` types +/// which are used by the `_meta` and `_logs` fields respectively. +fn add_builtin_field_types(api: &mut s::Document) { lazy_static! { static ref META_FIELD_SCHEMA: s::Document = { let schema = include_str!("meta.graphql"); s::parse_schema(schema).expect("the schema `meta.graphql` is invalid") }; + static ref LOGS_FIELD_SCHEMA: s::Document = { + let schema = include_str!("logs.graphql"); + s::parse_schema(schema).expect("the schema `logs.graphql` is invalid") + }; } api.definitions .extend(META_FIELD_SCHEMA.definitions.iter().cloned()); + api.definitions + .extend(LOGS_FIELD_SCHEMA.definitions.iter().cloned()); } fn add_types_for_object_types( @@ -1073,6 +1079,7 @@ fn add_query_type(api: &mut s::Document, input_schema: &InputSchema) -> Result<( fields.append(&mut agg_fields); fields.append(&mut fulltext_fields); fields.push(meta_field()); + fields.push(logs_field()); let typedef = s::TypeDefinition::Object(s::ObjectType { position: Pos::default(), @@ -1278,6 +1285,102 @@ fn meta_field() -> s::Field { META_FIELD.clone() } +fn logs_field() -> s::Field { + lazy_static! { + static ref LOGS_FIELD: s::Field = s::Field { + position: Pos::default(), + description: Some( + "Query execution logs emitted by the subgraph during indexing. \ + Results are sorted by timestamp in descending order (newest first)." + .to_string() + ), + name: LOGS_FIELD_NAME.to_string(), + arguments: vec![ + // level: LogLevel + s::InputValue { + position: Pos::default(), + description: Some( + "Filter logs by severity level. Only logs at this level will be returned." + .to_string() + ), + name: String::from("level"), + value_type: s::Type::NamedType(String::from("LogLevel")), + default_value: None, + directives: vec![], + }, + // from: String (RFC3339 timestamp) + s::InputValue { + position: Pos::default(), + description: Some( + "Filter logs from this timestamp onwards (inclusive). \ + Must be in RFC3339 format (e.g., '2024-01-15T10:30:00Z')." + .to_string() + ), + name: String::from("from"), + value_type: s::Type::NamedType(String::from("String")), + default_value: None, + directives: vec![], + }, + // to: String (RFC3339 timestamp) + s::InputValue { + position: Pos::default(), + description: Some( + "Filter logs until this timestamp (inclusive). \ + Must be in RFC3339 format (e.g., '2024-01-15T23:59:59Z')." + .to_string() + ), + name: String::from("to"), + value_type: s::Type::NamedType(String::from("String")), + default_value: None, + directives: vec![], + }, + // search: String (full-text search) + s::InputValue { + position: Pos::default(), + description: Some( + "Search for logs containing this text in the message. \ + Case-insensitive substring match. Maximum length: 1000 characters." + .to_string() + ), + name: String::from("search"), + value_type: s::Type::NamedType(String::from("String")), + default_value: None, + directives: vec![], + }, + // first: Int (default 100, max 1000) + s::InputValue { + position: Pos::default(), + description: Some( + "Maximum number of logs to return. Default: 100, Maximum: 1000." + .to_string() + ), + name: String::from("first"), + value_type: s::Type::NamedType(String::from("Int")), + default_value: Some(s::Value::Int(100.into())), + directives: vec![], + }, + // skip: Int (default 0, max 10000) + s::InputValue { + position: Pos::default(), + description: Some( + "Number of logs to skip (for pagination). Default: 0, Maximum: 10000." + .to_string() + ), + name: String::from("skip"), + value_type: s::Type::NamedType(String::from("Int")), + default_value: Some(s::Value::Int(0.into())), + directives: vec![], + }, + ], + field_type: s::Type::NonNullType(Box::new(s::Type::ListType(Box::new( + s::Type::NonNullType(Box::new(s::Type::NamedType(String::from("_Log_")))), + )))), + directives: vec![], + }; + } + LOGS_FIELD.clone() +} + #[cfg(test)] mod tests { use crate::{ diff --git a/graph/src/schema/logs.graphql b/graph/src/schema/logs.graphql new file mode 100644 index 00000000000..be80e029152 --- /dev/null +++ b/graph/src/schema/logs.graphql @@ -0,0 +1,61 @@ +""" +The severity level of a log entry. +Log levels are ordered from most to least severe: CRITICAL > ERROR > WARNING > INFO > DEBUG +""" +enum LogLevel { + "Critical errors that require immediate attention" + CRITICAL + "Error conditions that indicate a failure" + ERROR + "Warning conditions that may require attention" + WARNING + "Informational messages about normal operations" + INFO + "Detailed diagnostic information for debugging" + DEBUG +} + +""" +A log entry emitted by a subgraph during indexing. +Logs can be generated by the subgraph's AssemblyScript code using the `log.*` functions. +""" +type _Log_ { + "Unique identifier for this log entry" + id: String! + "The deployment hash of the subgraph that emitted this log" + subgraphId: String! + "The timestamp when the log was emitted, in RFC3339 format (e.g., '2024-01-15T10:30:00Z')" + timestamp: String! + "The severity level of the log entry" + level: LogLevel! + "The log message text" + text: String! + "Additional structured data passed to the log function as key-value pairs" + arguments: [_LogArgument_!]! + "Metadata about the source location in the subgraph code where the log was emitted" + meta: _LogMeta_! +} + +""" +A key-value pair of additional data associated with a log entry. +These correspond to arguments passed to the log function in the subgraph code. +""" +type _LogArgument_ { + "The parameter name" + key: String! + "The parameter value, serialized as a string" + value: String! +} + +""" +Source code location metadata for a log entry. +Indicates where in the subgraph's AssemblyScript code the log statement was executed. +""" +type _LogMeta_ { + "The module or file path where the log was emitted" + module: String! + "The line number in the source file" + line: Int! + "The column number in the source file" + column: Int! +} diff --git a/graph/src/schema/mod.rs b/graph/src/schema/mod.rs index f4e098a4b3e..0720101ce49 100644 --- a/graph/src/schema/mod.rs +++ b/graph/src/schema/mod.rs @@ -42,6 +42,9 @@ pub const INTROSPECTION_SCHEMA_FIELD_NAME: &str = "__schema"; pub const META_FIELD_TYPE: &str = "_Meta_"; pub const META_FIELD_NAME: &str = "_meta"; +pub const LOGS_FIELD_TYPE: &str = "_Log_"; +pub const LOGS_FIELD_NAME: &str = "_logs"; + pub const INTROSPECTION_TYPE_FIELD_NAME: &str = "__type"; pub const BLOCK_FIELD_TYPE: &str = "_Block_"; diff --git a/graphql/src/execution/execution.rs b/graphql/src/execution/execution.rs index 48477d3eb5f..158f4a4b2e1 100644 --- a/graphql/src/execution/execution.rs +++ b/graphql/src/execution/execution.rs @@ -8,7 +8,7 @@ use graph::{ }, futures03::future::TryFutureExt, prelude::{s, CheapClone}, - schema::{is_introspection_field, INTROSPECTION_QUERY_TYPE, META_FIELD_NAME}, + schema::{is_introspection_field, INTROSPECTION_QUERY_TYPE, LOGS_FIELD_NAME, META_FIELD_NAME}, util::{herd_cache::HerdCache, lfu_cache::EvictStats, timed_rw_lock::TimedMutex}, }; use lazy_static::lazy_static; @@ -231,6 +231,9 @@ where /// Whether to include an execution trace in the result pub trace: bool, + + /// The log store to use for querying logs. + pub log_store: Arc, } pub(crate) fn get_field<'a>( @@ -264,6 +267,7 @@ where // `cache_status` is a dead value for the introspection context. cache_status: AtomicCell::new(CacheStatus::Miss), trace: ENV_VARS.log_sql_timing(), + log_store: self.log_store.cheap_clone(), } } } @@ -273,11 +277,12 @@ pub(crate) async fn execute_root_selection_set_uncached( selection_set: &a::SelectionSet, root_type: &sast::ObjectType, ) -> Result<(Object, Trace), Vec> { - // Split the top-level fields into introspection fields and - // regular data fields + // Split the top-level fields into introspection fields, + // logs fields, meta fields, and regular data fields let mut data_set = a::SelectionSet::empty_from(selection_set); let mut intro_set = a::SelectionSet::empty_from(selection_set); let mut meta_items = Vec::new(); + let mut logs_fields = Vec::new(); for field in selection_set.fields_for(root_type)? { // See if this is an introspection or data field. We don't worry about @@ -285,6 +290,8 @@ pub(crate) async fn execute_root_selection_set_uncached( // the data_set SelectionSet if is_introspection_field(&field.name) { intro_set.push(field)? + } else if field.name == LOGS_FIELD_NAME { + logs_fields.push(field) } else if field.name == META_FIELD_NAME || field.name == "__typename" { meta_items.push(field) } else { @@ -292,6 +299,15 @@ pub(crate) async fn execute_root_selection_set_uncached( } } + // Validate that _logs queries cannot be combined with regular entity queries + if !logs_fields.is_empty() && !data_set.is_empty() { + return Err(vec![QueryExecutionError::ValidationError( + None, + "The _logs query cannot be combined with other entity queries in the same request" + .to_string(), + )]); + } + // If we are getting regular data, prefetch it from the database let (mut values, trace) = if data_set.is_empty() && meta_items.is_empty() { (Object::default(), Trace::None) @@ -314,6 +330,64 @@ pub(crate) async fn execute_root_selection_set_uncached( ); } + // Resolve logs fields, if there are any + for field in logs_fields { + use graph::data::graphql::object; + + // Build log query from field arguments + let log_query = crate::store::logs::build_log_query(field, ctx.query.schema.id()) + .map_err(|e| vec![e])?; + + // Query the log store + let log_entries = ctx.log_store.query_logs(log_query).await.map_err(|e| { + vec![QueryExecutionError::StoreError( + anyhow::Error::from(e).into(), + )] + })?; + + // Convert log entries to GraphQL values + let log_values: Vec = log_entries + .into_iter() + .map(|entry| { + // Convert arguments Vec<(String, String)> to GraphQL objects + let arguments: Vec = entry + .arguments + .into_iter() + .map(|(key, value)| { + object! { + key: key, + value: value, + __typename: "_LogArgument_" + } + }) + .collect(); + + // Convert log level to string + let level_str = entry.level.as_str().to_uppercase(); + + object! { + id: entry.id, + subgraphId: entry.subgraph_id.to_string(), + timestamp: entry.timestamp, + level: level_str, + text: entry.text, + arguments: arguments, + meta: object! { + module: entry.meta.module, + line: r::Value::Int(entry.meta.line), + column: r::Value::Int(entry.meta.column), + __typename: "_LogMeta_" + }, + __typename: "_Log_" + } + }) + .collect(); + + let response_key = Word::from(field.response_key()); + let logs_object = Object::from_iter(vec![(response_key, r::Value::List(log_values))]); + values.append(logs_object); + } + Ok((values, trace)) } diff --git a/graphql/src/query/mod.rs b/graphql/src/query/mod.rs index 641eb4581bb..86244c4cc71 100644 --- a/graphql/src/query/mod.rs +++ b/graphql/src/query/mod.rs @@ -29,6 +29,9 @@ pub struct QueryExecutionOptions { /// Whether to include an execution trace in the result pub trace: bool, + + /// The log store to use for querying logs. + pub log_store: Arc, } /// Executes a query and returns a result. @@ -52,6 +55,7 @@ where max_skip: options.max_skip, cache_status: Default::default(), trace: options.trace, + log_store: options.log_store, }); let selection_set = selection_set diff --git a/graphql/src/runner.rs b/graphql/src/runner.rs index 293dcaa111b..3af945e030f 100644 --- a/graphql/src/runner.rs +++ b/graphql/src/runner.rs @@ -26,6 +26,7 @@ pub struct GraphQlRunner { store: Arc, load_manager: Arc, graphql_metrics: Arc, + log_store: Arc, } #[cfg(debug_assertions)] @@ -44,6 +45,7 @@ where store: Arc, load_manager: Arc, registry: Arc, + log_store: Arc, ) -> Self { let logger = logger.new(o!("component" => "GraphQlRunner")); let graphql_metrics = Arc::new(GraphQLMetrics::new(registry)); @@ -52,6 +54,7 @@ where store, load_manager, graphql_metrics, + log_store, } } @@ -186,6 +189,7 @@ where max_first: max_first.unwrap_or(ENV_VARS.graphql.max_first), max_skip: max_skip.unwrap_or(ENV_VARS.graphql.max_skip), trace: do_trace, + log_store: self.log_store.cheap_clone(), }, )); } diff --git a/graphql/src/store/logs.rs b/graphql/src/store/logs.rs new file mode 100644 index 00000000000..44b60041ffa --- /dev/null +++ b/graphql/src/store/logs.rs @@ -0,0 +1,174 @@ +use graph::components::log_store::LogQuery; +use graph::prelude::{q, r, DeploymentHash, QueryExecutionError}; + +use crate::execution::ast as a; + +const MAX_FIRST: u32 = 1000; +const MAX_SKIP: u32 = 10000; +const MAX_TEXT_LENGTH: usize = 1000; + +/// Validate and sanitize text search input to prevent injection attacks +fn validate_text_input(text: &str) -> Result<(), &'static str> { + if text.is_empty() { + return Err("search text cannot be empty"); + } + + if text.len() > MAX_TEXT_LENGTH { + return Err("search text exceeds maximum length of 1000 characters"); + } + + // Reject strings that look like Elasticsearch query DSL to prevent injection + if text + .chars() + .any(|c| matches!(c, '{' | '}' | '[' | ']' | ':' | '"')) + { + return Err("search text contains invalid characters ({}[]:\")"); + } + + Ok(()) +} + +/// Validate RFC3339 timestamp format +fn validate_timestamp(timestamp: &str) -> Result<(), &'static str> { + if !timestamp.contains('T') { + return Err("must be in ISO 8601 format (e.g., 2024-01-15T10:30:00Z)"); + } + + if !timestamp.ends_with('Z') && !timestamp.contains('+') && !timestamp.contains('-') { + return Err("must include timezone (Z or offset like +00:00)"); + } + + if timestamp.len() > 50 { + return Err("timestamp exceeds maximum length"); + } + + // Check for suspicious characters that could be injection attempts + if timestamp + .chars() + .any(|c| matches!(c, '{' | '}' | ';' | '\'' | '"')) + { + return Err("timestamp contains invalid characters"); + } + + Ok(()) +} + +pub fn build_log_query( + field: &a::Field, + subgraph_id: &DeploymentHash, +) -> Result { + let mut level = None; + let mut from = None; + let mut to = None; + let mut search = None; + let mut first = 100; + let mut skip = 0; + + // Parse arguments + for (name, value) in &field.arguments { + match name.as_str() { + "level" => { + if let r::Value::Enum(level_str) = value { + level = Some(level_str.parse().map_err(|e: String| { + QueryExecutionError::InvalidArgumentError( + field.position, + "level".to_string(), + q::Value::String(e), + ) + })?); + } + } + "from" => { + if let r::Value::String(from_str) = value { + validate_timestamp(from_str).map_err(|e| { + QueryExecutionError::InvalidArgumentError( + field.position, + "from".to_string(), + q::Value::String(format!("Invalid timestamp: {}", e)), + ) + })?; + from = Some(from_str.clone()); + } + } + "to" => { + if let r::Value::String(to_str) = value { + validate_timestamp(to_str).map_err(|e| { + QueryExecutionError::InvalidArgumentError( + field.position, + "to".to_string(), + q::Value::String(format!("Invalid timestamp: {}", e)), + ) + })?; + to = Some(to_str.clone()); + } + } + "search" => { + if let r::Value::String(search_str) = value { + validate_text_input(search_str).map_err(|e| { + QueryExecutionError::InvalidArgumentError( + field.position, + "search".to_string(), + q::Value::String(format!("Invalid search text: {}", e)), + ) + })?; + search = Some(search_str.clone()); + } + } + "first" => { + if let r::Value::Int(first_val) = value { + let first_i64 = *first_val; + if first_i64 < 0 { + return Err(QueryExecutionError::InvalidArgumentError( + field.position, + "first".to_string(), + q::Value::String("first must be non-negative".to_string()), + )); + } + let first_u32 = first_i64 as u32; + if first_u32 > MAX_FIRST { + return Err(QueryExecutionError::InvalidArgumentError( + field.position, + "first".to_string(), + q::Value::String(format!("first must not exceed {}", MAX_FIRST)), + )); + } + first = first_u32; + } + } + "skip" => { + if let r::Value::Int(skip_val) = value { + let skip_i64 = *skip_val; + if skip_i64 < 0 { + return Err(QueryExecutionError::InvalidArgumentError( + field.position, + "skip".to_string(), + q::Value::String("skip must be non-negative".to_string()), + )); + } + let skip_u32 = skip_i64 as u32; + if skip_u32 > MAX_SKIP { + return Err(QueryExecutionError::InvalidArgumentError( + field.position, + "skip".to_string(), + q::Value::String(format!("skip must not exceed {}", MAX_SKIP)), + )); + } + skip = skip_u32; + } + } + _ => { + // Unknown argument, ignore + } + } + } + + Ok(LogQuery { + subgraph_id: subgraph_id.clone(), + level, + from, + to, + search, + first, + skip, + }) +} diff --git a/graphql/src/store/mod.rs b/graphql/src/store/mod.rs index 6a4850b6a86..8f77a832b90 100644 --- a/graphql/src/store/mod.rs +++ b/graphql/src/store/mod.rs @@ -1,3 +1,4 @@ +pub mod logs; mod prefetch; mod query; mod resolver; diff --git a/graphql/src/store/resolver.rs b/graphql/src/store/resolver.rs index 426e921f2c6..6a237e8a17d 100644 --- a/graphql/src/store/resolver.rs +++ b/graphql/src/store/resolver.rs @@ -12,8 +12,8 @@ use graph::data::value::{Object, Word}; use graph::derive::CheapClone; use graph::prelude::*; use graph::schema::{ - ast as sast, INTROSPECTION_SCHEMA_FIELD_NAME, INTROSPECTION_TYPE_FIELD_NAME, META_FIELD_NAME, - META_FIELD_TYPE, + ast as sast, INTROSPECTION_SCHEMA_FIELD_NAME, INTROSPECTION_TYPE_FIELD_NAME, LOGS_FIELD_NAME, + META_FIELD_NAME, META_FIELD_TYPE, }; use graph::schema::{ErrorPolicy, BLOCK_FIELD_TYPE}; @@ -353,6 +353,23 @@ impl Resolver for StoreResolver { return Ok(()); } + // Check if the query only contains debugging fields (_meta, _logs). + // If so, don't add indexing errors - these queries are specifically for debugging + // failed subgraphs and should work without errors. + // Introspection queries (__schema, __type) still get the indexing_error to inform + // users the subgraph has issues, but they return data. + let only_debugging_fields = result + .data() + .map(|data| { + data.iter() + .all(|(key, _)| key == META_FIELD_NAME || key == LOGS_FIELD_NAME) + }) + .unwrap_or(false); + + if only_debugging_fields { + return Ok(()); + } + // Add the "indexing_error" to the response. assert!(result.errors_mut().is_empty()); *result.errors_mut() = vec![QueryError::IndexingError]; @@ -364,9 +381,10 @@ impl Resolver for StoreResolver { ErrorPolicy::Deny => { let mut data = result.take_data(); - // Only keep the _meta, __schema and __type fields from the data + // Only keep the _meta, _logs, __schema and __type fields from the data let meta_fields = data.as_mut().and_then(|d| { let meta_field = d.remove(META_FIELD_NAME); + let logs_field = d.remove(LOGS_FIELD_NAME); let schema_field = d.remove(INTROSPECTION_SCHEMA_FIELD_NAME); let type_field = d.remove(INTROSPECTION_TYPE_FIELD_NAME); @@ -376,6 +394,9 @@ impl Resolver for StoreResolver { if let Some(meta_field) = meta_field { meta_fields.push((Word::from(META_FIELD_NAME), meta_field)); } + if let Some(logs_field) = logs_field { + meta_fields.push((Word::from(LOGS_FIELD_NAME), logs_field)); + } if let Some(schema_field) = schema_field { meta_fields .push((Word::from(INTROSPECTION_SCHEMA_FIELD_NAME), schema_field)); diff --git a/node/src/bin/manager.rs b/node/src/bin/manager.rs index 792df8853c9..97564e7f6fc 100644 --- a/node/src/bin/manager.rs +++ b/node/src/bin/manager.rs @@ -1033,7 +1033,13 @@ impl Context { let load_manager = Arc::new(LoadManager::new(&logger, vec![], vec![], registry.clone())); - Arc::new(GraphQlRunner::new(&logger, store, load_manager, registry)) + Arc::new(GraphQlRunner::new( + &logger, + store, + load_manager, + registry, + Arc::new(graph::components::log_store::NoOpLogStore), + )) } async fn networks(&self) -> anyhow::Result { diff --git a/node/src/launcher.rs b/node/src/launcher.rs index 9c0bef19e44..9d0633e9267 100644 --- a/node/src/launcher.rs +++ b/node/src/launcher.rs @@ -36,6 +36,7 @@ use tokio_util::sync::CancellationToken; use crate::config::Config; use crate::helpers::watch_subgraph_updates; +use crate::log_config_provider::{LogStoreConfigProvider, LogStoreConfigSources}; use crate::network_setup::Networks; use crate::opt::Opt; use crate::store_builder::StoreBuilder; @@ -364,6 +365,7 @@ fn build_graphql_server( metrics_registry: Arc, network_store: &Arc, logger_factory: &LoggerFactory, + log_store: Arc, ) -> GraphQLQueryServer> { let shards: Vec<_> = config.stores.keys().cloned().collect(); let load_manager = Arc::new(LoadManager::new( @@ -377,6 +379,7 @@ fn build_graphql_server( network_store.clone(), load_manager, metrics_registry, + log_store, )); GraphQLQueryServer::new(logger_factory, graphql_runner.clone()) @@ -440,20 +443,121 @@ pub async fn run( info!(logger, "Starting up"; "node_id" => &node_id); - // Optionally, identify the Elasticsearch logging configuration - let elastic_config = opt - .elasticsearch_url - .clone() - .map(|endpoint| ElasticLoggingConfig { - endpoint, - username: opt.elasticsearch_user.clone(), - password: opt.elasticsearch_password.clone(), - client: reqwest::Client::new(), - }); + // Create log store configuration provider + // Build LogStoreConfig from CLI args with backward compatibility + let cli_config = if let Some(backend) = opt.log_store_backend.as_ref() { + // New generic CLI args used + match backend.to_lowercase().as_str() { + "elasticsearch" | "elastic" | "es" => { + let url = opt + .log_store_elasticsearch_url + .clone() + .or_else(|| { + if opt.elasticsearch_url.is_some() { + warn!( + logger, + "Using deprecated --elasticsearch-url, use --log-store-elasticsearch-url instead" + ); + } + opt.elasticsearch_url.clone() + }); + + url.map(|endpoint| { + let index = opt + .log_store_elasticsearch_index + .clone() + .or_else(|| std::env::var("GRAPH_LOG_STORE_ELASTICSEARCH_INDEX").ok()) + .or_else(|| std::env::var("GRAPH_ELASTIC_SEARCH_INDEX").ok()) + .unwrap_or_else(|| "subgraph".to_string()); + + let timeout_secs = std::env::var("GRAPH_LOG_STORE_ELASTICSEARCH_TIMEOUT") + .or_else(|_| std::env::var("GRAPH_ELASTICSEARCH_TIMEOUT")) + .ok() + .and_then(|s| s.parse::().ok()) + .unwrap_or(10); + + graph::components::log_store::LogStoreConfig::Elasticsearch { + endpoint, + username: opt + .log_store_elasticsearch_user + .clone() + .or_else(|| opt.elasticsearch_user.clone()), + password: opt + .log_store_elasticsearch_password + .clone() + .or_else(|| opt.elasticsearch_password.clone()), + index, + timeout_secs, + } + }) + } + + "loki" => opt.log_store_loki_url.clone().map(|endpoint| { + graph::components::log_store::LogStoreConfig::Loki { + endpoint, + tenant_id: opt.log_store_loki_tenant_id.clone(), + } + }), + + "file" | "files" => opt.log_store_file_dir.clone().map(|directory| { + graph::components::log_store::LogStoreConfig::File { + directory: std::path::PathBuf::from(directory), + max_file_size: opt.log_store_file_max_size.unwrap_or(100 * 1024 * 1024), + retention_days: opt.log_store_file_retention_days.unwrap_or(30), + } + }), + + "disabled" | "none" => None, + + other => { + warn!(logger, "Invalid log store backend: {}", other); + None + } + } + } else if opt.elasticsearch_url.is_some() { + // Old Elasticsearch-specific CLI args used (backward compatibility) + warn!( + logger, + "Using deprecated --elasticsearch-url CLI argument, \ + please use --log-store-backend elasticsearch --log-store-elasticsearch-url instead" + ); + + let index = opt + .log_store_elasticsearch_index + .clone() + .or_else(|| std::env::var("GRAPH_LOG_STORE_ELASTICSEARCH_INDEX").ok()) + .or_else(|| std::env::var("GRAPH_ELASTIC_SEARCH_INDEX").ok()) + .unwrap_or_else(|| "subgraph".to_string()); + + let timeout_secs = std::env::var("GRAPH_LOG_STORE_ELASTICSEARCH_TIMEOUT") + .or_else(|_| std::env::var("GRAPH_ELASTICSEARCH_TIMEOUT")) + .ok() + .and_then(|s| s.parse::().ok()) + .unwrap_or(10); + + Some( + graph::components::log_store::LogStoreConfig::Elasticsearch { + endpoint: opt.elasticsearch_url.clone().unwrap(), + username: opt.elasticsearch_user.clone(), + password: opt.elasticsearch_password.clone(), + index, + timeout_secs, + }, + ) + } else { + // No CLI config provided + None + }; + + let log_config_provider = LogStoreConfigProvider::new(LogStoreConfigSources { cli_config }); + + // Resolve log store (for querying) and config (for drains) + // Priority: GRAPH_LOG_STORE env var → CLI config → NoOp/None + let (log_store, log_store_config) = log_config_provider.resolve(&logger); // Create a component and subgraph logger factory let logger_factory = - LoggerFactory::new(logger.clone(), elastic_config, metrics_registry.clone()); + LoggerFactory::new(logger.clone(), log_store_config, metrics_registry.clone()); let arweave_resolver = Arc::new(ArweaveClient::new( logger.cheap_clone(), @@ -560,6 +664,7 @@ pub async fn run( metrics_registry.clone(), &network_store, &logger_factory, + log_store.clone(), ); let index_node_server = IndexNodeServer::new( diff --git a/node/src/lib.rs b/node/src/lib.rs index a0fe189f1f7..7344fc89a04 100644 --- a/node/src/lib.rs +++ b/node/src/lib.rs @@ -9,6 +9,7 @@ pub mod chain; pub mod config; mod helpers; pub mod launcher; +pub mod log_config_provider; pub mod manager; pub mod network_setup; pub mod opt; diff --git a/node/src/log_config_provider.rs b/node/src/log_config_provider.rs new file mode 100644 index 00000000000..133be35af14 --- /dev/null +++ b/node/src/log_config_provider.rs @@ -0,0 +1,232 @@ +use graph::components::log_store::{LogStore, LogStoreConfig, LogStoreFactory, NoOpLogStore}; +use graph::prelude::*; +use slog::{info, warn, Logger}; +use std::sync::Arc; + +/// Configuration sources for log store resolution +pub struct LogStoreConfigSources { + /// Log store config from CLI arguments (any backend) + pub cli_config: Option, +} + +/// Provider for resolving log store configuration from multiple sources +/// +/// It handles multi-source configuration with the following priority: +/// 1. GRAPH_LOG_STORE environment variable (supports all backends) +/// 2. CLI configuration (any backend) +/// 3. NoOp/None (disabled) +pub struct LogStoreConfigProvider { + sources: LogStoreConfigSources, +} + +impl LogStoreConfigProvider { + /// Create a new provider with given configuration sources + pub fn new(sources: LogStoreConfigSources) -> Self { + Self { sources } + } + + /// Resolve and create a LogStore for querying logs + /// + /// Priority: GRAPH_LOG_STORE env var → CLI config → NoOp + pub fn resolve_log_store(&self, logger: &Logger) -> Arc { + // Try GRAPH_LOG_STORE environment variable + match LogStoreFactory::from_env() { + Ok(config) => match LogStoreFactory::from_config(config) { + Ok(store) => { + info!( + logger, + "Log store initialized from GRAPH_LOG_STORE environment variable" + ); + return store; + } + Err(e) => { + warn!( + logger, + "Failed to initialize log store from GRAPH_LOG_STORE: {}, falling back to CLI config", + e + ); + // Fall through to CLI fallback + } + }, + Err(_) => { + // No GRAPH_LOG_STORE env var, fall through to CLI config + } + } + + // Try CLI config + if let Some(cli_store) = self.resolve_cli_store(logger) { + return cli_store; + } + + // Default to NoOp + info!( + logger, + "No log store configured, queries will return empty results" + ); + Arc::new(NoOpLogStore) + } + + /// Resolve LogStoreConfig for drain selection (write side) + /// + /// Priority: GRAPH_LOG_STORE env var → CLI config → None + pub fn resolve_log_store_config(&self, _logger: &Logger) -> Option { + // Try GRAPH_LOG_STORE environment variable + // Note: from_env() returns Ok(Disabled) when GRAPH_LOG_STORE is not set, + // so we need to check if it's actually configured + if let Ok(config) = LogStoreFactory::from_env() { + if !matches!(config, LogStoreConfig::Disabled) { + return Some(config); + } + } + + // Fallback to CLI config (any backend) + self.sources.cli_config.clone() + } + + /// Convenience method: Resolve both log store and config at once + /// + /// This is the primary entry point for most callers, as it resolves both + /// the LogStore (for querying) and LogStoreConfig (for drain selection) + /// in a single call. + pub fn resolve(&self, logger: &Logger) -> (Arc, Option) { + let store = self.resolve_log_store(logger); + let config = self.resolve_log_store_config(logger); + + if let Some(ref cfg) = config { + info!(logger, "Log drain initialized"; "backend" => format!("{:?}", cfg)); + } + + (store, config) + } + + /// Helper: Try to create log store from CLI config (any backend) + fn resolve_cli_store(&self, logger: &Logger) -> Option> { + self.sources.cli_config.as_ref().map(|config| { + match LogStoreFactory::from_config(config.clone()) { + Ok(store) => { + info!(logger, "Log store initialized from CLI configuration"); + store + } + Err(e) => { + warn!( + logger, + "Failed to initialize log store from CLI config: {}, using NoOp", e + ); + Arc::new(NoOpLogStore) + } + } + }) + } +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn test_no_config_returns_noop() { + std::env::remove_var("GRAPH_LOG_STORE"); + + let logger = graph::log::logger(true); + let provider = LogStoreConfigProvider::new(LogStoreConfigSources { cli_config: None }); + + let store = provider.resolve_log_store(&logger); + assert!(!store.is_available()); + + let config = provider.resolve_log_store_config(&logger); + assert!(config.is_none()); + } + + #[test] + fn test_elastic_from_cli() { + std::env::remove_var("GRAPH_LOG_STORE"); + + let logger = graph::log::logger(true); + let cli_config = LogStoreConfig::Elasticsearch { + endpoint: "http://localhost:9200".to_string(), + username: Some("user".to_string()), + password: Some("pass".to_string()), + index: "test-index".to_string(), + timeout_secs: 10, + }; + + let provider = LogStoreConfigProvider::new(LogStoreConfigSources { + cli_config: Some(cli_config), + }); + + let config = provider.resolve_log_store_config(&logger); + assert!(config.is_some()); + + if let Some(LogStoreConfig::Elasticsearch { + endpoint, + username, + password, + index, + .. + }) = config + { + assert_eq!(endpoint, "http://localhost:9200"); + assert_eq!(username, Some("user".to_string())); + assert_eq!(password, Some("pass".to_string())); + assert_eq!(index, "test-index"); + } else { + panic!("Expected Elasticsearch config"); + } + } + + #[test] + fn test_resolve_convenience_method() { + std::env::remove_var("GRAPH_LOG_STORE"); + + let logger = graph::log::logger(true); + let cli_config = LogStoreConfig::Elasticsearch { + endpoint: "http://localhost:9200".to_string(), + username: None, + password: None, + index: "test-index".to_string(), + timeout_secs: 10, + }; + + let provider = LogStoreConfigProvider::new(LogStoreConfigSources { + cli_config: Some(cli_config), + }); + + let (_store, config) = provider.resolve(&logger); + assert!(config.is_some()); + + if let Some(LogStoreConfig::Elasticsearch { endpoint, .. }) = config { + assert_eq!(endpoint, "http://localhost:9200"); + } else { + panic!("Expected Elasticsearch config"); + } + } + + #[test] + fn test_loki_from_cli() { + std::env::remove_var("GRAPH_LOG_STORE"); + + let logger = graph::log::logger(true); + let cli_config = LogStoreConfig::Loki { + endpoint: "http://localhost:3100".to_string(), + tenant_id: Some("test-tenant".to_string()), + }; + + let provider = LogStoreConfigProvider::new(LogStoreConfigSources { + cli_config: Some(cli_config), + }); + + let config = provider.resolve_log_store_config(&logger); + assert!(config.is_some()); + + if let Some(LogStoreConfig::Loki { + endpoint, + tenant_id, + }) = config + { + assert_eq!(endpoint, "http://localhost:3100"); + assert_eq!(tenant_id, Some("test-tenant".to_string())); + } else { + panic!("Expected Loki config"); + } + } +} diff --git a/node/src/opt.rs b/node/src/opt.rs index 3708a7da493..a4ccc73cbfe 100644 --- a/node/src/opt.rs +++ b/node/src/opt.rs @@ -165,18 +165,92 @@ pub struct Opt { #[clap(long, help = "Enable debug logging")] pub debug: bool, + // ============================================ + // Log Store Configuration - NEW GENERIC ARGS + // ============================================ + #[clap( + long = "log-store-backend", + value_name = "BACKEND", + help = "Log store backend to use (disabled, elasticsearch, loki, file)" + )] + pub log_store_backend: Option, + + // --- Elasticsearch Configuration --- + #[clap( + long = "log-store-elasticsearch-url", + value_name = "URL", + help = "Elasticsearch URL for log storage" + )] + pub log_store_elasticsearch_url: Option, + #[clap( + long = "log-store-elasticsearch-user", + value_name = "USER", + help = "Elasticsearch username for authentication" + )] + pub log_store_elasticsearch_user: Option, + #[clap( + long = "log-store-elasticsearch-password", + value_name = "PASSWORD", + hide_env_values = true, + help = "Elasticsearch password for authentication" + )] + pub log_store_elasticsearch_password: Option, + #[clap( + long = "log-store-elasticsearch-index", + value_name = "INDEX", + help = "Elasticsearch index name (default: subgraph)" + )] + pub log_store_elasticsearch_index: Option, + + // --- Loki Configuration --- + #[clap( + long = "log-store-loki-url", + value_name = "URL", + help = "Loki URL for log storage" + )] + pub log_store_loki_url: Option, + #[clap( + long = "log-store-loki-tenant-id", + value_name = "TENANT_ID", + help = "Loki tenant ID for multi-tenancy" + )] + pub log_store_loki_tenant_id: Option, + + // --- File Configuration --- + #[clap( + long = "log-store-file-dir", + value_name = "DIR", + help = "Directory for file-based log storage" + )] + pub log_store_file_dir: Option, + #[clap( + long = "log-store-file-max-size", + value_name = "BYTES", + help = "Maximum log file size in bytes (default: 104857600 = 100MB)" + )] + pub log_store_file_max_size: Option, + #[clap( + long = "log-store-file-retention-days", + value_name = "DAYS", + help = "Number of days to retain log files (default: 30)" + )] + pub log_store_file_retention_days: Option, + + // ================================================ + // DEPRECATED - OLD ELASTICSEARCH-SPECIFIC ARGS + // ================================================ #[clap( long, value_name = "URL", env = "ELASTICSEARCH_URL", - help = "Elasticsearch service to write subgraph logs to" + help = "DEPRECATED: Use --log-store-elasticsearch-url instead. Elasticsearch service to write subgraph logs to" )] pub elasticsearch_url: Option, #[clap( long, value_name = "USER", env = "ELASTICSEARCH_USER", - help = "User to use for Elasticsearch logging" + help = "DEPRECATED: Use --log-store-elasticsearch-user instead. User to use for Elasticsearch logging" )] pub elasticsearch_user: Option, #[clap( @@ -184,7 +258,7 @@ pub struct Opt { value_name = "PASSWORD", env = "ELASTICSEARCH_PASSWORD", hide_env_values = true, - help = "Password to use for Elasticsearch logging" + help = "DEPRECATED: Use --log-store-elasticsearch-password instead. Password to use for Elasticsearch logging" )] pub elasticsearch_password: Option, #[clap( diff --git a/pnpm-lock.yaml b/pnpm-lock.yaml index 9276137fd13..4470fab39b0 100644 --- a/pnpm-lock.yaml +++ b/pnpm-lock.yaml @@ -80,6 +80,15 @@ importers: specifier: 0.34.0 version: 0.34.0 + tests/integration-tests/logs-query: + devDependencies: + '@graphprotocol/graph-cli': + specifier: 0.69.0 + version: 0.69.0(@types/node@24.3.0)(bufferutil@4.0.9)(encoding@0.1.13)(node-fetch@2.7.0(encoding@0.1.13))(typescript@5.9.2)(utf-8-validate@5.0.10) + '@graphprotocol/graph-ts': + specifier: 0.34.0 + version: 0.34.0 + tests/integration-tests/multiple-subgraph-datasources: devDependencies: '@graphprotocol/graph-cli': @@ -308,6 +317,15 @@ importers: specifier: 0.31.0 version: 0.31.0 + tests/runner-tests/logs-query: + devDependencies: + '@graphprotocol/graph-cli': + specifier: 0.69.0 + version: 0.69.0(@types/node@24.3.0)(bufferutil@4.0.9)(encoding@0.1.13)(node-fetch@2.7.0(encoding@0.1.13))(typescript@5.9.2)(utf-8-validate@5.0.10) + '@graphprotocol/graph-ts': + specifier: 0.34.0 + version: 0.34.0 + tests/runner-tests/substreams: devDependencies: '@graphprotocol/graph-cli': @@ -3340,7 +3358,7 @@ snapshots: binary-install-raw: 0.0.13(debug@4.3.4) chalk: 3.0.0 chokidar: 3.5.3 - debug: 4.3.4(supports-color@8.1.1) + debug: 4.3.4 docker-compose: 0.23.19 dockerode: 2.5.8 fs-extra: 9.1.0 @@ -3379,7 +3397,7 @@ snapshots: binary-install-raw: 0.0.13(debug@4.3.4) chalk: 3.0.0 chokidar: 3.5.3 - debug: 4.3.4(supports-color@8.1.1) + debug: 4.3.4 docker-compose: 0.23.19 dockerode: 2.5.8 fs-extra: 9.1.0 @@ -3420,7 +3438,7 @@ snapshots: binary-install-raw: 0.0.13(debug@4.3.4) chalk: 3.0.0 chokidar: 3.5.3 - debug: 4.3.4(supports-color@8.1.1) + debug: 4.3.4 docker-compose: 0.23.19 dockerode: 2.5.8 fs-extra: 9.1.0 @@ -3461,7 +3479,7 @@ snapshots: binary-install-raw: 0.0.13(debug@4.3.4) chalk: 3.0.0 chokidar: 3.5.3 - debug: 4.3.4(supports-color@8.1.1) + debug: 4.3.4 docker-compose: 0.23.19 dockerode: 2.5.8 fs-extra: 9.1.0 @@ -3502,7 +3520,7 @@ snapshots: binary-install-raw: 0.0.13(debug@4.3.4) chalk: 3.0.0 chokidar: 3.5.3 - debug: 4.3.4(supports-color@8.1.1) + debug: 4.3.4 docker-compose: 0.23.19 dockerode: 2.5.8 fs-extra: 9.1.0 @@ -3542,7 +3560,7 @@ snapshots: binary-install-raw: 0.0.13(debug@4.3.4) chalk: 3.0.0 chokidar: 3.5.3 - debug: 4.3.4(supports-color@8.1.1) + debug: 4.3.4 docker-compose: 0.23.19 dockerode: 2.5.8 fs-extra: 9.1.0 @@ -3583,7 +3601,7 @@ snapshots: binary-install-raw: 0.0.13(debug@4.3.4) chalk: 3.0.0 chokidar: 3.5.3 - debug: 4.3.4(supports-color@8.1.1) + debug: 4.3.4 docker-compose: 0.23.19 dockerode: 2.5.8 fs-extra: 9.1.0 @@ -3984,7 +4002,7 @@ snapshots: chalk: 4.1.2 clean-stack: 3.0.1 cli-progress: 3.12.0 - debug: 4.3.4(supports-color@8.1.1) + debug: 4.4.1(supports-color@8.1.1) ejs: 3.1.10 get-package-type: 0.1.0 globby: 11.1.0 @@ -4020,7 +4038,7 @@ snapshots: chalk: 4.1.2 clean-stack: 3.0.1 cli-progress: 3.12.0 - debug: 4.3.4(supports-color@8.1.1) + debug: 4.4.1(supports-color@8.1.1) ejs: 3.1.10 fs-extra: 9.1.0 get-package-type: 0.1.0 @@ -4032,7 +4050,7 @@ snapshots: natural-orderby: 2.0.3 object-treeify: 1.1.33 password-prompt: 1.1.3 - semver: 7.4.0 + semver: 7.7.2 string-width: 4.2.3 strip-ansi: 6.0.1 supports-color: 8.1.1 @@ -4069,7 +4087,7 @@ snapshots: natural-orderby: 2.0.3 object-treeify: 1.1.33 password-prompt: 1.1.3 - semver: 7.6.3 + semver: 7.7.2 string-width: 4.2.3 strip-ansi: 6.0.1 supports-color: 8.1.1 @@ -4140,7 +4158,7 @@ snapshots: is-wsl: 2.2.0 lilconfig: 3.1.3 minimatch: 9.0.5 - semver: 7.6.3 + semver: 7.7.2 string-width: 4.2.3 supports-color: 8.1.1 tinyglobby: 0.2.14 @@ -4152,7 +4170,7 @@ snapshots: dependencies: '@oclif/core': 2.16.0(@types/node@24.3.0)(typescript@5.9.2) chalk: 4.1.2 - debug: 4.3.4(supports-color@8.1.1) + debug: 4.4.1(supports-color@8.1.1) transitivePeerDependencies: - '@swc/core' - '@swc/wasm' @@ -4866,11 +4884,9 @@ snapshots: dependencies: ms: 2.1.3 - debug@4.3.4(supports-color@8.1.1): + debug@4.3.4: dependencies: ms: 2.1.2 - optionalDependencies: - supports-color: 8.1.1 debug@4.3.7(supports-color@8.1.1): dependencies: @@ -5098,7 +5114,7 @@ snapshots: execa@5.1.1: dependencies: - cross-spawn: 7.0.3 + cross-spawn: 7.0.6 get-stream: 6.0.1 human-signals: 2.1.0 is-stream: 2.0.1 @@ -5162,7 +5178,7 @@ snapshots: follow-redirects@1.15.11(debug@4.3.4): optionalDependencies: - debug: 4.3.4(supports-color@8.1.1) + debug: 4.3.4 follow-redirects@1.15.11(debug@4.3.7): optionalDependencies: diff --git a/server/index-node/src/service.rs b/server/index-node/src/service.rs index 09ddfd29038..d54e658c430 100644 --- a/server/index-node/src/service.rs +++ b/server/index-node/src/service.rs @@ -153,6 +153,7 @@ where max_first: u32::MAX, max_skip: u32::MAX, trace: false, + log_store: Arc::new(graph::components::log_store::NoOpLogStore), }; let (result, _) = execute_query(query_clone.cheap_clone(), None, None, options).await; query_clone.log_execution(0); diff --git a/store/test-store/src/store.rs b/store/test-store/src/store.rs index af973c32993..d33a1de1b15 100644 --- a/store/test-store/src/store.rs +++ b/store/test-store/src/store.rs @@ -558,7 +558,7 @@ async fn execute_subgraph_query_internal( error_policy, query.schema.id().clone(), graphql_metrics(), - LOAD_MANAGER.clone() + LOAD_MANAGER.clone(), ) .await ); @@ -572,6 +572,7 @@ async fn execute_subgraph_query_internal( max_first: u32::MAX, max_skip: u32::MAX, trace, + log_store: std::sync::Arc::new(graph::components::log_store::NoOpLogStore), }, ) .await; diff --git a/store/test-store/tests/graphql/introspection.rs b/store/test-store/tests/graphql/introspection.rs index 6a978bccfc5..750c6f34f62 100644 --- a/store/test-store/tests/graphql/introspection.rs +++ b/store/test-store/tests/graphql/introspection.rs @@ -125,6 +125,7 @@ async fn introspection_query(schema: Arc, query: &str) -> QueryResult max_first: u32::MAX, max_skip: u32::MAX, trace: false, + log_store: Arc::new(graph::components::log_store::NoOpLogStore), }; let result = diff --git a/store/test-store/tests/graphql/mock_introspection.json b/store/test-store/tests/graphql/mock_introspection.json index d2eca61b928..f98ef44bda7 100644 --- a/store/test-store/tests/graphql/mock_introspection.json +++ b/store/test-store/tests/graphql/mock_introspection.json @@ -175,6 +175,47 @@ "enumValues": null, "possibleTypes": null }, + { + "kind": "ENUM", + "name": "LogLevel", + "description": "The severity level of a log entry.\nLog levels are ordered from most to least severe: CRITICAL > ERROR > WARNING > INFO > DEBUG", + "fields": null, + "inputFields": null, + "interfaces": null, + "enumValues": [ + { + "name": "CRITICAL", + "description": "Critical errors that require immediate attention", + "isDeprecated": false, + "deprecationReason": null + }, + { + "name": "ERROR", + "description": "Error conditions that indicate a failure", + "isDeprecated": false, + "deprecationReason": null + }, + { + "name": "WARNING", + "description": "Warning conditions that may require attention", + "isDeprecated": false, + "deprecationReason": null + }, + { + "name": "INFO", + "description": "Informational messages about normal operations", + "isDeprecated": false, + "deprecationReason": null + }, + { + "name": "DEBUG", + "description": "Detailed diagnostic information for debugging", + "isDeprecated": false, + "deprecationReason": null + } + ], + "possibleTypes": null + }, { "kind": "INTERFACE", "name": "Node", @@ -720,6 +761,91 @@ }, "isDeprecated": false, "deprecationReason": null + }, + { + "name": "_logs", + "description": "Query execution logs emitted by the subgraph during indexing. Results are sorted by timestamp in descending order (newest first).", + "args": [ + { + "name": "level", + "description": "Filter logs by severity level. Only logs at this level will be returned.", + "type": { + "kind": "ENUM", + "name": "LogLevel", + "ofType": null + }, + "defaultValue": null + }, + { + "name": "from", + "description": "Filter logs from this timestamp onwards (inclusive). Must be in RFC3339 format (e.g., '2024-01-15T10:30:00Z').", + "type": { + "kind": "SCALAR", + "name": "String", + "ofType": null + }, + "defaultValue": null + }, + { + "name": "to", + "description": "Filter logs until this timestamp (inclusive). Must be in RFC3339 format (e.g., '2024-01-15T23:59:59Z').", + "type": { + "kind": "SCALAR", + "name": "String", + "ofType": null + }, + "defaultValue": null + }, + { + "name": "search", + "description": "Search for logs containing this text in the message. Case-insensitive substring match. Maximum length: 1000 characters.", + "type": { + "kind": "SCALAR", + "name": "String", + "ofType": null + }, + "defaultValue": null + }, + { + "name": "first", + "description": "Maximum number of logs to return. Default: 100, Maximum: 1000.", + "type": { + "kind": "SCALAR", + "name": "Int", + "ofType": null + }, + "defaultValue": "100" + }, + { + "name": "skip", + "description": "Number of logs to skip (for pagination). Default: 0, Maximum: 10000.", + "type": { + "kind": "SCALAR", + "name": "Int", + "ofType": null + }, + "defaultValue": "0" + } + ], + "type": { + "kind": "NON_NULL", + "name": null, + "ofType": { + "kind": "LIST", + "name": null, + "ofType": { + "kind": "NON_NULL", + "name": null, + "ofType": { + "kind": "OBJECT", + "name": "_Log_", + "ofType": null + } + } + } + }, + "isDeprecated": false, + "deprecationReason": null } ], "inputFields": null, @@ -1344,6 +1470,239 @@ "enumValues": null, "possibleTypes": null }, + { + "kind": "OBJECT", + "name": "_LogArgument_", + "description": "A key-value pair of additional data associated with a log entry.\nThese correspond to arguments passed to the log function in the subgraph code.", + "fields": [ + { + "name": "key", + "description": "The parameter name", + "args": [], + "type": { + "kind": "NON_NULL", + "name": null, + "ofType": { + "kind": "SCALAR", + "name": "String", + "ofType": null + } + }, + "isDeprecated": false, + "deprecationReason": null + }, + { + "name": "value", + "description": "The parameter value, serialized as a string", + "args": [], + "type": { + "kind": "NON_NULL", + "name": null, + "ofType": { + "kind": "SCALAR", + "name": "String", + "ofType": null + } + }, + "isDeprecated": false, + "deprecationReason": null + } + ], + "inputFields": null, + "interfaces": [], + "enumValues": null, + "possibleTypes": null + }, + { + "kind": "OBJECT", + "name": "_LogMeta_", + "description": "Source code location metadata for a log entry.\nIndicates where in the subgraph's AssemblyScript code the log statement was executed.", + "fields": [ + { + "name": "module", + "description": "The module or file path where the log was emitted", + "args": [], + "type": { + "kind": "NON_NULL", + "name": null, + "ofType": { + "kind": "SCALAR", + "name": "String", + "ofType": null + } + }, + "isDeprecated": false, + "deprecationReason": null + }, + { + "name": "line", + "description": "The line number in the source file", + "args": [], + "type": { + "kind": "NON_NULL", + "name": null, + "ofType": { + "kind": "SCALAR", + "name": "Int", + "ofType": null + } + }, + "isDeprecated": false, + "deprecationReason": null + }, + { + "name": "column", + "description": "The column number in the source file", + "args": [], + "type": { + "kind": "NON_NULL", + "name": null, + "ofType": { + "kind": "SCALAR", + "name": "Int", + "ofType": null + } + }, + "isDeprecated": false, + "deprecationReason": null + } + ], + "inputFields": null, + "interfaces": [], + "enumValues": null, + "possibleTypes": null + }, + { + "kind": "OBJECT", + "name": "_Log_", + "description": "A log entry emitted by a subgraph during indexing.\nLogs can be generated by the subgraph's AssemblyScript code using the `log.*` functions.", + "fields": [ + { + "name": "id", + "description": "Unique identifier for this log entry", + "args": [], + "type": { + "kind": "NON_NULL", + "name": null, + "ofType": { + "kind": "SCALAR", + "name": "String", + "ofType": null + } + }, + "isDeprecated": false, + "deprecationReason": null + }, + { + "name": "subgraphId", + "description": "The deployment hash of the subgraph that emitted this log", + "args": [], + "type": { + "kind": "NON_NULL", + "name": null, + "ofType": { + "kind": "SCALAR", + "name": "String", + "ofType": null + } + }, + "isDeprecated": false, + "deprecationReason": null + }, + { + "name": "timestamp", + "description": "The timestamp when the log was emitted, in RFC3339 format (e.g., '2024-01-15T10:30:00Z')", + "args": [], + "type": { + "kind": "NON_NULL", + "name": null, + "ofType": { + "kind": "SCALAR", + "name": "String", + "ofType": null + } + }, + "isDeprecated": false, + "deprecationReason": null + }, + { + "name": "level", + "description": "The severity level of the log entry", + "args": [], + "type": { + "kind": "NON_NULL", + "name": null, + "ofType": { + "kind": "ENUM", + "name": "LogLevel", + "ofType": null + } + }, + "isDeprecated": false, + "deprecationReason": null + }, + { + "name": "text", + "description": "The log message text", + "args": [], + "type": { + "kind": "NON_NULL", + "name": null, + "ofType": { + "kind": "SCALAR", + "name": "String", + "ofType": null + } + }, + "isDeprecated": false, + "deprecationReason": null + }, + { + "name": "arguments", + "description": "Additional structured data passed to the log function as key-value pairs", + "args": [], + "type": { + "kind": "NON_NULL", + "name": null, + "ofType": { + "kind": "LIST", + "name": null, + "ofType": { + "kind": "NON_NULL", + "name": null, + "ofType": { + "kind": "OBJECT", + "name": "_LogArgument_", + "ofType": null + } + } + } + }, + "isDeprecated": false, + "deprecationReason": null + }, + { + "name": "meta", + "description": "Metadata about the source location in the subgraph code where the log was emitted", + "args": [], + "type": { + "kind": "NON_NULL", + "name": null, + "ofType": { + "kind": "OBJECT", + "name": "_LogMeta_", + "ofType": null + } + }, + "isDeprecated": false, + "deprecationReason": null + } + ], + "inputFields": null, + "interfaces": [], + "enumValues": null, + "possibleTypes": null + }, { "kind": "OBJECT", "name": "_Meta_", @@ -1431,7 +1790,9 @@ { "name": "language", "description": null, - "locations": ["FIELD_DEFINITION"], + "locations": [ + "FIELD_DEFINITION" + ], "args": [ { "name": "language", @@ -1448,7 +1809,11 @@ { "name": "skip", "description": null, - "locations": ["FIELD", "FRAGMENT_SPREAD", "INLINE_FRAGMENT"], + "locations": [ + "FIELD", + "FRAGMENT_SPREAD", + "INLINE_FRAGMENT" + ], "args": [ { "name": "if", @@ -1469,7 +1834,11 @@ { "name": "include", "description": null, - "locations": ["FIELD", "FRAGMENT_SPREAD", "INLINE_FRAGMENT"], + "locations": [ + "FIELD", + "FRAGMENT_SPREAD", + "INLINE_FRAGMENT" + ], "args": [ { "name": "if", @@ -1490,13 +1859,17 @@ { "name": "entity", "description": "Marks the GraphQL type as indexable entity. Each type that should be an entity is required to be annotated with this directive.", - "locations": ["OBJECT"], + "locations": [ + "OBJECT" + ], "args": [] }, { "name": "subgraphId", "description": "Defined a Subgraph ID for an object type", - "locations": ["OBJECT"], + "locations": [ + "OBJECT" + ], "args": [ { "name": "id", @@ -1517,7 +1890,9 @@ { "name": "derivedFrom", "description": "creates a virtual field on the entity that may be queried but cannot be set manually through the mappings API.", - "locations": ["FIELD_DEFINITION"], + "locations": [ + "FIELD_DEFINITION" + ], "args": [ { "name": "field", diff --git a/store/test-store/tests/graphql/query.rs b/store/test-store/tests/graphql/query.rs index f206fe2644f..9908866610b 100644 --- a/store/test-store/tests/graphql/query.rs +++ b/store/test-store/tests/graphql/query.rs @@ -616,6 +616,7 @@ async fn execute_query_document_with_variables( STORE.clone(), LOAD_MANAGER.clone(), METRICS_REGISTRY.clone(), + Arc::new(graph::components::log_store::NoOpLogStore), )); let target = QueryTarget::Deployment(id.clone(), Default::default()); let query = Query::new(query, variables, false); @@ -726,6 +727,7 @@ where STORE.clone(), LOAD_MANAGER.clone(), METRICS_REGISTRY.clone(), + Arc::new(graph::components::log_store::NoOpLogStore), )); let target = QueryTarget::Deployment(id.clone(), Default::default()); let query = Query::new(query, variables, false); diff --git a/tests/integration-tests/logs-query/abis/Contract.abi b/tests/integration-tests/logs-query/abis/Contract.abi new file mode 100644 index 00000000000..02da1a9e7f3 --- /dev/null +++ b/tests/integration-tests/logs-query/abis/Contract.abi @@ -0,0 +1,33 @@ +[ + { + "inputs": [], + "stateMutability": "nonpayable", + "type": "constructor" + }, + { + "anonymous": false, + "inputs": [ + { + "indexed": false, + "internalType": "uint16", + "name": "x", + "type": "uint16" + } + ], + "name": "Trigger", + "type": "event" + }, + { + "inputs": [ + { + "internalType": "uint16", + "name": "x", + "type": "uint16" + } + ], + "name": "emitTrigger", + "outputs": [], + "stateMutability": "nonpayable", + "type": "function" + } +] diff --git a/tests/integration-tests/logs-query/package.json b/tests/integration-tests/logs-query/package.json new file mode 100644 index 00000000000..c1a68515054 --- /dev/null +++ b/tests/integration-tests/logs-query/package.json @@ -0,0 +1,13 @@ +{ + "name": "logs-query-subgraph", + "version": "0.0.0", + "private": true, + "scripts": { + "codegen": "graph codegen --skip-migrations", + "deploy:test": "graph deploy test/logs-query --version-label v0.0.1 --ipfs $IPFS_URI --node $GRAPH_NODE_ADMIN_URI" + }, + "devDependencies": { + "@graphprotocol/graph-cli": "0.69.0", + "@graphprotocol/graph-ts": "0.34.0" + } +} diff --git a/tests/integration-tests/logs-query/schema.graphql b/tests/integration-tests/logs-query/schema.graphql new file mode 100644 index 00000000000..1459e34353b --- /dev/null +++ b/tests/integration-tests/logs-query/schema.graphql @@ -0,0 +1,4 @@ +type Trigger @entity { + id: ID! + x: Int! +} diff --git a/tests/integration-tests/logs-query/src/mapping.ts b/tests/integration-tests/logs-query/src/mapping.ts new file mode 100644 index 00000000000..8f4d55e6a9f --- /dev/null +++ b/tests/integration-tests/logs-query/src/mapping.ts @@ -0,0 +1,39 @@ +import { Trigger as TriggerEvent } from "../generated/Contract/Contract"; +import { Trigger } from "../generated/schema"; +import { log } from "@graphprotocol/graph-ts"; + +export function handleTrigger(event: TriggerEvent): void { + let entity = new Trigger(event.transaction.hash.toHex()); + entity.x = event.params.x; + entity.save(); + + // Generate various log levels and types for testing + let x = event.params.x as i32; + + if (x == 0) { + log.info("Processing trigger with value zero", []); + } + + if (x == 1) { + log.error("Error processing trigger", []); + } + + if (x == 2) { + log.warning("Warning: unusual trigger value", ["hash", event.transaction.hash.toHexString()]); + } + + if (x == 3) { + log.debug("Debug: trigger details", ["blockNumber", event.block.number.toString()]); + } + + if (x == 4) { + log.info("Handler execution successful", ["entity_id", entity.id]); + } + + if (x == 5) { + log.error("Critical timeout error", []); + } + + // Log for every event to test general log capture + log.info("Trigger event processed", []); +} diff --git a/tests/integration-tests/logs-query/subgraph.yaml b/tests/integration-tests/logs-query/subgraph.yaml new file mode 100644 index 00000000000..be3061d5533 --- /dev/null +++ b/tests/integration-tests/logs-query/subgraph.yaml @@ -0,0 +1,26 @@ +specVersion: 0.0.5 +description: Logs Query Test Subgraph +repository: https://github.com/graphprotocol/graph-node +schema: + file: ./schema.graphql +dataSources: + - kind: ethereum/contract + name: Contract + network: test + source: + address: "0x5FbDB2315678afecb367f032d93F642f64180aa3" + abi: Contract + startBlock: 0 + mapping: + kind: ethereum/events + apiVersion: 0.0.6 + language: wasm/assemblyscript + entities: + - Trigger + abis: + - name: Contract + file: ./abis/Contract.abi + eventHandlers: + - event: Trigger(uint16) + handler: handleTrigger + file: ./src/mapping.ts diff --git a/tests/src/config.rs b/tests/src/config.rs index 46f22b141e7..cada62cac2f 100644 --- a/tests/src/config.rs +++ b/tests/src/config.rs @@ -140,10 +140,17 @@ impl GraphNodeConfig { let bin = fs::canonicalize("../target/debug/gnd") .expect("failed to infer `graph-node` program location. (Was it built already?)"); + // Allow overriding IPFS port via environment variable + let ipfs_port = std::env::var("IPFS_TEST_PORT") + .ok() + .and_then(|p| p.parse().ok()) + .unwrap_or(3001); + let ipfs_uri = format!("http://localhost:{}", ipfs_port); + Self { bin, ports: GraphNodePorts::default(), - ipfs_uri: "http://localhost:3001".to_string(), + ipfs_uri, log_file: TestFile::new("integration-tests/graph-node.log"), } } @@ -154,10 +161,17 @@ impl Default for GraphNodeConfig { let bin = fs::canonicalize("../target/debug/graph-node") .expect("failed to infer `graph-node` program location. (Was it built already?)"); + // Allow overriding IPFS port via environment variable + let ipfs_port = std::env::var("IPFS_TEST_PORT") + .ok() + .and_then(|p| p.parse().ok()) + .unwrap_or(3001); + let ipfs_uri = format!("http://localhost:{}", ipfs_port); + Self { bin, ports: GraphNodePorts::default(), - ipfs_uri: "http://localhost:3001".to_string(), + ipfs_uri, log_file: TestFile::new("integration-tests/graph-node.log"), } } @@ -219,7 +233,9 @@ impl Config { .stderr(stderr) .args(args.clone()) .env("GRAPH_STORE_WRITE_BATCH_DURATION", "5") - .env("ETHEREUM_REORG_THRESHOLD", "0"); + .env("ETHEREUM_REORG_THRESHOLD", "0") + .env("GRAPH_LOG_STORE_BACKEND", "file") + .env("GRAPH_LOG_STORE_FILE_DIR", "/tmp/integration-test-logs"); status!( "graph-node", @@ -284,17 +300,28 @@ impl Default for Config { let num_parallel_tests = std::env::var("N_CONCURRENT_TESTS") .map(|x| x.parse().expect("N_CONCURRENT_TESTS must be a number")) .unwrap_or(1000); + + // Allow overriding ports via environment variables + let postgres_port = std::env::var("POSTGRES_TEST_PORT") + .ok() + .and_then(|p| p.parse().ok()) + .unwrap_or(3011); + let eth_port = std::env::var("ETHEREUM_TEST_PORT") + .ok() + .and_then(|p| p.parse().ok()) + .unwrap_or(3021); + Config { db: DbConfig { host: "localhost".to_string(), - port: 3011, + port: postgres_port, user: "graph-node".to_string(), password: "let-me-in".to_string(), name: "graph-node".to_string(), }, eth: EthConfig { network: "test".to_string(), - port: 3021, + port: eth_port, host: "localhost".to_string(), }, graph_node: GraphNodeConfig::from_env(), diff --git a/tests/src/fixture/mod.rs b/tests/src/fixture/mod.rs index 62000dc5e8e..fc3c627c8d1 100644 --- a/tests/src/fixture/mod.rs +++ b/tests/src/fixture/mod.rs @@ -556,6 +556,7 @@ pub async fn setup_inner( stores.network_store.clone(), Arc::new(load_manager), mock_registry.clone(), + Arc::new(graph::components::log_store::NoOpLogStore), )); let indexing_status_service = Arc::new(IndexNodeService::new( diff --git a/tests/tests/integration_tests.rs b/tests/tests/integration_tests.rs index 322eb643533..4ee95fd1205 100644 --- a/tests/tests/integration_tests.rs +++ b/tests/tests/integration_tests.rs @@ -1028,6 +1028,71 @@ async fn test_poi_for_failed_subgraph(ctx: TestContext) -> anyhow::Result<()> { let resp = Subgraph::query_with_vars(FETCH_POI, vars).await?; assert_eq!(None, resp.get("errors")); assert!(resp["data"]["proofOfIndexing"].is_string()); + + // Test that _logs query works on failed subgraphs (critical for debugging!) + // Wait a moment for logs to be written + sleep(Duration::from_secs(2)).await; + + let query = r#"{ + _logs(first: 100) { + id + timestamp + level + text + } + }"# + .to_string(); + + let resp = subgraph.query(&query).await?; + + // Should not have GraphQL errors when querying logs on failed subgraph + assert!( + resp.get("errors").is_none(), + "Expected no errors when querying _logs on failed subgraph, got: {:?}", + resp.get("errors") + ); + + let logs = resp["data"]["_logs"] + .as_array() + .context("Expected _logs to be an array")?; + + // The critical assertion: _logs query works on failed subgraphs + // This enables debugging even when the subgraph has crashed + println!( + "Successfully queried _logs on failed subgraph, found {} log entries", + logs.len() + ); + + // Print a sample of logs to see what's available (for documentation/debugging) + if !logs.is_empty() { + println!("Sample logs from failed subgraph:"); + for (i, log) in logs.iter().take(5).enumerate() { + println!( + " Log {}: level={:?}, text={:?}", + i + 1, + log["level"].as_str(), + log["text"].as_str() + ); + } + } + + // Verify we can also filter by level on failed subgraphs + let query = r#"{ + _logs(level: ERROR, first: 100) { + level + text + } + }"# + .to_string(); + + let resp = subgraph.query(&query).await?; + assert!( + resp.get("errors").is_none(), + "Expected no errors when filtering _logs by level on failed subgraph" + ); + + println!("✓ _logs query works on failed subgraphs - critical for debugging!"); + Ok(()) } @@ -1284,6 +1349,189 @@ async fn test_declared_calls_struct_fields(ctx: TestContext) -> anyhow::Result<( Ok(()) } +async fn test_logs_query(ctx: TestContext) -> anyhow::Result<()> { + let subgraph = ctx.subgraph; + assert!(subgraph.healthy); + + // Wait a moment for logs to be written + sleep(Duration::from_secs(2)).await; + + // Test 1: Query all logs + let query = r#"{ + _logs(first: 100) { + id + timestamp + level + text + } + }"# + .to_string(); + let resp = subgraph.query(&query).await?; + + let logs = resp["data"]["_logs"] + .as_array() + .context("Expected _logs to be an array")?; + + // We should have logs from the subgraph (user logs + system logs) + assert!( + !logs.is_empty(), + "Expected to have logs, got none. Response: {:?}", + resp + ); + + // Test 2: Filter by ERROR level + let query = r#"{ + _logs(level: ERROR, first: 100) { + level + text + } + }"# + .to_string(); + let resp = subgraph.query(&query).await?; + let error_logs = resp["data"]["_logs"] + .as_array() + .context("Expected _logs to be an array")?; + + // Check that we have error logs and they're all ERROR level + for log in error_logs { + assert_eq!( + log["level"].as_str(), + Some("ERROR"), + "Expected ERROR level, got: {:?}", + log + ); + } + + // Test 3: Search for specific text + let query = r#"{ + _logs(search: "timeout", first: 100) { + id + text + } + }"# + .to_string(); + let resp = subgraph.query(&query).await?; + let timeout_logs = resp["data"]["_logs"] + .as_array() + .context("Expected _logs to be an array")?; + + // If we have timeout logs, verify they contain the word "timeout" + for log in timeout_logs { + let text = log["text"] + .as_str() + .context("Expected text field to be a string")?; + assert!( + text.to_lowercase().contains("timeout"), + "Expected log to contain 'timeout', got: {}", + text + ); + } + + // Test 4: Pagination + let query = r#"{ + _logs(first: 2, skip: 0) { + id + } + }"# + .to_string(); + let resp = subgraph.query(&query).await?; + let first_page = resp["data"]["_logs"] + .as_array() + .context("Expected _logs to be an array")?; + + let query = r#"{ + _logs(first: 2, skip: 2) { + id + } + }"# + .to_string(); + let resp = subgraph.query(&query).await?; + let second_page = resp["data"]["_logs"] + .as_array() + .context("Expected _logs to be an array")?; + + // If we have enough logs, verify pages are different + if first_page.len() == 2 && !second_page.is_empty() { + let first_ids: Vec<_> = first_page.iter().map(|l| &l["id"]).collect(); + let second_ids: Vec<_> = second_page.iter().map(|l| &l["id"]).collect(); + + // Verify no overlap between pages + for id in &second_ids { + assert!( + !first_ids.contains(id), + "Log ID {:?} appears in both pages", + id + ); + } + } + + // Test 5: Query with arguments field to verify structured logging + let query = r#"{ + _logs(first: 10) { + text + arguments { + key + value + } + } + }"# + .to_string(); + let resp = subgraph.query(&query).await?; + let logs_with_args = resp["data"]["_logs"] + .as_array() + .context("Expected _logs to be an array")?; + + // Verify arguments field is present (even if empty for some logs) + for log in logs_with_args { + assert!( + log.get("arguments").is_some(), + "Expected arguments field to exist in log: {:?}", + log + ); + } + + // Test 6: Verify that combining _logs with regular entity queries returns a validation error + let query = r#"{ + _logs(first: 10) { + id + text + } + triggers { + id + x + } + }"# + .to_string(); + let resp = subgraph.query(&query).await?; + + // Should have errors, not data + assert!( + resp.get("errors").is_some(), + "Expected errors when combining _logs with entity queries, got: {:?}", + resp + ); + + // Verify the error message mentions the validation issue + let errors = resp["errors"] + .as_array() + .context("Expected errors to be an array")?; + assert!( + !errors.is_empty(), + "Expected at least one error in response" + ); + + let error_msg = errors[0]["message"] + .as_str() + .context("Expected error message to be a string")?; + assert!( + error_msg.contains("_logs") && error_msg.contains("cannot be combined"), + "Expected validation error about _logs combination, got: {}", + error_msg + ); + + Ok(()) +} + async fn wait_for_blockchain_block(block_number: i32) -> bool { // Wait up to 5 minutes for the expected block to appear const STATUS_WAIT: Duration = Duration::from_secs(300); @@ -1337,22 +1585,27 @@ async fn integration_tests() -> anyhow::Result<()> { "declared-calls-struct-fields", test_declared_calls_struct_fields, ), + TestCase::new("logs-query", test_logs_query), ]; // Filter the test cases if a specific test name is provided - let cases_to_run: Vec<_> = if let Some(test_name) = test_name_to_run { + let cases_to_run: Vec<_> = if let Some(ref test_name) = test_name_to_run { cases .into_iter() - .filter(|case| case.name == test_name) + .filter(|case| case.name == *test_name) .collect() } else { cases }; - // Here we wait for a block in the blockchain in order not to influence - // block hashes for all the blocks until the end of the grafting tests. - // Currently the last used block for grafting test is the block 3. - assert!(wait_for_blockchain_block(SUBGRAPH_LAST_GRAFTING_BLOCK).await); + // Only wait for blockchain blocks if running the grafting test + let needs_grafting_setup = cases_to_run.iter().any(|case| case.name == "grafted"); + if needs_grafting_setup { + // Here we wait for a block in the blockchain in order not to influence + // block hashes for all the blocks until the end of the grafting tests. + // Currently the last used block for grafting test is the block 3. + assert!(wait_for_blockchain_block(SUBGRAPH_LAST_GRAFTING_BLOCK).await); + } let contracts = Contract::deploy_all().await?;