voro-guard
The code index service. A hardened FastAPI service that indexes repositories, extracts symbols across 8 languages, builds Solidity call graphs, and signs artifacts with HMAC-SHA256 for trust verification.
Purpose
voro-guard provides structured code intelligence to the VORO pipeline. It indexes source code into searchable symbol databases, computes call graphs for reachability analysis, and wraps everything in signed artifact envelopes for trust verification. voro-brain consumes this service via MCP stdio to enrich its exploitability assessments.
Architecture
voro-brain (ExploitabilityAssessor)
→ spawns: python -m app.mcp_server (stdio)
→ MCP server starts managed FastAPI subprocess on 127.0.0.1:18765
→ index_repo() → POST /v1/index → ArtifactEnvelope
→ search_symbols() → POST /v1/search → SymbolMatch[]
→ get_symbol() → POST /v1/get → Symbol
→ outline_file() → POST /v1/outline → FileOutline
→ callgraph() → POST /v1/callgraph → CallGraph
HTTP API
| Method | Path | Purpose |
|---|---|---|
| GET | /health | Service health check (no auth) |
| POST | /v1/index | Index a repository, create signed artifact |
| POST | /v1/search | Search symbols by query |
| POST | /v1/get | Get specific symbol by ID |
| POST | /v1/outline | List all files and symbols in artifact |
| POST | /v1/callgraph | Build Solidity call graph from file |
| GET | /v1/metrics | Service metrics snapshot |
All /v1/* endpoints require Bearer token authentication.
MCP Tools (stdio interface)
| Tool | Purpose |
|---|---|
index_repo(source_type, source_id, workspace_id, source_revision) | Index a repository |
search_symbols(query, workspace_id, artifact_id, source_fingerprint) | Search symbols |
get_symbol(symbol_id, workspace_id, artifact_id, source_fingerprint) | Get symbol detail |
outline_file(workspace_id, artifact_id, source_fingerprint) | File/symbol outline |
Supported Languages
Symbol extraction (regex-based, line-by-line — no AST for speed and portability):
- Python
- JavaScript
- TypeScript
- Go
- Rust
- Java
- PHP
- Solidity (with additional call graph analysis, visibility metadata, and reachability tracking)
Security Model
| Layer | Mechanism |
|---|---|
| Artifact Signing | HMAC-SHA256 over canonical JSON (deterministic field ordering) |
| Trust Modes | strict (signature required, production) vs legacy (unsigned allowed, dev only) |
| Bearer Auth | Required for all /v1/* endpoints |
| Path Safety | Symlink escape detection, secret file filtering, binary exclusion |
| Identity Verification | workspace_id + source_fingerprint + artifact_id verified on all queries |
ArtifactEnvelope (output contract)
Every indexed repository produces a signed ArtifactEnvelope:
{
"schema_version": "c35-v1",
"workspace_id": "ws1",
"source_type": "github|git|local_repo|snapshot",
"source_id": "owner/repo",
"source_revision": "commit_sha",
"source_fingerprint": "sha256:...",
"artifact_id": "24-char sha256 prefix",
"manifest": {
"signer": "voro-index-guard",
"signed_at": "ISO8601",
"signature": "hmac_hex"
},
"payload": {
"files": [...],
"symbols": [...],
"stats": { "file_count": N, "symbol_count": N },
"token_savings_estimate": { ... }
}
}
Symbol Fields
| Field | Type | Note |
|---|---|---|
id | string | 24-char SHA256 prefix |
kind | string | function, class, interface, contract |
name | string | Symbol name |
file | string | File path within repo |
line | int | Line number |
language | string | One of 8 supported languages |
snippet | string | Source code context (plus/minus 2 lines) |
visibility | string | Solidity only: public, external, internal, private |
payable | bool | Solidity only |
reachable | bool | Solidity only — derived from call graph analysis |
Deployment
Production via Zeabur (Docker):
- Image: Python 3.12-slim from Dockerfile
- Port: 8080
- Required env vars:
CODE_INDEX_SERVICE_TOKEN,CODE_INDEX_SIGNING_KEY,CODE_INDEX_TRUST_MODE=strict - Persistent volume:
/data/artifacts - Health check:
GET /health→ 200
# Smoke test
./scripts/smoke_prod.sh https://<domain> <token>
Module Structure
app/
├── main.py # FastAPI app setup
├── config.py # Settings/env configuration
├── security.py # Bearer token auth
├── metrics.py # Request/success/deny counters
├── mcp_server.py # FastMCP stdio wrapper + managed subprocess
├── models/schemas.py # Pydantic request/response models
├── routes/
│ ├── index.py # POST /v1/index (artifact signing)
│ └── query.py # Search/get/outline/metrics/callgraph
└── core/
├── artifacts.py # Artifact persistence & verification
├── callgraph.py # Solidity call graph analysis
├── identity.py # Source fingerprinting
├── indexer.py # GitHub & local repo indexing
├── ingest.py # File discovery & reading
├── parser.py # Symbol extraction (8 languages)
├── safety.py # Symlink/secret/binary checks
├── signing.py # HMAC-SHA256 signing
└── store.py # Symbol indexing & querying
Build & Run
pip install -r requirements.txt
pytest tests/unit/ # Run tests
uvicorn app.main:app --host 0.0.0.0 --port 8080 # Run HTTP API
python -m app.mcp_server # Run MCP stdio server
Current State
- Version: 0.1.0
- Tests: 8 unit test modules
- Branch:
mainat45cdb1d - Shipped: Phase 2.0 (MCP wrapper), Phase 3.0 (Solidity call graphs with visibility)
- Open issue: #5 — Parse Solidity visibility modifiers for reachability (blocks voro-brain Phase 2.4 re-evaluation)