Skip to main content

voro-guard

The code index service. A hardened FastAPI service that indexes repositories, extracts symbols across 8 languages, builds Solidity call graphs, and signs artifacts with HMAC-SHA256 for trust verification.

Purpose

voro-guard provides structured code intelligence to the VORO pipeline. It indexes source code into searchable symbol databases, computes call graphs for reachability analysis, and wraps everything in signed artifact envelopes for trust verification. voro-brain consumes this service via MCP stdio to enrich its exploitability assessments.

Architecture

voro-brain (ExploitabilityAssessor)
→ spawns: python -m app.mcp_server (stdio)
→ MCP server starts managed FastAPI subprocess on 127.0.0.1:18765
→ index_repo() → POST /v1/index → ArtifactEnvelope
→ search_symbols() → POST /v1/search → SymbolMatch[]
→ get_symbol() → POST /v1/get → Symbol
→ outline_file() → POST /v1/outline → FileOutline
→ callgraph() → POST /v1/callgraph → CallGraph

HTTP API

MethodPathPurpose
GET/healthService health check (no auth)
POST/v1/indexIndex a repository, create signed artifact
POST/v1/searchSearch symbols by query
POST/v1/getGet specific symbol by ID
POST/v1/outlineList all files and symbols in artifact
POST/v1/callgraphBuild Solidity call graph from file
GET/v1/metricsService metrics snapshot

All /v1/* endpoints require Bearer token authentication.

MCP Tools (stdio interface)

ToolPurpose
index_repo(source_type, source_id, workspace_id, source_revision)Index a repository
search_symbols(query, workspace_id, artifact_id, source_fingerprint)Search symbols
get_symbol(symbol_id, workspace_id, artifact_id, source_fingerprint)Get symbol detail
outline_file(workspace_id, artifact_id, source_fingerprint)File/symbol outline

Supported Languages

Symbol extraction (regex-based, line-by-line — no AST for speed and portability):

  • Python
  • JavaScript
  • TypeScript
  • Go
  • Rust
  • Java
  • PHP
  • Solidity (with additional call graph analysis, visibility metadata, and reachability tracking)

Security Model

LayerMechanism
Artifact SigningHMAC-SHA256 over canonical JSON (deterministic field ordering)
Trust Modesstrict (signature required, production) vs legacy (unsigned allowed, dev only)
Bearer AuthRequired for all /v1/* endpoints
Path SafetySymlink escape detection, secret file filtering, binary exclusion
Identity Verificationworkspace_id + source_fingerprint + artifact_id verified on all queries

ArtifactEnvelope (output contract)

Every indexed repository produces a signed ArtifactEnvelope:

{
"schema_version": "c35-v1",
"workspace_id": "ws1",
"source_type": "github|git|local_repo|snapshot",
"source_id": "owner/repo",
"source_revision": "commit_sha",
"source_fingerprint": "sha256:...",
"artifact_id": "24-char sha256 prefix",
"manifest": {
"signer": "voro-index-guard",
"signed_at": "ISO8601",
"signature": "hmac_hex"
},
"payload": {
"files": [...],
"symbols": [...],
"stats": { "file_count": N, "symbol_count": N },
"token_savings_estimate": { ... }
}
}

Symbol Fields

FieldTypeNote
idstring24-char SHA256 prefix
kindstringfunction, class, interface, contract
namestringSymbol name
filestringFile path within repo
lineintLine number
languagestringOne of 8 supported languages
snippetstringSource code context (plus/minus 2 lines)
visibilitystringSolidity only: public, external, internal, private
payableboolSolidity only
reachableboolSolidity only — derived from call graph analysis

Deployment

Production via Zeabur (Docker):

  • Image: Python 3.12-slim from Dockerfile
  • Port: 8080
  • Required env vars: CODE_INDEX_SERVICE_TOKEN, CODE_INDEX_SIGNING_KEY, CODE_INDEX_TRUST_MODE=strict
  • Persistent volume: /data/artifacts
  • Health check: GET /health → 200
# Smoke test
./scripts/smoke_prod.sh https://<domain> <token>

Module Structure

app/
├── main.py # FastAPI app setup
├── config.py # Settings/env configuration
├── security.py # Bearer token auth
├── metrics.py # Request/success/deny counters
├── mcp_server.py # FastMCP stdio wrapper + managed subprocess
├── models/schemas.py # Pydantic request/response models
├── routes/
│ ├── index.py # POST /v1/index (artifact signing)
│ └── query.py # Search/get/outline/metrics/callgraph
└── core/
├── artifacts.py # Artifact persistence & verification
├── callgraph.py # Solidity call graph analysis
├── identity.py # Source fingerprinting
├── indexer.py # GitHub & local repo indexing
├── ingest.py # File discovery & reading
├── parser.py # Symbol extraction (8 languages)
├── safety.py # Symlink/secret/binary checks
├── signing.py # HMAC-SHA256 signing
└── store.py # Symbol indexing & querying

Build & Run

pip install -r requirements.txt
pytest tests/unit/ # Run tests
uvicorn app.main:app --host 0.0.0.0 --port 8080 # Run HTTP API
python -m app.mcp_server # Run MCP stdio server

Current State

  • Version: 0.1.0
  • Tests: 8 unit test modules
  • Branch: main at 45cdb1d
  • Shipped: Phase 2.0 (MCP wrapper), Phase 3.0 (Solidity call graphs with visibility)
  • Open issue: #5 — Parse Solidity visibility modifiers for reachability (blocks voro-brain Phase 2.4 re-evaluation)