voro-brain
The intelligence layer. A Bayesian scoring engine that consumes agent-builder scan output and produces calibrated ThreatReports with certainty quantification.
Purpose
voro-brain transforms raw scanner findings into risk-calibrated threat assessments. It applies Bayesian inference with Beta distributions to score vulnerabilities across 6 threat dimensions, producing a structured ThreatReport with confidence intervals, safety grades, and actionable verdicts.
Architecture
voro-brain report <target> --audit-json <path> --repo-path <path>
→ AgentBuilderAdapter.convert_audit() # Translate AB JSON → Findings
→ IntentAnalyzer.analyze() # Honeypot, time bomb detection (27 patterns, 7 langs)
→ FeatureExtractor.extract() # ERC detection, proxy patterns (30 fields, 46 detectors)
→ BrainEngine.score_findings() # Bayesian scoring → RiskScore (6 dimensions)
→ ReportAssembler.assemble() # Wire results → ThreatReport
→ VerdictWriter.write_verdict() # Plain-language verdict + severity
→ SafetyGrade.compute() # Deterministic A-F grading
→ JSON output to stdout
Module Inventory
| Module | Responsibility |
|---|---|
src/brain/ | Bayesian inference, Beta distributions, 6-dimension scoring, Fernet-encrypted priors |
src/analysis/ | Pipeline orchestration, honeypot/backdoor detection, feature extraction, differential cache |
src/models/ | Pydantic models: Finding, ScanResult, RiskScore, ThreatReport, DecisionMetadata |
src/ingestion/ | Agent-builder JSON → voro-brain Findings (46 → 18 category mapping) |
src/output/ | Report assembly, verdicts, A-F safety grading, lock v2 attestation |
src/security/ | Dual-LLM isolation: untrusted zone → schema validator → privileged scoring |
src/calibration/ | 24-parameter calibration engine with grid/random sweeps |
src/exploitability/ | MCP client to voro-guard for reachability analysis |
src/feed/ | Threat landscape feed: ingest → distill → publish to exploit store |
src/postmortem/ | Fetch verified source from 7 chains, compare brain score vs actual exploits |
src/bulk/ | Bulk scanning: load targets from YAML, sequential scan, aggregate analytics |
src/daemon/ | Warm brain kernel over Unix domain socket for low-latency repeat calls |
6 Threat Dimensions
Every scanned target is scored across 6 orthogonal risk dimensions:
| Dimension | What It Measures |
|---|---|
| fund_safety | Direct financial risk — reentrancy, flash loans, price manipulation |
| access_control | Authorization gaps — missing ownership checks, privilege escalation |
| external_risk | Third-party exposure — oracle manipulation, untrusted calls |
| code_integrity | Code quality issues — integer overflow, logic bugs, unchecked returns |
| dependency_health | Supply chain risk — vulnerable dependencies, outdated packages |
| agent_autonomy | Agentic risk — autonomous execution, unscoped permissions, prompt injection |
Each dimension produces a risk score (0-10) with confidence intervals derived from Beta distribution posteriors.
Bayesian Scoring
voro-brain uses Beta distributions — not point estimates — to represent belief about vulnerability risk. This means every score carries calibrated certainty.
How It Works
- Prior beliefs are loaded from corpus-calibrated baselines (772 pattern priors from 1,113 contracts)
- Scanner findings update these priors via Bayesian inference using the Beta-Binomial conjugate model
- Confidence intervals narrow as more evidence accumulates
- Per-technique thresholds prevent known-dangerous patterns from scoring artificially low
Calibration Corpus
- 1,113 contracts (1,039 clean + 74 vulnerable) with known ground truth
- Scanner confidence defaults: Regex 0.439, Opengrep 0.093, Slither 0.7455
- 3-tier parameter space: Tier A (thresholds), Tier B (normalization), Tier C (Bayesian dynamics)
Key Capabilities
- Corpus-calibrated: All priors derived from benchmark data, not guesswork
- Certainty-aware: Every score is a distribution with confidence bounds, not a single number
- 6-dimensional: Risk decomposed into orthogonal dimensions for nuanced assessment
- Deterministic grading: A-F safety grades computed from dimension scores with policy constraints
- Feed pipeline: Ingests real-world exploit data (DeFiHackLabs, Immunefi) to improve priors
- Post-mortem analysis: Fetch verified source for past exploits, compare brain predictions vs reality
- Daemon mode: Warm brain singleton over Unix domain socket for sub-second repeat scans
- Dual-LLM isolation: Untrusted code never reaches the privileged scoring zone without schema validation
CLI Reference
# Core scanning
voro-brain scan <target> --json-output # Full scan (AB subprocess + brain scoring)
voro-brain report <target> --audit-json <path> # Brain-only scoring (skip AB)
# Calibration
voro-brain calibrate run --strategy grid --tier A # Parameter sweep
voro-brain calibrate recommend --top 5 # Best configurations
voro-brain calibrate baseline # Reset to calibrated defaults
# Benchmarks
voro-brain benchmark run # Run against ground truth
voro-brain benchmark report <ID> # View results
voro-brain benchmark compare <BASE> <RUN> # Compare runs
# Feed pipeline
voro-brain feed run # Full feed cycle: poll → distill → publish
# Daemon
voro-brain daemon start # Start warm kernel (Unix domain socket)
voro-brain daemon status # Check daemon health
ThreatReport Output
The ThreatReport is the primary output contract consumed by voro-web:
ThreatReport {
report_id, brain_version, scanned_at, scan_duration_ms,
target, target_type, languages_detected, files_scanned, lines_analyzed,
verdict, verdict_severity,
dimensions: ReportDimensionScore[], // 6 dimensions with risk_score + CI bounds
findings: ReportFinding[], // Individual findings with severity + confidence
total_findings, gated_findings_count,
standards_checked, standards_violations,
decision_metadata: DecisionMetadata // Verdict drivers, data quality flags
}
JSON Schema at schemas/threat_report.schema.json is the contract between voro-brain and voro-web TypeScript types.
Database Schemas
| Database | Purpose |
|---|---|
pattern_library.db | 772 vulnerability patterns + canary honeypots |
local_evidence.db | Per-scan observations, outcomes, staged feedback (90-day TTL) |
exploits.db | Historical exploit intelligence from feed pipeline |
standards_xref.db | CWE ↔ SWC ↔ OWASP cross-standard mappings |
calibration_results.db | Parameter sweep trials and composite F1 scores |
benchmark_results.db | Precision, recall, F1, AUROC per benchmark run |
baseline_priors.json | Fernet-encrypted Bayesian priors (pattern_id → alpha, beta) |
Current State
- Tests: 919+ passing (1,531 total with integration tests)
- Priors: 772 patterns calibrated from 1,113-contract corpus
- Benchmarks: 807 ground truth labels across 3 datasets (DeFiVulnLabs, SmartBugs, DeFiHackLabs)
- Exploitability: Phase 2.4 on HOLD — assessor scored 0/10 on known-vulnerable contracts; remediation planned
- Recalibration: In progress — improving scanner confidence defaults and dimension thresholds