Skip to main content

voro-brain

The intelligence layer. A Bayesian scoring engine that consumes agent-builder scan output and produces calibrated ThreatReports with certainty quantification.

Purpose

voro-brain transforms raw scanner findings into risk-calibrated threat assessments. It applies Bayesian inference with Beta distributions to score vulnerabilities across 6 threat dimensions, producing a structured ThreatReport with confidence intervals, safety grades, and actionable verdicts.

Architecture

voro-brain report <target> --audit-json <path> --repo-path <path>
→ AgentBuilderAdapter.convert_audit() # Translate AB JSON → Findings
→ IntentAnalyzer.analyze() # Honeypot, time bomb detection (27 patterns, 7 langs)
→ FeatureExtractor.extract() # ERC detection, proxy patterns (30 fields, 46 detectors)
→ BrainEngine.score_findings() # Bayesian scoring → RiskScore (6 dimensions)
→ ReportAssembler.assemble() # Wire results → ThreatReport
→ VerdictWriter.write_verdict() # Plain-language verdict + severity
→ SafetyGrade.compute() # Deterministic A-F grading
→ JSON output to stdout

Module Inventory

ModuleResponsibility
src/brain/Bayesian inference, Beta distributions, 6-dimension scoring, Fernet-encrypted priors
src/analysis/Pipeline orchestration, honeypot/backdoor detection, feature extraction, differential cache
src/models/Pydantic models: Finding, ScanResult, RiskScore, ThreatReport, DecisionMetadata
src/ingestion/Agent-builder JSON → voro-brain Findings (46 → 18 category mapping)
src/output/Report assembly, verdicts, A-F safety grading, lock v2 attestation
src/security/Dual-LLM isolation: untrusted zone → schema validator → privileged scoring
src/calibration/24-parameter calibration engine with grid/random sweeps
src/exploitability/MCP client to voro-guard for reachability analysis
src/feed/Threat landscape feed: ingest → distill → publish to exploit store
src/postmortem/Fetch verified source from 7 chains, compare brain score vs actual exploits
src/bulk/Bulk scanning: load targets from YAML, sequential scan, aggregate analytics
src/daemon/Warm brain kernel over Unix domain socket for low-latency repeat calls

6 Threat Dimensions

Every scanned target is scored across 6 orthogonal risk dimensions:

DimensionWhat It Measures
fund_safetyDirect financial risk — reentrancy, flash loans, price manipulation
access_controlAuthorization gaps — missing ownership checks, privilege escalation
external_riskThird-party exposure — oracle manipulation, untrusted calls
code_integrityCode quality issues — integer overflow, logic bugs, unchecked returns
dependency_healthSupply chain risk — vulnerable dependencies, outdated packages
agent_autonomyAgentic risk — autonomous execution, unscoped permissions, prompt injection

Each dimension produces a risk score (0-10) with confidence intervals derived from Beta distribution posteriors.

Bayesian Scoring

voro-brain uses Beta distributions — not point estimates — to represent belief about vulnerability risk. This means every score carries calibrated certainty.

How It Works

  1. Prior beliefs are loaded from corpus-calibrated baselines (772 pattern priors from 1,113 contracts)
  2. Scanner findings update these priors via Bayesian inference using the Beta-Binomial conjugate model
  3. Confidence intervals narrow as more evidence accumulates
  4. Per-technique thresholds prevent known-dangerous patterns from scoring artificially low

Calibration Corpus

  • 1,113 contracts (1,039 clean + 74 vulnerable) with known ground truth
  • Scanner confidence defaults: Regex 0.439, Opengrep 0.093, Slither 0.7455
  • 3-tier parameter space: Tier A (thresholds), Tier B (normalization), Tier C (Bayesian dynamics)

Key Capabilities

  • Corpus-calibrated: All priors derived from benchmark data, not guesswork
  • Certainty-aware: Every score is a distribution with confidence bounds, not a single number
  • 6-dimensional: Risk decomposed into orthogonal dimensions for nuanced assessment
  • Deterministic grading: A-F safety grades computed from dimension scores with policy constraints
  • Feed pipeline: Ingests real-world exploit data (DeFiHackLabs, Immunefi) to improve priors
  • Post-mortem analysis: Fetch verified source for past exploits, compare brain predictions vs reality
  • Daemon mode: Warm brain singleton over Unix domain socket for sub-second repeat scans
  • Dual-LLM isolation: Untrusted code never reaches the privileged scoring zone without schema validation

CLI Reference

# Core scanning
voro-brain scan <target> --json-output # Full scan (AB subprocess + brain scoring)
voro-brain report <target> --audit-json <path> # Brain-only scoring (skip AB)

# Calibration
voro-brain calibrate run --strategy grid --tier A # Parameter sweep
voro-brain calibrate recommend --top 5 # Best configurations
voro-brain calibrate baseline # Reset to calibrated defaults

# Benchmarks
voro-brain benchmark run # Run against ground truth
voro-brain benchmark report <ID> # View results
voro-brain benchmark compare <BASE> <RUN> # Compare runs

# Feed pipeline
voro-brain feed run # Full feed cycle: poll → distill → publish

# Daemon
voro-brain daemon start # Start warm kernel (Unix domain socket)
voro-brain daemon status # Check daemon health

ThreatReport Output

The ThreatReport is the primary output contract consumed by voro-web:

ThreatReport {
report_id, brain_version, scanned_at, scan_duration_ms,
target, target_type, languages_detected, files_scanned, lines_analyzed,
verdict, verdict_severity,
dimensions: ReportDimensionScore[], // 6 dimensions with risk_score + CI bounds
findings: ReportFinding[], // Individual findings with severity + confidence
total_findings, gated_findings_count,
standards_checked, standards_violations,
decision_metadata: DecisionMetadata // Verdict drivers, data quality flags
}

JSON Schema at schemas/threat_report.schema.json is the contract between voro-brain and voro-web TypeScript types.

Database Schemas

DatabasePurpose
pattern_library.db772 vulnerability patterns + canary honeypots
local_evidence.dbPer-scan observations, outcomes, staged feedback (90-day TTL)
exploits.dbHistorical exploit intelligence from feed pipeline
standards_xref.dbCWE ↔ SWC ↔ OWASP cross-standard mappings
calibration_results.dbParameter sweep trials and composite F1 scores
benchmark_results.dbPrecision, recall, F1, AUROC per benchmark run
baseline_priors.jsonFernet-encrypted Bayesian priors (pattern_id → alpha, beta)

Current State

  • Tests: 919+ passing (1,531 total with integration tests)
  • Priors: 772 patterns calibrated from 1,113-contract corpus
  • Benchmarks: 807 ground truth labels across 3 datasets (DeFiVulnLabs, SmartBugs, DeFiHackLabs)
  • Exploitability: Phase 2.4 on HOLD — assessor scored 0/10 on known-vulnerable contracts; remediation planned
  • Recalibration: In progress — improving scanner confidence defaults and dimension thresholds