AI Inference Integration
The oracle-bridge provides AI-powered endpoints for the DAO's governance and collaboration tools. All AI features route through the oracle-bridge (Node.js off-chain service), which communicates with an Ollama instance running on-cluster. This keeps inference costs off-chain while keeping results verifiable and auditable.
Epic: BL-045 — implemented across oracle-bridge, governance-suite, dao-suite.
Architecture Overview
Browser / Suite frontend
│
▼
oracle-bridge (Node.js, port 3000 / 8787 staging)
│
├── PostgreSQL (proposal summary cache, embeddings)
│
└── Ollama HTTP API (Theo node: 192.168.2.160, port 11434)
│
├── mistral:7b (drafting assistant)
├── llama3.2:3b (semantic search / embeddings)
└── tinyllama:1.1b (fallback / health checks)The oracle-bridge is the only service that talks to Ollama. Frontends never call Ollama directly.
Configuration
Environment Variables
| Variable | Default | Description |
|---|---|---|
OLLAMA_HOST | http://192.168.2.160:11434 | Ollama base URL (Theo node internal) |
OLLAMA_DEFAULT_MODEL | mistral:7b | Model used for drafting and summaries |
OLLAMA_EMBED_MODEL | llama3.2:3b | Model used for embedding generation |
OLLAMA_TIMEOUT_MS | 30000 | Request timeout in milliseconds |
OLLAMA_CIRCUIT_OPEN_THRESHOLD | 3 | Failures before circuit opens |
OLLAMA_CIRCUIT_RESET_MS | 60000 | Time before circuit half-opens |
AI_CACHE_TTL_SECONDS | 3600 | PostgreSQL cache TTL for summaries |
For local development against the cluster Theo node, set OLLAMA_HOST in .env.local. For staging/production the value is injected via Docker secrets.
Model Selection
Each cluster node has different model availability:
| Node | Internal Address | Models | Notes |
|---|---|---|---|
| Theo (.160) | ollama-theo-svc.hello-world:11434 | TinyLlama 1.1B, Llama 3.2 3B, Mistral 7B Q4 | Coby's dedicated node, 16GB RAM, GTX 1060 |
| Aurora (.159) | 192.168.2.159:31434 | Various | Graydon's primary, 64GB RAM, GTX 1070 |
Model swaps take 10–30 seconds on Theo (only one model loaded at a time). Design prompts to tolerate cold-start latency.
Circuit Breaker Pattern
The oracle-bridge wraps every Ollama call in a circuit breaker (src/ai/circuitBreaker.ts). This prevents cascading failures when Ollama is unreachable or overloaded.
States
CLOSED ──(3 failures)──► OPEN ──(60s timeout)──► HALF-OPEN ──(success)──► CLOSED
│
(failure)
│
OPEN| State | Behavior |
|---|---|
| CLOSED | Normal — all requests pass through |
| OPEN | Fast-fail — returns 503 immediately, no Ollama calls |
| HALF-OPEN | One probe request allowed — success closes, failure reopens |
Implementation
import { CircuitBreaker } from '../ai/circuitBreaker';
const breaker = new CircuitBreaker({
failureThreshold: parseInt(process.env.OLLAMA_CIRCUIT_OPEN_THRESHOLD ?? '3'),
resetTimeoutMs: parseInt(process.env.OLLAMA_CIRCUIT_RESET_MS ?? '60000'),
});
async function callOllama(prompt: string): Promise<string> {
return breaker.execute(() => ollamaClient.generate(prompt));
}When the circuit is OPEN, endpoints return:
{ "error": "AI service temporarily unavailable", "code": "CIRCUIT_OPEN" }GlitchTip alerts fire when the circuit opens (tag: ai.circuit_open).
API Reference
All endpoints require a valid session cookie (httpOnly, domain=.helloworlddao.com) unless otherwise noted. Oracle-bridge validates the session against auth-service before serving AI responses.
Base URL
| Environment | URL |
|---|---|
| Local | http://localhost:3000 |
| Staging | https://staging-oracle.helloworlddao.com |
| Production | https://oracle.helloworlddao.com |
POST /api/ai/summarize
Generate a concise summary of a governance proposal. Results are cached in PostgreSQL by proposal_id with a configurable TTL.
Request
POST /api/ai/summarize
Content-Type: application/json
Cookie: session=<token>
{
"proposal_id": "prop_abc123",
"title": "Allocate 500 DOM for community garden",
"body": "We propose to allocate 500 DOM tokens from the treasury..."
}Response
{
"summary": "Allocates 500 DOM from treasury to fund a shared garden space, benefiting ~40 members. Vote closes in 5 days.",
"cached": false,
"model": "mistral:7b",
"latency_ms": 1240
}| Field | Type | Description |
|---|---|---|
summary | string | 1–3 sentence plain-language summary |
cached | boolean | true if returned from PostgreSQL cache |
model | string | Ollama model that generated the response |
latency_ms | number | Time from request to response (0 if cached) |
Error Responses
| Status | Code | Meaning |
|---|---|---|
| 503 | CIRCUIT_OPEN | Ollama unreachable — circuit is open |
| 422 | VALIDATION_ERROR | Missing or invalid proposal_id / body |
| 401 | UNAUTHORIZED | Invalid or expired session |
POST /api/ai/search
Semantic search across proposals using cosine similarity over stored embeddings. Results are ranked by semantic relevance, not keyword match.
Request
POST /api/ai/search
Content-Type: application/json
Cookie: session=<token>
{
"query": "environmental sustainability funding",
"limit": 10,
"threshold": 0.75
}Response
{
"results": [
{
"proposal_id": "prop_xyz789",
"title": "Green Energy Infrastructure Grant",
"score": 0.91,
"excerpt": "Funding allocation for solar panel installation..."
}
],
"query_embedding_ms": 340,
"search_ms": 12
}| Field | Type | Description |
|---|---|---|
results[].score | float | Cosine similarity (0.0–1.0, higher = more relevant) |
threshold | float | Minimum score to include (default: 0.75) |
limit | int | Max results (default: 10, max: 50) |
Embeddings are generated using llama3.2:3b and stored in PostgreSQL as vector(4096). New proposals are embedded asynchronously after creation.
POST /api/ai/draft
AI-assisted proposal drafting. Accepts a brief intent and returns a structured first draft.
Request
POST /api/ai/draft
Content-Type: application/json
Cookie: session=<token>
{
"intent": "Propose adding a community library of maker tools",
"category": "infrastructure",
"max_tokens": 400
}Response
{
"draft": {
"title": "Establish Community Maker Tool Library",
"body": "## Summary\n\nThis proposal establishes a shared library of maker tools...",
"suggested_budget": null,
"tags": ["infrastructure", "community", "tools"]
},
"model": "mistral:7b",
"latency_ms": 2100
}The draft is a starting point — members must edit and review before submitting. The governance-suite proposal creation form pre-fills from the draft response.
POST /api/ai/embed
Generate an embedding vector for arbitrary text. Used internally for indexing; also available for advanced integrations.
Request
POST /api/ai/embed
Content-Type: application/json
Cookie: session=<token>
{
"text": "Sustainable agriculture and community food systems"
}Response
{
"embedding": [0.021, -0.043, ...],
"dimensions": 4096,
"model": "llama3.2:3b"
}GET /api/ai/health
Returns Ollama connectivity status and circuit state. Does not require authentication.
Response
{
"status": "ok",
"circuit": "CLOSED",
"model_available": true,
"ollama_host": "http://192.168.2.160:11434",
"default_model": "mistral:7b"
}Proposal Summary Caching
Summaries are cached in PostgreSQL (proposal_summaries table) to avoid redundant Ollama calls. Cache key is proposal_id. Cache is invalidated when the proposal body changes (detected by SHA-256 hash of the body).
A reindex cron job runs nightly to generate summaries for any proposals added without a cached summary:
# Manually trigger reindex (oracle-bridge admin endpoint)
curl -X POST https://oracle.helloworlddao.com/api/ai/admin/reindex \
-H "X-API-Token: fos_<token>"Monitoring & Alerts
GlitchTip DSN: https://017a18...@glitchtip.founderyos.dev/4
| Alert | Trigger | Tag |
|---|---|---|
| Circuit opened | Circuit transitions CLOSED → OPEN | ai.circuit_open |
| High latency | Ollama response > 15s | ai.latency_high |
| Cache miss spike | Cache hit rate drops below 40% | ai.cache_miss_rate |
All AI errors are tagged with service:oracle-bridge and component:ai in GlitchTip.
Local Development
To run AI endpoints locally without a cluster connection, use the Ollama desktop app or Docker:
# Install Ollama locally
curl -fsSL https://ollama.ai/install.sh | sh
# Pull required models
ollama pull mistral:7b
ollama pull llama3.2:3b
# Point oracle-bridge at local Ollama
echo 'OLLAMA_HOST=http://localhost:11434' >> oracle-bridge/.env.local
# Start oracle-bridge
cd oracle-bridge && npm run devThe circuit breaker is disabled in test mode (NODE_ENV=test) to avoid flakiness in unit tests.