Plugin pipeline
Xenovia Runtime is a high performance Go-based LLM proxy. Every request passes through six plugins in sequence. Each plugin implements aPreLLMHook (before the upstream call) and a PostLLMHook (after the response).
Middleware wraps the entire router at the fasthttp level and stamps X-Xenovia-Session-Id and X-Xenovia-Trace-Id into the response headers before the first body byte. This is required for streaming responses — the headers must be sent before the stream opens.
Plugin details
1. Auth
- Accepts
Authorization: Bearer xe_...orX-Xenovia-Key: xe_.... - Resolves the key against Redis (
apikey:{key}, 5-minute TTL). On cache miss, calls the control planePOST /api/v1/internal/auth/verify. - The resolved identity is an HMAC-signed blob containing
proxy_idandorg_id. X-Xenovia-Agent-Pathheader validation prevents cross-proxy key use: the resolved proxy ID must match the path segment.- Keys are never logged; only an 8-character SHA-256 hex prefix appears in logs.
2. Provider routing
- Rewrites the request’s provider field to match the proxy’s upstream configuration.
- Two rewrite paths:
- vLLM: OpenAI-format request +
base_url→ vLLM provider with SSRF-safe endpoint validation. - Cloud: OpenAI-format request → Anthropic, Gemini, Azure, or Bedrock with automatic format translation.
- vLLM: OpenAI-format request +
- SSRF protection: link-local IPs and cloud metadata endpoints are blocked in vLLM
base_urlvalues. - Provider credentials are held in the proxy configuration and resolved from the control plane. Your application never needs provider API keys.
3. Session
Five-strategy resolution chain (evaluated in priority order):| Priority | Source | Mechanism |
|---|---|---|
| 1 | X-Xenovia-Session-Id header | Must be a valid UUID; validated before use |
| 2 | previous_response_id (Responses API) | Looked up via respchain:{keyHash}:{resp_id} in Redis |
| 3 | user field (Chat Completions) | Looked up via usersession:{keyHash}:{userHash} in Redis |
| 4 | Message fingerprint | SHA-256 of messages[:-1]; looked up via chain:{keyHash}:{hash} |
| 5 | New session | Generates a fresh UUID |
INCR sessionturn:{session_id}) with a 30-minute sliding TTL. Sessions are proxy-scoped; cross-proxy session hijacking is prevented by ownership verification.
4. Trace
- Opens a trace record with session context at
PreLLMHook. - Emits child trace steps:
request_received,llm,policy,intent,escalation,tool(call + result),request_finished. - Each step has its own trace ID linked to the parent via
parent_trace_id. - Custom properties from
X-Xenovia-Property-*headers are attached to the trace (max 20 properties; keys ≤ 64 chars; values ≤ 512 chars;policy_prefix is reserved). - Async persistence via a bounded goroutine pool (256 concurrent). A
sync.Oncededup guard prevents duplicate rows under concurrent streaming hooks. - Captures: request/response bodies (truncated), tokens, latency, TTFT, tool calls/results, session turn, policy decision, intent score.
- Optional direct ClickHouse write in addition to control plane persistence.
| Header | Value |
|---|---|
X-Xenovia-Session-Id | Resolved session UUID |
X-Xenovia-Trace-Id | Per-request trace UUID |
| Header | Description |
|---|---|
X-Xenovia-Property-{key} | Custom trace property (key ≤ 64, value ≤ 512, no policy_ prefix) |
X-Xenovia-Session-Path | Hierarchical path tag for trace grouping |
X-Xenovia-Parent-Trace-Id | Parent trace UUID for cross-request linkage |
X-Xenovia-Trace-Flow-Id | Flow-level grouping UUID |
5. Policy
- Rego policies are fetched per proxy from the control plane (
GET /api/v1/internal/proxies/{id}/policies) and cached in Redis for 5 minutes. - Two independent policies per proxy: request-stage and response-stage.
- OPA evaluates policies with a 2-second compile timeout and a 200-millisecond eval timeout. Compile results are cached by
(agentID, policyHash)using singleflight to prevent duplicate compiles under load. - Optional HMAC-signed policies: stored as
v1:<hmac_hex>:<rego>in Redis; signature verified before use. - Response-stage policy failures fail open (logged, counted with an atomic counter).
6. Intent
- Intent configuration fetched per proxy: intent text, capability list, and a semantic trigger.
- Trigger axes:
turn_scope:first_only,first_n,all(default)on_tools:always(default),require,ignoremin_content_chars: minimum message length to scoresample_rate: 0.0–1.0 for probabilistic scoring
- Scoring request sent to guardrail service (
POST {GUARDRAIL_URL}/score) or control plane fallback, with a 15-second timeout. Payloads are truncated to 4096 characters per field. - Actions:
allow(pass through),block(403),escalate(403 + async operator notification). - The block/escalate reason is never forwarded to the agent — it is logged server-side only.
- Fail mode:
XENOVIA_INTENT_FAIL_MODE=open(default, fail open) orclosed(503 on scoring errors).
Supported providers
| Provider | Format |
|---|---|
| OpenAI | Native |
| Anthropic | Auto-translated from OpenAI format |
| Google Gemini | Auto-translated from OpenAI format |
| Azure OpenAI | Auto-translated from OpenAI format |
| Amazon Bedrock | Auto-translated from OpenAI format |
| Groq | OpenAI-compatible |
| vLLM (self-hosted) | OpenAI-compatible |
Supported endpoints
| Endpoint | Use case |
|---|---|
POST /v1/chat/completions | Chat, agents, tool calling |
POST /v1/responses | OpenAI Agents SDK (Responses API) |
POST /v1/embeddings | RAG, vector search |
POST /v1/completions | Legacy text completions |
Runtime environment variables
| Variable | Required | Default | Description |
|---|---|---|---|
CONTROL_PLANE_URL | Yes | — | Control plane base URL |
RUNTIME_SHARED_SECRET | Yes | — | Sent as X-Runtime-Secret on internal CP calls |
PORT | No | 8080 | HTTP listen port |
REDIS_URL | No | redis://localhost:6379 | Redis connection URL |
CLICKHOUSE_URL | No | unset | ClickHouse for direct trace writes |
GUARDRAIL_URL | No | unset | Guardrail scoring service URL |
GUARDRAIL_SECRET | No | unset | Guardrail service auth token |
XENOVIA_INTENT_FAIL_MODE | No | open | open or closed |
XENOVIA_POLICY_SIGNING_KEY | No | unset | HMAC key for signed Rego policies |
XENOVIA_IDENTITY_SIGNING_KEY | No | RUNTIME_SHARED_SECRET | HMAC key for identity blobs |