Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.xenovia.io/llms.txt

Use this file to discover all available pages before exploring further.

Setup

pip install langchain-openai
import os
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-4o-mini",
    api_key=os.environ["XENOVIA_API_KEY"],
    base_url=f"https://runtime.xenovia.io/a/{os.environ['XENOVIA_PROXY_ID']}/openai/v1"
)

Chains

Use the configured llm in any LangChain chain. Every LLM call in the chain routes through Xenovia individually — each call gets its own trace.
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("user", "{input}")
])

chain = prompt | llm
response = chain.invoke({"input": "What is LangChain?"})
print(response.content)

Agents with tools

Tool definitions are passed to the upstream LLM through Xenovia. The request-stage Rego policy evaluates input.tool_names before the call reaches the model — a blocked tool returns 403 before the LLM is called.
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.tools import tool

@tool
def search(query: str) -> str:
    """Search the web for information."""
    return f"Results for: {query}"

agent = create_tool_calling_agent(llm, [search], prompt)
executor = AgentExecutor(agent=agent, tools=[search])
result = executor.invoke({"input": "Search for recent AI news"})

RAG pipeline

Both LLM and embedding calls route through Xenovia. Policies and traces apply to the full pipeline, not only the final generation step.
from langchain_openai import OpenAIEmbeddings
from langchain_core.vectorstores import InMemoryVectorStore

embeddings = OpenAIEmbeddings(
    api_key=os.environ["XENOVIA_API_KEY"],
    base_url=f"https://runtime.xenovia.io/a/{os.environ['XENOVIA_PROXY_ID']}/openai/v1"
)

vectorstore = InMemoryVectorStore.from_texts(
    ["Xenovia provides runtime governance for AI agents"],
    embedding=embeddings
)
retriever = vectorstore.as_retriever()

# Combine retriever with LLM
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

rag_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)
answer = rag_chain.invoke("What does Xenovia provide?")

Session tracking

To group all LLM calls in a multi-step chain under one session, pass X-Xenovia-Session-Id as a default header on the client.
import uuid
from langchain_openai import ChatOpenAI

session_id = str(uuid.uuid4())

llm = ChatOpenAI(
    model="gpt-4o-mini",
    api_key=os.environ["XENOVIA_API_KEY"],
    base_url=f"https://runtime.xenovia.io/a/{os.environ['XENOVIA_PROXY_ID']}/openai/v1",
    default_headers={"X-Xenovia-Session-Id": session_id}
)

Handling policy blocks

LangChain propagates the upstream 403 as an openai.PermissionDeniedError. Catch it in your chain’s error handler.
from openai import PermissionDeniedError

try:
    result = executor.invoke({"input": "Delete all records"})
except PermissionDeniedError as e:
    print(f"Blocked by policy: {e.message}")
If your LangChain workflow also uses OpenAIEmbeddings, configure that client with the same Xenovia base_url and the same session header so retrieval and generation traces stay correlated.