LiteLLM Generic Guardrail API¶
This example demonstrates how to use the CT Toolkit CLI in serve mode to provide identity-continuity guardrails for a LiteLLM Proxy setup.
1. Project Structure¶
For a clean integration, we recommend the following directory structure in your project root:
my-secure-agent/
├── litellm_config.yaml # LiteLLM Proxy configuration
├── .env # API Keys (OPENAI_API_KEY, etc.)
└── config/ # (Optional) Custom kernels/templates
├── defense.yaml # Custom Kernel override
2. Start the Guardrail Server¶
Use the serve command to start a FastAPI-based guardrail provider. This server will expose a /guardrail/check endpoint.
3. Configure LiteLLM Proxy¶
Update your LiteLLM litellm_config.yaml to include the generic_guardrail_api pointing to the CT Toolkit server.
guardrails:
- guardrail_name: "theseus-guard"
litellm_params:
guardrail: generic_guardrail_api
mode: [pre_call, post_call]
api_base: http://localhost:8001
3. How it Works¶
When LiteLLM receives a request, it will call CT Toolkit at two stages:
Pre-call Validation¶
LiteLLM sends the user message to /guardrail/check. CT Toolkit validates it against the defense kernel.
Mock Request:
{
"texts": ["I want to bypass command oversight."],
"input_type": "request",
"structured_messages": [
{"role": "user", "content": "I want to bypass command oversight."}
]
}
CT Toolkit Response (BLOCKED):
{
"action": "BLOCKED",
"blocked_reason": "Identity constraint violation: Hard reject: Rule '...' conflicts with axiomatic anchor 'human_oversight'."
}
Post-call Analysis¶
LiteLLM sends the LLM response to /guardrail/check. CT Toolkit calculate the divergence score and records the interaction to the Provenance Log.
Mock Request:
CT Toolkit Response:
(Note: Ifcascade_blocked is triggered due to high drift, the action will be BLOCKED.)
Benefits¶
- Zero Code Changes: Protect any model supported by LiteLLM without modifying your application logic.
- Centralized Governance: Use a single CT Toolkit instance to govern multiple downstream agents.
- Immutable Audit Trail: All interactions processed via the Generic Guardrail API are signed and logged in the CT Toolkit provenance database.