AI Guardrails for LLM safety
The TrustyAI Guardrails Orchestrator runs detectors on LLM inputs and outputs to filter or flag content. It is based on the open-source FMS-Guardrails project. The TrustyAI Operator provides the GuardrailsOrchestrator CRD to deploy and manage it.
This document covers AutoConfig deployment and the built-in regex detector only.
TOC
PrerequisitesDeploy with AutoConfigResource statusBuilt-in regex detectorGateway ConfigMap structureService ports and API referencePorts and rolesAuthentication (auth enabled)Request path referenceAPI usage and examplesStandalone detection (/api/v1/text/contents)Orchestrator API: per-request detectors (/api/v2/chat/completions-detection)Gateway API: preset pipeline (/all/v1/chat/completions)Prerequisites
- TrustyAI Operator installed (see Install TrustyAI).
- An LLM deployed as an InferenceService in the target namespace.
Deploy with AutoConfig
With AutoConfig, the operator generates the orchestrator and gateway configuration from resources in the namespace; no manual ConfigMaps are required for a basic setup.
Create a GuardrailsOrchestrator custom resource with autoConfig and the built-in detector and gateway enabled:
inferenceServiceToGuardrail: Name of the InferenceService (LLM) to guardrail; must match the deployed model in the same namespace.enableBuiltInDetectors: Whentrue, a built-in regex detector sidecar is added.enableGuardrailsGateway: Whentrue, the gateway exposes preset routes (e.g./all/v1/chat/completions).
Resource status
The resource has a status subresource. status.phase can be Progressing, Ready, or Error. When using AutoConfig, status.autoConfigState holds the generated ConfigMap names (generatedConfigMap, generatedGatewayConfigMap), the detected services, and a message. Traffic should be sent only after status.phase == Ready and the corresponding Deployment is ready.
The operator creates an orchestrator ConfigMap and a gateway ConfigMap named <orchestrator-name>-gateway-auto-config. The built-in detector is registered as built-in-detector.
Built-in regex detector
The built-in detector provides regex-based algorithms. Supported algorithms include:
The default gateway config uses a placeholder regex ($^). To enable a specific algorithm (e.g. email), patch the ConfigMap and set detector_params.regex to the algorithm name (e.g. - email).
Gateway ConfigMap structure
ConfigMap name: <orchestrator-name>-gateway-auto-config. Example:
Change regex under built-in-detector to the desired algorithm (e.g. - email). After updating, wait for the Deployment to be ready.
Service ports and API reference
The Guardrails Orchestrator is exposed by a Service named <orchestrator-name>-service. Port numbers depend on whether authentication is enabled (annotation security.opendatahub.io/enable-auth: "true" on the GuardrailsOrchestrator).
Ports and roles
When auth is enabled, the gateway and built-in-detector ports require a Bearer token.
Authentication (auth enabled)
Requests to the gateway or built-in-detector ports must include:
Authorization: Bearer <token>
The token must be a valid Kubernetes ServiceAccount token (or other token accepted by the cluster auth proxy) for a subject with access to the service (e.g. services/proxy). Unauthorized requests receive 401/403.
How to obtain a token
Create a ServiceAccount, a Role (with get, create on services/proxy), and a RoleBinding in the same namespace as the Guardrails Orchestrator; then create a token for the ServiceAccount:
Optionally set token duration, e.g. --duration=8760h for one year. The last command outputs the token; set it as the Authorization: Bearer <token> header value.
Clients inside the cluster can use the projected ServiceAccount token volume as the Bearer token.
Request path reference
Other gateway routes (e.g. /<preset-name>/v1/chat/completions) are defined in the gateway ConfigMap routes.
API usage and examples
Standalone detection (/api/v1/text/contents)
Run the built-in regex detector on text without calling the LLM. Use the built-in-detector port (8080 or 8480). Request body: contents (list of strings), detector_params (e.g. regex: ["email"]).
Use the service address (from inside the cluster: <orchestrator-name>-service.<your-namespace>.svc.cluster.local; from outside: Ingress host if exposed) and the built-in-detector port (see Ports and roles).
Response: array (one entry per contents item) of arrays of detection objects. Each object has start, end, text, detection (e.g. EmailAddress), detection_type (e.g. pii), and score.
Response example (standalone detection, email detected)
Orchestrator API: per-request detectors (/api/v2/chat/completions-detection)
Use the orchestrator port (8032 or 8432) when the caller must choose which detectors run on each request. Request body: model, messages, and optionally detectors (e.g. input / output with detector params).
Example: run built-in detector with regex email on input and output:
When the detector finds a match in the input (e.g. email), the response includes detections and warnings, and choices is empty:
Response example (input triggers detection)
Response shape matches gateway chat: choices, detections, warnings. Omit detectors for a plain chat completion with no detection.
Gateway API: preset pipeline (/all/v1/chat/completions)
Use the gateway port (8090 or 8490) for chat with a fixed detector pipeline (defined in the gateway ConfigMap). Request body: model, messages (OpenAI-style). For per-request detector choice, use the orchestrator API instead.
Use the service address and gateway port (see Ports and roles).
When input/output pass: detections and warnings are null, choices contains the model reply:
Response example (input/output pass)
When input triggers a detection (e.g. PII): detections and warnings are set, choices is empty: