POST /v1/chat/completions
Create a chat completion with guardrails applied. This endpoint is compatible with the OpenAI Chat Completions API with additional guardrails-specific extensions.Request
The LLM model to use for chat completion (e.g., “gpt-4o”, “llama-3.1-8b”).
The list of messages in the current conversation.
If set, partial message deltas will be sent as server-sent events.
The maximum number of tokens to generate.
Sampling temperature to use (0.0 to 2.0).
Top-p sampling parameter (0.0 to 1.0).
Stop sequences where the API will stop generating further tokens.
Presence penalty parameter (-2.0 to 2.0).
Frequency penalty parameter (-2.0 to 2.0).
Guardrails Extensions
Guardrails-specific options:
The guardrails configuration ID to use.
List of configuration IDs to combine. Cannot be used with
config_id.The ID of an existing thread to continue (minimum 16 characters).
Additional context data for the conversation.
Additional generation options:
rails: Which rails to enable ({"input": true, "output": true})log: Logging options ({"activated_rails": true, "llm_calls": true})output_vars: Variables to extract from context
State object to continue the interaction. Must contain
events or state key.Response
Unique identifier for the chat completion.
Always “chat.completion”.
Unix timestamp of when the completion was created.
The model used for the completion.
Array of completion choices.
The index of this choice.
The reason the generation stopped: “stop”, “length”, or “content_filter”.
Guardrails-specific output data:
The configuration ID that was used.
Updated state object for continuing the conversation.
Generation log data (if requested):
activated_rails: List of rails that were activatedllm_calls: Details of LLM calls madestats: Performance statistics