Guardrails

The Guardrails page lets you configure content safety rules that automatically scan and filter API requests before they reach the LLM provider.

Guardrails are available on the Enterprise plan. Owner or Admin role is required.

Main Toggle

A global toggle at the top enables or disables all guardrails for your organization. Click Save Changes to apply.

Six built-in rules with individual enable/disable toggles:

Rule	Description
Prompt Injection Detection	Detects attempts to override or manipulate system instructions
Jailbreak Prevention	Identifies attempts to bypass safety measures
PII Detection	Identifies personal information like emails, phone numbers, and SSNs
Secrets Detection	Detects API keys, passwords, and credentials
File Type Restrictions	Controls which file types can be uploaded
Document Leakage Prevention	Detects attempts to extract confidential documents

Each rule has an action dropdown to configure the response:

Configure file upload limits:

Create organization-specific rules by clicking Add Rule:

Each custom rule can be individually enabled/disabled or deleted.

Learn more about guardrails in the Guardrails feature docs.