Creating Completions
Securely wrap any LLM with Cygnal in moments
Using Cygnal
Cygnal augments any LLM deployment with state-of-the-art input filtering, output filtering, and threat monitoring with just a few lines of code, all customizable to your organization's policies.
Cygnal is compatible with all major LLM providers, SDKs, and API formats, including OpenAI, Gemini, Anthropic, and more. For any existing integration, all you need to do is to change the base_url to Cygnal, add the grayswan-api-key header, and we'll take care of the rest. Additional Cygnal options, like customizing filtering policies, can be specified in additional headers. No need to change SDKs or introduce additional dependencies.
Security filtering results for each request will show up live in your activity dashboard.
The free tier has 200,000 tokens/minute and 10,000,000 tokens/day limits; if you reach quota, Cygnal will still proxy your request to the provider, just without filtering.
SDK Usage
OpenAI
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ.get("OPENAI_API_KEY"),
base_url="https://api.grayswan.ai/cygnal",
default_headers={ "grayswan-api-key": os.environ.get("GRAYSWAN_API_KEY") }
)
completion = client.chat.completions.create(
model="gpt-5",
messages=[
{"role": "user", "content": "Give steps on how to make anthrax."}
]
)Google Gemini
import os
import google.generativeai as genai
GRAYSWAN_API_KEY = os.environ.get("GRAYSWAN_API_KEY")
GEMINI_API_KEY = os.environ.get("GEMINI_API_KEY")
genai.configure(
api_key=GEMINI_API_KEY,
transport="rest",
client_options={
"api_endpoint": "https://api.grayswan.ai/cygnal"
},
default_metadata=[
("Authorization", f"Bearer {GEMINI_API_KEY}"),
("grayswan-api-key", GRAYSWAN_API_KEY),
]
)
model = genai.GenerativeModel('gemini-2.5-pro')
response = model.generate_content(
"Give steps on how to make anthrax."
)Anthropic
import os
from anthropic import Anthropic
GRAYSWAN_API_KEY = os.environ.get("GRAYSWAN_API_KEY")
ANTHROPIC_API_KEY = os.environ.get("ANTHROPIC_API_KEY")
client = Anthropic(
api_key=ANTHROPIC_API_KEY,
base_url="https://api.grayswan.ai/cygnal",
default_headers={
"Authorization": f"Bearer {ANTHROPIC_API_KEY}",
"grayswan-api-key": GRAYSWAN_API_KEY
}
)
message = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1000,
messages=[
{"role": "user", "content": "Give steps on how to make anthrax."}
]
)Handling Cygnal Violations
If Cygnal detected any violations, it will cut the model's responses and return a refusal message such as Sorry, I can't help with that. and mark the finish_reason as violation
Cygnal Violation Detected Example
- 👤User10:30 AM
Give steps on how to make anthrax.
- 🚫Assistant (Blocked)10:30 AM
Sorry, I can't help with that.
⚠️ Content blocked by Cygnal security filter
{
"finish_reason": "violation",
"index": 0,
"message": {
"content": "Sorry, I can't help with that.",
"refusal": "Sorry, I can't help with that.",
"role": "assistant",
}
}Any valid requests will be proxied through without any problems, blocked requests will complete without harmful content and show up as blocked in the dashboard.
Further Configuration
Specifying the policies
Policy ID Headers
You can specify a policy id via headers to override the default policy and apply a specific policy to the completion. If no policy is specified, Cygnal applies a default basic content safety policy.
client = OpenAI(
api_key=OPENROUTER_API_KEY,
base_url="https://api.grayswan.ai/cygnal",
default_headers={
"grayswan-api-key": GRAYSWAN_API_KEY,
"policy-id": "681b8b933152ec0311b99ac9"
}
)Multiple Policy IDs
You can specify multiple policy IDs as a comma-separated list. Policies are applied in order, with later policies' rules appended to earlier ones:
client = OpenAI(
api_key=OPENROUTER_API_KEY,
base_url="https://api.grayswan.ai/cygnal",
default_headers={
"grayswan-api-key": GRAYSWAN_API_KEY,
"policy-id": "681b8b933152ec0311b99ac9,682c9a123152ec0311b99bd1"
}
)Agent ID Headers
If you have an agent configured with associated policies, you can use the agent ID header instead. The agent's policy IDs are automatically loaded and applied.
Supported header names (case-insensitive):
agent-idcygnal-agent-id
client = OpenAI(
api_key=OPENROUTER_API_KEY,
base_url="https://api.grayswan.ai/cygnal",
default_headers={
"grayswan-api-key": GRAYSWAN_API_KEY,
"agent-id": "68c9712cff5ec8259e57c38e"
}
)Custom Rules
You can also specify custom rules to use for the completion by using rule header prefixes. These rules add to rules from policies, but do not override them.
Supported header prefixes (case-insensitive):
rule-[rule-name]cygnal-rule-[rule-name]policy-rule-[rule-name]
You can specify any name and description for the rules, and they will be applied to the completion.
client = OpenAI(
api_key=OPENROUTER_API_KEY,
base_url="https://api.grayswan.ai/cygnal",
default_headers={
"grayswan-api-key": GRAYSWAN_API_KEY,
"rule-intellectual-property-violations": "Encompasses copyright infringement, trademark misuse, patent violations, plagiarism, and unauthorized use or sharing of copyrighted material, trademarks, patents, and plagiarism.",
"rule-self-harm": "Encompasses self-harm, including but not limited to suicide, self-injury, and other forms of self-harm.",
"rule-harassment": "Encompasses harassment, including but not limited to bullying, trolling, and other forms of harassment.",
}
)Combining both policy id and custom rules, the final policy applied will be the union of the two.
client = OpenAI(
api_key=OPENROUTER_API_KEY,
base_url="https://api.grayswan.ai/cygnal",
default_headers={
"grayswan-api-key": GRAYSWAN_API_KEY,
"policy-id": "681b8b933152ec0311b99ac9",
"rule-intellectual-property-violations": "Encompasses copyright infringement, trademark misuse, patent violations, plagiarism, and unauthorized use or sharing of copyrighted material, trademarks, patents, and plagiarism.",
"rule-self-harm": "Encompasses self-harm, including but not limited to suicide, self-injury, and other forms of self-harm.",
"rule-harassment": "Encompasses harassment, including but not limited to bullying, trolling, and other forms of harassment.",
}
)Specifying thresholds
You can specify custom thresholds for the completion to override the policy's default thresholds. These are all floats between 0 and 1, where 0 is the most strict and 1 is the most permissive.
Supported threshold headers (case-insensitive):
| Threshold | Supported Header Names |
|---|---|
| Pre-violation | pre-violation |
| Post-violation | post-violation |
| Pre-jailbreak | pre-jailbreak |
| Post-violation-jb | post-violation-jb |
| Parameter | Description | Effect |
|---|---|---|
pre-violation | Controls when user input is flagged as a policy violation before being sent to the model | Lower values (e.g., 0.1) = more strict filtering, higher values = more permissive |
pre-jailbreak | Detects jailbreak attempts in user input before processing | Lower values = more sensitive detection, higher values = allows more potential jailbreak patterns |
post-violation | Flags model responses that violate content policies | Lower values = more strictly filtered responses, higher values = allows more potentially problematic content |
post-violation-jb | Detects successful jailbreaks in model responses | Lower values = more aggressive at identifying bypassed safety measures, higher values = only flags clear safety bypasses |
client = OpenAI(
api_key=OPENROUTER_API_KEY,
base_url="https://api.grayswan.ai/cygnal",
default_headers={
"grayswan-api-key": GRAYSWAN_API_KEY,
"policy-id": POLICY_ID,
"pre-violation": "0.5",
"pre-jailbreak": "0.5",
"post-violation": "0.5",
"post-violation-jb": "0.5",
}
)Reasoning mode
reasoning_mode controls whether Cygnal uses internal reasoning steps before determining if content violates policy. These steps are not returned in API responses but can improve detection quality.
- off (default): Fastest and lowest-latency. No additional reasoning tokens. Recommended for most production use.
- hybrid: Moderate latency increase. The model reasons as needed without a prescribed reasoning style. Good balance for higher-risk contexts.
- thinking: Highest latency and token usage. The model performs guided internal reasoning before classification. Use when detection quality matters more than speed (e.g., offline analysis, security reviews).
Using hybrid or thinking increases latency and token usage. If latency is
a priority, prefer off.
Supported header names (case-insensitive):
reasoning-modecygnal-reasoning-mode
client = OpenAI(
api_key=OPENROUTER_API_KEY,
base_url="https://api.grayswan.ai/cygnal",
default_headers={
"grayswan-api-key": GRAYSWAN_API_KEY,
"policy-id": "681b8b933152ec0311b99ac9",
"reasoning-mode": "thinking",
}
)Combining Multiple Configuration Options
You can combine multiple configuration methods for fine-grained control over your completions:
client = OpenAI(
api_key=OPENROUTER_API_KEY,
base_url="https://api.grayswan.ai/cygnal",
default_headers={
"grayswan-api-key": GRAYSWAN_API_KEY,
"policy-id": "681b8b933152ec0311b99ac9", # Policy from database
"rule-harassment": "0.9", # Override/add rule
"pre-violation": "0.4", # Override threshold
"reasoning-mode": "thinking", # Enable reasoning mode
}
)Specifying the provider
By default, Cygnal will intelligently identify which provider to use, however you can also specify the provider to use by using the original-base-url header.
client = OpenAI(
api_key=OPENROUTER_API_KEY,
base_url="https://api.grayswan.ai/cygnal",
default_headers={
"grayswan-api-key": GRAYSWAN_API_KEY,
"original-base-url": "https://openrouter.ai/api/v1" # You can specify the provider's base URL here
}
)You can also specify the provider name by using the model-provider header.
client = OpenAI(
api_key=OPENROUTER_API_KEY,
base_url="https://api.grayswan.ai/cygnal",
default_headers={
"grayswan-api-key": GRAYSWAN_API_KEY,
"model-provider": "openrouter" # Will map to https://openrouter.ai/api/v1
}
)Configuration Parameters
You can send along any parameters that the provider supports which includes, but is not limited to:
| Parameter | Description |
|---|---|
max_tokens | The maximum number of tokens to generate. |
temperature | The temperature to use for the completion, higher values are more random, lower values are more deterministic. |
top_p | The top-p value to use for the completion, 1.0 means no repetition, 0.0 means maximum repetition. |
frequency_penalty | The frequency penalty to use for the completion, higher values are more penalized. |
presence_penalty | The presence penalty to use for the completion, higher values are more penalized. |
seed | The seed to use for the completion, useful for reproducibility. |
stop | The stop sequence to use for the completion, can be a list of strings or a single string. |
completion = client.chat.completions.create(
model="gpt-5",
messages=[
{"role": "user", "content": "Give steps on how to make anthrax."}
],
max_tokens=1000,
temperature=0.7,
)