Point your existing OpenAI or Claude client at Arc Gate and get real-time behavioral monitoring, drift detection, and injection blocking. No model weights required. One URL change.
Arc Gate sits between your application and the OpenAI or Anthropic API. Each layer catches what the others miss.
Catches explicit injection language before the request reaches OpenAI. 35+ patterns including DAN mode, persona hijacks, instruction overrides, and authority claims.
Analyzes the logprob distribution of each response. Measures Fisher-Rao distance from the deployment baseline. Catches behavioral drift even when the text looks normal.
Tracks a stability scalar over the full session. Catches gradual manipulation campaigns that look innocent turn by turn — the same mechanism that caught Crescendo in whitebox testing.
Replace your OpenAI base URL with your Arc Gate endpoint. Your API key, your model, your prompts — nothing else changes.
import openai client = openai.OpenAI( api_key="sk-...", base_url="https://web-production-6e47f.up.railway.app/v1" # ← only change ) response = client.chat.completions.create( model="gpt-4", messages=[{"role": "user", "content": user_input}] ) # Requests are monitored and blocked automatically.
Arc Sentry and Arc Gate solve the same problem from different positions.
| Arc Sentry | Arc Gate | |
|---|---|---|
| Model access | Whitebox — needs weights | Blackbox — API only |
| Works with | Mistral 7B, Qwen 2.5 7B, Llama 3.1 8B | GPT-4, Claude, Gemini, any OpenAI-compatible API |
| Detection layer | Residual stream (pre-generate) | Logprob distribution (post-generate) |
| Session monitor | ✓ D(t) scalar | ✓ D(t) scalar |
| Phrase blocking | ✓ | ✓ |
| Dashboard | — (library only) | ✓ hosted |
| Setup | pip install + calibrate | One URL change |
Arc Gate analyzes the response distribution — it cannot read the model's internal state like Arc Sentry can. Highly subtle single-turn attacks may pass the geometric layer and only be caught by phrase matching. The session D(t) monitor becomes more accurate over 10+ requests as it builds a deployment baseline. For whitebox detection on open source models, use Arc Sentry.
500 free requests per month — enough to see it work in your deployment. Upgrade when you need more.
Send any prompt. Normal messages pass through. Injection attempts are blocked in real time.