Core Concepts
What is Delimiter?#
Delimiter is Sentry for AI apps. It monitors everything that can take your AI app offline — rate limit exhaustion, credit depletion, provider outages. Two lines of code, zero maintenance — Delimiter automatically detects every AI provider your app talks to.
How it works#
Delimiter prevents two types of chat blackouts:
- Rate limit exhaustion (429 errors) — monitored via SDK response headers, real-time
- Credit depletion (account suspended) — monitored via provider billing APIs, polled every 15–30 min
SDK flow:
Your App → AI Provider → Response → Delimiter reads headers → Dashboard updates
When you call delimiter.init(), the SDK hooks into Node's HTTP layer. Every outbound request is checked against known AI provider domains (api.openai.com, api.anthropic.com, etc.). When a match is found, Delimiter reads rate-limit and credit headers from the response.
Provider Connections: Connect your provider accounts in the dashboard to monitor credit balances directly from billing APIs. Delimiter never stores your credentials — authentication is handled via a secure credential broker.
Reporting is async and fire-and-forget. Your application code runs exactly as it did before, with zero overhead in the critical path.
This means Delimiter works with any AI SDK, any framework (LangChain, Vercel AI SDK, LiteLLM), or raw fetch() calls. If it makes an HTTP request to an AI API, Delimiter sees it.
What Delimiter never does#
- Never reads, stores, or transmits your API keys. The SDK only reads response headers (rate-limit numbers), not request headers (where API keys live). Your keys stay in your environment.
- Never modifies requests or responses. Delimiter is a passive observer at the network layer. Your API calls are untouched.
- Never adds latency. Header extraction and reporting happen asynchronously, after the response is already returned to your code.
- Never fails loudly. If the Delimiter backend is unreachable, the SDK silently continues. Your application is never affected.
Supported providers#
Delimiter works at the network layer — any AI provider that returns rate-limit headers over HTTP is automatically supported.
| Provider | Domain | Status |
| --- | --- | --- |
| OpenAI | api.openai.com | Auto-detected |
| Anthropic | api.anthropic.com | Auto-detected |
| Google Gemini | generativelanguage.googleapis.com | Auto-detected |
| Mistral | api.mistral.ai | Auto-detected |
| Cohere | api.cohere.com | Auto-detected |
| Groq | api.groq.com | Auto-detected |
| DeepSeek | api.deepseek.com | Auto-detected |
| xAI | api.x.ai | Auto-detected |
| Perplexity | api.perplexity.ai | Auto-detected |
| Together AI | api.together.xyz | Auto-detected |
| Fireworks AI | api.fireworks.ai | Auto-detected |
| Replicate | api.replicate.com | Auto-detected |
| Azure OpenAI | *.openai.azure.com | Auto-detected |
| Amazon Bedrock | bedrock-runtime.*.amazonaws.com | Auto-detected |
| OpenRouter | openrouter.ai | Auto-detected |
New providers are added to the SDK's detection list over time. When a new provider is added, your existing delimiter.init() call picks it up automatically — no code changes.
Two tiers of monitoring#
| | SDK (Passive) | Provider Connections |
|---|---|---|
| What it monitors | Rate limits (requests, tokens) + credit headers | Credit balances, spend |
| How it works | Reads HTTP response headers | Polls provider billing APIs securely |
| Setup | delimiter.init('key') — 2 lines | Click "Connect" in dashboard |
| Credentials | Never sees your API keys | You authenticate directly with provider |
| Data freshness | Real-time (every API call) | Polled (15–30 min intervals) |
Architecture#
Delimiter has three parts:
- SDK (npm package) — hooks into the HTTP layer at startup, detects AI provider API calls by domain, extracts rate-limit and credit headers from responses, and reports them asynchronously.
- Dashboard (web app) — receives reports from the SDK, aggregates the data, and displays your rate limit and credit usage in real-time.
- Provider Connections — optional billing API integration for spend monitoring on providers that don't expose cost data in response headers.
