English·4/29/2026·portkey alternatives, portkey alternative, llm gateway, ai agent monitoring, llm proxy, ai gateway, llm observability, claude monitoring

Best Portkey Alternatives 2026: 7 LLM Gateway & Monitoring Tools Compared

Portkey has become a popular AI gateway for teams that want a single endpoint in front of OpenAI, Anthropic, and a handful of other LLM providers. It bundles routing, caching, fallbacks, and observability behind one API. That bundling is exactly what makes it useful — and exactly what makes it the wrong fit for some teams.

If your stack already has a load balancer, if you can't afford another single point of failure on the request path, or if you care more about agent-level telemetry than gateway-level routing, you're going to want to look at the alternatives. This guide walks through 7 of them — gateway-style and observer-style — with the trade-offs that actually matter when you ship to production.

Quick comparison

|------|--------------|---------|----------|

Two architectural camps emerge:

Proxies sit on the request path. They can route, retry, cache, fall back — and if they go down, your agents go down too. Portkey, LiteLLM, Cloudflare AI Gateway, Kong, OpenRouter, and Helicone all live in this camp.
Observers sit beside the request path. They watch what's happening — runs, tool calls, costs, errors, OS-health — without being on the critical path. ClawPulse and Langfuse live here.

Pick the camp first. Then pick the tool.

1. ClawPulse — the observer-pattern alternative

ClawPulse is purpose-built for monitoring OpenClaw agent fleets in production. Instead of routing your LLM calls, it instruments your agents and gives you a real-time fleet view: which agents are running, what they're spending, where they're erroring, and how the host machines are holding up.

Strengths

Observer architecture — zero impact on the request path. Your agents keep working even if ClawPulse is down.
Built specifically for agent telemetry, not just LLM proxy logs. Tracks tool calls, retries, sub-agent runs, token usage, OS-level health.
Real-time alerts on cost spikes, error storms, agent silence (an agent that suddenly stops emitting telemetry is often more important than one that errors loudly).
Works with self-hosted and managed agent deployments — install via a single curl one-liner.
Bilingual product (EN + FR) — useful for teams with Quebec or French-speaking ops staff.

Weaknesses

Not a routing layer. If you want fallbacks between OpenAI and Anthropic, you'll need another tool (LiteLLM is a good companion).
Strongest for OpenClaw — generic LangChain/CrewAI support exists but the OpenClaw-specific dashboards are where it shines.

Best for: teams running OpenClaw agents in production who want fleet visibility without adding a new failure point on the request path.

> Compare directly: ClawPulse vs Portkey →

2. LiteLLM — the OSS proxy

LiteLLM is the open-source proxy most teams reach for when they want Portkey-style routing without the SaaS bill. It exposes a single OpenAI-compatible endpoint that fans out to 100+ providers, with built-in retries, fallbacks, and cost tracking.

Strengths

Truly open source (MIT). Self-host on your own infra — no vendor lock-in.
100+ provider integrations including the long tail (Mistral, Together, Fireworks, Bedrock, Vertex).
Drop-in OpenAI SDK compatibility — you change `base_url` and your existing code works.
A managed cloud version exists if you don't want to run it yourself.

Weaknesses

Self-hosting it well (HA, secrets, observability) is itself a project.
Proxy architecture — a misconfigured fallback or a stuck retry loop can amplify outages instead of dampening them.
Observability is functional but not the focus — you'll want a separate tracing tool (Langfuse, ClawPulse) layered on top.

Best for: infra teams comfortable running their own proxy who want maximum control and zero vendor markup.

3. Cloudflare AI Gateway

If your traffic already terminates at Cloudflare's edge, their AI Gateway is hard to beat on price. It sits between your app and the LLM provider, gives you analytics, caching, and rate-limiting, and the free tier is generous enough that small projects never pay.

Strengths

Free tier covers most early-stage workloads.
Edge-deployed — latency overhead is negligible if your users and the model are in similar regions.
Built-in caching and rate-limiting are particularly useful for agentic workloads with repeated tool calls.

Weaknesses

Locked to the Cloudflare ecosystem. Migrating away is a project.
Observability is shallow compared to dedicated tracing tools — you see request-level data, not agent-run data.
No deep tool-call or sub-agent tracing.

Best for: teams already on Cloudflare who need a cheap, fast gateway with basic analytics.

4. Kong AI Gateway

Kong's AI Gateway is the enterprise pick. Built on top of their well-known API gateway, it adds LLM-aware routing, prompt firewalling, PII redaction, and policy controls — the kind of features that compliance teams actually ask for.

Strengths

Enterprise-grade governance: PII scrubbing, prompt firewalls, audit trails.
Slots into existing Kong deployments without a separate platform.
Strong RBAC and multi-tenant support.

Weaknesses

Quote-based pricing — not for indie devs or small startups.
Heavyweight to deploy if you don't already use Kong.
Routing-focused; observability is a checkbox feature, not the headline.

Best for: large organizations with existing Kong infrastructure and compliance requirements that go beyond what Portkey provides.

5. OpenRouter — the multi-model router

OpenRouter is less of a Portkey replacement and more of a Portkey companion you sometimes use instead. It exposes a unified API to dozens of model providers and lets you pick a model per request — useful for experimentation, A/B testing prompts across providers, or routing prompts to the cheapest model that meets a quality bar.

Strengths

Single API for every model worth using, including the long-tail open-weight ones.
Pay-as-you-go with a thin markup — no monthly minimum.
Excellent for experimentation: swap a model in your config, ship, measure.

Weaknesses

Markup on every token adds up at scale.
Not a full gateway — no fine-grained policy controls, no PII scrubbing.
Observability is request-level only.

Best for: product teams running prompt experiments across providers who don't want to manage N provider integrations.

6. Helicone — the proxy-based observability tool

Helicone is the closest direct competitor to Portkey on the observability side. You change one line of code (the OpenAI `base_url`), and Helicone proxies your traffic, logs every request, and gives you a dashboard.

Strengths

Trivial setup — genuinely 60 seconds.
Strong free tier.
Open-source self-hosted option.
Good prompt management and caching features.

Weaknesses

Same architectural risk as Portkey: it's on the request path. If Helicone has an incident, your agents have an incident.
Sees requests and responses — not the agent-level state (tool calls, sub-agent fan-out, OS-health) that matters for fleet ops.
Pricing scales with request volume; high-traffic agents get expensive.

Best for: teams who want LLM observability today and can accept a proxy on their request path.

> Deeper read: Best Helicone Alternatives 2026 →

7. Langfuse — the OSS tracing platform

Langfuse is the open-source observer-pattern alternative. It lets you instrument your LLM apps with decorators and SDK calls and ships every span, generation, and score to either the cloud or your own self-hosted instance.

Strengths

Truly open source. Self-host the whole stack.
Best-in-class trace UI — the timeline view of nested LLM calls is excellent.
Strong eval and prompt-versioning features.
Active community and rapid release cadence.

Weaknesses

Setup is more involved than a proxy — you instrument your code rather than swapping a `base_url`.
Self-hosted requires Postgres, ClickHouse, and a Redis — non-trivial infra.
Focused on LLM traces; less coverage of agent-fleet OS-level health.

Best for: teams who want deep, structured LLM tracing and are willing to instrument their codebase to get it.

> Deeper read: Best Langfuse Alternatives 2026 →

How to choose

There are three real questions:

1. Do you need routing, observability, or both? Routing → LiteLLM, Kong, Cloudflare, OpenRouter. Observability → ClawPulse, Langfuse, Helicone. Both → stack one of each (LiteLLM + ClawPulse is a common pairing).

2. Can you tolerate a proxy on your request path? If no — pick an observer (ClawPulse, Langfuse). If yes — your set widens to everything else.

3. What does your fleet look like? A handful of services calling the OpenAI API → Helicone or Cloudflare. A growing fleet of OpenClaw agents with tool calls and sub-agents → ClawPulse. A multi-team enterprise with compliance asks → Kong.

If you're already running OpenClaw agents and your pain is "I can't see what my fleet is actually doing," start a ClawPulse trial — it takes about two minutes to install and you'll have your first telemetry within five.

See ClawPulse in action

The fastest way to evaluate any of these tools is to see real telemetry from a real agent. We have a live demo dashboard seeded with realistic OpenClaw agent traffic — cost spikes, error patterns, fleet-wide views — so you can stress-test the UX before committing to any setup work.

Pricing is on the pricing page. Starter covers most teams' first production fleet; Growth and Agency add seat counts, more instances, and longer retention.

FAQ

Is Portkey worth it in 2026?

Portkey is still a solid choice if you want a managed gateway that bundles routing and observability with minimal setup. The right alternative depends on what you're optimizing for — cost (Cloudflare), control (LiteLLM), depth of observability (ClawPulse, Langfuse), or compliance (Kong).

What's the difference between an LLM gateway and an LLM observer?

A gateway sits on the request path — it routes, retries, caches, and logs. An observer sits beside the request path — it watches, instruments, and reports without intercepting calls. Gateways add a single point of failure but can save money via caching and routing. Observers stay out of the critical path but can't influence outcomes in real time.

Which Portkey alternative is best for OpenClaw agents specifically?

ClawPulse — it's purpose-built for OpenClaw fleets and tracks agent-level telemetry (tool calls, sub-agent runs, OS health) that gateway-style tools don't see.

Can I use multiple tools together?

Yes, and most production teams do. A common pairing: LiteLLM for routing + ClawPulse for fleet observability + a long-term log warehouse (Snowflake or BigQuery) for analytics. Each does one thing well.

Are there any free Portkey alternatives?

Yes — LiteLLM (OSS, self-host), Langfuse (OSS, self-host), Cloudflare AI Gateway (free tier), and Helicone (free tier) all have no-cost paths. ClawPulse offers a free trial so you can validate fit before paying.

How do I migrate from Portkey?

If you're moving to a proxy alternative (LiteLLM, Helicone, Cloudflare), migration is mostly a `base_url` swap plus rewriting any Portkey-specific config (fallbacks, virtual keys). If you're moving to an observer (ClawPulse, Langfuse), you remove Portkey from the request path and add an SDK or sidecar — a bigger architectural change but no shared failure mode.

Start monitoring your OpenClaw agents in 2 minutes

Free 14-day trial. No credit card. Just drop in one curl command.

Prefer a walkthrough? Book a 15-min demo.

Related: LangSmith alternatives

Portkey and LangSmith sit on opposite ends of the gateway-vs-trace-store spectrum. For teams whose evaluation centers on LangSmith specifically, see Best LangSmith Alternatives 2026 for the same 7-tool field reframed.

> 🇫🇷 Lecteur francophone ? Voir notre version FR : Meilleures alternatives à Langfuse en 2026 : 7 outils de monitoring d'agents IA comparés

May-2026 decision matrix: Portkey vs ClawPulse by workload profile

The "is Portkey worth it" question collapses into nonsense without a workload. Below is the matrix we use when teams ask "should I migrate." Each row is a real workload pattern we've seen in production, with the lever that actually matters and the alert you should set first.

|---------|-------------------|----------------|----------------|----------|----------|----------------|

Portkey solves this matrix from the request side: it routes, caches, and falls back. ClawPulse solves it from the observation side: it tells you which row you're actually in and which lever is leaking. The two are not equivalent. A team that buys Portkey and skips observation usually has caching configured but no idea their RAG corpus is duplicating four times a day.

A 95-LOC TypeScript observer-pattern wrapper (no proxy SPOF)

The architectural critique of Portkey-style proxies is concrete: they sit on the request path. The observer alternative is to instrument the call site itself, attach a fire-forget beacon, and ship the telemetry side-channel. Below is the production wrapper we ship — copy it into a `lib/instrument.ts` and you have agent-grade telemetry without a gateway.

```ts

// lib/instrument.ts — observer pattern, fire-forget beacon, never blocks the call

import crypto from "node:crypto";

type ProviderUsage = {

prompt_tokens: number;

completion_tokens: number;

prompt_tokens_details?: { cached_tokens?: number };

cache_creation_input_tokens?: number;

cache_read_input_tokens?: number;

reasoning_tokens?: number;

};

type Telemetry = {

ts: string;

route: string;

tenant_id?: string;

model: string;

duration_ms: number;

prompt_hash: string; // SHA-256 16-char prefix — anonymized

billable_input_tokens: number; // input minus cache_read (what you actually pay)

cache_read_tokens: number;

cache_write_tokens: number;

output_tokens: number;

reasoning_tokens: number;

retry_count: number;

rpc_error_code?: string;

status: "ok" | "error" | "timeout";

};

const BEACON_URL = process.env.CLAWPULSE_BEACON_URL!; // observer endpoint

const BEACON_TIMEOUT_MS = 250;

function hashPrompt(s: string): string {

return crypto.createHash("sha256").update(s).digest("hex").slice(0, 16);

}

function fireForget(payload: Telemetry) {

// never await — never block the agent. Daemon-style.

const ctrl = new AbortController();

const timer = setTimeout(() => ctrl.abort(), BEACON_TIMEOUT_MS);

fetch(BEACON_URL, {

method: "POST",

headers: { "content-type": "application/json", "x-anon": "1" },

body: JSON.stringify(payload),

signal: ctrl.signal,

keepalive: true,

})

.catch(() => {}) // beacon must never propagate

.finally(() => clearTimeout(timer));

}

export async function instrumentLLMCall(

meta: { route: string; tenant_id?: string; model: string; promptText: string },

fn: () => Promise,

opts?: { maxRetries?: number }

): Promise {

const t0 = Date.now();

const maxRetries = opts?.maxRetries ?? 2;

let attempt = 0;

let lastErr: unknown;

while (attempt <= maxRetries) {

try {

const result = await fn();

const u = result.usage ?? ({} as ProviderUsage);

const cacheRead =

u.cache_read_input_tokens ?? u.prompt_tokens_details?.cached_tokens ?? 0;

const cacheWrite = u.cache_creation_input_tokens ?? 0;

const billableInput = Math.max(0, (u.prompt_tokens ?? 0) - cacheRead);

fireForget({

ts: new Date().toISOString(),

route: meta.route,

tenant_id: meta.tenant_id,

model: meta.model,

duration_ms: Date.now() - t0,

prompt_hash: hashPrompt(meta.promptText),

billable_input_tokens: billableInput,

cache_read_tokens: cacheRead,

cache_write_tokens: cacheWrite,

output_tokens: u.completion_tokens ?? 0,

reasoning_tokens: u.reasoning_tokens ?? 0,

retry_count: attempt,

status: "ok",

});

return result;

} catch (e: any) {

lastErr = e;

const code = e?.status ?? e?.code ?? "unknown";

if (attempt === maxRetries) {

fireForget({

ts: new Date().toISOString(),

route: meta.route,

tenant_id: meta.tenant_id,

model: meta.model,

duration_ms: Date.now() - t0,

prompt_hash: hashPrompt(meta.promptText),

billable_input_tokens: 0,

cache_read_tokens: 0,

cache_write_tokens: 0,

output_tokens: 0,

reasoning_tokens: 0,

retry_count: attempt,

rpc_error_code: String(code),

status: "error",

});

throw e;

}

attempt++;

await new Promise((r) => setTimeout(r, 200 2 * attempt));

}

throw lastErr;

}

```

Three properties matter:

1. Off the request path. The beacon uses `keepalive: true` and a 250 ms hard timeout. If ClawPulse is down, your agent is unaffected — the beacon drops silently. Portkey's proxy cannot offer this guarantee.

2. Cache-aware billing. The wrapper subtracts `cache_read_input_tokens` from `prompt_tokens` to compute `billable_input_tokens`. Provider dashboards bill correctly; many SaaS observers do not, leading to apparent "cost leaks" that are actually billing-model errors.

3. prompt_hash anonymization. SHA-256 16-char prefix lets you group retries without storing the prompt. Compatible with Loi 25 art. 17/18 and GDPR art. 28/32 minimization requirements.

Postmortem: $14.2k Toronto legaltech RAG over 6 days — a Portkey deployment couldn't have caught this

A Toronto law-firm RAG product running on Anthropic Sonnet via Portkey saw its Anthropic invoice jump from $1.9k → $16.1k over six business days. Portkey's dashboard showed normal request volume, normal cache hit rate, normal error rate. Anthropic's billing dashboard updated 4–7 days behind. The team only noticed during month-end reconciliation.

Root cause was a frontend bug introduced in a Friday hotfix: a `useEffect` dependency array change caused the citation panel to re-mount on every chat turn, and each re-mount triggered a fresh corpus retrieval that was then jammed into the next prompt — duplicating the same 9-document corpus four times in input. `prompt_tokens` on every turn went from 4,100 → 21,400. Cost per turn went from $0.012 → $0.064. Over 220k turns, this cost the firm $14.2k.

Why Portkey couldn't catch it:

Its dashboards bucket cost by route, not by `prompt_tokens` distribution per route. The route was the same. The token distribution had drifted catastrophically.
It does not run a z-score on input_tokens against a 24-hour baseline. The drift was visible in seconds with such an alert; it was invisible in aggregate counters.
Its caching could not save the team because the duplicated corpus chunks were emitted in subtly different orders per turn, defeating prefix caching.

What ClawPulse caught (in test environments after the fact, then deployed to prevent recurrence):

A `billable_input_tokens` z-score per route hit 5.2 within 11 minutes of the deploy.
A `cache_read_ratio` alert fired at < 0.10 against a 0.78 baseline.
Mean time to detection in an instrumented setup: ~11 minutes, vs. the 6 days observed.

ROI on a $49/mo Growth plan: $14.2k / $49 = ~290x. This is the math that makes observer-pattern monitoring not optional for production agent fleets.

4 production SQL recipes (drop-in for the ClawPulse warehouse)

These are the four queries that, if alerted on, would have caught the postmortem above and three others we've seen on Portkey deployments. They run against the `TaskEntry` table the wrapper above feeds.

1. Per-route input-token z-score (catches RAG drift, retry storms, prompt regressions)

```sql

WITH baseline AS (

SELECT route, AVG(billable_input_tokens) AS mu, STDDEV_POP(billable_input_tokens) AS sigma

FROM TaskEntry

WHERE createdAt > DATE_SUB(NOW(), INTERVAL 24 HOUR) AND status = 'ok'

GROUP BY route HAVING COUNT(*) > 50

), recent AS (

SELECT route, AVG(billable_input_tokens) AS mu_recent

FROM TaskEntry

WHERE createdAt > DATE_SUB(NOW(), INTERVAL 1 HOUR) AND status = 'ok'

GROUP BY route

)

SELECT r.route, b.mu, r.mu_recent, (r.mu_recent - b.mu) / NULLIF(b.sigma, 0) AS z

FROM recent r JOIN baseline b USING (route)

WHERE ABS((r.mu_recent - b.mu) / NULLIF(b.sigma, 0)) > 3.5

ORDER BY z DESC;

```

Alert when `z > 3.5`. This is the single highest-ROI alert in production agent monitoring.

2. Retry storms by prompt_hash (catches loops the proxy hides)

```sql

SELECT prompt_hash, route, COUNT(*) AS retries, MAX(retry_count) AS max_retry,

SUM(billable_input_tokens + output_tokens) AS wasted_tokens

FROM TaskEntry

WHERE createdAt > DATE_SUB(NOW(), INTERVAL 5 MINUTE)

AND retry_count > 0

GROUP BY prompt_hash, route

HAVING retries > 50

ORDER BY retries DESC;

```

If your proxy is doing exponential backoff on a stuck route, this is what you want to see. Portkey logs show the retry; they do not page you.

3. cache_read_ratio degradation (catches cache key drift)

```sql

SELECT route,

SUM(cache_read_tokens) / NULLIF(SUM(cache_read_tokens + billable_input_tokens), 0) AS cache_ratio,

SUM(cache_read_tokens + billable_input_tokens) AS total_input

FROM TaskEntry

WHERE createdAt > DATE_SUB(NOW(), INTERVAL 1 HOUR) AND status = 'ok'

GROUP BY route

HAVING total_input > 5000 AND cache_ratio < 0.30

ORDER BY total_input DESC;

```

Sudden drop below 0.30 typically means a non-deterministic value (timestamp, request ID, locale) leaked into the cached prefix.

4. Multi-tenant fairness (catches single-tenant runaway)

```sql

WITH per_tenant AS (

SELECT tenant_id,

SUM(billable_input_tokens + output_tokens) AS tenant_tokens

FROM TaskEntry

WHERE createdAt > DATE_SUB(NOW(), INTERVAL 1 HOUR)

GROUP BY tenant_id

), avg_tenant AS (

SELECT AVG(tenant_tokens) AS mean_tokens FROM per_tenant

)

SELECT pt.tenant_id, pt.tenant_tokens, at.mean_tokens,

pt.tenant_tokens / NULLIF(at.mean_tokens, 0) AS share_vs_mean

FROM per_tenant pt CROSS JOIN avg_tenant at

WHERE pt.tenant_tokens > 5 * at.mean_tokens

ORDER BY share_vs_mean DESC;

```

Page on `share_vs_mean > 5`. This is what saves the SaaS team from the runaway-customer-bill incident.

7-tool capability comparison (the one Portkey docs do not publish)

|------------|:--------:|:-------:|:-------:|:--------:|:--------:|:----------------:|:---------------:|

| Off request path (no proxy SPOF) | ✓ | ✗ | ✗ | ✗ | ✓ | ✗ | ✓ |

| `cache_read_input_tokens` correctly billed | ✓ | partial | partial | partial | ✓ | ✗ | ✓ |

| z-score auto-alerts on input_tokens drift | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ | partial |

| Retry-storm detection by prompt_hash | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |

| Per-tenant fairness alerts | ✓ | ✗ | ✗ | partial | partial | ✗ | partial |

| Canada-resident hosting (Toronto) | ✓ | ✗ (US) | self-host | ✗ (US) | self-host | edge | ✗ (US) |

| Setup time to first telemetry | < 5 min | < 5 min | hours | < 5 min | 30 min | < 5 min | hours |

| Free tier with prod-grade signal | ✓ | partial | OSS | ✓ | OSS | ✓ | ✗ |

What this table is not saying: Portkey is a bad product. It's a competent gateway. The table is saying that gateway-side feature lists and observer-side feature lists are not interchangeable, and most teams who say "I picked Portkey for observability" are paying for routing they don't need and missing observability they do.

Loi 25 (Quebec) and GDPR considerations for Portkey migrations

Teams operating from Quebec or serving Canadian end-users should know:

Loi 25 art. 17/18 (Quebec) requires PII minimization and clear retention purposes. ClawPulse stores only `prompt_hash` (SHA-256 16-char prefix), not the prompt. Portkey by default stores the full prompt and response.
Loi 25 art. 28.1 requires the option to delete personal data on request. ClawPulse exposes `DELETE /api/dashboard/erase?tenant_id=...` for tenant-scoped erasure.
GDPR art. 28 (processor obligations) and art. 32 (security of processing) are satisfied by Aiven's Toronto-region MySQL with AES-256 at rest and TLS 1.3 in transit. Portkey's processing region is US-east by default and contractually requires a separate DPA addendum to add EU/CA residency.
Tool-arg PII allowlist. The instrumentation library accepts a per-route allowlist for tool arguments — fields not on the allowlist are hashed before transmission. This matters for legaltech and healthtech workloads.

If your stack already on a US-resident proxy is creating a Loi 25 friction with a Quebec compliance officer, the architectural answer is to move telemetry to a Canadian-resident observer and leave routing wherever it makes sense for cost and latency.

10-point pre-migration checklist

Use this when you're moving off Portkey (or evaluating staying on it):

1. Inventory routes. List every distinct `route` your agents emit. If you can't list them, telemetry is your real problem, not the gateway.

2. Capture baseline `billable_input_tokens` per route. 24-hour mean and stddev. This is the input to your z-score alert.

3. Capture baseline `cache_read_ratio` per route. Any route below 0.30 is leaving money on the table.

4. Set up a `prompt_hash` for retry-storm detection. Without it, loops cost you silently.

5. Decide your data-residency requirement. Toronto, US-east, EU-west — pick before vendor selection.

6. Decide proxy vs observer. Routing requirement → proxy. Visibility requirement → observer. Both → both.

7. Beacon timeout. 250 ms hard cap; never await; never propagate exceptions to the agent.

8. PII allowlist. Document which tool arguments are safe to transmit unhashed.

9. Runbook link in the alert payload. Every page should link to a runbook. Pages without runbooks get ignored by week three.

10. Synthetic canary every 60 s that exercises the slowest route. Catches outages your users don't have to.

If you check off these ten before pulling Portkey out, the migration is uneventful. If you skip them and migrate, expect a surprise during week one of the new tool.

Extended FAQ (visible)

Q1: Will switching from Portkey to an observer increase my agent's latency?

No — observers add zero latency to the request path because they don't sit on it. The beacon emits asynchronously after the response is returned. Proxies (including Portkey) typically add 30–80 ms p95 of routing overhead.

Q2: Can I run Portkey AND ClawPulse together during migration?

Yes, and it's the recommended path. Keep Portkey on the request path, instrument the call site with the wrapper above, and you get both gateway features and full agent telemetry. Once you trust the observer signal, decide whether the gateway is still pulling its weight.

Q3: What's the smallest workload where ClawPulse outperforms Portkey on observability?

Any workload above ~$300/mo in token spend with a single retry-storm or RAG-drift incident per quarter. The first incident pays for the year.

Q4: How does ClawPulse handle multi-provider fleets (OpenAI + Anthropic + Mistral)?

The wrapper normalizes provider-specific token fields into a common schema (`billable_input_tokens`, `cache_read_tokens`, etc.). Routes can be tagged with provider, so per-provider cost and latency are first-class in the dashboard.

Q5: Does ClawPulse work with self-hosted LiteLLM?

Yes. Wrap the call inside your LiteLLM client invocation. The two are complementary: LiteLLM does routing, ClawPulse does observation.

Q6: What about prompt-firewall / PII redaction features that Portkey advertises?

Those are gateway features by definition (you can't redact what you don't intercept). If you need them, keep a gateway. If you don't need them on the request path, redact at the call site before invoking the model — and pair with an observer for the rest.

Q7: How does ClawPulse compare to Datadog LLM Observability for fleet-level signals?

Datadog LLM Obs is well-engineered but priced for enterprise (per-host + ingestion-volume model). ClawPulse is priced for fleets of agents (per-instance). For a 50-agent OpenClaw fleet, the difference is typically 10–25x in monthly cost.

Q8: Can I export my historical telemetry out of Portkey before switching?

Portkey exposes a CSV export from the analytics page; APIs vary by plan. Export before disabling. ClawPulse imports from CSV via `POST /api/dashboard/import-csv` so the historical baseline is not lost.

Internal reads, in order of usefulness

External authority references

Anthropic prompt caching documentation — cache_read billing semantics
OpenTelemetry GenAI semantic conventions — token attribute schema
Légis Québec — Loi 25 official text — privacy obligations
Anthropic enterprise privacy — processor and residency posture