Log calls, track versions, compare models, and optimize cost — all in three lines of code.
Latency, tokens, model, cost — captured automatically for every request. No agent frameworks required.
Tag every call with a version string. See cost, latency, and token counts broken down per version.
Spend per prompt, per model, and projected monthly cost — updated live. Catch a prompt going rogue before your bill does.
Track p50/p95 latency per prompt and version. See the histogram, catch regressions, and get alerted when p95 goes red.
p50
420ms
p95
1.2s
Run the same prompt on Haiku, Sonnet, and Opus simultaneously. Compare output, latency, and cost side by side before you ship.
Automatic model downgrade suggestions, caching opportunities, and token reduction tips — with confidence ratings and savings estimates.
Cost spikes and slow calls are flagged automatically. Know within seconds when something breaks in production.
Define test cases and run them against two models or prompt versions. Catch regressions with contains, regex, JSON validity, and LLM-as-judge scoring.
Set a monthly call limit and logging pauses when you hit it — your app keeps running, you just stop being tracked. No overages. Ever.
Works with any LLM provider — OpenAI, Anthropic, or any OpenAI-compatible API.
Install the SDK
npm install @promptive/sdk
Initialize with your project key
import { Promptive } from "@promptive/sdk";
const promptive = new Promptive({ projectKey: process.env.PROMPTIVE_KEY });Wrap your LLM call
const result = await promptive.wrap({
promptId: "summarizer",
version: "v2.0",
call: () => anthropic.messages.create({ model, messages }),
});Tag every LLM call with a prompt ID and version string. Promptive groups metrics by version automatically — side-by-side latency, cost, p95, and token counts. No custom tooling.
Version Analysis · summarizer
Run any prompt on Haiku, Sonnet, and Opus simultaneously. See outputs, latency, and cost side by side — so the choice is data, not instinct.
Model Comparison · 3 models
Haiku 4.5
Fastest · CheapestLatency
1.2s
Cost
$0.0012
Sonnet 4.6
Latency
4.1s
Cost
$0.0098
Opus 4.6
Latency
8.7s
Cost
$0.082
Promptive analyzes your usage and surfaces actionable suggestions: model downgrades, token reduction opportunities, and caching wins — each with a confidence rating based on your actual call history.
Cost Optimization
⚡ Switch to Haiku 4.5
4 calls on Sonnet 4 · 3–5× faster
$0.11–$0.17
/month
Model Usage
→ Haiku 4.5 saves $0.16/mo
Type a question — "What's my most expensive prompt?" or "Show me slow calls from last week" — and Trace converts it to SQL, runs it against your real log data, and answers in seconds. Grounded in your actual data, not hallucinated.
Trace · Natural Language Queries
summarizer is your most expensive prompt — $0.42 spent on 312 calls this month.
Avg cost/call: $0.0013 · Model: claude-sonnet-4-6
Switching to Haiku could save ~$0.34/mo.
Define test cases once, run them against two models or prompt versions side by side. Every result shows pass/fail, latency, tokens, and cost — so you know exactly which model is worth the trade-off.
Evals · Regression Testing
Extract tenant name from lease
Is this a fixed-term agreement?
Validate JSON output format
Summarise key clauses
Every logged call can be replayed directly in the Promptive playground. Edit the prompt, swap the model, and compare outputs — without touching your production code.
Playground · summarizer v1.0.0
system
You are a concise summarizer. Return 3 bullet points.
user
Summarize this lease agreement: The tenant agrees to pay…
assistant · 4.2s · $0.008
• Monthly rent: $2,400 due on the 1st
• 12-month term, no early exit
• Pets allowed with $500 deposit