StemmaStemma
LLM observability for developers

Full visibility into every LLM call

Log calls, track versions, compare models, and optimize cost — all in three lines of code.

Log every LLM call

Latency, tokens, model, cost — captured automatically for every request. No agent frameworks required.

summarizer · 423ms · $0.0008
Free

Prompt version tracking

Tag every call with a version string. See cost, latency, and token counts broken down per version.

v1.0
13.4s
v2.0
4.2s ✓
Builder

Real-time cost tracking

Spend per prompt, per model, and projected monthly cost — updated live. Catch a prompt going rogue before your bill does.

$0.42todayproj. $12/mo
Free

Latency monitoring

Track p50/p95 latency per prompt and version. See the histogram, catch regressions, and get alerted when p95 goes red.

p50

420ms

p95

1.2s

Free

Model comparison

Run the same prompt on Haiku, Sonnet, and Opus simultaneously. Compare output, latency, and cost side by side before you ship.

Haiku · $0.001Sonnet · $0.010Opus · $0.082
Builder

Cost intelligence

Automatic model downgrade suggestions, caching opportunities, and token reduction tips — with confidence ratings and savings estimates.

Switch to Haiku → save $14–18/mo
Builder

Anomaly detection

Cost spikes and slow calls are flagged automatically. Know within seconds when something breaks in production.

Cost spike · 3.2× avg
Free

Hard cost caps

Set a monthly call limit and logging pauses when you hit it — your app keeps running, you just stop being tracked. No overages. Ever.

7,423 calls used10,000 cap
All plans

Up and running in 60 seconds

Works with any LLM provider — OpenAI, Anthropic, or any OpenAI-compatible API.

1

Install the SDK

npm install @stemma/sdk
2

Initialize with your project key

import { Stemma } from "@stemma/sdk";
const stemma = new Stemma({ projectKey: process.env.STEMMA_KEY });
3

Wrap your LLM call

const result = await stemma.wrap({
  promptId: "summarizer",
  version:  "v2.0",
  call: () => anthropic.messages.create({ model, messages }),
});
Version tracking

Know exactly which prompt version costs what

Tag every LLM call with a prompt ID and version string. Stemma groups metrics by version automatically — side-by-side latency, cost, p95, and token counts. No custom tooling.

  • p50 / p95 / p99 latency per version
  • Cost per call with monthly projection
  • High-latency alerts and suggested experiments

Version Analysis · summarizer

VersionModelAvgp95Cost/call
v1.0.0claude-sonnet-413.4s21.8s$0.042
v1.0.1claude-haiku-4.54.2s6.1s$0.008
↓ 69% faster↓ 81% cheaper
Model comparison

Pick the right model before you ship

Run any prompt on Haiku, Sonnet, and Opus simultaneously. See outputs, latency, and cost side by side — so the choice is data, not instinct.

  • Compare up to 5 Claude models at once
  • Fastest and cheapest model highlighted
  • Token breakdown per model

Model Comparison · 3 models

Haiku 4.5

Fastest · Cheapest

Latency

1.2s

Cost

$0.0012

The quick brown fox…

Sonnet 4.6

Latency

4.1s

Cost

$0.0098

The quick brown fox…

Opus 4.6

Latency

8.7s

Cost

$0.082

The quick brown fox…
Cost intelligence

Automatic optimization recommendations

Stemma analyzes your usage and surfaces actionable suggestions: model downgrades, token reduction opportunities, and caching wins — each with a confidence rating based on your actual call history.

  • Savings range with ±20% confidence interval
  • Current vs. optimized annual cost
  • Stacked model usage bar for visual comparison

Cost Intelligence

⚡ Switch to Haiku 4.5

4 calls on Sonnet 4 · 3–5× faster

$0.11–$0.17

/month

Low data — 4 callsView Prompt →

Model Usage

Claude Sonnet 4$0.18/mo

→ Haiku 4.5 saves $0.16/mo

Playground

Replay any production call in the browser

Every logged call can be replayed directly in the Stemma playground. Edit the prompt, swap the model, and compare outputs — without touching your production code.

  • One-click replay from any log entry
  • Edit prompt and re-run live
  • Latency and cost shown per run

Playground · summarizer v1.0.0

system

You are a concise summarizer. Return 3 bullet points.

user

Summarize this lease agreement: The tenant agrees to pay…

assistant · 4.2s · $0.008

• Monthly rent: $2,400 due on the 1st
• 12-month term, no early exit
• Pets allowed with $500 deposit