StemmaStemma

Debug LLMs in Production.

Log every call, version your prompts, and track cost — all in one dashboard. Hard-capped at your plan limit. No surprises.

10,000 free calls.·No overages, ever.·Your keys, never ours.
app.stemma.dev
Stemma dashboard overview
Logging

Log Everything, Automatically

Every LLM call is captured — latency, token count, cost, model, and the full input/output. Zero extra code after the one-time setup.

  • Latency & p95 tracking
  • Full prompt & completion capture
  • Token counts per call
app.stemma.dev
Stemma logs view
app.stemma.dev
Stemma prompt versions
Versioning

Compare Prompt Versions

Tag each call with a prompt ID and version number. Metrics update in real-time so you can see exactly what changed between iterations.

  • Side-by-side metric comparison
  • Real-time version diff
  • Regression detection
Cost Control

Hard Caps, Zero Surprises

Track spend per prompt and set hard monthly caps. When the limit is hit, calls stop gracefully — no overages, no bill shock.

  • Cost per prompt breakdown
  • Monthly hard caps
  • Projected spend forecast
app.stemma.dev
Stemma cost tracking

Three Lines of Code

Add Stemma to any existing project in under a minute.

TypeScript

// 1. Install

npm install @stemma/sdk

// 2. Wrap any LLM call

import { Stemma } from "@stemma/sdk";

const stemma = new Stemma({
  apiKey: "YOUR_API_KEY",
});

const response = await stemma.wrap({
  promptId: "my-prompt",
  version:  "v1",
  call: () => openai.chat.completions.create({
    model: "gpt-4o",
    messages,
  }),
});

Common Questions

Everything you need to know before getting started.

Stemma is an LLM observability tool. Add three lines to your app and every prompt call gets logged — latency, token counts, cost, input, output — all in a searchable dashboard.

Install the SDK with npm install @stemma/sdk, create a project to get an API key, then wrap your LLM calls with stemma.wrap(). That's it — no proxy required.

Yes. Stemma works with any provider — OpenAI, Anthropic, Google, Mistral, local models. You pass the call result directly, so there's no SDK lock-in.

Console logs vanish. Stemma persists every call with structured metadata, lets you diff prompt versions, see cost trends, replay requests with copy-as-curl, and alert you when costs spike.

Your app keeps running — only logging stops for the rest of the month. No surprise shutdowns, no broken production calls. Upgrade to Builder for 25,000 calls/month.

“Built by an indie developer, for indie developers — because production visibility shouldn't cost enterprise prices.”

— Stemma