Log every call, version your prompts, and track cost — all in one dashboard. Hard-capped at your plan limit. No surprises.

Every LLM call is captured — latency, token count, cost, model, and the full input/output. Zero extra code after the one-time setup.


Tag each call with a prompt ID and version number. Metrics update in real-time so you can see exactly what changed between iterations.
Track spend per prompt and set hard monthly caps. When the limit is hit, calls stop gracefully — no overages, no bill shock.

Add Stemma to any existing project in under a minute.
// 1. Install
npm install @stemma/sdk// 2. Wrap any LLM call
import { Stemma } from "@stemma/sdk";
const stemma = new Stemma({
apiKey: "YOUR_API_KEY",
});
const response = await stemma.wrap({
promptId: "my-prompt",
version: "v1",
call: () => openai.chat.completions.create({
model: "gpt-4o",
messages,
}),
});Everything you need to know before getting started.
Stemma is an LLM observability tool. Add three lines to your app and every prompt call gets logged — latency, token counts, cost, input, output — all in a searchable dashboard.
Install the SDK with npm install @stemma/sdk, create a project to get an API key, then wrap your LLM calls with stemma.wrap(). That's it — no proxy required.
Yes. Stemma works with any provider — OpenAI, Anthropic, Google, Mistral, local models. You pass the call result directly, so there's no SDK lock-in.
Console logs vanish. Stemma persists every call with structured metadata, lets you diff prompt versions, see cost trends, replay requests with copy-as-curl, and alert you when costs spike.
Your app keeps running — only logging stops for the rest of the month. No surprise shutdowns, no broken production calls. Upgrade to Builder for 25,000 calls/month.
“Built by an indie developer, for indie developers — because production visibility shouldn't cost enterprise prices.”
— Stemma