Pick a preset, hit Run, and see exactly what Stemma captures — latency, tokens, and cost per call.
This is a simulated demo. Sign up to run real calls with your own API keys.
You are a concise summarizer. Return a 2–3 sentence summary of the user's text.
The transformer architecture, introduced in the paper 'Attention Is All You Need' (Vaswani et al., 2017), replaced recurrent networks with a self-attention mechanism. This allowed for far greater parallelism during training and enabled models to capture long-range dependencies in text more effectively. The architecture became the foundation for GPT, BERT, and virtually every large language model built since.
Hit Run to see the response →