
You deployed your OpenAI integration last week. The daily summary email says $4.20 spent yesterday. Sounds fine. But by the time that email arrives, a runaway loop that fired 3,000 requests at 3 AM has already finished. You lost $12, your rate limit got hammered, and you found out 18 hours later.
Real-time cost monitoring closes that gap. This guide shows you how to get per-minute spend visibility on your OpenAI API in under ten minutes, using Spanlens, a free open-source observability proxy.
Why a Daily Summary Is Not Enough
OpenAI’s usage dashboard shows aggregate daily totals. That is useful for accounting but too slow for catching problems. Three scenarios where daily totals leave you blind:
- A retry loop bug sends 500 requests in 2 minutes. You hit your rate limit before you notice.
- You accidentally left
gpt-4oin a code path that should usegpt-4o-mini. Costs are 5x higher, but the daily total looks normal until end of month. - A new feature shipped on Friday afternoon. Usage is up 300%. Is that expected growth or a bug? You cannot tell until Monday.
Four Numbers That Actually Matter
Instead of daily totals, track these four metrics in real time:
- Spend per hour: spikes signal runaway loops or traffic anomalies
- Cost by model: shows if you are accidentally using an expensive model
- Requests per minute: tracks traffic volume independently of cost
- Error rate: a rising error rate combined with rising spend often means retries are multiplying costs
Set Up Spanlens in 5 Minutes
Spanlens acts as a proxy in front of the OpenAI API. Change one line in your code and every request gets logged, costed, and traced automatically.
Step 1: Sign up at spanlens.io and create a project. You get an API key in the format sl_live_...
Step 2: Change your baseURL to point to the Spanlens proxy:

import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
baseURL: "https://api.spanlens.io/proxy/openai/v1",
defaultHeaders: {
"Authorization": `Bearer ${process.env.SPANLENS_API_KEY}`,
},
});That is the entire setup. Your existing OpenAI calls work unchanged. No SDK wrapper, no new function calls, no restructuring.
Reading the Dashboard
Once your first request flows through, the Spanlens dashboard shows spend in real time. The “Traffic and spend” chart updates every minute. The solid line tracks request volume; the dashed line tracks dollar cost.
When a runaway loop fires, you see both lines spike at the same time. The anomaly detector flags it automatically with the exact timestamp, cost delta, and percentage increase.

You can configure Slack or email alerts to fire the moment a spike is detected. That means you know within minutes instead of the next morning.
Breaking Down Costs by Model
The dashboard also breaks down spend by model, so you can see exactly where your budget is going.

A common finding: a background job that runs hourly was using gpt-4o for a task that gpt-4o-mini handles just as well. Switching that one call typically cuts the monthly bill by 30 to 50 percent.
A 10-Minute Cost Audit
If you just connected Spanlens to an existing project, here is a quick audit that surfaces optimizations fast:

- Open the Requests page and sort by cost descending. Look at the top 10 calls. Are they using the right model for the task?
- Switch to the Stats page and set the range to 7 days. Check if spend per day is growing faster than requests per day. If cost grows faster than volume, you likely have model drift somewhere.
- Go to the Anomalies tab. Look at any flagged events from the past week. Each one links to the exact requests that triggered it.
The entire audit takes about 10 minutes and almost always surfaces at least one change worth making.
Two Patterns to Watch For
Runaway retry loops. An error condition triggers retries. Each retry costs money. Without rate limiting on the retry logic, a single bad request can multiply into hundreds. Spanlens shows this as a sudden vertical spike on the hourly chart.
Model drift. A code review bumps a model from gpt-4o-mini to gpt-4o in a hot path. The change looks small in a diff but can double your daily bill. Watching cost-per-request over time catches this within hours of the deploy.
Start Monitoring in 5 Minutes
Spanlens is free to start, open source (MIT), and takes a single line change to set up. No agent to deploy, no infrastructure to manage, no vendor lock-in.
Prefer to self-host? The full source is available on GitHub. Docker image included.