01Overview
Token Costs provides programmatic access to LLM token pricing. Instead of hardcoding prices or scraping pricing pages yourself, fetch daily-updated JSON or use the npm package.
Supported providers: OpenAI, Anthropic, Google, OpenRouter
Update frequency: Daily at 00:01 UTC
02Quick Start
Option 1: NPM Package
npm install token-costs
import { CostClient } from 'token-costs';
const client = new CostClient();
// Get pricing
const result = await client.getModelPricing('openai', 'gpt-4o');
console.log(result.pricing.input); // 2.5
// Calculate cost
const cost = await client.calculateCost('anthropic', 'claude-sonnet-4', {
inputTokens: 1500,
outputTokens: 800,
});
console.log(cost.totalCost); // 0.0165
Option 2: Direct Fetch (No Dependencies)
Fetch JSON directly from this site:
const response = await fetch('https://mikkotikkanen.github.io/token-costs/api/v1/openai.json');
const data = await response.json();
const gpt4o = data.current.models['gpt-4o'];
console.log(`Input: $${gpt4o.input}/M tokens`);
console.log(`Output: $${gpt4o.output}/M tokens`);
03NPM Package
Installation
npm install token-costs
CostClient
The main class for fetching and caching pricing data:
import { CostClient } from 'token-costs';
const client = new CostClient();
// Get pricing for a specific model
const result = await client.getModelPricing('openai', 'gpt-4o');
// Returns: { provider, modelId, pricing, date, stale }
// Get all models for a provider
const models = await client.getProviderModels('anthropic');
// Returns: { 'claude-sonnet-4': { input, output, ... }, ... }
// List model IDs
const ids = await client.listModels('google');
// Returns: ['gemini-1.5-pro', 'gemini-1.5-flash', ...]
// Calculate cost
const cost = await client.calculateCost('openai', 'gpt-4o', {
inputTokens: 1000,
outputTokens: 500,
cachedInputTokens: 200, // optional
});
// Returns: { inputCost, outputCost, totalCost, usedCachedPricing, date, stale }
Custom Providers & Offline Mode
Add custom models or use entirely custom pricing data:
// Add custom models alongside remote data
const client = new CostClient({
customProviders: {
'my-company': {
'internal-llm': { input: 0.50, output: 1.00, context: 32000 }
},
'openai': {
'gpt-4-custom': { input: 25, output: 50 } // Override/add to openai
}
}
});
// Offline mode - no remote fetching, only use provided data
const offlineClient = new CostClient({
offline: true,
customProviders: {
'openai': {
'gpt-4o': { input: 2.5, output: 10 }
}
}
});
Client Options
const client = new CostClient({
// Disable remote fetching (only use customProviders)
offline: false,
// Custom provider data (merged with remote, custom takes precedence)
customProviders: { ... },
// Custom base URL (default: GitHub raw content)
baseUrl: 'https://your-mirror.com/api/v1',
// Custom fetch function
fetch: customFetch,
// Clock offset in milliseconds (if server clock is wrong)
timeOffsetMs: 2 * 60 * 60 * 1000, // +2 hours
// External cache for serverless (Redis, etc.)
externalCache: {
get: (key) => redis.get(key),
set: (key, value) => redis.set(key, value, 'EX', 86400),
},
});
04Stale Data Handling
The stale flag indicates if pricing data might be outdated. This happens when:
- Update window - Data updates at 00:01 UTC. Between midnight and 00:01, yesterday's data is "current"
- Fetch failure - If crawlers fail to update, yesterday's data persists
- Clock skew - Your server's date differs from the data's date
const client = new CostClient();
const result = await client.getModelPricing('openai', 'gpt-4o');
if (result.stale) {
// Data date doesn't match today's date
console.warn(`Using ${result.date} pricing`);
}
// Stale data is still valid - prices rarely change daily
// Typical staleness is just a few hours around midnight UTC
Best Practices
For most applications, stale data is acceptable - LLM prices rarely change daily. However, if freshness matters:
const client = new CostClient();
const result = await client.getModelPricing('openai', 'gpt-4o');
if (result.stale) {
// Option 1: Use it anyway (recommended for most cases)
console.log(`Using ${result.date} pricing`);
// Option 2: Alert but continue
await sendAlert('Pricing data is stale');
// Option 3: Refuse to calculate (strict mode)
throw new Error('Pricing data is stale, aborting');
}
// In cost calculations, staleness is also returned:
const cost = await client.calculateCost('openai', 'gpt-4o', { inputTokens: 1000, outputTokens: 500 });
console.log(`Cost: $${cost.totalCost} (data from ${cost.date}, stale: ${cost.stale})`);
Error Handling
import { CostClient, ClockMismatchError } from 'token-costs';
const client = new CostClient();
try {
const pricing = await client.getModelPricing('openai', 'gpt-4o');
} catch (error) {
if (error instanceof ClockMismatchError) {
// Server clock is significantly off
console.error(`Clock off by ${error.daysDiff} days`);
}
}
05API Endpoints
JSON files updated daily at 00:01 UTC:
OpenAI
api/v1/openai.jsonAnthropic
api/v1/anthropic.jsonOpenRouter
api/v1/openrouter.json06Data Format
API Response Structure
{
"current": {
"date": "2026-01-12",
"models": {
"gpt-4o": {
"input": 2.5,
"output": 10,
"cached": 1.25,
"context": 128000
},
"gpt-4o-mini": {
"input": 0.15,
"output": 0.6,
"context": 128000
}
}
},
"previous": {
"date": "2026-01-11",
"models": { ... }
}
}
Model Fields
| Field | Type | Description |
|---|---|---|
input | number | USD per million input tokens |
output | number | USD per million output tokens |
cached | number? | USD per million cached input tokens |
context | number? | Maximum context window size |
maxOutput | number? | Maximum output tokens |
Why Current and Previous?
Data updates at 00:01 UTC. If your application fetches at 23:59 UTC and caches, it might have "yesterday's" data when the calendar date changes.
The dual structure lets you:
- Detect if your cached data is stale
- Fall back to previous data if current is missing
- Show users when prices last changed
07Advanced Usage
Serverless / Edge Functions
In serverless environments, use external cache to avoid refetching on every cold start:
import { CostClient } from 'token-costs';
import { Redis } from '@upstash/redis';
const redis = new Redis({ url: process.env.REDIS_URL, token: process.env.REDIS_TOKEN });
const client = new CostClient({
externalCache: {
get: (key) => redis.get(key),
set: (key, value) => redis.set(key, value, { ex: 86400 }),
},
});
Custom Fetch
For environments without global fetch or to add custom headers:
import { CostClient } from 'token-costs';
import nodeFetch from 'node-fetch';
const client = new CostClient({
fetch: nodeFetch,
});
Self-Hosting
Mirror the JSON files and point the client to your server:
const client = new CostClient({
baseUrl: 'https://your-cdn.com/token-costs/api/v1',
});
08Historical Data
Full price change history is stored in the GitHub repository:
github.com/mikkotikkanen/token-costs/tree/main/history/prices
Historical Format
{
"provider": "openai",
"lastCrawled": "2026-01-12T00:01:00Z",
"pricingUrl": "https://openai.com/api/pricing/",
"changes": [
{
"date": "2026-01-11",
"changeType": "added",
"pricing": {
"modelId": "gpt-4o",
"modelName": "GPT-4o",
"inputPricePerMillion": 2.5,
"outputPricePerMillion": 10,
"contextWindow": 128000
}
},
{
"date": "2026-01-15",
"changeType": "updated",
"pricing": { ... },
"previousPricing": { ... }
}
]
}
Change types: added, updated, removed
09For LLM Providers
The Problem
Every tool that tracks LLM pricing has to scrape your website. This is:
- Fragile - breaks when you update your pricing page
- Wasteful - dozens of tools hitting your site daily
- Inaccurate - scrapers miss updates or parse incorrectly
The Proposal
Add a /llm_prices.json file to your website root (like robots.txt):
{
"gpt-4o": {
"input": 2.5,
"output": 10,
"cached": 1.25,
"context": 128000
},
"gpt-4o-mini": {
"input": 0.15,
"output": 0.6,
"context": 128000
}
}
Why This Structure
- Model ID as key - matches what developers use in API calls
- Flat object - simple to parse, no nested hierarchies
- Minimal fields - only what's needed for cost calculation
- Per-million pricing - avoids floating point issues with per-token prices
Fields
| Field | Required | Description |
|---|---|---|
input | Yes | USD per million input tokens |
output | Yes | USD per million output tokens |
cached | No | USD per million cached input tokens |
context | No | Maximum context window size |
Benefits
- One source of truth - you control the data
- Always accurate - updated when you change prices
- Zero scraping - tools fetch a single JSON file
- Tiny payload - typically under 2KB