token-costs - LLM Pricing Data

01Overview

Token Costs provides programmatic access to LLM token pricing. Instead of hardcoding prices or scraping pricing pages yourself, fetch daily-updated JSON or use the npm package.

Supported providers: OpenAI, Anthropic, Google, OpenRouter

Update frequency: Daily at 00:01 UTC

02Quick Start

Option 1: NPM Package

npm install token-costs

import { CostClient } from 'token-costs';

const client = new CostClient();

// Get pricing
const result = await client.getModelPricing('openai', 'gpt-4o');
console.log(result.pricing.input);  // 2.5

// Calculate cost
const cost = await client.calculateCost('anthropic', 'claude-sonnet-4', {
  inputTokens: 1500,
  outputTokens: 800,
});
console.log(cost.totalCost);  // 0.0165

Option 2: Direct Fetch (No Dependencies)

Fetch JSON directly from this site:

const response = await fetch('https://mikkotikkanen.github.io/token-costs/api/v1/openai.json');
const data = await response.json();

const gpt4o = data.current.models['gpt-4o'];
console.log(`Input: $${gpt4o.input}/M tokens`);
console.log(`Output: $${gpt4o.output}/M tokens`);

03NPM Package

Installation

npm install token-costs

CostClient

The main class for fetching and caching pricing data:

import { CostClient } from 'token-costs';

const client = new CostClient();

// Get pricing for a specific model
const result = await client.getModelPricing('openai', 'gpt-4o');
// Returns: { provider, modelId, pricing, date, stale }

// Get all models for a provider
const models = await client.getProviderModels('anthropic');
// Returns: { 'claude-sonnet-4': { input, output, ... }, ... }

// List model IDs
const ids = await client.listModels('google');
// Returns: ['gemini-1.5-pro', 'gemini-1.5-flash', ...]

// Calculate cost
const cost = await client.calculateCost('openai', 'gpt-4o', {
  inputTokens: 1000,
  outputTokens: 500,
  cachedInputTokens: 200,  // optional
});
// Returns: { inputCost, outputCost, totalCost, usedCachedPricing, date, stale }

Custom Providers & Offline Mode

Add custom models or use entirely custom pricing data:

// Add custom models alongside remote data
const client = new CostClient({
  customProviders: {
    'my-company': {
      'internal-llm': { input: 0.50, output: 1.00, context: 32000 }
    },
    'openai': {
      'gpt-4-custom': { input: 25, output: 50 } // Override/add to openai
    }
  }
});

// Offline mode - no remote fetching, only use provided data
const offlineClient = new CostClient({
  offline: true,
  customProviders: {
    'openai': {
      'gpt-4o': { input: 2.5, output: 10 }
    }
  }
});

Client Options

const client = new CostClient({
  // Disable remote fetching (only use customProviders)
  offline: false,

  // Custom provider data (merged with remote, custom takes precedence)
  customProviders: { ... },

  // Custom base URL (default: GitHub raw content)
  baseUrl: 'https://your-mirror.com/api/v1',

  // Custom fetch function
  fetch: customFetch,

  // Clock offset in milliseconds (if server clock is wrong)
  timeOffsetMs: 2 * 60 * 60 * 1000,  // +2 hours

  // External cache for serverless (Redis, etc.)
  externalCache: {
    get: (key) => redis.get(key),
    set: (key, value) => redis.set(key, value, 'EX', 86400),
  },
});

04Stale Data Handling

The stale flag indicates if pricing data might be outdated. This happens when:

Update window - Data updates at 00:01 UTC. Between midnight and 00:01, yesterday's data is "current"
Fetch failure - If crawlers fail to update, yesterday's data persists
Clock skew - Your server's date differs from the data's date

const client = new CostClient();
const result = await client.getModelPricing('openai', 'gpt-4o');

if (result.stale) {
  // Data date doesn't match today's date
  console.warn(`Using ${result.date} pricing`);
}

// Stale data is still valid - prices rarely change daily
// Typical staleness is just a few hours around midnight UTC

Best Practices

For most applications, stale data is acceptable - LLM prices rarely change daily. However, if freshness matters:

const client = new CostClient();
const result = await client.getModelPricing('openai', 'gpt-4o');

if (result.stale) {
  // Option 1: Use it anyway (recommended for most cases)
  console.log(`Using ${result.date} pricing`);

  // Option 2: Alert but continue
  await sendAlert('Pricing data is stale');

  // Option 3: Refuse to calculate (strict mode)
  throw new Error('Pricing data is stale, aborting');
}

// In cost calculations, staleness is also returned:
const cost = await client.calculateCost('openai', 'gpt-4o', { inputTokens: 1000, outputTokens: 500 });
console.log(`Cost: $${cost.totalCost} (data from ${cost.date}, stale: ${cost.stale})`);

Error Handling

import { CostClient, ClockMismatchError } from 'token-costs';

const client = new CostClient();

try {
  const pricing = await client.getModelPricing('openai', 'gpt-4o');
} catch (error) {
  if (error instanceof ClockMismatchError) {
    // Server clock is significantly off
    console.error(`Clock off by ${error.daysDiff} days`);
  }
}

05API Endpoints

JSON files updated daily at 00:01 UTC:

OpenAI

api/v1/openai.json

Anthropic

api/v1/anthropic.json

Google

api/v1/google.json

OpenRouter

api/v1/openrouter.json

06Data Format

API Response Structure

{
  "current": {
    "date": "2026-01-12",
    "models": {
      "gpt-4o": {
        "input": 2.5,
        "output": 10,
        "cached": 1.25,
        "context": 128000
      },
      "gpt-4o-mini": {
        "input": 0.15,
        "output": 0.6,
        "context": 128000
      }
    }
  },
  "previous": {
    "date": "2026-01-11",
    "models": { ... }
  }
}

Model Fields

Field	Type	Description
`input`	number	USD per million input tokens
`output`	number	USD per million output tokens
`cached`	number?	USD per million cached input tokens
`context`	number?	Maximum context window size
`maxOutput`	number?	Maximum output tokens

Why Current and Previous?

Data updates at 00:01 UTC. If your application fetches at 23:59 UTC and caches, it might have "yesterday's" data when the calendar date changes.

The dual structure lets you:

Detect if your cached data is stale
Fall back to previous data if current is missing
Show users when prices last changed

07Advanced Usage

Serverless / Edge Functions

In serverless environments, use external cache to avoid refetching on every cold start:

import { CostClient } from 'token-costs';
import { Redis } from '@upstash/redis';

const redis = new Redis({ url: process.env.REDIS_URL, token: process.env.REDIS_TOKEN });

const client = new CostClient({
  externalCache: {
    get: (key) => redis.get(key),
    set: (key, value) => redis.set(key, value, { ex: 86400 }),
  },
});

Custom Fetch

For environments without global fetch or to add custom headers:

import { CostClient } from 'token-costs';
import nodeFetch from 'node-fetch';

const client = new CostClient({
  fetch: nodeFetch,
});

Self-Hosting

Mirror the JSON files and point the client to your server:

const client = new CostClient({
  baseUrl: 'https://your-cdn.com/token-costs/api/v1',
});

08Historical Data

Full price change history is stored in the GitHub repository:

github.com/mikkotikkanen/token-costs/tree/main/history/prices

Historical Format

{
  "provider": "openai",
  "lastCrawled": "2026-01-12T00:01:00Z",
  "pricingUrl": "https://openai.com/api/pricing/",
  "changes": [
    {
      "date": "2026-01-11",
      "changeType": "added",
      "pricing": {
        "modelId": "gpt-4o",
        "modelName": "GPT-4o",
        "inputPricePerMillion": 2.5,
        "outputPricePerMillion": 10,
        "contextWindow": 128000
      }
    },
    {
      "date": "2026-01-15",
      "changeType": "updated",
      "pricing": { ... },
      "previousPricing": { ... }
    }
  ]
}

Change types: added, updated, removed

09For LLM Providers

The Problem

Every tool that tracks LLM pricing has to scrape your website. This is:

Fragile - breaks when you update your pricing page
Wasteful - dozens of tools hitting your site daily
Inaccurate - scrapers miss updates or parse incorrectly

The Proposal

Add a /llm_prices.json file to your website root (like robots.txt):

{
  "gpt-4o": {
    "input": 2.5,
    "output": 10,
    "cached": 1.25,
    "context": 128000
  },
  "gpt-4o-mini": {
    "input": 0.15,
    "output": 0.6,
    "context": 128000
  }
}

Why This Structure

Model ID as key - matches what developers use in API calls
Flat object - simple to parse, no nested hierarchies
Minimal fields - only what's needed for cost calculation
Per-million pricing - avoids floating point issues with per-token prices

Fields

Field	Required	Description
`input`	Yes	USD per million input tokens
`output`	Yes	USD per million output tokens
`cached`	No	USD per million cached input tokens
`context`	No	Maximum context window size

Benefits

One source of truth - you control the data
Always accurate - updated when you change prices
Zero scraping - tools fetch a single JSON file
Tiny payload - typically under 2KB