Chat Completions API

AI / ML

OpenAI

GPT-4o and GPT-4-turbo language model inference for chat, completion, and function calling

Reported Latency

Typical performance from public reports -- not live measurements

Average: 850ms

p50

600ms

p95

2,200ms

p99

4,500ms

99.81%observed uptime

Free tier

None (credit-based)

Paid starts at

$0.50 / 1M input tokens (GPT-4o-mini)

Authentication

API Key

Rate limit

500 RPM (Tier 1)

Protocols

RESTWebSocket (Realtime)

SDKs

PythonNode.jsGo.NETJava

Regions

USEU

Last updated: 2026-04-12