Completions Response Format
Here’s the response schema as a TypeScript type:TypeScript
Copy
// Definitions of subtypes are below
type Response = {
id: string;
// Depending on whether you set "stream" to "true" and
// whether you passed in "messages" or a "prompt", you
// will get a different output shape
choices: (NonStreamingChoice | StreamingChoice | NonChatChoice)[];
created: number; // Unix timestamp
model: string;
object: 'chat.completion' | 'chat.completion.chunk';
system_fingerprint?: string; // Only present if the provider supports it
// Usage data is always returned for non-streaming.
// When streaming, you will get one usage object at
// the end accompanied by an empty choices array.
usage?: ResponseUsage;
};
Copy
// If the provider returns usage, we pass it down
// as-is. Otherwise, we count using the GPT-4 tokenizer.
type ResponseUsage = {
/** Including images and tools if any */
prompt_tokens: number;
/** The tokens generated */
completion_tokens: number;
/** Sum of the above two fields */
total_tokens: number;
/** Detailed breakdown of completion tokens */
completion_tokens_details?: {
accepted_prediction_tokens?: number | null;
audio_tokens?: number | null;
reasoning_tokens?: number; // Tokens used for reasoning (for models like o1, o3)
rejected_prediction_tokens?: number | null;
image_tokens?: number;
};
/** Detailed breakdown of prompt tokens */
prompt_tokens_details?: {
audio_tokens?: number | null;
cached_tokens?: number;
};
/** Total cost of the request */
cost?: number;
/** Whether the request used Bring Your Own Key */
is_byok?: boolean;
/** Detailed cost breakdown */
cost_details?: {
upstream_inference_cost?: number;
upstream_inference_prompt_cost?: number;
upstream_inference_completions_cost?: number;
};
};
Copy
// Subtypes:
type NonChatChoice = {
finish_reason: string | null;
text: string;
error?: ErrorResponse;
};
type NonStreamingChoice = {
finish_reason: string | null;
native_finish_reason: string | null;
message: {
content: string | null;
role: string;
tool_calls?: ToolCall[];
};
error?: ErrorResponse;
};
type StreamingChoice = {
finish_reason: string | null;
native_finish_reason: string | null;
delta: {
content: string | null;
role?: string;
tool_calls?: ToolCall[];
};
error?: ErrorResponse;
};
type ErrorResponse = {
code: number; // See "Error Handling" section
message: string;
metadata?: Record<string, unknown>; // Contains additional error information such as provider details, the raw error message, etc.
};
type ToolCall = {
id: string;
type: 'function';
function: FunctionCall;
};
Copy
{
"id": "gen-1770283226-LLYbKoYDuMJMdKzmarHz",
"created": 1770283226,
"model": "o3-mini",
"object": "chat.completion",
"system_fingerprint": null,
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "Building the world's tallest skyscraper is an immensely complex undertaking...",
"role": "assistant",
"tool_calls": null,
"function_call": null
},
"provider_specific_fields": {
"native_finish_reason": "completed"
}
}
],
"usage": {
"completion_tokens": 1671,
"prompt_tokens": 16,
"total_tokens": 1687,
"completion_tokens_details": {
"accepted_prediction_tokens": null,
"audio_tokens": null,
"reasoning_tokens": 512,
"rejected_prediction_tokens": null,
"image_tokens": 0
},
"prompt_tokens_details": {
"audio_tokens": null,
"cached_tokens": 0
},
"cost": 0.00737,
"is_byok": false,
"cost_details": {
"upstream_inference_cost": 0.00737,
"upstream_inference_prompt_cost": 1.76e-05,
"upstream_inference_completions_cost": 0.0073524
}
},
"provider": "OpenAI"
}
When using models that support reasoning (like OpenAI o1, o3 series), the
reasoning_tokens field in completion_tokens_details shows how many tokens were used for the model’s internal reasoning process. These tokens are part of the completion_tokens count and contribute to the overall cost.Finish Reason
Some models and providers may have additional finish reasons. The raw finish_reason string returned by the model is available via thenative_finish_reason property.