Tokenthon provides a compatibility layer that enables you to use the standard OpenAI SDK to access our powerful AI models. With minimal code changes, you can quickly integrate Tokenthon's cost-effective AI capabilities into your existing applications.
This compatibility layer is primarily intended for seamless integration with existing OpenAI-based applications. While we maintain the core OpenAI interface, some advanced features have limitations. For access to Tokenthon's full feature set, consider using our native API.
Using Tokenthon's OpenAI-compatible API can save you up to 38-90% on AI costs compared to direct OpenAI billing. All the same models, a fraction of the price.
If you encounter any issues with the OpenAI SDK compatibility feature, please let us know through our support channels. We continuously improve this compatibility layer based on user feedback.
To use the OpenAI SDK compatibility feature, you'll need to:
- Use an official OpenAI SDK (or any compatible HTTP client)
- Update your base URL to point to Tokenthon's API
- Replace your API key with a Tokenthon API key
- Use Tokenthon model names (e.g.,
gpt-auto,gpt-5-mini,gpt-5,gpt-5.2) - Review the documentation below for what features are supported
from openai import OpenAI
client = OpenAI(
api_key="YOUR_TOKENTHON_API_KEY",
base_url="https://api.tokenthon.com/v1/"
)
response = client.chat.completions.create(
model="gpt-auto", # Tokenthon auto-selects the best model
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who are you?"}
],
)
print(response.choices[0].message.content)Tokenthon supports the following model names through the OpenAI compatibility layer:
| Model Name | Description | Tier |
|---|---|---|
gpt-auto | Automatically selects the best available model | Free & Premium |
gpt-5-mini | Lightweight, fast model for simple tasks | Free & Premium |
gpt-5 | Premium model for complex tasks | Premium |
gpt-5.2 | Advanced model with enhanced capabilities | Premium |
Premium models (gpt-5 and gpt-5.2) require a paid subscription plan. Free tier users can use gpt-auto and gpt-5-mini.
Here are the most substantial differences from using the standard OpenAI API:
- Streaming is not supported - The
streamparameter is accepted but not implemented. All responses are returned as complete messages. - Tool calling is not supported - Tool messages (
role: "tool") and function calling are filtered out and not processed. - Most parameters are ignored - Parameters like
temperature,top_p,stop,presence_penalty,frequency_penalty, and others are accepted by the schema but not used in the actual processing. - Response format validation - JSON schema validation is performed by the model but not strictly enforced by Tokenthon.
- Multiple choices not supported - The
nparameter must be exactly1.
Most unsupported fields are silently ignored rather than producing errors. These are all documented below.
If you've done lots of tweaking to your prompt, it's likely to be well-tuned to OpenAI specifically. You may need to adjust your prompts slightly when migrating to Tokenthon, especially if you rely on features that are not supported in our compatibility layer.
Tokenthon's compatibility layer supports both system and developer roles. However, our internal API uses a slightly different role system:
- OpenAI
systemmessages are mapped to Tokenthon'sdeveloperrole - OpenAI
developermessages are mapped to Tokenthon'sdeveloperrole userandassistantroles remain unchanged
This mapping is handled automatically, so you don't need to change your message structure.
Rate limits follow Tokenthon's standard limits for the /v1/chat/completions endpoint. Please refer to your account dashboard or contact support for specific rate limit details.
| Field | Support Status | Notes |
|---|---|---|
model | Fully supported | Use Tokenthon model names (gpt-auto, gpt-5-mini, gpt-5, gpt-5.2) |
max_tokens | Accepted | Validated but not passed to underlying model |
max_completion_tokens | Accepted | Validated but not passed to underlying model |
stream | Accepted | Not implemented - always returns complete response |
stream_options | Accepted | Ignored |
top_p | Accepted | Ignored - only affects sampling in original OpenAI API |
parallel_tool_calls | Accepted | Ignored - tool calling not supported |
stop | Accepted | Ignored |
temperature | Accepted | Ignored - must be between 0 and 2 |
n | Accepted | Must be exactly 1 - multiple choices not supported |
logprobs | Accepted | Ignored |
metadata | Accepted | Ignored |
response_format | Fully supported | Supports text, json_object, and json_schema |
prediction | Accepted | Ignored |
presence_penalty | Accepted | Ignored |
frequency_penalty | Accepted | Ignored |
seed | Accepted | Ignored |
service_tier | Accepted | Ignored |
audio | Accepted | Ignored - audio input not supported |
logit_bias | Accepted | Ignored |
store | Accepted | Ignored |
user | Accepted | Ignored |
modalities | Accepted | Ignored |
top_logprobs | Accepted | Ignored |
reasoning_effort | Accepted | Ignored |
Tool calling and function calling are not supported in the current version of the OpenAI compatibility layer. All tool-related parameters are ignored.
Tools (tools[n].function fields)
| Field | Support Status |
|---|---|
name | Accepted but ignored |
description | Accepted but ignored |
parameters | Accepted but ignored |
strict | Accepted but ignored |
Functions (functions[n] fields)
| Field | Support Status |
|---|---|
name | Accepted but ignored |
description | Accepted but ignored |
parameters | Accepted but ignored |
strict | Accepted but ignored |
Developer Role
Fields for messages[n].role == "developer"
| Field | Support Status | Notes |
|---|---|---|
content | Fully supported | Mapped to Tokenthon's developer role |
name | Accepted | Ignored |
System Role
Fields for messages[n].role == "system"
| Field | Support Status | Notes |
|---|---|---|
content | Fully supported | Mapped to Tokenthon's developer role |
name | Accepted | Ignored |
User Role
Fields for messages[n].role == "user"
| Field | Variant | Sub-field | Support Status |
|---|---|---|---|
content | string | Fully supported | |
array, type == "text" | Fully supported | ||
array, type == "image_url" | url | Fully supported | |
detail | Accepted but ignored | ||
array, type == "input_audio" | Ignored - audio not supported | ||
array, type == "file" | Ignored | ||
name | Accepted but ignored |
Assistant Role
Fields for messages[n].role == "assistant"
| Field | Variant | Support Status |
|---|---|---|
content | string | Fully supported |
array, type == "text" | Fully supported | |
array, type == "refusal" | Accepted but ignored | |
tool_calls | Accepted but ignored - tool calling not supported | |
function_call | Accepted but ignored - function calling not supported | |
audio | Ignored - audio not supported | |
refusal | Ignored |
Tool Role
Fields for messages[n].role == "tool"
| Field | Variant | Support Status |
|---|---|---|
content | string | Filtered out - tool messages not supported |
array, type == "text" | Filtered out - tool messages not supported | |
tool_call_id | Filtered out - tool messages not supported | |
tool_choice | Filtered out - tool messages not supported | |
name | Filtered out - tool messages not supported |
Function Role
Fields for messages[n].role == "function"
| Field | Variant | Support Status |
|---|---|---|
content | string | Filtered out - function messages not supported |
array, type == "text" | Filtered out - function messages not supported | |
tool_choice | Filtered out - function messages not supported | |
name | Filtered out - function messages not supported |
| Field | Support Status | Notes |
|---|---|---|
id | Fully supported | Format: chatcmpl-{jobId} |
choices[] | Fully supported | Always has a length of 1 |
choices[].finish_reason | Fully supported | Values: stop, length, content_filter |
choices[].index | Fully supported | Always 0 |
choices[].message.role | Fully supported | Always assistant |
choices[].message.content | Fully supported | The assistant's response |
choices[].message.tool_calls | Always empty | Tool calling not supported |
choices[].message.refusal | Always empty | Not supported |
choices[].message.audio | Always empty | Audio not supported |
object | Fully supported | Always chat.completion |
created | Fully supported | Unix timestamp |
model | Fully supported | The requested model name |
finish_reason | Fully supported | Same as choices[].finish_reason |
content | Fully supported | Same as choices[].message.content |
usage.completion_tokens | Fully supported | Estimated token count |
usage.prompt_tokens | Fully supported | Estimated token count |
usage.total_tokens | Fully supported | Sum of prompt and completion tokens |
usage.completion_tokens_details | Always empty | Not supported |
usage.prompt_tokens_details | Always empty | Not supported |
logprobs | Always empty | Not supported |
service_tier | Always empty | Not supported |
system_fingerprint | Always empty | Not supported |
The compatibility layer maintains consistent error formats with the OpenAI API. Error responses include:
{
"error": {
"message": "Error description here",
"type": "error_type",
"param": undefined,
"code": "status_code_string"
}
}Error types are mapped based on HTTP status codes:
| Status Code | Error Type |
|---|---|
| 400 | invalid_request_error |
| 401 | invalid_request_error |
| 403 | permission_error |
| 404 | not_found_error |
| 408 | timeout_error |
| 429 | rate_limit_error |
| 500 | api_error |
| 502 | api_error |
| 503 | service_unavailable_error |
| 504 | timeout_error |
The detailed error messages will not be equivalent to OpenAI's. We recommend only using the error messages for logging and debugging.
While the OpenAI SDK automatically manages headers, here is the complete list of headers supported by the Tokenthon API for developers who need to work with them directly.
| Header | Support Status | Notes |
|---|---|---|
authorization | Fully supported | Bearer token authentication |
content-type | Fully supported | Must be application/json |
x-request-id | Supported | Request tracking identifier |
Note: Tokenthon does not return rate limit headers in the current implementation. Please monitor your usage through the dashboard.
response = client.chat.completions.create(
model="gpt-5",
messages=[
{"role": "system", "content": "You are a helpful assistant that responds in JSON."},
{"role": "user", "content": "Extract the name and age from this text: John is 25 years old."}
],
response_format={"type": "json_object"}
)
print(response.choices[0].message.content)response = client.chat.completions.create(
model="gpt-5",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Extract structured data from this text: The meeting is on January 15, 2024 at 3 PM."}
],
response_format={
"type": "json_schema",
"json_schema": {
"name": "event_extraction",
"schema": {
"type": "object",
"properties": {
"date": {"type": "string"},
"time": {"type": "string"}
},
"required": ["date", "time"]
}
}
}
)
print(response.choices[0].message.content)response = client.chat.completions.create(
model="gpt-auto",
messages=[
{"role": "system", "content": "You are a helpful customer service agent."},
{"role": "user", "content": "I have a question about my order."},
{"role": "assistant", "content": "I'd be happy to help! What's your order number?"},
{"role": "user", "content": "Order #12345"}
]
)
print(response.choices[0].message.content)