Choose the right model for your use case — from automatic smart routing to high-performance specialized models.
Tokenthon gives you access to OpenAI's latest GPT models with intelligent fallbacks and transparent pricing. Each model is optimized for different scenarios, ensuring you get the best performance and value for your specific needs.
Intelligent model selection that automatically chooses the best available model based on:
- Current service availability
- System load and demand
- Request complexity
Optimized for speed and efficiency without compromising quality:
- Fast response times
- Cost-effective processing
- High availability
High-performance model with advanced reasoning capabilities for complex tasks:
- Advanced reasoning & analysis
- Higher accuracy & coherence
- Complex problem solving
Latest-generation model with cutting-edge reasoning and breakthrough creative capabilities:
- Superior reasoning & analysis
- Advanced creative writing
- Expanded context understanding
All Tokenthon models now support a comprehensive multi-role message system that enables rich, structured conversations:
Our API supports four distinct message roles: system, developer, assistant, and user. This enables more sophisticated conversation flows and better control over AI behavior.
High-level system instructions that set the overall behavior and personality of the AI. Appears at the conversation start.
Developer-provided instructions for task-specific guidance, formatting requirements, or implementation details.
AI-generated responses from the model. Can be used to maintain conversation history and context.
End-user messages and queries. The conversation must contain at least one user message.
All models (gpt-auto, gpt-5, gpt-5.2, and gpt-5-mini) fully support all four roles, giving you maximum flexibility in designing your conversations.
Technical Specifications
| Feature | gpt-auto | gpt-5/gpt-5.2 | gpt-5-mini |
|---|---|---|---|
| Model Selection | Automatic | Manual | Manual |
| Response Quality | Adaptive | Premium | Standard |
| Response Speed | Variable | Standard | Fast |
| Availability | High Demand | High | |
| Fallback Support |
Select the model that best fits your specific requirements. When in doubt, start with
gpt-auto for optimal performance and reliability.
Use gpt-auto for:
- Production applications requiring high availability
- Variable workloads with different complexity levels
- Maximum reliability with automatic failover
- Cost optimization with smart model selection
Use gpt-5/gpt-5.2 for:
- Complex reasoning and analytical tasks
- Research and academic applications
- Code generation with complex logic
- Creative writing and content creation
Use gpt-5-mini for:
- Quick responses and real-time applications
- Simple classification and extraction tasks
- High-volume processing with budget constraints
- Chatbots and basic conversational AI
Using Models in Your API Requests
TypeScript Example
const response = await fetch("https://api.tokenthon.com/api/v1/messages", {
method: "POST",
headers: {
"Content-Type": "application/json",
"x-api-key": "<YOUR_API_KEY>"
},
body: JSON.stringify({
model: "gpt-auto",
messages: [
{role: "user", content: "Write a bedtime story about a unicorn."}
],
response_format: { format: "text" }
})
});
const data = await response.json();
console.log(data);cURL Example
curl -X POST "https://api.tokenthon.com/api/v1/messages" \
-H "Content-Type: application/json" \
-H "x-api-key: <YOUR_API_KEY>" \
-d '{
"model": "gpt-auto",
"messages": [{
"role": "user",
"content": "Write a bedtime story about a unicorn."
}],
"response_format": { "format": "text" }
}'Available Models
"model": "gpt-auto""model": "gpt-5""model": "gpt-5.2""model": "gpt-5-mini"Simply replace "model": "gpt-auto" with your desired model
name in the examples above.
During high usage periods, gpt-5 may experience temporary capacity
limits. The gpt-auto model automatically handles these situations by
seamlessly falling back to gpt-5-mini to ensure your requests are
always processed.