LLM Model Selection¶
Configure which AI models power different features across your account.
Overview¶
Optimize for cost, performance, speed, or specific capabilities by selecting different models for different tasks.
Supported Providers:
- OpenAI
- GPT-5-mini, GPT-4.1-mini, GPT-4.1-nano - Fast, cost-effective models for production use
- Anthropic
- Claude 4 family - Excellent for nuanced conversations and longer context
- Gemini models - Strong analysis with competitive pricing
Availability
Providers and models vary by tier and region.
Model Selection by Task¶
Configure different models for each task type:
- Chat Messages
- Powers assistant conversations and responses
- Recommended: GPT-5-mini (all tiers)
- Summary Generation
- Creates conversation summaries and titles
- Recommended: GPT-4.1-mini (cost-effective, sufficient quality)
- QA Generation
- Generates Q&A pairs from knowledge bases for RAG
- Recommended: GPT-5-mini (high quality), GPT-4.1-mini (balance)
- Shared Chat
- Powers public chatbot widgets and customer-facing interactions
- Recommended: GPT-5-mini (reliability critical for production)
- Function Creation
- Generates Python code for custom functions
- Recommended: GPT-5-mini (superior code quality)
- Function Test & Fix
- Debugs and fixes function code
- Recommended: GPT-5-mini (reliable), GPT-4.1-mini (faster iterations)
Configuration¶
- Go to Account → LLM Settings
- Select Provider and Model for each task type
- Click Save Changes
Test Changes
Test your assistants after changing models to verify quality meets expectations.
Model Comparison
OpenAI Models:
| Model | Speed | Cost | Best For |
|---|---|---|---|
| GPT-5-mini | Fast | Low | Production chat, complex reasoning (default) |
| GPT-4.1-mini | Very Fast | Very Low | Summaries, high-volume tasks |
| GPT-4.1-nano | Very Fast | Lowest | Simple tasks, batch processing |
Anthropic Models:
| Model | Speed | Cost | Best For |
|---|---|---|---|
| Claude 4 Sonnet | Fast | Medium | Balanced performance and cost |
| Claude 4 Haiku | Very Fast | Low | Quick responses, simple tasks |
View current pricing in your Billing Dashboard.
Cost Optimization¶
Task-Based Strategy:
Use premium models where they add value:
- Chat: GPT-5-mini (customer-facing)
- Summaries: GPT-4.1-mini (high volume)
- QA Generation: GPT-4.1-mini (batch processing)
- Shared Chat: GPT-5-mini (public-facing)
- Functions: GPT-5-mini (code quality matters)
- Function Fix: GPT-4.1-mini (iterative)
Savings: 30-50% by using GPT-4.1-mini for non-customer-facing tasks
Development vs Production:
- Development: GPT-4.1-mini for all tasks
- Production: Use GPT-5-mini for customer-facing features
Monitor and Adjust:
- Check usage in Billing Dashboard
- Identify high-cost tasks
- Test cheaper alternatives
- Adjust based on results
Trade-offs¶
Use GPT-5-mini when:
- Accuracy is critical (customer support, medical, legal)
- Complex reasoning required
- Brand reputation matters (public-facing)
- User satisfaction is priority
Use GPT-4.1-mini or GPT-4.1-nano when:
- Simple, repetitive tasks
- High volume, low stakes (testing, internal tools)
- Speed is priority
- Cost optimization needed
Advanced: Model-Specific Features
Context Windows:
- GPT-5-mini: 128K tokens
- GPT-4.1-mini: 128K tokens
- Claude 4: 200K tokens
Matters for long conversations, large documents, and multi-turn interactions.
Model Strengths:
- GPT-5-mini
- Strong function calling, reliable JSON output, excellent code generation
- Best for: E-commerce, technical support, customer-facing chat
- Claude 4
- Longer context, ethical reasoning, nuanced conversations
- Best for: Customer service, long conversations
Recommended Configuration¶
Default: All tasks use GPT-5-mini (good balance for getting started)
Recommended Starter Config:
- Chat: GPT-5-mini
- Summary: GPT-4.1-mini
- QA Generation: GPT-4.1-mini
- Shared Chat: GPT-5-mini ⭐
- Function Creation: GPT-5-mini ⭐
- Function Test & Fix: GPT-4.1-mini
Prioritizes customer-facing quality and code reliability while keeping costs low.
Troubleshooting¶
- Model not available
- Check your tier includes access. Verify provider is available in your region.
- Responses seem lower quality
- Verify selected model. Test with premium model to compare. Some models handle certain tasks better.
- Unexpectedly high costs
- Review usage in Billing Dashboard. Identify high-cost tasks. Consider downgrading low-value tasks.
- Settings not saving
- Ensure you clicked "Save Changes". Check for errors. Refresh page and retry.
Best Practices¶
Testing Model Changes:
- Create test assistant with new configuration
- Run test conversations with typical queries
- Compare quality against previous model
- Check response time
- Monitor costs for a few days
- Gather feedback if public-facing
- Adjust based on results
Monthly Reviews:
- Review usage in Billing Dashboard
- Identify high-cost tasks
- Test cheaper alternatives
- Check for new models
- Adjust based on metrics
FAQ¶
- Can I use different models for different assistants?
- Model selection is account-wide. All assistants use your configured models.
- How quickly do changes take effect?
- Immediately for new conversations. Existing conversations continue with their current model.
- Does model selection affect tier limits?
- No. Tier limits are independent of model selection. Costs per message vary by model.
- What happens if a model is deprecated?
- You'll receive advance notice. Your configuration will auto-update to a comparable model.
Related Documentation¶
- Billing & Pricing - Understand costs per model
- Assistants - Create and configure AI assistants
- Knowledge Bases - How QA generation works
- Templates & Functions - Custom function creation