LLM Model Selection¶
Configure which AI models power different features across your account.
Overview¶
Optimize for cost, performance, speed, or specific capabilities by selecting different models for different tasks.
Supported Providers:
- OpenAI
- GPT-4, GPT-4o - Best for complex reasoning and reliable output
- Anthropic
- Claude 3 family - Excellent for nuanced conversations and longer context
- Gemini models - Strong analysis with competitive pricing
Availability
Providers and models vary by tier and region.
Model Selection by Task¶
Configure different models for each task type:
- Chat Messages
- Powers assistant conversations and responses
- Recommended: GPT-4o or Claude 3 (Pro), GPT-4o-mini (Basic/Free)
- Summary Generation
- Creates conversation summaries and titles
- Recommended: GPT-4o-mini (cost-effective, sufficient quality)
- QA Generation
- Generates Q&A pairs from knowledge bases for RAG
- Recommended: GPT-4o (high quality), GPT-4o-mini (balance)
- Shared Chat
- Powers public chatbot widgets and customer-facing interactions
- Recommended: GPT-4o or Claude 3 (reliability critical for production)
- Function Creation
- Generates Python code for custom functions
- Recommended: GPT-4o (superior code quality)
- Function Test & Fix
- Debugs and fixes function code
- Recommended: GPT-4o (reliable), GPT-4o-mini (faster iterations)
Configuration¶
- Go to Account → LLM Settings
- Select Provider and Model for each task type
- Click Save Changes
Test Changes
Test your assistants after changing models to verify quality meets expectations.
Model Comparison
OpenAI Models:
| Model | Speed | Cost | Best For |
|---|---|---|---|
| GPT-4o | Fast | Medium | Production chat, complex reasoning |
| GPT-4o-mini | Very Fast | Low | Summaries, high-volume tasks |
| GPT-4 Turbo | Medium | High | Complex analysis, critical tasks |
Anthropic Models:
| Model | Speed | Cost | Best For |
|---|---|---|---|
| Claude 3 Opus | Medium | High | Sophisticated reasoning, long conversations |
| Claude 3 Sonnet | Fast | Medium | Balanced performance and cost |
| Claude 3 Haiku | Very Fast | Low | Quick responses, simple tasks |
View current pricing in your Billing Dashboard.
Cost Optimization¶
Task-Based Strategy:
Use premium models where they add value:
- Chat: GPT-4o (customer-facing)
- Summaries: GPT-4o-mini (high volume)
- QA Generation: GPT-4o-mini (batch processing)
- Shared Chat: GPT-4o (public-facing)
- Functions: GPT-4o (code quality matters)
- Function Fix: GPT-4o-mini (iterative)
Savings: 40-60% vs. premium models everywhere
Development vs Production:
- Development: GPT-4o-mini for all tasks
- Production: Upgrade customer-facing features to GPT-4o
Monitor and Adjust:
- Check usage in Billing Dashboard
- Identify high-cost tasks
- Test cheaper alternatives
- Adjust based on results
Trade-offs¶
Use Premium Models (GPT-4o, Claude Opus) when:
- Accuracy is critical (customer support, medical, legal)
- Complex reasoning required
- Brand reputation matters (public-facing)
- User satisfaction is priority
Use Budget Models (GPT-4o-mini, Claude Haiku) when:
- Simple, repetitive tasks
- High volume, low stakes (testing, internal tools)
- Speed is priority
- Cost optimization needed
Advanced: Model-Specific Features
Context Windows:
- GPT-4o: 128K tokens
- Claude 3: 200K tokens
- GPT-4o-mini: 128K tokens
Matters for long conversations, large documents, and multi-turn interactions.
Model Strengths:
- GPT-4o
- Strong function calling, reliable JSON output, excellent code generation
- Best for: E-commerce, technical support
- Claude 3
- Longer context, ethical reasoning, nuanced conversations
- Best for: Customer service, long conversations
Recommended Configuration¶
Default: All tasks use GPT-4o-mini (good balance for getting started)
Recommended Starter Config:
- Chat: GPT-4o-mini
- Summary: GPT-4o-mini
- QA Generation: GPT-4o-mini
- Shared Chat: GPT-4o ⭐
- Function Creation: GPT-4o ⭐
- Function Test & Fix: GPT-4o-mini
Prioritizes customer-facing quality and code reliability while keeping costs low.
Troubleshooting¶
- Model not available
- Check your tier includes access. Verify provider is available in your region.
- Responses seem lower quality
- Verify selected model. Test with premium model to compare. Some models handle certain tasks better.
- Unexpectedly high costs
- Review usage in Billing Dashboard. Identify high-cost tasks. Consider downgrading low-value tasks.
- Settings not saving
- Ensure you clicked "Save Changes". Check for errors. Refresh page and retry.
Best Practices¶
Testing Model Changes:
- Create test assistant with new configuration
- Run test conversations with typical queries
- Compare quality against previous model
- Check response time
- Monitor costs for a few days
- Gather feedback if public-facing
- Adjust based on results
Monthly Reviews:
- Review usage in Billing Dashboard
- Identify high-cost tasks
- Test cheaper alternatives
- Check for new models
- Adjust based on metrics
FAQ¶
- Can I use different models for different assistants?
- Model selection is account-wide. All assistants use your configured models.
- How quickly do changes take effect?
- Immediately for new conversations. Existing conversations continue with their current model.
- Does model selection affect tier limits?
- No. Tier limits are independent of model selection. Costs per message vary by model.
- What happens if a model is deprecated?
- You'll receive advance notice. Your configuration will auto-update to a comparable model.
Related Documentation¶
- Billing & Pricing - Understand costs per model
- Assistants - Create and configure AI assistants
- Knowledge Bases - How QA generation works
- Templates & Functions - Custom function creation