LLM Model Selection¶

Configure which AI models power different features across your account.

Overview¶

Optimize for cost, performance, speed, or specific capabilities by selecting different models for different tasks.

Supported Providers:

OpenAI: GPT-4, GPT-4o - Best for complex reasoning and reliable output
Anthropic: Claude 3 family - Excellent for nuanced conversations and longer context
Google: Gemini models - Strong analysis with competitive pricing

Availability

Providers and models vary by tier and region.

Model Selection by Task¶

Configure different models for each task type:

Chat Messages: Powers assistant conversations and responses; Recommended: GPT-4o or Claude 3 (Pro), GPT-4o-mini (Basic/Free)
Summary Generation: Creates conversation summaries and titles; Recommended: GPT-4o-mini (cost-effective, sufficient quality)
QA Generation: Generates Q&A pairs from knowledge bases for RAG; Recommended: GPT-4o (high quality), GPT-4o-mini (balance)
Shared Chat: Powers public chatbot widgets and customer-facing interactions; Recommended: GPT-4o or Claude 3 (reliability critical for production)
Function Creation: Generates Python code for custom functions; Recommended: GPT-4o (superior code quality)
Function Test & Fix: Debugs and fixes function code; Recommended: GPT-4o (reliable), GPT-4o-mini (faster iterations)

Configuration¶

Go to Account → LLM Settings
Select Provider and Model for each task type
Click Save Changes

Test Changes

Test your assistants after changing models to verify quality meets expectations.

Model Comparison

OpenAI Models:

Model	Speed	Cost	Best For
GPT-4o	Fast	Medium	Production chat, complex reasoning
GPT-4o-mini	Very Fast	Low	Summaries, high-volume tasks
GPT-4 Turbo	Medium	High	Complex analysis, critical tasks

Anthropic Models:

Model	Speed	Cost	Best For
Claude 3 Opus	Medium	High	Sophisticated reasoning, long conversations
Claude 3 Sonnet	Fast	Medium	Balanced performance and cost
Claude 3 Haiku	Very Fast	Low	Quick responses, simple tasks

View current pricing in your Billing Dashboard.

Cost Optimization¶

Task-Based Strategy:

Use premium models where they add value:

Chat: GPT-4o (customer-facing)
Summaries: GPT-4o-mini (high volume)
QA Generation: GPT-4o-mini (batch processing)
Shared Chat: GPT-4o (public-facing)
Functions: GPT-4o (code quality matters)
Function Fix: GPT-4o-mini (iterative)

Savings: 40-60% vs. premium models everywhere

Development vs Production:

Development: GPT-4o-mini for all tasks
Production: Upgrade customer-facing features to GPT-4o

Monitor and Adjust:

Check usage in Billing Dashboard
Identify high-cost tasks
Test cheaper alternatives
Adjust based on results

Trade-offs¶

Use Premium Models (GPT-4o, Claude Opus) when:

Accuracy is critical (customer support, medical, legal)
Complex reasoning required
Brand reputation matters (public-facing)
User satisfaction is priority

Use Budget Models (GPT-4o-mini, Claude Haiku) when:

Simple, repetitive tasks
High volume, low stakes (testing, internal tools)
Speed is priority
Cost optimization needed

Advanced: Model-Specific Features

Context Windows:

GPT-4o: 128K tokens
Claude 3: 200K tokens
GPT-4o-mini: 128K tokens

Matters for long conversations, large documents, and multi-turn interactions.

Model Strengths:

GPT-4o: Strong function calling, reliable JSON output, excellent code generation; Best for: E-commerce, technical support
Claude 3: Longer context, ethical reasoning, nuanced conversations; Best for: Customer service, long conversations

Recommended Configuration¶

Default: All tasks use GPT-4o-mini (good balance for getting started)

Recommended Starter Config:

Chat: GPT-4o-mini
Summary: GPT-4o-mini
QA Generation: GPT-4o-mini
Shared Chat: GPT-4o ⭐
Function Creation: GPT-4o ⭐
Function Test & Fix: GPT-4o-mini

Prioritizes customer-facing quality and code reliability while keeping costs low.

Configure Models

Troubleshooting¶

Model not available: Check your tier includes access. Verify provider is available in your region.
Responses seem lower quality: Verify selected model. Test with premium model to compare. Some models handle certain tasks better.
Unexpectedly high costs: Review usage in Billing Dashboard. Identify high-cost tasks. Consider downgrading low-value tasks.
Settings not saving: Ensure you clicked "Save Changes". Check for errors. Refresh page and retry.

Best Practices¶

Testing Model Changes:

Create test assistant with new configuration
Run test conversations with typical queries
Compare quality against previous model
Check response time
Monitor costs for a few days
Gather feedback if public-facing
Adjust based on results

Monthly Reviews:

Review usage in Billing Dashboard
Identify high-cost tasks
Test cheaper alternatives
Check for new models
Adjust based on metrics

FAQ¶

Can I use different models for different assistants?: Model selection is account-wide. All assistants use your configured models.
How quickly do changes take effect?: Immediately for new conversations. Existing conversations continue with their current model.
Does model selection affect tier limits?: No. Tier limits are independent of model selection. Costs per message vary by model.
What happens if a model is deprecated?: You'll receive advance notice. Your configuration will auto-update to a comparable model.

Billing & Pricing - Understand costs per model
Assistants - Create and configure AI assistants
Knowledge Bases - How QA generation works
Templates & Functions - Custom function creation