Skip to content

LLM Model Selection

Configure which AI models power different features across your account.

Overview

Optimize for cost, performance, speed, or specific capabilities by selecting different models for different tasks.

Supported Providers:

OpenAI
GPT-4, GPT-4o - Best for complex reasoning and reliable output
Anthropic
Claude 3 family - Excellent for nuanced conversations and longer context
Google
Gemini models - Strong analysis with competitive pricing

Availability

Providers and models vary by tier and region.

Model Selection by Task

Configure different models for each task type:

Chat Messages
Powers assistant conversations and responses
Recommended: GPT-4o or Claude 3 (Pro), GPT-4o-mini (Basic/Free)
Summary Generation
Creates conversation summaries and titles
Recommended: GPT-4o-mini (cost-effective, sufficient quality)
QA Generation
Generates Q&A pairs from knowledge bases for RAG
Recommended: GPT-4o (high quality), GPT-4o-mini (balance)
Shared Chat
Powers public chatbot widgets and customer-facing interactions
Recommended: GPT-4o or Claude 3 (reliability critical for production)
Function Creation
Generates Python code for custom functions
Recommended: GPT-4o (superior code quality)
Function Test & Fix
Debugs and fixes function code
Recommended: GPT-4o (reliable), GPT-4o-mini (faster iterations)

Configuration

  1. Go to AccountLLM Settings
  2. Select Provider and Model for each task type
  3. Click Save Changes

Test Changes

Test your assistants after changing models to verify quality meets expectations.

Model Comparison

OpenAI Models:

Model Speed Cost Best For
GPT-4o Fast Medium Production chat, complex reasoning
GPT-4o-mini Very Fast Low Summaries, high-volume tasks
GPT-4 Turbo Medium High Complex analysis, critical tasks

Anthropic Models:

Model Speed Cost Best For
Claude 3 Opus Medium High Sophisticated reasoning, long conversations
Claude 3 Sonnet Fast Medium Balanced performance and cost
Claude 3 Haiku Very Fast Low Quick responses, simple tasks

View current pricing in your Billing Dashboard.

Cost Optimization

Task-Based Strategy:

Use premium models where they add value:

  • Chat: GPT-4o (customer-facing)
  • Summaries: GPT-4o-mini (high volume)
  • QA Generation: GPT-4o-mini (batch processing)
  • Shared Chat: GPT-4o (public-facing)
  • Functions: GPT-4o (code quality matters)
  • Function Fix: GPT-4o-mini (iterative)

Savings: 40-60% vs. premium models everywhere

Development vs Production:

  • Development: GPT-4o-mini for all tasks
  • Production: Upgrade customer-facing features to GPT-4o

Monitor and Adjust:

  1. Check usage in Billing Dashboard
  2. Identify high-cost tasks
  3. Test cheaper alternatives
  4. Adjust based on results

Trade-offs

Use Premium Models (GPT-4o, Claude Opus) when:

  • Accuracy is critical (customer support, medical, legal)
  • Complex reasoning required
  • Brand reputation matters (public-facing)
  • User satisfaction is priority

Use Budget Models (GPT-4o-mini, Claude Haiku) when:

  • Simple, repetitive tasks
  • High volume, low stakes (testing, internal tools)
  • Speed is priority
  • Cost optimization needed
Advanced: Model-Specific Features

Context Windows:

  • GPT-4o: 128K tokens
  • Claude 3: 200K tokens
  • GPT-4o-mini: 128K tokens

Matters for long conversations, large documents, and multi-turn interactions.

Model Strengths:

GPT-4o
Strong function calling, reliable JSON output, excellent code generation
Best for: E-commerce, technical support
Claude 3
Longer context, ethical reasoning, nuanced conversations
Best for: Customer service, long conversations

Default: All tasks use GPT-4o-mini (good balance for getting started)

Recommended Starter Config:

  • Chat: GPT-4o-mini
  • Summary: GPT-4o-mini
  • QA Generation: GPT-4o-mini
  • Shared Chat: GPT-4o ⭐
  • Function Creation: GPT-4o ⭐
  • Function Test & Fix: GPT-4o-mini

Prioritizes customer-facing quality and code reliability while keeping costs low.

Configure Models

Troubleshooting

Model not available
Check your tier includes access. Verify provider is available in your region.
Responses seem lower quality
Verify selected model. Test with premium model to compare. Some models handle certain tasks better.
Unexpectedly high costs
Review usage in Billing Dashboard. Identify high-cost tasks. Consider downgrading low-value tasks.
Settings not saving
Ensure you clicked "Save Changes". Check for errors. Refresh page and retry.

Best Practices

Testing Model Changes:

  1. Create test assistant with new configuration
  2. Run test conversations with typical queries
  3. Compare quality against previous model
  4. Check response time
  5. Monitor costs for a few days
  6. Gather feedback if public-facing
  7. Adjust based on results

Monthly Reviews:

  1. Review usage in Billing Dashboard
  2. Identify high-cost tasks
  3. Test cheaper alternatives
  4. Check for new models
  5. Adjust based on metrics

FAQ

Can I use different models for different assistants?
Model selection is account-wide. All assistants use your configured models.
How quickly do changes take effect?
Immediately for new conversations. Existing conversations continue with their current model.
Does model selection affect tier limits?
No. Tier limits are independent of model selection. Costs per message vary by model.
What happens if a model is deprecated?
You'll receive advance notice. Your configuration will auto-update to a comparable model.

Next Steps