Skip to content

Knowledge Bases

Knowledge bases (also called "Stores") allow your assistants to access your specific documents and data, enabling accurate, contextual responses using Retrieval-Augmented Generation (RAG).

What Are Knowledge Bases?

A knowledge base is a collection of documents that your assistant can search and reference when responding to users. This enables:

  • Accurate answers based on your actual documentation
  • Up-to-date information without retraining the AI
  • Specific knowledge about your products, services, or domain

How It Works

When a user asks a question, the system searches your documents for relevant information and uses it to generate accurate, contextual responses.

The Process:

  1. Upload - You upload documents to a knowledge base
  2. Process - Documents are chunked and converted to embeddings for search
  3. Connect - Link the knowledge base to your assistant
  4. Query - User asks a question
  5. Retrieve - System finds the most relevant document sections
  6. Generate - Assistant crafts an answer using your specific information
Advanced: How RAG Works

Retrieval-Augmented Generation (RAG) combines semantic search with AI generation:

  • Documents are split into chunks (manageable sections)
  • Each chunk becomes a vector embedding (numerical representation capturing meaning)
  • Embeddings are stored in OpenAI's vector store
  • When users ask questions, their query is also embedded
  • The system finds chunks with similar embeddings (semantic similarity)
  • Top matching chunks provide context for the AI's response

Why semantic search matters: A search for "refund policy" retrieves chunks about "returns," "money back guarantee," or "cancellation procedures" because they're semantically similar - not just keyword matches.

What You Can Do

Task Description
Create Stores Set up new knowledge bases
Upload Files Add documents to your stores
Supported Formats See what file types you can upload
Connect to Assistants Link stores to your chatbots

Storage Limits by Tier

Tier Max Stores Storage per Store
Basic 1 5 MB
Medium 2 30 MB
Pro 5 150 MB

See Tier Limits for complete details.

Best Practices

What to Upload

Good candidates: - Product documentation and manuals - FAQ documents - Policy documents (returns, privacy, terms) - Training materials and guides - Technical specifications

Avoid: - Image-heavy documents (images aren't processed) - Scanned PDFs without OCR - Password-protected files - Raw data exports without context

Content Organization Strategy

Use Multiple Stores When:

  • You have distinct topic areas (products vs. policies vs. support)
  • Different assistants need different knowledge
  • Content has different update cycles
  • You want to control which information is available to which assistant

Use One Store When:

  • All content is related to a single domain
  • Your assistant needs access to everything
  • You're at your store limit (Basic tier: 1 store, Medium: 2 stores)
  • Content is interconnected and often referenced together

Example Structure:

For an e-commerce business:

Store 1: "Product Catalog" (50 MB)
- Product descriptions, specs, features
- Updated frequently

Store 2: "Customer Support" (15 MB)
- FAQs, troubleshooting guides
- Shipping and returns policies
- Updated occasionally

Store 3: "Company Info" (5 MB)
- About us, brand story, values
- Rarely updated

Document Preparation

Do: - Use clear headings and section breaks - Write in complete sentences with context - Use descriptive file names - Remove duplicate content

Don't: - Rely on images for critical information - Use excessive formatting that obscures text - Upload multiple versions of the same document

Common Use Cases

E-commerce Customer Support

Scenario: You run an online store and want your chatbot to answer product and policy questions.

Recommended setup:

Store 1: "Product Information" (30 MB)
- Product descriptions and specifications
- Size guides and measurement charts
- Care instructions

Store 2: "Policies & Shipping" (10 MB)
- Return and exchange policy
- Shipping information
- Warranty terms

Store 3: "FAQ" (5 MB)
- Common questions and answers
- Troubleshooting guides

Why this works: Separates frequently-updated products from stable policies, easy to maintain.

SaaS Product Documentation

Scenario: You provide a software product and want to help users learn how to use it.

Recommended setup:

Store 1: "Getting Started" (15 MB)
- Onboarding guides
- Quick start tutorials
- Basic concepts

Store 2: "Feature Documentation" (50 MB)
- Detailed feature guides
- Advanced usage
- Configuration options

Store 3: "API Reference" (20 MB)
- API documentation
- Code examples
- Integration guides

Why this works: Organizes by user journey - beginners get basics, advanced users get detailed docs.

Professional Services

Scenario: You offer consulting or services and want to automate client FAQs.

Recommended setup:

Store 1: "Services & Pricing" (8 MB)
- Service descriptions
- Pricing packages
- Engagement process

Store 2: "Case Studies & Examples" (25 MB)
- Client success stories
- Project examples
- Testimonials

Store 3: "Resources" (12 MB)
- Whitepapers
- Industry insights
- Best practices guides

Why this works: Helps prospects understand services while showcasing expertise.

Testing Your Knowledge Base

After uploading content, test your assistant to ensure it uses the information correctly.

Create a test plan:

  1. Baseline questions: Ask about content you know is in your docs
  2. Variation testing: Ask the same question different ways
  3. Edge cases: Questions partially covered or not covered

Example test suite for e-commerce:

Baseline:
- "What is your return policy?"
- "How long does shipping take?"

Variations:
- "Can I return this if I don't like it?"
- "How do I send something back?"

Edge cases:
- "Can I return a sale item?"
- "What if my item arrives damaged?"
Advanced: Optimizing for Chunking

Documents are automatically split into chunks. You can optimize results by:

Good for chunking: - Clear section breaks with headings - Paragraphs of reasonable length (3-8 sentences) - Logical content flow

Bad for chunking: - Very long paragraphs without breaks - Important info split across distant sections - Dense walls of text

Next Steps

Create Your First Store