Skip to content

File Formats Reference

Complete reference of file types supported for knowledge base uploads.

Quick Reference

Category Extensions
Documents .pdf, .doc, .docx, .txt, .md, .pptx
Data .json
Code .py, .js, .ts, .html, .css, .java, .c, .cpp, .cs, .go, .rb, .php, .sh
Other .tex

Document Formats

PDF (.pdf)

Property Value
Extension .pdf
Max size 500 MB
Best for Manuals, reports, documentation

Tips: - Use PDFs with selectable text (not scanned images) - Run OCR on scanned documents before uploading - Remove password protection before upload

Microsoft Word (.doc, .docx)

Property Value
Extensions .doc, .docx
Max size 500 MB
Best for Business documents, policies

Tips: - Images within documents are not processed for text - Table content is extracted - Formatting is converted to plain text

Plain Text (.txt)

Property Value
Extension .txt
Max size 500 MB
Best for Simple content, logs

Tips: - Most reliable format - No conversion needed - Ideal for clean, structured content

Markdown (.md)

Property Value
Extension .md
Max size 500 MB
Best for Technical docs, READMEs

Tips: - Headings help with organization - Code blocks are preserved - Links are extracted as text

Presentation Formats

PowerPoint (.pptx)

Property Value
Extension .pptx
Max size 500 MB
Best for Slide content

Tips: - Text from slides is extracted - Speaker notes are included - Images and charts not processed

Data Formats

JSON (.json)

Property Value
Extension .json
Max size 500 MB
Best for Structured data

Tips: - Nested structures are flattened - Keys and values extracted as text - Good for product catalogs, configurations

Code Files

Python (.py)

Property Value
Extension .py
Max size 500 MB

JavaScript (.js)

Property Value
Extension .js
Max size 500 MB

TypeScript (.ts)

Property Value
Extension .ts
Max size 500 MB

HTML (.html)

Property Value
Extension .html
Max size 500 MB

CSS (.css)

Property Value
Extension .css
Max size 500 MB

Additional Code Files

Extension Language
.java Java
.c, .cpp C/C++
.cs C#
.rb Ruby
.php PHP
.go Go
.sh Shell

Other Formats

LaTeX (.tex)

Property Value
Extension .tex
Max size 500 MB
Best for Academic papers, technical documents

Tips: - LaTeX markup is preserved as text - Good for mathematical documentation - Include comments for context

Tips for code files: - Include docstrings and comments - Well-documented code extracts better - Remove sensitive data (API keys, credentials)

Upload Limits

Per-File Limits

Limit Value
Maximum file size 500 MB
File name length 255 characters

Storage Limits by Tier

Tier Per Store
Basic 5 MB
Medium 30 MB
Pro 150 MB

Unsupported Formats

The following are not supported:

Format Reason
Images (.jpg, .png, .gif) Cannot extract text
Videos (.mp4, .mov) Cannot process
Audio (.mp3, .wav) Cannot transcribe
Spreadsheets (.csv, .xlsx) Not currently supported
XML/YAML (.xml, .yaml, .yml) Not currently supported
Rich Text (.rtf) Not currently supported
Executables (.exe, .app) Security risk
Archives (.zip, .rar) Must extract first
Password-protected Cannot access

Workarounds: - Images: Use OCR to extract text, save as PDF - Videos/Audio: Create transcripts as text files - Spreadsheets: Convert to JSON or plain text - Archives: Extract and upload individual files

Optimization Tips

Before Uploading

  1. Remove unnecessary content
  2. Cover pages
  3. Blank pages
  4. Duplicate content

  5. Ensure text is extractable

  6. For scanned docs, run OCR first
  7. Test by trying to copy/paste text

  8. Structure your content

  9. Use clear headings
  10. Organize logically
  11. Include context

File Size Optimization

Format Optimization
PDF Remove images, flatten layers
Word Remove tracked changes, images
Excel Remove empty cells, formatting
Code Remove node_modules, build files

Content Quality

Better content = better responses:

  • Clear, well-written text
  • Accurate information
  • Up-to-date content
  • Relevant to assistant's purpose

Troubleshooting

File won't upload

  1. Check file extension is supported
  2. Verify file size under 500 MB
  3. Ensure file isn't corrupted
  4. Try different browser

Content not being found

  1. Verify upload completed (Finish Upload clicked)
  2. Check file has extractable text
  3. Test with specific queries
  4. Review file quality

Poor response quality

  1. Review content accuracy
  2. Improve document structure
  3. Add more relevant documents
  4. Check for conflicting information