Complete reference of file types supported for knowledge base uploads.
Quick Reference
| Category |
Extensions |
| Documents |
.pdf, .doc, .docx, .txt, .md, .pptx |
| Data |
.json |
| Code |
.py, .js, .ts, .html, .css, .java, .c, .cpp, .cs, .go, .rb, .php, .sh |
| Other |
.tex |
PDF (.pdf)
| Property |
Value |
| Extension |
.pdf |
| Max size |
500 MB |
| Best for |
Manuals, reports, documentation |
Tips:
- Use PDFs with selectable text (not scanned images)
- Run OCR on scanned documents before uploading
- Remove password protection before upload
Microsoft Word (.doc, .docx)
| Property |
Value |
| Extensions |
.doc, .docx |
| Max size |
500 MB |
| Best for |
Business documents, policies |
Tips:
- Images within documents are not processed for text
- Table content is extracted
- Formatting is converted to plain text
Plain Text (.txt)
| Property |
Value |
| Extension |
.txt |
| Max size |
500 MB |
| Best for |
Simple content, logs |
Tips:
- Most reliable format
- No conversion needed
- Ideal for clean, structured content
Markdown (.md)
| Property |
Value |
| Extension |
.md |
| Max size |
500 MB |
| Best for |
Technical docs, READMEs |
Tips:
- Headings help with organization
- Code blocks are preserved
- Links are extracted as text
PowerPoint (.pptx)
| Property |
Value |
| Extension |
.pptx |
| Max size |
500 MB |
| Best for |
Slide content |
Tips:
- Text from slides is extracted
- Speaker notes are included
- Images and charts not processed
JSON (.json)
| Property |
Value |
| Extension |
.json |
| Max size |
500 MB |
| Best for |
Structured data |
Tips:
- Nested structures are flattened
- Keys and values extracted as text
- Good for product catalogs, configurations
Code Files
Python (.py)
| Property |
Value |
| Extension |
.py |
| Max size |
500 MB |
JavaScript (.js)
| Property |
Value |
| Extension |
.js |
| Max size |
500 MB |
TypeScript (.ts)
| Property |
Value |
| Extension |
.ts |
| Max size |
500 MB |
HTML (.html)
| Property |
Value |
| Extension |
.html |
| Max size |
500 MB |
CSS (.css)
| Property |
Value |
| Extension |
.css |
| Max size |
500 MB |
Additional Code Files
| Extension |
Language |
.java |
Java |
.c, .cpp |
C/C++ |
.cs |
C# |
.rb |
Ruby |
.php |
PHP |
.go |
Go |
.sh |
Shell |
LaTeX (.tex)
| Property |
Value |
| Extension |
.tex |
| Max size |
500 MB |
| Best for |
Academic papers, technical documents |
Tips:
- LaTeX markup is preserved as text
- Good for mathematical documentation
- Include comments for context
Tips for code files:
- Include docstrings and comments
- Well-documented code extracts better
- Remove sensitive data (API keys, credentials)
Upload Limits
Per-File Limits
| Limit |
Value |
| Maximum file size |
500 MB |
| File name length |
255 characters |
Storage Limits by Tier
| Tier |
Per Store |
| Basic |
5 MB |
| Medium |
30 MB |
| Pro |
150 MB |
The following are not supported:
| Format |
Reason |
| Images (.jpg, .png, .gif) |
Cannot extract text |
| Videos (.mp4, .mov) |
Cannot process |
| Audio (.mp3, .wav) |
Cannot transcribe |
| Spreadsheets (.csv, .xlsx) |
Not currently supported |
| XML/YAML (.xml, .yaml, .yml) |
Not currently supported |
| Rich Text (.rtf) |
Not currently supported |
| Executables (.exe, .app) |
Security risk |
| Archives (.zip, .rar) |
Must extract first |
| Password-protected |
Cannot access |
Workarounds:
- Images: Use OCR to extract text, save as PDF
- Videos/Audio: Create transcripts as text files
- Spreadsheets: Convert to JSON or plain text
- Archives: Extract and upload individual files
Optimization Tips
Before Uploading
- Remove unnecessary content
- Cover pages
- Blank pages
-
Duplicate content
-
Ensure text is extractable
- For scanned docs, run OCR first
-
Test by trying to copy/paste text
-
Structure your content
- Use clear headings
- Organize logically
- Include context
File Size Optimization
| Format |
Optimization |
| PDF |
Remove images, flatten layers |
| Word |
Remove tracked changes, images |
| Excel |
Remove empty cells, formatting |
| Code |
Remove node_modules, build files |
Content Quality
Better content = better responses:
- Clear, well-written text
- Accurate information
- Up-to-date content
- Relevant to assistant's purpose
Troubleshooting
File won't upload
- Check file extension is supported
- Verify file size under 500 MB
- Ensure file isn't corrupted
- Try different browser
Content not being found
- Verify upload completed (Finish Upload clicked)
- Check file has extractable text
- Test with specific queries
- Review file quality
Poor response quality
- Review content accuracy
- Improve document structure
- Add more relevant documents
- Check for conflicting information