Importing data
Enhance agent capabilities by importing data into its knowledge base for tailored and relevant responses.
Supercharge your Voiceflow agent by importing content from websites, documents, and support articles- giving it the context it needs to respond with real, reliable information.
Supported data sources
Voiceflow supports five different data sources:
Source | Description |
---|---|
URL(s) | Paste one or more URLs to import the content from those pages. Use sitemap.xml to bulk import entire websites. You can only import publicly accessible URLs. |
Sitemap | Import all pages from a website into your knowledge base, ideal for full-site crawls. For most websites, the sitemap is found at https://website.com/sitemap.xml . |
Upload file | Upload .pdf , .txt , or .docx files (Max 10 mb). Only text can be imported into the knowledge base. |
Plain text | Paste raw content directly. Great for fast prototyping or testing. |
Zendesk | Import data from your Zendesk knowledge base. Requires Zendesk admin access. |
On the pro plan, each project can support up to 3,000 different sources (URLs, documents, etc) of knowledge base content.
Heads up!Make sure you don't import confidential or proprietary data unless your use case allows it. Any data that you import may be included in LLM-generated responses.
Refresh rate settings
When importing data from URLs or sitemaps, you can schedule a refresh rate to automatically re-sync your content, keeping your knowledge base up to date.
Option | Best for... |
---|---|
Never | Static content that doesn't need refreshing |
Daily | Frequently updated content (e.g., blogs, news sites) |
Weekly | Occasionally updated info (e.g., support centers) |
Monthly | Stable content (e.g., product policies, pricing) |
Be careful with refresh rates!When an LLM chunking strategy is enabled, every re-sync will consume credits. If your content doesn't change often, we'd recommend you reduce your refresh rate frequency. When LLM chunking strategies are disabled, re-syncs don't consume any credits.
LLM chunking strategies
You can use LLM chunking strategies to optimize the data in your agent's knowledge base. The chunking strategies use AI to process your data in various ways, optimizing it for agent. The quality of your content directly impacts your agent's ability to answer user questions.
Voiceflow supports five different chunking strategies:
Strategy name | Ideal use case |
---|---|
Smart chunking | Breaks and clusters content into topic-based sections.
|
FAQ optimization | Converts structured data into question-and-answer pairs to help the LLM handle direct user queries.
|
Remove HTML and noise | Strips out excess formatting, inline code, and markdown artifacts that may confuse the LLM.
|
Add topic headers | Prepends each chunk with a short summary line and header to signpost the LLM on available information.
|
Summarize | Automatically extracts the most important ideas or paragraphs from large, verbose documents.
|
We recommend experimenting with various combinations of chunking strategies to see which best fits your use-case.
Troubleshooting data imports
If something goes wrong when importing your data, hover over the❗ icon to learn why.

Import errors are handled gracefully. Failed files won’t break your project - the remaining files will be processed.
Advanced: importing data using Voiceflow's API
Voiceflow's Knowledge Base API allows you to manage your Knowledge base programmatically. Common API use cases include:
- Importing content from Notion, Google Docs, or Airtable via an SDK or API request
- Managing KB entries dynamically across multiple workspaces, outside of the Voiceflow Creator
- Debugging & testing chunk processing behind the scenes
Updated about 17 hours ago