Importing data
Add company-specific knowledge to your agent, allowing accurate responses to be generated.
The Knowledge Base lets your agent access and answer questions using your own content. You can upload documents, connect URLs, or add text directly, and the agent will use that information to provide accurate, grounded responses. It’s an easy way to give your agent expertise on your product, company, or domain without having to script every answer manually.
Supported data sources
Voiceflow supports five different data sources:
| Source | Description |
|---|---|
| URL(s) | Paste one or more URLs to import the content from those pages. Use sitemap.xml to bulk import entire websites. You can only import publicly accessible URLs. |
| Sitemap | Import all pages from a website into your knowledge base, ideal for full-site crawls. For most websites, the sitemap is found at https://website.com/sitemap.xml. |
| Upload file | Upload .pdf, .txt, or .docx files (Max 10 mb). Only text can be imported into the knowledge base. |
| Plain text | Paste raw content directly. Great for fast prototyping or testing. |
| Zendesk | Import data from your Zendesk knowledge base. Requires Zendesk admin access. |
The number of data sources you can connect to your project varies depending on your plan. Learn more on our pricing page.
Import settings
Refresh rate settings
When importing data from a URL or sitemap, you can optionally set a refresh rate. This allows your project to periodically re-sync in content from your website, keeping your knowledge base up to date. The following options are available:
| Option | Best for... |
|---|---|
| Never | Static content that doesn't need refreshing |
| Daily | Frequently updated content (e.g., blogs, news sites) |
| Weekly | Occasionally updated info (e.g., support centers) |
| Monthly | Stable content (e.g., product policies, pricing) |
Please note that when an LLM chunking strategy is enabled every re-sync will consume credits. If your content doesn't change often, we'd recommend you reduce your refresh rate frequency. No credits are consumed when syncing data with no LLM chunking strategies selected.
LLM chunking strategies
When your agent looks up data from the knowledge base, behind the scenes, it's finding chunks of data that are most similar to a provided query. While you can upload manually chunked data to your knowledge base using Voiceflow's API, most users will find it easiest to automatically chunk their data using an LLM chunking strategy. This will allow AI to automatically split up your data into optimized chunks, helping your agent find useful answers to your user's questions.
Voiceflow supports five different chunking strategies:
Strategy name | Ideal use case |
|---|---|
Smart chunking | Breaks and clusters content into topic-based sections.
|
FAQ optimization | Converts structured data into question-and-answer pairs to help the LLM handle direct user queries.
|
Remove HTML and noise | Strips out excess formatting, inline code, and markdown artifacts that may confuse the LLM.
|
Add topic headers | Prepends each chunk with a short summary line and header to signpost the LLM on available information.
|
Summarize | Automatically extracts the most important ideas or paragraphs from large, verbose documents.
|
Chunking strategies aren't a "one-size-fits-all" concept so we recommend experimenting with various combinations of strategies on each of your data sources to see which helps your agent generate the best responses.
Metadata
You can optionally attach metadata to each document or webpage you upload to your knowledge base. This can then be used as a filter by the Knowledge base tool when querying data.
This can be useful if you have multiple brands, product lines or subscription tiers and want to ensure the correct category of information is returned. For example, if you're building an agent for a SaaS platform which has different policies versus self-serve customers, you could add metadata with the key plan and value enterprise or regular to each data source, based on which type of customer the information is relevant for.
Troubleshooting data imports
If something goes wrong when importing your data, hover over the❗ icon to learn why.
Import errors are handled gracefully. Failed files won’t break your project - the remaining files will be processed.
Advanced: importing data using Voiceflow's API
Voiceflow's Knowledge Base API allows you to manage your Knowledge base programmatically. Common API use cases include:
- Importing content from Notion, Google Docs, or Airtable via an SDK or API request
- Managing KB entries dynamically across multiple workspaces, outside of the Voiceflow Creator
- Debugging & testing chunk processing behind the scenes
Updated 5 days ago