Append website text to RAG knowledge base

This guide explains how to incrementally add information scraped from external websites to your AI Employee's memory. You will learn how to configure scraping instructions, verify the update history, and reset the website memory when necessary.

RAG website append overview

The RAG (Retrieval Augmented Generation) Website Append feature allows you to add knowledge to your AI Employee by pointing it to specific URLs. The system visits these sites, scrapes the content, and converts it into structured topics tagged with rag_autogenerated_from_websites.

This process is incremental, meaning you can add new sources without overwriting existing knowledge. The agent prioritizes newer records if they contain updated information.

🗒️ NOTE

Website memory is independent of text memory. Resetting one does not affect the other.

Add websites to memory

To add new information, you must provide the URL and specific instructions on what data to extract.

Navigate to the Builder > Attributes (or the Portal > Settings > Knowledge Base tab).
Locate the attribute project_attributes_rag_knowledge_base_append_websites.

Enter the website details using the format below:

Website: https://example.com 
Objective: Collect information about pricing and services offered Keywords: pricing, services

Save your changes.

Once saved, the system processes the request asynchronously. Because the agent must visit and read the site, this process often takes 10 minutes or more.

❗❗ IMPORTANT

The text extracted from the site must fit within system limits (~100,000 characters). For very large websites, add specific pages one by one rather than the entire domain root.

Upon successful processing, the system automatically clears the field to indicate readiness for new input.

Verify update history

You can verify that your instructions were received and processed by checking the log attribute.

Locate the attribute project_attributes_rag_knowledge_base_append_websites_log.
Review the content. Newest entries appear at the bottom.

The log uses a specific format to separate entries:

Separator: ---
Header: 📅 Append Log Entry [timestamp]

🗒️ NOTE

This log contains your instructions and URLs, not the scraped content itself. It serves as an audit trail and backup.

Reset website memory

If you need to clear all website-based knowledge (for example, if a scraped site has completely changed its offerings) you can perform a reset.

Locate the attribute project_attributes_rag_knowledge_base_append_websites_reset.
Set the value to True.

This action immediately deletes all knowledge topics tagged as rag_autogenerated_from_websites. The attribute will automatically revert to False once the reset is complete.

🚨 WARNING

This action cannot be undone for the active memory. However, your instructions remain preserved in the project_attributes_rag_knowledge_base_append_websites_log attribute if you need to copy them back into the append field.

Reset and update workflow

If you provide new URLs in the append field and set the reset switch to True simultaneously, the system will:

Clear the old memory first.
Process the new scraping request immediately after.

Example scenario: Competitor product launch

Context: A sales manager wants the AI Employee to be aware of a competitor's newly launched "Pro Plan" to improve objection handling.

Action:

The manager locates the competitor's pricing page URL.

They paste the following into the project_attributes_rag_knowledge_base_append_websites attribute:

Website: https://competitor.com/pricing/pro-plan 
Objective: Gather details on feature limits and pricing tiers Keywords: pro plan, pricing, limits, cost

After 15 minutes, they check project_attributes_rag_knowledge_base_append_websites_log to confirm the instruction was logged.
The AI Employee can now discuss how their product compares to the competitor's new "Pro Plan."

Portal - Settings

NAF - Changelog v3.6.0

Append text to RAG knowledge base