This guide explains how the Assessor module automatically tests and evaluates your AI Employees. The Assessor generates realistic test conversations based on your agent's configuration, executes them across voice and chat channels, and produces performance scores to validate agent behavior before going live.

How the Assessor works

The Assessor uses a client-server architecture to run automated tests at scale:

Scenario generation: The Assessor reads your agent's Intent Type Map (ITM) and Agent Main Instruction (AMI) to generate realistic test conversations that cover each configured intent.
Test execution: The Assessor role-plays as a customer and conducts conversations with your AI Employee through phone calls or web chat sessions.
Performance scoring: After each conversation, the Assessor evaluates whether the agent completed the required steps and reached the expected Call-To-Action (CTA). Each intent receives a score from 0 to 100.
Reporting: Results are saved as assessment reports that detail which intents passed, which failed, and why.

Architecture

The Assessor consists of two components:

Assessor Client: Generates test scenarios from your agent's configuration, conducts conversations, and evaluates results.
Assessor Server: Manages phone number pooling, test queue orchestration, and resource allocation. The server coordinates testing resources so multiple assessments can run in parallel without bottlenecks.

This separation allows the system to scale efficiently — the server handles resource logistics while the client focuses on test quality.

Supported channels

The Assessor can test AI Employees across multiple channels:

Channel	Description
Voice (phone)	Places phone calls to your AI Employee using pooled phone numbers managed by the Assessor Server.
Web chat	Conducts text-based conversations through the AssessorChatFlow, testing chat agent responses and conversation logic.

Test scenarios

The Assessor automatically generates test scenarios from two sources:

Intent Type Map (ITM): Defines the intents your agent handles (e.g., booking an appointment, requesting a quote). The Assessor creates test conversations that exercise each intent.
Agent Main Instruction (AMI): Provides additional context about your agent's behavior, tone, and procedures. The Assessor uses this to generate realistic customer personas and conversation topics.

Tests can be configured to cover both working hours and non-working hours scenarios by setting the run_both_wh attribute.

Assessment results

Each test run produces a report containing:

Intent scores: A 0–100 score for each tested intent based on step completion and CTA achievement.
Conversation transcripts: Full records of each test conversation.
QA analysis: Detailed breakdown of failed intents explaining what went wrong and which steps were missed.

Assessment reports are saved in the agent's Knowledge Base with the label report.

Run an assessment

Navigate to your AI Employee's configuration in the Builder.
Configure the intents you want to test in the Intent Type Map.
Trigger an assessment run.
Monitor the progress as the Assessor executes conversations across your configured channels.
Review the assessment report to identify any intents that need improvement.

🗒️ NOTE

The Assessor can be configured to run automatically when an agent's configuration is updated by setting run_on_update to true.

Lead success score (LSS)

Builder - Sessions

Dashboard and session data overview

Test web chat agents with Assessor

Assessor - Changelog v1.3.0

Assessor overview