This article explains how we measure the performance of our AI Employees. The primary metric is called Agent Success, which evaluates whether an AI Employee's interaction with a user was successful from a business perspective.
Success codes
Code | Value | Description |
Yes-A | 1 | Ideal Scenario: The agent successfully completed its primary task without any issues. |
Yes-B | 1 | Fallback Success: The agent encountered a problem but successfully used a backup plan to save the customer lead. |
No | 0 | Failure: The agent failed due to a technical or logical error, resulting in a lost opportunity. |
Example scenarios
Agent Success = "Yes-A": This status is automatically assigned when the AI Employee handles the entire interaction according to its main script, without errors or deviations. The final outcome of the user's decision (e.g., whether they made a purchase or not) does not matter, as long as the AI flawlessly did its job.
Example: A customer wants to book an appointment. The agent gathers the necessary information and successfully sends an SMS with the appointment link. The agent's task is complete.
Agent Success = "Yes-B": This status is assigned when the AI Employee starts on the main script but runs into an issue. However, it successfully switches to a "fallback" procedure to ensure the business does not lose the customer (the lead is "saved").
Example 1: An agent tries to send an appointment link via SMS, but the customer never receives it. Instead of ending the call, the AI transfers the customer to a human colleague or manually collects their information to be passed to a manager.
Example 2: A customer asks for an appointment on Monday, and the AI mistakenly says the day is unavailable. The AI then corrects itself by offering Tuesday, and the customer agrees and books the appointment.
Example 3: The AI confirms an appointment, but an API error prevents it from being saved in the CRM system. The agent then automatically sends an apology SMS to the customer and emails the business managers about the failure, allowing them to manually book it and save the customer.
Agent Success = "No": This status is assigned when a technical or logical failure occurs that the AI Employee cannot recover from, leading to a potential loss of revenue.
Example 1: A customer wants to book an appointment on Monday. The AI incorrectly states that Monday is a non-working day and ends the conversation without offering an alternative.
Example 2: The AI successfully agrees on an appointment time with a customer and says, "Your appointment is booked," before hanging up. However, an API error occurs, and the appointment is never actually created, leading to a lost opportunity.
Automated vs. Human Assessment
The evaluation process involves both AI and human oversight to ensure accuracy.
Agent Success (Auto Assessment): An AI system automatically analyzes each session and assigns an
Agent Success
status (Yes-A
,Yes-B
, orNo
).Agent Success (Human Assessment): A human assessor can review any session. If they disagree with the AI's automatic assessment, they can override it by setting a different
Agent Success
status.