Designing Conversational AI Agents
Last updated
Last updated
When designing Conversational AI agents we deal with three broad aspects
The possible behavior, language and actions that the user of the agent may want to use to achieve their goal (get an answer to a question, complete a transaction, make a change to a policy, etc).
The possible and available behaviors, language and actions that the AI agent has to help the user achieve the goal.
The context of the conversation and the overall rules that govern the interaction and dictate how language should be interpreted and when certain rules should be applied.
A conversation design needs to be able to manipulate and reason about all of these elements within a Conversational AI framework so as to be able design for the right outcomes.
The OpenDialog approach directly supports you in designing across these elements and combining them in a number of different ways.
The diagram below illustrates the key components at a high-level. When user input comes in we classify it (through Semantic Classifiers) in order to have an understanding of what the user is trying to say and then we contextualise it based on the state of the overall business process we are support, the specific type of conversation we are having and what we know abot the user. We can then reason about where the conversation should transition next before we go ahead and generate a response for the user. This reasoning cycle is illustrated below.
This functionality is support by the OpenDialog AI Agent Orchestration Layer in combination with the OpenDialog Conversation Framework.
The OpenDialog conversation framework enables much of the flexibility that the platform provides. It consists of levels and components that help you define this space within which your agent and the user communicate. It allows you to take a design-system approach to conversation design, going from high-level descriptions to individual turns within a conversation.
The different levels of the framework are:
When we start an OpenDialog application we start with a Scenario. This holds the highest-level description of the space we are designing.
A scenario is the highest level in the OpenDialog framework. It encompasses the set of functionalities that make up all the conversational application. For example your scenario might be a Pizza Order activity, an Insurance Claim activity or a Customer Support activity.
Within a Scenario your AI agent and the user will have conversations. A conversation refers to communication for specific goals. Conversations can be viewed as steps in the customer journey, or conversations to be had. Examples: a welcome conversation, a payment and delivery conversation.
Conversations then are further split into scenes. A scene deals with a specific stage, aspect or subgoal of a conversation. It is a middle layer in the model that allows for much flexibility in building out conversations. Example: a payment conversation can include a scene to enter payment data, a scene to confirm paument and a scene to finalize payment. A delivery conversation can include a scene to collect delivery details such as the address.
In a turn the user and application exchange specific information or intents. Example: the scene to collect a delivery address can have two turns: a turn to collect the address details and a turn to confirm the address details that were collected.
A turn consists of intents. An intent holds the message and its meaning. An intent can come from the app or the user. For example the application may request an address and the user may provide it or the application may display an address for confirmation and the user can confirm it.
This is where you can connect what we said at the start about desiging a space where the agent and the user are represented. We have intents both for the app (our AI agent) and the user.
As you start designing in OpenDialog you will see that we provide multiple ways for you to explore this space and get familiar with the concepts, which, in turn, will equip you with tool to create really flexible Gen-AI powered conversational applications.