Build a Real-Time Voice AI Assistant with Twilio's ConversationRelay, LiteLLM, and Python

April 15, 2025
Written by
Hao Wang
Twilion
Reviewed by
Paul Kamp
Twilion

Voice AI assistants are evolving rapidly, and real-time interactions with LLMs are becoming smoother and more human-like. Twilio’s ConversationRelay helps you build on the trend – it lets you quickly integrate an LLM of your choice with the voice channel over Twilio Voice.

This project integrates Twilio's ConversationRelay with LiteLLM, allowing you to pick from multiple large language model (LLM) providers with a standardized API. In this tutorial, I’ll show you how to use ConversationRelay and LiteLLM to integrate LLMs from OpenAI, Anthropic, and DeepSeek with Twilio Voice in Python. Let’s get started!

Features

This Voice AI assistant is designed for a quick integration with Twilio Voice. Here are features of the setup you’ll build:

  • Real-time streaming responses via a WebSocket server (using the FastAPI framework)
  • Multi-provider LLM support using LiteLLM
  • Smoother voice interactions with our prompt letting the LLM know it’s a voice assistant
  • A straightforward Twilio Voice integration through ConversationRelay, letting you call the AI and chat at any hour!

Prerequisites

To get started, ensure you have the following:

  • Python 3.8+
  • API keys for the LLM providers you’d like to test (In this demo, I’ve shown support for OpenAI, Anthropic, and DeepSeek)
  • ngrok for exposing your test server to Twilio
  • A Twilio account and a registered phone number
  • You can sign up for a free Twilio account here
  • Search and purchase a Twilio phone number with these instructions (make sure you select one with Voice support.

Install the server

1. Clone the repository:

cd ConvRelay-LiteLLM

2. Create and enter a virtual environment:

source env/bin/activate

3. Install dependencies:

pip install fastapi 'uvicorn[standard]' litellm python-dotenv

4. Configure API keys: Create a .env file and add your API credentials:

OPENAI_API_KEY=your_openai_key
ANTHROPIC_API_KEY=your_anthropic_key
DEEPSEEK_API_KEY=your_deepseek_key

5. Pick which LLM provider to use by selecting the model in function draft_response.

6. Improve or customize the system prompt as needed, using the voice-specific version below as a reference. (You can see our best practices here).

You are a helpful, concise, and reliable voice assistant. Your responses will be converted directly to speech, so always reply in plain, unformatted text that sounds natural when spoken.
When given a transcribed user request:
1. Silently fix likely transcription errors. Focus on intended meaning over literal wording. For example, interpret “buy milk two tomorrow” as “buy milk tomorrow.”
2. Keep answers short and direct unless the user asks for more detail.
3. Prioritize clarity and accuracy. Avoid bullet points, formatting, or unnecessary filler.
4. Answer questions directly. Acknowledge or confirm commands.
5. If you don't understand the request, say: “I'm sorry, I didn't understand that."

Run the server

1. Start the WebSocket server:

python server_litellm.py

2. Expose the server with ngrok:

ngrok http 8000

3. Copy the provided HTTPS url from ngrok, without the scheme (‘https://’). For example, I copied ‘03ca546a6a10.ngrok.app’.

Screenshot of an Ngrok session status showing connection details, an active session, and a forwarding URL.

Connect Twilio ConversationRelay

Now, we’ll integrate our server with Twilio Voice using ConversationRelay.

1. Set up a TwiML bin with the ngrok URL (without the scheme) you copied in the last step. (You can create a TwiML Bin from your Twilio Console)

<?xml version="1.0" encoding="UTF-8"?>
<Response>
  <Connect>
    <ConversationRelay url="wss://your-ngrok-domain.ngrok.io/ws" welcomeGreeting="Welcome to the Twilio and LiteLLM ConversationRelay demo! I'm an A I Assistant." />
  </Connect>
</Response>

2. Link your Twilio phone number to the TwiML bin. From your phone number, under Voice Configuration, change the A call comes in pulldown to your TwiML Bin.

3. Now you’re ready – make a test call to your Twilio number and interact with the AI assistant in real-time!

Conclusion

And there you have it. Your Voice AI assistant brings together Twilio's powerful ConversationRelay and LiteLLM's provider flexibility to quickly create a hotline for you to call an LLM. As you saw, you can switch between providers to test the capabilities of various LLMs for your use cases.

Want to extend its capabilities? Check out the ConversationRelay docs, or see our other ConversationRelay tutorials for topics such as advanced interruption handling and function calling. The possibilities are endless!

Hao Wang is a Solution Architect at Twilio, dedicated to empowering customers to maximize the potential of Twilio’s products. With a strong passion for emerging technologies and Voice AI, Hao is always exploring innovative ways to drive impactful solutions.