How to Make Outgoing Calls with Twilio Voice, the OpenAI Realtime API, and Python
OpenAI recently launched their Realtime API, exposing the multimodal capabilities of their GPT-4o model. When they launched, we posted our tutorial on how you could build a voice AI assistant in Python. Since then, many of you have asked for a demonstration of how to have the AI call out to a number.
Don’t worry, I’ve got you covered. In this tutorial, I’ll show you how to make an outbound phone call using Python, Twilio Voice and Media Streams, and the OpenAI Realtime API. I’ll show an example filter function, which demonstrates how to check if a phone number is allowed to be called, then (assuming it is!) begins a phone call. Finally, after a user picks up the call, we’ll have OpenAI’s Realtime API talk first to kick off a conversation.
Sounds good? Well, the AI will sound even better… let’s code.
Prerequisites
To follow along, ensure you have:
- Python 3.9+ installed. Download it from here. (I used
3.9.13
here, but newer versions should work too. Verify your version if issues arise.) - A Twilio account. If you don’t have one yet, you can sign up for a free trial here.
- A Twilio number with Voice capabilities to make an outbound call. Here are instructions to purchase one.
- An OpenAI account and an OpenAI API Key with OpenAI Realtime API access. Sign up here to get one.
- ngrok or another tunneling solution to expose your local server to the internet for testing. You can download ngrok here.
- Either:
- A second Twilio phone number where you can place a call using the Twilio Dev Phone. Or
- A phone number to a device where you can receive phone calls, that you’ve added to your Twilio Verified Caller IDs. You can find a tutorial here.
Awesome, let’s do this.
Build the Python outbound AI call application
Step 1: Set up your project
To start, create a project directory and set up your Python environment:
As you can see there, we’ll do our work in a virtual environment. Activate the virtual environment:
- On Windows:
.\venv\Scripts\activate
- On macOS/Linux:
source venv/bin/activate
Step 2: Install the required packages
Once the virtual environment is active, install the necessary Python packages using pip
:
These packages provide the tools needed to handle HTTP requests and WebSockets, and to simplify interactions with Twilio and OpenAI.
I’m using FastAPI here, just like in the Python inbound OpenAI Realtime example. I found it more straightforward to handle websockets and the asynchronous code than some other frameworks.
Step 3: Create the project files
We will create a file named main.py
for our main server code. We’ll also use an .env
file to store sensitive environment variables. ( More information on this strategy here)
Create a .env
file to securely store API keys and other variables:
Add the following to your .env
file, replacing my placeholders with your actual keys. Find your TWILIO_ACCOUNT_SID
and TWILIO_AUTH_TOKEN
in your Twilio Console. The PHONE_NUMBER_FROM
should be the Twilio phone number you purchased in the Prerequisites, formatted as E.164 (e.g., +18885551212
). Set DOMAIN
to nothing for now—we'll address it later. You can copy my PORT
and set it to 6060
.
Now, create the main.py
file:
Great! Now, open main.py
with your favorite text editor or IDE and let’s get to it.
Step 4: Write the server code
With the project's structure ready, the following steps will guide you through writing the server code. I’ll try to explain the trickier parts, but you can skip the explanations for the parts you understand (and paste the code directly).
Step 4.1 Import dependencies, set constants, and set environment variables
Add this at the top of the main.py
file:
As you can see, we first import all of the packages we’ll use, then load all the environment variables in the .env
file (that we discussed above) using load_dotenv()
. We then initialize a FastAPI instance for routing as well as the Twilio client we’ll be using to make our outbound call.
We also define the system message, voice, and server port. Then, we choose the OpenAI events to log to the console.
SYSTEM_MESSAGE
is instructions we send to OpenAI, basically controlling the AI’s behavior during the phone call, while VOICE
controls how the AI will sound. (You can find more information in OpenAI’s Realtime API Reference.)
Step 4.2 Define FastAPI Routes for HTTP and WebSocket handling
After the above code, implement the main HTTP and WebSocket routes for server interactions:
The /media-stream
WebSocket route maintains a live connection for continuous data exchange between Twilio and OpenAI. As audio events come in, audio is proxied between the two – response.audio.delta
from OpenAI, and media payloads from Twilio.
There is a lot going on here. I’m skipping some explanations, but you can read more details in our initial tutorial.
Step 4.3 Set up the initial OpenAI Session
Next, we initialize the session with OpenAI to configure our phone interaction, and send a conversation item to get the AI to talk first. Paste this next:
I explain similar code in more detail in the previous Python tutorial. However, you’re here, so here’s a brief explanation of what’s going on… well, here:
- Session Update/Initialization: We use the
initialize_session
function to configure the session with our desired settings, such as the AI voice and system message (set in the constants in Step 4.1). After that, we send asession.update
event to OpenAI to update our session’s configuration ( more details can be found here).Another important detail is we set the inbound and outbound audio format tog711_ulaw
. This format is supported by Twilio and Media Streams, so we don’t have to do any transcoding. - AI talks first: This code is new for this tutorial. Since we’re dialing a number, we want the AI to talk when the call is picked up. We send a manual conversation update with
conversation.item.create
andresponse.create
. This causes the OpenAI Realtime API to “go first” in this conversation, and greet the person who answers the phone.
Step 4.4 Implement the outbound call functionality
In this section, we'll implement the functionality to make an outbound call using the Twilio API. This involves verifying that you are allowed to make calls to the number you specify, and only then making the call.
Step 4.4.1 Phone number validation
Next, paste in my example phone number validation code:
This function checks if the given phone number to
is allowed to receive calls from your application.
Working through exactly who you are allowed to call is beyond the scope of this tutorial, but if it’s a Twilio phone number you control or one of your validated Outgoing Caller IDs, it’s a safe bet. client.incoming_phone_numbers.list(phone_number=to)
is checking the former, while client.outgoing_caller_ids.list(phone_number=to)
is checking the latter.
Step 4.4.2 Create the outbound call function and a Call SID logger
Next, paste in the outbound calling code:
The make_call
function initiates an outbound call to the specified phone number using Twilio's Python Helper Library. On connect, it connects to your WebSocket route to start proxying audio between OpenAI and Twilio. (The code to do that is in the outbound_twiml
variable.)
Finally, we define the log_call_sid
function to print out the Call SID when we make the outbound call.
Step 4.5 Launch the server
Next, we’ll run through our logic while starting the server. Paste this at the end of main.py
, then save.
This segment employs argument parsing for phone number input, then executes the call setup and starts the server using Uvicorn.
You must pass in a --call
parameter when you start the code, for example --call=+18885551212
. (That’s controlled with required=True
.) If you do, we’ll run through the logic to check your outbound call permissions, then initiate the call.
Okay, awesome! Let’s move on to running and testing it.
Run and test your code
In the next steps, I’ll cover how to get the code to run so you can have the AI make an outbound call.
Step 1: Launch ngrok
You need to use ngrok or a similar product (a VPS, reverse proxy, etc.) to expose your server to Twilio.
Run the following command. (If you changed the port from 6060
, update it here):
Step 1.1 Set the DOMAIN variable
Earlier, we left the DOMAIN
variable in the .env
file blank – let’s set it now.
Copy the Forwarding address from ngrok, without the protocol (omitting the https://
in my image).
Here’s an example using my .env
(with fake values other than DOMAIN
and PORT
):
Further up the digital page, we built a filter function which makes sure we’re only calling numbers we have permission to call. One part of that function allows you to call Twilio numbers you own.
If you’re new to the Dev Phone, go through the Twilio Dev Phone tutorial. It will ask you to install the Twilio CLI and add your account credentials.
When you’re done, run twilio dev-phone
in your console. A screen should pop up that looks like this:
In the Phone Number box, choose the Twilio number you’ll call to test this app. If you have that number configured, it’ll warn you before overwriting the config. Quadruple check the number is okay to use (there’s no Undo!), then hit Use this phone number.
Step 3: Place an outbound call
Run the following in your console, replacing the placeholder number with your Twilio Dev Phone number (or, alternatively, a Verified Caller ID number):
Pick it up and you’ll hear a greeting – go ahead and respond. Enjoy your call with the Realtime API!
Debugging your setup
Assuming your server is running, here are the first places to check if you have issues placing an outbound call:
- Is ngrok running? Can you see any errors on the ngrok screen? Is the
DOMAIN
variable properly set in the.env
file? - Is your code calling OpenAI correctly? See more information in their documentation.
- Have you checked the Error Logs in the Developer Tools?
- Did you get error 21216 from Twilio ? Do you need to add a Primary Caller Profile in TrustHub ?
Conclusion
Congratulations! You successfully created an AI voice assistant that will place an outbound call using Twilio Voice and the OpenAI Realtime API using Python. The code is now ready for your modifications – but check our Python repo first to see if we’ve already implemented some of your dream functionality.
Have fun!
Next steps:
- For an inbound calling version of the app with advanced features (including interruption handling), try the Code Exchange app or the repo.
- Check out the Twilio documentation and OpenAI's Realtime API docs for more advanced features.
- See OpenAI’s documentation on concepts.
Paul Kamp is the Technical Editor-in-Chief of the Twilio Blog. He struck up a conversation about various Python frameworks with the AI to test this tutorial. But you don’t have to call to get in touch with him – reach Paul at pkamp [at] twilio.com.
Related Posts
Related Resources
Twilio Docs
From APIs to SDKs to sample apps
API reference documentation, SDKs, helper libraries, quickstarts, and tutorials for your language and platform.
Resource Center
The latest ebooks, industry reports, and webinars
Learn from customer engagement experts to improve your own communication.
Ahoy
Twilio's developer community hub
Best practices, code samples, and inspiration to build communications and digital engagement experiences.