Skip to main content
  • Home
  • >
  • Live Translation with OpenAI’s Realtime API

Live Translation with OpenAI’s Realtime API

By Twilio

  • Flex
  • Twilio
  • Applications

This application demonstrates how to use Twilio and OpenAI's Realtime API for bidirectional voice language translation between a caller and a contact center agent.

The AI Assistant intercepts voice audio from one party, translates it, and speaks the audio in the other party's preferred language. Use of the Realtime API from OpenAI offers significantly reduced latency that is conducive to a natural two-way voice conversation.

See here for a video demo of the real time translation app in action.

Below is a high level architecture diagram of how this application works: Realtime Translation Diagram:

Architecture of Live Translation with OpenAI’s Realtime API in Twilio Flex

This application uses the following Twilio products in conjunction with OpenAI's Realtime API, orchestrated by this middleware application:

  • Voice
  • Studio
  • Flex
  • Task Router

Two separate Voice calls are initiated, proxied by this middleware service. The caller is asked to choose their preferred language, then the conversation is queued for the next available agent in Twilio Flex. Once connected to the agent, this middleware intercepts the audio from both parties via Media Streams and forwards to OpenAI Realtime for translation. The translated audio is then forwarded to the other party.

Step 1:

Step 2: You need an API key to get started.
Get a free API key

Step 3: Set up the code sample locally

Deploy

OR

Want to set up this code sample locally? Follow the setup instructions in the README

Don't see what you want? Request a code sample Explore Docs

There was an issue loading the page.

Please try again in some time.