Skip to contentSkip to navigationSkip to topbar
On this page

Twilio Video technical overview


Integrate real-time video calling functionality into your web, iOS, and Android applications with Twilio Video. Twilio Video is a communications platform built on WebRTC that provides user access management, media services, and signaling to support scalable video applications.

Twilio Video is programmable, giving you full control over how video appears in your application. You aren't constrained to any particular formats and can calibrate performance based on your use case. Twilio Video capabilities include screen sharing, recordings, noise cancellation, virtual backgrounds, dominant speaker detection, and support for Twilio Voice.

Twilio provides SDKs and APIs to build Twilio Video into your app, and tools to monitor and optimize video quality and app performance.


Video Rooms

video-rooms page anchor

Video Rooms are the core building blocks of a Twilio Video experience. Participants join a Room and can then exchange audio, video, and other data in real time.

Sharing media tracks

sharing-media-tracks page anchor

All Participant media tracks (video, audio, and data) go through the Twilio Cloud, which acts as a Selective Forwarding Unit (SFU) to share the media with other Participants.

Video Rooms use a publish-subscribe model for three types of Participant tracks: video, audio, and data. A Participant publishes their video, audio, or data tracks, and all other Participants can subscribe to those published tracks. Participant tracks go to the Twilio Cloud SFU, which forwards that data to the other Participants. You control which tracks to include in your application.

Rooms can have up to 50 concurrent Participants. Clients only publish Participant media tracks once to the SFU, which clones and routes them to subscribers (other Participants). This means that a Participant's upstream bandwidth and battery consumption isn't affected by the number of Participants in a Room.

You can create Video Rooms in two ways:

  • Using the Rooms API resource before a Participant joins a Room
  • Automatically, on the client side, when a Participant requests to join a Room

While the API allows fine-grained control over Room configuration, client-side Room creation is better for rapid scaling to a large number of Rooms. In both cases, Rooms have a maximum duration of 24 hours, and Participants can remain connected for up to 24 hours.

Learn more about Video Rooms.


Components of a Twilio Video app

components-of-a-twilio-video-app page anchor

A Twilio Video app requires both a front-end and a back-end component to support Video Rooms:

  • The back end is the application server that generates Access Tokens for Participants. Your application server can also use Video APIs to create and manage Room settings or Recordings.
  • The front end is the mobile client or web browser client app that users interact with and that connects to the Twilio Cloud. Twilio Video has SDKs for JavaScript, iOS, and Android.

Learn more about the components of a Twilio Video app.


Twilio Video capabilities

twilio-video-capabilities page anchor

You can add functionality and tooling to enhance, customize, and optimize your app.

Your Twilio Video app can include screen capture so that Participants can share their device screen with other Participants in a Video Room.

Learn how to capture a Participant's screen to share in a Room as a video track.

Recordings and Compositions

recordings-and-compositions page anchor

You can record Video Room content. Because all Participant audio, video, and data passes through Twilio's SFU, Twilio can save that media for you to retrieve after a Room session completes.

Twilio records and stores each Participant media track (video, audio, and data) as a separate file. For example, if Participants share audio and video, each Participant will have a file for video and another for audio. You can choose to record all the tracks in a Room, or capture specific Participants and tracks.

After recording a Room, you can customize the layout of the final recorded video using Compositions. The Composition service takes individual track recordings, formats them visually according to your specifications, and creates an output file in MP4 or WebM format.

You can choose to store Recordings and Compositions in the Twilio Cloud by default, or set up external AWS S3 storage.

Learn more about recordings and compositions.

The end user's network and device setup influences the quality of a video call. Twilio Video tools can provide the end user feedback about their connectivity before they join a call, display real-time information and metrics about a call, and capture logs for app monitoring.

See the full list of Twilio Video diagnostic and troubleshooting tools.

Adaptive simulcast

adaptive-simulcast page anchor

Simulcast is a scalable video technique that helps you determine video quality for each Participant based on their available bandwidth. Adaptive simulcast dynamically enables and disables video quality layers to improve bandwidth and CPU usage. This helps save device resources in cases such as a presentation or grid UI layout, when the application doesn't need a Participant's highest resolution video. Adaptive simulcast ensures that publishers are only encoding the video quality layers needed at a given moment.

Learn more in Working with VP8 Adaptive Simulcast.

You can add virtual backgrounds, background blurring, or other custom video filters in JavaScript applications using the Twilio Video Processors SDK. See a demo of the Video Processors SDK(link takes you to an external page) and a blog post about how to use the Video Processors to create virtual backgrounds(link takes you to an external page).

Twilio Video Noise Cancellation (powered by Krisp) is an AI-based plugin that filters out background noise in real time. You can host and serve the Krisp audio plugin for JavaScript, iOS, or Android in your application. The plugin runs as part of the audio pipeline between the microphone and audio encoder and removes unwanted sounds during a preprocessing step.

Learn how to add noise cancellation to your Twilio Video app.


Scaling your Twilio Video app

scaling-your-twilio-video-app page anchor

Twilio Video enforces default concurrency and request quotas to optimize resource usage while enabling application growth. Quotas apply per Account SID and include the following:

  • Concurrent Rooms quota
  • Concurrent Participants quota
  • REST API read/write request quotas

To manage quotas and scale your app:

  • Use ad-hoc Rooms to avoid creating unused Rooms.
  • Use status callbacks to reduce read requests.
  • Implement retries with exponential backoff.
  • Request higher quotas by contacting Twilio sales.

Learn more about quotas and limits and how to scale your Twilio Video application.


Diagnostic and troubleshooting tools

diagnostic-and-troubleshooting-tools page anchor

Gain insight into your video applications and provide feedback to end users about their setup and connectivity both before and during video calls.

  • Video Insights: Analytics and aggregations in the Twilio Console for observing your application, discovering trends, and troubleshooting Rooms and Participants.
  • JavaScript Room Monitor: A browser-based tool that displays real-time information and metrics about a Twilio Video Room. It gathers and processes information from the Room object, including information about Participants' bandwidth, packet loss, and jitter, and displays the information in a modal window in the video application.
  • JavaScript Logger: Capture logs generated by the Twilio Video JS SDK in real time so that you can monitor your front-end applications and see how they behave in production.
  • JavaScript Video diagnostics application(link takes you to an external page): An open-source ReactJS application that tests Participants' device and software setup, connectivity with the Twilio Cloud, and network performance. It provides feedback to end users about network quality and device setup, and includes recommendations for improving video call quality. This application is built on top of the Preflight API and RTC Diagnostics SDK.
    • Preflight API: Functions for testing connectivity to the Twilio Cloud. The API can identify signaling and media connectivity issues and provides a report at the end of the test.
    • RTC diagnostics SDK(link takes you to an external page): Functions to test a Participant's input and output devices, including microphones, speakers, and cameras, as well as functionality to confirm that a Participant meets the network bandwidth requirements to make a voice call or conduct a video call.

Networking considerations

networking-considerations page anchor

Twilio Video uses WebRTC to provide real-time video and audio communication in Rooms. Review the list of ports and protocols that Twilio uses during video calls so that you can help end users connect to your application.

Additionally, you can learn more about locations of Twilio servers and global low latency. Connecting to Twilio infrastructure that's closest to your end users will help reduce round-trip time and latency on video calls.

For more information, see Networking considerations for Video applications.


Integrate with other Twilio products

integrate-with-other-twilio-products page anchor

You can integrate other Twilio services into your Video application. Consider adding the following services: