Twilio Changelog | Feb. 12, 2025

<Gather> New Multi-Provider Speech Recognition Models - upcoming General Availability

TLDR;  New Speech Models and Providers, “Twilio Picks” Multi-Provider Resiliency in Speech Recognition Capabilities, now available in <Gather>!

<Gather>, the Twilio platform’s utterance-based Speech to Text (STT) capability, takes a significant step forward for voice applications builders starting this month, by rolling into GA the support for the  latest Speech-to-Text API capabilities from Google, updating to V2 of their Speech APIs (including new and improved speech models), and  Deepgram (including their Nova2 speech models), for use in their Twilio <Gather> TwiML calls and for use in the Gather Studio Widget and Twilio’s SDKs. Developer/Builder customers can select a specific provider and speech model (“Customer picks” mode), or – new with GA – specify a language as well as optionally a generic speech model and let Twilio pick the best provider for that model/language combination (“Twilio Picks” mode). When using the “Twilio picks” selection – the default for new and migrated customers – then, should any cloud speech provider suffer an unexpected slowdown in responsiveness, the Twilio platform can also switch speech providers automatically on the fly, adding a layer of resiliency to business-critical speech-based and/or IVR/Virtual Agent services.

Developers can even configurably pick and choose speech models and providers to use with each individual question/prompt (utterance), as may suit their application/use case’s need – picking one provider for conversation and another for numbers or addresses, or re-prompting with another provider if the response was not understood by the first provider, for example – to achieve optimal speech rec accuracy and performance.

How can we take advantage of these new <Gather> New Multi-Provider Speech Recognition Models?

New Twilio accounts will start using the new providers and the new BI (Billable Item – see Voice Pricing page) for “Twilio Picks” mode, by default, starting 3/17/25.

Existing Beta customers who have already selected to use the latest v2 Speech APIs (from Google) via Console checkbox will be opted into “Twilio Picks” mode upon GA Gather rollout,  starting 2/17/25. If Beta customers used TwiML (“Customer picks” mode) to select the new providers and speech models, their model/vendor selection (and BIs) will not change.

Existing Gather customers using TwiML or Studio who do nothing will be automatically moved in 3 waves to the new Gather speech providers and new BI (Billable Item – see Voice Pricing page), starting 4/1/25 through 6/1/25. 

Customers wishing to temporarily “opt out” of using the new Google speech functionality can contact Twilio Support to do so for a limited time, under increased pricing effective 6/1/25, if they need longer than the migration window above to adjust their applications for the results delivered by the new, improved Gather – until 9/30/25 when the “temporary opt out” period expires.

Note:  SDK customers who wish to take advantage of the new functionality may have to upgrade to the latest version of their SDK and its helper libraries before doing so, i.e. before recompiling their app using the new functionality. SDK customers not changing a production application and/or not using new Gather speech model/language variant functionality will not have to do anything, though, to be moved to the new speech provider APIs and new BI during the 4/1/25-6/1/25 moves, along with all other Gather customers.

Customer benefits and Pricing Changes 

With these new speech recognition capabilities, providers, and support for their latest STT API versions, Twilio delivers industry-leading speech recognition accuracy and  improved noisy environment performance, offering builders choices from across a wider array of speech models suited to builder’s use cases, from longer answers to short utterances, ranging from customer services automations like form-filling and survey responses, to speaking naturally to LLM bots in IVRs/Virtual Agents, and more!

The Twilio Platform, by delivering first-in-the-industry query-by-query multi-provider speech recognition with multi-provider resiliency to reliably deliver the best speech recognition results, sets a new industry benchmark for value – especially when accompanied by new lower pricing for the “Twilio Picks” default mode speech recognition. For those who want fine-grained control, customers can also explicitly select their speech recognition provider on the Twilio platform, and choose from the latest and greatest speech models and language variants those providers offer, the instant providers make them available, as a premium service, as well. 

 

Voice