How to Use Twilio Speech Recognition
Time to read: 3 minutes
Twilio Speech Recognition is a powerful addition to voice applications powered by the TwiML <Gather> verb. Instead of just taking DTMF tones as input you can use the full expressiveness of spoken language in a variety of languages.
We’ll build a hotline that returns facts about cats, numbers, and Chuck Norris to have some fun with this feature and also show its usefulness in interactive voice response (IVR) applications. If you learn best from video or just want to see this in action, this full tutorial is available on the Twilio YouTube channel:
The code for the application is available in this repo on Github.
Hello, How Can I Help You?
We’ll use Twilio Functions to build our application. If you’re new to Twilio Functions you can follow this video tutorial to learn how it works. Since Twilio Functions runs inside the Twilio Runtime there are no prerequisites for this project other than signing up for a free Twilio account and getting a voice-enabled Twilio phone number.
The call flow for our application looks like this:
When the user calls they’ll be presented with a menu provided by the TwiML returned by
/facts
. The user will speak a command and the resulting words will be processed by /fact-commands
. This function will redirect to the appropriate function for the type of fact requested by the caller. Once at /cat-facts
, /number-facts
, or /chuck-facts
, the caller can either receive more of that type of fact or head back to the main menu. Let’s create the /facts
function to get this started. Head to the Functions inside the Twilio Runtime and create a new Function using the Blank template:Set the Function name to “Twilio Facts” and the path to
/facts
. Then add the following code to the code section at the bottom:
Line 5 specifies that this <Gather> tag uses the speech
input method only. If you want to also accept DTMF tones you can specify dtmf speech
but we’ll use speech only for now. Line 7 provides a comma-separated set of hints that instruct the speech recognition engine about words we’re expecting to receive as input. These intents improve the accuracy of the application. We’re done with /facts
so save this function before moving on.
Create another new function using the Blank template. Name it “Fact Command” and give it a path of /fact-command
. In the code section we’ll add some code that just tests the speech recognition to see if things are working so far:
The highlighted line gets the lowercase value of the SpeechResult out of the event
parameter and stores it in command
. We set the command to lowercase to make comparison easier in future steps. Head to setup page for your Twilio phone number and set the incoming call to use the “Twilio Facts” function:
Call your phone number and say “cat”, “number”, or “Chuck Norris” at the prompt to make sure everything is working so far.
What Kind of Fact Do You Want?
With the basics of the app working, update /fact-command
to redirect the call to the appropriate fact Function based on what the caller says. Replace the code in /fact-command
with the followingcode:
The highlighted lines redirect the call flow based on the input received. If the caller says something unexpected, we redirect them back to the main menu.
Implement the /cat-facts
Function next. Cat facts will be retrieved from http://catfact.ninja/fact using the got library (did you know that blue-eyed, pure white cats are frequently deaf?). Create a Function called “Cat Facts” with the path cat-facts
and add the following code in the code section:
Our application first checks to see if the caller said “menu” after receiving a cat fact and if so the app redirects them back to the menu. Otherwise, a new cat fact is fetched and presented to the caller. The caller is redirected to the main menu if an error occurs in fetching a cat fact. This ensures that we don’t hang up on our caller if catfact.ninja happens to go down for some reason.
The number facts and Chuck Norris facts Functions are almost identical so you can implement them with the following code. Here’s the /number-facts
code:
Here’s the /chuck-facts
code:
Once all of these functions are created and saved, call your Twilio Facts Hotline and test it out. You should be able to amaze your friends with random cat facts, some number trivia or at the very least have a Chuck Norris joke at the ready at any time.
What’s Next?
Now that you know how Twilio Speech Recognition works, consider trying the following:
- Try using the partialResultsCallback for to access partial speech results in real-time
- Build an ordering system that combines DTMF tones and speech
Whatever you build, let me know about it at @brentschooley.
Related Posts
Related Resources
Twilio Docs
From APIs to SDKs to sample apps
API reference documentation, SDKs, helper libraries, quickstarts, and tutorials for your language and platform.
Resource Center
The latest ebooks, industry reports, and webinars
Learn from customer engagement experts to improve your own communication.
Ahoy
Twilio's developer community hub
Best practices, code samples, and inspiration to build communications and digital engagement experiences.