How to Handle Incoming WhatsApp Audio Messages in Go
How to Handle Incoming Twilio WhatsApp Audio Messages in Go
When building social media customer support systems, automated transcription services, or voice-driven data collection platforms, users can effortlessly communicate with your application through voice recordings, eliminating the need to type their messages.
In this tutorial, you'll learn how to handle incoming WhatsApp audio messages and transcribe them into text in a Go application using Twilio and AssemblyAI.
Prerequisites
To follow along with this tutorial, you’ll need the following:
- Go 1.22 or above
- A Twilio account (free or paid); if you don't have one yet, click here to create a new account
- An AssemblyAI account
- Ngrok installed on your computer, and an ngrok account
- Your preferred text editor or IDE
- Prior experience with developing in Go would be ideal but is not required
Create a new Go project
To get started, let’s create a new Go project. Open your terminal, navigate to the directory where you want to create the project, and run the commands below.
After running the commands above, open the project in your preferred code editor or IDE.
Install the required dependencies
Install the Twilio Go Helper Library
To make it significantly easier for the application to interact with the Twilio WhatsApp API, install the Twilio Go Helper Library using the command below.
Install the AssemblyAI Go SDK
Now, to translate incoming WhatsApp audio messages to text in our application, we will use the AssemblyAI, as it transcribes speech to text. To simplify interacting with AssemblyAI, we'll use their Go SDK.
Run the command below to Install it.
Store your credentials as environment variables
Now, we'll store the API credentials in the .env file so that they'll be accessible as environment variables. To do this, create a .env file in your project folder and add the following variable:
Next, let's install the Godotenv package to load the environment variables into the application using the command below.
Retrieve the required credentials
Retrieve your Twilio credentials
To retrieve your Twilio API credentials, log in to your Twilio Console dashboard. You will find your Twilio Account SID and Auth Token under the Account Info section, as shown in the screenshot below.
Retrieve your AssemblyAI API key
To obtain your AssemblyAI API key, log in to your AssemblyAI dashboard. You will find your API key as shown in the screenshot below.
Create the application logic
Now, let’s create the application's core logic for processing incoming WhatsApp audio messages. Specifically, we need to set up a webhook URL to handle these messages from Twilio. In the project's root directory, create a file named main.go and add the following code to it.
Here is the breakdown of the above code:
- The
init()
function loads the environment variables and sets up the global credential variables - The
downloadFile()
function downloads the audio file from Twilio endpoint - The
transcribeAudio()
function sends the downloaded audio file to AssemblyAI for transcription - The
main()
function sets up an HTTP server that handles incoming requests
Connect the app to the Twilio WhatsApp Sandbox
Let’s configure our Twilio sandbox to accept and send incoming WhatsApp messages to our application. To do this, go to your Twilio Console dashboard, and navigate to Explore products > Messaging > Try it out > Send a WhatsApp message, as shown in the screenshot below.
On the Try WhatsApp page, copy your Twilio WhatsApp number and send the displayed join message to that number, as shown in the screenshot below.
Next, open the .env file and replace the placeholder <twilio–whatsapp-number>
with your actual Twilio WhatsApp number.
Start the application
Let's start the application development server. You can do this by running the command below.
You will see the application running on localhost listening on port 8080.
Set up a Twilio WhatsApp Webhook
When Twilio receives an incoming message, it forwards the message details to your application's webhook URL. For the webhook URL to work, you have to make the application accessible over the internet using ngrok. To do this, open another terminal and run the following command:
The command above will generate a F orwarding URL. Copy it as shown in the terminal below.
Now, on the Twilio Try WhatsApp page, click on the Sandbox Settings option and configure the sandbox settings as follows.
- When a message comes in: add the generated ngrok forwarding URL and append
/webhook
- Method: POST
After setting the configuration, click the Save button, as shown in the screenshot below, to save your changes.
Next, in the .env file, replace the placeholder <ngrok-forwarding-URL>
with your actual value.
Test the application
To test the application, open your WhatsApp app and send a voice note to your Twilio number. You should receive a reply with your voice note translated to text, as shown in the screenshot below.
That is how to handle incoming Twilio WhatsApp audio messages in Go
In this tutorial, you learned how to handle incoming WhatsApp audio messages in Go using the Twilio WhatsApp API and AssemblyAI. The application converts WhatsApp audio messages into readable text and responds to users with the transcriptions.
Popoola Temitope is a mobile developer and a technical writer who loves writing about frontend technologies. He can be reached on LinkedIn.
Related Posts
Related Resources
Twilio Docs
From APIs to SDKs to sample apps
API reference documentation, SDKs, helper libraries, quickstarts, and tutorials for your language and platform.
Resource Center
The latest ebooks, industry reports, and webinars
Learn from customer engagement experts to improve your own communication.
Ahoy
Twilio's developer community hub
Best practices, code samples, and inspiration to build communications and digital engagement experiences.