How to Add SMS Capabilities to Your AI Agents in Minutes

January 13, 2025
Written by
Reviewed by

Abraham Maslow famously said, "It is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail." This timeless wisdom applies just as well to AI agents today. If our AI systems can only communicate via chat, we risk limiting their potential. But what if we equipped them with more tools—like mobile messaging—to make their communication more versatile and impactful?

In this guide, we’ll explore how you can enhance your AI agents and add messaging (SMS, WhatsApp, or RCS) capabilities in just a few steps. Whether you’re building customer support agents, appointment schedulers, or travel booking agents, enabling them to send text messages opens up new possibilities for user engagement and satisfaction.

We’ll walk you through the basics of what AI agents are, how they differ from large language models (LLMs), and how to integrate Twilio Programmable Messaging into your AI projects. Let’s get started!

What is an AI Agent, and How Does It Differ from an LLM?

At its core, an AI agent is a program that performs tasks autonomously, often guided by a goal or set of instructions. While an LLM (Large Language Model), like OpenAI’s GPT, is designed to generate text based on a prompt, an agent is designed to act. Agents make decisions, take actions, and manage workflows, often leveraging LLMs as one of their underlying tools. On top, they use input and output guardrails to make sure the agent only acts on the problems it's intended to.

Think of it this way: an LLM is like a well-read librarian, providing information and context when asked, whereas an AI agent is more like a project manager—it takes the librarian’s advice, interprets it, and then executes tasks like sending messaging, scheduling appointments, or answering customer calls.

If you’d like to dive deeper into the differences, these two articles do a fantastic job of explaining the concepts:

By understanding this distinction, you can start building AI agents that don’t just respond to text prompts—they proactively solve real-world problems.

Enhancing AI Agents with Communication Tools

Imagine this: you’re on a service hotline, explaining to the AI agent you want to cancel a shipment. The interaction is seamless, but what happens next? Instead of leaving you wondering if the cancellation went through, the agent sends you a confirmation message on your phone instantly. It’s quick, reliable, and gives you peace of mind.

Or consider this scenario: you schedule a doctor’s appointment with an AI receptionist. A few days before the appointment, you receive a reminder via SMS — or better: an Rich Communication Services (RCS) message — complete with the appointment details and a convenient option to reschedule if needed. No missed appointments, no confusion—just efficient communication.

These examples highlight why AI agents need to go beyond pure text-based chat systems. By integrating communication channels like SMS, WhatsApp, or RCS, AI agents can create a more cohesive and reliable user experience. In the next section, we’ll dive into how to integrate these capabilities into your AI agent.

Adding SMS Capabilities to Your AI Agent: The Concept and Code

To enable SMS capabilities in your AI agent, the approach involves defining the tool—in this case, sending an SMS using Twilio Programmable Messaging—and making it available for your agent to use. Once the tool is defined, you provide the agent with the necessary prompt to trigger the tasks with the given tool. Let’s walk through the concept and provide some sample code using JavaScript with the OpenAI SDK and LangChain, respectively.

Understanding the Process

At its core, the integration involves:

  • Tool Definition: Setting up a function that performs the action (e.g., sending an SMS/WhatsApp/RCS via Twilio). This function becomes a “tool” your agent can call.
  • Agent Execution: Providing the AI model with a prompt that explains the tool and its purpose, so the model knows how to use it effectively.
  • CLI: Using the @inquirer/prompts package, you can build a CLI to interact with your naive agent

Prerequisites

npm install @inquirer/prompts twilio openai @langchain/openai @langchain/core dotenv
  • Environment Variables: Create a .env file in your project directory with the following content:
TWILIO_ACCOUNT_SID=your_twilio_account_sid  
TWILIO_AUTH_TOKEN=your_twilio_auth_token  
TWILIO_MESSAGING_SERVICE=your_twilio_messaging_service_sid

Adding SMS Capabilities with the OpenAI SDK

Below is some sample code openai-agent-cli.mjs that shows how to define the messaging tool and execute it via a CLI when the program is executed.

import { input } from '@inquirer/prompts';
import OpenAI from 'openai';
import twilio from 'twilio';
import dotenv from 'dotenv';
dotenv.config();
const { OPENAI_API_KEY, TWILIO_ACCOUNT_SID, TWILIO_AUTH_TOKEN, TWILIO_MESSAGING_SERVICE } = process.env;
if (!OPENAI_API_KEY || !TWILIO_ACCOUNT_SID || !TWILIO_AUTH_TOKEN || !TWILIO_MESSAGING_SERVICE) {
   console.error("Please provide valid OpenAI API key, Twilio Account SID, Twilio Auth Token, and Twilio Messaging Service.");
   process.exit(1);
}
const client = twilio(TWILIO_ACCOUNT_SID, TWILIO_AUTH_TOKEN);
const openai = new OpenAI(OPENAI_API_KEY);
let messages = [
   { role: "system", content: "You are a helpful receptionist at a dentist office." },
];
const tools = [
   {
       "type": "function",
       "function": {
           "name": "sendConfirmation",
           "description": "Sends a confirmation via SMS, WhatsApp or RCS message to a given phone number.",
           "parameters": {
               type: "object",
               properties: {
                   message: {
                       type: "string",
                       description: "The content of the message. Keep the message short and sweet!"
                   },
                   to: {
                       type: "string",
                       description: "The recipient's phone number in valid E.164 format."
                   },
                   channel: {
                       type: "string",
                       description: "The channel to send the message on. This can be 'sms', 'whatsapp', or 'rcs'."
                   }
               }
           }
       }
   }
];
async function sendAnswerToOpenAI(answer) {
   messages = messages.concat({ role: "user", content: answer });
   const completion = await openai.chat.completions.create({
       model: "gpt-4o-mini",
       tools,
       messages
   });
   const isFunctionCall = completion.choices[0].message.tool_calls && completion.choices[0].message.tool_calls.length > 0;
   const hasContent = completion.choices[0].message.content;
   if (isFunctionCall) {
       const toolCall = completion.choices[0].message.tool_calls[0];
       const name = toolCall.function.name;
       const args = JSON.parse(toolCall.function.arguments); // no error handling at this time
       if (name === "sendConfirmation") { // only one tool in this example
           const numberPrefix = args.channel === "whatsapp" ? "whatsapp:" :
               args.channel === "rcs" ? "rcs:" : "";
           try {
               await client.messages.create({
                   body: args.message,
                   from: TWILIO_MESSAGING_SERVICE,
                   to: numberPrefix + args.to,
               });
               console.info("Confirmation message sent successfully!");
           } catch (error) {
               console.error("Error sending confirmation message:", error);
               messages = messages.concat({ role: "assistant", content: error.message });
               return error.message;
           }
           const confirmation = `I just sent you a confirmation message to ${args.to}! Is there anything else I can help you with?`;
           messages = messages.concat({ role: "assistant", content: confirmation });
           return confirmation;
       }
   }
   if (hasContent) {
       const output = completion.choices[0].message.content;
       messages = messages.concat({ role: "assistant", content: output });
       return output;
   }
}
console.log("***************************");
console.log("Welcome to the Dentist Office");
console.log("***************************");
let user_reply;
let ai_reply = await sendAnswerToOpenAI("Hi, can you help me schedule a checkup appointment with the dentist?");
while (true) {
   user_reply = await input({ message: ai_reply });
   ai_reply = await sendAnswerToOpenAI(user_reply);
}

This code snippet demonstrates how to integrate Twilio’s messaging services with an AI agent using OpenAI’s SDK. First, the code sets up the necessary environment variables and dependencies. Next, it defines a “sendConfirmation” tool, which allows the AI agent to send SMS, WhatsApp, or RCS messages through Twilio. The tool is introduced to the AI model via the tools array in line 56, enabling it to execute messaging tasks autonomously.

In the interaction loop, the AI agent processes user inputs and determines if it needs to call the “sendConfirmation” function. If a tool call is detected, the agent sends the message through Twilio's API, providing real-time feedback to the user. This approach illustrates the process of extending the AI agent's capabilities with additional communication channels.

You can run the code with node openai-agent-cli.mjs. At this time, the naive agent is still without any guardrails and might start to talk about unrelated topics. As shown in this video, it’s still straightforward to schedule an appointment.

The code responsible for sending the SMS, WhatsApp, or RCS message starts in line 63 and marks where the critical work of validation, sanitization, and security checks should take place. For simplification, this was omitted in this post. For production usage, this step is vital! Executing parameters generated by a LLM without proper scrutiny can pose significant security risks, such as injection attacks or misuse of resources. By thoroughly vetting the inputs—ensuring they adhere to expected formats and rejecting suspicious data—you can mitigate potential vulnerabilities.

For even greater safety, you might consider offloading this execution to a separate API call rather than running it in the same Node.js thread as your main application logic. This separation ensures that the execution process operates in a controlled, isolated environment, reducing the risk of any unintended actions. Such an approach is particularly useful in production systems where robust error handling, logging, and monitoring are essential.

Taking these precautions not only secures your application but also creates a scalable and resilient architecture for handling AI-driven actions. In the next section, we’ll see how to achieve the same result with LangChain. A LLM agnostic framework for AI agents..

Adding SMS Capabilities with Langchain

In LangChain, the process of integrating tools like sending an SMS via Twilio follows a similar pattern, but with some unique abstractions that simplify working with different AI models and tasks. The code langchain-agent-cli.mjs shows how you can define and execute a tool to send messages with LangChain.

import { input } from '@inquirer/prompts';
import { ChatOpenAI } from "@langchain/openai";
import { DynamicStructuredTool } from "@langchain/core/tools";
import { HumanMessage, SystemMessage } from "@langchain/core/messages";
import { z } from "zod";
import twilio from 'twilio';
import dotenv from 'dotenv';
dotenv.config();
const { OPENAI_API_KEY, TWILIO_ACCOUNT_SID, TWILIO_AUTH_TOKEN, TWILIO_MESSAGING_SERVICE } = process.env;
if (!OPENAI_API_KEY || !TWILIO_ACCOUNT_SID || !TWILIO_AUTH_TOKEN || !TWILIO_MESSAGING_SERVICE) {
   console.error("Please provide valid OpenAI API key, Twilio Account SID, Twilio Auth Token, and Twilio Messaging Service.");
   process.exit(1);
}
const client = twilio(TWILIO_ACCOUNT_SID, TWILIO_AUTH_TOKEN);
const llm = new ChatOpenAI({
   model: "gpt-4o-mini",
   temperature: 0,
   apiKey: OPENAI_API_KEY,
});
const confirmationSchema = z.object({
   message: z.string().describe("Message to the patient confirming the appointment. Keep the message short and sweet!"),
   to: z.string().describe("The recipient's phone number in valid E.164 format."),
   channel: z.enum(["sms", "whatsapp", "rcs"]).describe("The channel to send the message on."),
});
const sendConfirmationTool = new DynamicStructuredTool({
   name: "sendConfirmation",
   description: "Sends a confirmation via SMS, WhatsApp or RCS message to a given phone number.",
   schema: confirmationSchema,
});
const llmWithTools = llm.bindTools([sendConfirmationTool]);
console.log("***************************");
console.log("Welcome to the Dentist Office");
console.log("***************************");
let messages = [
   new SystemMessage("You are a helpful receptionist at a dentist office."),
   new HumanMessage("Hi, how are you?"),
];
let ai_reply, user_reply;
while (true) {
   ai_reply = await llmWithTools.invoke(messages);
   const isToolCall = ai_reply.tool_calls && ai_reply.tool_calls.length > 0;
   if (isToolCall) {
       const call = ai_reply.tool_calls[0]
       if (call.name === "sendConfirmation") { // only one tool in this example
           const numberPrefix = call.args.channel === "whatsapp" ? "whatsapp:" :
               call.args.channel === "rcs" ? "rcs:" : "";
           try {
               await client.messages.create({
                   body: call.args.message,
                   from: TWILIO_MESSAGING_SERVICE,
                   to: numberPrefix + call.args.to,
               });
               console.info("Confirmation message sent successfully!");
               ai_reply = {
                   content: `I just sent you a confirmation message! Is there anything else I can help you with?`
               }
           } catch (error) {
               console.error("Error sending confirmation message:", error);
               ai_reply = {
                   content: `I just sent you a confirmation message! Is there anything else I can help you with?`
               }
           }
       }
   }
   user_reply = await input({ message: ai_reply.content });
   messages.push(new HumanMessage(user_reply));
   messages.push(new SystemMessage(ai_reply.content));
}

From the users’ POV, there is not much of a difference between both pieces of code. Both use OpenAI’s gpt4-o-mini model with slightly different prompts. The main difference between this snippet and the previous one is in the implementation. The second one makes use of LangChain’s abstractions, which simplify the process of defining and using tools. Instead of manually defining a tools array and managing tool calls, LangChain provides structured components like DynamicStructuredTool and ChatOpenAI. This makes code maintenance easier as it’s less dependent on OpenAI-specific API calls and allows switching to different models more easily.

Additionally, LangChain integrates schema validation via zod, ensuring that the tool’s parameters are strictly validated before execution. This reduces the likelihood of errors and makes the implementation more robust as you can validate during runtime if the LLM provides the right parameter to make a 3rd-party API call. LangChain also binds tools to the AI model directly, streamlining the process of invoking tools during interactions.

You can run the code with node langchain-agent-cli.mjs. You will notice, the UX is almost the same as with the previous code snippet.

 

Closing Thoughts

With the integration of SMS, WhatsApp, and RCS, your AI agent now has a diverse set of tools to enhance its communication capabilities. The real power lies in the agent’s ability to utilize these tools seamlessly to engage with users in a more dynamic, personalized way. Whether it’s sending a reminder, confirming a shipment, or handling a customer inquiry, your agent is now equipped to handle a variety of tasks with ease.

As your AI agent evolves, integrating with systems like Segment can further empower it by giving it access to detailed customer profiles. By doing so, the agent can bypass repetitive prompts, such as asking for contact preferences or their phone number, and focus on providing a more personalized experience right from the start. This allows for smoother, faster interactions, creating a more engaging experience for users.

The more tools your AI agent has at its disposal, the more versatile and efficient it becomes. From basic tasks like sending SMS notifications to complex multi-channel communication, each new capability adds another layer of sophistication to your agent. The possibilities are endless as you continue to build and expand its functionality.

Did you know there’s a simpler way to build AI agents with Twilio that requires little to no coding? Twilio’s AI Assistants make it easy to create powerful, multi-channel AI agents in minutes. And the needed guardrails are already built-in as well. Learn more by visiting Twilio AI Assistants.

We can’t wait to see what you build!