Messaging Architecture for Independent Software Vendors (ISVs)
Time to read: 11 minutes
Independent Software Vendors (ISVs) are an especially important category of customers for whom building a trusted and scalable messaging architecture is paramount. If you aren’t sure if you are an ISV, checkout this blog post. If you are an ISV, you are serving hundreds to thousands of customers, and exponentially more end-users. There are a multitude of design patterns to consider when rolling out a messaging solution, and it’s easy to go down the wrong path. That’s where Twilio comes in – to help you avoid common pitfalls.
Twilio is a trusted provider in the telecommunications ecosystem – we help our customers navigate the complexities of the ever-changing communications landscape, and we’re here to empower you to build the best architecture for your messaging solutions. Designing the correct architecture is critical to ensure your infrastructure can scale; ultimately allowing you to support your customer base. From choosing the right sender type, to assessing design considerations, to error handling, and much more—we’ve got you covered.
This architecture-focused post will guide you through the steps to build out your messaging strategy as an ISV.
In this post we will discuss:
- A brief overview of message senders and sender selection strategy
- Messaging architecture guidance
- Reliability considerations
Prerequisites
We recommend that you read:
- ISVs: Set up for Success for foundational, ISV-specific information
- Using subaccounts to understand how to leverage subaccounts.
The foundation of building a robust messaging architecture starts with a scalable account structure in Twilio. As an ISV, you will want to separate usage and billing for your customers, among other things, and this is where subaccounts are a valuable resource. For the remainder of this post, we will stick to ISV customer implementations that leverage subaccounts. The diagram below provides a high-level example of an ISV and customer rollout using subaccounts.
Below is a high level architecture for Owl Inc. Owl Inc. has two projects; one for development (Dev) work and the other for production (Prod) traffic. The Prod account will contain subaccounts that Owl Inc provisions for each of its customers.
Sender Selection and Strategy
Sender classification
There are two broad categories of senders: Alphanumeric and Numeric Senders.
While we won’t get into all the differences between the different sender types, a few key differences are:
Alphanumeric:
- Doesn’t support two-way messaging
- Must provide an alternative path for end-users to opt-out.
Below is a helpful diagram showing the taxonomy of Sender IDs.
Sender Matrix
Short Codes |
A2P 10DLC |
Toll-Free SMS |
Alphanumeric Sender ID (Non registered) |
Alphanumeric Sender ID (Registered) |
Long Code- Sending Internationally | |
MPS |
Country dependent |
3-180 based on Trust Score |
3+ |
10+ |
10+ |
10 |
Message Volume |
Unlimited |
T-Mobile limits 1000-200k** |
Unlimited |
Unlimited |
Unlimited |
Unlimited |
Registration Interface |
Application Form |
API |
API***** |
Application |
API | |
Voice |
No |
Yes |
Yes |
No |
No |
Yes- if within same country |
DLRs |
Country dependent |
Carrier |
Handset (US/CA) |
Country dependent |
Country dependent |
Country dependent |
Provisioning Time |
6-10 weeks |
Minutes |
Instant**** |
Minutes |
Country specific |
Instant |
Recurring Fees* |
Yes. Check country specific pricing |
Registration + Campaign fees*** |
Depends on throughput |
No |
Yes- country specific |
No |
Additional Carrier Fees |
Yes |
Yes |
Yes |
No |
No |
No |
* Fees outside of phone number cost.
** Depends on Starter vs. Standard. Standard brands can send over 200k with a Special Business Review approval.
*** Difference in cost depending on Starter vs. Standard Brands.
**** Verified Toll Free number takes 5-7 business days.
***** Verification process requires application form.
Sender Decision Flow Chart
The chart below highlights a few key questions to ask your customers when they are looking to send messages. Answering these questions will help you select the right sender for your workload.
Why each question is important to ask:
- Use Case: Countries have different regulations on what types of messages can be sent using which types of number (alphanumeric or numeric). In certain countries a Regulatory Bundle must be completed before provisioning a number. Broadly speaking, there are three types of Messaging:
- Conversational: A back and-forth conversation that takes place via text, where an end-user initiates a conversation
- Transactional: When a Consumer gives their phone number to a business and asks to be contacted in the future
- Promotional: A message sent that contains a sales or marketing promotion. Adding a call-to-action (e.g., a coupon code to an informational text) may place the message in the promotional category
- One-way vs. Two-way Messaging: Not all phone numbers are capable of sending and receiving SMS messages. In countries where there are multiple Sender types there are usually trade-offs when it comes to choosing the right sender.
- Country:
- When possible, always select a local sender, especially for two way messaging. This is best for deliverability and the end-user experience.
- Sending from a non-local sender could result in the sender ID being overwritten and impact user experience, especially if the use case is to support two way messaging.
- In countries where alpha pre-registration is required, this will also impact deliverability. It is best to use pre-registration alphanumeric numbers as carriers tend to favor this mechanism for sending.
- In countries that do not require pre-registration, Twilio will overwrite to an effective local sender such as a shared SC where necessary to deliver the message to the end user. Note that the local sender that Twilio sends from may change, so keep only one-way workloads on this sender.
- Keep in mind that in many countries you will need to provide documentation as part of local regulations. For example, German local numbers require proof of identity and a local address.
- High vs. Low Message Throughput: Understanding what your customers' throughput requirements are will help ensure you select the right sender. For example, for time-sensitive use cases like delivering One Time Passcodes (OTP), a short code is normally the best sender versus in a conversational use case, where a local 10-digit code may be preferred.
- Cost vs Deliverability: Senders have trade-offs. Customers should prioritize deliverability,however, there may be budget-related reasons or limited inventory that make it challenging to prioritize sender selection. Please speak with your Account Team to talk about these trade-offs.
- Phone provisioning urgency: Certain senders, such as pre-registered Alphanumeric Senders and short codes, take longer to provision compared to long codes. Ensure you are aware of lead times when working with your customers.
Now that we understand the main questions to ask, let's walk through a quick example.
Owl Inc. is a fictional ISV that provides a customer engagement platform. Falcon Flights, a client of Owl Inc., wants to send out order status updates to their customers.
The diagram below shows the questions Owl Inc. should take into consideration as they engage in a discovery conversation with Falcon Flights.
United States 🇺🇸
US-based senders should reference our phone number guide.
Brazil 🇧🇷
Owl Inc. is expanding into other countries. They’d like to offer conversational use cases within Brazil. Let's walk through the decision tree for Brazil.
International based senders can reference our international phone number guide.
Please read our Best Practices for Scaling with Messaging Services documentation for more details.
I can’t find a number…
In certain countries, Twilio isn’t able to provide numbers for a variety of reasons, but there are alternatives you can consider:
- WhatsApp: WhatsApp is the preferred contact method in many parts of the world, for example in Brazil and India. There are specific steps for Integrating WhatsApp with Independent Software Vendors (ISV) and System Integrators (SI).
- Email: For certain workloads, email is a fantastic compliment or alternative. Here is a great blog on ISVs using Twilio Sendgrid Email APIs.
Architecture Considerations
It is critical to understand a few key architectural considerations for messaging.
Messaging rate limits
Let’s understand the flow of messages that pass through Twilio. There are a few systems involved, and these systems affect the speed at which Twilio accepts and delivers messages.
- Twilio’s API edge is the first checkpoint, and it determines how fast Twilio can receive messages. It is defined as the number of concurrent requests that Twilio can receive. Each Twilio account / subaccount has its own request limits, or the number of simultaneous requests you can make to Twilio.
For example, let’s pretend that your Twilio account has a request concurrency limit of 50, and on average the Twilio API’s response latency is 250ms. We can calculate the total number of requests that can be made in one second as follows:
Number of requests per second = (API concurrency * 1000 milliseconds) / API latency in milliseconds
Twilio provides a
Twilio-Request-Duration
header on each API response, allowing you to evaluate the API’s processing time compared to your network latencyIf you exceed this limit you will receive
HTTP 429
error messages.Some best practices we recommend are:
- Implement exponential backoff to retry messages that fail to send.
- Use a 3rd party message queue (like RabbitMQ) Quality of Service (QoS) setting with an exact number of consumers or threads feeding from that queue and sending messages to Twilio. Since concurrency is applied per subaccount, ISVs should implement these queues at the subaccount level as well.
- Use a reverse proxy to throttle outbound http requests from your application(s).
- Twilio egress to the downstream provider is measured in messages per second (MPS) or message segments per second. To understand more about a message segment check out our blog post called What the Heck is a Segment?.
Note: ISVs that want to expose message length to their customers can embed a segment calculator.
As explained earlier, each sender type has a different rate limit. Sender rate limits define how fast Twilio sends messages to downstream carrier partners.
If you have selected a Short code (SC) with 100 MPS, Twilio will send messages to downstream providers at 100 segments per second. If you are sending at a rate higher than your available MPS, messages will be queued.
-
Queuing occurs when messages are sent at a higher rate than the sender’s available MPS. Messages are always queued First In First Out (FIFO). All Twilio queues are 4 hours long, so the queue can hold messages for up to 4 hours before they expire.
For example, if you have Short code sender with a default of 100 messages per second (MPS), you are going to get a 4 hour queue with 100 * 4 * 60 * 60 = 1,440,000 message segments.
When messages are queued, they are checked for the validity period (an attribute on the Message Resource) and the max queue size. If the message validity period is less than the queue size—meaning Twilio can dequeue a message before the validity period expires—then messages are rejected with an HTTP error of
429
. If no validity period is defined, then the max queue size is used which has a validity period of 4 hours. In this situation, messages fail with the error ‘30001 - Queue overflow’.
There are two strategies for sending messages to Twilio:
1. Sending with a sender specified via the From
parameter in a /POST
request:
2. Sending with a Messaging Service specified in a /POST
request:
Messages sent with a defined sender type are queued directly to the sender queue and receive responses synchronously. If any errors arise, they will be either HTTP 429
or Twilio error ‘30001 - Queue overflow’. It is best to implement exponential backoff to retry failed messages.
If a messaging service is used, the messages are first sent to the messaging service’s queue. These messages are dequeued at a rate higher than the available MPS of senders, which means the messaging service queues will always be empty. When the downstream senders’ queues are full, messages fail asynchronously with the Twilio error ‘30001 - Queue overflow’.
Implementing your own queue
Handling messaging rate limits can be architecturally challenging, and if you’re an ISV using subaccounts to manage your customers, you will want to consider building scalable designs such as a multi-queue worker system that distributes resources in order to adhere to concurrency limitations.
Implementing this architecture will allow ISVs the ability to :
- Segment traffic resources
- Prioritize high priority messages vs. low priority messages
- Scale horizontally
Reliability Considerations
Now that we have a good understanding of the right sender types and architecture, let's think about reliability and how to make our infrastructure resilient.
This section will walk through a few ways to handle failures for your messaging workloads.
Status Callbacks
Status callbacks are specified on each message sent or on the Messaging Service. As the message goes through Twilio and the messaging ecosystem, Twilio will send status updates to the specified status callback url.
If you see statuses such as undelivered and failed start investigating what might be leading to these statuses.
Debugger Event Webhook
The Debugger Event webhook is the first place to start when it comes to triaging errors and resolving them.
Anytime there is an error or warning, a webhook will be sent to the specified endpoint. ISVs should ensure that subaccounts are included. This is the payload that will be sent for each event:
PROPERTY |
DESCRIPTION |
Sid |
Unique identifier of this Debugger event. |
AccountSid |
Unique identifier of the account that generated the Debugger event. |
ParentAccountSid |
Unique identifier of the parent account. This parameter only exists if the above account is a subaccount. |
Timestamp |
Time of occurrence of the Debugger event. |
Level |
Severity of the Debugger event. Possible values are Error and Warning. |
PayloadType |
application/json |
Payload |
JSON data specific to the Debugger Event. |
Fallback URL
When a message is sent into Twilio’s infrastructure, we send a webhook to the provision
endpoint via the incoming webhook. However, if your infrastructure is down, your application won’t get a POST
request.
One strategy to mitigate this is to add a fallback URL at the Messaging Service or Phone number level.
Note: Make sure that your fallback URL is hosted in a different service than that of your primary URL. If both your primary and fallback URL are in the same service and that service is not responding, the fallback URL won’t help. In the case of network interruptions, consider webhook overrides to fine-tune webhook logic.
Twilio Functions + Sync
Twilio Functions is a serverless environment that allows builders to write application code and deploy without worrying about underlying infrastructure.
Twilio Sync is a serverless storage layer.
A fallback strategy consists of:
- Create a Twilio function
/POST
inbound messages to a Sync Resource (List, Map or Document)- When your infrastructure recovers, poll the Sync Resource to collect information that was missed during the outage
While this strategy isn’t perfect, it does provide the benefit of persisting all the messages in a single source to update your records when your application comes back online. Additionally, within Twilio Functions you can write logic to respond back to the end-user.
Event Streams
Event Streams is an API that allows developers to aggregate Twilio events and send them to a specified destination. Destinations include Sink types (such as Amazon Kinesis) and a webhook.
Messaging Insights
Messaging Insights is a dashboard that aggregates account-level Messaging metrics. For ISVs, it’s a helpful view of parent account and subaccount data, making it easier to understand high-level trends and to monitor key metrics such as deliverability, opt-outs and errors.
Conclusion
That’s it! You’re now fully equipped with the information you need as an ISV to build and launch a supercharged messaging solution. As you implement the messaging recommendations, use this list to check off the major components related to SMS messaging.
We hope this tutorial was valuable for learning the ins and outs of messaging architecture. We can’t wait to see what you build.
Valerie is a Principal Solutions Engineer at Twilio, focused on enabling Platform ISVs to build innovative engagement solutions on Twilio. You can reach her at vlim[at]twilio.com.
Pathik Soni is a Principal Solutions Engineer helping Enterprise ISV partners reimagine their customer engagement experience using Twilio.
Josh Siverson is a Principal Solutions Engineer focused on helping ISV Partners build scalable architectures and business on Twilio. You can reach him at jsiverson [at] twilio.com.
Related Posts
Related Resources
Twilio Docs
From APIs to SDKs to sample apps
API reference documentation, SDKs, helper libraries, quickstarts, and tutorials for your language and platform.
Resource Center
The latest ebooks, industry reports, and webinars
Learn from customer engagement experts to improve your own communication.
Ahoy
Twilio's developer community hub
Best practices, code samples, and inspiration to build communications and digital engagement experiences.