Fun with Markov Chains, Python, and Twilio SMS
One of the many allures of Twitter is that you can tweet at your favorite celebrity and (maybe) get a response. Still though, tweeting isn’t quite as intimate as trading text messages. So we thought it’d be fun to use Markov Chains, Programmable SMS, and Python to create a bot that impersonates your favorite Twitter personality.
We could use the code below to create an SMS chat bot that sounds like anyone with a Twitter account. But to show off it’s true potential, we need to someone with a distinct and recognizable tweeting style. Someone with a huge personality. Someone who has the best words.
Someone like Donald Trump.
Ever wish that you could debate Trump? Drop him a text at: 847-55-TRUMP (847-558-7867).
There are three steps to create this bot:
- Download the tweets for a given user to create a corpus of text.
- Use the corpus to generate a sentence in the style of the Tweeter.
- Reply to a text message with that sentence.
To follow along, you’ll need Python, a Twitter account, and a free Twilio account.
Download All Tweets from a User
Before we get started, let’s give credit to Filip Hráček whose Automatic Donald Trump was the inspiration for this idea. Check out his post for an excellent explanation on how to implement Markov chains in Dart.
Markov chains begin with a corpus — a library of text to train your model. We’ll use a modified version of this tweet_dumper script to pull in down tweets from the Twitter API.
To get started, create and activate a new virtual environment:
Then go to wherever you keep your code, make a directory for your project and create a file called get_tweets.py .
Install the tweepy package to connect to the Twitter API:
In get_tweets.py, import tweepy, csv, and the regular expression packages:
We’ll need Twitter credentials to access the API. Create a new app on the Twitter Application Manager. You can fill in any ol’ domain for the website and leave the callback URL blank. Once created, generate your credentials:
Add those credentials to the bottom of your file (you’ll want to extract those creds to environment variables if you deploy this script, but hard-coding works for now):
We’ll create a function that pulls down all tweets for a given screen name. We have to do this iteratively, as the Twitter API only allows 200 tweets at a time. Also, we can only retrieve the 3,024 most recent tweets. Once we hit that limit, our new_tweets array will be blank and we’ll know to stop iterating.
Until then, we keep querying Twitter and adding to all_tweets. To make sure we’re getting the words straight from Trump’s fingertips, we’ll only keep the ones where tweet.source == 'Twitter for (delete that conditional if you create a bot for other tweeters).
To do all this, add this code to the bottom of get_tweets:
Add a function at the bottom of your file to strip out some miscellaneous text and make our replies feel more like text messages and less like tweets:
The add a function to:
- Take an array of raw tweets as a parameter
- Open a CSV file for writing
- Iterate through each tweet
- Write each non-blank clean tweet to the file
Finally, add code to the bottom of your file to retrieve the tweets and write them to a CSV:
Run your script with python get_tweets.py. About half of Trump’s 3,200 tweets make it past our filters, so you should end up with around 1,600 rows.
Generate Sentences With a Markov Chain
You may not realize it, but you see Markov chains every day — they’re what power the auto-suggest feature on your phone’s keyboard. When it comes to sentence generation, Markov chains ask, “Based on the last word you typed and all the phrases you’ve typed in the past, what are you most likely to type next?” For an in-depth explanation on the mechanics of Markov chains, check out Filip’s post or Victor Powell’s excellent Markov Chains Explained Visually.
Fortunately, we don’t need to do the Markov calculations by hand. Jeremy Singer-Vine’s markovify package abstracts the generation of text-based Markov chains, letting you generate sentences from a corpus in just two lines.
Install the makovify package:
Create a file called app.py. Paste this code to import markovify, create a model based on the corpus, and print a short Trumpov chain:
Run python app.py and marvel at either the plausibility or absurdity of this computer generated statement. Now we just need to hook that code up to a phone number.
Reply to an SMS with Python and Twilio
When someone texts our Twilio number, Twilio makes an HTTP request to our app. In return, it expects an HTTP response in the of TwiML – a simple set of XML tags that tell Twilio what to do next.
We’ll use Flask, the Python microframework, to handle that POST request. We’ll use the Twilio helper library to generate a TwiML response that sends a reply message.
Install those two packages:
Delete everything in app.py and replace it with:
Start your app with python app.py.
Assuming you’re working on your local machine, you’ll need a publicly accessible URL to localhost so that Twilio can access your script. Fastest way to do this is with ngrok.
In a separate terminal window, start ngrok and copy the URL it gives you to your clipboard (check out the GIF below):
Sign up for a free Twilio account if you don’t have one. Buy a phone number, then setup your number. Scroll down to the Message section and find the A Message Comes In field. Paste your ngrok URL and append the /message endpoint. Then save your configuration and text your burning question to your great Twilio phone number.
What’s Next?
Nice work! With just a few lines of Python you just:
- Mined Twitter data using the Twitter API
- Created Markov Chains in Python
- Replied to an SMS in Python using Twilio
Armed with those skills, you’ll probably come up with a creation far more useful than a bot that pretends to be Donald Trump. To aid you in that endeavor, here are some resources that may be helpful:
- Matt Makai’s How to Build an SMS Slack bot
- The Twilio and Python Quickstarts
- The Twilio Tutorials (which feature clone-able, production ready apps)
If this post inspires you to build something cool, or if you have any questions, I’d love to hear about it. Drop me a line at gb@twilio.com or find me on Twitter at @greggyb.
PS – Please vote. You can register via HelloVote by texting HELLO to 384-387.
Many thanks to Ricky and Matt for the reviews.
Related Posts
Related Resources
Twilio Docs
From APIs to SDKs to sample apps
API reference documentation, SDKs, helper libraries, quickstarts, and tutorials for your language and platform.
Resource Center
The latest ebooks, industry reports, and webinars
Learn from customer engagement experts to improve your own communication.
Ahoy
Twilio's developer community hub
Best practices, code samples, and inspiration to build communications and digital engagement experiences.