Build a Movie recommendation app with Wikidata and Twilio Messaging
Time to read: 16 minutes
Build a Movie Recommendation App with Wikidata and Twilio Messaging
In this tutorial, you'll learn how to build a text-based movie recommendation application that provides movie suggestions via SMS. Additionally, you'll be able to retrieve detailed information about any recommended movies.
To build this application, you will use The Twilio Programmable Messaging API to receive and send SMS messages. You will learn the basics of SPARQL and apply this knowledge to interact with the Wikidata Query Service . This service will be used to fetch movies stored in Wikidata based on the search criteria provided in the received SMS message. Additionally, you'll use the Wikipedia API to gather details about the movies retrieved.
SPARQL (SPARQL Protocol and RDF Query Language) is a query language and protocol used to retrieve and manipulate data stored in Resource Description Framework (RDF) format. It's commonly used to query linked data sources, including semantic web data.
Wikidata is a free and open knowledge base maintained by the Wikimedia Foundation. It serves as a central storage repository for structured data and provides support for Wikipedia and other Wikimedia projects.
The Wikidata Query Service (WDQS) is an online service provided by the Wikimedia Foundation that allows users to run SPARQL queries against the Wikidata knowledge base. It enables users to retrieve specific information or perform complex data analyses using the structured data stored in Wikidata.
The Wikipedia API is an interface provided by Wikimedia that allows developers to programmatically access and retrieve information from Wikipedia articles. It provides methods for searching articles, fetching article content, obtaining metadata, and more, enabling integration with external applications and services.
Tutorial Requirements
To follow this tutorial you will need the following:
- A Twilio account - Sign up here
- A Ngrok account and the Ngrok CLI
- A basic understanding of Javascript;
- A basic understanding of how to use Twilio SMS API to build an app;
- Node.js v18+ installation
- Git installation
Getting the boilerplate code
In this section, you will clone a repository containing the boilerplate code needed to build the SMS-based movie recommendation system.
Open a terminal window and navigate to a suitable location for your project. Run the following commands to clone the repository containing the boilerplate code and navigate to the boilerplate directory:
This boilerplate code includes an Express.js project that serves the client application a file named genres.json. The genres.json contains a list of movie genres extracted from Wikidata, each item on the list has a Wikidata ID and name.
This Node.js application comes with the following packages:
- express: is a fast, unopinionated, minimalist web framework for Node.js, providing a robust set of features to develop web and mobile applications.
- body-parser: is a Node.js middleware that parses incoming request bodies in a middleware before handlers, facilitating the handling of data sent in the request body.
- dotenv: is a Node.js package that allows you to load environment variables from a .env file into process.env.
- node-dev: is a development tool for Node.js applications, providing automatic restarts of the server when changes are detected in the source code.
- twilio: is a package that allows you to interact with the Twilio API.
- node-fetch: is a lightweight module that brings the Fetch API to Node.js, enabling simplified HTTP requests and responses in server-side applications.
Use the following command to install the packages mentioned above:
Collecting the credentials
In this section, you are going to retrieve the Twilio credentials that will allow you to interact with the Twilio API. Additionally, you will buy a new Twilio phone number with SMS capabilities if you don't have one.
The .env file included in the cloned repository is where you will store all credentials.
Open a new browser tab, and log in to your Twilio Console . Once you are on your console copy the Account SID and Auth Token and store these credentials in the .env file found in the root directory of the boilerplate code:
Navigate to the Buy a Number page in your Twilio Console , and purchase a number with SMS capabilities if you don’t own one .
Detecting incoming messages
In this section, you will write the code that will allow you to detect and respond to SMS messages sent to your Twilio phone number.
Go to the server.js file, and add the following code to the import section:
This code block imports the twilio
package, which provides functionalities to interact with the Twilio API.
Go to the server.js file, and add the following code below the line where the body-parser
was set (around line 11):
This code creates a new instance of the Twilio client using the twilio
package with the Twilio account SID and authentication token obtained from the environment variables and then stores the client instance in a constant named client
.
Add the following code below the client
constant:
Here, the code defines a function named sendMessage()
which takes three parameters: from
(sender's phone number), to
(recipient's phone number), and message
(the content of the message to be sent).
Within this function, the messages.create()
method provided by the Twilio client is called to send a message. The create method is provided with an object containing the message details, including the message body
, sender, and recipient.
When the message is successfully sent, the Twilio message SID is logged to the console.
Add the following code below the /ping
route:
The code above defines an Express.js route for handling incoming HTTP POST requests at the path '/incomingMessage'.
This route is designed to receive SMS messages from users.
Upon receiving a message, it extracts the To
, Body
, and From
properties from the request body. It then logs these details and proceeds to call the previously defined sendMessage()
function, sending the SMS message received back to the sender.
Next, it sends a response indicating that the request has been successfully processed.
Go back to your terminal and run the following command to start the server application:
Open another terminal tab, and run the following ngrok command:
This command above exposes your local web server, running on port 3000
, to the internet using the ngrok service.
Copy the forwarding HTTPS URL provided, and paste it into a text editor where you can edit it. Next, add the text /incomingMessage
to the end of it so that your complete URL looks similar to the following:
Go back to your browser tab where you accessed your Twilio Console to navigate to Active Numbers under Phone Numbers > Manage. Click on the number you are using in this tutorial. Scroll down to Messaging. Under A MESSAGE COMES IN you will set the HTTP method to POST and assign it the ngrok https URL you combined earlier. Then, click Save to save the configuration.
Using your preferred method send an SMS to your Twilio Number, with the following text:
The message above asks the application system to recommend Keanu Reeves action movies released between 2000 and 2023.
The recommendation system should reply with the message sent:
Handling incoming messages
In this section, you will write the code that will be responsible for extracting the movie filters contained inside the incoming messages.
An SMS message sent to the application to get movie recommendations should contain the /recommend
command followed by the movie genre, release year range, and cast:
The application might respond with a message similar to the following:
To get one of the recommended movie's details you should send an SMS containing the /details
command followed by the movie number on the list:
Once the application is complete, it will respond with a message similar to the following:
In your project root directory, create a file named messageHandler.js and add the following code to it:
This code defines a function named processGenreFilter()
. It takes a single parameter, unprocessedFilterValue
, representing the raw input for the 'genre'
filter.
Within the function, the unprocessed filter value is trimmed to remove any leading or trailing whitespaces, and the processed filter value is returned.
Add the following code below the processGenreFilter()
function:
Here, the code defines the processYearFilter()
function, which is responsible for handling unprocessed filter values associated with the `'year'` filter. It takes a single parameter, `unprocessedFilterValue`, representing the raw input for the year filter.
The function splits this input into start and end years based on the hyphen (-
) separator, trims any whitespaces, and then returns an array containing both values.
Add the following code below the processYearFilter()
function:
This code defines the processCastFilter()
function. It accepts unprocessedFilterValue
as a parameter, representing the unprocessed value of a cast filter provided by the user.
The function splits this value into individual actors using commas as delimiters, trims any leading or trailing whitespace from each actor, and stores the resulting array of actors in the filterValue
variable. Finally, the function returns this array as the processed cast
filter.
Add the following code below the processCastFilter()
function:
This code defines the getFilters()
function, which takes the incoming message body as a parameter. The function is responsible for processing the user's input message and extracting filters such as genre, year, and cast.
It splits the incoming message into an array of filters, and for each filter, it further splits it into key and value. The function then checks the filter key and processes the filter value accordingly using functions like processGenreFilter()
, processYearFilter()
, or processCastFilter()
.
The processed filters are stored in the filters
object, which is then logged to the console, and the object is returned.
Add the following code below the getFilters()
function:
This code defines the handleIncomingMessage()
function, which is exported and takes incomingMessage
as a parameter. The function is responsible for handling the incoming user message and generating a response based on the detected commands like '/recommend'
or '/details'.
If the message contains '/recommend'
, it extracts filters using the getFilters()
function and converts the filters
object to a JSON string, forming a reply
, and then returns the reply
.
If the message contains '/details'
, it extracts the movie index and constructs a reply
indicating the selected movie index, and then returns the reply
.
Go the the server.js file and add the following code to the import statements section:
Here the code imports the handleIncomingMessage()
function from the messageHandler.js file.
Update the /incomingMessage
endpoint code with the following :
Here this endpoint instead of just sending back the incoming message, calls the handleIncomingMessage()
function with the incoming message as an argument and waits for a response. The response returned is saved in a constant named reply
, which is then used as the message body to be sent back to the user.
Using your preferred method send an SMS to your Twilio Number to test the /recommend
command, with the following text:
The application should send back the following response:
Now do the same to test the /details
command:
The application should send back the following response:
Getting Started with SPARQL
In this section, you will learn the basics of SPARQL, a query language that will be used to retrieve information from Wikidata knowledge database. A SPARQL query mainly consists of a SELECT
and a WHERE
clause.
The SELECT
clause specifies the variables that you want to retrieve, typically denoted by names starting with a question mark. These variables represent the information you're interested in obtaining.
The WHERE
clause contains constraints or conditions on these variables, typically expressed as triples. Triples consist of a subject, a predicate, and an object, forming statements about the data.
When a query is executed, the SPARQL engine attempts to fill in the variables with actual values from the database, generating triples that match the specified criteria. Each result returned represents a combination of variable values found in the database.
A SPARQL query looks similar to this:
The query triple pattern above can be interpreted as follows: ?human
has the occupation
of actor
. This query communicates to Wikidata that you are searching for individuals who have the occupation of acting. In this context, ?human
serves as the subject, occupation
as the predicate, and actor
as the object.
The results for this query could include thousands of actors such as “Halle Berry, Tom Hanks, and Sandra Bullock” and so on.
To avoid duplicate results and limit the number, you could add the following to the query
The DISTINCT
keyword added in the SELECT
clause eliminates duplicate results.
The LIMIT
clause added at the end restricts the number of results returned by the query to 100. This is particularly useful when dealing with a large number of results or when you only need a subset of the results.
Up until now, our SPARQL queries have been written using human-readable names for educational purposes.
On Wikidata, items and properties are not identified by human-readable names like "occupation"(property) and "actor"(item). This is because names like "actor" can refer to multiple entities, such as an occupation or the name of a song, album, movie, and so on. Therefore, Wikidata assigns a unique identifier to each item and property.
To find the identifier for an item, navigate to the Special Search page provided by Wikidata , enter the desired item name, and in the Search in filter select only main to limit the search to items. Next, click on the Search button, and copy the Q-number associated with the result that best matches what you're looking for, typically based on its description. In the image below the actor item identifier is Q33999:
Similarly, to find the identifier for a property, enter the desired property name, in the Search in filter select the property instead of just main, which limits the search to properties. Next, Click on the Search button, and copy the P-number associated with the result that best matches what you're looking for. In the image below the property for designating occupation property is P106:
Lastly, it's important to include prefixes. For straightforward WDQS triples, items should be prefixed with wd:
and properties with wdt:
.
Now our query should be:
Click here to try the query above in the Wikidata Query Service. Then “Run” the query on the WDQS page. Scroll down, and the results should look similar to the following:
The results above only show the identifiers. However, you can click on them to view their corresponding Wikidata page, which includes a human-readable label.
Change the query above to the following to see the the human-readable label next to the identifiers:
Here the ?humanLabel
variable was added to the Select
clause and a Service
keyword was added inside the WHERE
clause so that that a human-readable label is shown for each entity in the results.
The SERVICE
keyword is used to indicate the use of a service, in this case, the Wikibase label service. This service is used to retrieve human-readable labels for entities, making the results more understandable. The bd:serviceParam
part specifies parameters for the label service, such as the language of the labels. [AUTO_LANGUAGE]
is a placeholder that automatically selects the appropriate language based on the user's preferences or the available data.
Click here to try the query above in the Wikidata Query Service. Then “Run” the query on the WDQS page. Scroll down, and the results should look similar to the following:
SPARQL also allows you to search for entities by using their human-readable name:
Here the code adds the rdfs:label
predicate to a triple pattern located inside the WHERE
clause to search for humans who have the English label “Keanu Reeves” and removes the LIMIT
keyword.
You can try the query above here .
To learn more about SPARQL please visit the official Wikidata SPARQL tutorial page.
Retrieving data from Wikidata and Wikipedia
In this section, you will write the code responsible for transforming the filters extracted from the incoming message into SPARQL queries. Next, you will use these queries with the Wikidata Query service to obtain movie recommendations. Additionally, you will use the Wikipedia API to retrieve movie details.
Create a file named wikidataRetriever.js and add the following code to it:
This code block imports the fetch
function from the node-fetch
package. Additionally, it imports the 'fs' module.
Following this, the code reads the content of a file named 'genres.json'
synchronously using the fs.readFileSync()
method. The file contains JSON data representing the movie genres extracted from Wikidata. The content is then parsed into a JavaScript object using JSON.parse()
, and the resulting object is assigned to the variable genres
.
Add the following code below the genres
variable:
Here, the code defines a function named getGenreID()
that takes a single parameter targetGenre
. This function is responsible for finding the genre ID based on a given target genre. It iterates through the genres
array, which is populated from the 'genres.json' file, comparing the lowercase name of each genre with the lowercase targetGenre
. If a match is found, it assigns the corresponding genre ID to the genreID variable and returns it.
Add the following code below the getGenreID()
function:
In this code block, the function generateActorSparqlQuery()
is declared. It takes a single parameter, actorName
. The purpose of this function is to generate a SPARQL query to retrieve information about a human actor based on their name.
The SPARQL query is constructed using the provided actorName
and includes conditions to filter results for entities with the specified human occupation and label.
Lastly, the generated SPARQL query is returned.
Add the following code below the generateActorSparqlQuery()
function:
Here, the code defines an asynchronous function named sendRequest()
, which takes a sparqlQuery
parameter. This function is responsible for sending an HTTP request to the Wikidata SPARQL endpoint, using the provided SPARQL query.
The function constructs the URL with the encoded query, sets the necessary headers, and uses the fetch function to make the request.
If the response status is 200, it parses and returns the JSON bindings from the response. However, the function logs an error message and returns `undefined` if an error occurs during the process.
Add the following code below the sendRequest()
function:
In this code block, an asynchronous function named getCastIDs()
is declared. This function takes a cast
parameter, which is an array of actor names. The purpose of this function is to retrieve the Wikidata IDs of actors based on their names.
It iterates through the provided array, generates a SPARQL query for each actor using the generateActorSparqlQuery()
function, sends a request to Wikidata using sendRequest()
, and extracts the actor ID from the response. Lastly, the function returns an array of actor IDs.
Add the following code below the getCastIDs()
function:
Here, the code defines a function named handleCastArgs()
. This function takes an array of actor IDs (castIDs
) and is responsible for generating a portion of a SPARQL query related to the cast.
If only one actor ID exists, it appends the ID to the cast
string. If there are multiple actor IDs, it iterates through the array, appending each actor ID separated by commas, and adding appropriate prefixes and suffixes. The resulting cast
string is then returned.
Add the following code below the handleCastArgs()
function:
In this code block, a function named handleYearArgs()
is declared. This function takes an array of year values, representing a year or a range of years. It constructs and returns an object containing the start and end dates in the format required for SPARQL queries.
If there are multiple years, it uses the first and last years in the array to create a date range. Otherwise, it uses a single year for both start and end dates.
Add the following code below the handleYearArgs()
function:
This code defines a function named generateMoviesSparqlQuery()
. This function takes three parameters: genre
, cast
, and year
. It is responsible for constructing a SPARQL query used to retrieve distinct movie items with their labels and associated Wikipedia articles based on specified filters such as genre, cast, and release year.
The function dynamically generates parts of the query based on the provided parameters, and the resulting query is returned.
Add the following code below the generateMoviesSparqlQuery()
function:
Here, the code exports an asynchronous function named getMovies()
. This function takes a filters
parameter representing various criteria (such as genre
, cast
, and year
) to filter movie results.
It first retrieves the genre ID using getGenreID()
and actor IDs using getCastIDs()
. If the number of actors found does not match the expected count, it logs an error and returns an object indicating the failure to find actors.
The function then constructs the SPARQL query using generateMoviesSparqlQuery()
, sends the query to Wikidata via sendRequest()
, and stores the response in a variable named movies
.
If movies are found, it returns an object with the movies found; otherwise, it returns an error object.
Add the following code below the getMovies()
function:
The code defines a function named truncate()
, which takes an str
parameter representing an input string and a max
parameter which specifies the maximum length the string can be. This function is responsible for truncating an input string to a specified maximum length and adds an ellipsis character if the string exceeds that length.
Add the following code below the truncate
function:
This code exports an asynchronous function wrapped in a try-catch
block named getMoviePageSummary()
. This function takes a title
parameter representing the movie title.
It constructs a URL for the Wikipedia API using the title
, and makes a request to retrieve the page summary.
Once the summary is retrieved the code calls the truncate()
function to truncate the summary to a maximum of 800 characters, and then returns the truncated summary if an error doesn’t occur; otherwise, it returns undefined
.
Go back to the messageHandler.js file and add the following code to the top of the file:
This code imports the getMoviePageSummary()
and getMovies()
functions from the wikidataRetriever.js file. Additionally, it declares a variable named movies
and initializes it as an empty array. The movies array will be used to store the fetched movie data.
Add the following code below the getFilters()
function:
This code defines an asynchronous function named getMoviesRecommendations()
. This function takes a filters
parameter representing the user's preferences for movie recommendations.
It awaits the result from the getMovies()
function, which fetches movie data based on the specified filters. If an error occurs during the retrieval, the function returns an error message. Otherwise, it constructs a reply
string containing a list of recommended movies.
Add the following code below the getMoviesRecommendations()
function:
Here, the code defines an asynchronous function named getMovieDetails()
. This function takes a title
parameter representing the movie title.
It awaits the result from the getMoviePageSummary()
function, which fetches the Wikipedia page summary for the specified movie title. If the details are found, the function returns them; otherwise, it sets a reply
indicating the failure to find the movie details.
Replace the content of the handleIncomingMessage()
function with the following:
With the code changes above, the handleIncomingMessage()
function instead of returning a JSON stringified version of the filters
for the '/recommend'
command or a simple response for the '/details'
command, now invokes the functions getMoviesRecommendations()
and getMovieDetails()
to generate appropriate responses for the user.
For the '/recommend'
command, it processes the filters
using the getFilters()
function and then calls the getMoviesRecommendations()
function with the obtained `filters`. The result of this function call is stored in the reply
variable, and it is returned as the response to the user.
Similarly, for the '/details'
command, it extracts the movie index from the incoming message, uses the index to select a movie stored in the movies
array, retrieves the corresponding movie URL, extracts the title
, and then calls the getMovieDetails()
function with the obtained title
. The response from this function call is stored in the reply
variable, and it is returned as the response to the user.
Using your preferred method send the following SMS to your Twilio Number to receive action movie recommendations starring the actor Keanu Reeves:
The application should send back a response similar to the following:
Now send the following SMS to retrieve the movie details for the 6th movie on the list (John Wick
):
The application should send back the following response:
Conclusion
In this tutorial, you learned how to build a text-based movie recommendation application that provides movie suggestions via SMS. First, you've learned how to leverage the Twilio Programmable Messaging API for message handling. Next, you learned the basics of SPARQL and used this knowledge to retrieve movies using SPARQL and the WIkidata query service. Lastly, you learned how to use the Wikipedia API to retrieve movie details.
Related Posts
Related Resources
Twilio Docs
From APIs to SDKs to sample apps
API reference documentation, SDKs, helper libraries, quickstarts, and tutorials for your language and platform.
Resource Center
The latest ebooks, industry reports, and webinars
Learn from customer engagement experts to improve your own communication.
Ahoy
Twilio's developer community hub
Best practices, code samples, and inspiration to build communications and digital engagement experiences.