Creating a Recipe Recommendation Chatbot with Ollama and Twilio

January 30, 2025
Written by
Carlos Mucuho
Contributor
Opinions expressed by Twilio contributors are their own
Reviewed by

Creating a Recipe Recommendation Chatbot with Ollama and Twilio

In this tutorial, you will learn how to build a WhatsApp recipe recommendation chatbot that uses Retrieval-Augmented Generation (RAG) to provide recipe suggestions and computer vision to process images containing cooking ingredients. You'll create a chatbot that can understand recipe requests, analyze ingredient photos, and engage in natural conversations about cooking.

To build this application, you will use the Twilio Programmable Messaging API for handling WhatsApp messages and Ollama, an open-source tool for running large language models (LLMs) locally. You'll implement a RAG system that combines the Nomic embed text model and Chroma for generating and storing recipe embeddings with the Mistral NeMo model for generating contextual responses and the LLaVA model for processing ingredient images.

RAG is a technique that enhances language model responses by retrieving relevant information from a knowledge base before generating answers. This approach improves accuracy and grounds the model's responses in factual data.

Embeddings are numerical representations of text that capture semantic meaning, allowing the system to understand and compare the similarity between recipes and user queries effectively.

The Twilio Programmable Messaging API is a service that enables developers to programmatically send and receive WhatsApp messages from their applications.

Ollama is a tool that simplifies running and managing different AI models locally.

Chroma serves as a vector database for efficiently storing and retrieving recipe embeddings.

Here's what your chatbot will look like in action by the end of this tutorial:

RAG powered recipe recommendation Twilio WhatsApp chatbot demo

Tutorial Requirements

To follow this tutorial, you will need the following components:

Creating The Project Directory

In this section, you will create a new project directory for your WhatsApp recipe recommendation chatbot, initialize a new Node project, and install the required packages to interact with Ollama, Twilio, and Chroma.

Open a terminal window and navigate to a suitable location for your project. Run the following commands to create the project directory and navigate into it:

mkdir recipes-chatbot
cd recipes-chatbot

Use the following command to create a directory named images, wherethe chatbot will store the images sent by the user:

mkdir images

Download this image and store it in the images directory and name it image.jpg. This image shows a bunch of fruits and vegetables on top of a wooden table.

Download this recipes.txt file and store it in your project root directory. This file contains a list of recipes that might be suggested to the user.

Run the following command to create a new Node.js project:

npm init -y

Using your preferred text editor open the package.json file and set the project type to module.

{
  "name": "recipes-chatbot",
  "type": "module",
  …
}

Now, use the following command to install the packages needed to build this application:

npm install ollama cross-fetch chromadb twilio express body-parser dotenv

With the command above you installed the following packages:

  • ollama is a client library that allows you to use Javascript to interact with the Ollama server API. It will be responsible for handling communication with LLMs and managing response times for longer computations.
  • cross-fetch: is a package that adds Fetch API support to Node.js, enabling resource fetching, like API requests. It will be used to send HTTP requests to the Ollama server, preventing request timeouts that may occur if an LLM requires extra time to generate a response.
  • chromadb: A JavaScript package for working with the Chroma database. It will be used to store and retrieve embeddings that will be created for the recipes stored in the recipes.txt file.
  • twilio: is a package that allows you to interact with the Twilio API. It will be used to create the Twilio Whatsapp chatbot.
  • express: is a minimal and flexible Node.js back-end web application framework that simplifies the creation of web applications and APIs. It will be used to serve the Twilio WhatsApp chatbot.
  • body-parser: is an express body parsing middleware. It will be used to parse the URL-encoded request bodies sent to the express application.
  • dotenv: is a Node.js package that allows you to load environment variables from a .env file into process.env. It will be used to retrieve the Twilio API credentials and the Ollama base URL that you will soon store in a .env file.

Setting Up Ollama

In this section, you’ll set up Ollama to interact with LLMs. You’ll download the necessary models and start the Ollama server, which will handle natural language processing tasks for your chatbot.

To get started with Ollama, download the correct installer for your operating system from the official Ollama site.

With Ollama now installed, you can begin interacting with LLMs. Before using any commands, start Ollama by opening the app or running this command in your terminal:

ollama serve

This command will start the Ollama server, which runs on port 11434, making it ready to accept commands. You can confirm it’s running by visiting http://localhost:11434 in a browser.

If you get the following error in the terminal it means that the Ollama server is already running: Error: listen tcp 127.0.0.1:11434: bind: address already in use

To download a model without running it, use the ollama pull command. For instance, to pull the Nomic embed text model, run:

ollama pull nomic-embed-text

This command will download the model, making it available for later use. The Nomic embed text model is a model designed to create text embeddings. You will use this model to generate embeddings for the recipes stored in the recipes.txt file and for the user queries.

You can check all downloaded models by running:

ollama list

The output will show a list of models, including their sizes and the date they were last modified:

NAME                 	ID          	SIZE   	MODIFIED
nomic-embed-text:latest  0a109f422b47	274 MB 	2 weeks ago

Now use the same command to pull the LLaVA model:

ollama pull llava

LLaVA (Large Language and Vision Assistant) is a multimodal AI model that integrates natural language processing with computer vision capabilities. It processes and analyzes images to identify objects, their relationships, scene composition, and visual attributes, then generates accurate descriptions using natural language. You will use this model to describe the images sent by the user and enumerate possible ingredients in them.

To see all available models on your device so far, run:

ollama list

You should see something similar to:

NAME                   	             ID          	  SIZE  	      MODIFIED
nomic-embed-text:latest              0a109f422b47     274 MB	      2 weeks ago
llava:latest      	                 8dd30f6b0cb1  	  4.7 GB	      3 weeks ago

To pull a model and execute it in one step, use the ollama run command. For example, to run the Mistral NeMo large language model, type:

ollama run mistral-nemo

If the model isn’t already on your device, this command will download it and start the model so you can begin querying it right away. The Mistral NeMo model is a versatile conversational LLM that is ideal for dialogue-based interactions, such as chatbots and virtual assistants. You will use this model to answer the user queries.

When executed, your terminal will display:

csfm1993:~$ ollama run mistral-nemo
>>> Send a message (/? for help)

Try sending a simple prompt to test the model’s capabilities:

"Hello, can you introduce yourself?"

Here is the response:

>>> Hello, can you introduce yourself?
Hello! I'm a text-based AI assistant. (I don't have personal experiences or feelings, but I'm here to help answer your questions
and provide information as best I can.) You can call me Assistant. How about you? Would you like to tell me something about
yourself?

The prompt asks the model to introduce itself. The response may vary but typically includes a brief, friendly introduction about the model’s purpose and functions.

Once you are done testing the model send the command /bye to end the chat.

Setting Up Chroma With Docker

In this section, you will set up Chroma using Docker to store and retrieve embeddings for your recipes. This step ensures that your chatbot can efficiently search and recommend recipes based on user queries.

To set up Chroma with Docker, run the following command in your terminal from the root directory of your project:

docker run -d --rm --name chromadb -e IS_PERSISTENT=TRUE -e ANONYMIZED_TELEMETRY=TRUE -v ./chroma:/chroma/chroma -p 8000:8000 chromadb/chroma

Here is a breakdown of what the command above does:

  • docker run: Starts a new Docker container based on the specified image, in this case, chromadb/chroma.
  • -d: Runs the container in the background (detached mode), allowing the terminal to be free for other tasks.
  • --rm: Automatically removes the container once it stops, keeping your system clean and free of unused containers.
  • --name chromadb: Assigns the name "chromadb" to the container, making it easier to refer to in future Docker commands.
  • -e IS_PERSISTENT=TRUE: Sets an environment variable that ensures data stored in Chroma is persistent, meaning it will not be lost after the container stops or restarts.
  • -e ANONYMIZED_TELEMETRY=TRUE: Enables anonymized telemetry, allowing Chroma to collect usage statistics for improving the product without tracking any personal information.
  • -v ./chroma:/chroma/chroma: Mounts a volume from your local ./chroma directory to the /chroma/chroma directory inside the container. This allows Chroma to store data in a location on your local system, which can persist across container restarts.
  • -p 8000:8000 Maps port 8000 on your local machine to port 8000 in the container, making Chroma accessible via http://localhost:8000.

After running this command, Chroma will be set up and running, with its data stored persistently in a subdirectory named chroma located in your project root directory.

To ensure that the Chroma container is running, run the following command:

docker ps

This command lists all the active containers currently running on your system. You should see output similar to the following:

CONTAINER ID   IMAGE        	COMMAND           	CREATED       	STATUS       	PORTS                                	NAMES
346a8043a119   chromadb/chroma   "/docker_entrypoint.…"   About a minute ago   Up About a minute   0.0.0.0:8000->8000/tcp, :::8000->8000/tcp   chromadb

The docker ps output shows the container's unique CONTAINER ID, the image used (chromadb/chroma), the command it's executing, and how long ago it was created. The STATUS confirms it's running, while PORTS shows that port 8000 is mapped to your local machine.

If you see a similar output, it confirms that the Chroma container is successfully running and accessible on port 8000.

Collecting And Storing Your Credentials

In this section, you will collect and store your Twilio credentials that will allow you to interact with the Twilio API.

Open a new browser tab and log in to your Twilio Console. Once you are on your console copy the Account SID and Auth Token, create a new file named .env in your project’s root directory , and store these credentials in it:

TWILIO_ACCOUNT_SID=< your twilio account SID>
TWILIO_AUTH_TOKEN=< your twilio account auth token>

Now, open the .env file and save the Ollama base URL in it:

TWILIO_ACCOUNT_SID=< your twilio account SID>
TWILIO_AUTH_TOKEN=< your twilio account auth token>
OLLAMA_BASE_URL="http://localhost:11434"
If you are running Ollama on a remote machine make sure to replace the Ollama base URL with the correct one.

Setting Up The Database Helper

In this section, you will create database helper functions to manage recipes and query embeddings. These functions will handle the creation, storage, and retrieval of embeddings using Ollama and Chroma.

First, take a look at the content of the recipes.txt file. Open it in your preferred text editor, and you should see something like this:

Title: Miso-Butter Roast Chicken With Acorn Squash Panzanella
Ingredients:
['1 (3½–4-lb.) whole chicken', '2¾ tsp. kosher salt, divided, plus more', '2 small acorn squash (about 3 lb. total)', …]
Instructions:
Pat chicken dry with paper towels …
--+--
Title: Crispy Salt and Pepper Potatoes
Ingredients:
['2 large egg whites', '1 pound new potatoes (about 1 inch in diameter)', '2 teaspoons kosher salt', …]
Instructions:
Preheat oven to 400°F …
--+--

This file contains several recipes, each formatted with a title, ingredients, and cooking instructions. Each recipe is separated by the marker --+--.

Now, you will create the database helper functions to manage recipe embeddings. First, Create a file named dbHelper.js in your project root directory.

Next, Add the following code to set up the necessary imports and initialize the Ollama and ChromaDb clients:

import { Ollama } from 'ollama';
import { ChromaClient } from 'chromadb';
import fs from 'fs';
import fetch from 'cross-fetch';
import process from 'process';
import 'dotenv/config';
const ollamaBaseURL = process.env.OLLAMA_BASE_URL;
const ollama = new Ollama({ host: ollamaBaseURL, fetch: fetch });
const chroma = new ChromaClient({ path: 'http://localhost:8000' });
const collection = await chroma.getOrCreateCollection({ name: 'recipes', metadata: { 'hnsw:space': 'cosine' } });

This code initializes the Ollama and Chroma clients. The code uses Ollama for generating embeddings and Chroma as the aplication vector database. The collection where the embeddings will be stored is named recipes and it is configured to use cosine similarity for comparing recipe embeddings.

Next, add the functions for processing the recipe data:

async function splitIntoChunks(fileName) {
  let content = fs.readFileSync(fileName, { encoding: 'utf-8' });
  const chunks = content.split('--+--');
  chunks.splice((chunks.length - 1), 1);
  return chunks;
}
async function generateEmbeddings(chunks) {
  const response = await ollama.embed({ model: 'nomic-embed-text', input: chunks });
  return response.embeddings;
}

These functions handle the text chunking and embedding generation. The splitIntoChunks function reads the recipes file and places each recipe into its own text chunk, while generateEmbeddings converts these chunks into numerical vectors using Ollama and the Nomic embed text model.

Now, add the function to create and store recipes embeddings:

async function createEmbeddings() {
  let fileName = 'recipes.txt';
  const chunks = await splitIntoChunks(fileName)
  const chunkIdentifiers = [];
  for (let i = 0; i < chunks.length; i++) {
    chunkIdentifiers.push(`${fileName}-${i}`);
  }
  const metadatas = Array(chunks.length).fill({ source: fileName });
  const embeddings = await generateEmbeddings(chunks);
  console.log('embeddings', embeddings.length);
  await collection.add({ ids: chunkIdentifiers, embeddings: embeddings, documents: chunks, metadatas: metadatas });
}

This function manages the entire embedding creation process. It generates unique identifiers for each recipe chunk and stores them as documents in Chroma along with their embeddings and metadata.

Finally, add the function to retrieve recipes based on user queries:

export async function retrieveRecipes(args) {
  const query = args.RAGQuery;
  const numberOfRecipes = args.numberOfRecipes !== undefined ? args.numberOfRecipes : 3
  const queryEmbedding = (await ollama.embed({ model: 'nomic-embed-text', input: query })).embeddings[0];
  const docsFound = await collection.query({ queryEmbeddings: [queryEmbedding], nResults: numberOfRecipes });
  return docsFound;
}

The retrieveRecipes function performs a semantic search using embedding similarity. It converts the query into an embedding, finds the most similar recipes in the database, and limits the number of results to the one specified in the numberOfRecipes property found in the args object.

To ensure that the database helper functions are working correctly, you can add a test function. In dbHelper.js, add the following code:

async function test() {
  await createEmbeddings();
  console.log('created embeddings')
  const args = {
    RAGQuery: 'Suggest two recipes with cheese',
    numberOfRecipes: 2,
  }
  const recipes = await retrieveRecipes(args);
  console.log(`recipes found: ${recipes.documents}`);
}
test();

This test function first calls the createEmbeddings() function to generate and store embeddings for the recipes. It then performs a test query to retrieve recipes that include cheese and limits the number of results to two by calling the retrieveRecipes() function and passing the args object as an argument. The results are logged to the console, allowing you to verify that the embeddings and retrieval functions are working correctly.

Run the following command to test the functions:

node dbHelper.js

You should get an output similar to the following

embeddings 543
created embeddings
recipes found:
Title: Creole Cream Cheese
Ingredients:
['2 quarts skim milk', '1⁄4 cup buttermilk', '1 rennet tablet']
Instructions:
In a large heavy-bottomed saucepot over medium heat with a kitchen thermometer attached to the rim, warm the milk and buttermilk to 85°F …
,
Title: Syrniki (Сырники / Farmer’s Cheese Pancakes)
Ingredients:
['2 egg yolks', '2 cups (1 pound) tvorog or farmer’s cheese, homemade or store-bought', '...']
Instructions:
In a medium bowl, beat the egg yolks into the farmer’s cheese, then stir in the sugar. Mix together 1⁄2 cup of the flour, the baking powder, and the salt and add to the cheese mixture. If the mixture seems dry, add a little heavy cream …
,
…

The output above shows that the embeddings were created for all the recipes stored in the recipes.txt file, and also shows the result for the test query to retrieve recipes that have cheese as an ingredient.

After you are done testing the functions in the database helper, comment out the test() function call:

// test();

Adding Image Processing

In this section, you will add image processing capabilities to your chatbot. This step involves managing images sent by the user, using Ollama and the LLaVA model to describe the images, and identifying ingredients.

In your project root directory, create a new file named handleImages.js to manage image-related functionality. Start with the basic setup:

import { Ollama } from 'ollama';
import * as fs from 'fs';
import fetch from 'node-fetch';
import process from 'process';
import 'dotenv/config';
const ollamaBaseURL = process.env.OLLAMA_BASE_URL;
const ollama = new Ollama({ host: ollamaBaseURL, fetch: fetch });
const imagePath = './images/image.jpg';

This code sets up the image processing environment, creating an Ollama client and defining the path where the application will temporarily store uploaded images.

Add the function to handle image storage:

export async function storeImage(mediaURL) {
  return new Promise((resolve) => {
    fetch(mediaURL)
      .then(async (res) => {
        res.body.pipe(fs.createWriteStream(imagePath));
        res.body.on('end', () => resolve(imagePath));
      }).catch((error) => {
        console.error(error);
        resolve(undefined);
      });
  });
}

The storeImage function downloads images from Twilio's media URLs and saves them in the images directory under the name image.jpg for processing.

Add the image description function:

export async function describeImage() {
  try {
    const prompt = 'Provide a detailed description of the image, enumerating each visible element.';
    const imageData = fs.readFileSync(imagePath).toString('base64');
    const response = await ollama.chat({
      model: 'llava',
      messages: [{ role: 'user', content: prompt, images: [imageData] }],
    });
    return response.message.content;
  } catch (error) {
    return 'Failed to use the Vision model to answer your message'
  }
}

This function uses Ollama and the LLaVA model to analyze the image stored in the images directory, provide detailed descriptions, and enumerate visible items.

If there aren’t any errors the function returns the response generated by the LLaVa model otherwise it returns a message stating that the application failed to use the vision model to describe the image.

To ensure that the image processing functions are working correctly, you can add a test function.

Add the following code at the bottom of the handleImages.js:

async function test() {
  const response = await describeImage();
  console.log(`image description: ${response}`);
}
test()

This test function calls the describeImage() function to generate the sample image description and logs the result to the console. This function allows you to verify that the image processing functions are working correctly.

Run the following command to test the functions:

node handleImages.js

You should get an output similar to the following

image description:  The image is a color photograph featuring an array of fresh produce arranged on a wooden surface. The items displayed include various fruits and vegetables:
1. Multiple bunches of carrots with green tops in different areas across the table.
2. A cluster of grapes, which appear ripe.
3. A head of lettuce with crisp leaves at the center.
4. Three ears of corn lying on their sides.
5. A large watermelon with a visible rind pattern and cut into quarters, revealing a red interior.
6. Multiple bell peppers in different colors (red, yellow, and green) spread across the table, with some partially overlapping.
7. Two apples lying next to each other; one is closer to the viewer than the other.
8. A bunch of bananas with their characteristic curved shape, near the watermelon and apples.
9. Three tomatoes that vary in color from red to orange-red and are located close to the bell peppers.
10. A pile of whole onions at the center-left of the table.
The background is a simple wooden surface, and the image has a high-contrast, selective-focus composition where the produce stands out prominently against the plain backdrop. There are no texts or distinctive markings visible in this image. The style of the image suggests a focus on healthy eating with vibrant colors emphasizing freshness and variety.

The output above shows that the LLaVA model was able to make a decent description of the image contents.

After you are done testing the functions in the handleImages.js file, comment out the test() function call:

// test();

Implementing The Chat Functionality

In this section, you will create a new file called chat.js to handle all of your chatbot's conversational logic. This file will manage the interaction between the user and the chatbot, using Ollama for interacting with the Mistral NeMo LLM.

Create a new file named chat.js in your project root directory. In the chat.js file, start by importing the required dependencies and setting up the Ollama client:

import { Ollama } from 'ollama';
import fetch from 'cross-fetch';
import { retrieveRecipes } from './dbHelper.js';
import process from 'process';
import 'dotenv/config';
const ollamaBaseURL = process.env.OLLAMA_BASE_URL;
const ollama = new Ollama({ host: ollamaBaseURL, fetch: fetch });

This initial setup establishes the connection to Ollama and imports necessary modules.

Next, define the message history array and the retrieval tool for recipe searches:

const messages = [];
const retrievalTool = {
  type: 'function',
  function: {
    name: 'retrieve_recipes',
    description: 'Given an ingredient or ingredients names retrieve recipes',
    parameters: {
      type: 'object',
      properties: {
        RAGQuery: {
          type: 'string',
          description: 'The ingredient or ingredients names',
        },
        numberOfRecipes: {
          type: 'number',
          description: 'The number of recipes',
        }
      },
      required: ['RAGQuery'],
    },
  },
};

Here you created a messages array to maintain conversation history and define a retrieval tool that follows OpenAI's function calling format. This tool specification tells the LLM how to structure recipe search requests. This tool will parse the user query to extract ingredient names and the desired number of recipes, then store them in the RAGQuery and numberOfRecipes properties, respectively as the function parameters.

Add the function to manage chat history:

export function addToChatHistory(message) {
  messages.push(message);
}

This simple yet crucial function maintains the conversation context by adding each message exchange to the history array. This context helps the LLM provide more coherent and relevant responses throughout the conversation.

Now, implement the main chat function with initial query processing:

export async function chat(query) {
  try {
    const model = 'mistral-nemo';
    const initialPrompt = `You are an assistant that, given the following user query, returns a string. If the user is looking for a recipe by name or ingredient, return the appropriate query. If the user is not trying to find a recipe, answer the query directly. 
    Query: ${query}`;
    addToChatHistory({ 'role': 'user', 'content': initialPrompt });
    const response = await ollama.chat({
      model: model,
      messages: messages,
      tools: [retrievalTool],
    });
    if (!response.message.tool_calls || response.message.tool_calls.length === 0) {
      console.log('The model didn\'t use the function. Its response was:');
      return response.message.content;
    }
    } catch (error) {
    console.error(error)
    return 'Failed to use an LLM to answer your message'
  }
}

This section of the chat function initializes the conversation. It sets the model to Mistral NeMo and creates and executes an initial prompt that helps the LLM understand whether to use the tool defined earlier to search for recipes or provide a direct answer.

If the LLM decides to not use the tool the code returns the direct answer to the query. However, if an error occurred while trying to query the LLM the code returns a message stating that the chatbot failed to use an LLM to answer the query.

Add the recipe retrieval processing:

export async function chat(query) {
  try {
    …
    let recipes;
    if (response.message.tool_calls) {
      const availableFunctions = {
        retrieve_recipes: retrieveRecipes,
      };
      for (const tool of response.message.tool_calls) {
        const functionToCall = availableFunctions[tool.function.name];
        const functionResponse = await functionToCall(tool.function.arguments);
        recipes = functionResponse.documents[0].join('\n\n');
        console.log('function response', functionResponse.documents);
      }
    }
  } catch (error) {
    …
  }
}

This section handles the actual recipe retrieval. When the LLM decides to search for recipes, it calls the retrieveRecipes function (located in the dbHelper.js), passing the RAGQuery and numberOfRecipes as arguments. It then processes the results, joining multiple recipes into a single string.

Finally, add the response generation:

export async function chat(query) {
  try {
    …
    const RAGPrompt = `You are an assistant for question-answering tasks. Using only the provided context, answer the query. If you don't know the answer, simply say that you don't know. 
    Query: ${query}
    Context: ${recipes}`;
    addToChatHistory({ 'role': 'user', 'content': RAGPrompt });
    const finalResponse = await ollama.chat({
      model: model,
      messages: messages,
    });
    addToChatHistory({ 'role': 'assistant', 'content': finalResponse.message.content });
    return finalResponse.message.content;
  } catch (error) {
    …
  }
}

This final section generates the response to the user. It creates a RAG prompt that includes the retrieved recipes and asks the LLM to formulate a helpful response, limiting the number of recipes to prevent overwhelming the user.

To ensure that the chat functionality is working correctly, you can add a test function.

Add the following code at the bottom of the chat.js file:

async function test() {
  let query = 'Suggest 2 recipes with tomatoes';
  let response = await chat(query);
  console.log(`query: ${query}`, `response : ${response}`);
  query = 'Tell me more about the first recipe';
  response = await chat(query);
  console.log(`query: ${query}`, `response : ${response}`);
}
test();

This test function performs two queries: one to suggest recipes with tomatoes and another to provide more details about the first recipe retrieved in the first query. The results are logged to the console, allowing you to verify that the chat functionality is working correctly.

Run the following command to test the chat functionality:

node chat.js

The output for the first query should look similar to the following:

retrieveRecipes, the query is tomatoes
function response [
  [
	'\n' +
  	' \n' +
  	'Title: Fried Green Tomatoes \n' +
  	' \n' +
  	'Ingredients:\n' +
  	"['2 large green tomatoes (1 pound), cleaned and sliced into ½-inch slices', …'] \n" +
  	' \n' +
  	'Instructions:\n' +
  	'Sprinkle tomato slices with 1 teaspoon salt and 1 teaspoon pepper, pour buttermilk into a shallow bowl. Place the tomatoes in and soak for 15 minutes.\n' +
…
]
query: Suggest 2 recipes with tomatoes response : **Recipe Index**
1. **Burst Cherry Tomato Pasta**: This recipe uses cherry tomatoes as the main ingredient, creating a chunky, flavorful sauce that coats the casarecce pasta perfectly. It's a quick and easy dish to prepare.
2. **Fried Green Tomatoes**: While not using ripe tomatoes, this recipe does use green tomatoes as its primary ingredient. The tomatoes are coated in a seasoned flour and cornmeal mixture before being fried until golden brown. This is a unique way to enjoy tomatoes if you're looking for something different.

The LLM analyzes the first query and decides to use the tool to retrieve recipes that have tomatoes as an ingredient. Then the code adds the recipes found to the RAGPrompt and sends the prompt to the LLM to generate a final response.

Here is the output for the second query on the test() function:

The model didn't use the function. Its response was:
uery: Tell me more about the first recipe response : The first recipe recommended is **Burst Cherry Tomato Pasta**. This is a delightful and quick meal that makes the most of cherry tomatoes, creating a chunky sauce with them. Here's a bit more detail about it:
- **Preparation Time**: About 30 minutes
- **Servings**: 4-6 servings
- **Ingredients**:
  - ½ cup extra-virgin olive oil, plus more for drizzling
  - 6 garlic cloves, smashed
  - 2½ lb. cherry tomatoes (about 4 pints)
  - 2 large sprigs basil, plus 1 cup basil leaves, torn if large
  - ¾ tsp. crushed red pepper flakes
  - 1½ tsp. kosher salt, plus more
  - A pinch of sugar (optional)
  - 12 oz. casarecce or other medium-size pasta
  - 1 oz. Parmesan, finely grated (about ½ cup), plus more for serving
- **Instructions**:
  1. Heat ½ cup oil in a large heavy pot over low heat. Add garlic and cook until softened but not browned.
  2. Increase the heat to medium and add tomatoes, basil sprigs, red pepper flakes, and 1½ tsp. salt. Cook until some of the tomatoes begin to burst and release their juices.
  3. Smash some of the tomatoes with a spoon to help release their liquid, then continue to cook until a chunky sauce forms.
  4. Meanwhile, cook pasta in boiling salted water until al dente.
  5. Drain pasta, add it to the pot with the sauce, and stir until coated. Remove from heat and stir in Parmesan.
  6. Divide pasta among bowls, drizzle with oil, top with more Parmesan and basil leaves.
This recipe is perfect for a light lunch or dinner, and it's easy to customize by adding other vegetables or proteins if desired. Enjoy your cooking!

The LLM analyses the second query and decides not to use the tool and uses the context provided in the message history to directly answer the query.

After you are done testing the chat functionality, comment out the test() function call.

// test();

Setting Up The Server

In this section, you will create a new file called server.js to handle web server setup and Twilio integration. This file will manage incoming messages, process them through the chatbot, and send responses back to the user.

Create a new file called server.js to handle web server setup and Twilio integration. Start with the necessary imports:

import express from 'express';
import bodyParser from 'body-parser';
import twilio from 'twilio';
import process from 'process';
import 'dotenv';
import { describeImage, storeImage } from './handleImages.js';
import { chat, addToChatHistory } from './chat.js';

This brings in all required dependencies, including Express for your web server and Twilio for WhatsApp messaging.

Set up the Express server and Twilio client:

const app = express();
const port = 3000;
app.use(express.json());
app.use(bodyParser.urlencoded({ extended: false }));
const accountSid = process.env.TWILIO_ACCOUNT_SID;
const authToken = process.env.TWILIO_AUTH_TOKEN;
const twilioClient = twilio(accountSid, authToken);

This code initializes your Express application and configures middleware for parsing incoming requests. It also sets up the Twilio client using the credentials stored in the environment variables.

Add the message splitting utility:

function splitMessage(text) {
  const maxLength = 1600;
  const parts = [];
  let start = 0;
  while (start < text.length) {
    let end = start + maxLength;
    if (end > text.length) {
      end = text.length;
    } else {
      const lastSpace = text.lastIndexOf('\n', end);
      if (lastSpace > start) {
        end = lastSpace;
      }
    }
    parts.push(text.slice(start, end).trim());
    start = end;
  }
  return parts;
}

This function handles Twilio's message length limitations by splitting long messages into smaller parts. It preserves message readability by using newlines to split messages before exceeding 1600 characters.

Implement the message sending function:

async function sendMessage(message, from, to) {
  if (message.length > 1600) {
    const parts = splitMessage(message)
    for (let i = 0; i < parts.length; i++) {
      console.log('part', parts[i].length)
      await twilioClient.messages
        .create({ body: parts[i], from: from, to: to })
        .then((msg) => console.log(msg.sid));
    }
  } else {
    twilioClient.messages
      .create({ body: message, from: from, to: to })
      .then((msg) => console.log(msg.sid));
  }
}

This function sends messages through Twilio's WhatsApp API, handling single and multi-part responses when messages exceed Twilio's length limit.

Add the image processing route:

app.post('/incomingMessage', async (req, res) => {
  const { To, Body, From } = req.body;
  const mediaURL = req.body['MediaUrl0'];
  if (mediaURL !== undefined) {
    const imagePath = await storeImage(mediaURL);
    const imageDescription = await describeImage(imagePath);
    addToChatHistory(
      { 'role': 'user', 'content': 'Describe the image with details, listing every visible item' },
      { 'role': 'assistant', 'content': imageDescription }
    );
    const message = imageDescription;
    await sendMessage(message, To, From);
    res.set('Content-Type', 'text/xml');
    res.send('').status(200);
  } else {
 }  
});

This code defines the /incomingMessage endpoint which handles incoming text and media messages.

When an image is attached to a message, the application saves the image and generates a detailed description. Both the generated description and the prompt used to describe the image, are stored in the chat history. The application then sends the image description back to the sender.

Finally, add the text message handling:

app.post('/incomingMessage', async (req, res) => {
  …
  if (mediaURL !== undefined) {
     …
  } else {
    const message = await chat(Body);
    await sendMessage(message, To, From);
    res.set('Content-Type', 'text/xml');
    res.send('').status(200);
  }
});
app.listen(port, () => {
  console.log(`Express server running on port ${port}`);
});

The code added completes the endpoint by handling text messages. When the application receives text messages it processes them through the chat() function ( located in the chat.js file) and sends the generated response back to the sender.

Finally, the code starts the server and prints a message saying that the server is running on port 3000.

Go back to your terminal and run the following command to start serving the chatbot application:

node server.js

Open another tab in the terminal and run the following command to expose the application:

ngrok http 3000

Copy the https Forwarding URL provided by Ngrok. Return to your Twilio Console homepage, select the Develop tab, navigate to Messaging, select Try it out, and choose Send a WhatsApp message to access the WhatsApp Sandbox section.

Navigating to the Twilio Whatsapp sandbox page

On reaching the Sandbox page, scroll to find the connection instructions and follow them to connect to the Twilio sandbox. You'll need to send a specified message to the provided Twilio Sandbox WhatsApp Number.

Connecting to Twilio Whatsapp sandbox

Once connected, return to the top of the page and click the Sandbox settings button to view the WhatsApp Sandbox configuration. In the settings, input your ngrok https URL into the "When a message comes in" field, adding /incomingMessage at the end. Choose POST as your method, save your changes, and your WhatsApp bot will be ready to process incoming messages. Verify that your URL matches the format shown below.

Configuring the Twilio Whatsapp sandbox settings

Open a WhatsApp client, send a message containing an image of cooking ingredients and the chatbot will respond with a detailed image description containing the ingredients that it was able to identify.

Next, send a message asking for a recipe containing a specific ingredient found in the image, and the chatbot will respond with recipe recommendations.

RAG powered recipe recommendation Twilio WhatsApp chatbot demo

Conclusion

In this tutorial, you learned how to create a WhatsApp recipe recommendation chatbot that combines computer vision with retrieval-augmented generation (RAG). You've learned how to utilize the Twilio Programmable Messaging API for handling WhatsApp messages, integrate Ollama to run AI models locally, and implement a RAG system using Chroma for efficient recipe storage and retrieval.

You've learned how to combine multiple AI models for different tasks: the Nomic embed text model for generating recipe embeddings, the Mistral NeMo model for natural conversation, and the LLaVA model for image analysis. Through this process, you've built a chatbot that not only understands recipe requests but can also analyze images of ingredients and maintain contextual conversations about cooking.

The code for the entire application is available in the following repository: https://github.com/CSFM93/twilio-recipes-chatbot .