Create a AI Summarizer Bot with Ollama LangChain and Twilio

Time to read:

March 19, 2025

Written by

Rishab Kumar

Twilion

Reviewed by

Marius Obert

Twilion

Create a AI Summarizer Bot with Ollama LangChain and Twilio

Drowning in information but starving for knowledge? Imagine having your own personal AI summarizer that distills lengthy texts into clear, concise summaries - running directly on your hardware, respecting your privacy, and accessible anywhere through a simple text message.

In this tutorial, I'll walk you through creating a powerful AI summarization tool that leverages Ollama's local language model capabilities, LangChain's flexible orchestration, and Twilio's seamless communication infrastructure to bring this vision to life over SMS.

Prerequisites

To follow with today’s project you’ll need:

The power of local LLMs

Yes, you can run Meta’s llama, Google’s Gemma, Microsoft’s Phi, and DeepSeek R1 for free on your own hardware.

How we do that is with an application called Ollama.

Ollama

Ollama allows you to run open-source large language models locally. Ollama bundles model weights, configuration, and data into a single package, defined by a Modelfile. It optimizes setup and configuration details, including GPU usage.

You can install Ollama with a single command on Linux:

curl -fsSL https://ollama.com/install.sh | sh

Windows and macOS versions are also available on the downloads page.

Once you have it installed and running, you can pull a model by running the following command.

ollama pull <name-of-model>

For a complete list of supported models and model variants, see the Ollama model library.

Screenshot of model library webpage on Ollama showing a list of AI models with descriptions and popularity metrics.

Running Gemma

To pull a Gemma model by Google, use the following command.

ollama pull gemma3

If you would like to test the model, you can interact with it with two different methods:

In the terminal:

Run ollama run gemma3 to start interacting via the command line directly

Via the API:

All of your local models are automatically served on localhost:11434
Send an application/json request to the API endpoint of Ollama to interact. By default, the answer streams back to the user.

curl http://localhost:11434/api/generate -d '{ 
	"model": "gemma3", 
	"prompt":"Why is the sky blue?" 
}'

Integrate with LangChain

In order to use our locally running Ollama model, I will be using the Python library of LangChain. So let's get started by creating a directory for our project.

mkdir ollama-langchain-sms
cd ollama-langchain-sms

Set up our environment

First, let's create a virtual environment and install the necessary packages:

python -m venv venv 
source venv/bin/activate # On Windows: venv\Scripts\activate 
pip install langchain langchain-community twilio python-dotenv beautifulsoup4

Create our summarizer bot

Now, let's write the code for our summarizer bot. Create a file called `summarize.py`:

from langchain_community.llms import Ollama
from langchain.chains.summarize import load_summarize_chain
from langchain_community.document_loaders import WebBaseLoader
from langchain.prompts import PromptTemplate
from twilio.rest import Client
import os
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Twilio configuration
account_sid = os.getenv("TWILIO_ACCOUNT_SID")
auth_token = os.getenv("TWILIO_AUTH_TOKEN")
twilio_phone = os.getenv("TWILIO_PHONE_NUMBER")

# Initialize Twilio client
client = Client(account_sid, auth_token)

# Initialize our local Gemma model
llm = Ollama(model="gemma3")


def load_text(url):
    """Load the article/blog post"""
    loader = WebBaseLoader(url)
    loader.requests_kwargs = {'verify':False}
    docs = loader.load()
    return docs

# Create a prompt template for summarization
summary_template = """
You are an expert summarizer. Your task is to create a concise summary of the 
following text. The summary should be no more than 5-6 sentences long.

TEXT: {text}

SUMMARY:
"""

# Create the prompt
prompt = PromptTemplate(
    input_variables=["text"],
    template=summary_template,
)

# Create the LLMChain
summarize_chain = load_summarize_chain(llm=llm, prompt=prompt, chain_type="stuff")


def summarize_text(text):
    """Summarize the given text using our local LLM"""
    summary = summarize_chain.invoke(text)
    return summary

def send_summary(summary, to_number):
    """Send the summary via Twilio SMS"""
    message = client.messages.create(
        body=summary,
        from_=twilio_phone,
        to=to_number
    )
    return message.sid

Let's break down what our summarize.py code is doing:

Setting up connections: First, we load our environment variables and initialize our Twilio client for sending SMS messages. This method can also be used to send messages over WhatsApp or RCS. We also initialize our local Ollama model, specifically the Gemma model we pulled earlier.
Web content loading: We've added a `load_text()` function that uses LangChain's `WebBaseLoader` to fetch and process content directly from URLs. This means our summarizer can now work with articles and blog posts from the web.
Creating the prompt template: We define a prompt template that instructs the AI to act as an expert summarizer. This template includes placeholders for the text to be summarized and guidelines for keeping the summary concise (5-6 sentences).
Building the LangChain chain: We create a LangChain load_summarize_chain with the "stuff" chain type, which is optimized for document summarization. This creates a reusable pipeline for text summarization.
Creating utility functions:
- summarize_text(): This function takes a text input, passes it through our summarization chain, and returns a cleaned summary.
- send_summary(): This function sends the generated summary to a specified phone number using Twilio's SMS capabilities.

The power of this approach lies in its simplicity and modularity. LangChain allows us to swap out different models, modify our prompts, or add additional processing steps without rewriting the entire application. Meanwhile, running the model locally with Ollama means your data never leaves your machine during the summarization process.

Set up Twilio

To use Twilio for sending summaries, you'll need to:

Create a Twilio account
Get a Twilio phone number
Note your Account SID and Auth Token
Create an .env file in your project directory:

TWILIO_ACCOUNT_SID=your_account_sid
TWILIO_AUTH_TOKEN=your_auth_token
TWILIO_PHONE_NUMBER=your_twilio_number

Create a Simple Web Interface

Let's create a simple application using Streamlit to provide a web interface for our summarizer. Create a new file, and let’s name it streamlit.py:

import streamlit as st
from summarize import load_text, summarize_text, send_summary
import os
st.title("Summarizer Bot")
with st.sidebar:
    with st.form(key='my_form'):
        url = st.sidebar.text_area(
            label="What is the URL?",
            max_chars=250
            )
        number = st.sidebar.text_input(
            label="What is the phone number?",
            max_chars=250,
            )
if url:
    docs = load_text(url)
    response = summarize_text(docs)
    response = response["output_text"]
    st.write(response)
    if number:
      send_summary(response, number)

I like using Streamlit, since it provides an interactive and user-friendly interface, all with minimal code in Python. The above file not only creates the interface for our app, but also imports the functions from summarize.py.Now to run our application, we have to make sure we install the streamlit package as well.In your terminal, run pip install streamlit

After installing streamlit, you can start the application by:

streamlit run streamlit.py

A screenshot showing the web interface of the Summarizer Bot providing an overview of its functionalities.

The sidebar contains a form where users can input a URL and (optionally) their phone number (hidden for privacy). When a URL is provided, the app automatically fetches the content, summarizes it, and displays the summary. If a phone number is provided, it also sends the summary via SMS.

You can test it out by adding a URL of an article or blog post you need a summary for and then add a number where the summary will be sent.

Mobile screen showing a text message summarizing research on the potential of large language models.

Take it further

Here are some ways to enhance your summarizer bot:

Add support for different summarization styles (bullet points, executive summary, etc.)
Implement a webhook to receive URLs via Twilio and return summaries automatically
Create a scheduled service that summarizes news from your favorite sources daily
Fine-tune your local model to improve summarization quality for specific domains

Conclusion

By combining Ollama, LangChain, and Twilio we've created a powerful, privacy-focused AI summarization tool that runs entirely on your own hardware. With our enhanced version that can process web content directly, you can quickly get summaries of articles, blog posts, and other web content with just a URL.

The best part? This is just the beginning. As open-source models continue to improve, your local summarizer bot will only get better with time. Happy summarizing!If you are interested in learning more about Ollama, LangChain, or other AI related technologies, check out my YouTube Channel at Rishab in Cloud.

Related Resources

Twilio Docs

From APIs to SDKs to sample apps

API reference documentation, SDKs, helper libraries, quickstarts, and tutorials for your language and platform.

Resource Center

The latest ebooks, industry reports, and webinars

Learn from customer engagement experts to improve your own communication.

Ahoy

Twilio's developer community hub

Best practices, code samples, and inspiration to build communications and digital engagement experiences.

Create a AI Summarizer Bot with Ollama LangChain and Twilio

Related Posts

Related Resources

From APIs to SDKs to sample apps

The latest ebooks, industry reports, and webinars

Twilio's developer community hub