How to Build an Email Monitoring Application with SMS and Semantic Analysis

Bold text on a light background
January 09, 2025
Written by
Feranmi Odugbemi
Contributor
Opinions expressed by Twilio contributors are their own
Reviewed by
Diane Phan
Twilion

 

Managing your inbox can be overwhelming. What if you could have an intelligent assistant that reads your emails, determines their importance based on the ones that really matter to you, and texts you about them?

In this tutorial, you'll learn how to build a Python application that does just that, using Twilio SMS and semantic analysis with sentence transformers. Haven’t heard of semantic analysis? Semantic analysis is the process of understanding the meaning and context of words and sentences in natural language.

This smart email monitoring will:

  • Retrieve unread emails from your inbox
  • Use semantic similarity to analyze and classify each email's importance according to your use case
  • Send SMS notifications about important emails that match specific keywords
  • Provide detailed console output for monitoring

By the end of this tutorial, you won’t have to stress to stay on top of your most crucial communications without being tied to your inbox.

Prerequisites

Before proceeding, make sure you have the following:

  • A free Twilio account. If you don't have one, sign up here.
  • Python 3.7 or later installed on your computer.
  • Basic knowledge of Python programming.
  • An email account that supports IMAP access. Most major email providers (Gmail, Outlook, Yahoo) support this.

Set Up Your Development Environment

You can start by setting up a clean development environment. Open your terminal and run the following commands:

mkdir email-monitor
cd email-monitor
python3 -m venv venv
source venv/bin/activate  # On Windows, use venv\Scripts\activate

This creates a new directory for your project and sets up a virtual environment to manage your dependencies.

Install Required Packages

Now, install the necessary Python packages:

pip install twilio python-dotenv colorama chardet sentence-transformers scikit-learn

Here's what each package does:

  • twilio: Twilio's Python SDK for making voice calls
  • sentence-transformers: For semantic text similarity analysis
  • python-dotenv: For loading environment variables from a .env file
  • colorama: For adding color to console output
  • chardet: For detecting character encoding in emails
  • scikit-learn: or computing cosine similarity

Configure Your Environment Variables

To keep your sensitive information secure, you'll use environment variables. Create a file named .env in your project directory with the following content:

EMAIL=your_email@example.com
PASSWORD=your_email_password
IMAP_SERVER=your_imap_server
account_sid=your_twilio_account_sid
auth_token=your_twilio_auth_token
twilio_phone_number=your_twilio_phone_number
to_phone_number=your_personal_phone_number

Replace the placeholder values with your actual credentials. Here's how to get each:

  • EMAIL: Your email address
  • PASSWORD: Your email app password (see section below on how to get this for Gmail)
  • IMAP_SERVER: The IMAP server for your email provider (e.g., imap.gmail.com for Gmail)
  • account_sid and auth_token: Find these in your Twilio Console
  • twilio_phone_number: Your Twilio phone number with SMS enabled
  • to_phone_number: The phone number associated with the Twilio account you created, also in the E.164 format e.g +2348160435459
Never commit your .env file to version control. Add it to your .gitignore file to prevent accidental exposure of your credentials.

Get an App Password for Gmail

If you're using Gmail, you'll need to generate an app password instead of using your regular account password. This is necessary for securely logging into your Gmail account via IMAP and authenticating the connection. Gmail requires an app password when two-factor authentication (2FA) is enabled. Here's how:

  1. Go to your Google Account.
  2. Select "Security" from the left navigation panel.
  • Under "How you sign in to Google," select "2-Step Verification" and verify your identity. The process won’t work without 2 step verification activated.
Google settings showing 2-Step Verification enabled since October 14, 2023.
  • Go to App passwords, you may need to sign in.
  • Enter a name for the app (e.g., "Email-analyzer") and click "Create".
Screen showing creation of an app-specific password with Email-analyzer as the name and a Create button.
  • Google will display a 16-character app password. Copy this password.
  • Use this app password in your .env file for the PASSWORD field.

Build the Email Analyzer

The main components of the email monitoring application will be implemented in a file named email_monitor.py. This is where you will add the core functionality to retrieve emails, analyze their importance according to your set keywords, and send an SMS.

To start, create a new file in your project directory with the name email_monitor.py. This is where you will write the code for the email monitoring application. To begin, copy the code snippet below and paste it in your email_monitor.py.

import imaplib
from twilio.rest import Client
import email
from email.header import decode_header
import os
import time
from colorama import init, Fore, Style
import chardet
from dotenv import load_dotenv
from sentence_transformers import SentenceTransformer
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
# Initialize colorama
init()
# Load environment variables from .env file
load_dotenv()
# Email configuration from environment variables
EMAIL = os.getenv("EMAIL")
PASSWORD = os.getenv("PASSWORD")
IMAP_SERVER = os.getenv("IMAP_SERVER")
# Twilio credentials from environment variables
account_sid = os.getenv("account_sid")
auth_token = os.getenv("auth_token")
twilio_phone_number = os.getenv("twilio_phone_number")
to_phone_number = os.getenv("to_phone_number")
# Initialize the sentence transformer model
model = SentenceTransformer('all-MiniLM-L6-v2')

The above code does the following:

  • Imports our libraries for email fetching, SMS sending, text analysis, and environment variable management.
  • Loads sensitive information like email credentials and Twilio API keys from the .env file.
  • Loads a pre-trained NLP model ('all-MiniLM-L6-v2') for generating sentence embeddings to compare and analyze text semantically.

Decode the emails

Implement the functions below for the email decoding functionality. Copy and paste the following code below:

def safe_decode(payload):
    if payload is None:
        return ""
    # List of encodings to try
    encodings = ['utf-8', 'iso-8859-1', 'windows-1252', 'ascii', 'latin-1', 'cp1252']
    # Try chardet first
    detected = chardet.detect(payload)
    if detected['encoding']:
        encodings.insert(0, detected['encoding'])
    # Try each encoding
    for encoding in encodings:
        try:
            return payload.decode(encoding)
        except UnicodeDecodeError:
            continue
    # If all else fails, return a string representation of the bytes
    return str(payload)

The safe_decode function is crucial for handling email content because:

  • Emails can come from various sources with different character encodings
  • Some emails might use non-standard or legacy encodings
  • The function tries multiple common encodings to ensure readability
  • It uses chardet to automatically detect the most likely encoding
  • If all decoding attempts fail, it returns a string representation as a fallback

Retrieve Unread Emails

You will now implement the email retrieval function. Copy and paste the following code below:

def get_emails():
    # Connect to the IMAP server
    mail = imaplib.IMAP4_SSL(IMAP_SERVER)
    mail.login(EMAIL, PASSWORD)
    mail.select("inbox")
    # Search for all unread emails
    _, search_data = mail.search(None, '(UNSEEN SENTSINCE 18-Nov-2024)')
    email_ids = search_data[0].split()
    emails = []
    for email_id in email_ids:
        try:
            _, msg_data = mail.fetch(email_id, "(RFC822)")
            raw_email = msg_data[0][1]
            email_message = email.message_from_bytes(raw_email)
            subject = decode_header(email_message["Subject"])[0][0]
            if isinstance(subject, bytes):
                subject = safe_decode(subject)
            body = ""
            if email_message.is_multipart():
                for part in email_message.walk():
                    if part.get_content_type() == "text/plain":
                        payload = part.get_payload(decode=True)
                        body = safe_decode(payload)
                        break
            else:
                payload = email_message.get_payload(decode=True)
                body = safe_decode(payload)
            emails.append({
                "id": email_id.decode(), 
                "subject": subject, 
                "body": body,
                "from": email_message["From"]
            })
        except Exception as e:
            print(f"Error processing email {email_id}: {str(e)}")
            continue
    mail.close()
    mail.logout()
    return emails

This get_emails function performs several important tasks:

  • Connects securely to your email server using SSL
  • The SENTSINCE 18-Nov-2024 filter ensures we only process unseen emails from 18th of August, 2024 (adjust this date as needed according to your use case)
  • Handles both single-part and multi-part email messages
  • Extracts the sender, subject, and body text
  • Use the safe_decode function to handle various character encodings

Analyze emails with semantic similarity analysis

Next, you will implement the core functionality, where the magic happens—analyzing emails using semantic analysis. Copy and paste the code below:

def check_email_relevance(subject, body, keywords, similarity_threshold=0.7):
    """
    Check if email content matches any of the keywords using semantic similarity
    """
    email_text = f"{subject}. {body}"
    sentences = [s.strip() for s in email_text.split('.') if s.strip()]
    # Encode keywords and sentences
    keyword_embeddings = model.encode(keywords, convert_to_tensor=True)
    sentence_embeddings = model.encode(sentences, convert_to_tensor=True)
    # Calculate similarities
    similarities = cosine_similarity(
        sentence_embeddings.cpu().numpy(), 
        keyword_embeddings.cpu().numpy()
    )
    # Get maximum similarity for each keyword
    max_similarities = np.max(similarities, axis=0)
    matched_keyword_indices = np.where(max_similarities > similarity_threshold)[0]
    if len(matched_keyword_indices) > 0:
        matched_keywords = [keywords[idx] for idx in matched_keyword_indices]
        relevant_sentences = []
        for keyword_idx in matched_keyword_indices:
            best_sentence_idx = np.argmax(similarities[:, keyword_idx])
            relevant_sentences.append(sentences[best_sentence_idx])
        return True, matched_keywords, relevant_sentences
    return False, [], []

Semantic similarity analysis uses the sentence transformer model to convert text into numerical vectors.

The default similarity threshold of 0.7 was chosen for more concise and targeted emails. It can catch semantic variations (e.g., "Position available" might match "Job opportunity"). It catches words that are similar in meaning to your set keywords, which will be later explained in the article.

The function returns both matched keywords and the relevant sentences for context.

Implement the Twilio SMS notifications

Let's add the SMS notification functionality. Copy and paste the code below:

def print_green(text):
    print(f"{Fore.GREEN}{text}{Style.RESET_ALL}")
def send_sms(message):
    twilio_client = Client(account_sid, auth_token)
    message = twilio_client.messages.create(
        body=message,
        from_=twilio_phone_number,
        to=to_phone_number
    )
    print(f"SMS sent! Message SID: {message.sid}")

The code initializes a Twilio client with your account credentials and sends an SMS using Twilio's messages.create() method.

Let's implement the main function that ties everything together. Copy and paste the code below:

def main():
    # Define your keywords to match
    keywords = [
        "job opportunity",
        "career opening",
        "interview request",
    ]
    print(f"Monitoring emails for keywords: {', '.join(keywords)}")
    print("\nChecking for new emails...")
    emails = get_emails()
    matched_count = 0
    for email in emails:
        is_relevant, matched_keywords, relevant_sentences = check_email_relevance(
            email["subject"], 
            email["body"], 
            keywords
        )
        if is_relevant:
            matched_count += 1
            summary = (f"Important email from {email['from']}\n"
                       f"Subject: {email['subject']}\n"
                       f"Matched keywords: {', '.join(matched_keywords)}\n"
                       f"Relevant content: {' '.join(relevant_sentences)}")
            print_green("Email found...sending SMS")
            print(summary)
            # Send SMS
            sms_message = f"Important email: {email['subject']}\nMatched keywords: {', '.join(matched_keywords)}"
            send_sms(sms_message)
    print(f"\nEmail check complete:")
    print(f"Total new emails: {len(emails)}")
    print(f"Matched emails: {matched_count}")
if __name__ == "__main__":
    try:
        main()
    except KeyboardInterrupt:
        print("\nScript terminated by user.")
    except Exception as e:
        print(f"An error occurred: {str(e)}")

The main() function:

  • Defines a list of keywords that indicate important emails you need to be notified about in your inbox, in this case it is targeted at job opportunities
  • These keywords were chosen to cover situations for seeking new job opportunities (Adjust the keywords as needed)
  • Provides color-coded console output for better visibility
  • Sends concise SMS notifications for important emails
  • Includes error handling for termination

Run and test the application

You can get the code from this GitHub repository.

The script uses the date "November 18, 2024" as a filter for emails. Make sure to adjust this date in the get_emails() function to match your testing needs.

The script will start running and check for new emails from your set date. To test it:

Send yourself these sample emails:

Subject: Senior Developer Position - Interview Request  
Body: Following your application for the Senior Developer role, we were impressed with your background in cloud architecture. We would like to schedule a technical interview to discuss your experience with AWS and microservices.  
Subject: System Maintenance Notice  
Body: Our cloud infrastructure will undergo scheduled maintenance this weekend. Expected downtime is 2 hours. Please save all work and close active connections before Saturday 2 AM EST.  
Subject: Career Advancement Opportunity  
Body: Based on your LinkedIn profile, I think you'd be a perfect fit for a Lead Software Architect position at our firm. The role offers competitive compensation ($150-180k) and the chance to work with cutting-edge AI technologies. Would you be interested in learning more?  
Subject: Urgent: Production Pipeline Issue  
Body: The CI/CD pipeline is failing on the main branch. Build errors indicate dependency conflicts in the Node.js packages. Need immediate assistance to resolve this blocking issue.  
Subject: Team Sync-up Meeting  
Body: Let's meet tomorrow at 10 AM to discuss the progress on the user authentication module. Please prepare updates on your assigned tasks and any blocking issues.  
Subject: AWS Service Disruption  
Body: Multiple EC2 instances in the us-east-1 region are showing high latency. The monitoring dashboard indicates potential memory leaks. Engineering team is investigating.
Screenshot of an email inbox with six unread notifications from various subjects.

To run your email monitor, make sure all your credentials are correctly set in the .env file, then execute:

python email_monitor.py

Console output with progress bars indicating the usage of different spaCy models like en_core_web_sm and en_core_web_md.

Your application will take time to download the 'all-MiniLM-L6-v2' model.

Log displaying monitoring of emails for job-related keywords and sending SMS alerts for important messages.

Your application has successfully gone through the emails and sent an SMS as seen below:

Two text messages from Twilio trial account about interview request and job opportunity.

What's next for building email monitoring systems and Twilio SMS?

Congratulations! You've built a sophisticated email monitoring system that demonstrates how modern NLP techniques can be used to create powerful, cost-effective email monitoring solutions. By leveraging sentence transformers and semantic similarity, you built a system that identifies important emails without relying on expensive API calls or complex setups, showcasing how AI and cloud communications can create powerful productivity tools.

This email monitoring application can be extended in various ways:

  • Keyword Categories: Group keywords by importance level.
  • Web Interface: monitoring and configuration.
  • Scheduled Runs: You can schedule the script to run at specific times using cron jobs.
  • Email Actions: Implement functionality to reply to or forward important emails automatically.

As you continue to develop this application, consider exploring more of Twilio's communication APIs. The possibilities for enhancing and customizing this tool are endless!

For more tutorials and ideas, check out the Twilio Blog, where you can explore articles like Building a Multilingual Email App with SendGrid and Amazon Translate, Send Personalized Emails Using Gemini, SendGrid, and Node.js, and Send SMS with Node.js and AWS Lambda.

This tutorial was written by Feranmi Odugbemi, a software developer specializing in AI and cloud communications. With extensive experience in Python and a passion for creating innovative solutions, Feranmi Odugbemi enjoys exploring the intersection of artificial intelligence and practical applications. For more information or to get in touch, visit Feranmi Odugbemi or email feranmiodugbemi@gmail.com .