Build an SMS Proxy that Redacts PII from Conversation Threads Using Twilio SMS, Pangea Redact Service, and Python

December 06, 2022
Written by
Nicolas Vautier
Contributor
Opinions expressed by Twilio contributors are their own
Reviewed by
Mia Adjei
Twilion

Build an SMS Proxy that Redacts PII from Conversations Using Twilio SMS, Pangea Redact Service, and Python

It is better to give than to receive — the seemingly universal proverb does not apply to your personal data. In the information era of the internet, fueled by how much information companies can acquire about their customers, giving them the ability to regulate unintentional data leaks in digital communications can be a gift that both delights your customers and gains their trust. In this post, you will learn how to redact sensitive or personal information unintentionally sent through Twilio-powered SMS conversations using Pangea’s Redact service.

By the end of the tutorial you’ll have:

  • Set up a free Pangea account and Access Token for interacting with the Redact Service
  • Set up a Django and Python application that utilizes the Twilio SMS and Pangea SDKs to redact sensitive information
  • Run the application locally on your workstation
  • Configured Twilio webhooks to invoke your app using ngrok

If you’d like to try a live version of the redact service before building your own, send a message that contains an email address to the following number.

Note: The live demo is a slightly modified version of the tutorial code configured to automatically respond to messages.

+1 415 662-0675

Try something like “Hi, my name is Nicolas Vautier and my email address is nicolas.vautier@pangea.cloud”

Requirements:

Set up your Pangea account and Access Token

Once you’ve signed up for Pangea, log in and access the Pangea Console. Set up the Redact service by selecting Redact Audit Log from the left-hand navigation menu.

Pangea Console

Review the benefits of the service and select Next to continue.

Redact Service informational dialog

Create an Access Token by selecting a token name, expiration date, and token scope — or use the default values by selecting Done.

Create token dialog

Make a note of the Config ID, service Domain, and access Token. You will use each of these values to interact with the service from your app’s code in the next step.

Note: You can quickly copy each value to your system’s clipboard using the shortcuts.

Redact service dashboard

Select Rulesets from the left-hand navigation menu, and select PII. Enable the types of data you would like redacted from messages sent through your service, for example, EMAIL_ADDRESS and PERSON. Confirm the configuration changes by clicking Save.

Configure Redact Ruleset

Get the code

Clone the SMS proxy app:

git clone https://github.com/pangeacyber/redact-twilio-proxy.git

Change the working directory to your new Django project, redact_twilio_proxy, with the following command:

cd redact-twilio-proxy

Take a moment to explore the project's files and configure them to your environments.

  • manage.py - Django's command-line utility for administrative tasks, such as running a development web server
  • .env - Contains the environment variables the app will reference.
  • requirements.txt - contains the application dependencies. The Twilio Python Helper Library and Pangea SDK are both listed.
  • redact/views.js - The application source file that contains a single function to handle incoming SMS messages.

Install the Python modules used by the application

Before installing dependencies it is recommended that you create and activate a virtual environment where you can install them without affecting your global environment. You can do so in the root project directory with the following commands if you are working in a Mac or Unix environment:

python3 -m venv venv
source venv/bin/activate

If you are working in a Windows environment, run the commands below instead:

python -m venv venv
venv\Scripts\activate

Then to install each dependency listed in the requirements.txt file, run the following command from the root of the project directory:

pip3 install -r requirements.txt

Configure and run the app

The application source reads 7 variables from the .env file.

  • ACCOUNT_SID and AUTH_TOKEN to authenticate your app with the Twilio service.
  • PANGEA_DOMAIN, PANGEA_CONFIG_ID, and PANGEA_AUTH_TOKEN to authenticate with the Pangea service.
  • TARGET_NUMBER and OWNER_NUMBER are used to determine where to forward incoming SMS messages.

Modify the .env file by replacing each {REPLACE} tag with the corresponding value. For the PANGEA_ specific variables, use the three values you noted in the previous section, or retrieve them from the Pangea Console by navigating to the Secure Audit Log tab. The Twilio values for ACCOUNT_SID and AUTH_TOKEN can be found on the landing page of the Twilio Console. OWNER_NUMBER and TARGET_NUMBER should each be a valid E.164 phone number you’d like to test with. For example, OWNER_NUMBER can be set to your mobile phone number and TARGET_NUMBER to a friend's number who you’d like to start and record an auditable message thread with. An example of an E.164 formatted number in the US is +16502223333.

Note: You can also set both OWNER_NUMBER and TARGET_NUMBER to your personal mobile number and reply to your own messages.

Verify your changes to the .env file with the git diff sub-command:

git diff

The output should look similar to this:

diff --git a/.env b/.env
index 7442c6a..9fd8304 100644
--- a/.env
+++ b/.env
@@ -1,7 +1,7 @@
-ACCOUNT_SID={REPLACE}
-AUTH_TOKEN={REPLACE}
-PANGEA_DOMAIN={REPLACE}
-PANGEA_CONFIG_ID={REPLACE}
-PANGEA_AUTH_TOKEN={REPLACE}
-OWNER_NUMBER={REPLACE}
-TARGET_NUMBER={REPLACE}
+ACCOUNT_SID=ACXXXXXXXXXXXXXXXXXXXXXXXXXX
+AUTH_TOKEN=XXXXXXXXXXXXXXXXXXXXXXXXX
+PANGEA_DOMAIN=aws.us.pangea.cloud
+PANGEA_CONFIG_ID=pci_XXXXXXXXXXXXXXXXXXXXXX
+PANGEA_AUTH_TOKEN=pts_XXXXXXXXXXXXXXXXXXXXXXXX
+OWNER_NUMBER=+1305XXXXXXX
+TARGET_NUMBER=+1305XXXXXXX

Code walkthrough

Inspect the contents of the redact/views.py source file. The user defined environment variables set in the .env file will be loaded by the load_dotenv() function.

# Load the .env file into environment variables
from dotenv import load_dotenv
load_dotenv()

Then, read the Twilio account variables and assign them to accountSid and authToken respectively. Use them to instantiate an instance of the twilio class defined in the SDK.

# Read the Twilio SID and Auth Token from the environment variables
accountSid = os.getenv("ACCOUNT_SID")
authToken = os.getenv("AUTH_TOKEN")

# Import the Twilio SDK
from twilio.rest import Client
from twilio.twiml.messaging_response import MessagingResponse

# Instantiate a Twilio Client using the accountSid and authToken
twilioClient = Client(accountSid, authToken)

The Pangea SDK classes are imported and the PANGEA_ values are assigned to their respective variables. A PangeaConfig object is created and used to create a RedactService instance.

# Import the Pangea SDK
from pangea.config import PangeaConfig
from pangea.services import Redact

# Read the Pangea Config Id and Auth Token from the environment variables
pangeaDomain = os.getenv("PANGEA_DOMAIN")
redactToken = os.getenv("PANGEA_AUTH_TOKEN")
redactConfigId = os.getenv("PANGEA_CONFIG_ID")

# Instantiate a Pangea Configuration object with the end point domain and configId
redactConfig = PangeaConfig(domain=pangeaDomain, config_id=redactConfigId)
redactService = Redact(redactToken, config=redactConfig)

Read each of the recipient numbers into local variables. These values will be used to determine where to relay incoming messages to.

# Read the target recipients numbers from environment variables
ownerNumber = os.getenv("OWNER_NUMBER")
targetNumber = os.getenv("TARGET_NUMBER")

The code reviewed thus far will execute when the Django app is loaded by the web server.   Next, declare a function to handle requests. You will configure Twilio to invoke this function each time your Twilio owned number receives an SMS. The function’s response and actions taken will be determined by the contents of the object passed in as the request parameter.

@require_POST
@csrf_exempt
def index(request):

        print(f"Event: {request.POST}")

        # Define a response object, in case a response to the sender is required
        resp = MessagingResponse()

Determine the destinationNumber to relay the message to.

  • If the message originated from the owner, send it to the target.
  • If the message came from the target, send it to the owner.
  • If the message came from any other number, reply back to the sender with the redacted message.
# Determine the destination number
if request.POST['From'].endswith(ownerNumber):
  # If the message is from the owner, send it to target
  destinationNumber = targetNumber
elif request.POST['From'].endswith(targetNumber):
  # If the message is form the target, send it to owner
  destinationNumber = ownerNumber
else:
  # If the message is from any other number, reply to the sender
  destinationNumber = request.POST['From']

Read the original message from the incoming request and pass it to the redact method of the `redactService. This will invoke the Pangea Service to return a modified version of the supplied text depending which redact rulesets are enabled. You can explore the different rulesets or create your own on the Pangea Console.

originalMessage = request.POST['Body']
print(f"Redacting PII from: {originalMessage}")
redactResponse = redactService.redact(originalMessage)

If the response returned from the redact service is marked as a success, extract the redacted_text from the result and use the twilioClient to relay the redacted message to the destinationNumber and return a blank response to complete the function execution.

if redactResponse.success:
  print(f"Response: {redactResponse.result}")
  redactedMessage = redactResponse.result.redacted_text

 # Send the redacted message to the destinationNumber
twilioClient.messages.create(
  body=redactedMessage,
  from_=request.POST['To'],
  to=destinationNumber
)

If the relayed message was affected by the redact operation, notify the original sender that a modified version of their message was sent to the recipient.  

# If a redacted message was sent, notify the sender via an automated response
if redactedMessage != originalMessage:
  resp.message("AUTOMATED RESPONSE: You sent a message with sensitive, personal     information. Our system redacted that information so that you can remain protected. The   recipient of that message cannot access your sensitive information through this conversation.")

In the case that the redact failed, or if there are any other failures, reply to the sender with the error details.

else:
print(f"Redact Request Error: {redactResponse.response.text}")
            if redactResponse.result and redactResponse.result.errors:
                        for err in redactResponse.result.errors:
                            print(f"\t{err.detail}")
                            resp.message(err.detail)

Run the sample

Django projects are bundled with the command-line utility, manage.py, to help you interact with your project. Use the following command to start a development server on your local machine:

python3 manage.py runserver

You’ll see the following output on the command line:

Performing system checks...

System check identified no issues (0 silenced).

You have unapplied migrations; your app may not work properly until they are applied.
Run 'python manage.py migrate' to apply them.

September 18, 2022 - 15:50:53
Django version 4.1, using settings 'mysite.settings'
Starting development server at http://127.0.0.1:8000/
Quit the server with CONTROL-C.

Your web app is now running! Your function is waiting to respond to requests sent to http://127.0.0.1:8000/redact/, but this URL is only reachable from your local computer. You’ll need to make the endpoint accessible publicly so Twilio can query it when an SMS is received.

Make your app accessible publicly on the internet with ngrok

Twilio has a couple great tutorials, Test Your Webhooks Locally with ngrok, covering this use case in detail and an update for using the latest version, Using Ngrok in 2022. Visit the links for details or follow these quick instructions to get started quickly:

  • Download the version for your particular system and install it to a location of your choice.
  • Using the terminal, navigate to the folder you installed it to.
  • Run ./ngrok http 8000 on Linux and Mac, and just ngrok http 8000 on Windows to start ngrok and tell it which port to expose to the public internet

You should see the following output on the command line:

ngrok                                                                                                                                                     (Ctrl+C to quit)
                                                                                                                                                                                    
Visit http://localhost:4040/ to inspect, replay, and modify your requests                                                                                                           
                                                                                                                                                                                    
Session Status                    online                                                                                                                                                
Account                           your name (Plan: Free)                                                                                                                          
Version                           3.1.0                                                                                                                                                 
Region                            United States (us)                                                                                                                                    
Latency                           99ms                                                                                                                                                  
Web Interface                     http://127.0.0.1:4040                                                                                                                                 
Forwarding                        https://4762-47-156-19-205.ngrok.io -> http://localhost:8000

Make a note of the Forwarding base URL above. It will be needed to configure your Twilio programmable phone number to query your app for instruction when an SMS is received.

If you do not already have a Twilio number, follow these instructions:

  • Go to your Phone Numbers Dashboard.
  • Click Buy a Number.
  • Search for a number that suits you.
  • Click Buy.
  • Confirm your purchase, then click Setup Number.

Buy a number page in the Twilio console

Otherwise, navigate to the Active numbers panel of the Twilio console and select the number you’d like to use with this service.

Active numbers page in the Twilio console

Under Messaging, look for the line that says “A message comes in.” Change the first box to “Webhook” and add the ngrok base URL with the path to the redact/ endpoint appended to it in the second box. Yours should look similar to https://[your generated ID].ngrok.io/redact/. Save the configuration. The function on your local machine will now be invoked every time an SMS is sent to this number.

Configure a phone numbers webhooks page in the Twilio console

Test and verify the conversation audit trail

That's it! You now have a proxy number between the OWNER_NUMBER and TARGET_NUMBER you configured in the .env file. Use a cellphone with either number to send an SMS to the Twilio number you purchased and configured to invoke your function. The SMS contents of each message will be inspected by the Pangea Redact service for the data types you enabled on the Pangea Console and will be redacted and replaced with the appropriate generic tag before forwarding the message to the other participant. Similarly when the recipient replies, their message will be forwarded back to you, creating a conversation thread on the SMS apps on both your phones. You can update the data types that get redacted at any time without having to redeploy or make changes to your app.

Conclusion

In this article you learned how to build and run an SMS proxy that redacts sensitive information from conversations. Regardless of your use case, whether it is a live customer support channel or automated AI chat bot, regulating and minimizing the unintentional disclosure of your user’s private or sensitive data can have a monumental impact on their trust of your service.

Nicolas Vautier is a Developer Advocate at Pangea Cyber and is a privacy and data security enthusiast. If you have questions, comments, or ideas for future posts, Nicolas can be reached at nicolas.vautier@pangea.cloud. Follow him on Twitter @DeveloperEnvY to join his journey as he helps unify security services for app builders @PangeaCyber.