WhatsApp Media with C#, .NET Core and the Pho.to API

February 19, 2019
Written by

hero.jpg

When the WhatsApp API for Twilio launched last year it only supported text.

Not anymore!

The WhatsApp API for Twilio now supports media, including images and files.  

If you would like to see a full integration of Twilio APIs in a .NET Core application then checkout this free 5-part video series I created. It's separate from this blog post tutorial but will give you a full run down of many APIs at once.

So let's build an app that receives an image from WhatsApp, processes the image using one of the many options available on the Pho.to API and then returns the processed image.

What we'll need to get started.

Create a new folder for your project and navigate into it.

>mkdir PhotoFiddler
>cd PhotoFiddler

Once in your new folder, we can make a new .NET Core WebAPI project using the command below:

>dotnet new webapi

This command will create a WebAPI project from the template that comes with the SDK.

Setting up the basics

Open the PhotoFiddler folder in your code editor.  The first thing we will do is update the PhotoFiddler.csproj file to include the various nuget packages. We will need to add a GUID from a GUID generator to create a user secrets ID.  

<Project Sdk="Microsoft.NET.Sdk.Web">
 <PropertyGroup>
   <TargetFramework>netcoreapp2.2</TargetFramework>
   <AspNetCoreHostingModel>InProcess</AspNetCoreHostingModel>
   <UserSecretsId>GUID</UserSecretsId>

 </PropertyGroup>
 <ItemGroup>
   <PackageReference Include="Microsoft.AspNetCore.App" />
   <PackageReference Include="Microsoft.AspNetCore.Razor.Design" Version="2.2.0" PrivateAssets="All" />
   <PackageReference Include="twilio" Version="5.26.0" />
   <PackageReference Include="twilio.aspnet.common" Version="5.20.1" />
   <PackageReference Include="twilio.aspnetcore.mvc" Version="1.0.2" />
 </ItemGroup>
</Project>

If we return to the command line we can fetch and install our newly added packages with the following command.        

>dotnet restore

We will need some API keys from Pho.to which can be generated here.  We will create a model to map these keys to. Add a new file called PhotoApiSettings.cs in the root of your project and add the following code to it:

namespace PhotoFiddler
{
   public class PhotoApiSettings
   {
       public string PrivateKey {get; set;}
       public string AppId {get; set;}
   }
}

We will add corresponding values to the appsettings.json file.  We will be using User Secrets to store the keys to prevent them from ending up in source control.  If you are unsure of how to do this then check out this blog post.

In a new secrets.json file, the photo API settings will have the actual keys as values and look like this:

{
 "PhotoApiSettings": {
   "PrivateKey": "set in User Secrets",
   "AppId": "set in User Secrets"
 }
}

We can map the Pho.to keys to our model,from within the Startup.cs file. inside the ConfigureServices model ensuring it comes before the services.AddMvc() method call:

First, copy the using statement to pull in the PhotoApiSettings into our class and then configure the call to read the settings from appsettings.json, ensuring it comes before the services.AddMvc()`

using Microsoft.AspNetCore.Builder;
using Microsoft.AspNetCore.Hosting;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
using PhotoFiddler;
...
...
public void ConfigureServices(IServiceCollection services)
       {
           services.Configure<PhotoApiSettings>(Configuration.GetSection("PhotoApiSettings"));
           services.AddMvc().SetCompatibilityVersion(CompatibilityVersion.Version_2_1);
       }
...

We can then inject these settings into any class using IOptions<TOptions> and the Options Pattern.

Getting prepped for our processing

The Pho.to API requires us to make an initial API call with a URL to our image.  The image will then be taken, added to a task list, and queued for processing. A request ID is given in return.  A second API call with the request ID is made to see if the image has been processed.

Sending a URL to our image is not the most foolproof and the Pho.to API is unable to read the Twilio image from the Twilio URL.  Therefore, we need to save the incoming image locally and then serve that up to Pho.to.  Let's write a service to handle all of these steps.

Create a new folder in the root of the project called Services.  We can then add an interface for our service called IPhotoProcessor.cs into this folder and add the following code.

using System.Threading.Tasks;

namespace PhotoFiddler.Services
{
   public interface IPhotoProcessor
   {
       Task<string> Process(string incomingImageUrl, string sid, string host);
      
   }
}

Next, we will create the implementation of this Service.  Add a file called PhotoProcessor.cs in the Services folder and add the following code.

using System;
using System.IO;
using System.Threading.Tasks;
using System.Net;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Xml;
using System.Linq;
using System.Xml.Linq;
using System.Security.Cryptography;
using System.Text;
using System.Collections.Generic;

using Microsoft.Extensions.Options;

namespace PhotoFiddler.Services
{
   public class PhotoProcessor : IPhotoProcessor
   {
       public async Task<string> Process(string incomingImageUrl, string sid, string host)
       {
           return null;
       }

   }
}

Saving the incoming image

The first thing we want to do is save the incoming image URL locally.

We will do this by adding a private method to the service and then calling it in the public method Process.  The private method will return the location and file name of the newly stored image and we will call it in the Process method.

public async Task<string> Process(string incomingImageUrl, string sid, string host)
       {
           var imageLocation = await SaveImageLocally(incomingImageUrl, sid);
           return null;
       }

       private async Task<string> SaveImageLocally(string imageUrl, string sid)
       {
           var root = "/wwwroot";
           var dir = "/images/";
           var filename = $"{sid}.jpg";
           var path = Environment.CurrentDirectory + root + dir;
           var saveLocation = path + filename;           
           var rootLocation = dir + filename;

           if (!Directory.Exists(path))
           {
                var di = Directory.CreateDirectory(path);
           }

           using (var httpClient = new HttpClient())
           {
               byte[] imageBytes = await
                   httpClient
                       .GetByteArrayAsync(imageUrl);
               FileStream fs = new FileStream(saveLocation, FileMode.Create);
               BinaryWriter bw = new BinaryWriter(fs);
               bw.Write(imageBytes);
           }

           return rootLocation;
       }

We are saving our file to wwwroot folder which is the default location for static files. We will also need to enable our application to serve up static files.  Let's do that next.

Open the Startup.cs file and add app.UseStaticFiles(); to the Configure method as shown below:

 

public void Configure(IApplicationBuilder app, IHostingEnvironment env)
       {
        ...
           app.UseStaticFiles();
           app.UseHttpsRedirection();
           app.UseMvc();
       }

Whilst we are in the Startup.cs file we will register our service to .NET Core's IoC container from within the ConfigureServices method.

...
using PhotoFiddler.Services;
...
public void ConfigureServices(IServiceCollection services)
       {
           services.Configure<PhotoApiSettings>(Configuration.GetSection("PhotoApiSettings"));
           services.AddScoped<IPhotoProcessor, PhotoProcessor>();
           services.AddMvc().SetCompatibilityVersion(CompatibilityVersion.Version_2_1);
       }

I have chosen to configure my services as Scoped as I want the instance to be around for the lifetime of the request.  You can read more on the service registration options on the Microsoft documentation.

Midway Test run

Let's test that we are saving the incoming image.

To do this we need to write and set up a webhook. When we send a WhatsApp message to Twilio, it will trigger the webhook.

The template comes with a ValuesController, we can repurpose this for our use, renaming it to PhotoController.  Let's remove some of the extra code and add in some of our own.  We can also inject our newly created service into the constructor.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
using Microsoft.AspNetCore.Mvc;
using Microsoft.AspNetCore.Http;
using Twilio.Rest.Api.V2010.Account;
using Twilio.AspNet.Common;
using Twilio.TwiML;
using PhotoFiddler.Services;


namespace PhotoFiddler.Controllers
{
   [Route("api/[controller]")]
   [ApiController]
   public class PhotoController : ControllerBase
   {

     private readonly IPhotoProcessor _photoProcessor;

       public PhotoController(IPhotoProcessor photoProcessor)
       {
           _photoProcessor = photoProcessor;
       }

       [HttpPost]
       public async Task<IActionResult> Post()
       {
        return Content("Ahoy!");
       }
   }
}

When Twilio calls our webhook, it sends a request body with all the details and contents of the incoming WhatsApp message.  We will need two values from the incoming request; MediaUrl0 - this the URL to the photo we receive in, and MessageSid - the unique identifier for the message. We can get these from the Request from within our Post method in PhotoController.cs.

public async Task<IActionResult> Post()
{
var incomingImage = Request.Form["MediaUrl0"];
var sid = Request.Form["MessageSid"];
//var host = HttpContext.Request.Host.ToString();
var host = "https://<NGROK>.ngrok.io/";
var processedImage = await _photoProcessor.Process(incomingImage, sid, host);

return Content("Ahoy!");

}

We will also need the host domain to be able to send a URL to Pho.to. We will be using ngrok (find out how to set this up here) so that we don't have to publish our application to a server.  I have commented out the code that would get the host domain and hardcoded my ngrok domain for testing purposes.

Once ngrok is installed you can start it off by running the following command in the terminal.

 

>ngrok http <PORT_NUMBER> -host-header="localhost:<PORT_NUMBER>"

This will return you an ngrok URL that can be used to publicly expose your project without the need to deploy.  It will look something like this https://123456.ngrok.io/. Remember to update the host variable with the URL ngrok gives you.

Next, we need to run our application.

I, personally, like to change the launchSettings.json file, found in the Properties folder of the solution.  Changing the use of ports 5000 and 5001 and SSL when running locally, as it seems to limit problems with self-signing certificates and the like.

 The launchUrl can be updated to reflect the renaming of ValuesController to PhotoController. Here is an example:

{
 "$schema": "http://json.schemastore.org/launchsettings.json",
 "iisSettings": {
   "windowsAuthentication": false,
   "anonymousAuthentication": true,
   "iisExpress": {
     "applicationUrl": "http://localhost:3238"
     "sslPort": 44341

   }
 },
 "profiles": {
   "IIS Express": {
     "commandName": "IISExpress",
     "launchBrowser": true,
     "launchUrl": "api/photo",
     "environmentVariables": {
       "ASPNETCORE_ENVIRONMENT": "Development"
     }
   },
   "PhotoFiddler": {
     "commandName": "Project",
     "launchBrowser": true,
     "launchUrl": "api/photo",
     "applicationUrl": "http://localhost:3238",
     "environmentVariables": {
       "ASPNETCORE_ENVIRONMENT": "Development"
     }
   }
 }
}

You can run your solution with the following command from the CLI

>dotnet run

We will need to pass the URL to our webhook, https://<NGROK>.ngrok.io/api/photo into the Twilio API for WhatsApp configuration, under A Message Comes In and save.

screenshot of twilio console

 

Now take a selfie and WhatsApp it to +1 415 523 8886 (or whichever number you set up with your sandbox).

If it all worked you should receive Ahoy in you WhatsApp inbox and if you go to the wwwroot folder in your project you should now have an images folder with the image you sent inside.

Result!

gif of woman making a victory pose

Processing the image with Pho.to API

Now that our images are saving locally, we can make the call to the Pho.to API.

For this we will need the PhotoAppSettings we mapped earlier. We will inject them directly into the PhotoProcessor constructor in the PhotoProcessor.cs file.

…
using PhotoFiddler;
…
public class PhotoProcessor : IPhotoProcessor
   {
       private readonly PhotoApiSettings _photoApiSettings;
       public PhotoProcessor(IOptions<PhotoApiSettings> photoApiSettings)
       {
           _photoApiSettings = photoApiSettings.Value;
       }
      …
   }

We will add a new private method to the PhotoProcessor class and call it in the Process method.

       public async Task<string> Process(string incomingImageUrl, string sid, string host)
       {
           var imageLocation = await SaveImageLocallytest(incomingImageUrl, sid);

           var imageUrl = host + imageLocation;

           var processedImageUrl = await GetProcessedImage(imageUrl);

           return processedImageUrl;
       }

       private async Task<string> GetProcessedImage(string imageUrl)
       {
           string processedImageUrl = string.Empty;
           return processedImageUrl;

       }

The Pho.to API also requires us to HMAC encode the private key. So we will add another private method to the PhotoProcessor class that handles that.

private string EncodeKey(string input, byte[] key)
       {
           var encodedKey = new HMACSHA1(key);
           var byteArray = Encoding.ASCII.GetBytes(input);
           var stream = new MemoryStream(byteArray);
           return encodedKey.ComputeHash(stream).Aggregate("", (s, e) => s + String.Format("{0:x2}", e), s => s);
       }

We will now create the first API request, which adds our image to the Pho.to task list and gets a request ID in response.

To do this we will create an HTTP client and make a POST to the addtask endpoint.

Pho.to is expecting XML in a particular format which includes the URL to the image we want processed.  This is also where we can specify how we would like the image to be manipulated.  Check out this page for some fun ideas!  I have chosen to make the subject of my photos into aliens, Martians to be specific!

When we receive our successful response, we parse the XML to convert it to an XElement object and extract the request_id value which we will need for the second API call.

       private async Task<string> GetProcessedImage(string imageUrl)
       {
           string processedImageUrl = string.Empty;

           using (var httpClient = new HttpClient())
           {
               var apiEndPointPost = "http://opeapi.ws.pho.to/addtask";
               var apiEndPointGet = "http://opeapi.ws.pho.to/getresult?request_id=";
               string requestId = "";

               httpClient.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/xml"));

               string xmlMessage = $@"<image_process_call>
                                     <image_url>{imageUrl}</image_url>
                                   <methods_list>
                                       <method>
                                           <name>caricature</name>
                                           <params>type=10</params>
                                       </method>
                                   </methods_list>
                               </image_process_call>";

               var key = Encoding.ASCII.GetBytes(_photoApiSettings.PrivateKey);
               var keySha = EncodeKey(xmlMessage, key);
               var values = new Dictionary<string, string>
                   {
                       { "app_id", _photoApiSettings.AppId },
                       { "sign_data", keySha },
                       {"data", xmlMessage}
                   };

               var content = new FormUrlEncodedContent(values);

               var responseMessage = await
                   httpClient
                       .PostAsync(apiEndPointPost, content);

               if (responseMessage.IsSuccessStatusCode)
               {
                   var response = await responseMessage.Content.ReadAsStringAsync();

                   var xml = XElement.Parse(response).Descendants().FirstOrDefault(x => x.Name == "request_id");
                   requestId = xml?.Value;
               }

               return processedImageUrl;

           }
       }

As we have added our photo to a queue, when we call the second API, we may be told our image isn't ready yet.  So let's code a Do while loop to handle this.  Add this in immediately after the newly added code above, but before the return statement.

 private async Task<string> GetProcessedImage(string imageUrl)
       {
...
               string status;

               int i = 0;
               do
               {
                   System.Threading.Thread.Sleep(1000);
                   var url = apiEndPointGet + requestId;
                   var responseGet = await httpClient
.GetAsync(url);
                   var contentString = await responseGet.Content.ReadAsStringAsync();

                   var xmlGet = XElement.Parse(contentString).Descendants();
                   var xmlStatus = xmlGet.FirstOrDefault(x => x.Name == "status");
                   status = xmlStatus?.Value;
                   ++i;

                   if (status == "OK")
                   {
                       var xmlUrl = xmlGet.FirstOrDefault(x => x.Name == "result_url");
                       processedImageUrl = xmlUrl?.Value ?? "empty node";
                   }
               }
               while (i < 10 && status == "InProgress");

               if (i == 10 && status == "InProgress")
               {
                   Console.WriteLine("Retrieve processed image : Timeout error.");
                   return string.Empty;
               }

           }
           return processedImageUrl;
}

This code will call the API up to ten times on one second intervals whilst the status is set to InProgress.  Once the status is OK it will extract the URL of our processed image and we return that.

Sending our processed image

The last piece of our application is to respond to Twilio with the URL to our processed image.  We will do this in the webhook endpoint that we set up in PhotoController.cs file using TwiML, Twilio's version of XML.

       [HttpPost]
       public async Task<IActionResult> Post()
       {
           var incomingImage = Request.Form["MediaUrl0"];
           var sid = Request.Form["MessageSid"];

           //var host = HttpContext.Request.Host.ToString();
           var host = "https://<NGROK>.ngrok.io/";
           var processedImage = await _photoProcessor.Process(incomingImage, sid, host);

           var twiml = $@"<Response>
                               <Message>
                                   <Media>{processedImage}</Media>
                               </Message>
                           </Response>";

           return new ContentResult { Content = twiml, ContentType = "application/xml" };
       }

Testing out our processor

Run your project again, ngrok should still be running. Take a selfie and send it to yourself and see if you'd make a good Martian 👽!

 

screenshot of whatsapp with processed image

What next?

There are a few differences between MMS and WhatsApp media messages. Media messages can only be sent to WhatsApp users if there is an active messaging ‘session’ established. Messaging sessions are created when a user responds to a template message or the user initiates the conversation. They must also active for 24 hours after the last message they send. WhatsApp media messages also do not support some of the file types that MMS does. For more information on file type support check out the FAQs.

We covered quite a few coding challenges in this post, from implementing a do while loop to parsing XML into an XElement.  What we didn't do was write any code that handled what happens when any of these stages goes wrong.  So that could be the next step.

The Pho.to API has a lot of uses and many of the processes can be stacked.  You could make our Martian a Martian cartoon by changing the XML message we send out from the PhotoProcessor to the following:

<image_process_call>
<image_url>{imageUrl}</image_url>
   <methods_list>
       <method>
           <name>caricature</name>
           <params>type=10;cartoon=true</params>
        </method>
   </methods_list>
</image_process_call>

If you would like to see my completed code it can be found on my GitHub.

I can't wait to see what you build!  Please share them with me on any of my contacts!