Category Archives: Azure

Serverless Computing – A Paradigm Shift

In the beginning

When I first started programming (on the Spectrum ZX81), you would typically write a program such as this (apologies if it’s syntactically incorrect, but I don’t have a Speccy interpreter to hand):

10 PRINT "Hello World"
20 GOTO 10

You could type that in to a computer at Tandys and chuckle as the shop assistants tried to work out how to stop the program (sometimes, you might not even use “Hello World”, but something more profane).

However, no matter how long it took them to work out how to stop the program, they only paid for the electricity the computer used while it was on. Further, there will only ever be a finite and, presumably (I never tried), predictable number of “Hello World” messages produced in an hour.

The N-Tier Revolution

Fast forward a few years, and everyone’s talking about N-Tier computing. We’re writing programs that run on servers. Some of those servers are big and expensive, but pretty much the same statement is true. No matter how big and complex your program that you run on the server, it’s your server (well, actually, it probably belongs to a company that you work for in some capacity). For example, if you have a poorly written SQL Server Procedure that scans an entire table, the same two statements are still true: no matter how long it takes to run, the price for execution is consistent, and the amount of time it takes to run is predicable (although, you may decide that if it’s slow, speeding it up might be a better use of your time than calculating exactly how slow).

Using Other People’s Computers

And now we come to cloud computing… or do we? The chronology is a little off on this, and that’s mainly because everyone keeps forgetting what cloud computing actually is. You’re renting time on somebody else’s computer. If I was twenty years older, I might have started this post by saying that “this was what was done back in the 70’s and 80’s in a manner of speaking. But I’m not, so we’ll jump back to the mid to late 90’s: and SETI. Anyone who had a computer back in those days of dial-up connections and 14.4K modems will remember that SETI (search for extra-terrestrial intelligence) were distributing a program for everyone to run on their computer instead of toaster screensavers*.

Wait – isn’t that what cloud computing is?

Yes – but SETI did it in reverse. You would dial-up, download a chunk of data and your machine would process it as a screensaver.

Everyone was happy to do that for free: because we all want to find aliens. But what if there had been a cost associated to each byte of data processed; clearly something similar was in the mind of Amazon when they started with this idea.

Back to the Paradigm Shift

So, the title, and gist of this post is that the two statements that have been consistent right through from programs written on the ZX Spectrum to programs written in Turbo C and Pascal, to programs written in C# running on a dedicated server, has now changed. So, here are the four big changes brought about by the cloud **:

1. Development and Debugging

If you write a program, you can no-longer be sure of the cost, or the execution time. This isn’t a scare post: both are broadly predictable; but the paradigm has now changed. If you’re in the middle of writing an Azure function, or a GCP BigQuery query, and it doesn’t work, you can’t just close your laptop and go for dinner while you have a think, because while you do, nodes are lighting up all over the world trying to complete your task. The lights are dimming in Seattle while your Azure function crashes again and again.

2. Scale and Accessibility

The second big change is the way that your code is scaled. For example, you might be used to parallelising slow code so that you can make use of all available threads; however, in our new world, if you do that, you may actually be making it harder for your cloud platform of choice to scale your code.

Because you pay per minute of computing time (typically this is more expensive than storage), code that is unnecessarily slow or inefficient may not cause your system to slow down – what it will probably do is to cost you more money to run it at the same speed.

In some cases, you might find it’s more efficient to do the opposite of what you have thus-far believed to be accepted industry best practice. For example, it may prove more cost efficient to de-normalise data that you need to access.

3. Logging

This isn’t exactly new: you should always have been logging in your programs – it’s just common sense. However, there’s a new emphasis here: you’re not running this on your own server (you’re not even running it on a customer’s server) – it’s somebody else’s. That means that if it crashes, there’s a limited amount of investigation that you can do. As a result, you need to log profusely.

4. Microservices and Message Busses

IMHO, there are two reason why the Microservice architecture has become so popular in the cloud world: it’s easy – that is, spinning up a new Azure function endpoint takes minutes.

Secondly, it’s more scaleable. Microservices make your code more scaleable because it’s easy for the cloud provider to instantiate two instances of your program, and then three, and then a million. If your program does one small thing, then only that small thing needs to be instantiated. If your program does twenty different things, it can still scale, but it’ll cost more, because it will require more processing power.

Finally, instead of simply calling the service that you need, there is now the option to place a message on a queue; apart from separating your program into definable responsibility sectors, this means that, when your cloud provider of choice scales your service out, all the instances can pick up a message and deal with it.

Summary

So, what’s the conclusion: is cloud computing good, or bad? In five years time, the idea that someone in your organisation will know the spec of the machine your business critical application is running on seems a little far-fetched. The days of having a trusted server that has a load of hugely important stuff on, but nobody really knows what, and has been running since 1973 are numbered.

Obviously, there’s a price to pay for everything. In the case of the cloud, it’s complexity – not of the code, and not really of the combined system, but typically you are introducing dozens of moving parts to a system. If you decide to segregate your database, you might find that you have several databases, tens, or even hundreds of independent processes and endpoints, you could even spread that across multiple cloud providers. It’s not that hard to lose track of where these pieces are living and what they are doing.

Footnotes

* If you don’t get this reference then you’re probably under 30.

** Clearly this is an opinion piece.

Setting up an e-mail Notification System using Logic Apps

One of the new features of the Microsoft’s Azure offering are Logic Apps: these are basically a workflow system, not totally dis-similar to Windows Workflow (WF so as not to get sued by panda bears). I’ve worked with a number of workflow systems in the past, from standard offerings to completely bespoke versions. The problem always seems to be that, once people start using them, they become the first thing you reach for to solve every problem. Not to say that you can’t solve every problem using a workflow (obviously, it depends which workflow and what you’re doing), but they are not always the best solution. In fact, they tend to be at their best when they are small and simple.

With that in mind, I thought I’d start with a very straightforward e-mail alert system. In fact, all this is going to do is to read an service bus queue and send an e-mail. I discussed a similar idea here, but that was using a custom written function.

Create a Logic App

The first step is to create a new Logic App project:

There are three options here: create a blank logic app, choose from a template (for example, process a service bus message), or define your own with a given trigger. We’ll start from a blank app:

Trigger

Obviously, for a workflow to make sense, it has to start on an event or a schedule. In our case, we are going to run from a service bus entry, so let’s pick that from the menu that appears:

In this case, we’ll choose Peek-Lock, so that we don’t lose our message if something fails. I can now provide the connection details, or simply pick the service bus from a list that it already knows about:

It’s not immediately obvious, but you have to provide a connection name here:

If you choose Peek-Lock, you’ll be presented with an explanation of what that means, and then a screen such as the following:

In addition to picking the queue name, you can also choose the queue type (as well as listening to the queue itself, you can run your workflow from the dead-letter queue – which is very useful in its own right, and may even be a better use case for this type of workflow). Finally, you can choose how frequently to poll the queue.

If you now pick “New step”, you should be faced with an option:

In our case, let’s provide a condition (so that only queue messages with “e-mail” in the message result in an e-mail):

Before progressing to the next stage – let’s have a look at the output of this (you can do this by running the workflow and viewing the “raw output”):

Clearly the content data here is not what was entered. A quick search revealed that the data is Base64 encoded, so we have to make a small tweak in advanced mode:

Okay – finally, we can add the step that actually sends the e-mail. In this instance, I simply picked Outlook.com, and allowed Azure access to my account:

The last step is to complete the message. Because we only took a “peek-lock”, we now need to manually complete the message. In the designer, we just need to add an action:

Then tell it that we want to use the service bus again. As you can see – that’s one of the options in the list:

Finally, it wants the name of the queue, and asks for the lock token – which it helpfully offers from dynamic content:

Testing

To test this, we can add a message to our test queue using the Service Bus Explorer:

I won’t bother with a screenshot of the e-mail, but I will show this:

Which provides a detailed overview of exactly what has happened in the run.

Summary

Having a workflow system built into Azure seems like a two edged sword. On the one hand, you could potentially use it to easily augment functionality and quickly plug holes; however, on the other hand, you might find very complex workflows popping up all over the system, creating an indecipherable architecture.

Working with Multiple Cloud Providers – Part 3 – Linking Azure and GCP

This is the third and final post in a short series on linking up Azure with GCP (for Christmas). In the first post, I set-up a basic Azure function that updated some data in table storage, and then in the second post, I configured the GCP link from PubSub into BigQuery.

In the post, we’ll square this off by adapting the Azure function to post a message directly to PubSub; then, we’ll call the Azure function with Santa’a data, and watch that appear in BigQuery. At least, that was my plan – but Microsoft had other ideas.

It turns out that Azure functions have a dependency on Newtonsoft Json 9.0.1, and the GCP client libraries require 10+. So instead of being a 10 minute job on Boxing day to link the two, it turned into a mammoth task. Obviously, I spent the first few hours searching for a way around this – surely other people have faced this, and there’s a redirect, setting, or way of banging the keyboard that makes it work? Turns out not.

The next idea was to experiment with contacting the Google server directly, as is described here. Unfortunately, you still need the Auth libraries.

Finally, I swapped out the function for a WebJob. WebJobs give you a little move flexibility, and have no hard dependencies. So, on with the show (albeit a little more involved than expected).

WebJob

In this post I described how to create a basic WebJob. Here, we’re going to do something similar. In our case, we’re going to listen for an Azure Service Bus Message, and then update the Azure Storage table (as described in the previous post), and call out to GCP to publish a message to PubSub.

Handling a Service Bus Message

We weren’t originally going to take this approach, but I found that WebJobs play much nicer with a Service Bus message, than with trying to get them to fire on a specific endpoint. In terms of scaleability, adding a queue in the middle can only be a good thing. We’ll square off the contactable endpoint at the end with a function that will simply convert the endpoint to a message on the queue. Here’s what the WebJob Program looks like:

public static void ProcessQueueMessage(
    [ServiceBusTrigger("localsantaqueue")] string message,
    TextWriter log,
    [Table("Delivery")] ICollector<TableItem> outputTable)
{
    Console.WriteLine("test");
 
    log.WriteLine(message);
 
    // parse query parameter
    TableItem item = Newtonsoft.Json.JsonConvert.DeserializeObject<TableItem>(message);
    if (string.IsNullOrWhiteSpace(item.PartitionKey)) item.PartitionKey = item.childName.First().ToString();
    if (string.IsNullOrWhiteSpace(item.RowKey)) item.RowKey = item.childName;
 
    outputTable.Add(item);
 
    GCPHelper.AddMessageToPubSub(item).GetAwaiter().GetResult();
    
    log.WriteLine("DeliveryComplete Finished");
 
}

Effectively, this is the same logic as the function (obviously, we now have the GCPHelper, and we’ll come to that in a minute. First, here’s the code for the TableItem model:

[JsonObject(MemberSerialization.OptIn)]
public class TableItem : TableEntity
{
    [JsonProperty]
    public string childName { get; set; }
 
    [JsonProperty]
    public string present { get; set; }
}

As you can see, we need to decorate the members with specific serialisation instructions. The reason being that this model is being used by both GCP (which only needs what you see on the screen) and Azure (which needs the inherited properties).

GCPHelper

As described here, you’ll need to install the client package for GCP into the Azure Function App that we created in post one of this series (referenced above):

Install-Package Google.Cloud.PubSub.V1 -Pre

Here’s the helper code that I mentioned:

public static class GCPHelper
{
    public static async Task AddMessageToPubSub(TableItem toSend)
    {
        string jsonMsg = Newtonsoft.Json.JsonConvert.SerializeObject(toSend);
        
        Environment.SetEnvironmentVariable(
            "GOOGLE_APPLICATION_CREDENTIALS",
            Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "Test-Project-8d8d83hs4hd.json"));
        GrpcEnvironment.SetLogger(new ConsoleLogger());

        PublisherClient publisher = PublisherClient.Create();
        string projectId = "test-project-123456";
        TopicName topicName = new TopicName(projectId, "test");
        SimplePublisher simplePublisher = 
            await SimplePublisher.CreateAsync(topicName);
        string messageId = 
            await simplePublisher.PublishAsync(jsonMsg);
        await simplePublisher.ShutdownAsync(TimeSpan.FromSeconds(15));
    }
 
}

I detailed in this post how to create a credentials file; you’ll need to do that to allow the WebJob to be authorised. The Json file referenced above was created using that process.

Azure Config

You’ll need to create an Azure message queue (I’ve called mine localsantaqueue):

I would also download the Service Bus Explorer (I’ll be using it later for testing).

GCP Config

We already have a DataFlow, a PubSub Topic and a BigQuery Database, so GCP should require no further configuration; except to ensure the permissions are correct.

The Service Account user (which I give more details of here needs to have PubSub permissions. For now, we’ll make them an editor, although in this instance, they probably only need publish:

Test

We can do a quick test using the Service Bus Explorer and publish a message to the queue:

The ultimate test is that we can then see this in the BigQuery Table:

Lastly, the Function

This won’t be a completely function free post. The last step is to create a function that adds a message to the queue:

[FunctionName("Function1")]
public static HttpResponseMessage Run(
    [HttpTrigger(AuthorizationLevel.Function, "post")]HttpRequestMessage req,             
    TraceWriter log,
    [ServiceBus("localsantaqueue")] ICollector<string> queue)
{
    log.Info("C# HTTP trigger function processed a request.");
    var parameters = req.GetQueryNameValuePairs();
    string childName = parameters.First(a => a.Key == "childName").Value;
    string present = parameters.First(a => a.Key == "present").Value;
    string json = "{{ 'childName': '{childName}', 'present': '{present}' }} ";            
    queue.Add(json);
    

    return req.CreateResponse(HttpStatusCode.OK);
}

So now we have an endpoint for our imaginary Xamarin app to call into.

Summary

Both GCP and Azure are relatively immature platforms for this kind of interaction. The GCP client libraries seem to be missing functionality (and GCP is still heavily weighted away from .Net). The Azure libraries (especially functions) seem to be in a pickle, too – with strange dependencies that makes it very difficult to communicate outside of Azure. As a result, this task (which should have taken an hour or so) took a great deal of time, and it was completely unnecessary.

Having said that, it is clearly possible to link the two systems, if a little long-winded.

References

https://blog.falafel.com/rest-google-cloud-pubsub-with-oauth/

https://github.com/Azure/azure-functions-vs-build-sdk/issues/107

https://docs.microsoft.com/en-us/azure/azure-functions/functions-bindings-service-bus

https://stackoverflow.com/questions/48092003/adding-to-a-queue-using-an-azure-function-in-c-sharp/48092276#48092276

Working with Multiple Cloud Providers – Part 2 – Getting Data Into BigQuery

In this post, I described how we might attempt to help Santa and his delivery drivers to deliver presents to every child in the world, using the combined power of Google and Microsoft.

In this, the second part of the series (there will be one more), I’m going to describe how we might set-up a GCP pipeline that feeds that data into BigQuery (Google’s BigData NoSQL warehouse offering). We’ll first set up BigQuery, then the PubSub topic, and finally, we’ll set-up the dataflow, ready for Part 3, which will be joining the two systems together.

BigQuery

Once you navigate to the BigQuery section of the GCP console, you’ll be able to create a Dataset:

You can now set-up a new table. As this is an illustration, we’ll keep it as simple as possible, but you can see that this might be much more complex:

One thing to bear in mind about BigQuery, and cloud data storage in general is that, often, it makes sense to de-normalise your data – storage is often much cheaper than CPU time.

PubSub

Now we have somewhere to put the data; we could simply have the Azure function write the data into BigQuery. However, we might then run into problems if the data flow suddenly spiked. For this reason, Google recommends the use of PubSub as a shock absorber.

Let’s create a PubSub topic. I’ve written in more detail on this here:

DataFlow

The last piece of the jigsaw is Dataflow. Dataflow can be used for much more complex tasks than to simply take data from one place and put it in another, but in this case, that’s all we need. Before we can set-up a new dataflow job, we’ll need to create a storage bucket:

We’ll create the bucket as Regional for now:

Remember that the bucket name must be unique (so no-one can ever pick pcm-data-flow-bucket again!)

Now, we’ll move onto the DataFlow itself. We get a number of dataflow templates out of the box; and we’ll use one of those. Let’s launch dataflow from the console:

Here we create a new Dataflow job:

We’ll pick “PubSub to BigQuery”:

You’ll then get asked for the name of the topic (which was created earlier) and the storage bucket (again, created earlier); you’re form should look broadly like this when you’re done:

I strongly recommend specifying a maximum number of workers, at least while you’re testing.

Testing

Finally, we’ll test it. PubSub allows you to publish a message:

Next, visit the Dataflow to see what’s happening:

Looks interesting! Finally, in BigQuery, we can see the data:

Summary

We now have the two separate cloud systems functioning independently. Step three will be to join them together.

Working with Multiple Cloud Providers – Part 1 – Azure Function

Regular readers (if there are such things to this blog) may have noticed that I’ve recently been writing a lot about two main cloud providers. I won’t link to all the articles, but if you’re interested, a quick search for either Azure or Google Cloud Platform will yield several results.

Since it’s Christmas, I thought I’d do something a bit different and try to combine them. This isn’t completely frivolous; both have advantages and disadvantages: GCP is very geared towards big data, whereas the Azure Service Fabric provides a lot of functionality that might fit well with a much smaller LOB app.

So, what if we had the following scenario:

Santa has to deliver presents to every child in the world in one night. Santa is only one man* and Google tells me there are 1.9B children in the world, so he contracts out a series of delivery drivers. There needs to be around 79M deliveries every hour, let’s assume that each delivery driver can work 24 hours**. Each driver can deliver, say 100 deliveries per hour, that means we need around 790,000 drivers. Every delivery driver has an app that links to their depot; recording deliveries, schedules, etc.

That would be a good app to write in, say, Xamarin, and maybe have an Azure service running it; here’s the obligatory box diagram:

The service might talk to the service bus, might control stock, send e-mails, all kinds of LOB jobs. Now, I’m not saying for a second that Azure can’t cope with this, but what if we suddenly want all of these instances to feed metrics into a single data store. There’s 190*** countries in the world; if each has a depot, then there’s ~416K messages / hour going into each Azure service. But there’s 79M / hour going into a single DB. Because it’s Christmas, let assume that Azure can’t cope with this, or let’s say that GCP is a little cheaper at this scale; or that we have some Hadoop jobs that we’d like to use on the data. In theory, we can link these systems; which might look something like this:

So, we have multiple instances of the Azure architecture, and they all feed into a single GCP service.

Disclaimer

At no point during this post will I attempt to publish 79M records / hour to GCP BigQuery. Neither will any Xamarin code be written or demonstrated – you have to use your imagination for that bit.

Proof of Concept

Given the disclaimer I’ve just made, calling this a proof of concept seems a little disingenuous; but let’s imagine that we know that the volumes aren’t a problem and concentrate on how to link these together.

Azure Service

Let’s start with the Azure Service. We’ll create an Azure function that accepts a HTTP message, updates a DB and then posts a message to Google PubSub.

Storage

For the purpose of this post, let’s store our individual instance data in Azure Table Storage. I might come back at a later date and work out how and whether it would make sense to use CosmosDB instead.

We’ll set-up a new table called Delivery:

Azure Function

Now we have somewhere to store the data, let’s create an Azure Function App that updates it. In this example, we’ll create a new Function App from VS:

In order to test this locally, change local.settings.json to point to your storage location described above.

And here’s the code to update the table:

    public static class DeliveryComplete
    {
        [FunctionName("DeliveryComplete")]
        public static HttpResponseMessage Run(
            [HttpTrigger(AuthorizationLevel.Function, "post", Route = null)]HttpRequestMessage req, 
            TraceWriter log,            
            [Table("Delivery", Connection = "santa_azure_table_storage")] ICollector<TableItem> outputTable)
        {
            log.Info("C# HTTP trigger function processed a request.");
 
            // parse query parameter
            string childName = req.GetQueryNameValuePairs()
                .FirstOrDefault(q => string.Compare(q.Key, "childName", true) == 0)
                .Value;
 
            string present = req.GetQueryNameValuePairs()
                .FirstOrDefault(q => string.Compare(q.Key, "present", true) == 0)
                .Value;            
 
            var item = new TableItem()
            {
                childName = childName,
                present = present,                
                RowKey = childName,
                PartitionKey = childName.First().ToString()                
            };
 
            outputTable.Add(item);            
 
            return req.CreateResponse(HttpStatusCode.OK);
        }
 
        public class TableItem : TableEntity
        {
            public string childName { get; set; }
            public string present { get; set; }
        }
    }

Testing

There are two ways to test this; the first is to just press F5; that will launch the function as a local service, and you can use PostMan or similar to test it; the alternative is to deploy to the cloud. If you choose the latter, then your local.settings.json will not come with you, so you’ll need to add an app setting:

Remember to save this setting, otherwise, you’ll get an error saying that it can’t find your setting, and you won’t be able to work out why – ask me how I know!

Now, if you run a test …

You should be able to see your table updated (shown here using Storage Explorer):

Summary

We now have a working Azure function that updates a storage table with some basic information. In the next post, we’ll create a GCP service that pipes all this information into BigTable and then link the two systems.

Footnotes

* Remember, all the guys in Santa suits are just helpers.
** That brandy you leave out really hits the spot!
*** I just Googled this – it seems a bit low to me, too.

References

https://docs.microsoft.com/en-us/azure/azure-functions/functions-how-to-use-azure-function-app-settings#manage-app-service-settings

https://anthonychu.ca/post/azure-functions-update-delete-table-storage/

https://stackoverflow.com/questions/44961482/how-to-specify-output-bindings-of-azure-function-from-visual-studio-2017-preview

Debugging Recommendations Engine

Here I wrote about how to set-up and configure the MS Azure recommendations engine.

One thing that has become painfully apparent while working with recommendations is how difficult it is to work out what has gone wrong when you don’t get any recommendations. The following is a handy check-list for the next time this happens to me… so others may, or may not find this useful*:

1. Check the model was correctly generated

Once you have produced a recommendations model, you can access that model by simply navigating to it. The url is in the following format:

{recommendations uri}/ui

For example:

https://pcmrecasd4zzx2asdf.azurewebsites.net/ui

This gives you a screen such as this:

The status (listed in the centre of the screen) tells you whether the build has finished and, if so, whether it succeeded or not.

If the build has failed, you can select that row and drill into, and find out why.

In the following example, there is a reference in the usage data, to an item that is not in the catalogue.

Other reasons that the model build may fail include invalid, corrupt or missing data in either file.

2. Check the recommendation in the interface

In order to exclude other factors in your code, you can manually interrogate the model directly by simply clicking on the “Score” link above; you will be presented with a screen such as this:

In here, you can request direct recommendations to see how the model behaves.

3. Volume

If you find that your score is consistently returning as zero, then the issue may be with the volume of usage data that you have provided. 1k** rows of usage data is the sort of volume you should be dealing with; this statistic was based on a catalogue of around 20 – 30 products.

4. Distribution

The number of users matters – for the above figures, a minimum of 15** users was necessary to get any scores back. If the data sample is across too small a user base, it won’t return anything.

Footnotes

* Although this post is written by me, and is for my benefit, I stole much of its content from wiser work colleagues.

** Arbitrary values – your mileage may vary.

Adding to an Existing Azure Blob

In this post I briefly cover the concept of Storage Accounts and Blob Storage; however, there are more to blobs than this simple use case. In this post, I’ll explore creating a blob file from a text stream, and then adding to that file.

As is stated in the post referenced above, Azure provides a facility for storing files in, what are known as, Azure Blobs.

In order to upload a file to a blob, you need a storage account, and a container. Setting these up is a relatively straightforward process and, again, is covered in the post above.

Our application here will take the form of a simple console app that will prompt the user for some text, and then add it to the file in Azure.

Set-up

Once you’ve set-up your console app, you’ll need the Azure NuGet Storage package.

Also, add the connection string to your storage account into the app.config:

<connectionStrings>
    <add name="Storage" connectionString="DefaultEndpointsProtocol=https;AccountName=testblob;AccountKey=wibble/dslkdsjdljdsoicj/rkDL7Ocs+aBuq3hpUnUQ==;EndpointSuffix=core.windows.net"/>
</connectionStrings>

Here’s the basic code for the console app:

static void Main(string[] args)
{
    Console.Write("Please enter text to add to the blob: ");
    string text = Console.ReadLine();
 
    UploadNewText(text);
 
    Console.WriteLine("Done");
    Console.ReadLine();
}

I’ll bet you’re glad I posted that, otherwise you’d have been totally lost. The following snippets are possible implementations of the method UploadNewText().

Uploading to BlockBlob

The following code will upload a file to a blob container:

string connection = ConfigurationManager.ConnectionStrings["Storage"].ConnectionString;
string fileName = "test.txt";
string containerString = "mycontainer";
 
using (MemoryStream stream = new MemoryStream())
using (StreamWriter sw = new StreamWriter(stream))
{
    sw.Write(text);
    sw.Flush();
    stream.Position = 0;
 
    CloudStorageAccount storage = CloudStorageAccount.Parse(connection);
    CloudBlobClient client = storage.CreateCloudBlobClient();
    CloudBlobContainer container = client.GetContainerReference(containerString);
    CloudBlockBlob blob = container.GetBlockBlobReference(fileName);
    blob.UploadFromStream(stream);
}

(note that the name of the container in this code is case sensitive)

If we have a look at the storage account, a text file has, indeed been created:

New Blob

But, what if we want to add to that? Well, running the same code again will work, but it will replace the existing file. To prove that, I’ve changed the text to “Test data 2” and run it again:

Test Data

So, how do we update the file? Given that we can update it, one possibility is to download the existing file, add to it and upload it again; that would look something like this:

string connection = ConfigurationManager.ConnectionStrings["Storage"].ConnectionString;
string fileName = "test.txt";
string containerString = "mycontainer";
 
CloudStorageAccount storage = CloudStorageAccount.Parse(connection);
CloudBlobClient client = storage.CreateCloudBlobClient();
CloudBlobContainer container = client.GetContainerReference(containerString);
CloudBlockBlob blob = container.GetBlockBlobReference(fileName);
 
using (MemoryStream stream = new MemoryStream())
{
    blob.DownloadToStream(stream);
 
    using (StreamWriter sw = new StreamWriter(stream))
    {
        sw.Write(text);
        sw.Flush();
        stream.Position = 0;
 
        blob.UploadFromStream(stream);
    }
}

This obviously means two round trips to the server, which isn’t the best thing in the world. Another possible option is to use the Append Blob…

Azure Append Blob Storage

There is a blob type that allows you to add to it without actually touching it; for example:

string connection = ConfigurationManager.ConnectionStrings["Storage"].ConnectionString;
string fileName = "testAppend.txt";
string containerString = "mycontainer";
 
CloudStorageAccount storage = CloudStorageAccount.Parse(connection);
CloudBlobClient client = storage.CreateCloudBlobClient();
CloudBlobContainer container = client.GetContainerReference(containerString);
CloudAppendBlob blob = container.GetAppendBlobReference(fileName);
if (!blob.Exists()) blob.CreateOrReplace();
 
using (MemoryStream stream = new MemoryStream())
using (StreamWriter sw = new StreamWriter(stream))
{
    sw.Write("Test data 4");
    sw.Flush();
    stream.Position = 0;
 
    blob.AppendFromStream(stream);                
}

There are a few things to note here:

  • The reason that I changed the name of the blob is that you can’t append to a BlockBlob (at least not using an AppendBlob); so it has to have been created for the purpose of appending.
  • While UploadFromStream will just create the file if it doesn’t exist, with the AppendBlob, you need to do it explicitly.

PutBlock

The final alternative here is to use PutBlock. This can bridge the gap, by allowing the addition of blocks into an existing block blob. However, you either need to maintain the Block ID list manually, or download the existing block list; here’s an example of creating, or adding to a file using the PutBlock method:

string connection = ConfigurationManager.ConnectionStrings["Storage"].ConnectionString;
string fileName = "test4.txt";
string containerString = "mycontainer";
 
CloudStorageAccount storage = CloudStorageAccount.Parse(connection);
CloudBlobClient client = storage.CreateCloudBlobClient();
CloudBlobContainer container = client.GetContainerReference(containerString);
CloudBlockBlob blob = container.GetBlockBlobReference(fileName);
 
ShowBlobBlockList(blob);
 
using (MemoryStream stream = new MemoryStream())
using (StreamWriter sw = new StreamWriter(stream))
{
    sw.Write(text);
    sw.Flush();
    stream.Position = 0;
 
    double seconds = (DateTime.Now - new DateTime(2000, 1, 1)).TotalSeconds;
    string blockId = Convert.ToBase64String(
        ASCIIEncoding.ASCII.GetBytes(seconds.ToString()));
 
    Console.WriteLine(blockId);
    //string blockHash = GetMD5HashFromStream(bytes);                
 
    List<string> newList = new List<string>();
    if (blob.Exists())
    {
        IEnumerable<ListBlockItem> blockList = blob.DownloadBlockList();
 
        newList.AddRange(blockList.Select(a => a.Name));
    }
 
    newList.Add(blockId);
 
    blob.PutBlock(blockId, stream, null);
    blob.PutBlockList(newList.ToArray());
}

The code above owes a lot to the advice given on this Stack Overflow question.

In order to avoid conflicts in the Block Ids, I’ve used a count of seconds since an arbitrary date. Obviously, this won’t work in all cases. Further, it’s worth noting that the code above still does two trips to the server (it has to download the block list).

The commented MD5 hash allows you to provide some form of check on the data being valid, should you choose to use it.

What is ShowBlobBlockList(blob)?

The following function will give some details relating to the existing blocks (it is shamelessly plagiarised from here):

public static void ShowBlobBlockList(CloudBlockBlob blockBlob)
{
    if (!blockBlob.Exists()) return;
 
    IEnumerable<ListBlockItem> blockList = blockBlob.DownloadBlockList(BlockListingFilter.All);
    int index = 0;
    foreach (ListBlockItem blockListItem in blockList)
    {
        index++;
        Console.WriteLine("Block# {0}, BlockID: {1}, Size: {2}, Committed: {3}",
            index, blockListItem.Name, blockListItem.Length, blockListItem.Committed);
    }
}

Summary

Despite being an established technology, these methods and techniques are sparsely documented on the web. Obviously, there are Microsoft docs, and they are helpful, but, unfortunately, not exhaustive.

References

https://stackoverflow.com/questions/33088964/append-to-azure-append-blob-using-appendtextasync-results-in-missing-data

https://docs.microsoft.com/en-us/rest/api/storageservices/understanding-block-blobs–append-blobs–and-page-blobs

http://www.c-sharpcorner.com/UploadFile/40e97e/windows-azure-blockblob-putblock-method/

https://docs.microsoft.com/is-is/rest/api/storageservices/put-block

https://www.red-gate.com/simple-talk/cloud/platform-as-a-service/azure-blob-storage-part-4-uploading-large-blobs/

https://stackoverflow.com/questions/46368954/can-putblock-be-used-to-append-to-an-existing-blockblob-in-azure

Azure Recommendations

Azure provides a number of pre-configured machine learning services out of the box. One of these (still in Beta at the time of writing this) is Recommendations. The idea being that it will try to work out what, given a list of items, you would prefer, based knowledge about your habits. There’s a lot of information on line about this, but briefly, it can work out what your preference is based a combination of your past activity, and the past activity of others that have shown an interest in the same item.

Obviously, the “items” could be products, films, aardvarks, or sheep; Azure doesn’t know anything about the content of what its recommending; if you’ve bought* “A” in the past, and 75% of everyone else that has bought “A” has also bought “B” then there’s a chance you’ll want “B”. “A” could be an apple, and “B” could be a pair of sunglasses; so obviously, you need to be careful about the data that you feed it.

Recommendations API

The first thing to note is that the Recommendations API represents an earlier attempt to implement this by Microsoft, and is due to be discontinued early next year (2018). If you try to use this to follow any of the online tutorials then you’ll get into a world of hurt.

Deploy Recommendations

The new method of creating a recommendations service is via a wizard (which, I believe behind the scenes, builds a custom ARM template). This is the start of the deployment, and gives you a screen similar to the following (once you’ve logged in):

As you can see, there’s clearly some re-branding in progress here; anyway, complete the form and create the service.

Another thing that has changed in the new version of this is that the free pricing tier has disappeared:

After a few screens, it starts the deployment process:

The next screen that is displayed shows all the connection strings and keys in one handy reference:

… they are just below this screenshot.

This should create four separate services:

Sample Project

Microsoft provides a sample project that should work out of the box (they actually provide more than one – some of which work better than others). This one uses AutoRest, but there’s another referenced at the bottom of this post.

In this project, open Recommendations.Sample.Program.cs and, at the top of the main function, enter the details that you noted after the creation of your service. If you didn’t note them, then you can still find them. You’ll notice that four separate services were created: AppService, StorageAccount, App Insights and an App Service Plan.

recommendationsEndPointUri

Is found in the URL of the App Service:

apiAdminKey

Is found in the Application Settings of the App Service:

connectionString

Is the connection string of the storage account:

If you run this now, you should find that it will process and score the recommendations:

So, for example, we can see from the results above that people that bought DHF-01159 are recommended to buy DHF-01055 (although it doesn’t seem very convinced).

Footnotes

* The term that Azure uses here is “Purchase”. Different actions have different weighting (configurable), but by default, you would assume that buying something is more important than, for example, clicking on it (“Click” is another action). These actions can mean anything you choose; in the sheep example above, “Purchase” might mean shearing, and “Click” might mean photographing.

References

http://pmichaels.net/2017/08/06/deploying-azure-recommendation-service-using-arm-template/

https://docs.microsoft.com/en-us/azure/monitoring-and-diagnostics/monitoring-overview-of-diagnostic-logs

https://docs.microsoft.com/en-us/azure/cognitive-services/cognitive-services-recommendations-ui-intro

https://gallery.cortanaintelligence.com/Tutorial/Recommendations-Solution

https://github.com/Microsoft/Cognitive-Recommendations-Windows

SendGrid – Azure e-mail functionality

In this post, I discussed the prospect of sending an e-mail from an Azure function in order to alert someone that something had gone wrong. In one of the comments, it was suggested that I should look into a third party tool called “SendGrid”, and this post is the result of that investigation.

Azure Configuration

SendGrid is a third party application, and so the first thing you need to do is to create an account:

The free tier covers you for 25,000 e-mails per month. However, you do get a scary warning that, because this isn’t a Microsoft product, it is not covered by Azure credits.

Anyway, click create and, after a while, you’re new SendGrid Account should be created:

You’ll need to get the API Key: to do this, select Manage:

That takes you to https://app.sendgrid.com, where you can select to create an API Key:

Clearly, you wouldn’t want a full access account to just send an e-mail in real life… but Restricted Access has a form that would take longer to fill in, and I can’t be mythered*… so we’ll go with full access for now.

Once you’ve created it, and given it a name, you should have a key (remember that key – don’t write it down, or copy it, you must remember it!).

Code

Create a new Function App, and add the SendGrid NuGet package:

https://www.nuget.org/packages/Sendgrid/

In this case, let’s create a HttpTrigger function (this will fire when a web address is accessed); the body of which needs to look something like this:

        [FunctionName("SendEmail")]
        public static async Task<HttpResponseMessage> Run(
            [HttpTrigger(AuthorizationLevel.Function, "get", "post", Route = null)]HttpRequestMessage req, 
            TraceWriter log)
        {
            log.Info("C# HTTP trigger function processed a request.");

            // parse query parameter
            var addresses = req.GetQueryNameValuePairs();
            string[] addressArr = addresses
                .Where(a => a.Key == "address")
                .Select(a => a.Value).ToArray();

            if (addressArr.Count() == 0)
            {
                return req.CreateResponse(HttpStatusCode.BadRequest, 
                    "Please pass an address in the query string");
            }
            else
            {
                HttpStatusCode status = await CallSendEmailAsync(
                    "[email protected]", addressArr, "test e-mail",
                    "Once more unto the breach, dear friends, once more;\n" +
                    "Or close the wall up with our English dead.   \n" +
                    "In peace there's nothing so becomes a man     \n" +
                    "As modest stillness and humility:             \n" +
                    "But when the blast of war blows in our ears,  \n" +
                    "Then imitate the action of the tiger;         \n" +
                    "Stiffen the sinews, summon up the blood,");
                switch (status)
                {
                    case HttpStatusCode.OK:
                    case HttpStatusCode.Accepted:
                    {
                        return req.CreateResponse(status, "Mail Successfully Sent");
                    }
                    default:
                    {
                        return req.CreateResponse(status, "Unable to send e-mail");
                    }
                }
            }
        }

The `CallSendEmailAsync` helper method might look like this:

public static async Task<HttpStatusCode> CallSendEmailAsync(string from, string[] recipients, string subject, string body)
{
    EmailAddress fromAddress = new EmailAddress(from);
 
    SendGridMessage message = new SendGridMessage()
    {
        From = fromAddress,
        Subject = subject,
        HtmlContent = body
    };
 
    message.AddTos(recipients.Select(r => { return new EmailAddress(r); }).ToList());
   
    string sendGridApiKey = "AB.keythisisthekey.nkfdhfkjfhkjfd0ei8L9xTyaTCzy_sV5gPJNX-3";
 
    SendGridClient client = new SendGridClient(sendGridApiKey);
    Response response = await client.SendEmailAsync(message);
 
    return response.StatusCode;
}

The key is the string that I said you should remember earlier.

You can just paste the URL into a browser, give it the e-mail addresses in the format:

https://[email protected]&[email protected]

As you will see from the link at the bottom of this post, OK (200) signifies that the message is valid, but not queued to be delivered; and Accepted (202) indicates that the message is valid and queued. My guess is that if you get an OK, it means that the mail has already been delivered.

Footnotes

* It probably took longer to type this out than to do it.

References

https://go.sendgrid.com/

https://sendgrid.com/docs/API_Reference/Web_API_v3/Mail/errors.html

AutoRest

In the world of WCF and SOAP, if you want to consume a WCF service, you simply ask VS to add a service reference, direct it to the WSDL file, and it generates a client for you. Wouldn’t it be nice if you could do the same for a REST service!

It turns out that you can (although, admittedly, the process is not as seamless). The tool of choice for providing this WSDL emulation is called Swagger. This post isn’t about Swagger, and it assumes that you know your Swagger source. Azure has its own flavour of Swagger, called “Azure Swagger” (seriously). Consuming that is the main focus of this.

AutoRest

AutoRest is the client-side tool that allows you to automatically create proxy classes for your REST interactions, and is available via NuGet… but it doesn’t work at the time of writing this.

Instead, use Node Package Manager (NPM). First, install it, via this link.

Next, install AutoRest. Your friend here is the Package Manager Console:

npm install -g autorest

Update the version (this is, as I understand it, not strictly necassary, as it will install the latest version):

autorest --latest

Finally, check the version that is installed (okay, clearly this isn’t necessary at all, but I like to check that the computer and me both have a synchronised view of events as often as possible):

autorest --list-installed

Next, check where you are:

Then call AutoRest:

AutoRest -Input [SwaggerLocation] -Namespace [Namespace]

A note on Namespaces

It’s tempting to be dismissive of the namespace. Remember that, the namespace that you give will be assigned to classes and used by classes… so, if you were, for example, to give a namespace that matched the class name (so, for example, if you’re creating Recommendations, you’ll have a class called “Recommendations”) then you’ll find you’ll get thousands of strange errors that apparently make no sense. Think carefully about the namespace!

Where is the Swagger Location?

Obviously, the answer is, it depends… however; for Azure, most services have that readily available; let’s take Recommendations:

References

https://github.com/Azure/autorest

https://azure.microsoft.com/en-gb/resources/videos/inside-autorest-with-david-justice/

https://www.nuget.org/packages/autorest/

https://dzimchuk.net/generating-clients-for-your-apis-with-autorest/

https://github.com/Azure/autorest/blob/master/README.md#installing-autorest