Impacts on Productivity

It was after reading the excellent book Soft Skills: The software developer’s life manual by Jason Sonmez that I decided to push forward with furthering my career by starting a blog and creating YouTube videos. I did a number of posts and episodes of a web series, before intentionally taking a break to analyse what I was doing and look at how I could improve. Over the last couple of months I have spent time planning projects and reading various books on development processes as opposed to specific technologies. So before I start writing my next series of posts covering specific coding projects, I wanted to get my thoughts down on a topic that I can often find both interesting and frustrating and that is Productivity.

Almost all companies developing software follow one of the development methodologies. I have never experienced one that didn’t, but I’m sure they exist and they are probably quite chaotic. Having a defined method helps maintain control and ensures people are following the same processes. But just saying you are following Agile or Scrum or even Waterfall, does not say that it is helping you be as productive as possible within your chosen methodology.

I have heard of places “rigidly enforcing agile”. This doesn’t even make sense to me. Agile allows you to quickly adapt and adjust, so by rigidly following processes you can very easily end up stifling the qualities that you are trying to instil in your teams. I previously worked in government where following the processes appeared more important than the products being delivered. We even had far more people managing and being processes than we ever had in the development teams. This didn’t help with delivery, in fact it was rare for us to complete a product and when we did teams were more relieved it was over than happy with the quality of work.

I was looking at productivity from a work perspective but after reading another great book this one Scrum by Jeff Sutherland I decided to look at myself and where I reduce my own productivity in general. It turns out that the reasons and results between personal and work aligned quite well.

At times I have a short attention span. My long suffering partner may say that’s more often than I would, but its just part of who I am. One of the results of this is that I will often have several books on the go at any one time, along with other projects and even video games. To me this has always been normal and I didn’t see a problem with it. So I carried out an experiment and chose my reading as what I would work on first. Over a year I will read on average 2 books a month and I would be reading 2 different fiction books, 1 non-fiction such as biography or popular science and 1 related to technology and software. I often would read in the evening after 9pm and I would jump in and out of the books depending on my mood. Over the last couple of months I have set a new rule, only read 1 book at a time. I am no reading 3-4 books a month, so about double if I keep to this over a year. And it comes down to the simple fact that I remained engaged on the same book, I didn’t have to remember where I was, or get back into the story or subject as I was already there. Its not that I sped up my reading, it was just more focused so I didn’t have to pause when I started and also I read for longer in each session.

In work we will often switch between projects based on work items in our sprint. Our work items are based on priorities of how important they are to the product and the impact on the customer. Often the priorities are based on bugs first, then other backlog items. At face value this makes perfect sense, why create a new feature when your old ones are broken. So I did another experiment. I reordered the items in my sprint to be based on product areas, I did all the backup bugs and features first, then looked at a group of caching bugs and features. By doing this I ignored my normal prioritising order and it my velocity increased. However, later in the sprint due to various things going on I moved between different areas repeatedly. Guess what? My velocity dropped like a stone. My takeaway from this, was that I need to come up with a prioritisation system that takes into account the potential drop of productivity. If I can do more, quicker then delaying that priority 2 bug to later in the sprint is not really going to impact on its delivery date. While I’m talking about priorities, take a look at how you prioritise bugs. On a 1-5 scale people seem far more confident to assign a 1 or 2 to a bug than a 5. Is that typo that doesn’t change the meaning of the message really a 3 or 4? Be honest, its probably a 5 as its been there for a year and nobody has noticed.

My next observation on productivity was another from my personal life, smart phones and internet. Now I work in technology and am a big fan of what is now available on the internet and the information I can access on my phone, but it has its place. How much time do people spend on their phones at home? Just sitting there looking at Facebook as if somebody would have done something spectacular in the last ten minutes. I have an odd and unpredictable group of friends, but still the vast majority of the time nothing would have happened. Or looking up one thing on Wikipedia and then ending up reading about something really random and unrelated. I ended up on the Salem Witch Trial a few weeks ago, which while very interesting didn’t provide any help with developing for Google Home, which was what I was supposed to be doing. So now, I am trying to turn off the phone at home for a few hours when I’m working, so far its working and not just with my development projects, but also stopping me missing out on jobs round the house.

Now I’m a professional and I’m not going to sit at work just playing on my phone, but that 1 minute check every half hour to an hour is disruptive to the flow of work. People have gotten so used to checking that it is becoming a muscle memory action. I forgot my phone one day last week, I didn’t miss it. In fact when I got home I had 10 emails on my personal account, all marketing emails and 1 text message, ironically from my girlfriend letting me know I had forgotten my phone. But going back to my earlier point about losing productivity by switching focus, not having my phone did help as I wasn’t distracted. In the Soft Skills book I mentioned at the start, the author talks about batching tasks and specifically focuses on emails. When an email comes in you get a notification and a little icon and you have to check in case its something important, it rarely is, but since your there you reply, do what was asked, or even just think and plan what you would need to do. Again with my experiments I tried the batching approach and it worked. I only checked and responded to emails in 3 blocks during the working day, start of the day, after lunch and before I left. Not only did I find this approach helped me focus and be more productive, I realised something else, nobody noticed that I was doing it. When I started I expected emails saying “do you have an update on my last email?”, but it turns out that not replying for 3 hours doesn’t bring the building down.

My final observations for this post are specifically regarding Scrum. Now I am a big fan of Scrum, quite simply it works, my caveat for this is that it works when followed sensibly. Just having daily stand-ups and sprints does not mean you are following Scrum or aiding productivity.

I once did a sprint at home. I had let a lot of DIY jobs build up to the point that I didn’t want to face any of them, so I wrote a list that ranged from grout at tile and replace light fitting tasks to re-plaster a wall and then prioritised it. I originally planned to do it in 1 week before I was laughed at and told I should probably make it 2. But it worked. I’ve seen sprints in software where the key concept of having a demonstrable product is not adhered to. The team works on features and items slip between sprints, often the QA type tasks that are moved to the next sprint and even beyond that. Every few sprints a nice feature can be presented. The problem I see with this is that you are adding the retrospectives and planning meetings as if you were in a sprint, but not providing the deliverable of the sprint. In reality you are just reducing your productivity by adding processes for the sake of them. If this is the case then you should reassess and work on what you can deliver in a sprint.

Stand-up meetings are another essential part of a sprint, these meetings should leave you energised and ready to take on your tasks knowing that blockers are being addressed. These meetings should also be short. A 10-15 minute stand-up is in reality a 15-20 minute break in work, people have to get up from the desk and move to the meeting etc. Now if you make this meeting 30 minutes, it gets worse. If your stand-ups are going to be that long, watch how many people get up and go get a coffee before or after the meeting (in some cases both). A 30 minute meeting is an investment in time and you will see people treat it as much more than just an update. Not by way of the content the provide in the meeting, but how they prepare for attending. So now you are at 40-45 minute disruption to work, or over a working week is over 3.5 hours, or half a working day for each person attending.

This was quite a long post really and I have purposely not tried to provide solutions to the problems, most of what I have looked at were things I noticed by looking at my own productivity and how this can change through various factors. But what I want to get across is that we should look at our own actions to see if we are doing something that reduces our productivity. We should also look that when we are following a methodology, that we really are following it and not just following the processes for it. If you are putting more effort into being Agile than being Agile, maybe some adjustments are needed.

Getting Started with DocumentDB and Azure Functions

Introduction
Serverless computing is another buzzwords being passed around at the moment. Instead of setting up web servers and database servers you create everything in the cloud, you don’t even need to use a development environment as you can enter everything from you browser.

I can be quite cynical at times when “the next big thing” is being talked about, but this is one (along with bots and AI) is one I can really get behind. Several real world scenarios I have faced recently have been easily solved using these or their Amazon Web Services equivalent. I will share an example below:

I was faced with coming up with a solution that coincidentally involved a bot at one stage of the application. We needed to be able to get a small amount of information from about the current user that would allow us to direct information to the correct web services. I needed to make it as lightweight as possible as due to the complexity of components outside of my control, slow performance was a concern. Only a week or two earlier I had created my first Alexa Skill using two parts of AWS; DynamoDB their NoSQL database and Lambda which is their web functions. The speed and reduced overhead that these provided made me think that they would be the solution. Most of my team are all .Net developers so I decided to use Azure for this specific project, which meant using DocumentDB and some Azure Functions.

In this post I am going to go through the basics process of creating a DocumentDB database and then four functions to replicate the basic CRUD commands.

DocumentDB
DocumentDB is a NoSQL database, which to be honest is something I have heard for a while, but only really paid attention too recently. To me it was SQL that made databases, except for that one week where I did the old Java Developer certification and you had to write an old fashioned application!

Within a DocumentDB database you save Documents which are JSON objects. This makes them a great fit if you want to be able be used by a WebAPI service.

We are going to create a new DocumentDB so within the Azure Portal we add a new item and search for DocumentDB. The initial creation wizard is very simple, so enter a unique ID and the other settings here. These are all pretty standard across Azure. This wizard does give you the option to select DocumentDB or MongoDB, this is really aimed at users who are already using MongoDB and intending to migrate over so we are going to leave it as the default of DocumentDB.

CreateDocumentDB

Once we select create it will take a minute to provision and deploy this for us.

InitiallyDeployedSelect Add Database and then complete the extensive wizard for this, just enter a name!

DB

We now need to add a collection for this database. Be warned that these are billable so if you are not using this for production purposes, reduce the RTU’s to the lowest possible (400 as of writing this).

AddCol1AddCol2With that done, we have our DocumentDB setup with all that is needed. We will come back later to get the connection details, but we shouldn’t need to carry out any further config.

Azure Function

Azure Functions allow you to lightweight, serverless and event drive. It event says that on the Azure site. What that really means is, you don’t have to code much, they are fast, Azure handles the scaling and you can call them from pretty much anything. I’m not going to go into the setting up of webhooks or timer based triggers in this post, I’m going to just use basic Http triggers. But most of the principals are the same and Azure has a large stock of sample functions to help with whatever trigger you want to use.

First off we need to add a new Function App. This uses a pretty basic Azure creating wizard.

CreateFunctionAppOnce the Function App has finished deploying and you select it, the Function App will open in almost the width of the browser and you will be displayed a getting started page. You can choose your function type from here, or if you select New Function you will be able to see all the options available. I have used the drop down lists to refine the search and selected the HttpTrigger.

CreateFor the purposes of this demo we don’t need to worry about the Authorization level so I’m going to leave it at Function. The steps here are generic and we will use them throughout the rest of the document to build each of the four functions.

Once the function has been created you will see something similar to this:

capture10We wont be needing the boilerplate code, but before deleting it if you haven’t used functions before it may be worth scrolling to the bottom of the display and testing the simulator. By entering the value for name into the JSON you get a simple response from the function when you press Run.

capture13

Before we move on, you can check out the Integrate and Manage tabs sections selected from the left hand panel. Integrate allows you to select the types of trigger and output (including parameters) you want your function to have, along with settings such as Authentication. Most of this is set when you choose your function type, but you can make changes here if say you wanted to have an action triggered by a DocumentDB but with a request sent out via Http to another service.

capture11

From the manage tab you can enable/disable or delete your function.

capture12

With our function created, we can now write some code into the browser.

Adding a NuGet Package to your Function

This step will be used within all the functions in this post so I’m putting it at the top and getting it out of the way. From the Function UI you can not directly add NuGet packages. To connect to DocumentDB you need a NuGet package installed. Problem? Nope, not really. Although you cant add them normally, they are still easily supported. All you need to do is select View Files and add a new file called Project.json like shown below:

capture14

Inside that file you add the following json. If you need any other examples of this or are unsure of the settings for a particular NuGet package then you can just copy the detail from the Project.json file in any Visual Studio solution. That’s what I did!

{
  "frameworks": {
    "net46":{
      "dependencies": {
        "Microsoft.Azure.DocumentDB": "1.10.0"
      }
    }
   }
}

Function 1 – Create New Document

For our first function we are going to create a new document based on the json received in the body of the request. The full code is below and will be explained underneath.

using System.Net;
using Microsoft.Azure.Documents;
using Microsoft.Azure.Documents.Client;
using Newtonsoft.Json;

public static async Task<HttpResponseMessage> Run(HttpRequestMessage req, TraceWriter log)
{
    string DB = "Users";
    string COLLECTION = "UsersCollection";
    string ENDPOINT = "<YOUR-ENDPOINT>";
    string KEY = "<YOUR-KEY>";

    dynamic data = await req.Content.ReadAsAsync<object>();
    string id = data?.Id;
    if (string.IsNullOrEmpty(id))
        return req.CreateResponse(HttpStatusCode.BadRequest, "Please pass an Id in the request body");
    DocumentClient client = new DocumentClient(new Uri(ENDPOINT), KEY);
    try
    {
        await client.ReadDocumentAsync(UriFactory.CreateDocumentUri(DB, COLLECTION, id));

        return req.CreateResponse(HttpStatusCode.Conflict, "This user already exists in the database.");
    }
    catch (DocumentClientException ex)
    {
        if (ex.StatusCode == HttpStatusCode.NotFound)
        {
            User u = new User();
            u.Id = data?.Id;
            u.JobTitle = data?.JobTitle;
            await client.CreateDocumentAsync(UriFactory.CreateDocumentCollectionUri(DB, COLLECTION), u);

            return req.CreateResponse(HttpStatusCode.OK, "The following user was created successfully: " + id);
        }
        else
        {
            return req.CreateResponse(HttpStatusCode.BadRequest, "Please pass a name on the query string or in the request body");
        }
    }
}

public class User
{
    [JsonProperty(PropertyName = "id")]
    public string Id { get; set; }
    public string JobTitle { get; set; }
}

First I have created a few variables to contain the database details to connect to the DocumentDB that we created earlier. I then assign the Id value from the JSON in the body of the request to a dynamic to use later on. To simplify this post I am only checking the body for the Id. You could also check the header to allow developers to choose how they wanted to send requests.

Using the libraries from Microsoft.Azure.DocumentDB package we added using our Project.json file we first create a DocumentClient to allow us to access our database, then inside of a try/catch block we attempt to read the document. If this is successful then the document already exists so we return an error response to the user. If the document isn’t found then an exception is thrown. We check this to see if it is of the HttpStatusCode NotFound, if it is then we create the document using the CreateDocumentAsync method.

There is also a small User class at the end of the file for serializing the JSON.

The DocumentDB library has a UriFactory static class which you may have noticed we have used at several points. This class provides helpers for creating all the types of Uri needed for accessing the database.

Function 2 – Get a Document

Now that we have created a document in the database, we need to be able to retrieve it. We could create a function that allows us to do a complex search based, or one that returns only specific values. However for the purposes of this walkthrough, we are going to simply return the whole JSON document and allow the calling system to deal with anything else.

As with the previous example, the whole code is shown below and explained underneath.

using System.Net;
using Microsoft.Azure.Documents;
using Microsoft.Azure.Documents.Client;
using Newtonsoft.Json;

public static async Task<HttpResponseMessage> Run(HttpRequestMessage req, TraceWriter log)
{
    string DB = "Users";
    string COLLECTION = "UsersCollection";
    string ENDPOINT = "<YOUR-ENDPOINT>";
    string KEY = "<YOUR-KEY>";

    dynamic data = await req.Content.ReadAsAsync<object>();
    string id = data?.Id;
    if (string.IsNullOrEmpty(id))
        return req.CreateResponse(HttpStatusCode.BadRequest, "Please pass an Id in the request body");
    DocumentClient client = new DocumentClient(new Uri(ENDPOINT), KEY);
    try
    {
        IQueryable<User> users = client.CreateDocumentQuery<User>(UriFactory.CreateDocumentCollectionUri(DB, COLLECTION)).Where(u => u.Id == id);
        User currentUser = new User();
        foreach(var u in users)
        {
            log.Info(u.Id);
            currentUser = u;
        }
        if (string.IsNullOrEmpty(currentUser.Id))
            return req.CreateResponse(HttpStatusCode.BadRequest, "Unable to access user within DB");
        string rtn = JsonConvert.SerializeObject(currentUser);

        return req.CreateResponse(HttpStatusCode.OK, $"{rtn}");
    }
    catch (DocumentClientException ex)
    {
        if (ex.StatusCode == HttpStatusCode.NotFound)
            return req.CreateResponse(HttpStatusCode.BadRequest, "This user does not exist in the database.");

        return req.CreateResponse(HttpStatusCode.BadRequest, $"An unknown error has occured. Message: {ex.Message}");
    }
}

public class User
{
    [JsonProperty(PropertyName = "id")]
    public string Id { get; set; }
    public string JobTitle { get; set; }
}

A fair amount of the code for this function is the same as the previous example. What I have done is in the try/catch block called the CreateDocumentQuery method on my DocumentClient. This will return an IQueryable of User objects. As I am trying to show a simple demonstration I have not implemented IEnumerable on by User class so am still using a foreach loop (not the cleanest code but allows the point to be made). As I am searching only by the Id which as the partition key on the DocumentDB must be unique, we can be sure of only a single value being returned.

We then use the SerializeObject method from Newtonsoft.Json’s JsonConvert class to create a string that is returned as the body of the response. This enables the calling system to retrieve the full JSON document from the database.

Function 3 – Update a Document

As we are using DocumentDB and just storing JSON objects, instead of updating an object directly we are going to replace the JSON object with an updated copy.

using System.Net;
using Microsoft.Azure.Documents;
using Microsoft.Azure.Documents.Client;
using Newtonsoft.Json;

public static async Task<HttpResponseMessage> Run(HttpRequestMessage req, TraceWriter log)
{
    string DB = "Users";
    string COLLECTION = "UsersCollection";
    string ENDPOINT = "<YOUR-ENDPOINT>";
    string KEY = "<YOUR-KEY>";

    dynamic data = await req.Content.ReadAsAsync<object>();
    string id = data?.Id;
    string jobTitle = data?.JobTitle;
    if (string.IsNullOrEmpty(id))
        return req.CreateResponse(HttpStatusCode.BadRequest, "Please pass an Id in the request body");
    if (string.IsNullOrEmpty(jobTitle))
        return req.CreateResponse(HttpStatusCode.BadRequest, "No job title provided for update.");
    DocumentClient client = new DocumentClient(new Uri(ENDPOINT), KEY);
    try
    {
        IQueryable<User> users = client.CreateDocumentQuery<User>(UriFactory.CreateDocumentCollectionUri(DB, COLLECTION)).Where(u => u.Id == id);
        User userToUpdate = new User();
        foreach(var u in users)
        {
            userToUpdate = u;
        }
        if (string.IsNullOrEmpty(userToUpdate.Id))
            return req.CreateResponse(HttpStatusCode.BadRequest, "Unable to access user within DB");
        userToUpdate.JobTitle = jobTitle;
        await client.ReplaceDocumentAsync(UriFactory.CreateDocumentUri(DB, COLLECTION, id), userToUpdate);

        return req.CreateResponse(HttpStatusCode.OK, $"The job title for user {id} has been updated to {jobTitle}");
    }
    catch (DocumentClientException ex)
    {
        if (ex.StatusCode == HttpStatusCode.NotFound)
            return req.CreateResponse(HttpStatusCode.BadRequest, "This user does not exist in the database.");
        return req.CreateResponse(HttpStatusCode.BadRequest, $"An unknown error has occured. Message: {ex.Message}");
    }
}

public class User
{
    [JsonProperty(PropertyName = "id")]
    public string Id { get; set; }
    public string JobTitle { get; set; }
}

First we use the CreateDocumentQuery method like we did in our retrieve function. After checking that the document exists we then edit the document to include the new JobTitle. Finally we call the ReplaceDocumentAsync method on our DocumentClient with the parameters for the Document uri using the UriFactory class and the new JSON object to store in the DB.

Function 4 – Delete a Document

Our final function to create our basic CRUD functionality is to delete a Document.

using System.Net;
using Microsoft.Azure.Documents;
using Microsoft.Azure.Documents.Client;
using Newtonsoft.Json;

public static async Task<HttpResponseMessage> Run(HttpRequestMessage req, TraceWriter log)
{
    string DB = "Users";
    string COLLECTION = "UsersCollection";
    string ENDPOINT = "<YOUR-ENDPOINT>";
    string KEY = "<YOUR-KEY>";

    dynamic data = await req.Content.ReadAsAsync<object>();
    string id = data?.Id;
    if (string.IsNullOrEmpty(id))
        return req.CreateResponse(HttpStatusCode.BadRequest, "Please pass an Id in the request body");
    DocumentClient client = new DocumentClient(new Uri(ENDPOINT), KEY);
    try
    {
        await client.ReadDocumentAsync(UriFactory.CreateDocumentUri(DB, COLLECTION, id));
        await client.DeleteDocumentAsync(UriFactory.CreateDocumentUri(DB, COLLECTION, id));

        return req.CreateResponse(HttpStatusCode.OK, $"The following user has been deleted from the database : {id}");
    }
    catch (DocumentClientException ex)
    {
        if (ex.StatusCode == HttpStatusCode.NotFound)
            return req.CreateResponse(HttpStatusCode.BadRequest, "This user does not exist in the database.");
        return req.CreateResponse(HttpStatusCode.BadRequest, $"An unknown error has occured. Message: {ex.Message}");
    }
}

Unlike our previous functions, the delete function does not need to know about the User object so this is not included. Our fist step after checking the request has an Id, is to try and read the document. This allows us to confirm that a document with the provided Id does indeed exist. If it does we then call DeleteDocumentAsync on our DocumentClient using a DocumentUri created by the UriFactory.

Testing

Now that you have functions for creating, retrieving, updating and deleting you can test these against your DocumentDB database. At the bottom of the browser page for the function you can update the JSON as shown below and when you press Run you will see in the Output panel the text we entered for the return message.

Test1

Now you can test the retrieve operation by going to that function and changing the JSON request to this value. You will then see the JSON string in the return.

capture16

Final Words

This post has covered a fairly simple example, that said we do now have all the steps needed for accessing and managing data in our database. In future posts I will be incorporating Azure Functions and DocumentDB into a bot that can store data on behalf of a user. If you have any questions or comments, please add them below.

Deploying a bot with the Microsoft Bot Framework

Introduction

So far in my posts/videos on the bot framework we have been using the emulator to communicate with our bot. To finish off the basic tutorials for the bots we are now going to connect our bot to another channel, in this case Slack.

Connector Service

The Bot Framework’s Connector Service allows us to easily integrate our bots with various applications. To do this we need to register out bot on the Bot Framework site. In this post we will just be setting up our bot for dev purposes so wont worry too much about Icons and site details for help etc.

The first thing we need to do is host our application somewhere. I am just using a basic Azure App for this. We will need the endpoint details shortly when we register our bot.
Azure Publish
When you register a new bot you need to give it a name and handle. I am just going to use the same for both here.
Capture1
Next we configure the endpoint ensuring to add api/messages on the end. For the App ID and password press the button and you will be taken to another window to access/generate these.
Capture4
Bot ID and PW
One important note when generating the password is that you will not be able to access that again. If you don’t save the password immediately in your bot you will need to regenerate a new password.

Just like many app stores you will need to provide some details on the publisher such as support details and terms of use. These are important when you are pushing out a live application, for development purposes we just need to ensure they are in the valid formats.
Publisher Profile
If you intend to use App Insights then you can configure this at the end of the form. Then all that remains is confirming you have read the terms and privacy and then press Register. Capture7
You will now be navigated to your apps page. You will notice that the channels for Skype and Web Chat are already enabled. You don’t need to press the Publish button when working in Dev so this can be left for now. You can press the Test button on the right of the page to confirm connectivity to your published bot.
Capture8

Slack

When you have registered your bot you will be able to see a list of available channels. Each one has different configuration steps based on their own systems requirements. We are going to go through using Slack for this demo. Slack is a great chat application that works great if you need to section people into teams.  Capture9To publish your bot to Slack you need to setup a Slack account so you can administer your team and also have an account in api.slack.com which will allow you to develop Slack applications.
Capture10
When you choose to connect to a Channel you follow a basic set of instructions from the bot framework site. These are usually just steps on what to do in the other application. For Slack you are asked to create a new App which is what we do here.
Capture11The bot framework wizard shows you what information you need to input into Slack. So in our case we need to copy the redirect uri into the appropriate part of the Slack configuration form. Capture12Once we have done this we need to create a bot user, this is the handle we are going to use to invite and call our bot .
Capture14The next step is to copy the App Credentials into the appropriate boxes in the form on the bot framework site. When you authorise the bot, you will be prompted by Slack to confirm that you are happy for the bot framework to make these changes.
Capture18Capture19You will now be able to use your invite your bot into your group on slack and ask it your normal questions. In slack all bots are considered non-sentient users, so they respond to you but don’t initialise conversations.
Capture20Capture21
We now can test this by asking a couple of our bot questions.
Capture22One problem we encounter is that if you have prompts, you need to configure slack to allow interactive messages.
Capture23
If we go back to our app in api.slack.com we can enable interactive messages.
Capture24 We can now run our commands which contain prompts without problem.
Capture25
What next?

I see this as the last of the basic tutorials for the bot framework, we can now build bots using dialogs, LUIS, prompts and publish them to a channel. You can experiment with other channels and the process is quite straight forward for each of them.

Adding prompts to a Microsoft bot

Introduction

In recent posts I have covered the basics of creating a bot. One of the last features that was added was the ability to remove a team from a competition. In a normal desktop or web application, basic user experience rules would have us ensure that the user was prompted to confirm such a decision. Our bot didn’t have this feature, which would increase the possibility of users mistakenly deleting data. In todays post we are going to look at adding prompts to our bot.

Basic yes or no questions

The first prompt we will look at is a basic confirmation. This prompt will allow the user to respond with Yes or No. As some of the channels we can connect our bot into allow for buttons, we will automatically get these displayed. We will see these in the image from our emulator later in the post, but we need to remember that depending on channel used (i.e. SMS) this may not be available to the user.

To implement our prompt we call PromptDialog.Confirm. We need to provide the context, the name of a suitable Action and also some options for how we what we will display in the prompt. In this example, for the options we just provide a caption for the dialog. In the example later in the post we will create a more complete set of PromptOptions.

private string TeamName;

[LuisIntent("RemoveTeam")]
public async Task RemoveTeam(IDialogContext context, LuisResult result)
{
    EntityRecommendation rec;
    if (result.TryFindEntity("TeamName", out rec))
    {
        TeamName = rec.Entity;
        if (champs.DoesTeamExist(TeamName))
        {
            PromptDialog.Confirm(context, RemoveTeamPromptAsync, 
                    $"Are you sure you want to delete the team { TeamName }?");
        }
        else
        {
            await context.PostAsync($"The team {TeamName} was not found.");
            context.Wait(MessageReceived);
        }
    }
    else
    {
        await context.PostAsync("An error occured. We were unable to remove the team.");
        context.Wait(MessageReceived);
    }

}

In the RemoveTeamPromptAsync method we created we need to add the code for handling the users response. For a confirm dialog the result is always going to be a bool, so we can await this result and use that to determine the action to carry out. If the user selects or states No then no action is taken and the user is informed of this. At the end of the method we have the IDialogContext await the next message.

private async Task RemoveTeamPromptAsync(IDialogContext context, IAwaitable<bool> result)
{
    if (await result)
    {
        champs.RemoveTeam(TeamName);
        await context.PostAsync($"{TeamName} has been removed from the championships.");
    }
    else
    {
        await context.PostAsync($"OK, we have not deleted them.");
    }
    context.Wait(MessageReceived);
}

We can test our code now in the Emulator and see how the prompt is displayed to the users.

Simple Prompt

Custom prompts

Although Yes/No prompts are great, sometimes you want to provide a more specific set of options. Imaging creating a booking application and the user asks to book an appointment. You would not respond back with a yes or no question of if they want to book unless you had more details. So rather than requiring the user to specify a time initially, you could offer them a list of available times via a prompt.

In the example below I have added a new feature to my Championships class that returns the top three teams. This is a simple example, but provides enough to show the necessary steps. The LUIS application has also been updated to have a new Intent called RemoveGoodTeam. This allows the user to inform the bot to “Remove a good team” and the bot will return a list of the top three teams and ask the user which it should remove.

Outside of the bot code, I have called the GetTopThreeTeams from my sample app and assigned to result to a List this will be used in the prompt options. We now create PromptOptions object which we will use with our PromptDialog. Within these options we can provide a number of settings. In this example not only do I provide the prompt for the user as in we did in our earlier example, but I also provide messages for incorrect entries, too many attempts, the list of options to display (taken from our GetTopThreeTeamsMethod) and finally how many attempts the user gets.

The final line of the code calls the PromptDialog but this time instead of Confirm we call Choice.

[LuisIntent("RemoveGoodTeam")]
public async Task RemoveGoodTeam(IDialogContext context, LuisResult result)
{
    List goodTeams = champs.GetTopThreeTeams();
    PromptOptions options = new PromptOptions("Select which of the top teams to remove",
            "Sorry please try again", "I give up on you", goodTeams, 2);
    PromptDialog.Choice(context, RemoveGoodTeamAsync, options);
}

The code for our action is pretty much the same as our first example, with the notable exception that we are awaiting a string and not a bool. The string value will be one of the options we provided earlier. The bot will only return one of our specified options, so we don’t have to worry as much about invalid entries as we would have done if the user had sent the request themselves.

private async Task RemoveGoodTeamAsync(IDialogContext context, IAwaitable<string> result)
{
    string res = await result;
    if (champs.DoesTeamExist(res))
    {
        champs.RemoveTeam(res);
        await context.PostAsync($"{res} has been removed from the championships.");
    }
    else
    {
        await context.PostAsync($"The team {res} was not found.");
    }
    context.Wait(MessageReceived);
}

When we test this in the emulator we can see the options now displayed instead of the Yes/No prompts. If we typed in a team name that was not on the options list, the bot would return the message we entered for incorrect entries. After the specified number of attempts had failed, the user would have to re ask the bot to remove a good team.

CustomPrompt

What next?

So far in the last three posts we have created a basic bot, added natural language support with LUIS and now provided prompts and questions. Our next stage is to deploy our bot outside of our local environment and connect up to some of the channels available.

Using Natural Language in a bot with LUIS

Introduction

Bots may be great, but one thing that can take up a lot of a developers time, is coding for as many variations of input as they can think of. Even with all this work, some customers may find that the bot does not recognise what they are entering. A great way to handle this is to use a natural language service, so you can leave the variations of language to that. As part of the Microsofts Cognitive Services, they have developed a system called Language Understanding Intelligent Service. Or just simply LUIS. In this blog we are going to create a simple LUIS application and integrate it with a bot. For those people who read my last post, we are simply using the same bot but creating a new LuisDialog to replace our old Dialog.

Create a LUIS application

Much of our work with LUIS will be done within the browser. From the LUIS site we are able to create and configure our applications. We will also use this site to train our applications, which is something we will cover later on in this post. For now we are going to create our application.

If you navigate to the LUIS site luis.ai you will be taken to your “My Applications” page. If you are logging in for the first time, you will need to register with a Live account.

Within My Applications we have the option to create a new App. When you select this, a dialog will appear and you will need to enter an application name. I will be sticking with the English application culture for this one, mainly as that is the only language I can do more than just order food and beer in. When you press Add App you will need to wait for a minute or two whilst the application is built.

Add New Application

Once the application has been created, the browser will navigate to that applications page.

Application Area

Responding to the user – Intents

In a LUIS application each different type of command you want to raise based on the users input needs to be created as an Intent. When you create an Intent you need to provide a name for it and an example utterance.

The name should be named with the same care as a class/method name in your code. This will be used in the bot code to identify the intent, it is also how you will train and configure your application within LUIS.

An utterance is an example of what a user might say when communicating with the service. In the example below we have an intent that provides the number of teams called TeamCount. The example utterance we provide is “How many teams are there”. This is a natural way to ask for this information, it is not the only way however, so we will want to add some more later on.

Create New Intent

Once we have created an Intent, we can add more utterances from the application page. To do this, from the “New utterances” tab enter a new expression in the text box and press the arrow key. You will now have the option to select which intent this utterance should be assigned too. Note that there is always an intent called “None”, this indicates that the application does not understand what has been asked. This will be used in our code for a simple message back to the user saying that we don’t understand their question.

Adding utterances

Training and publishing

Before we can use our app we need to publish it. But before we can do that we need to press the Train button. This takes all the information on utterances that we have entered and does its magic in the background.

Training

Once trained, we can then select Publish and we a dialog similar to the one below will be shown. Its worth noting that you can test your application by just entering text into the Query field and pressing enter. This will open a new page in your browser with the JSON return.

Capture6

Getting the keys

So we are have our application built, trained and published, what we need to do now is link it to our bot code. This is actually relatively simple, but what we need from the LUIS site is the App Id and our own subscription key. These will be copied into the LUISDialog we create. You can get your App Id from the App Settings page.

App Key

The subscription key can be found on your main account settings page. When you sign up for LUIS you get a subscription key that allows you to use apps for development purposes. If you need more calls than this will allow, you can purchase a key from Azure and add this to your application.

Sub Key

Updating your bot with a LuisDialog

OK, for a blog that has the C# tag it has taken us some time to get to code. But here we are. What we are going to do is create a new LuisDialog inside of a bot project. I am not going to go through the bot application as this was covered in my last post. For now, we just create a new class within our project. I have called mine ChampionshipDialogLUIS. This class extends the LuisDialog. We then add the attribute for LuisModel which has a parameter for App Id and then one for Subscription Key, these are the details we took from our application on the LUIS site.

[LuisModel("e0357190-4455-4506-8764-055b7e04a674",
    "a91e3e2044be4be99c291c54a153f3a6")]
[Serializable]
public class ChampionshipDialogLUIS : LuisDialog<object>
{

}

In a change from our standard Dialog, in the LuisDialog we create a method for each of our intents and use the attribute LuisIntent with the name of the intent as shown below. For the builtin None intent we just use the attribute LuisIntent(“”).

The LUISDialog does have some functionality already that saves us a bit of work, so now when we finish in a method we just add context.Wait(MessageReceived) and the LuisDialog takes care of listening for the next input. The remainder of the code for our TeamCount intent is the same as it would be for a standard dialog.

[LuisModel("e0357190-4455-4506-8764-055b7e04a674",
    "a91e3e2044be4be99c291c54a153f3a6")]
[Serializable]
public class ChampionshipDialogLUIS : LuisDialog<object>
{
    [LuisIntent("TeamCount")]
    public async Task GetTeamCount(IDialogContext context, LuisResult result)
    {
        Championships champs = new Championships();
        await context.PostAsync($"There are {champs.GetTeamCount()} teams in the championships.");
        context.Wait(MessageReceived);
    }
        
    [LuisIntent("")]
    public async Task None(IDialogContext context, LuisResult result)
    {
        await context.PostAsync("No clue what you are talking about");
        context.Wait(MessageReceived);
    }
}

What I have now gone and done is add a few more intents in exactly the same way as our TeamCount one. This just shows how easy it is to have the LuisDialog handle the various message we get from the bot.

[LuisIntent("TopTeam")]
public async Task TopTeam(IDialogContext context, LuisResult result)
{
    await context.PostAsync($"The highest rated team is {champs.GetHighestRatedTeam()}.");
    context.Wait(MessageReceived);
}

[LuisIntent("BottomTeam")]
public async Task BottomTeam(IDialogContext context, LuisResult result)
{
    await context.PostAsync($"The lowest rated team is {champs.GetLowestRatedTeam()}");
    context.Wait(MessageReceived);
}

Getting user entered data – Entities

The problem with what we have done so far, is that we have only handled simple requests from the users. In reality a user is going to want to do more than just ask for some predefined information, they are going to want to ask a more detailed question. For this we will bring in Entities.

Entities allow us to take part of the expression entered by the user and use this as an argument. For example, my test app provides information about a football championship, it is therefore logical that the user will want to ask about a specific team. So that is what we are going to add.

Back in the LUIS site, we add a new entity. For this we enter a name using the same care as with our intents. In my case I have called mine TeamName.

Capture7

What I have also done is create an intent that allows us to remove a specific team, called RemoveTeam. When you are entering utterances, you can select a word and declare it as an entity. This tells the application that it is for the intent RemoveTeam but that the text in this area would form an entity. This should become clear when we look at the code.

Selecting a sample entity

Back to the code

In our code we add a new method for remove team and add the necessary attribute. This method is different to the others as we now use the TryFindEntity method on the LuisResult. The first argument is the name of our entity (see this is why I keep saying about care in naming them). The output of this is of an EntityRecommendation, its easy to think of this as a recommendation as it may not always be correct and is down to the training and experience of the application. Once we have this we can get the Entity from the EntityRecommendation and use that in our code to remove the selected team.

[LuisIntent("RemoveTeam")]
public async Task RemoveTeam(IDialogContext context, LuisResult result)
{
    string teamName = "";
    EntityRecommendation rec;
    if (result.TryFindEntity("TeamName", out rec))
    {
        teamName = rec.Entity;
        try
        {
            champs.RemoveTeam(teamName);
            await context.PostAsync($"{teamName} has been removed from the championships.");
        }
        catch (TeamNotFoundException)
        {
            await context.PostAsync($"The team {teamName} was not found.");
        }
    }
    else
    {
        await context.PostAsync("An error occured. We were unable to remove the team.");
    }
    context.Wait(MessageReceived);
}

Retraining

Before we go any further, we should retrain our application from the web page. This will make sure any changes that we have made, such as our new entities are added. Once we have our application working and it is receiving requests, we can go to the Suggest tab and look at what intent the application has assigned to the input from our bot. This wont always be correct, but you can select each one and select the correct intent and add entities if appropriate. Something you will notice is that the more you use this, the more accurate it becomes.

Retraining

On the right of the web page are some statistics on how accurate the application has performed. It also shows how many utterances have been entered for an intent. If you have an intent that isn’t getting picked up as you expect, it might be worth checking to see if you have not provided enough utterances for the application to be able to accurately predict the intent.

Response Recommendations

Testing in the emulator

We are now coming to the end, so we just need to check this in the emulator. As you can see from the screenshot, the application can pick up what you trying to say and then provide the right response.

Emulator

What next

This post has covered natural language and my last post covered the basics of bots, so continuing the theme I will be writing about prompts and images in your conversations on my next post.

Creating your first bot with the Microsoft Bot Framework

Introduction

This post is actually my second on getting started with the bot framework. I’m starting again as Microsoft has provided significant updates to the bot framework with V3 and some things are now done in a different way. I will be posting several tutorials over the coming days which will take you from the basics and onto more advanced topics. Today we will just look at creating a bot with a simple dialog.

Create the project

To create a bot application you should install the Visual Studio template from here. For my example I have taken a basic solution which contained a project called FootballData. This project is just used for simple demonstrations and gives us something to allow our bot to get data in response to our input.

Create your dialog

When you first create bot project you get some sample code in the MessageController class. Rather than use this we are going to create our own Dialog. To do this you create a new class in the root of the project. I am going to call mine ChampionshipDialog. This class needs to implement IDialog and also to be marked as serializable. At this stage the dialog should look similar to this. (if you have Visual Studio generate the methods for the interface it wont mark as async, but it will be needed later so its worth adding it now.)

[Serializable]
public class ChampionshipsDialog : IDialog
{
    public async Task StartAsync(IDialogContext context)
    {

    }
}

When our Dialog is first created the StartAsync method will be called. Here we need to wait for a message from the user. Additionally what we are going to want to do is carry out an action based on the users input, and ensure that we can still listen out for more messages.

public static Championships champs;
        
public async Task StartAsync(IDialogContext context)
{
    champs = new Championships();
    context.Wait(MessageReceivedAsync);
}

private async Task MessageReceivedAsync(IDialogContext context, IAwaitable<object> result)
{
    var msg = await result as Activity;
    //we will carry out our actions here

    context.Wait(MessageReceivedAsync)
}

In the code above all we have done is call MessageReceivedAsync and then get the Activity object from the bot. The Activity is where we will get the text entered by the user. In earlier versions of the bot framework the result would be of a type Message, this changed in v3 and the Activity could be of various types. In our application we are only going to call this Dialog if the Activity is of the type Message, but that will be covered in the next section. The other thing I have done is at the end of MessageReceivedAsync I have told the context to wait for further input. This ensures that our bot keeps listening and doesn’t just shut down after the first message.

Now we are in a position to add some functionality to our Dialog. First of all, we are just going to see if the text sent to the bot contains “how many teams”. If this is the message we can then call the appropriate method in our Championships class and then post back to the user the result.

private async Task MessageReceivedAsync(IDialogContext context, IAwaitable<object> result)
{
    var msg = await result as Activity;
    if (msg.Text.Contains("how many teams"))
    {
      await context.PostAsync($"There are { champs.GetTeamCount() } teams in the championships.");
    }
    context.Wait(MessageReceivedAsync)
}

Just providing simple answers like that is not really much use, so what we can now add cases to respond to different questions/statements from our users. So now we go through and interpret the message and carry out actions accordingly. A more complete example is shown below.

private async Task MessageReceivedAsync(IDialogContext context, IAwaitable<object> result)
{
    var msg = await result as Activity;
    if (msg.Text.Contains("how many teams"))
    {
       await context.PostAsync($"There are { champs.GetTeamCount() } teams in the championships.");
    }
    else if (msg.Text.StartsWith("who") || msg.Text.StartsWith("which"))
    {
       if (msg.Text.Contains("best") || msg.Text.Contains("top") || msg.Text.Contains("greatest"))
       {
          await context.PostAsync($"The top rated team is { champs.GetHighestRatedTeam() }");
       }
       else if (msg.Text.Contains("worst") || msg.Text.Contains("bottom") || msg.Text.Contains("lowest"))
       {
          await context.PostAsync($"The lowest rated team is { champs.GetLowestRatedTeam() }");
       }
       else
       {
          await context.PostAsync("Sorry I didnt understand the question.");
       }
    }
    else if (msg.Text.StartsWith("remove"))
    {
       string team = msg.Text.Replace("remove", "").Trim();
       try
       {
          champs.RemoveTeam(team);
          await context.PostAsync($"The team { team } has been removed from the championships.");
       }
       catch (TeamNotFoundException)
       {
          await context.PostAsync($"The team { team } was not found.");
       }
    }
    else
    {
       await context.PostAsync("Sorry I didnt understand the question.");
    }
    context.Wait(MessageReceivedAsync);
}

Don’t be too worried about the what I’m calling in the Championships class, the important thing is to look at the structure of the dialog. There are a couple of things here that I would like to point out. The first is that at the end I have added a final else statement for we don’t get any recognised input. It is important to always send a response back to the user, even if it is just to tell them that you have no idea what they are asking. The second is that we are having to a lot of checking what text says and even having or statements to allow for users using different phrases. This can add up to a lot of time trying to work out how users will ask a question. We can improve this by using a natural language service such as LUIS. Which my next post will be covering.

Wire things together

For now lets connect our dialog to our bot. If we go back to the MessagesController class and in the Post method, replace everything within the if statement for ActivityTypes.Message with the statement

await Conversation.SendAsync(activity, () => new ChampionshipsDialog());

This statement is telling the bot to use our new Dialog. The full code looks similar to this:

public async Task<HttpResponseMessage> Post([FromBody]Activity activity)
{
    if (activity.Type == ActivityTypes.Message)
    {
       await Conversation.SendAsync(activity, () => new ChampionshipsDialog());
    }
    else
    {
       HandleSystemMessage(activity);
    }
    var response = Request.CreateResponse(HttpStatusCode.OK);
    return response;
}

Test in the emulator

If we debug the bot service now and open up the bot emulator that can be downloaded from here. We can communicate and get the results expected. We now have our simple bot up and running.

Emulator

What next

The code in this post is a simplistic view of a bot. We haven’t covered prompts to the user to confirm an action, holding state, multiple users or many other features. I will be blogging regularly on bots and will be going through many of the more advanced features, so you can subscribe or keep checking back here for more info. The Microsoft site dev.botframework.com has some great information as well.

Detecting a persons mood via an image with the Microsoft Emotion API

Introduction

In this post we are going to continue looking at the Microsoft Cognitive Services. Much of the core code for this will be similar to the last blog post on OCR, but this time we are going to get emotion data from the faces of an image.

Sign up for cognitive services

Just like in the previous post on OCR, the Emotion API that we are going to use is part of the Microsoft Cognitive Services. If you have not already, you will need to sign up to use these API’s here. We will be using the Emotion API so sign up for this subscription. It is a preview so is free for 30000 transactions a month, there are some other limits which you can read up on, but for our test it should be more than adequate. We will be using the keys from this subscription later on.

Get Mood from Emotion API

Before we call the Emotion API, we need to setup a data structure for the returned JSON. The data structures below should be pretty self explanatory and will allow us to iterate though multiple faces later on.

public class ImageData
{
    public FaceRectangle faceRectangle { get; set; }
    public Scores scores { get; set; }
}

public class FaceRectangle
{
    public int left { get; set; }
    public int top { get; set; }
    public int width { get; set; }
    public int height { get; set; }
}

public class Scores
{
    public decimal anger { get; set; }
    public decimal contempt { get; set; }
    public decimal disgust { get; set; }
    public decimal fear { get; set; }
    public decimal happiness { get; set; }
    public decimal neutral { get; set; }
    public decimal sadness { get; set; }
    public decimal surprise { get; set; }
}

With our data structure in place we can now call the Emotion API from https://api.projectoxford.ai/emotion/v1.0/reconize. The rest of the code is the same as our previous OCR example, although this time we will deserialize to a list of ImageData objects.

public async Task<List> GetOCRData(string filename)
{
    var client = new HttpClient();
    client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", "5f6067eea83497fdfaa4ce");
    var uri = "https://api.projectoxford.ai/emotion/v1.0/recognize";
    HttpResponseMessage response;
    using (var content = new ByteArrayContent(GetBytesFromFilepath(filename)))
    {
        content.Headers.ContentType = 
            new System.Net.Http.Headers.MediaTypeHeaderValue("application/octet-stream");
        response = await client.PostAsync(uri, content).ConfigureAwait(false);
    }
    var data = JsonConvert.DeserializeObject<List>(
        await response.Content.ReadAsStringAsync());

    return data;
}

We added a helper class to get a byte array from a selected filepath. Although we could have set MediaTypeHeaderValue to “application/json” and sent a uri rather than sending the file bytes.

private byte[] GetBytesFromFilepath(string filePath)
{
    Image img = Image.FromFile(filePath);
    using (var stream = new MemoryStream())
    {
        img.Save(stream, System.Drawing.Imaging.ImageFormat.Jpeg);
        return stream.ToArray();
    }
}

Sample App – The Mood Elevator

The data returned from the API is a series of decimal values where the indicating the detected emotion. The closest to 1 is the indicated emotion.

To display the result of the returned emotion, I have created a basic MVC application to allow a user to browse and upload a file. I have then created a MoodAnalysis class to determine the mood value based on a set a set of criteria that I have chosen. In my case, I have assigned different values for happy based on decimal values 0.9, 0.8 etc.

public async Task Index(HttpPostedFileBase file)
{
    string guid = Guid.NewGuid().ToString() + ".jpg";
    if (file != null && file.ContentLength > 0)
    {
        var fileName = Path.GetFileName(file.FileName);
        var path = Path.Combine(Server.MapPath("~/Content/upload"), guid);
        file.SaveAs(path);
        PhotoReader pr = new PhotoReader();
        var rtn = await pr.GetOCRData(path);
        MoodAnalysis ma = new MoodAnalysis();
        int elScore = ma.GetMood(rtn.FirstOrDefault().scores);
        ViewBag.EmotionScore = elScore;
        ViewBag.EmotionMessage = ma.ConvertStepToElevatorName(elScore);
    }
    
    return View();
}

Using this information I am then able to display a result on a chart, as shown in the example below:

AngryMoodElevator

 

Conclusion

In this brief introduction to the Emotion API we have uploaded a file and retrieved the emotion scores for analysis. I have tried not to overcomplicate it by not showing my break down of the score results, but with the returned decimal values you will be able to experiment based on your applications needs. If you have any questions or comments, please add them below.

Using Microsoft Cognitive Service to OCR business cards

Introduction

A couple of days ago I attended the AzureCraft event in London hosted by Microsoft and the UK Azure User Group. It was a really good event and during on of the many demos in the keynote that overran by an hour briefly touched on the OCR capability in the Microsoft Cognitive Services. Although that example was to show the capability of Azure Functions, I quickly jotted down a note to try out the OCR capability. So after spending yesterday losing at golf (it was only 1 point so I demand a recount!), today I have tried creating an OCR application for the first time. In this post I am going to cover the code I created to create a basic application that allows you to get information from a photo of a business card.

A caveat I would like to add before getting into the code is that this is my first run though with this technology and the code has not been optimised yet, I just thought I would get it all down while it was fresh in my mind.

Sign up for cognitive services

The OCR function we are going to use is part of the Microsoft Cognitive Services. You will need to sign up to use these API’s here. We will be using the Computer Vision API so sign up for this subscription. It is a preview so is free for 5000 transactions a month. We will be using the keys from this subscription later on.

Get OCR data from Computer Vision API

Our first action is to send a file to the service and get the JSON response with the OCR data. For this we create a GetOCRData class that takes the filename of our image file. We need to add some details to the DefaultRequestHeaders which will include the subscription key that you get when signing up for the trial (don’t try using mine, it wont work). There are several options for the content we send, if we just wanted to send a Uri we could use application/json, however in our case we are going to push a local file to the service so need to use application/octet-stream.

Once we have posted our response, we can deserialize the return value. I used Newtonsoft.Json by installing it from NuGet, although there are alternatives that can be used.

public async Task GetOCRData(string filename)
{
    var client = new HttpClient();
    client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", "8ecc1bf65ff1c4e831191");
    var uri = "https://api.projectoxford.ai/vision/v1.0/ocr";
    HttpResponseMessage response;
    using (var content = new ByteArrayContent(GetBytesFromFilepath(filename)))
    {
        content.Headers.ContentType = new Headers.MediaTypeHeaderValue("application/octet-stream");
        response = await client.PostAsync(uri, content).ConfigureAwait(false);
    }
    var x = JsonConvert.DeserializeObject(await response.Content.ReadAsStringAsync());

    return x.ToString();
}

Our code above uses this helper method to get the bytes from a file

private byte[] GetBytesFromFilepath(string filePath)
{
    Image img = Image.FromFile(filePath);
    using (var stream = new MemoryStream())
    {
        img.Save(stream, ImageFormat.Jpeg);
        return stream.ToArray();
    }
}

To allow me to easily test this I created a basic WPF application that contains a textbox, button and textarea. This allows me to check the returned values as I test the app. I also have a series of unit tests, but found the visual display to be a great help. With the code as it stands we get the response as shown below

JSON Return

Interpreting the JSON

What we have so far just gives us the raw JSON request, we can at least see that the OCR is working, but its not very user friendly at this stage. To allow us to handle the returned data we are going to parse the JSON into an OCRData class. For this we need to create the OCRData class and several other classes to cover Regions, Lines and Words which are returned by the web request.

public class OCRData
{
    public List regions { get; set; }
}

public class Region
{
    public string boundingBox { get; set; }
    public List lines { get; set; }
}

public class Line
{
    public string boundingBox { get; set; }
    public List words { get; set; }
}

public class Word
{
    public string boundingBox { get; set; }
    public string text { get; set; }
}

With these in place we can change our GetOCRData class to return a Task and to change the JsonConvert.DeserializeObject method to one that returns an OCRData type as shown in the sample code below

public async Task GetOCRData(string filename)
{
    var client = new HttpClient();
    client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", "8ecc1bf6548a4bdf1c4e831191");
    var uri = "https://api.projectoxford.ai/vision/v1.0/ocr";
    HttpResponseMessage response;
    using (var content = new ByteArrayContent(GetBytesFromFilepath(filename)))
    {
        content.Headers.ContentType = new Headers.MediaTypeHeaderValue("application/octet-stream");
        response = await client.PostAsync(uri, content).ConfigureAwait(false);
    }
    var x = JsonConvert.DeserializeObject(await 
        response.Content.ReadAsStringAsync());

    return x;
}

Populating a ContactCard object and Regex, Regex everywhere!

We now have an OCRData object being returned, but we still need to analyse this data and create a useable contact object from this information. To do this we create a ContactCard class that includes a number of probably properties on a business card. I have also created an override for ToString to allow us to display the contact data easily in our OCR Tester application.

public class ContactCard
{
    public string Name { get; set; }
    public string Company { get; set; }
    public string Position { get; set; }
    public string PhoneNo { get; set; }
    public string Email { get; set; }
    public string Website { get; set; }
    public string Facebook { get; set; }
    public string Twitter { get; set; }

    public override string ToString()
    {
        StringBuilder sb = new StringBuilder();
        sb.AppendLine("Name: " + Name);
        sb.AppendLine("Company: " + Company);
        sb.AppendLine("Position: " + Position);
        sb.AppendLine("Phone: " + PhoneNo);
        sb.AppendLine("Email: " + Email);
        sb.AppendLine("Website: " + Website);
        sb.AppendLine("Facebook: " + Facebook);
        sb.AppendLine("Twitter: " + Twitter);

        return sb.ToString();
    }
}

With the ContactCard object ready and our OCR data available, we now expand out our code. We create an new method called ReadBusinessCard which takes the filename for the image file. This method first calls our GetOCRData method an then creates a new ContactCard. For each property in ContactCard we then call a method to get the relevant data out of our OCRData object.

For the Name, Company and Position I have actually cheated and given defined locations for these based on the sample cards I am using to test this app. The reason for this is that they are much harder to detect. My intention is to identify these using the LUIS natural language service, but this would over complicate this post so they have been omitted. For those interested in using this natural language service, my previous blog post on the Bot Framework has examples of this.

What I have done for this version is create a GetFromRegex method that allows us to send a pattern to the method and have a Regex check done on all the returned words. This then allows us to populate the contact card with phone numbers, emails, websites, twitter handles etc.

public async Task ReadBusinessCard(string filename)
{
    OCRData data = await GetOCRData(filename).ConfigureAwait(false);
    ContactCard cc = new ContactCard();
    Region r = data.regions[0];
    cc.Name = GetName(r);
    cc.Company = GetCompany(r);
    cc.Position = GetPosition(r);
    cc.PhoneNo = GetFromRegex(r, @"^d+$");
    cc.Email = GetFromRegex(r, @"^([a-z0-9_.-]+)@([da-z.-]+).([a-z.]{2,6})$");
    cc.Website = GetFromRegex(r, "^www.", "facebook");
    cc.Facebook = GetFromRegex(r, @"^www.Facebook.com");
    cc.Twitter = GetFromRegex(r, "^@");

    return cc;
}

private string GetFromRegex(Region r, string pattern, string notContains = null)
{
    foreach(Line l in r.lines)
    {
        foreach(Word w in l.words)
        {
            if (Regex.IsMatch(w.text, pattern, RegexOptions.IgnoreCase))
            {
                if (string.IsNullOrEmpty(notContains))
                    return w.text;
                else
                {
                    if (!w.text.Contains(notContains))
                        return w.text;
                }
            }
        }
    }
    return "";
}

Now I’m going to be completely honest, I don’t think the GetFromRegex method or my regex patterns are as good as they could be. However for the sake of this demonstration they allow me to show how we can break down the returned OCR text data into the sections of a contact card.
With the code ready now and getting the data, the next logical step would be to add this to a mobile app. However for the ease of the demo I have modified my OCR Tester application to display the returned ContactCard. From here we could easily add this information to a phones contact or other contact application, but I will leave that for today and look at that another time.

Contact Card Return

Conclusion

This sample shows how powerful and simple to implement OCR using the cognitive services. If you have any questions or suggestions for this post, please add them to the comments below.

Developing a Bot with the Microsoft Bot Framework

Introduction

In this post we are going to look at creating a very basic bot using the Microsoft Bot Framework. Although the bot itself will be simple, we will connect it to the LUIS natural language service to allow us to train the service to handle responses. From this example, it should be easy to move forward with more complex bots and features.

Create Project in Visual Studio

To create a bot application, you need to install the Bot Application template and Bot Emulator which can both be found here.

When these are installed, create a new project using the Bot Application template. We have called our one MagicEightBall.

Bot Project

The template provides the basic code needed to run a bot. The main area you will edit is the MessagesController class in the Controllers folder. The example code as shown below will return a count of how many characters you have entered. It is a good idea to debug this and test your emulator now with this code.

public async Task Post([FromBody]Message message)
{
    if (message.Type == "Message")
    {
        // calculate something for us to return
        int length = (message.Text ?? string.Empty).Length;

        // return our reply to the user
        return message.CreateReplyMessage($"You sent {length} characters");
    }
    else
    {
        return HandleSystemMessage(message);
    }
}

Using Natural Language

To simplify the process of creating question and responses for our Bot we are going to use LUIS (Language Understanding Intelligent Service). This service allows you to train intents to understand what users have entered and then respond accordingly. The LUIS site can be found here.

Create a new LUIS application

Once you have registered for LUIS, create a new application for the Magic Eight Ball as shown below. The LUIS site has an excellent 10 minute video on creating and training an app, if you have not created one before then I highly recommend that you view this.

LUIS

Train the application to understand a question

The first thing you need to do is assign an intent, which will determine the questions that the user can ask. We have created a basic one called Question that we will train. We would then add new utterances to show what examples of text would be assigned to this intent. The LUIS server will then learn and try and pick up other text that it considers as having the same meaning.

Capture3

If we were getting entities to pass data to our code, we can assign part of our utterances to that entity. In the example below, the intent would be picked up as “Am I”, whilst the Action entity for this would be assigned to “funny”. This allows you to handle various options and settings sent by users in a statement.

Capture4

Connecting to LUIS service

The code sample below is of a MagicEight class that extends the LuisDialog. For each intent we have we should have a method that has the attribute [LuisIntent([IntentName])] as shown below. This method will be called when the bot determines that this question/intent has been called. For our example we do not care about any actions or details from the message, just that it was a valid question. We then send a message of a random response from the list of responses on a standard magic eight ball.

We also provide a method for the None intent by using the attribute [LuisIntent(“”)]. This allows us to provide a default response to users when it can not determine the intent.

Please note the App Id and Key shown in the example below is not valid and you will need to get these from a valid LUIS application.

[LuisModel("b514324f-d6d3-418e-a911-c7fasda6699e2", "a91e3e2044a987876291c54a153f3a6")]
[Serializable]
public class MagicEight : LuisDialog<object>
{
    private static Random rand = new Random(14891734);
    string[] responses = { "It is certain.", "It is decidedly so.",
        "Without a doubt.", "Yes, definately.", "You may rely on it.",
        "As I see it, yes.", "Most likely.", "Outlook good.", "Yes.",
        "Signs point to yes.", "Reply hazy try again.", "Ask again later.",
        "Better not tell you now.", "Cannot predict now",
        "Concentrate and ask again", "Don't count on it.",
        "My reply is no.", "My sources say no.", "Outlook not so good",
        "Very doubtful."};

    [LuisIntent("Question")]
    public async Task AnswerQuestion(IDialogContext context, LuisResult result)
    {
        int x = rand.Next(20);
        string message = responses[x - 1];
        await context.PostAsync(message);
        context.Wait(MessageReceived);
    }

    [LuisIntent("")]
    public async Task None(IDialogContext context, LuisResult<object> result)
    {
        string message = "Sorry I did not understand";
        await context.PostAsync(message);
        context.Wait(MessageReceived);
    }
}

Our final task is to return to our MessagesController class and update the Post method to send new messages to our MagicEight class.

public async Task Post([FromBody]Message message)
{
    if (message.Type == "Message")
    {
        return await Conversation.SendAsync(message, () => new MagicEight());
    }
    else
    {
        return HandleSystemMessage(message);
    }
}

Test in Emulator

Now that we have the code complete, test the application in the emulator.

Capture5

You may not always get the correct response straight away. This is due to the LUIS application having not been trained much, the more work you put into labelling the utterings within the LUIS application the more accurate it will become. Any uttering sent to the service will be available for you to label, or approve the automatically assigned label that the service determined.

You can now take what you have learnt in this post and expand it into more complex bots and connect what you have created into web/windows applications.

Integrate Microsoft Band features into a Windows Store App

Introduction

Fitness Bands are one of the big things in technology at the moment, new ones appear to be being released all the time each claiming new or improved features. When Microsoft released the band they also released an impressive SDK to allow developers to integrate the Band with apps across all of the major mobile platforms.

In this post I want to take you through some of the features I have added to an app that I released on the Windows Phone. Hopefully these examples will help others who want to create apps for the band and maybe provide some ideas for devs who haven’t thought about it yet.

BeepFitBand2
The completed app running on a device

The App

Rather than a hello world type app I am going to use code from an app called BeepFit. This app allows users to run a multi-staged fitness test (otherwise known as a beep or bleep test) and to store and share their results. It also uses the Microsoft Band to send notifications to users so they don’t need to worry about listening out for alerts. Additionally at the completion of each level of the test it takes various sensor readings to add to the results. If you are interested you can get the completed app for Windows Phone from the store here.

The User Interface

I have already updated the UI to have a menu for selecting the type of device to connect to. This is to enable future updates to include other fitness devices if I choose. In our tutorial we are going to start the connection process via a command linked to this menu item.

DeviceSelection

 

Connecting to the Band

Our first step is to connect to the band itself. The code example below shows that we use the BandClientManager to get an array of IBandInfo. This array then gives us the available bands, to simplify our code we only allow a connection if a single band is connected. We then use the BandClientManager again to connect to the device by sending the object we obtained from our GetBandsAsync call.

With the connection established, we check if a Tile is available and if not call a method to create the tile (this is covered in the next paragraph) and finally we use our bandClient object that we connected too and use the NotificationManager to send a vibration alert. We will see the NotificationManager again later in this tutorial, but for simple vibrations all we need to do is call the VibrateAsync method with a suitable value from the VibrationType enum. For our project we are going to use VibrationType.ThreeToneHigh to let the user know that the app is connected to the band.

private async void ConnectToBand()
{
    if (testRunning)
        return;

    IBandInfo[] bands = 
        await BandClientManager.Instance.GetBandsAsync();
    if (bands.Length != 1)
    {
        DeviceState = "Unable to connect to device";
    }
    else
    {
        BandName = bands[0].Name;
        bandInfo = bands[0];
        bandClient = 
            await BandClientManager.Instance.ConnectAsync(bandInfo);
        IsBandConnected = true;
        if (!IsTileCreated)
        {
            this.CreateTile();
            IsTileCreated = true;
            this.ConnectToBand();
        }
        else
        {
            await bandClient.NotificationManager.VibrateAsync
                (VibrationType.ThreeToneHigh);
            DeviceState = string.Format("Connected to Microsoft Band");
        }
    }
}

Installing a tile

In the previous section we connected to the band and as part of that task we called the our CreateTile method which is detailed below. This method gets the installed tiles as an IEnumerable by using the GetTilesAsync method of the TileManager. We also need to check that there is capacity on the band to add another tile. After this check is carried out we create the icon for the then use this to create the new BandTile item. Key things to note when creating the tile is that it must have a unique reference (we use a constant that is a string of a GUID) and to determine if IsBadgingEnabled should be set to true. If set to true, the tile will show the number of unread notifications for the tile.

private async void CreateTile()
{
    IEnumerable tiles = await bandClient.TileManager.GetTilesAsync();
    int capacity = 
        await bandClient.TileManager.GetRemainingTileCapacityAsync();
    if (capacity >= 1)
    {
        WriteableBitmap iconBitmap = await this.GetBitmapFromFile(
            "ms-appx:///Images/BandLogoWhite.png");
        BandIcon tileIcon = iconBitmap.ToBandIcon();
        WriteableBitmap smallIconBitmap = await this.GetBitmapFromFile
            ("ms-appx:///Images/BandLogoSmallWhite.png");
        BandIcon smallTileIcon = smallIconBitmap.ToBandIcon();
        tileGuid = Guid.Parse(AppConstants.BandTileGuid);
        BandTile tile = new BandTile(tileGuid)
        {
            IsBadgingEnabled = true,
            Name = "Beep Fit",
            TileIcon = tileIcon,
            SmallIcon = smallTileIcon
        };
        await bandClient.TileManager.AddTileAsync(tile);
    }
}

The following is a helper method used in our CreateTile method for getting a bitmap from a file Uri.

private async Task GetBitmapFromFile(string uri)
{
    var file = 
        await StorageFile.GetFileFromApplicationUriAsync(new Uri(uri));
    using (var fileStream = await file.OpenAsync(FileAccessMode.Read))
    {
        var bitmap = new WriteableBitmap(1, 1);
        await bitmap.SetSourceAsync(fileStream);
        return bitmap;
    }
}

Sending Alerts to the Band

Sending alerts to the band during a test is done in the same way as when we made our initial connection earlier on. As we want to provide users a notification on each lap (in lieu of the audio) we send a single OneToneHigh vibration. We also send a dialog to show the current lap and level, along with the interval time that the user would be aiming for.

private async void LapBeep()
{
    if (IsBandConnected)
    {
        await bandClient.NotificationManager.VibrateAsync
            (VibrationType.OneToneHigh);
        await bandClient.NotificationManager.ShowDialogAsync(tileGuid,
            string.Format("Level: {0} Lap: {1}/{2}", CurrentLevel,
            CurrentLap, Laps), string.Format("Lap Time: {0}", PerLap));
    }
    individualTestView.PlaySound();
}

Getting sensor data from the Band

We get sensor data by listening for changes to the readings. Each sensor has a set of supported reporting intervals that can be used. As we need to get the reading closest to the lap/level beep of our test, we are going to get the first interval available as this will give us the most regular updates. In the code below we setup to get the readings for both the HeartRate and SkinTemperature sensors. These are the only ones relevant to our tests, but the principle is the same if you want to get the readings from other sensors.

var x = bandClient.SensorManager.HeartRate.SupportedReportingIntervals;
bandClient.SensorManager.HeartRate.ReportingInterval = 
    x.FirstOrDefault();
bandClient.SensorManager.HeartRate.ReadingChanged += 
    HeartRate_ReadingChanged;
await bandClient.SensorManager.HeartRate.StartReadingsAsync();
var y =
  bandClient.SensorManager.SkinTemperature.SupportedReportingIntervals;
bandClient.SensorManager.SkinTemperature.ReportingInterval =
    y.FirstOrDefault();
bandClient.SensorManager.SkinTemperature.ReadingChanged += 
    SkinTemperature_ReadingChanged;
await bandClient.SensorManager.SkinTemperature.StartReadingsAsync();

When the SkinTemperature calls its ReadingChanged event we want to store the temperature value from the sensor. We are doing it this way for our app as the timings that we need the sensor data from will be continually changing based on the current level of the test. We then do the same again but for the HeartRate sensor.

void SkinTemperature_ReadingChanged(object sender, 
    Microsoft.Band.Sensors.BandSensorReadingEventArgs e)
{
    lastTempReading = e.SensorReading.Temperature;
}

void HeartRate_ReadingChanged(object sender, 
    Microsoft.Band.Sensors.BandSensorReadingEventArgs e)
{
    lastReading = e.SensorReading.HeartRate;
}

When our test reaches a new level we take the latest sensor readings and add them to the overall test data and display a notification to the user that they are on a new level. As the standard beep test would have a triple beep to indicate this, we use the ThreeToneHigh vibration alert.

private async void LevelBeep(int levelNumber)
{
    if (IsBandConnected)
    {
        if (levelHeartRates == null)
            levelHeartRates = new Dictionary<int, int>();
        levelHeartRates.Add(levelNumber, lastReading);
        if (levelTempReadings == null)
            levelTempReadings = new Dictionary<int, double>();
        levelTempReadings.Add(levelNumber, lastTempReading);
        await bandClient.NotificationManager.VibrateAsync
           (VibrationType.ThreeToneHigh);
        await bandClient.NotificationManager.ShowDialogAsync(tileGuid,
            string.Format("Level: {0} Lap: {1}/{2}", CurrentLevel, 
            CurrentLap, Laps), string.Format("Lap Time: {0}", PerLap));
        }
        individualTestView.PlayTripple();
    }

Ending a test

Our test is ended via a command on the Windows Phone. When this is called a number of actions are carried out:

  1. A message is sent to the band to notify the test is completed.
  2. The sensor manger is told to stop taking readings for HeartRate and SkinTemperature.
  3. The application navigates to a results page.

The difference between the SendMessageAsync method used here and the SendDialogAsync message we have been using in other parts of the app, is that this message will remain on the band and can be viewed by tapping the tile at a later date. The provides the user with an easy way to see recent test results.

private async void StopTestCommandHandler(IUICommand command)
{
  if (command.Label.Equals("Yes"))
  {
      testRunning = false;
      beepTest.Lap -= bt_Lap;
      totalTimer.Tick -= dispatcherTimer_Tick;
      beepTest.StopTest();
      StartStopBtnText = "Start";
      displayRequest.RequestRelease();
      if (IsBandConnected)
      {
          await bandClient.NotificationManager.SendMessageAsync(
              tileGuid, "Test Completed", string.Format("You reached
              level {0}.{1}", CurrentLevel, CurrentLap),
              DateTimeOffset.Now, MessageFlags.ShowDialog);
          await
              bandClient.SensorManager.HeartRate.StopReadingsAsync();
          await 
          bandClient.SensorManager.SkinTemperature.StopReadingsAsync();
            //Navigate to results frame
        }
        else
        {
            //Navigate to results frame.
        }
    }
}

After the test has ended we would navigate to a page with to display the results (this has been omitted from the example above). In the case of our app, the results page shows a graph with the collected results if a band was used during the test.

wp_ss_20151007_0001

 

Conclusion

I have tried to cover a lot in this blog post. What I hope to get across is that it is quite simple to add functionality to connect your Windows Store apps to a Microsoft Band. The official developer site at developer.microsoftband.com has up to date documentation for the SDK and includes examples for integrating with Android and iOS.