Microsoft Cognitive Services - Text Recognition

December 29, 2016

Recently at DDD North I saw a talk on MS cognitive services. This came back and sparked interest in me while I was looking at some TFS APIs (see later posts for why). However, in this post, I’m basically exploring what can be done with these services.

The Hype

  • Language: can detect the language that you pass
  • Topics: can determine the topic being discussed
  • Key Phrases: key points (which I believe may equate to nouns)
  • Sentiment: whether or not what you are saying is good or bad (I must admit, I don’t really understand that - but we can try some phrases to see what it comes up with)

For some reason that I can’t really understand, topics requires over 100 documents, and so I won’t be getting that to work, as I don’t have a text sample big enough. The examples that they give in marketing seem to relate to people booking and reviewing holidays; and it feels a lot like these services are overly skewed toward that particular purpose.

Set-up

Register here:

https://www.microsoft.com/cognitive-services/

Registration is free (although I believe you need a live account).

cog1

Client

The internal name for this at MS is Project Oxford. You don’t have to install the client libraries (because they are just service calls), but you get some objects and helpers if you do:

cog2

Cognition

The following code is largely plagiarised from the links at the bottom of this page:

Here’s the Main function:



var requestDocs = PopulateDocuments();
Console.WriteLine($"-=Requests=-");
foreach (var eachReq in requestDocs.Documents)
{
    Console.WriteLine($"Id: {eachReq.Id} Text: {eachReq.Text}");
}
Console.WriteLine($"-=End Requests=-");
 
string req = JsonConvert.SerializeObject(requestDocs);
 
MakeRequests(req);
Console.WriteLine("Hit ENTER to exit...");
Console.ReadLine();

PopulateDocuments just fills the RequestDocument collection with some test data:




private static LanguageRequest PopulateDocuments()
{
    LanguageRequest requestText = new Microsoft.ProjectOxford.Text.Language.LanguageRequest();
    requestText.Documents.Add(
        new Microsoft.ProjectOxford.Text.Core.Document()
        { Id = "One", Text = "The quick brown fox jumped over the hedge" });
    requestText.Documents.Add(
        new Microsoft.ProjectOxford.Text.Core.Document()
        { Id = "Two", Text = "March is a green month" });
    requestText.Documents.Add(
        new Microsoft.ProjectOxford.Text.Core.Document()
        { Id = "Three", Text = "When I press enter the program crashes" });
    requestText.Documents.Add(
        new Microsoft.ProjectOxford.Text.Core.Document()
        { Id = "4", Text = "Pressing return - the program crashes" });
    requestText.Documents.Add(
        new Microsoft.ProjectOxford.Text.Core.Document()
        { Id = "5", Text = "Los siento, no hablo Enspanol" });
 
    return requestText;
}

As you can see, I dropped some Spanish in there for the language detection. The MakeRequests method and its dependencies:




static async void MakeRequests(string req)
{
    using (var client = new HttpClient())
    {
        client.BaseAddress = new Uri(BaseUrl);
 
        // Request headers.
        client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", AccountKey);
        client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
 
        // Request body. Insert your text data here in JSON format.
        byte[] byteData = Encoding.UTF8.GetBytes(req);
 
        // Detect key phrases:
        var uri = "text/analytics/v2.0/keyPhrases";
        string response = await CallEndpoint(client, uri, byteData);
        Console.WriteLine("Key Phrases");
        Console.WriteLine(ParseResponseGeneric(response));
 
        // Detect language:
        var queryString = HttpUtility.ParseQueryString(string.Empty);
        queryString["numberOfLanguagesToDetect"] = NumLanguages.ToString(CultureInfo.InvariantCulture);
        uri = "text/analytics/v2.0/languages?" + queryString;
        response = await CallEndpoint(client, uri, byteData);
        Console.WriteLine("Detect language");
        Console.WriteLine(ParseResponseLanguage(response));
 
        // Detect topic:
        queryString = HttpUtility.ParseQueryString(string.Empty);
        queryString["minimumNumberOfDocuments"] = "1";
        uri = "text/analytics/v2.0/topics?" + queryString;
        response = await CallEndpoint(client, uri, byteData);
        Console.WriteLine("Detect topic");
        Console.WriteLine(ParseResponseGeneric(response));
 
        // Detect sentiment:
        uri = "text/analytics/v2.0/sentiment";
        response = await CallEndpoint(client, uri, byteData);
        Console.WriteLine("Detect sentiment");
        Console.WriteLine(ParseResponseSentiment(response));
    }
}
private static string ParseResponseSentiment(string response)
{
    if (!string.IsNullOrWhiteSpace(response))
    {
        SentimentResponse resp = JsonConvert.DeserializeObject<SentimentResponse>(response);
        string returnVal = string.Empty;
 
        foreach (var doc in resp.Documents)
        {
            returnVal += Environment.NewLine +
                $"Sentiment: {doc.Id}, Score: {doc.Score}";
        }
 
        return returnVal;
    }
 
    return null;
}
 
private static string ParseResponseLanguage(string response)
{
    if (!string.IsNullOrWhiteSpace(response))
    {
        LanguageResponse resp = JsonConvert.DeserializeObject<LanguageResponse>(response);
        string returnVal = string.Empty;
        foreach(var doc in resp.Documents)
        {
            var detectedLanguage = doc.DetectedLanguages.OrderByDescending(l => l.Score).First();
            returnVal += Environment.NewLine +
                $"Id: {doc.Id}, " +
                $"Language: {detectedLanguage.Name}, " +
                $"Score: {detectedLanguage.Score}";
        }
        return returnVal;
    }
 
    return null;
}
 
private static string ParseResponseGeneric(string response)
{            
    if (!string.IsNullOrWhiteSpace(response))
    {
        return Environment.NewLine + response;                
    }
 
    return null;
}

The subscription key is given when you register (in the screen under “Set-up”). Keep an eye on the requests, too: 5000 seems like a lot, but when you’re testing, you might find you get through them faster than you expect.

Here’s the output:

cog3

Evaluation

So, the 5 phrases that I used were:

The quick brown fox jumped over the hedge

This is a basic sentence indicating an action.

The KeyPhrases API decided that the key points here were “hedge” and “quick brown fox”. It didn’t think that “jumped” was key to this sentence.

The Language API successfully worked out that it’s written in English.

The Sentiment API thought that this was a slightly negative statement.

March is a green month

This was a nonsense statement, but in a valid sentence structure.

The KeyPhrases API identified “green month” as being important, but not March.

The Language API successfully worked out that it’s written in English.

The Sentiment API thought this was a very positive statement.

When I press enter the program crashes

Again, a completely valid sentence, and with a view to my idea ultimate idea for this API.

The KeyPhrases API spotted “program crashes”, but not why. I found this interesting because it seems to conflict with the other phrases, which seemed to identify nouns only.

Again, the Language API knew this was English.

The sentiment API identified that this was a negative statement… which I think I agree with.

Pressing return - the program crashes

The idea here was, it’s basically the same sentence as above, but phrased differently.

The KeyPhrases API wasn’t fooled, and returned the same key phrase - this is good.

Still English, according to the Language API.

This is identified as a negative statement again, but oddly, not as negative as the previous one.

Los siento, no hablo Enspanol

I threw in a Spanish phrase because I felt the Language API hadn’t had much of a run.

The KeyPhrase API pulled out “hablo Espanol”, which based on my very rudimentary Spanish, means the opposite of that was said.

It was correctly identified as Spanish by the Language API.

The Sentiment API identified it as the most negative statement. Perhaps because it has the word “sorry” and “no” in it?

References

Sample code:

https://text-analytics-demo.azurewebsites.net/Home/SampleCode

https://elbruno.com/2016/04/13/cognitiveservices-text-analytics-api-new-operation-detect-key-topics-in-documents/

https://mrfoxsql.wordpress.com/2016/09/13/azure-cognitive-services-apis-with-sql-server-2016-clr/



Profile picture

A blog about one man's journey through code… and some pictures of the Peak District
Twitter

© Paul Michaels 2024