Category Archives: Machine Learning

Add Evaluation to ML.NET Model

I’ve recently been playing around with ML.NET. I’ve documented some of my escapades here.

One thing that I found was that, when trying to work out how effective the model created from the data was, I was manually rifling through the data that I had: having to essentially compare each result. As a result, I created EvaluateMLNet. It’s a small NuGet package that essentially does this for you.

Step 1 – Import the package

If you follow the previous post, you’ll find yourself in a position where you have a Model project and a ConsoleApp project:

In order to use the package, start by importing the NuGet package into the ConsoleApp project:

Install-Package EvaluateMLNet

Step 2 – Add the data

The next stage is to have some data to test your model against. Add this to your ConsoleApp project, and remember to set the Copy if Newer or Copy Always on the file.

Step 3 – Code

The main program will look something like this to begin with:

        static void Main(string[] args)
        {
            // Create single instance of sample data from first line of dataset for model input
            ModelInput sampleData = new ModelInput()
            {
                Season = @"2019-2020",
                Time = @"7:00pm",
                Home_team_name = @"Liverpool",
                Away_team_name = @"Norwich City",
                Referee = @"Michael Oliver",
                Stadium_name = @"Anfield (Liverpool)",
            };

            // Make a single prediction on the sample data and print results
            var predictionResult = ConsumeModel.Predict(sampleData);

            Console.WriteLine($"Season: {sampleData.Season}");
            Console.WriteLine($"Time: {sampleData.Time}");
            Console.WriteLine($"Home_team_name: {sampleData.Home_team_name}");
            Console.WriteLine($"Away_team_name: {sampleData.Away_team_name}");
            Console.WriteLine($"Referee: {sampleData.Referee}");
            Console.WriteLine($"Stadium_name: {sampleData.Stadium_name}");
            Console.WriteLine($"\n\nPredicted home_team_score: {predictionResult.Score}\n\n");
            Console.WriteLine("=============== End of process, hit any key to finish ===============");
            Console.ReadKey();
        }

Instead of that, start by extracting the Predict method – that is everything after:

// Make a single prediction on the sample data and print results

This should give you:

        static void Main(string[] args)
        {
            // Create single instance of sample data from first line of dataset for model input
            ModelInput sampleData = new ModelInput()
            {
                Season = @"2019-2020",
                Time = @"7:00pm",
                Home_team_name = @"Liverpool",
                Away_team_name = @"Norwich City",
                Referee = @"Michael Oliver",
                Stadium_name = @"Anfield (Liverpool)",
            };

            PredictData(sampleData);
        }

        private static float PredictData(ModelInput sampleData)
        {
            // Make a single prediction on the sample data and print results
            var predictionResult = ConsumeModel.Predict(sampleData);

            Console.WriteLine($"Season: {sampleData.Season}");
            Console.WriteLine($"Time: {sampleData.Time}");
            Console.WriteLine($"Home_team_name: {sampleData.Home_team_name}");
            Console.WriteLine($"Away_team_name: {sampleData.Away_team_name}");
            Console.WriteLine($"Referee: {sampleData.Referee}");
            Console.WriteLine($"Stadium_name: {sampleData.Stadium_name}");

            return predictionResult.Score;
        }

Note that we’re also returning the result of the prediction. In fact, that method only needs to return the result of the prediction – the Console.WriteLines are unnecessary.

Finally, replace the Main method with the following:

        static void Main(string[] args)
        {
            var runEvaluation = new RunEvaluation();
            var resultStats = runEvaluation.Run<ModelInput, float>("my-data-file.csv",
                "Predicted_field_name", PredictData, 0);

            Console.WriteLine("Results");
            Console.WriteLine("Total evaluated results: {0}", resultStats.EvaluatedCount);
            Console.WriteLine("Total success results: {0}", resultStats.SuccessCount);
            Console.ReadLine();            
        }

A few comments about this code:

1. The “Predicted_field_name” is the name of the field in the class ModelInput. It’s very likely to have a capitalised first letter.
2. My data is predicting a float – if yours is not then you’ll need to change this.
3. The margin of error here is 0; that means that a prediction is only considered a success where it’s within the same integer; for example, if the prediction was 1.3, then 1 and 2 would be considered a success, but 0 and 3 would not.

That’s it, the output will give you something like this:

Summary

I realise that this is feeding a very niche crowd, but hopefully it’ll save someone a Saturday afternoon.

Predicting Football Results Using ML.Net

Have you ever wondered why milk is where it is in the supermarket? At least in the UK, the supermarkets sell milk either at cost, or even below cost, in order to attract people into the store: they don’t want you to just buy milk, because they lose money on it. You’ll know the stuff they want you to buy, because it’s at eye level, and it’s right in front of you when you walk in the door.

ML.NET is an open source machine learning platform. As with many things that Microsoft are the guardian of, they want to sell you Azure time, and so this is just another pint of milk at the back of the shop. Having said that – it’s pretty good milk!

In this post, I’m going to set-up a very simple test. I’ll be using this file. It shows the results of the English Premier League from 2018-2019. I’m not a huge football fan myself, but it was the only data I could find at short notice.

Add ML.NET

ML.NET is in preview, so the first step is to add the feature. Oddly, it’s under the “Cross Platform Development” workload:

Once you’ve added this, you may reasonably expect something to change, although it likely won’t – or it will – you’ll see a context menu when you right click a project – but it won’t do anything. This is, bizarrely, because you need to explicitly enable preview features; under Tools -> Options, you’ll find this menu:

Let’s create a new console application; then right click on the project:

You’re now given a list of “scenarios”:

For our purpose, let’s select “Value prediction”. We’re going to try to predict the number of goals for the home team, based on the shots on goal. Just select the file as input data and the column to predict as home_team_goal_count:

For the input, just select home_team_goal_count and then Train:

It asks you for a time here. The longer you give it, the better the model – although there will be a point at which additional time won’t make any difference. You should be able to get a reasonable prediction after 10 seconds, but I’ve picked 60 to see how good a prediction it can make. As someone who knows nothing about football, I would expect these figures to be an almost direct correlation.

Once you’ve finished training the model, you can Evaluate it:

So, it would appear that with 9 shots at goal, I can expect that a team will score between 1 and 2. If I now click the code button, ML.NET will create two new projects for me, including a new Console application; which looks like this:

        static void Main(string[] args)
        {
            // Create single instance of sample data from first line of dataset for model input
            ModelInput sampleData = new ModelInput()
            {
                Home_team_shots = 9F,
            };

            // Make a single prediction on the sample data and print results
            var predictionResult = ConsumeModel.Predict(sampleData);

            Console.WriteLine("Using model to make single prediction -- Comparing actual Home_team_goal_count with predicted Home_team_goal_count from sample data...\n\n");
            Console.WriteLine($"Home_team_shots: {sampleData.Home_team_shots}");
            Console.WriteLine($"\n\nPredicted Home_team_goal_count: {predictionResult.Score}\n\n");
            Console.WriteLine("=============== End of process, hit any key to finish ===============");
            Console.ReadKey();
        }

Let’s modify this slightly, so that we can simply ask it to predict the goal count:

        static void Main(string[] args)
        {
            Console.WriteLine("Enter shots at goal: ");
            string shots = Console.ReadLine();
            if (int.TryParse(shots, out int shotsNum))
            {
                PredictGoals(shotsNum);
            }
        }

        private static void PredictGoals(int shots)
        {
            // Create single instance of sample data from first line of dataset for model input
            ModelInput sampleData = new ModelInput()
            {
                Home_team_shots = shots,
            };

            // Make a single prediction on the sample data and print results
            var predictionResult = ConsumeModel.Predict(sampleData);

            Console.WriteLine("Using model to make single prediction -- Comparing actual Home_team_goal_count with predicted Home_team_goal_count from sample data...\n\n");
            Console.WriteLine($"Home_team_shots: {sampleData.Home_team_shots}");
            Console.WriteLine($"\n\nPredicted Home_team_goal_count: {predictionResult.Score}\n\n");
            Console.WriteLine("=============== End of process, hit any key to finish ===============");
            Console.ReadKey();
        }

And now, we can get a prediction from the app:

29 shots at goal result in only 2 – 3 goals. We can glance at the spreadsheet to see how accurate this is:

It appears it is actually quite accurate!