In writing this blog, I am sure I should have started from the basics of Machine learning such as talking about supervised or unsupervised models or training and testing data sets in Machine learning, but I feel this has been addressed a lot on this space and everyone has tried to use the available labelled data sets to create supervised machine learning models or the unlabeled data to find clusters in the data and association.
In this article, I will be addressing the last but most important step when dealing with Machine learning models, this is how you determine the accuracy of the machine Learning models once you have implemented the model. This is very key because, if the accuracy of the model is very low, there is a lot you have missed when fitting the model regarding the dataset that you had. This mostly is underfitting which occurs because of two major reasons
- The model fails to fit in the dataset provided because it didn’t find any trend in the dataset
- Fitting a model to the wrong data i.e. fitting a linear model to a nonlinear data set.
Another major situation that influence the accuracy of the model is overfitting in the training dataset. This is mostly caused by the dataset having too many explanatory variables and the model tries to incorporate every variable.
When you implement a model, its essential to determine the accuracy before recommending the model to be used in production. Below are some of the metrics that you can use when it comes to Machine Learning. I will be explaining all the metrics in layman’s language and where mostly you can use them in a series of articles. In this article, the focus will be MAE.
- Mean Absolute Error
- Mean Absolute Percentage Error
- Mean Squared Error
- R squared
- Confusion Matrix
Mean Absolute Error
As the name suggest, the metric is mostly focused on the errors. This means the difference between the actual observation and the predicted observation. MAE is mostly used to evaluate regression models such as linear models. Basically, all the observations are in continuous form. To implement it in any language, it follows the logic below in the order of the steps.
- Getting the Error, Error = Actual observation – predicted observation
- When you get all the errors, you will realize that some errors are positive, and others are negative
- Getting the Absolute Error = |Error|
- This step ignores the sign before the error. Treating the positive and negative errors observed as absolute
- Getting the Average (Mean) of the absolute errors
- This involves adding all the errors and dividing with the total number of observations.
Practical Example predicting the price of Houses:
|House description||Predicted cost of the house using Linear Model||Actual cost of the house||Calculating the error (Actual -Predicted)||Absolute Error|
|2 bedroom, 2 baths, kitchen and balcony||$18700||$20000||+1300||1300|
|3-bedroom, kitchen, 2 bath, dry cleaner, gas cooker||$43,200||$40000||-3200||3200|
|3-bedroom, kitchen, 3 baths,||$27,800||$30000||+2200||2200|
|4 bedroom, 2 baths, dish washer, dry cleaner, kitchen, dry cleaner||$63200||$58000||-5200||5200|
|2-bedroom, dry cleaner, electric cooker, dish washer||$22400||$25000||+2600||2600|
Getting the Average of the Absolute errors:
1300 +3200+2200+5200+2600 = 14500
14500/5 = 2900
Interpreting MAE results:
- The result can range from 0 to infinity
- MAE result is not affected by the direction of errors since we use absolute errors
- The lower the result the better
- A MAE of $2900 is our measure of our Model quality which means our that on Average our model predictions are off with approximately $2900
As much as MAE takes care of all the errors across the predicted values, it gives all the errors the same weight (small and big errors). This means missing the right prediction by 5 is as bad as missing the right prediction by 1. If missing the right value by 5 is way worse than missing by 1, consider using MAPE since it takes into consideration the weight of the errors. (will be explored in the next article).
MAE is best used in scenarios where the magnitude of every error is not important.
Point to Note: In case one avoids the second step, of getting the absolute and uses the raw errors to calculate the Mean, the result is described as Mean Bias Error. This measures the average bias in the model itself. Which means, how is our model biased in comparison to the actual predictions. Sometimes it can give weird result, since most of the time, the positives and negatives will cancel out. Should be careful when interpreting the results.