Evaluation of Machine Learning Algorithm
Once you have done a machine learning model for classification problem, we want to know the accuracy of prediction of the model. We can use accuracy, precision, recall and f1-score to show how good the model is.
Basic Terms
Positive(P): The ground truth is positive (e.g. it is an iPhone) Negative(N): The ground truth is negative (e.g. it is not an iPhone) True Positive(TP): The prediction is positive; The ground truth is positive. False Positive(FP): The prediction is positive; The ground truth is negative. True Negative(TN): The prediction is negative; The ground truth is negative. False Negative(FN): The prediction is negative; The ground truth is positive.
Error
Proportion of all predictions that are incorrect. Error is a measurement of how bad a model is.
Accuracy
Proportion of all predictions that are correct. Accuracy is a measurement of how good a model is.
Precision
Proportion of all positive predictions that are correct. Precision is a measurement of how many positive predictions were actual positive observations.
Recall
Proportion of all real positive observations that are correct. Precision is a measure of how many actual positive observations were predicted correctly.
F1-Score
The harmonic mean(调和平均数,又称倒数平均数) of precision and recall. F1 score is an ‘average’ of both precision and recall. We use the harmonic mean because it is the appropriate way to average ratios (while arthmetric mean is appropriate when it conceptually makes sense to add things up).
其他概念
Error of Commission: a mistake that consists of doing something wrong, such as including a wrong amount, or including an amount in the wrong place Error of Omission: a mistake that consists of not doing something you should have done, or not including something such as an amount or fact that should be included (Cambridge Dictionary Online) commission errors(错分误差) and omission errors(漏分误差) commission errors + user accuracy(precision) = 1
omission errors + producer accuracy(recall) = 1
user accuracy = TP/(TP+FP)
producer accuracy = TP/(TP+FN)