{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 混淆矩阵(confusion matrix)\n", "\n", "混淆矩阵是用来总结一个分类器结果的矩阵。对于k元分类,其实它就是一个$k \\times k$的表格,用来记录分类器的预测结果。\n", "\n", "对于最常见的二元分类来说,它的混淆矩阵是2乘2的,如下\n", "![confusion_matrix1](images/confusion_matrix1.png)\n", "\n", "* `TP` = True Postive = 真阳性\n", "* `FP` = False Positive = 假阳性\n", "* `FN` = False Negative = 假阴性\n", "* `TN` = True Negative = 真阴性\n", "\n", "你要的例子来了。。。比如我们一个模型对15个样本进行预测,然后结果如下。\n", "\n", "* 预测值:1 1 1 1 1 0 0 0 0 0 1 1 1 0 1\n", "* 真实值:0 1 1 0 1 1 0 0 1 0 1 0 1 0 0\n", "\n", "![confusion_matrix2](images/confusion_matrix2.png)\n", "\n", "\n", "这个就是混淆矩阵。混淆矩阵中的这四个数值,经常被用来定义其他一些度量。\n", "\n", "\n", "### 准确度\n", "```\n", "Accuracy = (TP+TN) / (TP+TN+FN+TN)\n", "```\n", "\n", "在上面的例子中,准确度 = (5+4) / 15 = 0.6\n", "\n", "\n", "\n", "### 精度(precision, 或者PPV, positive predictive value) \n", "```\n", "precision = TP / (TP + FP)\n", "```\n", "在上面的例子中,精度 = 5 / (5+4) = 0.556\n", "\n", "\n", "\n", "### 召回(recall, 或者敏感度,sensitivity,真阳性率,TPR,True Positive Rate) \n", "\n", "```\n", "recall = TP / (TP + FN)\n", "```\n", "\n", "在上面的例子中,召回 = 5 / (5+2) = 0.714\n", "\n", "\n", "\n", "### 特异度(specificity,或者真阴性率,TNR,True Negative Rate)\n", "```\n", "specificity = TN / (TN + FP)\n", "```\n", "\n", "在上面的例子中,特异度 = 4 / (4+2) = 0.667\n", "\n", "\n", "\n", "### F1-值(F1-score) \n", "```\n", "F1 = 2*TP / (2*TP+FP+FN) \n", "```\n", "在上面的例子中,F1-值 = 2*5 / (2*5+4+2) = 0.625\n", "\n", "\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.2" }, "main_language": "python" }, "nbformat": 4, "nbformat_minor": 2 }