|
|
- {
- "cells": [
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "# 逻辑斯蒂回归模型"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "上一节课我们学习了简单的线性回归模型,这一节我们会学习第二个模型:逻辑斯蒂回归模型(Logistic Regression)。\n",
- "\n",
- "逻辑斯蒂回归是一种广义的回归模型,其与多元线性回归有着很多相似之处,模型的形式基本相同,虽然也被称为回归,但是其更多的情况使用在分类问题上,同时又以二分类更为常用。"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## 1. 模型形式\n",
- "\n",
- "逻辑斯蒂回归的模型形式和线性回归一样,都是 $y = wx + b$,其中 $x$ 可以是一个多维的特征,唯一不同的地方在于逻辑斯蒂回归会对 $y$ 作用一个 logistic 函数,将其变为一种概率的结果。 \n",
- "\n",
- "$$\n",
- "h_\\theta(x) = g(\\theta^T x) = \\frac{1}{1+e^{-\\theta^T x}}\n",
- "$$\n",
- "\n",
- "Logistic 函数作为 Logistic 回归的核心,我们下面讲一讲 Logistic 函数,也被称为 Sigmoid 函数。"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 1.1 Sigmoid 函数\n",
- "Sigmoid 函数非常简单,其公式如下\n",
- "\n",
- "$$\n",
- "f(x) = \\frac{1}{1 + e^{-x}}\n",
- "$$\n",
- "\n",
- "Sigmoid 函数的图像如下"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 1,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "image/png": "\n",
- "text/plain": [
- "<Figure size 432x288 with 1 Axes>"
- ]
- },
- "metadata": {
- "needs_background": "light"
- },
- "output_type": "display_data"
- }
- ],
- "source": [
- "%matplotlib inline\n",
- "import matplotlib.pyplot as plt\n",
- "import numpy as np\n",
- "\n",
- "plt.figure()\n",
- "plt.axis([-10,10,0,1])\n",
- "plt.grid(True)\n",
- "X=np.arange(-10,10,0.1)\n",
- "y=1/(1+np.e**(-X))\n",
- "plt.plot(X,y,'b-')\n",
- "plt.title(\"Logistic function\")\n",
- "plt.show()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "\n",
- "可以看到 Sigmoid 函数的范围是在 0 ~ 1 之间,所以任何一个值经过了 Sigmoid 函数的作用,都会变成 0 ~ 1 之间的一个值,这个值可以形象地理解为一个概率,比如对于二分类问题,这个值越小就表示属于第一类,这个值越大就表示属于第二类。"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "另外一个 Logistic 回归的前提是确保你的数据具有非常良好的线性可分性,也就是说,你的数据集能够在一定的维度上被分为两个部分,比如\n",
- "\n",
- ""
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "可以看到,上面绿色的点和蓝色的点能够几乎被一个黑色的平面分割开来"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 1.2 损失函数\n",
- "前一节对于回归问题,我们有一个 loss 去衡量误差,那么对于分类问题,我们如何去衡量这个误差,并设计 loss 函数呢?\n",
- "\n",
- "Logistic 回归使用了 Sigmoid 函数将结果变到 0 ~ 1 之间,对于任意输入一个数据,经过 Sigmoid 之后的结果我们记为 $\\hat{y}$,表示这个数据点属于第二类的概率,那么其属于第一类的概率就是 $1-\\hat{y}$。如果这个数据点属于第二类,我们希望 $\\hat{y}$ 越大越好,也就是越靠近 1 越好,如果这个数据属于第一类,那么我们希望 $1-\\hat{y}$ 越大越好,也就是 $\\hat{y}$ 越小越好,越靠近 0 越好,所以我们可以这样设计我们的 loss 函数\n",
- "\n",
- "$$\n",
- "loss = -(y * log(\\hat{y}) + (1 - y) * log(1 - \\hat{y}))\n",
- "$$\n",
- "\n",
- "其中 y 表示真实的 label,只能取 {0, 1} 这两个值,因为 $\\hat{y}$ 表示经过 Logistic 回归预测之后的结果,是一个 0 ~ 1 之间的小数。如果 y 是 0,表示该数据属于第一类,我们希望 $\\hat{y}$ 越小越好,上面的 loss 函数变为\n",
- "\n",
- "$$\n",
- "loss = - (log(1 - \\hat{y}))\n",
- "$$\n",
- "\n",
- "在训练模型的时候我们希望最小化 loss 函数,根据 log 函数的单调性,也就是最小化 $\\hat{y}$,与我们的要求是一致的。\n",
- "\n",
- "而如果 y 是 1,表示该数据属于第二类,我们希望 $\\hat{y}$ 越大越好,同时上面的 loss 函数变为\n",
- "\n",
- "$$\n",
- "loss = -(log(\\hat{y}))\n",
- "$$\n",
- "\n",
- "我们希望最小化 loss 函数也就是最大化 $\\hat{y}$,这也与我们的要求一致。\n",
- "\n",
- "所以通过上面的论述,说明了这么构建 loss 函数是合理的。"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 1.3 程序示例\n",
- "\n",
- "下面我们通过例子来具体学习 Logistic 回归"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 12,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "<torch._C.Generator at 0x7f36e27d3490>"
- ]
- },
- "execution_count": 12,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "import torch\n",
- "from torch.autograd import Variable\n",
- "import numpy as np\n",
- "import matplotlib.pyplot as plt\n",
- "%matplotlib inline\n",
- "\n",
- "# 设定随机种子\n",
- "torch.manual_seed(2021)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "我们从 `data.txt` 读入数据。读入数据点之后我们根据不同的 label 将数据点分为了红色和蓝色,并且画图展示出来了"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 13,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "<matplotlib.legend.Legend at 0x7f36e004f310>"
- ]
- },
- "execution_count": 13,
- "metadata": {},
- "output_type": "execute_result"
- },
- {
- "data": {
- "image/png": "\n",
- "text/plain": [
- "<Figure size 432x288 with 1 Axes>"
- ]
- },
- "metadata": {
- "needs_background": "light"
- },
- "output_type": "display_data"
- }
- ],
- "source": [
- "# 从 data.txt 中读入点\n",
- "with open('./data.txt', 'r') as f:\n",
- " data_list = [i.split('\\n')[0].split(',') for i in f.readlines()]\n",
- " data = [(float(i[0]), float(i[1]), float(i[2])) for i in data_list]\n",
- "\n",
- "# 标准化\n",
- "x0_max = max([i[0] for i in data])\n",
- "x1_max = max([i[1] for i in data])\n",
- "data = [(i[0]/x0_max, i[1]/x1_max, i[2]) for i in data]\n",
- "\n",
- "x0 = list(filter(lambda x: x[-1] == 0.0, data)) # 选择第一类的点\n",
- "x1 = list(filter(lambda x: x[-1] == 1.0, data)) # 选择第二类的点\n",
- "\n",
- "plot_x0 = [i[0] for i in x0]\n",
- "plot_y0 = [i[1] for i in x0]\n",
- "plot_x1 = [i[0] for i in x1]\n",
- "plot_y1 = [i[1] for i in x1]\n",
- "\n",
- "plt.plot(plot_x0, plot_y0, 'ro', label='x_0')\n",
- "plt.plot(plot_x1, plot_y1, 'bo', label='x_1')\n",
- "plt.legend(loc='best')"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "接下来我们将数据转换成 NumPy 的类型,接着转换到 Tensor 为之后的训练做准备"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 14,
- "metadata": {},
- "outputs": [],
- "source": [
- "np_data = np.array(data, dtype='float32') # 转换成 numpy array\n",
- "x_data = torch.from_numpy(np_data[:, 0:2]) # 转换成 Tensor, 大小是 [100, 2]\n",
- "y_data = torch.from_numpy(np_data[:, 2]).unsqueeze(1)\n",
- "\n",
- "x_data = Variable(x_data)\n",
- "y_data = Variable(y_data)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "在 PyTorch 当中,不需要我们自己写 Sigmoid 的函数,PyTorch 已经用底层的 C++ 语言为我们写好了一些常用的函数,不仅方便我们使用,同时速度上比我们自己实现的更快,稳定性更好。"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 15,
- "metadata": {},
- "outputs": [],
- "source": [
- "# 定义 logistic 回归模型\n",
- "w = Variable(torch.randn(2, 1), requires_grad=True) \n",
- "b = Variable(torch.zeros(1), requires_grad=True)\n",
- "\n",
- "def logistic_regression(x):\n",
- " return torch.sigmoid(torch.mm(x, w) + b)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "在更新之前,我们可以画出分类的效果"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 16,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "<matplotlib.legend.Legend at 0x7f36e0017610>"
- ]
- },
- "execution_count": 16,
- "metadata": {},
- "output_type": "execute_result"
- },
- {
- "data": {
- "image/png": "\n",
- "text/plain": [
- "<Figure size 432x288 with 1 Axes>"
- ]
- },
- "metadata": {
- "needs_background": "light"
- },
- "output_type": "display_data"
- }
- ],
- "source": [
- "# 画出参数更新之前的结果 \n",
- "w0 = w[0].data[0]\n",
- "w1 = w[1].data[0]\n",
- "b0 = b.data[0]\n",
- "\n",
- "plot_x = np.arange(0.2, 1, 0.01)\n",
- "plot_y = (-w0.numpy() * plot_x - b0.numpy()) / w1.numpy()\n",
- "\n",
- "plt.plot(plot_x, plot_y, 'g', label='cutting line')\n",
- "plt.plot(plot_x0, plot_y0, 'ro', label='x_0')\n",
- "plt.plot(plot_x1, plot_y1, 'bo', label='x_1')\n",
- "plt.legend(loc='best')"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "可以看到分类效果基本是混乱的,我们来计算一下 loss,公式如下\n",
- "\n",
- "$$\n",
- "loss = -\\{ y * log(\\hat{y}) + (1 - y) * log(1 - \\hat{y}) \\}\n",
- "$$"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 17,
- "metadata": {},
- "outputs": [],
- "source": [
- "# 计算loss, 使用clamp的目的是防止数据过小而对结果产生较大影响。\n",
- "def binary_loss(y_pred, y):\n",
- " logits = (y * y_pred.clamp(1e-12).log() + \\\n",
- " (1 - y) * (1 - y_pred).clamp(1e-12).log()).mean()\n",
- " return -logits"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "注意到其中使用 `.clamp`,这是[文档](http://pytorch.org/docs/0.3.0/torch.html?highlight=clamp#torch.clamp)的内容,查看一下,并且思考一下这里是否一定要使用这个函数,如果不使用会出现什么样的结果。"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 18,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "tensor(0.7655, grad_fn=<NegBackward>)\n"
- ]
- }
- ],
- "source": [
- "y_pred = logistic_regression(x_data)\n",
- "loss = binary_loss(y_pred, y_data)\n",
- "loss.backward()\n",
- "print(loss)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "得到 loss 之后,我们还是使用梯度下降法更新参数,这里可以使用自动求导来直接得到参数的导数,感兴趣的同学可以去手动推导一下导数的公式"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 37,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "\n",
- "During Time: 0.306 s\n"
- ]
- }
- ],
- "source": [
- "start = time.time()\n",
- "\n",
- "# 自动求导并更新参数\n",
- "for i in range(1000):\n",
- " # 算出一次更新之后的loss\n",
- " y_pred = logistic_regression(x_data)\n",
- " loss = binary_loss(y_pred, y_data)\n",
- " \n",
- " # calc grad & update w,b\n",
- " loss.backward()\n",
- " w.data = w.data - 0.1 * w.grad.data\n",
- " b.data = b.data - 0.1 * b.grad.data\n",
- "\n",
- " # clear w,b grad\n",
- " w.grad.data.zero_()\n",
- " b.grad.data.zero_()\n",
- " \n",
- "during = time.time() - start\n",
- "print()\n",
- "print('During Time: {:.3f} s'.format(during))"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 26,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "<matplotlib.legend.Legend at 0x7f36cedaf310>"
- ]
- },
- "execution_count": 26,
- "metadata": {},
- "output_type": "execute_result"
- },
- {
- "data": {
- "image/png": "\n",
- "text/plain": [
- "<Figure size 432x288 with 1 Axes>"
- ]
- },
- "metadata": {
- "needs_background": "light"
- },
- "output_type": "display_data"
- }
- ],
- "source": [
- "# 画出参数更新之前的结果\n",
- "w0 = w[0].data[0]\n",
- "w1 = w[1].data[0]\n",
- "b0 = b.data[0]\n",
- "\n",
- "plot_x = np.arange(0.2, 1, 0.01)\n",
- "plot_y = (-w0.numpy() * plot_x - b0.numpy()) / w1.numpy()\n",
- "\n",
- "plt.plot(plot_x, plot_y, 'g', label='cutting line')\n",
- "plt.plot(plot_x0, plot_y0, 'ro', label='x_0')\n",
- "plt.plot(plot_x1, plot_y1, 'bo', label='x_1')\n",
- "plt.legend(loc='best')"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 1.4 torch.optim\n",
- "上面的参数更新方式其实是繁琐的重复操作,如果我们的参数很多,比如有 100 个,那么我们需要写 100 行来更新参数,为了方便,我们可以写成一个函数来更新,其实 PyTorch 已经为我们封装了一个函数来做这件事,这就是 PyTorch 中的优化器 `torch.optim`\n",
- "\n",
- "使用 `torch.optim` 需要另外一个数据类型,就是 `nn.Parameter`,这个本质上和 Variable 是一样的,只不过 `nn.Parameter` 默认是要求梯度的,而 Variable 默认是不求梯度的\n",
- "\n",
- "使用 `torch.optim.SGD` 可以使用梯度下降法来更新参数,PyTorch 中的优化器有更多的优化算法,在本章后面的课程我们会更加详细的介绍\n",
- "\n",
- "将参数 w 和 b 放到 `torch.optim.SGD` 中之后,说明一下学习率的大小,就可以使用 `optimizer.step()` 来更新参数了,比如下面我们将参数传入优化器,学习率设置为 1.0"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 31,
- "metadata": {},
- "outputs": [],
- "source": [
- "# 使用 torch.optim 更新参数\n",
- "from torch import nn\n",
- "\n",
- "w = nn.Parameter(torch.randn(2, 1))\n",
- "b = nn.Parameter(torch.zeros(1))\n",
- "\n",
- "def logistic_regression(x):\n",
- " return torch.sigmoid(torch.mm(x, w) + b)\n",
- "\n",
- "optimizer = torch.optim.SGD([w, b], lr=1.)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 38,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "epoch: 200, Loss: 0.24529, Acc: 0.89000\n",
- "epoch: 400, Loss: 0.23901, Acc: 0.89000\n",
- "epoch: 600, Loss: 0.23409, Acc: 0.89000\n",
- "epoch: 800, Loss: 0.23013, Acc: 0.89000\n",
- "epoch: 1000, Loss: 0.22689, Acc: 0.89000\n",
- "\n",
- "During Time: 0.352 s\n"
- ]
- }
- ],
- "source": [
- "# 进行 1000 次更新\n",
- "import time\n",
- "\n",
- "start = time.time()\n",
- "for e in range(1000):\n",
- " # 前向传播\n",
- " y_pred = logistic_regression(x_data)\n",
- " loss = binary_loss(y_pred, y_data) # 计算 loss\n",
- " \n",
- " # 反向传播\n",
- " optimizer.zero_grad() # 使用优化器将梯度归 0\n",
- " loss.backward()\n",
- " optimizer.step() # 使用优化器来更新参数\n",
- " \n",
- " # 计算正确率\n",
- " mask = y_pred.ge(0.5).float()\n",
- " acc = (mask == y_data).sum().item() / y_data.shape[0]\n",
- " if (e + 1) % 200 == 0:\n",
- " print('epoch: {}, Loss: {:.5f}, Acc: {:.5f}'.format(e+1, loss.item(), acc))\n",
- "during = time.time() - start\n",
- "print()\n",
- "print('During Time: {:.3f} s'.format(during))"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "可以看到使用优化器之后更新参数非常简单,只需要在自动求导之前使用**`optimizer.zero_grad()`** 来归 0 梯度,然后使用 **`optimizer.step()`**来更新参数就可以了,非常简便\n",
- "\n",
- "同时经过了 1000 次更新,loss 也降得比较低了"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "下面我们画出更新之后的结果"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 33,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "<matplotlib.legend.Legend at 0x7f36cec7e550>"
- ]
- },
- "execution_count": 33,
- "metadata": {},
- "output_type": "execute_result"
- },
- {
- "data": {
- "image/png": "\n",
- "text/plain": [
- "<Figure size 432x288 with 1 Axes>"
- ]
- },
- "metadata": {
- "needs_background": "light"
- },
- "output_type": "display_data"
- }
- ],
- "source": [
- "# 画出更新之后的结果\n",
- "w0 = w[0].data[0]\n",
- "w1 = w[1].data[0]\n",
- "b0 = b.data[0]\n",
- "\n",
- "plot_x = np.arange(0.2, 1, 0.01)\n",
- "plot_y = (-w0.numpy() * plot_x - b0.numpy()) / w1.numpy()\n",
- "\n",
- "plt.plot(plot_x, plot_y, 'g', label='cutting line')\n",
- "plt.plot(plot_x0, plot_y0, 'ro', label='x_0')\n",
- "plt.plot(plot_x1, plot_y1, 'bo', label='x_1')\n",
- "plt.legend(loc='best')"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "可以看到更新之后模型已经能够基本将这两类点分开了"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 1. 5 PyTorch的Loss函数\n",
- "前面我们使用了自己写的 loss,其实 PyTorch 已经为我们写好了一些常见的 loss,比如线性回归里面的 loss 是 `nn.MSE()`,而 Logistic 回归的二分类 loss 在 PyTorch 中是 `nn.BCEWithLogitsLoss()`,关于更多的 loss,可以查看[文档](http://pytorch.org/docs/0.3.0/nn.html#loss-functions)\n",
- "\n",
- "PyTorch 为我们实现的 loss 函数有两个好处,第一是方便我们使用,不需要重复造轮子,第二就是其实现是在底层 C++ 语言上的,所以速度上和稳定性上都要比我们自己实现的要好\n",
- "\n",
- "另外,PyTorch 出于稳定性考虑,将模型的 Sigmoid 操作和最后的 loss 都合在了 `nn.BCEWithLogitsLoss()`,所以我们使用 PyTorch 自带的 loss 就不需要再加上 Sigmoid 操作了"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 35,
- "metadata": {},
- "outputs": [],
- "source": [
- "# 使用自带的loss\n",
- "criterion = nn.BCEWithLogitsLoss() # 将 sigmoid 和 loss 写在一层,有更快的速度、更好的稳定性\n",
- "\n",
- "w = nn.Parameter(torch.randn(2, 1))\n",
- "b = nn.Parameter(torch.zeros(1))\n",
- "\n",
- "def logistic_reg(x):\n",
- " return torch.mm(x, w) + b\n",
- "\n",
- "optimizer = torch.optim.SGD([w, b], 1.)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 118,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "tensor(0.6314)\n"
- ]
- }
- ],
- "source": [
- "y_pred = logistic_reg(x_data)\n",
- "loss = criterion(y_pred, y_data)\n",
- "print(loss.data)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 39,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "epoch: 200, Loss: 0.22419, Acc: 0.89000\n",
- "epoch: 400, Loss: 0.22191, Acc: 0.89000\n",
- "epoch: 600, Loss: 0.21997, Acc: 0.89000\n",
- "epoch: 800, Loss: 0.21830, Acc: 0.88000\n",
- "epoch: 1000, Loss: 0.21685, Acc: 0.88000\n",
- "\n",
- "During Time: 0.215 s\n"
- ]
- }
- ],
- "source": [
- "# 同样进行 1000 次更新\n",
- "\n",
- "start = time.time()\n",
- "for e in range(1000):\n",
- " # 前向传播\n",
- " y_pred = logistic_reg(x_data)\n",
- " loss = criterion(y_pred, y_data)\n",
- " \n",
- " # 反向传播\n",
- " optimizer.zero_grad()\n",
- " loss.backward()\n",
- " optimizer.step()\n",
- " \n",
- " # 计算正确率 0.5以上的判断为正确\n",
- " mask = y_pred.ge(0.5).float() \n",
- " acc = (mask == y_data).sum().item() / y_data.shape[0]\n",
- " if (e + 1) % 200 == 0:\n",
- " print('epoch: {}, Loss: {:.5f}, Acc: {:.5f}'.format(e+1, loss.item(), acc))\n",
- "\n",
- "during = time.time() - start\n",
- "print()\n",
- "print('During Time: {:.3f} s'.format(during))"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "可以看到,使用了 PyTorch 自带的 loss 之后,速度有了一定的上升,虽然看上去速度的提升并不多,但是这只是一个小网络,对于大网络,使用自带的 loss 不管对于稳定性还是速度而言,都有质的飞跃,同时也避免了重复造轮子的困扰"
- ]
- }
- ],
- "metadata": {
- "kernelspec": {
- "display_name": "Python 3",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.7.9"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 2
- }
|