|
|
- {
- "cells": [
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "# 线性模型和梯度下降\n",
- "\n",
- "本节我们简单回顾一下线性回归模型,并演示一下如何使用PyTorch来对线性回归模型进行建模和模型参数计算。"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## 1. 一元线性回归\n",
- "一元线性模型非常简单,假设我们有变量 $x_i$ 和目标 $y_i$,每个 i 对应于一个数据点,希望建立一个模型\n",
- "\n",
- "$$\n",
- "\\hat{y}_i = w x_i + b\n",
- "$$\n",
- "\n",
- "$\\hat{y}_i$ 是我们预测的结果,希望通过 $\\hat{y}_i$ 来拟合目标 $y_i$,通俗来讲就是找到这个函数拟合 $y_i$ 使得误差最小,即最小化\n",
- "\n",
- "$$\n",
- "\\frac{1}{n} \\sum_{i=1}^n(\\hat{y}_i - y_i)^2\n",
- "$$"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "那么如何最小化这个误差呢?\n",
- "\n",
- "这里需要用到**梯度下降**,这是我们接触到的第一个优化算法,非常简单,但是却非常强大,在深度学习中被大量使用,所以让我们从简单的例子出发了解梯度下降法的原理"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## 2. 梯度下降法\n",
- "在梯度下降法中,我们首先要明确梯度的概念,随后我们再了解如何使用梯度进行下降。"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 2.1 梯度\n",
- "梯度在数学上就是导数,如果是一个多元函数,那么梯度就是偏导数。比如一个函数f(x, y),那么 f 的梯度就是 \n",
- "\n",
- "$$\n",
- "(\\frac{\\partial f}{\\partial x},\\ \\frac{\\partial f}{\\partial y})\n",
- "$$\n",
- "\n",
- "可以称为 grad f(x, y) 或者 $\\nabla f(x, y)$。具体某一点 $(x_0,\\ y_0)$ 的梯度就是 $\\nabla f(x_0,\\ y_0)$。\n",
- "\n",
- "下面这个图片是 $f(x) = x^2$ 这个函数在 x=1 处的梯度\n",
- "\n",
- ""
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "梯度有什么意义呢?从几何意义来讲,一个点的梯度值是这个函数变化最快的地方,具体来说,对于函数 f(x, y),在点 $(x_0, y_0)$ 处,沿着梯度 $\\nabla f(x_0,\\ y_0)$ 的方向,函数增加最快,也就是说沿着梯度的方向,我们能够更快地找到函数的极大值点,或者反过来沿着梯度的反方向,我们能够更快地找到函数的最小值点。"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 2.2 梯度下降法\n",
- "有了对梯度的理解,我们就能了解梯度下降发的原理了。上面我们需要最小化这个误差,也就是需要找到这个误差的最小值点,那么沿着梯度的反方向我们就能够找到这个最小值点。\n",
- "\n",
- "我们可以来看一个直观的解释。比如我们在一座大山上的某处位置,由于我们不知道怎么下山,于是决定走一步算一步,也就是在每走到一个位置的时候,求解当前位置的梯度,沿着梯度的负方向,也就是当前最陡峭的位置向下走一步,然后继续求解当前位置梯度,向这一步所在位置沿着最陡峭最易下山的位置走一步。这样一步步的走下去,一直走到觉得我们已经到了山脚。当然这样走下去,有可能我们不能走到山脚,而是到了某一个局部的山峰低处。\n",
- "\n",
- "类比我们的问题,就是沿着梯度的反方向,我们不断改变 w 和 b 的值,最终找到一组最好的 w 和 b 使得误差最小。\n",
- "\n",
- "在更新的时候,我们需要决定每次更新的幅度,比如在下山的例子中,我们需要每次往下走的那一步的长度,这个长度称为学习率,用 $\\eta$ 表示,这个学习率非常重要,不同的学习率都会导致不同的结果,学习率太小会导致下降非常缓慢,学习率太大又会导致跳动非常明显,可以看看下面的例子\n",
- "\n",
- "\n",
- "\n",
- "可以看到上面的学习率较为合适,而下面的学习率太大,就会导致不断跳动\n",
- "\n",
- "最后我们的更新公式就是\n",
- "\n",
- "$$\n",
- "w := w - \\eta \\frac{\\partial f(w,\\ b)}{\\partial w} \\\\\n",
- "b := b - \\eta \\frac{\\partial f(w,\\ b)}{\\partial b}\n",
- "$$\n",
- "\n",
- "通过不断地迭代更新,最终我们能够找到一组最优的 w 和 b,这就是梯度下降法的原理。\n",
- "\n",
- "最后可以通过这张图形象地说明一下这个方法\n",
- "\n",
- ""
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 2.3 PyTorch实现\n",
- "\n",
- "上面是原理部分,下面通过一个例子来进一步学习线性模型"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 1,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "<torch._C.Generator at 0x7f60af667e50>"
- ]
- },
- "execution_count": 1,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "import torch\n",
- "import numpy as np\n",
- "from torch.autograd import Variable\n",
- "\n",
- "torch.manual_seed(2021)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 2,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "[<matplotlib.lines.Line2D at 0x7f60acfc2910>]"
- ]
- },
- "execution_count": 2,
- "metadata": {},
- "output_type": "execute_result"
- },
- {
- "data": {
- "image/png": "iVBORw0KGgoAAAANSUhEUgAAAWoAAAD4CAYAAADFAawfAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/MnkTPAAAACXBIWXMAAAsTAAALEwEAmpwYAAAOqElEQVR4nO3df4xlZ13H8fd32VScpgLpTo2W7kyJgNSaQpk0xYQqWSTYGJqoMSXTKKZhApoq+hfJ/oHR7B8kaqIJUSf+NmNFK5A1aFN/gI3ELt6lhW5bS9qys2xb6UVhNYzaln7949xht5O7vWdm7rnnOfe8X8lk5t579u732TP99LnPOc/zRGYiSSrXgbYLkCS9NINakgpnUEtS4QxqSSqcQS1JhTvYxJseOnQol5eXm3hrSZpLJ0+e/GpmLo57rZGgXl5eZjAYNPHWkjSXImLzYq859CFJhTOoJalwBrUkFc6glqTCGdSSVDiDWpL2aWMDlpfhwIHq+8bGdN+/kdvzJKkvNjZgbQ22tqrHm5vVY4DV1en8HfaoJWkfjh49H9Lbtraq56fFoJakfThzZnfP74VBLUn7cPjw7p7fC4Nakvbh2DFYWHjxcwsL1fPTYlBL0j6srsL6OiwtQUT1fX19ehcSwbs+JGnfVlenG8w72aOWpMIZ1JJUOINakgpnUEtS4QxqSSqcQS1JhTOoJalwBrUkFc6glqTCGdSSVDiDWpIKZ1BLUuEMakkqnEEtSYWrFdQR8QsRcSoiHoqIDzRckyTpAhODOiKuBd4L3ABcB/xoRHxP04VJkip1etRvAE5k5lZmPg/8E/BjzZYlSdpWJ6hPAW+NiMsjYgG4Gbhq50ERsRYRg4gYDIfDadcpSb01Magz8xHgw8A9wN3AA8A3xxy3npkrmbmyuLg47TolqbdqXUzMzN/PzDdn5k3A14AvNluWpCZsbMDyMhw4UH3f2Gi7ItVRa3PbiLgiM5+JiMNU49M3NluWpGnb2IC1Ndjaqh5vblaPodmNWbV/de+j/quIeBj4a+DnMvPrzZUkqQlHj54P6W1bW9XzKlutHnVmvrXpQiQ168yZ3T2vcjgzUeqJw4d397zKYVBLPXHsGCwsvPi5hYXqeZXNoJZ6YnUV1tdhaQkiqu/r615I7IJaY9SS5sPqqsHcRfaoJalwBrUkFc6glqTCGdSSVDiDWpIKZ1BLUuEMakkqnEEtSYUzqCWpcAa11BEu+t9fTiGXOsBF//vNHrXUAS76328GtdQBLvrfbwa11AEu+t9vBrXUAS76328GtdQBLvrfb971IXWEi/73lz1qSSqcQS1JhTOoJalwBrUkFc6glqTCGdSSVDiDWpIKZ1BLUuEMakkqnEEtSYUzqCWpcAa1JBXOoJbGcH9ClaRWUEfEL0bEQxFxKiLujIiXN12Y1Jbt/Qk3NyHz/P6EhrXaMjGoI+JK4OeBlcy8FngZcGvThUltcX9Clabu0MdB4Nsj4iCwADzVXElSu9yfUKWZGNSZ+STwa8AZ4GngXGbes/O4iFiLiEFEDIbD4fQrlWbE/QlVmjpDH68CbgGuBr4buDQibtt5XGauZ+ZKZq4sLi5Ov1JpRtyfUKWpM/TxduBLmTnMzOeAjwE/0GxZUnvcn1ClqbNn4hngxohYAP4HOAIMGq1Kapn7E6okdcaoTwB3AZ8DHhz9mfWG65IkjdTahTwzPwR8qOFaJEljODNRkgpnUEtS4QxqSSqcQa3ecKEldVWti4lS120vtLS9hsf2QkvgbXgqnz1q9YILLanLDGr1ggstqcsMavWCCy2pywxq9YILLanLDGr1ggstqcu860O94UJL6ip71JJUOINakgpnUEtS4QxqqWVObdckXkyUWuTUdtVhj1pqkVPbVYdBLbXIqe2qw6CWWuTUdtVhUEstcmq76jCopRY5tV11eNeH1DKntmsSe9SSVDiDWpIKZ1BLUuEMakkqnEEtSYUzqCWpcAa1NEOulKe98D5qaUZcKU97ZY9amhFXytNeGdTSjLhSnvbKoJ5TjoWWx5XytFcG9RzaHgvd3ITM82OhhnW7XClPe2VQzyHHQsvkSnnl6conz4lBHRGvj4gHLvj6r4j4wAxq0x45Flqu1VU4fRpeeKH63veQbjMou/TJc2JQZ+ajmfnGzHwj8GZgC/h404Vp7xwLVRe0HZRd+uS526GPI8DjmbnZRDGaDsdCNWt76Rm3HZRd+uS526C+Fbhz3AsRsRYRg4gYDIfD/VemPXMsVLO0155x20HZpU+ekZn1Doy4BHgK+L7M/MpLHbuyspKDwWAK5Ukq3fJyFc47LS1V4/DT/nPTsnOmKFSfPNvq1ETEycxcGffabnrUPwJ8blJIS+qXvfaM2x6i69Inz90E9bu5yLCHpP7a6xBCCUHZlbtwagV1RFwK/DDwsWbLkdQ1++kZdyUo21YrqDPzG5l5eWaea7ogSd1SQs943rnMqaR9W101mJvkFHJJKpxBLUmFM6glqXAGtSQVzqCWpMIZ1JJUOINaxevK4u5SU7yPWkXbuXDO9sps4H276g971Cpa22sWSyUwqFW0ttcslkpgUKtoXVrc/UKOq2uaDGoVre01i/ei7b0ANX8MahWtiyuzOa6uaau9FdduuBWX+uzAgaonvVNEte6yNM60tuKSVENXx9VVrqKC2gswmgddHFdX2YoJai/AaF50cVxdZStmjLrtreMlqU2dGKN2YoMkjVdMUHsBRpLGKyaovQAjSeMVE9RegJGk8YoJ6o2NaubWmTPVcMexY4Z0F3mLpTR9RaxH7ZrD88HzKDWjiNvzvDVvPngepb0r/vY8b82bD55HqRlFBLW35s0Hz6PUjCKC2lvz5oPnUWpGEUHtrXnzwfMoNaOIi4mS1HfFX0yUJF2cQS1JhTOopQ5x5mc/1QrqiHhlRNwVEf8WEY9ExFuaLkzSi7m5Rn/V7VH/JnB3Zn4vcB3wSHMlSRrH3c37a+JaHxHxCuAm4D0Amfks8GyzZUnayZmf/VWnR301MAT+MCLuj4jfi4hLG65L0g7O/OyvOkF9ELge+O3MfBPwDeCDOw+KiLWIGETEYDgcTrlMSc787K86QX0WOJuZJ0aP76IK7hfJzPXMXMnMlcXFxWnWKAlnfvbZxDHqzPz3iPhyRLw+Mx8FjgAPN1+apJ1WVw3mPqq7ccAdwEZEXAI8AfxMcyVJki5UK6gz8wFg7Bx0SVKznJmoXnOmn7qgiD0TpTa4x6O6wh61esuZfuoKg1q95Uw/dYVBrd5ypp+6wqBWbznTT11hUKu3nOmnrvCuD/WaM/3UBfaoJalwBrUkFc6glqTCGdSSVDiDWpIKZ1BLUuEMakkqnEEtSYUzqCWpcJ0Mahd7l9QnnZtC7mLvkvqmcz1qF3uX1DedC2oXe5fUN50Lahd7l9Q3nQtqF3uX1DedC2oXe5fUN5276wNc7F1Sv3SuRy1JfWNQS1LhDGpJKpxBLUmFM6glqXAGtSQVzqCWpMIZ1JJUOINakgpnUEtS4WpNIY+I08B/A98Ens/MlSaLkiSdt5u1Pt6WmV9trBJJ0lgOfdTgHo2S2lQ3qBO4JyJORsTauAMiYi0iBhExGA6H06uwZdt7NG5uQub5PRoNa0mzEpk5+aCIKzPzyYi4Avg74I7MvPdix6+srORgMJhime1ZXq7CeaelJTh9etbVSJpXEXHyYtf/avWoM/PJ0fdngI8DN0yvvLK5R6Oktk0M6oi4NCIu2/4ZeAdwqunCSuEejZLaVqdH/Z3AP0fE54HPAp/MzLubLasc7tEoqW0Tb8/LzCeA62ZQS5G2t/w6erQa7jh8uApptwKTNCud3DNx1tyjUVKbvI9akgpnUEtS4QxqSSqcQS1JhTOoJalwtaaQ7/pNI4bAmInXHAL6vAKf7e9v+/vcdrD9ddq/lJmL415oJKgvJiIGfV7L2vb3t/19bjvY/v2236EPSSqcQS1JhZt1UK/P+O8rje3vrz63HWz/vto/0zFqSdLuOfQhSYUzqCWpcI0EdUS8MyIejYjHIuKDY17/toj46Oj1ExGx3EQdbanR/l+KiIcj4gsR8Q8RsdRGnU2Y1PYLjvvxiMiImKtbtuq0PyJ+cnT+H4qIP5t1jU2q8bt/OCI+FRH3j37/b26jziZExB9ExDMRMXZjlaj81ujf5gsRcX3tN8/MqX4BLwMeB14DXAJ8HrhmxzE/C/zO6OdbgY9Ou462vmq2/23Awujn989L++u0fXTcZcC9wH3AStt1z/jcvxa4H3jV6PEVbdc94/avA+8f/XwNcLrtuqfY/puA64FTF3n9ZuBvgQBuBE7Ufe8metQ3AI9l5hOZ+Szw58AtO465Bfjj0c93AUciIhqopQ0T25+Zn8rMrdHD+4BXz7jGptQ59wC/CnwY+N9ZFjcDddr/XuAjmfk1+NY+pPOiTvsT+I7Rz68AnpphfY3KasPv/3yJQ24B/iQr9wGvjIjvqvPeTQT1lcCXL3h8dvTc2GMy83ngHHB5A7W0oU77L3Q71f9l58HEto8+7l2VmZ+cZWEzUufcvw54XUR8JiLui4h3zqy65tVp/y8Dt0XEWeBvgDtmU1oRdpsN3+IOLy2KiNuAFeAH265lFiLiAPAbwHtaLqVNB6mGP36I6pPUvRHx/Zn59TaLmqF3A3+Umb8eEW8B/jQirs3MF9ourGRN9KifBK664PGrR8+NPSYiDlJ9BPqPBmppQ532ExFvB44C78rM/5tRbU2b1PbLgGuBT0fEaapxuuNzdEGxzrk/CxzPzOcy80vAF6mCex7Uaf/twF8AZOa/AC+nWrCoD2plwzhNBPW/Aq+NiKsj4hKqi4XHdxxzHPjp0c8/Afxjjkbb58DE9kfEm4DfpQrpeRqjfMm2Z+a5zDyUmcuZuUw1Pv+uzBy0U+7U1fnd/wRVb5qIOEQ1FPLEDGtsUp32nwGOAETEG6iCejjTKttzHPip0d0fNwLnMvPpWn+yoaufN1P1FB4Hjo6e+xWq/yihOjl/CTwGfBZ4TdtXbGfc/r8HvgI8MPo63nbNs2r7jmM/zRzd9VHz3AfV8M/DwIPArW3XPOP2XwN8huqOkAeAd7Rd8xTbfifwNPAc1Sen24H3Ae+74Nx/ZPRv8+BufvedQi5JhXNmoiQVzqCWpMIZ1JJUOINakgpnUEtS4QxqSSqcQS1Jhft/S0csRUdaFNsAAAAASUVORK5CYII=\n",
- "text/plain": [
- "<Figure size 432x288 with 1 Axes>"
- ]
- },
- "metadata": {
- "needs_background": "light"
- },
- "output_type": "display_data"
- }
- ],
- "source": [
- "# 生层测试数据\n",
- "x_train = np.random.rand(20, 1)\n",
- "y_train = x_train * 3 + 4 + 3*np.random.rand(20,1)\n",
- "\n",
- "# 画出图像\n",
- "import matplotlib.pyplot as plt\n",
- "%matplotlib inline\n",
- "\n",
- "plt.plot(x_train, y_train, 'bo')"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 3,
- "metadata": {},
- "outputs": [],
- "source": [
- "# 转换成 Tensor\n",
- "x_train = torch.from_numpy(x_train)\n",
- "y_train = torch.from_numpy(y_train)\n",
- "\n",
- "# 定义参数 w 和 b\n",
- "w = Variable(torch.randn(1), requires_grad=True) # 随机初始化\n",
- "b = Variable(torch.zeros(1), requires_grad=True) # 使用 0 进行初始化"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 4,
- "metadata": {},
- "outputs": [],
- "source": [
- "# 构建线性回归模型\n",
- "x_train = Variable(x_train)\n",
- "y_train = Variable(y_train)\n",
- "\n",
- "def linear_model(x):\n",
- " return x * w + b\n",
- "\n",
- "def logistc_regression(x):\n",
- " return torch.sigmoid(x*w+b) "
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 5,
- "metadata": {},
- "outputs": [],
- "source": [
- "y_ = linear_model(x_train)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "经过上面的步骤我们就定义好了模型,在进行参数更新之前,我们可以先看看模型的输出结果长什么样"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 6,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "<matplotlib.legend.Legend at 0x7f60ad93f3d0>"
- ]
- },
- "execution_count": 6,
- "metadata": {},
- "output_type": "execute_result"
- },
- {
- "data": {
- "image/png": "\n",
- "text/plain": [
- "<Figure size 432x288 with 1 Axes>"
- ]
- },
- "metadata": {
- "needs_background": "light"
- },
- "output_type": "display_data"
- }
- ],
- "source": [
- "plt.plot(x_train.data.numpy(), y_train.data.numpy(), 'bo', label='real')\n",
- "plt.plot(x_train.data.numpy(), y_.data.numpy(), 'ro', label='estimated')\n",
- "plt.legend()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "**思考:红色的点表示预测值,似乎排列成一条直线,请思考一下这些点是否在一条直线上?**"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "这个时候需要计算我们的误差函数,也就是\n",
- "\n",
- "$$\n",
- "E = \\sum_{i=1}^n(\\hat{y}_i - y_i)^2\n",
- "$$"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 7,
- "metadata": {},
- "outputs": [],
- "source": [
- "# 计算误差\n",
- "def get_loss(y_, y):\n",
- " return torch.sum((y_ - y) ** 2)\n",
- "\n",
- "loss = get_loss(y_, y_train)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 8,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "tensor(687.4893, dtype=torch.float64, grad_fn=<SumBackward0>)\n"
- ]
- }
- ],
- "source": [
- "# 打印一下看看 loss 的大小\n",
- "print(loss)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "定义好了误差函数,接下来我们需要计算 w 和 b 的梯度了,这时得益于 PyTorch 的自动求导,我们不需要手动去算梯度,有兴趣的同学可以手动计算一下,w 和 b 的梯度分别是\n",
- "\n",
- "$$\n",
- "\\frac{\\partial}{\\partial w} = \\frac{2}{n} \\sum_{i=1}^n x_i(w x_i + b - y_i) \\\\\n",
- "\\frac{\\partial}{\\partial b} = \\frac{2}{n} \\sum_{i=1}^n (w x_i + b - y_i)\n",
- "$$"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 9,
- "metadata": {},
- "outputs": [],
- "source": [
- "# 自动求导\n",
- "loss.backward()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 10,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "tensor([-120.5742])\n",
- "tensor([-231.9290])\n"
- ]
- }
- ],
- "source": [
- "# 查看 w 和 b 的梯度\n",
- "print(w.grad)\n",
- "print(b.grad)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 11,
- "metadata": {},
- "outputs": [],
- "source": [
- "# 更新一次参数\n",
- "w.data = w.data - 1e-2 * w.grad.data\n",
- "b.data = b.data - 1e-2 * b.grad.data"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "更新完成参数之后,我们再一次看看模型输出的结果"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 12,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "<matplotlib.legend.Legend at 0x7f60ace2d450>"
- ]
- },
- "execution_count": 12,
- "metadata": {},
- "output_type": "execute_result"
- },
- {
- "data": {
- "image/png": "\n",
- "text/plain": [
- "<Figure size 432x288 with 1 Axes>"
- ]
- },
- "metadata": {
- "needs_background": "light"
- },
- "output_type": "display_data"
- }
- ],
- "source": [
- "y_ = linear_model(x_train)\n",
- "plt.plot(x_train.data.numpy(), y_train.data.numpy(), 'bo', label='real')\n",
- "plt.plot(x_train.data.numpy(), y_.data.numpy(), 'ro', label='estimated')\n",
- "plt.legend()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "从上面的例子可以看到,更新之后红色的线跑到了蓝色的线下面,没有特别好的拟合蓝色的真实值,所以我们需要在进行几次更新"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 13,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "epoch: 19, loss: 18.053107839401285\n",
- "epoch: 39, loss: 16.176534764175827\n",
- "epoch: 59, loss: 15.469259285871882\n",
- "epoch: 79, loss: 15.202689710228258\n",
- "epoch: 99, loss: 15.102220561226387\n"
- ]
- }
- ],
- "source": [
- "for e in range(100): # 进行 100 次更新\n",
- " y_ = linear_model(x_train)\n",
- " loss = get_loss(y_, y_train)\n",
- " \n",
- " w.grad.zero_() # 记得归零梯度\n",
- " b.grad.zero_() # 记得归零梯度\n",
- " loss.backward()\n",
- " \n",
- " w.data = w.data - 1e-2 * w.grad.data # 更新 w\n",
- " b.data = b.data - 1e-2 * b.grad.data # 更新 b \n",
- " if (e + 1) % 20 == 0:\n",
- " print('epoch: {}, loss: {}'.format(e, loss.item()))"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 14,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "<matplotlib.legend.Legend at 0x7f60acf30b90>"
- ]
- },
- "execution_count": 14,
- "metadata": {},
- "output_type": "execute_result"
- },
- {
- "data": {
- "image/png": "\n",
- "text/plain": [
- "<Figure size 432x288 with 1 Axes>"
- ]
- },
- "metadata": {
- "needs_background": "light"
- },
- "output_type": "display_data"
- }
- ],
- "source": [
- "y_ = linear_model(x_train)\n",
- "plt.plot(x_train.data.numpy(), y_train.data.numpy(), 'bo', label='real')\n",
- "plt.plot(x_train.data.numpy(), y_.data.numpy(), 'ro', label='estimated')\n",
- "plt.legend()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "经过 100 次更新,我们发现红色的预测结果已经比较好的拟合了蓝色的真实值。\n",
- "\n",
- "现在你已经学会了你的第一个机器学习模型了,再接再厉,完成下面的小练习。"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 2.4 练习题\n",
- "\n",
- "重启 notebook 运行上面的线性回归模型,但是改变训练次数以及不同的学习率进行尝试得到不同的结果"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## 3. 多项式回归模型"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "下面我们更进一步,讲一讲多项式回归。什么是多项式回归呢?非常简单,根据上面的线性回归模型\n",
- "\n",
- "$$\n",
- "\\hat{y} = w x + b\n",
- "$$\n",
- "\n",
- "这里是关于 x 的一个一次多项式,这个模型比较简单,没有办法拟合比较复杂的模型,所以我们可以使用更高次的模型,比如\n",
- "\n",
- "$$\n",
- "\\hat{y} = w_0 + w_1 x + w_2 x^2 + w_3 x^3 + \\cdots\n",
- "$$\n",
- "\n",
- "这样就能够拟合更加复杂的模型,这就是多项式模型,这里使用了 x 的更高次,同理还有多元回归模型,形式也是一样的,只是出了使用 x,还是更多的变量,比如 y、z 等等,同时他们的 loss 函数和简单的线性回归模型是一致的。"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "首先我们可以先定义一个需要拟合的目标函数,这个函数是个三次的多项式"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 15,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "y = 0.90 + 0.50 * x + 3.00 * x^2 + 2.40 * x^3\n"
- ]
- }
- ],
- "source": [
- "# 定义一个多变量函数\n",
- "\n",
- "w_target = np.array([0.5, 3, 2.4]) # 定义参数\n",
- "b_target = np.array([0.9]) # 定义参数\n",
- "\n",
- "f_des = 'y = {:.2f} + {:.2f} * x + {:.2f} * x^2 + {:.2f} * x^3'.format(\n",
- " b_target[0], w_target[0], w_target[1], w_target[2]) # 打印出函数的式子\n",
- "\n",
- "print(f_des)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "我们可以先画出这个多项式的图像"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 16,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "<matplotlib.legend.Legend at 0x7f60a9e47e10>"
- ]
- },
- "execution_count": 16,
- "metadata": {},
- "output_type": "execute_result"
- },
- {
- "data": {
- "image/png": "\n",
- "text/plain": [
- "<Figure size 432x288 with 1 Axes>"
- ]
- },
- "metadata": {
- "needs_background": "light"
- },
- "output_type": "display_data"
- }
- ],
- "source": [
- "# 画出这个函数的曲线\n",
- "x_sample = np.arange(-3, 3.1, 0.1)\n",
- "y_sample = b_target[0] + w_target[0] * x_sample + w_target[1] * x_sample ** 2 + w_target[2] * x_sample ** 3\n",
- "\n",
- "plt.plot(x_sample, y_sample, label='real curve')\n",
- "plt.legend()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "接着我们可以构建数据集,需要 x 和 y,同时是一个三次多项式,所以我们取了 $x,\\ x^2, x^3$"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 17,
- "metadata": {},
- "outputs": [],
- "source": [
- "# 构建数据 x 和 y\n",
- "# x 是一个如下矩阵 [x, x^2, x^3]\n",
- "# y 是函数的结果 [y]\n",
- "\n",
- "x_train = np.stack([x_sample ** i for i in range(1, 4)], axis=1)\n",
- "x_train = torch.from_numpy(x_train).float() # 转换成 float tensor\n",
- "\n",
- "y_train = torch.from_numpy(y_sample).float().unsqueeze(1) # 转化成 float tensor "
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 18,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "torch.Size([61, 3])\n"
- ]
- }
- ],
- "source": [
- "print(x_train.size())"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "接着我们可以定义需要优化的参数,就是前面这个函数里面的 $w_i$"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 20,
- "metadata": {},
- "outputs": [],
- "source": [
- "# 定义参数和模型\n",
- "w = Variable(torch.randn(3, 1), requires_grad=True)\n",
- "b = Variable(torch.zeros(1), requires_grad=True)\n",
- "\n",
- "# 将 x 和 y 转换成 Variable\n",
- "x_train = Variable(x_train)\n",
- "y_train = Variable(y_train)\n",
- "\n",
- "def multi_linear(x):\n",
- " return torch.mm(x, w) + b\n",
- "\n",
- "def get_loss(y_, y):\n",
- " return torch.mean((y_ - y) ** 2)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "我们可以画出没有更新之前的模型和真实的模型之间的对比"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 21,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "<matplotlib.legend.Legend at 0x7f60a9d81f50>"
- ]
- },
- "execution_count": 21,
- "metadata": {},
- "output_type": "execute_result"
- },
- {
- "data": {
- "image/png": "\n",
- "text/plain": [
- "<Figure size 432x288 with 1 Axes>"
- ]
- },
- "metadata": {
- "needs_background": "light"
- },
- "output_type": "display_data"
- }
- ],
- "source": [
- "# 画出更新之前的模型\n",
- "y_pred = multi_linear(x_train)\n",
- "\n",
- "plt.plot(x_train.data.numpy()[:, 0], y_pred.data.numpy(), label='fitting curve', color='r')\n",
- "plt.plot(x_train.data.numpy()[:, 0], y_sample, label='real curve', color='b')\n",
- "plt.legend()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "可以发现,这两条曲线之间存在差异,我们计算一下他们之间的误差"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 22,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "tensor(1144.2655, grad_fn=<MeanBackward0>)\n"
- ]
- }
- ],
- "source": [
- "# 计算误差,这里的误差和一元的线性模型的误差是相同的,前面已经定义过了 get_loss\n",
- "loss = get_loss(y_pred, y_train)\n",
- "print(loss)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 23,
- "metadata": {},
- "outputs": [],
- "source": [
- "# 自动求导\n",
- "loss.backward()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 24,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "tensor([[ -94.7455],\n",
- " [-139.1247],\n",
- " [-629.8584]])\n",
- "tensor([-25.7413])\n"
- ]
- }
- ],
- "source": [
- "# 查看一下 w 和 b 的梯度\n",
- "print(w.grad)\n",
- "print(b.grad)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 25,
- "metadata": {},
- "outputs": [],
- "source": [
- "# 更新一下参数\n",
- "w.data = w.data - 0.001 * w.grad.data\n",
- "b.data = b.data - 0.001 * b.grad.data"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 26,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "<matplotlib.legend.Legend at 0x7f60a9e3fdd0>"
- ]
- },
- "execution_count": 26,
- "metadata": {},
- "output_type": "execute_result"
- },
- {
- "data": {
- "image/png": "\n",
- "text/plain": [
- "<Figure size 432x288 with 1 Axes>"
- ]
- },
- "metadata": {
- "needs_background": "light"
- },
- "output_type": "display_data"
- }
- ],
- "source": [
- "# 画出更新一次之后的模型\n",
- "y_pred = multi_linear(x_train)\n",
- "\n",
- "plt.plot(x_train.data.numpy()[:, 0], y_pred.data.numpy(), label='fitting curve', color='r')\n",
- "plt.plot(x_train.data.numpy()[:, 0], y_sample, label='real curve', color='b')\n",
- "plt.legend()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "因为只更新了一次,所以两条曲线之间的差异仍然存在,我们进行 100 次迭代"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 27,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "epoch 20, Loss: 65.56586\n",
- "epoch 40, Loss: 15.41177\n",
- "epoch 60, Loss: 3.70702\n",
- "epoch 80, Loss: 0.97122\n",
- "epoch 100, Loss: 0.32874\n"
- ]
- }
- ],
- "source": [
- "# 进行 100 次参数更新\n",
- "for e in range(100):\n",
- " y_pred = multi_linear(x_train)\n",
- " loss = get_loss(y_pred, y_train)\n",
- " \n",
- " w.grad.data.zero_()\n",
- " b.grad.data.zero_()\n",
- " loss.backward()\n",
- " \n",
- " # 更新参数\n",
- " w.data = w.data - 0.001 * w.grad.data\n",
- " b.data = b.data - 0.001 * b.grad.data\n",
- " if (e + 1) % 20 == 0:\n",
- " print('epoch {}, Loss: {:.5f}'.format(e+1, loss.data.item()))"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "可以看到更新完成之后 loss 已经非常小了,我们画出更新之后的曲线对比"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 28,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "<matplotlib.legend.Legend at 0x7f60a9c42290>"
- ]
- },
- "execution_count": 28,
- "metadata": {},
- "output_type": "execute_result"
- },
- {
- "data": {
- "image/png": "\n",
- "text/plain": [
- "<Figure size 432x288 with 1 Axes>"
- ]
- },
- "metadata": {
- "needs_background": "light"
- },
- "output_type": "display_data"
- }
- ],
- "source": [
- "# 画出更新之后的结果\n",
- "y_pred = multi_linear(x_train)\n",
- "\n",
- "plt.plot(x_train.data.numpy()[:, 0], y_pred.data.numpy(), label='fitting curve', color='r')\n",
- "plt.plot(x_train.data.numpy()[:, 0], y_sample, label='real curve', color='b')\n",
- "plt.legend()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "可以看到,经过 100 次更新之后,可以看到拟合的线和真实的线已经完全重合了"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "collapsed": true
- },
- "source": [
- "## 4. 练习题\n",
- "\n",
- "上面的例子是一个三次的多项式,尝试使用二次的多项式去拟合它,看看最后能做到多好\n",
- "\n",
- "**提示:参数 `w = torch.randn(2, 1)`,同时重新构建 x 数据集**"
- ]
- }
- ],
- "metadata": {
- "kernelspec": {
- "display_name": "Python 3",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.7.9"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 2
- }
|