{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# LeNet5\n", "\n", "LeNet 诞生于 1994 年,是最早的卷积神经网络之一,并且推动了深度学习领域的发展。自从 1988 年开始,在多次迭代后这个开拓性成果被命名为 LeNet5。LeNet5 的架构的提出是基于如下的观点:图像的特征分布在整张图像上,通过带有可学习参数的卷积,从而有效的减少了参数数量,能够在多个位置上提取相似特征。\n", "\n", "在LeNet5提出的时候,没有 GPU 帮助训练,甚至 CPU 的速度也很慢,因此,LeNet5的规模并不大。其包含七个处理层,每一层都包含可训练参数(权重),当时使用的输入数据是 $32 \\times 32$ 像素的图像。LeNet-5 这个网络虽然很小,但是它包含了深度学习的基本模块:卷积层,池化层,全连接层。它是其他深度学习模型的基础,这里对LeNet5进行深入分析和讲解,通过实例分析,加深对与卷积层和池化层的理解。" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import sys\n", "sys.path.append('..')\n", "\n", "import numpy as np\n", "import torch\n", "from torch import nn\n", "from torch.autograd import Variable\n", "from torchvision.datasets import CIFAR10\n", "from torchvision import transforms as tfs" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import torch\n", "from torch import nn\n", "\n", "lenet5 = nn.Sequential(\n", " nn.Conv2d(1, 6, kernel_size=5, padding=2), nn.Sigmoid(),\n", " nn.AvgPool2d(kernel_size=2, stride=2),\n", " nn.Conv2d(6, 16, kernel_size=5), nn.Sigmoid(),\n", " nn.AvgPool2d(kernel_size=2, stride=2),\n", " nn.Flatten(),\n", " nn.Linear(16 * 5 * 5, 120), nn.Sigmoid(),\n", " nn.Linear(120, 84), nn.Sigmoid(),\n", " nn.Linear(84, 10) )" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from utils import train\n", "\n", "# 使用数据增强\n", "def train_tf(x):\n", " im_aug = tfs.Compose([\n", " tfs.Resize(224),\n", " tfs.ToTensor(),\n", " tfs.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])\n", " ])\n", " x = im_aug(x)\n", " return x\n", "\n", "def test_tf(x):\n", " im_aug = tfs.Compose([\n", " tfs.Resize(224),\n", " tfs.ToTensor(),\n", " tfs.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])\n", " ])\n", " x = im_aug(x)\n", " return x\n", " \n", "train_set = CIFAR10('../../data', train=True, transform=train_tf)\n", "train_data = torch.utils.data.DataLoader(train_set, batch_size=64, shuffle=True)\n", "test_set = CIFAR10('../../data', train=False, transform=test_tf)\n", "test_data = torch.utils.data.DataLoader(test_set, batch_size=128, shuffle=False)\n", "\n", "net = lenet5\n", "optimizer = torch.optim.SGD(net.parameters(), lr=1e-1)\n", "criterion = nn.CrossEntropyLoss()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "(l_train_loss, l_train_acc, l_valid_loss, l_valid_acc) = train(net, \n", " train_data, test_data, \n", " 20, \n", " optimizer, criterion,\n", " use_cuda=False)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "%matplotlib inline\n", "\n", "plt.plot(l_train_loss, label='train')\n", "plt.plot(l_valid_loss, label='valid')\n", "plt.xlabel('epoch')\n", "plt.legend(loc='best')\n", "plt.savefig('fig-res-lenet5-train-validate-loss.pdf')\n", "plt.show()\n", "\n", "plt.plot(l_train_acc, label='train')\n", "plt.plot(l_valid_acc, label='valid')\n", "plt.xlabel('epoch')\n", "plt.legend(loc='best')\n", "plt.savefig('fig-res-lenet5-train-validate-acc.pdf')\n", "plt.show()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.4" } }, "nbformat": 4, "nbformat_minor": 2 }