|
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209 |
- {
- "cells": [
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "# LeNet5\n",
- "\n",
- "LeNet 诞生于 1994 年,是最早的卷积神经网络之一,并且推动了深度学习领域的发展。自从 1988 年开始,在多次迭代后这个开拓性成果被命名为 LeNet5。LeNet5 的架构的提出是基于如下的观点:图像的特征分布在整张图像上,通过带有可学习参数的卷积,从而有效的减少了参数数量,能够在多个位置上提取相似特征。\n",
- "\n",
- "在LeNet5提出的时候,没有 GPU 帮助训练,甚至 CPU 的速度也很慢,因此,LeNet5的规模并不大。其包含七个处理层,每一层都包含可训练参数(权重),当时使用的输入数据是 $32 \\times 32$ 像素的图像。LeNet-5 这个网络虽然很小,但是它包含了深度学习的基本模块:卷积层,池化层,全连接层。它是其他深度学习模型的基础,这里对LeNet5进行深入分析和讲解,通过实例分析,加深对与卷积层和池化层的理解。"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "定义网络为:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "import torch\n",
- "from torch import nn\n",
- "import torch.nn.functional as F\n",
- "\n",
- "class LeNet5(nn.Module):\n",
- " def __init__(self):\n",
- " super(LeNet5, self).__init__()\n",
- " # 1-input channel, 6-output channels, 5x5-conv\n",
- " self.conv1 = nn.Conv2d(1, 6, 5)\n",
- " # 6-input channel, 16-output channels, 5x5-conv\n",
- " self.conv2 = nn.Conv2d(6, 16, 5)\n",
- " # 16x5x5-input, 120-output\n",
- " self.fc1 = nn.Linear(16 * 5 * 5, 120) \n",
- " # 120-input, 84-output\n",
- " self.fc2 = nn.Linear(120, 84)\n",
- " # 84-input, 10-output\n",
- " self.fc3 = nn.Linear(84, 10)\n",
- "\n",
- " def forward(self, x):\n",
- " x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))\n",
- " x = F.max_pool2d(F.relu(self.conv2(x)), (2, 2))\n",
- " x = torch.flatten(x, 1) # 将结果拉升成1维向量,除了批次的维度\n",
- " x = F.relu(self.fc1(x))\n",
- " x = F.relu(self.fc2(x))\n",
- " x = self.fc3(x)\n",
- " return x"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "net = LeNet5()\n",
- "print(net)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "input = torch.randn(1, 1, 32, 32)\n",
- "out = net(input)\n",
- "print(out)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "import numpy as np\n",
- "from torchvision.datasets import mnist\n",
- "from torch.utils.data import DataLoader\n",
- "from torchvision.datasets import mnist \n",
- "from torchvision import transforms as tfs\n",
- "from utils import train\n",
- "\n",
- "# 使用数据增强\n",
- "def data_tf(x):\n",
- " im_aug = tfs.Compose([\n",
- " tfs.Resize(32),\n",
- " tfs.ToTensor() #,\n",
- " #tfs.Normalize([0.5], [0.5])\n",
- " ])\n",
- " x = im_aug(x)\n",
- " return x\n",
- " \n",
- "train_set = mnist.MNIST('../../data/mnist', train=True, transform=data_tf, download=True) \n",
- "train_data = torch.utils.data.DataLoader(train_set, batch_size=64, shuffle=True)\n",
- "test_set = mnist.MNIST('../../data/mnist', train=False, transform=data_tf, download=True) \n",
- "test_data = torch.utils.data.DataLoader(test_set, batch_size=128, shuffle=False)\n",
- "\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "# 显示其中一个数据\n",
- "import matplotlib.pyplot as plt\n",
- "plt.imshow(train_set.data[0], cmap='gray')\n",
- "plt.title('%i' % train_set.targets[0])\n",
- "plt.colorbar()\n",
- "plt.show()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "import matplotlib.pyplot as plt\n",
- "\n",
- "# 显示转化后的图像\n",
- "for im, label in train_data:\n",
- " print(im.shape)\n",
- " print(label.shape)\n",
- " \n",
- " img = im[0,0,:,:]\n",
- " lab = label[0]\n",
- " plt.imshow(img, cmap='gray')\n",
- " plt.title('%i' % lab)\n",
- " plt.colorbar()\n",
- " plt.show()\n",
- "\n",
- " print(im[0,0,:,:])\n",
- " break"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "scrolled": false
- },
- "outputs": [],
- "source": [
- "net = LeNet5()\n",
- "optimizer = torch.optim.Adam(net.parameters(), lr=1e-3)\n",
- "criterion = nn.CrossEntropyLoss()\n",
- "\n",
- "res = train(net, train_data, test_data, 20, \n",
- " optimizer, criterion,\n",
- " use_cuda=True)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "import matplotlib.pyplot as plt\n",
- "%matplotlib inline\n",
- "\n",
- "plt.plot(res[0], label='train')\n",
- "plt.plot(res[2], label='valid')\n",
- "plt.xlabel('epoch')\n",
- "plt.ylabel('Loss')\n",
- "plt.legend(loc='best')\n",
- "plt.savefig('fig-res-lenet5-train-validate-loss.pdf')\n",
- "plt.show()\n",
- "\n",
- "plt.plot(res[1], label='train')\n",
- "plt.plot(res[3], label='valid')\n",
- "plt.xlabel('epoch')\n",
- "plt.ylabel('Acc')\n",
- "plt.legend(loc='best')\n",
- "plt.savefig('fig-res-lenet5-train-validate-acc.pdf')\n",
- "plt.show()"
- ]
- }
- ],
- "metadata": {
- "kernelspec": {
- "display_name": "Python 3",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.7.9"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 2
- }
|