You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

5-nn-sequential-module.ipynb 178 kB


  1. {
  2. "cells": [
  3. {
  4. "cell_type": "markdown",
  5. "metadata": {},
  6. "source": [
  7. "# 多层神经网络\n",
  8. "\n",
  9. "本节在前面学习线性回归和逻辑回归模型的基础上,本节学习如何利用PyTorch实现多层神经网络。"
  10. ]
  11. },
  12. {
  13. "cell_type": "markdown",
  14. "metadata": {},
  15. "source": [
  16. "## 1. 多层神经网络\n",
  17. "线性回归的公式是 $y = w x + b$, Logistic 回归的公式是 $y = Sigmoid(w x + b)$,其实它们都可以看成单层神经网络,其中 Sigmoid 被称为激活函数。"
  18. ]
  19. },
  20. {
  21. "cell_type": "markdown",
  22. "metadata": {},
  23. "source": [
  24. "### 1.1 神经网络的结构\n",
  25. "\n",
  26. "神经网络就是很多个神经元堆在一起形成一层神经网络,那么多个层堆叠在一起就是深层神经网络\n",
  27. "\n",
  28. "![nn demo](imgs/nn-forward.gif)\n",
  29. "\n",
  30. "可以看到,神经网络的结构其实非常简单,主要有输入层,隐藏层,输出层构成,输入层需要根据特征数目来决定,输出层根据解决的问题来决定,那么隐藏层的网路层数以及每层的神经元数就是可以调节的参数,而不同的层数和每层的参数对模型的影响非常大,具体的动态示例可以参考 [demo - classify2d](http://cs.stanford.edu/people/karpathy/convnetjs/demo/classify2d.html) 。神经网络向前传播也非常简单,就是一层一层不断做运算即可。"
  31. ]
  32. },
  33. {
  34. "cell_type": "markdown",
  35. "metadata": {},
  36. "source": [
  37. "### 1.2 多层神经网络示例程序\n",
  38. "\n",
  39. "首先生成一些训练、测试数据。"
  40. ]
  41. },
  42. {
  43. "cell_type": "code",
  44. "execution_count": 1,
  45. "metadata": {},
  46. "outputs": [
  47. {
  48. "data": {
  49. "image/png": "\n",
  50. "text/plain": [
  51. "<Figure size 432x288 with 1 Axes>"
  52. ]
  53. },
  54. "metadata": {
  55. "needs_background": "light"
  56. },
  57. "output_type": "display_data"
  58. }
  59. ],
  60. "source": [
  61. "import torch\n",
  62. "import numpy as np\n",
  63. "from torch import nn\n",
  64. "from sklearn import datasets\n",
  65. "\n",
  66. "import matplotlib.pyplot as plt\n",
  67. "%matplotlib inline\n",
  68. "\n",
  69. "# generate sample data\n",
  70. "np.random.seed(0)\n",
  71. "data_x, data_y = datasets.make_moons(200, noise=0.20)\n",
  72. "\n",
  73. "# plot data\n",
  74. "plt.scatter(data_x[:, 0], data_x[:, 1], c=data_y, cmap=plt.cm.Spectral)\n",
  75. "plt.show()"
  76. ]
  77. },
  78. {
  79. "cell_type": "code",
  80. "execution_count": 2,
  81. "metadata": {},
  82. "outputs": [],
  83. "source": [
  84. "# 变量\n",
  85. "x = torch.from_numpy(data_x).float()\n",
  86. "y = torch.from_numpy(data_y).float().unsqueeze(1)\n",
  87. "\n",
  88. "\n",
  89. "# 定义两层神经网络的参数\n",
  90. "w1 = nn.Parameter(torch.randn(2, 4) * 0.1) # 隐藏层神经元个数 4\n",
  91. "b1 = nn.Parameter(torch.zeros(4))\n",
  92. "\n",
  93. "w2 = nn.Parameter(torch.randn(4, 1) * 0.1)\n",
  94. "b2 = nn.Parameter(torch.zeros(1))\n",
  95. "\n",
  96. "# 定义模型\n",
  97. "def SimpNetwork(x):\n",
  98. " x1 = torch.mm(x, w1) + b1\n",
  99. " x1 = torch.sigmoid(x1) # 使用 PyTorch 自带的 sigmoid 激活函数\n",
  100. " x2 = torch.mm(x1, w2) + b2\n",
  101. " return x2 # BCEWithLogitsLoss 已经带了sigmoid,所以此处不需要\n",
  102. "\n",
  103. "optimizer = torch.optim.SGD([w1, b1, w2, b2], 0.1)\n",
  104. "\n",
  105. "criterion = nn.BCEWithLogitsLoss()"
  106. ]
  107. },
  108. {
  109. "cell_type": "code",
  110. "execution_count": 3,
  111. "metadata": {},
  112. "outputs": [
  113. {
  114. "name": "stdout",
  115. "output_type": "stream",
  116. "text": [
  117. "epoch: 100, loss: 0.6914874315261841\n",
  118. "epoch: 200, loss: 0.6847885251045227\n",
  119. "epoch: 300, loss: 0.658918559551239\n",
  120. "epoch: 400, loss: 0.588269054889679\n",
  121. "epoch: 500, loss: 0.4917648732662201\n",
  122. "epoch: 600, loss: 0.42251646518707275\n",
  123. "epoch: 700, loss: 0.38259515166282654\n",
  124. "epoch: 800, loss: 0.3581520915031433\n",
  125. "epoch: 900, loss: 0.34184250235557556\n",
  126. "epoch: 1000, loss: 0.330547571182251\n"
  127. ]
  128. }
  129. ],
  130. "source": [
  131. "# 训练 1000 次\n",
  132. "for e in range(1000):\n",
  133. " out = SimpNetwork(x)\n",
  134. " loss = criterion(out, y)\n",
  135. " optimizer.zero_grad()\n",
  136. " loss.backward()\n",
  137. " optimizer.step()\n",
  138. " if (e + 1) % 100 == 0:\n",
  139. " print('epoch: {}, loss: {}'.format(e+1, loss.item()))"
  140. ]
  141. },
  142. {
  143. "cell_type": "code",
  144. "execution_count": 4,
  145. "metadata": {},
  146. "outputs": [],
  147. "source": [
  148. "def plot_decision_boundary(model, x, y):\n",
  149. " # Set min and max values and give it some padding\n",
  150. " x_min, x_max = x[:, 0].min() - 1, x[:, 0].max() + 1\n",
  151. " y_min, y_max = x[:, 1].min() - 1, x[:, 1].max() + 1\n",
  152. " h = 0.01\n",
  153. " # Generate a grid of points with distance h between them\n",
  154. " xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))\n",
  155. " # Predict the function value for the whole grid .c_按行连接两个矩阵,左右相加。\n",
  156. " Z = model(np.c_[xx.ravel(), yy.ravel()])\n",
  157. " Z = Z.reshape(xx.shape)\n",
  158. " # Plot the contour and training examples\n",
  159. " plt.contourf(xx, yy, Z, cmap=plt.cm.Spectral)\n",
  160. " plt.ylabel('x2')\n",
  161. " plt.xlabel('x1')\n",
  162. " plt.scatter(x[:, 0], x[:, 1], c=y.reshape(-1), s=40, cmap=plt.cm.Spectral)"
  163. ]
  164. },
  165. {
  166. "cell_type": "code",
  167. "execution_count": 5,
  168. "metadata": {},
  169. "outputs": [
  170. {
  171. "data": {
  172. "image/png": "\n",
  173. "text/plain": [
  174. "<Figure size 432x288 with 1 Axes>"
  175. ]
  176. },
  177. "metadata": {
  178. "needs_background": "light"
  179. },
  180. "output_type": "display_data"
  181. },
  182. {
  183. "data": {
  184. "image/png": "\n",
  185. "text/plain": [
  186. "<Figure size 432x288 with 1 Axes>"
  187. ]
  188. },
  189. "metadata": {
  190. "needs_background": "light"
  191. },
  192. "output_type": "display_data"
  193. }
  194. ],
  195. "source": [
  196. "y_res = torch.sigmoid(SimpNetwork(x))\n",
  197. "#y_pred = np.argmax(y_res, axis=1)\n",
  198. "y_pred = (y_res > 0.5)*1\n",
  199. "\n",
  200. "# plot data\n",
  201. "plt.scatter(x[:, 0], x[:, 1], c=y, cmap=plt.cm.Spectral)\n",
  202. "plt.title(\"ground truth\")\n",
  203. "plt.show()\n",
  204. "\n",
  205. "plt.scatter(x[:, 0], x[:, 1], c=y_pred, cmap=plt.cm.Spectral)\n",
  206. "plt.title(\"predicted\")\n",
  207. "plt.show()"
  208. ]
  209. },
  210. {
  211. "cell_type": "markdown",
  212. "metadata": {},
  213. "source": [
  214. "## 2. Sequential 和 Module"
  215. ]
  216. },
  217. {
  218. "cell_type": "markdown",
  219. "metadata": {},
  220. "source": [
  221. "\n",
  222. "对于前面的线性回归模型、 Logistic回归模型和神经网络,在构建的时候定义了需要的参数。这对于比较小的模型是可行的,但是对于大的模型,比如100 层的神经网络,这个时候再去手动定义参数就显得非常麻烦,所以 PyTorch 提供了两个模块来帮助我们构建模型,一个是Sequential,一个是 Module。\n",
  223. "\n",
  224. "Sequential 允许我们构建序列化的模块,而 Module 是一种更加灵活的模型定义方式,下面分别用 `Sequential` 和 `Module` 来定义上面的神经网络。"
  225. ]
  226. },
  227. {
  228. "cell_type": "markdown",
  229. "metadata": {},
  230. "source": [
  231. "### 2.1 Sequential"
  232. ]
  233. },
  234. {
  235. "cell_type": "code",
  236. "execution_count": 6,
  237. "metadata": {},
  238. "outputs": [],
  239. "source": [
  240. "# Sequential\n",
  241. "seq_net = nn.Sequential(\n",
  242. " nn.Linear(2, 4), # PyTorch 中的线性层,wx + b\n",
  243. " nn.Tanh(),\n",
  244. " nn.Linear(4, 1)\n",
  245. ")"
  246. ]
  247. },
  248. {
  249. "cell_type": "code",
  250. "execution_count": 7,
  251. "metadata": {},
  252. "outputs": [
  253. {
  254. "data": {
  255. "text/plain": [
  256. "Linear(in_features=2, out_features=4, bias=True)"
  257. ]
  258. },
  259. "execution_count": 7,
  260. "metadata": {},
  261. "output_type": "execute_result"
  262. }
  263. ],
  264. "source": [
  265. "# 序列模块可以通过索引访问每一层\n",
  266. "seq_net[0] # 第一层"
  267. ]
  268. },
  269. {
  270. "cell_type": "code",
  271. "execution_count": 8,
  272. "metadata": {},
  273. "outputs": [
  274. {
  275. "name": "stdout",
  276. "output_type": "stream",
  277. "text": [
  278. "Parameter containing:\n",
  279. "tensor([[ 0.3485, 0.5085],\n",
  280. " [-0.6388, -0.1725],\n",
  281. " [ 0.4717, -0.2461],\n",
  282. " [-0.1726, 0.4927]], requires_grad=True)\n"
  283. ]
  284. }
  285. ],
  286. "source": [
  287. "# 打印出第一层的权重\n",
  288. "\n",
  289. "w0 = seq_net[0].weight\n",
  290. "print(w0)"
  291. ]
  292. },
  293. {
  294. "cell_type": "code",
  295. "execution_count": 9,
  296. "metadata": {},
  297. "outputs": [
  298. {
  299. "name": "stdout",
  300. "output_type": "stream",
  301. "text": [
  302. "epoch: 1000, loss: 0.3075895607471466\n",
  303. "epoch: 2000, loss: 0.3041735887527466\n",
  304. "epoch: 3000, loss: 0.30135470628738403\n",
  305. "epoch: 4000, loss: 0.25870421528816223\n",
  306. "epoch: 5000, loss: 0.14440153539180756\n",
  307. "epoch: 6000, loss: 0.10606899112462997\n",
  308. "epoch: 7000, loss: 0.09030225872993469\n",
  309. "epoch: 8000, loss: 0.08221166580915451\n",
  310. "epoch: 9000, loss: 0.0778866782784462\n",
  311. "epoch: 10000, loss: 0.07527764141559601\n"
  312. ]
  313. }
  314. ],
  315. "source": [
  316. "# generate sample data\n",
  317. "np.random.seed(0)\n",
  318. "data_x, data_y = datasets.make_moons(200, noise=0.20)\n",
  319. "\n",
  320. "# 变量\n",
  321. "x = torch.from_numpy(data_x).float()\n",
  322. "y = torch.from_numpy(data_y).float().unsqueeze(1)\n",
  323. "\n",
  324. "# 通过 parameters 可以取得模型的参数\n",
  325. "param = seq_net.parameters()\n",
  326. "\n",
  327. "# 定义优化器\n",
  328. "optim = torch.optim.SGD(param, 0.1)\n",
  329. "\n",
  330. "# 我们训练 10000 次\n",
  331. "for e in range(10000):\n",
  332. " out = seq_net(x)\n",
  333. " loss = criterion(out, y)\n",
  334. " optim.zero_grad()\n",
  335. " loss.backward()\n",
  336. " optim.step()\n",
  337. " if (e + 1) % 1000 == 0:\n",
  338. " print('epoch: {}, loss: {}'.format(e+1, loss.item()))"
  339. ]
  340. },
  341. {
  342. "cell_type": "markdown",
  343. "metadata": {},
  344. "source": [
  345. "可以看到,训练 10000 次 loss 比之前的更低,这是因为 PyTorch 自带的模块比我们写的更加稳定,同时也有一些初始化的问题在里面,关于参数初始化,我们会在后面的课程中讲到"
  346. ]
  347. },
  348. {
  349. "cell_type": "code",
  350. "execution_count": 10,
  351. "metadata": {},
  352. "outputs": [],
  353. "source": [
  354. "def plot_seq(x):\n",
  355. " out = torch.sigmoid(seq_net(torch.from_numpy(x).float())).data.numpy()\n",
  356. " out = (out > 0.5) * 1\n",
  357. " return out"
  358. ]
  359. },
  360. {
  361. "cell_type": "code",
  362. "execution_count": 11,
  363. "metadata": {},
  364. "outputs": [
  365. {
  366. "data": {
  367. "text/plain": [
  368. "Text(0.5, 1.0, 'sequential')"
  369. ]
  370. },
  371. "execution_count": 11,
  372. "metadata": {},
  373. "output_type": "execute_result"
  374. },
  375. {
  376. "data": {
  377. "image/png": "\n",
  378. "text/plain": [
  379. "<Figure size 432x288 with 1 Axes>"
  380. ]
  381. },
  382. "metadata": {
  383. "needs_background": "light"
  384. },
  385. "output_type": "display_data"
  386. }
  387. ],
  388. "source": [
  389. "plot_decision_boundary(lambda x: plot_seq(x), x.numpy(), y.numpy())\n",
  390. "plt.title('sequential')"
  391. ]
  392. },
  393. {
  394. "cell_type": "markdown",
  395. "metadata": {},
  396. "source": [
  397. "### 2.2 保存模型参数\n",
  398. "\n",
  399. "保存模型在 PyTorch 中有两种方式,一种是将模型结构和参数都保存在一起,一种是只将参数保存下来。"
  400. ]
  401. },
  402. {
  403. "cell_type": "code",
  404. "execution_count": 19,
  405. "metadata": {
  406. "collapsed": true
  407. },
  408. "outputs": [],
  409. "source": [
  410. "# 将参数和模型保存在一起\n",
  411. "torch.save(seq_net, 'save_seq_net.pth')"
  412. ]
  413. },
  414. {
  415. "cell_type": "markdown",
  416. "metadata": {},
  417. "source": [
  418. "上面就是保存模型的方式,`torch.save`里面有两个参数,第一个是要保存的模型,第二个参数是保存的路径,读取模型的方式也非常简单"
  419. ]
  420. },
  421. {
  422. "cell_type": "code",
  423. "execution_count": 20,
  424. "metadata": {
  425. "collapsed": true
  426. },
  427. "outputs": [],
  428. "source": [
  429. "# 读取保存的模型\n",
  430. "seq_net1 = torch.load('save_seq_net.pth')"
  431. ]
  432. },
  433. {
  434. "cell_type": "code",
  435. "execution_count": 21,
  436. "metadata": {},
  437. "outputs": [
  438. {
  439. "data": {
  440. "text/plain": [
  441. "Sequential(\n",
  442. " (0): Linear(in_features=2, out_features=4, bias=True)\n",
  443. " (1): Tanh()\n",
  444. " (2): Linear(in_features=4, out_features=1, bias=True)\n",
  445. ")"
  446. ]
  447. },
  448. "execution_count": 21,
  449. "metadata": {},
  450. "output_type": "execute_result"
  451. }
  452. ],
  453. "source": [
  454. "seq_net1"
  455. ]
  456. },
  457. {
  458. "cell_type": "code",
  459. "execution_count": 22,
  460. "metadata": {},
  461. "outputs": [
  462. {
  463. "name": "stdout",
  464. "output_type": "stream",
  465. "text": [
  466. "Parameter containing:\n",
  467. "tensor([[-5.7823, 5.7006],\n",
  468. " [ 5.3129, 3.6949],\n",
  469. " [ 3.5471, -0.7431],\n",
  470. " [ 2.4003, 1.7605]], requires_grad=True)\n"
  471. ]
  472. }
  473. ],
  474. "source": [
  475. "print(seq_net1[0].weight)"
  476. ]
  477. },
  478. {
  479. "cell_type": "markdown",
  480. "metadata": {},
  481. "source": [
  482. "我们可以看到我们重新读入了模型,并且将其命名为 seq_net1,并且打印了第一层的参数\n",
  483. "\n",
  484. "下面我们看看第二种保存模型的方式,只保存参数而不保存模型结构"
  485. ]
  486. },
  487. {
  488. "cell_type": "code",
  489. "execution_count": 23,
  490. "metadata": {
  491. "collapsed": true
  492. },
  493. "outputs": [],
  494. "source": [
  495. "# 保存模型参数\n",
  496. "torch.save(seq_net.state_dict(), 'save_seq_net_params.pth')"
  497. ]
  498. },
  499. {
  500. "cell_type": "markdown",
  501. "metadata": {},
  502. "source": [
  503. "通过上面的方式,我们保存了模型的参数,如果要重新读入模型的参数,首先我们需要重新定义一次模型,接着重新读入参数"
  504. ]
  505. },
  506. {
  507. "cell_type": "code",
  508. "execution_count": 24,
  509. "metadata": {},
  510. "outputs": [
  511. {
  512. "data": {
  513. "text/plain": [
  514. "<All keys matched successfully>"
  515. ]
  516. },
  517. "execution_count": 24,
  518. "metadata": {},
  519. "output_type": "execute_result"
  520. }
  521. ],
  522. "source": [
  523. "seq_net2 = nn.Sequential(\n",
  524. " nn.Linear(2, 4),\n",
  525. " nn.Tanh(),\n",
  526. " nn.Linear(4, 1)\n",
  527. ")\n",
  528. "\n",
  529. "seq_net2.load_state_dict(torch.load('save_seq_net_params.pth'))"
  530. ]
  531. },
  532. {
  533. "cell_type": "code",
  534. "execution_count": 25,
  535. "metadata": {},
  536. "outputs": [
  537. {
  538. "data": {
  539. "text/plain": [
  540. "Sequential(\n",
  541. " (0): Linear(in_features=2, out_features=4, bias=True)\n",
  542. " (1): Tanh()\n",
  543. " (2): Linear(in_features=4, out_features=1, bias=True)\n",
  544. ")"
  545. ]
  546. },
  547. "execution_count": 25,
  548. "metadata": {},
  549. "output_type": "execute_result"
  550. }
  551. ],
  552. "source": [
  553. "seq_net2"
  554. ]
  555. },
  556. {
  557. "cell_type": "code",
  558. "execution_count": 26,
  559. "metadata": {},
  560. "outputs": [
  561. {
  562. "name": "stdout",
  563. "output_type": "stream",
  564. "text": [
  565. "Parameter containing:\n",
  566. "tensor([[-5.7823, 5.7006],\n",
  567. " [ 5.3129, 3.6949],\n",
  568. " [ 3.5471, -0.7431],\n",
  569. " [ 2.4003, 1.7605]], requires_grad=True)\n"
  570. ]
  571. }
  572. ],
  573. "source": [
  574. "print(seq_net2[0].weight)"
  575. ]
  576. },
  577. {
  578. "cell_type": "markdown",
  579. "metadata": {},
  580. "source": [
  581. "通过这种方式我们也重新读入了相同的模型,打印第一层的参数对比,发现和前面的办法是一样"
  582. ]
  583. },
  584. {
  585. "cell_type": "markdown",
  586. "metadata": {},
  587. "source": [
  588. "有这两种保存和读取模型的方法,我们推荐使用**第二种**,因为第二种可移植性更强"
  589. ]
  590. },
  591. {
  592. "cell_type": "markdown",
  593. "metadata": {},
  594. "source": [
  595. "### 2.3 Module\n",
  596. "\n",
  597. "下面再用 Module 定义这个模型,下面是使用 Module 的模板\n",
  598. "\n",
  599. "```\n",
  600. "class 网络名字(nn.Module):\n",
  601. " def __init__(self, 一些定义的参数):\n",
  602. " super(网络名字, self).__init__()\n",
  603. " self.layer1 = nn.Linear(num_input, num_hidden)\n",
  604. " self.layer2 = nn.Sequential(...)\n",
  605. " ...\n",
  606. " \n",
  607. " 定义需要用的网络层\n",
  608. " \n",
  609. " def forward(self, x): # 定义前向传播\n",
  610. " x1 = self.layer1(x)\n",
  611. " x2 = self.layer2(x)\n",
  612. " x = x1 + x2\n",
  613. " ...\n",
  614. " return x\n",
  615. "```\n",
  616. "\n",
  617. "注意的是,Module 里面也可以使用 Sequential,同时 Module 非常灵活,具体体现在 forward 中,如何复杂的操作都能直观的在 forward 里面执行\n",
  618. "\n",
  619. "下面我们照着模板实现一下上面的神经网络"
  620. ]
  621. },
  622. {
  623. "cell_type": "code",
  624. "execution_count": 12,
  625. "metadata": {},
  626. "outputs": [],
  627. "source": [
  628. "class SimpNet(nn.Module):\n",
  629. " def __init__(self, num_input, num_hidden, num_output):\n",
  630. " super(SimpNet, self).__init__()\n",
  631. " self.layer1 = nn.Linear(num_input, num_hidden)\n",
  632. " \n",
  633. " self.layer2 = nn.Tanh()\n",
  634. " \n",
  635. " self.layer3 = nn.Linear(num_hidden, num_output)\n",
  636. " \n",
  637. " def forward(self, x):\n",
  638. " x = self.layer1(x)\n",
  639. " x = self.layer2(x)\n",
  640. " x = self.layer3(x)\n",
  641. " return x"
  642. ]
  643. },
  644. {
  645. "cell_type": "code",
  646. "execution_count": 13,
  647. "metadata": {},
  648. "outputs": [],
  649. "source": [
  650. "mo_net = SimpNet(2, 4, 1)"
  651. ]
  652. },
  653. {
  654. "cell_type": "code",
  655. "execution_count": 14,
  656. "metadata": {},
  657. "outputs": [
  658. {
  659. "name": "stdout",
  660. "output_type": "stream",
  661. "text": [
  662. "Linear(in_features=2, out_features=4, bias=True)\n"
  663. ]
  664. }
  665. ],
  666. "source": [
  667. "# 访问模型中的某层可以直接通过名字\n",
  668. "\n",
  669. "# 第一层\n",
  670. "l1 = mo_net.layer1\n",
  671. "print(l1)"
  672. ]
  673. },
  674. {
  675. "cell_type": "code",
  676. "execution_count": 15,
  677. "metadata": {},
  678. "outputs": [
  679. {
  680. "name": "stdout",
  681. "output_type": "stream",
  682. "text": [
  683. "Parameter containing:\n",
  684. "tensor([[ 0.6988, 0.2605],\n",
  685. " [-0.4452, 0.1708],\n",
  686. " [-0.3578, 0.6637],\n",
  687. " [ 0.2984, -0.1281]], requires_grad=True)\n"
  688. ]
  689. }
  690. ],
  691. "source": [
  692. "# 打印出第一层的权重\n",
  693. "print(l1.weight)"
  694. ]
  695. },
  696. {
  697. "cell_type": "code",
  698. "execution_count": 16,
  699. "metadata": {},
  700. "outputs": [],
  701. "source": [
  702. "# 定义优化器\n",
  703. "optim = torch.optim.SGD(mo_net.parameters(), 1.)"
  704. ]
  705. },
  706. {
  707. "cell_type": "code",
  708. "execution_count": 17,
  709. "metadata": {},
  710. "outputs": [
  711. {
  712. "name": "stdout",
  713. "output_type": "stream",
  714. "text": [
  715. "epoch: 1000, loss: 0.0754304826259613\n",
  716. "epoch: 2000, loss: 0.06512685120105743\n",
  717. "epoch: 3000, loss: 0.061497319489717484\n",
  718. "epoch: 4000, loss: 0.055132776498794556\n",
  719. "epoch: 5000, loss: 0.04916892945766449\n",
  720. "epoch: 6000, loss: 0.04603230580687523\n",
  721. "epoch: 7000, loss: 0.04394793137907982\n",
  722. "epoch: 8000, loss: 0.04242979362607002\n",
  723. "epoch: 9000, loss: 0.041267599910497665\n",
  724. "epoch: 10000, loss: 0.04034609720110893\n"
  725. ]
  726. }
  727. ],
  728. "source": [
  729. "# 我们训练 10000 次\n",
  730. "for e in range(10000):\n",
  731. " out = mo_net(x)\n",
  732. " loss = criterion(out, y)\n",
  733. " optim.zero_grad()\n",
  734. " loss.backward()\n",
  735. " optim.step()\n",
  736. " if (e + 1) % 1000 == 0:\n",
  737. " print('epoch: {}, loss: {}'.format(e+1, loss.item()))"
  738. ]
  739. },
  740. {
  741. "cell_type": "code",
  742. "execution_count": 33,
  743. "metadata": {
  744. "collapsed": true
  745. },
  746. "outputs": [],
  747. "source": [
  748. "# 保存模型\n",
  749. "torch.save(mo_net.state_dict(), 'module_net.pth')"
  750. ]
  751. },
  752. {
  753. "cell_type": "markdown",
  754. "metadata": {},
  755. "source": [
  756. "可以看到我们得到了相同的结果,而且使用 Sequential 和 Module 来定义模型更加方便\n",
  757. "\n",
  758. "在这一节中我们还是使用梯度下降法来优化参数,在神经网络中,这种优化方法有一个特别的名字,反向传播算法,下一次课我们会讲一讲什么是反向传播算法"
  759. ]
  760. },
  761. {
  762. "cell_type": "markdown",
  763. "metadata": {},
  764. "source": [
  765. "## 练习题\n",
  766. "\n",
  767. "* 改变网络的隐藏层神经元数目,或者试试定义一个 5 层甚至更深的模型,增加训练次数,改变学习率,看看结果会怎么样"
  768. ]
  769. }
  770. ],
  771. "metadata": {
  772. "kernelspec": {
  773. "display_name": "Python 3 (ipykernel)",
  774. "language": "python",
  775. "name": "python3"
  776. },
  777. "language_info": {
  778. "codemirror_mode": {
  779. "name": "ipython",
  780. "version": 3
  781. },
  782. "file_extension": ".py",
  783. "mimetype": "text/x-python",
  784. "name": "python",
  785. "nbconvert_exporter": "python",
  786. "pygments_lexer": "ipython3",
  787. "version": "3.9.7"
  788. }
  789. },
  790. "nbformat": 4,
  791. "nbformat_minor": 2
  792. }

机器学习越来越多应用到飞行器、机器人等领域,其目的是利用计算机实现类似人类的智能,从而实现装备的智能化与无人化。本课程旨在引导学生掌握机器学习的基本知识、典型方法与技术,通过具体的应用案例激发学生对该学科的兴趣,鼓励学生能够从人工智能的角度来分析、解决飞行器、机器人所面临的问题和挑战。本课程主要内容包括Python编程基础,机器学习模型,无监督学习、监督学习、深度学习基础知识与实现,并学习如何利用机器学习解决实际问题,从而全面提升自我的《综合能力》。