Browse Source

Improve figure save

pull/14/MERGE
bushuhui 3 years ago
parent
commit
88cb7fe231
13 changed files with 18 additions and 987 deletions
  1. +11
    -12
      4_logistic_regression/1-Least_squares.ipynb
  2. +7
    -7
      4_logistic_regression/2-Logistic_regression.ipynb
  3. BIN
      4_logistic_regression/isomap_visualize.pdf
  4. BIN
      4_logistic_regression/logistic_confusion_matrix.pdf
  5. BIN
      4_logistic_regression/logistic_pred_res.pdf
  6. BIN
      4_logistic_regression/logistic_train_data.pdf
  7. BIN
      4_logistic_regression/logstic_fuction.pdf
  8. +0
    -968
      4_logistic_regression/ls.ipynb
  9. BIN
      4_logistic_regression/missle_est.pdf
  10. BIN
      4_logistic_regression/missle_taj.pdf
  11. BIN
      4_logistic_regression/pca_visualize.pdf
  12. BIN
      4_logistic_regression/sklean_isomap_confusion_matrix.pdf
  13. BIN
      4_logistic_regression/sklearn_linear_fitting.pdf

+ 11
- 12
4_logistic_regression/1-Least_squares.ipynb View File

@@ -31,7 +31,7 @@
"source": [ "source": [
"### 1.1 示例\n", "### 1.1 示例\n",
"\n", "\n",
"假设我们有下面的一些观测数据,我们希望找到他们内在的规律。"
"假设我们有下面的一些观测数据,希望找到它们内在的规律。"
] ]
}, },
{ {
@@ -82,7 +82,7 @@
"$$\n", "$$\n",
"其中$\\mathbf{X}$为自变量,$\\mathbf{Y}$为因变量。\n", "其中$\\mathbf{X}$为自变量,$\\mathbf{Y}$为因变量。\n",
"\n", "\n",
"我们希望找到一个模型能够解释这些数据,假设使用最简单的线性模型来拟合数据:\n",
"希望找到一个模型能够解释这些数据,假设使用最简单的线性模型来拟合数据:\n",
"$$\n", "$$\n",
"y = ax + b\n", "y = ax + b\n",
"$$\n", "$$\n",
@@ -190,11 +190,10 @@
"梯度下降法有很多优点,其中最主要的优点是,**在梯度下降法的求解过程中只需求解损失函数的一阶导数,计算的代价比较小,这使得梯度下降法能在很多大规模数据集上得到应用。**\n", "梯度下降法有很多优点,其中最主要的优点是,**在梯度下降法的求解过程中只需求解损失函数的一阶导数,计算的代价比较小,这使得梯度下降法能在很多大规模数据集上得到应用。**\n",
"\n", "\n",
"梯度下降法的含义是通过当前点的梯度方向寻找到新的迭代点。梯度下降法的基本思想可以类比为一个下山的过程。假设这样一个场景:\n", "梯度下降法的含义是通过当前点的梯度方向寻找到新的迭代点。梯度下降法的基本思想可以类比为一个下山的过程。假设这样一个场景:\n",
"* 一个人被困在山上,需要从山上下来(i.e. 找到山的最低点,也就是山谷)。\n",
"* 一个人被困在山上,需要从山上下来,找到山的最低点,也就是山谷;\n",
"* 但此时山上的浓雾很大,导致可视度很低。因此,下山的路径就无法全部确定,他必须利用自己周围的信息去找到下山的路径。\n", "* 但此时山上的浓雾很大,导致可视度很低。因此,下山的路径就无法全部确定,他必须利用自己周围的信息去找到下山的路径。\n",
"* 这个时候,他就可以利用梯度下降算法来帮助自己下山。\n",
" - 具体来说就是,以他当前的所处的位置为基准,寻找这个位置最陡峭的地方,然后朝着山的高度下降的地方走\n",
" - 然后每走一段距离,都反复采用同一个方法,最后就能成功的抵达山谷。\n",
"* 以他当前的所处的位置为基准,寻找这个位置最陡峭的地方,然后朝着山的高度下降的地方走\n",
"* 每走一段距离,都反复采用同一个方法,最后就能成功的抵达山谷。\n",
"\n", "\n",
"\n", "\n",
"一般情况下,这座山最陡峭的地方是无法通过肉眼立马观察出来的,而是需要一个工具来测量;同时,这个人此时正好拥有测量出最陡峭方向的能力。所以,此人每走一段距离,都需要一段时间来测量所在位置最陡峭的方向,这是比较耗时的。那么为了在太阳下山之前到达山底,就要尽可能的减少测量方向的次数。这是一个两难的选择,如果测量的频繁,可以保证下山的方向是绝对正确的,但又非常耗时;如果测量的过少,又有偏离轨道的风险。所以需要找到一个合适的测量方向的频率,来确保下山的方向不错误,同时又不至于耗时太多!\n", "一般情况下,这座山最陡峭的地方是无法通过肉眼立马观察出来的,而是需要一个工具来测量;同时,这个人此时正好拥有测量出最陡峭方向的能力。所以,此人每走一段距离,都需要一段时间来测量所在位置最陡峭的方向,这是比较耗时的。那么为了在太阳下山之前到达山底,就要尽可能的减少测量方向的次数。这是一个两难的选择,如果测量的频繁,可以保证下山的方向是绝对正确的,但又非常耗时;如果测量的过少,又有偏离轨道的风险。所以需要找到一个合适的测量方向的频率,来确保下山的方向不错误,同时又不至于耗时太多!\n",
@@ -209,7 +208,7 @@
"L = \\sum_{i=1}^{N} (y_i - a x_i - b)^2\n", "L = \\sum_{i=1}^{N} (y_i - a x_i - b)^2\n",
"$$\n", "$$\n",
"\n", "\n",
"我们更新的策略是:\n",
"更新的策略是:\n",
"$$\n", "$$\n",
"\\theta^1 = \\theta^0 - \\eta \\triangledown L(\\theta)\n", "\\theta^1 = \\theta^0 - \\eta \\triangledown L(\\theta)\n",
"$$\n", "$$\n",
@@ -217,7 +216,7 @@
"\n", "\n",
"此公式的意义是:$L$是关于$\\theta$的一个函数,我们当前所处的位置为$\\theta_0$点,要从这个点走到L的最小值点,也就是山底。首先我们先确定前进的方向,也就是梯度的反向,然后走一段距离的步长,也就是$\\eta$,走完这个段步长,就到达了$\\theta_1$这个点!\n", "此公式的意义是:$L$是关于$\\theta$的一个函数,我们当前所处的位置为$\\theta_0$点,要从这个点走到L的最小值点,也就是山底。首先我们先确定前进的方向,也就是梯度的反向,然后走一段距离的步长,也就是$\\eta$,走完这个段步长,就到达了$\\theta_1$这个点!\n",
"\n", "\n",
"更新的策略是:\n",
"最终的更新方程是:\n",
"\n", "\n",
"$$\n", "$$\n",
"a^1 = a^0 + 2 \\eta [ y - (ax+b)]*x \\\\\n", "a^1 = a^0 + 2 \\eta [ y - (ax+b)]*x \\\\\n",
@@ -1410,7 +1409,7 @@
"plt.plot(t, y)\n", "plt.plot(t, y)\n",
"plt.xlabel(\"time\")\n", "plt.xlabel(\"time\")\n",
"plt.ylabel(\"height\")\n", "plt.ylabel(\"height\")\n",
"plt.savefig(\"missle_taj.pdf\")\n",
"plt.savefig(\"fig-res-missle_taj.pdf\")\n",
"plt.show()" "plt.show()"
] ]
}, },
@@ -1493,7 +1492,7 @@
"plt.plot(t, y, 'r-', label='Real data')\n", "plt.plot(t, y, 'r-', label='Real data')\n",
"plt.plot(t, y_est, 'g-x', label='Estimated data')\n", "plt.plot(t, y_est, 'g-x', label='Estimated data')\n",
"plt.legend()\n", "plt.legend()\n",
"plt.savefig(\"missle_est.pdf\")\n",
"plt.savefig(\"fig-res-missle_est.pdf\")\n",
"plt.show()\n" "plt.show()\n"
] ]
}, },
@@ -1562,7 +1561,7 @@
"plt.plot([x_min, x_max], [y_min, y_max], 'r')\n", "plt.plot([x_min, x_max], [y_min, y_max], 'r')\n",
"plt.xlabel(\"X\")\n", "plt.xlabel(\"X\")\n",
"plt.ylabel(\"Y\")\n", "plt.ylabel(\"Y\")\n",
"plt.savefig(\"sklearn_linear_fitting.pdf\")\n",
"plt.savefig(\"fig-res-sklearn_linear_fitting.pdf\")\n",
"plt.show()" "plt.show()"
] ]
}, },
@@ -1636,7 +1635,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.5.4"
"version": "3.7.9"
} }
}, },
"nbformat": 4, "nbformat": 4,


+ 7
- 7
4_logistic_regression/2-Logistic_regression.ipynb View File

@@ -19,7 +19,7 @@
"\n", "\n",
"一说回归最先想到的是终结者那句:I'll be back\n", "一说回归最先想到的是终结者那句:I'll be back\n",
"\n", "\n",
"regress,re表示back,gress等于go,数值go back to mean value,也就是I'll be back 的意思\n",
"regress,re表示back,gress等于go,数值go back to mean value,也就是I'll be back 的意思\n",
"\n", "\n",
"在数理统计中,回归是确定多种变量相互依赖的定量关系的方法\n", "在数理统计中,回归是确定多种变量相互依赖的定量关系的方法\n",
"\n", "\n",
@@ -80,7 +80,7 @@
"y=1/(1+np.e**(-X))\n", "y=1/(1+np.e**(-X))\n",
"plt.plot(X,y,'b-')\n", "plt.plot(X,y,'b-')\n",
"plt.title(\"Logistic function\")\n", "plt.title(\"Logistic function\")\n",
"plt.savefig(\"logstic_fuction.pdf\")\n",
"plt.savefig(\"fig-res-logstic_fuction.pdf\")\n",
"plt.show()" "plt.show()"
] ]
}, },
@@ -227,7 +227,7 @@
"data, label = sklearn.datasets.make_moons(200, noise=0.30)\n", "data, label = sklearn.datasets.make_moons(200, noise=0.30)\n",
"\n", "\n",
"plt.scatter(data[:,0], data[:,1], c=label)\n", "plt.scatter(data[:,0], data[:,1], c=label)\n",
"plt.savefig(\"logistic_train_data.pdf\")\n",
"plt.savefig(\"fig-res-logistic_train_data.pdf\")\n",
"plt.title(\"Original Data\")" "plt.title(\"Original Data\")"
] ]
}, },
@@ -408,7 +408,7 @@
"plt.colorbar()\n", "plt.colorbar()\n",
"plt.ylabel('Groundtruth')\n", "plt.ylabel('Groundtruth')\n",
"plt.xlabel(u'Predict')\n", "plt.xlabel(u'Predict')\n",
"plt.savefig('logistic_confusion_matrix.pdf')\n",
"plt.savefig('fig-res-logistic_confusion_matrix.pdf')\n",
"plt.show()" "plt.show()"
] ]
}, },
@@ -567,7 +567,7 @@
"\n", "\n",
"plt.scatter(proj[:, 0], proj[:, 1], c=digits.target)\n", "plt.scatter(proj[:, 0], proj[:, 1], c=digits.target)\n",
"plt.colorbar()\n", "plt.colorbar()\n",
"plt.savefig(\"pca_visualize.pdf\")\n",
"plt.savefig(\"fig-res-pca_visualize.pdf\")\n",
"plt.show()" "plt.show()"
] ]
}, },
@@ -605,7 +605,7 @@
"\n", "\n",
"plt.scatter(proj[:, 0], proj[:, 1], c=digits.target)\n", "plt.scatter(proj[:, 0], proj[:, 1], c=digits.target)\n",
"plt.colorbar()\n", "plt.colorbar()\n",
"plt.savefig(\"isomap_visualize.pdf\")\n",
"plt.savefig(\"fig-res-isomap_visualize.pdf\")\n",
"plt.show()" "plt.show()"
] ]
}, },
@@ -721,7 +721,7 @@
"plt.colorbar()\n", "plt.colorbar()\n",
"plt.ylabel(u'Groundtruth')\n", "plt.ylabel(u'Groundtruth')\n",
"plt.xlabel(u'Predict')\n", "plt.xlabel(u'Predict')\n",
"plt.savefig(\"sklean_isomap_confusion_matrix.pdf\")\n",
"plt.savefig(\"fig-res-sklean_isomap_confusion_matrix.pdf\")\n",
"plt.show()" "plt.show()"
] ]
}, },


BIN
4_logistic_regression/isomap_visualize.pdf View File


BIN
4_logistic_regression/logistic_confusion_matrix.pdf View File


BIN
4_logistic_regression/logistic_pred_res.pdf View File


BIN
4_logistic_regression/logistic_train_data.pdf View File


BIN
4_logistic_regression/logstic_fuction.pdf View File


+ 0
- 968
4_logistic_regression/ls.ipynb
File diff suppressed because it is too large
View File


BIN
4_logistic_regression/missle_est.pdf View File


BIN
4_logistic_regression/missle_taj.pdf View File


BIN
4_logistic_regression/pca_visualize.pdf View File


BIN
4_logistic_regression/sklean_isomap_confusion_matrix.pdf View File


BIN
4_logistic_regression/sklearn_linear_fitting.pdf View File


Loading…
Cancel
Save