From 1c99404f4d3799977ef01295c7fb70d7b723b525 Mon Sep 17 00:00:00 2001 From: bushuhui Date: Mon, 10 Jan 2022 10:37:52 +0800 Subject: [PATCH] Imporve number of softmax-ce --- 5_nn/3-softmax_ce.ipynb | 34 ++++++++++++++++++++++++++-------- 1 file changed, 26 insertions(+), 8 deletions(-) diff --git a/5_nn/3-softmax_ce.ipynb b/5_nn/3-softmax_ce.ipynb index 8e5a1d1..fe99c61 100644 --- a/5_nn/3-softmax_ce.ipynb +++ b/5_nn/3-softmax_ce.ipynb @@ -143,7 +143,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## 2. 推导过程\n", + "## 3. 推导过程\n", "\n", "首先,我们要明确一下我们要求什么,我们要求的是我们的$loss$对于神经元输出($z_i$)的梯度,即:\n", "\n", @@ -158,14 +158,26 @@ "$$\n", "\n", "有个人可能有疑问了,这里为什么是$a_j$而不是$a_i$,这里要看一下$softmax$的公式了,因为$softmax$公式的特性,它的分母包含了所有神经元的输出,所以,对于不等于$i$的其他输出里面,也包含着$z_i$,所有的$a$都要纳入到计算范围中,并且后面的计算可以看到需要分为$i = j$和$i \\ne j$两种情况求导。\n", - "\n", - "### 2.1 针对$a_j$的偏导\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 3.1 针对$a_j$的偏导\n", "\n", "$$\n", "\\frac{\\partial C}{\\partial a_j} = \\frac{(\\partial -\\sum_j y_j ln a_j)}{\\partial a_j} = -\\sum_j y_j \\frac{1}{a_j}\n", "$$\n", - "\n", - "### 2.2 针对$z_i$的偏导\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 3.2 针对$z_i$的偏导\n", "\n", "如果 $i=j$ :\n", "\n", @@ -188,8 +200,14 @@ "$$\n", "(\\frac{u}{v})' = \\frac{u'v - uv'}{v^2} \n", "$$\n", - "\n", - "### 2.3 整体的推导\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 3.3 整体的推导\n", "\n", "\\begin{eqnarray}\n", "\\frac{\\partial C}{\\partial z_i} & = & (-\\sum_j y_j \\frac{1}{a_j} ) \\frac{\\partial a_j}{\\partial z_i} \\\\\n", @@ -234,7 +252,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## 3. 问题\n", + "## 4. 问题\n", "如何将本节所讲的softmax,交叉熵代价函数应用到上节所讲的BP方法中?" ] },