diff --git a/1_logistic_regression/Logistic_regression.ipynb b/1_logistic_regression/Logistic_regression.ipynb
index 2e6c35f..53b24cf 100644
--- a/1_logistic_regression/Logistic_regression.ipynb
+++ b/1_logistic_regression/Logistic_regression.ipynb
@@ -25,6 +25,89 @@
    ]
   },
   {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### 逻辑回归表达式\n",
+    "\n",
+    "这个函数称为Logistic函数(logistic function)，也称为Sigmoid函数(sigmoid function)。函数公式如下：\n",
+    "\n",
+    "$$\n",
+    "g(z) = \\frac{1}{1+e^{-z}}\n",
+    "$$\n",
+    "\n",
+    "Logistic函数当z趋近于无穷大时，g(z)趋近于1；当z趋近于无穷小时，g(z)趋近于0。Logistic函数的图形如上图所示。Logistic函数求导时有一个特性，这个特性将在下面的推导中用到，这个特性为：\n",
+    "$$\n",
+    "g'(z) =  \\frac{d}{dz} \\frac{1}{1+e^{-z}} \\\\\n",
+    "      =  \\frac{1}{(1+e^{-z})^2}(e^{-z}) \\\\\n",
+    "      =  \\frac{1}{(1+e^{-z})} (1 - \\frac{1}{(1+e^{-z})}) \\\\\n",
+    "      =  g(z)(1-g(z))\n",
+    "$$\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "逻辑回归本质上是线性回归，只是在特征到结果的映射中加入了一层函数映射，即先把特征线性求和，然后使用函数$g(z)$将最为假设函数来预测。$g(z)$可以将连续值映射到0到1之间。线性回归模型的表达式带入$g(z)$，就得到逻辑回归的表达式:\n",
+    "\n",
+    "$$\n",
+    "h_\\theta(x) = g(\\theta^T x) = \\frac{1}{1+e^{-\\theta^T x}}\n",
+    "$$"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### 逻辑回归的软分类\n",
+    "\n",
+    "现在我们将y的取值$h_\\theta(x)$通过Logistic函数归一化到(0,1)间，$y$的取值有特殊的含义，它表示结果取1的概率，因此对于输入$x$分类结果为类别1和类别0的概率分别为：\n",
+    "\n",
+    "$$\n",
+    "P(y=1|x,\\theta) = h_\\theta(x) \\\\\n",
+    "P(y=0|x,\\theta) = 1 - h_\\theta(x)\n",
+    "$$\n",
+    "\n",
+    "对上面的表达式合并一下就是：\n",
+    "\n",
+    "$$\n",
+    "p(y|x,\\theta) = (h_\\theta(x))^y (1 - h_\\theta(x))^{1-y}\n",
+    "$$\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### 梯度上升\n",
+    "\n",
+    "得到了逻辑回归的表达式，下一步跟线性回归类似，构建似然函数，然后最大似然估计，最终推导出$\\theta$的迭代更新表达式。只不过这里用的不是梯度下降，而是梯度上升，因为这里是最大化似然函数。\n",
+    "\n",
+    "我们假设训练样本相互独立，那么似然函数表达式为：\n",
+    "![Loss](images/eq_loss.png)\n",
+    "\n",
+    "同样对似然函数取log，转换为：\n",
+    "![LogLoss](images/eq_logloss.png)\n",
+    "\n",
+    "转换后的似然函数对$\\theta$求偏导，在这里我们以只有一个训练样本的情况为例：\n",
+    "![LogLossDiff](images/eq_logloss_diff.png)\n",
+    "\n",
+    "这个求偏导过程中：\n",
+    "* 第一步是对$\\theta$偏导的转化，依据偏导公式：$y=lnx$, $y'=1/x$。\n",
+    "* 第二步是根据g(z)求导的特性g'(z) = g(z)(1 - g(z)) 。\n",
+    "* 第三步就是普通的变换。\n",
+    "\n",
+    "这样我们就得到了梯度上升每次迭代的更新方向，那么$\\theta$的迭代表达式为：\n",
+    "$$\n",
+    "\\theta_j := \\theta_j + \\alpha(y^i - h_\\theta(x^i)) x_j^i\n",
+    "$$\n",
+    "\n"
+   ]
+  },
+  {
    "cell_type": "code",
    "execution_count": 2,
    "metadata": {},
diff --git a/1_logistic_regression/Logistic_regression.py b/1_logistic_regression/Logistic_regression.py
index eabdefb..971b871 100644
--- a/1_logistic_regression/Logistic_regression.py
+++ b/1_logistic_regression/Logistic_regression.py
@@ -38,6 +38,72 @@
 #
 #
 
+# ### 逻辑回归表达式
+#
+# 这个函数称为Logistic函数(logistic function)，也称为Sigmoid函数(sigmoid function)。函数公式如下：
+#
+# $$
+# g(z) = \frac{1}{1+e^{-z}}
+# $$
+#
+# Logistic函数当z趋近于无穷大时，g(z)趋近于1；当z趋近于无穷小时，g(z)趋近于0。Logistic函数的图形如上图所示。Logistic函数求导时有一个特性，这个特性将在下面的推导中用到，这个特性为：
+# $$
+# g'(z) =  \frac{d}{dz} \frac{1}{1+e^{-z}} \\
+#       =  \frac{1}{(1+e^{-z})^2}(e^{-z}) \\
+#       =  \frac{1}{(1+e^{-z})} (1 - \frac{1}{(1+e^{-z})}) \\
+#       =  g(z)(1-g(z))
+# $$
+#
+#
+
+# 逻辑回归本质上是线性回归，只是在特征到结果的映射中加入了一层函数映射，即先把特征线性求和，然后使用函数$g(z)$将最为假设函数来预测。$g(z)$可以将连续值映射到0到1之间。线性回归模型的表达式带入$g(z)$，就得到逻辑回归的表达式:
+#
+# $$
+# h_\theta(x) = g(\theta^T x) = \frac{1}{1+e^{-\theta^T x}}
+# $$
+
+# ### 逻辑回归的软分类
+#
+# 现在我们将y的取值$h_\theta(x)$通过Logistic函数归一化到(0,1)间，$y$的取值有特殊的含义，它表示结果取1的概率，因此对于输入$x$分类结果为类别1和类别0的概率分别为：
+#
+# $$
+# P(y=1|x,\theta) = h_\theta(x) \\
+# P(y=0|x,\theta) = 1 - h_\theta(x)
+# $$
+#
+# 对上面的表达式合并一下就是：
+#
+# $$
+# p(y|x,\theta) = (h_\theta(x))^y (1 - h_\theta(x))^{1-y}
+# $$
+#
+#
+
+# ### 梯度上升
+#
+# 得到了逻辑回归的表达式，下一步跟线性回归类似，构建似然函数，然后最大似然估计，最终推导出$\theta$的迭代更新表达式。只不过这里用的不是梯度下降，而是梯度上升，因为这里是最大化似然函数。
+#
+# 我们假设训练样本相互独立，那么似然函数表达式为：
+# ![Loss](images/eq_loss.png)
+#
+# 同样对似然函数取log，转换为：
+# ![LogLoss](images/eq_logloss.png)
+#
+# 转换后的似然函数对$\theta$求偏导，在这里我们以只有一个训练样本的情况为例：
+# ![LogLossDiff](images/eq_logloss_diff.png)
+#
+# 这个求偏导过程中：
+# * 第一步是对$\theta$偏导的转化，依据偏导公式：$y=lnx$, $y'=1/x$。
+# * 第二步是根据g(z)求导的特性g'(z) = g(z)(1 - g(z)) 。
+# * 第三步就是普通的变换。
+#
+# 这样我们就得到了梯度上升每次迭代的更新方向，那么$\theta$的迭代表达式为：
+# $$
+# \theta_j := \theta_j + \alpha(y^i - h_\theta(x^i)) x_j^i
+# $$
+#
+#
+
 # +
 # %matplotlib inline
 
diff --git a/1_logistic_regression/images/eq_logloss.png b/1_logistic_regression/images/eq_logloss.png
new file mode 100644
index 0000000..a802d44
Binary files /dev/null and b/1_logistic_regression/images/eq_logloss.png differ
diff --git a/1_logistic_regression/images/eq_logloss_diff.png b/1_logistic_regression/images/eq_logloss_diff.png
new file mode 100644
index 0000000..337f9c5
Binary files /dev/null and b/1_logistic_regression/images/eq_logloss_diff.png differ
diff --git a/1_logistic_regression/images/eq_loss.png b/1_logistic_regression/images/eq_loss.png
new file mode 100644
index 0000000..8e1bd6b
Binary files /dev/null and b/1_logistic_regression/images/eq_loss.png differ