finish '4.1 多特征(Multiple Features)'

7 years ago · 001a72fe40
--- a/notes/image/20180106_092119.png
+++ b/notes/image/20180106_092119.png
--- a/notes/image/20180107_234509.png
+++ b/notes/image/20180107_234509.png
--- a/notes/week1.md
+++ b/notes/week1.md
@@ -71,7 +71,7 @@

   在房屋价格预测的例子中，给出了一系列的房屋面基数据，根据这些数据来预测任意面积的房屋价格。给出照片-年龄数据集，预测给定照片的年龄。

   ![](image\20180105_194712.png)
   ![](image/20180105_194712.png)

 2. 分类问题(Classification)

@@ -81,7 +81,7 @@

   视频中举了癌症肿瘤这个例子，针对诊断结果，分别分类为良性或恶性。还例如垃圾邮件分类问题，也同样属于监督学习中的分类问题。

   ![](image\20180105_194839.png)
   ![](image/20180105_194839.png)

 视频中提到**支持向量机**这个算法，旨在解决当特征量很大的时候(特征即如癌症例子中的肿块大小，颜色，气味等各种特征)，计算机内存一定会不够用的情况。**支持向量机能让计算机处理无限多个特征。**

@@ -165,7 +165,7 @@ $h_\theta(x)=\theta_0+\theta_1x$，为其中一种可行的表达式。
 >
 > $\left(x, y\right)$: 训练集中的实例
 >
 > $\left(x^\left(i\right), y^\left(i\right)\right)$: 训练集中的第 $i$ 个样本实例
 > $\left(x^\left(i\right),y^\left(i\right)\right)$: 训练集中的第 $i$ 个样本实例

 ![](image/20180105_224648.png)

@@ -173,9 +173,9 @@ $h_\theta(x)=\theta_0+\theta_1x$，为其中一种可行的表达式。

 为了求解最小值，引入损失函数(Cost Function)概念，用于度量建模误差。考虑到要计算最小值，应用二次函数对求和式建模，即应用统计学中的平方损失函数（最小二乘法）：

 $$J(\theta_0, \theta_1) = \dfrac {1}{2m} \displaystyle \sum _{i=1}^m \left ( \hat{y}_{i}- y_{i} \right)^2 = \dfrac {1}{2m} \displaystyle \sum _{i=1}^m \left (h_\theta (x_{i}) - y_{i} \right)^2$$ 
 $$J(\theta_0,\theta_1)=\dfrac{1}{2m}\displaystyle\sum_{i=1}^m\left(\hat{y}_{i}-y_{i} \right)^2=\dfrac{1}{2m}\displaystyle\sum_{i=1}^m\left(h_\theta(x_{i})-y_{i}\right)^2$$ 

 > 系数 $\frac{1}{2}$ 存在与否都不会影响结果，这里是为了在应用梯度下降时便于求解，平方的导数会抵消掉 $\frac{1}{2}$ 。
 > 系数 $\frac{1}{2}$ 存在与否都不会影响结果，这里是为了在应用梯度下降时便于求解，平方的导数会抵消掉 $\frac{1}{2}$ 。

 讨论到这里，我们的问题就转化成了**求解 $J\left( \theta_0, \theta_1  \right)$ 的最小值**。

@@ -202,7 +202,7 @@ $$J(\theta_0, \theta_1) = \dfrac {1}{2m} \displaystyle \sum _{i=1}^m \left ( \ha

 给定数据集：

 ![](image\20180106_091307.png)
 ![](image/20180106_091307.png)

 参数在 $\theta_0$ 不恒为 $0$ 时损失函数 $J\left(\theta\right)$ 关于 $\theta_0, \theta_1$ 的3-D图像，图像中的高度为损失函数的值。

--- a/notes/week2.md
+++ b/notes/week2.md
@@ -1,11 +1,39 @@
 [TOC]

 # 4 Linear Regression with Multiple Variables
 # 4 多变量线性回归(Linear Regression with Multiple Variables)

 ## 4.1 Multiple Features
 ## 4.1 多特征(Multiple Features)

 对于一个要度量的对象，一般来说会有不同维度的多个特征。比如之前的房屋价格预测例子中，除了房屋的面积大小，可能还有房屋的年限、房屋的层数等等其他特征：

 ![](image/20180107_234509.png)

 这里由于特征不再只有一个，引入一些新的记号

 > $n$: 特征的总数 
 >
 >  ${x}^{\left( i \right)}$: 代表特征矩阵中第 $i$ 行，也就是第 $i$ 个训练实例。
 >
 >  ${x}_{j}^{\left( i \right)}$: 代表特征矩阵中第 $i$ 行的第 $j$ 个特征，也就是第 $i$ 个训练实例的第 $j$ 个特征。

 参照上图，则记号的举例有，${x}^{(2)}\text{=}\begin{bmatrix} 1416\\\ 3\\\ 2\\\ 40 \end{bmatrix}, {x}^{(2)}_{1} = 1416$

 多变量假设函数 $h$ 表示为：$h_{\theta}\left( x \right)={\theta_{0}}+{\theta_{1}}{x_{1}}+{\theta_{2}}{x_{2}}+...+{\theta_{n}}{x_{n}}$

 对于 $\theta_0$，和单特征中一样，我们将其看作基础数值。例如，房价的基础价格。

 参数向量的维度为 $n+1$，在特征向量中添加 $x_{0}$ 后，其维度也变为 $n+1$， 则运用线性代数，可对 $h$ 简化。 

 $h_\theta\left(x\right)=\begin{bmatrix}\theta_0\; \theta_1\; ... \;\theta_n \end{bmatrix}\begin{bmatrix}x_0 \newline x_1 \newline \vdots \newline x_n\end{bmatrix}= \theta^T x$

 > $\theta^T$: $\theta$ 矩阵的转置
 >
 > $x_0$: 为了计算方便我们会假设 $x_0^{(i)} = 1$

 ## 4.2 Gradient Descent for Multiple Variables



 ## 4.3 Gradient Descent in Practice I - Feature Scaling

 ## 4.4 Gradient Descent in Practice II - Learning Rate
@@ -19,6 +47,9 @@
 ## 4.8 Working on and Submitting Programming Assignments

 # 5 Octave Matlab Tutorial

 复习时可直接倍速回顾视频，笔记整理暂留。

 ## 5.1 Basic Operations

 ## 5.2 Moving Data Around
@@ -29,4 +60,6 @@

 ## 5.5 Control Statements_ for, while, if statement

 ## 5.6 Vectorization
 ## 5.6 Vectorization

 ## 5.x 常用函数整理