@@ -6,22 +6,28 @@ | |||||
"source": [ | "source": [ | ||||
"# Tensor and Variable\n", | "# Tensor and Variable\n", | ||||
"\n", | "\n", | ||||
"PyTorch的简洁设计使得它入门很简单,在深入介绍PyTorch之前,本节将先介绍一些PyTorch的基础知识,使得读者能够对PyTorch有一个大致的了解,并能够用PyTorch搭建一个简单的神经网络。部分内容读者可能暂时不太理解,可先不予以深究,后续的课程将会对此进行深入讲解。\n", | |||||
"\n", | "\n", | ||||
"本节内容参考了PyTorch官方教程[^1]并做了相应的增删修改,使得内容更贴合新版本的PyTorch接口,同时也更适合新手快速入门。另外本书需要读者先掌握基础的Numpy使用,其他相关知识推荐读者参考CS231n的教程[^2]。\n", | |||||
"张量(Tensor)是一种专门的数据结构,非常类似于数组和矩阵。在PyTorch中,我们使用张量来编码模型的输入和输出,以及模型的参数。\n", | |||||
"\n", | "\n", | ||||
"[^1]: http://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html\n", | |||||
"[^2]: http://cs231n.github.io/python-numpy-tutorial/\n", | |||||
"\n" | |||||
"张量类似于`numpy`的`ndarray`,不同之处在于张量可以在GPU或其他硬件加速器上运行。事实上,张量和NumPy数组通常可以共享相同的底层内存,从而消除了复制数据的需要(请参阅使用NumPy的桥接)。张量还针对自动微分进行了优化,在Autograd部分中看到更多关于这一点的内介绍。\n", | |||||
"\n", | |||||
"`variable`是一种可以不断变化的变量,符合反向传播,参数更新的属性。PyTorch的`variable`是一个存放会变化值的内存位置,里面的值会不停变化,像装糖果(糖果就是数据,即tensor)的盒子,糖果的数量不断变化。pytorch都是由tensor计算的,而tensor里面的参数是variable形式。\n" | |||||
] | ] | ||||
}, | }, | ||||
{ | { | ||||
"cell_type": "markdown", | "cell_type": "markdown", | ||||
"metadata": {}, | "metadata": {}, | ||||
"source": [ | "source": [ | ||||
"## 把 PyTorch 当做 NumPy 用\n", | |||||
"## 1. Tensor基本用法\n", | |||||
"\n", | "\n", | ||||
"PyTorch 的官方介绍是一个拥有强力GPU加速的张量和动态构建网络的库,其主要构件是张量,所以我们可以把 PyTorch 当做 NumPy 来用,PyTorch 的很多操作好 NumPy 都是类似的,但是因为其能够在 GPU 上运行,所以有着比 NumPy 快很多倍的速度。通过本次课程,你能够学会如何像使用 NumPy 一样使用 PyTorch,了解到 PyTorch 中的基本元素 Tensor 和 Variable 及其操作方式。" | |||||
"PyTorch基础的数据是张量,PyTorch 的很多操作好 NumPy 都是类似的,但是因为其能够在 GPU 上运行,所以有着比 NumPy 快很多倍的速度。通过本次课程,能够学会如何像使用 NumPy 一样使用 PyTorch,了解到 PyTorch 中的基本元素 Tensor 和 Variable 及其操作方式。" | |||||
] | |||||
}, | |||||
{ | |||||
"cell_type": "markdown", | |||||
"metadata": {}, | |||||
"source": [ | |||||
"### 1.1 Tensor定义与生成" | |||||
] | ] | ||||
}, | }, | ||||
{ | { | ||||
@@ -113,7 +119,7 @@ | |||||
"cell_type": "markdown", | "cell_type": "markdown", | ||||
"metadata": {}, | "metadata": {}, | ||||
"source": [ | "source": [ | ||||
"PyTorch Tensor 使用 GPU 加速\n", | |||||
"### 1.2 PyTorch Tensor 使用 GPU 加速\n", | |||||
"\n", | "\n", | ||||
"我们可以使用以下两种方式将 Tensor 放到 GPU 上" | "我们可以使用以下两种方式将 Tensor 放到 GPU 上" | ||||
] | ] | ||||
@@ -245,7 +251,7 @@ | |||||
"cell_type": "markdown", | "cell_type": "markdown", | ||||
"metadata": {}, | "metadata": {}, | ||||
"source": [ | "source": [ | ||||
"**小练习**\n", | |||||
"### 1.3 小练习\n", | |||||
"\n", | "\n", | ||||
"查阅以下[文档](http://pytorch.org/docs/0.3.0/tensors.html)了解 tensor 的数据类型,创建一个 float64、大小是 3 x 2、随机初始化的 tensor,将其转化为 numpy 的 ndarray,输出其数据类型\n", | "查阅以下[文档](http://pytorch.org/docs/0.3.0/tensors.html)了解 tensor 的数据类型,创建一个 float64、大小是 3 x 2、随机初始化的 tensor,将其转化为 numpy 的 ndarray,输出其数据类型\n", | ||||
"\n", | "\n", | ||||
@@ -284,8 +290,15 @@ | |||||
"cell_type": "markdown", | "cell_type": "markdown", | ||||
"metadata": {}, | "metadata": {}, | ||||
"source": [ | "source": [ | ||||
"## Tensor的操作\n", | |||||
"Tensor 操作中的 api 和 NumPy 非常相似,如果你熟悉 NumPy 中的操作,那么 tensor 基本是一致的,下面我们来列举其中的一些操作" | |||||
"## 2. Tensor的操作\n", | |||||
"Tensor 操作中的 API 和 NumPy 非常相似,如果你熟悉 NumPy 中的操作,那么 tensor 基本是一致的,下面我们来列举其中的一些操作" | |||||
] | |||||
}, | |||||
{ | |||||
"cell_type": "markdown", | |||||
"metadata": {}, | |||||
"source": [ | |||||
"### 2.1 基本操作" | |||||
] | ] | ||||
}, | }, | ||||
{ | { | ||||
@@ -629,7 +642,8 @@ | |||||
"cell_type": "markdown", | "cell_type": "markdown", | ||||
"metadata": {}, | "metadata": {}, | ||||
"source": [ | "source": [ | ||||
"另外,pytorch中大多数的操作都支持 inplace 操作,也就是可以直接对 tensor 进行操作而不需要另外开辟内存空间,方式非常简单,一般都是在操作的符号后面加`_`,比如" | |||||
"### 2.2 `inplace`操作\n", | |||||
"另外,pytorch中大多数的操作都支持 `inplace` 操作,也就是可以直接对 tensor 进行操作而不需要另外开辟内存空间,方式非常简单,一般都是在操作的符号后面加`_`,比如" | |||||
] | ] | ||||
}, | }, | ||||
{ | { | ||||
@@ -692,9 +706,9 @@ | |||||
"cell_type": "markdown", | "cell_type": "markdown", | ||||
"metadata": {}, | "metadata": {}, | ||||
"source": [ | "source": [ | ||||
"**小练习**\n", | |||||
"### 2.3 **小练习**\n", | |||||
"\n", | "\n", | ||||
"访问[文档](http://pytorch.org/docs/0.3.0/tensors.html)了解 tensor 更多的 api,实现下面的要求\n", | |||||
"访问[文档](http://pytorch.org/docs/tensors.html)了解 tensor 更多的 api,实现下面的要求\n", | |||||
"\n", | "\n", | ||||
"创建一个 float32、4 x 4 的全为1的矩阵,将矩阵正中间 2 x 2 的矩阵,全部修改成2\n", | "创建一个 float32、4 x 4 的全为1的矩阵,将矩阵正中间 2 x 2 的矩阵,全部修改成2\n", | ||||
"\n", | "\n", | ||||
@@ -742,28 +756,38 @@ | |||||
"cell_type": "markdown", | "cell_type": "markdown", | ||||
"metadata": {}, | "metadata": {}, | ||||
"source": [ | "source": [ | ||||
"## Variable\n", | |||||
"tensor 是 PyTorch 中的完美组件,但是构建神经网络还远远不够,我们需要能够构建计算图的 tensor,这就是 Variable。Variable 是对 tensor 的封装,操作和 tensor 是一样的,但是每个 Variabel都有三个属性,Variable 中的 tensor本身`.data`,对应 tensor 的梯度`.grad`以及这个 Variable 是通过什么方式得到的`.grad_fn`" | |||||
"## 3. Variable\n", | |||||
"tensor 是 PyTorch 中的基础数据类型,但是构建神经网络还远远不够,需要能够构建计算图的 tensor,这就是 Variable。Variable 是对 tensor 的封装,操作和 tensor 是一样的,但是每个 Variabel都有三个属性:\n", | |||||
"* Variable 中的 tensor本身`.data`,\n", | |||||
"* 对应 tensor 的梯度`.grad`\n", | |||||
"* Variable 是通过什么方式得到的`.grad_fn`" | |||||
] | |||||
}, | |||||
{ | |||||
"cell_type": "markdown", | |||||
"metadata": {}, | |||||
"source": [ | |||||
"### 3.1 Variable的基本操作" | |||||
] | ] | ||||
}, | }, | ||||
{ | { | ||||
"cell_type": "code", | "cell_type": "code", | ||||
"execution_count": 35, | |||||
"execution_count": 4, | |||||
"metadata": {}, | "metadata": {}, | ||||
"outputs": [], | "outputs": [], | ||||
"source": [ | "source": [ | ||||
"# 通过下面这种方式导入 Variable\n", | |||||
"import torch\n", | |||||
"from torch.autograd import Variable" | "from torch.autograd import Variable" | ||||
] | ] | ||||
}, | }, | ||||
{ | { | ||||
"cell_type": "code", | "cell_type": "code", | ||||
"execution_count": 36, | |||||
"execution_count": 5, | |||||
"metadata": {}, | "metadata": {}, | ||||
"outputs": [], | "outputs": [], | ||||
"source": [ | "source": [ | ||||
"x_tensor = torch.randn(10, 5)\n", | |||||
"y_tensor = torch.randn(10, 5)\n", | |||||
"x_tensor = torch.randn(3, 4)\n", | |||||
"y_tensor = torch.randn(3, 4)\n", | |||||
"\n", | "\n", | ||||
"# 将 tensor 变成 Variable\n", | "# 将 tensor 变成 Variable\n", | ||||
"x = Variable(x_tensor, requires_grad=True) # 默认 Variable 是不需要求梯度的,所以我们用这个方式申明需要对其进行求梯度\n", | "x = Variable(x_tensor, requires_grad=True) # 默认 Variable 是不需要求梯度的,所以我们用这个方式申明需要对其进行求梯度\n", | ||||
@@ -772,7 +796,7 @@ | |||||
}, | }, | ||||
{ | { | ||||
"cell_type": "code", | "cell_type": "code", | ||||
"execution_count": 37, | |||||
"execution_count": 6, | |||||
"metadata": {}, | "metadata": {}, | ||||
"outputs": [], | "outputs": [], | ||||
"source": [ | "source": [ | ||||
@@ -781,15 +805,15 @@ | |||||
}, | }, | ||||
{ | { | ||||
"cell_type": "code", | "cell_type": "code", | ||||
"execution_count": 44, | |||||
"execution_count": 7, | |||||
"metadata": {}, | "metadata": {}, | ||||
"outputs": [ | "outputs": [ | ||||
{ | { | ||||
"name": "stdout", | "name": "stdout", | ||||
"output_type": "stream", | "output_type": "stream", | ||||
"text": [ | "text": [ | ||||
"tensor(-22.1040)\n", | |||||
"<SumBackward0 object at 0x7f839f4e4a90>\n" | |||||
"tensor(-7.7018)\n", | |||||
"<SumBackward0 object at 0x7f0d79305810>\n" | |||||
] | ] | ||||
} | } | ||||
], | ], | ||||
@@ -807,33 +831,19 @@ | |||||
}, | }, | ||||
{ | { | ||||
"cell_type": "code", | "cell_type": "code", | ||||
"execution_count": 45, | |||||
"execution_count": 8, | |||||
"metadata": {}, | "metadata": {}, | ||||
"outputs": [ | "outputs": [ | ||||
{ | { | ||||
"name": "stdout", | "name": "stdout", | ||||
"output_type": "stream", | "output_type": "stream", | ||||
"text": [ | "text": [ | ||||
"tensor([[2., 2., 2., 2., 2.],\n", | |||||
" [2., 2., 2., 2., 2.],\n", | |||||
" [2., 2., 2., 2., 2.],\n", | |||||
" [2., 2., 2., 2., 2.],\n", | |||||
" [2., 2., 2., 2., 2.],\n", | |||||
" [2., 2., 2., 2., 2.],\n", | |||||
" [2., 2., 2., 2., 2.],\n", | |||||
" [2., 2., 2., 2., 2.],\n", | |||||
" [2., 2., 2., 2., 2.],\n", | |||||
" [2., 2., 2., 2., 2.]])\n", | |||||
"tensor([[2., 2., 2., 2., 2.],\n", | |||||
" [2., 2., 2., 2., 2.],\n", | |||||
" [2., 2., 2., 2., 2.],\n", | |||||
" [2., 2., 2., 2., 2.],\n", | |||||
" [2., 2., 2., 2., 2.],\n", | |||||
" [2., 2., 2., 2., 2.],\n", | |||||
" [2., 2., 2., 2., 2.],\n", | |||||
" [2., 2., 2., 2., 2.],\n", | |||||
" [2., 2., 2., 2., 2.],\n", | |||||
" [2., 2., 2., 2., 2.]])\n" | |||||
"tensor([[1., 1., 1., 1.],\n", | |||||
" [1., 1., 1., 1.],\n", | |||||
" [1., 1., 1., 1.]])\n", | |||||
"tensor([[1., 1., 1., 1.],\n", | |||||
" [1., 1., 1., 1.],\n", | |||||
" [1., 1., 1., 1.]])\n" | |||||
] | ] | ||||
} | } | ||||
], | ], | ||||
@@ -856,7 +866,7 @@ | |||||
"cell_type": "markdown", | "cell_type": "markdown", | ||||
"metadata": {}, | "metadata": {}, | ||||
"source": [ | "source": [ | ||||
"**小练习**\n", | |||||
"### 3.2 **小练习**\n", | |||||
"\n", | "\n", | ||||
"尝试构建一个函数 $y = x^2 $,然后求 x=2 的导数。\n", | "尝试构建一个函数 $y = x^2 $,然后求 x=2 的导数。\n", | ||||
"\n", | "\n", | ||||
@@ -931,6 +941,15 @@ | |||||
"source": [ | "source": [ | ||||
"下一次课程我们将会从导数展开,了解 PyTorch 的自动求导机制" | "下一次课程我们将会从导数展开,了解 PyTorch 的自动求导机制" | ||||
] | ] | ||||
}, | |||||
{ | |||||
"cell_type": "markdown", | |||||
"metadata": {}, | |||||
"source": [ | |||||
"## References\n", | |||||
"* http://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html\n", | |||||
"* http://cs231n.github.io/python-numpy-tutorial/" | |||||
] | |||||
} | } | ||||
], | ], | ||||
"metadata": { | "metadata": { | ||||
@@ -10,7 +10,7 @@ | |||||
}, | }, | ||||
{ | { | ||||
"cell_type": "code", | "cell_type": "code", | ||||
"execution_count": 1, | |||||
"execution_count": 2, | |||||
"metadata": {}, | "metadata": {}, | ||||
"outputs": [], | "outputs": [], | ||||
"source": [ | "source": [ | ||||
@@ -22,7 +22,7 @@ | |||||
"cell_type": "markdown", | "cell_type": "markdown", | ||||
"metadata": {}, | "metadata": {}, | ||||
"source": [ | "source": [ | ||||
"## 简单情况的自动求导\n", | |||||
"## 1. 简单情况的自动求导\n", | |||||
"下面我们显示一些简单情况的自动求导,\"简单\"体现在计算的结果都是标量,也就是一个数,我们对这个标量进行自动求导。" | "下面我们显示一些简单情况的自动求导,\"简单\"体现在计算的结果都是标量,也就是一个数,我们对这个标量进行自动求导。" | ||||
] | ] | ||||
}, | }, | ||||
@@ -61,7 +61,8 @@ | |||||
"$$\n", | "$$\n", | ||||
"\\frac{\\partial z}{\\partial x} = 2 (x + 2) = 2 (2 + 2) = 8\n", | "\\frac{\\partial z}{\\partial x} = 2 (x + 2) = 2 (2 + 2) = 8\n", | ||||
"$$\n", | "$$\n", | ||||
"如果你对求导不熟悉,可以查看以下[网址进行复习](https://baike.baidu.com/item/%E5%AF%BC%E6%95%B0#1)" | |||||
"\n", | |||||
"如果你对求导不熟悉,可以查看以下[《导数介绍资料》](https://baike.baidu.com/item/%E5%AF%BC%E6%95%B0#1)网址进行复习" | |||||
] | ] | ||||
}, | }, | ||||
{ | { | ||||
@@ -92,210 +93,106 @@ | |||||
}, | }, | ||||
{ | { | ||||
"cell_type": "code", | "cell_type": "code", | ||||
"execution_count": 9, | |||||
"execution_count": 25, | |||||
"metadata": {}, | "metadata": {}, | ||||
"outputs": [ | "outputs": [ | ||||
{ | { | ||||
"name": "stdout", | "name": "stdout", | ||||
"output_type": "stream", | "output_type": "stream", | ||||
"text": [ | "text": [ | ||||
"tensor([[ 5.7436e-01, -8.5241e-01, 2.2845e+00, 3.6574e-01, 1.4336e+00,\n", | |||||
" 6.2769e-01, -2.4378e-01, 2.3407e+00, 3.8966e-01, 1.1835e+00,\n", | |||||
" -6.4391e-01, 9.1353e-01, -5.8734e-01, -1.9392e+00, 9.3507e-01,\n", | |||||
" 8.8518e-02, 7.2412e-01, -1.0687e+00, -6.7646e-01, 1.2672e+00],\n", | |||||
" [ 7.2998e-01, 2.0229e+00, -5.0831e-01, -6.3940e-01, -8.7033e-01,\n", | |||||
" 2.7687e-01, 6.3498e-01, -1.8736e-03, -8.4395e-01, 1.4696e+00,\n", | |||||
" -1.7850e+00, -4.5297e-01, 9.2144e-01, 8.5070e-02, -5.8926e-01,\n", | |||||
" 1.2085e+00, -9.7894e-01, -3.4309e-01, -2.4711e-02, -6.4475e-01],\n", | |||||
" [-2.8774e-01, 1.2039e+00, -5.2320e-01, 1.3787e-01, 3.9971e-02,\n", | |||||
" -5.6454e-01, -1.5835e+00, -2.0742e-01, -1.4274e+00, -3.7860e-01,\n", | |||||
" 6.2642e-01, 1.6408e+00, -1.1916e-01, 1.4388e-01, -9.5261e-01,\n", | |||||
" 4.0784e-01, 8.1715e-01, 3.9228e-01, 4.1611e-01, -3.3709e-01],\n", | |||||
" [ 3.3040e-01, 1.7915e-01, -5.7069e-02, 1.1144e+00, -1.0322e+00,\n", | |||||
" 9.9129e-01, 1.1692e+00, 7.9638e-01, -1.0943e-01, 8.2714e-01,\n", | |||||
" -1.5700e-01, -5.6686e-01, -1.9550e-01, -1.2263e+00, 1.7836e+00,\n", | |||||
" 9.1989e-01, -6.4577e-01, 9.5402e-01, -8.6525e-01, 3.9199e-01],\n", | |||||
" [-8.8085e-01, -6.3551e-03, 1.6959e+00, -7.5292e-02, -8.8929e-02,\n", | |||||
" 1.0209e+00, 8.9355e-01, -1.2029e+00, 1.9429e+00, -2.7024e-01,\n", | |||||
" -9.1289e-01, -1.3788e+00, -6.2695e-01, -6.5776e-01, 3.3640e-01,\n", | |||||
" -1.0473e-01, 9.9417e-01, 1.0128e+00, 2.4199e+00, 2.8859e-01],\n", | |||||
" [ 8.0469e-02, -1.6585e-01, -4.9862e-01, -5.5413e-01, -4.9307e-01,\n", | |||||
" -7.3808e-01, 1.3946e-02, 5.6282e-01, 9.1096e-01, -1.9281e-01,\n", | |||||
" -3.8546e-01, -1.4070e+00, 7.3520e-01, 1.7412e+00, 1.0770e+00,\n", | |||||
" 1.4837e+00, -7.4241e-01, -4.0977e-01, 1.1057e+00, -7.0222e-01],\n", | |||||
" [-2.3147e-01, -3.7781e-01, 1.0774e+00, -7.9918e-01, 1.8275e+00,\n", | |||||
" 7.6937e-01, -2.7600e-01, 1.0389e+00, 1.4457e+00, -1.2898e+00,\n", | |||||
" 1.2761e-03, 5.5406e-01, 1.8231e+00, -2.3874e-01, 1.2145e+00,\n", | |||||
" -2.1051e+00, -6.6464e-01, -8.5335e-01, -2.6258e-01, 8.0080e-01],\n", | |||||
" [ 4.2173e-01, 1.7040e-01, -3.0126e-01, -5.2095e-01, 5.5845e-01,\n", | |||||
" 5.9780e-01, -6.8320e-01, -5.2203e-01, 4.9485e-01, -8.2392e-01,\n", | |||||
" -1.7584e-01, -1.3862e+00, 1.3604e+00, -7.5567e-01, 3.1400e-01,\n", | |||||
" 1.8617e+00, -1.1887e+00, -3.1732e-01, -1.5062e-01, -1.7251e-01],\n", | |||||
" [ 1.0924e+00, 1.0899e+00, 5.7135e-01, -2.7047e-01, 1.1123e+00,\n", | |||||
" 9.3634e-01, -1.4739e+00, 5.3640e-01, -8.2090e-02, 3.3112e-02,\n", | |||||
" 6.6032e-01, 1.1448e+00, -4.2457e-01, 1.2898e+00, 3.9002e-01,\n", | |||||
" 2.7646e-01, 9.6717e-03, -1.7425e-01, -1.9732e-01, 9.7876e-01],\n", | |||||
" [ 4.4554e-01, 5.3807e-01, -2.2031e-02, 1.3198e+00, -1.1642e+00,\n", | |||||
" -6.6617e-01, -2.6982e-01, -1.0219e+00, 5.8154e-01, 1.7617e+00,\n", | |||||
" 3.3077e-01, 1.5238e+00, -5.8909e-01, 1.1373e+00, 1.0998e+00,\n", | |||||
" -1.8168e+00, -5.0699e-01, 4.0043e-01, -2.3226e+00, 7.2522e-02]],\n", | |||||
" requires_grad=True)\n" | |||||
"tensor([[1., 2.],\n", | |||||
" [3., 4.]], requires_grad=True)\n" | |||||
] | ] | ||||
} | } | ||||
], | ], | ||||
"source": [ | "source": [ | ||||
"# FIXME: the demo need improve\n", | |||||
"x = Variable(torch.randn(10, 20), requires_grad=True)\n", | |||||
"y = Variable(torch.randn(10, 5), requires_grad=True)\n", | |||||
"w = Variable(torch.randn(20, 5), requires_grad=True)\n", | |||||
"print(x)\n", | |||||
"out = torch.mean(y - torch.matmul(x, w)) # torch.matmul 是做矩阵乘法\n", | |||||
"out.backward()" | |||||
"# 定义Variable\n", | |||||
"x = Variable(torch.FloatTensor([1,2]), requires_grad=False)\n", | |||||
"b = Variable(torch.FloatTensor([5,6]), requires_grad=False)\n", | |||||
"w = Variable(torch.FloatTensor([[1,2],[3,4]]), requires_grad=True)\n", | |||||
"print(w)" | |||||
] | ] | ||||
}, | }, | ||||
{ | { | ||||
"cell_type": "markdown", | |||||
"cell_type": "code", | |||||
"execution_count": 26, | |||||
"metadata": {}, | "metadata": {}, | ||||
"outputs": [], | |||||
"source": [ | "source": [ | ||||
"如果你对矩阵乘法不熟悉,可以查看下面的[网址进行复习](https://baike.baidu.com/item/%E7%9F%A9%E9%98%B5%E4%B9%98%E6%B3%95/5446029?fr=aladdin)" | |||||
"z = torch.mean(torch.matmul(w, x) + b) # torch.matmul 是做矩阵乘法\n", | |||||
"z.backward()" | |||||
] | ] | ||||
}, | }, | ||||
{ | { | ||||
"cell_type": "code", | |||||
"execution_count": 6, | |||||
"cell_type": "markdown", | |||||
"metadata": {}, | "metadata": {}, | ||||
"outputs": [ | |||||
{ | |||||
"name": "stdout", | |||||
"output_type": "stream", | |||||
"text": [ | |||||
"tensor([[ 0.0034, -0.0301, -0.0040, -0.0488, 0.0187, -0.0139, -0.0374, 0.0102,\n", | |||||
" 0.0337, -0.0249, -0.0777, -0.0868, 0.0132, 0.0042, -0.0627, -0.0448,\n", | |||||
" 0.0221, -0.0324, -0.0601, 0.0048],\n", | |||||
" [ 0.0034, -0.0301, -0.0040, -0.0488, 0.0187, -0.0139, -0.0374, 0.0102,\n", | |||||
" 0.0337, -0.0249, -0.0777, -0.0868, 0.0132, 0.0042, -0.0627, -0.0448,\n", | |||||
" 0.0221, -0.0324, -0.0601, 0.0048],\n", | |||||
" [ 0.0034, -0.0301, -0.0040, -0.0488, 0.0187, -0.0139, -0.0374, 0.0102,\n", | |||||
" 0.0337, -0.0249, -0.0777, -0.0868, 0.0132, 0.0042, -0.0627, -0.0448,\n", | |||||
" 0.0221, -0.0324, -0.0601, 0.0048],\n", | |||||
" [ 0.0034, -0.0301, -0.0040, -0.0488, 0.0187, -0.0139, -0.0374, 0.0102,\n", | |||||
" 0.0337, -0.0249, -0.0777, -0.0868, 0.0132, 0.0042, -0.0627, -0.0448,\n", | |||||
" 0.0221, -0.0324, -0.0601, 0.0048],\n", | |||||
" [ 0.0034, -0.0301, -0.0040, -0.0488, 0.0187, -0.0139, -0.0374, 0.0102,\n", | |||||
" 0.0337, -0.0249, -0.0777, -0.0868, 0.0132, 0.0042, -0.0627, -0.0448,\n", | |||||
" 0.0221, -0.0324, -0.0601, 0.0048],\n", | |||||
" [ 0.0034, -0.0301, -0.0040, -0.0488, 0.0187, -0.0139, -0.0374, 0.0102,\n", | |||||
" 0.0337, -0.0249, -0.0777, -0.0868, 0.0132, 0.0042, -0.0627, -0.0448,\n", | |||||
" 0.0221, -0.0324, -0.0601, 0.0048],\n", | |||||
" [ 0.0034, -0.0301, -0.0040, -0.0488, 0.0187, -0.0139, -0.0374, 0.0102,\n", | |||||
" 0.0337, -0.0249, -0.0777, -0.0868, 0.0132, 0.0042, -0.0627, -0.0448,\n", | |||||
" 0.0221, -0.0324, -0.0601, 0.0048],\n", | |||||
" [ 0.0034, -0.0301, -0.0040, -0.0488, 0.0187, -0.0139, -0.0374, 0.0102,\n", | |||||
" 0.0337, -0.0249, -0.0777, -0.0868, 0.0132, 0.0042, -0.0627, -0.0448,\n", | |||||
" 0.0221, -0.0324, -0.0601, 0.0048],\n", | |||||
" [ 0.0034, -0.0301, -0.0040, -0.0488, 0.0187, -0.0139, -0.0374, 0.0102,\n", | |||||
" 0.0337, -0.0249, -0.0777, -0.0868, 0.0132, 0.0042, -0.0627, -0.0448,\n", | |||||
" 0.0221, -0.0324, -0.0601, 0.0048],\n", | |||||
" [ 0.0034, -0.0301, -0.0040, -0.0488, 0.0187, -0.0139, -0.0374, 0.0102,\n", | |||||
" 0.0337, -0.0249, -0.0777, -0.0868, 0.0132, 0.0042, -0.0627, -0.0448,\n", | |||||
" 0.0221, -0.0324, -0.0601, 0.0048]])\n" | |||||
] | |||||
} | |||||
], | |||||
"source": [ | "source": [ | ||||
"# 得到 x 的梯度\n", | |||||
"print(x.grad)" | |||||
"如果你对矩阵乘法不熟悉,可以查看下面的[网址进行复习](https://baike.baidu.com/item/%E7%9F%A9%E9%98%B5%E4%B9%98%E6%B3%95/5446029?fr=aladdin)" | |||||
] | ] | ||||
}, | }, | ||||
{ | { | ||||
"cell_type": "code", | "cell_type": "code", | ||||
"execution_count": 7, | |||||
"execution_count": 27, | |||||
"metadata": {}, | "metadata": {}, | ||||
"outputs": [ | "outputs": [ | ||||
{ | { | ||||
"name": "stdout", | "name": "stdout", | ||||
"output_type": "stream", | "output_type": "stream", | ||||
"text": [ | "text": [ | ||||
"tensor([[0.0200, 0.0200, 0.0200, 0.0200, 0.0200],\n", | |||||
" [0.0200, 0.0200, 0.0200, 0.0200, 0.0200],\n", | |||||
" [0.0200, 0.0200, 0.0200, 0.0200, 0.0200],\n", | |||||
" [0.0200, 0.0200, 0.0200, 0.0200, 0.0200],\n", | |||||
" [0.0200, 0.0200, 0.0200, 0.0200, 0.0200],\n", | |||||
" [0.0200, 0.0200, 0.0200, 0.0200, 0.0200],\n", | |||||
" [0.0200, 0.0200, 0.0200, 0.0200, 0.0200],\n", | |||||
" [0.0200, 0.0200, 0.0200, 0.0200, 0.0200],\n", | |||||
" [0.0200, 0.0200, 0.0200, 0.0200, 0.0200],\n", | |||||
" [0.0200, 0.0200, 0.0200, 0.0200, 0.0200]])\n" | |||||
"tensor([[0.5000, 1.0000],\n", | |||||
" [0.5000, 1.0000]])\n" | |||||
] | ] | ||||
} | } | ||||
], | ], | ||||
"source": [ | "source": [ | ||||
"# 得到 y 的的梯度\n", | |||||
"print(y.grad)" | |||||
"# 得到 w 的梯度\n", | |||||
"print(w.grad)" | |||||
] | ] | ||||
}, | }, | ||||
{ | { | ||||
"cell_type": "code", | |||||
"execution_count": 8, | |||||
"cell_type": "markdown", | |||||
"metadata": {}, | "metadata": {}, | ||||
"outputs": [ | |||||
{ | |||||
"name": "stdout", | |||||
"output_type": "stream", | |||||
"text": [ | |||||
"tensor([[ 0.0172, 0.0172, 0.0172, 0.0172, 0.0172],\n", | |||||
" [ 0.0389, 0.0389, 0.0389, 0.0389, 0.0389],\n", | |||||
" [-0.0748, -0.0748, -0.0748, -0.0748, -0.0748],\n", | |||||
" [-0.0186, -0.0186, -0.0186, -0.0186, -0.0186],\n", | |||||
" [ 0.0278, 0.0278, 0.0278, 0.0278, 0.0278],\n", | |||||
" [-0.0228, -0.0228, -0.0228, -0.0228, -0.0228],\n", | |||||
" [-0.0496, -0.0496, -0.0496, -0.0496, -0.0496],\n", | |||||
" [-0.0084, -0.0084, -0.0084, -0.0084, -0.0084],\n", | |||||
" [ 0.0693, 0.0693, 0.0693, 0.0693, 0.0693],\n", | |||||
" [-0.0821, -0.0821, -0.0821, -0.0821, -0.0821],\n", | |||||
" [ 0.0419, 0.0419, 0.0419, 0.0419, 0.0419],\n", | |||||
" [-0.0126, -0.0126, -0.0126, -0.0126, -0.0126],\n", | |||||
" [ 0.0322, 0.0322, 0.0322, 0.0322, 0.0322],\n", | |||||
" [ 0.0863, 0.0863, 0.0863, 0.0863, 0.0863],\n", | |||||
" [-0.0791, -0.0791, -0.0791, -0.0791, -0.0791],\n", | |||||
" [ 0.0179, 0.0179, 0.0179, 0.0179, 0.0179],\n", | |||||
" [-0.1109, -0.1109, -0.1109, -0.1109, -0.1109],\n", | |||||
" [-0.0188, -0.0188, -0.0188, -0.0188, -0.0188],\n", | |||||
" [-0.0636, -0.0636, -0.0636, -0.0636, -0.0636],\n", | |||||
" [ 0.0223, 0.0223, 0.0223, 0.0223, 0.0223]])\n" | |||||
] | |||||
} | |||||
], | |||||
"source": [ | "source": [ | ||||
"# 得到 w 的梯度\n", | |||||
"print(w.grad)" | |||||
"具体计算的公式为:\n", | |||||
"$$\n", | |||||
"z_1 = w_{11}*x_1 + w_{12}*x_2 + b_1 \\\\\n", | |||||
"z_2 = w_{21}*x_1 + w_{22}*x_2 + b_2 \\\\\n", | |||||
"z = \\frac{1}{2} (z_1 + z_2)\n", | |||||
"$$" | |||||
] | ] | ||||
}, | }, | ||||
{ | { | ||||
"cell_type": "markdown", | "cell_type": "markdown", | ||||
"metadata": {}, | "metadata": {}, | ||||
"source": [ | "source": [ | ||||
"上面数学公式就更加复杂,矩阵乘法之后对两个矩阵对应元素相乘,然后所有元素求平均,有兴趣的同学可以手动去计算一下梯度,使用 PyTorch 的自动求导,我们能够非常容易得到 x, y 和 w 的导数,因为深度学习中充满大量的矩阵运算,所以我们没有办法手动去求这些导数,有了自动求导能够非常方便地解决网络更新的问题。" | |||||
"则微分计算结果是:\n", | |||||
"$$\n", | |||||
"\\frac{\\partial z}{w_{11}} = \\frac{1}{2} x_1 \\\\\n", | |||||
"\\frac{\\partial z}{w_{12}} = \\frac{1}{2} x_2 \\\\\n", | |||||
"\\frac{\\partial z}{w_{21}} = \\frac{1}{2} x_1 \\\\\n", | |||||
"\\frac{\\partial z}{w_{22}} = \\frac{1}{2} x_2\n", | |||||
"$$" | |||||
] | ] | ||||
}, | }, | ||||
{ | { | ||||
"cell_type": "markdown", | "cell_type": "markdown", | ||||
"metadata": {}, | "metadata": {}, | ||||
"source": [ | "source": [ | ||||
"\n" | |||||
"上面数学公式就更加复杂,矩阵乘法之后对两个矩阵对应元素相乘,然后所有元素求平均,有兴趣的同学可以手动去计算一下梯度,使用 PyTorch 的自动求导,我们能够非常容易得到 x, y 和 w 的导数,因为深度学习中充满大量的矩阵运算,所以我们没有办法手动去求这些导数,有了自动求导能够非常方便地解决网络更新的问题。" | |||||
] | ] | ||||
}, | }, | ||||
{ | { | ||||
"cell_type": "markdown", | "cell_type": "markdown", | ||||
"metadata": {}, | "metadata": {}, | ||||
"source": [ | "source": [ | ||||
"## 复杂情况的自动求导\n", | |||||
"上面我们展示了简单情况下的自动求导,都是对标量进行自动求导,可能你会有一个疑问,如何对一个向量或者矩阵自动求导了呢?感兴趣的同学可以自己先去尝试一下,下面我们会介绍对多维数组的自动求导机制。" | |||||
"## 2. 复杂情况的自动求导\n", | |||||
"\n", | |||||
"上面我们展示了简单情况下的自动求导,都是对标量进行自动求导,那么如何对一个向量或者矩阵自动求导?" | |||||
] | ] | ||||
}, | }, | ||||
{ | { | ||||
"cell_type": "code", | "cell_type": "code", | ||||
"execution_count": 11, | |||||
"execution_count": 28, | |||||
"metadata": {}, | "metadata": {}, | ||||
"outputs": [ | "outputs": [ | ||||
{ | { | ||||
@@ -316,7 +213,7 @@ | |||||
}, | }, | ||||
{ | { | ||||
"cell_type": "code", | "cell_type": "code", | ||||
"execution_count": 13, | |||||
"execution_count": 29, | |||||
"metadata": {}, | "metadata": {}, | ||||
"outputs": [ | "outputs": [ | ||||
{ | { | ||||
@@ -423,7 +320,7 @@ | |||||
"cell_type": "markdown", | "cell_type": "markdown", | ||||
"metadata": {}, | "metadata": {}, | ||||
"source": [ | "source": [ | ||||
"## 多次自动求导\n", | |||||
"## 3. 多次自动求导\n", | |||||
"通过调用 backward 我们可以进行一次自动求导,如果我们再调用一次 backward,会发现程序报错,没有办法再做一次。这是因为 PyTorch 默认做完一次自动求导之后,计算图就被丢弃了,所以两次自动求导需要手动设置一个东西,我们通过下面的小例子来说明。" | "通过调用 backward 我们可以进行一次自动求导,如果我们再调用一次 backward,会发现程序报错,没有办法再做一次。这是因为 PyTorch 默认做完一次自动求导之后,计算图就被丢弃了,所以两次自动求导需要手动设置一个东西,我们通过下面的小例子来说明。" | ||||
] | ] | ||||
}, | }, | ||||
@@ -516,7 +413,7 @@ | |||||
"cell_type": "markdown", | "cell_type": "markdown", | ||||
"metadata": {}, | "metadata": {}, | ||||
"source": [ | "source": [ | ||||
"**小练习**\n", | |||||
"## 4 练习题\n", | |||||
"\n", | "\n", | ||||
"定义\n", | "定义\n", | ||||
"\n", | "\n", | ||||
@@ -650,13 +547,6 @@ | |||||
"source": [ | "source": [ | ||||
"print(j)" | "print(j)" | ||||
] | ] | ||||
}, | |||||
{ | |||||
"cell_type": "markdown", | |||||
"metadata": {}, | |||||
"source": [ | |||||
"下一次课我们会介绍两种神经网络的编程方式,动态图编程和静态图编程" | |||||
] | |||||
} | } | ||||
], | ], | ||||
"metadata": { | "metadata": { | ||||
@@ -675,7 +565,7 @@ | |||||
"name": "python", | "name": "python", | ||||
"nbconvert_exporter": "python", | "nbconvert_exporter": "python", | ||||
"pygments_lexer": "ipython3", | "pygments_lexer": "ipython3", | ||||
"version": "3.6.9" | |||||
"version": "3.7.9" | |||||
} | } | ||||
}, | }, | ||||
"nbformat": 4, | "nbformat": 4, | ||||
@@ -1,220 +0,0 @@ | |||||
{ | |||||
"cells": [ | |||||
{ | |||||
"cell_type": "markdown", | |||||
"metadata": {}, | |||||
"source": [ | |||||
"# 动态图和静态图\n", | |||||
"目前神经网络框架分为[静态图框架和动态图框架](https://blog.csdn.net/qq_36653505/article/details/87875279),PyTorch 和 TensorFlow、Caffe 等框架最大的区别就是他们拥有不同的计算图表现形式。 TensorFlow 使用静态图,这意味着我们先定义计算图,然后不断使用它,而在 PyTorch 中,每次都会重新构建一个新的计算图。通过这次课程,我们会了解静态图和动态图之间的优缺点。\n", | |||||
"\n", | |||||
"对于使用者来说,两种形式的计算图有着非常大的区别,同时静态图和动态图都有他们各自的优点,比如动态图比较方便debug,使用者能够用任何他们喜欢的方式进行debug,同时非常直观,而静态图是通过先定义后运行的方式,之后再次运行的时候就不再需要重新构建计算图,所以速度会比动态图更快。" | |||||
] | |||||
}, | |||||
{ | |||||
"cell_type": "markdown", | |||||
"metadata": {}, | |||||
"source": [ | |||||
"" | |||||
] | |||||
}, | |||||
{ | |||||
"cell_type": "markdown", | |||||
"metadata": {}, | |||||
"source": [ | |||||
"下面我们比较 while 循环语句在 TensorFlow 和 PyTorch 中的定义" | |||||
] | |||||
}, | |||||
{ | |||||
"cell_type": "markdown", | |||||
"metadata": {}, | |||||
"source": [ | |||||
"## TensorFlow" | |||||
] | |||||
}, | |||||
{ | |||||
"cell_type": "code", | |||||
"execution_count": 1, | |||||
"metadata": {}, | |||||
"outputs": [ | |||||
{ | |||||
"ename": "ModuleNotFoundError", | |||||
"evalue": "No module named 'tensorflow'", | |||||
"output_type": "error", | |||||
"traceback": [ | |||||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", | |||||
"\u001b[0;31mModuleNotFoundError\u001b[0m Traceback (most recent call last)", | |||||
"\u001b[0;32m<ipython-input-1-7d11304356bb>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;31m# tensorflow\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0;32mimport\u001b[0m \u001b[0mtensorflow\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0mtf\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 3\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0mfirst_counter\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mconstant\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0msecond_counter\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mconstant\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m10\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", | |||||
"\u001b[0;31mModuleNotFoundError\u001b[0m: No module named 'tensorflow'" | |||||
] | |||||
} | |||||
], | |||||
"source": [ | |||||
"# tensorflow\n", | |||||
"import tensorflow as tf\n", | |||||
"\n", | |||||
"first_counter = tf.constant(0)\n", | |||||
"second_counter = tf.constant(10)" | |||||
] | |||||
}, | |||||
{ | |||||
"cell_type": "code", | |||||
"execution_count": 16, | |||||
"metadata": {}, | |||||
"outputs": [], | |||||
"source": [ | |||||
"def cond(first_counter, second_counter, *args):\n", | |||||
" return first_counter < second_counter\n", | |||||
"\n", | |||||
"def body(first_counter, second_counter):\n", | |||||
" first_counter = tf.add(first_counter, 2)\n", | |||||
" second_counter = tf.add(second_counter, 1)\n", | |||||
" return first_counter, second_counter" | |||||
] | |||||
}, | |||||
{ | |||||
"cell_type": "code", | |||||
"execution_count": 17, | |||||
"metadata": {}, | |||||
"outputs": [], | |||||
"source": [ | |||||
"c1, c2 = tf.while_loop(cond, body, [first_counter, second_counter])" | |||||
] | |||||
}, | |||||
{ | |||||
"cell_type": "code", | |||||
"execution_count": 21, | |||||
"metadata": {}, | |||||
"outputs": [ | |||||
{ | |||||
"ename": "RuntimeError", | |||||
"evalue": "The Session graph is empty. Add operations to the graph before calling run().", | |||||
"output_type": "error", | |||||
"traceback": [ | |||||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", | |||||
"\u001b[0;31mRuntimeError\u001b[0m Traceback (most recent call last)", | |||||
"\u001b[0;32m<ipython-input-21-430d26a59053>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;32mwith\u001b[0m \u001b[0mtf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcompat\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mv1\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mSession\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0msess\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mcounter_1_res\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcounter_2_res\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0msess\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrun\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mc1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mc2\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", | |||||
"\u001b[0;32m~/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py\u001b[0m in \u001b[0;36mrun\u001b[0;34m(self, fetches, feed_dict, options, run_metadata)\u001b[0m\n\u001b[1;32m 956\u001b[0m \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 957\u001b[0m result = self._run(None, fetches, feed_dict, options_ptr,\n\u001b[0;32m--> 958\u001b[0;31m run_metadata_ptr)\n\u001b[0m\u001b[1;32m 959\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mrun_metadata\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 960\u001b[0m \u001b[0mproto_data\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtf_session\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mTF_GetBuffer\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mrun_metadata_ptr\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", | |||||
"\u001b[0;32m~/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py\u001b[0m in \u001b[0;36m_run\u001b[0;34m(self, handle, fetches, feed_dict, options, run_metadata)\u001b[0m\n\u001b[1;32m 1104\u001b[0m \u001b[0;32mraise\u001b[0m \u001b[0mRuntimeError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'Attempted to use a closed Session.'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1105\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mgraph\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mversion\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0;36m0\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1106\u001b[0;31m raise RuntimeError('The Session graph is empty. Add operations to the '\n\u001b[0m\u001b[1;32m 1107\u001b[0m 'graph before calling run().')\n\u001b[1;32m 1108\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", | |||||
"\u001b[0;31mRuntimeError\u001b[0m: The Session graph is empty. Add operations to the graph before calling run()." | |||||
] | |||||
} | |||||
], | |||||
"source": [ | |||||
"with tf.compat.v1.Session() as sess:\n", | |||||
" counter_1_res, counter_2_res = sess.run([c1, c2])" | |||||
] | |||||
}, | |||||
{ | |||||
"cell_type": "code", | |||||
"execution_count": 19, | |||||
"metadata": {}, | |||||
"outputs": [ | |||||
{ | |||||
"ename": "NameError", | |||||
"evalue": "name 'counter_1_res' is not defined", | |||||
"output_type": "error", | |||||
"traceback": [ | |||||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", | |||||
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", | |||||
"\u001b[0;32m<ipython-input-19-62b1e84b7d43>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mcounter_1_res\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mcounter_2_res\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", | |||||
"\u001b[0;31mNameError\u001b[0m: name 'counter_1_res' is not defined" | |||||
] | |||||
} | |||||
], | |||||
"source": [ | |||||
"print(counter_1_res)\n", | |||||
"print(counter_2_res)" | |||||
] | |||||
}, | |||||
{ | |||||
"cell_type": "markdown", | |||||
"metadata": {}, | |||||
"source": [ | |||||
"可以看到 TensorFlow 需要将整个图构建成静态的,换句话说,每次运行的时候图都是一样的,是不能够改变的,所以不能直接使用 Python 的 while 循环语句,需要使用辅助函数 `tf.while_loop` 写成 TensorFlow 内部的形式\n", | |||||
"\n", | |||||
"这是非常反直觉的,学习成本也是比较高的\n", | |||||
"\n", | |||||
"下面我们来看看 PyTorch 的动态图机制,这使得我们能够使用 Python 的 while 写循环,非常方便" | |||||
] | |||||
}, | |||||
{ | |||||
"cell_type": "markdown", | |||||
"metadata": {}, | |||||
"source": [ | |||||
"## PyTorch" | |||||
] | |||||
}, | |||||
{ | |||||
"cell_type": "code", | |||||
"execution_count": 3, | |||||
"metadata": {}, | |||||
"outputs": [], | |||||
"source": [ | |||||
"# pytorch\n", | |||||
"import torch\n", | |||||
"first_counter = torch.Tensor([0])\n", | |||||
"second_counter = torch.Tensor([10])" | |||||
] | |||||
}, | |||||
{ | |||||
"cell_type": "code", | |||||
"execution_count": 5, | |||||
"metadata": {}, | |||||
"outputs": [], | |||||
"source": [ | |||||
"while (first_counter < second_counter)[0]:\n", | |||||
" first_counter += 2\n", | |||||
" second_counter += 1" | |||||
] | |||||
}, | |||||
{ | |||||
"cell_type": "code", | |||||
"execution_count": 6, | |||||
"metadata": {}, | |||||
"outputs": [ | |||||
{ | |||||
"name": "stdout", | |||||
"output_type": "stream", | |||||
"text": [ | |||||
"tensor([20.])\n", | |||||
"tensor([20.])\n" | |||||
] | |||||
} | |||||
], | |||||
"source": [ | |||||
"print(first_counter)\n", | |||||
"print(second_counter)" | |||||
] | |||||
}, | |||||
{ | |||||
"cell_type": "markdown", | |||||
"metadata": {}, | |||||
"source": [ | |||||
"可以看到 PyTorch 的写法跟 Python 的写法是完全一致的,没有任何额外的学习成本\n", | |||||
"\n", | |||||
"上面的例子展示如何使用静态图和动态图构建 while 循环,看起来动态图的方式更加简单且直观,你觉得呢?" | |||||
] | |||||
} | |||||
], | |||||
"metadata": { | |||||
"kernelspec": { | |||||
"display_name": "Python 3", | |||||
"language": "python", | |||||
"name": "python3" | |||||
}, | |||||
"language_info": { | |||||
"codemirror_mode": { | |||||
"name": "ipython", | |||||
"version": 3 | |||||
}, | |||||
"file_extension": ".py", | |||||
"mimetype": "text/x-python", | |||||
"name": "python", | |||||
"nbconvert_exporter": "python", | |||||
"pygments_lexer": "ipython3", | |||||
"version": "3.6.9" | |||||
} | |||||
}, | |||||
"nbformat": 4, | |||||
"nbformat_minor": 2 | |||||
} |
@@ -0,0 +1,100 @@ | |||||
{ | |||||
"cells": [ | |||||
{ | |||||
"cell_type": "markdown", | |||||
"metadata": {}, | |||||
"source": [ | |||||
"# 动态图和静态图\n", | |||||
"目前神经网络框架分为[静态图框架和动态图框架](https://blog.csdn.net/qq_36653505/article/details/87875279),PyTorch 和 TensorFlow、Caffe 等框架最大的区别就是他们拥有不同的计算图表现形式。 TensorFlow 使用静态图,这意味着我们先定义计算图,然后不断使用它,而在 PyTorch 中,每次都会重新构建一个新的计算图。通过这次课程,我们会了解静态图和动态图之间的优缺点。\n", | |||||
"\n", | |||||
"对于使用者来说,两种形式的计算图有着非常大的区别,同时静态图和动态图都有他们各自的优点,比如动态图比较方便debug,使用者能够用任何他们喜欢的方式进行debug,同时非常直观,而静态图是通过先定义后运行的方式,之后再次运行的时候就不再需要重新构建计算图,所以速度会比动态图更快。" | |||||
] | |||||
}, | |||||
{ | |||||
"cell_type": "markdown", | |||||
"metadata": {}, | |||||
"source": [ | |||||
"" | |||||
] | |||||
}, | |||||
{ | |||||
"cell_type": "markdown", | |||||
"metadata": {}, | |||||
"source": [ | |||||
"## PyTorch" | |||||
] | |||||
}, | |||||
{ | |||||
"cell_type": "code", | |||||
"execution_count": 1, | |||||
"metadata": {}, | |||||
"outputs": [], | |||||
"source": [ | |||||
"# pytorch\n", | |||||
"import torch\n", | |||||
"first_counter = torch.Tensor([0])\n", | |||||
"second_counter = torch.Tensor([10])" | |||||
] | |||||
}, | |||||
{ | |||||
"cell_type": "code", | |||||
"execution_count": 2, | |||||
"metadata": {}, | |||||
"outputs": [], | |||||
"source": [ | |||||
"while (first_counter < second_counter):\n", | |||||
" first_counter += 2\n", | |||||
" second_counter += 1" | |||||
] | |||||
}, | |||||
{ | |||||
"cell_type": "code", | |||||
"execution_count": 3, | |||||
"metadata": {}, | |||||
"outputs": [ | |||||
{ | |||||
"name": "stdout", | |||||
"output_type": "stream", | |||||
"text": [ | |||||
"tensor([20.])\n", | |||||
"tensor([20.])\n" | |||||
] | |||||
} | |||||
], | |||||
"source": [ | |||||
"print(first_counter)\n", | |||||
"print(second_counter)" | |||||
] | |||||
}, | |||||
{ | |||||
"cell_type": "markdown", | |||||
"metadata": {}, | |||||
"source": [ | |||||
"可以看到 PyTorch 的写法跟 Python 的写法是完全一致的,没有任何额外的学习成本\n", | |||||
"\n", | |||||
"上面的例子展示如何使用静态图和动态图构建 while 循环,看起来动态图的方式更加简单且直观,你觉得呢?" | |||||
] | |||||
} | |||||
], | |||||
"metadata": { | |||||
"kernelspec": { | |||||
"display_name": "Python 3", | |||||
"language": "python", | |||||
"name": "python3" | |||||
}, | |||||
"language_info": { | |||||
"codemirror_mode": { | |||||
"name": "ipython", | |||||
"version": 3 | |||||
}, | |||||
"file_extension": ".py", | |||||
"mimetype": "text/x-python", | |||||
"name": "python", | |||||
"nbconvert_exporter": "python", | |||||
"pygments_lexer": "ipython3", | |||||
"version": "3.7.9" | |||||
} | |||||
}, | |||||
"nbformat": 4, | |||||
"nbformat_minor": 2 | |||||
} |
@@ -782,7 +782,7 @@ | |||||
"name": "python", | "name": "python", | ||||
"nbconvert_exporter": "python", | "nbconvert_exporter": "python", | ||||
"pygments_lexer": "ipython3", | "pygments_lexer": "ipython3", | ||||
"version": "3.6.9" | |||||
"version": "3.7.9" | |||||
} | } | ||||
}, | }, | ||||
"nbformat": 4, | "nbformat": 4, | ||||
@@ -1,4 +1,15 @@ | |||||
# PyTorch | |||||
PyTorch是基于Python的科学计算包,其旨在服务两类场合: | |||||
* 替代numpy发挥GPU潜能 | |||||
* 提供了高度灵活性和效率的深度学习平台 | |||||
PyTorch的简洁设计使得它入门很简单,本部分内容在深入介绍PyTorch之前,先介绍一些PyTorch的基础知识,让大家能够对PyTorch有一个大致的了解,并能够用PyTorch搭建一个简单的神经网络,然后在深入学习如何使用PyTorch实现各类网络结构。在学习过程,可能部分内容暂时不太理解,可先不予以深究,后续的课程将会对此进行深入讲解。 | |||||
 | |||||
## References | ## References | ||||
@@ -11,7 +11,7 @@ | |||||
## 1. 内容 | ## 1. 内容 | ||||
1. [课程简介](CourseIntroduction.pdf) | 1. [课程简介](CourseIntroduction.pdf) | ||||
2. [Python](0_python/) | |||||
2. [Python](0_python/README.md) | |||||
- [Install Python](references_tips/InstallPython.md) | - [Install Python](references_tips/InstallPython.md) | ||||
- [ipython & notebook](0_python/0-ipython_notebook.ipynb) | - [ipython & notebook](0_python/0-ipython_notebook.ipynb) | ||||
- [Python Basics](0_python/1_Basics.ipynb) | - [Python Basics](0_python/1_Basics.ipynb) | ||||
@@ -21,43 +21,44 @@ | |||||
- [Control Flow](0_python/5_Control_Flow.ipynb) | - [Control Flow](0_python/5_Control_Flow.ipynb) | ||||
- [Function](0_python/6_Function.ipynb) | - [Function](0_python/6_Function.ipynb) | ||||
- [Class](0_python/7_Class.ipynb) | - [Class](0_python/7_Class.ipynb) | ||||
3. [numpy & matplotlib](1_numpy_matplotlib_scipy_sympy/) | |||||
3. [numpy & matplotlib](1_numpy_matplotlib_scipy_sympy/README.md) | |||||
- [numpy](1_numpy_matplotlib_scipy_sympy/1-numpy_tutorial.ipynb) | - [numpy](1_numpy_matplotlib_scipy_sympy/1-numpy_tutorial.ipynb) | ||||
- [matplotlib](1_numpy_matplotlib_scipy_sympy/2-matplotlib_tutorial.ipynb) | - [matplotlib](1_numpy_matplotlib_scipy_sympy/2-matplotlib_tutorial.ipynb) | ||||
4. [knn](2_knn/knn_classification.ipynb) | |||||
4. [kNN](2_knn/knn_classification.ipynb) | |||||
5. [kMeans](3_kmeans/1-k-means.ipynb) | 5. [kMeans](3_kmeans/1-k-means.ipynb) | ||||
- [kMeans - Image Compression](3_kmeans/2-kmeans-color-vq.ipynb) | |||||
- [Cluster Algorithms](3_kmeans/3-ClusteringAlgorithms.ipynb) | |||||
6. [Logistic Regression](4_logistic_regression/) | 6. [Logistic Regression](4_logistic_regression/) | ||||
- [Least squares](4_logistic_regression/1-Least_squares.ipynb) | - [Least squares](4_logistic_regression/1-Least_squares.ipynb) | ||||
- [Logistic regression](4_logistic_regression/2-Logistic_regression.ipynb) | - [Logistic regression](4_logistic_regression/2-Logistic_regression.ipynb) | ||||
- [PCA and Logistic regression](4_logistic_regression/3-PCA_and_Logistic_Regression.ipynb) | |||||
7. [Neural Network](5_nn/) | 7. [Neural Network](5_nn/) | ||||
- [Perceptron](5_nn/1-Perceptron.ipynb) | - [Perceptron](5_nn/1-Perceptron.ipynb) | ||||
- [Multi-layer Perceptron & BP](5_nn/2-mlp_bp.ipynb) | - [Multi-layer Perceptron & BP](5_nn/2-mlp_bp.ipynb) | ||||
- [Softmax & cross-entroy](5_nn/3-softmax_ce.ipynb) | - [Softmax & cross-entroy](5_nn/3-softmax_ce.ipynb) | ||||
8. [PyTorch](6_pytorch/) | |||||
8. [PyTorch](6_pytorch/README.md) | |||||
- Basic | - Basic | ||||
- [basic/Tensor-and-Variable](6_pytorch/0_basic/1-Tensor-and-Variable.ipynb) | |||||
- [basic/autograd](6_pytorch/0_basic/2-autograd.ipynb) | |||||
- [basic/dynamic-graph](6_pytorch/0_basic/3-dynamic-graph.ipynb) | |||||
- [Tensor and Variable](6_pytorch/0_basic/1-Tensor-and-Variable.ipynb) | |||||
- [autograd](6_pytorch/0_basic/2-autograd.ipynb) | |||||
- NN & Optimization | - NN & Optimization | ||||
- [nn/linear-regression-gradient-descend](6_pytorch/1_NN/linear-regression-gradient-descend.ipynb) | |||||
- [nn/logistic-regression](6_pytorch/1_NN/logistic-regression.ipynb) | |||||
- [nn/nn-sequential-module](6_pytorch/1_NN/nn-sequential-module.ipynb) | |||||
- [nn/bp](6_pytorch/1_NN/bp.ipynb) | |||||
- [nn/deep-nn](6_pytorch/1_NN/deep-nn.ipynb) | |||||
- [nn/param_initialize](6_pytorch/1_NN/param_initialize.ipynb) | |||||
- [optim/sgd](6_pytorch/1_NN/optimizer/sgd.ipynb) | |||||
- [optim/adam](6_pytorch/1_NN/optimizer/adam.ipynb) | |||||
- [nn/linear-regression-gradient-descend](6_pytorch/1_NN/1-linear-regression-gradient-descend.ipynb) | |||||
- [nn/logistic-regression](6_pytorch/1_NN/2-logistic-regression.ipynb) | |||||
- [nn/nn-sequential-module](6_pytorch/1_NN/3-nn-sequential-module.ipynb) | |||||
- [nn/deep-nn](6_pytorch/1_NN/4-deep-nn.ipynb) | |||||
- [nn/param_initialize](6_pytorch/1_NN/5-param_initialize.ipynb) | |||||
- [optim/sgd](6_pytorch/1_NN/optimizer/6_1-sgd.ipynb) | |||||
- [optim/adam](6_pytorch/1_NN/optimizer/6_6-adam.ipynb) | |||||
- CNN | - CNN | ||||
- [CNN simple demo](demo_code/3_CNN_MNIST.py) | - [CNN simple demo](demo_code/3_CNN_MNIST.py) | ||||
- [cnn/basic_conv](6_pytorch/2_CNN/basic_conv.ipynb) | |||||
- [cnn/basic_conv](6_pytorch/2_CNN/1-basic_conv.ipynb) | |||||
- [cnn/minist (demo code)](./demo_code/3_CNN_MNIST.py) | - [cnn/minist (demo code)](./demo_code/3_CNN_MNIST.py) | ||||
- [cnn/batch-normalization](6_pytorch/2_CNN/batch-normalization.ipynb) | |||||
- [cnn/regularization](6_pytorch/2_CNN/regularization.ipynb) | |||||
- [cnn/lr-decay](6_pytorch/2_CNN/lr-decay.ipynb) | |||||
- [cnn/vgg](6_pytorch/2_CNN/vgg.ipynb) | |||||
- [cnn/googlenet](6_pytorch/2_CNN/googlenet.ipynb) | |||||
- [cnn/resnet](6_pytorch/2_CNN/resnet.ipynb) | |||||
- [cnn/densenet](6_pytorch/2_CNN/densenet.ipynb) | |||||
- [cnn/batch-normalization](6_pytorch/2_CNN/2-batch-normalization.ipynb) | |||||
- [cnn/lr-decay](6_pytorch/2_CNN/3-lr-decay.ipynb) | |||||
- [cnn/regularization](6_pytorch/2_CNN/4-regularization.ipynb) | |||||
- [cnn/vgg](6_pytorch/2_CNN/6-vgg.ipynb) | |||||
- [cnn/googlenet](6_pytorch/2_CNN/7-googlenet.ipynb) | |||||
- [cnn/resnet](6_pytorch/2_CNN/8-resnet.ipynb) | |||||
- [cnn/densenet](6_pytorch/2_CNN/9-densenet.ipynb) | |||||
- RNN | - RNN | ||||
- [rnn/pytorch-rnn](6_pytorch/3_RNN/pytorch-rnn.ipynb) | - [rnn/pytorch-rnn](6_pytorch/3_RNN/pytorch-rnn.ipynb) | ||||
- [rnn/rnn-for-image](6_pytorch/3_RNN/rnn-for-image.ipynb) | - [rnn/rnn-for-image](6_pytorch/3_RNN/rnn-for-image.ipynb) | ||||