@@ -898,14 +898,18 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 35, | |||
"execution_count": 1, | |||
"metadata": {}, | |||
"outputs": [ | |||
{ | |||
"name": "stdout", | |||
"output_type": "stream", | |||
"text": [ | |||
"[1, 1, 4, 8, 7, 'name', 1, [5, 4, 2, 8], 5, 4, 2, 8]\n" | |||
"ename": "NameError", | |||
"evalue": "name 'lst' is not defined", | |||
"output_type": "error", | |||
"traceback": [ | |||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", | |||
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", | |||
"\u001b[0;32m<ipython-input-1-c109f115d7f9>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mlst\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0minsert\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m5\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m'name'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlst\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", | |||
"\u001b[0;31mNameError\u001b[0m: name 'lst' is not defined" | |||
] | |||
} | |||
], | |||
@@ -406,7 +406,7 @@ | |||
"cell_type": "markdown", | |||
"metadata": {}, | |||
"source": [ | |||
"For larger arrays it is inpractical to initialize the data manually, using explicit python lists. Instead we can use one of the many functions in `numpy` that generate arrays of different forms. Some of the more common are:" | |||
"For larger arrays it is impractical to initialize the data manually, using explicit python lists. Instead we can use one of the many functions in `numpy` that generate arrays of different forms. Some of the more common are:" | |||
] | |||
}, | |||
{ | |||
@@ -4879,7 +4879,7 @@ | |||
"name": "python", | |||
"nbconvert_exporter": "python", | |||
"pygments_lexer": "ipython3", | |||
"version": "3.5.2" | |||
"version": "3.6.8" | |||
} | |||
}, | |||
"nbformat": 4, | |||
@@ -1,3 +1,3 @@ | |||
0.85031 0.33331 0.64003 | |||
0.52522 0.21573 0.33288 | |||
0.74605 0.35135 0.45873 | |||
0.73172 0.46544 0.72373 | |||
0.32391 0.09679 0.95467 | |||
0.36052 0.78361 0.00717 |
@@ -336,7 +336,7 @@ | |||
"name": "python", | |||
"nbconvert_exporter": "python", | |||
"pygments_lexer": "ipython3", | |||
"version": "3.6.5" | |||
"version": "3.6.8" | |||
} | |||
}, | |||
"nbformat": 4, | |||
@@ -4,69 +4,67 @@ | |||
"cell_type": "markdown", | |||
"metadata": {}, | |||
"source": [ | |||
"# Softmax & 交叉熵代价函数\n" | |||
"# Softmax & Cross entropy cost function\n" | |||
] | |||
}, | |||
{ | |||
"cell_type": "markdown", | |||
"metadata": {}, | |||
"source": [ | |||
"softmax经常被添加在分类任务的神经网络中的输出层,神经网络的反向传播中关键的步骤就是求导,从这个过程也可以更深刻地理解反向传播的过程,还可以对梯度传播的问题有更多的思考。\n", | |||
"Softmax is often added as an output layer in the neural network for sorting tasks, the key process in the backward propagation is derivation. This process can also provide a deeper understanding of the back propagation process and give more thought to the problem of gradient propagation.\n", | |||
"\n", | |||
"## 1. softmax 函数\n", | |||
"## 1. softmax function\n", | |||
"\n", | |||
"softmax(柔性最大值)函数,一般在神经网络中, softmax可以作为分类任务的输出层。其实可以认为softmax输出的是几个类别选择的概率,比如我有一个分类任务,要分为三个类,softmax函数可以根据它们相对的大小,输出三个类别选取的概率,并且概率和为1。\n", | |||
"Softmax(Flexible maximum) function, usually in neural network, can work as the output layer of classification assignment. Actually we can think of softmax output as the probability of selecting several categories. For example, If I have a classification task that is divided into three classes, the Softmax function can output the probability of the selection of the three classes based on their relative size, and the probability sum is 1.\n", | |||
"\n", | |||
"softmax函数的公式是这种形式:\n", | |||
"The form of softmax function is:\n", | |||
"\n", | |||
"$$\n", | |||
"S_i = \\frac{e^{z_i}}{\\sum_k e^{z_k}}\n", | |||
"$$\n", | |||
"\n", | |||
"* $S_i$是经过softmax的类别概率输出\n", | |||
"* $z_k$是神经元的输出\n", | |||
"* $S_i$ is the class probability output that pass through the softmax\n", | |||
"* $z_k$ is the output of neuron\n", | |||
"\n", | |||
"\n", | |||
"更形象的如下图表示:\n", | |||
"More vivid expression is shown as the following graph:\n", | |||
"\n", | |||
"\n", | |||
"\n", | |||
"softmax直白来说就是将原来输出是$[3,1,-3]$通过softmax函数一作用,就映射成为(0,1)的值,而这些值的累和为1(满足概率的性质),那么我们就可以将它理解成概率,在最后选取输出结点的时候,我们就可以选取概率最大(也就是值对应最大的)结点,作为我们的预测目标!\n", | |||
"\n", | |||
"\n", | |||
"Softmax straightforward is the original output is $[3, 1, 3] $by softmax function role, is mapping the value of (0, 1), and these values are tired and 1 (meet the properties of probability), then we can understand it into probability, in the final selection of the output nodes, we can choose most probability (that is, value corresponding to the largest) node, as we predict the goal.\n", | |||
"softm\n", | |||
"\n", | |||
"首先是神经元的输出,一个神经元如下图:\n", | |||
"First is the output of neuron, the following graph shows a neuron:\n", | |||
"\n", | |||
"\n", | |||
"\n", | |||
"神经元的输出设为:\n", | |||
"we assume that the output of neuron is:\n", | |||
"\n", | |||
"$$\n", | |||
"z_i = \\sum_{j} w_{ij} x_{j} + b\n", | |||
"$$\n", | |||
"\n", | |||
"其中$W_{ij}$是第$i$个神经元的第$j$个权重,$b$是偏置。$z_i$表示该网络的第$i$个输出。\n", | |||
"Among them $W_{ij}$ is the $jth$ weight of $ith$ neuron and $b$ is the bias. $z_i$ represent the $ith$ output of this network.\n", | |||
"\n", | |||
"给这个输出加上一个softmax函数,那就变成了这样:\n", | |||
"Add a softmax function to the outpur we have:\n", | |||
"\n", | |||
"$$\n", | |||
"a_i = \\frac{e^{z_i}}{\\sum_k e^{z_k}}\n", | |||
"$$\n", | |||
"\n", | |||
"$a_i$代表softmax的第$i$个输出值,右侧套用了softmax函数。\n", | |||
"$a_i$ represent the $ith$ output value of softmax, while the right side uses softmax function.\n", | |||
"\n", | |||
"\n", | |||
"### 1.1 损失函数 loss function\n", | |||
"### 1.1 loss function\n", | |||
"\n", | |||
"在神经网络反向传播中,要求一个损失函数,这个损失函数其实表示的是真实值与网络的估计值的误差,知道误差了,才能知道怎样去修改网络中的权重。\n", | |||
"In the propagation of neural networks, we need to calculate a loss function, this loss function is actually the error between the true value and the estimation of network. Only when we get the error, it is possible to know how to change the weight in the network.\n", | |||
"\n", | |||
"损失函数可以有很多形式,这里用的是交叉熵函数,主要是由于这个求导结果比较简单,易于计算,并且交叉熵解决某些损失函数学习缓慢的问题。**[交叉熵函数](https://blog.csdn.net/u014313009/article/details/51043064)**是这样的:\n", | |||
"There are many form of loss function, what we used here is the cross entropy function, it is mainly because that the derivation reasult is quiet easy and convenient to calculate, and cross entropy can solve some lower learning rate problem**[Cross entropy function](https://blog.csdn.net/u014313009/article/details/51043064)**is this:\n", | |||
"\n", | |||
"$$\n", | |||
"C = - \\sum_i y_i ln a_i\n", | |||
"$$\n", | |||
"\n", | |||
"其中$y_i$表示真实的分类结果。\n", | |||
"Among them $y_i$ represent the truly classification result.\n", | |||
"\n" | |||
] | |||
}, | |||
@@ -74,31 +72,31 @@ | |||
"cell_type": "markdown", | |||
"metadata": {}, | |||
"source": [ | |||
"## 2. 推导过程\n", | |||
"## 2. Derive process\n", | |||
"\n", | |||
"首先,我们要明确一下我们要求什么,我们要求的是我们的$loss$对于神经元输出($z_i$)的梯度,即:\n", | |||
"Firstly, we need to make sure what we want, we want to get the gradient of our $loss$ to neuron output($z_i$), which is:\n", | |||
"\n", | |||
"$$\n", | |||
"\\frac{\\partial C}{\\partial z_i}\n", | |||
"$$\n", | |||
"\n", | |||
"根据复合函数求导法则:\n", | |||
"According to the derivation rule of composite function:\n", | |||
"\n", | |||
"$$\n", | |||
"\\frac{\\partial C}{\\partial z_i} = \\frac{\\partial C}{\\partial a_j} \\frac{\\partial a_j}{\\partial z_i}\n", | |||
"$$\n", | |||
"\n", | |||
"有个人可能有疑问了,这里为什么是$a_j$而不是$a_i$,这里要看一下$softmax$的公式了,因为$softmax$公式的特性,它的分母包含了所有神经元的输出,所以,对于不等于i的其他输出里面,也包含着$z_i$,所有的$a$都要纳入到计算范围中,并且后面的计算可以看到需要分为$i = j$和$i \\ne j$两种情况求导。\n", | |||
"Someone may have question, why we have $a_j$ instead of $a_i$. We need to check the formula of $softmax$ here, because of the special characteristcs, its denominatorc contains all the output of neurons. Therefore, for the other output which do not equal to i, it also contains $z_i$, all the $a$ are needed to be included into the calcultaion range and the calcultaion backwards need to be divide into two parts, which is $i = j$ and $i\\ne j$.\n", | |||
"\n", | |||
"### 2.1 针对$a_j$的偏导\n", | |||
"### 2.1 The partial derviation of $a_j$\n", | |||
"\n", | |||
"$$\n", | |||
"\\frac{\\partial C}{\\partial a_j} = \\frac{(\\partial -\\sum_j y_j ln a_j)}{\\partial a_j} = -\\sum_j y_j \\frac{1}{a_j}\n", | |||
"$$\n", | |||
"\n", | |||
"### 2.2 针对$z_i$的偏导\n", | |||
"### 2.2 The partial derviation of $z_i$\n", | |||
"\n", | |||
"如果 $i=j$ :\n", | |||
"If $i=j$ :\n", | |||
"\n", | |||
"\\begin{eqnarray}\n", | |||
"\\frac{\\partial a_i}{\\partial z_i} & = & \\frac{\\partial (\\frac{e^{z_i}}{\\sum_k e^{z_k}})}{\\partial z_i} \\\\\n", | |||
@@ -107,7 +105,7 @@ | |||
" & = & a_i (1 - a_i)\n", | |||
"\\end{eqnarray}\n", | |||
"\n", | |||
"如果 $i \\ne j$:\n", | |||
"IF $i \\ne j$:\n", | |||
"\\begin{eqnarray}\n", | |||
"\\frac{\\partial a_j}{\\partial z_i} & = & \\frac{\\partial (\\frac{e^{z_j}}{\\sum_k e^{z_k}})}{\\partial z_i} \\\\\n", | |||
" & = & \\frac{0 \\cdot \\sum_k e^{z_k} - e^{z_j} \\cdot e^{z_i} }{(\\sum_k e^{z_k})^2} \\\\\n", | |||
@@ -115,12 +113,12 @@ | |||
" & = & -a_j a_i\n", | |||
"\\end{eqnarray}\n", | |||
"\n", | |||
"当u,v都是变量的函数时的导数推导公式:\n", | |||
"When u, v are the dependent variable the derivation formula of derivative:\n", | |||
"$$\n", | |||
"(\\frac{u}{v})' = \\frac{u'v - uv'}{v^2} \n", | |||
"$$\n", | |||
"\n", | |||
"### 2.3 整体的推导\n", | |||
"### 2.3 Derivation of the whole\n", | |||
"\n", | |||
"\\begin{eqnarray}\n", | |||
"\\frac{\\partial C}{\\partial z_i} & = & (-\\sum_j y_j \\frac{1}{a_j} ) \\frac{\\partial a_j}{\\partial z_i} \\\\\n", | |||
@@ -135,8 +133,8 @@ | |||
"cell_type": "markdown", | |||
"metadata": {}, | |||
"source": [ | |||
"## 3. 问题\n", | |||
"如何将本节所讲的softmax,交叉熵代价函数应用到上节所讲的BP方法中?" | |||
"## 3. Question\n", | |||
"How to apply the softmax, cross entropy cost function in this section to the BP method in the previous section?" | |||
] | |||
}, | |||
{ | |||
@@ -168,7 +166,7 @@ | |||
"name": "python", | |||
"nbconvert_exporter": "python", | |||
"pygments_lexer": "ipython3", | |||
"version": "3.5.2" | |||
"version": "3.6.8" | |||
} | |||
}, | |||
"nbformat": 4, | |||
@@ -18,10 +18,8 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 1, | |||
"metadata": { | |||
"collapsed": true | |||
}, | |||
"execution_count": 5, | |||
"metadata": {}, | |||
"outputs": [], | |||
"source": [ | |||
"import torch\n", | |||
@@ -30,7 +28,7 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 2, | |||
"execution_count": 6, | |||
"metadata": {}, | |||
"outputs": [], | |||
"source": [ | |||
@@ -47,10 +45,8 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 3, | |||
"metadata": { | |||
"collapsed": true | |||
}, | |||
"execution_count": 7, | |||
"metadata": {}, | |||
"outputs": [], | |||
"source": [ | |||
"pytorch_tensor1 = torch.Tensor(numpy_tensor)\n", | |||
@@ -80,10 +76,8 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 4, | |||
"metadata": { | |||
"collapsed": true | |||
}, | |||
"execution_count": 8, | |||
"metadata": {}, | |||
"outputs": [], | |||
"source": [ | |||
"# 如果 pytorch tensor 在 cpu 上\n", | |||
@@ -955,7 +949,7 @@ | |||
"name": "python", | |||
"nbconvert_exporter": "python", | |||
"pygments_lexer": "ipython3", | |||
"version": "3.5.2" | |||
"version": "3.6.8" | |||
} | |||
}, | |||
"nbformat": 4, | |||
@@ -35,7 +35,7 @@ | |||
"name": "stdout", | |||
"output_type": "stream", | |||
"text": [ | |||
"tensor([19.], grad_fn=<AddBackward>)\n" | |||
"tensor([19.], grad_fn=<AddBackward0>)\n" | |||
] | |||
} | |||
], | |||
@@ -92,14 +92,51 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 5, | |||
"execution_count": 8, | |||
"metadata": {}, | |||
"outputs": [], | |||
"outputs": [ | |||
{ | |||
"name": "stdout", | |||
"output_type": "stream", | |||
"text": [ | |||
"tensor([[-1.5318, 1.5200, -2.1316, -1.3238, 1.0080, -1.0832, -0.2814, -1.0486,\n", | |||
" 1.0807, -2.2865, 0.6545, -0.3595, 0.4229, -0.9194, 0.1690, -0.3241,\n", | |||
" 1.8970, -0.8979, -0.7827, 0.3879],\n", | |||
" [ 0.1404, -0.8016, 0.1156, -0.8397, -1.8886, 1.1072, -1.0186, 0.2249,\n", | |||
" 0.5631, 0.4391, 0.7887, -2.3255, -0.4185, 0.6559, 0.7622, 1.6883,\n", | |||
" -1.4147, 0.2579, -0.6177, 0.2172],\n", | |||
" [-0.4866, -0.0322, -1.2484, 1.1913, -0.6569, 0.0810, 0.2491, -0.1258,\n", | |||
" 2.5903, -0.8370, -0.0554, 1.2174, 0.4059, -1.0759, 0.6649, 0.1642,\n", | |||
" -0.3512, -0.7695, 1.1469, -0.3409],\n", | |||
" [ 1.8789, -1.6553, -0.7401, -0.3198, -0.1010, -0.5512, -0.4792, -0.2891,\n", | |||
" -0.2655, -0.8132, 0.7210, 1.0885, -0.9557, -0.4472, -1.5340, 0.8093,\n", | |||
" 0.9349, 0.8352, -0.0774, -0.1728],\n", | |||
" [-0.3424, 0.1938, -2.4253, -0.0229, 0.3132, -0.7731, 0.8481, -1.3002,\n", | |||
" -0.1595, -0.0364, -1.5733, 0.8882, 0.1909, -0.1404, -1.5673, -1.1809,\n", | |||
" -0.7169, 0.7074, 0.3337, -1.0738],\n", | |||
" [-0.0501, 1.6210, 0.6854, 0.2216, 0.3034, -1.2762, -0.6216, 1.4884,\n", | |||
" 0.6078, 2.1512, -0.7141, 0.4110, -0.8187, 0.9474, -0.5978, -0.2679,\n", | |||
" 1.5315, -2.1550, 2.0969, -1.7669],\n", | |||
" [ 1.4505, -0.9497, 2.0269, -1.6402, -0.0047, -0.2716, -0.2727, 0.6795,\n", | |||
" -0.7367, -0.3248, -0.5312, 0.0887, -1.4303, -0.8390, 1.5324, 0.3761,\n", | |||
" -0.4658, -0.2044, 0.3050, -0.2756],\n", | |||
" [ 0.3265, -0.2513, 1.1441, 0.3805, -1.3629, -1.3120, -1.8571, 0.1180,\n", | |||
" 0.7466, -0.2654, -0.2154, 1.0603, -0.4113, -2.5965, 1.0736, 1.1610,\n", | |||
" 0.8165, 1.5916, 1.5556, 0.3078],\n", | |||
" [-0.4417, 0.1656, -2.1743, -0.1148, -1.2795, 1.0212, -0.7035, -0.8234,\n", | |||
" 0.3010, -1.0891, -1.0676, 0.8385, -0.2886, -1.1881, 0.5097, -0.5097,\n", | |||
" -1.7893, 0.0494, -0.0162, 1.5170],\n", | |||
" [-0.6435, -1.8376, 1.0022, -0.0397, 0.7187, -0.0661, -0.8528, 1.3248,\n", | |||
" -0.2566, -2.2886, 0.8728, -0.7152, 1.6180, 0.8416, 0.2788, 0.5515,\n", | |||
" -0.1266, -1.0025, 0.1767, -0.4987]], requires_grad=True)\n" | |||
] | |||
} | |||
], | |||
"source": [ | |||
"x = Variable(torch.randn(10, 20), requires_grad=True)\n", | |||
"y = Variable(torch.randn(10, 5), requires_grad=True)\n", | |||
"w = Variable(torch.randn(20, 5), requires_grad=True)\n", | |||
"\n", | |||
"print(x)\n", | |||
"out = torch.mean(y - torch.matmul(x, w)) # torch.matmul 是做矩阵乘法\n", | |||
"out.backward()" | |||
] | |||
@@ -113,40 +150,43 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 5, | |||
"execution_count": 9, | |||
"metadata": {}, | |||
"outputs": [ | |||
{ | |||
"name": "stdout", | |||
"output_type": "stream", | |||
"text": [ | |||
"Variable containing:\n", | |||
"\n", | |||
"Columns 0 to 9 \n", | |||
"-0.0600 -0.0242 -0.0514 0.0882 0.0056 -0.0400 -0.0300 -0.0052 -0.0289 -0.0172\n", | |||
"-0.0600 -0.0242 -0.0514 0.0882 0.0056 -0.0400 -0.0300 -0.0052 -0.0289 -0.0172\n", | |||
"-0.0600 -0.0242 -0.0514 0.0882 0.0056 -0.0400 -0.0300 -0.0052 -0.0289 -0.0172\n", | |||
"-0.0600 -0.0242 -0.0514 0.0882 0.0056 -0.0400 -0.0300 -0.0052 -0.0289 -0.0172\n", | |||
"-0.0600 -0.0242 -0.0514 0.0882 0.0056 -0.0400 -0.0300 -0.0052 -0.0289 -0.0172\n", | |||
"-0.0600 -0.0242 -0.0514 0.0882 0.0056 -0.0400 -0.0300 -0.0052 -0.0289 -0.0172\n", | |||
"-0.0600 -0.0242 -0.0514 0.0882 0.0056 -0.0400 -0.0300 -0.0052 -0.0289 -0.0172\n", | |||
"-0.0600 -0.0242 -0.0514 0.0882 0.0056 -0.0400 -0.0300 -0.0052 -0.0289 -0.0172\n", | |||
"-0.0600 -0.0242 -0.0514 0.0882 0.0056 -0.0400 -0.0300 -0.0052 -0.0289 -0.0172\n", | |||
"-0.0600 -0.0242 -0.0514 0.0882 0.0056 -0.0400 -0.0300 -0.0052 -0.0289 -0.0172\n", | |||
"\n", | |||
"Columns 10 to 19 \n", | |||
"-0.0372 0.0144 -0.1074 -0.0363 -0.0189 0.0209 0.0618 0.0435 -0.0591 0.0103\n", | |||
"-0.0372 0.0144 -0.1074 -0.0363 -0.0189 0.0209 0.0618 0.0435 -0.0591 0.0103\n", | |||
"-0.0372 0.0144 -0.1074 -0.0363 -0.0189 0.0209 0.0618 0.0435 -0.0591 0.0103\n", | |||
"-0.0372 0.0144 -0.1074 -0.0363 -0.0189 0.0209 0.0618 0.0435 -0.0591 0.0103\n", | |||
"-0.0372 0.0144 -0.1074 -0.0363 -0.0189 0.0209 0.0618 0.0435 -0.0591 0.0103\n", | |||
"-0.0372 0.0144 -0.1074 -0.0363 -0.0189 0.0209 0.0618 0.0435 -0.0591 0.0103\n", | |||
"-0.0372 0.0144 -0.1074 -0.0363 -0.0189 0.0209 0.0618 0.0435 -0.0591 0.0103\n", | |||
"-0.0372 0.0144 -0.1074 -0.0363 -0.0189 0.0209 0.0618 0.0435 -0.0591 0.0103\n", | |||
"-0.0372 0.0144 -0.1074 -0.0363 -0.0189 0.0209 0.0618 0.0435 -0.0591 0.0103\n", | |||
"-0.0372 0.0144 -0.1074 -0.0363 -0.0189 0.0209 0.0618 0.0435 -0.0591 0.0103\n", | |||
"[torch.FloatTensor of size 10x20]\n", | |||
"\n" | |||
"tensor([[-0.0198, -0.0066, -0.0288, 0.0080, 0.0079, -0.0569, -0.0489, 0.0505,\n", | |||
" 0.0132, -0.0072, -0.0024, 0.0400, 0.0691, -0.0273, 0.0124, 0.0104,\n", | |||
" 0.0098, -0.0598, 0.0365, 0.0177],\n", | |||
" [-0.0198, -0.0066, -0.0288, 0.0080, 0.0079, -0.0569, -0.0489, 0.0505,\n", | |||
" 0.0132, -0.0072, -0.0024, 0.0400, 0.0691, -0.0273, 0.0124, 0.0104,\n", | |||
" 0.0098, -0.0598, 0.0365, 0.0177],\n", | |||
" [-0.0198, -0.0066, -0.0288, 0.0080, 0.0079, -0.0569, -0.0489, 0.0505,\n", | |||
" 0.0132, -0.0072, -0.0024, 0.0400, 0.0691, -0.0273, 0.0124, 0.0104,\n", | |||
" 0.0098, -0.0598, 0.0365, 0.0177],\n", | |||
" [-0.0198, -0.0066, -0.0288, 0.0080, 0.0079, -0.0569, -0.0489, 0.0505,\n", | |||
" 0.0132, -0.0072, -0.0024, 0.0400, 0.0691, -0.0273, 0.0124, 0.0104,\n", | |||
" 0.0098, -0.0598, 0.0365, 0.0177],\n", | |||
" [-0.0198, -0.0066, -0.0288, 0.0080, 0.0079, -0.0569, -0.0489, 0.0505,\n", | |||
" 0.0132, -0.0072, -0.0024, 0.0400, 0.0691, -0.0273, 0.0124, 0.0104,\n", | |||
" 0.0098, -0.0598, 0.0365, 0.0177],\n", | |||
" [-0.0198, -0.0066, -0.0288, 0.0080, 0.0079, -0.0569, -0.0489, 0.0505,\n", | |||
" 0.0132, -0.0072, -0.0024, 0.0400, 0.0691, -0.0273, 0.0124, 0.0104,\n", | |||
" 0.0098, -0.0598, 0.0365, 0.0177],\n", | |||
" [-0.0198, -0.0066, -0.0288, 0.0080, 0.0079, -0.0569, -0.0489, 0.0505,\n", | |||
" 0.0132, -0.0072, -0.0024, 0.0400, 0.0691, -0.0273, 0.0124, 0.0104,\n", | |||
" 0.0098, -0.0598, 0.0365, 0.0177],\n", | |||
" [-0.0198, -0.0066, -0.0288, 0.0080, 0.0079, -0.0569, -0.0489, 0.0505,\n", | |||
" 0.0132, -0.0072, -0.0024, 0.0400, 0.0691, -0.0273, 0.0124, 0.0104,\n", | |||
" 0.0098, -0.0598, 0.0365, 0.0177],\n", | |||
" [-0.0198, -0.0066, -0.0288, 0.0080, 0.0079, -0.0569, -0.0489, 0.0505,\n", | |||
" 0.0132, -0.0072, -0.0024, 0.0400, 0.0691, -0.0273, 0.0124, 0.0104,\n", | |||
" 0.0098, -0.0598, 0.0365, 0.0177],\n", | |||
" [-0.0198, -0.0066, -0.0288, 0.0080, 0.0079, -0.0569, -0.0489, 0.0505,\n", | |||
" 0.0132, -0.0072, -0.0024, 0.0400, 0.0691, -0.0273, 0.0124, 0.0104,\n", | |||
" 0.0098, -0.0598, 0.0365, 0.0177]])\n" | |||
] | |||
} | |||
], | |||
@@ -157,27 +197,23 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 6, | |||
"execution_count": 10, | |||
"metadata": {}, | |||
"outputs": [ | |||
{ | |||
"name": "stdout", | |||
"output_type": "stream", | |||
"text": [ | |||
"Variable containing:\n", | |||
"1.00000e-02 *\n", | |||
" 2.0000 2.0000 2.0000 2.0000 2.0000\n", | |||
" 2.0000 2.0000 2.0000 2.0000 2.0000\n", | |||
" 2.0000 2.0000 2.0000 2.0000 2.0000\n", | |||
" 2.0000 2.0000 2.0000 2.0000 2.0000\n", | |||
" 2.0000 2.0000 2.0000 2.0000 2.0000\n", | |||
" 2.0000 2.0000 2.0000 2.0000 2.0000\n", | |||
" 2.0000 2.0000 2.0000 2.0000 2.0000\n", | |||
" 2.0000 2.0000 2.0000 2.0000 2.0000\n", | |||
" 2.0000 2.0000 2.0000 2.0000 2.0000\n", | |||
" 2.0000 2.0000 2.0000 2.0000 2.0000\n", | |||
"[torch.FloatTensor of size 10x5]\n", | |||
"\n" | |||
"tensor([[0.0200, 0.0200, 0.0200, 0.0200, 0.0200],\n", | |||
" [0.0200, 0.0200, 0.0200, 0.0200, 0.0200],\n", | |||
" [0.0200, 0.0200, 0.0200, 0.0200, 0.0200],\n", | |||
" [0.0200, 0.0200, 0.0200, 0.0200, 0.0200],\n", | |||
" [0.0200, 0.0200, 0.0200, 0.0200, 0.0200],\n", | |||
" [0.0200, 0.0200, 0.0200, 0.0200, 0.0200],\n", | |||
" [0.0200, 0.0200, 0.0200, 0.0200, 0.0200],\n", | |||
" [0.0200, 0.0200, 0.0200, 0.0200, 0.0200],\n", | |||
" [0.0200, 0.0200, 0.0200, 0.0200, 0.0200],\n", | |||
" [0.0200, 0.0200, 0.0200, 0.0200, 0.0200]])\n" | |||
] | |||
} | |||
], | |||
@@ -188,36 +224,33 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 7, | |||
"execution_count": 11, | |||
"metadata": {}, | |||
"outputs": [ | |||
{ | |||
"name": "stdout", | |||
"output_type": "stream", | |||
"text": [ | |||
"Variable containing:\n", | |||
" 0.1342 0.1342 0.1342 0.1342 0.1342\n", | |||
" 0.0507 0.0507 0.0507 0.0507 0.0507\n", | |||
" 0.0328 0.0328 0.0328 0.0328 0.0328\n", | |||
"-0.0086 -0.0086 -0.0086 -0.0086 -0.0086\n", | |||
" 0.0734 0.0734 0.0734 0.0734 0.0734\n", | |||
"-0.0042 -0.0042 -0.0042 -0.0042 -0.0042\n", | |||
" 0.0078 0.0078 0.0078 0.0078 0.0078\n", | |||
"-0.0769 -0.0769 -0.0769 -0.0769 -0.0769\n", | |||
" 0.0672 0.0672 0.0672 0.0672 0.0672\n", | |||
" 0.1614 0.1614 0.1614 0.1614 0.1614\n", | |||
"-0.0042 -0.0042 -0.0042 -0.0042 -0.0042\n", | |||
"-0.0970 -0.0970 -0.0970 -0.0970 -0.0970\n", | |||
"-0.0364 -0.0364 -0.0364 -0.0364 -0.0364\n", | |||
"-0.0419 -0.0419 -0.0419 -0.0419 -0.0419\n", | |||
" 0.0134 0.0134 0.0134 0.0134 0.0134\n", | |||
"-0.0251 -0.0251 -0.0251 -0.0251 -0.0251\n", | |||
" 0.0586 0.0586 0.0586 0.0586 0.0586\n", | |||
"-0.0050 -0.0050 -0.0050 -0.0050 -0.0050\n", | |||
" 0.1125 0.1125 0.1125 0.1125 0.1125\n", | |||
"-0.0096 -0.0096 -0.0096 -0.0096 -0.0096\n", | |||
"[torch.FloatTensor of size 20x5]\n", | |||
"\n" | |||
"tensor([[-0.0060, -0.0060, -0.0060, -0.0060, -0.0060],\n", | |||
" [ 0.0405, 0.0405, 0.0405, 0.0405, 0.0405],\n", | |||
" [ 0.0749, 0.0749, 0.0749, 0.0749, 0.0749],\n", | |||
" [ 0.0502, 0.0502, 0.0502, 0.0502, 0.0502],\n", | |||
" [ 0.0590, 0.0590, 0.0590, 0.0590, 0.0590],\n", | |||
" [ 0.0625, 0.0625, 0.0625, 0.0625, 0.0625],\n", | |||
" [ 0.0998, 0.0998, 0.0998, 0.0998, 0.0998],\n", | |||
" [-0.0050, -0.0050, -0.0050, -0.0050, -0.0050],\n", | |||
" [-0.0894, -0.0894, -0.0894, -0.0894, -0.0894],\n", | |||
" [ 0.1070, 0.1070, 0.1070, 0.1070, 0.1070],\n", | |||
" [ 0.0224, 0.0224, 0.0224, 0.0224, 0.0224],\n", | |||
" [-0.0438, -0.0438, -0.0438, -0.0438, -0.0438],\n", | |||
" [ 0.0337, 0.0337, 0.0337, 0.0337, 0.0337],\n", | |||
" [ 0.0952, 0.0952, 0.0952, 0.0952, 0.0952],\n", | |||
" [-0.0258, -0.0258, -0.0258, -0.0258, -0.0258],\n", | |||
" [-0.0494, -0.0494, -0.0494, -0.0494, -0.0494],\n", | |||
" [-0.0063, -0.0063, -0.0063, -0.0063, -0.0063],\n", | |||
" [ 0.0318, 0.0318, 0.0318, 0.0318, 0.0318],\n", | |||
" [-0.0824, -0.0824, -0.0824, -0.0824, -0.0824],\n", | |||
" [ 0.0340, 0.0340, 0.0340, 0.0340, 0.0340]])\n" | |||
] | |||
} | |||
], | |||
@@ -250,21 +283,15 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 8, | |||
"execution_count": 15, | |||
"metadata": {}, | |||
"outputs": [ | |||
{ | |||
"name": "stdout", | |||
"output_type": "stream", | |||
"text": [ | |||
"Variable containing:\n", | |||
" 2 3\n", | |||
"[torch.FloatTensor of size 1x2]\n", | |||
"\n", | |||
"Variable containing:\n", | |||
" 0 0\n", | |||
"[torch.FloatTensor of size 1x2]\n", | |||
"\n" | |||
"tensor([[2., 3.]], requires_grad=True)\n", | |||
"tensor([[0., 0.]])\n" | |||
] | |||
} | |||
], | |||
@@ -277,22 +304,21 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 9, | |||
"execution_count": 16, | |||
"metadata": {}, | |||
"outputs": [ | |||
{ | |||
"name": "stdout", | |||
"output_type": "stream", | |||
"text": [ | |||
"Variable containing:\n", | |||
" 4 27\n", | |||
"[torch.FloatTensor of size 1x2]\n", | |||
"\n" | |||
"tensor(2., grad_fn=<SelectBackward>)\n", | |||
"tensor([[ 4., 27.]], grad_fn=<CopySlices>)\n" | |||
] | |||
} | |||
], | |||
"source": [ | |||
"# 通过 m 中的值计算新的 n 中的值\n", | |||
"print(m[0,0])\n", | |||
"n[0, 0] = m[0, 0] ** 2\n", | |||
"n[0, 1] = m[0, 1] ** 3\n", | |||
"print(n)" | |||
@@ -336,7 +362,7 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 10, | |||
"execution_count": 17, | |||
"metadata": {}, | |||
"outputs": [], | |||
"source": [ | |||
@@ -345,17 +371,14 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 11, | |||
"execution_count": 18, | |||
"metadata": {}, | |||
"outputs": [ | |||
{ | |||
"name": "stdout", | |||
"output_type": "stream", | |||
"text": [ | |||
"Variable containing:\n", | |||
" 4 27\n", | |||
"[torch.FloatTensor of size 1x2]\n", | |||
"\n" | |||
"tensor([[ 4., 27.]])\n" | |||
] | |||
} | |||
], | |||
@@ -541,10 +564,8 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 6, | |||
"metadata": { | |||
"collapsed": true | |||
}, | |||
"execution_count": 38, | |||
"metadata": {}, | |||
"outputs": [], | |||
"source": [ | |||
"x = Variable(torch.FloatTensor([2, 3]), requires_grad=True)\n", | |||
@@ -556,34 +577,32 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 7, | |||
"execution_count": 39, | |||
"metadata": {}, | |||
"outputs": [], | |||
"source": [ | |||
"# k.backward(torch.ones_like(k)) \n", | |||
"# print(x.grad)\n", | |||
"# 和上一个的区别在于该算法是求得导数和,并不是分布求解。" | |||
] | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 40, | |||
"metadata": {}, | |||
"outputs": [ | |||
{ | |||
"name": "stdout", | |||
"output_type": "stream", | |||
"text": [ | |||
"Variable containing:\n", | |||
" 13\n", | |||
" 13\n", | |||
"[torch.FloatTensor of size 2]\n", | |||
"\n" | |||
"tensor([13., 13.], grad_fn=<CopySlices>)\n" | |||
] | |||
} | |||
], | |||
"source": [ | |||
"print(k)" | |||
] | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 8, | |||
"metadata": {}, | |||
"outputs": [], | |||
"source": [ | |||
"j = torch.zeros(2, 2)\n", | |||
"\n", | |||
"k.backward(torch.FloatTensor([1, 0]), retain_graph=True)\n", | |||
"print(k)\n", | |||
"j[0] = x.grad.data\n", | |||
"\n", | |||
"x.grad.data.zero_() # 归零之前求得的梯度\n", | |||
@@ -594,6 +613,23 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 18, | |||
"metadata": {}, | |||
"outputs": [ | |||
{ | |||
"name": "stdout", | |||
"output_type": "stream", | |||
"text": [ | |||
"tensor([13., 13.], grad_fn=<CopySlices>)\n" | |||
] | |||
} | |||
], | |||
"source": [ | |||
"print(k)" | |||
] | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 9, | |||
"metadata": {}, | |||
"outputs": [ | |||
@@ -637,7 +673,7 @@ | |||
"name": "python", | |||
"nbconvert_exporter": "python", | |||
"pygments_lexer": "ipython3", | |||
"version": "3.5.2" | |||
"version": "3.6.8" | |||
} | |||
}, | |||
"nbformat": 4, | |||
@@ -5,7 +5,7 @@ | |||
"metadata": {}, | |||
"source": [ | |||
"# 动态图和静态图\n", | |||
"目前神经网络框架分为静态图框架和动态图框架,PyTorch 和 TensorFlow、Caffe 等框架最大的区别就是他们拥有不同的计算图表现形式。 TensorFlow 使用静态图,这意味着我们先定义计算图,然后不断使用它,而在 PyTorch 中,每次都会重新构建一个新的计算图。通过这次课程,我们会了解静态图和动态图之间的优缺点。\n", | |||
"目前神经网络框架分为[静态图框架和动态图框架](https://blog.csdn.net/qq_36653505/article/details/87875279),PyTorch 和 TensorFlow、Caffe 等框架最大的区别就是他们拥有不同的计算图表现形式。 TensorFlow 使用静态图,这意味着我们先定义计算图,然后不断使用它,而在 PyTorch 中,每次都会重新构建一个新的计算图。通过这次课程,我们会了解静态图和动态图之间的优缺点。\n", | |||
"\n", | |||
"对于使用者来说,两种形式的计算图有着非常大的区别,同时静态图和动态图都有他们各自的优点,比如动态图比较方便debug,使用者能够用任何他们喜欢的方式进行debug,同时非常直观,而静态图是通过先定义后运行的方式,之后再次运行的时候就不再需要重新构建计算图,所以速度会比动态图更快。" | |||
] | |||
@@ -33,10 +33,8 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 1, | |||
"metadata": { | |||
"collapsed": true | |||
}, | |||
"execution_count": 15, | |||
"metadata": {}, | |||
"outputs": [], | |||
"source": [ | |||
"# tensorflow\n", | |||
@@ -48,10 +46,8 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 2, | |||
"metadata": { | |||
"collapsed": true | |||
}, | |||
"execution_count": 16, | |||
"metadata": {}, | |||
"outputs": [], | |||
"source": [ | |||
"def cond(first_counter, second_counter, *args):\n", | |||
@@ -65,7 +61,7 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 3, | |||
"execution_count": 17, | |||
"metadata": {}, | |||
"outputs": [], | |||
"source": [ | |||
@@ -74,27 +70,42 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 4, | |||
"metadata": { | |||
"collapsed": true | |||
}, | |||
"outputs": [], | |||
"execution_count": 21, | |||
"metadata": {}, | |||
"outputs": [ | |||
{ | |||
"ename": "RuntimeError", | |||
"evalue": "The Session graph is empty. Add operations to the graph before calling run().", | |||
"output_type": "error", | |||
"traceback": [ | |||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", | |||
"\u001b[0;31mRuntimeError\u001b[0m Traceback (most recent call last)", | |||
"\u001b[0;32m<ipython-input-21-430d26a59053>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;32mwith\u001b[0m \u001b[0mtf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcompat\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mv1\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mSession\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0msess\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mcounter_1_res\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcounter_2_res\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0msess\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrun\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mc1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mc2\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", | |||
"\u001b[0;32m~/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py\u001b[0m in \u001b[0;36mrun\u001b[0;34m(self, fetches, feed_dict, options, run_metadata)\u001b[0m\n\u001b[1;32m 956\u001b[0m \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 957\u001b[0m result = self._run(None, fetches, feed_dict, options_ptr,\n\u001b[0;32m--> 958\u001b[0;31m run_metadata_ptr)\n\u001b[0m\u001b[1;32m 959\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mrun_metadata\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 960\u001b[0m \u001b[0mproto_data\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtf_session\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mTF_GetBuffer\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mrun_metadata_ptr\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", | |||
"\u001b[0;32m~/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py\u001b[0m in \u001b[0;36m_run\u001b[0;34m(self, handle, fetches, feed_dict, options, run_metadata)\u001b[0m\n\u001b[1;32m 1104\u001b[0m \u001b[0;32mraise\u001b[0m \u001b[0mRuntimeError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'Attempted to use a closed Session.'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1105\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mgraph\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mversion\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0;36m0\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1106\u001b[0;31m raise RuntimeError('The Session graph is empty. Add operations to the '\n\u001b[0m\u001b[1;32m 1107\u001b[0m 'graph before calling run().')\n\u001b[1;32m 1108\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", | |||
"\u001b[0;31mRuntimeError\u001b[0m: The Session graph is empty. Add operations to the graph before calling run()." | |||
] | |||
} | |||
], | |||
"source": [ | |||
"with tf.Session() as sess:\n", | |||
"with tf.compat.v1.Session() as sess:\n", | |||
" counter_1_res, counter_2_res = sess.run([c1, c2])" | |||
] | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 5, | |||
"execution_count": 19, | |||
"metadata": {}, | |||
"outputs": [ | |||
{ | |||
"name": "stdout", | |||
"output_type": "stream", | |||
"text": [ | |||
"20\n", | |||
"20\n" | |||
"ename": "NameError", | |||
"evalue": "name 'counter_1_res' is not defined", | |||
"output_type": "error", | |||
"traceback": [ | |||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", | |||
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", | |||
"\u001b[0;32m<ipython-input-19-62b1e84b7d43>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mcounter_1_res\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mcounter_2_res\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", | |||
"\u001b[0;31mNameError\u001b[0m: name 'counter_1_res' is not defined" | |||
] | |||
} | |||
], | |||
@@ -197,7 +208,7 @@ | |||
"name": "python", | |||
"nbconvert_exporter": "python", | |||
"pygments_lexer": "ipython3", | |||
"version": "3.5.2" | |||
"version": "3.6.8" | |||
} | |||
}, | |||
"nbformat": 4, | |||
@@ -1546,7 +1546,7 @@ | |||
"name": "python", | |||
"nbconvert_exporter": "python", | |||
"pygments_lexer": "ipython3", | |||
"version": "3.5.2" | |||
"version": "3.6.8" | |||
} | |||
}, | |||
"nbformat": 4, | |||
@@ -129,7 +129,7 @@ | |||
" h = 0.01\n", | |||
" # Generate a grid of points with distance h between them\n", | |||
" xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))\n", | |||
" # Predict the function value for the whole grid\n", | |||
" # Predict the function value for the whole grid .c_按行连接两个矩阵,左右相加。\n", | |||
" Z = model(np.c_[xx.ravel(), yy.ravel()])\n", | |||
" Z = Z.reshape(xx.shape)\n", | |||
" # Plot the contour and training examples\n", | |||
@@ -261,6 +261,7 @@ | |||
], | |||
"source": [ | |||
"for e in range(100):\n", | |||
" #更新并自动计算\n", | |||
" out = logistic_regression(Variable(x))\n", | |||
" loss = criterion(out, Variable(y))\n", | |||
" optimizer.zero_grad()\n", | |||
@@ -688,7 +688,7 @@ | |||
"name": "python", | |||
"nbconvert_exporter": "python", | |||
"pygments_lexer": "ipython3", | |||
"version": "3.5.2" | |||
"version": "3.6.8" | |||
} | |||
}, | |||
"nbformat": 4, | |||
@@ -1947,7 +1947,7 @@ | |||
"name": "python", | |||
"nbconvert_exporter": "python", | |||
"pygments_lexer": "ipython3", | |||
"version": "3.5.2" | |||
"version": "3.6.8" | |||
} | |||
}, | |||
"nbformat": 4, | |||
@@ -561,7 +561,7 @@ | |||
"name": "python", | |||
"nbconvert_exporter": "python", | |||
"pygments_lexer": "ipython3", | |||
"version": "3.5.2" | |||
"version": "3.6.8" | |||
} | |||
}, | |||
"nbformat": 4, | |||
@@ -35,7 +35,7 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 3, | |||
"execution_count": 2, | |||
"metadata": { | |||
"ExecuteTime": { | |||
"end_time": "2017-12-22T09:01:51.296457Z", | |||
@@ -63,7 +63,7 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 6, | |||
"execution_count": 3, | |||
"metadata": { | |||
"ExecuteTime": { | |||
"end_time": "2017-12-22T09:01:51.312500Z", | |||
@@ -92,7 +92,7 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 7, | |||
"execution_count": 4, | |||
"metadata": { | |||
"ExecuteTime": { | |||
"end_time": "2017-12-22T08:20:40.819497Z", | |||
@@ -106,11 +106,11 @@ | |||
"text": [ | |||
"Sequential(\n", | |||
" (0): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", | |||
" (1): ReLU(inplace)\n", | |||
" (1): ReLU(inplace=True)\n", | |||
" (2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", | |||
" (3): ReLU(inplace)\n", | |||
" (3): ReLU(inplace=True)\n", | |||
" (4): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", | |||
" (5): ReLU(inplace)\n", | |||
" (5): ReLU(inplace=True)\n", | |||
" (6): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)\n", | |||
")\n" | |||
] | |||
@@ -123,7 +123,7 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 9, | |||
"execution_count": 5, | |||
"metadata": { | |||
"ExecuteTime": { | |||
"end_time": "2017-12-22T07:52:04.632406Z", | |||
@@ -157,7 +157,7 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 10, | |||
"execution_count": 6, | |||
"metadata": { | |||
"ExecuteTime": { | |||
"end_time": "2017-12-22T09:01:54.497712Z", | |||
@@ -184,7 +184,7 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 12, | |||
"execution_count": 7, | |||
"metadata": { | |||
"ExecuteTime": { | |||
"end_time": "2017-12-22T09:01:55.149378Z", | |||
@@ -199,43 +199,43 @@ | |||
"Sequential(\n", | |||
" (0): Sequential(\n", | |||
" (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", | |||
" (1): ReLU(inplace)\n", | |||
" (1): ReLU(inplace=True)\n", | |||
" (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", | |||
" (3): ReLU(inplace)\n", | |||
" (3): ReLU(inplace=True)\n", | |||
" (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)\n", | |||
" )\n", | |||
" (1): Sequential(\n", | |||
" (0): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", | |||
" (1): ReLU(inplace)\n", | |||
" (1): ReLU(inplace=True)\n", | |||
" (2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", | |||
" (3): ReLU(inplace)\n", | |||
" (3): ReLU(inplace=True)\n", | |||
" (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)\n", | |||
" )\n", | |||
" (2): Sequential(\n", | |||
" (0): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", | |||
" (1): ReLU(inplace)\n", | |||
" (1): ReLU(inplace=True)\n", | |||
" (2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", | |||
" (3): ReLU(inplace)\n", | |||
" (3): ReLU(inplace=True)\n", | |||
" (4): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", | |||
" (5): ReLU(inplace)\n", | |||
" (5): ReLU(inplace=True)\n", | |||
" (6): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)\n", | |||
" )\n", | |||
" (3): Sequential(\n", | |||
" (0): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", | |||
" (1): ReLU(inplace)\n", | |||
" (1): ReLU(inplace=True)\n", | |||
" (2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", | |||
" (3): ReLU(inplace)\n", | |||
" (3): ReLU(inplace=True)\n", | |||
" (4): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", | |||
" (5): ReLU(inplace)\n", | |||
" (5): ReLU(inplace=True)\n", | |||
" (6): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)\n", | |||
" )\n", | |||
" (4): Sequential(\n", | |||
" (0): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", | |||
" (1): ReLU(inplace)\n", | |||
" (1): ReLU(inplace=True)\n", | |||
" (2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", | |||
" (3): ReLU(inplace)\n", | |||
" (3): ReLU(inplace=True)\n", | |||
" (4): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", | |||
" (5): ReLU(inplace)\n", | |||
" (5): ReLU(inplace=True)\n", | |||
" (6): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)\n", | |||
" )\n", | |||
")\n" | |||
@@ -256,7 +256,7 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 13, | |||
"execution_count": 8, | |||
"metadata": { | |||
"ExecuteTime": { | |||
"end_time": "2017-12-22T08:52:44.049650Z", | |||
@@ -287,7 +287,7 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 14, | |||
"execution_count": 9, | |||
"metadata": { | |||
"ExecuteTime": { | |||
"end_time": "2017-12-22T09:01:57.323034Z", | |||
@@ -415,7 +415,7 @@ | |||
"name": "python", | |||
"nbconvert_exporter": "python", | |||
"pygments_lexer": "ipython3", | |||
"version": "3.5.2" | |||
"version": "3.6.8" | |||
} | |||
}, | |||
"nbformat": 4, | |||
@@ -49,7 +49,7 @@ | |||
"source": [ | |||
"import sys\n", | |||
"sys.path.append('..')\n", | |||
"\n", | |||
" \n", | |||
"import numpy as np\n", | |||
"import torch\n", | |||
"from torch import nn\n", | |||
@@ -377,7 +377,7 @@ | |||
"name": "python", | |||
"nbconvert_exporter": "python", | |||
"pygments_lexer": "ipython3", | |||
"version": "3.5.2" | |||
"version": "3.6.8" | |||
} | |||
}, | |||
"nbformat": 4, | |||
@@ -373,7 +373,7 @@ | |||
"name": "python", | |||
"nbconvert_exporter": "python", | |||
"pygments_lexer": "ipython3", | |||
"version": "3.5.2" | |||
"version": "3.6.8" | |||
} | |||
}, | |||
"nbformat": 4, | |||
@@ -41,10 +41,8 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 46, | |||
"metadata": { | |||
"collapsed": true | |||
}, | |||
"execution_count": 1, | |||
"metadata": {}, | |||
"outputs": [], | |||
"source": [ | |||
"import torch\n", | |||
@@ -54,10 +52,8 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 47, | |||
"metadata": { | |||
"collapsed": true | |||
}, | |||
"execution_count": 2, | |||
"metadata": {}, | |||
"outputs": [], | |||
"source": [ | |||
"# 定义一个单步的 rnn\n", | |||
@@ -66,25 +62,29 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 48, | |||
"execution_count": 3, | |||
"metadata": {}, | |||
"outputs": [ | |||
{ | |||
"data": { | |||
"text/plain": [ | |||
"Parameter containing:\n", | |||
"1.00000e-02 *\n", | |||
" 6.2260 -5.3805 3.5870 ... -2.2162 6.2760 1.6760\n", | |||
"-5.1878 -4.6751 -5.5926 ... -1.8942 0.1589 1.0725\n", | |||
" 3.3236 -3.2726 5.5399 ... 3.3193 0.2117 1.1730\n", | |||
" ... ⋱ ... \n", | |||
" 2.4032 -3.4415 5.1036 ... -2.2035 -0.1900 -6.4016\n", | |||
" 5.2031 -1.5793 -0.0623 ... 0.3424 6.9412 6.3707\n", | |||
"-5.4495 4.5280 2.1774 ... 1.8767 2.4968 5.3403\n", | |||
"[torch.FloatTensor of size 200x200]" | |||
"tensor([[-2.7963e-02, 3.6102e-02, 5.6609e-03, ..., -3.0035e-02,\n", | |||
" 2.7740e-02, 2.3327e-02],\n", | |||
" [-2.8567e-02, -3.2150e-02, -2.6686e-02, ..., -4.6441e-02,\n", | |||
" 3.5804e-02, 9.7260e-05],\n", | |||
" [ 4.6686e-02, -1.5825e-02, 6.7149e-02, ..., 3.3435e-02,\n", | |||
" -2.7623e-02, -6.7693e-02],\n", | |||
" ...,\n", | |||
" [-2.0338e-02, -1.6551e-02, 5.8996e-02, ..., -4.0145e-02,\n", | |||
" -6.9111e-03, -3.2740e-02],\n", | |||
" [-2.4584e-02, 2.3591e-02, 8.3090e-03, ..., -3.6077e-02,\n", | |||
" -6.0432e-03, 5.6279e-02],\n", | |||
" [ 5.6955e-02, -5.1925e-02, 3.1950e-02, ..., -5.6692e-02,\n", | |||
" 6.1773e-02, 1.9715e-02]], requires_grad=True)" | |||
] | |||
}, | |||
"execution_count": 48, | |||
"execution_count": 3, | |||
"metadata": {}, | |||
"output_type": "execute_result" | |||
} | |||
@@ -96,14 +96,58 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 49, | |||
"metadata": { | |||
"collapsed": true | |||
}, | |||
"outputs": [], | |||
"execution_count": 4, | |||
"metadata": {}, | |||
"outputs": [ | |||
{ | |||
"data": { | |||
"text/plain": [ | |||
"tensor([[[ 1.4637, -2.0015, 0.6298, ..., -1.1210, -1.6310, 0.5122],\n", | |||
" [-0.1500, -0.6931, 0.1568, ..., -0.9185, -0.5088, -1.0746],\n", | |||
" [ 0.1717, 1.2186, -0.8093, ..., 0.8630, 0.4601, -1.0218],\n", | |||
" [-0.3034, 2.8634, 2.2470, ..., 0.1678, -2.0585, -0.9628],\n", | |||
" [-2.3764, -0.4235, -1.1760, ..., -1.2251, 0.6761, -1.0323]],\n", | |||
"\n", | |||
" [[-1.3497, -0.6778, -0.0528, ..., -0.1852, -0.3997, -0.7633],\n", | |||
" [ 1.0105, 0.7974, 0.4253, ..., -1.1167, -1.3870, -1.3583],\n", | |||
" [ 0.2785, 0.5013, -0.5881, ..., -0.0283, 0.6044, -0.3249],\n", | |||
" [-1.9298, -0.6575, -1.2878, ..., 0.5636, -0.3266, 1.9391],\n", | |||
" [ 1.3117, -1.1429, -1.5837, ..., -1.5248, -0.2046, 1.0696]],\n", | |||
"\n", | |||
" [[-0.8637, -1.0572, -0.2438, ..., 0.1011, -0.4630, 0.0526],\n", | |||
" [-0.0056, -0.9442, -0.5588, ..., -0.6881, -1.2189, -1.1846],\n", | |||
" [ 0.8341, 0.6924, -0.4376, ..., 1.1331, -0.9766, 1.3822],\n", | |||
" [-0.3815, -1.3457, 0.5320, ..., 0.8280, 0.2146, -0.8704],\n", | |||
" [-0.6424, 1.3608, -0.5325, ..., -0.3414, 1.0094, 1.2650]],\n", | |||
"\n", | |||
" [[-0.1776, -0.2037, -0.7093, ..., -1.1442, -1.0058, -0.6898],\n", | |||
" [ 0.2921, -1.9473, -0.6989, ..., 0.6852, -0.2225, -0.6484],\n", | |||
" [-0.8576, 1.9338, -1.5359, ..., -0.3545, -0.9438, 0.1476],\n", | |||
" [ 2.3669, 0.8673, 2.0521, ..., -0.4679, -0.4050, 0.7761],\n", | |||
" [ 0.3706, 1.2876, -0.5311, ..., 0.4794, -0.4209, 0.5343]],\n", | |||
"\n", | |||
" [[-0.2726, -1.2583, -0.8259, ..., 0.8811, 0.5900, 0.1770],\n", | |||
" [ 1.1066, -0.4899, 0.9143, ..., -2.2898, 0.1525, -2.2099],\n", | |||
" [-1.3824, 0.3142, 1.2140, ..., 0.5470, -0.4883, -0.3204],\n", | |||
" [ 1.8471, 0.6011, 0.0613, ..., 1.1584, -0.8014, 0.4891],\n", | |||
" [ 1.5201, -1.7853, 1.3107, ..., 0.0032, -1.3422, 0.7332]],\n", | |||
"\n", | |||
" [[ 0.3025, -0.7314, -0.2032, ..., -0.9658, -1.8131, 0.5922],\n", | |||
" [-0.0878, 0.0909, 0.7064, ..., 2.4186, -0.0863, 0.0930],\n", | |||
" [-1.4278, -1.0901, 1.6742, ..., 0.3020, -0.6106, -0.4299],\n", | |||
" [-1.8291, -1.1337, -0.2405, ..., -1.2000, 2.0510, 1.3617],\n", | |||
" [-2.7953, -0.0559, 1.0224, ..., 0.4400, 0.9099, -1.5845]]])" | |||
] | |||
}, | |||
"execution_count": 4, | |||
"metadata": {}, | |||
"output_type": "execute_result" | |||
} | |||
], | |||
"source": [ | |||
"# 构造一个序列,长为 6,batch 是 5, 特征是 100\n", | |||
"x = Variable(torch.randn(6, 5, 100)) # 这是 rnn 的输入格式" | |||
"x = Variable(torch.randn(6, 5, 100)) # 这是 rnn 的输入格式\n", | |||
"x" | |||
] | |||
}, | |||
{ | |||
@@ -22,13 +22,12 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 1, | |||
"execution_count": 4, | |||
"metadata": { | |||
"ExecuteTime": { | |||
"end_time": "2018-01-01T10:09:21.223959Z", | |||
"start_time": "2018-01-01T10:09:20.758909Z" | |||
}, | |||
"collapsed": true | |||
} | |||
}, | |||
"outputs": [], | |||
"source": [ | |||
@@ -53,13 +52,12 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 2, | |||
"execution_count": 7, | |||
"metadata": { | |||
"ExecuteTime": { | |||
"end_time": "2018-01-01T10:09:21.368959Z", | |||
"start_time": "2018-01-01T10:09:21.341312Z" | |||
}, | |||
"collapsed": true | |||
} | |||
}, | |||
"outputs": [], | |||
"source": [ | |||
@@ -68,19 +66,18 @@ | |||
" tfs.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5]) # 标准化\n", | |||
"])\n", | |||
"\n", | |||
"train_set = MNIST('./mnist', transform=im_tfs)\n", | |||
"train_set = MNIST('./mnist', transform=im_tfs, download=True )\n", | |||
"train_data = DataLoader(train_set, batch_size=128, shuffle=True)" | |||
] | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 3, | |||
"execution_count": 8, | |||
"metadata": { | |||
"ExecuteTime": { | |||
"end_time": "2018-01-01T10:09:23.526707Z", | |||
"start_time": "2018-01-01T10:09:23.489417Z" | |||
}, | |||
"collapsed": true | |||
} | |||
}, | |||
"outputs": [], | |||
"source": [ | |||
@@ -125,7 +122,7 @@ | |||
}, | |||
{ | |||
"cell_type": "code", | |||
"execution_count": 4, | |||
"execution_count": 10, | |||
"metadata": { | |||
"ExecuteTime": { | |||
"end_time": "2018-01-01T10:09:26.677033Z", | |||
@@ -137,7 +134,9 @@ | |||
"name": "stdout", | |||
"output_type": "stream", | |||
"text": [ | |||
"torch.Size([1, 3])\n" | |||
"torch.Size([1, 3])\n", | |||
"tensor([[0.3032, 0.0394, 0.2175]], grad_fn=<AddmmBackward>)\n", | |||
"torch.Size([1, 784])\n" | |||
] | |||
} | |||
], | |||
@@ -145,7 +144,7 @@ | |||
"net = autoencoder()\n", | |||
"x = Variable(torch.randn(1, 28*28)) # batch size 是 1\n", | |||
"code, _ = net(x)\n", | |||
"print(code.shape)" | |||
"print(code.shape)\n" | |||
] | |||
}, | |||
{ | |||
@@ -136,7 +136,7 @@ | |||
"name": "python", | |||
"nbconvert_exporter": "python", | |||
"pygments_lexer": "ipython3", | |||
"version": "3.5.2" | |||
"version": "3.6.8" | |||
} | |||
}, | |||
"nbformat": 4, | |||