|
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736 |
- {
- "cells": [
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "# PyTorch\n",
- "\n",
- "PyTorch是基于Python的科学计算包,其旨在服务两类场合:\n",
- "* 替代NumPy发挥GPU潜能\n",
- "* 提供了高度灵活性和效率的深度学习平台\n",
- "\n",
- "PyTorch的简洁设计使得它入门很简单,本部分内容在深入介绍PyTorch之前,先介绍一些PyTorch的基础知识,让大家能够对PyTorch有一个大致的了解,并能够用PyTorch搭建一个简单的神经网络,然后在深入学习如何使用PyTorch实现各类网络结构。在学习过程,可能部分内容暂时不太理解,可先不予以深究,后续的课程将会对此进行深入讲解。\n",
- "\n",
- "\n",
- "\n",
- "\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## 1. Tensor基本用法\n",
- "\n",
- "张量(Tensor)是一种专门的数据结构,非常类似于数组和矩阵。在PyTorch中,我们使用张量来编码模型的输入和输出,以及模型的参数。\n",
- "\n",
- "张量类似于`NumPy`的`ndarray`,不同之处在于张量可以在GPU或其他硬件加速器上运行。事实上,张量和NumPy数组通常可以共享相同的底层内存,从而消除了复制数据的需要(请参阅使用NumPy的桥接)。张量还针对自动微分进行了优化,在Autograd部分中看到更多关于这一点的内介绍。\n",
- "\n",
- "`variable`是一种可以不断变化的变量,符合反向传播,参数更新的属性。PyTorch的`variable`是一个存放会变化值的内存位置,里面的值会不停变化,像装糖果(糖果就是数据,即tensor)的盒子,糖果的数量不断变化。pytorch都是由tensor计算的,而tensor里面的参数是variable形式。\n",
- "\n",
- "PyTorch基础的数据是张量(Tensor),PyTorch 的很多操作好 NumPy 都是类似的,但是因为其能够在 GPU 上运行,所以有着比 NumPy 快很多倍的速度。本节内容主要包括 PyTorch 中的基本元素 Tensor 和 Variable 及其操作方式。"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 1.1 Tensor定义与生成"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 2,
- "metadata": {
- "collapsed": true
- },
- "outputs": [],
- "source": [
- "import torch\n",
- "import numpy as np"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 3,
- "metadata": {
- "collapsed": true
- },
- "outputs": [],
- "source": [
- "# 创建一个 numpy ndarray\n",
- "numpy_tensor = np.random.randn(10, 20)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "可以使用下面两种方式将numpy的ndarray转换到tensor上"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 9,
- "metadata": {
- "collapsed": true
- },
- "outputs": [],
- "source": [
- "pytorch_tensor1 = torch.tensor(numpy_tensor)\n",
- "pytorch_tensor2 = torch.from_numpy(numpy_tensor)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "使用以上两种方法进行转换的时候,会直接将 NumPy ndarray 的数据类型转换为对应的 PyTorch Tensor 数据类型"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "同时也可以使用下面的方法将 `PyTorch Tensor` 转换为 `NumPy ndarray`"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 5,
- "metadata": {
- "collapsed": true
- },
- "outputs": [],
- "source": [
- "# 如果 pytorch tensor 在 cpu 上\n",
- "numpy_array = pytorch_tensor1.numpy()\n",
- "\n",
- "# 如果 pytorch tensor 在 gpu 上\n",
- "numpy_array = pytorch_tensor1.cpu().numpy()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "需要注意 GPU 上的 Tensor 不能直接转换为 NumPy ndarray,需要使用`.cpu()`先将 GPU 上的 Tensor 转到 CPU 上"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 1.2 PyTorch Tensor 使用 GPU 加速\n",
- "\n",
- "我们可以使用以下两种方式将 Tensor 放到 GPU 上"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 7,
- "metadata": {
- "collapsed": true
- },
- "outputs": [],
- "source": [
- "# 第一种方式是定义 cuda 数据类型\n",
- "dtype = torch.cuda.FloatTensor # 定义默认 GPU 的 数据类型\n",
- "gpu_tensor = torch.randn(10, 20).type(dtype)\n",
- "\n",
- "# 第二种方式更简单,推荐使用\n",
- "gpu_tensor = torch.randn(10, 20).cuda(0) # 将 tensor 放到第一个 GPU 上\n",
- "gpu_tensor = torch.randn(10, 20).cuda(1) # 将 tensor 放到第二个 GPU 上"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "使用第一种方式将 tensor 放到 GPU 上的时候会将数据类型转换成定义的类型,而是用第二种方式能够直接将 tensor 放到 GPU 上,类型跟之前保持一致\n",
- "\n",
- "推荐在定义 tensor 的时候就明确数据类型,然后直接使用第二种方法将 tensor 放到 GPU 上"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "而将 tensor 放回 CPU 的操作如下"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 8,
- "metadata": {
- "collapsed": true
- },
- "outputs": [],
- "source": [
- "cpu_tensor = gpu_tensor.cpu()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Tensor 属性的访问方式"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 9,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "torch.Size([10, 20])\n",
- "torch.Size([10, 20])\n"
- ]
- }
- ],
- "source": [
- "# 可以通过下面两种方式得到 tensor 的大小\n",
- "print(pytorch_tensor1.shape)\n",
- "print(pytorch_tensor1.size())"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 10,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "torch.FloatTensor\n",
- "torch.cuda.FloatTensor\n"
- ]
- }
- ],
- "source": [
- "# 得到 tensor 的数据类型\n",
- "print(pytorch_tensor1.type())\n",
- "print(gpu_tensor.type())"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 11,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "2\n"
- ]
- }
- ],
- "source": [
- "# 得到 tensor 的维度\n",
- "print(pytorch_tensor1.dim())"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 12,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "200\n"
- ]
- }
- ],
- "source": [
- "# 得到 tensor 的所有元素个数\n",
- "print(pytorch_tensor1.numel())"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## 2. Tensor的操作\n",
- "Tensor 操作中的 API 和 NumPy 非常相似,如果熟悉 NumPy 中的操作,那么 tensor 基本操作是一致的,下面列举其中的一些操作"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 2.1 基本操作"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 13,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "tensor([[1., 1.],\n",
- " [1., 1.],\n",
- " [1., 1.]])\n"
- ]
- }
- ],
- "source": [
- "x = torch.ones(3, 2)\n",
- "print(x) # 这是一个float tensor"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 14,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "torch.FloatTensor\n"
- ]
- }
- ],
- "source": [
- "print(x.type())"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 15,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "tensor([[1, 1],\n",
- " [1, 1],\n",
- " [1, 1]])\n"
- ]
- }
- ],
- "source": [
- "# 将其转化为整形\n",
- "x = x.long()\n",
- "# x = x.type(torch.LongTensor)\n",
- "print(x)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 16,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "tensor([[1., 1.],\n",
- " [1., 1.],\n",
- " [1., 1.]])\n"
- ]
- }
- ],
- "source": [
- "# 再将其转回 float\n",
- "x = x.float()\n",
- "# x = x.type(torch.FloatTensor)\n",
- "print(x)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 17,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "tensor([[-1.2200, 0.9769, -2.3477],\n",
- " [ 1.0125, -1.3236, -0.2626],\n",
- " [-0.3501, 0.5753, 1.5657],\n",
- " [ 0.4823, -0.4008, -1.3442]])\n"
- ]
- }
- ],
- "source": [
- "x = torch.randn(4, 3)\n",
- "print(x)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 18,
- "metadata": {
- "collapsed": true
- },
- "outputs": [],
- "source": [
- "# 沿着行取最大值\n",
- "max_value, max_idx = torch.max(x, dim=1)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 19,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "tensor([0.9769, 1.0125, 1.5657, 0.4823])"
- ]
- },
- "execution_count": 19,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 每一行的最大值\n",
- "max_value"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 20,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "tensor([1, 0, 2, 0])"
- ]
- },
- "execution_count": 20,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 每一行最大值的下标\n",
- "max_idx"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 21,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "tensor([-2.5908, -0.5736, 1.7909, -1.2627])\n"
- ]
- }
- ],
- "source": [
- "# 沿着行对 x 求和\n",
- "sum_x = torch.sum(x, dim=1)\n",
- "print(sum_x)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 22,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "torch.Size([4, 3])\n",
- "torch.Size([1, 4, 3])\n",
- "tensor([[[-1.2200, 0.9769, -2.3477],\n",
- " [ 1.0125, -1.3236, -0.2626],\n",
- " [-0.3501, 0.5753, 1.5657],\n",
- " [ 0.4823, -0.4008, -1.3442]]])\n"
- ]
- }
- ],
- "source": [
- "# 增加维度或者减少维度\n",
- "print(x.shape)\n",
- "x = x.unsqueeze(0) # 在第一维增加\n",
- "print(x.shape)\n",
- "print(x)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 23,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "torch.Size([1, 1, 4, 3])\n"
- ]
- }
- ],
- "source": [
- "x = x.unsqueeze(1) # 在第二维增加\n",
- "print(x.shape)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 24,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "torch.Size([1, 4, 3])\n",
- "tensor([[[-1.2200, 0.9769, -2.3477],\n",
- " [ 1.0125, -1.3236, -0.2626],\n",
- " [-0.3501, 0.5753, 1.5657],\n",
- " [ 0.4823, -0.4008, -1.3442]]])\n"
- ]
- }
- ],
- "source": [
- "x = x.squeeze(0) # 减少第一维\n",
- "print(x.shape)\n",
- "print(x)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 25,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "torch.Size([4, 3])\n"
- ]
- }
- ],
- "source": [
- "x = x.squeeze() # 将 tensor 中所有的一维全部都去掉\n",
- "print(x.shape)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 26,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "torch.Size([3, 4, 5])\n",
- "torch.Size([4, 3, 5])\n",
- "torch.Size([5, 3, 4])\n"
- ]
- }
- ],
- "source": [
- "x = torch.randn(3, 4, 5)\n",
- "print(x.shape)\n",
- "\n",
- "# 使用permute和transpose进行维度交换\n",
- "x = x.permute(1, 0, 2) # permute 可以重新排列 tensor 的维度\n",
- "print(x.shape)\n",
- "\n",
- "x = x.transpose(0, 2) # transpose 交换 tensor 中的两个维度\n",
- "print(x.shape)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 27,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "torch.Size([3, 4, 5])\n",
- "torch.Size([12, 5])\n",
- "torch.Size([3, 20])\n"
- ]
- }
- ],
- "source": [
- "# 使用 view 对 tensor 进行 reshape\n",
- "x = torch.randn(3, 4, 5)\n",
- "print(x.shape)\n",
- "\n",
- "x = x.view(-1, 5) # -1 表示任意的大小,5 表示第二维变成 5\n",
- "print(x.shape)\n",
- "\n",
- "x = x.view(3, 20) # 重新 reshape 成 (3, 20) 的大小\n",
- "print(x.shape)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 32,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "tensor([[-3.1321, -0.9734, 0.5307, 0.4975],\n",
- " [ 0.8537, 1.3424, 0.2630, -1.6658],\n",
- " [-1.0088, -2.2100, -1.9233, -0.3059]])\n"
- ]
- }
- ],
- "source": [
- "x = torch.randn(3, 4)\n",
- "y = torch.randn(3, 4)\n",
- "\n",
- "# 两个 tensor 求和\n",
- "z = x + y\n",
- "# z = torch.add(x, y)\n",
- "print(z)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 2.2 `inplace`操作\n",
- "另外,pytorch中大多数的操作都支持 `inplace` 操作,也就是可以直接对 tensor 进行操作而不需要另外开辟内存空间,方式非常简单,一般都是在操作的符号后面加`_`,比如"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 33,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "torch.Size([3, 3])\n",
- "torch.Size([1, 3, 3])\n",
- "torch.Size([3, 1, 3])\n"
- ]
- }
- ],
- "source": [
- "x = torch.ones(3, 3)\n",
- "print(x.shape)\n",
- "\n",
- "# unsqueeze 进行 inplace\n",
- "x.unsqueeze_(0)\n",
- "print(x.shape)\n",
- "\n",
- "# transpose 进行 inplace\n",
- "x.transpose_(1, 0)\n",
- "print(x.shape)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 34,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "tensor([[1., 1., 1.],\n",
- " [1., 1., 1.],\n",
- " [1., 1., 1.]])\n",
- "tensor([[2., 2., 2.],\n",
- " [2., 2., 2.],\n",
- " [2., 2., 2.]])\n"
- ]
- }
- ],
- "source": [
- "x = torch.ones(3, 3)\n",
- "y = torch.ones(3, 3)\n",
- "print(x)\n",
- "\n",
- "# add 进行 inplace\n",
- "x.add_(y)\n",
- "print(x)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## 练习题\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "* 查阅[PyTorch的Tensor文档](http://pytorch.org/docs/tensors.html)了解 tensor 的数据类型,创建一个 float64、大小是 3 x 2、随机初始化的 tensor,将其转化为 numpy 的 ndarray,输出其数据类型\n",
- "* 查阅[PyTorch的Tensor文档](http://pytorch.org/docs/tensors.html)了解 tensor 更多的 API,创建一个 float32、4 x 4 的全为1的矩阵,将矩阵正中间 2 x 2 的矩阵,全部修改成2"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## 参考\n",
- "* [PyTorch官方说明文档](https://pytorch.org/docs/stable/)\n",
- "* http://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html\n",
- "* http://cs231n.github.io/python-numpy-tutorial/"
- ]
- }
- ],
- "metadata": {
- "kernelspec": {
- "display_name": "Python 3",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.5.4"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 2
- }
|