|
|
- {
- "cells": [
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "# Numpy - 多维数据数组软件库"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "NumPy是Python中科学计算的基本软件包。它是一个Python库,提供多维数组对象、各种派生类(如掩码数组和矩阵)和各种例程。\n",
- "* 用于对数组进行快速操作,包括数学、逻辑、形状操作、排序、选择、I/O、离散傅立叶变换、基本线性代数、基本统计操作、随机模拟等等。\n",
- "* Numpy作为Python数据计算的基础广泛应用到数据处理、信号处理、机器学习等领域。"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- ""
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## 1. 简介"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "`numpy`包(模块)用在几乎所有使用Python的数值计算中,为Python提供高性能向量,矩阵和高维数据结构的模块。它是用C和Fortran语言实现的,因此当计算向量化数据(用向量和矩阵表示)时,性能非常的好。\n",
- "\n",
- "为了使用`numpy`模块,你先要像下面的例子一样导入这个模块:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 1,
- "metadata": {},
- "outputs": [],
- "source": [
- "# 这一行的作用会在Matplotlib中介绍\n",
- "%matplotlib inline\n",
- "import matplotlib.pyplot as plt"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 2,
- "metadata": {},
- "outputs": [],
- "source": [
- "# 不建议用这种方式导入库\n",
- "from numpy import *"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 3,
- "metadata": {},
- "outputs": [],
- "source": [
- "# 建议使用这种方式\n",
- "import numpy as np"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "**建议大家使用第二种导入方法** `import numpy as np`\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## 2. 创建`numpy`数组"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "有很多种方法去初始化新的numpy数组, 例如从\n",
- "\n",
- "* Python列表或元组\n",
- "* 使用专门用来创建numpy arrays的函数,例如 `arange`, `linspace`等\n",
- "* 从文件中读取数据"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 2.1 从列表中"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "例如,为了从Python列表创建新的向量和矩阵我们可以用`numpy.array`函数。\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 4,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[1, 2, 3, 4]\n"
- ]
- },
- {
- "data": {
- "text/plain": [
- "array([1, 2, 3, 4])"
- ]
- },
- "execution_count": 4,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "import numpy as np\n",
- "\n",
- "a = [1, 2, 3, 4]\n",
- "print(a)\n",
- "\n",
- "# a vector: the argument to the array function is a Python list\n",
- "v = np.array(a)\n",
- "\n",
- "v"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 5,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[[1 2]\n",
- " [3 4]\n",
- " [5 6]]\n",
- "\n",
- "(3, 2)\n"
- ]
- }
- ],
- "source": [
- "# 矩阵:数组函数的参数是一个嵌套的Python列表\n",
- "M = np.array([[1, 2], [3, 4], [5, 6]])\n",
- "\n",
- "print(M)\n",
- "print()\n",
- "print(M.shape)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 6,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[[[ 1 2]\n",
- " [ 3 4]\n",
- " [ 5 6]]\n",
- "\n",
- " [[ 3 4]\n",
- " [ 5 6]\n",
- " [ 7 8]]\n",
- "\n",
- " [[ 5 6]\n",
- " [ 7 8]\n",
- " [ 9 10]]\n",
- "\n",
- " [[ 7 8]\n",
- " [ 9 10]\n",
- " [11 12]]]\n",
- "\n",
- "(4, 3, 2)\n"
- ]
- }
- ],
- "source": [
- "M = np.array([[[1,2], [3,4], [5,6]], \\\n",
- " [[3,4], [5,6], [7,8]], \\\n",
- " [[5,6], [7,8], [9,10]], \\\n",
- " [[7,8], [9,10], [11,12]]])\n",
- "print(M)\n",
- "print()\n",
- "print(M.shape)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "`v`和`M`两个都是属于`numpy`模块提供的`ndarray`类型。"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 7,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "(numpy.ndarray, numpy.ndarray)"
- ]
- },
- "execution_count": 7,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "type(v), type(M)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "`v`和`M`之间的区别仅在于他们的形状。我们可以用属性函数`ndarray.shape`得到数组形状的信息。"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 8,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "(4,)"
- ]
- },
- "execution_count": 8,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "v.shape"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 9,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "(4, 3, 2)"
- ]
- },
- "execution_count": 9,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "M.shape"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "通过属性函数`ndarray.size`我们可以得到数组中元素的个数"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 10,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "24"
- ]
- },
- "execution_count": 10,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "M.size"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "同样,我们可以用函数`numpy.shape`和`numpy.size`"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 11,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "(4, 3, 2)"
- ]
- },
- "execution_count": 11,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "np.shape(M)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 12,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "24"
- ]
- },
- "execution_count": 12,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "np.size(M)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "到目前为止`numpy.ndarray`看起来非常像Python列表(或嵌套列表)。为什么不简单地使用Python列表来进行计算,而不是创建一个新的数组类型?\n",
- "\n",
- "下面有几个原因:\n",
- "\n",
- "* Python列表非常普遍。它们可以包含任何类型的对象。它们是动态类型的。它们不支持矩阵和点乘等数学函数。由于动态类型的关系,为Python列表实现这类函数的效率不是很高。\n",
- "* Numpy数组是**静态类型的**和**同构的**。元素的类型是在创建数组时确定的。\n",
- "* Numpy数组是内存高效的。\n",
- "* 由于是静态类型,数学函数的快速实现,比如“numpy”数组的乘法和加法可以用编译语言实现(使用C和Fortran).\n",
- "\n",
- "利用`ndarray`的属性函数`dtype`(数据类型),我们可以看出数组的数据是那种类型。\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 13,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "dtype('int64')"
- ]
- },
- "execution_count": 13,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "M.dtype"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "如果我们试图给一个numpy数组中的元素赋一个错误类型的值,我们会得到一个错误:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 14,
- "metadata": {},
- "outputs": [
- {
- "ename": "ValueError",
- "evalue": "invalid literal for int() with base 10: 'hello'",
- "output_type": "error",
- "traceback": [
- "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
- "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)",
- "\u001b[0;32m<ipython-input-14-3eecc5e8509b>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mM\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m\"hello\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
- "\u001b[0;31mValueError\u001b[0m: invalid literal for int() with base 10: 'hello'"
- ]
- }
- ],
- "source": [
- "M[0,0,0] = \"hello\""
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "如果我们想的话,我们可以利用`dtype`关键字参数显式地定义我们创建的数组数据类型:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 15,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[1.+0.j, 2.+0.j],\n",
- " [3.+0.j, 4.+0.j]])"
- ]
- },
- "execution_count": 15,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "M = np.array([[1, 2], [3, 4]], dtype=complex)\n",
- "\n",
- "M"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "常规可以伴随`dtype`使用的数据类型是:`int`, `float`, `complex`, `bool`, `object`等\n",
- "\n",
- "我们也可以显式地定义数据类型的大小,例如:`int64`, `int16`, `float128`, `complex128`。"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 2.2 使用数组生成函数"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "对于较大的数组,使用显式的Python列表人为地初始化数据是不切实际的。除此之外我们可以用`numpy`的很多函数得到不同类型的数组。有一些常用的分别是:"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "#### arange"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 16,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[0 1 2 3 4 5 6 7 8 9]\n",
- "[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]\n"
- ]
- }
- ],
- "source": [
- "# 创建一个范围\n",
- "\n",
- "x = np.arange(0, 10, 1) # 参数:start, stop, step: \n",
- "y = range(0, 10, 1)\n",
- "print(x)\n",
- "print(list(y))"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 17,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([-1.00000000e+00, -9.00000000e-01, -8.00000000e-01, -7.00000000e-01,\n",
- " -6.00000000e-01, -5.00000000e-01, -4.00000000e-01, -3.00000000e-01,\n",
- " -2.00000000e-01, -1.00000000e-01, -2.22044605e-16, 1.00000000e-01,\n",
- " 2.00000000e-01, 3.00000000e-01, 4.00000000e-01, 5.00000000e-01,\n",
- " 6.00000000e-01, 7.00000000e-01, 8.00000000e-01, 9.00000000e-01])"
- ]
- },
- "execution_count": 17,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "x = np.arange(-1, 1, 0.1)\n",
- "\n",
- "x"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "#### linspace and logspace"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 18,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([ 0. , 2.5, 5. , 7.5, 10. ])"
- ]
- },
- "execution_count": 18,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 使用linspace两边的端点也被包含进去\n",
- "np.linspace(0, 10, 5)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 19,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([1.00000000e+00, 3.03773178e+00, 9.22781435e+00, 2.80316249e+01,\n",
- " 8.51525577e+01, 2.58670631e+02, 7.85771994e+02, 2.38696456e+03,\n",
- " 7.25095809e+03, 2.20264658e+04])"
- ]
- },
- "execution_count": 19,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "np.logspace(0, 10, 10, base=np.e)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "#### mgrid"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 20,
- "metadata": {},
- "outputs": [],
- "source": [
- "y, x = np.mgrid[0:5, 0:5] # 和MATLAB中的meshgrid类似"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 21,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[0, 1, 2, 3, 4],\n",
- " [0, 1, 2, 3, 4],\n",
- " [0, 1, 2, 3, 4],\n",
- " [0, 1, 2, 3, 4],\n",
- " [0, 1, 2, 3, 4]])"
- ]
- },
- "execution_count": 21,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "x"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 22,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[0, 0, 0, 0, 0],\n",
- " [1, 1, 1, 1, 1],\n",
- " [2, 2, 2, 2, 2],\n",
- " [3, 3, 3, 3, 3],\n",
- " [4, 4, 4, 4, 4]])"
- ]
- },
- "execution_count": 22,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "y"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "#### random data"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 23,
- "metadata": {},
- "outputs": [],
- "source": [
- "from numpy import random"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 24,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[[0.57397454, 0.12434228],\n",
- " [0.74835474, 0.01034541],\n",
- " [0.91383579, 0.02807574],\n",
- " [0.14217509, 0.64698341]],\n",
- "\n",
- " [[0.65606545, 0.84787378],\n",
- " [0.31064031, 0.70205451],\n",
- " [0.30486756, 0.34702889],\n",
- " [0.47537986, 0.91154076]],\n",
- "\n",
- " [[0.32192343, 0.77700745],\n",
- " [0.80485914, 0.85919158],\n",
- " [0.29751565, 0.27228179],\n",
- " [0.57796668, 0.18255467]],\n",
- "\n",
- " [[0.50020698, 0.58134695],\n",
- " [0.14200095, 0.97556272],\n",
- " [0.32948647, 0.35170435],\n",
- " [0.27768833, 0.75059373]],\n",
- "\n",
- " [[0.23972627, 0.08461662],\n",
- " [0.1929383 , 0.80565903],\n",
- " [0.2627892 , 0.73361884],\n",
- " [0.18415944, 0.44976198]]])"
- ]
- },
- "execution_count": 24,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 均匀随机数在[0,1)区间\n",
- "random.rand(5,4,2)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 25,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[-1.74300737, 1.94689131, 0.18922227, -0.20440928],\n",
- " [ 1.31664152, -0.01176745, -0.43956951, 0.53571291],\n",
- " [ 0.02140654, -0.09635041, -1.84205831, 0.64951045],\n",
- " [ 0.35682903, 0.96657395, -0.50099255, -0.80044681]])"
- ]
- },
- "execution_count": 25,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 标准正态分布随机数\n",
- "random.randn(4,4)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "#### diag"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 26,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[1, 0, 0],\n",
- " [0, 2, 0],\n",
- " [0, 0, 3]])"
- ]
- },
- "execution_count": 26,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 一个对角矩阵\n",
- "np.diag([1,2,3])"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 27,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[0, 0, 0, 0],\n",
- " [1, 0, 0, 0],\n",
- " [0, 2, 0, 0],\n",
- " [0, 0, 3, 0]])"
- ]
- },
- "execution_count": 27,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 从主对角线偏移的对角线\n",
- "np.diag([1,2,3], k=-1) "
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "#### zeros and ones"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 28,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[0., 0., 0.],\n",
- " [0., 0., 0.],\n",
- " [0., 0., 0.]])"
- ]
- },
- "execution_count": 28,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "np.zeros((3,3))"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 29,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[1., 1., 1.],\n",
- " [1., 1., 1.],\n",
- " [1., 1., 1.]])"
- ]
- },
- "execution_count": 29,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "np.ones((3,3))"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## 3. 文件 I/O"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 3.1 逗号分隔值 (CSV)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "对于数据文件来说一种非常常见的文件格式是逗号分割值(CSV),或者有关的格式例如TSV(制表符分隔的值)。为了从这些文件中读取数据到Numpy数组中,我们可以用`numpy.genfromtxt`函数。例如:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 30,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "1800 1 1 -6.1 -6.1 -6.1 1\r\n",
- "1800 1 2 -15.4 -15.4 -15.4 1\r\n",
- "1800 1 3 -15.0 -15.0 -15.0 1\r\n",
- "1800 1 4 -19.3 -19.3 -19.3 1\r\n",
- "1800 1 5 -16.8 -16.8 -16.8 1\r\n",
- "1800 1 6 -11.4 -11.4 -11.4 1\r\n",
- "1800 1 7 -7.6 -7.6 -7.6 1\r\n",
- "1800 1 8 -7.1 -7.1 -7.1 1\r\n",
- "1800 1 9 -10.1 -10.1 -10.1 1\r\n",
- "1800 1 10 -9.5 -9.5 -9.5 1\r\n"
- ]
- }
- ],
- "source": [
- "!head stockholm_td_adj.dat"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 31,
- "metadata": {},
- "outputs": [],
- "source": [
- "import numpy as np\n",
- "\n",
- "data = np.genfromtxt('stockholm_td_adj.dat')"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 32,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "(77431, 7)"
- ]
- },
- "execution_count": 32,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "data.shape"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 33,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "image/png": "\n",
- "text/plain": [
- "<Figure size 1008x288 with 1 Axes>"
- ]
- },
- "metadata": {
- "needs_background": "light"
- },
- "output_type": "display_data"
- }
- ],
- "source": [
- "%matplotlib inline\n",
- "import matplotlib.pyplot as plt\n",
- "\n",
- "fig, ax = plt.subplots(figsize=(14,4))\n",
- "ax.plot(data[:,0]+data[:,1]/12.0+data[:,2]/365, data[:,5])\n",
- "ax.axis('tight')\n",
- "ax.set_title('tempeatures in Stockholm')\n",
- "ax.set_xlabel('year')\n",
- "ax.set_ylabel('temperature (C)');"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "使用`numpy.savetxt`我们可以将一个Numpy数组以CSV格式存入:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 34,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[0.34743109, 0.34666094, 0.67796236],\n",
- " [0.37775535, 0.7452935 , 0.44639271],\n",
- " [0.7097024 , 0.54721637, 0.96400871]])"
- ]
- },
- "execution_count": 34,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "M = np.random.rand(3,3)\n",
- "\n",
- "M"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 35,
- "metadata": {},
- "outputs": [],
- "source": [
- "np.savetxt(\"random-matrix.csv\", M)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 36,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "3.474310879390657414e-01 3.466609365910759966e-01 6.779623624489031775e-01\r\n",
- "3.777553531256817587e-01 7.452935047749419395e-01 4.463927097637667707e-01\r\n",
- "7.097023968559375007e-01 5.472163711854115542e-01 9.640087120207403437e-01\r\n"
- ]
- }
- ],
- "source": [
- "!cat random-matrix.csv"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 37,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "0.34743 0.34666 0.67796\r\n",
- "0.37776 0.74529 0.44639\r\n",
- "0.70970 0.54722 0.96401\r\n"
- ]
- }
- ],
- "source": [
- "np.savetxt(\"random-matrix.csv\", M, fmt='%.5f') # fmt 确定格式\n",
- "\n",
- "!cat random-matrix.csv"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 3.2 numpy 的本地文件格式"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "当存储和读取numpy数组时非常有用。利用函数`numpy.save`和`numpy.load`:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 38,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "random-matrix.npy: NumPy array, version 1.0, header length 118\r\n"
- ]
- }
- ],
- "source": [
- "np.save(\"random-matrix.npy\", M)\n",
- "\n",
- "!file random-matrix.npy"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 39,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[0.34743109, 0.34666094, 0.67796236],\n",
- " [0.37775535, 0.7452935 , 0.44639271],\n",
- " [0.7097024 , 0.54721637, 0.96400871]])"
- ]
- },
- "execution_count": 39,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "np.load(\"random-matrix.npy\")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## 4. 更多Numpy数组的性质"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 40,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "int64\n",
- "8\n"
- ]
- }
- ],
- "source": [
- "M = np.array([[1, 2], [3, 4], [5, 6]])\n",
- "\n",
- "print(M.dtype)\n",
- "print(M.itemsize) # 每个元素的字节数\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 41,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "48"
- ]
- },
- "execution_count": 41,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "M.nbytes # 字节数"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 42,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "2"
- ]
- },
- "execution_count": 42,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "M.ndim # 维度"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## 5. 操作数组"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 5.1 索引"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "我们可以用方括号和下标索引元素:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 43,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "1"
- ]
- },
- "execution_count": 43,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "v = np.array([1, 2, 3, 4, 5])\n",
- "\n",
- "# v 是一个向量,仅仅只有一维,取一个索引\n",
- "v[0]"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 44,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "4\n",
- "4\n",
- "[3 4]\n"
- ]
- }
- ],
- "source": [
- "# M 是一个矩阵或者是一个二维的数组,取两个索引 \n",
- "print(M[1,1])\n",
- "print(M[1][1])\n",
- "print(M[1])"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "如果我们省略了一个多维数组的索引,它将会返回整行(或者,总的来说,一个 N-1 维的数组)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 45,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[1, 2],\n",
- " [3, 4],\n",
- " [5, 6]])"
- ]
- },
- "execution_count": 45,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "M"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 46,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([3, 4])"
- ]
- },
- "execution_count": 46,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "M[1]"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "相同的事情可以利用`:`而不是索引来实现:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 47,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([3, 4])"
- ]
- },
- "execution_count": 47,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "M[1,:] # 行 1"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 48,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([2, 4, 6])"
- ]
- },
- "execution_count": 48,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "M[:,1] # 列 1"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "我们可以用索引赋新的值给数组中的元素:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 49,
- "metadata": {},
- "outputs": [],
- "source": [
- "M[0,0] = 1"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 50,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[1, 2],\n",
- " [3, 4],\n",
- " [5, 6]])"
- ]
- },
- "execution_count": 50,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "M"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 51,
- "metadata": {},
- "outputs": [],
- "source": [
- "# 对行和列也同样有用\n",
- "M[1,:] = 0\n",
- "M[:,1] = -1"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 52,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[ 1, -1],\n",
- " [ 0, -1],\n",
- " [ 5, -1]])"
- ]
- },
- "execution_count": 52,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "M"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 5.2 切片索引"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "切片索引是语法 `M[lower:upper:step]` 的技术名称,用于提取数组的一部分:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 53,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([1, 2, 3, 4, 5])"
- ]
- },
- "execution_count": 53,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "A = np.array([1,2,3,4,5])\n",
- "A"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 54,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([2, 3])"
- ]
- },
- "execution_count": 54,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "A[1:3]"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "切片索引到的数据是 *可变的* : 如果它们被分配了一个新值,那么从其中提取切片的原始数组将被修改:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 55,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([ 1, -2, -3, 4, 5])"
- ]
- },
- "execution_count": 55,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "A[1:3] = [-2,-3] # auto convert type\n",
- "A[1:3] = np.array([-2, -3]) \n",
- "\n",
- "A"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "我们可以省略 `M[lower:upper:step]` 中任意的三个值"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 56,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([ 1, -2, -3, 4, 5])"
- ]
- },
- "execution_count": 56,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "A[::] # lower, upper, step 都取默认值"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 57,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([ 1, -2, -3, 4, 5])"
- ]
- },
- "execution_count": 57,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "A[:]"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 58,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([ 1, -3, 5])"
- ]
- },
- "execution_count": 58,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "A[::2] # step is 2, lower and upper 代表数组的开始和结束"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 59,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([ 1, -2, -3])"
- ]
- },
- "execution_count": 59,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "A[:3] # 前3个元素"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 60,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([4, 5])"
- ]
- },
- "execution_count": 60,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "A[3:] # 从索引3开始的元素"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "负索引计数从数组的结束(正索引从开始):"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 61,
- "metadata": {},
- "outputs": [],
- "source": [
- "A = np.array([1,2,3,4,5])"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 62,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "5"
- ]
- },
- "execution_count": 62,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "A[-1] # 数组中最后一个元素"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 63,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([3, 4, 5])"
- ]
- },
- "execution_count": 63,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "A[-3:] # 最后三个元素"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "索引切片的工作方式与多维数组完全相同:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 64,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[ 0, 1, 2, 3, 4],\n",
- " [10, 11, 12, 13, 14],\n",
- " [20, 21, 22, 23, 24],\n",
- " [30, 31, 32, 33, 34],\n",
- " [40, 41, 42, 43, 44]])"
- ]
- },
- "execution_count": 64,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "A = np.array([[n+m*10 for n in range(5)] for m in range(5)])\n",
- "\n",
- "A"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 65,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[11, 12, 13],\n",
- " [21, 22, 23],\n",
- " [31, 32, 33]])"
- ]
- },
- "execution_count": 65,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 原始数组中的一个块\n",
- "A[1:4, 1:4]"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 66,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[ 0, 2, 4],\n",
- " [20, 22, 24],\n",
- " [40, 42, 44]])"
- ]
- },
- "execution_count": 66,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 步长\n",
- "A[::2, ::2]"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 5.3 花式索引"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Fancy索引是一个名称时,一个数组或列表被使用在一个索引:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 67,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[[10 11 12 13 14]\n",
- " [30 31 32 33 34]\n",
- " [20 21 22 23 24]]\n",
- "[[ 0 1 2 3 4]\n",
- " [10 11 12 13 14]\n",
- " [20 21 22 23 24]\n",
- " [30 31 32 33 34]\n",
- " [40 41 42 43 44]]\n"
- ]
- }
- ],
- "source": [
- "A = np.array([[n+m*10 for n in range(5)] for m in range(5)])\n",
- "\n",
- "row_indices = [1, 3, 2]\n",
- "print(A[row_indices])\n",
- "print(A)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 68,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([11, 31, 24])"
- ]
- },
- "execution_count": 68,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "col_indices = [1, 1, -1] # 索引-1 代表最后一个元素\n",
- "A[row_indices, col_indices]"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "我们也可以使用索引掩码:如果索引掩码是一个数据类型`bool`的Numpy数组,那么一个元素被选择(True)或不(False)取决于索引掩码在每个元素位置的值:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 69,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([0, 1, 2, 3, 4])"
- ]
- },
- "execution_count": 69,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "B = np.array([n for n in range(5)])\n",
- "B"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 70,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([0, 2])"
- ]
- },
- "execution_count": 70,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "row_mask = np.array([True, False, True, False, False])\n",
- "B[row_mask]"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 71,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([0, 2])"
- ]
- },
- "execution_count": 71,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 相同的事情\n",
- "row_mask = np.array([1,0,1,0,0], dtype=bool)\n",
- "B[row_mask]"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "这个特性对于有条件地从数组中选择元素非常有用,例如使用比较运算符:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 72,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. , 5.5, 6. ,\n",
- " 6.5, 7. , 7.5, 8. , 8.5, 9. , 9.5])"
- ]
- },
- "execution_count": 72,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "x = np.arange(0, 10, 0.5)\n",
- "x"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 73,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([False, False, False, False, False, False, False, False, False,\n",
- " False, False, True, True, True, True, False, False, False,\n",
- " False, False])"
- ]
- },
- "execution_count": 73,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "mask = (5 < x) * (x < 7.5)\n",
- "\n",
- "mask"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 74,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([5.5, 6. , 6.5, 7. ])"
- ]
- },
- "execution_count": 74,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "x[mask]"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 75,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([3.5, 4. , 4.5, 5. , 5.5])"
- ]
- },
- "execution_count": 75,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "x[(3<x) * (x<6)]"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## 6. 用于从数组中提取数据和创建数组的函数"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 6.1 where"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "索引掩码可以使用`where`函数转换为位置索引"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 76,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "(array([11, 12, 13, 14]),)"
- ]
- },
- "execution_count": 76,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "x = np.arange(0, 10, 0.5)\n",
- "mask = (5 < x) * (x < 7.5)\n",
- "\n",
- "indices = np.where(mask)\n",
- "\n",
- "indices"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 77,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([5.5, 6. , 6.5, 7. ])"
- ]
- },
- "execution_count": 77,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "x[indices] # 这个索引等同于花式索引x[mask]"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 6.2 diag"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "使用diag函数,我们还可以提取一个数组的对角线和亚对角线:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 78,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([ 0, 11, 22, 33, 44])"
- ]
- },
- "execution_count": 78,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "np.diag(A)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 79,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([10, 21, 32, 43])"
- ]
- },
- "execution_count": 79,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "np.diag(A, -1)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## 7. 线性代数"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "向量化代码是使用Python/Numpy编写高效数值计算的关键。这意味着尽可能多的程序应该用矩阵和向量运算来表示,比如矩阵-矩阵乘法。"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 7.1 Scalar-array 操作"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "我们可以使用常用的算术运算符来对标量数组进行乘、加、减和除运算。"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 80,
- "metadata": {},
- "outputs": [],
- "source": [
- "import numpy as np\n",
- "\n",
- "v1 = np.arange(0, 5)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 81,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([0, 2, 4, 6, 8])"
- ]
- },
- "execution_count": 81,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "v1 * 2"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 82,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([2, 3, 4, 5, 6])"
- ]
- },
- "execution_count": 82,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "v1 + 2"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 83,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[[ 0 2 4 6 8]\n",
- " [20 22 24 26 28]\n",
- " [40 42 44 46 48]\n",
- " [60 62 64 66 68]\n",
- " [80 82 84 86 88]]\n",
- "[[ 2 3 4 5 6]\n",
- " [12 13 14 15 16]\n",
- " [22 23 24 25 26]\n",
- " [32 33 34 35 36]\n",
- " [42 43 44 45 46]]\n"
- ]
- }
- ],
- "source": [
- "A = np.array([[n+m*10 for n in range(5)] for m in range(5)])\n",
- "\n",
- "print(A * 2)\n",
- "\n",
- "print(A + 2)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 7.2 数组间的元素操作"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "当我们对数组进行加法、减法、乘法和除法时,默认的行为是**element-wise**操作:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 84,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[0.12684531, 0.88008175, 0.00646408],\n",
- " [0.56140088, 0.06651575, 0.79145154]])"
- ]
- },
- "execution_count": 84,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "A = np.random.rand(2, 3)\n",
- "\n",
- "A * A # element-wise 乘法"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 85,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([1., 4.])"
- ]
- },
- "execution_count": 85,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "v1 = np.array([1.0, 2.0])\n",
- "v1 * v1"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "如果我们用兼容的形状进行数组的乘法,我们会得到每一行的对位相乘结果:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 86,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "((2, 3), (2,))"
- ]
- },
- "execution_count": 86,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "A.shape, v1.shape"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 87,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[0.35615349, 0.93812672, 0.08039952],\n",
- " [0.74926689, 0.25790647, 0.88963562]])"
- ]
- },
- "execution_count": 87,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "A"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 88,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[0.35615349, 1.49853379],\n",
- " [0.93812672, 0.51581293],\n",
- " [0.08039952, 1.77927125]])"
- ]
- },
- "execution_count": 88,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "A.T * v1"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 89,
- "metadata": {},
- "outputs": [
- {
- "ename": "ValueError",
- "evalue": "operands could not be broadcast together with shapes (2,3) (2,) ",
- "output_type": "error",
- "traceback": [
- "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
- "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)",
- "\u001b[0;32m<ipython-input-89-629678c55a83>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mA\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0mv1\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
- "\u001b[0;31mValueError\u001b[0m: operands could not be broadcast together with shapes (2,3) (2,) "
- ]
- }
- ],
- "source": [
- "A*v1"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 7.4 矩阵代数"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "矩阵的乘法有两种方法,第一种方法是点乘函数,它对两个参数应用矩阵-矩阵、矩阵-向量或内向量乘法"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 90,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[2.59833251, 1.8189686 , 1.32946437, 2.15441681, 1.55219543],\n",
- " [1.4561364 , 1.26875236, 0.97855704, 1.35013248, 1.05524471],\n",
- " [2.38061437, 1.70445667, 1.16297305, 2.27888345, 1.66499116],\n",
- " [1.08602725, 0.76015292, 0.46415646, 1.38753125, 1.00011024],\n",
- " [1.82122991, 1.34175794, 0.92375387, 1.74770416, 1.27559765]])"
- ]
- },
- "execution_count": 90,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "A = np.random.rand(5, 5)\n",
- "v1 = np.random.rand(5, 1)\n",
- "\n",
- "np.dot(A, A)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 91,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[2.0139906 ],\n",
- " [1.41657535],\n",
- " [2.09784627],\n",
- " [1.2752073 ],\n",
- " [1.6253844 ]])"
- ]
- },
- "execution_count": 91,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "np.dot(A, v1)\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 92,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[2.08466462]])"
- ]
- },
- "execution_count": 92,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "np.dot(v1.T, v1)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "另外,我们可以将数组对象投到`matrix`类型上。这将改变标准算术运算符`+, -, *` 的行为,以使用矩阵代数。"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 93,
- "metadata": {},
- "outputs": [],
- "source": [
- "M = np.matrix(A)\n",
- "v = np.matrix(v1).T # make it a column vector"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 94,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "matrix([[0.45282687, 0.64874757, 0.70028245, 0.91412865, 0.36429705]])"
- ]
- },
- "execution_count": 94,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "v"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 95,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "matrix([[2.59833251, 1.8189686 , 1.32946437, 2.15441681, 1.55219543],\n",
- " [1.4561364 , 1.26875236, 0.97855704, 1.35013248, 1.05524471],\n",
- " [2.38061437, 1.70445667, 1.16297305, 2.27888345, 1.66499116],\n",
- " [1.08602725, 0.76015292, 0.46415646, 1.38753125, 1.00011024],\n",
- " [1.82122991, 1.34175794, 0.92375387, 1.74770416, 1.27559765]])"
- ]
- },
- "execution_count": 95,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "M * M"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 96,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "matrix([[2.0139906 ],\n",
- " [1.41657535],\n",
- " [2.09784627],\n",
- " [1.2752073 ],\n",
- " [1.6253844 ]])"
- ]
- },
- "execution_count": 96,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "M * v.T"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 97,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "matrix([[2.08466462]])"
- ]
- },
- "execution_count": 97,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 內积\n",
- "v * v.T"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "如果我们尝试用不相配的矩阵形状加,减或者乘我们会得到错误:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 98,
- "metadata": {},
- "outputs": [],
- "source": [
- "v = np.matrix([1,2,3,4,5,6]).T"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 99,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "((5, 5), (6, 1))"
- ]
- },
- "execution_count": 99,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "np.shape(M), np.shape(v)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 100,
- "metadata": {},
- "outputs": [
- {
- "ename": "ValueError",
- "evalue": "shapes (5,5) and (6,1) not aligned: 5 (dim 1) != 6 (dim 0)",
- "output_type": "error",
- "traceback": [
- "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
- "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)",
- "\u001b[0;32m<ipython-input-100-e8f88679fe45>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mM\u001b[0m \u001b[0;34m*\u001b[0m \u001b[0mv\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
- "\u001b[0;32m~/anaconda3/lib/python3.8/site-packages/numpy/matrixlib/defmatrix.py\u001b[0m in \u001b[0;36m__mul__\u001b[0;34m(self, other)\u001b[0m\n\u001b[1;32m 218\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mother\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0mN\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mndarray\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlist\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtuple\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 219\u001b[0m \u001b[0;31m# This promotes 1-D vectors to row vectors\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 220\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mN\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdot\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0masmatrix\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mother\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 221\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0misscalar\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mother\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mor\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0mhasattr\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mother\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m'__rmul__'\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 222\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mN\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdot\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mother\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
- "\u001b[0;32m<__array_function__ internals>\u001b[0m in \u001b[0;36mdot\u001b[0;34m(*args, **kwargs)\u001b[0m\n",
- "\u001b[0;31mValueError\u001b[0m: shapes (5,5) and (6,1) not aligned: 5 (dim 1) != 6 (dim 0)"
- ]
- }
- ],
- "source": [
- "M * v"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 7.5 矩阵计算与数据处理"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "#### 求逆"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 101,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[-2. , 1. ],\n",
- " [ 1.5, -0.5]])"
- ]
- },
- "execution_count": 101,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "C = np.array([[1, 2], [3, 4]])\n",
- "np.linalg.inv(C) # equivalent to C.I "
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "#### 行列式"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 102,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "-2.0000000000000004"
- ]
- },
- "execution_count": 102,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "np.linalg.det(C)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "#### 数据统计\n",
- "通常将数据集存储在Numpy数组中是非常有用的。Numpy提供了许多函数用于计算数组中数据集的统计。\n",
- "\n",
- "例如,让我们从上面使用的斯德哥尔摩温度数据集计算一些属性。"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 103,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "(77431, 7)"
- ]
- },
- "execution_count": 103,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "import numpy as np\n",
- "data = np.genfromtxt('stockholm_td_adj.dat')\n",
- "\n",
- "# 提醒一下,温度数据集存储在数据变量中:\n",
- "np.shape(data)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "#### mean"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 104,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "(77431, 7)\n"
- ]
- },
- {
- "data": {
- "text/plain": [
- "6.197109684751585"
- ]
- },
- "execution_count": 104,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 温度数据在第三列中\n",
- "print(data.shape)\n",
- "np.mean(data[:,3])"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 105,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "0.4931528475182218"
- ]
- },
- "execution_count": 105,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "A = np.random.rand(4, 3)\n",
- "np.mean(A)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "在过去的200年里,斯德哥尔摩每天的平均气温大约是6.2 C。"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "#### 标准差和方差"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 106,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "(8.282271621340573, 68.59602320966341)"
- ]
- },
- "execution_count": 106,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "np.std(data[:,3]), np.var(data[:,3])"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "#### 最小值和最大值"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 107,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "-25.8"
- ]
- },
- "execution_count": 107,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 最低日平均温度\n",
- "data[:,3].min()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 108,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "28.3"
- ]
- },
- "execution_count": 108,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 最高日平均温度\n",
- "data[:,3].max()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "#### sum, prod, and trace"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 109,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])"
- ]
- },
- "execution_count": 109,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "d = np.arange(0, 10)\n",
- "d"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 110,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "45"
- ]
- },
- "execution_count": 110,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 将所有的元素相加\n",
- "np.sum(d)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 111,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "3628800"
- ]
- },
- "execution_count": 111,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 全元素积分\n",
- "np.prod(d+1)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 112,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([ 0, 1, 3, 6, 10, 15, 21, 28, 36, 45])"
- ]
- },
- "execution_count": 112,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 累计求和\n",
- "np.cumsum(d)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 113,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([ 1, 2, 6, 24, 120, 720, 5040,\n",
- " 40320, 362880, 3628800])"
- ]
- },
- "execution_count": 113,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 累计乘积\n",
- "np.cumprod(d+1)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 114,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "1.4446600641166332"
- ]
- },
- "execution_count": 114,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 计算对角线元素的和,和diag(A).sum()一样\n",
- "np.trace(A)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 7.6 数组子集的计算"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "我们可以使用索引、花式索引和从数组中提取数据的其他方法(如上所述)来计算数组中的数据子集。\n",
- "\n",
- "例如,让我们回到温度数据集:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 115,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "1800 1 1 -6.1 -6.1 -6.1 1\r\n",
- "1800 1 2 -15.4 -15.4 -15.4 1\r\n",
- "1800 1 3 -15.0 -15.0 -15.0 1\r\n"
- ]
- }
- ],
- "source": [
- "!head -n 3 stockholm_td_adj.dat"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "数据集的格式是:年,月,日,日平均气温,低,高,位置。\n",
- "\n",
- "如果我们对某个特定月份的平均温度感兴趣,比如二月,然后我们可以创建一个索引掩码,使用它来选择当月的数据:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 116,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12.])"
- ]
- },
- "execution_count": 116,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "np.unique(data[:,1]) # 列的值从1到12"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 117,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[False False False ... False False False]\n"
- ]
- }
- ],
- "source": [
- "mask_feb = data[:,1] == 2\n",
- "print(mask_feb)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 118,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "-3.212109570736596\n",
- "5.090390768766271\n"
- ]
- }
- ],
- "source": [
- "# 温度数据实在第三行\n",
- "print(np.mean(data[mask_feb,3]))\n",
- "print(np.std(data[mask_feb,3]))"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "有了这些工具,我们就有了非常强大的数据处理能力。例如,提取每年每个月的平均气温只需要几行代码:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 119,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "image/png": "iVBORw0KGgoAAAANSUhEUgAAAX4AAAEGCAYAAABiq/5QAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAARgUlEQVR4nO3df7RlZV3H8fdHJgMRRGTEHzheIJKQEG0W/qAMNQpFIVu2EpVISSz8mS5ztFqgfximmLpyqSgIEkJGqOgAgiiwyvwBiAj+CMUBEWJAC1ELA779cfbgdZx753Du2efMuc/7tdZZ9+znnLuf714Mn3nm2Xs/O1WFJKkd95l2AZKkyTL4JakxBr8kNcbgl6TGGPyS1JgV0y5gGDvttFPNzc1NuwxJmimXXXbZrVW1cuP2mQj+ubk5Lr300mmXIUkzJcl1m2p3qkeSGmPwS1JjDH5JaozBL0mNMfglqTEGvyQ1xuCXpMYY/JLUmJm4gUuaBXNr1o59n+uOO3js+5Qc8UtSYwx+SWqMwS9JjTH4JakxBr8kNcbgl6TGGPyS1BiDX5IaY/BLUmMMfklqjMEvSY1xrR5pxox7TSDXA2qPI35JaozBL0mNMfglqTEGvyQ1xuCXpMYY/JLUmN6CP8lJSdYnuWpe27FJvpfkiu71jL76lyRtWp8j/pOBgzbR/vdVtW/3OqfH/iVJm9Bb8FfVJcAP+tq/JGk005jjf1mSK7upoAdOoX9Jatqkg/89wO7AvsBNwPELfTHJUUkuTXLpLbfcMqn6JGnZm2jwV9XNVXVXVd0NvB/Yb5HvnlBVq6tq9cqVKydXpCQtcxMN/iQPnbf5bOCqhb4rSepHb6tzJjkdOADYKckNwDHAAUn2BQpYB7ykr/6lDca9miW4oqVmW2/BX1WHbaL5xL76kyQNxzt3JakxBr8kNcbgl6TGGPyS1BiDX5IaY/BLUmMMfklqjMEvSY0x+CWpMQa/JDXG4Jekxhj8ktQYg1+SGmPwS1JjDH5JaozBL0mNMfglqTEGvyQ1xuCXpMYY/JLUGINfkhpj8EtSYwx+SWqMwS9JjTH4JakxK0b5pSSfrKpnjrsYSVuOuTVrx7q/dccdPNb9aXSjjvhfPNYqJEkTM9SIP8l9gT2BAr5ZVTf1WpUkqTebDf4kBwPvBb4NBNg1yUuq6ty+i5Mkjd8wI/7jgadU1bcAkuwOrAUMfkmaQcPM8a/fEPqda4H1PdUjSerZMCP+q5OcA3yEwRz/HwJfSvIHAFV1Vo/1SZLGbJjg3xq4GfjtbvsWYEfgWQz+IjD4JWmGbDb4q+qFkyhEkjQZw1zVsyvwcmBu/ver6pD+ypIk9WWYqZ6PAScCnwDu7rccSVLfhgn+/62qd/VeiSRpIoYJ/ncmOQY4H7hjQ2NVXd5bVZKk3gwT/L8OHA48lZ9N9VS3LUmaMcME/7OB3arqp30XI0nq3zB37n4F2OHe7jjJSUnWJ7lqXtuOSS5Ick3384H3dr+SpKUZJvh3Br6R5FNJzt7wGuL3TgYO2qhtDXBhVe0BXNhtS5ImaJipnmNG2XFVXZJkbqPmQ4EDuvenABcBrxtl/5Kk0Qxz5+7FSR4J7FFVn05yP2CrEfvbecNa/lV1U5IHL/TFJEcBRwGsWrVqxO4kSRvb7FRPkhcDZwLv65oezuCmrl5V1QlVtbqqVq9cubLv7iSpGcPM8b8U2B/4IUBVXQMsOFLfjJuTPBSg++nyzpI0YcME/x3zL+VMsoLBdfyjOBs4ont/BPDxEfcjSRrRMCd3L07yBmCbJAcCRzNYt2dRSU5ncCJ3pyQ3MDhJfBzwkSRHAtczWNtfjZpbs3bs+1x33MFj36e03AwT/GuAI4GvAi8Bzqmq92/ul6rqsAU+etrw5UmSxm2Y4H95Vb0TuCfsk7yya5MkzZhh5viP2ETbn4y5DknShCw44k9yGPA8YNeN7tTdDvh+34VJkvqx2FTP54CbgJ2A4+e13w5c2WdRkqT+LBj8VXUdcB3wxMmVI0nq2zBz/JKkZcTgl6TGGPyS1JiRgj/JsWOuQ5I0IaOO+C8baxWSpIkZKfirarNr9UiStkybXbIhybs20XwbcGlVubqmJM2YYUb8WwP7Atd0r32AHYEjk7yjx9okST0YZpG2XwGeWlV3AiR5D3A+cCCDFTslSTNkmBH/w4Ft521vCzysqu4C7uilKklSb4YZ8f8dcEWSi4AATwbenGRb4NM91iZJ6sFmg7+qTkxyDrAfg+B/Q1Xd2H382j6LkySN3zBX9ZwNnA6cXVU/7r8kSVKfhpnjPx74LeBrSf45yXOSbN1zXZKkngwz1XMxgweubwU8FXgxcBKwfc+1SZJ6MMzJXZJsAzwL+CPgccApfRYlSerPMHP8/wQ8HjgPeDdwUVXd3XdhkqR+DDPi/yDwvO66fUnSjBtmjv+8JHsn2YvB8g0b2j/Ua2WSpF4MM9VzDHAAsBdwDvB04F8Bg1+SZtAwUz3PAR4DfLmqXphkZ+AD/ZYlqQVza9aOfZ/rjjt47Ptcboa5jv9/upO5dybZHlgP7NZvWZKkvgwz4r80yQ7A+xk8eetHwBd7rUqS1JthTu4e3b19b5LzgO2r6sp+y5Ik9WWoG7g2qKp1PdUhSZqQUR+2LkmaUQa/JDVms8Gf5G1JHj2JYiRJ/RtmxP8N4IQkX0jyZ0ke0HdRkqT+bDb4q+oDVbU/8MfAHHBlkg8neUrfxUmSxm+oOf5uLf49u9etwFeAVyc5o8faJEk9GGatnrcDhwAXAm+uqg03b70lyTf7LE6SNH7DXMd/FfDXVfWTTXy235jrkST1bMHgT/K47u0VwJ5Jfu7zqrq8qm7rsTZJUg8WG/Efv8hnxeD5uyNJsg64HbgLuLOqVo+6L0nSvbNg8FdV31ftPKWqbu25D0nSRoZ92PqTGFzKec/3fQKXJM2mYa7qORXYncFc/4bn7hZLewJXAecnKeB9VXXCJvo9CjgKYNWqVUvoSpI03zAj/tXAXlVVY+x3/6q6McmDgQuSfKOqLpn/he4vgxMAVq9ePc6+Jalpw9zAdRXwkHF2WlU3dj/XAx/Fy0IlaWIWu5zzEwymZLYDvpbki8AdGz6vqkNG6TDJtsB9qur27v3vAm8aZV+SpHtvsamet/XU587AR7v7AlYAH66q83rqS5K0kcUu57wYIMlbqup18z9L8hbg4lE6rKprgceM8ruSpKUbZo7/wE20PX3chUiSJmOxOf4/B44Gdksy/+Hq2wGf67swSVI/Fpvj/zBwLvC3wJp57bdX1Q96rUqS1JvF5vhvA24DDuvW49+5+/79k9y/qq6fUI2SpDEa5s7dlwHHAjcDd3fNBezTX1mSpL4Mc+fuq4BHVdX3+y5GW465NWvHur91xx081v1JGt0wV/V8l8GUjyRpGRhmxH8tcFGStfz8nbtv760qSVJvhgn+67vXfbuXJGmGbTb4q+qNAEm2G2zWj3qvSpLUm83O8SfZO8mXGazSeXWSy5I8uv/SJEl9GObk7gnAq6vqkVX1SOA1wPv7LUuS1Jdhgn/bqvrsho2qugjYtreKJEm9GuqqniR/A5zabb8A+E5/JUmS+jTMiP9FwErgLAZPy1oJvLDPoiRJ/Rnmqp7/Al4xgVokSROw2LLMZy/2i6M+elGSNF2LjfifyGC5htOBLwCZSEWSpF4tFvwPYfD0rcOA5wFrgdOr6upJFCZJ6seCJ3er6q6qOq+qjgCeAHyLwZo9L59YdZKksVv05G6SXwYOZjDqnwPexeDqHknSjFrs5O4pwN4MHr/4xqq6amJVSZJ6s9iI/3Dgx8CvAq9I7jm3GwaLtW3fc22SpB4s9szdYW7ukqQt3rifKAez/VQ5w12SGmPwS1JjDH5JaozBL0mNMfglqTEGvyQ1xuCXpMYY/JLUGINfkhpj8EtSYwx+SWqMwS9JjTH4JakxBr8kNcbgl6TGTCX4kxyU5JtJvpVkzTRqkKRWTTz4k2wFvBt4OrAXcFiSvSZdhyS1ahoj/v2Ab1XVtVX1U+AM4NAp1CFJTUpVTbbD5DnAQVX1p9324cDjq+plG33vKOAogFWrVv3GddddN1J/k3rk2qz2M8uPj5O2NFvaIx6TXFZVqzdun8aIP5to+4W/farqhKpaXVWrV65cOYGyJKkN0wj+G4BHzNveBbhxCnVIUpOmEfxfAvZIsmuS+wLPBc6eQh2S1KQVk+6wqu5M8jLgU8BWwElVdfWk65CkVk08+AGq6hzgnGn0LUmt885dSWqMwS9JjZnKVI9G53X3kpbKEb8kNcbgl6TGGPyS1Bjn+CVpTGblHJwjfklqjMEvSY0x+CWpMQa/JDXG4Jekxhj8ktQYg1+SGmPwS1JjDH5JaozBL0mNMfglqTEGvyQ1xuCXpMYY/JLUGINfkhpj8EtSY5b9g1hm5cEIkjQpjvglqTHLfsQ/Kf7LQtKscMQvSY0x+CWpMQa/JDXG4Jekxhj8ktQYg1+SGmPwS1JjDH5JaozBL0mNSVVNu4bNSnILcN206xiTnYBbp13EGC2n41lOxwIez5ZsUsfyyKpauXHjTAT/cpLk0qpaPe06xmU5Hc9yOhbweLZk0z4Wp3okqTEGvyQ1xuCfvBOmXcCYLafjWU7HAh7Plmyqx+IcvyQ1xhG/JDXG4Jekxhj8E5LkEUk+m+TrSa5O8spp17RUSbZK8uUkn5x2LUuVZIckZyb5Rvff6InTrmlUSf6i+zN2VZLTk2w97ZrujSQnJVmf5Kp5bTsmuSDJNd3PB06zxntjgeN5a/dn7cokH02ywyRrMvgn507gNVX1a8ATgJcm2WvKNS3VK4GvT7uIMXkncF5V7Qk8hhk9riQPB14BrK6qvYGtgOdOt6p77WTgoI3a1gAXVtUewIXd9qw4mV88nguAvatqH+A/gNdPsiCDf0Kq6qaqurx7fzuDYHn4dKsaXZJdgIOBD0y7lqVKsj3wZOBEgKr6aVX993SrWpIVwDZJVgD3A26ccj33SlVdAvxgo+ZDgVO696cAvz/RopZgU8dTVedX1Z3d5ueBXSZZk8E/BUnmgMcCX5huJUvyDuAvgbunXcgY7AbcAnywm7r6QJJtp13UKKrqe8DbgOuBm4Dbqur86VY1FjtX1U0wGEQBD55yPeP0IuDcSXZo8E9YkvsD/wK8qqp+OO16RpHkmcD6qrps2rWMyQrgccB7quqxwI+ZramEe3Rz34cCuwIPA7ZN8oLpVqWFJPkrBtPAp02yX4N/gpL8EoPQP62qzpp2PUuwP3BIknXAGcBTk/zjdEtakhuAG6pqw7/AzmTwF8Es+h3gO1V1S1X9H3AW8KQp1zQONyd5KED3c/2U61myJEcAzwSeXxO+ocrgn5AkYTCH/PWqevu061mKqnp9Ve1SVXMMThx+pqpmdlRZVf8JfDfJo7qmpwFfm2JJS3E98IQk9+v+zD2NGT1RvZGzgSO690cAH59iLUuW5CDgdcAhVfWTSfdv8E/O/sDhDEbHV3SvZ0y7KN3j5cBpSa4E9gXePOV6RtL9q+VM4HLgqwz+H5+ppQ6SnA78O/CoJDckORI4DjgwyTXAgd32TFjgeP4B2A64oMuC9060JpdskKS2OOKXpMYY/JLUGINfkhpj8EtSYwx+SWqMwS8BSSrJqfO2VyS5ZdSVR7vVPo+et33AcljFVMuDwS8N/BjYO8k23faBwPeWsL8dgKM3+y1pCgx+6WfOZbDiKMBhwOkbPujWg/9Yt37655Ps07Uf2623flGSa5O8ovuV44Ddu5tz3tq13X/emv+ndXfWShNn8Es/cwbw3O7BJfvw86unvhH4crd++huAD837bE/g94D9gGO6NZnWAN+uqn2r6rXd9x4LvArYi8GKoPv3eTDSQgx+qVNVVwJzDEb752z08W8Cp3bf+wzwoCQP6D5bW1V3VNWtDBYP23mBLr5YVTdU1d3AFV1f0sStmHYB0hbmbAbr2R8APGhe+6amZTasd3LHvLa7WPj/q2G/J/XKEb/0804C3lRVX92o/RLg+TC4Qge4dTPPU7idwSJc0hbHEYc0T1XdwOD5uxs7lsETuq4EfsLPlgheaD/fT/Jv3QO2zwXWjrtWaVSuzilJjXGqR5IaY/BLUmMMfklqjMEvSY0x+CWpMQa/JDXG4Jekxvw/tYNNp2EnXcsAAAAASUVORK5CYII=\n",
- "text/plain": [
- "<Figure size 432x288 with 1 Axes>"
- ]
- },
- "metadata": {
- "needs_background": "light"
- },
- "output_type": "display_data"
- }
- ],
- "source": [
- "%matplotlib inline\n",
- "import matplotlib.pyplot as plt\n",
- "\n",
- "months = np.unique(data[:,1])\n",
- "monthly_mean = [np.mean(data[data[:,1] == month, 3]) for month in months]\n",
- "\n",
- "fig, ax = plt.subplots()\n",
- "ax.bar(months, monthly_mean)\n",
- "ax.set_xlabel(\"Month\")\n",
- "ax.set_ylabel(\"Monthly avg. temp.\");"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 7.7 高维数据的计算"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "当例如`min`, `max`等函数应用在高维数组上时,有时将计算应用于整个数组是有用的,而且很多时候有时只基于行或列。用`axis`参数我们可以决定这个函数应该怎样表现:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 120,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[0.85882078, 0.0838741 , 0.4529751 ],\n",
- " [0.32355282, 0.23641565, 0.37693805],\n",
- " [0.06769945, 0.30438005, 0.9780961 ],\n",
- " [0.46162058, 0.42681981, 0.71106984]])"
- ]
- },
- "execution_count": 120,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "import numpy as np\n",
- "\n",
- "m = np.random.rand(4,3)\n",
- "m"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 121,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "0.978096099540799"
- ]
- },
- "execution_count": 121,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# global max\n",
- "m.max()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 122,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([0.85882078, 0.42681981, 0.9780961 ])"
- ]
- },
- "execution_count": 122,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# max in each column\n",
- "m.max(axis=0)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 123,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([0.85882078, 0.37693805, 0.9780961 , 0.71106984])"
- ]
- },
- "execution_count": 123,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# max in each row\n",
- "m.max(axis=1)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "许多其他的在`array` 和`matrix`类中的函数和方法接受同样(可选的)的关键字参数`axis`"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## 8. 阵列的重塑、调整大小和堆叠"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Numpy数组的形状可以被确定而无需复制底层数据,这使得即使对于大型数组也能有较快的操作。"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 124,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[[0.58458652 0.95489874 0.76873658]\n",
- " [0.79144906 0.35559767 0.96031963]\n",
- " [0.55942317 0.78723157 0.3650356 ]\n",
- " [0.04685468 0.43444695 0.33839966]]\n"
- ]
- }
- ],
- "source": [
- "import numpy as np\n",
- "\n",
- "A = np.random.rand(4, 3)\n",
- "print(A)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 125,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "4 3\n"
- ]
- }
- ],
- "source": [
- "n, m = A.shape\n",
- "print(n, m)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 126,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[0.58458652, 0.95489874, 0.76873658, 0.79144906, 0.35559767,\n",
- " 0.96031963, 0.55942317, 0.78723157, 0.3650356 , 0.04685468,\n",
- " 0.43444695, 0.33839966]])"
- ]
- },
- "execution_count": 126,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "B = A.reshape((1,n*m))\n",
- "B"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 127,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[[0.58458652]\n",
- " [0.95489874]\n",
- " [0.76873658]\n",
- " [0.79144906]\n",
- " [0.35559767]\n",
- " [0.96031963]\n",
- " [0.55942317]\n",
- " [0.78723157]\n",
- " [0.3650356 ]\n",
- " [0.04685468]\n",
- " [0.43444695]\n",
- " [0.33839966]]\n",
- "(12, 1)\n"
- ]
- }
- ],
- "source": [
- "B2 = A.reshape((n*m, 1))\n",
- "print(B2)\n",
- "print(B2.shape)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 128,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[5. , 5. , 5. , 5. , 5. ,\n",
- " 0.96031963, 0.55942317, 0.78723157, 0.3650356 , 0.04685468,\n",
- " 0.43444695, 0.33839966]])"
- ]
- },
- "execution_count": 128,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "B[0,0:5] = 5 # modify the array\n",
- "\n",
- "B"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 129,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[5. , 5. , 5. ],\n",
- " [5. , 5. , 0.96031963],\n",
- " [0.55942317, 0.78723157, 0.3650356 ],\n",
- " [0.04685468, 0.43444695, 0.33839966]])"
- ]
- },
- "execution_count": 129,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "A # and the original variable is also changed. B is only a different view of the same data"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "We can also use the function `flatten` to make a higher-dimensional array into a vector. But this function create a copy of the data."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 130,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([5. , 5. , 5. , 5. , 5. ,\n",
- " 0.96031963, 0.55942317, 0.78723157, 0.3650356 , 0.04685468,\n",
- " 0.43444695, 0.33839966])"
- ]
- },
- "execution_count": 130,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "B = A.flatten()\n",
- "\n",
- "B"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 131,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "(12,)\n"
- ]
- }
- ],
- "source": [
- "print(B.shape)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 132,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[0.88616566 0.11474399 0.49426839 0.86496944 0.44553257 0.01731081\n",
- " 0.26391484 0.81714822 0.9077824 0.45350327 0.34418481 0.30680307\n",
- " 0.22397584 0.96490185 0.25766897 0.1628303 0.35022665 0.87266285\n",
- " 0.14436895 0.2987234 0.04567582 0.62524215 0.03006832 0.15222984\n",
- " 0.86554462 0.30036796 0.66637188 0.51245662 0.46296801 0.53384373\n",
- " 0.90012971 0.00319531 0.48428543 0.24703543 0.53384405 0.48024175\n",
- " 0.17175873 0.1834814 0.43739033 0.64565657 0.49266811 0.72123815\n",
- " 0.57728476 0.76663343 0.68360823 0.34881945 0.64329004 0.79011718\n",
- " 0.7055079 0.32594224 0.48795517 0.43684614 0.32047664 0.63067622\n",
- " 0.24496431 0.25019593 0.57181523 0.38889906 0.53574819 0.02653888]\n"
- ]
- }
- ],
- "source": [
- "T = np.random.rand(3, 4, 5)\n",
- "T2 = T.flatten()\n",
- "print(T2)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 133,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([10. , 10. , 10. , 10. , 10. ,\n",
- " 0.96031963, 0.55942317, 0.78723157, 0.3650356 , 0.04685468,\n",
- " 0.43444695, 0.33839966])"
- ]
- },
- "execution_count": 133,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "B[0:5] = 10\n",
- "\n",
- "B"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 134,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[5. , 5. , 5. ],\n",
- " [5. , 5. , 0.96031963],\n",
- " [0.55942317, 0.78723157, 0.3650356 ],\n",
- " [0.04685468, 0.43444695, 0.33839966]])"
- ]
- },
- "execution_count": 134,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "A # 现在A并没有改变,因为B的数值是A的复制,并不指向同样的值。"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## 9. 添加、删除维度:newaxis、squeeze"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "当矩阵乘法的时候,需要两个矩阵的对应的纬度保持一致才可以正确执行,有了`newaxis`,我们可以在数组中插入新的维度,例如将一个向量转换为列或行矩阵:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 135,
- "metadata": {},
- "outputs": [],
- "source": [
- "v = np.array([1,2,3])"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 136,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "(3,)\n",
- "[1 2 3]\n"
- ]
- }
- ],
- "source": [
- "print(np.shape(v))\n",
- "print(v)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 137,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "(3, 1)\n",
- "[[1]\n",
- " [2]\n",
- " [3]]\n"
- ]
- }
- ],
- "source": [
- "v2 = v.reshape(3, 1)\n",
- "print(v2.shape)\n",
- "print(v2)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 138,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "(3,)\n",
- "(3, 1)\n"
- ]
- }
- ],
- "source": [
- "# 做一个向量v的列矩阵\n",
- "v2 = v[:, np.newaxis]\n",
- "print(v.shape)\n",
- "print(v2.shape)\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 139,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "(3, 1)"
- ]
- },
- "execution_count": 139,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 列矩阵\n",
- "v[:,np.newaxis].shape"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 140,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "(1, 3)"
- ]
- },
- "execution_count": 140,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 行矩阵\n",
- "v[np.newaxis,:].shape"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "也可以通过 `np.expand_dims` 来实现类似的操作"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 141,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "(3, 1)\n",
- "[[1]\n",
- " [2]\n",
- " [3]]\n"
- ]
- }
- ],
- "source": [
- "v = np.array([1,2,3])\n",
- "v3 = np.expand_dims(v, 1)\n",
- "print(v3.shape)\n",
- "print(v3)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "在某些情况,需要将纬度为1的那个纬度删除掉,可以使用`np.squeeze`实现"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 142,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "(1, 2, 3)\n",
- "[[[1 2 3]\n",
- " [2 3 4]]]\n"
- ]
- }
- ],
- "source": [
- "arr = np.array([[[1, 2, 3], [2, 3, 4]]])\n",
- "print(arr.shape)\n",
- "print(arr)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 143,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "(2, 3)\n",
- "[[1 2 3]\n",
- " [2 3 4]]\n"
- ]
- }
- ],
- "source": [
- "# 实际上第一个纬度为`1`,我们不需要\n",
- "arr2 = np.squeeze(arr, 0)\n",
- "print(arr2.shape)\n",
- "print(arr2)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "需要注意:只有数组长度在该纬度上为1,那么该纬度才可以被删除;否则会报错。"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## 10. 叠加和重复数组"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "利用函数`repeat`, `tile`, `vstack`, `hstack`, 和`concatenate` 可以用较小的向量和矩阵来创建更大的向量和矩阵。"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 10.1 tile and repeat"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 144,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[[1 2]\n",
- " [3 4]]\n"
- ]
- }
- ],
- "source": [
- "a = np.array([[1, 2], [3, 4]])\n",
- "print(a)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 145,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4])"
- ]
- },
- "execution_count": 145,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 重复每一个元素三次\n",
- "np.repeat(a, 3)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 146,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[1, 2, 1, 2, 1, 2],\n",
- " [3, 4, 3, 4, 3, 4]])"
- ]
- },
- "execution_count": 146,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# tile the matrix 3 times \n",
- "np.tile(a, 3)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 147,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[1, 2, 1, 2, 1, 2],\n",
- " [3, 4, 3, 4, 3, 4]])"
- ]
- },
- "execution_count": 147,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 更好的方案\n",
- "np.tile(a, (1, 3))"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 148,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[1, 2],\n",
- " [3, 4],\n",
- " [1, 2],\n",
- " [3, 4],\n",
- " [1, 2],\n",
- " [3, 4]])"
- ]
- },
- "execution_count": 148,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "np.tile(a, (3, 1))"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 10.2 concatenate"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 149,
- "metadata": {},
- "outputs": [],
- "source": [
- "b = np.array([[5, 6]])"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 150,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[1, 2],\n",
- " [3, 4],\n",
- " [5, 6]])"
- ]
- },
- "execution_count": 150,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "np.concatenate((a, b), axis=0)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 151,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[1, 2, 5],\n",
- " [3, 4, 6]])"
- ]
- },
- "execution_count": 151,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "np.concatenate((a, b.T), axis=1)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 10.3 hstack and vstack"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 152,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[1, 2],\n",
- " [3, 4],\n",
- " [5, 6]])"
- ]
- },
- "execution_count": 152,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "np.vstack((a,b))"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 153,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[1, 2, 5],\n",
- " [3, 4, 6]])"
- ]
- },
- "execution_count": 153,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "np.hstack((a,b.T))"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## 11. 复制和“深度复制”"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "为了获得高性能,Python中的赋值通常不复制底层对象。例如,在函数之间传递对象时,通过引用传递从而避免不必要的大量内存复制。"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 154,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[1, 2],\n",
- " [3, 4]])"
- ]
- },
- "execution_count": 154,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "A = np.array([[1, 2], [3, 4]])\n",
- "\n",
- "A"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 155,
- "metadata": {},
- "outputs": [],
- "source": [
- "# 现在B和A指的是同一个数组数据\n",
- "B = A "
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 156,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[10, 2],\n",
- " [ 3, 4]])"
- ]
- },
- "execution_count": 156,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 改变B影响A\n",
- "B[0,0] = 10\n",
- "\n",
- "B"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 157,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[10, 2],\n",
- " [ 3, 4]])"
- ]
- },
- "execution_count": 157,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "A"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "如果我们想避免这种引用赋值的行为,那么当我们从 `A` 复制一个新的完全独立的对象 `B` 时,我们需要使用函数 `copy` 来做一个所谓的“深度复制”:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 158,
- "metadata": {},
- "outputs": [],
- "source": [
- "B = np.copy(A)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 159,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[-5, 2],\n",
- " [ 3, 4]])"
- ]
- },
- "execution_count": 159,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 现在如果我们改变B,A不受影响\n",
- "B[0,0] = -5\n",
- "\n",
- "B"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 160,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[10, 2],\n",
- " [ 3, 4]])"
- ]
- },
- "execution_count": 160,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "A"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## 12. 遍历数组元素"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "通常,我们希望尽可能避免遍历数组元素(不惜一切代价)。原因是在像Python(或MATLAB)这样的解释语言中,迭代与向量化操作相比真的很慢。\n",
- "\n",
- "然而,有时迭代是不可避免的。对于这种情况,Python的For循环是最方便的遍历数组的方法:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 161,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "1\n",
- "2\n",
- "3\n",
- "4\n"
- ]
- }
- ],
- "source": [
- "v = np.array([1,2,3,4])\n",
- "\n",
- "for element in v:\n",
- " print(element)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 162,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "row [1 2]\n",
- "1\n",
- "2\n",
- "row [3 4]\n",
- "3\n",
- "4\n"
- ]
- }
- ],
- "source": [
- "M = np.array([[1,2], [3,4]])\n",
- "\n",
- "for row in M:\n",
- " print(\"row\", row)\n",
- " \n",
- " for element in row:\n",
- " print(element)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "当我们需要去\n",
- "当我们需要遍历一个数组的每个元素并修改它的元素时,使用`enumerate`函数可以方便地在`for`循环中获得元素及其索引:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 163,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "row_idx 0 row [1 2]\n",
- "col_idx 0 element 1\n",
- "col_idx 1 element 2\n",
- "row_idx 1 row [3 4]\n",
- "col_idx 0 element 3\n",
- "col_idx 1 element 4\n"
- ]
- }
- ],
- "source": [
- "for row_idx, row in enumerate(M):\n",
- " print(\"row_idx\", row_idx, \"row\", row)\n",
- " \n",
- " for col_idx, element in enumerate(row):\n",
- " print(\"col_idx\", col_idx, \"element\", element)\n",
- " \n",
- " # 更新矩阵:对每个元素求平方\n",
- " M[row_idx, col_idx] = element ** 2"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 164,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[ 1, 4],\n",
- " [ 9, 16]])"
- ]
- },
- "execution_count": 164,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 现在矩阵里的每一个元素都已经求得平方\n",
- "M"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## 13. 向量化功能"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "正如前面多次提到的,为了获得良好的性能,我们应该尽量避免对向量和矩阵中的元素进行循环,而应该使用向量化算法。将标量算法转换为向量化算法的第一步是确保我们编写的函数使用向量输入。"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 165,
- "metadata": {},
- "outputs": [],
- "source": [
- "def Theta(x):\n",
- " \"\"\"\n",
- " 阶跃函数的普遍版本\n",
- " \"\"\"\n",
- " if x >= 0:\n",
- " return 1\n",
- " else:\n",
- " return 0"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 166,
- "metadata": {
- "scrolled": true
- },
- "outputs": [
- {
- "ename": "ValueError",
- "evalue": "The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()",
- "output_type": "error",
- "traceback": [
- "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
- "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)",
- "\u001b[0;32m<ipython-input-166-b49266106206>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mTheta\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0marray\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m3\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m3\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
- "\u001b[0;32m<ipython-input-165-cb840dbb09da>\u001b[0m in \u001b[0;36mTheta\u001b[0;34m(x)\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0m阶跃函数的普遍版本\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \"\"\"\n\u001b[0;32m----> 5\u001b[0;31m \u001b[0;32mif\u001b[0m \u001b[0mx\u001b[0m \u001b[0;34m>=\u001b[0m \u001b[0;36m0\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 6\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 7\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
- "\u001b[0;31mValueError\u001b[0m: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()"
- ]
- }
- ],
- "source": [
- "Theta(np.array([-3,-2,-1,0,1,2,3]))"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "这个操作并不可行,因为所实现的 `Theta` 函数不能接收向量输入。\n",
- "\n",
- "为了得到向量化的版本,我们可以使用Numpy函数 `vectorize` 。在许多情况下,它可以自动向量化一个函数:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 167,
- "metadata": {},
- "outputs": [],
- "source": [
- "Theta_vec = np.vectorize(Theta)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 168,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([0, 0, 0, 1, 1, 1, 1])"
- ]
- },
- "execution_count": 168,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "Theta_vec(np.array([-3,-2,-1,0,1,2,3]))"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "我们也可以实现从一开始就接受矢量输入的函数(需要更多的计算,但可能会有更好的性能):"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 169,
- "metadata": {},
- "outputs": [],
- "source": [
- "def Theta(x):\n",
- " \"\"\"\n",
- " Heaviside阶跃函数的矢量感知实现。\n",
- " \"\"\"\n",
- " return 1 * (x >= 0)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 170,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([0, 0, 0, 1, 1, 1, 1])"
- ]
- },
- "execution_count": 170,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "Theta(np.array([-3,-2,-1,0,1,2,3]))"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 171,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[False False False True True True True]\n"
- ]
- },
- {
- "data": {
- "text/plain": [
- "array([0, 0, 0, 1, 1, 1, 1])"
- ]
- },
- "execution_count": 171,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "a = np.array([-3,-2,-1,0,1,2,3])\n",
- "b = a>=0\n",
- "print(b)\n",
- "b*1"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 172,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "(0, 1)"
- ]
- },
- "execution_count": 172,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# 同样适用于标量\n",
- "Theta(-1.2), Theta(2.6)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## 14. 在条件中使用数组"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "当在条件中使用数组时,例如`if`语句和其他布尔表达,一个需要用`any`或者`all`,这让数组任何或者所有元素都等于`True`。"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 173,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[1, 2],\n",
- " [3, 4]])"
- ]
- },
- "execution_count": 173,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "M = np.array([[1, 2], [3, 4]])\n",
- "M"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 174,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "True"
- ]
- },
- "execution_count": 174,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "(M > 2).any()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 175,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "at least one element in M is larger than 2\n"
- ]
- }
- ],
- "source": [
- "if (M > 2).any():\n",
- " print(\"at least one element in M is larger than 2\")\n",
- "else:\n",
- " print(\"no element in M is larger than 2\")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 176,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "all elements in M are not larger than 5\n"
- ]
- }
- ],
- "source": [
- "if (M > 5).all():\n",
- " print(\"all elements in M are larger than 5\")\n",
- "else:\n",
- " print(\"all elements in M are not larger than 5\")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## 15. 类型转换"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "因为Numpy数组是*静态类型*,数组的类型一旦创建就不会改变。但是我们可以用`astype`函数(参见类似的“asarray”函数)显式地转换一个数组的类型到其他的类型,这总是创建一个新类型的新数组。"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 177,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "dtype('int64')"
- ]
- },
- "execution_count": 177,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "M.dtype\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 178,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[1., 2.],\n",
- " [3., 4.]])"
- ]
- },
- "execution_count": 178,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "M2 = M.astype(float)\n",
- "\n",
- "M2"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 179,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "dtype('float64')"
- ]
- },
- "execution_count": 179,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "M2.dtype"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 180,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "array([[ True, True],\n",
- " [ True, True]])"
- ]
- },
- "execution_count": 180,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "M3 = M.astype(bool)\n",
- "\n",
- "M3"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## 16. 进一步学习"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "* [NumPy 简易教程](https://www.runoob.com/numpy/numpy-tutorial.html)\n",
- "* [NumPy 官方用户指南](https://www.numpy.org.cn/user/)\n",
- "* [NumPy 官方参考手册](https://www.numpy.org.cn/reference/)\n",
- "* [一个针对MATLAB使用者的Numpy教程](https://numpy.org/doc/stable/user/numpy-for-matlab-users.html)"
- ]
- }
- ],
- "metadata": {
- "kernelspec": {
- "display_name": "Python 3",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.8.3"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 1
- }
|