You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

1-numpy_tutorial.ipynb 160 kB

6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago

  1. {
  2. "cells": [
  3. {
  4. "cell_type": "markdown",
  5. "metadata": {},
  6. "source": [
  7. "# Numpy - 多维数据数组软件库"
  8. ]
  9. },
  10. {
  11. "cell_type": "markdown",
  12. "metadata": {},
  13. "source": [
  14. "NumPy是Python中科学计算的基本软件包。它是一个Python库,提供多维数组对象、各种派生类(如掩码数组和矩阵)和各种例程。\n",
  15. "* 用于对数组进行快速操作,包括数学、逻辑、形状操作、排序、选择、I/O、离散傅立叶变换、基本线性代数、基本统计操作、随机模拟等等。\n",
  16. "* Numpy作为Python数据计算的基础广泛应用到数据处理、信号处理、机器学习等领域。"
  17. ]
  18. },
  19. {
  20. "cell_type": "markdown",
  21. "metadata": {},
  22. "source": [
  23. "![cover image](images/numpy.png)"
  24. ]
  25. },
  26. {
  27. "cell_type": "markdown",
  28. "metadata": {},
  29. "source": [
  30. "## 1. 简介"
  31. ]
  32. },
  33. {
  34. "cell_type": "markdown",
  35. "metadata": {},
  36. "source": [
  37. "`numpy`包(模块)用在几乎所有使用Python的数值计算中,为Python提供高性能向量,矩阵和高维数据结构的模块。它是用C和Fortran语言实现的,因此当计算向量化数据(用向量和矩阵表示)时,性能非常的好。\n",
  38. "\n",
  39. "为了使用`numpy`模块,你先要像下面的例子一样导入这个模块:"
  40. ]
  41. },
  42. {
  43. "cell_type": "code",
  44. "execution_count": 1,
  45. "metadata": {},
  46. "outputs": [],
  47. "source": [
  48. "# 这一行的作用会在Matplotlib中介绍\n",
  49. "%matplotlib inline\n",
  50. "import matplotlib.pyplot as plt"
  51. ]
  52. },
  53. {
  54. "cell_type": "code",
  55. "execution_count": 2,
  56. "metadata": {},
  57. "outputs": [],
  58. "source": [
  59. "# 不建议用这种方式导入库\n",
  60. "from numpy import *"
  61. ]
  62. },
  63. {
  64. "cell_type": "code",
  65. "execution_count": 3,
  66. "metadata": {},
  67. "outputs": [],
  68. "source": [
  69. "# 建议使用这种方式\n",
  70. "import numpy as np"
  71. ]
  72. },
  73. {
  74. "cell_type": "markdown",
  75. "metadata": {},
  76. "source": [
  77. "**建议大家使用第二种导入方法** `import numpy as np`\n"
  78. ]
  79. },
  80. {
  81. "cell_type": "markdown",
  82. "metadata": {},
  83. "source": [
  84. "## 2. 创建`numpy`数组"
  85. ]
  86. },
  87. {
  88. "cell_type": "markdown",
  89. "metadata": {},
  90. "source": [
  91. "有很多种方法去初始化新的numpy数组, 例如从\n",
  92. "\n",
  93. "* Python列表或元组\n",
  94. "* 使用专门用来创建numpy arrays的函数,例如 `arange`, `linspace`等\n",
  95. "* 从文件中读取数据"
  96. ]
  97. },
  98. {
  99. "cell_type": "markdown",
  100. "metadata": {},
  101. "source": [
  102. "### 2.1 从列表中"
  103. ]
  104. },
  105. {
  106. "cell_type": "markdown",
  107. "metadata": {},
  108. "source": [
  109. "例如,为了从Python列表创建新的向量和矩阵我们可以用`numpy.array`函数。\n"
  110. ]
  111. },
  112. {
  113. "cell_type": "code",
  114. "execution_count": 4,
  115. "metadata": {},
  116. "outputs": [
  117. {
  118. "name": "stdout",
  119. "output_type": "stream",
  120. "text": [
  121. "[1, 2, 3, 4]\n"
  122. ]
  123. },
  124. {
  125. "data": {
  126. "text/plain": [
  127. "array([1, 2, 3, 4])"
  128. ]
  129. },
  130. "execution_count": 4,
  131. "metadata": {},
  132. "output_type": "execute_result"
  133. }
  134. ],
  135. "source": [
  136. "import numpy as np\n",
  137. "\n",
  138. "a = [1, 2, 3, 4]\n",
  139. "print(a)\n",
  140. "\n",
  141. "# a vector: the argument to the array function is a Python list\n",
  142. "v = np.array(a)\n",
  143. "\n",
  144. "v"
  145. ]
  146. },
  147. {
  148. "cell_type": "code",
  149. "execution_count": 5,
  150. "metadata": {},
  151. "outputs": [
  152. {
  153. "name": "stdout",
  154. "output_type": "stream",
  155. "text": [
  156. "[[1 2]\n",
  157. " [3 4]\n",
  158. " [5 6]]\n",
  159. "\n",
  160. "(3, 2)\n"
  161. ]
  162. }
  163. ],
  164. "source": [
  165. "# 矩阵:数组函数的参数是一个嵌套的Python列表\n",
  166. "M = np.array([[1, 2], [3, 4], [5, 6]])\n",
  167. "\n",
  168. "print(M)\n",
  169. "print()\n",
  170. "print(M.shape)"
  171. ]
  172. },
  173. {
  174. "cell_type": "code",
  175. "execution_count": 6,
  176. "metadata": {},
  177. "outputs": [
  178. {
  179. "name": "stdout",
  180. "output_type": "stream",
  181. "text": [
  182. "[[[ 1 2]\n",
  183. " [ 3 4]\n",
  184. " [ 5 6]]\n",
  185. "\n",
  186. " [[ 3 4]\n",
  187. " [ 5 6]\n",
  188. " [ 7 8]]\n",
  189. "\n",
  190. " [[ 5 6]\n",
  191. " [ 7 8]\n",
  192. " [ 9 10]]\n",
  193. "\n",
  194. " [[ 7 8]\n",
  195. " [ 9 10]\n",
  196. " [11 12]]]\n",
  197. "\n",
  198. "(4, 3, 2)\n"
  199. ]
  200. }
  201. ],
  202. "source": [
  203. "M = np.array([[[1,2], [3,4], [5,6]], \\\n",
  204. " [[3,4], [5,6], [7,8]], \\\n",
  205. " [[5,6], [7,8], [9,10]], \\\n",
  206. " [[7,8], [9,10], [11,12]]])\n",
  207. "print(M)\n",
  208. "print()\n",
  209. "print(M.shape)"
  210. ]
  211. },
  212. {
  213. "cell_type": "markdown",
  214. "metadata": {},
  215. "source": [
  216. "`v`和`M`两个都是属于`numpy`模块提供的`ndarray`类型。"
  217. ]
  218. },
  219. {
  220. "cell_type": "code",
  221. "execution_count": 7,
  222. "metadata": {},
  223. "outputs": [
  224. {
  225. "data": {
  226. "text/plain": [
  227. "(numpy.ndarray, numpy.ndarray)"
  228. ]
  229. },
  230. "execution_count": 7,
  231. "metadata": {},
  232. "output_type": "execute_result"
  233. }
  234. ],
  235. "source": [
  236. "type(v), type(M)"
  237. ]
  238. },
  239. {
  240. "cell_type": "markdown",
  241. "metadata": {},
  242. "source": [
  243. "`v`和`M`之间的区别仅在于他们的形状。我们可以用属性函数`ndarray.shape`得到数组形状的信息。"
  244. ]
  245. },
  246. {
  247. "cell_type": "code",
  248. "execution_count": 8,
  249. "metadata": {},
  250. "outputs": [
  251. {
  252. "data": {
  253. "text/plain": [
  254. "(4,)"
  255. ]
  256. },
  257. "execution_count": 8,
  258. "metadata": {},
  259. "output_type": "execute_result"
  260. }
  261. ],
  262. "source": [
  263. "v.shape"
  264. ]
  265. },
  266. {
  267. "cell_type": "code",
  268. "execution_count": 9,
  269. "metadata": {},
  270. "outputs": [
  271. {
  272. "data": {
  273. "text/plain": [
  274. "(4, 3, 2)"
  275. ]
  276. },
  277. "execution_count": 9,
  278. "metadata": {},
  279. "output_type": "execute_result"
  280. }
  281. ],
  282. "source": [
  283. "M.shape"
  284. ]
  285. },
  286. {
  287. "cell_type": "markdown",
  288. "metadata": {},
  289. "source": [
  290. "通过属性函数`ndarray.size`我们可以得到数组中元素的个数"
  291. ]
  292. },
  293. {
  294. "cell_type": "code",
  295. "execution_count": 10,
  296. "metadata": {},
  297. "outputs": [
  298. {
  299. "data": {
  300. "text/plain": [
  301. "24"
  302. ]
  303. },
  304. "execution_count": 10,
  305. "metadata": {},
  306. "output_type": "execute_result"
  307. }
  308. ],
  309. "source": [
  310. "M.size"
  311. ]
  312. },
  313. {
  314. "cell_type": "markdown",
  315. "metadata": {},
  316. "source": [
  317. "同样,我们可以用函数`numpy.shape`和`numpy.size`"
  318. ]
  319. },
  320. {
  321. "cell_type": "code",
  322. "execution_count": 11,
  323. "metadata": {},
  324. "outputs": [
  325. {
  326. "data": {
  327. "text/plain": [
  328. "(4, 3, 2)"
  329. ]
  330. },
  331. "execution_count": 11,
  332. "metadata": {},
  333. "output_type": "execute_result"
  334. }
  335. ],
  336. "source": [
  337. "np.shape(M)"
  338. ]
  339. },
  340. {
  341. "cell_type": "code",
  342. "execution_count": 12,
  343. "metadata": {},
  344. "outputs": [
  345. {
  346. "data": {
  347. "text/plain": [
  348. "24"
  349. ]
  350. },
  351. "execution_count": 12,
  352. "metadata": {},
  353. "output_type": "execute_result"
  354. }
  355. ],
  356. "source": [
  357. "np.size(M)"
  358. ]
  359. },
  360. {
  361. "cell_type": "markdown",
  362. "metadata": {},
  363. "source": [
  364. "到目前为止`numpy.ndarray`看起来非常像Python列表(或嵌套列表)。为什么不简单地使用Python列表来进行计算,而不是创建一个新的数组类型?\n",
  365. "\n",
  366. "下面有几个原因:\n",
  367. "\n",
  368. "* Python列表非常普遍。它们可以包含任何类型的对象。它们是动态类型的。它们不支持矩阵和点乘等数学函数。由于动态类型的关系,为Python列表实现这类函数的效率不是很高。\n",
  369. "* Numpy数组是**静态类型的**和**同构的**。元素的类型是在创建数组时确定的。\n",
  370. "* Numpy数组是内存高效的。\n",
  371. "* 由于是静态类型,数学函数的快速实现,比如“numpy”数组的乘法和加法可以用编译语言实现(使用C和Fortran).\n",
  372. "\n",
  373. "利用`ndarray`的属性函数`dtype`(数据类型),我们可以看出数组的数据是那种类型。\n"
  374. ]
  375. },
  376. {
  377. "cell_type": "code",
  378. "execution_count": 13,
  379. "metadata": {},
  380. "outputs": [
  381. {
  382. "data": {
  383. "text/plain": [
  384. "dtype('int64')"
  385. ]
  386. },
  387. "execution_count": 13,
  388. "metadata": {},
  389. "output_type": "execute_result"
  390. }
  391. ],
  392. "source": [
  393. "M.dtype"
  394. ]
  395. },
  396. {
  397. "cell_type": "markdown",
  398. "metadata": {},
  399. "source": [
  400. "如果我们试图给一个numpy数组中的元素赋一个错误类型的值,我们会得到一个错误:"
  401. ]
  402. },
  403. {
  404. "cell_type": "code",
  405. "execution_count": 14,
  406. "metadata": {},
  407. "outputs": [
  408. {
  409. "ename": "ValueError",
  410. "evalue": "invalid literal for int() with base 10: 'hello'",
  411. "output_type": "error",
  412. "traceback": [
  413. "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
  414. "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)",
  415. "\u001b[0;32m<ipython-input-14-3eecc5e8509b>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mM\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m\"hello\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
  416. "\u001b[0;31mValueError\u001b[0m: invalid literal for int() with base 10: 'hello'"
  417. ]
  418. }
  419. ],
  420. "source": [
  421. "M[0,0,0] = \"hello\""
  422. ]
  423. },
  424. {
  425. "cell_type": "markdown",
  426. "metadata": {},
  427. "source": [
  428. "如果我们想的话,我们可以利用`dtype`关键字参数显式地定义我们创建的数组数据类型:"
  429. ]
  430. },
  431. {
  432. "cell_type": "code",
  433. "execution_count": 15,
  434. "metadata": {},
  435. "outputs": [
  436. {
  437. "data": {
  438. "text/plain": [
  439. "array([[1.+0.j, 2.+0.j],\n",
  440. " [3.+0.j, 4.+0.j]])"
  441. ]
  442. },
  443. "execution_count": 15,
  444. "metadata": {},
  445. "output_type": "execute_result"
  446. }
  447. ],
  448. "source": [
  449. "M = np.array([[1, 2], [3, 4]], dtype=complex)\n",
  450. "\n",
  451. "M"
  452. ]
  453. },
  454. {
  455. "cell_type": "markdown",
  456. "metadata": {},
  457. "source": [
  458. "常规可以伴随`dtype`使用的数据类型是:`int`, `float`, `complex`, `bool`, `object`等\n",
  459. "\n",
  460. "我们也可以显式地定义数据类型的大小,例如:`int64`, `int16`, `float128`, `complex128`。"
  461. ]
  462. },
  463. {
  464. "cell_type": "markdown",
  465. "metadata": {},
  466. "source": [
  467. "### 2.2 使用数组生成函数"
  468. ]
  469. },
  470. {
  471. "cell_type": "markdown",
  472. "metadata": {},
  473. "source": [
  474. "对于较大的数组,使用显式的Python列表人为地初始化数据是不切实际的。除此之外我们可以用`numpy`的很多函数得到不同类型的数组。有一些常用的分别是:"
  475. ]
  476. },
  477. {
  478. "cell_type": "markdown",
  479. "metadata": {},
  480. "source": [
  481. "#### arange"
  482. ]
  483. },
  484. {
  485. "cell_type": "code",
  486. "execution_count": 16,
  487. "metadata": {},
  488. "outputs": [
  489. {
  490. "name": "stdout",
  491. "output_type": "stream",
  492. "text": [
  493. "[0 1 2 3 4 5 6 7 8 9]\n",
  494. "[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]\n"
  495. ]
  496. }
  497. ],
  498. "source": [
  499. "# 创建一个范围\n",
  500. "\n",
  501. "x = np.arange(0, 10, 1) # 参数:start, stop, step: \n",
  502. "y = range(0, 10, 1)\n",
  503. "print(x)\n",
  504. "print(list(y))"
  505. ]
  506. },
  507. {
  508. "cell_type": "code",
  509. "execution_count": 17,
  510. "metadata": {},
  511. "outputs": [
  512. {
  513. "data": {
  514. "text/plain": [
  515. "array([-1.00000000e+00, -9.00000000e-01, -8.00000000e-01, -7.00000000e-01,\n",
  516. " -6.00000000e-01, -5.00000000e-01, -4.00000000e-01, -3.00000000e-01,\n",
  517. " -2.00000000e-01, -1.00000000e-01, -2.22044605e-16, 1.00000000e-01,\n",
  518. " 2.00000000e-01, 3.00000000e-01, 4.00000000e-01, 5.00000000e-01,\n",
  519. " 6.00000000e-01, 7.00000000e-01, 8.00000000e-01, 9.00000000e-01])"
  520. ]
  521. },
  522. "execution_count": 17,
  523. "metadata": {},
  524. "output_type": "execute_result"
  525. }
  526. ],
  527. "source": [
  528. "x = np.arange(-1, 1, 0.1)\n",
  529. "\n",
  530. "x"
  531. ]
  532. },
  533. {
  534. "cell_type": "markdown",
  535. "metadata": {},
  536. "source": [
  537. "#### linspace and logspace"
  538. ]
  539. },
  540. {
  541. "cell_type": "code",
  542. "execution_count": 18,
  543. "metadata": {},
  544. "outputs": [
  545. {
  546. "data": {
  547. "text/plain": [
  548. "array([ 0. , 2.5, 5. , 7.5, 10. ])"
  549. ]
  550. },
  551. "execution_count": 18,
  552. "metadata": {},
  553. "output_type": "execute_result"
  554. }
  555. ],
  556. "source": [
  557. "# 使用linspace两边的端点也被包含进去\n",
  558. "np.linspace(0, 10, 5)"
  559. ]
  560. },
  561. {
  562. "cell_type": "code",
  563. "execution_count": 19,
  564. "metadata": {},
  565. "outputs": [
  566. {
  567. "data": {
  568. "text/plain": [
  569. "array([1.00000000e+00, 3.03773178e+00, 9.22781435e+00, 2.80316249e+01,\n",
  570. " 8.51525577e+01, 2.58670631e+02, 7.85771994e+02, 2.38696456e+03,\n",
  571. " 7.25095809e+03, 2.20264658e+04])"
  572. ]
  573. },
  574. "execution_count": 19,
  575. "metadata": {},
  576. "output_type": "execute_result"
  577. }
  578. ],
  579. "source": [
  580. "np.logspace(0, 10, 10, base=np.e)"
  581. ]
  582. },
  583. {
  584. "cell_type": "markdown",
  585. "metadata": {},
  586. "source": [
  587. "#### mgrid"
  588. ]
  589. },
  590. {
  591. "cell_type": "code",
  592. "execution_count": 20,
  593. "metadata": {},
  594. "outputs": [],
  595. "source": [
  596. "y, x = np.mgrid[0:5, 0:5] # 和MATLAB中的meshgrid类似"
  597. ]
  598. },
  599. {
  600. "cell_type": "code",
  601. "execution_count": 21,
  602. "metadata": {},
  603. "outputs": [
  604. {
  605. "data": {
  606. "text/plain": [
  607. "array([[0, 1, 2, 3, 4],\n",
  608. " [0, 1, 2, 3, 4],\n",
  609. " [0, 1, 2, 3, 4],\n",
  610. " [0, 1, 2, 3, 4],\n",
  611. " [0, 1, 2, 3, 4]])"
  612. ]
  613. },
  614. "execution_count": 21,
  615. "metadata": {},
  616. "output_type": "execute_result"
  617. }
  618. ],
  619. "source": [
  620. "x"
  621. ]
  622. },
  623. {
  624. "cell_type": "code",
  625. "execution_count": 22,
  626. "metadata": {},
  627. "outputs": [
  628. {
  629. "data": {
  630. "text/plain": [
  631. "array([[0, 0, 0, 0, 0],\n",
  632. " [1, 1, 1, 1, 1],\n",
  633. " [2, 2, 2, 2, 2],\n",
  634. " [3, 3, 3, 3, 3],\n",
  635. " [4, 4, 4, 4, 4]])"
  636. ]
  637. },
  638. "execution_count": 22,
  639. "metadata": {},
  640. "output_type": "execute_result"
  641. }
  642. ],
  643. "source": [
  644. "y"
  645. ]
  646. },
  647. {
  648. "cell_type": "markdown",
  649. "metadata": {},
  650. "source": [
  651. "#### random data"
  652. ]
  653. },
  654. {
  655. "cell_type": "code",
  656. "execution_count": 23,
  657. "metadata": {},
  658. "outputs": [],
  659. "source": [
  660. "from numpy import random"
  661. ]
  662. },
  663. {
  664. "cell_type": "code",
  665. "execution_count": 24,
  666. "metadata": {},
  667. "outputs": [
  668. {
  669. "data": {
  670. "text/plain": [
  671. "array([[[0.57397454, 0.12434228],\n",
  672. " [0.74835474, 0.01034541],\n",
  673. " [0.91383579, 0.02807574],\n",
  674. " [0.14217509, 0.64698341]],\n",
  675. "\n",
  676. " [[0.65606545, 0.84787378],\n",
  677. " [0.31064031, 0.70205451],\n",
  678. " [0.30486756, 0.34702889],\n",
  679. " [0.47537986, 0.91154076]],\n",
  680. "\n",
  681. " [[0.32192343, 0.77700745],\n",
  682. " [0.80485914, 0.85919158],\n",
  683. " [0.29751565, 0.27228179],\n",
  684. " [0.57796668, 0.18255467]],\n",
  685. "\n",
  686. " [[0.50020698, 0.58134695],\n",
  687. " [0.14200095, 0.97556272],\n",
  688. " [0.32948647, 0.35170435],\n",
  689. " [0.27768833, 0.75059373]],\n",
  690. "\n",
  691. " [[0.23972627, 0.08461662],\n",
  692. " [0.1929383 , 0.80565903],\n",
  693. " [0.2627892 , 0.73361884],\n",
  694. " [0.18415944, 0.44976198]]])"
  695. ]
  696. },
  697. "execution_count": 24,
  698. "metadata": {},
  699. "output_type": "execute_result"
  700. }
  701. ],
  702. "source": [
  703. "# 均匀随机数在[0,1)区间\n",
  704. "random.rand(5,4,2)"
  705. ]
  706. },
  707. {
  708. "cell_type": "code",
  709. "execution_count": 25,
  710. "metadata": {},
  711. "outputs": [
  712. {
  713. "data": {
  714. "text/plain": [
  715. "array([[-1.74300737, 1.94689131, 0.18922227, -0.20440928],\n",
  716. " [ 1.31664152, -0.01176745, -0.43956951, 0.53571291],\n",
  717. " [ 0.02140654, -0.09635041, -1.84205831, 0.64951045],\n",
  718. " [ 0.35682903, 0.96657395, -0.50099255, -0.80044681]])"
  719. ]
  720. },
  721. "execution_count": 25,
  722. "metadata": {},
  723. "output_type": "execute_result"
  724. }
  725. ],
  726. "source": [
  727. "# 标准正态分布随机数\n",
  728. "random.randn(4,4)"
  729. ]
  730. },
  731. {
  732. "cell_type": "markdown",
  733. "metadata": {},
  734. "source": [
  735. "#### diag"
  736. ]
  737. },
  738. {
  739. "cell_type": "code",
  740. "execution_count": 26,
  741. "metadata": {},
  742. "outputs": [
  743. {
  744. "data": {
  745. "text/plain": [
  746. "array([[1, 0, 0],\n",
  747. " [0, 2, 0],\n",
  748. " [0, 0, 3]])"
  749. ]
  750. },
  751. "execution_count": 26,
  752. "metadata": {},
  753. "output_type": "execute_result"
  754. }
  755. ],
  756. "source": [
  757. "# 一个对角矩阵\n",
  758. "np.diag([1,2,3])"
  759. ]
  760. },
  761. {
  762. "cell_type": "code",
  763. "execution_count": 27,
  764. "metadata": {},
  765. "outputs": [
  766. {
  767. "data": {
  768. "text/plain": [
  769. "array([[0, 0, 0, 0],\n",
  770. " [1, 0, 0, 0],\n",
  771. " [0, 2, 0, 0],\n",
  772. " [0, 0, 3, 0]])"
  773. ]
  774. },
  775. "execution_count": 27,
  776. "metadata": {},
  777. "output_type": "execute_result"
  778. }
  779. ],
  780. "source": [
  781. "# 从主对角线偏移的对角线\n",
  782. "np.diag([1,2,3], k=-1) "
  783. ]
  784. },
  785. {
  786. "cell_type": "markdown",
  787. "metadata": {},
  788. "source": [
  789. "#### zeros and ones"
  790. ]
  791. },
  792. {
  793. "cell_type": "code",
  794. "execution_count": 28,
  795. "metadata": {},
  796. "outputs": [
  797. {
  798. "data": {
  799. "text/plain": [
  800. "array([[0., 0., 0.],\n",
  801. " [0., 0., 0.],\n",
  802. " [0., 0., 0.]])"
  803. ]
  804. },
  805. "execution_count": 28,
  806. "metadata": {},
  807. "output_type": "execute_result"
  808. }
  809. ],
  810. "source": [
  811. "np.zeros((3,3))"
  812. ]
  813. },
  814. {
  815. "cell_type": "code",
  816. "execution_count": 29,
  817. "metadata": {},
  818. "outputs": [
  819. {
  820. "data": {
  821. "text/plain": [
  822. "array([[1., 1., 1.],\n",
  823. " [1., 1., 1.],\n",
  824. " [1., 1., 1.]])"
  825. ]
  826. },
  827. "execution_count": 29,
  828. "metadata": {},
  829. "output_type": "execute_result"
  830. }
  831. ],
  832. "source": [
  833. "np.ones((3,3))"
  834. ]
  835. },
  836. {
  837. "cell_type": "markdown",
  838. "metadata": {},
  839. "source": [
  840. "## 3. 文件 I/O"
  841. ]
  842. },
  843. {
  844. "cell_type": "markdown",
  845. "metadata": {},
  846. "source": [
  847. "### 3.1 逗号分隔值 (CSV)"
  848. ]
  849. },
  850. {
  851. "cell_type": "markdown",
  852. "metadata": {},
  853. "source": [
  854. "对于数据文件来说一种非常常见的文件格式是逗号分割值(CSV),或者有关的格式例如TSV(制表符分隔的值)。为了从这些文件中读取数据到Numpy数组中,我们可以用`numpy.genfromtxt`函数。例如:"
  855. ]
  856. },
  857. {
  858. "cell_type": "code",
  859. "execution_count": 30,
  860. "metadata": {},
  861. "outputs": [
  862. {
  863. "name": "stdout",
  864. "output_type": "stream",
  865. "text": [
  866. "1800 1 1 -6.1 -6.1 -6.1 1\r\n",
  867. "1800 1 2 -15.4 -15.4 -15.4 1\r\n",
  868. "1800 1 3 -15.0 -15.0 -15.0 1\r\n",
  869. "1800 1 4 -19.3 -19.3 -19.3 1\r\n",
  870. "1800 1 5 -16.8 -16.8 -16.8 1\r\n",
  871. "1800 1 6 -11.4 -11.4 -11.4 1\r\n",
  872. "1800 1 7 -7.6 -7.6 -7.6 1\r\n",
  873. "1800 1 8 -7.1 -7.1 -7.1 1\r\n",
  874. "1800 1 9 -10.1 -10.1 -10.1 1\r\n",
  875. "1800 1 10 -9.5 -9.5 -9.5 1\r\n"
  876. ]
  877. }
  878. ],
  879. "source": [
  880. "!head stockholm_td_adj.dat"
  881. ]
  882. },
  883. {
  884. "cell_type": "code",
  885. "execution_count": 31,
  886. "metadata": {},
  887. "outputs": [],
  888. "source": [
  889. "import numpy as np\n",
  890. "\n",
  891. "data = np.genfromtxt('stockholm_td_adj.dat')"
  892. ]
  893. },
  894. {
  895. "cell_type": "code",
  896. "execution_count": 32,
  897. "metadata": {},
  898. "outputs": [
  899. {
  900. "data": {
  901. "text/plain": [
  902. "(77431, 7)"
  903. ]
  904. },
  905. "execution_count": 32,
  906. "metadata": {},
  907. "output_type": "execute_result"
  908. }
  909. ],
  910. "source": [
  911. "data.shape"
  912. ]
  913. },
  914. {
  915. "cell_type": "code",
  916. "execution_count": 33,
  917. "metadata": {},
  918. "outputs": [
  919. {
  920. "data": {
  921. "image/png": "\n",
  922. "text/plain": [
  923. "<Figure size 1008x288 with 1 Axes>"
  924. ]
  925. },
  926. "metadata": {
  927. "needs_background": "light"
  928. },
  929. "output_type": "display_data"
  930. }
  931. ],
  932. "source": [
  933. "%matplotlib inline\n",
  934. "import matplotlib.pyplot as plt\n",
  935. "\n",
  936. "fig, ax = plt.subplots(figsize=(14,4))\n",
  937. "ax.plot(data[:,0]+data[:,1]/12.0+data[:,2]/365, data[:,5])\n",
  938. "ax.axis('tight')\n",
  939. "ax.set_title('tempeatures in Stockholm')\n",
  940. "ax.set_xlabel('year')\n",
  941. "ax.set_ylabel('temperature (C)');"
  942. ]
  943. },
  944. {
  945. "cell_type": "markdown",
  946. "metadata": {},
  947. "source": [
  948. "使用`numpy.savetxt`我们可以将一个Numpy数组以CSV格式存入:"
  949. ]
  950. },
  951. {
  952. "cell_type": "code",
  953. "execution_count": 34,
  954. "metadata": {},
  955. "outputs": [
  956. {
  957. "data": {
  958. "text/plain": [
  959. "array([[0.34743109, 0.34666094, 0.67796236],\n",
  960. " [0.37775535, 0.7452935 , 0.44639271],\n",
  961. " [0.7097024 , 0.54721637, 0.96400871]])"
  962. ]
  963. },
  964. "execution_count": 34,
  965. "metadata": {},
  966. "output_type": "execute_result"
  967. }
  968. ],
  969. "source": [
  970. "M = np.random.rand(3,3)\n",
  971. "\n",
  972. "M"
  973. ]
  974. },
  975. {
  976. "cell_type": "code",
  977. "execution_count": 35,
  978. "metadata": {},
  979. "outputs": [],
  980. "source": [
  981. "np.savetxt(\"random-matrix.csv\", M)"
  982. ]
  983. },
  984. {
  985. "cell_type": "code",
  986. "execution_count": 36,
  987. "metadata": {},
  988. "outputs": [
  989. {
  990. "name": "stdout",
  991. "output_type": "stream",
  992. "text": [
  993. "3.474310879390657414e-01 3.466609365910759966e-01 6.779623624489031775e-01\r\n",
  994. "3.777553531256817587e-01 7.452935047749419395e-01 4.463927097637667707e-01\r\n",
  995. "7.097023968559375007e-01 5.472163711854115542e-01 9.640087120207403437e-01\r\n"
  996. ]
  997. }
  998. ],
  999. "source": [
  1000. "!cat random-matrix.csv"
  1001. ]
  1002. },
  1003. {
  1004. "cell_type": "code",
  1005. "execution_count": 37,
  1006. "metadata": {},
  1007. "outputs": [
  1008. {
  1009. "name": "stdout",
  1010. "output_type": "stream",
  1011. "text": [
  1012. "0.34743 0.34666 0.67796\r\n",
  1013. "0.37776 0.74529 0.44639\r\n",
  1014. "0.70970 0.54722 0.96401\r\n"
  1015. ]
  1016. }
  1017. ],
  1018. "source": [
  1019. "np.savetxt(\"random-matrix.csv\", M, fmt='%.5f') # fmt 确定格式\n",
  1020. "\n",
  1021. "!cat random-matrix.csv"
  1022. ]
  1023. },
  1024. {
  1025. "cell_type": "markdown",
  1026. "metadata": {},
  1027. "source": [
  1028. "### 3.2 numpy 的本地文件格式"
  1029. ]
  1030. },
  1031. {
  1032. "cell_type": "markdown",
  1033. "metadata": {},
  1034. "source": [
  1035. "当存储和读取numpy数组时非常有用。利用函数`numpy.save`和`numpy.load`:"
  1036. ]
  1037. },
  1038. {
  1039. "cell_type": "code",
  1040. "execution_count": 38,
  1041. "metadata": {},
  1042. "outputs": [
  1043. {
  1044. "name": "stdout",
  1045. "output_type": "stream",
  1046. "text": [
  1047. "random-matrix.npy: NumPy array, version 1.0, header length 118\r\n"
  1048. ]
  1049. }
  1050. ],
  1051. "source": [
  1052. "np.save(\"random-matrix.npy\", M)\n",
  1053. "\n",
  1054. "!file random-matrix.npy"
  1055. ]
  1056. },
  1057. {
  1058. "cell_type": "code",
  1059. "execution_count": 39,
  1060. "metadata": {},
  1061. "outputs": [
  1062. {
  1063. "data": {
  1064. "text/plain": [
  1065. "array([[0.34743109, 0.34666094, 0.67796236],\n",
  1066. " [0.37775535, 0.7452935 , 0.44639271],\n",
  1067. " [0.7097024 , 0.54721637, 0.96400871]])"
  1068. ]
  1069. },
  1070. "execution_count": 39,
  1071. "metadata": {},
  1072. "output_type": "execute_result"
  1073. }
  1074. ],
  1075. "source": [
  1076. "np.load(\"random-matrix.npy\")"
  1077. ]
  1078. },
  1079. {
  1080. "cell_type": "markdown",
  1081. "metadata": {},
  1082. "source": [
  1083. "## 4. 更多Numpy数组的性质"
  1084. ]
  1085. },
  1086. {
  1087. "cell_type": "code",
  1088. "execution_count": 40,
  1089. "metadata": {},
  1090. "outputs": [
  1091. {
  1092. "name": "stdout",
  1093. "output_type": "stream",
  1094. "text": [
  1095. "int64\n",
  1096. "8\n"
  1097. ]
  1098. }
  1099. ],
  1100. "source": [
  1101. "M = np.array([[1, 2], [3, 4], [5, 6]])\n",
  1102. "\n",
  1103. "print(M.dtype)\n",
  1104. "print(M.itemsize) # 每个元素的字节数\n"
  1105. ]
  1106. },
  1107. {
  1108. "cell_type": "code",
  1109. "execution_count": 41,
  1110. "metadata": {},
  1111. "outputs": [
  1112. {
  1113. "data": {
  1114. "text/plain": [
  1115. "48"
  1116. ]
  1117. },
  1118. "execution_count": 41,
  1119. "metadata": {},
  1120. "output_type": "execute_result"
  1121. }
  1122. ],
  1123. "source": [
  1124. "M.nbytes # 字节数"
  1125. ]
  1126. },
  1127. {
  1128. "cell_type": "code",
  1129. "execution_count": 42,
  1130. "metadata": {},
  1131. "outputs": [
  1132. {
  1133. "data": {
  1134. "text/plain": [
  1135. "2"
  1136. ]
  1137. },
  1138. "execution_count": 42,
  1139. "metadata": {},
  1140. "output_type": "execute_result"
  1141. }
  1142. ],
  1143. "source": [
  1144. "M.ndim # 维度"
  1145. ]
  1146. },
  1147. {
  1148. "cell_type": "markdown",
  1149. "metadata": {},
  1150. "source": [
  1151. "## 5. 操作数组"
  1152. ]
  1153. },
  1154. {
  1155. "cell_type": "markdown",
  1156. "metadata": {},
  1157. "source": [
  1158. "### 5.1 索引"
  1159. ]
  1160. },
  1161. {
  1162. "cell_type": "markdown",
  1163. "metadata": {},
  1164. "source": [
  1165. "我们可以用方括号和下标索引元素:"
  1166. ]
  1167. },
  1168. {
  1169. "cell_type": "code",
  1170. "execution_count": 43,
  1171. "metadata": {},
  1172. "outputs": [
  1173. {
  1174. "data": {
  1175. "text/plain": [
  1176. "1"
  1177. ]
  1178. },
  1179. "execution_count": 43,
  1180. "metadata": {},
  1181. "output_type": "execute_result"
  1182. }
  1183. ],
  1184. "source": [
  1185. "v = np.array([1, 2, 3, 4, 5])\n",
  1186. "\n",
  1187. "# v 是一个向量,仅仅只有一维,取一个索引\n",
  1188. "v[0]"
  1189. ]
  1190. },
  1191. {
  1192. "cell_type": "code",
  1193. "execution_count": 44,
  1194. "metadata": {},
  1195. "outputs": [
  1196. {
  1197. "name": "stdout",
  1198. "output_type": "stream",
  1199. "text": [
  1200. "4\n",
  1201. "4\n",
  1202. "[3 4]\n"
  1203. ]
  1204. }
  1205. ],
  1206. "source": [
  1207. "# M 是一个矩阵或者是一个二维的数组,取两个索引 \n",
  1208. "print(M[1,1])\n",
  1209. "print(M[1][1])\n",
  1210. "print(M[1])"
  1211. ]
  1212. },
  1213. {
  1214. "cell_type": "markdown",
  1215. "metadata": {},
  1216. "source": [
  1217. "如果我们省略了一个多维数组的索引,它将会返回整行(或者,总的来说,一个 N-1 维的数组)"
  1218. ]
  1219. },
  1220. {
  1221. "cell_type": "code",
  1222. "execution_count": 45,
  1223. "metadata": {},
  1224. "outputs": [
  1225. {
  1226. "data": {
  1227. "text/plain": [
  1228. "array([[1, 2],\n",
  1229. " [3, 4],\n",
  1230. " [5, 6]])"
  1231. ]
  1232. },
  1233. "execution_count": 45,
  1234. "metadata": {},
  1235. "output_type": "execute_result"
  1236. }
  1237. ],
  1238. "source": [
  1239. "M"
  1240. ]
  1241. },
  1242. {
  1243. "cell_type": "code",
  1244. "execution_count": 46,
  1245. "metadata": {},
  1246. "outputs": [
  1247. {
  1248. "data": {
  1249. "text/plain": [
  1250. "array([3, 4])"
  1251. ]
  1252. },
  1253. "execution_count": 46,
  1254. "metadata": {},
  1255. "output_type": "execute_result"
  1256. }
  1257. ],
  1258. "source": [
  1259. "M[1]"
  1260. ]
  1261. },
  1262. {
  1263. "cell_type": "markdown",
  1264. "metadata": {},
  1265. "source": [
  1266. "相同的事情可以利用`:`而不是索引来实现:"
  1267. ]
  1268. },
  1269. {
  1270. "cell_type": "code",
  1271. "execution_count": 47,
  1272. "metadata": {},
  1273. "outputs": [
  1274. {
  1275. "data": {
  1276. "text/plain": [
  1277. "array([3, 4])"
  1278. ]
  1279. },
  1280. "execution_count": 47,
  1281. "metadata": {},
  1282. "output_type": "execute_result"
  1283. }
  1284. ],
  1285. "source": [
  1286. "M[1,:] # 行 1"
  1287. ]
  1288. },
  1289. {
  1290. "cell_type": "code",
  1291. "execution_count": 48,
  1292. "metadata": {},
  1293. "outputs": [
  1294. {
  1295. "data": {
  1296. "text/plain": [
  1297. "array([2, 4, 6])"
  1298. ]
  1299. },
  1300. "execution_count": 48,
  1301. "metadata": {},
  1302. "output_type": "execute_result"
  1303. }
  1304. ],
  1305. "source": [
  1306. "M[:,1] # 列 1"
  1307. ]
  1308. },
  1309. {
  1310. "cell_type": "markdown",
  1311. "metadata": {},
  1312. "source": [
  1313. "我们可以用索引赋新的值给数组中的元素:"
  1314. ]
  1315. },
  1316. {
  1317. "cell_type": "code",
  1318. "execution_count": 49,
  1319. "metadata": {},
  1320. "outputs": [],
  1321. "source": [
  1322. "M[0,0] = 1"
  1323. ]
  1324. },
  1325. {
  1326. "cell_type": "code",
  1327. "execution_count": 50,
  1328. "metadata": {},
  1329. "outputs": [
  1330. {
  1331. "data": {
  1332. "text/plain": [
  1333. "array([[1, 2],\n",
  1334. " [3, 4],\n",
  1335. " [5, 6]])"
  1336. ]
  1337. },
  1338. "execution_count": 50,
  1339. "metadata": {},
  1340. "output_type": "execute_result"
  1341. }
  1342. ],
  1343. "source": [
  1344. "M"
  1345. ]
  1346. },
  1347. {
  1348. "cell_type": "code",
  1349. "execution_count": 51,
  1350. "metadata": {},
  1351. "outputs": [],
  1352. "source": [
  1353. "# 对行和列也同样有用\n",
  1354. "M[1,:] = 0\n",
  1355. "M[:,1] = -1"
  1356. ]
  1357. },
  1358. {
  1359. "cell_type": "code",
  1360. "execution_count": 52,
  1361. "metadata": {},
  1362. "outputs": [
  1363. {
  1364. "data": {
  1365. "text/plain": [
  1366. "array([[ 1, -1],\n",
  1367. " [ 0, -1],\n",
  1368. " [ 5, -1]])"
  1369. ]
  1370. },
  1371. "execution_count": 52,
  1372. "metadata": {},
  1373. "output_type": "execute_result"
  1374. }
  1375. ],
  1376. "source": [
  1377. "M"
  1378. ]
  1379. },
  1380. {
  1381. "cell_type": "markdown",
  1382. "metadata": {},
  1383. "source": [
  1384. "### 5.2 切片索引"
  1385. ]
  1386. },
  1387. {
  1388. "cell_type": "markdown",
  1389. "metadata": {},
  1390. "source": [
  1391. "切片索引是语法 `M[lower:upper:step]` 的技术名称,用于提取数组的一部分:"
  1392. ]
  1393. },
  1394. {
  1395. "cell_type": "code",
  1396. "execution_count": 53,
  1397. "metadata": {},
  1398. "outputs": [
  1399. {
  1400. "data": {
  1401. "text/plain": [
  1402. "array([1, 2, 3, 4, 5])"
  1403. ]
  1404. },
  1405. "execution_count": 53,
  1406. "metadata": {},
  1407. "output_type": "execute_result"
  1408. }
  1409. ],
  1410. "source": [
  1411. "A = np.array([1,2,3,4,5])\n",
  1412. "A"
  1413. ]
  1414. },
  1415. {
  1416. "cell_type": "code",
  1417. "execution_count": 54,
  1418. "metadata": {},
  1419. "outputs": [
  1420. {
  1421. "data": {
  1422. "text/plain": [
  1423. "array([2, 3])"
  1424. ]
  1425. },
  1426. "execution_count": 54,
  1427. "metadata": {},
  1428. "output_type": "execute_result"
  1429. }
  1430. ],
  1431. "source": [
  1432. "A[1:3]"
  1433. ]
  1434. },
  1435. {
  1436. "cell_type": "markdown",
  1437. "metadata": {},
  1438. "source": [
  1439. "切片索引到的数据是 *可变的* : 如果它们被分配了一个新值,那么从其中提取切片的原始数组将被修改:"
  1440. ]
  1441. },
  1442. {
  1443. "cell_type": "code",
  1444. "execution_count": 55,
  1445. "metadata": {},
  1446. "outputs": [
  1447. {
  1448. "data": {
  1449. "text/plain": [
  1450. "array([ 1, -2, -3, 4, 5])"
  1451. ]
  1452. },
  1453. "execution_count": 55,
  1454. "metadata": {},
  1455. "output_type": "execute_result"
  1456. }
  1457. ],
  1458. "source": [
  1459. "A[1:3] = [-2,-3] # auto convert type\n",
  1460. "A[1:3] = np.array([-2, -3]) \n",
  1461. "\n",
  1462. "A"
  1463. ]
  1464. },
  1465. {
  1466. "cell_type": "markdown",
  1467. "metadata": {},
  1468. "source": [
  1469. "我们可以省略 `M[lower:upper:step]` 中任意的三个值"
  1470. ]
  1471. },
  1472. {
  1473. "cell_type": "code",
  1474. "execution_count": 56,
  1475. "metadata": {},
  1476. "outputs": [
  1477. {
  1478. "data": {
  1479. "text/plain": [
  1480. "array([ 1, -2, -3, 4, 5])"
  1481. ]
  1482. },
  1483. "execution_count": 56,
  1484. "metadata": {},
  1485. "output_type": "execute_result"
  1486. }
  1487. ],
  1488. "source": [
  1489. "A[::] # lower, upper, step 都取默认值"
  1490. ]
  1491. },
  1492. {
  1493. "cell_type": "code",
  1494. "execution_count": 57,
  1495. "metadata": {},
  1496. "outputs": [
  1497. {
  1498. "data": {
  1499. "text/plain": [
  1500. "array([ 1, -2, -3, 4, 5])"
  1501. ]
  1502. },
  1503. "execution_count": 57,
  1504. "metadata": {},
  1505. "output_type": "execute_result"
  1506. }
  1507. ],
  1508. "source": [
  1509. "A[:]"
  1510. ]
  1511. },
  1512. {
  1513. "cell_type": "code",
  1514. "execution_count": 58,
  1515. "metadata": {},
  1516. "outputs": [
  1517. {
  1518. "data": {
  1519. "text/plain": [
  1520. "array([ 1, -3, 5])"
  1521. ]
  1522. },
  1523. "execution_count": 58,
  1524. "metadata": {},
  1525. "output_type": "execute_result"
  1526. }
  1527. ],
  1528. "source": [
  1529. "A[::2] # step is 2, lower and upper 代表数组的开始和结束"
  1530. ]
  1531. },
  1532. {
  1533. "cell_type": "code",
  1534. "execution_count": 59,
  1535. "metadata": {},
  1536. "outputs": [
  1537. {
  1538. "data": {
  1539. "text/plain": [
  1540. "array([ 1, -2, -3])"
  1541. ]
  1542. },
  1543. "execution_count": 59,
  1544. "metadata": {},
  1545. "output_type": "execute_result"
  1546. }
  1547. ],
  1548. "source": [
  1549. "A[:3] # 前3个元素"
  1550. ]
  1551. },
  1552. {
  1553. "cell_type": "code",
  1554. "execution_count": 60,
  1555. "metadata": {},
  1556. "outputs": [
  1557. {
  1558. "data": {
  1559. "text/plain": [
  1560. "array([4, 5])"
  1561. ]
  1562. },
  1563. "execution_count": 60,
  1564. "metadata": {},
  1565. "output_type": "execute_result"
  1566. }
  1567. ],
  1568. "source": [
  1569. "A[3:] # 从索引3开始的元素"
  1570. ]
  1571. },
  1572. {
  1573. "cell_type": "markdown",
  1574. "metadata": {},
  1575. "source": [
  1576. "负索引计数从数组的结束(正索引从开始):"
  1577. ]
  1578. },
  1579. {
  1580. "cell_type": "code",
  1581. "execution_count": 61,
  1582. "metadata": {},
  1583. "outputs": [],
  1584. "source": [
  1585. "A = np.array([1,2,3,4,5])"
  1586. ]
  1587. },
  1588. {
  1589. "cell_type": "code",
  1590. "execution_count": 62,
  1591. "metadata": {},
  1592. "outputs": [
  1593. {
  1594. "data": {
  1595. "text/plain": [
  1596. "5"
  1597. ]
  1598. },
  1599. "execution_count": 62,
  1600. "metadata": {},
  1601. "output_type": "execute_result"
  1602. }
  1603. ],
  1604. "source": [
  1605. "A[-1] # 数组中最后一个元素"
  1606. ]
  1607. },
  1608. {
  1609. "cell_type": "code",
  1610. "execution_count": 63,
  1611. "metadata": {},
  1612. "outputs": [
  1613. {
  1614. "data": {
  1615. "text/plain": [
  1616. "array([3, 4, 5])"
  1617. ]
  1618. },
  1619. "execution_count": 63,
  1620. "metadata": {},
  1621. "output_type": "execute_result"
  1622. }
  1623. ],
  1624. "source": [
  1625. "A[-3:] # 最后三个元素"
  1626. ]
  1627. },
  1628. {
  1629. "cell_type": "markdown",
  1630. "metadata": {},
  1631. "source": [
  1632. "索引切片的工作方式与多维数组完全相同:"
  1633. ]
  1634. },
  1635. {
  1636. "cell_type": "code",
  1637. "execution_count": 64,
  1638. "metadata": {},
  1639. "outputs": [
  1640. {
  1641. "data": {
  1642. "text/plain": [
  1643. "array([[ 0, 1, 2, 3, 4],\n",
  1644. " [10, 11, 12, 13, 14],\n",
  1645. " [20, 21, 22, 23, 24],\n",
  1646. " [30, 31, 32, 33, 34],\n",
  1647. " [40, 41, 42, 43, 44]])"
  1648. ]
  1649. },
  1650. "execution_count": 64,
  1651. "metadata": {},
  1652. "output_type": "execute_result"
  1653. }
  1654. ],
  1655. "source": [
  1656. "A = np.array([[n+m*10 for n in range(5)] for m in range(5)])\n",
  1657. "\n",
  1658. "A"
  1659. ]
  1660. },
  1661. {
  1662. "cell_type": "code",
  1663. "execution_count": 65,
  1664. "metadata": {},
  1665. "outputs": [
  1666. {
  1667. "data": {
  1668. "text/plain": [
  1669. "array([[11, 12, 13],\n",
  1670. " [21, 22, 23],\n",
  1671. " [31, 32, 33]])"
  1672. ]
  1673. },
  1674. "execution_count": 65,
  1675. "metadata": {},
  1676. "output_type": "execute_result"
  1677. }
  1678. ],
  1679. "source": [
  1680. "# 原始数组中的一个块\n",
  1681. "A[1:4, 1:4]"
  1682. ]
  1683. },
  1684. {
  1685. "cell_type": "code",
  1686. "execution_count": 66,
  1687. "metadata": {},
  1688. "outputs": [
  1689. {
  1690. "data": {
  1691. "text/plain": [
  1692. "array([[ 0, 2, 4],\n",
  1693. " [20, 22, 24],\n",
  1694. " [40, 42, 44]])"
  1695. ]
  1696. },
  1697. "execution_count": 66,
  1698. "metadata": {},
  1699. "output_type": "execute_result"
  1700. }
  1701. ],
  1702. "source": [
  1703. "# 步长\n",
  1704. "A[::2, ::2]"
  1705. ]
  1706. },
  1707. {
  1708. "cell_type": "markdown",
  1709. "metadata": {},
  1710. "source": [
  1711. "### 5.3 花式索引"
  1712. ]
  1713. },
  1714. {
  1715. "cell_type": "markdown",
  1716. "metadata": {},
  1717. "source": [
  1718. "Fancy索引是一个名称时,一个数组或列表被使用在一个索引:"
  1719. ]
  1720. },
  1721. {
  1722. "cell_type": "code",
  1723. "execution_count": 67,
  1724. "metadata": {},
  1725. "outputs": [
  1726. {
  1727. "name": "stdout",
  1728. "output_type": "stream",
  1729. "text": [
  1730. "[[10 11 12 13 14]\n",
  1731. " [30 31 32 33 34]\n",
  1732. " [20 21 22 23 24]]\n",
  1733. "[[ 0 1 2 3 4]\n",
  1734. " [10 11 12 13 14]\n",
  1735. " [20 21 22 23 24]\n",
  1736. " [30 31 32 33 34]\n",
  1737. " [40 41 42 43 44]]\n"
  1738. ]
  1739. }
  1740. ],
  1741. "source": [
  1742. "A = np.array([[n+m*10 for n in range(5)] for m in range(5)])\n",
  1743. "\n",
  1744. "row_indices = [1, 3, 2]\n",
  1745. "print(A[row_indices])\n",
  1746. "print(A)"
  1747. ]
  1748. },
  1749. {
  1750. "cell_type": "code",
  1751. "execution_count": 68,
  1752. "metadata": {},
  1753. "outputs": [
  1754. {
  1755. "data": {
  1756. "text/plain": [
  1757. "array([11, 31, 24])"
  1758. ]
  1759. },
  1760. "execution_count": 68,
  1761. "metadata": {},
  1762. "output_type": "execute_result"
  1763. }
  1764. ],
  1765. "source": [
  1766. "col_indices = [1, 1, -1] # 索引-1 代表最后一个元素\n",
  1767. "A[row_indices, col_indices]"
  1768. ]
  1769. },
  1770. {
  1771. "cell_type": "markdown",
  1772. "metadata": {},
  1773. "source": [
  1774. "我们也可以使用索引掩码:如果索引掩码是一个数据类型`bool`的Numpy数组,那么一个元素被选择(True)或不(False)取决于索引掩码在每个元素位置的值:"
  1775. ]
  1776. },
  1777. {
  1778. "cell_type": "code",
  1779. "execution_count": 69,
  1780. "metadata": {},
  1781. "outputs": [
  1782. {
  1783. "data": {
  1784. "text/plain": [
  1785. "array([0, 1, 2, 3, 4])"
  1786. ]
  1787. },
  1788. "execution_count": 69,
  1789. "metadata": {},
  1790. "output_type": "execute_result"
  1791. }
  1792. ],
  1793. "source": [
  1794. "B = np.array([n for n in range(5)])\n",
  1795. "B"
  1796. ]
  1797. },
  1798. {
  1799. "cell_type": "code",
  1800. "execution_count": 70,
  1801. "metadata": {},
  1802. "outputs": [
  1803. {
  1804. "data": {
  1805. "text/plain": [
  1806. "array([0, 2])"
  1807. ]
  1808. },
  1809. "execution_count": 70,
  1810. "metadata": {},
  1811. "output_type": "execute_result"
  1812. }
  1813. ],
  1814. "source": [
  1815. "row_mask = np.array([True, False, True, False, False])\n",
  1816. "B[row_mask]"
  1817. ]
  1818. },
  1819. {
  1820. "cell_type": "code",
  1821. "execution_count": 71,
  1822. "metadata": {},
  1823. "outputs": [
  1824. {
  1825. "data": {
  1826. "text/plain": [
  1827. "array([0, 2])"
  1828. ]
  1829. },
  1830. "execution_count": 71,
  1831. "metadata": {},
  1832. "output_type": "execute_result"
  1833. }
  1834. ],
  1835. "source": [
  1836. "# 相同的事情\n",
  1837. "row_mask = np.array([1,0,1,0,0], dtype=bool)\n",
  1838. "B[row_mask]"
  1839. ]
  1840. },
  1841. {
  1842. "cell_type": "markdown",
  1843. "metadata": {},
  1844. "source": [
  1845. "这个特性对于有条件地从数组中选择元素非常有用,例如使用比较运算符:"
  1846. ]
  1847. },
  1848. {
  1849. "cell_type": "code",
  1850. "execution_count": 72,
  1851. "metadata": {},
  1852. "outputs": [
  1853. {
  1854. "data": {
  1855. "text/plain": [
  1856. "array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. , 5.5, 6. ,\n",
  1857. " 6.5, 7. , 7.5, 8. , 8.5, 9. , 9.5])"
  1858. ]
  1859. },
  1860. "execution_count": 72,
  1861. "metadata": {},
  1862. "output_type": "execute_result"
  1863. }
  1864. ],
  1865. "source": [
  1866. "x = np.arange(0, 10, 0.5)\n",
  1867. "x"
  1868. ]
  1869. },
  1870. {
  1871. "cell_type": "code",
  1872. "execution_count": 73,
  1873. "metadata": {},
  1874. "outputs": [
  1875. {
  1876. "data": {
  1877. "text/plain": [
  1878. "array([False, False, False, False, False, False, False, False, False,\n",
  1879. " False, False, True, True, True, True, False, False, False,\n",
  1880. " False, False])"
  1881. ]
  1882. },
  1883. "execution_count": 73,
  1884. "metadata": {},
  1885. "output_type": "execute_result"
  1886. }
  1887. ],
  1888. "source": [
  1889. "mask = (5 < x) * (x < 7.5)\n",
  1890. "\n",
  1891. "mask"
  1892. ]
  1893. },
  1894. {
  1895. "cell_type": "code",
  1896. "execution_count": 74,
  1897. "metadata": {},
  1898. "outputs": [
  1899. {
  1900. "data": {
  1901. "text/plain": [
  1902. "array([5.5, 6. , 6.5, 7. ])"
  1903. ]
  1904. },
  1905. "execution_count": 74,
  1906. "metadata": {},
  1907. "output_type": "execute_result"
  1908. }
  1909. ],
  1910. "source": [
  1911. "x[mask]"
  1912. ]
  1913. },
  1914. {
  1915. "cell_type": "code",
  1916. "execution_count": 75,
  1917. "metadata": {},
  1918. "outputs": [
  1919. {
  1920. "data": {
  1921. "text/plain": [
  1922. "array([3.5, 4. , 4.5, 5. , 5.5])"
  1923. ]
  1924. },
  1925. "execution_count": 75,
  1926. "metadata": {},
  1927. "output_type": "execute_result"
  1928. }
  1929. ],
  1930. "source": [
  1931. "x[(3<x) * (x<6)]"
  1932. ]
  1933. },
  1934. {
  1935. "cell_type": "markdown",
  1936. "metadata": {},
  1937. "source": [
  1938. "## 6. 用于从数组中提取数据和创建数组的函数"
  1939. ]
  1940. },
  1941. {
  1942. "cell_type": "markdown",
  1943. "metadata": {},
  1944. "source": [
  1945. "### 6.1 where"
  1946. ]
  1947. },
  1948. {
  1949. "cell_type": "markdown",
  1950. "metadata": {},
  1951. "source": [
  1952. "索引掩码可以使用`where`函数转换为位置索引"
  1953. ]
  1954. },
  1955. {
  1956. "cell_type": "code",
  1957. "execution_count": 76,
  1958. "metadata": {},
  1959. "outputs": [
  1960. {
  1961. "data": {
  1962. "text/plain": [
  1963. "(array([11, 12, 13, 14]),)"
  1964. ]
  1965. },
  1966. "execution_count": 76,
  1967. "metadata": {},
  1968. "output_type": "execute_result"
  1969. }
  1970. ],
  1971. "source": [
  1972. "x = np.arange(0, 10, 0.5)\n",
  1973. "mask = (5 < x) * (x < 7.5)\n",
  1974. "\n",
  1975. "indices = np.where(mask)\n",
  1976. "\n",
  1977. "indices"
  1978. ]
  1979. },
  1980. {
  1981. "cell_type": "code",
  1982. "execution_count": 77,
  1983. "metadata": {},
  1984. "outputs": [
  1985. {
  1986. "data": {
  1987. "text/plain": [
  1988. "array([5.5, 6. , 6.5, 7. ])"
  1989. ]
  1990. },
  1991. "execution_count": 77,
  1992. "metadata": {},
  1993. "output_type": "execute_result"
  1994. }
  1995. ],
  1996. "source": [
  1997. "x[indices] # 这个索引等同于花式索引x[mask]"
  1998. ]
  1999. },
  2000. {
  2001. "cell_type": "markdown",
  2002. "metadata": {},
  2003. "source": [
  2004. "### 6.2 diag"
  2005. ]
  2006. },
  2007. {
  2008. "cell_type": "markdown",
  2009. "metadata": {},
  2010. "source": [
  2011. "使用diag函数,我们还可以提取一个数组的对角线和亚对角线:"
  2012. ]
  2013. },
  2014. {
  2015. "cell_type": "code",
  2016. "execution_count": 78,
  2017. "metadata": {},
  2018. "outputs": [
  2019. {
  2020. "data": {
  2021. "text/plain": [
  2022. "array([ 0, 11, 22, 33, 44])"
  2023. ]
  2024. },
  2025. "execution_count": 78,
  2026. "metadata": {},
  2027. "output_type": "execute_result"
  2028. }
  2029. ],
  2030. "source": [
  2031. "np.diag(A)"
  2032. ]
  2033. },
  2034. {
  2035. "cell_type": "code",
  2036. "execution_count": 79,
  2037. "metadata": {},
  2038. "outputs": [
  2039. {
  2040. "data": {
  2041. "text/plain": [
  2042. "array([10, 21, 32, 43])"
  2043. ]
  2044. },
  2045. "execution_count": 79,
  2046. "metadata": {},
  2047. "output_type": "execute_result"
  2048. }
  2049. ],
  2050. "source": [
  2051. "np.diag(A, -1)"
  2052. ]
  2053. },
  2054. {
  2055. "cell_type": "markdown",
  2056. "metadata": {},
  2057. "source": [
  2058. "## 7. 线性代数"
  2059. ]
  2060. },
  2061. {
  2062. "cell_type": "markdown",
  2063. "metadata": {},
  2064. "source": [
  2065. "向量化代码是使用Python/Numpy编写高效数值计算的关键。这意味着尽可能多的程序应该用矩阵和向量运算来表示,比如矩阵-矩阵乘法。"
  2066. ]
  2067. },
  2068. {
  2069. "cell_type": "markdown",
  2070. "metadata": {},
  2071. "source": [
  2072. "### 7.1 Scalar-array 操作"
  2073. ]
  2074. },
  2075. {
  2076. "cell_type": "markdown",
  2077. "metadata": {},
  2078. "source": [
  2079. "我们可以使用常用的算术运算符来对标量数组进行乘、加、减和除运算。"
  2080. ]
  2081. },
  2082. {
  2083. "cell_type": "code",
  2084. "execution_count": 80,
  2085. "metadata": {},
  2086. "outputs": [],
  2087. "source": [
  2088. "import numpy as np\n",
  2089. "\n",
  2090. "v1 = np.arange(0, 5)"
  2091. ]
  2092. },
  2093. {
  2094. "cell_type": "code",
  2095. "execution_count": 81,
  2096. "metadata": {},
  2097. "outputs": [
  2098. {
  2099. "data": {
  2100. "text/plain": [
  2101. "array([0, 2, 4, 6, 8])"
  2102. ]
  2103. },
  2104. "execution_count": 81,
  2105. "metadata": {},
  2106. "output_type": "execute_result"
  2107. }
  2108. ],
  2109. "source": [
  2110. "v1 * 2"
  2111. ]
  2112. },
  2113. {
  2114. "cell_type": "code",
  2115. "execution_count": 82,
  2116. "metadata": {},
  2117. "outputs": [
  2118. {
  2119. "data": {
  2120. "text/plain": [
  2121. "array([2, 3, 4, 5, 6])"
  2122. ]
  2123. },
  2124. "execution_count": 82,
  2125. "metadata": {},
  2126. "output_type": "execute_result"
  2127. }
  2128. ],
  2129. "source": [
  2130. "v1 + 2"
  2131. ]
  2132. },
  2133. {
  2134. "cell_type": "code",
  2135. "execution_count": 83,
  2136. "metadata": {},
  2137. "outputs": [
  2138. {
  2139. "name": "stdout",
  2140. "output_type": "stream",
  2141. "text": [
  2142. "[[ 0 2 4 6 8]\n",
  2143. " [20 22 24 26 28]\n",
  2144. " [40 42 44 46 48]\n",
  2145. " [60 62 64 66 68]\n",
  2146. " [80 82 84 86 88]]\n",
  2147. "[[ 2 3 4 5 6]\n",
  2148. " [12 13 14 15 16]\n",
  2149. " [22 23 24 25 26]\n",
  2150. " [32 33 34 35 36]\n",
  2151. " [42 43 44 45 46]]\n"
  2152. ]
  2153. }
  2154. ],
  2155. "source": [
  2156. "A = np.array([[n+m*10 for n in range(5)] for m in range(5)])\n",
  2157. "\n",
  2158. "print(A * 2)\n",
  2159. "\n",
  2160. "print(A + 2)"
  2161. ]
  2162. },
  2163. {
  2164. "cell_type": "markdown",
  2165. "metadata": {},
  2166. "source": [
  2167. "### 7.2 数组间的元素操作"
  2168. ]
  2169. },
  2170. {
  2171. "cell_type": "markdown",
  2172. "metadata": {},
  2173. "source": [
  2174. "当我们对数组进行加法、减法、乘法和除法时,默认的行为是**element-wise**操作:"
  2175. ]
  2176. },
  2177. {
  2178. "cell_type": "code",
  2179. "execution_count": 84,
  2180. "metadata": {},
  2181. "outputs": [
  2182. {
  2183. "data": {
  2184. "text/plain": [
  2185. "array([[0.12684531, 0.88008175, 0.00646408],\n",
  2186. " [0.56140088, 0.06651575, 0.79145154]])"
  2187. ]
  2188. },
  2189. "execution_count": 84,
  2190. "metadata": {},
  2191. "output_type": "execute_result"
  2192. }
  2193. ],
  2194. "source": [
  2195. "A = np.random.rand(2, 3)\n",
  2196. "\n",
  2197. "A * A # element-wise 乘法"
  2198. ]
  2199. },
  2200. {
  2201. "cell_type": "code",
  2202. "execution_count": 85,
  2203. "metadata": {},
  2204. "outputs": [
  2205. {
  2206. "data": {
  2207. "text/plain": [
  2208. "array([1., 4.])"
  2209. ]
  2210. },
  2211. "execution_count": 85,
  2212. "metadata": {},
  2213. "output_type": "execute_result"
  2214. }
  2215. ],
  2216. "source": [
  2217. "v1 = np.array([1.0, 2.0])\n",
  2218. "v1 * v1"
  2219. ]
  2220. },
  2221. {
  2222. "cell_type": "markdown",
  2223. "metadata": {},
  2224. "source": [
  2225. "如果我们用兼容的形状进行数组的乘法,我们会得到每一行的对位相乘结果:"
  2226. ]
  2227. },
  2228. {
  2229. "cell_type": "code",
  2230. "execution_count": 86,
  2231. "metadata": {},
  2232. "outputs": [
  2233. {
  2234. "data": {
  2235. "text/plain": [
  2236. "((2, 3), (2,))"
  2237. ]
  2238. },
  2239. "execution_count": 86,
  2240. "metadata": {},
  2241. "output_type": "execute_result"
  2242. }
  2243. ],
  2244. "source": [
  2245. "A.shape, v1.shape"
  2246. ]
  2247. },
  2248. {
  2249. "cell_type": "code",
  2250. "execution_count": 87,
  2251. "metadata": {},
  2252. "outputs": [
  2253. {
  2254. "data": {
  2255. "text/plain": [
  2256. "array([[0.35615349, 0.93812672, 0.08039952],\n",
  2257. " [0.74926689, 0.25790647, 0.88963562]])"
  2258. ]
  2259. },
  2260. "execution_count": 87,
  2261. "metadata": {},
  2262. "output_type": "execute_result"
  2263. }
  2264. ],
  2265. "source": [
  2266. "A"
  2267. ]
  2268. },
  2269. {
  2270. "cell_type": "code",
  2271. "execution_count": 88,
  2272. "metadata": {},
  2273. "outputs": [
  2274. {
  2275. "data": {
  2276. "text/plain": [
  2277. "array([[0.35615349, 1.49853379],\n",
  2278. " [0.93812672, 0.51581293],\n",
  2279. " [0.08039952, 1.77927125]])"
  2280. ]
  2281. },
  2282. "execution_count": 88,
  2283. "metadata": {},
  2284. "output_type": "execute_result"
  2285. }
  2286. ],
  2287. "source": [
  2288. "A.T * v1"
  2289. ]
  2290. },
  2291. {
  2292. "cell_type": "code",
  2293. "execution_count": 89,
  2294. "metadata": {},
  2295. "outputs": [
  2296. {
  2297. "ename": "ValueError",
  2298. "evalue": "operands could not be broadcast together with shapes (2,3) (2,) ",
  2299. "output_type": "error",
  2300. "traceback": [
  2301. "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
  2302. "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)",
  2303. "\u001b[0;32m<ipython-input-89-629678c55a83>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mA\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0mv1\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
  2304. "\u001b[0;31mValueError\u001b[0m: operands could not be broadcast together with shapes (2,3) (2,) "
  2305. ]
  2306. }
  2307. ],
  2308. "source": [
  2309. "A*v1"
  2310. ]
  2311. },
  2312. {
  2313. "cell_type": "markdown",
  2314. "metadata": {},
  2315. "source": [
  2316. "### 7.4 矩阵代数"
  2317. ]
  2318. },
  2319. {
  2320. "cell_type": "markdown",
  2321. "metadata": {},
  2322. "source": [
  2323. "矩阵的乘法有两种方法,第一种方法是点乘函数,它对两个参数应用矩阵-矩阵、矩阵-向量或内向量乘法"
  2324. ]
  2325. },
  2326. {
  2327. "cell_type": "code",
  2328. "execution_count": 90,
  2329. "metadata": {},
  2330. "outputs": [
  2331. {
  2332. "data": {
  2333. "text/plain": [
  2334. "array([[2.59833251, 1.8189686 , 1.32946437, 2.15441681, 1.55219543],\n",
  2335. " [1.4561364 , 1.26875236, 0.97855704, 1.35013248, 1.05524471],\n",
  2336. " [2.38061437, 1.70445667, 1.16297305, 2.27888345, 1.66499116],\n",
  2337. " [1.08602725, 0.76015292, 0.46415646, 1.38753125, 1.00011024],\n",
  2338. " [1.82122991, 1.34175794, 0.92375387, 1.74770416, 1.27559765]])"
  2339. ]
  2340. },
  2341. "execution_count": 90,
  2342. "metadata": {},
  2343. "output_type": "execute_result"
  2344. }
  2345. ],
  2346. "source": [
  2347. "A = np.random.rand(5, 5)\n",
  2348. "v1 = np.random.rand(5, 1)\n",
  2349. "\n",
  2350. "np.dot(A, A)"
  2351. ]
  2352. },
  2353. {
  2354. "cell_type": "code",
  2355. "execution_count": 91,
  2356. "metadata": {},
  2357. "outputs": [
  2358. {
  2359. "data": {
  2360. "text/plain": [
  2361. "array([[2.0139906 ],\n",
  2362. " [1.41657535],\n",
  2363. " [2.09784627],\n",
  2364. " [1.2752073 ],\n",
  2365. " [1.6253844 ]])"
  2366. ]
  2367. },
  2368. "execution_count": 91,
  2369. "metadata": {},
  2370. "output_type": "execute_result"
  2371. }
  2372. ],
  2373. "source": [
  2374. "np.dot(A, v1)\n"
  2375. ]
  2376. },
  2377. {
  2378. "cell_type": "code",
  2379. "execution_count": 92,
  2380. "metadata": {},
  2381. "outputs": [
  2382. {
  2383. "data": {
  2384. "text/plain": [
  2385. "array([[2.08466462]])"
  2386. ]
  2387. },
  2388. "execution_count": 92,
  2389. "metadata": {},
  2390. "output_type": "execute_result"
  2391. }
  2392. ],
  2393. "source": [
  2394. "np.dot(v1.T, v1)"
  2395. ]
  2396. },
  2397. {
  2398. "cell_type": "markdown",
  2399. "metadata": {},
  2400. "source": [
  2401. "另外,我们可以将数组对象投到`matrix`类型上。这将改变标准算术运算符`+, -, *` 的行为,以使用矩阵代数。"
  2402. ]
  2403. },
  2404. {
  2405. "cell_type": "code",
  2406. "execution_count": 93,
  2407. "metadata": {},
  2408. "outputs": [],
  2409. "source": [
  2410. "M = np.matrix(A)\n",
  2411. "v = np.matrix(v1).T # make it a column vector"
  2412. ]
  2413. },
  2414. {
  2415. "cell_type": "code",
  2416. "execution_count": 94,
  2417. "metadata": {},
  2418. "outputs": [
  2419. {
  2420. "data": {
  2421. "text/plain": [
  2422. "matrix([[0.45282687, 0.64874757, 0.70028245, 0.91412865, 0.36429705]])"
  2423. ]
  2424. },
  2425. "execution_count": 94,
  2426. "metadata": {},
  2427. "output_type": "execute_result"
  2428. }
  2429. ],
  2430. "source": [
  2431. "v"
  2432. ]
  2433. },
  2434. {
  2435. "cell_type": "code",
  2436. "execution_count": 95,
  2437. "metadata": {},
  2438. "outputs": [
  2439. {
  2440. "data": {
  2441. "text/plain": [
  2442. "matrix([[2.59833251, 1.8189686 , 1.32946437, 2.15441681, 1.55219543],\n",
  2443. " [1.4561364 , 1.26875236, 0.97855704, 1.35013248, 1.05524471],\n",
  2444. " [2.38061437, 1.70445667, 1.16297305, 2.27888345, 1.66499116],\n",
  2445. " [1.08602725, 0.76015292, 0.46415646, 1.38753125, 1.00011024],\n",
  2446. " [1.82122991, 1.34175794, 0.92375387, 1.74770416, 1.27559765]])"
  2447. ]
  2448. },
  2449. "execution_count": 95,
  2450. "metadata": {},
  2451. "output_type": "execute_result"
  2452. }
  2453. ],
  2454. "source": [
  2455. "M * M"
  2456. ]
  2457. },
  2458. {
  2459. "cell_type": "code",
  2460. "execution_count": 96,
  2461. "metadata": {},
  2462. "outputs": [
  2463. {
  2464. "data": {
  2465. "text/plain": [
  2466. "matrix([[2.0139906 ],\n",
  2467. " [1.41657535],\n",
  2468. " [2.09784627],\n",
  2469. " [1.2752073 ],\n",
  2470. " [1.6253844 ]])"
  2471. ]
  2472. },
  2473. "execution_count": 96,
  2474. "metadata": {},
  2475. "output_type": "execute_result"
  2476. }
  2477. ],
  2478. "source": [
  2479. "M * v.T"
  2480. ]
  2481. },
  2482. {
  2483. "cell_type": "code",
  2484. "execution_count": 97,
  2485. "metadata": {},
  2486. "outputs": [
  2487. {
  2488. "data": {
  2489. "text/plain": [
  2490. "matrix([[2.08466462]])"
  2491. ]
  2492. },
  2493. "execution_count": 97,
  2494. "metadata": {},
  2495. "output_type": "execute_result"
  2496. }
  2497. ],
  2498. "source": [
  2499. "# 內积\n",
  2500. "v * v.T"
  2501. ]
  2502. },
  2503. {
  2504. "cell_type": "markdown",
  2505. "metadata": {},
  2506. "source": [
  2507. "如果我们尝试用不相配的矩阵形状加,减或者乘我们会得到错误:"
  2508. ]
  2509. },
  2510. {
  2511. "cell_type": "code",
  2512. "execution_count": 98,
  2513. "metadata": {},
  2514. "outputs": [],
  2515. "source": [
  2516. "v = np.matrix([1,2,3,4,5,6]).T"
  2517. ]
  2518. },
  2519. {
  2520. "cell_type": "code",
  2521. "execution_count": 99,
  2522. "metadata": {},
  2523. "outputs": [
  2524. {
  2525. "data": {
  2526. "text/plain": [
  2527. "((5, 5), (6, 1))"
  2528. ]
  2529. },
  2530. "execution_count": 99,
  2531. "metadata": {},
  2532. "output_type": "execute_result"
  2533. }
  2534. ],
  2535. "source": [
  2536. "np.shape(M), np.shape(v)"
  2537. ]
  2538. },
  2539. {
  2540. "cell_type": "code",
  2541. "execution_count": 100,
  2542. "metadata": {},
  2543. "outputs": [
  2544. {
  2545. "ename": "ValueError",
  2546. "evalue": "shapes (5,5) and (6,1) not aligned: 5 (dim 1) != 6 (dim 0)",
  2547. "output_type": "error",
  2548. "traceback": [
  2549. "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
  2550. "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)",
  2551. "\u001b[0;32m<ipython-input-100-e8f88679fe45>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mM\u001b[0m \u001b[0;34m*\u001b[0m \u001b[0mv\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
  2552. "\u001b[0;32m~/anaconda3/lib/python3.8/site-packages/numpy/matrixlib/defmatrix.py\u001b[0m in \u001b[0;36m__mul__\u001b[0;34m(self, other)\u001b[0m\n\u001b[1;32m 218\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mother\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0mN\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mndarray\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlist\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtuple\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 219\u001b[0m \u001b[0;31m# This promotes 1-D vectors to row vectors\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 220\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mN\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdot\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0masmatrix\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mother\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 221\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0misscalar\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mother\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mor\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0mhasattr\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mother\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m'__rmul__'\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 222\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mN\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdot\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mother\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
  2553. "\u001b[0;32m<__array_function__ internals>\u001b[0m in \u001b[0;36mdot\u001b[0;34m(*args, **kwargs)\u001b[0m\n",
  2554. "\u001b[0;31mValueError\u001b[0m: shapes (5,5) and (6,1) not aligned: 5 (dim 1) != 6 (dim 0)"
  2555. ]
  2556. }
  2557. ],
  2558. "source": [
  2559. "M * v"
  2560. ]
  2561. },
  2562. {
  2563. "cell_type": "markdown",
  2564. "metadata": {},
  2565. "source": [
  2566. "### 7.5 矩阵计算与数据处理"
  2567. ]
  2568. },
  2569. {
  2570. "cell_type": "markdown",
  2571. "metadata": {},
  2572. "source": [
  2573. "#### 求逆"
  2574. ]
  2575. },
  2576. {
  2577. "cell_type": "code",
  2578. "execution_count": 101,
  2579. "metadata": {},
  2580. "outputs": [
  2581. {
  2582. "data": {
  2583. "text/plain": [
  2584. "array([[-2. , 1. ],\n",
  2585. " [ 1.5, -0.5]])"
  2586. ]
  2587. },
  2588. "execution_count": 101,
  2589. "metadata": {},
  2590. "output_type": "execute_result"
  2591. }
  2592. ],
  2593. "source": [
  2594. "C = np.array([[1, 2], [3, 4]])\n",
  2595. "np.linalg.inv(C) # equivalent to C.I "
  2596. ]
  2597. },
  2598. {
  2599. "cell_type": "markdown",
  2600. "metadata": {},
  2601. "source": [
  2602. "#### 行列式"
  2603. ]
  2604. },
  2605. {
  2606. "cell_type": "code",
  2607. "execution_count": 102,
  2608. "metadata": {},
  2609. "outputs": [
  2610. {
  2611. "data": {
  2612. "text/plain": [
  2613. "-2.0000000000000004"
  2614. ]
  2615. },
  2616. "execution_count": 102,
  2617. "metadata": {},
  2618. "output_type": "execute_result"
  2619. }
  2620. ],
  2621. "source": [
  2622. "np.linalg.det(C)"
  2623. ]
  2624. },
  2625. {
  2626. "cell_type": "markdown",
  2627. "metadata": {},
  2628. "source": [
  2629. "#### 数据统计\n",
  2630. "通常将数据集存储在Numpy数组中是非常有用的。Numpy提供了许多函数用于计算数组中数据集的统计。\n",
  2631. "\n",
  2632. "例如,让我们从上面使用的斯德哥尔摩温度数据集计算一些属性。"
  2633. ]
  2634. },
  2635. {
  2636. "cell_type": "code",
  2637. "execution_count": 103,
  2638. "metadata": {},
  2639. "outputs": [
  2640. {
  2641. "data": {
  2642. "text/plain": [
  2643. "(77431, 7)"
  2644. ]
  2645. },
  2646. "execution_count": 103,
  2647. "metadata": {},
  2648. "output_type": "execute_result"
  2649. }
  2650. ],
  2651. "source": [
  2652. "import numpy as np\n",
  2653. "data = np.genfromtxt('stockholm_td_adj.dat')\n",
  2654. "\n",
  2655. "# 提醒一下,温度数据集存储在数据变量中:\n",
  2656. "np.shape(data)"
  2657. ]
  2658. },
  2659. {
  2660. "cell_type": "markdown",
  2661. "metadata": {},
  2662. "source": [
  2663. "#### mean"
  2664. ]
  2665. },
  2666. {
  2667. "cell_type": "code",
  2668. "execution_count": 104,
  2669. "metadata": {},
  2670. "outputs": [
  2671. {
  2672. "name": "stdout",
  2673. "output_type": "stream",
  2674. "text": [
  2675. "(77431, 7)\n"
  2676. ]
  2677. },
  2678. {
  2679. "data": {
  2680. "text/plain": [
  2681. "6.197109684751585"
  2682. ]
  2683. },
  2684. "execution_count": 104,
  2685. "metadata": {},
  2686. "output_type": "execute_result"
  2687. }
  2688. ],
  2689. "source": [
  2690. "# 温度数据在第三列中\n",
  2691. "print(data.shape)\n",
  2692. "np.mean(data[:,3])"
  2693. ]
  2694. },
  2695. {
  2696. "cell_type": "code",
  2697. "execution_count": 105,
  2698. "metadata": {},
  2699. "outputs": [
  2700. {
  2701. "data": {
  2702. "text/plain": [
  2703. "0.4931528475182218"
  2704. ]
  2705. },
  2706. "execution_count": 105,
  2707. "metadata": {},
  2708. "output_type": "execute_result"
  2709. }
  2710. ],
  2711. "source": [
  2712. "A = np.random.rand(4, 3)\n",
  2713. "np.mean(A)"
  2714. ]
  2715. },
  2716. {
  2717. "cell_type": "markdown",
  2718. "metadata": {},
  2719. "source": [
  2720. "在过去的200年里,斯德哥尔摩每天的平均气温大约是6.2 C。"
  2721. ]
  2722. },
  2723. {
  2724. "cell_type": "markdown",
  2725. "metadata": {},
  2726. "source": [
  2727. "#### 标准差和方差"
  2728. ]
  2729. },
  2730. {
  2731. "cell_type": "code",
  2732. "execution_count": 106,
  2733. "metadata": {},
  2734. "outputs": [
  2735. {
  2736. "data": {
  2737. "text/plain": [
  2738. "(8.282271621340573, 68.59602320966341)"
  2739. ]
  2740. },
  2741. "execution_count": 106,
  2742. "metadata": {},
  2743. "output_type": "execute_result"
  2744. }
  2745. ],
  2746. "source": [
  2747. "np.std(data[:,3]), np.var(data[:,3])"
  2748. ]
  2749. },
  2750. {
  2751. "cell_type": "markdown",
  2752. "metadata": {},
  2753. "source": [
  2754. "#### 最小值和最大值"
  2755. ]
  2756. },
  2757. {
  2758. "cell_type": "code",
  2759. "execution_count": 107,
  2760. "metadata": {},
  2761. "outputs": [
  2762. {
  2763. "data": {
  2764. "text/plain": [
  2765. "-25.8"
  2766. ]
  2767. },
  2768. "execution_count": 107,
  2769. "metadata": {},
  2770. "output_type": "execute_result"
  2771. }
  2772. ],
  2773. "source": [
  2774. "# 最低日平均温度\n",
  2775. "data[:,3].min()"
  2776. ]
  2777. },
  2778. {
  2779. "cell_type": "code",
  2780. "execution_count": 108,
  2781. "metadata": {},
  2782. "outputs": [
  2783. {
  2784. "data": {
  2785. "text/plain": [
  2786. "28.3"
  2787. ]
  2788. },
  2789. "execution_count": 108,
  2790. "metadata": {},
  2791. "output_type": "execute_result"
  2792. }
  2793. ],
  2794. "source": [
  2795. "# 最高日平均温度\n",
  2796. "data[:,3].max()"
  2797. ]
  2798. },
  2799. {
  2800. "cell_type": "markdown",
  2801. "metadata": {},
  2802. "source": [
  2803. "#### sum, prod, and trace"
  2804. ]
  2805. },
  2806. {
  2807. "cell_type": "code",
  2808. "execution_count": 109,
  2809. "metadata": {},
  2810. "outputs": [
  2811. {
  2812. "data": {
  2813. "text/plain": [
  2814. "array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])"
  2815. ]
  2816. },
  2817. "execution_count": 109,
  2818. "metadata": {},
  2819. "output_type": "execute_result"
  2820. }
  2821. ],
  2822. "source": [
  2823. "d = np.arange(0, 10)\n",
  2824. "d"
  2825. ]
  2826. },
  2827. {
  2828. "cell_type": "code",
  2829. "execution_count": 110,
  2830. "metadata": {},
  2831. "outputs": [
  2832. {
  2833. "data": {
  2834. "text/plain": [
  2835. "45"
  2836. ]
  2837. },
  2838. "execution_count": 110,
  2839. "metadata": {},
  2840. "output_type": "execute_result"
  2841. }
  2842. ],
  2843. "source": [
  2844. "# 将所有的元素相加\n",
  2845. "np.sum(d)"
  2846. ]
  2847. },
  2848. {
  2849. "cell_type": "code",
  2850. "execution_count": 111,
  2851. "metadata": {},
  2852. "outputs": [
  2853. {
  2854. "data": {
  2855. "text/plain": [
  2856. "3628800"
  2857. ]
  2858. },
  2859. "execution_count": 111,
  2860. "metadata": {},
  2861. "output_type": "execute_result"
  2862. }
  2863. ],
  2864. "source": [
  2865. "# 全元素积分\n",
  2866. "np.prod(d+1)"
  2867. ]
  2868. },
  2869. {
  2870. "cell_type": "code",
  2871. "execution_count": 112,
  2872. "metadata": {},
  2873. "outputs": [
  2874. {
  2875. "data": {
  2876. "text/plain": [
  2877. "array([ 0, 1, 3, 6, 10, 15, 21, 28, 36, 45])"
  2878. ]
  2879. },
  2880. "execution_count": 112,
  2881. "metadata": {},
  2882. "output_type": "execute_result"
  2883. }
  2884. ],
  2885. "source": [
  2886. "# 累计求和\n",
  2887. "np.cumsum(d)"
  2888. ]
  2889. },
  2890. {
  2891. "cell_type": "code",
  2892. "execution_count": 113,
  2893. "metadata": {},
  2894. "outputs": [
  2895. {
  2896. "data": {
  2897. "text/plain": [
  2898. "array([ 1, 2, 6, 24, 120, 720, 5040,\n",
  2899. " 40320, 362880, 3628800])"
  2900. ]
  2901. },
  2902. "execution_count": 113,
  2903. "metadata": {},
  2904. "output_type": "execute_result"
  2905. }
  2906. ],
  2907. "source": [
  2908. "# 累计乘积\n",
  2909. "np.cumprod(d+1)"
  2910. ]
  2911. },
  2912. {
  2913. "cell_type": "code",
  2914. "execution_count": 114,
  2915. "metadata": {},
  2916. "outputs": [
  2917. {
  2918. "data": {
  2919. "text/plain": [
  2920. "1.4446600641166332"
  2921. ]
  2922. },
  2923. "execution_count": 114,
  2924. "metadata": {},
  2925. "output_type": "execute_result"
  2926. }
  2927. ],
  2928. "source": [
  2929. "# 计算对角线元素的和,和diag(A).sum()一样\n",
  2930. "np.trace(A)"
  2931. ]
  2932. },
  2933. {
  2934. "cell_type": "markdown",
  2935. "metadata": {},
  2936. "source": [
  2937. "### 7.6 数组子集的计算"
  2938. ]
  2939. },
  2940. {
  2941. "cell_type": "markdown",
  2942. "metadata": {},
  2943. "source": [
  2944. "我们可以使用索引、花式索引和从数组中提取数据的其他方法(如上所述)来计算数组中的数据子集。\n",
  2945. "\n",
  2946. "例如,让我们回到温度数据集:"
  2947. ]
  2948. },
  2949. {
  2950. "cell_type": "code",
  2951. "execution_count": 115,
  2952. "metadata": {},
  2953. "outputs": [
  2954. {
  2955. "name": "stdout",
  2956. "output_type": "stream",
  2957. "text": [
  2958. "1800 1 1 -6.1 -6.1 -6.1 1\r\n",
  2959. "1800 1 2 -15.4 -15.4 -15.4 1\r\n",
  2960. "1800 1 3 -15.0 -15.0 -15.0 1\r\n"
  2961. ]
  2962. }
  2963. ],
  2964. "source": [
  2965. "!head -n 3 stockholm_td_adj.dat"
  2966. ]
  2967. },
  2968. {
  2969. "cell_type": "markdown",
  2970. "metadata": {},
  2971. "source": [
  2972. "数据集的格式是:年,月,日,日平均气温,低,高,位置。\n",
  2973. "\n",
  2974. "如果我们对某个特定月份的平均温度感兴趣,比如二月,然后我们可以创建一个索引掩码,使用它来选择当月的数据:"
  2975. ]
  2976. },
  2977. {
  2978. "cell_type": "code",
  2979. "execution_count": 116,
  2980. "metadata": {},
  2981. "outputs": [
  2982. {
  2983. "data": {
  2984. "text/plain": [
  2985. "array([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12.])"
  2986. ]
  2987. },
  2988. "execution_count": 116,
  2989. "metadata": {},
  2990. "output_type": "execute_result"
  2991. }
  2992. ],
  2993. "source": [
  2994. "np.unique(data[:,1]) # 列的值从1到12"
  2995. ]
  2996. },
  2997. {
  2998. "cell_type": "code",
  2999. "execution_count": 117,
  3000. "metadata": {},
  3001. "outputs": [
  3002. {
  3003. "name": "stdout",
  3004. "output_type": "stream",
  3005. "text": [
  3006. "[False False False ... False False False]\n"
  3007. ]
  3008. }
  3009. ],
  3010. "source": [
  3011. "mask_feb = data[:,1] == 2\n",
  3012. "print(mask_feb)"
  3013. ]
  3014. },
  3015. {
  3016. "cell_type": "code",
  3017. "execution_count": 118,
  3018. "metadata": {},
  3019. "outputs": [
  3020. {
  3021. "name": "stdout",
  3022. "output_type": "stream",
  3023. "text": [
  3024. "-3.212109570736596\n",
  3025. "5.090390768766271\n"
  3026. ]
  3027. }
  3028. ],
  3029. "source": [
  3030. "# 温度数据实在第三行\n",
  3031. "print(np.mean(data[mask_feb,3]))\n",
  3032. "print(np.std(data[mask_feb,3]))"
  3033. ]
  3034. },
  3035. {
  3036. "cell_type": "markdown",
  3037. "metadata": {},
  3038. "source": [
  3039. "有了这些工具,我们就有了非常强大的数据处理能力。例如,提取每年每个月的平均气温只需要几行代码:"
  3040. ]
  3041. },
  3042. {
  3043. "cell_type": "code",
  3044. "execution_count": 119,
  3045. "metadata": {},
  3046. "outputs": [
  3047. {
  3048. "data": {
  3049. "image/png": "iVBORw0KGgoAAAANSUhEUgAAAX4AAAEGCAYAAABiq/5QAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAARgUlEQVR4nO3df7RlZV3H8fdHJgMRRGTEHzheIJKQEG0W/qAMNQpFIVu2EpVISSz8mS5ztFqgfximmLpyqSgIEkJGqOgAgiiwyvwBiAj+CMUBEWJAC1ELA779cfbgdZx753Du2efMuc/7tdZZ9+znnLuf714Mn3nm2Xs/O1WFJKkd95l2AZKkyTL4JakxBr8kNcbgl6TGGPyS1JgV0y5gGDvttFPNzc1NuwxJmimXXXbZrVW1cuP2mQj+ubk5Lr300mmXIUkzJcl1m2p3qkeSGmPwS1JjDH5JaozBL0mNMfglqTEGvyQ1xuCXpMYY/JLUmJm4gUuaBXNr1o59n+uOO3js+5Qc8UtSYwx+SWqMwS9JjTH4JakxBr8kNcbgl6TGGPyS1BiDX5IaY/BLUmMMfklqjMEvSY1xrR5pxox7TSDXA2qPI35JaozBL0mNMfglqTEGvyQ1xuCXpMYY/JLUmN6CP8lJSdYnuWpe27FJvpfkiu71jL76lyRtWp8j/pOBgzbR/vdVtW/3OqfH/iVJm9Bb8FfVJcAP+tq/JGk005jjf1mSK7upoAdOoX9Jatqkg/89wO7AvsBNwPELfTHJUUkuTXLpLbfcMqn6JGnZm2jwV9XNVXVXVd0NvB/Yb5HvnlBVq6tq9cqVKydXpCQtcxMN/iQPnbf5bOCqhb4rSepHb6tzJjkdOADYKckNwDHAAUn2BQpYB7ykr/6lDca9miW4oqVmW2/BX1WHbaL5xL76kyQNxzt3JakxBr8kNcbgl6TGGPyS1BiDX5IaY/BLUmMMfklqjMEvSY0x+CWpMQa/JDXG4Jekxhj8ktQYg1+SGmPwS1JjDH5JaozBL0mNMfglqTEGvyQ1xuCXpMYY/JLUGINfkhpj8EtSYwx+SWqMwS9JjTH4JakxK0b5pSSfrKpnjrsYSVuOuTVrx7q/dccdPNb9aXSjjvhfPNYqJEkTM9SIP8l9gT2BAr5ZVTf1WpUkqTebDf4kBwPvBb4NBNg1yUuq6ty+i5Mkjd8wI/7jgadU1bcAkuwOrAUMfkmaQcPM8a/fEPqda4H1PdUjSerZMCP+q5OcA3yEwRz/HwJfSvIHAFV1Vo/1SZLGbJjg3xq4GfjtbvsWYEfgWQz+IjD4JWmGbDb4q+qFkyhEkjQZw1zVsyvwcmBu/ver6pD+ypIk9WWYqZ6PAScCnwDu7rccSVLfhgn+/62qd/VeiSRpIoYJ/ncmOQY4H7hjQ2NVXd5bVZKk3gwT/L8OHA48lZ9N9VS3LUmaMcME/7OB3arqp30XI0nq3zB37n4F2OHe7jjJSUnWJ7lqXtuOSS5Ick3384H3dr+SpKUZJvh3Br6R5FNJzt7wGuL3TgYO2qhtDXBhVe0BXNhtS5ImaJipnmNG2XFVXZJkbqPmQ4EDuvenABcBrxtl/5Kk0Qxz5+7FSR4J7FFVn05yP2CrEfvbecNa/lV1U5IHL/TFJEcBRwGsWrVqxO4kSRvb7FRPkhcDZwLv65oezuCmrl5V1QlVtbqqVq9cubLv7iSpGcPM8b8U2B/4IUBVXQMsOFLfjJuTPBSg++nyzpI0YcME/x3zL+VMsoLBdfyjOBs4ont/BPDxEfcjSRrRMCd3L07yBmCbJAcCRzNYt2dRSU5ncCJ3pyQ3MDhJfBzwkSRHAtczWNtfjZpbs3bs+1x33MFj36e03AwT/GuAI4GvAi8Bzqmq92/ul6rqsAU+etrw5UmSxm2Y4H95Vb0TuCfsk7yya5MkzZhh5viP2ETbn4y5DknShCw44k9yGPA8YNeN7tTdDvh+34VJkvqx2FTP54CbgJ2A4+e13w5c2WdRkqT+LBj8VXUdcB3wxMmVI0nq2zBz/JKkZcTgl6TGGPyS1JiRgj/JsWOuQ5I0IaOO+C8baxWSpIkZKfirarNr9UiStkybXbIhybs20XwbcGlVubqmJM2YYUb8WwP7Atd0r32AHYEjk7yjx9okST0YZpG2XwGeWlV3AiR5D3A+cCCDFTslSTNkmBH/w4Ft521vCzysqu4C7uilKklSb4YZ8f8dcEWSi4AATwbenGRb4NM91iZJ6sFmg7+qTkxyDrAfg+B/Q1Xd2H382j6LkySN3zBX9ZwNnA6cXVU/7r8kSVKfhpnjPx74LeBrSf45yXOSbN1zXZKkngwz1XMxgweubwU8FXgxcBKwfc+1SZJ6MMzJXZJsAzwL+CPgccApfRYlSerPMHP8/wQ8HjgPeDdwUVXd3XdhkqR+DDPi/yDwvO66fUnSjBtmjv+8JHsn2YvB8g0b2j/Ua2WSpF4MM9VzDHAAsBdwDvB04F8Bg1+SZtAwUz3PAR4DfLmqXphkZ+AD/ZYlqQVza9aOfZ/rjjt47Ptcboa5jv9/upO5dybZHlgP7NZvWZKkvgwz4r80yQ7A+xk8eetHwBd7rUqS1JthTu4e3b19b5LzgO2r6sp+y5Ik9WWoG7g2qKp1PdUhSZqQUR+2LkmaUQa/JDVms8Gf5G1JHj2JYiRJ/RtmxP8N4IQkX0jyZ0ke0HdRkqT+bDb4q+oDVbU/8MfAHHBlkg8neUrfxUmSxm+oOf5uLf49u9etwFeAVyc5o8faJEk9GGatnrcDhwAXAm+uqg03b70lyTf7LE6SNH7DXMd/FfDXVfWTTXy235jrkST1bMHgT/K47u0VwJ5Jfu7zqrq8qm7rsTZJUg8WG/Efv8hnxeD5uyNJsg64HbgLuLOqVo+6L0nSvbNg8FdV31ftPKWqbu25D0nSRoZ92PqTGFzKec/3fQKXJM2mYa7qORXYncFc/4bn7hZLewJXAecnKeB9VXXCJvo9CjgKYNWqVUvoSpI03zAj/tXAXlVVY+x3/6q6McmDgQuSfKOqLpn/he4vgxMAVq9ePc6+Jalpw9zAdRXwkHF2WlU3dj/XAx/Fy0IlaWIWu5zzEwymZLYDvpbki8AdGz6vqkNG6TDJtsB9qur27v3vAm8aZV+SpHtvsamet/XU587AR7v7AlYAH66q83rqS5K0kcUu57wYIMlbqup18z9L8hbg4lE6rKprgceM8ruSpKUbZo7/wE20PX3chUiSJmOxOf4/B44Gdksy/+Hq2wGf67swSVI/Fpvj/zBwLvC3wJp57bdX1Q96rUqS1JvF5vhvA24DDuvW49+5+/79k9y/qq6fUI2SpDEa5s7dlwHHAjcDd3fNBezTX1mSpL4Mc+fuq4BHVdX3+y5GW465NWvHur91xx081v1JGt0wV/V8l8GUjyRpGRhmxH8tcFGStfz8nbtv760qSVJvhgn+67vXfbuXJGmGbTb4q+qNAEm2G2zWj3qvSpLUm83O8SfZO8mXGazSeXWSy5I8uv/SJEl9GObk7gnAq6vqkVX1SOA1wPv7LUuS1Jdhgn/bqvrsho2qugjYtreKJEm9GuqqniR/A5zabb8A+E5/JUmS+jTMiP9FwErgLAZPy1oJvLDPoiRJ/Rnmqp7/Al4xgVokSROw2LLMZy/2i6M+elGSNF2LjfifyGC5htOBLwCZSEWSpF4tFvwPYfD0rcOA5wFrgdOr6upJFCZJ6seCJ3er6q6qOq+qjgCeAHyLwZo9L59YdZKksVv05G6SXwYOZjDqnwPexeDqHknSjFrs5O4pwN4MHr/4xqq6amJVSZJ6s9iI/3Dgx8CvAq9I7jm3GwaLtW3fc22SpB4s9szdYW7ukqQt3rifKAez/VQ5w12SGmPwS1JjDH5JaozBL0mNMfglqTEGvyQ1xuCXpMYY/JLUGINfkhpj8EtSYwx+SWqMwS9JjTH4JakxBr8kNcbgl6TGTCX4kxyU5JtJvpVkzTRqkKRWTTz4k2wFvBt4OrAXcFiSvSZdhyS1ahoj/v2Ab1XVtVX1U+AM4NAp1CFJTUpVTbbD5DnAQVX1p9324cDjq+plG33vKOAogFWrVv3GddddN1J/k3rk2qz2M8uPj5O2NFvaIx6TXFZVqzdun8aIP5to+4W/farqhKpaXVWrV65cOYGyJKkN0wj+G4BHzNveBbhxCnVIUpOmEfxfAvZIsmuS+wLPBc6eQh2S1KQVk+6wqu5M8jLgU8BWwElVdfWk65CkVk08+AGq6hzgnGn0LUmt885dSWqMwS9JjZnKVI9G53X3kpbKEb8kNcbgl6TGGPyS1Bjn+CVpTGblHJwjfklqjMEvSY0x+CWpMQa/JDXG4Jekxhj8ktQYg1+SGmPwS1JjDH5JaozBL0mNMfglqTEGvyQ1xuCXpMYY/JLUGINfkhpj8EtSY5b9g1hm5cEIkjQpjvglqTHLfsQ/Kf7LQtKscMQvSY0x+CWpMQa/JDXG4Jekxhj8ktQYg1+SGmPwS1JjDH5JaozBL0mNSVVNu4bNSnILcN206xiTnYBbp13EGC2n41lOxwIez5ZsUsfyyKpauXHjTAT/cpLk0qpaPe06xmU5Hc9yOhbweLZk0z4Wp3okqTEGvyQ1xuCfvBOmXcCYLafjWU7HAh7Plmyqx+IcvyQ1xhG/JDXG4Jekxhj8E5LkEUk+m+TrSa5O8spp17RUSbZK8uUkn5x2LUuVZIckZyb5Rvff6InTrmlUSf6i+zN2VZLTk2w97ZrujSQnJVmf5Kp5bTsmuSDJNd3PB06zxntjgeN5a/dn7cokH02ywyRrMvgn507gNVX1a8ATgJcm2WvKNS3VK4GvT7uIMXkncF5V7Qk8hhk9riQPB14BrK6qvYGtgOdOt6p77WTgoI3a1gAXVtUewIXd9qw4mV88nguAvatqH+A/gNdPsiCDf0Kq6qaqurx7fzuDYHn4dKsaXZJdgIOBD0y7lqVKsj3wZOBEgKr6aVX993SrWpIVwDZJVgD3A26ccj33SlVdAvxgo+ZDgVO696cAvz/RopZgU8dTVedX1Z3d5ueBXSZZk8E/BUnmgMcCX5huJUvyDuAvgbunXcgY7AbcAnywm7r6QJJtp13UKKrqe8DbgOuBm4Dbqur86VY1FjtX1U0wGEQBD55yPeP0IuDcSXZo8E9YkvsD/wK8qqp+OO16RpHkmcD6qrps2rWMyQrgccB7quqxwI+ZramEe3Rz34cCuwIPA7ZN8oLpVqWFJPkrBtPAp02yX4N/gpL8EoPQP62qzpp2PUuwP3BIknXAGcBTk/zjdEtakhuAG6pqw7/AzmTwF8Es+h3gO1V1S1X9H3AW8KQp1zQONyd5KED3c/2U61myJEcAzwSeXxO+ocrgn5AkYTCH/PWqevu061mKqnp9Ve1SVXMMThx+pqpmdlRZVf8JfDfJo7qmpwFfm2JJS3E98IQk9+v+zD2NGT1RvZGzgSO690cAH59iLUuW5CDgdcAhVfWTSfdv8E/O/sDhDEbHV3SvZ0y7KN3j5cBpSa4E9gXePOV6RtL9q+VM4HLgqwz+H5+ppQ6SnA78O/CoJDckORI4DjgwyTXAgd32TFjgeP4B2A64oMuC9060JpdskKS2OOKXpMYY/JLUGINfkhpj8EtSYwx+SWqMwS8BSSrJqfO2VyS5ZdSVR7vVPo+et33AcljFVMuDwS8N/BjYO8k23faBwPeWsL8dgKM3+y1pCgx+6WfOZbDiKMBhwOkbPujWg/9Yt37655Ps07Uf2623flGSa5O8ovuV44Ddu5tz3tq13X/emv+ndXfWShNn8Es/cwbw3O7BJfvw86unvhH4crd++huAD837bE/g94D9gGO6NZnWAN+uqn2r6rXd9x4LvArYi8GKoPv3eTDSQgx+qVNVVwJzDEb752z08W8Cp3bf+wzwoCQP6D5bW1V3VNWtDBYP23mBLr5YVTdU1d3AFV1f0sStmHYB0hbmbAbr2R8APGhe+6amZTasd3LHvLa7WPj/q2G/J/XKEb/0804C3lRVX92o/RLg+TC4Qge4dTPPU7idwSJc0hbHEYc0T1XdwOD5uxs7lsETuq4EfsLPlgheaD/fT/Jv3QO2zwXWjrtWaVSuzilJjXGqR5IaY/BLUmMMfklqjMEvSY0x+CWpMQa/JDXG4Jekxvw/tYNNp2EnXcsAAAAASUVORK5CYII=\n",
  3050. "text/plain": [
  3051. "<Figure size 432x288 with 1 Axes>"
  3052. ]
  3053. },
  3054. "metadata": {
  3055. "needs_background": "light"
  3056. },
  3057. "output_type": "display_data"
  3058. }
  3059. ],
  3060. "source": [
  3061. "%matplotlib inline\n",
  3062. "import matplotlib.pyplot as plt\n",
  3063. "\n",
  3064. "months = np.unique(data[:,1])\n",
  3065. "monthly_mean = [np.mean(data[data[:,1] == month, 3]) for month in months]\n",
  3066. "\n",
  3067. "fig, ax = plt.subplots()\n",
  3068. "ax.bar(months, monthly_mean)\n",
  3069. "ax.set_xlabel(\"Month\")\n",
  3070. "ax.set_ylabel(\"Monthly avg. temp.\");"
  3071. ]
  3072. },
  3073. {
  3074. "cell_type": "markdown",
  3075. "metadata": {},
  3076. "source": [
  3077. "### 7.7 高维数据的计算"
  3078. ]
  3079. },
  3080. {
  3081. "cell_type": "markdown",
  3082. "metadata": {},
  3083. "source": [
  3084. "当例如`min`, `max`等函数应用在高维数组上时,有时将计算应用于整个数组是有用的,而且很多时候有时只基于行或列。用`axis`参数我们可以决定这个函数应该怎样表现:"
  3085. ]
  3086. },
  3087. {
  3088. "cell_type": "code",
  3089. "execution_count": 120,
  3090. "metadata": {},
  3091. "outputs": [
  3092. {
  3093. "data": {
  3094. "text/plain": [
  3095. "array([[0.85882078, 0.0838741 , 0.4529751 ],\n",
  3096. " [0.32355282, 0.23641565, 0.37693805],\n",
  3097. " [0.06769945, 0.30438005, 0.9780961 ],\n",
  3098. " [0.46162058, 0.42681981, 0.71106984]])"
  3099. ]
  3100. },
  3101. "execution_count": 120,
  3102. "metadata": {},
  3103. "output_type": "execute_result"
  3104. }
  3105. ],
  3106. "source": [
  3107. "import numpy as np\n",
  3108. "\n",
  3109. "m = np.random.rand(4,3)\n",
  3110. "m"
  3111. ]
  3112. },
  3113. {
  3114. "cell_type": "code",
  3115. "execution_count": 121,
  3116. "metadata": {},
  3117. "outputs": [
  3118. {
  3119. "data": {
  3120. "text/plain": [
  3121. "0.978096099540799"
  3122. ]
  3123. },
  3124. "execution_count": 121,
  3125. "metadata": {},
  3126. "output_type": "execute_result"
  3127. }
  3128. ],
  3129. "source": [
  3130. "# global max\n",
  3131. "m.max()"
  3132. ]
  3133. },
  3134. {
  3135. "cell_type": "code",
  3136. "execution_count": 122,
  3137. "metadata": {},
  3138. "outputs": [
  3139. {
  3140. "data": {
  3141. "text/plain": [
  3142. "array([0.85882078, 0.42681981, 0.9780961 ])"
  3143. ]
  3144. },
  3145. "execution_count": 122,
  3146. "metadata": {},
  3147. "output_type": "execute_result"
  3148. }
  3149. ],
  3150. "source": [
  3151. "# max in each column\n",
  3152. "m.max(axis=0)"
  3153. ]
  3154. },
  3155. {
  3156. "cell_type": "code",
  3157. "execution_count": 123,
  3158. "metadata": {},
  3159. "outputs": [
  3160. {
  3161. "data": {
  3162. "text/plain": [
  3163. "array([0.85882078, 0.37693805, 0.9780961 , 0.71106984])"
  3164. ]
  3165. },
  3166. "execution_count": 123,
  3167. "metadata": {},
  3168. "output_type": "execute_result"
  3169. }
  3170. ],
  3171. "source": [
  3172. "# max in each row\n",
  3173. "m.max(axis=1)"
  3174. ]
  3175. },
  3176. {
  3177. "cell_type": "markdown",
  3178. "metadata": {},
  3179. "source": [
  3180. "许多其他的在`array` 和`matrix`类中的函数和方法接受同样(可选的)的关键字参数`axis`"
  3181. ]
  3182. },
  3183. {
  3184. "cell_type": "markdown",
  3185. "metadata": {},
  3186. "source": [
  3187. "## 8. 阵列的重塑、调整大小和堆叠"
  3188. ]
  3189. },
  3190. {
  3191. "cell_type": "markdown",
  3192. "metadata": {},
  3193. "source": [
  3194. "Numpy数组的形状可以被确定而无需复制底层数据,这使得即使对于大型数组也能有较快的操作。"
  3195. ]
  3196. },
  3197. {
  3198. "cell_type": "code",
  3199. "execution_count": 124,
  3200. "metadata": {},
  3201. "outputs": [
  3202. {
  3203. "name": "stdout",
  3204. "output_type": "stream",
  3205. "text": [
  3206. "[[0.58458652 0.95489874 0.76873658]\n",
  3207. " [0.79144906 0.35559767 0.96031963]\n",
  3208. " [0.55942317 0.78723157 0.3650356 ]\n",
  3209. " [0.04685468 0.43444695 0.33839966]]\n"
  3210. ]
  3211. }
  3212. ],
  3213. "source": [
  3214. "import numpy as np\n",
  3215. "\n",
  3216. "A = np.random.rand(4, 3)\n",
  3217. "print(A)"
  3218. ]
  3219. },
  3220. {
  3221. "cell_type": "code",
  3222. "execution_count": 125,
  3223. "metadata": {},
  3224. "outputs": [
  3225. {
  3226. "name": "stdout",
  3227. "output_type": "stream",
  3228. "text": [
  3229. "4 3\n"
  3230. ]
  3231. }
  3232. ],
  3233. "source": [
  3234. "n, m = A.shape\n",
  3235. "print(n, m)"
  3236. ]
  3237. },
  3238. {
  3239. "cell_type": "code",
  3240. "execution_count": 126,
  3241. "metadata": {},
  3242. "outputs": [
  3243. {
  3244. "data": {
  3245. "text/plain": [
  3246. "array([[0.58458652, 0.95489874, 0.76873658, 0.79144906, 0.35559767,\n",
  3247. " 0.96031963, 0.55942317, 0.78723157, 0.3650356 , 0.04685468,\n",
  3248. " 0.43444695, 0.33839966]])"
  3249. ]
  3250. },
  3251. "execution_count": 126,
  3252. "metadata": {},
  3253. "output_type": "execute_result"
  3254. }
  3255. ],
  3256. "source": [
  3257. "B = A.reshape((1,n*m))\n",
  3258. "B"
  3259. ]
  3260. },
  3261. {
  3262. "cell_type": "code",
  3263. "execution_count": 127,
  3264. "metadata": {},
  3265. "outputs": [
  3266. {
  3267. "name": "stdout",
  3268. "output_type": "stream",
  3269. "text": [
  3270. "[[0.58458652]\n",
  3271. " [0.95489874]\n",
  3272. " [0.76873658]\n",
  3273. " [0.79144906]\n",
  3274. " [0.35559767]\n",
  3275. " [0.96031963]\n",
  3276. " [0.55942317]\n",
  3277. " [0.78723157]\n",
  3278. " [0.3650356 ]\n",
  3279. " [0.04685468]\n",
  3280. " [0.43444695]\n",
  3281. " [0.33839966]]\n",
  3282. "(12, 1)\n"
  3283. ]
  3284. }
  3285. ],
  3286. "source": [
  3287. "B2 = A.reshape((n*m, 1))\n",
  3288. "print(B2)\n",
  3289. "print(B2.shape)"
  3290. ]
  3291. },
  3292. {
  3293. "cell_type": "code",
  3294. "execution_count": 128,
  3295. "metadata": {},
  3296. "outputs": [
  3297. {
  3298. "data": {
  3299. "text/plain": [
  3300. "array([[5. , 5. , 5. , 5. , 5. ,\n",
  3301. " 0.96031963, 0.55942317, 0.78723157, 0.3650356 , 0.04685468,\n",
  3302. " 0.43444695, 0.33839966]])"
  3303. ]
  3304. },
  3305. "execution_count": 128,
  3306. "metadata": {},
  3307. "output_type": "execute_result"
  3308. }
  3309. ],
  3310. "source": [
  3311. "B[0,0:5] = 5 # modify the array\n",
  3312. "\n",
  3313. "B"
  3314. ]
  3315. },
  3316. {
  3317. "cell_type": "code",
  3318. "execution_count": 129,
  3319. "metadata": {},
  3320. "outputs": [
  3321. {
  3322. "data": {
  3323. "text/plain": [
  3324. "array([[5. , 5. , 5. ],\n",
  3325. " [5. , 5. , 0.96031963],\n",
  3326. " [0.55942317, 0.78723157, 0.3650356 ],\n",
  3327. " [0.04685468, 0.43444695, 0.33839966]])"
  3328. ]
  3329. },
  3330. "execution_count": 129,
  3331. "metadata": {},
  3332. "output_type": "execute_result"
  3333. }
  3334. ],
  3335. "source": [
  3336. "A # and the original variable is also changed. B is only a different view of the same data"
  3337. ]
  3338. },
  3339. {
  3340. "cell_type": "markdown",
  3341. "metadata": {},
  3342. "source": [
  3343. "We can also use the function `flatten` to make a higher-dimensional array into a vector. But this function create a copy of the data."
  3344. ]
  3345. },
  3346. {
  3347. "cell_type": "code",
  3348. "execution_count": 130,
  3349. "metadata": {},
  3350. "outputs": [
  3351. {
  3352. "data": {
  3353. "text/plain": [
  3354. "array([5. , 5. , 5. , 5. , 5. ,\n",
  3355. " 0.96031963, 0.55942317, 0.78723157, 0.3650356 , 0.04685468,\n",
  3356. " 0.43444695, 0.33839966])"
  3357. ]
  3358. },
  3359. "execution_count": 130,
  3360. "metadata": {},
  3361. "output_type": "execute_result"
  3362. }
  3363. ],
  3364. "source": [
  3365. "B = A.flatten()\n",
  3366. "\n",
  3367. "B"
  3368. ]
  3369. },
  3370. {
  3371. "cell_type": "code",
  3372. "execution_count": 131,
  3373. "metadata": {},
  3374. "outputs": [
  3375. {
  3376. "name": "stdout",
  3377. "output_type": "stream",
  3378. "text": [
  3379. "(12,)\n"
  3380. ]
  3381. }
  3382. ],
  3383. "source": [
  3384. "print(B.shape)"
  3385. ]
  3386. },
  3387. {
  3388. "cell_type": "code",
  3389. "execution_count": 132,
  3390. "metadata": {},
  3391. "outputs": [
  3392. {
  3393. "name": "stdout",
  3394. "output_type": "stream",
  3395. "text": [
  3396. "[0.88616566 0.11474399 0.49426839 0.86496944 0.44553257 0.01731081\n",
  3397. " 0.26391484 0.81714822 0.9077824 0.45350327 0.34418481 0.30680307\n",
  3398. " 0.22397584 0.96490185 0.25766897 0.1628303 0.35022665 0.87266285\n",
  3399. " 0.14436895 0.2987234 0.04567582 0.62524215 0.03006832 0.15222984\n",
  3400. " 0.86554462 0.30036796 0.66637188 0.51245662 0.46296801 0.53384373\n",
  3401. " 0.90012971 0.00319531 0.48428543 0.24703543 0.53384405 0.48024175\n",
  3402. " 0.17175873 0.1834814 0.43739033 0.64565657 0.49266811 0.72123815\n",
  3403. " 0.57728476 0.76663343 0.68360823 0.34881945 0.64329004 0.79011718\n",
  3404. " 0.7055079 0.32594224 0.48795517 0.43684614 0.32047664 0.63067622\n",
  3405. " 0.24496431 0.25019593 0.57181523 0.38889906 0.53574819 0.02653888]\n"
  3406. ]
  3407. }
  3408. ],
  3409. "source": [
  3410. "T = np.random.rand(3, 4, 5)\n",
  3411. "T2 = T.flatten()\n",
  3412. "print(T2)"
  3413. ]
  3414. },
  3415. {
  3416. "cell_type": "code",
  3417. "execution_count": 133,
  3418. "metadata": {},
  3419. "outputs": [
  3420. {
  3421. "data": {
  3422. "text/plain": [
  3423. "array([10. , 10. , 10. , 10. , 10. ,\n",
  3424. " 0.96031963, 0.55942317, 0.78723157, 0.3650356 , 0.04685468,\n",
  3425. " 0.43444695, 0.33839966])"
  3426. ]
  3427. },
  3428. "execution_count": 133,
  3429. "metadata": {},
  3430. "output_type": "execute_result"
  3431. }
  3432. ],
  3433. "source": [
  3434. "B[0:5] = 10\n",
  3435. "\n",
  3436. "B"
  3437. ]
  3438. },
  3439. {
  3440. "cell_type": "code",
  3441. "execution_count": 134,
  3442. "metadata": {},
  3443. "outputs": [
  3444. {
  3445. "data": {
  3446. "text/plain": [
  3447. "array([[5. , 5. , 5. ],\n",
  3448. " [5. , 5. , 0.96031963],\n",
  3449. " [0.55942317, 0.78723157, 0.3650356 ],\n",
  3450. " [0.04685468, 0.43444695, 0.33839966]])"
  3451. ]
  3452. },
  3453. "execution_count": 134,
  3454. "metadata": {},
  3455. "output_type": "execute_result"
  3456. }
  3457. ],
  3458. "source": [
  3459. "A # 现在A并没有改变,因为B的数值是A的复制,并不指向同样的值。"
  3460. ]
  3461. },
  3462. {
  3463. "cell_type": "markdown",
  3464. "metadata": {},
  3465. "source": [
  3466. "## 9. 添加、删除维度:newaxis、squeeze"
  3467. ]
  3468. },
  3469. {
  3470. "cell_type": "markdown",
  3471. "metadata": {},
  3472. "source": [
  3473. "当矩阵乘法的时候,需要两个矩阵的对应的纬度保持一致才可以正确执行,有了`newaxis`,我们可以在数组中插入新的维度,例如将一个向量转换为列或行矩阵:"
  3474. ]
  3475. },
  3476. {
  3477. "cell_type": "code",
  3478. "execution_count": 135,
  3479. "metadata": {},
  3480. "outputs": [],
  3481. "source": [
  3482. "v = np.array([1,2,3])"
  3483. ]
  3484. },
  3485. {
  3486. "cell_type": "code",
  3487. "execution_count": 136,
  3488. "metadata": {},
  3489. "outputs": [
  3490. {
  3491. "name": "stdout",
  3492. "output_type": "stream",
  3493. "text": [
  3494. "(3,)\n",
  3495. "[1 2 3]\n"
  3496. ]
  3497. }
  3498. ],
  3499. "source": [
  3500. "print(np.shape(v))\n",
  3501. "print(v)"
  3502. ]
  3503. },
  3504. {
  3505. "cell_type": "code",
  3506. "execution_count": 137,
  3507. "metadata": {},
  3508. "outputs": [
  3509. {
  3510. "name": "stdout",
  3511. "output_type": "stream",
  3512. "text": [
  3513. "(3, 1)\n",
  3514. "[[1]\n",
  3515. " [2]\n",
  3516. " [3]]\n"
  3517. ]
  3518. }
  3519. ],
  3520. "source": [
  3521. "v2 = v.reshape(3, 1)\n",
  3522. "print(v2.shape)\n",
  3523. "print(v2)"
  3524. ]
  3525. },
  3526. {
  3527. "cell_type": "code",
  3528. "execution_count": 138,
  3529. "metadata": {},
  3530. "outputs": [
  3531. {
  3532. "name": "stdout",
  3533. "output_type": "stream",
  3534. "text": [
  3535. "(3,)\n",
  3536. "(3, 1)\n"
  3537. ]
  3538. }
  3539. ],
  3540. "source": [
  3541. "# 做一个向量v的列矩阵\n",
  3542. "v2 = v[:, np.newaxis]\n",
  3543. "print(v.shape)\n",
  3544. "print(v2.shape)\n"
  3545. ]
  3546. },
  3547. {
  3548. "cell_type": "code",
  3549. "execution_count": 139,
  3550. "metadata": {},
  3551. "outputs": [
  3552. {
  3553. "data": {
  3554. "text/plain": [
  3555. "(3, 1)"
  3556. ]
  3557. },
  3558. "execution_count": 139,
  3559. "metadata": {},
  3560. "output_type": "execute_result"
  3561. }
  3562. ],
  3563. "source": [
  3564. "# 列矩阵\n",
  3565. "v[:,np.newaxis].shape"
  3566. ]
  3567. },
  3568. {
  3569. "cell_type": "code",
  3570. "execution_count": 140,
  3571. "metadata": {},
  3572. "outputs": [
  3573. {
  3574. "data": {
  3575. "text/plain": [
  3576. "(1, 3)"
  3577. ]
  3578. },
  3579. "execution_count": 140,
  3580. "metadata": {},
  3581. "output_type": "execute_result"
  3582. }
  3583. ],
  3584. "source": [
  3585. "# 行矩阵\n",
  3586. "v[np.newaxis,:].shape"
  3587. ]
  3588. },
  3589. {
  3590. "cell_type": "markdown",
  3591. "metadata": {},
  3592. "source": [
  3593. "也可以通过 `np.expand_dims` 来实现类似的操作"
  3594. ]
  3595. },
  3596. {
  3597. "cell_type": "code",
  3598. "execution_count": 141,
  3599. "metadata": {},
  3600. "outputs": [
  3601. {
  3602. "name": "stdout",
  3603. "output_type": "stream",
  3604. "text": [
  3605. "(3, 1)\n",
  3606. "[[1]\n",
  3607. " [2]\n",
  3608. " [3]]\n"
  3609. ]
  3610. }
  3611. ],
  3612. "source": [
  3613. "v = np.array([1,2,3])\n",
  3614. "v3 = np.expand_dims(v, 1)\n",
  3615. "print(v3.shape)\n",
  3616. "print(v3)"
  3617. ]
  3618. },
  3619. {
  3620. "cell_type": "markdown",
  3621. "metadata": {},
  3622. "source": [
  3623. "在某些情况,需要将纬度为1的那个纬度删除掉,可以使用`np.squeeze`实现"
  3624. ]
  3625. },
  3626. {
  3627. "cell_type": "code",
  3628. "execution_count": 142,
  3629. "metadata": {},
  3630. "outputs": [
  3631. {
  3632. "name": "stdout",
  3633. "output_type": "stream",
  3634. "text": [
  3635. "(1, 2, 3)\n",
  3636. "[[[1 2 3]\n",
  3637. " [2 3 4]]]\n"
  3638. ]
  3639. }
  3640. ],
  3641. "source": [
  3642. "arr = np.array([[[1, 2, 3], [2, 3, 4]]])\n",
  3643. "print(arr.shape)\n",
  3644. "print(arr)"
  3645. ]
  3646. },
  3647. {
  3648. "cell_type": "code",
  3649. "execution_count": 143,
  3650. "metadata": {},
  3651. "outputs": [
  3652. {
  3653. "name": "stdout",
  3654. "output_type": "stream",
  3655. "text": [
  3656. "(2, 3)\n",
  3657. "[[1 2 3]\n",
  3658. " [2 3 4]]\n"
  3659. ]
  3660. }
  3661. ],
  3662. "source": [
  3663. "# 实际上第一个纬度为`1`,我们不需要\n",
  3664. "arr2 = np.squeeze(arr, 0)\n",
  3665. "print(arr2.shape)\n",
  3666. "print(arr2)"
  3667. ]
  3668. },
  3669. {
  3670. "cell_type": "markdown",
  3671. "metadata": {},
  3672. "source": [
  3673. "需要注意:只有数组长度在该纬度上为1,那么该纬度才可以被删除;否则会报错。"
  3674. ]
  3675. },
  3676. {
  3677. "cell_type": "markdown",
  3678. "metadata": {},
  3679. "source": [
  3680. "## 10. 叠加和重复数组"
  3681. ]
  3682. },
  3683. {
  3684. "cell_type": "markdown",
  3685. "metadata": {},
  3686. "source": [
  3687. "利用函数`repeat`, `tile`, `vstack`, `hstack`, 和`concatenate` 可以用较小的向量和矩阵来创建更大的向量和矩阵。"
  3688. ]
  3689. },
  3690. {
  3691. "cell_type": "markdown",
  3692. "metadata": {},
  3693. "source": [
  3694. "### 10.1 tile and repeat"
  3695. ]
  3696. },
  3697. {
  3698. "cell_type": "code",
  3699. "execution_count": 144,
  3700. "metadata": {},
  3701. "outputs": [
  3702. {
  3703. "name": "stdout",
  3704. "output_type": "stream",
  3705. "text": [
  3706. "[[1 2]\n",
  3707. " [3 4]]\n"
  3708. ]
  3709. }
  3710. ],
  3711. "source": [
  3712. "a = np.array([[1, 2], [3, 4]])\n",
  3713. "print(a)"
  3714. ]
  3715. },
  3716. {
  3717. "cell_type": "code",
  3718. "execution_count": 145,
  3719. "metadata": {},
  3720. "outputs": [
  3721. {
  3722. "data": {
  3723. "text/plain": [
  3724. "array([1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4])"
  3725. ]
  3726. },
  3727. "execution_count": 145,
  3728. "metadata": {},
  3729. "output_type": "execute_result"
  3730. }
  3731. ],
  3732. "source": [
  3733. "# 重复每一个元素三次\n",
  3734. "np.repeat(a, 3)"
  3735. ]
  3736. },
  3737. {
  3738. "cell_type": "code",
  3739. "execution_count": 146,
  3740. "metadata": {},
  3741. "outputs": [
  3742. {
  3743. "data": {
  3744. "text/plain": [
  3745. "array([[1, 2, 1, 2, 1, 2],\n",
  3746. " [3, 4, 3, 4, 3, 4]])"
  3747. ]
  3748. },
  3749. "execution_count": 146,
  3750. "metadata": {},
  3751. "output_type": "execute_result"
  3752. }
  3753. ],
  3754. "source": [
  3755. "# tile the matrix 3 times \n",
  3756. "np.tile(a, 3)"
  3757. ]
  3758. },
  3759. {
  3760. "cell_type": "code",
  3761. "execution_count": 147,
  3762. "metadata": {},
  3763. "outputs": [
  3764. {
  3765. "data": {
  3766. "text/plain": [
  3767. "array([[1, 2, 1, 2, 1, 2],\n",
  3768. " [3, 4, 3, 4, 3, 4]])"
  3769. ]
  3770. },
  3771. "execution_count": 147,
  3772. "metadata": {},
  3773. "output_type": "execute_result"
  3774. }
  3775. ],
  3776. "source": [
  3777. "# 更好的方案\n",
  3778. "np.tile(a, (1, 3))"
  3779. ]
  3780. },
  3781. {
  3782. "cell_type": "code",
  3783. "execution_count": 148,
  3784. "metadata": {},
  3785. "outputs": [
  3786. {
  3787. "data": {
  3788. "text/plain": [
  3789. "array([[1, 2],\n",
  3790. " [3, 4],\n",
  3791. " [1, 2],\n",
  3792. " [3, 4],\n",
  3793. " [1, 2],\n",
  3794. " [3, 4]])"
  3795. ]
  3796. },
  3797. "execution_count": 148,
  3798. "metadata": {},
  3799. "output_type": "execute_result"
  3800. }
  3801. ],
  3802. "source": [
  3803. "np.tile(a, (3, 1))"
  3804. ]
  3805. },
  3806. {
  3807. "cell_type": "markdown",
  3808. "metadata": {},
  3809. "source": [
  3810. "### 10.2 concatenate"
  3811. ]
  3812. },
  3813. {
  3814. "cell_type": "code",
  3815. "execution_count": 149,
  3816. "metadata": {},
  3817. "outputs": [],
  3818. "source": [
  3819. "b = np.array([[5, 6]])"
  3820. ]
  3821. },
  3822. {
  3823. "cell_type": "code",
  3824. "execution_count": 150,
  3825. "metadata": {},
  3826. "outputs": [
  3827. {
  3828. "data": {
  3829. "text/plain": [
  3830. "array([[1, 2],\n",
  3831. " [3, 4],\n",
  3832. " [5, 6]])"
  3833. ]
  3834. },
  3835. "execution_count": 150,
  3836. "metadata": {},
  3837. "output_type": "execute_result"
  3838. }
  3839. ],
  3840. "source": [
  3841. "np.concatenate((a, b), axis=0)"
  3842. ]
  3843. },
  3844. {
  3845. "cell_type": "code",
  3846. "execution_count": 151,
  3847. "metadata": {},
  3848. "outputs": [
  3849. {
  3850. "data": {
  3851. "text/plain": [
  3852. "array([[1, 2, 5],\n",
  3853. " [3, 4, 6]])"
  3854. ]
  3855. },
  3856. "execution_count": 151,
  3857. "metadata": {},
  3858. "output_type": "execute_result"
  3859. }
  3860. ],
  3861. "source": [
  3862. "np.concatenate((a, b.T), axis=1)"
  3863. ]
  3864. },
  3865. {
  3866. "cell_type": "markdown",
  3867. "metadata": {},
  3868. "source": [
  3869. "### 10.3 hstack and vstack"
  3870. ]
  3871. },
  3872. {
  3873. "cell_type": "code",
  3874. "execution_count": 152,
  3875. "metadata": {},
  3876. "outputs": [
  3877. {
  3878. "data": {
  3879. "text/plain": [
  3880. "array([[1, 2],\n",
  3881. " [3, 4],\n",
  3882. " [5, 6]])"
  3883. ]
  3884. },
  3885. "execution_count": 152,
  3886. "metadata": {},
  3887. "output_type": "execute_result"
  3888. }
  3889. ],
  3890. "source": [
  3891. "np.vstack((a,b))"
  3892. ]
  3893. },
  3894. {
  3895. "cell_type": "code",
  3896. "execution_count": 153,
  3897. "metadata": {},
  3898. "outputs": [
  3899. {
  3900. "data": {
  3901. "text/plain": [
  3902. "array([[1, 2, 5],\n",
  3903. " [3, 4, 6]])"
  3904. ]
  3905. },
  3906. "execution_count": 153,
  3907. "metadata": {},
  3908. "output_type": "execute_result"
  3909. }
  3910. ],
  3911. "source": [
  3912. "np.hstack((a,b.T))"
  3913. ]
  3914. },
  3915. {
  3916. "cell_type": "markdown",
  3917. "metadata": {},
  3918. "source": [
  3919. "## 11. 复制和“深度复制”"
  3920. ]
  3921. },
  3922. {
  3923. "cell_type": "markdown",
  3924. "metadata": {},
  3925. "source": [
  3926. "为了获得高性能,Python中的赋值通常不复制底层对象。例如,在函数之间传递对象时,通过引用传递从而避免不必要的大量内存复制。"
  3927. ]
  3928. },
  3929. {
  3930. "cell_type": "code",
  3931. "execution_count": 154,
  3932. "metadata": {},
  3933. "outputs": [
  3934. {
  3935. "data": {
  3936. "text/plain": [
  3937. "array([[1, 2],\n",
  3938. " [3, 4]])"
  3939. ]
  3940. },
  3941. "execution_count": 154,
  3942. "metadata": {},
  3943. "output_type": "execute_result"
  3944. }
  3945. ],
  3946. "source": [
  3947. "A = np.array([[1, 2], [3, 4]])\n",
  3948. "\n",
  3949. "A"
  3950. ]
  3951. },
  3952. {
  3953. "cell_type": "code",
  3954. "execution_count": 155,
  3955. "metadata": {},
  3956. "outputs": [],
  3957. "source": [
  3958. "# 现在B和A指的是同一个数组数据\n",
  3959. "B = A "
  3960. ]
  3961. },
  3962. {
  3963. "cell_type": "code",
  3964. "execution_count": 156,
  3965. "metadata": {},
  3966. "outputs": [
  3967. {
  3968. "data": {
  3969. "text/plain": [
  3970. "array([[10, 2],\n",
  3971. " [ 3, 4]])"
  3972. ]
  3973. },
  3974. "execution_count": 156,
  3975. "metadata": {},
  3976. "output_type": "execute_result"
  3977. }
  3978. ],
  3979. "source": [
  3980. "# 改变B影响A\n",
  3981. "B[0,0] = 10\n",
  3982. "\n",
  3983. "B"
  3984. ]
  3985. },
  3986. {
  3987. "cell_type": "code",
  3988. "execution_count": 157,
  3989. "metadata": {},
  3990. "outputs": [
  3991. {
  3992. "data": {
  3993. "text/plain": [
  3994. "array([[10, 2],\n",
  3995. " [ 3, 4]])"
  3996. ]
  3997. },
  3998. "execution_count": 157,
  3999. "metadata": {},
  4000. "output_type": "execute_result"
  4001. }
  4002. ],
  4003. "source": [
  4004. "A"
  4005. ]
  4006. },
  4007. {
  4008. "cell_type": "markdown",
  4009. "metadata": {},
  4010. "source": [
  4011. "如果我们想避免这种引用赋值的行为,那么当我们从 `A` 复制一个新的完全独立的对象 `B` 时,我们需要使用函数 `copy` 来做一个所谓的“深度复制”:"
  4012. ]
  4013. },
  4014. {
  4015. "cell_type": "code",
  4016. "execution_count": 158,
  4017. "metadata": {},
  4018. "outputs": [],
  4019. "source": [
  4020. "B = np.copy(A)"
  4021. ]
  4022. },
  4023. {
  4024. "cell_type": "code",
  4025. "execution_count": 159,
  4026. "metadata": {},
  4027. "outputs": [
  4028. {
  4029. "data": {
  4030. "text/plain": [
  4031. "array([[-5, 2],\n",
  4032. " [ 3, 4]])"
  4033. ]
  4034. },
  4035. "execution_count": 159,
  4036. "metadata": {},
  4037. "output_type": "execute_result"
  4038. }
  4039. ],
  4040. "source": [
  4041. "# 现在如果我们改变B,A不受影响\n",
  4042. "B[0,0] = -5\n",
  4043. "\n",
  4044. "B"
  4045. ]
  4046. },
  4047. {
  4048. "cell_type": "code",
  4049. "execution_count": 160,
  4050. "metadata": {},
  4051. "outputs": [
  4052. {
  4053. "data": {
  4054. "text/plain": [
  4055. "array([[10, 2],\n",
  4056. " [ 3, 4]])"
  4057. ]
  4058. },
  4059. "execution_count": 160,
  4060. "metadata": {},
  4061. "output_type": "execute_result"
  4062. }
  4063. ],
  4064. "source": [
  4065. "A"
  4066. ]
  4067. },
  4068. {
  4069. "cell_type": "markdown",
  4070. "metadata": {},
  4071. "source": [
  4072. "## 12. 遍历数组元素"
  4073. ]
  4074. },
  4075. {
  4076. "cell_type": "markdown",
  4077. "metadata": {},
  4078. "source": [
  4079. "通常,我们希望尽可能避免遍历数组元素(不惜一切代价)。原因是在像Python(或MATLAB)这样的解释语言中,迭代与向量化操作相比真的很慢。\n",
  4080. "\n",
  4081. "然而,有时迭代是不可避免的。对于这种情况,Python的For循环是最方便的遍历数组的方法:"
  4082. ]
  4083. },
  4084. {
  4085. "cell_type": "code",
  4086. "execution_count": 161,
  4087. "metadata": {},
  4088. "outputs": [
  4089. {
  4090. "name": "stdout",
  4091. "output_type": "stream",
  4092. "text": [
  4093. "1\n",
  4094. "2\n",
  4095. "3\n",
  4096. "4\n"
  4097. ]
  4098. }
  4099. ],
  4100. "source": [
  4101. "v = np.array([1,2,3,4])\n",
  4102. "\n",
  4103. "for element in v:\n",
  4104. " print(element)"
  4105. ]
  4106. },
  4107. {
  4108. "cell_type": "code",
  4109. "execution_count": 162,
  4110. "metadata": {},
  4111. "outputs": [
  4112. {
  4113. "name": "stdout",
  4114. "output_type": "stream",
  4115. "text": [
  4116. "row [1 2]\n",
  4117. "1\n",
  4118. "2\n",
  4119. "row [3 4]\n",
  4120. "3\n",
  4121. "4\n"
  4122. ]
  4123. }
  4124. ],
  4125. "source": [
  4126. "M = np.array([[1,2], [3,4]])\n",
  4127. "\n",
  4128. "for row in M:\n",
  4129. " print(\"row\", row)\n",
  4130. " \n",
  4131. " for element in row:\n",
  4132. " print(element)"
  4133. ]
  4134. },
  4135. {
  4136. "cell_type": "markdown",
  4137. "metadata": {},
  4138. "source": [
  4139. "当我们需要去\n",
  4140. "当我们需要遍历一个数组的每个元素并修改它的元素时,使用`enumerate`函数可以方便地在`for`循环中获得元素及其索引:"
  4141. ]
  4142. },
  4143. {
  4144. "cell_type": "code",
  4145. "execution_count": 163,
  4146. "metadata": {},
  4147. "outputs": [
  4148. {
  4149. "name": "stdout",
  4150. "output_type": "stream",
  4151. "text": [
  4152. "row_idx 0 row [1 2]\n",
  4153. "col_idx 0 element 1\n",
  4154. "col_idx 1 element 2\n",
  4155. "row_idx 1 row [3 4]\n",
  4156. "col_idx 0 element 3\n",
  4157. "col_idx 1 element 4\n"
  4158. ]
  4159. }
  4160. ],
  4161. "source": [
  4162. "for row_idx, row in enumerate(M):\n",
  4163. " print(\"row_idx\", row_idx, \"row\", row)\n",
  4164. " \n",
  4165. " for col_idx, element in enumerate(row):\n",
  4166. " print(\"col_idx\", col_idx, \"element\", element)\n",
  4167. " \n",
  4168. " # 更新矩阵:对每个元素求平方\n",
  4169. " M[row_idx, col_idx] = element ** 2"
  4170. ]
  4171. },
  4172. {
  4173. "cell_type": "code",
  4174. "execution_count": 164,
  4175. "metadata": {},
  4176. "outputs": [
  4177. {
  4178. "data": {
  4179. "text/plain": [
  4180. "array([[ 1, 4],\n",
  4181. " [ 9, 16]])"
  4182. ]
  4183. },
  4184. "execution_count": 164,
  4185. "metadata": {},
  4186. "output_type": "execute_result"
  4187. }
  4188. ],
  4189. "source": [
  4190. "# 现在矩阵里的每一个元素都已经求得平方\n",
  4191. "M"
  4192. ]
  4193. },
  4194. {
  4195. "cell_type": "markdown",
  4196. "metadata": {},
  4197. "source": [
  4198. "## 13. 向量化功能"
  4199. ]
  4200. },
  4201. {
  4202. "cell_type": "markdown",
  4203. "metadata": {},
  4204. "source": [
  4205. "正如前面多次提到的,为了获得良好的性能,我们应该尽量避免对向量和矩阵中的元素进行循环,而应该使用向量化算法。将标量算法转换为向量化算法的第一步是确保我们编写的函数使用向量输入。"
  4206. ]
  4207. },
  4208. {
  4209. "cell_type": "code",
  4210. "execution_count": 165,
  4211. "metadata": {},
  4212. "outputs": [],
  4213. "source": [
  4214. "def Theta(x):\n",
  4215. " \"\"\"\n",
  4216. " 阶跃函数的普遍版本\n",
  4217. " \"\"\"\n",
  4218. " if x >= 0:\n",
  4219. " return 1\n",
  4220. " else:\n",
  4221. " return 0"
  4222. ]
  4223. },
  4224. {
  4225. "cell_type": "code",
  4226. "execution_count": 166,
  4227. "metadata": {
  4228. "scrolled": true
  4229. },
  4230. "outputs": [
  4231. {
  4232. "ename": "ValueError",
  4233. "evalue": "The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()",
  4234. "output_type": "error",
  4235. "traceback": [
  4236. "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
  4237. "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)",
  4238. "\u001b[0;32m<ipython-input-166-b49266106206>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mTheta\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0marray\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m3\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m3\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
  4239. "\u001b[0;32m<ipython-input-165-cb840dbb09da>\u001b[0m in \u001b[0;36mTheta\u001b[0;34m(x)\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0m阶跃函数的普遍版本\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \"\"\"\n\u001b[0;32m----> 5\u001b[0;31m \u001b[0;32mif\u001b[0m \u001b[0mx\u001b[0m \u001b[0;34m>=\u001b[0m \u001b[0;36m0\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 6\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 7\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
  4240. "\u001b[0;31mValueError\u001b[0m: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()"
  4241. ]
  4242. }
  4243. ],
  4244. "source": [
  4245. "Theta(np.array([-3,-2,-1,0,1,2,3]))"
  4246. ]
  4247. },
  4248. {
  4249. "cell_type": "markdown",
  4250. "metadata": {},
  4251. "source": [
  4252. "这个操作并不可行,因为所实现的 `Theta` 函数不能接收向量输入。\n",
  4253. "\n",
  4254. "为了得到向量化的版本,我们可以使用Numpy函数 `vectorize` 。在许多情况下,它可以自动向量化一个函数:"
  4255. ]
  4256. },
  4257. {
  4258. "cell_type": "code",
  4259. "execution_count": 167,
  4260. "metadata": {},
  4261. "outputs": [],
  4262. "source": [
  4263. "Theta_vec = np.vectorize(Theta)"
  4264. ]
  4265. },
  4266. {
  4267. "cell_type": "code",
  4268. "execution_count": 168,
  4269. "metadata": {},
  4270. "outputs": [
  4271. {
  4272. "data": {
  4273. "text/plain": [
  4274. "array([0, 0, 0, 1, 1, 1, 1])"
  4275. ]
  4276. },
  4277. "execution_count": 168,
  4278. "metadata": {},
  4279. "output_type": "execute_result"
  4280. }
  4281. ],
  4282. "source": [
  4283. "Theta_vec(np.array([-3,-2,-1,0,1,2,3]))"
  4284. ]
  4285. },
  4286. {
  4287. "cell_type": "markdown",
  4288. "metadata": {},
  4289. "source": [
  4290. "我们也可以实现从一开始就接受矢量输入的函数(需要更多的计算,但可能会有更好的性能):"
  4291. ]
  4292. },
  4293. {
  4294. "cell_type": "code",
  4295. "execution_count": 169,
  4296. "metadata": {},
  4297. "outputs": [],
  4298. "source": [
  4299. "def Theta(x):\n",
  4300. " \"\"\"\n",
  4301. " Heaviside阶跃函数的矢量感知实现。\n",
  4302. " \"\"\"\n",
  4303. " return 1 * (x >= 0)"
  4304. ]
  4305. },
  4306. {
  4307. "cell_type": "code",
  4308. "execution_count": 170,
  4309. "metadata": {},
  4310. "outputs": [
  4311. {
  4312. "data": {
  4313. "text/plain": [
  4314. "array([0, 0, 0, 1, 1, 1, 1])"
  4315. ]
  4316. },
  4317. "execution_count": 170,
  4318. "metadata": {},
  4319. "output_type": "execute_result"
  4320. }
  4321. ],
  4322. "source": [
  4323. "Theta(np.array([-3,-2,-1,0,1,2,3]))"
  4324. ]
  4325. },
  4326. {
  4327. "cell_type": "code",
  4328. "execution_count": 171,
  4329. "metadata": {},
  4330. "outputs": [
  4331. {
  4332. "name": "stdout",
  4333. "output_type": "stream",
  4334. "text": [
  4335. "[False False False True True True True]\n"
  4336. ]
  4337. },
  4338. {
  4339. "data": {
  4340. "text/plain": [
  4341. "array([0, 0, 0, 1, 1, 1, 1])"
  4342. ]
  4343. },
  4344. "execution_count": 171,
  4345. "metadata": {},
  4346. "output_type": "execute_result"
  4347. }
  4348. ],
  4349. "source": [
  4350. "a = np.array([-3,-2,-1,0,1,2,3])\n",
  4351. "b = a>=0\n",
  4352. "print(b)\n",
  4353. "b*1"
  4354. ]
  4355. },
  4356. {
  4357. "cell_type": "code",
  4358. "execution_count": 172,
  4359. "metadata": {},
  4360. "outputs": [
  4361. {
  4362. "data": {
  4363. "text/plain": [
  4364. "(0, 1)"
  4365. ]
  4366. },
  4367. "execution_count": 172,
  4368. "metadata": {},
  4369. "output_type": "execute_result"
  4370. }
  4371. ],
  4372. "source": [
  4373. "# 同样适用于标量\n",
  4374. "Theta(-1.2), Theta(2.6)"
  4375. ]
  4376. },
  4377. {
  4378. "cell_type": "markdown",
  4379. "metadata": {},
  4380. "source": [
  4381. "## 14. 在条件中使用数组"
  4382. ]
  4383. },
  4384. {
  4385. "cell_type": "markdown",
  4386. "metadata": {},
  4387. "source": [
  4388. "当在条件中使用数组时,例如`if`语句和其他布尔表达,一个需要用`any`或者`all`,这让数组任何或者所有元素都等于`True`。"
  4389. ]
  4390. },
  4391. {
  4392. "cell_type": "code",
  4393. "execution_count": 173,
  4394. "metadata": {},
  4395. "outputs": [
  4396. {
  4397. "data": {
  4398. "text/plain": [
  4399. "array([[1, 2],\n",
  4400. " [3, 4]])"
  4401. ]
  4402. },
  4403. "execution_count": 173,
  4404. "metadata": {},
  4405. "output_type": "execute_result"
  4406. }
  4407. ],
  4408. "source": [
  4409. "M = np.array([[1, 2], [3, 4]])\n",
  4410. "M"
  4411. ]
  4412. },
  4413. {
  4414. "cell_type": "code",
  4415. "execution_count": 174,
  4416. "metadata": {},
  4417. "outputs": [
  4418. {
  4419. "data": {
  4420. "text/plain": [
  4421. "True"
  4422. ]
  4423. },
  4424. "execution_count": 174,
  4425. "metadata": {},
  4426. "output_type": "execute_result"
  4427. }
  4428. ],
  4429. "source": [
  4430. "(M > 2).any()"
  4431. ]
  4432. },
  4433. {
  4434. "cell_type": "code",
  4435. "execution_count": 175,
  4436. "metadata": {},
  4437. "outputs": [
  4438. {
  4439. "name": "stdout",
  4440. "output_type": "stream",
  4441. "text": [
  4442. "at least one element in M is larger than 2\n"
  4443. ]
  4444. }
  4445. ],
  4446. "source": [
  4447. "if (M > 2).any():\n",
  4448. " print(\"at least one element in M is larger than 2\")\n",
  4449. "else:\n",
  4450. " print(\"no element in M is larger than 2\")"
  4451. ]
  4452. },
  4453. {
  4454. "cell_type": "code",
  4455. "execution_count": 176,
  4456. "metadata": {},
  4457. "outputs": [
  4458. {
  4459. "name": "stdout",
  4460. "output_type": "stream",
  4461. "text": [
  4462. "all elements in M are not larger than 5\n"
  4463. ]
  4464. }
  4465. ],
  4466. "source": [
  4467. "if (M > 5).all():\n",
  4468. " print(\"all elements in M are larger than 5\")\n",
  4469. "else:\n",
  4470. " print(\"all elements in M are not larger than 5\")"
  4471. ]
  4472. },
  4473. {
  4474. "cell_type": "markdown",
  4475. "metadata": {},
  4476. "source": [
  4477. "## 15. 类型转换"
  4478. ]
  4479. },
  4480. {
  4481. "cell_type": "markdown",
  4482. "metadata": {},
  4483. "source": [
  4484. "因为Numpy数组是*静态类型*,数组的类型一旦创建就不会改变。但是我们可以用`astype`函数(参见类似的“asarray”函数)显式地转换一个数组的类型到其他的类型,这总是创建一个新类型的新数组。"
  4485. ]
  4486. },
  4487. {
  4488. "cell_type": "code",
  4489. "execution_count": 177,
  4490. "metadata": {},
  4491. "outputs": [
  4492. {
  4493. "data": {
  4494. "text/plain": [
  4495. "dtype('int64')"
  4496. ]
  4497. },
  4498. "execution_count": 177,
  4499. "metadata": {},
  4500. "output_type": "execute_result"
  4501. }
  4502. ],
  4503. "source": [
  4504. "M.dtype\n"
  4505. ]
  4506. },
  4507. {
  4508. "cell_type": "code",
  4509. "execution_count": 178,
  4510. "metadata": {},
  4511. "outputs": [
  4512. {
  4513. "data": {
  4514. "text/plain": [
  4515. "array([[1., 2.],\n",
  4516. " [3., 4.]])"
  4517. ]
  4518. },
  4519. "execution_count": 178,
  4520. "metadata": {},
  4521. "output_type": "execute_result"
  4522. }
  4523. ],
  4524. "source": [
  4525. "M2 = M.astype(float)\n",
  4526. "\n",
  4527. "M2"
  4528. ]
  4529. },
  4530. {
  4531. "cell_type": "code",
  4532. "execution_count": 179,
  4533. "metadata": {},
  4534. "outputs": [
  4535. {
  4536. "data": {
  4537. "text/plain": [
  4538. "dtype('float64')"
  4539. ]
  4540. },
  4541. "execution_count": 179,
  4542. "metadata": {},
  4543. "output_type": "execute_result"
  4544. }
  4545. ],
  4546. "source": [
  4547. "M2.dtype"
  4548. ]
  4549. },
  4550. {
  4551. "cell_type": "code",
  4552. "execution_count": 180,
  4553. "metadata": {},
  4554. "outputs": [
  4555. {
  4556. "data": {
  4557. "text/plain": [
  4558. "array([[ True, True],\n",
  4559. " [ True, True]])"
  4560. ]
  4561. },
  4562. "execution_count": 180,
  4563. "metadata": {},
  4564. "output_type": "execute_result"
  4565. }
  4566. ],
  4567. "source": [
  4568. "M3 = M.astype(bool)\n",
  4569. "\n",
  4570. "M3"
  4571. ]
  4572. },
  4573. {
  4574. "cell_type": "markdown",
  4575. "metadata": {},
  4576. "source": [
  4577. "## 16. 进一步学习"
  4578. ]
  4579. },
  4580. {
  4581. "cell_type": "markdown",
  4582. "metadata": {},
  4583. "source": [
  4584. "* [NumPy 简易教程](https://www.runoob.com/numpy/numpy-tutorial.html)\n",
  4585. "* [NumPy 官方用户指南](https://www.numpy.org.cn/user/)\n",
  4586. "* [NumPy 官方参考手册](https://www.numpy.org.cn/reference/)\n",
  4587. "* [一个针对MATLAB使用者的Numpy教程](https://numpy.org/doc/stable/user/numpy-for-matlab-users.html)"
  4588. ]
  4589. }
  4590. ],
  4591. "metadata": {
  4592. "kernelspec": {
  4593. "display_name": "Python 3",
  4594. "language": "python",
  4595. "name": "python3"
  4596. },
  4597. "language_info": {
  4598. "codemirror_mode": {
  4599. "name": "ipython",
  4600. "version": 3
  4601. },
  4602. "file_extension": ".py",
  4603. "mimetype": "text/x-python",
  4604. "name": "python",
  4605. "nbconvert_exporter": "python",
  4606. "pygments_lexer": "ipython3",
  4607. "version": "3.8.3"
  4608. }
  4609. },
  4610. "nbformat": 4,
  4611. "nbformat_minor": 1
  4612. }

机器学习越来越多应用到飞行器、机器人等领域,其目的是利用计算机实现类似人类的智能,从而实现装备的智能化与无人化。本课程旨在引导学生掌握机器学习的基本知识、典型方法与技术,通过具体的应用案例激发学生对该学科的兴趣,鼓励学生能够从人工智能的角度来分析、解决飞行器、机器人所面临的问题和挑战。本课程主要内容包括Python编程基础,机器学习模型,无监督学习、监督学习、深度学习基础知识与实现,并学习如何利用机器学习解决实际问题,从而全面提升自我的《综合能力》。