Browse Source

Improve some description

pull/2/MERGE
bushuhui 4 years ago
parent
commit
3331b849d2
8 changed files with 439 additions and 290 deletions
  1. +83
    -51
      0_python/1_Basics.ipynb
  2. +24
    -18
      0_python/2_Print_Statement.ipynb
  3. +214
    -120
      0_python/3_Data_Structure_1.ipynb
  4. +77
    -74
      0_python/4_Data_Structure_2.ipynb
  5. +7
    -7
      0_python/5_Control_Flow.ipynb
  6. +1
    -1
      0_python/README.md
  7. +32
    -18
      2_knn/knn_classification.ipynb
  8. +1
    -1
      3_kmeans/1-k-means.ipynb

+ 83
- 51
0_python/1_Basics.ipynb View File

@@ -11,7 +11,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## 1. 导入Python之禅\n"
"## 1. 导入库与Python之禅\n"
] ]
}, },
{ {
@@ -22,16 +22,24 @@
{ {
"data": { "data": {
"text/plain": [ "text/plain": [
"['1_Basics.ipynb',\n",
"['.ipynb_checkpoints',\n",
" 'Python.pdf',\n",
" '1_Basics_EN.ipynb',\n",
" '2_Print_Statement_EN.ipynb',\n",
" '4_Data_Structure_2_EN.ipynb',\n",
" '5_Control_Flow_EN.ipynb',\n",
" '6_Function_EN.ipynb',\n",
" 'README.md',\n",
" 'README_EN.md',\n",
" '1_Basics.ipynb',\n",
" '2_Print_Statement.ipynb',\n", " '2_Print_Statement.ipynb',\n",
" '3_Data_Structure_1.ipynb',\n", " '3_Data_Structure_1.ipynb',\n",
" '3_Data_Structure_1_EN.ipynb',\n",
" '4_Data_Structure_2.ipynb',\n", " '4_Data_Structure_2.ipynb',\n",
" '5_Control_Flow.ipynb',\n", " '5_Control_Flow.ipynb',\n",
" '6_Function.ipynb',\n", " '6_Function.ipynb',\n",
" '7_Class.ipynb',\n", " '7_Class.ipynb',\n",
" 'Python.pdf',\n",
" 'README.md',\n",
" '.ipynb_checkpoints']"
" '7_Class_EN.ipynb']"
] ]
}, },
"execution_count": 1, "execution_count": 1,
@@ -47,7 +55,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 1,
"execution_count": 2,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -86,6 +94,30 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"**Python 之禅, by Tim Peters**\n",
"```\n",
"优美胜于丑陋(Python 以编写优美的代码为目标)\n",
"明了胜于晦涩(优美的代码应当是明了的,命名规范,风格相似)\n",
"简洁胜于复杂(优美的代码应当是简洁的,不要有复杂的内部实现)\n",
"复杂胜于凌乱(如果复杂不可避免,那代码间也不能有难懂的关系,要保持接口简洁)\n",
"扁平胜于嵌套(优美的代码应当是扁平的,不能有太多的嵌套)\n",
"间隔胜于紧凑(优美的代码有适当的间隔,不要奢望一行代码解决问题)\n",
"可读性很重要(优美的代码是可读的)\n",
"即便假借特例的实用性之名,也不可违背这些规则(这些规则至高无上)\n",
"不要包容所有错误,除非你确定需要这样做(精准地捕获异常,不写except:pass风格的代码)\n",
"当存在多种可能,不要尝试去猜测\n",
"而是尽量找一种,最好是唯一一种明显的解决方案(如果不确定,就用穷举法)\n",
"虽然这并不容易,因为你不是 Python 之父(这里的 Dutch 是指 Guido)\n",
"做也许好过不做,但不假思索就动手还不如不做(动手之前要细思量)\n",
"如果你无法向人描述你的方案,那肯定不是一个好方案;反之亦然(方案测评标准)\n",
"命名空间是一种绝妙的理念,我们应当多加利用(倡导与号召)\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. 变量" "## 2. 变量"
] ]
}, },
@@ -98,7 +130,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 1,
"execution_count": 2,
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
@@ -109,7 +141,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 2,
"execution_count": 3,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -133,7 +165,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 2,
"execution_count": 5,
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
@@ -142,7 +174,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 3,
"execution_count": 6,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -188,7 +220,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 4,
"execution_count": 7,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -197,7 +229,7 @@
"3" "3"
] ]
}, },
"execution_count": 4,
"execution_count": 7,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
@@ -275,7 +307,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 14,
"execution_count": 9,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -284,7 +316,7 @@
"0.5" "0.5"
] ]
}, },
"execution_count": 14,
"execution_count": 9,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
@@ -315,7 +347,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 15,
"execution_count": 10,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -324,7 +356,7 @@
"5" "5"
] ]
}, },
"execution_count": 15,
"execution_count": 10,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
@@ -337,12 +369,12 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"地板除法就是将这样得到的结果转换成最接近的整数。"
"地板除法(floor divide)就是将这样得到的结果转换成最接近的整数。"
] ]
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 16,
"execution_count": 11,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -351,7 +383,7 @@
"1.0" "1.0"
] ]
}, },
"execution_count": 16,
"execution_count": 11,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
@@ -383,7 +415,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 9,
"execution_count": 12,
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
@@ -392,7 +424,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 10,
"execution_count": 13,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -401,7 +433,7 @@
"True" "True"
] ]
}, },
"execution_count": 10,
"execution_count": 13,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
@@ -412,7 +444,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 11,
"execution_count": 14,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -421,7 +453,7 @@
"False" "False"
] ]
}, },
"execution_count": 11,
"execution_count": 14,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
@@ -453,7 +485,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 20,
"execution_count": 16,
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
@@ -463,7 +495,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 22,
"execution_count": 17,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -482,7 +514,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 23,
"execution_count": 18,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -491,7 +523,7 @@
"2" "2"
] ]
}, },
"execution_count": 23,
"execution_count": 18,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
@@ -513,7 +545,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 12,
"execution_count": 19,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -522,7 +554,7 @@
"10" "10"
] ]
}, },
"execution_count": 12,
"execution_count": 19,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
@@ -560,7 +592,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"### 4.1 从一个系统到另一个系统的转换"
"### 4.1 从一个进制到另一个进制的转换"
] ]
}, },
{ {
@@ -572,7 +604,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 25,
"execution_count": 20,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -581,7 +613,7 @@
"'0xaa'" "'0xaa'"
] ]
}, },
"execution_count": 25,
"execution_count": 20,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
@@ -592,7 +624,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 26,
"execution_count": 21,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -601,7 +633,7 @@
"170" "170"
] ]
}, },
"execution_count": 26,
"execution_count": 21,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
@@ -612,7 +644,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 2,
"execution_count": 22,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -621,7 +653,7 @@
"'0o10'" "'0o10'"
] ]
}, },
"execution_count": 2,
"execution_count": 22,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
@@ -639,7 +671,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 9,
"execution_count": 23,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -693,7 +725,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 10,
"execution_count": 24,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -702,7 +734,7 @@
"'b'" "'b'"
] ]
}, },
"execution_count": 10,
"execution_count": 24,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
@@ -713,7 +745,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 11,
"execution_count": 25,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -722,7 +754,7 @@
"98" "98"
] ]
}, },
"execution_count": 11,
"execution_count": 25,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
@@ -747,7 +779,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 12,
"execution_count": 26,
"metadata": { "metadata": {
"scrolled": false "scrolled": false
}, },
@@ -775,7 +807,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 35,
"execution_count": 27,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -800,7 +832,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 36,
"execution_count": 28,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -809,7 +841,7 @@
"(4, 1)" "(4, 1)"
] ]
}, },
"execution_count": 36,
"execution_count": 28,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
@@ -827,7 +859,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 13,
"execution_count": 29,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -946,14 +978,14 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 15,
"execution_count": 32,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
"name": "stdout", "name": "stdout",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"Type something here and it will be stored in variable abc \tHello world!\n"
"Type something here and it will be stored in variable abc \t10\n"
] ]
} }
], ],
@@ -963,7 +995,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 16,
"execution_count": 33,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -972,7 +1004,7 @@
"str" "str"
] ]
}, },
"execution_count": 16,
"execution_count": 33,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }


+ 24
- 18
0_python/2_Print_Statement.ipynb View File

@@ -47,19 +47,25 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 2,
"execution_count": 6,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
"name": "stdout", "name": "stdout",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"Hey\n"
"Hey\n",
"line1line2\n"
] ]
} }
], ],
"source": [ "source": [
"print('Hey')"
"print('Hey')\n",
"a = 'line1\\\n",
"line2\\\n",
"\\\n",
"'\n",
"print(a)"
] ]
}, },
{ {
@@ -92,7 +98,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 1,
"execution_count": 7,
"metadata": { "metadata": {
"scrolled": true "scrolled": true
}, },
@@ -243,7 +249,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 13,
"execution_count": 14,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -255,12 +261,12 @@
} }
], ],
"source": [ "source": [
"print(\"I want to be printed %s\" %'here')"
"print(\"I want to be printed %s\" % 'here')"
] ]
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 14,
"execution_count": 15,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -277,7 +283,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 15,
"execution_count": 16,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -301,7 +307,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 16,
"execution_count": 17,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -370,7 +376,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"# 2. PrecisionWidth和FieldWidth"
"# 2. `PrecisionWidth``FieldWidth`"
] ]
}, },
{ {
@@ -384,7 +390,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 19,
"execution_count": 18,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -393,7 +399,7 @@
"'3.121312'" "'3.121312'"
] ]
}, },
"execution_count": 19,
"execution_count": 18,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
@@ -411,7 +417,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 20,
"execution_count": 19,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -420,7 +426,7 @@
"'3.12131'" "'3.12131'"
] ]
}, },
"execution_count": 20,
"execution_count": 19,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
@@ -438,22 +444,22 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 21,
"execution_count": 25,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
"data": { "data": {
"text/plain": [ "text/plain": [
"' 3.12131'"
"'-33.12131'"
] ]
}, },
"execution_count": 21,
"execution_count": 25,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
], ],
"source": [ "source": [
"\"%9.5f\" % 3.121312312312"
"\"%9.5f\" % -33.121312312312"
] ]
}, },
{ {


+ 214
- 120
0_python/3_Data_Structure_1.ipynb
File diff suppressed because it is too large
View File


+ 77
- 74
0_python/4_Data_Structure_2.ipynb View File

@@ -4,6 +4,8 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"# 数据结构2\n",
"\n",
"## 1. 字符串" "## 1. 字符串"
] ]
}, },
@@ -16,17 +18,9 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 4,
"execution_count": 1,
"metadata": {}, "metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"10\n"
]
}
],
"outputs": [],
"source": [ "source": [
"String0 = 'Taj Mahal is beautiful'\n", "String0 = 'Taj Mahal is beautiful'\n",
"String1 = \"Taj Mahal is beautiful\"\n", "String1 = \"Taj Mahal is beautiful\"\n",
@@ -37,7 +31,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 3,
"execution_count": 2,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -67,7 +61,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 5,
"execution_count": 3,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -100,19 +94,21 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 6,
"execution_count": 5,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
"name": "stdout", "name": "stdout",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"Taj Mahal is beautiful\n",
"7\n", "7\n",
"-1\n" "-1\n"
] ]
} }
], ],
"source": [ "source": [
"print(String0)\n",
"print(String0.find('al'))\n", "print(String0.find('al'))\n",
"print(String0.find('am'))" "print(String0.find('am'))"
] ]
@@ -126,7 +122,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 7,
"execution_count": 6,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -150,7 +146,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 8,
"execution_count": 7,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -176,7 +172,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 9,
"execution_count": 8,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -201,7 +197,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 10,
"execution_count": 11,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -210,7 +206,7 @@
"' Taj Mahal is beautiful '" "' Taj Mahal is beautiful '"
] ]
}, },
"execution_count": 10,
"execution_count": 11,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
@@ -228,7 +224,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 11,
"execution_count": 12,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -237,7 +233,7 @@
"'------------------------Taj Mahal is beautiful------------------------'" "'------------------------Taj Mahal is beautiful------------------------'"
] ]
}, },
"execution_count": 11,
"execution_count": 12,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
@@ -255,7 +251,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 12,
"execution_count": 13,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -264,7 +260,7 @@
"'00000000Taj Mahal is beautiful'" "'00000000Taj Mahal is beautiful'"
] ]
}, },
"execution_count": 12,
"execution_count": 13,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
@@ -282,7 +278,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 13,
"execution_count": 14,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -313,7 +309,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 14,
"execution_count": 15,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -331,7 +327,7 @@
"traceback": [ "traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-14-a7d6b97b4839>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mString0\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mindex\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'Taj'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mString0\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mindex\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'Mahal'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mString0\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mindex\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'Mahal'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m10\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m20\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;32m<ipython-input-15-a7d6b97b4839>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mString0\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mindex\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'Taj'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mString0\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mindex\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'Mahal'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mString0\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mindex\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'Mahal'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m10\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m20\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mValueError\u001b[0m: substring not found" "\u001b[0;31mValueError\u001b[0m: substring not found"
] ]
} }
@@ -351,7 +347,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 15,
"execution_count": 16,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -375,7 +371,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 14,
"execution_count": 17,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -401,7 +397,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 15,
"execution_count": 18,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -427,7 +423,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 16,
"execution_count": 19,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -436,7 +432,7 @@
"'*a_a-'" "'*a_a-'"
] ]
}, },
"execution_count": 16,
"execution_count": 19,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
@@ -447,7 +443,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 6,
"execution_count": 20,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -456,7 +452,7 @@
"'1\\n2'" "'1\\n2'"
] ]
}, },
"execution_count": 6,
"execution_count": 20,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
@@ -481,7 +477,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 17,
"execution_count": 21,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -509,7 +505,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 18,
"execution_count": 22,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -534,7 +530,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 19,
"execution_count": 23,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -666,7 +662,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 16,
"execution_count": 24,
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
@@ -682,7 +678,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 17,
"execution_count": 25,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -691,7 +687,7 @@
"'hello'" "'hello'"
] ]
}, },
"execution_count": 17,
"execution_count": 25,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
@@ -709,7 +705,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 20,
"execution_count": 26,
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
@@ -718,7 +714,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 21,
"execution_count": 27,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -727,7 +723,7 @@
"' ***----hello---******* '" "' ***----hello---******* '"
] ]
}, },
"execution_count": 21,
"execution_count": 27,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
@@ -799,7 +795,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"典更像数据库,因为在这里你可以用用户定义的字符串索引特定的序列。"
"典更像数据库,因为在这里你可以用用户定义的字符串索引特定的序列。"
] ]
}, },
{ {
@@ -811,7 +807,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 24,
"execution_count": 28,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -837,14 +833,14 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 25,
"execution_count": 29,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
"name": "stdout", "name": "stdout",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"{'OneTwo': 12, 'One': 1}\n"
"{'One': 1, 'OneTwo': 12}\n"
] ]
} }
], ],
@@ -856,14 +852,14 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 26,
"execution_count": 30,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
"name": "stdout", "name": "stdout",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"{'key2': [1, 2, 4], 'key1': 1, 3: (1, 4, 6)}\n"
"{'key1': 1, 'key2': [1, 2, 4], 3: (1, 4, 6)}\n"
] ]
} }
], ],
@@ -881,7 +877,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 27,
"execution_count": 31,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -905,7 +901,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 1,
"execution_count": 32,
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
@@ -922,14 +918,14 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 4,
"execution_count": 33,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
"name": "stdout", "name": "stdout",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"{'One': 1, 'Four': 4, 'Three': 3, 'Five': 5, 'Two': 2}\n"
"{'One': 1, 'Two': 2, 'Three': 3, 'Four': 4, 'Five': 5}\n"
] ]
} }
], ],
@@ -983,14 +979,18 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 6,
"execution_count": 34,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
"name": "stdout",
"output_type": "stream",
"text": [
"{}\n"
"ename": "NameError",
"evalue": "name 'a1' is not defined",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-34-ea21b6f6055b>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0ma1\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mclear\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mNameError\u001b[0m: name 'a1' is not defined"
] ]
} }
], ],
@@ -1008,14 +1008,14 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 9,
"execution_count": 35,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
"name": "stdout", "name": "stdout",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"{'One': 1, 'Four': 4, 'Three': 3, 'Five': 5, 'Two': 2}\n"
"{'One': 1, 'Two': 2, 'Three': 3, 'Four': 4, 'Five': 5}\n"
] ]
} }
], ],
@@ -1052,16 +1052,16 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 10,
"execution_count": 36,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
"data": { "data": {
"text/plain": [ "text/plain": [
"dict_values([1, 4, 3, 5, 2])"
"dict_values([1, 2, 3, 4, 5])"
] ]
}, },
"execution_count": 10,
"execution_count": 36,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
@@ -1079,16 +1079,16 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 11,
"execution_count": 37,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
"data": { "data": {
"text/plain": [ "text/plain": [
"dict_keys(['One', 'Four', 'Three', 'Five', 'Two'])"
"dict_keys(['One', 'Two', 'Three', 'Four', 'Five'])"
] ]
}, },
"execution_count": 11,
"execution_count": 37,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
@@ -1106,7 +1106,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 12,
"execution_count": 38,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@@ -1114,10 +1114,10 @@
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"[ One] 1\n", "[ One] 1\n",
"[ Four] 4\n",
"[ Two] 2\n",
"[ Three] 3\n", "[ Three] 3\n",
"[ Five] 5\n",
"[ Two] 2\n"
"[ Four] 4\n",
"[ Five] 5\n"
] ]
} }
], ],
@@ -1137,15 +1137,18 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 13,
"execution_count": 41,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
"name": "stdout",
"output_type": "stream",
"text": [
"{'One': 1, 'Three': 3, 'Five': 5, 'Two': 2}\n",
"4\n"
"ename": "KeyError",
"evalue": "'Four'",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mKeyError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-41-807ce32acb5c>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0ma2\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0ma1\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpop\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'Four'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma2\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mKeyError\u001b[0m: 'Four'"
] ]
} }
], ],


+ 7
- 7
0_python/5_Control_Flow.ipynb View File

@@ -180,7 +180,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"### 4.1 For"
"### 4.1 for"
] ]
}, },
{ {
@@ -307,7 +307,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"### 4.2 While"
"### 4.2 while"
] ]
}, },
{ {
@@ -363,7 +363,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## 5. Break"
"## 5. break"
] ]
}, },
{ {
@@ -404,7 +404,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## 6. Continue"
"## 6. continue"
] ]
}, },
{ {
@@ -449,14 +449,14 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## 7. List Comprehensions"
"## 7. 列表推导(List Comprehensions"
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"Python使用列表理解式,用一行代码就可以很容易地生成所需的列表。例如,如果我需要生成27的倍数,我用For loop写代码,"
"Python可以使用列表推导模式,用一行代码就可以很容易地生成所需的列表。例如,如果我需要生成27的倍数,用`for`循环缩写的代码如下:"
] ]
}, },
{ {
@@ -484,7 +484,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"由于你需要生成另一个列表,所以列表理解是解决这个问题的更有效的方法。"
"由于你需要生成另一个列表,所以列表推导是解决这个问题的更有效的方法(建议大家使用这样的方式)"
] ]
}, },
{ {


+ 1
- 1
0_python/README.md View File

@@ -1,7 +1,7 @@


# 简明Python教程 (90分钟学会Python) # 简明Python教程 (90分钟学会Python)


Python 是一门上手简单、功能强大、通用型的脚本编程语言。Python 类库极其丰富,这使得 Python 几乎无所不能,网站开发、软件开发、大数据分析、网络爬虫、机器学习等都不在话下。Python最主要的优点是使用人类的思考方式来完成大部分的工作,大多数时候使用封装好的库快速完成给定的任务,虽然可能执行的效率不一定很高,但是极大的缩短了程序设计、编写、调试的时间,因此非常适合快速试错。
Python 是一门上手简单、功能强大、通用型的脚本编程语言。Python 类库极其丰富,这使得 Python 几乎无所不能,网站开发、软件开发、大数据分析、网络爬虫、机器学习等都不在话下。Python最主要的优点是使用人类的思考方式来编写程序,大多数情况下使用封装好的库能够快速完成给定的任务,虽然执行的效率不一定很高,但是极大的缩短了程序设计、编写、调试的时间,因此非常适合快速尝试、试错。


本教程来自[IPython Notebooks to learn Python](https://github.com/rajathkmp/Python-Lectures),将其中部分示例代码转化成Python3。关于Python的按照可以自行去网络上查找相关的资料,或者参考[安装Python环境](../tips/InstallPython.md)。 本教程来自[IPython Notebooks to learn Python](https://github.com/rajathkmp/Python-Lectures),将其中部分示例代码转化成Python3。关于Python的按照可以自行去网络上查找相关的资料,或者参考[安装Python环境](../tips/InstallPython.md)。




+ 32
- 18
2_knn/knn_classification.ipynb View File

@@ -7,39 +7,40 @@
"# kNN Classification\n", "# kNN Classification\n",
"\n", "\n",
"\n", "\n",
"K最近邻(k-Nearest Neighbor,kNN)分类算法,是一个理论上比较成熟的方法,也是最简单的机器学习算法之一。该方法的思路是:***如果一个样本在特征空间中的k个最相似(即特征空间中最邻近)的样本中的大多数属于某一个类别,则该样本也属于这个类别***。KNN方法虽然从原理上也依赖于极限定理,但在类别决策时,只与极少量的相邻样本有关。由于KNN方法主要靠周围有限的邻近的样本,而不是靠判别类域的方法来确定所属类别的,因此对于类域的交叉或重叠较多的待分样本集来说,KNN方法较其他方法更为适合。\n",
"K最近邻(k-Nearest Neighbor,kNN)分类算法,是一个理论上比较成熟的方法,也是最简单的机器学习算法之一。该方法的思路是:***如果一个样本在特征空间中的k个最相似(即特征空间中最邻近)的样本中的大多数属于某一个类别,则该样本也属于这个类别***。kNN方法虽然从原理上也依赖于极限定理,但在类别决策时,只与极少量的相邻样本有关。由于kNN方法主要靠周围有限的邻近的样本,而不是靠判别类域的方法来确定所属类别的,因此对于类域的交叉或重叠较多的待分样本集来说,kNN方法较其他方法更为适合。\n",
"\n", "\n",
"kNN算法不仅可以用于分类,还可以用于回归。通过找出一个样本的k个最近邻居,将这些邻居的属性的平均值赋给该样本,就可以得到该样本的属性。更有用的方法是将不同距离的邻居对该样本产生的影响给予不同的权值(weight),如权值与距离成正比(组合函数)。\n",
"kNN算法不仅可以用于分类,还可以用于回归。通过找出一个样本的`k`个最近邻居,将这些邻居的属性的平均值赋给该样本,就可以得到该样本的属性。更有用的方法是将不同距离的邻居对该样本产生的影响给予不同的权值(weight),如权值与距离成正比(组合函数)。\n",
"\n", "\n",
"该算法在分类时有个主要的不足是,当样本不平衡时,如一个类的样本容量很大,而其他类样本容量很小时,有可能导致当输入一个新样本时,该样本的K个邻居中大容量类的样本占多数。在这种情况下可能会产生误判的结果。因此我们需要减少数量对运行结果的影响。可以采用权值的方法(和该样本距离小的邻居权值大)来改进。该方法的另一个不足之处是计算量较大,因为对每一个待分类的文本都要计算它到全体已知样本的距离,才能求得它的K个最近邻点。目前常用的解决方法是事先对已知样本点进行剪辑,事先去除对分类作用不大的样本。该算法比较适用于样本容量比较大的类域的自动分类,而那些样本容量较小的类域采用这种算法比较容易产生误分。\n",
"该算法存在的问题:\n",
"1. 当样本不平衡时,如一个类的样本容量很大,而其他类样本容量很小时,有可能导致当输入一个新样本时,该样本的K个邻居中大容量类的样本占多数。在这种情况下可能会产生误判的结果。因此我们需要减少数量对运行结果的影响。可以采用权值的方法(和该样本距离小的邻居权值大)来改进。\n",
"2. 计算量较大,因为对每一个待分类的数据都要计算它到全体已知样本的距离,才能求得它的K个最近邻点。目前常用的解决方法是事先对已知样本点进行剪辑,事先去除对分类作用不大的样本。该算法比较适用于样本容量比较大的类域的自动分类,而那些样本容量较小的类域采用这种算法比较容易产生误分。\n",
"\n", "\n",
"k-NN可以说是一种最直接的用来分类未知数据的方法。基本通过下面这张图跟文字说明就可以明白K-NN是干什么的\n",
"kNN可以说是一种最直接的用来分类未知数据的方法。基本通过下面这张图跟文字说明就可以明白kNN是干什么的\n",
"![knn](images/knn.png)\n", "![knn](images/knn.png)\n",
"\n", "\n",
"简单来说,k-NN可以看成:**有那么一堆你已经知道分类的数据,然后当一个新数据进入的时候,就开始跟训练数据里的每个点求距离,然后挑离这个训练数据最近的K个点看看这几个点属于什么类型,然后用少数服从多数的原则,给新数据归类**。\n"
"简单来说,kNN可以看成:**有那么一堆你已经知道分类的数据,然后当一个新数据进入的时候,就开始跟训练数据里的每个点求距离,然后挑选这个训练数据最近的K个点,看看这几个点属于什么类型,然后用少数服从多数的原则,给新数据归类**。\n"
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## 算法步骤:(FIXME: 把流程再细化一下,循环需要体现的更好)\n",
"## 1. 算法步骤:\n",
"\n", "\n",
"* step.1---导入训练样本\n",
"* step.2---将样本的特征转化为数据\n",
"* step.3---计算未知样本和训练样本的距离dist\n",
"* step.4---记录位置样本和训练样本得距离及其所属于得分类\n",
"* step.5---重复步骤2、3,直到未知样本和所有训练样本的距离都算完\n",
"* step.6---将训练样本按照与未知样本的距离进行排序,找出其中K个最近的样本\n",
"* step.7---统计K-最近邻样本中每个类标号出现的次数\n",
"* step.8---选择出现频率最大的类标号作为未知样本的类标号"
"1. 准备数据,对数据进行预处理;\n",
"2. 计算测试数据与各个训练数据之间的**距离**;\n",
"3. 按照距离的递增关系进行排序;\n",
"4. 选取距离最小的`k`个点;\n",
"5. 确定前`k`个点所在类别的出现频率;\n",
"6. 返回前`k`个点中出现频率最高的类别作为测试数据的预测分类。\n",
"\n"
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## 生成数据"
"## 2. 生成数据"
] ]
}, },
{ {
@@ -119,7 +120,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## Program"
"## 3. Program"
] ]
}, },
{ {
@@ -208,7 +209,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## sklearn program"
"## 4. sklearn program"
] ]
}, },
{ {
@@ -312,6 +313,19 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## 5. 深入思考\n",
"\n",
"* 如果输入的数据非常多,怎么快速进行距离计算?\n",
" - kd-tree\n",
" - Fast Library for Approximate Nearest Neighbors (FLANN)\n",
"* 如何选择最好的`k`?\n",
" - https://zhuanlan.zhihu.com/p/143092725"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## References\n", "## References\n",
"* [Digits Classification Exercise](http://scikit-learn.org/stable/auto_examples/exercises/plot_digits_classification_exercise.html)\n", "* [Digits Classification Exercise](http://scikit-learn.org/stable/auto_examples/exercises/plot_digits_classification_exercise.html)\n",
"* [knn算法的原理与实现](https://zhuanlan.zhihu.com/p/36549000)" "* [knn算法的原理与实现](https://zhuanlan.zhihu.com/p/36549000)"
@@ -334,7 +348,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.6.8"
"version": "3.6.9"
} }
}, },
"nbformat": 4, "nbformat": 4,


+ 1
- 1
3_kmeans/1-k-means.ipynb View File

@@ -955,7 +955,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.6.8"
"version": "3.6.9"
} }
}, },
"nbformat": 4, "nbformat": 4,


Loading…
Cancel
Save