Browse Source

Add English version

pull/1/MERGE
bushuhui 5 years ago
parent
commit
262e6bfdfb
35 changed files with 32097 additions and 129 deletions
  1. +1015
    -0
      0_python/1_Basics_EN.ipynb
  2. +587
    -0
      0_python/2_Print_Statement_EN.ipynb
  3. +1999
    -0
      0_python/3_Data_Structure_1_EN.ipynb
  4. +1180
    -0
      0_python/4_Data_Structure_2_EN.ipynb
  5. +693
    -0
      0_python/5_Control_Flow_EN.ipynb
  6. +1075
    -0
      0_python/6_Function_EN.ipynb
  7. +1308
    -0
      0_python/7_Class_EN.ipynb
  8. +36
    -0
      0_python/README_EN.md
  9. +4887
    -0
      1_numpy_matplotlib_scipy_sympy/1-numpy_tutorial_EN.ipynb
  10. +467
    -0
      1_numpy_matplotlib_scipy_sympy/2-matplotlib_simple_tutorial_EN.ipynb
  11. +0
    -0
      1_numpy_matplotlib_scipy_sympy/3-ipython_notebook.ipynb
  12. +338
    -0
      1_numpy_matplotlib_scipy_sympy/3-ipython_notebook_EN.ipynb
  13. +0
    -0
      1_numpy_matplotlib_scipy_sympy/4-scipy_tutorial.ipynb
  14. +2423
    -0
      1_numpy_matplotlib_scipy_sympy/4-scipy_tutorial_EN.ipynb
  15. +0
    -0
      1_numpy_matplotlib_scipy_sympy/5-sympy_tutorial.ipynb
  16. +2534
    -0
      1_numpy_matplotlib_scipy_sympy/5-sympy_tutorial_EN.ipynb
  17. +48
    -123
      2_knn/knn_classification.ipynb
  18. +341
    -0
      2_knn/knn_classification_EN.ipynb
  19. +1
    -1
      3_kmeans/1-k-means.ipynb
  20. +997
    -0
      3_kmeans/1-k-means_EN.ipynb
  21. +1
    -1
      3_kmeans/2-kmeans-color-vq.ipynb
  22. +241
    -0
      3_kmeans/2-kmeans-color-vq_EN.ipynb
  23. +1
    -1
      3_kmeans/3-ClusteringAlgorithms.ipynb
  24. +224
    -0
      3_kmeans/3-ClusteringAlgorithms_EN.ipynb
  25. +1
    -1
      4_logistic_regression/1-Least_squares.ipynb
  26. +5154
    -0
      4_logistic_regression/1-Least_squares_EN.ipynb
  27. +1
    -1
      4_logistic_regression/2-Logistic_regression.ipynb
  28. +705
    -0
      4_logistic_regression/2-Logistic_regression_EN.ipynb
  29. +1
    -1
      4_logistic_regression/3-PCA_and_Logistic_Regression.ipynb
  30. +279
    -0
      4_logistic_regression/3-PCA_and_Logistic_Regression_EN.ipynb
  31. +256
    -0
      5_nn/1-Perceptron_EN.ipynb
  32. +4929
    -0
      5_nn/2-mlp_bp_EN.ipynb
  33. +176
    -0
      5_nn/3-softmax_ce_EN.ipynb
  34. +110
    -0
      README_EN.md
  35. +89
    -0
      tips/InstallPython_EN.md

+ 1015
- 0
0_python/1_Basics_EN.ipynb
File diff suppressed because it is too large
View File


+ 587
- 0
0_python/2_Print_Statement_EN.ipynb View File

@@ -0,0 +1,587 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Print Statement"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The **print** statement can be used in the following different ways :\n",
"\n",
" - print(\"Hello World\")\n",
" - print(\"Hello\", <Variable Containing the String>)\n",
" - print(\"Hello\" + <Variable Containing the String>)\n",
" - print(\"Hello %s\" % <variable containing the string>)"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Hello World\n"
]
}
],
"source": [
"print(\"Hello World\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In Python, single, double and triple quotes are used to denote a string.\n",
"Most use single quotes when declaring a single character. \n",
"Double quotes when declaring a line and triple quotes when declaring a paragraph/multiple lines."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Hey\n"
]
}
],
"source": [
"print('Hey')"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"My name is Rajath Kumar M.P.\n",
"\n",
"I love Python.\n"
]
}
],
"source": [
"print(\"\"\"My name is Rajath Kumar M.P.\n",
"\n",
"I love Python.\"\"\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Strings can be assigned to variable say _string1_ and _string2_ which can called when using the print statement."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Hello World\n",
"Hello World !\n"
]
}
],
"source": [
"string1 = 'World'\n",
"print('Hello', string1)\n",
"\n",
"string2 = '!'\n",
"print('Hello', string1, string2)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"String concatenation is the \"addition\" of two strings. Observe that while concatenating there will be no space between the strings."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"HelloWorld!\n"
]
}
],
"source": [
"print('Hello' + string1 + string2)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**%s** is used to refer to a variable which contains a string."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Hello World\n"
]
}
],
"source": [
"print(\"Hello %s\" % string1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Similarly, when using other data types\n",
"\n",
" - %s -> string\n",
" - %d -> Integer\n",
" - %f -> Float\n",
" - %o -> Octal\n",
" - %x -> Hexadecimal\n",
" - %e -> exponential\n",
" \n",
"This can be used for conversions inside the print statement itself."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Actual Number = 18\n",
"Float of the number = 18.000000\n",
"Octal equivalent of the number = 22\n",
"Hexadecimal equivalent of the number = 12\n",
"Exponential equivalent of the number = 1.800000e+01\n"
]
}
],
"source": [
"print(\"Actual Number = %d\" % 18)\n",
"print(\"Float of the number = %f\" % 18)\n",
"print(\"Octal equivalent of the number = %o\" % 18)\n",
"print(\"Hexadecimal equivalent of the number = %x\" % 18)\n",
"print(\"Exponential equivalent of the number = %e\" % 18)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When referring to multiple variables parenthesis is used."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Hello World !\n"
]
}
],
"source": [
"print(\"Hello %s %s\" % (string1,string2))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Other Examples"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The following are other different ways the print statement can be put to use."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"I want to be printed here\n"
]
}
],
"source": [
"print(\"I want to be printed %s\" %'here')"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"_A_A_A_A_A_A_A_A_A_A\n"
]
}
],
"source": [
"print('_A'*10)"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Jan\n",
"Feb\n",
"Mar\n",
"Apr\n",
"May\n",
"Jun\n",
"Jul\n",
"Aug\n"
]
}
],
"source": [
"print(\"Jan\\nFeb\\nMar\\nApr\\nMay\\nJun\\nJul\\nAug\")"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Jan\n",
"Feb\n",
"Mar\n",
"Apr\n",
"May\n",
"Jun\n",
"Jul\n",
"Aug\n"
]
}
],
"source": [
"print(\"\\n\".join(\"Jan Feb Mar Apr May Jun Jul Aug\".split(\" \")))"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"I want \\n to be printed.\n"
]
}
],
"source": [
"print(\"I want \\\\n to be printed.\")"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Routine:\n",
"\t- Eat\n",
"\t- Sleep\n",
"\t- Repeat\n",
"\n"
]
}
],
"source": [
"print(\"\"\"\n",
"Routine:\n",
"\\t- Eat\n",
"\\t- Sleep\\n\\t- Repeat\n",
"\"\"\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 2. PrecisionWidth and FieldWidth"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Fieldwidth is the width of the entire number and precision is the width towards the right. One can alter these widths based on the requirements.\n",
"\n",
"The default Precision Width is set to 6."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'3.121312'"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"\"%f\" % 3.121312312312"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Notice upto 6 decimal points are returned. To specify the number of decimal points, '%(fieldwidth).(precisionwidth)f' is used."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'3.12131'"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"\"%.5f\" % 3.121312312312"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If the field width is set more than the necessary than the data right aligns itself to adjust to the specified values."
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"' 3.12131'"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"\"%9.5f\" % 3.121312312312"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Zero padding is done by adding a 0 at the start of fieldwidth."
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'00000000000003.12131'"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"\"%020.5f\" % 3.121312312312"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For proper alignment, a space can be left blank in the field width so that when a negative number is used, proper alignment is maintained."
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 3.121312\n",
"-3.121312\n"
]
}
],
"source": [
"print \"% 9f\" % 3.121312312312\n",
"print \"% 9f\" % -3.121312312312"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"'+' sign can be returned at the beginning of a positive number by adding a + sign at the beginning of the field width."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"+3.121312\n",
"-3.121312\n"
]
}
],
"source": [
"print \"%+9f\" % 3.121312312312\n",
"print \"% 9f\" % -3.121312312312"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As mentioned above, the data right aligns itself when the field width mentioned is larger than the actualy field width. But left alignment can be done by specifying a negative symbol in the field width."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'3.121 '"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"\"%-9.3f\" % 3.121312312312"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.8"
}
},
"nbformat": 4,
"nbformat_minor": 1
}

+ 1999
- 0
0_python/3_Data_Structure_1_EN.ipynb
File diff suppressed because it is too large
View File


+ 1180
- 0
0_python/4_Data_Structure_2_EN.ipynb
File diff suppressed because it is too large
View File


+ 693
- 0
0_python/5_Control_Flow_EN.ipynb View File

@@ -0,0 +1,693 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Control Flow Statements"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. If"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"if some_condition:\n",
" \n",
" algorithm"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Welcome!\n"
]
}
],
"source": [
"x = 4\n",
"if x >10:\n",
" print(\"Hello\")\n",
"else:\n",
" print(\"Welcome!\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. If-else"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"if some_condition:\n",
" \n",
" algorithm\n",
" \n",
"else:\n",
" \n",
" algorithm"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"hello\n"
]
}
],
"source": [
"x = 12\n",
"if x > 10:\n",
" print(\"hello\")\n",
"else:\n",
" print(\"world\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. if-elif"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"if some_condition:\n",
" \n",
" algorithm\n",
"\n",
"elif some_condition:\n",
" \n",
" algorithm\n",
"\n",
"else:\n",
" \n",
" algorithm"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"x<y\n"
]
}
],
"source": [
"x = 10\n",
"y = 12\n",
"if x > y:\n",
" print(\"x>y\")\n",
"elif x < y:\n",
" print(\"x<y\")\n",
"else:\n",
" print(\"x=y\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"if statement inside a if statement or if-elif or if-else are called as nested if statements."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"x<y\n",
"x=10\n"
]
}
],
"source": [
"x = 10\n",
"y = 12\n",
"if x > y:\n",
" print(\"x>y\")\n",
"elif x < y:\n",
" print(\"x<y\")\n",
" if x==10:\n",
" print(\"x=10\")\n",
" else:\n",
" print(\"invalid\")\n",
"else:\n",
" print(\"x=y\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. Loops"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 4.1 For"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"for variable in something:\n",
" \n",
" algorithm"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0\n",
"1\n",
"2\n",
"3\n",
"4\n"
]
}
],
"source": [
"for i in range(5):\n",
" print(i)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1\n",
"2\n",
"5\n",
"6\n"
]
}
],
"source": [
"a = [1, 2, 5, 6]\n",
"for i in a:\n",
" print(i)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the above example, i iterates over the 0,1,2,3,4. Every time it takes each value and executes the algorithm inside the loop. It is also possible to iterate over a nested list illustrated below."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1, 2, 3]\n",
"[4, 5, 6]\n",
"[7, 8, 9]\n"
]
}
],
"source": [
"list_of_lists = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]\n",
"for list1 in list_of_lists:\n",
" print(list1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A use case of a nested for loop in this case would be,"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1\n",
"2\n",
"3\n",
"\n",
"4\n",
"5\n",
"6\n",
"\n",
"7\n",
"8\n",
"9\n",
"\n"
]
}
],
"source": [
"list_of_lists = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]\n",
"for list1 in list_of_lists:\n",
" for x in list1:\n",
" print(x)\n",
" print()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 4.2 While"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"while some_condition:\n",
" \n",
" algorithm"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1\n",
"4\n",
"Bye\n",
"looping 4\n",
"looping 5\n",
"looping 6\n",
"looping 7\n",
"looping 8\n",
"looping 9\n",
"looping 10\n",
"looping 11\n"
]
}
],
"source": [
"i = 1\n",
"while i < 3:\n",
" print(i ** 2)\n",
" i = i+1\n",
"print('Bye')\n",
"\n",
"# do-untile\n",
"while True:\n",
" #do something\n",
" i = i+1\n",
" print('looping %3d' % i)\n",
" \n",
" # check \n",
" if i>10: break"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5. Break"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As the name says. It is used to break out of a loop when a condition becomes true when executing the loop."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0\n",
"1\n",
"2\n",
"3\n",
"4\n",
"5\n",
"6\n",
"7\n"
]
}
],
"source": [
"for i in range(100):\n",
" print(i)\n",
" if i>=7:\n",
" break"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6. Continue"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This continues the rest of the loop. Sometimes when a condition is satisfied there are chances of the loop getting terminated. This can be avoided using continue statement. "
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0\n",
"1\n",
"2\n",
"3\n",
"4\n",
"The end.\n",
"The end.\n",
"The end.\n",
"The end.\n",
"The end.\n"
]
}
],
"source": [
"for i in range(10):\n",
" if i>4:\n",
" print(\"The end.\")\n",
" continue\n",
" elif i<7:\n",
" print(i)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 7. List Comprehensions"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Python makes it simple to generate a required list with a single line of code using list comprehensions. For example If i need to generate multiples of say 27 I write the code using for loop as,"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[27, 54, 81, 108, 135, 162, 189, 216, 243, 270]\n"
]
}
],
"source": [
"res = []\n",
"for i in range(1,11):\n",
" x = 27*i\n",
" res.append(x)\n",
"print(res)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Since you are generating another list altogether and that is what is required, List comprehensions is a more efficient way to solve this problem."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[27, 54, 81, 108, 135, 162, 189, 216, 243, 270]"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"[27*x for x in range(1,11)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"That's it!. Only remember to enclose it in square brackets"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Understanding the code, The first bit of the code is always the algorithm and then leave a space and then write the necessary loop. But you might be wondering can nested loops be extended to list comprehensions? Yes you can."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[27, 54, 81, 108, 135, 162, 189, 216, 243, 270]"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"[27*x for x in range(1,20) if x<=10]"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"{'108': 108,\n",
" '135': 135,\n",
" '162': 162,\n",
" '189': 189,\n",
" '216': 216,\n",
" '243': 243,\n",
" '27': 27,\n",
" '270': 270,\n",
" '54': 54,\n",
" '81': 81}"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"{str(27*x):27*x for x in range(1,20) if x<=10}"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(27, 54, 81, 108, 135, 162, 189, 216, 243, 270)"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tuple((27*x for x in range(1,20) if x<=10))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let me add one more loop to make you understand better, "
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[1,\n",
" 2,\n",
" 3,\n",
" 4,\n",
" 5,\n",
" 6,\n",
" 7,\n",
" 8,\n",
" 9,\n",
" 10,\n",
" 28,\n",
" 29,\n",
" 30,\n",
" 31,\n",
" 32,\n",
" 33,\n",
" 34,\n",
" 35,\n",
" 36,\n",
" 37,\n",
" 55,\n",
" 56,\n",
" 57,\n",
" 58,\n",
" 59,\n",
" 60,\n",
" 61,\n",
" 62,\n",
" 63,\n",
" 64,\n",
" 82,\n",
" 83,\n",
" 84,\n",
" 85,\n",
" 86,\n",
" 87,\n",
" 88,\n",
" 89,\n",
" 90,\n",
" 91,\n",
" 109,\n",
" 110,\n",
" 111,\n",
" 112,\n",
" 113,\n",
" 114,\n",
" 115,\n",
" 116,\n",
" 117,\n",
" 118]"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"[27*i+z for i in range(50) if i<5 for z in range(1,11)]"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.8"
}
},
"nbformat": 4,
"nbformat_minor": 1
}

+ 1075
- 0
0_python/6_Function_EN.ipynb
File diff suppressed because it is too large
View File


+ 1308
- 0
0_python/7_Class_EN.ipynb
File diff suppressed because it is too large
View File


+ 36
- 0
0_python/README_EN.md View File

@@ -0,0 +1,36 @@

# 简明Python教程 (90分钟学会Python)

Python 是一门上手简单、功能强大、通用型的脚本编程语言。Python 类库极其丰富,这使得 Python 几乎无所不能,网站开发、软件开发、大数据分析、网络爬虫、机器学习等都不在话下。Python最主要的优点是使用人类的思考方式来完成大部分的工作,大多数时候使用封装好的库快速完成给定的任务,虽然可能执行的效率不一定很高,但是极大的缩短了程序设计、编写、调试的时间,因此非常适合快速试错。

本教程来自[IPython Notebooks to learn Python](https://github.com/rajathkmp/Python-Lectures),将其中部分示例代码转化成Python3。关于Python的按照可以自行去网络上查找相关的资料,或者参考[安装Python环境](../tips/InstallPython.md)。

## 内容
0. [Install Python](../tips/InstallPython.md)
1. [Basics](1_Basics.ipynb)
- Why Python, Zen of Python
- Variables, Operators, Built-in functions
2. [Print statement](2_Print_Statement.ipynb)
- Tips of print
3. [Data structure - 1](3_Data_Structure_1.ipynb)
- Lists, Tuples, Sets
4. [Data structure - 2](4_Data_Structure_2.ipynb)
- Strings, Dictionaries
5. [Control flow](5_Control_Flow.ipynb)
- if, else, elif, for, while, break, continue
6. [Functions](6_Function.ipynb)
- Function define, return, arguments
- Gloabl and local variables
- Lambda functions
7. [Class](7_Class.ipynb)
- Class define
- Inheritance


## References
* [安装Python环境](../tips/InstallPython.md)
* [IPython Notebooks to learn Python](https://github.com/rajathkmp/Python-Lectures)
* [廖雪峰的Python教程](https://www.liaoxuefeng.com/wiki/1016959663602400)
* [智能系统实验室入门教程-Python](https://gitee.com/pi-lab/SummerCamp/tree/master/python)
* [Python tips](../tips/python)
* [Get Started with Python](Python.pdf)

+ 4887
- 0
1_numpy_matplotlib_scipy_sympy/1-numpy_tutorial_EN.ipynb
File diff suppressed because it is too large
View File


+ 467
- 0
1_numpy_matplotlib_scipy_sympy/2-matplotlib_simple_tutorial_EN.ipynb
File diff suppressed because it is too large
View File


1_numpy_matplotlib_scipy_sympy/ipython_notebook.ipynb → 1_numpy_matplotlib_scipy_sympy/3-ipython_notebook.ipynb View File


+ 338
- 0
1_numpy_matplotlib_scipy_sympy/3-ipython_notebook_EN.ipynb
File diff suppressed because it is too large
View File


1_numpy_matplotlib_scipy_sympy/3-scipy_tutorial.ipynb → 1_numpy_matplotlib_scipy_sympy/4-scipy_tutorial.ipynb View File


+ 2423
- 0
1_numpy_matplotlib_scipy_sympy/4-scipy_tutorial_EN.ipynb
File diff suppressed because it is too large
View File


1_numpy_matplotlib_scipy_sympy/4-sympy_tutorial.ipynb → 1_numpy_matplotlib_scipy_sympy/5-sympy_tutorial.ipynb View File


+ 2534
- 0
1_numpy_matplotlib_scipy_sympy/5-sympy_tutorial_EN.ipynb
File diff suppressed because it is too large
View File


+ 48
- 123
2_knn/knn_classification.ipynb
File diff suppressed because it is too large
View File


+ 341
- 0
2_knn/knn_classification_EN.ipynb
File diff suppressed because it is too large
View File


+ 1
- 1
3_kmeans/1-k-means.ipynb View File

@@ -989,7 +989,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.5.2"
"version": "3.6.8"
} }
}, },
"nbformat": 4, "nbformat": 4,


+ 997
- 0
3_kmeans/1-k-means_EN.ipynb
File diff suppressed because it is too large
View File


+ 1
- 1
3_kmeans/2-kmeans-color-vq.ipynb View File

@@ -233,7 +233,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.5.2"
"version": "3.6.8"
} }
}, },
"nbformat": 4, "nbformat": 4,


+ 241
- 0
3_kmeans/2-kmeans-color-vq_EN.ipynb
File diff suppressed because it is too large
View File


+ 1
- 1
3_kmeans/3-ClusteringAlgorithms.ipynb View File

@@ -215,7 +215,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.5.2"
"version": "3.6.8"
}, },
"main_language": "python" "main_language": "python"
}, },


+ 224
- 0
3_kmeans/3-ClusteringAlgorithms_EN.ipynb
File diff suppressed because it is too large
View File


+ 1
- 1
4_logistic_regression/1-Least_squares.ipynb View File

@@ -5146,7 +5146,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.5.2"
"version": "3.6.8"
} }
}, },
"nbformat": 4, "nbformat": 4,


+ 5154
- 0
4_logistic_regression/1-Least_squares_EN.ipynb
File diff suppressed because it is too large
View File


+ 1
- 1
4_logistic_regression/2-Logistic_regression.ipynb View File

@@ -697,7 +697,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.5.2"
"version": "3.6.8"
} }
}, },
"nbformat": 4, "nbformat": 4,


+ 705
- 0
4_logistic_regression/2-Logistic_regression_EN.ipynb
File diff suppressed because it is too large
View File


+ 1
- 1
4_logistic_regression/3-PCA_and_Logistic_Regression.ipynb View File

@@ -271,7 +271,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.5.2"
"version": "3.6.8"
} }
}, },
"nbformat": 4, "nbformat": 4,


+ 279
- 0
4_logistic_regression/3-PCA_and_Logistic_Regression_EN.ipynb
File diff suppressed because it is too large
View File


+ 256
- 0
5_nn/1-Perceptron_EN.ipynb View File

@@ -0,0 +1,256 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 感知机\n",
"\n",
"感知机(perceptron)是二分类的线性分类模型,输入为实例的特征向量,输出为实例的类别(取+1和-1)。感知机对应于输入空间中将实例划分为两类的分离超平面。感知机旨在求出该超平面,为求得超平面导入了基于误分类的损失函数,利用梯度下降法 对损失函数进行最优化(最优化)。感知机的学习算法具有简单而易于实现的优点,分为原始形式和对偶形式。感知机预测是用学习得到的感知机模型对新的实例进行预测的,因此属于判别模型。感知机由Rosenblatt于1957年提出的,是神经网络和支持向量机的基础。\n",
"\n",
"模仿的是生物神经系统内的神经元,它能够接受来自多个源的信号输入,然后将信号转化为便于传播的信号在进行输出(在生物体内表现为电信号)。\n",
"\n",
"![neuron](images/neuron.png)\n",
"\n",
"* dendrites - 树突\n",
"* nucleus - 细胞核\n",
"* axon - 轴突\n",
"\n",
"心理学家Rosenblatt构想了感知机,它作为简化的数学模型解释大脑神经元如何工作:它取一组二进制输入值(附近的神经元),将每个输入值乘以一个连续值权重(每个附近神经元的突触强度),并设立一个阈值,如果这些加权输入值的和超过这个阈值,就输出1,否则输出0(同理于神经元是否放电)。对于感知机,绝大多数输入值不是一些数据,就是别的感知机的输出值。\n",
"\n",
"唐纳德·赫布提出了一个出人意料并影响深远的想法,称知识和学习发生在大脑主要是通过神经元间突触的形成与变化,简要表述为赫布法则:\n",
"\n",
"> 当细胞A的轴突足以接近以激发细胞B,并反复持续地对细胞B放电,一些生长过程或代谢变化将发生在某一个或这两个细胞内,以致A作为对B放电的细胞中的一个,效率增加。\n",
"\n",
"\n",
"感知机并没有完全遵循这个想法,**但通过调输入值的权重,可以有一个非常简单直观的学习方案:给定一个有输入输出实例的训练集,感知机应该「学习」一个函数:对每个例子,若感知机的输出值比实例低太多,则增加它的权重,否则若设比实例高太多,则减少它的权重。**\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. 感知机模型\n",
"\n",
"假设输入空间(特征向量)为$X \\subseteq R^n$,输出空间为$Y=\\{-1, +1\\}$。输入$x \\in X$ 表示实例的特征向量,对应于输入空间的点;输出$y \\in Y$表示示例的类别。由输入空间到输出空间的函数为\n",
"\n",
"$$\n",
"f(x) = sign(w x + b)\n",
"$$\n",
"\n",
"称为感知机。其中,参数$w$叫做权值向量,$b$称为偏置。$w·x$表示$w$和$x$的内积。$sign$为符号函数,即\n",
"![sign_function](images/sign.png)\n",
"\n",
"### 1.1 几何解释 \n",
"感知机模型是线性分类模型,感知机模型的假设空间是定义在特征空间中的所有线性分类模型,即函数集合{f|f(x)=w·x+b}。线性方程 w·x+b=0对应于特征空间Rn中的一个超平面S,其中w是超平面的法向量,b是超平面的截踞。这个超平面把特征空间划分为两部分。位于两侧的点分别为正负两类。超平面S称为分离超平面,如下图:\n",
"![perceptron_geometry_def](images/perceptron_geometry_def.png)\n",
"\n",
"### 1.2 生物学类比\n",
"![perceptron_2](images/perceptron_2.PNG)\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. 感知机学习策略\n",
"\n",
"假设训练数据集是线性可分的,感知机学习的目标是求得一个能够将训练数据的正负实例点完全分开的分离超平面,即最终求得参数w、b。这需要一个学习策略,即定义(经验)损失函数并将损失函数最小化。\n",
"\n",
"损失函数的一个自然的选择是误分类的点的总数。但是这样得到的损失函数不是参数w、b的连续可导函数,不宜优化。损失函数的另一个选择是误分类点到分类面的距离之和。\n",
"\n",
"首先,对于任意一点xo到超平面的距离为\n",
"$$\n",
"\\frac{1}{||w||} | w \\cdot xo + b |\n",
"$$\n",
"\n",
"其次,对于误分类点$(x_i,y_i)$来说 $-y_i(w \\cdot x_i + b) > 0$\n",
"\n",
"这样,假设超平面S的总的误分类点集合为M,那么所有误分类点到S的距离之和为\n",
"$$\n",
"-\\frac{1}{||w||} \\sum_{x_i \\in M} y_i (w \\cdot x_i + b)\n",
"$$\n",
"不考虑1/||w||,就得到了感知机学习的损失函数。\n",
"\n",
"### 经验风险函数\n",
"\n",
"给定数据集$T = \\{(x_1,y_1), (x_2, y_2), ... (x_N, y_N)\\}$(其中$x_i \\in R^n$, $y_i \\in \\{-1, +1\\},i=1,2...N$),感知机sign(w·x+b)学习的损失函数定义为\n",
"$$\n",
"L(w, b) = - \\sum_{x_i \\in M} y_i (w \\cdot x_i + b)\n",
"$$\n",
"其中M为误分类点的集合,这个损失函数就是感知机学习的[经验风险函数](https://blog.csdn.net/zhzhx1204/article/details/70163099)。\n",
"\n",
"显然,损失函数L(w,b)是非负的。如果没有误分类点,那么L(w,b)为0,误分类点数越少,L(w,b)值越小。一个特定的损失函数:在误分类时是参数w,b的线性函数,在正确分类时,是0.因此,给定训练数据集T,损失函数L(w,b)是w,b的连续可导函数。\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. 感知机学习算法\n",
"\n",
"\n",
"最优化问题:给定数据集$T = \\{(x_1,y_1), (x_2, y_2), ... (x_N, y_N)\\}$(其中$x_i \\in R^n$, $y_i \\in \\{-1, +1\\},i=1,2...N$),求参数w,b,使其成为损失函数的解(M为误分类的集合):\n",
"\n",
"$$\n",
"min_{w,b} L(w, b) = - \\sum_{x_i \\in M} y_i (w \\cdot x_i + b)\n",
"$$\n",
"\n",
"感知机学习是误分类驱动的,具体采用[随机梯度下降法](https://blog.csdn.net/zbc1090549839/article/details/38149561)。首先,任意选定$w_0$、$b_0$,然后用梯度下降法不断极小化目标函数,极小化的过程不是一次性的把M中的所有误分类点梯度下降,而是一次随机选取一个误分类点使其梯度下降。\n",
"\n",
"假设误分类集合M是固定的,那么损失函数L(w,b)的梯度为\n",
"$$\n",
"\\triangledown_w L(w, b) = - \\sum_{x_i \\in M} y_i x_i \\\\\n",
"\\triangledown_b L(w, b) = - \\sum_{x_i \\in M} y_i \\\\\n",
"$$\n",
"\n",
"随机选取一个误分类点$(x_i,y_i)$,对$w,b$进行更新:\n",
"$$\n",
"w = w + \\eta y_i x_i \\\\\n",
"b = b + \\eta y_i\n",
"$$\n",
"\n",
"式中$\\eta$(0 ≤ $ \\eta $ ≤ 1)是步长,在统计学是中成为学习速率。步长越大,梯度下降的速度越快,更能接近极小点。如果步长过大,有可能导致跨过极小点,导致函数发散;如果步长过小,有可能会耗很长时间才能达到极小点。\n",
"\n",
"直观解释:当一个实例点被误分类时,调整w,b,使分离超平面向该误分类点的一侧移动,以减少该误分类点与超平面的距离,直至超越该点被正确分类。\n",
"\n",
"\n",
"\n",
"算法\n",
"```\n",
"输入:T={(x1,y1),(x2,y2)...(xN,yN)}(其中xi∈X=Rn,yi∈Y={-1, +1},i=1,2...N,学习速率为η)\n",
"输出:w, b;感知机模型f(x)=sign(w·x+b)\n",
"(1) 初始化w0,b0\n",
"(2) 在训练数据集中选取(xi, yi)\n",
"(3) 如果yi(w * xi+b)≤0\n",
" w = w + ηyixi\n",
" b = b + ηyi\n",
"(4) 如果所有的样本都正确分类,或者迭代次数超过设定值,则终止\n",
"(5) 否则,跳转至(2)\n",
"```\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. 示例程序\n"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"lines_to_end_of_cell_marker": 2
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"update weight and bias: 1.0 2.5 0.5\n",
"update weight and bias: -2.5 1.0 0.0\n",
"update weight and bias: -1.5 3.5 0.5\n",
"update weight and bias: -5.0 2.0 0.0\n",
"update weight and bias: -4.0 4.5 0.5\n",
"w = [-4.0, 4.5]\n",
"b = 0.5\n",
"ground_truth: [1, 1, 1, 1, -1, -1, -1, -1]\n",
"predicted: [1, 1, 1, 1, -1, -1, -1, -1]\n"
]
}
],
"source": [
"import random\n",
"import numpy as np\n",
"\n",
"# 符号函数\n",
"def sign(v):\n",
" if v > 0: return 1\n",
" else: return -1\n",
" \n",
"def perceptron_train(train_data, eta=0.5, n_iter=100):\n",
" weight = [0, 0] # 权重\n",
" bias = 0 # 偏置量\n",
" learning_rate = eta # 学习速率\n",
"\n",
" train_num = n_iter # 迭代次数\n",
"\n",
" for i in range(train_num):\n",
" #FIXME: the random chose sample is to slow\n",
" train = random.choice(train_data)\n",
" x1, x2, y = train\n",
" predict = sign(weight[0] * x1 + weight[1] * x2 + bias) # 输出\n",
" #print(\"train data: x: (%2d, %2d) y: %2d ==> predict: %2d\" % (x1, x2, y, predict))\n",
" \n",
" if y * predict <= 0: # 判断误分类点\n",
" weight[0] = weight[0] + learning_rate * y * x1 # 更新权重\n",
" weight[1] = weight[1] + learning_rate * y * x2\n",
" bias = bias + learning_rate * y # 更新偏置量\n",
" print(\"update weight and bias: \", weight[0], weight[1], bias)\n",
"\n",
" #print(\"stop training: \", weight[0], weight[1], bias)\n",
"\n",
" return weight, bias\n",
"\n",
"def perceptron_pred(data, w, b):\n",
" y_pred = []\n",
" for d in data:\n",
" x1, x2, y = d\n",
" yi = sign(w[0]*x1 + w[1]*x2 + b)\n",
" y_pred.append(yi)\n",
" \n",
" return y_pred\n",
"\n",
"# set training data\n",
"train_data = np.array([[1, 3, 1], [2, 5, 1], [3, 8, 1], [2, 6, 1], \n",
" [3, 1, -1], [4, 1, -1], [6, 2, -1], [7, 3, -1]])\n",
"\n",
"# do training\n",
"w, b = perceptron_train(train_data)\n",
"print(\"w = \", w)\n",
"print(\"b = \", b)\n",
"\n",
"# predict \n",
"y_pred = perceptron_pred(train_data, w, b)\n",
"\n",
"print(\"ground_truth: \", list(train_data[:, 2]))\n",
"print(\"predicted: \", y_pred)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Reference\n",
"* [感知机(Python实现)](http://www.cnblogs.com/kaituorensheng/p/3561091.html)\n",
"* [Programming a Perceptron in Python](https://blog.dbrgn.ch/2013/3/26/perceptrons-in-python/)\n",
"* [损失函数、风险函数、经验风险最小化、结构风险最小化](https://blog.csdn.net/zhzhx1204/article/details/70163099)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.2"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

+ 4929
- 0
5_nn/2-mlp_bp_EN.ipynb
File diff suppressed because it is too large
View File


+ 176
- 0
5_nn/3-softmax_ce_EN.ipynb View File

@@ -0,0 +1,176 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Softmax & 交叉熵代价函数\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"softmax经常被添加在分类任务的神经网络中的输出层,神经网络的反向传播中关键的步骤就是求导,从这个过程也可以更深刻地理解反向传播的过程,还可以对梯度传播的问题有更多的思考。\n",
"\n",
"## 1. softmax 函数\n",
"\n",
"softmax(柔性最大值)函数,一般在神经网络中, softmax可以作为分类任务的输出层。其实可以认为softmax输出的是几个类别选择的概率,比如我有一个分类任务,要分为三个类,softmax函数可以根据它们相对的大小,输出三个类别选取的概率,并且概率和为1。\n",
"\n",
"softmax函数的公式是这种形式:\n",
"\n",
"$$\n",
"S_i = \\frac{e^{z_i}}{\\sum_k e^{z_k}}\n",
"$$\n",
"\n",
"* $S_i$是经过softmax的类别概率输出\n",
"* $z_k$是神经元的输出\n",
"\n",
"\n",
"更形象的如下图表示:\n",
"\n",
"![softmax_demo](images/softmax_demo.png)\n",
"\n",
"softmax直白来说就是将原来输出是$[3,1,-3]$通过softmax函数一作用,就映射成为(0,1)的值,而这些值的累和为1(满足概率的性质),那么我们就可以将它理解成概率,在最后选取输出结点的时候,我们就可以选取概率最大(也就是值对应最大的)结点,作为我们的预测目标!\n",
"\n",
"\n",
"\n",
"首先是神经元的输出,一个神经元如下图:\n",
"\n",
"![softmax_neuron](images/softmax_neuron.png)\n",
"\n",
"神经元的输出设为:\n",
"\n",
"$$\n",
"z_i = \\sum_{j} w_{ij} x_{j} + b\n",
"$$\n",
"\n",
"其中$W_{ij}$是第$i$个神经元的第$j$个权重,$b$是偏置。$z_i$表示该网络的第$i$个输出。\n",
"\n",
"给这个输出加上一个softmax函数,那就变成了这样:\n",
"\n",
"$$\n",
"a_i = \\frac{e^{z_i}}{\\sum_k e^{z_k}}\n",
"$$\n",
"\n",
"$a_i$代表softmax的第$i$个输出值,右侧套用了softmax函数。\n",
"\n",
"\n",
"### 1.1 损失函数 loss function\n",
"\n",
"在神经网络反向传播中,要求一个损失函数,这个损失函数其实表示的是真实值与网络的估计值的误差,知道误差了,才能知道怎样去修改网络中的权重。\n",
"\n",
"损失函数可以有很多形式,这里用的是交叉熵函数,主要是由于这个求导结果比较简单,易于计算,并且交叉熵解决某些损失函数学习缓慢的问题。**[交叉熵函数](https://blog.csdn.net/u014313009/article/details/51043064)**是这样的:\n",
"\n",
"$$\n",
"C = - \\sum_i y_i ln a_i\n",
"$$\n",
"\n",
"其中$y_i$表示真实的分类结果。\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. 推导过程\n",
"\n",
"首先,我们要明确一下我们要求什么,我们要求的是我们的$loss$对于神经元输出($z_i$)的梯度,即:\n",
"\n",
"$$\n",
"\\frac{\\partial C}{\\partial z_i}\n",
"$$\n",
"\n",
"根据复合函数求导法则:\n",
"\n",
"$$\n",
"\\frac{\\partial C}{\\partial z_i} = \\frac{\\partial C}{\\partial a_j} \\frac{\\partial a_j}{\\partial z_i}\n",
"$$\n",
"\n",
"有个人可能有疑问了,这里为什么是$a_j$而不是$a_i$,这里要看一下$softmax$的公式了,因为$softmax$公式的特性,它的分母包含了所有神经元的输出,所以,对于不等于i的其他输出里面,也包含着$z_i$,所有的$a$都要纳入到计算范围中,并且后面的计算可以看到需要分为$i = j$和$i \\ne j$两种情况求导。\n",
"\n",
"### 2.1 针对$a_j$的偏导\n",
"\n",
"$$\n",
"\\frac{\\partial C}{\\partial a_j} = \\frac{(\\partial -\\sum_j y_j ln a_j)}{\\partial a_j} = -\\sum_j y_j \\frac{1}{a_j}\n",
"$$\n",
"\n",
"### 2.2 针对$z_i$的偏导\n",
"\n",
"如果 $i=j$ :\n",
"\n",
"\\begin{eqnarray}\n",
"\\frac{\\partial a_i}{\\partial z_i} & = & \\frac{\\partial (\\frac{e^{z_i}}{\\sum_k e^{z_k}})}{\\partial z_i} \\\\\n",
" & = & \\frac{\\sum_k e^{z_k} e^{z_i} - (e^{z_i})^2}{\\sum_k (e^{z_k})^2} \\\\\n",
" & = & (\\frac{e^{z_i}}{\\sum_k e^{z_k}} ) (1 - \\frac{e^{z_i}}{\\sum_k e^{z_k}} ) \\\\\n",
" & = & a_i (1 - a_i)\n",
"\\end{eqnarray}\n",
"\n",
"如果 $i \\ne j$:\n",
"\\begin{eqnarray}\n",
"\\frac{\\partial a_j}{\\partial z_i} & = & \\frac{\\partial (\\frac{e^{z_j}}{\\sum_k e^{z_k}})}{\\partial z_i} \\\\\n",
" & = & \\frac{0 \\cdot \\sum_k e^{z_k} - e^{z_j} \\cdot e^{z_i} }{(\\sum_k e^{z_k})^2} \\\\\n",
" & = & - \\frac{e^{z_j}}{\\sum_k e^{z_k}} \\cdot \\frac{e^{z_i}}{\\sum_k e^{z_k}} \\\\\n",
" & = & -a_j a_i\n",
"\\end{eqnarray}\n",
"\n",
"当u,v都是变量的函数时的导数推导公式:\n",
"$$\n",
"(\\frac{u}{v})' = \\frac{u'v - uv'}{v^2} \n",
"$$\n",
"\n",
"### 2.3 整体的推导\n",
"\n",
"\\begin{eqnarray}\n",
"\\frac{\\partial C}{\\partial z_i} & = & (-\\sum_j y_j \\frac{1}{a_j} ) \\frac{\\partial a_j}{\\partial z_i} \\\\\n",
" & = & - \\frac{y_i}{a_i} a_i ( 1 - a_i) + \\sum_{j \\ne i} \\frac{y_j}{a_j} a_i a_j \\\\\n",
" & = & -y_i + y_i a_i + \\sum_{j \\ne i} y_j a_i \\\\\n",
" & = & -y_i + a_i \\sum_{j} y_j \\\\\n",
" & = & -y_i + a_i\n",
"\\end{eqnarray}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. 问题\n",
"如何将本节所讲的softmax,交叉熵代价函数应用到上节所讲的BP方法中?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## References\n",
"\n",
"* Softmax & 交叉熵\n",
" * [交叉熵代价函数(作用及公式推导)](https://blog.csdn.net/u014313009/article/details/51043064)\n",
" * [手打例子一步一步带你看懂softmax函数以及相关求导过程](https://www.jianshu.com/p/ffa51250ba2e)\n",
" * [简单易懂的softmax交叉熵损失函数求导](https://www.jianshu.com/p/c02a1fbffad6)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.2"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

+ 110
- 0
README_EN.md View File

@@ -0,0 +1,110 @@
# 机器学习

本教程主要讲解机器学习的基本原理与实现,通过本教程的引导来快速学习Python、Python常用库、机器学习的理论知识与实际编程,并学习如何解决实际问题。

由于**本课程需要大量的编程练习才能取得比较好的学习效果**,因此需要认真去完成[作业和报告](https://gitee.com/pi-lab/machinelearning_homework),写作业的过程可以查阅网上的资料,但是不能直接照抄,需要自己独立思考并独立写出代码。

![Machine Learning Cover](images/machine_learning.png)


## 1. 内容
1. [课程简介](CourseIntroduction.pdf)
2. [Python](0_python/)
- [Install Python](tips/InstallPython.md)
- [Python Basics](0_python/1_Basics.ipynb)
- [Print Statement](0_python/2_Print_Statement.ipynb)
- [Data Structure 1](0_python/3_Data_Structure_1.ipynb)
- [Data Structure 2](0_python/4_Data_Structure_2.ipynb)
- [Control Flow](0_python/5_Control_Flow.ipynb)
- [Function](0_python/6_Function.ipynb)
- [Class](0_python/7_Class.ipynb)
3. [numpy & matplotlib](1_numpy_matplotlib_scipy_sympy/)
- [numpy](1_numpy_matplotlib_scipy_sympy/numpy_tutorial.ipynb)
- [matplotlib](1_numpy_matplotlib_scipy_sympy/matplotlib_simple_tutorial.ipynb)
- [ipython & notebook](1_numpy_matplotlib_scipy_sympy/ipython_notebook.ipynb)
4. [knn](2_knn/knn_classification.ipynb)
5. [kMenas](3_kmeans/k-means.ipynb)
6. [Logistic Regression](4_logistic_regression/)
- [Least squares](4_logistic_regression/Least_squares.ipynb)
- [Logistic regression](4_logistic_regression/Logistic_regression.ipynb)
7. [Neural Network](5_nn/)
- [Perceptron](5_nn/Perceptron.ipynb)
- [Multi-layer Perceptron & BP](5_nn/mlp_bp.ipynb)
- [Softmax & cross-entroy](5_nn/softmax_ce.ipynb)
8. [PyTorch](6_pytorch/)
- Basic
- [short tutorial](6_pytorch/PyTorch_quick_intro.ipynb)
- [basic/Tensor-and-Variable](6_pytorch/0_basic/Tensor-and-Variable.ipynb)
- [basic/autograd](6_pytorch/0_basic/autograd.ipynb)
- [basic/dynamic-graph](6_pytorch/0_basic/dynamic-graph.ipynb)
- NN & Optimization
- [nn/linear-regression-gradient-descend](6_pytorch/1_NN/linear-regression-gradient-descend.ipynb)
- [nn/logistic-regression](6_pytorch/1_NN/logistic-regression.ipynb)
- [nn/nn-sequential-module](6_pytorch/1_NN/nn-sequential-module.ipynb)
- [nn/bp](6_pytorch/1_NN/bp.ipynb)
- [nn/deep-nn](6_pytorch/1_NN/deep-nn.ipynb)
- [nn/param_initialize](6_pytorch/1_NN/param_initialize.ipynb)
- [optim/sgd](6_pytorch/1_NN/optimizer/sgd.ipynb)
- [optim/adam](6_pytorch/1_NN/optimizer/adam.ipynb)
- CNN
- [CNN simple demo](demo_code/3_CNN_MNIST.py)
- [cnn/basic_conv](6_pytorch/2_CNN/basic_conv.ipynb)
- [cnn/minist (demo code)](./demo_code/3_CNN_MNIST.py)
- [cnn/batch-normalization](6_pytorch/2_CNN/batch-normalization.ipynb)
- [cnn/regularization](6_pytorch/2_CNN/regularization.ipynb)
- [cnn/lr-decay](6_pytorch/2_CNN/lr-decay.ipynb)
- [cnn/vgg](6_pytorch/2_CNN/vgg.ipynb)
- [cnn/googlenet](6_pytorch/2_CNN/googlenet.ipynb)
- [cnn/resnet](6_pytorch/2_CNN/resnet.ipynb)
- [cnn/densenet](6_pytorch/2_CNN/densenet.ipynb)
- RNN
- [rnn/pytorch-rnn](6_pytorch/3_RNN/pytorch-rnn.ipynb)
- [rnn/rnn-for-image](6_pytorch/3_RNN/rnn-for-image.ipynb)
- [rnn/lstm-time-series](6_pytorch/3_RNN/time-series/lstm-time-series.ipynb)
- GAN
- [gan/autoencoder](6_pytorch/4_GAN/autoencoder.ipynb)
- [gan/vae](6_pytorch/4_GAN/vae.ipynb)
- [gan/gan](6_pytorch/4_GAN/gan.ipynb)



## 2. 学习的建议
1. 为了更好的学习本课程,需要大家把Python编程的基础能力培养好,这样后续的机器学习方法学习才比较扎实。
2. 每个课程前部分是理论基础,然后是代码实现。个人如果想学的更扎实,可以自己把各个方法的代码亲自实现一下。做的过程尽可能自己想解决办法,因为重要的学习目标不是代码本身,而是学会分析问题、解决问题的能力。


## 3. 其他参考资料
* 资料速查
* [相关学习参考资料汇总](References.md)
* [一些速查手册](tips/cheatsheet)

* 机器学习方面技巧等
* [Confusion Matrix](tips/confusion_matrix.ipynb)
* [Datasets](tips/datasets.ipynb)
* [构建深度神经网络的一些实战建议](tips/构建深度神经网络的一些实战建议.md)
* [Intro to Deep Learning](tips/Intro_to_Deep_Learning.pdf)

* Python技巧等
* [安装Python环境](tips/InstallPython.md)
* [Python tips](tips/python)

* Git
* [Git Tips - 常用方法速查,快速入门](https://gitee.com/pi-lab/learn_programming/blob/master/6_tools/git/git-tips.md)
* [Git快速入门 - Git初体验](https://my.oschina.net/dxqr/blog/134811)
* [在win7系统下使用TortoiseGit(乌龟git)简单操作Git](https://my.oschina.net/longxuu/blog/141699)
* [Git系统学习 - 廖雪峰的Git教程](https://www.liaoxuefeng.com/wiki/0013739516305929606dd18361248578c67b8067c8c017b000)

* Markdown
* [Markdown——入门指南](https://www.jianshu.com/p/1e402922ee32)


## 4. 相关学习资料参考

在上述内容学习完成之后,可以进行更进一步机器学习、计算机视觉方面的学习与研究,具体的资料可以参考:
1. [《一步一步学编程》](https://gitee.com/pi-lab/learn_programming)
2. 智能系统实验室-培训教程与作业
- [《智能系统实验室-暑期培训教程》](https://gitee.com/pi-lab/SummerCamp)
- [《智能系统实验室-暑期培训作业》](https://gitee.com/pi-lab/SummerCampHomework)
3. [《智能系统实验室研究课题》](https://gitee.com/pi-lab/pilab_research_fields)
4. [《编程代码参考、技巧集合》](https://gitee.com/pi-lab/code_cook)
- 可以在这个代码、技巧集合中找到某项功能的示例,从而加快自己代码的编写

+ 89
- 0
tips/InstallPython_EN.md View File

@@ -0,0 +1,89 @@
# Installing Python Environments

由于Python的库比较多,并且依赖关系比较复杂,所以请仔细阅读下面的说明,使用下面的说明来安装能够减少问题的可能。*不过所列的安装方法,里面存在较多的细节,也许和你的系统并不适配,所以会遇到问题。如果遇到问题请通过搜索引擎去查找解决的办法*,通过这个方式锻炼自己解决问题的能力。

可以参考后面所列的`1.Winodws`或者`2.Linux`章节所列的将Python环境安装到计算机里。如果想一次性把所有的所需要的软件都安装到机器上,可以在本项目的根目录下执行下面的命令,需要Python 3.5版本,如果出现问题,则可以参考`requirements.txt`里面所列的软件包名字,手动一个一个安装。
```
pip install -r requirements.txt
```


## 1. Windows

### 安装Anaconda

由于Anaconda集成了大部分的python包,因此能够很方便的开始使用。由于网络下载速度较慢,因此推荐使用镜像来提高下载的速度。镜像的使用方法可以参考[Anaconda镜像的说明文档](https://mirrors.tuna.tsinghua.edu.cn/help/anaconda)

在这里找到适合自己的安装文件,然后下载
https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/


设置软件源 https://mirror.tuna.tsinghua.edu.cn/help/anaconda/
```
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/
conda config --set show_channel_urls yes
```

### 安装Pytorch
```
conda install pytorch -c pytorch
pip3 install torchvision
```




## 2. Linux

### 安装pip
```
sudo apt-get install python3-pip
```



### 设置PIP源

```
pip config set global.index-url 'https://mirrors.ustc.edu.cn/pypi/web/simple'
```



### 安装常用的包

```
pip install -r requirements.txt
```

或者手动安装
```
sudo pip install scipy
sudo pip install scikit-learn
sudo pip install numpy
sudo pip install matplotlib
sudo pip install pandas
sudo pip install ipython
sudo pip install jupyter
```



### 安装pytorch

到[pytorch 官网](https://pytorch.org),根据自己的操作系统、CUDA版本,选择合适的安装命令。

例如Linux, Python3.5, CUDA 9.0:
```
pip3 install torch torchvision
```



## 3. [Python技巧](python/)

- [pip的安装、使用等](python/pip.md)
- [virtualenv的安装、使用](python/virtualenv.md)
- [virtualenv便捷管理工具:virtualenv_wrapper](python/virtualenv_wrapper.md)


Loading…
Cancel
Save