diff --git a/References_notes.md b/References_notes.md new file mode 100644 index 0000000..50eb9d0 --- /dev/null +++ b/References_notes.md @@ -0,0 +1,14 @@ +## Notebooks: + +machineLearning/10_digits_classification.ipynb + +MachineLearningNotebooks/05.%20Logistic%20Regression.ipynb + +MachineLearningNotebooks/08.%20Practical_NeuralNets.ipynb + + +## Exercise +http://sofasofa.io/competitions.php?type=practice +https://www.kaggle.com/competitions + +https://github.com/wmpscc/DataMiningNotesAndPractice/blob/master/2.KMeans%E7%AE%97%E6%B3%95%E4%B8%8E%E4%BA%A4%E9%80%9A%E4%BA%8B%E6%95%85%E7%90%86%E8%B5%94%E5%AE%A1%E6%A0%B8%E9%A2%84%E6%B5%8B.md \ No newline at end of file diff --git a/exercise/exercise.ipynb b/exercise/1_python.ipynb similarity index 66% rename from exercise/exercise.ipynb rename to exercise/1_python.ipynb index 79151af..baa1375 100644 --- a/exercise/exercise.ipynb +++ b/exercise/1_python.ipynb @@ -34,10 +34,15 @@ "* 高于 100 万元时, 超过 100 万元的部分按 1%提成, \n", "从键盘输入当月利润 I,求应发放奖金总数?\n", "\n", + "\n", "### (4)循环\n", "输出9x9的乘法口诀表\n", "\n", - "### (5)算法\n", + "\n", + "### (5)使用while循环实现输出2-3+4-5+6.....+100的和\n", + "\n", + "\n", + "### (6)算法\n", "给一个数字列表,将其按照由大到小的顺序排列\n", "\n", "例如\n", @@ -45,72 +50,19 @@ "1, 10, 4, 2, 9, 2, 34, 5, 9, 8, 5, 0\n", "```\n", "\n", - "### (6)应用1\n", + "### (7)应用1\n", "做为 Apple Store App 独立开发者,你要搞限时促销,为你的应用生成激活码(或者优惠券),使用 Python 如何生成 200 个激活码(或者优惠券)?\n", "\n", "需要考虑什么是激活码?有什么特性?例如`KR603guyVvR`是一个激活码\n", "\n", - "### (7)应用2\n", + "### (8)应用2\n", "需要把某个目录下面所有的某种类型的文件找到。\n", "例如把`c:`下面所有的`.dll`文件找到\n", "\n", - "### (8)应用3\n", + "### (9)应用3\n", "你有个目录,里面是程序(假如是C或者是Python),统计一下你写过多少行代码。包括空行和注释,但是要分别列出来。\n", "\n" ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 数值计算\n", - "\n", - "\n", - "### (1)对于一个存在在数组,如何添加一个用0填充的边界?\n", - "例如对一个二维矩阵\n", - "```\n", - "10, 34, 54, 23\n", - "31, 87, 53, 68\n", - "98, 49, 25, 11\n", - "84, 32, 67, 88\n", - "```\n", - "\n", - "变换成\n", - "```\n", - " 0, 0, 0, 0, 0, 0\n", - " 0, 10, 34, 54, 23, 0\n", - " 0, 31, 87, 53, 68, 0\n", - " 0, 98, 49, 25, 11, 0\n", - " 0, 84, 32, 67, 88, 0\n", - " 0, 0, 0, 0, 0, 0\n", - "```\n", - "\n", - "### (2) 创建一个 5x5的矩阵,并设置值1,2,3,4落在其对角线下方位置\n", - "\n", - "\n", - "### (3) 创建一个8x8 的矩阵,并且设置成棋盘样式\n", - "\n", - "\n", - "### (4)求解线性方程组\n", - "\n", - "给定一个方程组,如何求出其的方程解。有多种方法,分析各种方法的优缺点(最简单的方式是消元方)。\n", - "\n", - "例如\n", - "```\n", - "3x + 4y + 2z = 10\n", - "5x + 3y + 4z = 14\n", - "8x + 2y + 7z = 20\n", - "```\n", - "\n", - "编程写出求解的程序\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] } ], "metadata": { diff --git a/exercise/1_python.py b/exercise/1_python.py new file mode 100644 index 0000000..393c30b --- /dev/null +++ b/exercise/1_python.py @@ -0,0 +1,74 @@ +# -*- coding: utf-8 -*- +# --- +# jupyter: +# jupytext_format_version: '1.2' +# kernelspec: +# display_name: Python 3 +# language: python +# name: python3 +# language_info: +# codemirror_mode: +# name: ipython +# version: 3 +# file_extension: .py +# mimetype: text/x-python +# name: python +# nbconvert_exporter: python +# pygments_lexer: ipython3 +# version: 3.5.2 +# --- + +# # Python & Machine Learning Exercises + +# ## Python +# +# ### (1)字符串 +# 给定一个文章,找出每个单词的出现次数 +# +# ``` +# One is always on a strange road, watching strange scenery and listening to strange music. Then one day, you will find that the things you try hard to forget are already gone. +# ``` +# +# ### (2)组合 +# 有 1、2、3、4 个数字,能组成多少个互不相同且无重复数字的三位数?都是多少? +# +# +# ### (3) 判断 +# 企业发放的奖金根据利润提成。利润(I): +# * 低于或等于 10 万元时,奖金可提 10%; +# * 高于 10 万元,低于 20 万元时,低于 10 万元的部分按 10%提成,高于 10 万元的部分,可提成 7.5%; +# * 20 万到 40 万之间时,高于 20 万元的部分,可提成 5%; +# * 40 万到 60 万之间时,高于 40 万元的部分,可提成 3%; +# * 60 万到 100 万之间时,高于 60 万元的部分,可提成 1.5%, +# * 高于 100 万元时, 超过 100 万元的部分按 1%提成, +# 从键盘输入当月利润 I,求应发放奖金总数? +# +# +# ### (4)循环 +# 输出9x9的乘法口诀表 +# +# +# ### (5)使用while循环实现输出2-3+4-5+6.....+100的和 +# +# +# ### (6)算法 +# 给一个数字列表,将其按照由大到小的顺序排列 +# +# 例如 +# ``` +# 1, 10, 4, 2, 9, 2, 34, 5, 9, 8, 5, 0 +# ``` +# +# ### (7)应用1 +# 做为 Apple Store App 独立开发者,你要搞限时促销,为你的应用生成激活码(或者优惠券),使用 Python 如何生成 200 个激活码(或者优惠券)? +# +# 需要考虑什么是激活码?有什么特性?例如`KR603guyVvR`是一个激活码 +# +# ### (8)应用2 +# 需要把某个目录下面所有的某种类型的文件找到。 +# 例如把`c:`下面所有的`.dll`文件找到 +# +# ### (9)应用3 +# 你有个目录,里面是程序(假如是C或者是Python),统计一下你写过多少行代码。包括空行和注释,但是要分别列出来。 +# +# diff --git a/exercise/2_numpy.ipynb b/exercise/2_numpy.ipynb new file mode 100644 index 0000000..2a7a643 --- /dev/null +++ b/exercise/2_numpy.ipynb @@ -0,0 +1,82 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 数值计算\n", + "\n", + "\n", + "### (1)对于一个存在在数组,如何添加一个用0填充的边界?\n", + "例如对一个二维矩阵\n", + "```\n", + "10, 34, 54, 23\n", + "31, 87, 53, 68\n", + "98, 49, 25, 11\n", + "84, 32, 67, 88\n", + "```\n", + "\n", + "变换成\n", + "```\n", + " 0, 0, 0, 0, 0, 0\n", + " 0, 10, 34, 54, 23, 0\n", + " 0, 31, 87, 53, 68, 0\n", + " 0, 98, 49, 25, 11, 0\n", + " 0, 84, 32, 67, 88, 0\n", + " 0, 0, 0, 0, 0, 0\n", + "```\n", + "\n", + "### (2) 创建一个 5x5的矩阵,并设置值1,2,3,4落在其对角线下方位置\n", + "\n", + "\n", + "### (3) 创建一个8x8 的矩阵,并且设置成国际象棋棋盘样式(黑可以用0, 白可以用1)\n", + "\n", + "\n", + "### (4)求解线性方程组\n", + "\n", + "给定一个方程组,如何求出其的方程解。有多种方法,分析各种方法的优缺点(最简单的方式是消元方)。\n", + "\n", + "例如\n", + "```\n", + "3x + 4y + 2z = 10\n", + "5x + 3y + 4z = 14\n", + "8x + 2y + 7z = 20\n", + "```\n", + "\n", + "编程写出求解的程序\n", + "\n", + "\n", + "### (5) 翻转一个数组(第一个元素变成最后一个)\n", + "\n", + "\n", + "### (6) 产生一个十乘十随机数组,并且找出最大和最小值\n", + "\n", + "\n", + "## Reference\n", + "* [100 numpy exercises](https://github.com/rougier/numpy-100)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.5.2" + }, + "main_language": "python" + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/exercise/2_numpy.py b/exercise/2_numpy.py new file mode 100644 index 0000000..0b53cd7 --- /dev/null +++ b/exercise/2_numpy.py @@ -0,0 +1,70 @@ +# -*- coding: utf-8 -*- +# --- +# jupyter: +# jupytext_format_version: '1.2' +# kernelspec: +# display_name: Python 3 +# language: python +# name: python3 +# language_info: +# codemirror_mode: +# name: ipython +# version: 3 +# file_extension: .py +# mimetype: text/x-python +# name: python +# nbconvert_exporter: python +# pygments_lexer: ipython3 +# version: 3.5.2 +# --- + +# ## 数值计算 +# +# +# ### (1)对于一个存在在数组,如何添加一个用0填充的边界? +# 例如对一个二维矩阵 +# ``` +# 10, 34, 54, 23 +# 31, 87, 53, 68 +# 98, 49, 25, 11 +# 84, 32, 67, 88 +# ``` +# +# 变换成 +# ``` +# 0, 0, 0, 0, 0, 0 +# 0, 10, 34, 54, 23, 0 +# 0, 31, 87, 53, 68, 0 +# 0, 98, 49, 25, 11, 0 +# 0, 84, 32, 67, 88, 0 +# 0, 0, 0, 0, 0, 0 +# ``` +# +# ### (2) 创建一个 5x5的矩阵,并设置值1,2,3,4落在其对角线下方位置 +# +# +# ### (3) 创建一个8x8 的矩阵,并且设置成国际象棋棋盘样式(黑可以用0, 白可以用1) +# +# +# ### (4)求解线性方程组 +# +# 给定一个方程组,如何求出其的方程解。有多种方法,分析各种方法的优缺点(最简单的方式是消元方)。 +# +# 例如 +# ``` +# 3x + 4y + 2z = 10 +# 5x + 3y + 4z = 14 +# 8x + 2y + 7z = 20 +# ``` +# +# 编程写出求解的程序 +# +# +# ### (5) 翻转一个数组(第一个元素变成最后一个) +# +# +# ### (6) 产生一个十乘十随机数组,并且找出最大和最小值 +# +# +# ## Reference +# * [100 numpy exercises](https://github.com/rougier/numpy-100) diff --git a/exercise/3_matplotlib.ipynb b/exercise/3_matplotlib.ipynb new file mode 100644 index 0000000..71b36e5 --- /dev/null +++ b/exercise/3_matplotlib.ipynb @@ -0,0 +1,37 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Matplotlib\n", + "\n", + "\n", + "## (1) 画出一个二次函数,同时画出梯形法求积分时的各个梯形\n", + "\n" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.5.2" + }, + "main_language": "python" + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/exercise/3_matplotlib.py b/exercise/3_matplotlib.py new file mode 100644 index 0000000..9c0d808 --- /dev/null +++ b/exercise/3_matplotlib.py @@ -0,0 +1,26 @@ +# -*- coding: utf-8 -*- +# --- +# jupyter: +# jupytext_format_version: '1.2' +# kernelspec: +# display_name: Python 3 +# language: python +# name: python3 +# language_info: +# codemirror_mode: +# name: ipython +# version: 3 +# file_extension: .py +# mimetype: text/x-python +# name: python +# nbconvert_exporter: python +# pygments_lexer: ipython3 +# version: 3.5.2 +# --- + +# # Matplotlib +# +# +# ## (1) 画出一个二次函数,同时画出梯形法求积分时的各个梯形 +# +# diff --git a/knn/digital classification.ipynb b/knn/digital classification.ipynb deleted file mode 100644 index 720ea6b..0000000 --- a/knn/digital classification.ipynb +++ /dev/null @@ -1,82 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Digitial Classification\n", - "\n", - "\n" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Automatically created module for IPython interactive environment\n", - "KNN score: 0.953661\n", - "LogisticRegression score: 0.908248\n" - ] - } - ], - "source": [ - "print(__doc__)\n", - "\n", - "from sklearn import datasets, neighbors, linear_model\n", - "\n", - "digits = datasets.load_digits()\n", - "X_digits = digits.data\n", - "y_digits = digits.target\n", - "\n", - "n_samples = len(X_digits)\n", - "n_train = int(0.4 * n_samples)\n", - "\n", - "X_train = X_digits[:n_train]\n", - "y_train = y_digits[:n_train]\n", - "X_test = X_digits[n_train:]\n", - "y_test = y_digits[n_train:]\n", - "\n", - "knn = neighbors.KNeighborsClassifier()\n", - "logistic = linear_model.LogisticRegression()\n", - "\n", - "print('KNN score: %f' % knn.fit(X_train, y_train).score(X_test, y_test))\n", - "print('LogisticRegression score: %f' % logistic.fit(X_train, y_train).score(X_test, y_test))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## References\n", - "* [Supervised learning: predicting an output variable from high-dimensional observations](http://scikit-learn.org/stable/tutorial/statistical_inference/supervised_learning.html)\n", - "* [Digits Classification Exercise](http://scikit-learn.org/stable/auto_examples/exercises/plot_digits_classification_exercise.html)\n" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.5.2" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} diff --git a/knn/knn_classification.ipynb b/knn/knn_classification.ipynb new file mode 100644 index 0000000..be5c04e --- /dev/null +++ b/knn/knn_classification.ipynb @@ -0,0 +1,139 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# KNN Classification\n", + "\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Feature dimensions: (1797, 64)\n", + "Label dimensions: (1797,)\n" + ] + } + ], + "source": [ + "% matplotlib inline\n", + "\n", + "import matplotlib.pyplot as plt\n", + "from sklearn import datasets, neighbors, linear_model\n", + "\n", + "# load data\n", + "digits = datasets.load_digits()\n", + "X_digits = digits.data\n", + "y_digits = digits.target\n", + "\n", + "print(\"Feature dimensions: \", X_digits.shape)\n", + "print(\"Label dimensions: \", y_digits.shape)\n" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAW4AAABLCAYAAABQtG2+AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAFI5JREFUeJztnXmcFNW1x79nNgZmYAQGB9kExBEhUVRC1ERxeUZM3guo+USjiXlGJYGHL0bNxjMfSWIkLyaicSGSIHGLS94n6Iu7LwqK4jIRA0EZIovsy7DOvvV5f1RPV912ehime7q6M+f7+fRn7u1bXfc3t27dqjp17j2iqhiGYRjZQ07YAgzDMIzDwwZuwzCMLMMGbsMwjCzDBm7DMIwswwZuwzCMLMMGbsMwjCzDBm7DMIwsIyMGbhEZICKLRaRWRD4SkctC0DBLRCpEpFFEfp/u+gM6eonIwmg7VIvIeyJyQUhaHhaR7SJyUETWisjVYegI6DlWRBpE5OGQ6l8Srb8m+qkMQ0dUy6Ui8kH0nFknImekuf6auE+riNyVTg0BLSNF5FkR2SciO0TkbhHJC0HH8SLysogcEJEPReTC7qorIwZu4B6gCSgDLgfmi8j4NGvYBtwC3J/meuPJAzYDk4ES4CbgCREZGYKWucBIVe0HfBG4RUROCUFHG/cA74RYP8AsVS2Ofo4LQ4CInAf8N3Al0Bc4E1ifTg2BNigGBgP1wB/TqSHAvcAu4ChgAt65MzOdAqIXiqeAp4EBwHTgYREp7476Qh+4RaQIuBj4karWqOoy4H+Br6VTh6r+SVWfBPaks952dNSq6hxV3aiqEVV9GtgApH3AVNXVqtrYlo1+jkm3DvDuMIH9wF/CqD/D+DHwE1V9M9pHtqrq1hD1XIw3cL4WUv2jgCdUtUFVdwDPA+m+8RsLDAHmqWqrqr4MvE43jWOhD9xAOdCiqmsD3/2N9Dd8RiIiZXhttDqk+u8VkTpgDbAdeDYEDf2AnwDXp7vudpgrIlUi8rqInJXuykUkF5gIDIo+jm+JmgZ6p1tLgK8DD2p462fcAVwqIn1EZChwAd7gHTYCfKI7dpwJA3cxcDDuuwN4j4A9GhHJBx4BHlDVNWFoUNWZeMfiDOBPQGPHv+gWfgosVNUtIdQd5PvAaGAosAD4s4ik+wmkDMgHvoR3TCYAJ+GZ1NKOiByNZ5p4IIz6o7yKd6N3ENgCVABPpllDJd5Tx3dFJF9EPofXLn26o7JMGLhrgH5x3/UDqkPQkjGISA7wEJ7tf1aYWqKPfsuAYcCMdNYtIhOAfwHmpbPe9lDVt1S1WlUbVfUBvEfhz6dZRn30712qul1Vq4DbQ9DRxteAZaq6IYzKo+fJ83g3FUVAKdAf7x1A2lDVZmAa8AVgB3AD8ATehSTlZMLAvRbIE5FjA9+dSEimgUxARARYiHd3dXG0U2QCeaTfxn0WMBLYJCI7gBuBi0Xk3TTraA/FexxOX4Wq+/AGg6BZIswlPq8g3LvtAcAI4O7oBXUPsIgQLmSqulJVJ6vqQFU9H+/p7O3uqCv0gVtVa/Gulj8RkSIR+QwwFe9uM22ISJ6IFAK5QK6IFIbhUhRlPnA88G+qWn+ojbsDETky6nJWLCK5InI+8BXS/3JwAd7FYkL08xvgGeD8dIoQkSNE5Py2fiEil+N5c4RhS10EXBs9Rv2B7+B5M6QVETkdz2wUljcJ0SeODcCM6HE5As/mvjLdWkTkhGj/6CMiN+J5ufy+WypT1dA/eFfNJ4FaYBNwWQga5uB7TrR95oSg4+ho3Q14ZqS2z+Vp1jEIWIrnyXEQWAVckwF9ZQ7wcAj1DsJzRayOtsmbwHkhtUE+ngvcfrzH8l8DhSHouA94KAP6xARgCbAPqMIzUZSFoOO2qIYa4DlgTHfVJdEKDcMwjCwhdFOJYRiGcXjYwG0YhpFl2MBtGIaRZXRq4BaRKSJSGZ2p9YPuFmU6TIfpMB3/rDpSwSFfTkan2K4FzsPzH30H+Iqqvp/oNwXSSwsparespdT9fvDgvbH01tojnLLCLb77sqpS07KXPhQj5FBHNYUUkUsuDdTSpI0f86ftSMfHth3rX8N65bQ4Zft3+pM4VZWGvdu7TUfkCH+7kcN3OmU7mv15SqrKvsr9KdPRNNT9/hMDd8fSeyO5TtmeSn/b7j4ukud7ZEZGu/cZsrbJ14FSy8GU6Qj2B4Da5oJYOn9dQ0K9qdbRka74flr9vl+Wah1NQ9zvNdAlSvu6c+WOyvPbR1VZVdnEyFF55OVB5Wqld24/ciWP+tZqmiL1h6WjcaQ7EXF4sT9+bD4w0Ckr3O5P8lVValpT10+1vMDJB49F05pIu785FIl0tEdn/JQnAR+q6noAEXkMz8864cBdSBGflnPbLau6+DQn/90bHoulf/TXqU5Z+fXbY+l9TTv4x55lnBxdvXJDdAb4KBnLW9q+a3FHOuIZ8oA/OB/bZ5dT9uTt58TSNbs2svuZJ7pNR905n46lF95xu1M2d/uUWHr3ql28dXVFynRsuNY9Lm9/fX4s/Vh1f6fsocmTYunuPi65pUfG0vX3ustxFJz3USy9X/ewnvdTpiPYHwDe3joilh52ceK5YanW0ZGu+H669AS/fVKtY9M3T3fyTSX+4HTVua84ZbNL/dVul1fU8/3bdvO7h71B9YvjvHYcXXwKy6vad//uSMfamyc6+V+c4Y8fNzz9VafsuJ/7Cybub9rBP/a+nrL2aLr3aCc/sq9/Adl2atcmfSfS0R6dMZUMxVtmtI0t0e8cRGS6eOtZVzR3w3IWja01FOJ3zEJ608jH56Z0t47mugMZoaNuV11G6MiU49JIvenIQB1bd7QyeIh/e16YW0xDpDbtOhoitRnRHqkiZS8nVXWBqk5U1Yn59ErVbk2H6TAdpqPH6TgUnTGVbAWGB/LDot91iaBpBODSvvti6TuOqHHKnnn3hVh6eUU9F8woouoL3iN9w31r6EXqVrLcWD0gll40wl1W+Ldn+sFFGocU0fyKf6VuoD4pHZHJJzn51+65L5ZeG7dCydSBK2LpyjG1rCI5HWvn+yaPuee4x+UTd/rr0P/92/c6ZXedMTKWbq6ChpffS0pHR2yYMSaWbvq7azscg28q6UVvGpJsjyDBtoa4PrHN3fbJ2uJYuvLdWn755dTp2PfvrgnrhRG+CeuYx7/llI3hzVg61e0RT8EB/57vuZvPcspemjk2lj5Yv42dG5Yyd7u3QkHdgXcoAFrrd6Hq2ug7w1njEgcd+tW/uoGRnjrNP7dyVuWx+erk2iN3vB8345XxjyfeMK5/3FrlxtsImrS6SmfuuN8BjhWRUSJSAFyKF+ggrXxqQiGNB6poPLiHSGsLO9nMII5KtwwKRg2jnhrqtZaIRkLTMeaEPhmho3jA8IzQ0Y/+GaEjU45LprRH3+MGU735INXbqmltbg1Nx8DjSzOiPVLFIe+4VbVFRGYBL+AtwHS/qqZ95b68PGHYZy9i/bMLUFWGM4xiKUm3DCQ3l+OYwApeQ1GGMDIUHbl5khE6JCcz2iNHcjhOw9eRKcclU9pDcnOYdOOp/N9/vohGlDJGhNMeeTkZcVxSRadWv1PVZ0ki8knLOX7UrUv7vueUXTDl0li6ZKUbK+DLy9w3ui1Tj6Fs6vcAGDUjudUS400U95XfHci5LkD9VrmuP6VyFKUpulqvn+ba0YKPVQv/crZTtu6S3zj5+TImKR1j5/vxKx768SSn7Kalj8bS8V4lxX98y82nsD1yy4508l+7yH/T/vgitz8EH10ByjiOMs4CoHV1cnF83693379PK/L3t7bZfbn2Xysvd/JHD95NGSd6Ona6nh+Hy7TrX05YNvrJjl+epbKfjpjzRsKyD+ed6uSvKnPP42W/KOd0vNCLrZJceyx53z3mb5ck9va56yN34carLrqeCUwDoM9itw93hubSxDERrtzkm1ODHkgAPzvhKSe/lDEki82cNAzDyDJs4DYMw8gybOA2DMPIMtIS4aVhoF/NTbs+6ZRFViaOgfvOqtRGydo0x5/99dSVtzll5fmJpxwPfXGPk29Noabg7C6Axzf5dtznrnM1nr36MidfEHCH6wpO258w1ikLuml+eb1rW84b7Hablh3u1PxkCLr/AdxRsjiWXjrPdaP64H53Fl3OAV/XmO8kp+OlnW57BGcDxveVyCr3JVfrztS9ux/X2/W8Db4DyVm6In7zlFJ3oT+Ld9uZiWdiP3fRrzrcz+OX+f1n8LzkbNxjHnDPvpcefSSWvvLNM5yy95vKnHzftftj6a6cw/lrEntB75zq981JT21yysYVxJ8fZuM2DMPocdjAbRiGkWWkx1TS378+PLLcnQlW3kEQ5LySJiffcqAgwZadI+jSdN38C52yZ1e8mPB38W5AyV7tgi5vlT8Y7ZRddW7ihWZ6f9VdWyGVJpt4k9UXTvZj8Z70fNxUsLjwuCumDImlu2I2Cc4O/GC6O0tz/PLpsfQwXBPEhim/c/In3jaTVBFcwArgjAu/GUtXneiulhiv+Xh8HR250XWG+Mfsp/b4bqyb5rhmx1F/jDPpJekSGTQtjJjproh4X/kfEv7uquuud/KDFyfXBkEaBiQeA+JnPH/+vEucfLLtEXTtjJ8NGRw/Rj1/tVP2w6PcEyboxtpVTXbHbRiGkWXYwG0YhpFldMpUIiIbgWq8p/MWVZ3Y8S+6h62z55JT2AtyhH3a0Ok1jFPNMn2WXPIQBCGnx+tYsutB8iQfQUAjoemo/N1PycnvheTk8JHWh6Zj/byfklPg6dgeoo5M6R+mI/Ucjo37bFWt6kolhfv8Vd0+9cl1TtmBoJjBrvvOJeP+6uRvi8DQq2aQW1TMqB8u74qULrHr5LjVvJbAKUymQLq27OMHc/0psRum/CbhdpNm3+jk++/8+P+cjI6OCNqqgzZsgD33u0EGmkuWMHD2teQWF1HehaUIeh3w+0f8dPLVp/nuXreudO2K8eTWtHBa6TQKcnonPbU6nuAU6VI+3cGWoLnKoJuuIbdvEeXfqEiq3v85cLKTD9pxb73I/R9nT3ftpUUjCzjplJkUFBR1yXUwaH8tOM8tK9/mu0ROmj3DKeu/OLX9NLg8RXD1THBXSCwc4QYwuPxRt+2XnpzPpGO+QUFen6Tt3fEr/L0y+cpYunypW+/593/byY+8w48uFd+uncVMJYZhGFlGZ++4FXhRRBS4T1UXxG8gItOB6QCFJF6MJSlE2LZwAYiQr0cyTEa3s0kadAAreA0UhjLadAjsuvO3IEIfLQtNhwhU7P0zgjBEh4fYHsKuXy4MvT0Q4b2VixCEoTq4x/dTASo2/iHaP4aE2h7J0tmB+7OqulVEjgReEpE1qvpqcIPoYL4AoJ8M6DgCcRcZ+q1Z5JWU0FJTzZZb5lGkfekvg5xt0qFjImdTKL1p0gbe5bUer6Psxpnk9S+h9WANW753Z2g6Jg24kMLcYhpb66jYvTi89pj9Lb89vn1XaDpOOekaevUqoamphvfeuLfH99NJo6+gML8fjS21VKxZFJqOVNDZZV23Rv/uEpHFeAGEX+34Vz79Kn1L9s3DnnbKrpju+3zmT9tNRxz7cz8+cQ5DOMhe+jOog190D4Xi2bcKpJBBevg6gtN2b53o2m2DU6vfvnW+U3b25W4w5fpHhsRinAxatC6p9ghGwwEY8rI/xTnohw/w4Dg3iPG0/TOAJvJKChjUheMStB9fu/gzTlnQvnnPg3c7ZUEfb4BhVatppY486JKOIPGRZ4J2+DHfTxgnG4CRy9qiKZWwPUkdD/3JfYEWtGPHT8v/Usm7Tn7rJW3zBXox6I3kdKyNW15gbfPrsXTpc+57q/j5BcmeL8Gp5vHvQIJLRjSPdZfinf2oa8deOKNtmeT+DLouteNH8B1CfFu9cO6dTj7o597VZSsOaeMWkSIR6duWBj4H/L1LtSVBU10LLerF8mrVFvaykyLSvxB6pLkxI3S01jfT2uxNimhtbgxNR11dhEi9ty50pKEpvOPS0JQRx6W2LuIfl5bw2qOuLkKkwdMRaQyvf7RqZpy3tRnSHqmiM3fcZcBiEWnb/g+q+nzHP0k9tXsbqeAtUFCUwQynVAanWwYttTVUsCR0HU37aql8xrsDVY0wlKNC0bG3KsK2Ob/1dLRGGBGSjpb9tRlxXHbubmXVq95MSo1EGBJSe+zZHWH7r+/xMpHwjksjDaxkeUYcl0xoj1TRmdBl6yEazqOLBKdTXzL/Bqfsphv8SCt3rHMfC9+ZEJxa3I9TpYu+M+0QH5nk7NW+GeKV8W7EipbP+qaeHPI4dVFyOoKPVR25FbXctNctC+oaD6Nu992MRiXpdpa/353Gfe0tjyXYEqa94bp/nbk5ENUo8SJyXSK/qi6Wjl+Vb8DDxYFcMaNT2D92n+lGao6fXh9k/HI3As7pBwNT85Nsj1HzP3TzI/zp1PGP4N9c664eeUa5HwA7Z2dyKwleM9GdTv7Vm31X1fbcVNvoI8WcSnLHJXiuxv+Pr6zwz4l4M0r8aprnRPwlI5J1F403hwSDGE/u47bVf1wxy8n3WXr40XfiMXdAwzCMLMMGbsMwjCzDBm7DMIwsQ1RT76ooIruBWqBLU+TjKO3Efo5W1Y/59ZiOjNbxUSf3YTpMxz+Djs5oaVdHu6hqt3yAikzYj+nITB22D9tHT9pHKvejqmYqMQzDyDZs4DYMw8gyunPg/thCVCHtx3Sk9vep3I/tw/bRU/aRyv10z8tJwzAMo/swU4lhGEaWYQO3YRhGltEtA7eITBGRShH5UER+kMR+NorIKhF5T0QOezEO02E6TIfpyHYd7ZIqv8KAr2IusA4YDRQAfwPGdXFfG4FS02E6TIfp6Ik6En264457EvChqq5X1SbgMWDqIX7THZgO02E6TEe262iX7hi4hwKbA/kt0e+6Qlusy79GY8GZDtNhOkxHT9LRLp2NORkWh4x1aTpMh+kwHT1NR3fccW8Fhgfyw6LfHTYaiHUJtMW6NB2mw3SYjp6iI+FOU/rBu4tfD4zCN+qP78J+ioC+gfQbwBTTYTpMh+noKToSfVJuKlHVFhGZBbyA92b2flVdfYiftUdSsS5Nh+kwHaYj23Ukwqa8G4ZhZBk2c9IwDCPLsIHbMAwjy7CB2zAMI8uwgdswDCPLsIHbMAwjy7CB2zAMI8uwgdswDCPL+H+2ihC0591JagAAAABJRU5ErkJggg==\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "# plot sample images\n", + "nplot = 10\n", + "fig, axes = plt.subplots(nrows=1, ncols=nplot)\n", + "\n", + "for i in range(nplot):\n", + " img = X_digits[i].reshape(8, 8)\n", + " axes[i].imshow(img)\n", + " axes[i].set_title(y_digits[i])\n" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [], + "source": [ + "# split train / test data\n", + "n_samples = len(X_digits)\n", + "n_train = int(0.4 * n_samples)\n", + "\n", + "X_train = X_digits[:n_train]\n", + "y_train = y_digits[:n_train]\n", + "X_test = X_digits[n_train:]\n", + "y_test = y_digits[n_train:]\n" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "KNN score: 0.953661\n", + "LogisticRegression score: 0.908248\n" + ] + } + ], + "source": [ + "# do KNN classification\n", + "knn = neighbors.KNeighborsClassifier()\n", + "logistic = linear_model.LogisticRegression()\n", + "\n", + "print('KNN score: %f' % knn.fit(X_train, y_train).score(X_test, y_test))\n", + "print('LogisticRegression score: %f' % logistic.fit(X_train, y_train).score(X_test, y_test))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## References\n", + "* [Digits Classification Exercise](http://scikit-learn.org/stable/auto_examples/exercises/plot_digits_classification_exercise.html)\n" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.5.2" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/knn/knn_classification.py b/knn/knn_classification.py new file mode 100644 index 0000000..b0a1732 --- /dev/null +++ b/knn/knn_classification.py @@ -0,0 +1,73 @@ +# --- +# jupyter: +# jupytext_format_version: '1.2' +# kernelspec: +# display_name: Python 3 +# language: python +# name: python3 +# language_info: +# codemirror_mode: +# name: ipython +# version: 3 +# file_extension: .py +# mimetype: text/x-python +# name: python +# nbconvert_exporter: python +# pygments_lexer: ipython3 +# version: 3.5.2 +# --- + +# # KNN Classification +# +# +# + +# + +% matplotlib inline + +import matplotlib.pyplot as plt +from sklearn import datasets, neighbors, linear_model + +# load data +digits = datasets.load_digits() +X_digits = digits.data +y_digits = digits.target + +print("Feature dimensions: ", X_digits.shape) +print("Label dimensions: ", y_digits.shape) + + +# + +# plot sample images +nplot = 10 +fig, axes = plt.subplots(nrows=1, ncols=nplot) + +for i in range(nplot): + img = X_digits[i].reshape(8, 8) + axes[i].imshow(img) + axes[i].set_title(y_digits[i]) + + +# + +# split train / test data +n_samples = len(X_digits) +n_train = int(0.4 * n_samples) + +X_train = X_digits[:n_train] +y_train = y_digits[:n_train] +X_test = X_digits[n_train:] +y_test = y_digits[n_train:] + + +# + +# do KNN classification +knn = neighbors.KNeighborsClassifier() +logistic = linear_model.LogisticRegression() + +print('KNN score: %f' % knn.fit(X_train, y_train).score(X_test, y_test)) +print('LogisticRegression score: %f' % logistic.fit(X_train, y_train).score(X_test, y_test)) +# - + +# ## References +# * [Digits Classification Exercise](http://scikit-learn.org/stable/auto_examples/exercises/plot_digits_classification_exercise.html) +# diff --git a/matplotlib/stinkbug.webp b/matplotlib/stinkbug.webp deleted file mode 100644 index 08f02e0..0000000 Binary files a/matplotlib/stinkbug.webp and /dev/null differ diff --git a/matplotlib/example.png b/numpy_matplotlib_scipy_sympy/example.png similarity index 100% rename from matplotlib/example.png rename to numpy_matplotlib_scipy_sympy/example.png diff --git a/matplotlib/matplotlib_ani1.ipynb b/numpy_matplotlib_scipy_sympy/matplotlib_ani1.ipynb similarity index 100% rename from matplotlib/matplotlib_ani1.ipynb rename to numpy_matplotlib_scipy_sympy/matplotlib_ani1.ipynb diff --git a/matplotlib/matplotlib_ani1.py b/numpy_matplotlib_scipy_sympy/matplotlib_ani1.py similarity index 100% rename from matplotlib/matplotlib_ani1.py rename to numpy_matplotlib_scipy_sympy/matplotlib_ani1.py diff --git a/matplotlib/matplotlib_ani2.ipynb b/numpy_matplotlib_scipy_sympy/matplotlib_ani2.ipynb similarity index 100% rename from matplotlib/matplotlib_ani2.ipynb rename to numpy_matplotlib_scipy_sympy/matplotlib_ani2.ipynb diff --git a/matplotlib/matplotlib_ani2.py b/numpy_matplotlib_scipy_sympy/matplotlib_ani2.py similarity index 100% rename from matplotlib/matplotlib_ani2.py rename to numpy_matplotlib_scipy_sympy/matplotlib_ani2.py diff --git a/matplotlib/Lecture-4-Matplotlib.ipynb b/numpy_matplotlib_scipy_sympy/matplotlib_full.ipynb similarity index 100% rename from matplotlib/Lecture-4-Matplotlib.ipynb rename to numpy_matplotlib_scipy_sympy/matplotlib_full.ipynb diff --git a/matplotlib/tutorial matplotlib.ipynb b/numpy_matplotlib_scipy_sympy/matplotlib_simple_tutorial.ipynb similarity index 100% rename from matplotlib/tutorial matplotlib.ipynb rename to numpy_matplotlib_scipy_sympy/matplotlib_simple_tutorial.ipynb diff --git a/numpy_scipy_sympy/Numpy.ipynb b/numpy_matplotlib_scipy_sympy/numpy.ipynb similarity index 100% rename from numpy_scipy_sympy/Numpy.ipynb rename to numpy_matplotlib_scipy_sympy/numpy.ipynb diff --git a/numpy_scipy_sympy/Scipy.ipynb b/numpy_matplotlib_scipy_sympy/scipy.ipynb similarity index 100% rename from numpy_scipy_sympy/Scipy.ipynb rename to numpy_matplotlib_scipy_sympy/scipy.ipynb diff --git a/numpy_scipy_sympy/stockholm_td_adj.dat b/numpy_matplotlib_scipy_sympy/stockholm_td_adj.dat similarity index 100% rename from numpy_scipy_sympy/stockholm_td_adj.dat rename to numpy_matplotlib_scipy_sympy/stockholm_td_adj.dat diff --git a/numpy_scipy_sympy/Sympy.ipynb b/numpy_matplotlib_scipy_sympy/sympy.ipynb similarity index 100% rename from numpy_scipy_sympy/Sympy.ipynb rename to numpy_matplotlib_scipy_sympy/sympy.ipynb diff --git a/python/tips/README.md b/python/tips/README.md new file mode 100644 index 0000000..8160ea7 --- /dev/null +++ b/python/tips/README.md @@ -0,0 +1,16 @@ +# Pyton技巧 + + +## Python的包管理工具: `pip` +由于python是模块化的开发,因此能够能够利用其他人写的现成的包来快速的完成特定的任务。为了加快包的安装,python有很多包管理的工具,其中`pip`是目前使用最多的包管理工具。 + +* [pip的安装、使用等](pip.md) + +但是由于直接使用pip去访问国外的网站慢,所以需要设置好pip的镜像,从而加快包的安装 + + +## Python的虚拟环境: `virtualenv` +由于Python可以通过`pip`工具方便的安装包,因此极大的加快了程序编写的速度。但由于公开的包很多,不可避免的带来了包依赖导致的无法安装某些程序的问题。针对这个问题可以使用`docker`来构建一个隔离的环境来安装所需要的包,但有的时候还是希望在本机安装,因此需要使用`virtualenv`工具来安装虚拟的python环境。 + +* [virtualenv的安装、使用](virtualenv.md) +* [virtualenv便捷管理工具:virtualenv_wrapper](virtualenv_wrapper.md) \ No newline at end of file diff --git a/python/tips/pip.md b/python/tips/pip.md new file mode 100644 index 0000000..a2a54e8 --- /dev/null +++ b/python/tips/pip.md @@ -0,0 +1,66 @@ +# Python的包管理工具: `pip` + +由于python是模块化的开发,因此能够能够利用其他人写的现成的包来快速的完成特定的任务。为了加快包的安装,python有很多包管理的工具,其中`pip`是目前使用最多的包管理工具。 + +## 1. 安装pip +在ubuntu系统可以直接安装python-pip + +``` +# Python 3的pip (建议安装Python3) +sudo apt-get install python3-pip + +# Python 2的pip +sudo apt-get install python3-pip +``` + +Upgrade pip +``` +sudo pip3 install --upgrade pip +``` + +安装之后,可以输入`pip`查看简要的使用说明。**需要注意的是,通过系统安装的pip,在使用pip安装包的时候,需要用sudo来执行。** + + +## 2. pip的命令 + +### 2.1 查找一个给定名字的package +``` +pip search numpy +``` +会找到很多跟numpy有关联的包,可以拷贝每一行最前面的那个包名字,通过安装命令去安装。 + + +### 2.2 安装一个给定的package +``` +$ pip install numpy +``` +安装`numpy`这个包,同时它的依赖也自动安装到系统。 + +使用一个给定的URL安装包 +``` +$ pip -f URL install PACKAGE # 从指定URL下载安装包 +``` + + +### 2.3 升级一个包 +``` +$ pip -U install PACKAGE # 升级包 +``` + +### 2.4 列出当前系统中已经安装的包 +``` +$ pip list +``` + +查看一个安装好的包的信息 +``` +$ pip show numpy +``` + + +## 3. 设置pip的镜像 +但是由于直接使用pip去访问国外的网站慢,所以需要设置好pip的镜像,从而加快包的安装。目前国内有很多pip包镜像,选择其中一个就可以加快很多安装速度 + +``` +pip config set global.index-url 'https://mirrors.ustc.edu.cn/pypi/web/simple' +``` \ No newline at end of file diff --git a/python/tips/virtualenv.md b/python/tips/virtualenv.md new file mode 100644 index 0000000..1f9fc19 --- /dev/null +++ b/python/tips/virtualenv.md @@ -0,0 +1,59 @@ +# virtualenv manual + + +## 1. Install +virtualenv 是一个创建隔绝的Python环境的工具。virtualenv创建一个包含所有必要的可执行文件的文件夹,用来使用Python工程所需的包。 +``` +pip install virtualenv +``` + +如果当前pip是python2的话,则后续默认创建的虚拟环境就是python2;否则是python3的 + + +## 2. 创建虚拟环境 + +创建一个虚拟环境 +``` +$ mkdir -p ~/virtualenv; cd ~/virtualenv + +$ virtualenv venv # venv 是虚拟环境的目录名 +``` + +virtualenv venv 将会在当前的目录中创建一个文件夹,包含了Python可执行文件,以及 pip 库的一份拷贝,这样就能安装其他包了。虚拟环境的名字(此例中是 venv )可以是任意的;若省略名字将会把文件均放在当前目录。 + +在任何你运行命令的目录中,这会创建Python的拷贝,并将之放在叫做 venv 的文件中。 + +你可以选择使用一个Python解释器: +``` +$ virtualenv -p /usr/bin/python2.7 venv    # -p参数指定Python解释器程序路径 +``` + + +## 3. 使用虚拟环境 + +要开始使用虚拟环境,其需要被激活: +``` +$ source ~/virtualenv/venv/bin/activate    +``` + +从现在起,任何你使用pip安装的包将会放在 venv 文件夹中,与全局安装的Python隔绝开。 + +像平常一样安装包,比如: +``` +$ pip install requests +``` + + +## 4. 如果你在虚拟环境中暂时完成了工作,则可以停用它: +``` +$ . venv/bin/deactivate +``` +这将会回到系统默认的Python解释器,包括已安装的库也会回到默认的。 + + +## 5. 删除一个虚拟环境 +要删除一个虚拟环境,只需删除它的文件夹。(执行 rm -rf venv )。 + + +这里virtualenv 有些不便,因为virtual的启动、停止脚本都在特定文件夹,可能一段时间后,你可能会有很多个虚拟环境散落在系统各处,你可能忘记它们的名字或者位置。 + diff --git a/python/tips/virtualenv_wrapper.md b/python/tips/virtualenv_wrapper.md new file mode 100644 index 0000000..b2ac58b --- /dev/null +++ b/python/tips/virtualenv_wrapper.md @@ -0,0 +1,59 @@ +# virtualenvwrapper + +鉴于virtualenv不便于对虚拟环境集中管理,所以推荐直接使用virtualenvwrapper。 virtualenvwrapper提供了一系列命令使得和虚拟环境工作变得便利。它把你所有的虚拟环境都放在一个地方。 + + +安装virtualenvwrapper(确保virtualenv已安装) +``` +pip install virtualenvwrapper +pip install virtualenvwrapper-win  #Windows使用该命令 +``` + + +安装完成后,在~/.bashrc写入以下内容 + +``` +# virtualenv +export VIRTUALENVWRAPPER_PYTHON=/usr/bin/python3 +export WORKON_HOME=/home/bushuhui/virtualenv +source /usr/local/bin/virtualenvwrapper.sh  +``` +其中VIRTUALENVWRAPPER_PYTHON指定了使用那个python作为解释器 + + +## 1.创建虚拟环境 mkvirtualenv + +``` +mkvirtualenv venv    +``` +这样会在WORKON_HOME变量指定的目录下新建名为venv的虚拟环境。 +若想指定python版本,可通过"--python"指定python解释器 +``` +mkvirtualenv --python=/usr/local/python3.5.3/bin/python venv +``` + +## 2. 基本命令   + +查看当前的虚拟环境目录 +``` +[root@localhost ~]# workon +py2 +py3 +``` + +切换到虚拟环境 +``` +[root@localhost ~]# workon py3 +(py3) [root@localhost ~]# +``` + +退出虚拟环境 +``` +(py3) [root@localhost ~]# deactivate +[root@localhost ~]# +``` + +删除虚拟环境 +``` +rmvirtualenv venv +```