You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

k-means.ipynb 210 kB

6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago

  1. {
  2. "cells": [
  3. {
  4. "cell_type": "markdown",
  5. "metadata": {},
  6. "source": [
  7. "# k-means"
  8. ]
  9. },
  10. {
  11. "cell_type": "markdown",
  12. "metadata": {},
  13. "source": [
  14. "## Theory\n",
  15. "\n",
  16. "由于具有出色的速度和良好的可扩展性,K-Means聚类算法算得上是最著名的聚类方法。K-Means算法是一个重复移动类中心点的过程,把类的中心点,也称重心(centroids),移动到其包含成员的平均位置,然后重新划分其内部成员。\n",
  17. "\n",
  18. "K是算法计算出的超参数,表示类的数量;K-Means可以自动分配样本到不同的类,但是不能决定究竟要分几个类。\n",
  19. "\n",
  20. "K必须是一个比训练集样本数小的正整数。有时,类的数量是由问题内容指定的。例如,一个鞋厂有三种新款式,它想知道每种新款式都有哪些潜在客户,于是它调研客户,然后从数据里找出三类。也有一些问题没有指定聚类的数量,最优的聚类数量是不确定的。\n",
  21. "\n",
  22. "K-Means的参数是类的重心位置和其内部观测值的位置。与广义线性模型和决策树类似,K-Means参数的最优解也是以成本函数最小化为目标。K-Means成本函数公式如下:\n",
  23. "$$\n",
  24. "J = \\sum_{k=1}^{K} \\sum_{i \\in C_k} | x_i - u_k|^2\n",
  25. "$$\n",
  26. "\n",
  27. "$u_k$是第$k$个类的重心位置,定义为:\n",
  28. "$$\n",
  29. "u_k = \\frac{1}{|C_k|} \\sum_{x \\in C_k} x\n",
  30. "$$\n",
  31. "\n",
  32. "\n",
  33. "成本函数是各个类畸变程度(distortions)之和。每个类的畸变程度等于该类重心与其内部成员位置距离的平方和。若类内部的成员彼此间越紧凑则类的畸变程度越小,反之,若类内部的成员彼此间越分散则类的畸变程度越大。\n",
  34. "\n",
  35. "求解成本函数最小化的参数就是一个重复配置每个类包含的观测值,并不断移动类重心的过程。\n",
  36. "1. 首先,类的重心是随机确定的位置。实际上,重心位置等于随机选择的观测值的位置。\n",
  37. "2. 每次迭代的时候,K-Means会把观测值分配到离它们最近的类,然后把重心移动到该类全部成员位置的平均值那里。\n",
  38. "3. 若达到最大迭代步数或两次迭代差小于设定的阈值则算法结束,否则重复步骤2。\n",
  39. "\n"
  40. ]
  41. },
  42. {
  43. "cell_type": "code",
  44. "execution_count": 1,
  45. "metadata": {},
  46. "outputs": [
  47. {
  48. "data": {
  49. "image/png": "iVBORw0KGgoAAAANSUhEUgAAAW4AAAD8CAYAAABXe05zAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4wLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvqOYd8AAADgNJREFUeJzt3U+I3Pd5x/HPZ1cZZaSEJOCwpZKpdAgpIlCcFcFT0zB0ekhIqC8tOOAUsoe9JI6TpgQ7UHLUJYT4kBaMPbl4SKBKDiE1ccp251BmENEfQyIpAeM6thybOAcnWRd+U2mfHrTbUY2q/cman77zzL5fMKBd764fnp197+i3O/o6IgQAyGOp9AAAgNtDuAEgGcINAMkQbgBIhnADQDKEGwCSIdwAkAzhBoBkCDcAJHOgiQ96zz33xLFjx5r40LW99dZbOnz4cNEZ5gW7mGIXU+xiah52ce7cud9GxAfrvG0j4T527JjOnj3bxIeubTgcqtvtFp1hXrCLKXYxxS6m5mEXtn9V9225VAIAyRBuAEiGcANAMoQbAJIh3ACQDOEGgGQINwAkQ7gBIBnCDQDJEG4ASIZwA0AyhBsAkiHcAJAM4QaAZAg3ACRDuAEgmVrhtv1l2xdt/9z2d22/u+nBAAA3t2e4bR+R9EVJJyPiI5KWJT3U9GAAgJure6nkgKS27QOSDkn6dXMjAWjaeDzWYDDQeDwuPQregT3DHRGvSvqGpJclvSbpdxHxk6YHA9CM8XisXq+nfr+vXq9HvBPa87Bg2x+Q9KCk45LelPQvth+OiGfe9nbrktYlaWVlRcPhcPbT3oatra3iM8wLdjHFLqTBYKCqqrS9va2qqtTv91VVVemxikp3v4iIW94k/a2kp294+e8k/dOt3md1dTVK29zcLD3C3GAXU+wiYjQaRbvdjqWlpWi32zEajUqPVNw83C8knY09erx7q3ON+2VJ99s+ZNuSepIuN/R9BEDDOp2ONjY2tLa2po2NDXU6ndIj4TbteakkIs7YPi3pvKSrki5IerLpwQA0p9PpqKoqop3UnuGWpIj4uqSvNzwLAKAGnjkJAMkQbgBIhnADQDKEGwCSIdwAkAzhBoBkCDcAJEO4ASAZwg0AyRBuAEiGcANAMoQbAJIh3ACQDOEGgGQINwAkQ7jRuPF4rFOnTnEordjFjeZlFxlPvK91kALwTu2eKD6ZTNRqtfb1UVnsYmpedrE7R1VVGgwGaT4nPOJGo4bDoSaTia5du6bJZJLrJO0ZYxdT87KL3Tm2t7dTfU4INxrV7XbVarW0vLysVqulbrdbeqRi2MXUvOxid46lpaVUnxMulaBRuyeKD4dDdbvdFH8NbQq7mJqXXezO0e/3tba2luZzQrjRuE6nk+YLomnsYmpedpHxxHsulQBAMoQbAJIh3ACQDOEGgGQINwAkQ7gBIBnCDQDJEG4ASIZwA0AyhBsAkiHcAJAM4QaAZAg3ACRDuAEgmVrhtv1+26dt/8L2Zdt5/v1DAFgwdf897ick/Tgi/sZ2S9KhBmcCANzCno+4bb9P0sclPS1JETGJiDebHgyYtYyneQM3U+dSyXFJb0j6ju0Ltp+yfbjhuYCZ2j3Nu9/vq9frEW+kVudSyQFJH5X0SEScsf2EpMck/eONb2R7XdK6JK2srBQ/LXlra6v4DPOCXUiDwUBVVWl7e1tVVanf76uqqtJjFcX9YirdLiLiljdJfyTppRte/gtJ/3qr91ldXY3SNjc3S48wN9hFxGg0ina7HUtLS9Fut2M0GpUeqTjuF1PzsAtJZ2OPHu/e9rxUEhGvS3rF9od3XtWTdKmZbyNAM3ZP815bW9PGxkaqg2GBt6v7WyWPSBrs/EbJi5I+19xIQDMynuYN3EytcEfE85JONjwLAKAGnjkJAMkQbgBIhnADQDKEGwCSIdwAkAzhBoBkCDcAJEO4ASAZwg0AyRBuAEiGcANAMoQbAJIh3ACQDOEGgGQIN3AXjcdjnTp1ijMvxS7uRN2DFADcod0DiyeTiVqt1r4+iYdd3BkecQN3yXA41GQy0bVr1zSZTHIdTjtj7OLOEG7gLul2u2q1WlpeXlar1VK32y09UjHs4s5wqQS4S3YPLB4Oh+p2u/v60gC7uDOEG7iLOp0OkdrBLt45LpUAQDKEGwCSIdwAkAzhBoBkCDcAJEO4ASAZwg0AyRBuAEiGcANAMoQbAJIh3ACQDOEGgGQINwAkQ7gBIJna4ba9bPuC7R81ORAA4NZu5xH3o5IuNzUIAKCeWuG2fVTSpyQ91ew4i4VTrAE0oe4JON+S9FVJ721wloXCKdYAmrJnuG1/WtJvIuKc7e4t3m5d0rokraysFD+1eWtrq+gMg8FAVVVpe3tbVVWp3++rqqois5TexTxhF1PsYirdLiLiljdJpyRdkfSSpNcl/ZekZ271Pqurq1Ha5uZm0f//aDSKdrsdy8vL0W63YzQaFZul9C7mCbuYYhdT87ALSWdjjx7v3vZ8xB0Rj0t6XJJ2HnH/Q0Q83My3kcXBKdYAmsIp7w3iFGsATbitcEfEUNKwkUkAALXwzEkASIZwA0AyhBsAkiHcAJAM4QaAZAg3ACRDuAEgGcINAMkQbgBIhnADQDKEGwCSIdwAkAzhBoBkCDcAJEO4ASAZwo3Gcdo9MFucgINGcdo9MHs84kajhsOhJpOJrl27pslkkuskbWBOEW40qtvtqtVqaXl5Wa1WS91ut/RIQHpcKkGjOO0emD3CjcZx2j0wW1wqAYBkCDcAJEO4ASAZwg0AyRBuAEiGcANAMoQbAJIh3ACQDOEGgGQINwAkQ7gBIBnCDQDJEG4ASIZwA0Aye4bb9r22N21fsn3R9qN3YzAAwM3V+fe4r0r6SkSct/1eSeds/1tEXGp4NgDATez5iDsiXouI8zt//oOky5KOND0YZmM8HmswGHDCOrBAbusat+1jku6TdKaJYTBbuyes9/t99Xo94g0siNpHl9l+j6TvS/pSRPz+Jv99XdK6JK2srBQ/zXtra6v4DKUNBgNVVaXt7W1VVaV+v6+qqkqPVRT3iyl2MZVuFxGx503SuyQ9J+nv67z96upqlLa5uVl6hOJGo1G02+1YWlqKdrsdo9Go9EjFcb+YYhdT87ALSWejRl8jotZvlVjS05IuR8Q3G/0ugpnaPWF9bW1NGxsbHNgLLIg6l0oekPRZST+z/fzO674WEc82NxZmpdPpqKoqog0skD3DHRH/Icl3YRYAQA08cxIAkiHcAJAM4QaAZAg3ACRDuAEgGcINAMkQbgBIhnADQDKEGwCSIdwAkAzhBoBkCDcAJEO4ASAZwg0AyRBuAEiGcANAMoQbAJIh3ACQDOEGgGQINwAkQ7gBIBnCDQDJEG4ASIZwA0AyhBsAkiHcAJAM4QaAZAg3ACRDuAEgGcINAMkQbgBIhnADQDKEGwCSIdwAkEytcNv+hO1f2n7B9mNNDwUA+P/tGW7by5K+LemTkk5I+oztE00PBgC4uTqPuD8m6YWIeDEiJpK+J+nBZse6M+PxWIPBQOPxuPQoADBzdcJ9RNIrN7x8Zed1c2k8HqvX66nf76vX6xFvAAvnwKw+kO11SeuStLKyouFwOKsPfVsGg4GqqtL29raqqlK/31dVVUVmmRdbW1vFPh/zhl1MsYupbLuoE+5XJd17w8tHd173f0TEk5KelKSTJ09Gt9udxXy37eDBg/8b74MHD2ptbU2dTqfILPNiOByq1Odj3rCLKXYxlW0XdS6V/FTSh2wft92S9JCkHzY71jvX6XS0sbGhtbU1bWxs7PtoA1g8ez7ijoirtr8g6TlJy5L6EXGx8cnuQKfTUVVVRBvAQqp1jTsinpX0bMOzAABq4JmTAJAM4QaAZAg3ACRDuAEgGcINAMkQbgBIhnADQDKEGwCSIdwAkAzhBoBkCDcAJEO4ASAZwg0AyRBuAEiGcANAMoQbAJIh3ACQjCNi9h/UfkPSr2b+gW/PPZJ+W3iGecEuptjFFLuYmodd/ElEfLDOGzYS7nlg+2xEnCw9xzxgF1PsYopdTGXbBZdKACAZwg0AySxyuJ8sPcAcYRdT7GKKXUyl2sXCXuMGgEW1yI+4AWAhLWS4bX/C9i9tv2D7sdLzlGL7Xtubti/Zvmj70dIzlWR72fYF2z8qPUtJtt9v+7TtX9i+bLtTeqZSbH9552vj57a/a/vdpWeqY+HCbXtZ0rclfVLSCUmfsX2i7FTFXJX0lYg4Iel+SZ/fx7uQpEclXS49xBx4QtKPI+JPJf2Z9ulObB+R9EVJJyPiI5KWJT1Udqp6Fi7ckj4m6YWIeDEiJpK+J+nBwjMVERGvRcT5nT//Qde/QI+UnaoM20clfUrSU6VnKcn2+yR9XNLTkhQRk4h4s+xURR2Q1LZ9QNIhSb8uPE8tixjuI5JeueHlK9qnsbqR7WOS7pN0puwkxXxL0lclbZcepLDjkt6Q9J2dy0ZP2T5ceqgSIuJVSd+Q9LKk1yT9LiJ+UnaqehYx3Hgb2++R9H1JX4qI35ee526z/WlJv4mIc6VnmQMHJH1U0j9HxH2S3pK0L38OZPsDuv638eOS/ljSYdsPl52qnkUM96uS7r3h5aM7r9uXbL9L16M9iIgflJ6nkAck/bXtl3T90tlf2n6m7EjFXJF0JSJ2/+Z1WtdDvh/9laT/jIg3IuK/Jf1A0p8XnqmWRQz3TyV9yPZx2y1d/2HDDwvPVIRt6/q1zMsR8c3S85QSEY9HxNGIOKbr94d/j4gUj6xmLSJel/SK7Q/vvKon6VLBkUp6WdL9tg/tfK30lOQHtQdKDzBrEXHV9hckPafrPyXuR8TFwmOV8oCkz0r6me3nd173tYh4tuBMKO8RSYOdBzYvSvpc4XmKiIgztk9LOq/rv4F1QUmeQckzJwEgmUW8VAIAC41wA0AyhBsAkiHcAJAM4QaAZAg3ACRDuAEgGcINAMn8DzWXEr0zzEqRAAAAAElFTkSuQmCC\n",
  50. "text/plain": [
  51. "<Figure size 432x288 with 1 Axes>"
  52. ]
  53. },
  54. "metadata": {
  55. "needs_background": "light"
  56. },
  57. "output_type": "display_data"
  58. }
  59. ],
  60. "source": [
  61. "% matplotlib inline\n",
  62. "import matplotlib.pyplot as plt\n",
  63. "import numpy as np\n",
  64. "\n",
  65. "X0 = np.array([7, 5, 7, 3, 4, 1, 0, 2, 8, 6, 5, 3])\n",
  66. "X1 = np.array([5, 7, 7, 3, 6, 4, 0, 2, 7, 8, 5, 7])\n",
  67. "plt.figure()\n",
  68. "plt.axis([-1, 9, -1, 9])\n",
  69. "plt.grid(True)\n",
  70. "plt.plot(X0, X1, 'k.');"
  71. ]
  72. },
  73. {
  74. "cell_type": "markdown",
  75. "metadata": {},
  76. "source": [
  77. "假设K-Means初始化时,将第一个类的重心设置在第5个样本,第二个类的重心设置在第11个样本.那么我们可以把每个实例与两个重心的距离都计算出来,将其分配到最近的类里面。计算结果如下表所示:\n",
  78. "![data_0](images/data_0.png)\n",
  79. "\n",
  80. "新的重心位置和初始聚类结果如下图所示。第一类用X表示,第二类用点表示。重心位置用稍大的点突出显示。\n",
  81. "\n",
  82. "\n"
  83. ]
  84. },
  85. {
  86. "cell_type": "code",
  87. "execution_count": 2,
  88. "metadata": {},
  89. "outputs": [
  90. {
  91. "data": {
  92. "image/png": "iVBORw0KGgoAAAANSUhEUgAAAW4AAAEICAYAAAB/Dx7IAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4wLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvqOYd8AAAFYlJREFUeJzt3X+U3XV95/Hnm0kGCbHQ3bhpgQmD1UUpHIWE1pHanXHctayi5+w5ZW2RHM160vasgkU2qyLUVSmttVTtWntYTVlg1mwOup6CWHUnc/donbJJkF1+RM6hMGQAXbHKj4CdScJ7//je4U5CkrkDc/O9n5nn45x7Zr7f+73f+7qf3Lzudz73znwjM5EkleOYugNIkubH4pakwljcklQYi1uSCmNxS1JhLG5JKozFrY6KiDdExH01Z/hwRHyhzgwvVkRkRLyi7hzqDha3AIiI90bEjoiYiojr53G7iYh40+Guz8xvZ+bp7W7/YkXEYEQ8fFCGP8zM93TqPo+2iLg+Ij5Rdw7VZ1ndAdQ1HgU+AbwZOK7mLIcUEQFEZj5bd5ZDiYhlmbmv7hxa/DziFgCZ+ZXM/CrwDwdfFxGrIuLWiHg8In4SEd+OiGMi4kZgDXBLROyJiE2HuO1zR8CH2z4iXhcR323u//9ExOCs2zci4uqI+FvgGeDlEfHuiNgVEU9FxAMR8TvNbY8Hvg6c1Nz/nog4KSI+GhE3zdrn2yLinub9NSLi1bOum4iIyyPi/0bEExHx3yPiJYcas4h4V0T8bUT8WUT8A/DR5voNzXw/jYhvRMSpzfXR3PZHEfFkRNwVEWfOepzvOWjf3znEfW4ELgI2NR/fLc31/zEiHmmOyX0RMXyozFokMtOLl+cuVEfd1x+07hrgL4HlzcsbqI58ASaANx1hf4PAw7OWD9geOJnqxeJfUx1I/Mvm8sua1zeA3cAvU/2EuBx4C/BLQAD/gqrQzznU/TXXfRS4qfn9Pweebt7PcmATcD/QOyvf/wZOAv4JsAv43cM8tncB+4D3NbMdB7y9ub9XN9d9BPhuc/s3AzuBE5vZXw384qzH+Z6D9v2dWcsJvKL5/fXAJ2ZddzowCZzUXO4Hfqnu55KXzl084lY79gK/CJyamXuzmrdeqD9y807gtsy8LTOfzcxvATuoinzG9Zl5T2bua97/1zLz77Pyv4BvUr2YtOPfAl/LzG9l5l7gU1SF+/pZ23w2Mx/NzJ8AtwCvPcL+Hs3MP29m+xnwu8A1mbkrq2mTPwRe2zzq3gu8FHgV1Qvfrsz8QZu5j2Q/cCxwRkQsz8yJzPz7BdivupTFrXb8CdVR5DebUxMfXMB9nwr8ZnPa4vGIeBz4NaoXihmTs28QEedHxN81p20epyr5VW3e30nAQzMLWc2XT1Id+c/44azvnwFWHmF/kwctnwp8ZtZj+QnV0fXJmbkN+M/A54AfRcR1EfFzbeY+rMy8H3g/1U8WP4qILRFx0ovdr7qXxa05ZeZTmfmBzHw58DbgsllzqPM98j54+0ngxsw8cdbl+Mz8o0PdJiKOBb5MdaS8OjNPBG6jKsd28jxKVa4z+wugD3hkno/jedmaJoHfOejxHJeZ3wXIzM9m5lrgDKppm//QvN3TwIpZ+/mFedwnmfnfMvPXqB5bAn/8wh6OSmBxC6g+EdF8E64H6ImIl0TEsuZ1b42IVzRL7gmqH81nPtnx/4CXz+OuDt7+JuCCiHhzRMzc72BEnHKY2/dSTQs8BuyLiPOBf3XQ/v9pRJxwmNtvBd4SEcMRsRz4ADAFfHcej+FI/hL4UET8MkBEnBARv9n8/tyI+NXm/T4N/COtcbwT+DcRsSKqz2v/uyPcxwFjGBGnR8Qbmy9q/wj8bNZ+tQhZ3JrxEar/8B+kmnf+WXMdwCuB/wnsAcaBv8jMseZ11wAfaU4NXN7G/RywfWZOUr2h92GqMp6kOgo95HMzM58CLqEq4J8Cvw389azrvw98CXigeR8nHXT7+5qP78+BHwMXABdk5nQb2eeUmf+D6mh3S0Q8CdwNnN+8+ueA/9LM/RDVm7B/0rzuz4BpqlL+r8DIEe7mi1Tz2Y9HxFepXsj+qPl4fgj8M+BDC/F41J1mPhkgSSqER9ySVBiLW5IKY3FLUmEsbkkqTEf+yNSqVauyv7+/E7tu29NPP83xxx9fa4Zu4Vi0OBYtjkVLN4zFzp07f5yZL2tn244Ud39/Pzt27OjErtvWaDQYHBysNUO3cCxaHIsWx6KlG8YiIh6ae6uKUyWSVBiLW5IKY3FLUmEsbkkqjMUtSYWxuCWpMBa3JBXG4pakwljcklQYi1uSCmNxS1JhLG5JKozFLUmFsbglqTAWtyQVxuKWpMJY3JJUmLaKOyJ+PyLuiYi7I+JLEfGSTgeT1AGf/CSMjR24bmysWq9izFncEXEycAmwLjPPBHqAd3Q6mKQOOPdcuPDCVnmPjVXL555bby7NS7vnnFwGHBcRe4EVwKOdiySpY4aGYOtWuPBC+s8/H77+9Wp5aKjuZJqHyMy5N4q4FLga+Bnwzcy86BDbbAQ2AqxevXrtli1bFjjq/OzZs4eVK1fWmqFbOBYtjkWlf/Nm+m+8kYmLL2Ziw4a649SuG54XQ0NDOzNzXVsbZ+YRL8DPA9uAlwHLga8C7zzSbdauXZt1GxsbqztC13AsWhyLzNy2LXPVqnzw4oszV62qlpe4bnheADtyjj6eubTz5uSbgAcz87HM3At8BXj9C3hBkVS3mTntrVurI+3mtMnz3rBUV2unuHcDr4uIFRERwDCwq7OxJHXE9u0HzmnPzHlv315vLs3LnG9OZubtEXEzcAewD/gecF2ng0nqgE2bnr9uaMg3JwvT1qdKMvMPgD/ocBZJUhv8zUlJKozFLUmFsbglqTAWtyQVxuKWpMJY3JJUGItbkgpjcUtSYSxuSSqMxS1JhbG4JakwFrckFcbilqTCWNzqHM8o3uJYaAFZ3Ooczyje4lg8z/jkONd8+xrGJ8drzzGye6T2HPPR7lnepfmbdUZxfu/34POfX7pnFHcsDjA+Oc7wDcNM75+mt6eX0fWjDPQN1JZjat8UI5MjteWYL4+41VlDQ1VRffzj1dclWlSAYzFLY6LB9P5p9ud+pvdP05ho1JrjWZ6tNcd8WdzqrLGx6ujyyiurr0v5pLSOxXMG+wfp7emlJ3ro7ellsH+w1hzHcEytOebLqRJ1zqwzij93XsPZy0uJY3GAgb4BRteP0phoMNg/WNv0xEyOzWOb2TC0oYhpErC41UlHOqP4Uisrx+J5BvoGuqIoB/oGmFoz1RVZ2mVxq3M8o3iLY6EF5By3JBXG4pakwljcklQYi1uSCmNxS1JhLG5JKozFLUmFsbglqTAWtyQVxuKWpMJY3JJUGItbi9OhThV2OJ5CTIWxuLU4HXyqsMPxFGIqUFvFHREnRsTNEfH9iNgVEeX8/UMtTbNPFXa48j74b2RLhWj3iPszwN9k5quA1wC7OhdJWiAz5X3BBXDttQded+211XpLWwWa8+9xR8QJwK8D7wLIzGlgurOxpAUyNAQf+xhcfnm1fM45VWlffjl86lOWtorUzokUTgMeA/4qIl4D7AQuzcynO5pMWiiXXVZ9vfxyXnvmmXD33VVpz6yXChOZeeQNItYBfwecl5m3R8RngCcz88qDttsIbARYvXr12i1btnQocnv27NnDypUra83QLRyLymsvuYQT77qLx886izs/+9m649TO50VLN4zF0NDQzsxc19bGmXnEC/ALwMSs5TcAXzvSbdauXZt1GxsbqztC13AsMvNP/zQzIn961lmZEdXyEufzoqUbxgLYkXP08cxlzqmSzPxhRExGxOmZeR8wDNz7Ql9VpKNu1pz2neecw+Add7TmvJ0uUYHaPVnw+4CRiOgFHgDe3blI0gIaG4OrrmrNaTcarbK+6io4+2zfoFRx2iruzLwTaG/uReoWM5/TvuWW55fzZZdVpe3nuFUgf3NSi1M7v1zTzi/pSF3I4tbitH17e0fSM+W9ffvRySUtgHbnuKWybNrU/rZDQ06VqCgecUtSYSxuSSqMxS1JhbG4JakwFrckFcbilqTCWNySVBiLW5IKY3FLUmEsbkkqjMUtHSUjd43Q/+l+jvlPx9D/6X5G7hqpO5IK5d8qkY6CkbtG2HjLRp7Z+wwADz3xEBtv2QjARWddVGe02oxPjtOYaDDYP8hA30DdcYpicUtHwRWjVzxX2jOe2fsMV4xesSSLe3xynOEbhpneP01vTy+j60ct73lwqkQ6CnY/sXte6xe7xkSD6f3T7M/9TO+fpjHRqDtSUSxu6ShYc8Kaea1f7Ab7B+nt6aUneujt6WWwf7DuSEWxuKWj4Orhq1mxfMUB61YsX8HVw1fXlKheA30DjK4f5eNDH3ea5AVwjls6Cmbmsa8YvYLdT+xmzQlruHr46iU5vz1joG/Awn6BLG7pKLnorIuWdFFr4ThVIkmFsbglqTAWtyQVxuKWpMJY3JJUGItbkgpjcUtSYSxuSSqMxS1JhbG4JakwFrckFcbilqTCWNySVBiLW5IK03ZxR0RPRHwvIm7tZKBF4ZOfhLGxA9eNjVXrJelFms8R96XArk4FWVTOPRcuvLBV3mNj1fK559abS9Ki0FZxR8QpwFuAL3Q2ziIxNARbt1ZlfdVV1detW6v1kvQiRWbOvVHEzcA1wEuByzPzrYfYZiOwEWD16tVrt2zZssBR52fPnj2sXLmy1gz9mzfTf+ONTFx8MRMbNtSWoxvGols4Fi2ORUs3jMXQ0NDOzFzX1saZecQL8FbgL5rfDwK3znWbtWvXZt3GxsbqDbBtW+aqVZlXXll93battii1j0UXcSxaHIuWbhgLYEfO0a0zl3amSs4D3hYRE8AW4I0RcdP8X0+WkJk57a1b4WMfa02bHPyGpSS9AHMWd2Z+KDNPycx+4B3Atsx8Z8eTlWz79gPntGfmvLdvrzeXpEXBs7x3wqZNz183NOSbk5IWxLyKOzMbQKMjSSRJbfE3JyWpMBa3JBXG4pakwljcklQYi1uSCmNxS1JhLG5JKozFLUmFsbglqTAWtyQVxuKWpMJY3JJUGItbkgpjcUtSYSxuddz45DjXfPsaxifH644iLQqeSEEdNT45zvANw0zvn6a3p5fR9aMM9A3UHUsqmkfc6qjGRIPp/dPsz/1M75+mMdGoO5JUPItbHTXYP0hvTy890UNvTy+D/YN1R5KK51SJOmqgb4DR9aM0JhoM9g86TSItAItbHTfQN2BhSwvIqRJJKozFLUmFsbglqTAWtyQVxuKWpMJY3JJUGItbkgpjcUtSYSxuSSqMxS1JhbG4JakwFrckFcbilqTCWNySVJg5izsi+iJiLCLujYh7IuLSoxFMknRo7fw97n3ABzLzjoh4KbAzIr6Vmfd2OJsk6RDmPOLOzB9k5h3N758CdgEndzqYFsb45Dgju0c8w7q0iMxrjjsi+oGzgds7EUYLa+YM65sf3MzwDcOWt7RItH3qsohYCXwZeH9mPnmI6zcCGwFWr15No9FYqIwvyJ49e2rPULeR3SNM7ZviWZ5lat8Um8c2M7Vmqu5YtfJ50eJYtJQ2FpGZc28UsRy4FfhGZl471/br1q3LHTt2LEC8F67RaDA4OFhrhrrNHHFP7Zvi2GXHMrp+dMmf+9HnRYtj0dINYxEROzNzXTvbtvOpkgC+COxqp7TVPWbOsL7htA2WtrSItDNVch5wMXBXRNzZXPfhzLytc7G0UAb6BphaM2VpS4vInMWdmd8B4ihkkSS1wd+clKTCWNySVBiLW5IKY3FLUmEsbkkqjMUtSYWxuCWpMBa3JBXG4pakwljcklQYi1uSCmNxS1JhLG5JKozFLUmFsbglqTAWtyQVxuKWpMJY3JJUGItbkgpjcUtSYSxuSSqMxS1JhbG4JakwFrckFcbilqTCWNySVBiLW5IKY3FLUmEsbkkqjMUtSYWxuCWpMBa3JBXG4pakwljcklQYi1uSCtNWcUfEb0TEfRFxf0R8sNOhJEmHN2dxR0QP8DngfOAM4Lci4oxOB3sxxifHGdk9wvjkeN1RJGnBtXPE/SvA/Zn5QGZOA1uAt3c21gs3PjnO8A3DbH5wM8M3DFvekhadZW1sczIwOWv5YeBXD94oIjYCGwFWr15No9FYiHzzNrJ7hKl9UzzLs0ztm2Lz2Gam1kzVkqVb7Nmzp7Z/j27jWLQ4Fi2ljUU7xd2WzLwOuA5g3bp1OTg4uFC7npdjJ49lZLIq72OXHcuGoQ0M9A3UkqVbNBoN6vr36DaORYtj0VLaWLQzVfII0Ddr+ZTmuq400DfA6PpRNpy2gdH1o0u+tCUtPu0ccW8HXhkRp1EV9juA3+5oqhdpoG+AqTVTlrakRWnO4s7MfRHxXuAbQA+wOTPv6XgySdIhtTXHnZm3Abd1OIskqQ3+5qQkFcbilqTCWNySVBiLW5IKY3FLUmEsbkkqjMUtSYWxuCWpMBa3JBXG4pakwljcklQYi1uSCmNxS1JhLG5JKozFLUmFsbglqTCRmQu/04jHgIcWfMfzswr4cc0ZuoVj0eJYtDgWLd0wFqdm5sva2bAjxd0NImJHZq6rO0c3cCxaHIsWx6KltLFwqkSSCmNxS1JhFnNxX1d3gC7iWLQ4Fi2ORUtRY7Fo57glabFazEfckrQoWdySVJhFWdwR8RsRcV9E3B8RH6w7T10ioi8ixiLi3oi4JyIurTtTnSKiJyK+FxG31p2lThFxYkTcHBHfj4hdETFQd6a6RMTvN/9v3B0RX4qIl9SdqR2Lrrgjogf4HHA+cAbwWxFxRr2parMP+EBmngG8Dvj3S3gsAC4FdtUdogt8BvibzHwV8BqW6JhExMnAJcC6zDwT6AHeUW+q9iy64gZ+Bbg/Mx/IzGlgC/D2mjPVIjN/kJl3NL9/iuo/6Mn1pqpHRJwCvAX4Qt1Z6hQRJwC/DnwRIDOnM/PxelPVahlwXEQsA1YAj9acpy2LsbhPBiZnLT/MEi2r2SKiHzgbuL3eJLX5NLAJeLbuIDU7DXgM+KvmtNEXIuL4ukPVITMfAT4F7AZ+ADyRmd+sN1V7FmNx6yARsRL4MvD+zHyy7jxHW0S8FfhRZu6sO0sXWAacA3w+M88GngaW5PtAEfHzVD+NnwacBBwfEe+sN1V7FmNxPwL0zVo+pbluSYqI5VSlPZKZX6k7T03OA94WERNUU2dvjIib6o1Um4eBhzNz5ievm6mKfCl6E/BgZj6WmXuBrwCvrzlTWxZjcW8HXhkRp0VEL9WbDX9dc6ZaRERQzWXuysxr685Tl8z8UGaekpn9VM+HbZlZxJHVQsvMHwKTEXF6c9UwcG+Nkeq0G3hdRKxo/l8ZppA3apfVHWChZea+iHgv8A2qd4k3Z+Y9Nceqy3nAxcBdEXFnc92HM/O2GjOpfu8DRpoHNg8A7645Ty0y8/aIuBm4g+oTWN+jkF9991feJakwi3GqRJIWNYtbkgpjcUtSYSxuSSqMxS1JhbG4JakwFrckFeb/AyaUIWRb0bIhAAAAAElFTkSuQmCC\n",
  93. "text/plain": [
  94. "<Figure size 432x288 with 1 Axes>"
  95. ]
  96. },
  97. "metadata": {
  98. "needs_background": "light"
  99. },
  100. "output_type": "display_data"
  101. }
  102. ],
  103. "source": [
  104. "C1 = [1, 4, 5, 9, 11]\n",
  105. "C2 = list(set(range(12)) - set(C1))\n",
  106. "X0C1, X1C1 = X0[C1], X1[C1]\n",
  107. "X0C2, X1C2 = X0[C2], X1[C2]\n",
  108. "plt.figure()\n",
  109. "plt.title('1st iteration results')\n",
  110. "plt.axis([-1, 9, -1, 9])\n",
  111. "plt.grid(True)\n",
  112. "plt.plot(X0C1, X1C1, 'rx')\n",
  113. "plt.plot(X0C2, X1C2, 'g.')\n",
  114. "plt.plot(4,6,'rx',ms=12.0)\n",
  115. "plt.plot(5,5,'g.',ms=12.0);"
  116. ]
  117. },
  118. {
  119. "cell_type": "markdown",
  120. "metadata": {},
  121. "source": [
  122. "现在我们重新计算两个类的重心,把重心移动到新位置,并重新计算各个样本与新重心的距离,并根据距离远近为样本重新归类。结果如下表所示:\n",
  123. "\n",
  124. "![data_1](images/data_1.png)\n",
  125. "\n",
  126. "画图结果如下:"
  127. ]
  128. },
  129. {
  130. "cell_type": "code",
  131. "execution_count": 3,
  132. "metadata": {},
  133. "outputs": [
  134. {
  135. "data": {
  136. "image/png": "iVBORw0KGgoAAAANSUhEUgAAAW4AAAEICAYAAAB/Dx7IAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4wLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvqOYd8AAAFQZJREFUeJzt3X+U3Hdd7/Hnu5sfNAkWNLBIm7DRo60RxZoUuvSqu271UKhyz9FbfpT0Qg43VxQt3t6DFm6lFirq8XjAg/ZeLKk0rOTWwrlirVJNd/VCY23SVkubopWkSUtLA9gfm8Juk7zvH/PdO0PYzc4mO/nOZ/b5OGfO7nfmO9/ve967+9rvfL4z84nMRJJUjtPqLkCSND8GtyQVxuCWpMIY3JJUGINbkgpjcEtSYQxuLZiIeGtEfG6W29ZGxERE9J3qulpquDQibqtr/wshIvZFxIV116F6GdyLWEQsj4iPRcTDEfFMRNwbERd1Yl+ZuT8zV2XmkWrf4xHx9k7sq9r+QERkRCxpqWE0M3+6U/s81SLi6oj4RN116NQzuBe3JcAB4CeAM4D/AdwUEQM11tSWOo/c59L6z0LqBIN7EcvMQ5l5dWbuy8yjmXkLsBfYABARQxHxSERcERFPRMRjEfG26ftHxHdFxGci4umI+Efge2fbV+sRcERcC/wY8JFq+OQj1TrnRMTfRMTXI+KLEXFJy/3/JCKui4hbI+IQMBwRr4uIe6r9H4iIq1t2+ffV1yerfQweO5QTEa+OiLsi4qnq66tbbhuPiPdHxOerZyO3RcTqWR7bdJ9+LSIeB26orr+4ehbzZETcERE/3HKfX4uIR6ttfzEiRloe5weO3fYM+3wN8B7gDdXj+6fq+rdGxJeq7e6NiEtn+5moYJnpxQuZCdAPfBM4p1oeAg4D1wBLgdcCzwIvrG7fDtwErAReDjwKfG6WbQ8ACSyplseBt7fcvpLG0f/baDwTOBf4KrC+uv1PgKeAC2gccDyvqu+HquUfBr4C/MeZ9ldd99bp+oDvBP4d2FTt703V8ne11PdvwPcDp1fLvz3LY5vu0+8Ay6v1zwWeAF4F9AH/GdhX3X529Vhf2lLr97Y8zg8cs+1HWpb3ARdW318NfOKYHj4NnF0tfzfwg3X/XnlZ+ItH3AIgIpYCo8DHM/PBlpueA67JzOcy81ZgAji7Gqr4OeA3snHk/gXg4ydRwsXAvsy8ITMPZ+Y9wKeA/9Syzp9n5uez8ezgm5k5npn3Vcv/DHySxrBPO14H/Gtmbqv290ngQeBnWta5ITP/JTO/QeMf1I8cZ3tHgfdl5mS1/hbgf2XmnZl5JDM/DkwC5wNHaAT4+ohYmo1nPP/WZt1zOQq8PCJOz8zHMvP+BdquuojBLSLiNGAbMAW885ibv5aZh1uWnwVWAS+iOUY+7eGTKONlwKuqYYUnI+JJ4FLgJS3rtO6LiHhVRIxFxMGIeAr4BWDG4YwZvHSGeh8GzmxZfrzl++nHPZuDmfnNluWXAVcc83jW0DjKfgh4F40j5iciYntEvLTNumeVmYeAN9Dow2MR8ZcRcc7Jblfdx+Be5CIigI/RGCb5ucx8rs27HqQxPLCm5bq189j1sR9LeQD4u8x8QctlVWa+4zj3+VPgM8CazDwD+J9AzLLusb5MI1xbraUx3HMiZno81x7zeFZUR/Zk5p9m5n+oakgawywAh4AVLdt5CbP7tseYmZ/NzJ+iMUzyIPDHJ/Zw1M0Mbl0H/ADwM9VT/LZk42V9nwaujogVEbGexjhuu74CfE/L8i3A90fEpohYWl3Oi4gfOM42ng98PTO/GRGvBN7ccttBGsMG3zPjPeHWan9vrk6YvgFYX9WxEP4Y+IXqWUFExMrqZOrzI+LsiPjJiFhO45zCN6paAe4FXhsR3xkRL6FxZD6brwAD1TMmIqI/Il4fEStpDMtMtGxXPcTgXsQi4mXAf6Uxdvt49eqEiXm8EuGdNIYPHqdxUu2Geez+w8DPR8S/R8QfZOYzwE8Db6RxNPw4zZN9s/lF4JqIeAb4DRrj0ABk5rPAtcDnq6GK81vvmJlfozGufgXwNeDdwMWZ+dV5PIZZZeYu4L8AH6Fx0vMhGidHqR7Tb9M4+fo48GLgyuq2bcA/0TgJeRvwv4+zmz+rvn4tIu6m8ff832j07+s0xvvfMct9VbDIdCIFSSqJR9ySVBiDW5IKY3BLUmEMbkkqTEc+DGf16tU5MDDQiU237dChQ6xcubLWGrqFvWiyF032oqkberF79+6vZuaL2lm3I8E9MDDArl27OrHpto2PjzM0NFRrDd3CXjTZiyZ70dQNvYiItt957FCJJBXG4JakwhjcklQYg1uSCmNwS1JhDG5JKozBLUmFMbglqTAGtyQVxuCWpMIY3JJUGINbkgpjcEtSYQxuSSqMwS1JhTG4JakwBrckFaat4I6IX42I+yPiCxHxyYh4XqcLk9QBv/u7MDb2rdeNjTWuVzHmDO6IOBP4FWBjZr4c6APe2OnCJHXAeefBJZc0w3tsrLF83nn11qV5aXfOySXA6RHxHLAC+HLnSpLUMcPDcNNNcMklDFx0EfzVXzWWh4frrkzzEJk590oRlwPXAt8AbsvMS2dYZwuwBaC/v3/D9u3bF7jU+ZmYmGDVqlW11tAt7EWTvWgY2LqVgW3b2LdpE/s2b667nNp1w+/F8PDw7szc2NbKmXncC/BC4HbgRcBS4P8AbznefTZs2JB1Gxsbq7uErmEvmuxFZt5+e+bq1bl306bM1asby4tcN/xeALtyjjyevrRzcvJCYG9mHszM54BPA68+gX8okuo2PaZ9002NI+1q2OTbTliqq7UT3PuB8yNiRUQEMALs6WxZkjrirru+dUx7esz7rrvqrUvzMufJycy8MyJuBu4GDgP3AB/tdGGSOuDd7/7264aHPTlZmLZeVZKZ7wPe1+FaJElt8J2TklQYg1uSCmNwS1JhDG5JKozBLUmFMbglqTAGtyQVxuCWpMIY3JJUGINbkgpjcEtSYQxuSSqMwS1JhTG41TnOKN5kL5q6pRfdUscJMLjVOc4o3mQvmrqlF91Sxwlod5Z3af5aZhTnHe+A665bvDOK24umbulFwTPee8Stzhoebvxxvv/9ja8F/FF0jL1o6pZeVHUMbNtW1M/E4FZnjY01jqiuuqrxdTFPSmsvmrqlF1Ud+zZtKupnYnCrc1pmFOeaaxb3jOL2oqlbelHwjPcGtzrHGcWb7EVTt/SiW+o4AZ6cVOc4o3iTvWjqll50Sx0nwCNuSSqMwS1JhTG4Va6Z3vk2m0LeESe1w+BWuY5959tsCnpHnNQOg1vlan0H3mzh3frSswJOOkntMLhVtuOFt6GtHmVwq3wzhbehrR7m67jVG7rlg4ukU8AjbvWObvngIqnDDG71jm754CKpwwxu9YZu+eAi6RQwuFW+mU5EtvNSQalQBrfKdrxXjxje6lFtBXdEvCAibo6IByNiT0QMdrowaU7tvOTP8FYPaveI+8PAX2fmOcArgD2dK0lq07Gfp3y89a688ls/Z9nPLlHB5nwdd0ScAfw48FaAzJwCpjpbltSGmT5PeSbTn2ly002N5dYjdalA7bwBZx1wELghIl4B7AYuz8xDHa1MWigFz+YtzSQy8/grRGwE/gG4IDPvjIgPA09n5lXHrLcF2ALQ39+/Yfv27R0quT0TExOsWrWq1hq6hb1oGNi6lYFt29i3aVNjjsFFzt+Lpm7oxfDw8O7M3NjWypl53AvwEmBfy/KPAX95vPts2LAh6zY2NlZ3CV3DXmTm7bdnrl6dezdtyly9urG8yPl70dQNvQB25Rx5PH2Z8+RkZj4OHIiIs6urRoAHTuAfilSPgmfzlmbS7qtKfhkYjYh/Bn4E+K3OlSQtsIJn85Zm0tanA2bmvUB7Yy9Styl4Nm9pJr5zUpIKY3BLUmEMbkkqjMEtSYUxuCWpMAa3JBXG4JakwhjcklQYg1uSCmNwS1JhDG5JKozBLUmFMbglqTAGtyQVxuCWTqGdB3bywf/7QXYe2Fl3KbWzFyeurc/jlnTydh7YyciNI0wdmWJZ3zJ2XLaDwTWDdZdVC3txcjzilk6R8X3jTB2Z4kgeYerIFOP7xusuqTb24uQY3NIpMjQwxLK+ZfRFH8v6ljE0MFR3SbWxFyfHoRLpFBlcM8iOy3Ywvm+coYGhRT00YC9OjsEtnUKDawYNqYq9OHEOlUhSYQxuSSqMwS1JhTG4JakwBrckFcbglqTCGNySVBiDW5IKY3BLUmEMbkkqjMEtSYUxuCWpMAa3NE+j940y8KEBTvvN0xj40ACj943WXZIWGT8dUJqH0ftG2fIXW3j2uWcBePiph9nyF1sAuPSHLq2zNC0iHnFL8/DeHe/9/6E97dnnnuW9O95bU0VajNoO7ojoi4h7IuKWThYkdbP9T+2f1/VSJ8zniPtyYE+nCulFzmLde9aesXZe10ud0FZwR8RZwOuA6ztbTu+YnsX6qrGrGLlxxPDuEdeOXMuKpSu+5boVS1dw7ci1NVWkxSgyc+6VIm4GPgg8H/jvmXnxDOtsAbYA9Pf3b9i+ffsClzo/ExMTrFq1qrb9j+4fZeverRzlKKdxGpvXbebStfWcvKq7F91kIXrxt1/5W67fez1PTD7Bi5e/mLevezsX9l+4QBWeOv5eNHVDL4aHh3dn5sa2Vs7M416Ai4E/qr4fAm6Z6z4bNmzIuo2NjdW6/zv235Gnf+D07PvNvjz9A6fnHfvvqK2WunvRTexFk71o6oZeALtyjmydvrTzcsALgJ+NiNcCzwO+IyI+kZlvOYF/KouGs1hL6pQ5gzszrwSuBIiIIRpDJYZ2G5zFWlIn+DpuSSrMvN45mZnjwHhHKpEktcUjbkkqjMEtSYUxuCWpMAa3JBXG4JakwhjcklQYg1uSCmNwS1JhDG5JKozBLUmFMbglqTAGtyQVxuCWpMIY3JJUGINbHeds99LCmtfncUvzNT3b/dSRKZb1LWPHZTucFUg6SR5xq6PG940zdWSKI3mEqSNTjO8br7skqXgGtzpqaGCIZX3L6Is+lvUtY2hgqO6SpOI5VKKOcrZ7aeEZ3Oo4Z7uXFpZDJZJUGINbkgpjcEtSYQxuSSqMwS1JhTG4JakwBrckFcbglqTCGNySVBiDW5IKY3BLUmEMbkkqjMEtSYUxuCWpMHMGd0SsiYixiHggIu6PiMtPRWGSpJm183nch4ErMvPuiHg+sDsi/iYzH+hwbZKkGcx5xJ2Zj2Xm3dX3zwB7gDM7XZgWxs4DOxndP+oM61IPmdcYd0QMAOcCd3aiGC2s6RnWt+7dysiNI4a31CPanrosIlYBnwLelZlPz3D7FmALQH9/P+Pj4wtV4wmZmJiovYa6je4fZfLwJEc5yuThSbaObWVy7WTdZdXK34sme9FUWi8iM+deKWIpcAvw2cz8/bnW37hxY+7atWsByjtx4+PjDA0N1VpD3aaPuCcPT7J8yXJ2XLZj0c/96O9Fk71o6oZeRMTuzNzYzrrtvKokgI8Be9oJbXWP6RnWN6/bbGhLPaSdoZILgE3AfRFxb3XdezLz1s6VpYUyuGaQybWThrbUQ+YM7sz8HBCnoBZJUht856QkFcbglqTCGNySVBiDW5IKY3BLUmEMbkkqjMEtSYUxuCWpMAa3JBXG4JakwhjcklQYg1uSCmNwS1JhDG5JKozBLUmFMbglqTAGtyQVxuCWpMIY3JJUGINbkgpjcEtSYQxuSSqMwS1JhTG4JakwBrckFcbglqTCGNySVBiDW5IKY3BLUmEMbkkqjMEtSYUxuCWpMAa3JBXG4JakwhjcklSYtoI7Il4TEV+MiIci4tc7XZQkaXZzBndE9AF/CFwErAfeFBHrO13Yydh5YCej+0fZeWBn3aVI0oJr54j7lcBDmfmlzJwCtgOv72xZJ27ngZ2M3DjC1r1bGblxxPCW1HOWtLHOmcCBluVHgFcdu1JEbAG2APT39zM+Pr4Q9c3b6P5RJg9PcpSjTB6eZOvYVibXTtZSS7eYmJio7efRbexFk71oKq0X7QR3WzLzo8BHATZu3JhDQ0MLtel5WX5gOaMHGuG9fMlyNg9vZnDNYC21dIvx8XHq+nl0G3vRZC+aSutFO0MljwJrWpbPqq7rSoNrBtlx2Q42r9vMjst2LPrQltR72jnivgv4vohYRyOw3wi8uaNVnaTBNYNMrp00tCX1pDmDOzMPR8Q7gc8CfcDWzLy/45VJkmbU1hh3Zt4K3NrhWiRJbfCdk5JUGINbkgpjcEtSYQxuSSqMwS1JhTG4JakwBrckFcbglqTCGNySVBiDW5IKY3BLUmEMbkkqjMEtSYUxuCWpMAa3JBXG4JakwkRmLvxGIw4CDy/4hudnNfDVmmvoFvaiyV402YumbujFyzLzRe2s2JHg7gYRsSszN9ZdRzewF032osleNJXWC4dKJKkwBrckFaaXg/ujdRfQRexFk71oshdNRfWiZ8e4JalX9fIRtyT1JINbkgrTk8EdEa+JiC9GxEMR8et111OXiFgTEWMR8UBE3B8Rl9ddU50ioi8i7omIW+qupU4R8YKIuDkiHoyIPRExWHdNdYmIX63+Nr4QEZ+MiOfVXVM7ei64I6IP+EPgImA98KaIWF9vVbU5DFyRmeuB84FfWsS9ALgc2FN3EV3gw8BfZ+Y5wCtYpD2JiDOBXwE2ZubLgT7gjfVW1Z6eC27glcBDmfmlzJwCtgOvr7mmWmTmY5l5d/X9MzT+QM+st6p6RMRZwOuA6+uupU4RcQbw48DHADJzKjOfrLeqWi0BTo+IJcAK4Ms119OWXgzuM4EDLcuPsEjDqlVEDADnAnfWW0ltPgS8GzhadyE1WwccBG6oho2uj4iVdRdVh8x8FPg9YD/wGPBUZt5Wb1Xt6cXg1jEiYhXwKeBdmfl03fWcahFxMfBEZu6uu5YusAT4UeC6zDwXOAQsyvNAEfFCGs/G1wEvBVZGxFvqrao9vRjcjwJrWpbPqq5blCJiKY3QHs3MT9ddT00uAH42IvbRGDr7yYj4RL0l1eYR4JHMnH7mdTONIF+MLgT2ZubBzHwO+DTw6ppraksvBvddwPdFxLqIWEbjZMNnaq6pFhERNMYy92Tm79ddT10y88rMPCszB2j8PtyemUUcWS20zHwcOBARZ1dXjQAP1FhSnfYD50fEiupvZYRCTtQuqbuAhZaZhyPincBnaZwl3pqZ99dcVl0uADYB90XEvdV178nMW2usSfX7ZWC0OrD5EvC2muupRWbeGRE3A3fTeAXWPRTy1nff8i5JhenFoRJJ6mkGtyQVxuCWpMIY3JJUGINbkgpjcEtSYQxuSSrM/wNZ1XFVcoOSCQAAAABJRU5ErkJggg==\n",
  137. "text/plain": [
  138. "<Figure size 432x288 with 1 Axes>"
  139. ]
  140. },
  141. "metadata": {
  142. "needs_background": "light"
  143. },
  144. "output_type": "display_data"
  145. }
  146. ],
  147. "source": [
  148. "C1 = [1, 2, 4, 8, 9, 11]\n",
  149. "C2 = list(set(range(12)) - set(C1))\n",
  150. "X0C1, X1C1 = X0[C1], X1[C1]\n",
  151. "X0C2, X1C2 = X0[C2], X1[C2]\n",
  152. "plt.figure()\n",
  153. "plt.title('2nd iteration results')\n",
  154. "plt.axis([-1, 9, -1, 9])\n",
  155. "plt.grid(True)\n",
  156. "plt.plot(X0C1, X1C1, 'rx')\n",
  157. "plt.plot(X0C2, X1C2, 'g.')\n",
  158. "plt.plot(3.8,6.4,'rx',ms=12.0)\n",
  159. "plt.plot(4.57,4.14,'g.',ms=12.0);"
  160. ]
  161. },
  162. {
  163. "cell_type": "markdown",
  164. "metadata": {},
  165. "source": [
  166. "我们再重复一次上面的做法,把重心移动到新位置,并重新计算各个样本与新重心的距离,并根据距离远近为样本重新归类。结果如下表所示:\n",
  167. "![data_2](images/data_2.png)\n",
  168. "\n",
  169. "画图结果如下:\n"
  170. ]
  171. },
  172. {
  173. "cell_type": "code",
  174. "execution_count": 4,
  175. "metadata": {},
  176. "outputs": [
  177. {
  178. "data": {
  179. "image/png": "iVBORw0KGgoAAAANSUhEUgAAAW4AAAEICAYAAAB/Dx7IAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4wLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvqOYd8AAAFKhJREFUeJzt3X+Q3HV9x/Hn2wQiIQjaYCwk4VColuqoJagn1d41dgoVdabTMlAM1bTNFKvir+IPpFpptONYC1aLjXKM4FXKANNRC2oNd1U7EUnAiiFqGRJyICi08uNAL4S8+8d+jz3DXW4vt5vvfu6ej5mby+5+9/t9f9/Ze91nP7u3n8hMJEnleErdBUiSZsbglqTCGNySVBiDW5IKY3BLUmEMbkkqjMGttoiIjIjjprjt+oj4kwNd0141jEbEs+usYTYi4oMR8fm661B3MLhFRHw+Iu6JiIci4kcR8Wft3H9mnpqZn6uO9YaI+FY797+3iBje+xwyc0lm3tHJ4x4oEdFT/aJcWHctqofBLYCPAD2Z+TTgtcDfRsSJk21Yd1jUffx96ebaNLcY3CIzt2bm2PjF6us5ABHRFxF3RcS7I+Je4LLq+r+qRuk/joi1+9r/+Ag4In4d+DTQW01dPFDdvigiPhYROyPiJxHx6Yg4ZKrjR8TTI+LLEXFfRPys+vfyavv1wCuAT1bH+GR1/RNTORFxeERcXt3/zoh4f0Q8pbrtDRHxraqen0XE9og4dR/ntqOq7XvAIxGxMCKOiohrqv1vj4i3Ttj+JRGxuXp285OI+PjE85xk36+a5LDfqL4/UJ1jb0QcFxH/GREPRsT9EfGv+/o/UdkMbgEQEf8UEY8CPwDuAa6bcPOzgGcAxwDrIuIU4F3A7wLHA5OFy5Nk5jbgL4BN1dTFEdVNfwf8GvAi4DjgaOCvpzo+jcftZdXllcDPgU9Wxzgf+Cbw5uoYb56klH8EDgeeDfw2cDbwxgm3vxT4IbAU+ChwaUTEPk7tTODVwBHAHuBLwH9X57EaeFtE/F617cXAxdWzm+cAV+1jv1N5ZfX9iOocNwEXAl8Dng4sr85Rc5TBLQAy803AYTRGq9cCYxNu3gN8IDPHMvPnwOnAZZn5/cx8BPjg/h63CsR1wNsz8/8y82Hgw8AZUx0/M/83M6/JzEer7dfTCOBWjreg2vd7M/PhzNwB/D2wZsJmd2bmZzLzceBzwK8Cy/ax209k5kjVm5OAIzPzQ5m5q5pX/8yE83kMOC4ilmbmaGZ+u5W6W/AYjV9kR2XmLzKzo68jqF4Gt56QmY9XP/DLgXMm3HRfZv5iwuWjgJEJl++cxWGPBBYDWyLigWr65CvV9ZMePyIWR8Q/V9McD9GYOjiiCuXpLAUO2qvmO2mMjsfdO/6PzHy0+ueSfexzYi+OAY4aP5fqfN5HM/j/lMazix9ExE0RcVoLNbfiPCCA70TE1ummr1Q2X0zRZBZSzXFX9v4IyXuAFRMur5zBvvfe1/00pjp+IzPvbvE+7wSeC7w0M++NiBcBt9AIrsm23/t446PT26rrVgJTHbsVE483AmzPzOMn3TDzf4Azqzn1PwCujohfAR6h8QsMeOKZwZGT7YNJzi8z7wX+vLrvbwFfj4hvZObt+3E+6nKOuOe5iHhmRJwREUsiYkE1F3smsHEfd7sKeENEnBARi4EPzOCQPwGWR8TBAJm5h8ZUwj9ExDOrmo6eMCc8mcNohP0DEfGMSY7/Exrz109STX9cBayPiMMi4hjgHUC73iP9HeDh6gXLQ6qePj8iTgKIiNdHxJHVeT9Q3WcP8CPgqRHx6og4CHg/sGiKY9xX3eeJc4yIPxp/gRb4GY1w39Omc1KXMbiVNKZF7qLxA/8x4G2Z+cUp75B5PXARcANwe/W9VTcAW4F7I+L+6rp3V/v5djX18XUaI+qpXAQcQmP0/G0aUysTXQz8YfWukE9Mcv+30Bjh3gF8C/gXYGAG5zCl6hfDaTReaN1e1fhZGi+GApwCbI2I0arOM6p5+weBN1Xb3l3VdxeTqKZv1gP/VU3HvIzG3PqN1X6/CJw7V963ricLF1KQpLI44pakwhjcklQYg1uSCmNwS1JhOvI+7qVLl2ZPT08ndt2yRx55hEMPPbTWGrqFvWiyF032oqkberFly5b7M3Oq9+7/ko4Ed09PD5s3b+7Erls2PDxMX19frTV0C3vRZC+a7EVTN/QiIlr+C2SnSiSpMAa3JBXG4JakwhjcklQYg1uSCmNwS1JhDG5JKozBLUmFMbglqTAGtyQVxuCWpMIY3JJUGINbkgpjcEtSYQxuSSqMwS1JhTG4JakwLQV3RLw9IrZGxPcj4gsR8dROFyapAz76URga+uXrhoYa16sY0wZ3RBwNvBVYlZnPBxYAZ3S6MEkdcNJJcPrpzfAeGmpcPumkeuvSjLS65uRC4JCIeAxYDPy4cyVJ6pj+frjqKjj9dHpOPRWuv75xub+/7so0A5GZ028UcS6wHvg58LXMPGuSbdYB6wCWLVt24pVXXtnmUmdmdHSUJUuW1FpDt7AXTfaioWdggJ4rrmDHmjXsWLu27nJq1w2Pi/7+/i2ZuaqljTNzn1/A04EbgCOBg4B/A16/r/uceOKJWbehoaG6S+ga9qLJXmTmDTdkLl2a29esyVy6tHF5nuuGxwWwOafJ4/GvVl6cfBWwPTPvy8zHgGuBl+/HLxRJdRuf077qqsZIu5o2edILlupqrQT3TuBlEbE4IgJYDWzrbFmSOuKmm355Tnt8zvumm+qtSzMy7YuTmXljRFwN3AzsBm4BNnS6MEkdcN55T76uv98XJwvT0rtKMvMDwAc6XIskqQX+5aQkFcbglqTCGNySVBiDW5IKY3BLUmEMbkkqjMEtSYUxuCWpMAa3JBXG4JakwhjcUjebbKmxqbgE2bxhcEvdbO+lxqbiEmTzisEtdbMJS41NGd4TPmPbT/mbHwxudY4rijfNphf7Cu8SQ7tbHhfdUsd+MLjVOa4o3jTbXkwW3iWGNnTP46Jb6tgfra5xNpMv15zsLrX2olrfMC+4oCvWNyy+F23sZ/G9aGMd3bD+Jm1ec1Laf/39cM45cOGFje8ljQzbrR29mCv97JbzqOroueKKovppcKuzhobgkkvgggsa3+fzorTt6MVc6We3nEdVx441a8rqZ6tD85l8OVXSXWrrxfjT4fGnn3tfrkHRvWhzP4vuRZvrGBoaqv3xiVMl6gquKN40215M9kJkK28V7Ebd8rjoljr2R6sJP5MvR9zdxV40FdmL6UaC+zlSLLIXHdINvcARtzRHtPKWv1JH3tpvBrfUzfZ+Oj+Vkp7ma9YW1l2ApH0477zWt+3vL+btbJodR9ySVBiDW5IKY3BLUmEMbkkqjMEtSYUxuCWpMAa3JBXG4JakwhjcklQYg1uSCtNScEfEERFxdUT8ICK2RURvpwuTJE2u1RH3xcBXMvN5wAuBbZ0rSWqzglfzliYzbXBHxOHAK4FLATJzV2Y+0OnCpLYpeTVvaRKtfDrgscB9wGUR8UJgC3BuZj7S0cqkdpnwedU9p54K11/f2kelSl0qGgsv7GODiFXAt4GTM/PGiLgYeCgzL9hru3XAOoBly5adeOWVV3ao5NaMjo6yZMmSWmvoFvaioWdggJ4rrmDHmjXsWLu27nJq5+OiqRt60d/fvyUzV7W08XRL5ADPAnZMuPwK4N/3dR+XLusu9iKfWN5r+5o1tS9Y3C18XDR1Qy9o59JlmXkvMBIRz62uWg3cth+/UKR6TFj+a8fatS7zpeK1+q6StwCDEfE94EXAhztXktRmJa/mLU2ipaXLMvO7QGtzL1K3mWz5L5f5UsH8y0lJKozBLUmFMbglqTAGtyQVxuCWpMIY3JJUGINbkgpjcEtSYQxuSSqMwS1JhTG4JakwBrckFcbglqTCGNzSgeCCxU32YtYMbulAcMHiJnsxay19HrekWZqwYDHnnAOXXDJ/Fyy2F7PmiFs6UPr7G0F14YWN7/M5qOzFrBjc0oEyNNQYXV5wQeP7fF7z0l7MisEtHQgTFizmQx+a3wsW24tZM7ilA8EFi5vsxaz54qR0ILhgcZO9mDVH3JJUGINbkgpjcEtSYQxuSSqMwS1JhTG4JakwBrckFcbglqTCGNySVBiDW5IKY3BLUmEMbkkqjMEtSYUxuCWpMC0Hd0QsiIhbIuLLnSxIkrRvMxlxnwts61Qhc9GmkU185JsfYdPIprpLkTSHtLSQQkQsB14NrAfe0dGK5ohNI5tYfflqdj2+i4MXHMzGszfSu6K37rIkzQGtroBzEXAecNhUG0TEOmAdwLJlyxgeHp51cbMxOjpaaw2DOwcZ2z3GHvYwtnuMgaEBxlaO1VJL3b3oJvaiyV40ldaLaYM7Ik4DfpqZWyKib6rtMnMDsAFg1apV2dc35aYHxPDwMHXWsGhkEYMjg0+MuNf2r61txF13L7qJvWiyF02l9aKVEffJwGsj4veBpwJPi4jPZ+brO1ta2XpX9LLx7I0M7ximr6fPaRJJbTNtcGfme4H3AlQj7ncZ2q3pXdFrYEtqO9/HLUmFafXFSQAycxgY7kglkqSWOOKWpMIY3JJUGINbkgpjcEtSYQxuSSqMwS1JhTG4JakwBrckFcbglqTCGNySVBiDW5IKY3BLUmEMbkkqjMEtSYUxuNVxrnYvtdeMPo9bmilXu5fazxG3Omp4xzC7Ht/F4/k4ux7fxfCO4bpLkopncM9Tg7cO0nNRD0/5m6fQc1EPg7cOduQ4fT19HLzgYBbEAg5ecDB9PX0dOY40nzhVMg8N3jrIui+t49HHHgXgzgfvZN2X1gFw1gvOauuxXO1eaj+Dex46f+P5T4T2uEcfe5TzN57f9uAGV7uX2s2pknlo54M7Z3S9pO5icM9DKw9fOaPrJXUXg3seWr96PYsPWvxL1y0+aDHrV6+vqSJJM2Fwz0NnveAsNrxmA8ccfgxBcMzhx7DhNRs6Mr8tqf18cXKeOusFZxnUUqEccUtSYQxuSSqMwS1JhTG4JakwBrckFcbglqTCGNySVBiDW5IKY3BLUmGmDe6IWBERQxFxW0RsjYhzD0RhkqTJtfIn77uBd2bmzRFxGLAlIv4jM2/rcG2SpElMO+LOzHsy8+bq3w8D24CjO12Y2mPTyCYGdw66wro0h8xojjsieoAXAzd2ohi11/gK6wPbB1h9+WrDW5ojWv50wIhYAlwDvC0zH5rk9nXAOoBly5YxPDzcrhr3y+joaO011G1w5yBju8fYwx7Gdo8xMDTA2MqxusuqlY+LJnvRVFovIjOn3yjiIODLwFcz8+PTbb9q1arcvHlzG8rbf8PDw/T19dVaQ93GR9xju8dYtHARG8/eOO/XfvRx0WQvmrqhFxGxJTNXtbJtK+8qCeBSYFsroa3uMb7C+tpj1xra0hzSylTJycAa4NaI+G513fsy87rOlaV26V3Ry9jKMUNbmkOmDe7M/BYQB6AWSVIL/MtJSSqMwS1JhTG4JakwBrckFcbglqTCGNySVBiDW5IKY3BLUmEMbkkqjMEtSYUxuCWpMAa3JBXG4JakwhjcklQYg1uSCmNwS1JhDG5JKozBLUmFMbglqTAGtyQVxuCWpMIY3JJUGINbkgpjcEtSYQxuSSqMwS1JhTG4JakwBrckFcbglqTCGNySVBiDW5IKY3BLUmEMbkkqjMEtSYUxuCWpMC0Fd0ScEhE/jIjbI+I9nS5KkjS1aYM7IhYAnwJOBU4AzoyIEzpd2GxsGtnE4M5BNo1sqrsUSWq7VkbcLwFuz8w7MnMXcCXwus6Wtf82jWxi9eWrGdg+wOrLVxvekuachS1sczQwMuHyXcBL994oItYB6wCWLVvG8PBwO+qbscGdg4ztHmMPexjbPcbA0ABjK8dqqaVbjI6O1vb/0W3sRZO9aCqtF60Ed0sycwOwAWDVqlXZ19fXrl3PyKKRRQyONMJ70cJFrO1fS++K3lpq6RbDw8PU9f/RbexFk71oKq0XrUyV3A2smHB5eXVdV+pd0cvGszey9ti1bDx747wPbUlzTysj7puA4yPiWBqBfQbwxx2tapZ6V/QytnLM0JY0J00b3Jm5OyLeDHwVWAAMZObWjlcmSZpUS3PcmXkdcF2Ha5EktcC/nJSkwhjcklQYg1uSCmNwS1JhDG5JKozBLUmFMbglqTAGtyQVxuCWpMIY3JJUGINbkgpjcEtSYQxuSSqMwS1JhTG4JakwBrckFSYys/07jbgPuLPtO56ZpcD9NdfQLexFk71oshdN3dCLYzLzyFY27Ehwd4OI2JyZq+quoxvYiyZ70WQvmkrrhVMlklQYg1uSCjOXg3tD3QV0EXvRZC+a7EVTUb2Ys3PckjRXzeURtyTNSQa3JBVmTgZ3RJwSET+MiNsj4j1111OXiFgREUMRcVtEbI2Ic+uuqU4RsSAibomIL9ddS50i4oiIuDoifhAR2yKit+6a6hIRb69+Nr4fEV+IiKfWXVMr5lxwR8QC4FPAqcAJwJkRcUK9VdVmN/DOzDwBeBnwl/O4FwDnAtvqLqILXAx8JTOfB7yQedqTiDgaeCuwKjOfDywAzqi3qtbMueAGXgLcnpl3ZOYu4ErgdTXXVIvMvCczb67+/TCNH9Cj662qHhGxHHg18Nm6a6lTRBwOvBK4FCAzd2XmA/VWVauFwCERsRBYDPy45npaMheD+2hgZMLlu5inYTVRRPQALwZurLeS2lwEnAfsqbuQmh0L3AdcVk0bfTYiDq27qDpk5t3Ax4CdwD3Ag5n5tXqras1cDG7tJSKWANcAb8vMh+qu50CLiNOAn2bmlrpr6QILgd8ELsnMFwOPAPPydaCIeDqNZ+PHAkcBh0bE6+utqjVzMbjvBlZMuLy8um5eioiDaIT2YGZeW3c9NTkZeG1E7KAxdfY7EfH5ekuqzV3AXZk5/szrahpBPh+9Ctiemfdl5mPAtcDLa66pJXMxuG8Cjo+IYyPiYBovNnyx5ppqERFBYy5zW2Z+vO566pKZ783M5ZnZQ+PxcENmFjGyarfMvBcYiYjnVletBm6rsaQ67QReFhGLq5+V1RTyQu3Cugtot8zcHRFvBr5K41XigczcWnNZdTkZWAPcGhHfra57X2ZeV2NNqt9bgMFqYHMH8Maa66lFZt4YEVcDN9N4B9YtFPKn7/7JuyQVZi5OlUjSnGZwS1JhDG5JKozBLUmFMbglqTAGtyQVxuCWpML8P42o419LPfFMAAAAAElFTkSuQmCC\n",
  180. "text/plain": [
  181. "<Figure size 432x288 with 1 Axes>"
  182. ]
  183. },
  184. "metadata": {
  185. "needs_background": "light"
  186. },
  187. "output_type": "display_data"
  188. }
  189. ],
  190. "source": [
  191. "C1 = [0, 1, 2, 4, 8, 9, 10, 11]\n",
  192. "C2 = list(set(range(12)) - set(C1))\n",
  193. "X0C1, X1C1 = X0[C1], X1[C1]\n",
  194. "X0C2, X1C2 = X0[C2], X1[C2]\n",
  195. "plt.figure()\n",
  196. "plt.title('3rd iteration results')\n",
  197. "plt.axis([-1, 9, -1, 9])\n",
  198. "plt.grid(True)\n",
  199. "plt.plot(X0C1, X1C1, 'rx')\n",
  200. "plt.plot(X0C2, X1C2, 'g.')\n",
  201. "plt.plot(5.5,7.0,'rx',ms=12.0)\n",
  202. "plt.plot(2.2,2.8,'g.',ms=12.0);"
  203. ]
  204. },
  205. {
  206. "cell_type": "markdown",
  207. "metadata": {},
  208. "source": [
  209. "再重复上面的方法就会发现类的重心不变了,K-Means会在条件满足的时候停止重复聚类过程。通常,条件是前后两次迭代的成本函数值的差达到了限定值,或者是前后两次迭代的重心位置变化达到了限定值。如果这些停止条件足够小,K-Means就能找到最优解。不过这个最优解不一定是全局最优解。\n",
  210. "\n"
  211. ]
  212. },
  213. {
  214. "cell_type": "markdown",
  215. "metadata": {},
  216. "source": [
  217. "## Program"
  218. ]
  219. },
  220. {
  221. "cell_type": "code",
  222. "execution_count": 6,
  223. "metadata": {},
  224. "outputs": [
  225. {
  226. "data": {
  227. "text/html": [
  228. "<div>\n",
  229. "<style scoped>\n",
  230. " .dataframe tbody tr th:only-of-type {\n",
  231. " vertical-align: middle;\n",
  232. " }\n",
  233. "\n",
  234. " .dataframe tbody tr th {\n",
  235. " vertical-align: top;\n",
  236. " }\n",
  237. "\n",
  238. " .dataframe thead th {\n",
  239. " text-align: right;\n",
  240. " }\n",
  241. "</style>\n",
  242. "<table border=\"1\" class=\"dataframe\">\n",
  243. " <thead>\n",
  244. " <tr style=\"text-align: right;\">\n",
  245. " <th></th>\n",
  246. " <th>sepal-length</th>\n",
  247. " <th>sepal-width</th>\n",
  248. " <th>petal-length</th>\n",
  249. " <th>petal-width</th>\n",
  250. " <th>class</th>\n",
  251. " </tr>\n",
  252. " </thead>\n",
  253. " <tbody>\n",
  254. " <tr>\n",
  255. " <th>0</th>\n",
  256. " <td>5.1</td>\n",
  257. " <td>3.5</td>\n",
  258. " <td>1.4</td>\n",
  259. " <td>0.2</td>\n",
  260. " <td>Iris-setosa</td>\n",
  261. " </tr>\n",
  262. " <tr>\n",
  263. " <th>1</th>\n",
  264. " <td>4.9</td>\n",
  265. " <td>3.0</td>\n",
  266. " <td>1.4</td>\n",
  267. " <td>0.2</td>\n",
  268. " <td>Iris-setosa</td>\n",
  269. " </tr>\n",
  270. " <tr>\n",
  271. " <th>2</th>\n",
  272. " <td>4.7</td>\n",
  273. " <td>3.2</td>\n",
  274. " <td>1.3</td>\n",
  275. " <td>0.2</td>\n",
  276. " <td>Iris-setosa</td>\n",
  277. " </tr>\n",
  278. " <tr>\n",
  279. " <th>3</th>\n",
  280. " <td>4.6</td>\n",
  281. " <td>3.1</td>\n",
  282. " <td>1.5</td>\n",
  283. " <td>0.2</td>\n",
  284. " <td>Iris-setosa</td>\n",
  285. " </tr>\n",
  286. " <tr>\n",
  287. " <th>4</th>\n",
  288. " <td>5.0</td>\n",
  289. " <td>3.6</td>\n",
  290. " <td>1.4</td>\n",
  291. " <td>0.2</td>\n",
  292. " <td>Iris-setosa</td>\n",
  293. " </tr>\n",
  294. " </tbody>\n",
  295. "</table>\n",
  296. "</div>"
  297. ],
  298. "text/plain": [
  299. " sepal-length sepal-width petal-length petal-width class\n",
  300. "0 5.1 3.5 1.4 0.2 Iris-setosa\n",
  301. "1 4.9 3.0 1.4 0.2 Iris-setosa\n",
  302. "2 4.7 3.2 1.3 0.2 Iris-setosa\n",
  303. "3 4.6 3.1 1.5 0.2 Iris-setosa\n",
  304. "4 5.0 3.6 1.4 0.2 Iris-setosa"
  305. ]
  306. },
  307. "execution_count": 6,
  308. "metadata": {},
  309. "output_type": "execute_result"
  310. }
  311. ],
  312. "source": [
  313. "# This line configures matplotlib to show figures embedded in the notebook, \n",
  314. "# instead of opening a new window for each figure. More about that later. \n",
  315. "# If you are using an old version of IPython, try using '%pylab inline' instead.\n",
  316. "%matplotlib inline\n",
  317. "\n",
  318. "# import librarys\n",
  319. "from numpy import *\n",
  320. "import matplotlib.pyplot as plt\n",
  321. "import pandas as pd\n",
  322. "\n",
  323. "# Load dataset\n",
  324. "names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class']\n",
  325. "dataset = pd.read_csv(\"iris.csv\", header=0, index_col=0)\n",
  326. "dataset.head()\n"
  327. ]
  328. },
  329. {
  330. "cell_type": "code",
  331. "execution_count": 7,
  332. "metadata": {
  333. "lines_to_next_cell": 2
  334. },
  335. "outputs": [
  336. {
  337. "name": "stderr",
  338. "output_type": "stream",
  339. "text": [
  340. "/home/bushuhui/.virtualenv/dl/lib/python3.5/site-packages/ipykernel_launcher.py:2: SettingWithCopyWarning: \n",
  341. "A value is trying to be set on a copy of a slice from a DataFrame\n",
  342. "\n",
  343. "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n",
  344. " \n",
  345. "/home/bushuhui/.virtualenv/dl/lib/python3.5/site-packages/ipykernel_launcher.py:3: SettingWithCopyWarning: \n",
  346. "A value is trying to be set on a copy of a slice from a DataFrame\n",
  347. "\n",
  348. "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n",
  349. " This is separate from the ipykernel package so we can avoid doing imports until\n",
  350. "/home/bushuhui/.virtualenv/dl/lib/python3.5/site-packages/ipykernel_launcher.py:4: SettingWithCopyWarning: \n",
  351. "A value is trying to be set on a copy of a slice from a DataFrame\n",
  352. "\n",
  353. "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n",
  354. " after removing the cwd from sys.path.\n"
  355. ]
  356. }
  357. ],
  358. "source": [
  359. "#对类别进行编码,3个类别分别赋值0,1,2\n",
  360. "dataset['class'][dataset['class']=='Iris-setosa']=0\n",
  361. "dataset['class'][dataset['class']=='Iris-versicolor']=1\n",
  362. "dataset['class'][dataset['class']=='Iris-virginica']=2"
  363. ]
  364. },
  365. {
  366. "cell_type": "code",
  367. "execution_count": 8,
  368. "metadata": {
  369. "lines_to_next_cell": 2
  370. },
  371. "outputs": [],
  372. "source": [
  373. "def originalDatashow(dataSet):\n",
  374. " #绘制原始的样本点\n",
  375. " num,dim=shape(dataSet)\n",
  376. " marksamples=['ob'] #样本图形标记\n",
  377. " for i in range(num):\n",
  378. " plt.plot(datamat.iat[i,0],datamat.iat[i,1],marksamples[0],markersize=5)\n",
  379. " plt.title('original dataset')\n",
  380. " plt.xlabel('sepal length')\n",
  381. " plt.ylabel('sepal width') \n",
  382. " plt.show()"
  383. ]
  384. },
  385. {
  386. "cell_type": "code",
  387. "execution_count": 9,
  388. "metadata": {
  389. "lines_to_end_of_cell_marker": 2,
  390. "scrolled": true
  391. },
  392. "outputs": [
  393. {
  394. "data": {
  395. "image/png": "\n",
  396. "text/plain": [
  397. "<Figure size 432x288 with 1 Axes>"
  398. ]
  399. },
  400. "metadata": {
  401. "needs_background": "light"
  402. },
  403. "output_type": "display_data"
  404. }
  405. ],
  406. "source": [
  407. "#获取样本数据\n",
  408. "datamat = dataset.loc[:, ['sepal-length', 'sepal-width']]\n",
  409. "# 真实的标签\n",
  410. "labels = dataset.loc[:, ['class']]\n",
  411. "#原始数据显示\n",
  412. "originalDatashow(datamat)"
  413. ]
  414. },
  415. {
  416. "cell_type": "code",
  417. "execution_count": 11,
  418. "metadata": {},
  419. "outputs": [],
  420. "source": [
  421. "def randChosenCent(dataSet,k):\n",
  422. " \"\"\"初始化聚类中心:通过在区间范围随机产生的值作为新的中心点\"\"\"\n",
  423. "\n",
  424. " # 样本数\n",
  425. " m=shape(dataSet)[0]\n",
  426. " # 初始化列表\n",
  427. " centroidsIndex=[]\n",
  428. " #生成类似于样本索引的列表\n",
  429. " dataIndex=list(range(m))\n",
  430. " for i in range(k):\n",
  431. " #生成随机数\n",
  432. " randIndex=random.randint(0,len(dataIndex))\n",
  433. " #将随机产生的样本的索引放入centroidsIndex\n",
  434. " centroidsIndex.append(dataIndex[randIndex])\n",
  435. " #删除已经被抽中的样本\n",
  436. " del dataIndex[randIndex]\n",
  437. " #根据索引获取样本\n",
  438. " centroids = dataSet.iloc[centroidsIndex]\n",
  439. " return mat(centroids)"
  440. ]
  441. },
  442. {
  443. "cell_type": "code",
  444. "execution_count": 12,
  445. "metadata": {},
  446. "outputs": [],
  447. "source": [
  448. "\n",
  449. "def distEclud(vecA, vecB):\n",
  450. " \"\"\"算距离, 两个向量间欧式距离\"\"\"\n",
  451. " return sqrt(sum(power(vecA - vecB, 2))) #la.norm(vecA-vecB)\n",
  452. "\n",
  453. "\n",
  454. "def kMeans(dataSet, k):\n",
  455. " # 样本总数\n",
  456. " m = shape(dataSet)[0]\n",
  457. " # 分配样本到最近的簇:存[簇序号,距离的平方] (m行 x 2 列)\n",
  458. " clusterAssment = mat(zeros((m, 2)))\n",
  459. "\n",
  460. " # step1: 通过随机产生的样本点初始化聚类中心\n",
  461. " centroids = randChosenCent(dataSet, k)\n",
  462. " print('最初的中心=', centroids)\n",
  463. "\n",
  464. " # 标志位,如果迭代前后样本分类发生变化值为Tree,否则为False\n",
  465. " clusterChanged = True\n",
  466. " # 查看迭代次数\n",
  467. " iterTime = 0\n",
  468. " \n",
  469. " # 所有样本分配结果不再改变,迭代终止\n",
  470. " while clusterChanged:\n",
  471. " clusterChanged = False\n",
  472. " \n",
  473. " # step2:分配到最近的聚类中心对应的簇中\n",
  474. " for i in range(m):\n",
  475. " # 初始定义距离为无穷大\n",
  476. " minDist = inf;\n",
  477. " # 初始化索引值\n",
  478. " minIndex = -1\n",
  479. " # 计算每个样本与k个中心点距离\n",
  480. " for j in range(k):\n",
  481. " # 计算第i个样本到第j个中心点的距离\n",
  482. " distJI = distEclud(centroids[j, :], dataSet.values[i, :])\n",
  483. " # 判断距离是否为最小\n",
  484. " if distJI < minDist:\n",
  485. " # 更新获取到最小距离\n",
  486. " minDist = distJI\n",
  487. " # 获取对应的簇序号\n",
  488. " minIndex = j\n",
  489. " # 样本上次分配结果跟本次不一样,标志位clusterChanged置True\n",
  490. " if clusterAssment[i, 0] != minIndex:\n",
  491. " clusterChanged = True\n",
  492. " clusterAssment[i, :] = minIndex, minDist ** 2 # 分配样本到最近的簇\n",
  493. " \n",
  494. " iterTime += 1\n",
  495. " sse = sum(clusterAssment[:, 1])\n",
  496. " print('the SSE of %d' % iterTime + 'th iteration is %f' % sse)\n",
  497. " \n",
  498. " # step3:更新聚类中心\n",
  499. " for cent in range(k): # 样本分配结束后,重新计算聚类中心\n",
  500. " # 获取该簇所有的样本点\n",
  501. " ptsInClust = dataSet.iloc[nonzero(clusterAssment[:, 0].A == cent)[0]]\n",
  502. " # 更新聚类中心:axis=0沿列方向求均值。\n",
  503. " centroids[cent, :] = mean(ptsInClust, axis=0)\n",
  504. " return centroids, clusterAssment\n"
  505. ]
  506. },
  507. {
  508. "cell_type": "code",
  509. "execution_count": 13,
  510. "metadata": {},
  511. "outputs": [
  512. {
  513. "name": "stdout",
  514. "output_type": "stream",
  515. "text": [
  516. "最初的中心= [[6.2 2.8]\n",
  517. " [6.7 3.1]\n",
  518. " [5.1 3.8]]\n",
  519. "the SSE of 1th iteration is 54.890000\n",
  520. "the SSE of 2th iteration is 37.397339\n",
  521. "the SSE of 3th iteration is 37.268236\n",
  522. "the SSE of 4th iteration is 37.201302\n",
  523. "the SSE of 5th iteration is 37.155048\n",
  524. "the SSE of 6th iteration is 37.141172\n"
  525. ]
  526. }
  527. ],
  528. "source": [
  529. "# 进行k-means聚类\n",
  530. "k = 3 # 用户定义聚类数\n",
  531. "mycentroids, clusterAssment = kMeans(datamat, k)"
  532. ]
  533. },
  534. {
  535. "cell_type": "code",
  536. "execution_count": 14,
  537. "metadata": {},
  538. "outputs": [],
  539. "source": [
  540. "def datashow(dataSet, k, centroids, clusterAssment): # 二维空间显示聚类结果\n",
  541. " from matplotlib import pyplot as plt\n",
  542. " num, dim = shape(dataSet) # 样本数num ,维数dim\n",
  543. "\n",
  544. " if dim != 2:\n",
  545. " print('sorry,the dimension of your dataset is not 2!')\n",
  546. " return 1\n",
  547. " marksamples = ['or', 'ob', 'og', 'ok', '^r', '^b', '<g'] # 样本图形标记\n",
  548. " if k > len(marksamples):\n",
  549. " print('sorry,your k is too large,please add length of the marksample!')\n",
  550. " return 1\n",
  551. " # 绘所有样本\n",
  552. " for i in range(num):\n",
  553. " markindex = int(clusterAssment[i, 0]) # 矩阵形式转为int值, 簇序号\n",
  554. " # 特征维对应坐标轴x,y;样本图形标记及大小\n",
  555. " plt.plot(dataSet.iat[i, 0], dataSet.iat[i, 1], marksamples[markindex], markersize=6)\n",
  556. "\n",
  557. " # 绘中心点\n",
  558. " markcentroids = ['o', '*', '^'] # 聚类中心图形标记\n",
  559. " label = ['0', '1', '2']\n",
  560. " c = ['yellow', 'pink', 'red']\n",
  561. " for i in range(k):\n",
  562. " plt.plot(centroids[i, 0], centroids[i, 1], markcentroids[i], markersize=15, label=label[i], c=c[i])\n",
  563. " plt.legend(loc='upper left')\n",
  564. " plt.xlabel('sepal length')\n",
  565. " plt.ylabel('sepal width')\n",
  566. "\n",
  567. " plt.title('k-means cluster result') # 标题\n",
  568. " plt.show()\n",
  569. " \n",
  570. " \n",
  571. "# 画出实际图像\n",
  572. "def trgartshow(dataSet, k, labels):\n",
  573. " from matplotlib import pyplot as plt\n",
  574. "\n",
  575. " num, dim = shape(dataSet)\n",
  576. " label = ['0', '1', '2']\n",
  577. " marksamples = ['ob', 'or', 'og', 'ok', '^r', '^b', '<g']\n",
  578. " # 通过循环的方式,完成分组散点图的绘制\n",
  579. " for i in range(num):\n",
  580. " plt.plot(datamat.iat[i, 0], datamat.iat[i, 1], marksamples[int(labels.iat[i, 0])], markersize=6)\n",
  581. " for i in range(0, num, 50):\n",
  582. " plt.plot(datamat.iat[i, 0], datamat.iat[i, 1], marksamples[int(labels.iat[i, 0])], markersize=6,\n",
  583. " label=label[int(labels.iat[i, 0])])\n",
  584. " plt.legend(loc='upper left')\n",
  585. " \n",
  586. " # 添加轴标签和标题\n",
  587. " plt.xlabel('sepal length')\n",
  588. " plt.ylabel('sepal width')\n",
  589. " plt.title('iris true result') # 标题\n",
  590. "\n",
  591. " # 显示图形\n",
  592. " plt.show()\n",
  593. " # label=labels.iat[i,0]"
  594. ]
  595. },
  596. {
  597. "cell_type": "code",
  598. "execution_count": 15,
  599. "metadata": {},
  600. "outputs": [
  601. {
  602. "data": {
  603. "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYUAAAEWCAYAAACJ0YulAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4wLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvqOYd8AAAIABJREFUeJzt3XmcHHWd//HXOzMhyYSQuCQqMmQmEOXHKSThUjwg8UIOV3TFjcsi+IgmovID3f1xLKfhUJF4EXYEd4VkAUVdbhSCUZDLhEACQQUhx0SOECTkINfk8/ujaiY9PT1T1d3V1dXdn+fj0Y9MV1d/61M1nf5Ofa+PzAznnHMOYFC1A3DOOZcdXik455zr4ZWCc865Hl4pOOec6+GVgnPOuR5eKTjnnOvhlYIrm6RlkqZUO460SbpQ0pxqx1FNkuZL+kK143DJ8UrBuSqS1C7JJDVXO5ZySTpF0oPVjsOVxysF52pYnMqkHioclx6vFFyiJO0j6QVJn+3n9Qsl/VzSHEnrJC2R9C5JZ0t6RdJKSR/O2X+kpOskvShplaRvSmoKX9tL0v2S1kh6VdJcSaNy3rtM0tclLZa0VtLNkoaGr42WdIek1yW9JukBSQX/P0jaT9K94X4vSzqnwD4flNSZt62nWU3SoZIWSHojLOO74W6/D/99XdJ6SUeE+58q6RlJf5f0a0ltOeWapC9LehZ4tkAs3Xcfp0laAdwfbj9c0kPhOT8p6YM57zlF0vPh7+QFSVNzfl9zCpTdnHfMfYBrgCPC83i90LV02eeVgkuMpAnAr4GvmNmNA+x6HHAD8BZgUfieQcDuwMXAf+bs+9/ANmA8cDDwYaC7DVvAZcA7gH2APYAL8471T8BHgXHAgcAp4fazgE5gDPA24Bygz5ovkkYA9wH3hMcZD8wb4Nz68z3ge2a2C7AX8LNw+/vDf0eZ2c5m9rCkE8J4PhnG9wCQfz0/ARwG7DvAMT9AcF0+Iml34E7gm8A/AF8HfiFpjKThwPeBj5nZCOA9wBPFnJyZPQN8CXg4PI9RUe9x2eSVgkvK+4DbgJPN7I6IfR8ws1+b2Tbg5wRffJeb2VbgJqBd0ihJbwOOAc4wsw1m9gpwFXASgJk9Z2b3mtlmM1sNfJfgizDX983sb2b2GnA7cFC4fSuwG9BmZlvN7AErvBDYscBLZnalmW0ys3Vm9mhxl6bneOMljTaz9Wb2yAD7fgm4zMyeCa/RpcBBuXcL4euvmdmbA5RzYXjd3gQ+B9xlZneZ2XYzuxdYQHB9AbYD+0saZmYvmtnTJZyjqwNeKbikfAl4yMzmd2+QNDVsSlgv6e6cfV/O+flN4FUz68p5DrAz0AYMBl4MmzxeJ7iLeGtY/tsk3RQ2K70BzAFG58X1Us7PG8NyAb4NPAf8Jmw2+X/9nNcewF+jTj6G04B3AX+S9EdJxw6wbxvwvZxzfo3grmj3nH1Wxjhm7j5twKe7ywzLPRLYzcw2AJ8h+B2+KOlOSf8n/qm5euKVgkvKl4Cxkq7q3mBmc8OmhJ3N7GMllLkS2AyMNrNR4WMXM9svfP1SgiafA8Jmmc8RfHlGCv/iP8vM9gSOB86UNLmfGPaMUeQGoKX7SdjvMSbneM+a2WcJKrQrgFvCZptCdycrgS/mnPMoMxtmZg/lnkKMmHL3WQnckFfmcDO7PIzv12b2IYK7pz8BPy50XsDbYx7P1SivFFxS1hG03b9f0uVJFGhmLwK/Aa6UtIukQWHncncT0QhgPbA2bDP/RtyyJR0rabwkAWuBLoImlHx3ALtJOkPSEEkjJB1WYL+/AEMlfVzSYOA8YEjO8T4naYyZbQe6O2G3A6vDf3MrnmuAsyXtF753pKRPxz23fswBjpP0EUlNkoaGneOt4R3XCWEltZngmnZfiycIfqdjJY0Ezh7gGC8DrZJ2KjNWV0VeKbjEmNnrwIeAj0m6JKFiTwZ2ApYCfwduIfhrFuAiYALBl/qdwC+LKPedBB3I64GHgavN7Lf5O5nZOoJzOo6gKepZ4KgC+60FZgDXAqsI/sLOHY30UeBpSesJOp1PMrM3zWwjMBP4Q9isc7iZ/YrgbuKmsFnsKaCUO63c+FYC3R3YqwnuHL5B8B0wCDgT+BtBU9UHgOnh++4FbgYWAwsJKsn+3A88Dbwk6dVy4nXVI0+y45xzrpvfKTjnnOvhlYJzzrkeXik455zrUfFKIRzpsEhSnw6qcGr9aklPhA9fbdE556oojYWyvgY8A+zSz+s3m9npcQsbPXq0tbe3JxGXc841jIULF75qZmOi9qtopSCpFfg4wZC7M5Mos729nQULFiRRlHPONQxJy+PsV+nmo1nAv1F4UlC3ExWsYnmLpD0K7SBpWrjC5ILVq1dXJFDnnHMVrBTCtV1eMbOFA+x2O9BuZgcC9wI/LbSTmXWY2SQzmzRmTOTdj3POuRJV8k7hvcDxkpYRrHx5tPJSF5rZGjPbHD69FphYwXicc85FqFifgpmdTbhOSpjM4+tm9rncfSTtFq5vA8GiZM+UcqytW7fS2dnJpk2byoi48oYOHUprayuDBw+udijOOVdQ6mn6JF0MLDCz24CvSjqeIInKa+xIgFKUzs5ORowYQXt7O8H6ZgMxgqVuHiNYw20EcChwBDEX2CyJmbFmzRo6OzsZN25cxY7jnHPlSKVSCNfYnx/+fH7O9p67iXJs2rQpRoWwFbgO+BbwSvh8K8Fy/YMJVjT+N4Jl75P/S14Su+66K95Rnqy5S+Zy7rxzWbF2BWNHjmXm5JlMPWBqtcNyrmbVTULvgSuE9QSLTD5OkGcl15bw8QJBhsb/Ae5iRy6WtGJ0xZq7ZC7Tbp/Gxq3B73T52uVMu30agFcMzpWoAZa52EpQIfyRvhVCvo0EzUrHhO9zWXbuvHN7KoRuG7du5Nx551YpIudqXwNUCtcR3CFsjtoxtJlg2fifFH2ke+65h7333pvx48dz+eWJ5JlxA1ixdkVR251z0eq8UjCCPoSoO4R8G8P3xc810dXVxZe//GXuvvtuli5dyo033sjSpUuLPK4rxtiRY4va7pyLVueVwsMEncqleDl8fzyPPfYY48ePZ88992SnnXbipJNO4tZbby3x2C6OmZNn0jK4pde2lsEtzJw8s0oROVf76rxSeIzS+wa2EfRDxLNq1Sr22GPHKh2tra2sWrWqxGO7OKYeMJWO4zpoG9mGEG0j2+g4rsM7mZ0rQ92MPipsHaVXClvC97ssm3rAVK8EnEtQnd8pjKD0OQc7he+PZ/fdd2flypU9zzs7O9l9991LPLZzzlVHnVcKh1J6pdAMHBJ770MOOYRnn32WF154gS1btnDTTTdx/PHHl3hs55yrjjpvPjqCYKbyCyW8923h++Npbm7mhz/8IR/5yEfo6uri1FNPZb/99ivhuM45Vz11XimIYOmKsyhuWGpL+L7iZiAfc8wxHHPMMUW9xznnsqTOm48gWMtoAjAk5v5DCFbwPrViETnnXFY1QKUwGLiboH+hJWLflnC/u6jEonjOOZd1DVApQLC43Tzgu8CewHCCOwKF/w4Pt3833C/5xfCcc64W1HmfQq7BwBeBacDDsHURPLMn7PM8DJ4AHE4l8yk451wtaKBKoZuA98BLe8LfO+Glg2GPt1c7KOecy4QGaT7KYwadLwc/r3o5eO5SN3fJXNpntTPookG0z2pn7pK51Q7JuYbXmJXC2vWwrSv4eWtX8LxMp556Km9961vZf//9yy6rEXQnyFm+djmG9STI8YrBuepqzEqh82XYvj34efv24G6hTKeccgr33HNP2eU0Ck+Q41w21X+fwlPPwpq1vbflp8VcsxZ+t6D3tl1Hwv7vjH2Y97///Sxbtqy0GBuQJ8hxLpvq/05hXCsM2al3RZDfh5D7fJCC/ce1phNfg/IEOc5lU/1XCsOHwSH7wehRMCjidAcNgl1HBfsPH5ZOfA3KE+Q4l031XykANDXBvnvBXq19m466ScHr++4V7O8qyhPkOJdN9d+nkGvnlqB5qKvAENRBgp2Hpx9TA/MEOc5lT2PcKXRbtzGv/yDn9M1g/YaSi/7sZz/LEUccwZ///GdaW1u57rrrygi0+nwOgXONqbHuFNaug+0W3BUMHgzj94DnVsKWrcH219fBO95aUtE33nhjwsFWT/ccgu4ho91zCAD/y965OtdgdwrhnUB3Z/Lot+zohM59vcH5HALnGldj3Sm0DIOxu8FuY3Zs6+6EfnE1vPp69WLLEJ9D4FzjaqxK4YABJqPtNqZ3ZdHAxo4cy/K1ywtud87Vt8ZqPsr18svwgQ/AqlXVjiRzfA6Bc42rcSuF73wHHnwQzj672pFkjs8hcK5xNVbzUbc33oDZs4PF8G65Bc49F/beu9pRZYrPIXCuMVX8TkFSk6RFku4o8NoQSTdLek7So5LaKx0PEFQI3fMVtmyBs84qu8iVK1dy1FFHse+++7Lffvvxve99r+wyXfl8voVzxUmj+ehrwDP9vHYa8HczGw9cBVxR8Wg2b4bLL4eN4ZDLri64/35YsGDg90Vobm7myiuvZOnSpTzyyCP86Ec/YunSpQkE7ErlORucK15FKwVJrcDHgWv72eUE4Kfhz7cAk6X+FidKyA03wLZtvbdt2gRf+UpZxe62225MmDABgBEjRrDPPvuwyjuxq8rnWzhXvErfKcwC/g3Y3s/ruwMrAcxsG7AW2DV/J0nTJC2QtGD16tWlR9PVBRdeCOvzMq2ZwZIlcN99pZedY9myZSxatIjDDjsskfJcaXy+hXPFq1ilIOlY4BUzW1huWWbWYWaTzGzSmDFlzCW49VZYu7bwaxs2wOmn78jIVqL169dz4oknMmvWLHbZZZeyynLl8ZwNzhWvkncK7wWOl7QMuAk4WtKcvH1WAXsASGoGRgJrKhKNGZx3Xt+7hFydnfCLX5R8iK1bt3LiiScydepUPvnJT5ZcjkuGz7dwrngVqxTM7GwzazWzduAk4H4z+1zebrcB/xr+/KlwnwLrWidg/nxYEdFssGEDnHEGbN1adPFmxmmnncY+++zDmWeeWVqMLlE+38K54qU+T0HSxcACM7sNuA64QdJzwGsElUdlnHde8KUfZe1auO46+NKXiir+D3/4AzfccAMHHHAABx10EACXXnopxxxzTCnRuoT4fAvnipNKpWBm84H54c/n52zfBHy64gEsWgRPPBFv3w0b4Jxz4OSToaUlev/QkUceSaVucmrRjDtn0LGwgy7roklNTJs4jas/fnW1w3LORWiMZS4uuCAYdhrX5s0wa1bl4qlzM+6cwewFs+myLgC6rIvZC2Yz484ZVY7MORel/iuFv/4V7r23uFFFGzfCpZfC3/9eubjqWMfCjqK2O+eyo24qhX6bbmbO7DtZLY6uLrjkkvKCytMozUvddwhxtzvnsqMuKoWhQ4eyZs2awl+6CxeWVils2hSMWEqImbFmzRqGDh2aWJlZ1aSmorY757KjLlZJbW1tpbOzk4KznW+6qbzCn+lv2abiDR06lNbW1sTKy6ppE6cxe8Hsgtudc9lWF5XC4MGDGTduXLXDcKHuUUY++si52qNaa+eeNGmSLShzRVPnnGs0khaa2aSo/eqiT8E551wyvFJoQFOun4IuUs9jyvVTqh1SyTyJjsu6uXOhvR0GDQr+nVvCRzSJMuLySqHBTLl+CvNemNdr27wX5tVkxeBJdFzWzZ0L06bB8uXBmpzLlwfPi/lST6KMYnifQoPRRf3nMLILauuz0D6rneVrl/fZ3jayjWVnLEs/IOfytLcHX+L52tpg2bL0ygDvU3ANwJPouKzrb2HmqAWbky6jGF4puJrlSXRc1o3t56PY3/ZKlVEMrxQazORxk4vanmWeRMdl3cyZfRdbbmkJtqdZRjG8Umgw9518X58KYPK4ydx3cjL5qdPkSXRc1k2dCh0dQfu/FPzb0RFsT7OMYnhHs3PONQDvaHb9SmJsf1QZPn/AudpUF2sfufi6x/Zv3LoRoGdsPxC72SWqjCSO4ZyrDm8+ajBJjO2PKsPnDziXPd585ApKYmx/VBk+f8C52uWVQoNJYmx/VBk+f8C52uWVQoNJYmx/VBk+f8C52uWVQoNJYmx/VBk+f8C52uUdzc451wC8o7kKsjI2PytxOFdJaeYYaCQ+TyEhWRmbn5U4nKuk7hwDG4OPeU+OAajc8g+NwpuPEpKVsflZicO5Skoqx0Aj8eajlGVlbH5W4nCuktLOMdBIvFJISFbG5mclDucqKe0cA43EK4WEZGVsflbicK6S0s4x0Ei8UkhIVsbmZyUO5yop7RwDjcQ7mp1zrgFUvaNZ0lBJj0l6UtLTki4qsM8pklZLeiJ8fKFS8TSSGXfOoPniZnSRaL64mRl3zijqdUhnroPPp3Aueyo5T2EzcLSZrZc0GHhQ0t1m9kjefjeb2ekVjKOhzLhzBrMXzO553mVdPc+v/vjVka9DOnMdfD6Fc9kU2XwkaQhwItBOTiViZhfHPojUAjwITDezR3O2nwJMKqZS8OajgTVf3EyXdfXZ3qQmtp2/LfJ1SGeug8+ncC5dSTYf3QqcAGwDNuQ84gTRJOkJ4BXg3twKIceJkhZLukXSHv2UM03SAkkLVq9eHefQDavQF37u9qjXIZ25Dj6fwrlsitN81GpmHy2lcDPrAg6SNAr4laT9zeypnF1uB240s82Svgj8FDi6QDkdQAcEdwqlxNIomtTU751AnNchmNNQ6K/4JOc6pHEM51zx4twpPCTpgHIOYmavA78FPpq3fY2ZbQ6fXgtMLOc4DqZNDNrl37oe5v8XvOON3tu7/+3vfZDOXAefT+FcNvV7pyBpCWDhPp+X9DxB57EAM7MDBypY0hhgq5m9LmkY8CHgirx9djOzF8OnxwPPlHwmDtjRWTzu0tkcuQIumwePzJzes737346FHXRZF01qYtrEaT3bYUdH77nzzmXF2hWMHTmWmZNnJtoBnMYxnHPF67ejWVLbQG80swLLUfV6/4EEzUFNBHckPzOziyVdDCwws9skXUZQGWwDXiPoiP7TQOV6R3MMb7wB73gHbNgAw4bBokWw997Vjso5V0VxO5r7vVPo/tKXdIOZ/Ute4TcA/1LwjTvevxg4uMD283N+Phs4OypIV6TZs6G7st+yBc46C+64o7oxOedqQpw+hf1yn0hqwtv+C0piMlaciWUD2ryZjZecv2Oh+a4uttx7D+TcXcWJs9xzKfs8siSBbC5xivCkMS4LBupTOBs4Bxgm6Y3uzcAWwpFAbockJmPFmVgWZc43PsIJW7b02ta8pYvnP3cse/7ppVhxlnsuSZxHZiSQzSVOEZ40xmVFnMlrl4XNPJmQ1T6FJCZjxZlYNqCuLla9pZnd1/V9af1g2Pmue2l/6guRcZZ7LmWfR5YkkM0lThGeNMZVWtl9CpImhD/+POfnHmb2eBnx1Z0kJmPFmVg2oFtvZZfNhV/aeStw+ums/Mzygo2GuXGWey5ln0eWJJDNJU4RnjTGZcVAfQpXho8fAY8SNBn9OPz5R5UPrbYkkdwmdwJZnO29mMF55zFiywD7dHbyxeWjC76UG2e551LWeWRNAtlc4hThSWNcVvRbKZjZUWZ2FPAiMMHMJpnZRIIRRavSCrBWJDEZK87Esn7Nnx/9Z+WGDXznri52GTSs1+b8OMs9l7LOI2sSyOYSpwhPGuMyw8wGfABPx9mW1mPixImWVXMWz7G2q9pMF8rarmqzOYvnFF3G9DumW9NFTcaFWNNFTTb9junx3vie95gF9wsDP4YPt0fO+3xknOWeS8nnkUVz5pi1tZlJwb9ziv+9xikigcM41y+C+WGR37FxOppvJFgAb064aSqws5l9tnJVVf+y2tFcVYsWwZFH7hi6EuUtb4HOzr5/mjrn6laSq6R+Hnga+Fr4WBpuc1lxwQWwaVPs3be9uYFvf3p3T26TMTNmQHNzkF6yuTl43ogxuOrydJy17q9/hf33L6pSgGCI6h5nwuvDgv4Cz+NcXTNmBBPR802fDlenNLUjCzG4yol7pzDQ2kc/M7N/ylkYrxeLWBCvUrxSyHPqqXDDDbCtuPH/bzbB1YfA18N1az25TXU1N0NXgRG7TU1F/2prOgZXOWXPUyBoKgI4NpmQXEUsXFjS/9hhXXDUsh3PPblNdRX6Mh5oe73G4KpvoAXxupe0ngL83syeTSckV5Qnn+z1NGo2cn+ve3Kb6mpq6v+v9EaKwVVfnI7mscB/Snpe0s8lfUXSQZUOzJUmao6BJ7fJpmn9TOHob3u9xuCqL7JSMLMLzOxogtVSHwC+ASysdGCuNFMPmErHcR20jWxDiLaRbb06kaNed9Vx9dVBh273X+VNTel38GYhBld9ceYpnAe8F9gZWAQ8CDyQ07yUKu9ods654iU5T+GTwK7AfcAvgVurVSFUUhK5EKLKSCvHQBLn0lBqJJFB1ByCtE4j6jhp5Y6okV9b7Ykz7RnYBfgYMBP4C/BgnPdV4lGJZS7mLJ5jLTNbjAvpebTMbClqaYeoMqbfMb3Xa92PpJd/SOJcGsqcOWYtLb2XAmlpydwaE9OnF161ZHr48UnrNKKOEyeOJGKtkV9bppDgMhf7A+8DPgBMAlYSNB+dP+AbK6QSzUdJ5EKIKiOtHANJnEtDqZFEBlFzCNI6jajjpJU7okZ+bZmSxDyFbpcDvwe+D/zRzLaWG1zWJJELIaqMtHIMJHEuDaVGEhlEzSFI6zSijpNW7oga+bXVpDijj441s2+Z2UP1WCFAMrkQospIK8dAEufSUGokkUF/cwW6t6d1GlHHSSt3RI382mpSnI7mupfE2P2oMtLKMeDzEIpUI4kMouYQpHUaUcdJK3dEjfzaalOcjocsPSqVTyGJXAhRZaSVYyCJc2koNZLIYPp0s6amoFO1qWlHJ3O3tE4j9zjv3nerrbr7z2ZbthYVRxKx1sivLTNIqqM5a3yegnMZsvIleL4T9myFPd5e7WjcAMqepyDpdkm39fdINtz6kMZchynXT0EXqecx5fopSYXvMi6NcflTpgTzILofUwb6eJlB58vBz6teDp4XW4bLnIGWzv7AQG80s99VJKIIWb1TmLtkLtNun8bGrTuynxWbpyCqjCnXT2HeC/P6vG/yuMncd/J95Z+Ey6y5c4P+g9zkei0t0NEBUxNaoWTKFJjX9+PF5MlwX6GP1+vrYMmzsH17UFMd8E6mfGpEcWW41JSdTyGrsloppDHXQRep3/faBbX1e3TFSWNcvvr/eFHwa+Kp52DN6zuejx6F9h9fXBkuNYnNU5D0TuAyYF9gaPd2M9uzrAjrTBpzHVzjqvq4/KeehTVre2/Lr0XWrMXm9/6D7dYHR/KJ895Z4eBckuIMSf0vYDawDTgKuB6YU8mgalEacx1c46r6uPxxrTBkp94VQf6f/jnPN24Sy17aiXOubU0pQJeUOJXCMDObR9DUtNzMLgQ+Xtmwak8acx0mj5tc8H39bXf1I41x+ZP7+RhNngwMHwaH7AejRwX9BwNY/+Ygbv3DKPY7ZT+WLhs2YNkue+JUCpslDQKelXS6pH8kWEbb5UgiT0FUGfedfF+fCsA7mRvD1KlBp3JbW/DHeltbsp3MEHQE53959+ogbmqCffeCvVr774CQ+PG8Vv75kr3YuKmpbxku8+IsiHcI8AwwCrgEGAl8y8weqXx4fWW1o9m5hvHGelj8F+ja3ve1pkFw4N6wy/D043IDSqyj2cz+GBY4CPiqma2LGcBQgoX0hoTHucXMLsjbZwhBH8VEYA3wGTNbFqd851yVrNvYuz9h0KBgWCoE29dv8EqhhkU2H0maJGkJsBhYIulJSRNjlL0ZONrM3g0cBHxU0uF5+5wG/N3MxgNXAVcUF348cSaVZSUxTVQinpo5lyRmWkVllUnrOHGOESfWCoszaSzqVOKcxrIn18F2Y+Mm0bl6J373yrgdndDbLZi/UKa0EvWUq1biLErUOhgElcH7cp4fCSyOs4ZGzntagMeBw/K2/xo4Ivy5GXiVsEmrv0exax/FSTqTlcQ0UYl4auZcksiAEpVVJq3jxDlGnFgrbPLkwiFMnrxjn6hTiXvJn7/pSds674/2P//xnLUM3WYtLWY3zt1m9vRzZvP/aPbIk2WdS1qJespVK3F2I8EkO4vM7OC8bY+b2YSoCkdSE7AQGA/8yMz+Pe/1p4CPmlln+PyvYcXxan9lFtunEGdSWVYS00Ql4qmZc0liplVUVpm0jhPnGHFirbA4E8+iTiXuJf/Rl57llt+N4r/vGdOnDF5cDa++DgeUPjchrUQ95aqVOLslNqNZ0ixgGHAjYMBngE2EcxXM7PEYwYwCfgV8xcyeytkeq1KQNA2YBjB27NiJywtd5X4MumgQRt9zFGL7Bdtj75OGqBnLNXMugwYVnr4q7Wh7jhLnWy6N48Q5RtFTgZOXxOVK65JHiXOMNOKIUitx7jhmmQvi5Xg38C7gAuBCYB/gYOBK4DtxgjGz14HfAh/Ne2kVsEcYcDPByKY1Bd7fYWaTzGzSmDFj8l8eUJwJYVmZNBaViKdmziWJmVZRWWXSOk6cY8SJNQOiTiWtSx4lrUQ95aqVOIsVJ/PaUQM8ju7vfZLGhHcISBoGfAj4U95utwH/Gv78KeB+i7p1KVKcSWVZSUwTlYinZs4liZlWUVll0jpOnGPEibXCBpx4Foo6lbQueZS0EvWUq1biLFpUpwPwNuA64O7w+b7AaTHedyCwiKCj+ing/HD7xcDx4c9DgZ8DzwGPAXtGlVtKkp04SWeykpgmKhFPzZxLEhlQorLKpHWcOMeIE2uF5Xc253Yyd4s6lbQueZS0EvWUq1biNEu2o/lugvWPzjWzd4fNPIvM7IBEa6eYfPKac84VL8k+hdFm9jNgO4CZbQMKjFGobZkY2+96y8og8CTiSKCMJE615sbMl6GRzjVRUbcSwHxgV+Dx8PnhwO/i3IZU4lGJHM2ZGNvvesvKIPAk4kigjCRONUtj5iutkc41LhJsPpoA/ADYn6BvYAzwKTNbXLGaagCVaD7KxNh+11tWBoEnEUcCZSRxqlkaM19pjXSucSWaeS3sR9gbEPBnM9tafoilqUSlkImx/a63rAwCTyKOBMq/eztNAAAQ+0lEQVRI4lSzNGa+0hrpXONKrE9B0qcJcio8DXwCuDm8e6gbmRjb73rLyiDwJOJIoIwkTrUWx8yXqpHONWlxOpr/w8zWSToSmEwwPHV2ZcNKVybG9rvesjIIPIk4EigjiVOtyTHzJWqkc01cVKcDwfBTCPI0/3Putmo8KtHRbJaRsf2ut6wMAk8ijgTKSOJUszJmPg2NdK5xkGBH8x0Ey1F8CJgAvAk8ZsGS2KnzeQrOOVe8JOcp/BPBEtcfsWANo38AvlFmfM5FSyKPQVqD1ZOII2KfrJxqPY3/z8o0l0yJczuRpUelmo9cxiSRxyCtwepJxBGxT1ZOtZ7G/2dlmktaSKr5KGu8+ahBJJHHIK3B6knEEbFPVk61nsb/Z2WaS1oSnaeQJV4pNIgk8hgMUjD//lBgBLCOYNnFRwjSRiYliZwMEfskcYgk1NP4/6xMc0lLkn0KzqWvnDwGQwYB18CyZvgNcDlwUfjvbwi2cw2Q0BzMJHIyROyTxCGSUE/j/7MyzSVrvFJw2VRqHoPhwFNvBc6CsdtgZ4IF2geF/+5MsJ2zCKbdrC8/1iRyMkTsk8QhklBP4/+zMs0lc+J0PGTp4R3NDaTYPAZDBpk9+3YzG2LxPk5DzOx9Zral/FiTyMkQsU8Sh0hCPY3/z8o0lzTgHc2u8VxDcAewsYj3tADfBb5YkYicywrvU3DlycLg6qJiMOBbFFchEO7/rfD9ScRRhhTyKTgXKc7tRJYe3nyUgiwMri46hj+Y2XAr7WM1PHx/EnGUKIV8Cq6x4c1HrmRZGFxddAyzgH8HtpRwsCHAFcDXEoijRCnkU3CNzZuPXOlWrChueyZiWEfpQ0y3hO9PIo4SRRwnC78S1xi8UnB9ZWFwddExjAAGl3iwncL3JxFHiVLIp+BcHF4puL6yMLi66BgOpfRKoRk4JKE4SpRCPgXnYonT8ZClh3c0pyQLg6uLimG7mY2z0j5We4bvTyKOMqSQT8E1Lryj2TUen6fgXH+8o9llXxID73PLGH8ZvNJKMJoojiHARODU4o87UBw+icDlqLmPRpzbiSw9vPmoTiQx8L5QGWOGmb28t5m12MAfpRYLlrhYl41zcXUpSx8NvPnIZVoSA+/7K2OvsfDcObD8dNh1W9CPPJhgxOo2YE0ztP2Q4A6h1M7pGHH4JIKGl6WPhudTcNmWxELzUWV051M4hN75FB4l2XwKWVo032VKlj4acSuF5jSCca6PsWML/wlVzMD7qDLGtsHDy+HhvNfb2uIfI4k4XMOqxY+GdzS76khi4H1UGRmZY+AaV01+NOJ0PGTp4R3NdSSJgfdRZWRkjoFrXFn5aOAdzc4557r5PAXnnHNFq1ilIGkPSb+VtFTS05L6rEss6YOS1kp6InycX6l46sXcJXNpn9XOoIsG0T6rnblLypzwVc3ZNFFxxIkzK+eShBkzoLk5GJrS3Bw8T1k9XU5XojhtTKU8gN2ACeHPI4C/APvm7fNB4I5iym3kPoU5i+dYy8wW40J6Hi0zW2zO4jInfFVjNk1UHHHizMq5JGH69N7n0f3IT8RcQfV0OV1fZK1PQdKtwA/N7N6cbR8Evm5mx8Ytp5H7FNpntbN8bd/xbW0j21h2xrKYhbRnYzZNVBxx4szKuSShuRm6uvpub2qCbdtSCaGeLqfrK1OT1yS1A78H9jezN3K2fxD4BdAJ/I2ggni6wPunAdMAxo4dO3F5oU9uAxh00SCsQC5hIbZfkNCEr7RETjyLEWdWziUJUv+vpfSHWz1dTtdXZjqaJe1M8MV/Rm6FEHocaDOzdwM/AP63UBlm1mFmk8xs0pgxYyobcIaNHVl4xkt/2wvvnJFsLVFxxIkzK+eShKam4rZXQD1dTle6ilYKkgYTVAhzzeyX+a+b2Rtmtj78+S5gsKTRlYypls2cPJOWwb1nwrQMbmHm5AQnfKUliYlnWTmXJEybVtz2Cqiny+nKEKfjoZQHIOB6YNYA+7ydHU1YhwIrup/392jkjmazoLO57ao204Wytqvaiutk7ikkI7Npkph4lpVzScL06WZNTUEPb1NTqp3M3erpcrreqHZHs6QjgQeAJUB3i+Q5wNiwMrpG0unAdIK1K98EzjSzhwYqt5E7mp1zrlRV71MwswfNTGZ2oJkdFD7uMrNrzOyacJ8fmtl+ZvZuMzs8qkJw1NdA8gyMy3fO9earpNaSuXODNuaNYbrJ5ct3tDlPnVq9uEoxYwbMnr3jeVfXjudXX12dmJxzvvZRTamngeQZGJfvXCOpevORq4AVK4rbnmWFKoSBtjvnUuGVQi2pp4HkGRiX75zryyuFWlJPA8kzMC7fOdeXVwq1ZOpU6OgI+hCk4N+OjtrrZIagM3n69B13Bk1NwXPvZHauqryj2TnnGoB3NCcskTwGaamVuQy1Emda/Hq4DPB5CjHMXTKXabdPY+PWYH7A8rXLmXZ70PY99YCMNd3UylyGWokzLX49XEZ481EMieQxSEutzGWolTjT4tfDVZg3HyVoxdrC8wD6215VtTKXoVbiTItfD5cRXinEkEgeg7TUylyGWokzLX49XEZ4pRBDInkM0lIrcxlqJc60+PVwGeGVQgxTD5hKx3EdtI1sQ4i2kW10HNeRvU5mqJ25DLUSZ1r8eriM8I5m55xrAN7R7FxSksj74HMQXI3weQrODSSJvA8+B8HVEG8+cm4gSeR98DkILgO8+ci5JCSR98HnILga4pWCcwNJIu+Dz0FwNcQrBecGkkTeB5+D4GqIVwrODSSJvA8+B8HVEO9ods65BuAdzc4554rmlYJzzrkeXik455zr4ZWCc865Hl4pOOec6+GVgnPOuR5eKTjnnOvhlYJzzrkeXik455zrUbFKQdIekn4raamkpyV9rcA+kvR9Sc9JWixpQqXiaSie0MU5V6JKJtnZBpxlZo9LGgEslHSvmS3N2edjwDvDx2HA7PBfVypP6OKcK0PF7hTM7EUzezz8eR3wDLB73m4nANdb4BFglKTdKhVTQzj33B0VQreNG4PtzjkXIZU+BUntwMHAo3kv7Q6szHneSd+KA0nTJC2QtGD16tWVCrM+eEIX51wZKl4pSNoZ+AVwhpm9UUoZZtZhZpPMbNKYMWOSDbDeeEIX51wZKlopSBpMUCHMNbNfFthlFbBHzvPWcJsrlSd0cc6VoZKjjwRcBzxjZt/tZ7fbgJPDUUiHA2vN7MVKxdQQPKGLc64MlRx99F7gX4Alkp4It50DjAUws2uAu4BjgOeAjcDnKxhP45g61SsB51xJKlYpmNmDgCL2MeDLlYrBOedccXxGs3POuR5eKTjnnOvhlYJzzrkeXik455zroaCvt3ZIWg0sr2IIo4FXq3j8YtRKrB5nsmolTqidWOshzjYzi5z9W3OVQrVJWmBmk6odRxy1EqvHmaxaiRNqJ9ZGitObj5xzzvXwSsE551wPrxSK11HtAIpQK7F6nMmqlTihdmJtmDi9T8E551wPv1NwzjnXwysF55xzPbxSGICkJkmLJN1R4LVTJK2W9ET4+EKVYlwmaUkYw4ICr0vS9yU9J2mxpAnViDOMJSrWD0pam3NNz69SnKMk3SLpT5KekXRE3uuZuKYx4szK9dw7J4YnJL0h6Yy8fap+TWPGmZVr+n8lPS3pKUk3Shqa9/oQSTeH1/PRMPtlLJVcOrsefI0gt/Qu/bx+s5mdnmI8/TnKzPqbsPIx4J3h4zBgdvhvtQwUK8ADZnZsatEU9j3gHjP7lKSdgLysRZm5plFxQgaup5n9GTgIgj+0CBJp/Spvt6pf05hxQpWvqaTdga8C+5rZm5J+BpwE/HfObqcBfzez8ZJOAq4APhOnfL9T6IekVuDjwLXVjqVMJwDXW+ARYJSk3aodVFZJGgm8nyBBFGa2xcxez9ut6tc0ZpxZNBn4q5nlr0pQ9Wuap784s6IZGCapmeCPgb/lvX4C8NPw51uAyWHis0heKfRvFvBvwPYB9jkxvNW9RdIeA+xXSQb8RtJCSdMKvL47sDLneWe4rRqiYgU4QtKTku6WtF+awYXGAauB/wqbDq+VNDxvnyxc0zhxQvWvZ76TgBsLbM/CNc3VX5xQ5WtqZquA7wArgBcJMlb+Jm+3nutpZtuAtcCuccr3SqEASccCr5jZwgF2ux1oN7MDgXvZUSun7Ugzm0Bw+/1lSe+vUhxxRMX6OMH6LO8GfgD8b9oBEvwFNgGYbWYHAxuA/1eFOKLEiTML17NH2MR1PPDzasYRJSLOql9TSW8huBMYB7wDGC7pc0mV75VCYe8Fjpe0DLgJOFrSnNwdzGyNmW0On14LTEw3xJ44VoX/vkLQ/nlo3i6rgNy7mNZwW+qiYjWzN8xsffjzXcBgSaNTDrMT6DSzR8PntxB8+ebKwjWNjDMj1zPXx4DHzezlAq9l4Zp26zfOjFzTKcALZrbazLYCvwTek7dPz/UMm5hGAmviFO6VQgFmdraZtZpZO8Ft5P1m1qsmzmvvPJ6gQzpVkoZLGtH9M/Bh4Km83W4DTg5HdxxOcKv5YsqhxopV0tu72z0lHUrw+Yz1QU6Kmb0ErJS0d7hpMrA0b7eqX9M4cWbheub5LP03yVT9muboN86MXNMVwOGSWsJYJtP3++c24F/Dnz9F8B0Wa6ayjz4qgqSLgQVmdhvwVUnHA9uA14BTqhDS24BfhZ/RZuB/zOweSV8CMLNrgLuAY4DngI3A56sQZ9xYPwVMl7QNeBM4Ke4HOWFfAeaGzQjPA5/P6DWNijMr17P7D4EPAV/M2Za5axojzqpfUzN7VNItBE1Z24BFQEfe99N1wA2SniP4fjopbvm+zIVzzrke3nzknHOuh1cKzjnnenil4JxzrodXCs4553p4peCcc66HVwrOFSlcKbPQyrkFtydwvE9I2jfn+XxJmU8i72qTVwrOZd8ngH0j93IuAV4puLoTzp6+M1y07ClJnwm3T5T0u3BBvl93z0oP//L+noL18Z8KZ6oi6VBJD4cLzj2UM3s4bgw/kfRY+P4Twu2nSPqlpHskPSvpWznvOU3SX8L3/FjSDyW9h2DG/LfD+PYKd/90uN9fJL0voUvnnM9odnXpo8DfzOzjECwzLWkwwQJmJ5jZ6rCimAmcGr6nxcwOChfp+wmwP/An4H1mtk3SFOBS4MSYMZxLsLTAqZJGAY9Jui987SDgYGAz8GdJPwC6gP8gWL9oHXA/8KSZPSTpNuAOM7slPB+AZjM7VNIxwAUE6+E4VzavFFw9WgJcKekKgi/TByTtT/BFf2/4pdpEsOxwtxsBzOz3knYJv8hHAD+V9E6CZb8HFxHDhwkWVfx6+HwoMDb8eZ6ZrQWQtBRoA0YDvzOz18LtPwfeNUD5vwz/XQi0FxGXcwPySsHVHTP7i4J0jscA35Q0j2BV1qfN7Ij+3lbg+SXAb83sHxWkM5xfRBgCTgyzee3YKB1GcIfQrYvS/h92l1Hq+50ryPsUXN2R9A5go5nNAb5N0CTzZ2CMwjzGkgard4KU7n6HIwlW6FxLsNxw9/LNpxQZxq+Br+SsqHlwxP5/BD4g6S0KljrObaZaR3DX4lzFeaXg6tEBBG34TxC0t3/TzLYQrHB5haQngSfovQb9JkmLgGsI8tsCfAu4LNxe7F/jlxA0Ny2W9HT4vF9hrolLgceAPwDLCLJlQZDT4xthh/VehUtwLhm+SqpreJLmA183swVVjmNnM1sf3in8CviJmRVKHO9cxfidgnPZcWF4d/MU8AJVTp/pGpPfKTjnnOvhdwrOOed6eKXgnHOuh1cKzjnnenil4JxzrodXCs4553r8f9DTYyw3KDdCAAAAAElFTkSuQmCC\n",
  604. "text/plain": [
  605. "<Figure size 432x288 with 1 Axes>"
  606. ]
  607. },
  608. "metadata": {
  609. "needs_background": "light"
  610. },
  611. "output_type": "display_data"
  612. },
  613. {
  614. "data": {
  615. "image/png": "\n",
  616. "text/plain": [
  617. "<Figure size 432x288 with 1 Axes>"
  618. ]
  619. },
  620. "metadata": {
  621. "needs_background": "light"
  622. },
  623. "output_type": "display_data"
  624. }
  625. ],
  626. "source": [
  627. "# 绘图显示\n",
  628. "datashow(datamat, k, mycentroids, clusterAssment)\n",
  629. "trgartshow(datamat, 3, labels)"
  630. ]
  631. },
  632. {
  633. "cell_type": "markdown",
  634. "metadata": {},
  635. "source": [
  636. "## How to use sklearn to do the classifiction\n"
  637. ]
  638. },
  639. {
  640. "cell_type": "code",
  641. "execution_count": 18,
  642. "metadata": {},
  643. "outputs": [
  644. {
  645. "data": {
  646. "text/plain": [
  647. "<Figure size 432x288 with 0 Axes>"
  648. ]
  649. },
  650. "metadata": {},
  651. "output_type": "display_data"
  652. },
  653. {
  654. "data": {
  655. "image/png": "iVBORw0KGgoAAAANSUhEUgAAAP4AAAECCAYAAADesWqHAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4wLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvqOYd8AAAC8tJREFUeJzt3X+o1fUdx/HXazetlpK2WoRGZgwhguUPZFHEphm2wv2zRKFgsaF/bJFsULZ/Rv/1V7Q/RiBWCzKjawkjtpaSEUGr3Wu2TG2UGCnVLTTM/lCy9/44X4eJ637v3f187jnn/XzAwXO9x/P63Ht9ne/3e+73nLcjQgBy+c5kLwBAfRQfSIjiAwlRfCAhig8kRPGBhLqi+LaX237X9nu21xfOesz2iO3dJXNOy7vc9g7be2y/Y/uewnnn2X7D9ltN3gMl85rMAdtv2n6+dFaTd8D227Z32R4qnDXD9hbb+2zvtX1dwax5zdd06nLU9roiYRExqRdJA5LelzRX0lRJb0m6umDejZIWSNpd6eu7TNKC5vp0Sf8u/PVZ0rTm+hRJr0v6UeGv8beSnpL0fKXv6QFJF1fKekLSr5rrUyXNqJQ7IOljSVeUuP9u2OIvlvReROyPiBOSnpb0s1JhEfGKpMOl7v8seR9FxM7m+heS9kqaVTAvIuJY8+GU5lLsLC3bsyXdKmljqYzJYvtCdTYUj0pSRJyIiM8rxS+V9H5EfFDizruh+LMkfXjaxwdVsBiTyfYcSfPV2QqXzBmwvUvSiKRtEVEy72FJ90r6umDGmULSi7aHba8pmHOlpE8lPd4cymy0fUHBvNOtkrS51J13Q/FTsD1N0rOS1kXE0ZJZEXEyIq6VNFvSYtvXlMixfZukkYgYLnH/3+KGiFgg6RZJv7Z9Y6Gcc9Q5LHwkIuZL+lJS0eegJMn2VEkrJA2WyuiG4h+SdPlpH89u/q5v2J6iTuk3RcRztXKb3dIdkpYXirhe0grbB9Q5RFti+8lCWf8VEYeaP0ckbVXncLGEg5IOnrbHtEWdB4LSbpG0MyI+KRXQDcX/p6Qf2L6yeaRbJekvk7ymCWPb6hwj7o2IhyrkXWJ7RnP9fEnLJO0rkRUR90fE7IiYo87P7aWIuKNE1im2L7A9/dR1STdLKvIbmoj4WNKHtuc1f7VU0p4SWWdYrYK7+VJnV2ZSRcRXtn8j6e/qPJP5WES8UyrP9mZJP5Z0se2Dkv4QEY+WylNnq3inpLeb425J+n1E/LVQ3mWSnrA9oM4D+zMRUeXXbJVcKmlr5/FU50h6KiJeKJh3t6RNzUZpv6S7CmadejBbJmlt0ZzmVwcAEumGXX0AlVF8ICGKDyRE8YGEKD6QUFcVv/Dpl5OWRR553ZbXVcWXVPObW/UHSR553ZTXbcUHUEGRE3hs9/VZQTNnzhzzvzl+/LjOPffcceXNmjX2FysePnxYF1100bjyjh4d+2uIjh07pmnTpo0r79Chsb80IyLUnL03ZidPnhzXv+sVETHqN2bST9ntRTfddFPVvAcffLBq3vbt26vmrV9f/AVv33DkyJGqed2IXX0gIYoPJETxgYQoPpAQxQcSovhAQhQfSIjiAwm1Kn7NEVcAyhu1+M2bNv5Jnbf8vVrSattXl14YgHLabPGrjrgCUF6b4qcZcQVkMWEv0mneOKD2a5YBjEOb4rcacRURGyRtkPr/ZblAr2uzq9/XI66AjEbd4tcecQWgvFbH+M2ct1Kz3gBUxpl7QEIUH0iI4gMJUXwgIYoPJETxgYQoPpAQxQcSYpLOONSebDN37tyqeeMZEfb/OHz4cNW8lStXVs0bHBysmtcGW3wgIYoPJETxgYQoPpAQxQcSovhAQhQfSIjiAwlRfCAhig8k1GaE1mO2R2zvrrEgAOW12eL/WdLywusAUNGoxY+IVyTVfRUFgKI4xgcSYnYekNCEFZ/ZeUDvYFcfSKjNr/M2S3pN0jzbB23/svyyAJTUZmjm6hoLAVAPu/pAQhQfSIjiAwlRfCAhig8kRPGBhCg+kBDFBxLqi9l5CxcurJpXe5bdVVddVTVv//79VfO2bdtWNa/2/xdm5wHoChQfSIjiAwlRfCAhig8kRPGBhCg+kBDFBxKi+EBCFB9IqM2bbV5ue4ftPbbfsX1PjYUBKKfNufpfSfpdROy0PV3SsO1tEbGn8NoAFNJmdt5HEbGzuf6FpL2SZpVeGIByxnSMb3uOpPmSXi+xGAB1tH5Zru1pkp6VtC4ijp7l88zOA3pEq+LbnqJO6TdFxHNnuw2z84De0eZZfUt6VNLeiHio/JIAlNbmGP96SXdKWmJ7V3P5aeF1ASiozey8VyW5wloAVMKZe0BCFB9IiOIDCVF8ICGKDyRE8YGEKD6QEMUHEuqL2XkzZ86smjc8PFw1r/Ysu9pqfz/BFh9IieIDCVF8ICGKDyRE8YGEKD6QEMUHEqL4QEIUH0iI4gMJtXmX3fNsv2H7rWZ23gM1FgagnDbn6h+XtCQijjXvr/+q7b9FxD8Krw1AIW3eZTckHWs+nNJcGJgB9LBWx/i2B2zvkjQiaVtEMDsP6GGtih8RJyPiWkmzJS22fc2Zt7G9xvaQ7aGJXiSAiTWmZ/Uj4nNJOyQtP8vnNkTEoohYNFGLA1BGm2f1L7E9o7l+vqRlkvaVXhiActo8q3+ZpCdsD6jzQPFMRDxfdlkASmrzrP6/JM2vsBYAlXDmHpAQxQcSovhAQhQfSIjiAwlRfCAhig8kRPGBhJidNw7bt2+vmtfvav/8jhw5UjWvG7HFBxKi+EBCFB9IiOIDCVF8ICGKDyRE8YGEKD6QEMUHEqL4QEKti98M1XjTNm+0CfS4sWzx75G0t9RCANTTdoTWbEm3StpYdjkAami7xX9Y0r2Svi64FgCVtJmkc5ukkYgYHuV2zM4DekSbLf71klbYPiDpaUlLbD955o2YnQf0jlGLHxH3R8TsiJgjaZWklyLijuIrA1AMv8cHEhrTW29FxMuSXi6yEgDVsMUHEqL4QEIUH0iI4gMJUXwgIYoPJETxgYQoPpBQX8zOqz0LbeHChVXzaqs9y67293NwcLBqXjdiiw8kRPGBhCg+kBDFBxKi+EBCFB9IiOIDCVF8ICGKDyRE8YGEWp2y27y19heSTkr6irfQBnrbWM7V/0lEfFZsJQCqYVcfSKht8UPSi7aHba8puSAA5bXd1b8hIg7Z/r6kbbb3RcQrp9+geUDgQQHoAa22+BFxqPlzRNJWSYvPchtm5wE9os203AtsTz91XdLNknaXXhiActrs6l8qaavtU7d/KiJeKLoqAEWNWvyI2C/phxXWAqASfp0HJETxgYQoPpAQxQcSovhAQhQfSIjiAwlRfCAhR8TE36k98Xf6LebOnVszTkNDQ1Xz1q5dWzXv9ttvr5pX++e3aFF/v5wkIjzabdjiAwlRfCAhig8kRPGBhCg+kBDFBxKi+EBCFB9IiOIDCVF8IKFWxbc9w/YW2/ts77V9XemFASin7UCNP0p6ISJ+bnuqpO8WXBOAwkYtvu0LJd0o6ReSFBEnJJ0ouywAJbXZ1b9S0qeSHrf9pu2NzWCNb7C9xvaQ7bovXQMwZm2Kf46kBZIeiYj5kr6UtP7MGzFCC+gdbYp/UNLBiHi9+XiLOg8EAHrUqMWPiI8lfWh7XvNXSyXtKboqAEW1fVb/bkmbmmf090u6q9ySAJTWqvgRsUsSx+5An+DMPSAhig8kRPGBhCg+kBDFBxKi+EBCFB9IiOIDCfXF7Lza1qxZUzXvvvvuq5o3PDxcNW/lypVV8/ods/MAnBXFBxKi+EBCFB9IiOIDCVF8ICGKDyRE8YGEKD6Q0KjFtz3P9q7TLkdtr6uxOABljPqeexHxrqRrJcn2gKRDkrYWXheAgsa6q79U0vsR8UGJxQCoY6zFXyVpc4mFAKindfGb99RfIWnwf3ye2XlAj2g7UEOSbpG0MyI+OdsnI2KDpA1S/78sF+h1Y9nVXy1284G+0Kr4zVjsZZKeK7scADW0HaH1paTvFV4LgEo4cw9IiOIDCVF8ICGKDyRE8YGEKD6QEMUHEqL4QEIUH0io1Oy8TyWN5zX7F0v6bIKX0w1Z5JFXK++KiLhktBsVKf542R6KiEX9lkUeed2Wx64+kBDFBxLqtuJv6NMs8sjrqryuOsYHUEe3bfEBVEDxgYQoPpAQxQcSovhAQv8BVOSY4UmSu60AAAAASUVORK5CYII=\n",
  656. "text/plain": [
  657. "<Figure size 288x288 with 1 Axes>"
  658. ]
  659. },
  660. "metadata": {
  661. "needs_background": "light"
  662. },
  663. "output_type": "display_data"
  664. }
  665. ],
  666. "source": [
  667. "from sklearn.datasets import load_digits\n",
  668. "import matplotlib.pyplot as plt \n",
  669. "from sklearn.cluster import KMeans\n",
  670. "\n",
  671. "# load digital data\n",
  672. "digits, dig_label = load_digits(return_X_y=True)\n",
  673. "\n",
  674. "# draw one digital\n",
  675. "plt.gray() \n",
  676. "plt.matshow(digits[0].reshape([8, 8])) \n",
  677. "plt.show() \n",
  678. "\n",
  679. "# calculate train/test data number\n",
  680. "N = len(digits)\n",
  681. "N_train = int(N*0.8)\n",
  682. "N_test = N - N_train\n",
  683. "\n",
  684. "# split train/test data\n",
  685. "x_train = digits[:N_train, :]\n",
  686. "y_train = dig_label[:N_train]\n",
  687. "x_test = digits[N_train:, :]\n",
  688. "y_test = dig_label[N_train:]\n",
  689. "\n"
  690. ]
  691. },
  692. {
  693. "cell_type": "code",
  694. "execution_count": 19,
  695. "metadata": {},
  696. "outputs": [
  697. {
  698. "data": {
  699. "image/png": "iVBORw0KGgoAAAANSUhEUgAAAW4AAAA/CAYAAADAByJpAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4wLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvqOYd8AAAEHNJREFUeJztnWtsVNUWx/97ZjptZzqtVCqUhxT0+kDN1QZB8CZqfIAaJfpBfMRH1OADTPygxpuguWpEwUdqolGJuQokPquQaCwPTa0GTASjeBUEeRR5SC1tebUzbWdm3w90NmvvdqbnnHmcHrp+yYR1WNNz/nPmnDV7r7P23kJKCYZhGMY7+NwWwDAMw9iDAzfDMIzH4MDNMAzjMThwMwzDeAwO3AzDMB6DAzfDMIzH4MDNMAzjMSwFbiHELCHEViHEdiHEk/kWxTpYB+tgHSerjpwgpcz4AuAHsAPAJABBAJsATB7s73L9Yh2sg3WwDq/ryNVL9H2otAghpgP4j5RyZt/2v/sC/gsZ/ibtTiORiLY9duxYZQcCAc134MABZff29uLw4cMQQqDv+Np7pZTCjg6/369tn3HGGcqOx+Oab8+ePcpOJpNIJBLpdmtJh893oqMzfvx47b2nnnqqdixKS0uLsnt6etDW1qb2Zb7X7vkoLy/XtidOnKgdi/Lnn38qO5FIIBaLpdut7fMxYcIE7b2VlZXK/vvvvzXf/v376XH6nQO7OjJRXFysbHqtALr+rq4u7N69G6FQCABw9OjRnOoYNWqUsquqqjTfH3/8oexkMone3t60+7GrY8SIEdr2uHHjlG3eS9FoVLP379+vrq+Ojg4Ax89ZMplEMpkcVAeNCzU1Ndp7S0tL0+qgn7+rqwt79uxR8SelI4Xd83Haaael3c503wJAe3s7Pe6gOgYiMPhbMBbAHrK9F8A0KztPkQq2ADBlyhTN9+KLLyrbvDgWLVqk7F27dqGpqQlFRUUAkDFYWOGUU07Rtt9++21lmyf6scceU3ZXV5d24p1AL7YFCxZovrvuukvZ5md85ZVXlL1582asXLlSBZTOzs6sNF166aXa9tKlS5W9d+9ezffwww8ru62tTQsYTkgFOQB4/vnnNd+cOXOU/frrr2u+Z599Vtnd3d04duxYVjoyQQPV+++/r/nKysqU3dDQgIULF+Kcc84BADQ2NgI4fg8M1kgaCDMY3XnnncqeO3eu5rvuuuuUfezYMbS0tKgflVRjw44Oet9eddVVmm/x4sXKpj+uAPDLL78ou7GxEcuWLcP06dMBACtWrEBvby/Kyspw6NAhSzpoXHj11Vc130UXXaTscDis+VpbW5W9atUq1NXVYerUqQCAjz76yNKxKfS7uOOOOzTf/PnzlW3+WL/22mva9ocffqhs+iNnByuB2xJCiLkA5g76xjzDOlgH62AdXtcxGFYC9z4AtD8/ru//NKSUSwAsAex1/awSCoUstRTyrcNsAbmlIxKJDInzEQwGLb0v3zpousJNHaNGjbLUGyzEdToUro+qqiqtN5hMJge8hwrxvVhp3eZbR66wErg3APiHEGIijgfsWwHcbucgtCtJu3MAMGnSJGUfOXJE882ePVvZiUQCDQ0NKC8vh9/vx19//WVHAgC960e7+wBwySWXKPvxxx/XfPTCc9LdNbniiiuUfdlll2m+d999V9lnn3225rvpppuUHY/HsXz5clRWViIQCGDXrl22ddAu6JIlSzQfPVdmGuaNN97QdEyfPh0VFRXw+Xxoa2uzrePqq69W9jXXXKP5tm/fruwZM2ZovnPPPVfZUkqsW7fO9rEp9DPT1AgAPPnkiSKEVBokBf3M06ZNQ09PD0KhkEqJCSEcp0poKgDQU2srVqzQfDSn6/f74fP5EIlE4PP50N7ejkAgACFEv2c46aDpxIceekjz0edPmzZt0nznn3++squqqvDEE0+ocxKLxVBcXIxYLGb5fNA03g033KD5tm7dquwvvvhC89FnMclkEu3t7di0aZNKt9qFXm8LFy7UfB9//LGy6XUEALfeequ2/fnnnys7b6kSKWVcCDEfwGocfzL7Xynlb46OlgV+vx8VFRWOAkMuMb8UtwgEAhg5ciQOHDiQkx+TbHSUlZXh8OHDrmkAhtb3Ultbi6ampuNP//uCdqERQiAcDqvGkM/nc0VHIBDAjBkz0NDQACml+kEpND6fD2PHjsXOnTsLfux8YCnHLaX8EsCXedYyKCUlJSgpKQGgVxQMV0KhkHqw5+YFGQwG1QOqgwcPuqZjqDBmzBiMGTMGAPDJJ5+4piMYDKpUltmbLSTjx49X1VMffPCBazrKy8tVdYvZS/AaOXs4mQlaNWDmRHfs2JHWZ7bisq0koSkbs/tCKwWWLVum+XJdrfD7778r26wMoMeiKQkA2LJli7adbe/j4osvVraZGrj99hPZsA0bNmg+s4t+4YUXKvurr76yrYOWV5oVGzRVcv/992s+p13edIwcOVLZtJII0FNa+/bpj3jMUkqaxnPSG6LVEc8995zm2717t7LN78FMJdHr45tvvrGtg+aily9frvm+++47ZZvpi9GjR2vb9DukqYFM5ZsU+r10d3drvhdeOFGVvHbtWs1nlvxlGz9oGShNFQH6DzS9HwA9xQLocchpQ4eHvDMMw3gMDtwMwzAegwM3wzCMxyhIjpvmpcwyvrPOOkvZZn0nzV8B/XPedkk9MAL6j9KkoyWnTdMHhpojA2me0UkOk+b1zWHcTz31lLLN3Nhnn32mbWc7WrK6ulrZ5ujI77//XtlmTvenn37StmnJmpMc9/r165Vtno9Zs2YpO/VgOoWZw8wW+jluueUWzUfzsfQ6AvQReoBehubk+qAlkfTzA3oZq3l9pEYnpli1apWy16xZY1sH/czmtAfXXnutss1nD2bumua8rea1KfTBqpmnfuSRR5RNp2kAgHfeeUfbNqdIsAstPTRjAn0WYU5jYY7ENkd4OoFb3AzDMB6DAzfDMIzHKEiqhHazzNIpmrIwux9mGVqmWfmsQI9FZ+ED9FGJZveUdn0B4Omnn1Y2nVDHCeZnampqUjYdzQkA99xzj7ZNR4r99pv9MVF0siuzO0fLlMxZ5sxuJi33dAI9lrkvOgqvublZ8+U6VUJTYOYkRLR7e/PNN2s+M1XS1dWVlY6ZM2em9dFJ2sxRt2YKxywXtAtNXZqTkNF7xExRmGV55vmxyw8//KDst956S/PRY5ujTM3zU19fr2yzrNAKNMX5zDPPaD5aimmWNZvfJ02lOLlvAW5xMwzDeA5LLW4hRDOAowASAOJSyimZ/+LkZsuWLZYnm8on27Ztc20oM6Wurg7FxcWu6xgqHDx40LWh7pT6+noUFRW5rmPdunXw+/2u6+ju7nZdQ66w0+K+Qkp54XAP2ikmTZqkVcS4RU1NTb+J/d3g7rvvxoMPPui2jCHDiBEj+s1T7QYzZ87EjTfe6LYM1NbW9qvWcoOioiLLM1oOZQqS46ar3phfHs0Rm7lUc7WL5uZm9cttdYYzSqahyDS/aQ7vNS/8iooKzJs3D5FIBPfee6/6f6uaaFlbRUWF5qPldOaE7OZCAqWlpbjyyisRCoUc5co2b96sbPNc0+/MzDubZWiJRAItLS1aztwOdIWTCy64QPPRlWfMMkTzeEIIlJaWQgjhqFSSXh9vvvmm5qOf+frrr9d8ZqmclBLRaNRW646+99dff1U2feYB6Hlc8wf722+/1bZjsZhq7TqB3o/mkHn67MF8BmTmf7u7u7FhwwbHrV1axmcuSkCfA5mLG5x++unathBCzZDoJMdNY4ZZmvzpp58q24xj5oyX9FqiJZt2sBq4JYA1ffPTvt03Z60rZPuAMlcsXrwYQggkk0lXZjtLUV9f73r3TwiBL790fQ4yANnPR5ErnE7XmWvMuni3cBIo88FQuT6yxWrg/peUcp8Q4jQAa4UQv0sptZ/3QqwckWptSynTBvBC6FiwYAEqKytx5MgRzJ8/f8B8ZiF03HbbbYhEIujs7OzXQiykjtmzZyMcDiMajfaboKuQOkpKSuDz+SClTFvZUQgdpaWlak1FN3WMHj0agUAAiUSi3wCrQupIPf+QUqYNnMPp+sgFlpqKUsp9ff/+DWAFgKkDvGeJlHJKPnPgqeCYqYVZCB2p3GV5eXnaCfILoSOVzsg0EqsQOlLHz5QqKYSOVM/H7esjpSNTT6wQOlIpqEypkuF03w6V6yMXDNriFkKEAfiklEf77GsAPDvIn2nQbpI5RSoNOuY0nXSq0Xg8rv1yO8lx09XaGxoaNN+ZZ545oF5Ar/+ORqMIh8MIhUIqj1lcXIxAIGA5r0pz3Pfdd5/mo7kzc8VzusJ3Z2cnOjo6EAwGtZyanZVWfv75Z2WbPRi6KKv5w0Dz4Z2dnWhqaoLf73c0nBnQV1qZN2+e5qutrVU2nd4T0KfPjMViWL9+PYqLi9HT04OXX34Z4XAYwWDQcr13pqHVNF86UG49RTQaRTAYdLziDaAvJkunAwD08QbmCkkvvfSSsuPxuPbcxgl0MV863BvQr0Xz2QsdfyGlzLjavBXo+b788ss1H109afLkyZqP5p3j8Th6enqy+l7ofWuuCLRx40Zlm3Xr5sLkFPrjbuf+sZIqGQVgRd/FGQDwvpTSWUY9C6LR6JDIGx46dEg9fEkkEggEAtrDtULR2tqKlStXAjjxhbuR625tbXW0bFqu6ejowHvvvQfg+PmgiwgUkra2NtdXAwKGTk55qBCLxRw19oYqVpYu2wngnwXQkpFIJKJVN+R6cQOrVFdXa5PX0HUxC0lNTY02EZI5oqyQOmhvxelIsGyprq7GAw88oLbT5fzzzbhx47QWlltL7eViIqNc4PaD8xRlZWVaj96sBPIaBWkq0haI2a2i3TuzNM7svme74gntiphdnaVLlyrbXG7KLHeiq6PQhxxWuzq0G37eeedpvjlz5ijbbLmZZYp00VHa6rfasqCpAXM4PV381Nzfo48+qm1v27bN0vHSQXtSZgkk7ZKb0yWY76X7od3mH3/80bYm88amvQpz9RPze3JSfke773QKAJquAPRUiamDlnfmAhp0zdV16PlpbGzUfLleA5Xe9+Yi33RGxLq6Os23evVqbTvbYE3vdfOaX7RokbKrqqo0n1ny9/XXXyvb6bniIe8MwzAegwM3wzCMx+DAzTAM4zFErvNRACCEaAXQCcDZEsY6Iy3sZ4KUssr8T9YxpHXstrgP1sE6TgYdVrQMqGNApJR5eQHYOBT2wzqGpg7eB+9jOO0jl/uRUnKqhGEYxmtw4GYYhvEY+QzcuZpBMNv9sI7c/n0u98P74H0Ml33kcj/5eTjJMAzD5A9OlTAMw3iMvARuIcQsIcRWIcR2IcSTWeynWQjxPyHEz0KIjYP/BetgHayDdZxcOgYkV+UppOTFD2AHgEkAggA2AZjscF/NAEayDtbBOljHcNSR7pWPFvdUANullDullD0APgTgxhR6rIN1sA7W4XUdA5KPwD0WwB6yvbfv/5yQWuvyx74lhVgH62AdrGM46RiQwq8AYI9B17pkHayDdbCO4aYjHy3ufQDGk+1xff9nG2lhrUvWwTpYB+s4iXWk3WlOXzjeit8JYCJOJPXPc7CfMIAIsdcDmMU6WAfrYB3DRUe6V85TJVLKuBBiPoDVOP5k9r9SSifrWWW11iXrYB2sg3V4XUc6eOQkwzCMx+CRkwzDMB6DAzfDMIzH4MDNMAzjMThwMwzDeAwO3AzDMB6DAzfDMIzH4MDNMAzjMThwMwzDeIz/A3IZWsVEJuJMAAAAAElFTkSuQmCC\n",
  700. "text/plain": [
  701. "<Figure size 432x288 with 10 Axes>"
  702. ]
  703. },
  704. "metadata": {
  705. "needs_background": "light"
  706. },
  707. "output_type": "display_data"
  708. }
  709. ],
  710. "source": [
  711. "# do kmeans\n",
  712. "kmeans = KMeans(n_clusters=10, random_state=0).fit(x_train)\n",
  713. "\n",
  714. "# kmeans.labels_ - output label\n",
  715. "# kmeans.cluster_centers_ - cluster centers\n",
  716. "\n",
  717. "# draw cluster centers\n",
  718. "fig, axes = plt.subplots(nrows=1, ncols=10)\n",
  719. "for i in range(10):\n",
  720. " img = kmeans.cluster_centers_[i].reshape(8, 8)\n",
  721. " axes[i].imshow(img)"
  722. ]
  723. },
  724. {
  725. "cell_type": "markdown",
  726. "metadata": {},
  727. "source": [
  728. "## Exerciese - How to caluate the accuracy?\n",
  729. "\n",
  730. "1. How to match cluster label to groundtruth label\n",
  731. "2. How to solve the uncertainty of some digital"
  732. ]
  733. },
  734. {
  735. "cell_type": "markdown",
  736. "metadata": {},
  737. "source": [
  738. "## 评估聚类性能\n",
  739. "\n",
  740. "方法1: 如果被用来评估的数据本身带有正确的类别信息,则利用Adjusted Rand Index(ARI),ARI与分类问题中计算准确性的方法类似,兼顾了类簇无法和分类标记一一对应的问题。\n",
  741. "\n"
  742. ]
  743. },
  744. {
  745. "cell_type": "code",
  746. "execution_count": 20,
  747. "metadata": {},
  748. "outputs": [
  749. {
  750. "name": "stdout",
  751. "output_type": "stream",
  752. "text": [
  753. "ari_train = 0.687021\n"
  754. ]
  755. }
  756. ],
  757. "source": [
  758. "from sklearn.metrics import adjusted_rand_score\n",
  759. "\n",
  760. "ari_train = adjusted_rand_score(y_train, kmeans.labels_)\n",
  761. "print(\"ari_train = %f\" % ari_train)"
  762. ]
  763. },
  764. {
  765. "cell_type": "markdown",
  766. "metadata": {},
  767. "source": [
  768. "Given the contingency table:\n",
  769. "![ARI_ct](images/ARI_ct.png)\n",
  770. "\n",
  771. "the adjusted index is:\n",
  772. "![ARI_define](images/ARI_define.png)\n",
  773. "\n",
  774. "* [ARI reference](https://davetang.org/muse/2017/09/21/adjusted-rand-index/)"
  775. ]
  776. },
  777. {
  778. "cell_type": "markdown",
  779. "metadata": {},
  780. "source": [
  781. "\n",
  782. "\n",
  783. "方法2: 如果被用来评估的数据没有所属类别,则使用轮廓系数(Silhouette Coefficient)来度量聚类结果的质量,评估聚类的效果。轮廓系数同时兼顾了聚类的凝聚都和分离度,取值范围是[-1,1],轮廓系数越大,表示聚类效果越好。 \n",
  784. "\n",
  785. "轮廓系数的具体计算步骤: \n",
  786. "1. 对于已聚类数据中第i个样本$x_i$,计算$x_i$与其同一类簇内的所有其他样本距离的平均值,记作$a_i$,用于量化簇内的凝聚度 \n",
  787. "2. 选取$x_i$外的一个簇$b$,计算$x_i$与簇$b$中所有样本的平均距离,遍历所有其他簇,找到最近的这个平均距离,记作$b_i$,用于量化簇之间分离度 \n",
  788. "3. 对于样本$x_i$,轮廓系数为$sc_i = \\frac{b_i−a_i}{max(b_i,a_i)}$ \n",
  789. "4. 最后,对所以样本集合$\\mathbf{X}$求出平均值,即为当前聚类结果的整体轮廓系数。"
  790. ]
  791. },
  792. {
  793. "cell_type": "code",
  794. "execution_count": 21,
  795. "metadata": {},
  796. "outputs": [
  797. {
  798. "data": {
  799. "image/png": "\n",
  800. "text/plain": [
  801. "<Figure size 720x720 with 6 Axes>"
  802. ]
  803. },
  804. "metadata": {
  805. "needs_background": "light"
  806. },
  807. "output_type": "display_data"
  808. },
  809. {
  810. "data": {
  811. "image/png": "\n",
  812. "text/plain": [
  813. "<Figure size 720x720 with 1 Axes>"
  814. ]
  815. },
  816. "metadata": {
  817. "needs_background": "light"
  818. },
  819. "output_type": "display_data"
  820. }
  821. ],
  822. "source": [
  823. "import numpy as np\n",
  824. "from sklearn.cluster import KMeans\n",
  825. "from sklearn.metrics import silhouette_score\n",
  826. "import matplotlib.pyplot as plt\n",
  827. "\n",
  828. "plt.rcParams['figure.figsize']=(10,10)\n",
  829. "plt.subplot(3,2,1)\n",
  830. "\n",
  831. "x1=np.array([1,2,3,1,5,6,5,5,6,7,8,9,7,9]) #初始化原始数据\n",
  832. "x2=np.array([1,3,2,2,8,6,7,6,7,1,2,1,1,3])\n",
  833. "X=np.array(list(zip(x1,x2))).reshape(len(x1),2)\n",
  834. "\n",
  835. "plt.xlim([0,10])\n",
  836. "plt.ylim([0,10])\n",
  837. "plt.title('Instances')\n",
  838. "plt.scatter(x1,x2)\n",
  839. "\n",
  840. "colors=['b','g','r','c','m','y','k','b']\n",
  841. "markers=['o','s','D','v','^','p','*','+']\n",
  842. "\n",
  843. "clusters=[2,3,4,5,8]\n",
  844. "subplot_counter=1\n",
  845. "sc_scores=[]\n",
  846. "for t in clusters:\n",
  847. " subplot_counter +=1\n",
  848. " plt.subplot(3,2,subplot_counter)\n",
  849. " kmeans_model=KMeans(n_clusters=t).fit(X) #KMeans建模\n",
  850. "\n",
  851. " for i,l in enumerate(kmeans_model.labels_):\n",
  852. " plt.plot(x1[i],x2[i],color=colors[l],marker=markers[l],ls='None')\n",
  853. "\n",
  854. " plt.xlim([0,10])\n",
  855. " plt.ylim([0,10])\n",
  856. "\n",
  857. " sc_score=silhouette_score(X,kmeans_model.labels_,metric='euclidean') #计算轮廓系数\n",
  858. " sc_scores.append(sc_score)\n",
  859. "\n",
  860. " plt.title('k=%s,silhouette coefficient=%0.03f'%(t,sc_score))\n",
  861. "\n",
  862. "plt.figure()\n",
  863. "plt.plot(clusters,sc_scores,'*-') #绘制类簇数量与对应轮廓系数关系\n",
  864. "plt.xlabel('Number of Clusters')\n",
  865. "plt.ylabel('Silhouette Coefficient Score')\n",
  866. "\n",
  867. "plt.show() "
  868. ]
  869. },
  870. {
  871. "cell_type": "markdown",
  872. "metadata": {},
  873. "source": [
  874. "## How to determin the 'k'?\n",
  875. "\n",
  876. "利用“肘部观察法”可以粗略地估计相对合理的聚类个数。K-means模型最终期望*所有数据点到其所属的类簇距离的平方和趋于稳定,所以可以通过观察这个值随着K的走势来找出最佳的类簇数量。理想条件下,这个折线在不断下降并且趋于平缓的过程中会有斜率的拐点,这表示从这个拐点对应的K值开始,类簇中心的增加不会过于破坏数据聚类的结构*。\n",
  877. "\n"
  878. ]
  879. },
  880. {
  881. "cell_type": "code",
  882. "execution_count": 22,
  883. "metadata": {},
  884. "outputs": [
  885. {
  886. "data": {
  887. "image/png": "\n",
  888. "text/plain": [
  889. "<Figure size 720x720 with 1 Axes>"
  890. ]
  891. },
  892. "metadata": {
  893. "needs_background": "light"
  894. },
  895. "output_type": "display_data"
  896. }
  897. ],
  898. "source": [
  899. "import numpy as np\n",
  900. "from sklearn.cluster import KMeans\n",
  901. "from scipy.spatial.distance import cdist\n",
  902. "import matplotlib.pyplot as plt\n",
  903. "\n",
  904. "cluster1=np.random.uniform(0.5,1.5,(2,10))\n",
  905. "cluster2=np.random.uniform(5.5,6.5,(2,10))\n",
  906. "cluster3=np.random.uniform(3,4,(2,10))\n",
  907. "\n",
  908. "X=np.hstack((cluster1,cluster2,cluster3)).T\n",
  909. "plt.scatter(X[:,0],X[:,1])\n",
  910. "plt.xlabel('x1')\n",
  911. "plt.ylabel('x2')\n",
  912. "plt.show()"
  913. ]
  914. },
  915. {
  916. "cell_type": "code",
  917. "execution_count": 23,
  918. "metadata": {},
  919. "outputs": [
  920. {
  921. "data": {
  922. "image/png": "\n",
  923. "text/plain": [
  924. "<Figure size 720x720 with 1 Axes>"
  925. ]
  926. },
  927. "metadata": {
  928. "needs_background": "light"
  929. },
  930. "output_type": "display_data"
  931. }
  932. ],
  933. "source": [
  934. "K=range(1,10)\n",
  935. "meandistortions=[]\n",
  936. "\n",
  937. "for k in K:\n",
  938. " kmeans=KMeans(n_clusters=k)\n",
  939. " kmeans.fit(X)\n",
  940. " meandistortions.append(sum(np.min(cdist(X,kmeans.cluster_centers_,'euclidean'),axis=1))/X.shape[0])\n",
  941. "\n",
  942. "plt.plot(K,meandistortions,'bx-')\n",
  943. "plt.xlabel('k')\n",
  944. "plt.ylabel('Average Dispersion')\n",
  945. "plt.title('Selecting k with the Elbow Method')\n",
  946. "plt.show()"
  947. ]
  948. },
  949. {
  950. "cell_type": "markdown",
  951. "metadata": {},
  952. "source": [
  953. "从上图可见,类簇数量从1降到2再降到3的过程,更改K值让整体聚类结构有很大改变,这意味着新的聚类数量让算法有更大的收敛空间,这样的K值不能反映真实的类簇数量。而当K=3以后再增大K,平均距离的下降速度显著变缓慢,这意味着进一步增加K值不再会有利于算法的收敛,同时也暗示着K=3是相对最佳的类簇数量。"
  954. ]
  955. }
  956. ],
  957. "metadata": {
  958. "jupytext_formats": "ipynb,py",
  959. "kernelspec": {
  960. "display_name": "Python 3",
  961. "language": "python",
  962. "name": "python3"
  963. },
  964. "language_info": {
  965. "codemirror_mode": {
  966. "name": "ipython",
  967. "version": 3
  968. },
  969. "file_extension": ".py",
  970. "mimetype": "text/x-python",
  971. "name": "python",
  972. "nbconvert_exporter": "python",
  973. "pygments_lexer": "ipython3",
  974. "version": "3.5.2"
  975. }
  976. },
  977. "nbformat": 4,
  978. "nbformat_minor": 2
  979. }

机器学习越来越多应用到飞行器、机器人等领域,其目的是利用计算机实现类似人类的智能,从而实现装备的智能化与无人化。本课程旨在引导学生掌握机器学习的基本知识、典型方法与技术,通过具体的应用案例激发学生对该学科的兴趣,鼓励学生能够从人工智能的角度来分析、解决飞行器、机器人所面临的问题和挑战。本课程主要内容包括Python编程基础,机器学习模型,无监督学习、监督学习、深度学习基础知识与实现,并学习如何利用机器学习解决实际问题,从而全面提升自我的《综合能力》。