You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

k-means.ipynb 195 kB

6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago

  1. {
  2. "cells": [
  3. {
  4. "cell_type": "markdown",
  5. "metadata": {},
  6. "source": [
  7. "# k-means"
  8. ]
  9. },
  10. {
  11. "cell_type": "markdown",
  12. "metadata": {},
  13. "source": [
  14. "## Theory\n",
  15. "\n",
  16. "由于具有出色的速度和良好的可扩展性,K-Means聚类算法算得上是最著名的聚类方法。K-Means算法是一个重复移动类中心点的过程,把类的中心点,也称重心(centroids),移动到其包含成员的平均位置,然后重新划分其内部成员。\n",
  17. "\n",
  18. "K是算法计算出的超参数,表示类的数量;K-Means可以自动分配样本到不同的类,但是不能决定究竟要分几个类。\n",
  19. "\n",
  20. "K必须是一个比训练集样本数小的正整数。有时,类的数量是由问题内容指定的。例如,一个鞋厂有三种新款式,它想知道每种新款式都有哪些潜在客户,于是它调研客户,然后从数据里找出三类。也有一些问题没有指定聚类的数量,最优的聚类数量是不确定的。\n",
  21. "\n",
  22. "K-Means的参数是类的重心位置和其内部观测值的位置。与广义线性模型和决策树类似,K-Means参数的最优解也是以成本函数最小化为目标。K-Means成本函数公式如下:\n",
  23. "$$\n",
  24. "J = \\sum_{k=1}^{K} \\sum_{i \\in C_k} | x_i - u_k|^2\n",
  25. "$$\n",
  26. "\n",
  27. "$u_k$是第$k$个类的重心位置,定义为:\n",
  28. "$$\n",
  29. "u_k = \\frac{1}{|C_k|} \\sum_{x \\in C_k} x\n",
  30. "$$\n",
  31. "\n",
  32. "\n",
  33. "成本函数是各个类畸变程度(distortions)之和。每个类的畸变程度等于该类重心与其内部成员位置距离的平方和。若类内部的成员彼此间越紧凑则类的畸变程度越小,反之,若类内部的成员彼此间越分散则类的畸变程度越大。\n",
  34. "\n",
  35. "求解成本函数最小化的参数就是一个重复配置每个类包含的观测值,并不断移动类重心的过程。\n",
  36. "1. 首先,类的重心是随机确定的位置。实际上,重心位置等于随机选择的观测值的位置。\n",
  37. "2. 每次迭代的时候,K-Means会把观测值分配到离它们最近的类,然后把重心移动到该类全部成员位置的平均值那里。\n",
  38. "3. 若达到最大迭代步数或两次迭代差小于设定的阈值则算法结束,否则重复步骤2。\n",
  39. "\n"
  40. ]
  41. },
  42. {
  43. "cell_type": "code",
  44. "execution_count": 3,
  45. "metadata": {},
  46. "outputs": [
  47. {
  48. "data": {
  49. "image/png": "iVBORw0KGgoAAAANSUhEUgAAAW4AAAD8CAYAAABXe05zAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAADgNJREFUeJzt3U+I3Pd5x/HPZ1cZZaSEJOCwpZKpdAgpIlCcFcFT0zB0ekhIqC8tOOAUsoe9JI6TpgQ7UHLUJYT4kBaMPbl4SKBKDiE1ccp251BmENEfQyIpAeM6thybOAcnWRd+U2mfHrTbUY2q/cman77zzL5fMKBd764fnp197+i3O/o6IgQAyGOp9AAAgNtDuAEgGcINAMkQbgBIhnADQDKEGwCSIdwAkAzhBoBkCDcAJHOgiQ96zz33xLFjx5r40LW99dZbOnz4cNEZ5gW7mGIXU+xiah52ce7cud9GxAfrvG0j4T527JjOnj3bxIeubTgcqtvtFp1hXrCLKXYxxS6m5mEXtn9V9225VAIAyRBuAEiGcANAMoQbAJIh3ACQDOEGgGQINwAkQ7gBIBnCDQDJEG4ASIZwA0AyhBsAkiHcAJAM4QaAZAg3ACRDuAEgmVrhtv1l2xdt/9z2d22/u+nBAAA3t2e4bR+R9EVJJyPiI5KWJT3U9GAAgJure6nkgKS27QOSDkn6dXMjAWjaeDzWYDDQeDwuPQregT3DHRGvSvqGpJclvSbpdxHxk6YHA9CM8XisXq+nfr+vXq9HvBPa87Bg2x+Q9KCk45LelPQvth+OiGfe9nbrktYlaWVlRcPhcPbT3oatra3iM8wLdjHFLqTBYKCqqrS9va2qqtTv91VVVemxikp3v4iIW94k/a2kp294+e8k/dOt3md1dTVK29zcLD3C3GAXU+wiYjQaRbvdjqWlpWi32zEajUqPVNw83C8knY09erx7q3ON+2VJ99s+ZNuSepIuN/R9BEDDOp2ONjY2tLa2po2NDXU6ndIj4TbteakkIs7YPi3pvKSrki5IerLpwQA0p9PpqKoqop3UnuGWpIj4uqSvNzwLAKAGnjkJAMkQbgBIhnADQDKEGwCSIdwAkAzhBoBkCDcAJEO4ASAZwg0AyRBuAEiGcANAMoQbAJIh3ACQDOEGgGQINwAkQ7jRuPF4rFOnTnEordjFjeZlFxlPvK91kALwTu2eKD6ZTNRqtfb1UVnsYmpedrE7R1VVGgwGaT4nPOJGo4bDoSaTia5du6bJZJLrJO0ZYxdT87KL3Tm2t7dTfU4INxrV7XbVarW0vLysVqulbrdbeqRi2MXUvOxid46lpaVUnxMulaBRuyeKD4dDdbvdFH8NbQq7mJqXXezO0e/3tba2luZzQrjRuE6nk+YLomnsYmpedpHxxHsulQBAMoQbAJIh3ACQDOEGgGQINwAkQ7gBIBnCDQDJEG4ASIZwA0AyhBsAkiHcAJAM4QaAZAg3ACRDuAEgmVrhtv1+26dt/8L2Zdt5/v1DAFgwdf897ick/Tgi/sZ2S9KhBmcCANzCno+4bb9P0sclPS1JETGJiDebHgyYtYyneQM3U+dSyXFJb0j6ju0Ltp+yfbjhuYCZ2j3Nu9/vq9frEW+kVudSyQFJH5X0SEScsf2EpMck/eONb2R7XdK6JK2srBQ/LXlra6v4DPOCXUiDwUBVVWl7e1tVVanf76uqqtJjFcX9YirdLiLiljdJfyTppRte/gtJ/3qr91ldXY3SNjc3S48wN9hFxGg0ina7HUtLS9Fut2M0GpUeqTjuF1PzsAtJZ2OPHu/e9rxUEhGvS3rF9od3XtWTdKmZbyNAM3ZP815bW9PGxkaqg2GBt6v7WyWPSBrs/EbJi5I+19xIQDMynuYN3EytcEfE85JONjwLAKAGnjkJAMkQbgBIhnADQDKEGwCSIdwAkAzhBoBkCDcAJEO4ASAZwg0AyRBuAEiGcANAMoQbAJIh3ACQDOEGgGQIN3AXjcdjnTp1ijMvxS7uRN2DFADcod0DiyeTiVqt1r4+iYdd3BkecQN3yXA41GQy0bVr1zSZTHIdTjtj7OLOEG7gLul2u2q1WlpeXlar1VK32y09UjHs4s5wqQS4S3YPLB4Oh+p2u/v60gC7uDOEG7iLOp0OkdrBLt45LpUAQDKEGwCSIdwAkAzhBoBkCDcAJEO4ASAZwg0AyRBuAEiGcANAMoQbAJIh3ACQDOEGgGQINwAkQ7gBIJna4ba9bPuC7R81ORAA4NZu5xH3o5IuNzUIAKCeWuG2fVTSpyQ91ew4i4VTrAE0oe4JON+S9FVJ721wloXCKdYAmrJnuG1/WtJvIuKc7e4t3m5d0rokraysFD+1eWtrq+gMg8FAVVVpe3tbVVWp3++rqqois5TexTxhF1PsYirdLiLiljdJpyRdkfSSpNcl/ZekZ271Pqurq1Ha5uZm0f//aDSKdrsdy8vL0W63YzQaFZul9C7mCbuYYhdT87ALSWdjjx7v3vZ8xB0Rj0t6XJJ2HnH/Q0Q83My3kcXBKdYAmsIp7w3iFGsATbitcEfEUNKwkUkAALXwzEkASIZwA0AyhBsAkiHcAJAM4QaAZAg3ACRDuAEgGcINAMkQbgBIhnADQDKEGwCSIdwAkAzhBoBkCDcAJEO4ASAZwo3Gcdo9MFucgINGcdo9MHs84kajhsOhJpOJrl27pslkkuskbWBOEW40qtvtqtVqaXl5Wa1WS91ut/RIQHpcKkGjOO0emD3CjcZx2j0wW1wqAYBkCDcAJEO4ASAZwg0AyRBuAEiGcANAMoQbAJIh3ACQDOEGgGQINwAkQ7gBIBnCDQDJEG4ASIZwA0Aye4bb9r22N21fsn3R9qN3YzAAwM3V+fe4r0r6SkSct/1eSeds/1tEXGp4NgDATez5iDsiXouI8zt//oOky5KOND0YZmM8HmswGHDCOrBAbusat+1jku6TdKaJYTBbuyes9/t99Xo94g0siNpHl9l+j6TvS/pSRPz+Jv99XdK6JK2srBQ/zXtra6v4DKUNBgNVVaXt7W1VVaV+v6+qqkqPVRT3iyl2MZVuFxGx503SuyQ9J+nv67z96upqlLa5uVl6hOJGo1G02+1YWlqKdrsdo9Go9EjFcb+YYhdT87ALSWejRl8jotZvlVjS05IuR8Q3G/0ugpnaPWF9bW1NGxsbHNgLLIg6l0oekPRZST+z/fzO674WEc82NxZmpdPpqKoqog0skD3DHRH/Icl3YRYAQA08cxIAkiHcAJAM4QaAZAg3ACRDuAEgGcINAMkQbgBIhnADQDKEGwCSIdwAkAzhBoBkCDcAJEO4ASAZwg0AyRBuAEiGcANAMoQbAJIh3ACQDOEGgGQINwAkQ7gBIBnCDQDJEG4ASIZwA0AyhBsAkiHcAJAM4QaAZAg3ACRDuAEgGcINAMkQbgBIhnADQDKEGwCSIdwAkEytcNv+hO1f2n7B9mNNDwUA+P/tGW7by5K+LemTkk5I+oztE00PBgC4uTqPuD8m6YWIeDEiJpK+J+nBZse6M+PxWIPBQOPxuPQoADBzdcJ9RNIrN7x8Zed1c2k8HqvX66nf76vX6xFvAAvnwKw+kO11SeuStLKyouFwOKsPfVsGg4GqqtL29raqqlK/31dVVUVmmRdbW1vFPh/zhl1MsYupbLuoE+5XJd17w8tHd173f0TEk5KelKSTJ09Gt9udxXy37eDBg/8b74MHD2ptbU2dTqfILPNiOByq1Odj3rCLKXYxlW0XdS6V/FTSh2wft92S9JCkHzY71jvX6XS0sbGhtbU1bWxs7PtoA1g8ez7ijoirtr8g6TlJy5L6EXGx8cnuQKfTUVVVRBvAQqp1jTsinpX0bMOzAABq4JmTAJAM4QaAZAg3ACRDuAEgGcINAMkQbgBIhnADQDKEGwCSIdwAkAzhBoBkCDcAJEO4ASAZwg0AyRBuAEiGcANAMoQbAJIh3ACQjCNi9h/UfkPSr2b+gW/PPZJ+W3iGecEuptjFFLuYmodd/ElEfLDOGzYS7nlg+2xEnCw9xzxgF1PsYopdTGXbBZdKACAZwg0AySxyuJ8sPcAcYRdT7GKKXUyl2sXCXuMGgEW1yI+4AWAhLWS4bX/C9i9tv2D7sdLzlGL7Xtubti/Zvmj70dIzlWR72fYF2z8qPUtJtt9v+7TtX9i+bLtTeqZSbH9552vj57a/a/vdpWeqY+HCbXtZ0rclfVLSCUmfsX2i7FTFXJX0lYg4Iel+SZ/fx7uQpEclXS49xBx4QtKPI+JPJf2Z9ulObB+R9EVJJyPiI5KWJT1Udqp6Fi7ckj4m6YWIeDEiJpK+J+nBwjMVERGvRcT5nT//Qde/QI+UnaoM20clfUrSU6VnKcn2+yR9XNLTkhQRk4h4s+xURR2Q1LZ9QNIhSb8uPE8tixjuI5JeueHlK9qnsbqR7WOS7pN0puwkxXxL0lclbZcepLDjkt6Q9J2dy0ZP2T5ceqgSIuJVSd+Q9LKk1yT9LiJ+UnaqehYx3Hgb2++R9H1JX4qI35ee526z/WlJv4mIc6VnmQMHJH1U0j9HxH2S3pK0L38OZPsDuv638eOS/ljSYdsPl52qnkUM96uS7r3h5aM7r9uXbL9L16M9iIgflJ6nkAck/bXtl3T90tlf2n6m7EjFXJF0JSJ2/+Z1WtdDvh/9laT/jIg3IuK/Jf1A0p8XnqmWRQz3TyV9yPZx2y1d/2HDDwvPVIRt6/q1zMsR8c3S85QSEY9HxNGIOKbr94d/j4gUj6xmLSJel/SK7Q/vvKon6VLBkUp6WdL9tg/tfK30lOQHtQdKDzBrEXHV9hckPafrPyXuR8TFwmOV8oCkz0r6me3nd173tYh4tuBMKO8RSYOdBzYvSvpc4XmKiIgztk9LOq/rv4F1QUmeQckzJwEgmUW8VAIAC41wA0AyhBsAkiHcAJAM4QaAZAg3ACRDuAEgGcINAMn8DzWXEr0zzEqRAAAAAElFTkSuQmCC\n",
  50. "text/plain": [
  51. "<Figure size 432x288 with 1 Axes>"
  52. ]
  53. },
  54. "metadata": {
  55. "needs_background": "light"
  56. },
  57. "output_type": "display_data"
  58. }
  59. ],
  60. "source": [
  61. "% matplotlib inline\n",
  62. "import matplotlib.pyplot as plt\n",
  63. "import numpy as np\n",
  64. "\n",
  65. "X0 = np.array([7, 5, 7, 3, 4, 1, 0, 2, 8, 6, 5, 3])\n",
  66. "X1 = np.array([5, 7, 7, 3, 6, 4, 0, 2, 7, 8, 5, 7])\n",
  67. "plt.figure()\n",
  68. "plt.axis([-1, 9, -1, 9])\n",
  69. "plt.grid(True)\n",
  70. "plt.plot(X0, X1, 'k.');"
  71. ]
  72. },
  73. {
  74. "cell_type": "markdown",
  75. "metadata": {},
  76. "source": [
  77. "假设K-Means初始化时,将第一个类的重心设置在第5个样本,第二个类的重心设置在第11个样本.那么我们可以把每个实例与两个重心的距离都计算出来,将其分配到最近的类里面。计算结果如下表所示:\n",
  78. "![data_0](images/data_0.png)\n",
  79. "\n",
  80. "新的重心位置和初始聚类结果如下图所示。第一类用X表示,第二类用点表示。重心位置用稍大的点突出显示。\n",
  81. "\n",
  82. "\n"
  83. ]
  84. },
  85. {
  86. "cell_type": "code",
  87. "execution_count": 5,
  88. "metadata": {},
  89. "outputs": [
  90. {
  91. "data": {
  92. "image/png": "iVBORw0KGgoAAAANSUhEUgAAAW4AAAEICAYAAAB/Dx7IAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAFYlJREFUeJzt3X+U3XV95/Hnm0kGCbHQ3bhpgQmD1UUpHIWE1pHanXHctayi5+w5ZW2RHM160vasgkU2qyLUVSmttVTtWntYTVlg1mwOup6CWHUnc/donbJJkF1+RM6hMGQAXbHKj4CdScJ7//je4U5CkrkDc/O9n5nn45x7Zr7f+73f+7qf3Lzudz73znwjM5EkleOYugNIkubH4pakwljcklQYi1uSCmNxS1JhLG5JKozFrY6KiDdExH01Z/hwRHyhzgwvVkRkRLyi7hzqDha3AIiI90bEjoiYiojr53G7iYh40+Guz8xvZ+bp7W7/YkXEYEQ8fFCGP8zM93TqPo+2iLg+Ij5Rdw7VZ1ndAdQ1HgU+AbwZOK7mLIcUEQFEZj5bd5ZDiYhlmbmv7hxa/DziFgCZ+ZXM/CrwDwdfFxGrIuLWiHg8In4SEd+OiGMi4kZgDXBLROyJiE2HuO1zR8CH2z4iXhcR323u//9ExOCs2zci4uqI+FvgGeDlEfHuiNgVEU9FxAMR8TvNbY8Hvg6c1Nz/nog4KSI+GhE3zdrn2yLinub9NSLi1bOum4iIyyPi/0bEExHx3yPiJYcas4h4V0T8bUT8WUT8A/DR5voNzXw/jYhvRMSpzfXR3PZHEfFkRNwVEWfOepzvOWjf3znEfW4ELgI2NR/fLc31/zEiHmmOyX0RMXyozFokMtOLl+cuVEfd1x+07hrgL4HlzcsbqI58ASaANx1hf4PAw7OWD9geOJnqxeJfUx1I/Mvm8sua1zeA3cAvU/2EuBx4C/BLQAD/gqrQzznU/TXXfRS4qfn9Pweebt7PcmATcD/QOyvf/wZOAv4JsAv43cM8tncB+4D3NbMdB7y9ub9XN9d9BPhuc/s3AzuBE5vZXw384qzH+Z6D9v2dWcsJvKL5/fXAJ2ZddzowCZzUXO4Hfqnu55KXzl084lY79gK/CJyamXuzmrdeqD9y807gtsy8LTOfzcxvATuoinzG9Zl5T2bua97/1zLz77Pyv4BvUr2YtOPfAl/LzG9l5l7gU1SF+/pZ23w2Mx/NzJ8AtwCvPcL+Hs3MP29m+xnwu8A1mbkrq2mTPwRe2zzq3gu8FHgV1Qvfrsz8QZu5j2Q/cCxwRkQsz8yJzPz7BdivupTFrXb8CdVR5DebUxMfXMB9nwr8ZnPa4vGIeBz4NaoXihmTs28QEedHxN81p20epyr5VW3e30nAQzMLWc2XT1Id+c/44azvnwFWHmF/kwctnwp8ZtZj+QnV0fXJmbkN+M/A54AfRcR1EfFzbeY+rMy8H3g/1U8WP4qILRFx0ovdr7qXxa05ZeZTmfmBzHw58DbgsllzqPM98j54+0ngxsw8cdbl+Mz8o0PdJiKOBb5MdaS8OjNPBG6jKsd28jxKVa4z+wugD3hkno/jedmaJoHfOejxHJeZ3wXIzM9m5lrgDKppm//QvN3TwIpZ+/mFedwnmfnfMvPXqB5bAn/8wh6OSmBxC6g+EdF8E64H6ImIl0TEsuZ1b42IVzRL7gmqH81nPtnx/4CXz+OuDt7+JuCCiHhzRMzc72BEnHKY2/dSTQs8BuyLiPOBf3XQ/v9pRJxwmNtvBd4SEcMRsRz4ADAFfHcej+FI/hL4UET8MkBEnBARv9n8/tyI+NXm/T4N/COtcbwT+DcRsSKqz2v/uyPcxwFjGBGnR8Qbmy9q/wj8bNZ+tQhZ3JrxEar/8B+kmnf+WXMdwCuB/wnsAcaBv8jMseZ11wAfaU4NXN7G/RywfWZOUr2h92GqMp6kOgo95HMzM58CLqEq4J8Cvw389azrvw98CXigeR8nHXT7+5qP78+BHwMXABdk5nQb2eeUmf+D6mh3S0Q8CdwNnN+8+ueA/9LM/RDVm7B/0rzuz4BpqlL+r8DIEe7mi1Tz2Y9HxFepXsj+qPl4fgj8M+BDC/F41J1mPhkgSSqER9ySVBiLW5IKY3FLUmEsbkkqTEf+yNSqVauyv7+/E7tu29NPP83xxx9fa4Zu4Vi0OBYtjkVLN4zFzp07f5yZL2tn244Ud39/Pzt27OjErtvWaDQYHBysNUO3cCxaHIsWx6KlG8YiIh6ae6uKUyWSVBiLW5IKY3FLUmEsbkkqjMUtSYWxuCWpMBa3JBXG4pakwljcklQYi1uSCmNxS1JhLG5JKozFLUmFsbglqTAWtyQVxuKWpMJY3JJUmLaKOyJ+PyLuiYi7I+JLEfGSTgeT1AGf/CSMjR24bmysWq9izFncEXEycAmwLjPPBHqAd3Q6mKQOOPdcuPDCVnmPjVXL555bby7NS7vnnFwGHBcRe4EVwKOdiySpY4aGYOtWuPBC+s8/H77+9Wp5aKjuZJqHyMy5N4q4FLga+Bnwzcy86BDbbAQ2AqxevXrtli1bFjjq/OzZs4eVK1fWmqFbOBYtjkWlf/Nm+m+8kYmLL2Ziw4a649SuG54XQ0NDOzNzXVsbZ+YRL8DPA9uAlwHLga8C7zzSbdauXZt1GxsbqztC13AsWhyLzNy2LXPVqnzw4oszV62qlpe4bnheADtyjj6eubTz5uSbgAcz87HM3At8BXj9C3hBkVS3mTntrVurI+3mtMnz3rBUV2unuHcDr4uIFRERwDCwq7OxJHXE9u0HzmnPzHlv315vLs3LnG9OZubtEXEzcAewD/gecF2ng0nqgE2bnr9uaMg3JwvT1qdKMvMPgD/ocBZJUhv8zUlJKozFLUmFsbglqTAWtyQVxuKWpMJY3JJUGItbkgpjcUtSYSxuSSqMxS1JhbG4JakwFrckFcbilqTCWNzqHM8o3uJYaAFZ3Ooczyje4lg8z/jkONd8+xrGJ8drzzGye6T2HPPR7lnepfmbdUZxfu/34POfX7pnFHcsDjA+Oc7wDcNM75+mt6eX0fWjDPQN1JZjat8UI5MjteWYL4+41VlDQ1VRffzj1dclWlSAYzFLY6LB9P5p9ud+pvdP05ho1JrjWZ6tNcd8WdzqrLGx6ujyyiurr0v5pLSOxXMG+wfp7emlJ3ro7ellsH+w1hzHcEytOebLqRJ1zqwzij93XsPZy0uJY3GAgb4BRteP0phoMNg/WNv0xEyOzWOb2TC0oYhpErC41UlHOqP4Uisrx+J5BvoGuqIoB/oGmFoz1RVZ2mVxq3M8o3iLY6EF5By3JBXG4pakwljcklQYi1uSCmNxS1JhLG5JKozFLUmFsbglqTAWtyQVxuKWpMJY3JJUGItbi9OhThV2OJ5CTIWxuLU4HXyqsMPxFGIqUFvFHREnRsTNEfH9iNgVEeX8/UMtTbNPFXa48j74b2RLhWj3iPszwN9k5quA1wC7OhdJWiAz5X3BBXDttQded+211XpLWwWa8+9xR8QJwK8D7wLIzGlgurOxpAUyNAQf+xhcfnm1fM45VWlffjl86lOWtorUzokUTgMeA/4qIl4D7AQuzcynO5pMWiiXXVZ9vfxyXnvmmXD33VVpz6yXChOZeeQNItYBfwecl5m3R8RngCcz88qDttsIbARYvXr12i1btnQocnv27NnDypUra83QLRyLymsvuYQT77qLx886izs/+9m649TO50VLN4zF0NDQzsxc19bGmXnEC/ALwMSs5TcAXzvSbdauXZt1GxsbqztC13AsMvNP/zQzIn961lmZEdXyEufzoqUbxgLYkXP08cxlzqmSzPxhRExGxOmZeR8wDNz7Ql9VpKNu1pz2neecw+Add7TmvJ0uUYHaPVnw+4CRiOgFHgDe3blI0gIaG4OrrmrNaTcarbK+6io4+2zfoFRx2iruzLwTaG/uReoWM5/TvuWW55fzZZdVpe3nuFUgf3NSi1M7v1zTzi/pSF3I4tbitH17e0fSM+W9ffvRySUtgHbnuKWybNrU/rZDQ06VqCgecUtSYSxuSSqMxS1JhbG4JakwFrckFcbilqTCWNySVBiLW5IKY3FLUmEsbkkqjMUtHSUjd43Q/+l+jvlPx9D/6X5G7hqpO5IK5d8qkY6CkbtG2HjLRp7Z+wwADz3xEBtv2QjARWddVGe02oxPjtOYaDDYP8hA30DdcYpicUtHwRWjVzxX2jOe2fsMV4xesSSLe3xynOEbhpneP01vTy+j60ct73lwqkQ6CnY/sXte6xe7xkSD6f3T7M/9TO+fpjHRqDtSUSxu6ShYc8Kaea1f7Ab7B+nt6aUneujt6WWwf7DuSEWxuKWj4Orhq1mxfMUB61YsX8HVw1fXlKheA30DjK4f5eNDH3ea5AVwjls6Cmbmsa8YvYLdT+xmzQlruHr46iU5vz1joG/Awn6BLG7pKLnorIuWdFFr4ThVIkmFsbglqTAWtyQVxuKWpMJY3JJUGItbkgpjcUtSYSxuSSqMxS1JhbG4JakwFrckFcbilqTCWNySVBiLW5IK03ZxR0RPRHwvIm7tZKBF4ZOfhLGxA9eNjVXrJelFms8R96XArk4FWVTOPRcuvLBV3mNj1fK559abS9Ki0FZxR8QpwFuAL3Q2ziIxNARbt1ZlfdVV1detW6v1kvQiRWbOvVHEzcA1wEuByzPzrYfYZiOwEWD16tVrt2zZssBR52fPnj2sXLmy1gz9mzfTf+ONTFx8MRMbNtSWoxvGols4Fi2ORUs3jMXQ0NDOzFzX1saZecQL8FbgL5rfDwK3znWbtWvXZt3GxsbqDbBtW+aqVZlXXll93battii1j0UXcSxaHIuWbhgLYEfO0a0zl3amSs4D3hYRE8AW4I0RcdP8X0+WkJk57a1b4WMfa02bHPyGpSS9AHMWd2Z+KDNPycx+4B3Atsx8Z8eTlWz79gPntGfmvLdvrzeXpEXBs7x3wqZNz183NOSbk5IWxLyKOzMbQKMjSSRJbfE3JyWpMBa3JBXG4pakwljcklQYi1uSCmNxS1JhLG5JKozFLUmFsbglqTAWtyQVxuKWpMJY3JJUGItbkgpjcUtSYSxuddz45DjXfPsaxifH644iLQqeSEEdNT45zvANw0zvn6a3p5fR9aMM9A3UHUsqmkfc6qjGRIPp/dPsz/1M75+mMdGoO5JUPItbHTXYP0hvTy890UNvTy+D/YN1R5KK51SJOmqgb4DR9aM0JhoM9g86TSItAItbHTfQN2BhSwvIqRJJKozFLUmFsbglqTAWtyQVxuKWpMJY3JJUGItbkgpjcUtSYSxuSSqMxS1JhbG4JakwFrckFcbilqTCWNySVJg5izsi+iJiLCLujYh7IuLSoxFMknRo7fw97n3ABzLzjoh4KbAzIr6Vmfd2OJsk6RDmPOLOzB9k5h3N758CdgEndzqYFsb45Dgju0c8w7q0iMxrjjsi+oGzgds7EUYLa+YM65sf3MzwDcOWt7RItH3qsohYCXwZeH9mPnmI6zcCGwFWr15No9FYqIwvyJ49e2rPULeR3SNM7ZviWZ5lat8Um8c2M7Vmqu5YtfJ50eJYtJQ2FpGZc28UsRy4FfhGZl471/br1q3LHTt2LEC8F67RaDA4OFhrhrrNHHFP7Zvi2GXHMrp+dMmf+9HnRYtj0dINYxEROzNzXTvbtvOpkgC+COxqp7TVPWbOsL7htA2WtrSItDNVch5wMXBXRNzZXPfhzLytc7G0UAb6BphaM2VpS4vInMWdmd8B4ihkkSS1wd+clKTCWNySVBiLW5IKY3FLUmEsbkkqjMUtSYWxuCWpMBa3JBXG4pakwljcklQYi1uSCmNxS1JhLG5JKozFLUmFsbglqTAWtyQVxuKWpMJY3JJUGItbkgpjcUtSYSxuSSqMxS1JhbG4JakwFrckFcbilqTCWNySVBiLW5IKY3FLUmEsbkkqjMUtSYWxuCWpMBa3JBXG4pakwljcklQYi1uSCtNWcUfEb0TEfRFxf0R8sNOhJEmHN2dxR0QP8DngfOAM4Lci4oxOB3sxxifHGdk9wvjkeN1RJGnBtXPE/SvA/Zn5QGZOA1uAt3c21gs3PjnO8A3DbH5wM8M3DFvekhadZW1sczIwOWv5YeBXD94oIjYCGwFWr15No9FYiHzzNrJ7hKl9UzzLs0ztm2Lz2Gam1kzVkqVb7Nmzp7Z/j27jWLQ4Fi2ljUU7xd2WzLwOuA5g3bp1OTg4uFC7npdjJ49lZLIq72OXHcuGoQ0M9A3UkqVbNBoN6vr36DaORYtj0VLaWLQzVfII0Ddr+ZTmuq400DfA6PpRNpy2gdH1o0u+tCUtPu0ccW8HXhkRp1EV9juA3+5oqhdpoG+AqTVTlrakRWnO4s7MfRHxXuAbQA+wOTPv6XgySdIhtTXHnZm3Abd1OIskqQ3+5qQkFcbilqTCWNySVBiLW5IKY3FLUmEsbkkqjMUtSYWxuCWpMBa3JBXG4pakwljcklQYi1uSCmNxS1JhLG5JKozFLUmFsbglqTCRmQu/04jHgIcWfMfzswr4cc0ZuoVj0eJYtDgWLd0wFqdm5sva2bAjxd0NImJHZq6rO0c3cCxaHIsWx6KltLFwqkSSCmNxS1JhFnNxX1d3gC7iWLQ4Fi2ORUtRY7Fo57glabFazEfckrQoWdySVJhFWdwR8RsRcV9E3B8RH6w7T10ioi8ixiLi3oi4JyIurTtTnSKiJyK+FxG31p2lThFxYkTcHBHfj4hdETFQd6a6RMTvN/9v3B0RX4qIl9SdqR2Lrrgjogf4HHA+cAbwWxFxRr2parMP+EBmngG8Dvj3S3gsAC4FdtUdogt8BvibzHwV8BqW6JhExMnAJcC6zDwT6AHeUW+q9iy64gZ+Bbg/Mx/IzGlgC/D2mjPVIjN/kJl3NL9/iuo/6Mn1pqpHRJwCvAX4Qt1Z6hQRJwC/DnwRIDOnM/PxelPVahlwXEQsA1YAj9acpy2LsbhPBiZnLT/MEi2r2SKiHzgbuL3eJLX5NLAJeLbuIDU7DXgM+KvmtNEXIuL4ukPVITMfAT4F7AZ+ADyRmd+sN1V7FmNx6yARsRL4MvD+zHyy7jxHW0S8FfhRZu6sO0sXWAacA3w+M88GngaW5PtAEfHzVD+NnwacBBwfEe+sN1V7FmNxPwL0zVo+pbluSYqI5VSlPZKZX6k7T03OA94WERNUU2dvjIib6o1Um4eBhzNz5ievm6mKfCl6E/BgZj6WmXuBrwCvrzlTWxZjcW8HXhkRp0VEL9WbDX9dc6ZaRERQzWXuysxr685Tl8z8UGaekpn9VM+HbZlZxJHVQsvMHwKTEXF6c9UwcG+Nkeq0G3hdRKxo/l8ZppA3apfVHWChZea+iHgv8A2qd4k3Z+Y9Nceqy3nAxcBdEXFnc92HM/O2GjOpfu8DRpoHNg8A7645Ty0y8/aIuBm4g+oTWN+jkF9991feJakwi3GqRJIWNYtbkgpjcUtSYSxuSSqMxS1JhbG4JakwFrckFeb/AyaUIWRb0bIhAAAAAElFTkSuQmCC\n",
  93. "text/plain": [
  94. "<Figure size 432x288 with 1 Axes>"
  95. ]
  96. },
  97. "metadata": {
  98. "needs_background": "light"
  99. },
  100. "output_type": "display_data"
  101. }
  102. ],
  103. "source": [
  104. "C1 = [1, 4, 5, 9, 11]\n",
  105. "C2 = list(set(range(12)) - set(C1))\n",
  106. "X0C1, X1C1 = X0[C1], X1[C1]\n",
  107. "X0C2, X1C2 = X0[C2], X1[C2]\n",
  108. "plt.figure()\n",
  109. "plt.title('1st iteration results')\n",
  110. "plt.axis([-1, 9, -1, 9])\n",
  111. "plt.grid(True)\n",
  112. "plt.plot(X0C1, X1C1, 'rx')\n",
  113. "plt.plot(X0C2, X1C2, 'g.')\n",
  114. "plt.plot(4,6,'rx',ms=12.0)\n",
  115. "plt.plot(5,5,'g.',ms=12.0);"
  116. ]
  117. },
  118. {
  119. "cell_type": "markdown",
  120. "metadata": {},
  121. "source": [
  122. "现在我们重新计算两个类的重心,把重心移动到新位置,并重新计算各个样本与新重心的距离,并根据距离远近为样本重新归类。结果如下表所示:\n",
  123. "\n",
  124. "![data_1](images/data_1.png)\n",
  125. "\n",
  126. "画图结果如下:"
  127. ]
  128. },
  129. {
  130. "cell_type": "code",
  131. "execution_count": 8,
  132. "metadata": {},
  133. "outputs": [
  134. {
  135. "data": {
  136. "image/png": "iVBORw0KGgoAAAANSUhEUgAAAW4AAAEICAYAAAB/Dx7IAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAFQZJREFUeJzt3X+U3Hdd7/Hnu5sfNAkWNLBIm7DRo60RxZoUuvSqu271UKhyz9FbfpT0Qg43VxQt3t6DFm6lFirq8XjAg/ZeLKk0rOTWwrlirVJNd/VCY23SVkubopWkSUtLA9gfm8Juk7zvH/PdO0PYzc4mO/nOZ/b5OGfO7nfmO9/ve967+9rvfL4z84nMRJJUjtPqLkCSND8GtyQVxuCWpMIY3JJUGINbkgpjcEtSYQxuLZiIeGtEfG6W29ZGxERE9J3qulpquDQibqtr/wshIvZFxIV116F6GdyLWEQsj4iPRcTDEfFMRNwbERd1Yl+ZuT8zV2XmkWrf4xHx9k7sq9r+QERkRCxpqWE0M3+6U/s81SLi6oj4RN116NQzuBe3JcAB4CeAM4D/AdwUEQM11tSWOo/c59L6z0LqBIN7EcvMQ5l5dWbuy8yjmXkLsBfYABARQxHxSERcERFPRMRjEfG26ftHxHdFxGci4umI+Efge2fbV+sRcERcC/wY8JFq+OQj1TrnRMTfRMTXI+KLEXFJy/3/JCKui4hbI+IQMBwRr4uIe6r9H4iIq1t2+ffV1yerfQweO5QTEa+OiLsi4qnq66tbbhuPiPdHxOerZyO3RcTqWR7bdJ9+LSIeB26orr+4ehbzZETcERE/3HKfX4uIR6ttfzEiRloe5weO3fYM+3wN8B7gDdXj+6fq+rdGxJeq7e6NiEtn+5moYJnpxQuZCdAPfBM4p1oeAg4D1wBLgdcCzwIvrG7fDtwErAReDjwKfG6WbQ8ACSyplseBt7fcvpLG0f/baDwTOBf4KrC+uv1PgKeAC2gccDyvqu+HquUfBr4C/MeZ9ldd99bp+oDvBP4d2FTt703V8ne11PdvwPcDp1fLvz3LY5vu0+8Ay6v1zwWeAF4F9AH/GdhX3X529Vhf2lLr97Y8zg8cs+1HWpb3ARdW318NfOKYHj4NnF0tfzfwg3X/XnlZ+ItH3AIgIpYCo8DHM/PBlpueA67JzOcy81ZgAji7Gqr4OeA3snHk/gXg4ydRwsXAvsy8ITMPZ+Y9wKeA/9Syzp9n5uez8ezgm5k5npn3Vcv/DHySxrBPO14H/Gtmbqv290ngQeBnWta5ITP/JTO/QeMf1I8cZ3tHgfdl5mS1/hbgf2XmnZl5JDM/DkwC5wNHaAT4+ohYmo1nPP/WZt1zOQq8PCJOz8zHMvP+BdquuojBLSLiNGAbMAW885ibv5aZh1uWnwVWAS+iOUY+7eGTKONlwKuqYYUnI+JJ4FLgJS3rtO6LiHhVRIxFxMGIeAr4BWDG4YwZvHSGeh8GzmxZfrzl++nHPZuDmfnNluWXAVcc83jW0DjKfgh4F40j5iciYntEvLTNumeVmYeAN9Dow2MR8ZcRcc7Jblfdx+Be5CIigI/RGCb5ucx8rs27HqQxPLCm5bq189j1sR9LeQD4u8x8QctlVWa+4zj3+VPgM8CazDwD+J9AzLLusb5MI1xbraUx3HMiZno81x7zeFZUR/Zk5p9m5n+oakgawywAh4AVLdt5CbP7tseYmZ/NzJ+iMUzyIPDHJ/Zw1M0Mbl0H/ADwM9VT/LZk42V9nwaujogVEbGexjhuu74CfE/L8i3A90fEpohYWl3Oi4gfOM42ng98PTO/GRGvBN7ccttBGsMG3zPjPeHWan9vrk6YvgFYX9WxEP4Y+IXqWUFExMrqZOrzI+LsiPjJiFhO45zCN6paAe4FXhsR3xkRL6FxZD6brwAD1TMmIqI/Il4fEStpDMtMtGxXPcTgXsQi4mXAf6Uxdvt49eqEiXm8EuGdNIYPHqdxUu2Geez+w8DPR8S/R8QfZOYzwE8Db6RxNPw4zZN9s/lF4JqIeAb4DRrj0ABk5rPAtcDnq6GK81vvmJlfozGufgXwNeDdwMWZ+dV5PIZZZeYu4L8AH6Fx0vMhGidHqR7Tb9M4+fo48GLgyuq2bcA/0TgJeRvwv4+zmz+rvn4tIu6m8ff832j07+s0xvvfMct9VbDIdCIFSSqJR9ySVBiDW5IKY3BLUmEMbkkqTEc+DGf16tU5MDDQiU237dChQ6xcubLWGrqFvWiyF032oqkberF79+6vZuaL2lm3I8E9MDDArl27OrHpto2PjzM0NFRrDd3CXjTZiyZ70dQNvYiItt957FCJJBXG4JakwhjcklQYg1uSCmNwS1JhDG5JKozBLUmFMbglqTAGtyQVxuCWpMIY3JJUGINbkgpjcEtSYQxuSSqMwS1JhTG4JakwBrckFaat4I6IX42I+yPiCxHxyYh4XqcLk9QBv/u7MDb2rdeNjTWuVzHmDO6IOBP4FWBjZr4c6APe2OnCJHXAeefBJZc0w3tsrLF83nn11qV5aXfOySXA6RHxHLAC+HLnSpLUMcPDcNNNcMklDFx0EfzVXzWWh4frrkzzEJk590oRlwPXAt8AbsvMS2dYZwuwBaC/v3/D9u3bF7jU+ZmYmGDVqlW11tAt7EWTvWgY2LqVgW3b2LdpE/s2b667nNp1w+/F8PDw7szc2NbKmXncC/BC4HbgRcBS4P8AbznefTZs2JB1Gxsbq7uErmEvmuxFZt5+e+bq1bl306bM1asby4tcN/xeALtyjjyevrRzcvJCYG9mHszM54BPA68+gX8okuo2PaZ9002NI+1q2OTbTliqq7UT3PuB8yNiRUQEMALs6WxZkjrirru+dUx7esz7rrvqrUvzMufJycy8MyJuBu4GDgP3AB/tdGGSOuDd7/7264aHPTlZmLZeVZKZ7wPe1+FaJElt8J2TklQYg1uSCmNwS1JhDG5JKozBLUmFMbglqTAGtyQVxuCWpMIY3JJUGINbkgpjcEtSYQxuSSqMwS1JhTG41TnOKN5kL5q6pRfdUscJMLjVOc4o3mQvmrqlF91Sxwlod5Z3af5aZhTnHe+A665bvDOK24umbulFwTPee8Stzhoebvxxvv/9ja8F/FF0jL1o6pZeVHUMbNtW1M/E4FZnjY01jqiuuqrxdTFPSmsvmrqlF1Ud+zZtKupnYnCrc1pmFOeaaxb3jOL2oqlbelHwjPcGtzrHGcWb7EVTt/SiW+o4AZ6cVOc4o3iTvWjqll50Sx0nwCNuSSqMwS1JhTG4Va6Z3vk2m0LeESe1w+BWuY5959tsCnpHnNQOg1vlan0H3mzh3frSswJOOkntMLhVtuOFt6GtHmVwq3wzhbehrR7m67jVG7rlg4ukU8AjbvWObvngIqnDDG71jm754CKpwwxu9YZu+eAi6RQwuFW+mU5EtvNSQalQBrfKdrxXjxje6lFtBXdEvCAibo6IByNiT0QMdrowaU7tvOTP8FYPaveI+8PAX2fmOcArgD2dK0lq07Gfp3y89a688ls/Z9nPLlHB5nwdd0ScAfw48FaAzJwCpjpbltSGmT5PeSbTn2ly002N5dYjdalA7bwBZx1wELghIl4B7AYuz8xDHa1MWigFz+YtzSQy8/grRGwE/gG4IDPvjIgPA09n5lXHrLcF2ALQ39+/Yfv27R0quT0TExOsWrWq1hq6hb1oGNi6lYFt29i3aVNjjsFFzt+Lpm7oxfDw8O7M3NjWypl53AvwEmBfy/KPAX95vPts2LAh6zY2NlZ3CV3DXmTm7bdnrl6dezdtyly9urG8yPl70dQNvQB25Rx5PH2Z8+RkZj4OHIiIs6urRoAHTuAfilSPgmfzlmbS7qtKfhkYjYh/Bn4E+K3OlSQtsIJn85Zm0tanA2bmvUB7Yy9Styl4Nm9pJr5zUpIKY3BLUmEMbkkqjMEtSYUxuCWpMAa3JBXG4JakwhjcklQYg1uSCmNwS1JhDG5JKozBLUmFMbglqTAGtyQVxuCWTqGdB3bywf/7QXYe2Fl3KbWzFyeurc/jlnTydh7YyciNI0wdmWJZ3zJ2XLaDwTWDdZdVC3txcjzilk6R8X3jTB2Z4kgeYerIFOP7xusuqTb24uQY3NIpMjQwxLK+ZfRFH8v6ljE0MFR3SbWxFyfHoRLpFBlcM8iOy3Ywvm+coYGhRT00YC9OjsEtnUKDawYNqYq9OHEOlUhSYQxuSSqMwS1JhTG4JakwBrckFcbglqTCGNySVBiDW5IKY3BLUmEMbkkqjMEtSYUxuCWpMAa3NE+j940y8KEBTvvN0xj40ACj943WXZIWGT8dUJqH0ftG2fIXW3j2uWcBePiph9nyF1sAuPSHLq2zNC0iHnFL8/DeHe/9/6E97dnnnuW9O95bU0VajNoO7ojoi4h7IuKWThYkdbP9T+2f1/VSJ8zniPtyYE+nCulFzmLde9aesXZe10ud0FZwR8RZwOuA6ztbTu+YnsX6qrGrGLlxxPDuEdeOXMuKpSu+5boVS1dw7ci1NVWkxSgyc+6VIm4GPgg8H/jvmXnxDOtsAbYA9Pf3b9i+ffsClzo/ExMTrFq1qrb9j+4fZeverRzlKKdxGpvXbebStfWcvKq7F91kIXrxt1/5W67fez1PTD7Bi5e/mLevezsX9l+4QBWeOv5eNHVDL4aHh3dn5sa2Vs7M416Ai4E/qr4fAm6Z6z4bNmzIuo2NjdW6/zv235Gnf+D07PvNvjz9A6fnHfvvqK2WunvRTexFk71o6oZeALtyjmydvrTzcsALgJ+NiNcCzwO+IyI+kZlvOYF/KouGs1hL6pQ5gzszrwSuBIiIIRpDJYZ2G5zFWlIn+DpuSSrMvN45mZnjwHhHKpEktcUjbkkqjMEtSYUxuCWpMAa3JBXG4JakwhjcklQYg1uSCmNwS1JhDG5JKozBLUmFMbglqTAGtyQVxuCWpMIY3JJUGINbHeds99LCmtfncUvzNT3b/dSRKZb1LWPHZTucFUg6SR5xq6PG940zdWSKI3mEqSNTjO8br7skqXgGtzpqaGCIZX3L6Is+lvUtY2hgqO6SpOI5VKKOcrZ7aeEZ3Oo4Z7uXFpZDJZJUGINbkgpjcEtSYQxuSSqMwS1JhTG4JakwBrckFcbglqTCGNySVBiDW5IKY3BLUmEMbkkqjMEtSYUxuCWpMHMGd0SsiYixiHggIu6PiMtPRWGSpJm183nch4ErMvPuiHg+sDsi/iYzH+hwbZKkGcx5xJ2Zj2Xm3dX3zwB7gDM7XZgWxs4DOxndP+oM61IPmdcYd0QMAOcCd3aiGC2s6RnWt+7dysiNI4a31CPanrosIlYBnwLelZlPz3D7FmALQH9/P+Pj4wtV4wmZmJiovYa6je4fZfLwJEc5yuThSbaObWVy7WTdZdXK34sme9FUWi8iM+deKWIpcAvw2cz8/bnW37hxY+7atWsByjtx4+PjDA0N1VpD3aaPuCcPT7J8yXJ2XLZj0c/96O9Fk71o6oZeRMTuzNzYzrrtvKokgI8Be9oJbXWP6RnWN6/bbGhLPaSdoZILgE3AfRFxb3XdezLz1s6VpYUyuGaQybWThrbUQ+YM7sz8HBCnoBZJUht856QkFcbglqTCGNySVBiDW5IKY3BLUmEMbkkqjMEtSYUxuCWpMAa3JBXG4JakwhjcklQYg1uSCmNwS1JhDG5JKozBLUmFMbglqTAGtyQVxuCWpMIY3JJUGINbkgpjcEtSYQxuSSqMwS1JhTG4JakwBrckFcbglqTCGNySVBiDW5IKY3BLUmEMbkkqjMEtSYUxuCWpMAa3JBXG4JakwhjcklSYtoI7Il4TEV+MiIci4tc7XZQkaXZzBndE9AF/CFwErAfeFBHrO13Yydh5YCej+0fZeWBn3aVI0oJr54j7lcBDmfmlzJwCtgOv72xZJ27ngZ2M3DjC1r1bGblxxPCW1HOWtLHOmcCBluVHgFcdu1JEbAG2APT39zM+Pr4Q9c3b6P5RJg9PcpSjTB6eZOvYVibXTtZSS7eYmJio7efRbexFk71oKq0X7QR3WzLzo8BHATZu3JhDQ0MLtel5WX5gOaMHGuG9fMlyNg9vZnDNYC21dIvx8XHq+nl0G3vRZC+aSutFO0MljwJrWpbPqq7rSoNrBtlx2Q42r9vMjst2LPrQltR72jnivgv4vohYRyOw3wi8uaNVnaTBNYNMrp00tCX1pDmDOzMPR8Q7gc8CfcDWzLy/45VJkmbU1hh3Zt4K3NrhWiRJbfCdk5JUGINbkgpjcEtSYQxuSSqMwS1JhTG4JakwBrckFcbglqTCGNySVBiDW5IKY3BLUmEMbkkqjMEtSYUxuCWpMAa3JBXG4JakwkRmLvxGIw4CDy/4hudnNfDVmmvoFvaiyV402YumbujFyzLzRe2s2JHg7gYRsSszN9ZdRzewF032osleNJXWC4dKJKkwBrckFaaXg/ujdRfQRexFk71oshdNRfWiZ8e4JalX9fIRtyT1JINbkgrTk8EdEa+JiC9GxEMR8et111OXiFgTEWMR8UBE3B8Rl9ddU50ioi8i7omIW+qupU4R8YKIuDkiHoyIPRExWHdNdYmIX63+Nr4QEZ+MiOfVXVM7ei64I6IP+EPgImA98KaIWF9vVbU5DFyRmeuB84FfWsS9ALgc2FN3EV3gw8BfZ+Y5wCtYpD2JiDOBXwE2ZubLgT7gjfVW1Z6eC27glcBDmfmlzJwCtgOvr7mmWmTmY5l5d/X9MzT+QM+st6p6RMRZwOuA6+uupU4RcQbw48DHADJzKjOfrLeqWi0BTo+IJcAK4Ms119OWXgzuM4EDLcuPsEjDqlVEDADnAnfWW0ltPgS8GzhadyE1WwccBG6oho2uj4iVdRdVh8x8FPg9YD/wGPBUZt5Wb1Xt6cXg1jEiYhXwKeBdmfl03fWcahFxMfBEZu6uu5YusAT4UeC6zDwXOAQsyvNAEfFCGs/G1wEvBVZGxFvqrao9vRjcjwJrWpbPqq5blCJiKY3QHs3MT9ddT00uAH42IvbRGDr7yYj4RL0l1eYR4JHMnH7mdTONIF+MLgT2ZubBzHwO+DTw6ppraksvBvddwPdFxLqIWEbjZMNnaq6pFhERNMYy92Tm79ddT10y88rMPCszB2j8PtyemUUcWS20zHwcOBARZ1dXjQAP1FhSnfYD50fEiupvZYRCTtQuqbuAhZaZhyPincBnaZwl3pqZ99dcVl0uADYB90XEvdV178nMW2usSfX7ZWC0OrD5EvC2muupRWbeGRE3A3fTeAXWPRTy1nff8i5JhenFoRJJ6mkGtyQVxuCWpMIY3JJUGINbkgpjcEtSYQxuSSrM/wNZ1XFVcoOSCQAAAABJRU5ErkJggg==\n",
  137. "text/plain": [
  138. "<Figure size 432x288 with 1 Axes>"
  139. ]
  140. },
  141. "metadata": {
  142. "needs_background": "light"
  143. },
  144. "output_type": "display_data"
  145. }
  146. ],
  147. "source": [
  148. "C1 = [1, 2, 4, 8, 9, 11]\n",
  149. "C2 = list(set(range(12)) - set(C1))\n",
  150. "X0C1, X1C1 = X0[C1], X1[C1]\n",
  151. "X0C2, X1C2 = X0[C2], X1[C2]\n",
  152. "plt.figure()\n",
  153. "plt.title('2nd iteration results')\n",
  154. "plt.axis([-1, 9, -1, 9])\n",
  155. "plt.grid(True)\n",
  156. "plt.plot(X0C1, X1C1, 'rx')\n",
  157. "plt.plot(X0C2, X1C2, 'g.')\n",
  158. "plt.plot(3.8,6.4,'rx',ms=12.0)\n",
  159. "plt.plot(4.57,4.14,'g.',ms=12.0);"
  160. ]
  161. },
  162. {
  163. "cell_type": "markdown",
  164. "metadata": {},
  165. "source": [
  166. "我们再重复一次上面的做法,把重心移动到新位置,并重新计算各个样本与新重心的距离,并根据距离远近为样本重新归类。结果如下表所示:\n",
  167. "![data_2](images/data_2.png)\n",
  168. "\n",
  169. "画图结果如下:\n"
  170. ]
  171. },
  172. {
  173. "cell_type": "code",
  174. "execution_count": 11,
  175. "metadata": {},
  176. "outputs": [
  177. {
  178. "data": {
  179. "image/png": "iVBORw0KGgoAAAANSUhEUgAAAW4AAAEICAYAAAB/Dx7IAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAFKhJREFUeJzt3X+Q3HV9x/Hn2wQiIQjaYCwk4VColuqoJagn1d41dgoVdabTMlAM1bTNFKvir+IPpFpptONYC1aLjXKM4FXKANNRC2oNd1U7EUnAiiFqGRJyICi08uNAL4S8+8d+jz3DXW4vt5vvfu6ej5mby+5+9/t9f9/Ze91nP7u3n8hMJEnleErdBUiSZsbglqTCGNySVBiDW5IKY3BLUmEMbkkqjMGttoiIjIjjprjt+oj4kwNd0141jEbEs+usYTYi4oMR8fm661B3MLhFRHw+Iu6JiIci4kcR8Wft3H9mnpqZn6uO9YaI+FY797+3iBje+xwyc0lm3tHJ4x4oEdFT/aJcWHctqofBLYCPAD2Z+TTgtcDfRsSJk21Yd1jUffx96ebaNLcY3CIzt2bm2PjF6us5ABHRFxF3RcS7I+Je4LLq+r+qRuk/joi1+9r/+Ag4In4d+DTQW01dPFDdvigiPhYROyPiJxHx6Yg4ZKrjR8TTI+LLEXFfRPys+vfyavv1wCuAT1bH+GR1/RNTORFxeERcXt3/zoh4f0Q8pbrtDRHxraqen0XE9og4dR/ntqOq7XvAIxGxMCKOiohrqv1vj4i3Ttj+JRGxuXp285OI+PjE85xk36+a5LDfqL4/UJ1jb0QcFxH/GREPRsT9EfGv+/o/UdkMbgEQEf8UEY8CPwDuAa6bcPOzgGcAxwDrIuIU4F3A7wLHA5OFy5Nk5jbgL4BN1dTFEdVNfwf8GvAi4DjgaOCvpzo+jcftZdXllcDPgU9Wxzgf+Cbw5uoYb56klH8EDgeeDfw2cDbwxgm3vxT4IbAU+ChwaUTEPk7tTODVwBHAHuBLwH9X57EaeFtE/F617cXAxdWzm+cAV+1jv1N5ZfX9iOocNwEXAl8Dng4sr85Rc5TBLQAy803AYTRGq9cCYxNu3gN8IDPHMvPnwOnAZZn5/cx8BPjg/h63CsR1wNsz8/8y82Hgw8AZUx0/M/83M6/JzEer7dfTCOBWjreg2vd7M/PhzNwB/D2wZsJmd2bmZzLzceBzwK8Cy/ax209k5kjVm5OAIzPzQ5m5q5pX/8yE83kMOC4ilmbmaGZ+u5W6W/AYjV9kR2XmLzKzo68jqF4Gt56QmY9XP/DLgXMm3HRfZv5iwuWjgJEJl++cxWGPBBYDWyLigWr65CvV9ZMePyIWR8Q/V9McD9GYOjiiCuXpLAUO2qvmO2mMjsfdO/6PzHy0+ueSfexzYi+OAY4aP5fqfN5HM/j/lMazix9ExE0RcVoLNbfiPCCA70TE1ummr1Q2X0zRZBZSzXFX9v4IyXuAFRMur5zBvvfe1/00pjp+IzPvbvE+7wSeC7w0M++NiBcBt9AIrsm23/t446PT26rrVgJTHbsVE483AmzPzOMn3TDzf4Azqzn1PwCujohfAR6h8QsMeOKZwZGT7YNJzi8z7wX+vLrvbwFfj4hvZObt+3E+6nKOuOe5iHhmRJwREUsiYkE1F3smsHEfd7sKeENEnBARi4EPzOCQPwGWR8TBAJm5h8ZUwj9ExDOrmo6eMCc8mcNohP0DEfGMSY7/Exrz109STX9cBayPiMMi4hjgHUC73iP9HeDh6gXLQ6qePj8iTgKIiNdHxJHVeT9Q3WcP8CPgqRHx6og4CHg/sGiKY9xX3eeJc4yIPxp/gRb4GY1w39Omc1KXMbiVNKZF7qLxA/8x4G2Z+cUp75B5PXARcANwe/W9VTcAW4F7I+L+6rp3V/v5djX18XUaI+qpXAQcQmP0/G0aUysTXQz8YfWukE9Mcv+30Bjh3gF8C/gXYGAG5zCl6hfDaTReaN1e1fhZGi+GApwCbI2I0arOM6p5+weBN1Xb3l3VdxeTqKZv1gP/VU3HvIzG3PqN1X6/CJw7V963ricLF1KQpLI44pakwhjcklQYg1uSCmNwS1JhOvI+7qVLl2ZPT08ndt2yRx55hEMPPbTWGrqFvWiyF032oqkberFly5b7M3Oq9+7/ko4Ed09PD5s3b+7Erls2PDxMX19frTV0C3vRZC+a7EVTN/QiIlr+C2SnSiSpMAa3JBXG4JakwhjcklQYg1uSCmNwS1JhDG5JKozBLUmFMbglqTAGtyQVxuCWpMIY3JJUGINbkgpjcEtSYQxuSSqMwS1JhTG4JakwLQV3RLw9IrZGxPcj4gsR8dROFyapAz76URga+uXrhoYa16sY0wZ3RBwNvBVYlZnPBxYAZ3S6MEkdcNJJcPrpzfAeGmpcPumkeuvSjLS65uRC4JCIeAxYDPy4cyVJ6pj+frjqKjj9dHpOPRWuv75xub+/7so0A5GZ028UcS6wHvg58LXMPGuSbdYB6wCWLVt24pVXXtnmUmdmdHSUJUuW1FpDt7AXTfaioWdggJ4rrmDHmjXsWLu27nJq1w2Pi/7+/i2ZuaqljTNzn1/A04EbgCOBg4B/A16/r/uceOKJWbehoaG6S+ga9qLJXmTmDTdkLl2a29esyVy6tHF5nuuGxwWwOafJ4/GvVl6cfBWwPTPvy8zHgGuBl+/HLxRJdRuf077qqsZIu5o2edILlupqrQT3TuBlEbE4IgJYDWzrbFmSOuKmm355Tnt8zvumm+qtSzMy7YuTmXljRFwN3AzsBm4BNnS6MEkdcN55T76uv98XJwvT0rtKMvMDwAc6XIskqQX+5aQkFcbglqTCGNySVBiDW5IKY3BLUmEMbkkqjMEtSYUxuCWpMAa3JBXG4JakwhjcUjebbKmxqbgE2bxhcEvdbO+lxqbiEmTzisEtdbMJS41NGd4TPmPbT/mbHwxudY4rijfNphf7Cu8SQ7tbHhfdUsd+MLjVOa4o3jTbXkwW3iWGNnTP46Jb6tgfra5xNpMv15zsLrX2olrfMC+4oCvWNyy+F23sZ/G9aGMd3bD+Jm1ec1Laf/39cM45cOGFje8ljQzbrR29mCv97JbzqOroueKKovppcKuzhobgkkvgggsa3+fzorTt6MVc6We3nEdVx441a8rqZ6tD85l8OVXSXWrrxfjT4fGnn3tfrkHRvWhzP4vuRZvrGBoaqv3xiVMl6gquKN40215M9kJkK28V7Ebd8rjoljr2R6sJP5MvR9zdxV40FdmL6UaC+zlSLLIXHdINvcARtzRHtPKWv1JH3tpvBrfUzfZ+Oj+Vkp7ma9YW1l2ApH0477zWt+3vL+btbJodR9ySVBiDW5IKY3BLUmEMbkkqjMEtSYUxuCWpMAa3JBXG4JakwhjcklQYg1uSCtNScEfEERFxdUT8ICK2RURvpwuTJE2u1RH3xcBXMvN5wAuBbZ0rSWqzglfzliYzbXBHxOHAK4FLATJzV2Y+0OnCpLYpeTVvaRKtfDrgscB9wGUR8UJgC3BuZj7S0cqkdpnwedU9p54K11/f2kelSl0qGgsv7GODiFXAt4GTM/PGiLgYeCgzL9hru3XAOoBly5adeOWVV3ao5NaMjo6yZMmSWmvoFvaioWdggJ4rrmDHmjXsWLu27nJq5+OiqRt60d/fvyUzV7W08XRL5ADPAnZMuPwK4N/3dR+XLusu9iKfWN5r+5o1tS9Y3C18XDR1Qy9o59JlmXkvMBIRz62uWg3cth+/UKR6TFj+a8fatS7zpeK1+q6StwCDEfE94EXAhztXktRmJa/mLU2ipaXLMvO7QGtzL1K3mWz5L5f5UsH8y0lJKozBLUmFMbglqTAGtyQVxuCWpMIY3JJUGINbkgpjcEtSYQxuSSqMwS1JhTG4JakwBrckFcbglqTCGNzSgeCCxU32YtYMbulAcMHiJnsxay19HrekWZqwYDHnnAOXXDJ/Fyy2F7PmiFs6UPr7G0F14YWN7/M5qOzFrBjc0oEyNNQYXV5wQeP7fF7z0l7MisEtHQgTFizmQx+a3wsW24tZM7ilA8EFi5vsxaz54qR0ILhgcZO9mDVH3JJUGINbkgpjcEtSYQxuSSqMwS1JhTG4JakwBrckFcbglqTCGNySVBiDW5IKY3BLUmEMbkkqjMEtSYUxuCWpMC0Hd0QsiIhbIuLLnSxIkrRvMxlxnwts61Qhc9GmkU185JsfYdPIprpLkTSHtLSQQkQsB14NrAfe0dGK5ohNI5tYfflqdj2+i4MXHMzGszfSu6K37rIkzQGtroBzEXAecNhUG0TEOmAdwLJlyxgeHp51cbMxOjpaaw2DOwcZ2z3GHvYwtnuMgaEBxlaO1VJL3b3oJvaiyV40ldaLaYM7Ik4DfpqZWyKib6rtMnMDsAFg1apV2dc35aYHxPDwMHXWsGhkEYMjg0+MuNf2r61txF13L7qJvWiyF02l9aKVEffJwGsj4veBpwJPi4jPZ+brO1ta2XpX9LLx7I0M7ximr6fPaRJJbTNtcGfme4H3AlQj7ncZ2q3pXdFrYEtqO9/HLUmFafXFSQAycxgY7kglkqSWOOKWpMIY3JJUGINbkgpjcEtSYQxuSSqMwS1JhTG4JakwBrckFcbglqTCGNySVBiDW5IKY3BLUmEMbkkqjMEtSYUxuNVxrnYvtdeMPo9bmilXu5fazxG3Omp4xzC7Ht/F4/k4ux7fxfCO4bpLkopncM9Tg7cO0nNRD0/5m6fQc1EPg7cOduQ4fT19HLzgYBbEAg5ecDB9PX0dOY40nzhVMg8N3jrIui+t49HHHgXgzgfvZN2X1gFw1gvOauuxXO1eaj+Dex46f+P5T4T2uEcfe5TzN57f9uAGV7uX2s2pknlo54M7Z3S9pO5icM9DKw9fOaPrJXUXg3seWr96PYsPWvxL1y0+aDHrV6+vqSJJM2Fwz0NnveAsNrxmA8ccfgxBcMzhx7DhNRs6Mr8tqf18cXKeOusFZxnUUqEccUtSYQxuSSqMwS1JhTG4JakwBrckFcbglqTCGNySVBiDW5IKY3BLUmGmDe6IWBERQxFxW0RsjYhzD0RhkqTJtfIn77uBd2bmzRFxGLAlIv4jM2/rcG2SpElMO+LOzHsy8+bq3w8D24CjO12Y2mPTyCYGdw66wro0h8xojjsieoAXAzd2ohi11/gK6wPbB1h9+WrDW5ojWv50wIhYAlwDvC0zH5rk9nXAOoBly5YxPDzcrhr3y+joaO011G1w5yBju8fYwx7Gdo8xMDTA2MqxusuqlY+LJnvRVFovIjOn3yjiIODLwFcz8+PTbb9q1arcvHlzG8rbf8PDw/T19dVaQ93GR9xju8dYtHARG8/eOO/XfvRx0WQvmrqhFxGxJTNXtbJtK+8qCeBSYFsroa3uMb7C+tpj1xra0hzSylTJycAa4NaI+G513fsy87rOlaV26V3Ry9jKMUNbmkOmDe7M/BYQB6AWSVIL/MtJSSqMwS1JhTG4JakwBrckFcbglqTCGNySVBiDW5IKY3BLUmEMbkkqjMEtSYUxuCWpMAa3JBXG4JakwhjcklQYg1uSCmNwS1JhDG5JKozBLUmFMbglqTAGtyQVxuCWpMIY3JJUGINbkgpjcEtSYQxuSSqMwS1JhTG4JakwBrckFcbglqTCGNySVBiDW5IKY3BLUmEMbkkqjMEtSYUxuCWpMC0Fd0ScEhE/jIjbI+I9nS5KkjS1aYM7IhYAnwJOBU4AzoyIEzpd2GxsGtnE4M5BNo1sqrsUSWq7VkbcLwFuz8w7MnMXcCXwus6Wtf82jWxi9eWrGdg+wOrLVxvekuachS1sczQwMuHyXcBL994oItYB6wCWLVvG8PBwO+qbscGdg4ztHmMPexjbPcbA0ABjK8dqqaVbjI6O1vb/0W3sRZO9aCqtF60Ed0sycwOwAWDVqlXZ19fXrl3PyKKRRQyONMJ70cJFrO1fS++K3lpq6RbDw8PU9f/RbexFk71oKq0XrUyV3A2smHB5eXVdV+pd0cvGszey9ti1bDx747wPbUlzTysj7puA4yPiWBqBfQbwxx2tapZ6V/QytnLM0JY0J00b3Jm5OyLeDHwVWAAMZObWjlcmSZpUS3PcmXkdcF2Ha5EktcC/nJSkwhjcklQYg1uSCmNwS1JhDG5JKozBLUmFMbglqTAGtyQVxuCWpMIY3JJUGINbkgpjcEtSYQxuSSqMwS1JhTG4JakwBrckFSYys/07jbgPuLPtO56ZpcD9NdfQLexFk71oshdN3dCLYzLzyFY27Ehwd4OI2JyZq+quoxvYiyZ70WQvmkrrhVMlklQYg1uSCjOXg3tD3QV0EXvRZC+a7EVTUb2Ys3PckjRXzeURtyTNSQa3JBVmTgZ3RJwSET+MiNsj4j1111OXiFgREUMRcVtEbI2Ic+uuqU4RsSAibomIL9ddS50i4oiIuDoifhAR2yKit+6a6hIRb69+Nr4fEV+IiKfWXVMr5lxwR8QC4FPAqcAJwJkRcUK9VdVmN/DOzDwBeBnwl/O4FwDnAtvqLqILXAx8JTOfB7yQedqTiDgaeCuwKjOfDywAzqi3qtbMueAGXgLcnpl3ZOYu4ErgdTXXVIvMvCczb67+/TCNH9Cj662qHhGxHHg18Nm6a6lTRBwOvBK4FCAzd2XmA/VWVauFwCERsRBYDPy45npaMheD+2hgZMLlu5inYTVRRPQALwZurLeS2lwEnAfsqbuQmh0L3AdcVk0bfTYiDq27qDpk5t3Ax4CdwD3Ag5n5tXqras1cDG7tJSKWANcAb8vMh+qu50CLiNOAn2bmlrpr6QILgd8ELsnMFwOPAPPydaCIeDqNZ+PHAkcBh0bE6+utqjVzMbjvBlZMuLy8um5eioiDaIT2YGZeW3c9NTkZeG1E7KAxdfY7EfH5ekuqzV3AXZk5/szrahpBPh+9Ctiemfdl5mPAtcDLa66pJXMxuG8Cjo+IYyPiYBovNnyx5ppqERFBYy5zW2Z+vO566pKZ783M5ZnZQ+PxcENmFjGyarfMvBcYiYjnVletBm6rsaQ67QReFhGLq5+V1RTyQu3Cugtot8zcHRFvBr5K41XigczcWnNZdTkZWAPcGhHfra57X2ZeV2NNqt9bgMFqYHMH8Maa66lFZt4YEVcDN9N4B9YtFPKn7/7JuyQVZi5OlUjSnGZwS1JhDG5JKozBLUmFMbglqTAGtyQVxuCWpML8P42o419LPfFMAAAAAElFTkSuQmCC\n",
  180. "text/plain": [
  181. "<Figure size 432x288 with 1 Axes>"
  182. ]
  183. },
  184. "metadata": {
  185. "needs_background": "light"
  186. },
  187. "output_type": "display_data"
  188. }
  189. ],
  190. "source": [
  191. "C1 = [0, 1, 2, 4, 8, 9, 10, 11]\n",
  192. "C2 = list(set(range(12)) - set(C1))\n",
  193. "X0C1, X1C1 = X0[C1], X1[C1]\n",
  194. "X0C2, X1C2 = X0[C2], X1[C2]\n",
  195. "plt.figure()\n",
  196. "plt.title('3rd iteration results')\n",
  197. "plt.axis([-1, 9, -1, 9])\n",
  198. "plt.grid(True)\n",
  199. "plt.plot(X0C1, X1C1, 'rx')\n",
  200. "plt.plot(X0C2, X1C2, 'g.')\n",
  201. "plt.plot(5.5,7.0,'rx',ms=12.0)\n",
  202. "plt.plot(2.2,2.8,'g.',ms=12.0);"
  203. ]
  204. },
  205. {
  206. "cell_type": "markdown",
  207. "metadata": {},
  208. "source": [
  209. "再重复上面的方法就会发现类的重心不变了,K-Means会在条件满足的时候停止重复聚类过程。通常,条件是前后两次迭代的成本函数值的差达到了限定值,或者是前后两次迭代的重心位置变化达到了限定值。如果这些停止条件足够小,K-Means就能找到最优解。不过这个最优解不一定是全局最优解。\n",
  210. "\n"
  211. ]
  212. },
  213. {
  214. "cell_type": "markdown",
  215. "metadata": {},
  216. "source": [
  217. "## Program"
  218. ]
  219. },
  220. {
  221. "cell_type": "code",
  222. "execution_count": 10,
  223. "metadata": {},
  224. "outputs": [
  225. {
  226. "data": {
  227. "text/html": [
  228. "<div>\n",
  229. "<style scoped>\n",
  230. " .dataframe tbody tr th:only-of-type {\n",
  231. " vertical-align: middle;\n",
  232. " }\n",
  233. "\n",
  234. " .dataframe tbody tr th {\n",
  235. " vertical-align: top;\n",
  236. " }\n",
  237. "\n",
  238. " .dataframe thead th {\n",
  239. " text-align: right;\n",
  240. " }\n",
  241. "</style>\n",
  242. "<table border=\"1\" class=\"dataframe\">\n",
  243. " <thead>\n",
  244. " <tr style=\"text-align: right;\">\n",
  245. " <th></th>\n",
  246. " <th>sepal-length</th>\n",
  247. " <th>sepal-width</th>\n",
  248. " <th>petal-length</th>\n",
  249. " <th>petal-width</th>\n",
  250. " <th>class</th>\n",
  251. " </tr>\n",
  252. " </thead>\n",
  253. " <tbody>\n",
  254. " <tr>\n",
  255. " <th>0</th>\n",
  256. " <td>5.1</td>\n",
  257. " <td>3.5</td>\n",
  258. " <td>1.4</td>\n",
  259. " <td>0.2</td>\n",
  260. " <td>Iris-setosa</td>\n",
  261. " </tr>\n",
  262. " <tr>\n",
  263. " <th>1</th>\n",
  264. " <td>4.9</td>\n",
  265. " <td>3.0</td>\n",
  266. " <td>1.4</td>\n",
  267. " <td>0.2</td>\n",
  268. " <td>Iris-setosa</td>\n",
  269. " </tr>\n",
  270. " <tr>\n",
  271. " <th>2</th>\n",
  272. " <td>4.7</td>\n",
  273. " <td>3.2</td>\n",
  274. " <td>1.3</td>\n",
  275. " <td>0.2</td>\n",
  276. " <td>Iris-setosa</td>\n",
  277. " </tr>\n",
  278. " <tr>\n",
  279. " <th>3</th>\n",
  280. " <td>4.6</td>\n",
  281. " <td>3.1</td>\n",
  282. " <td>1.5</td>\n",
  283. " <td>0.2</td>\n",
  284. " <td>Iris-setosa</td>\n",
  285. " </tr>\n",
  286. " <tr>\n",
  287. " <th>4</th>\n",
  288. " <td>5.0</td>\n",
  289. " <td>3.6</td>\n",
  290. " <td>1.4</td>\n",
  291. " <td>0.2</td>\n",
  292. " <td>Iris-setosa</td>\n",
  293. " </tr>\n",
  294. " </tbody>\n",
  295. "</table>\n",
  296. "</div>"
  297. ],
  298. "text/plain": [
  299. " sepal-length sepal-width petal-length petal-width class\n",
  300. "0 5.1 3.5 1.4 0.2 Iris-setosa\n",
  301. "1 4.9 3.0 1.4 0.2 Iris-setosa\n",
  302. "2 4.7 3.2 1.3 0.2 Iris-setosa\n",
  303. "3 4.6 3.1 1.5 0.2 Iris-setosa\n",
  304. "4 5.0 3.6 1.4 0.2 Iris-setosa"
  305. ]
  306. },
  307. "execution_count": 10,
  308. "metadata": {},
  309. "output_type": "execute_result"
  310. }
  311. ],
  312. "source": [
  313. "# This line configures matplotlib to show figures embedded in the notebook, \n",
  314. "# instead of opening a new window for each figure. More about that later. \n",
  315. "# If you are using an old version of IPython, try using '%pylab inline' instead.\n",
  316. "%matplotlib inline\n",
  317. "\n",
  318. "# import librarys\n",
  319. "from numpy import *\n",
  320. "import matplotlib.pyplot as plt\n",
  321. "import pandas as pd\n",
  322. "\n",
  323. "# Load dataset\n",
  324. "names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class']\n",
  325. "dataset = pd.read_csv(\"iris.csv\", header=0, index_col=0)\n",
  326. "dataset.head()\n"
  327. ]
  328. },
  329. {
  330. "cell_type": "code",
  331. "execution_count": null,
  332. "metadata": {
  333. "lines_to_next_cell": 2
  334. },
  335. "outputs": [],
  336. "source": [
  337. "#对类别进行编码,3个类别分别赋值0,1,2\n",
  338. "dataset['class'][dataset['class']=='Iris-setosa']=0\n",
  339. "dataset['class'][dataset['class']=='Iris-versicolor']=1\n",
  340. "dataset['class'][dataset['class']=='Iris-virginica']=2"
  341. ]
  342. },
  343. {
  344. "cell_type": "code",
  345. "execution_count": null,
  346. "metadata": {
  347. "lines_to_next_cell": 2
  348. },
  349. "outputs": [],
  350. "source": [
  351. "def originalDatashow(dataSet):\n",
  352. " #绘制原始的样本点\n",
  353. " num,dim=shape(dataSet)\n",
  354. " marksamples=['ob'] #样本图形标记\n",
  355. " for i in range(num):\n",
  356. " plt.plot(datamat.iat[i,0],datamat.iat[i,1],marksamples[0],markersize=5)\n",
  357. " plt.title('original dataset')\n",
  358. " plt.xlabel('sepal length')\n",
  359. " plt.ylabel('sepal width') \n",
  360. " plt.show()"
  361. ]
  362. },
  363. {
  364. "cell_type": "code",
  365. "execution_count": null,
  366. "metadata": {
  367. "lines_to_end_of_cell_marker": 2,
  368. "scrolled": true
  369. },
  370. "outputs": [],
  371. "source": [
  372. "#获取样本数据\n",
  373. "datamat = dataset.loc[:, ['sepal-length', 'sepal-width']]\n",
  374. "# 真实的标签\n",
  375. "labels = dataset.loc[:, ['class']]\n",
  376. "#原始数据显示\n",
  377. "originalDatashow(datamat)"
  378. ]
  379. },
  380. {
  381. "cell_type": "code",
  382. "execution_count": 12,
  383. "metadata": {},
  384. "outputs": [],
  385. "source": [
  386. "def randChosenCent(dataSet,k):\n",
  387. " \"\"\"初始化聚类中心:通过在区间范围随机产生的值作为新的中心点\"\"\"\n",
  388. "\n",
  389. " # 样本数\n",
  390. " m=shape(dataSet)[0]\n",
  391. " # 初始化列表\n",
  392. " centroidsIndex=[]\n",
  393. " #生成类似于样本索引的列表\n",
  394. " dataIndex=list(range(m))\n",
  395. " for i in range(k):\n",
  396. " #生成随机数\n",
  397. " randIndex=random.randint(0,len(dataIndex))\n",
  398. " #将随机产生的样本的索引放入centroidsIndex\n",
  399. " centroidsIndex.append(dataIndex[randIndex])\n",
  400. " #删除已经被抽中的样本\n",
  401. " del dataIndex[randIndex]\n",
  402. " #根据索引获取样本\n",
  403. " centroids = dataSet.iloc[centroidsIndex]\n",
  404. " return mat(centroids)"
  405. ]
  406. },
  407. {
  408. "cell_type": "code",
  409. "execution_count": 15,
  410. "metadata": {},
  411. "outputs": [],
  412. "source": [
  413. "\n",
  414. "def distEclud(vecA, vecB):\n",
  415. " \"\"\"算距离, 两个向量间欧式距离\"\"\"\n",
  416. " return sqrt(sum(power(vecA - vecB, 2))) #la.norm(vecA-vecB)\n",
  417. "\n",
  418. "\n",
  419. "def kMeans(dataSet, k):\n",
  420. " # 样本总数\n",
  421. " m = shape(dataSet)[0]\n",
  422. " # 分配样本到最近的簇:存[簇序号,距离的平方] (m行 x 2 列)\n",
  423. " clusterAssment = mat(zeros((m, 2)))\n",
  424. "\n",
  425. " # step1: 通过随机产生的样本点初始化聚类中心\n",
  426. " centroids = randChosenCent(dataSet, k)\n",
  427. " print('最初的中心=', centroids)\n",
  428. "\n",
  429. " # 标志位,如果迭代前后样本分类发生变化值为Tree,否则为False\n",
  430. " clusterChanged = True\n",
  431. " # 查看迭代次数\n",
  432. " iterTime = 0\n",
  433. " \n",
  434. " # 所有样本分配结果不再改变,迭代终止\n",
  435. " while clusterChanged:\n",
  436. " clusterChanged = False\n",
  437. " \n",
  438. " # step2:分配到最近的聚类中心对应的簇中\n",
  439. " for i in range(m):\n",
  440. " # 初始定义距离为无穷大\n",
  441. " minDist = inf;\n",
  442. " # 初始化索引值\n",
  443. " minIndex = -1\n",
  444. " # 计算每个样本与k个中心点距离\n",
  445. " for j in range(k):\n",
  446. " # 计算第i个样本到第j个中心点的距离\n",
  447. " distJI = distEclud(centroids[j, :], dataSet.values[i, :])\n",
  448. " # 判断距离是否为最小\n",
  449. " if distJI < minDist:\n",
  450. " # 更新获取到最小距离\n",
  451. " minDist = distJI\n",
  452. " # 获取对应的簇序号\n",
  453. " minIndex = j\n",
  454. " # 样本上次分配结果跟本次不一样,标志位clusterChanged置True\n",
  455. " if clusterAssment[i, 0] != minIndex:\n",
  456. " clusterChanged = True\n",
  457. " clusterAssment[i, :] = minIndex, minDist ** 2 # 分配样本到最近的簇\n",
  458. " \n",
  459. " iterTime += 1\n",
  460. " sse = sum(clusterAssment[:, 1])\n",
  461. " print('the SSE of %d' % iterTime + 'th iteration is %f' % sse)\n",
  462. " \n",
  463. " # step3:更新聚类中心\n",
  464. " for cent in range(k): # 样本分配结束后,重新计算聚类中心\n",
  465. " # 获取该簇所有的样本点\n",
  466. " ptsInClust = dataSet.iloc[nonzero(clusterAssment[:, 0].A == cent)[0]]\n",
  467. " # 更新聚类中心:axis=0沿列方向求均值。\n",
  468. " centroids[cent, :] = mean(ptsInClust, axis=0)\n",
  469. " return centroids, clusterAssment\n"
  470. ]
  471. },
  472. {
  473. "cell_type": "code",
  474. "execution_count": 16,
  475. "metadata": {},
  476. "outputs": [
  477. {
  478. "name": "stdout",
  479. "output_type": "stream",
  480. "text": [
  481. "最初的中心= [[5. 3.5]\n",
  482. " [4.9 2.4]\n",
  483. " [7.1 3. ]]\n",
  484. "the SSE of 1th iteration is 68.800000\n",
  485. "the SSE of 2th iteration is 41.374283\n",
  486. "the SSE of 3th iteration is 38.641949\n",
  487. "the SSE of 4th iteration is 38.030526\n",
  488. "the SSE of 5th iteration is 37.513984\n",
  489. "the SSE of 6th iteration is 37.174201\n",
  490. "the SSE of 7th iteration is 37.136261\n",
  491. "the SSE of 8th iteration is 37.123702\n"
  492. ]
  493. }
  494. ],
  495. "source": [
  496. "# 进行k-means聚类\n",
  497. "k = 3 # 用户定义聚类数\n",
  498. "mycentroids, clusterAssment = kMeans(datamat, k)"
  499. ]
  500. },
  501. {
  502. "cell_type": "code",
  503. "execution_count": 17,
  504. "metadata": {},
  505. "outputs": [],
  506. "source": [
  507. "def datashow(dataSet, k, centroids, clusterAssment): # 二维空间显示聚类结果\n",
  508. " from matplotlib import pyplot as plt\n",
  509. " num, dim = shape(dataSet) # 样本数num ,维数dim\n",
  510. "\n",
  511. " if dim != 2:\n",
  512. " print('sorry,the dimension of your dataset is not 2!')\n",
  513. " return 1\n",
  514. " marksamples = ['or', 'ob', 'og', 'ok', '^r', '^b', '<g'] # 样本图形标记\n",
  515. " if k > len(marksamples):\n",
  516. " print('sorry,your k is too large,please add length of the marksample!')\n",
  517. " return 1\n",
  518. " # 绘所有样本\n",
  519. " for i in range(num):\n",
  520. " markindex = int(clusterAssment[i, 0]) # 矩阵形式转为int值, 簇序号\n",
  521. " # 特征维对应坐标轴x,y;样本图形标记及大小\n",
  522. " plt.plot(dataSet.iat[i, 0], dataSet.iat[i, 1], marksamples[markindex], markersize=6)\n",
  523. "\n",
  524. " # 绘中心点\n",
  525. " markcentroids = ['o', '*', '^'] # 聚类中心图形标记\n",
  526. " label = ['0', '1', '2']\n",
  527. " c = ['yellow', 'pink', 'red']\n",
  528. " for i in range(k):\n",
  529. " plt.plot(centroids[i, 0], centroids[i, 1], markcentroids[i], markersize=15, label=label[i], c=c[i])\n",
  530. " plt.legend(loc='upper left')\n",
  531. " plt.xlabel('sepal length')\n",
  532. " plt.ylabel('sepal width')\n",
  533. "\n",
  534. " plt.title('k-means cluster result') # 标题\n",
  535. " plt.show()\n",
  536. " \n",
  537. " \n",
  538. "# 画出实际图像\n",
  539. "def trgartshow(dataSet, k, labels):\n",
  540. " from matplotlib import pyplot as plt\n",
  541. "\n",
  542. " num, dim = shape(dataSet)\n",
  543. " label = ['0', '1', '2']\n",
  544. " marksamples = ['ob', 'or', 'og', 'ok', '^r', '^b', '<g']\n",
  545. " # 通过循环的方式,完成分组散点图的绘制\n",
  546. " for i in range(num):\n",
  547. " plt.plot(datamat.iat[i, 0], datamat.iat[i, 1], marksamples[int(labels.iat[i, 0])], markersize=6)\n",
  548. " for i in range(0, num, 50):\n",
  549. " plt.plot(datamat.iat[i, 0], datamat.iat[i, 1], marksamples[int(labels.iat[i, 0])], markersize=6,\n",
  550. " label=label[int(labels.iat[i, 0])])\n",
  551. " plt.legend(loc='upper left')\n",
  552. " \n",
  553. " # 添加轴标签和标题\n",
  554. " plt.xlabel('sepal length')\n",
  555. " plt.ylabel('sepal width')\n",
  556. " plt.title('iris true result') # 标题\n",
  557. "\n",
  558. " # 显示图形\n",
  559. " plt.show()\n",
  560. " # label=labels.iat[i,0]"
  561. ]
  562. },
  563. {
  564. "cell_type": "code",
  565. "execution_count": 18,
  566. "metadata": {},
  567. "outputs": [
  568. {
  569. "data": {
  570. "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYUAAAEWCAYAAACJ0YulAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzt3XmcHHWd//HXOzMhYUJI3BAQGTITEuQH4QzhUuQwUZHTFV3xF5ZF8BFNFGXxWEF+EIIRUEHwAJwVd4VkAQVdbpRbEDkCwYSEU0hIIkIIEHKQkJl8fn9U9aSn0z1d3V1dXd39eT4e/Zju6upvfaqmM9/U9/rIzHDOOecABtQ6AOecc+nhlYJzzrleXik455zr5ZWCc865Xl4pOOec6+WVgnPOuV5eKbiKSVokaVKt40iapOmSZtU6jlqSdL+kL9Y6DhcfrxScqyFJnZJMUmutY6mUpJMlPVTrOFxlvFJwro5FqUwaocJxyfFKwcVK0q6SXpb0+QLvT5f0W0mzJK2SNF/SByWdKel1SUskfTxr/2GSrpL0qqRlkr4nqSV8b4ykeyWtkPSGpNmShmd9dpGkb0qaJ2mlpOslDQ7f20bSrZLelvSmpAcl5f33IGmcpLvC/V6TdFaefQ6TtDRnW2+zmqT9Jc2R9E5YxiXhbn8Kf74tabWkg8L9T5H0jKS3JP1BUkdWuSbpK5JeAF7IE0vm7uNUSa8A94bbD5T0cHjOf5V0WNZnTpb0Uvg7eVnS5Kzf16w8ZbfmHHNX4ErgoPA83s53LV36eaXgYiNpPPAH4DQzu7afXY8BrgHeB8wNPzMA2AGYAfwia9//BrqBscA+wMeBTBu2gAuADwC7AjsC03OO9S/AEcBoYE/g5HD7N4ClwEhgO+AsYLM1XyQNBe4G7gyPMxa4p59zK+Qy4DIz2xoYA/wm3H5I+HO4mW1lZn+RdFwYz6fD+B4Ecq/np4ADgN36OeahBNflE5J2AG4Dvgf8E/BN4EZJIyUNAX4CfNLMhgIfAp4q5eTM7Bngy8BfwvMYXuwzLp28UnBx+QhwM3CSmd1aZN8HzewPZtYN/JbgD9+FZrYBuA7olDRc0nbAkcDpZrbGzF4HfgycAGBmL5rZXWa23syWA5cQ/CHM9hMz+7uZvQncAuwdbt8AbA90mNkGM3vQ8i8EdjTwDzO72MzWmdkqM3u0tEvTe7yxkrYxs9Vm9kg/+34ZuMDMngmv0feBvbPvFsL33zSzd/spZ3p43d4FTgRuN7PbzWyjmd0FzCG4vgAbgd0lbWlmr5rZgjLO0TUArxRcXL4MPGxm92c2SJocNiWslnRH1r6vZT1/F3jDzHqyXgNsBXQAA4FXwyaPtwnuIrYNy99O0nVhs9I7wCxgm5y4/pH1fG1YLsAPgReBP4bNJt8pcF47An8rdvIRnAp8EHhW0uOSju5n3w7gsqxzfpPgrmiHrH2WRDhm9j4dwGczZYblHgxsb2ZrgM8R/A5flXSbpP8T/dRcI/FKwcXly8AoST/ObDCz2WFTwlZm9skyylwCrAe2MbPh4WNrMxsXvv99giafPcJmmRMJ/ngWFf6P/xtmthNwLHCGpIkFYtgpQpFrgLbMi7DfY2TW8V4ws88TVGgXATeEzTb57k6WAF/KOufhZralmT2cfQoRYsreZwlwTU6ZQ8zswjC+P5jZxwjunp4F/jPfeQHvj3g8V6e8UnBxWUXQdn+IpAvjKNDMXgX+CFwsaWtJA8LO5UwT0VBgNbAybDP/VtSyJR0taawkASuBHoImlFy3AttLOl3SIElDJR2QZ7/ngcGSjpI0EDgbGJR1vBMljTSzjUCmE3YjsDz8mV3xXAmcKWlc+Nlhkj4b9dwKmAUcI+kTklokDQ47x9vDO67jwkpqPcE1zVyLpwh+p6MkDQPO7OcYrwHtkraoMFZXQ14puNiY2dvAx4BPSjo/pmJPArYAFgJvATcQ/G8W4DxgPMEf9duA35VQ7s4EHcirgb8Al5vZfbk7mdkqgnM6hqAp6gXg8Dz7rQSmAb8ElhH8Dzt7NNIRwAJJqwk6nU8ws3fNbC0wE/hz2KxzoJn9nuBu4rqwWexpoJw7rez4lgCZDuzlBHcO3yL4GzAAOAP4O0FT1aHA1PBzdwHXA/OAJwgqyULuBRYA/5D0RiXxutqRJ9lxzjmX4XcKzjnnenml4JxzrpdXCs4553pVvVIIRzrMlbRZB1U4tX65pKfCh6+26JxzNZTEQllfB54Bti7w/vVm9tWohW2zzTbW2dkZR1zOOdc0nnjiiTfMbGSx/apaKUhqB44iGHJ3RhxldnZ2MmfOnDiKcs65piFpcZT9qt18dCnwbfJPCso4XsEqljdI2jHfDpKmhCtMzlm+fHlVAnXOOVfFSiFc2+V1M3uin91uATrNbE/gLuDX+XYysy4zm2BmE0aOLHr345xzrkzVvFP4MHCspEUEK19+VDmpC81shZmtD1/+Eti3ivE455wromp9CmZ2JuE6KWEyj2+a2YnZ+0jaPlzfBoJFyZ4p51gbNmxg6dKlrFu3roKIq2/w4MG0t7czcODAWofinHN5JZ6mT9IMYI6Z3Qx8TdKxBElU3mRTApSSLF26lKFDh9LZ2Umwvll/jGCpm8cI1nAbCuwPHETEBTbLYmasWLGCpUuXMnr06KodxznnKpHI5DUzu9/Mjg6fnxNWCJjZmWY2zsz2MrPDzezZcspft24dI0aMKFIhbCBYfHIMQfKu/wDODX9+PNx+Zbhf/CQxYsSI1N/N1J3Zs6GzEwYMCH7Onl3riJyraw2T0Lv/CmE1wSKTTxLkWcn2Xvh4mSBD4/8At7MpF0tSMbqSzZ4NU6bA2vB3unhx8Bpg8uTaxeVcHWuCZS42EFQIj7N5hZBrLUGz0pFU647Bxei7391UIWSsXRtsd86VpQkqhasI7hDWF9sxtJ5g2fhflXykO++8k1122YWxY8dy4YWx5Jlx/XnlldK2O+eKavBKwYAfUPwOIdfa8HPRc0309PTwla98hTvuuIOFCxdy7bXXsnDhwhKP60oyalRp251zRTV4pfAX4PUyP/ta+PloHnvsMcaOHctOO+3EFltswQknnMBNN91U5rFdJDNnQltb321tbcF251xZGrxSeIzy+wa6Cfoholm2bBk77rhplY729naWLVtW5rFdJJMnQ1cXdHSAFPzs6vJOZucq0DCjj/JbRfmVwnvh512qTZ7slYBzMWrwO4WhQLmzh7cIPx/NDjvswJIlS3pfL126lB122KHMYzvnXG00eKWwP+VXCq3AfpH33m+//XjhhRd4+eWXee+997juuus49thjyzy2c87VRoM3Hx0EbEswMa1U24Wfj6a1tZWf/exnfOITn6Cnp4dTTjmFcePGlXFc55yrnQavFESQzuEblDYstS38XGkzkI888kiOPPLIkj7jnHNp0uDNRwCnAuOBQRH3H0SwgvcpVYvIOefSqgkqhYHAHQT9C21F9m0L97ud8vsinHOufjVBpQDB4nb3AJcAOwFDCO4IFP4cEm6/JNwv/sXwnHOuHjR4n0K2gcCXgCnAX2DDXHhmJ9j1JRg4HjiQauZTcM65etBElUKGgA/BP3aCt5bCP/aBHd9f66Cccy4VmqT5KIcZLH0teL7steC1S54nyHEudZqzUli5Grp7gucbeoLXFTrllFPYdttt2X333SsuqylkEuQsXhxUypkEOV4xOFdTzVkpLH0NNm4Mnm/cGNwtVOjkk0/mzjvvrLicpuEJcpxLpcbvU3j6BVixsu+23LSYK1bCA3P6bhsxDHbfOfJhDjnkEBYtWlRejM3IE+Q4l0qNf6cwuh0GbdG3IsjtQ8h+PUDB/qPbk4mvWXmCHOdSqfErhSFbwn7jYJvhQYdmfwYMgBHDg/2HbJlMfM3KE+Q4l0qNXykAtLTAbmNgTPvmTUcZUvD+bmOC/V11eYIc51Kp8fsUsm3VFjQP9eQZgjpAsNWQ5GNqZp4gx7nUaY47hYxVa3P6D7JO3wxWrym76M9//vMcdNBBPPfcc7S3t3PVVVdVEGgK+BwC55pSc90prFwFGy24Kxg4EMbuCC8ugfc2BNvfXgUf2Lasoq+99tqYg62hzByCzJDRzBwC8P/ZO9fgmuxOIbwTyHQmb/O+TZ3Q2e83O59D4FzTaq47hbYtYdT2sP3ITdsyndCvLoc33q5dbGnicwica1rNVSns0c9ktO1H9q0smtmoUUGTUb7tzrmG1lzNR9leew0OPRSWLat1JOnjcwica1rNWyn86Efw0ENw5pm1jiR9fA6Bc02rOSuFd96BK64IFsO74QZ47rlaR5Q+kyfDokXBNVq0yCsE55pE1SsFSS2S5kq6Nc97gyRdL+lFSY9K6qx2PEBQIWTmK7z3HnzjGxUXuWTJEg4//HB22203xo0bx2WXXVZxmS4GPt/CuZIkcafwdeCZAu+dCrxlZmOBHwMXVT2a9evhwgs3Dbns6YF774U5c/r/XBGtra1cfPHFLFy4kEceeYSf//znLFy4MIaAXdk8Z4NzJatqpSCpHTgK+GWBXY4Dfh0+vwGYKBVanCgm11wD3d19t61bB6edVlGx22+/PePHjwdg6NCh7LrrrizzTuza8vkWzpWs2ncKlwLfBjYWeH8HYAmAmXUDK4ERuTtJmiJpjqQ5y5cvLz+anh6YPh1W52RaM4P58+Huu8svO8uiRYuYO3cuBxxwQCzluTL5fAvnSla1SkHS0cDrZvZEpWWZWZeZTTCzCSNHVjCX4KabYOXK/O+tWQNf/eqmjGxlWr16NccffzyXXnopW2+9dUVluQp5zgbnSlbNO4UPA8dKWgRcB3xU0qycfZYBOwJIagWGASuqEo0ZnH325ncJ2ZYuhRtvLPsQGzZs4Pjjj2fy5Ml8+tOfLrscFxOfb+FcyapWKZjZmWbWbmadwAnAvWZ2Ys5uNwP/Fj7/TLhPnnWtY3D//cWbDdasgdNPhw0bSi7ezDj11FPZddddOeOMM8qL0cXL51s4V7LE5ylImiHp2PDlVcAISS8CZwDfqdqBzz47+KNfzMqVUMay13/+85+55ppruPfee9l7773Ze++9uf3228sI1MXK51s4V5JE1j4ys/uB+8Pn52RtXwd8tuoBzJ0LTz0Vbd81a+Css+CkkzZveujHwQcfTLVucurStGnB/8p7eoJFB6dMgcsvr3VUzrkimmNG87nnBsNOo1q/Hi69tHrxNLpp04IJgj09weuenuD1tGm1jcs5V1TjVwp/+xvcdVdpo4rWroXvfx/eeqt6cTWyrq7StjvnUqNhKoWCTTczZ24+WS2Knh44//zKgsrRNM1LmTuEqNudc6nREJXC4MGDWbFiRf4/uk88UV6lsG5dMGIpJmbGihUrGDx4cGxlplZLS2nbnXOp0RBJdtrb21m6dCl5Zztfd11lhT9TaNmm0g0ePJj29vbYykutKVOCPoR8251zqdYQlcLAgQMZPXp0rcNwGZlRRj76yLm60xCVgkuhyy/3SsC5OtQQfQrOOefi4ZVCM5o0KVj2IfOYNKnWEZXPk+i4lJs9fzadl3Yy4LwBdF7ayez5pX9H4ygjKm8+ajaTJsE99/Tdds89wfaYlg5PTCaJTiZnQiaJDvhyFi4VZs+fzZRbprB2Q/AdXbxyMVNuCb6jk/eI9h2No4xSqN7Gzk+YMMHmVJglran1l8Oozr4LdHYGFUGujo5gnSPnaqzz0k4Wr9z8O9oxrINFpy9KrAwASU+Y2YRi+3nzkatfnkTHpdwrK/N/Fwttr1YZpfBKwdUvT6LjUm7UsPzfxULbq1VGKbxSaDYTJ5a2Pc08iY5LuZkTZ9I2sO93tG1gGzMnRv+OxlFGKbxSaDZ33715BTBxYv11MoMn0XGpN3mPyXQd00XHsA6E6BjWQdcxXSV1EMdRRim8o9k555qAdzS7wuIY21+sDJ8/4Fxd8nkKzSaOsf3FyvD5A87VLW8+ajZxjO0vVobPH3Audbz5yOUXx9j+YmX4/AHn6pZXCs0mjrH9xcrw+QPO1S2vFJpNHGP7i5Xh8wecq1teKTSbOMb2FyvD5w84V7e8o9k555qAdzTXQlrG5qclDueqKMkcA83E5ynEJS1j89MSh3NVlHSOgWbizUdxScvY/LTE4VwVxZVjoJl481HS0jI2Py1xOFdFSecYaCZeKcQlLWPz0xKHc1WUdI6BZuKVQlzSMjY/LXE4V0VJ5xhoJl4pxCUtY/PTEodzVZR0joFm4h3NzjnXBGre0SxpsKTHJP1V0gJJ5+XZ52RJyyU9FT6+WK14msq0adDaGtwptLYGr0t5H5KZ6+DzKZxLHzOrygMQsFX4fCDwKHBgzj4nAz8rpdx9993XXD+mTjWDzR9Tp0Z738xs1iyztra+77e1BdvjksQxnHO9gDkW4W9s0eYjSYOA44FOsia7mdmMqBWPpDbgIWCqmT2atf1kYIKZfTVqWd58VERrC+y/EfYHhgKrgMeAxwZAd09wZ9DTs/nnWlqguzt4nsRcB59P4VyiojYfRZnRfBOwEngCWF9iEC3h58YCP8+uELIcL+kQ4Hng381sSZ5ypgBTAEb50MoCNgBXwXMbYTuC3+wWwHtAN/DaRuBKUJ4KAfpWFEnMdfD5FM6lUpQ7hafNbPeKDiINB34PnGZmT2dtHwGsNrP1kr4EfM7MPtpfWX6nkM9q4JPAk8DafvZrgwfXBruuyXnL7xSca2hxdjQ/LGmPSoIxs7eB+4AjcravMLPM3ccvgX0rOU5z2kDwV/5x+q8QCN4/cADczub3iJn1kSCZuQ4+n8K5VCpYKUiaL2kecDDwpKTnJM3L2t4vSSPDOwQkbQl8DHg2Z5/ts14eCzxTzkk0t6sI7hAituwN3AgHtMKpCl63tMDUqXD55Zv2SWKug8+ncC6VCjYfSero74Nmlufev8/n9wR+DbQQVD6/MbMZkmYQ9ILfLOkCgsqgG3iToCP62YKF4s1HfRkwBni5jM/uBLxIMEjMOdfoojYfRelTuMbM/rXYtqR4pZDtYeDjbN5BEMUQ4I/Ah2KNyDmXTnH2KYzLKbgFb/vPL47JWFEmlvV6jKBPoQzr18DpHy4cZ6XnUtJ5pFscv9YoCWE8aYxLhUITGIAzCUa5dwPvhI9VwArggiiTIKrxSO3ktTgmY0WZWNbHDDOTlXUpuzH7boE4Kz2Xks8jveL4tc6aN8vaZrYZ0+l9tM1ss1nzZpW0j3OVIMbJaxeY2ZlVrZlKkNrmoziGWEaZWNbHpcB/EExGKNG68KM/yRNnpedS8nmkVxy/1igJYTxpjKu2iievSRofPv1t1vNeZvZkBfE1njgmY+X7Q9rfdvYnWEGkjEqhm2AUa0Z2nJWeS8nnkV5x/FqjJITxpDEuLfrrU7g4fPycYN2iLuA/w+c/r35odSaO5DYtLaVt5yBg2+jlZ3sN+EvW6+w4Kz2Xks8jveL4tUZJCONJY1xaFKwUzOxwMzsceBUYb2YTzGxfYB9gWVIB1o04JmNlTyCLsh0B3wbaCrxfwBrgoqzXuXFWei4ln0d6xfFrjZIQxpPGuNQo1ukALIiyLalHajuazYLex44OMyn4Wc6Kn1OnmrW0BD2aLS0ROmffM7ODzWyQRbuEg8z+sYvZmFH9x1npuZR8HukVx6911rxZ1vHjDtN0WcePO/J2IEfZx7lyEWNH87UE/7ecFW6aTLAk9uerV1UVltqO5ppaDRxJsPZgkbWP2JdgnYutEojLOZcWcc5T+AKwAPh6+FgYbnOpsRVwD3AJsBNsGBSsetFD8HPDoGA7lwT7zb7Jk9uk0LTbptE6oxWdJ1pntDLttuTndqQhBldbRZfONrN1wI/Dh0utgcCXYPYQuOqLsAeb8inMA754Hkw+MagApkyBteEdxeLFm9r6fd2hmpl22zSumHNF7+se6+l9fflRlxf6WMPF4Gqvv7WPfmNm/yJpPsEiO32Y2Z7VDi4fbz4qotjAel+yOpVaZ7TSY5sP2W1RC93nJDO3Iw0xuOqJI8nO18OfR8cTkktEsYH1ntwmlfL9Me5ve6PG4GqvvyGpr4ZPJwFbmNni7Ecy4bmSFRtYH8fAexe7FuWfw1Foe6PG4GovSkfzKOAXkl6S9FtJp0nau9qBuTIVG1jvyW1Sacq++edwFNreqDG42itaKZjZuRakyBwHPAh8i2Dso0ujYslrPLlNKl1+1OVMnTC193/lLWph6oSpiXbwpiEGV3tR5imcDXyYYNzjXOAh4MGs5qVEeUezc86VLs55Cp8GRgB3A78DbqpVhVBVsSyaX6SMpHIMxHEuTaReLlexOQRJ5WModpykckd4/okqiTLtGdiaIDv8TOB54KEon6vGoyrLXMSyaH6RMpLKMRDHuTSRerlcU2+d2ifXQuYx9dbg+5NUPoZix0kqd4TnnygdMS5zsTvwEeBQYAKwhKD56Jwq1lUFVaX5KJZF84uUkVSOAZ+HUJJ6uVzF5hAklY+h2HGSyh3h+SdKF8c8hYwLgT8RpGN53MzKzP+YYrEsml+kjKRyDPg8hJLUy+UqNocgqXwMxY6TVO4Izz9RPVFGHx1tZj8ws4cbskKAmBbNL1JGUjkGfB5CSerlchWbQ5BUPoZix0kqd4Tnn6ieKB3NjS+WRfOLlJFUjgGfh1CSerlcxeYQJJWPodhxksod4fknqihKx0OaHlXLpxDLovlFykgqx0Ac59JE6uVyTb11qrWc12JMx1rOa+ntZM5IKh9D5jjbfRN7ZMwgu/Gun5QcRxyxev6J0hBXR3Pa+DwF51LiW9+CSy4JJj5efXWto3FFVDxPQdItkm4u9Ig33AaRxFyHSZOCeQ6Zx6RJlcft6kIS4/InXT0Jnafex6SrC3y/3nkHrrgCNm6EG26A554rvQyXSv0tnX1ofx80sweqElERqb1TyM1TAEHjdClLSBQrY9IkuOeezT83cSLcfXdl8btUmz1/NlNumcLaDZu+G20D2+g6povJe8SzRMmkqydxz8ubf78mjp7I3SflfL8uughmzAi+qy0tcMQRcOutpZXhEhX1TsGbj+KSxFwHqfBn6+z36EqTxLh8nVf4+2XnZn2/1q+H978f3n5707Ytt4Q//Qndtl+0MlziYlvmQtLOkm6QtDBcKfUlSS/FE2YDSWKug2taqRqXf801m0+4XLcOTjst+Vhc7KIMSf0v4AqgGzgcuBqYVc2g6lIScx1c00rNuPyeHpg+HVav7rvdDObPZ+Lfkg3HxS9KpbClmd1D0NS02MymA0dVN6w6lMRch4kT83+u0HbXMJIYlz9xdP7vUZ/tN90EK1fmL2DNGn51VxvaGL1slz5RKoX1kgYAL0j6qqR/JlhG22WLI09BsTLuvnvzCsA7mZvC5D0m03VMFx3DOhCiY1hHrJ3MAHefdPdmf7z7dBCbwdlnb36XkGXUKjF9xe6Fy3CpF2VBvP2AZ4DhwPnAMOAHZvZI9cPbXGo7mp1rdPfdB8ccA2vW9L/fBz4QDIwYODCRsFw0sXU0m9njZrYaeAf4mpl9OkqFIGmwpMck/VXSAknn5dlnkKTrJb0o6VFJncXKdc7VyNlnF68QIGheuuqq6sfjqiLK6KMJkuYD84D54R/5fSOUvR74qJntBewNHCHpwJx9TgXeMrOxwI+Bi0oLP6Iok8rSkmmlWCKeOjmXOEKIkpMoieNEOUZS+ZP6E2XSWLEJcAUT+cydC089FS2QNWvgrLP6zrcpUVKJeipVL3GWpNg6GASVwUeyXh8MzIuyhkbWZ9qAJ4EDcrb/ATgofN4KvEHYpFXoUfLaR1GyqKQl00qxRDx1ci5xhBAlJ1ESx4lyjKTyJ/Vn4q8n5k3CM/HXE3v3KZaYpt9EPsccYzZgQP4TzfdoazObObOsc0kqUU+l6iXODGJMsjPXzPbJ2fakmY0vVuFIagGeAMYCPzez/8h5/2ngCDNbGr7+W1hxvFGozJL7FKJMKktLppViiXjq5FziCCFKTqIkjhPlGEnlT+pPlIlnxSbAFUrks/NbA3j+F1sEcxFKMWQILFkC73tfSR9LKlFPpeolzow4czQ/IOkXkg6TdKiky4H7JY2X1G/FYGY9ZrY30A7sH2ZxK5mkKZLmSJqzfPny0j4cZUJYWiaNFUvEUyfnEkcIUXISJXGcKMdIKn9SpYpNgCuUyOc7D2wsr3br6YHzzy/5Y0kl6qlUvcRZqiiVwl7AB4FzgenArsA+wMXAj6IcxMzeBu4Djsh5axmwI4CkVoKRTSvyfL7LzCaY2YSRI0dGOeQmUSaEpWXSWLFEPHVyLnGEECUnURLHiXKMpPInVarYBLhCiXz2fZXyKoV16+D++0v+WFKJeipVL3GWKsroo8P7eXy00OckjZQ0PHy+JfAx4Nmc3W4G/i18/hngXivWnlWqKJPK0pJppVginjo5lzhCiJKTKInjRDlGUvmT+hNl4lmxCXCFEvn84qqpvb0Fs+fNYsjMNjSd3seQmW3Mnjcrf+/Ck0+WfC5JJeqpVL3EWbJinQ7AdsBVwB3h692AUyN8bk9gLkFH9dPAOeH2GcCx4fPBwG+BF4HHgJ2KlVtWkp0oWVTSkmmlWCKeOjmXOEKIkpMoieNEOUZS+ZP6k9vZnN3JnFEsMU2xRD5RyohDUol6miVOs3g7mu8gWP/ou2a2V9jMM9fM9oi1dorIJ68551zp4uxo3sbMfgNsBDCzbiBlXWgxSMHYftdXWqZkxBFHpDKKjGePY7x73Y2Zr0AznWusit1KAPcDI4Anw9cHAg9EuQ2pxqMqOZpTMLbf9ZWWKRlxxBGpjCLj2eMY756mMfPV1kznGhUxNh+NB34K7E7QNzAS+IyZzataTdWPqjQfpWBsv+srLVMy4ogjUhlFxrPHMd49TWPmq62ZzjWqqM1HrcV2MLMnw9ScuwACnjOzDTHEmB4pGNvv+krLlIw44ohURpHx7HGMd6/HMfPlaqZzjVuUtY8+S5BTYQHwKeD6YpPW6k4Kxva7vtIyJSOOOCKVUWQ8exzj3etxzHy5mulc4xalo/n/mdkqSQcDEwmGp15R3bASloKx/a6vtEzJiCOOSGUUGc8ex3j3uhwzX6ZmOtfYFet0IBh+CnAB8H+zt9XiUZWOZrNUjO13faVlSkZKTiEbAAAQeklEQVQccUQqo8h49jjGu6dlzHwSmulcoyDGjuZbCZaj+BgwHngXeMyCJbET5/MUnHOudHHOU/gXgiWuP2HBGkb/BHyrwvicKyqOPAZJTT9JIg1GwVwHmc8nNC6/kcb/J3EudXe9otxOpOlRteYjlypx5DFIavpJEmkw+s11YMmNy2+k8f9JnEuarhdxNR+ljTcfNYc48hgkNf0kiTQYhXIdtKiF7nO6ExuX30jj/5M4lzRdrzibj5xLXBx5DJKafpJEGoxCuQ4y25Mal99I4/+TOJd6vF5eKbhUiiOPQVLTT5JIg1Eo10Fme1Lj8htp/H8S51KP18srBZdKceQx+OEF3fzxR8/zT1tvShBTjeknSaTBKJTrILM9qXH5jTT+P4lzqcvrFaXjIU0P72huHhXnMXjlVdt43+M2Y8qrVZ9+kkQajGK5DpIal99I4//TkhsiCXhHs2tqZvDIPHhvAwwaCAfsGYwXda5JeUezq0ga0ktUFMPK1dAddMKuequHj+y1GgnGjUs4jhiPU3fj3V19inI7kaaHNx9VXxrSS1Qcw/wXrPvex83uf9y6733cbpzxQm85u+2WYBwxHSdN491dfcKbj1y50pBeoqQYnn4BVqzsu00K/raG1m8QgwbmfNdHDIPdd44vjgoUzcmQovHurj5585ErWxrSS5QUw+h2GLRF3z6DnP/sZFcIa9cp2H90e7xxVKBoToY6HO/u6pNXCm4zaUgvUVIMQ7aE/cbBNsODBvl+rH53ADf9eXiw/5At442jAkVzMtTheHdXn7xScJtJQ3qJkmNoaYHdxsCY9oKjjNa9J75xeTvfu3FM4RlnlcZRpqI5GepxvLurS14puM1MngxdXUF7thT87OoKtqc+hq3aYED+SmH9e+Lt7iEsWJBAHCUqdpzJe0ym65guOoZ1IETHsA66juli8h4J/lJcU/COZtdYlr0OLy2BjeH3esAA2LgxfC4YsyN8YNvaxedcjXhHs0u9OMb/55ax6K+rggphQNCZ/MDro1n6xha8u16w0Xj5qVXxnkSBOGoxr8OlU73NL/FKwdXE7NnB2kCLFwcDhRYvDl6X8sc0Xxm2ak1wkzBiONe9NI4jT3ofu5w4jpv/PJzuHrB31sT+BzuOc3GNafb82Uy5ZQqLVy7GMBavXMyUW6akumLw5iNXE3GM/89Xxq0XvMB9C4bzo2tGbvb+yUcs5zOHvs1Xrtw50TkGrnmlaX5J1OYjrxRcTQwYsNlUAiDoZM10AVRaRhzHiCMO17wGnDcAY/MvhxAbz032y+F9Ci7V4hj/X3Rsf0rmGLjmVY/zS7xScDURx/j/omP7UzLHwDWvupxfEmWBpDQ9fEG8xhElx0ClZcRxjDjicM3L8ylUmfcpOOdc6bxPwTnnXMmqVilI2lHSfZIWSlog6et59jlM0kpJT4WPc6oVT6OoxoSvWo2nL5pUJkKcaTmXOEybBq2twail1tbgddLqbaKVq4IobUzlPIDtgfHh86HA88BuOfscBtxaSrnN3KcQR8KXNCTQiRJHlDjTci5xmDq173lkHrm5nqvJE/k0NtLWpyDpJuBnZnZX1rbDgG+a2dFRy2nmPoVqTfgqtYw4FE0qU+T9KGXUk9ZW6OnZfHtLC3R3JxNDmiZaufilavKapE7gT8DuZvZO1vbDgBuBpcDfCSqIzdawlDQFmAIwatSofRfn+0vQBJKY8JWUOCaepeVc4lBgtW8g/zlWQ5omWrn4paajWdJWBH/4T8+uEEJPAh1mthfwU+B/85VhZl1mNsHMJowcObK6AadYEhO+khLHxLO0nEscCqV3iJj2IRb1ONHKxa+qlYKkgQQVwmwz+13u+2b2jpmtDp/fDgyUtE01Y6pnSUz4SkocE8/Sci5xmDKltO3VUJcTrVz8onQ8lPMABFwNXNrPPu9nUxPW/sArmdeFHs3c0WyWzISvpMQx8Swt5xKHqVPNWlqCDuaWlmQ7mTPSMtHKxY9adzRLOhh4EJgPZBokzwJGhZXRlZK+CkwFuoF3gTPM7OH+ym3mjmbnnCtXzfsUzOwhM5OZ7Wlme4eP283sSjO7MtznZ2Y2zsz2MrMDi1UIzsflO+eqq7XWAbjoMslc1q4NXmeSuUCy+ZPjMG0aXHHFptc9PZteX355bWJyznk+hbri4/Kdc+WqefORi98rr5S2Pc3yVQj9bXfOJcMrhTri4/Kdc9XmlUId8XH5zrlq80qhjkyeDF1dQR+CFPzs6qq/TmYIOpOnTt10Z9DSErz2Tmbnass7mp1zrgl4R3Pc6miCQL2EWi9xJsWvh0uFKNOe0/SoyTIXdbRwf72EWi9xJsWvh6s2ar3MRbXUpPmojiYI1Euo9RJnUvx6uGpLVT6FONWkUqijhfvrJdR6iTMpfj1ctXmfQpzqaIJAvYRaL3Emxa+HSwuvFKKoowkC9RJqvcSZFL8eLi28UoiijiYI1Euo9RJnUvx6uLTwPgXnnGsC3qfgXEziyPvgcxBcvfB8Cs71I468D42UB8M1Pm8+cq4fceR98DkILg28+ci5GMSR96GR8mC4xueVgnP9iCPvg89BcPXEKwXn+hFH3gefg+DqiVcKzvUjjrwPPgfB1RPvaHbOuSbgHc3OOedK5pWCc865Xl4pOOec6+WVgnPOuV5eKTjnnOvllYJzzrleXik455zr5ZWCc865Xl4pOOec61W1SkHSjpLuk7RQ0gJJX8+zjyT9RNKLkuZJGl+teJqJJ3RxzpWrmkl2uoFvmNmTkoYCT0i6y8wWZu3zSWDn8HEAcEX405XJE7o45ypRtTsFM3vVzJ4Mn68CngF2yNntOOBqCzwCDJe0fbViagbf/e6mCiFj7dpgu3POFZNIn4KkTmAf4NGct3YAlmS9XsrmFQeSpkiaI2nO8uXLqxVmQ/CELs65SlS9UpC0FXAjcLqZvVNOGWbWZWYTzGzCyJEj4w2wwXhCF+dcJapaKUgaSFAhzDaz3+XZZRmwY9br9nCbK5MndHHOVaKao48EXAU8Y2aXFNjtZuCkcBTSgcBKM3u1WjE1A0/o4pyrRDVHH30Y+FdgvqSnwm1nAaMAzOxK4HbgSOBFYC3whSrG0zQmT/ZKwDlXnqpVCmb2EKAi+xjwlWrF4JxzrjQ+o9k551wvrxScc8718krBOedcL68UnHPO9VLQ11s/JC0HFtcwhG2AN2p4/FLUS6weZ7zqJU6on1gbIc4OMys6+7fuKoVakzTHzCbUOo4o6iVWjzNe9RIn1E+szRSnNx8555zr5ZWCc865Xl4plK6r1gGUoF5i9TjjVS9xQv3E2jRxep+Cc865Xn6n4JxzrpdXCs4553p5pdAPSS2S5kq6Nc97J0taLump8PHFGsW4SNL8MIY5ed6XpJ9IelHSPEnjaxFnGEuxWA+TtDLrmp5ToziHS7pB0rOSnpF0UM77qbimEeJMy/XcJSuGpyS9I+n0nH1qfk0jxpmWa/rvkhZIelrStZIG57w/SNL14fV8NMx+GUk1l85uBF8nyC29dYH3rzezryYYTyGHm1mhCSufBHYOHwcAV4Q/a6W/WAEeNLOjE4smv8uAO83sM5K2AHLSFqXmmhaLE1JwPc3sOWBvCP6jRZBI6/c5u9X8mkaME2p8TSXtAHwN2M3M3pX0G+AE4L+zdjsVeMvMxko6AbgI+FyU8v1OoQBJ7cBRwC9rHUuFjgOutsAjwHBJ29c6qLSSNAw4hCBBFGb2npm9nbNbza9pxDjTaCLwNzPLXZWg5tc0R6E406IV2FJSK8F/Bv6e8/5xwK/D5zcAE8PEZ0V5pVDYpcC3gY397HN8eKt7g6Qd+9mvmgz4o6QnJE3J8/4OwJKs10vDbbVQLFaAgyT9VdIdksYlGVxoNLAc+K+w6fCXkobk7JOGaxolTqj99cx1AnBtnu1puKbZCsUJNb6mZrYM+BHwCvAqQcbKP+bs1ns9zawbWAmMiFK+Vwp5SDoaeN3Mnuhnt1uATjPbE7iLTbVy0g42s/EEt99fkXRIjeKIolisTxKsz7IX8FPgf5MOkOB/YOOBK8xsH2AN8J0axFFMlDjTcD17hU1cxwK/rWUcxRSJs+bXVNL7CO4ERgMfAIZIOjGu8r1SyO/DwLGSFgHXAR+VNCt7BzNbYWbrw5e/BPZNNsTeOJaFP18naP/cP2eXZUD2XUx7uC1xxWI1s3fMbHX4/HZgoKRtEg5zKbDUzB4NX99A8Mc3WxquadE4U3I9s30SeNLMXsvzXhquaUbBOFNyTScBL5vZcjPbAPwO+FDOPr3XM2xiGgasiFK4Vwp5mNmZZtZuZp0Et5H3mlmfmjinvfNYgg7pREkaImlo5jnwceDpnN1uBk4KR3ccSHCr+WrCoUaKVdL7M+2ekvYn+H5G+iLHxcz+ASyRtEu4aSKwMGe3ml/TKHGm4Xrm+DyFm2Rqfk2zFIwzJdf0FeBASW1hLBPZ/O/PzcC/hc8/Q/A3LNJMZR99VAJJM4A5ZnYz8DVJxwLdwJvAyTUIaTvg9+F3tBX4HzO7U9KXAczsSuB24EjgRWAt8IUaxBk11s8AUyV1A+8CJ0T9IsfsNGB22IzwEvCFlF7TYnGm5Xpm/iPwMeBLWdtSd00jxFnza2pmj0q6gaApqxuYC3Tl/H26CrhG0osEf59OiFq+L3PhnHOulzcfOeec6+WVgnPOuV5eKTjnnOvllYJzzrleXik455zr5ZWCcyUKV8rMt3Ju3u0xHO9TknbLen2/pNQnkXf1ySsF59LvU8BuRfdyLgZeKbiGE86evi1ctOxpSZ8Lt+8r6YFwQb4/ZGalh//zvkzB+vhPhzNVkbS/pL+EC849nDV7OGoMv5L0WPj548LtJ0v6naQ7Jb0g6QdZnzlV0vPhZ/5T0s8kfYhgxvwPw/jGhLt/NtzveUkfienSOeczml1DOgL4u5kdBcEy05IGEixgdpyZLQ8ripnAKeFn2sxs73CRvl8BuwPPAh8xs25Jk4DvA8dHjOG7BEsLnCJpOPCYpLvD9/YG9gHWA89J+inQA/w/gvWLVgH3An81s4cl3QzcamY3hOcD0Gpm+0s6EjiXYD0c5yrmlYJrRPOBiyVdRPDH9EFJuxP8ob8r/KPaQrDscMa1AGb2J0lbh3/IhwK/lrQzwbLfA0uI4eMEiyp+M3w9GBgVPr/HzFYCSFoIdADbAA+Y2Zvh9t8CH+yn/N+FP58AOkuIy7l+eaXgGo6ZPa8gneORwPck3UOwKusCMzuo0MfyvD4fuM/M/llBOsP7SwhDwPFhNq9NG6UDCO4QMnoo799hpoxyP+9cXt6n4BqOpA8Aa81sFvBDgiaZ54CRCvMYSxqovglSMv0OBxOs0LmSYLnhzPLNJ5cYxh+A07JW1NynyP6PA4dKep+CpY6zm6lWEdy1OFd1Xim4RrQHQRv+UwTt7d8zs/cIVri8SNJfgafouwb9OklzgSsJ8tsC/AC4INxe6v/GzydobponaUH4uqAw18T3gceAPwOLCLJlQZDT41thh/WY/CU4Fw9fJdU1PUn3A980szk1jmMrM1sd3in8HviVmeVLHO9c1fidgnPpMT28u3kaeJkap890zcnvFJxzzvXyOwXnnHO9vFJwzjnXyysF55xzvbxScM4518srBeecc73+P43ZxcpzilFgAAAAAElFTkSuQmCC\n",
  571. "text/plain": [
  572. "<Figure size 432x288 with 1 Axes>"
  573. ]
  574. },
  575. "metadata": {},
  576. "output_type": "display_data"
  577. },
  578. {
  579. "data": {
  580. "image/png": "\n",
  581. "text/plain": [
  582. "<Figure size 432x288 with 1 Axes>"
  583. ]
  584. },
  585. "metadata": {},
  586. "output_type": "display_data"
  587. }
  588. ],
  589. "source": [
  590. "# 绘图显示\n",
  591. "datashow(datamat, k, mycentroids, clusterAssment)\n",
  592. "trgartshow(datamat, 3, labels)"
  593. ]
  594. },
  595. {
  596. "cell_type": "markdown",
  597. "metadata": {},
  598. "source": [
  599. "## How to use sklearn to do the classifiction\n"
  600. ]
  601. },
  602. {
  603. "cell_type": "code",
  604. "execution_count": 21,
  605. "metadata": {},
  606. "outputs": [
  607. {
  608. "data": {
  609. "text/plain": [
  610. "<Figure size 432x288 with 0 Axes>"
  611. ]
  612. },
  613. "metadata": {},
  614. "output_type": "display_data"
  615. },
  616. {
  617. "data": {
  618. "image/png": "iVBORw0KGgoAAAANSUhEUgAAAP4AAAECCAYAAADesWqHAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAC8tJREFUeJzt3X+o1fUdx/HXazetlpK2WoRGZgwhguUPZFHEphm2wv2zRKFgsaF/bJFsULZ/Rv/1V7Q/RiBWCzKjawkjtpaSEUGr3Wu2TG2UGCnVLTTM/lCy9/44X4eJ637v3f187jnn/XzAwXO9x/P63Ht9ne/3e+73nLcjQgBy+c5kLwBAfRQfSIjiAwlRfCAhig8kRPGBhLqi+LaX237X9nu21xfOesz2iO3dJXNOy7vc9g7be2y/Y/uewnnn2X7D9ltN3gMl85rMAdtv2n6+dFaTd8D227Z32R4qnDXD9hbb+2zvtX1dwax5zdd06nLU9roiYRExqRdJA5LelzRX0lRJb0m6umDejZIWSNpd6eu7TNKC5vp0Sf8u/PVZ0rTm+hRJr0v6UeGv8beSnpL0fKXv6QFJF1fKekLSr5rrUyXNqJQ7IOljSVeUuP9u2OIvlvReROyPiBOSnpb0s1JhEfGKpMOl7v8seR9FxM7m+heS9kqaVTAvIuJY8+GU5lLsLC3bsyXdKmljqYzJYvtCdTYUj0pSRJyIiM8rxS+V9H5EfFDizruh+LMkfXjaxwdVsBiTyfYcSfPV2QqXzBmwvUvSiKRtEVEy72FJ90r6umDGmULSi7aHba8pmHOlpE8lPd4cymy0fUHBvNOtkrS51J13Q/FTsD1N0rOS1kXE0ZJZEXEyIq6VNFvSYtvXlMixfZukkYgYLnH/3+KGiFgg6RZJv7Z9Y6Gcc9Q5LHwkIuZL+lJS0eegJMn2VEkrJA2WyuiG4h+SdPlpH89u/q5v2J6iTuk3RcRztXKb3dIdkpYXirhe0grbB9Q5RFti+8lCWf8VEYeaP0ckbVXncLGEg5IOnrbHtEWdB4LSbpG0MyI+KRXQDcX/p6Qf2L6yeaRbJekvk7ymCWPb6hwj7o2IhyrkXWJ7RnP9fEnLJO0rkRUR90fE7IiYo87P7aWIuKNE1im2L7A9/dR1STdLKvIbmoj4WNKHtuc1f7VU0p4SWWdYrYK7+VJnV2ZSRcRXtn8j6e/qPJP5WES8UyrP9mZJP5Z0se2Dkv4QEY+WylNnq3inpLeb425J+n1E/LVQ3mWSnrA9oM4D+zMRUeXXbJVcKmlr5/FU50h6KiJeKJh3t6RNzUZpv6S7CmadejBbJmlt0ZzmVwcAEumGXX0AlVF8ICGKDyRE8YGEKD6QUFcVv/Dpl5OWRR553ZbXVcWXVPObW/UHSR553ZTXbcUHUEGRE3hs9/VZQTNnzhzzvzl+/LjOPffcceXNmjX2FysePnxYF1100bjyjh4d+2uIjh07pmnTpo0r79Chsb80IyLUnL03ZidPnhzXv+sVETHqN2bST9ntRTfddFPVvAcffLBq3vbt26vmrV9f/AVv33DkyJGqed2IXX0gIYoPJETxgYQoPpAQxQcSovhAQhQfSIjiAwm1Kn7NEVcAyhu1+M2bNv5Jnbf8vVrSattXl14YgHLabPGrjrgCUF6b4qcZcQVkMWEv0mneOKD2a5YBjEOb4rcacRURGyRtkPr/ZblAr2uzq9/XI66AjEbd4tcecQWgvFbH+M2ct1Kz3gBUxpl7QEIUH0iI4gMJUXwgIYoPJETxgYQoPpAQxQcSYpLOONSebDN37tyqeeMZEfb/OHz4cNW8lStXVs0bHBysmtcGW3wgIYoPJETxgYQoPpAQxQcSovhAQhQfSIjiAwlRfCAhig8k1GaE1mO2R2zvrrEgAOW12eL/WdLywusAUNGoxY+IVyTVfRUFgKI4xgcSYnYekNCEFZ/ZeUDvYFcfSKjNr/M2S3pN0jzbB23/svyyAJTUZmjm6hoLAVAPu/pAQhQfSIjiAwlRfCAhig8kRPGBhCg+kBDFBxLqi9l5CxcurJpXe5bdVVddVTVv//79VfO2bdtWNa/2/xdm5wHoChQfSIjiAwlRfCAhig8kRPGBhCg+kBDFBxKi+EBCFB9IqM2bbV5ue4ftPbbfsX1PjYUBKKfNufpfSfpdROy0PV3SsO1tEbGn8NoAFNJmdt5HEbGzuf6FpL2SZpVeGIByxnSMb3uOpPmSXi+xGAB1tH5Zru1pkp6VtC4ijp7l88zOA3pEq+LbnqJO6TdFxHNnuw2z84De0eZZfUt6VNLeiHio/JIAlNbmGP96SXdKWmJ7V3P5aeF1ASiozey8VyW5wloAVMKZe0BCFB9IiOIDCVF8ICGKDyRE8YGEKD6QEMUHEuqL2XkzZ86smjc8PFw1r/Ysu9pqfz/BFh9IieIDCVF8ICGKDyRE8YGEKD6QEMUHEqL4QEIUH0iI4gMJtXmX3fNsv2H7rWZ23gM1FgagnDbn6h+XtCQijjXvr/+q7b9FxD8Krw1AIW3eZTckHWs+nNJcGJgB9LBWx/i2B2zvkjQiaVtEMDsP6GGtih8RJyPiWkmzJS22fc2Zt7G9xvaQ7aGJXiSAiTWmZ/Uj4nNJOyQtP8vnNkTEoohYNFGLA1BGm2f1L7E9o7l+vqRlkvaVXhiActo8q3+ZpCdsD6jzQPFMRDxfdlkASmrzrP6/JM2vsBYAlXDmHpAQxQcSovhAQhQfSIjiAwlRfCAhig8kRPGBhJidNw7bt2+vmtfvav/8jhw5UjWvG7HFBxKi+EBCFB9IiOIDCVF8ICGKDyRE8YGEKD6QEMUHEqL4QEKti98M1XjTNm+0CfS4sWzx75G0t9RCANTTdoTWbEm3StpYdjkAami7xX9Y0r2Svi64FgCVtJmkc5ukkYgYHuV2zM4DekSbLf71klbYPiDpaUlLbD955o2YnQf0jlGLHxH3R8TsiJgjaZWklyLijuIrA1AMv8cHEhrTW29FxMuSXi6yEgDVsMUHEqL4QEIUH0iI4gMJUXwgIYoPJETxgYQoPpBQX8zOqz0LbeHChVXzaqs9y67293NwcLBqXjdiiw8kRPGBhCg+kBDFBxKi+EBCFB9IiOIDCVF8ICGKDyRE8YGEWp2y27y19heSTkr6irfQBnrbWM7V/0lEfFZsJQCqYVcfSKht8UPSi7aHba8puSAA5bXd1b8hIg7Z/r6kbbb3RcQrp9+geUDgQQHoAa22+BFxqPlzRNJWSYvPchtm5wE9os203AtsTz91XdLNknaXXhiActrs6l8qaavtU7d/KiJeKLoqAEWNWvyI2C/phxXWAqASfp0HJETxgYQoPpAQxQcSovhAQhQfSIjiAwlRfCAhR8TE36k98Xf6LebOnVszTkNDQ1Xz1q5dWzXv9ttvr5pX++e3aFF/v5wkIjzabdjiAwlRfCAhig8kRPGBhCg+kBDFBxKi+EBCFB9IiOIDCVF8IKFWxbc9w/YW2/ts77V9XemFASin7UCNP0p6ISJ+bnuqpO8WXBOAwkYtvu0LJd0o6ReSFBEnJJ0ouywAJbXZ1b9S0qeSHrf9pu2NzWCNb7C9xvaQ7bovXQMwZm2Kf46kBZIeiYj5kr6UtP7MGzFCC+gdbYp/UNLBiHi9+XiLOg8EAHrUqMWPiI8lfWh7XvNXSyXtKboqAEW1fVb/bkmbmmf090u6q9ySAJTWqvgRsUsSx+5An+DMPSAhig8kRPGBhCg+kBDFBxKi+EBCFB9IiOIDCfXF7Lza1qxZUzXvvvvuq5o3PDxcNW/lypVV8/ods/MAnBXFBxKi+EBCFB9IiOIDCVF8ICGKDyRE8YGEKD6Q0KjFtz3P9q7TLkdtr6uxOABljPqeexHxrqRrJcn2gKRDkrYWXheAgsa6q79U0vsR8UGJxQCoY6zFXyVpc4mFAKindfGb99RfIWnwf3ye2XlAj2g7UEOSbpG0MyI+OdsnI2KDpA1S/78sF+h1Y9nVXy1284G+0Kr4zVjsZZKeK7scADW0HaH1paTvFV4LgEo4cw9IiOIDCVF8ICGKDyRE8YGEKD6QEMUHEqL4QEIUH0io1Oy8TyWN5zX7F0v6bIKX0w1Z5JFXK++KiLhktBsVKf542R6KiEX9lkUeed2Wx64+kBDFBxLqtuJv6NMs8sjrqryuOsYHUEe3bfEBVEDxgYQoPpAQxQcSovhAQv8BVOSY4UmSu60AAAAASUVORK5CYII=\n",
  619. "text/plain": [
  620. "<Figure size 288x288 with 1 Axes>"
  621. ]
  622. },
  623. "metadata": {
  624. "needs_background": "light"
  625. },
  626. "output_type": "display_data"
  627. }
  628. ],
  629. "source": [
  630. "from sklearn.datasets import load_digits\n",
  631. "import matplotlib.pyplot as plt \n",
  632. "from sklearn.cluster import KMeans\n",
  633. "\n",
  634. "# load digital data\n",
  635. "digits, dig_label = load_digits(return_X_y=True)\n",
  636. "\n",
  637. "# draw one digital\n",
  638. "plt.gray() \n",
  639. "plt.matshow(digits[0].reshape([8, 8])) \n",
  640. "plt.show() \n",
  641. "\n",
  642. "# calculate train/test data number\n",
  643. "N = len(digits)\n",
  644. "N_train = int(N*0.8)\n",
  645. "N_test = N - N_train\n",
  646. "\n",
  647. "# split train/test data\n",
  648. "x_train = digits[:N_train, :]\n",
  649. "y_train = dig_label[:N_train]\n",
  650. "x_test = digits[N_train:, :]\n",
  651. "y_test = dig_label[N_train:]\n",
  652. "\n"
  653. ]
  654. },
  655. {
  656. "cell_type": "code",
  657. "execution_count": 28,
  658. "metadata": {},
  659. "outputs": [
  660. {
  661. "data": {
  662. "image/png": "iVBORw0KGgoAAAANSUhEUgAAAW4AAAA/CAYAAADAByJpAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAEHNJREFUeJztnWtsVNUWx/97ZjptZzqtVCqUhxT0+kDN1QZB8CZqfIAaJfpBfMRH1OADTPygxpuguWpEwUdqolGJuQokPquQaCwPTa0GTASjeBUEeRR5SC1tebUzbWdm3w90NmvvdqbnnHmcHrp+yYR1WNNz/nPmnDV7r7P23kJKCYZhGMY7+NwWwDAMw9iDAzfDMIzH4MDNMAzjMThwMwzDeAwO3AzDMB6DAzfDMIzH4MDNMAzjMSwFbiHELCHEViHEdiHEk/kWxTpYB+tgHSerjpwgpcz4AuAHsAPAJABBAJsATB7s73L9Yh2sg3WwDq/ryNVL9H2otAghpgP4j5RyZt/2v/sC/gsZ/ibtTiORiLY9duxYZQcCAc134MABZff29uLw4cMQQqDv+Np7pZTCjg6/369tn3HGGcqOx+Oab8+ePcpOJpNIJBLpdmtJh893oqMzfvx47b2nnnqqdixKS0uLsnt6etDW1qb2Zb7X7vkoLy/XtidOnKgdi/Lnn38qO5FIIBaLpdut7fMxYcIE7b2VlZXK/vvvvzXf/v376XH6nQO7OjJRXFysbHqtALr+rq4u7N69G6FQCABw9OjRnOoYNWqUsquqqjTfH3/8oexkMone3t60+7GrY8SIEdr2uHHjlG3eS9FoVLP379+vrq+Ojg4Ax89ZMplEMpkcVAeNCzU1Ndp7S0tL0+qgn7+rqwt79uxR8SelI4Xd83Haaael3c503wJAe3s7Pe6gOgYiMPhbMBbAHrK9F8A0KztPkQq2ADBlyhTN9+KLLyrbvDgWLVqk7F27dqGpqQlFRUUAkDFYWOGUU07Rtt9++21lmyf6scceU3ZXV5d24p1AL7YFCxZovrvuukvZ5md85ZVXlL1582asXLlSBZTOzs6sNF166aXa9tKlS5W9d+9ezffwww8ru62tTQsYTkgFOQB4/vnnNd+cOXOU/frrr2u+Z599Vtnd3d04duxYVjoyQQPV+++/r/nKysqU3dDQgIULF+Kcc84BADQ2NgI4fg8M1kgaCDMY3XnnncqeO3eu5rvuuuuUfezYMbS0tKgflVRjw44Oet9eddVVmm/x4sXKpj+uAPDLL78ou7GxEcuWLcP06dMBACtWrEBvby/Kyspw6NAhSzpoXHj11Vc130UXXaTscDis+VpbW5W9atUq1NXVYerUqQCAjz76yNKxKfS7uOOOOzTf/PnzlW3+WL/22mva9ocffqhs+iNnByuB2xJCiLkA5g76xjzDOlgH62AdXtcxGFYC9z4AtD8/ru//NKSUSwAsAex1/awSCoUstRTyrcNsAbmlIxKJDInzEQwGLb0v3zpousJNHaNGjbLUGyzEdToUro+qqiqtN5hMJge8hwrxvVhp3eZbR66wErg3APiHEGIijgfsWwHcbucgtCtJu3MAMGnSJGUfOXJE882ePVvZiUQCDQ0NKC8vh9/vx19//WVHAgC960e7+wBwySWXKPvxxx/XfPTCc9LdNbniiiuUfdlll2m+d999V9lnn3225rvpppuUHY/HsXz5clRWViIQCGDXrl22ddAu6JIlSzQfPVdmGuaNN97QdEyfPh0VFRXw+Xxoa2uzrePqq69W9jXXXKP5tm/fruwZM2ZovnPPPVfZUkqsW7fO9rEp9DPT1AgAPPnkiSKEVBokBf3M06ZNQ09PD0KhkEqJCSEcp0poKgDQU2srVqzQfDSn6/f74fP5EIlE4PP50N7ejkAgACFEv2c46aDpxIceekjz0edPmzZt0nznn3++squqqvDEE0+ocxKLxVBcXIxYLGb5fNA03g033KD5tm7dquwvvvhC89FnMclkEu3t7di0aZNKt9qFXm8LFy7UfB9//LGy6XUEALfeequ2/fnnnys7b6kSKWVcCDEfwGocfzL7Xynlb46OlgV+vx8VFRWOAkMuMb8UtwgEAhg5ciQOHDiQkx+TbHSUlZXh8OHDrmkAhtb3Ultbi6ampuNP//uCdqERQiAcDqvGkM/nc0VHIBDAjBkz0NDQACml+kEpND6fD2PHjsXOnTsLfux8YCnHLaX8EsCXedYyKCUlJSgpKQGgVxQMV0KhkHqw5+YFGQwG1QOqgwcPuqZjqDBmzBiMGTMGAPDJJ5+4piMYDKpUltmbLSTjx49X1VMffPCBazrKy8tVdYvZS/AaOXs4mQlaNWDmRHfs2JHWZ7bisq0koSkbs/tCKwWWLVum+XJdrfD7778r26wMoMeiKQkA2LJli7adbe/j4osvVraZGrj99hPZsA0bNmg+s4t+4YUXKvurr76yrYOWV5oVGzRVcv/992s+p13edIwcOVLZtJII0FNa+/bpj3jMUkqaxnPSG6LVEc8995zm2717t7LN78FMJdHr45tvvrGtg+aily9frvm+++47ZZvpi9GjR2vb9DukqYFM5ZsU+r10d3drvhdeOFGVvHbtWs1nlvxlGz9oGShNFQH6DzS9HwA9xQLocchpQ4eHvDMMw3gMDtwMwzAegwM3wzCMxyhIjpvmpcwyvrPOOkvZZn0nzV8B/XPedkk9MAL6j9KkoyWnTdMHhpojA2me0UkOk+b1zWHcTz31lLLN3Nhnn32mbWc7WrK6ulrZ5ujI77//XtlmTvenn37StmnJmpMc9/r165Vtno9Zs2YpO/VgOoWZw8wW+jluueUWzUfzsfQ6AvQReoBehubk+qAlkfTzA3oZq3l9pEYnpli1apWy16xZY1sH/czmtAfXXnutss1nD2bumua8rea1KfTBqpmnfuSRR5RNp2kAgHfeeUfbNqdIsAstPTRjAn0WYU5jYY7ENkd4OoFb3AzDMB6DAzfDMIzHKEiqhHazzNIpmrIwux9mGVqmWfmsQI9FZ+ED9FGJZveUdn0B4Omnn1Y2nVDHCeZnampqUjYdzQkA99xzj7ZNR4r99pv9MVF0siuzO0fLlMxZ5sxuJi33dAI9lrkvOgqvublZ8+U6VUJTYOYkRLR7e/PNN2s+M1XS1dWVlY6ZM2em9dFJ2sxRt2YKxywXtAtNXZqTkNF7xExRmGV55vmxyw8//KDst956S/PRY5ujTM3zU19fr2yzrNAKNMX5zDPPaD5aimmWNZvfJ02lOLlvAW5xMwzDeA5LLW4hRDOAowASAOJSyimZ/+LkZsuWLZYnm8on27Ztc20oM6Wurg7FxcWu6xgqHDx40LWh7pT6+noUFRW5rmPdunXw+/2u6+ju7nZdQ66w0+K+Qkp54XAP2ikmTZqkVcS4RU1NTb+J/d3g7rvvxoMPPui2jCHDiBEj+s1T7QYzZ87EjTfe6LYM1NbW9qvWcoOioiLLM1oOZQqS46ar3phfHs0Rm7lUc7WL5uZm9cttdYYzSqahyDS/aQ7vNS/8iooKzJs3D5FIBPfee6/6f6uaaFlbRUWF5qPldOaE7OZCAqWlpbjyyisRCoUc5co2b96sbPNc0+/MzDubZWiJRAItLS1aztwOdIWTCy64QPPRlWfMMkTzeEIIlJaWQgjhqFSSXh9vvvmm5qOf+frrr9d8ZqmclBLRaNRW646+99dff1U2feYB6Hlc8wf722+/1bZjsZhq7TqB3o/mkHn67MF8BmTmf7u7u7FhwwbHrV1axmcuSkCfA5mLG5x++unathBCzZDoJMdNY4ZZmvzpp58q24xj5oyX9FqiJZt2sBq4JYA1ffPTvt03Z60rZPuAMlcsXrwYQggkk0lXZjtLUV9f73r3TwiBL790fQ4yANnPR5ErnE7XmWvMuni3cBIo88FQuT6yxWrg/peUcp8Q4jQAa4UQv0sptZ/3QqwckWptSynTBvBC6FiwYAEqKytx5MgRzJ8/f8B8ZiF03HbbbYhEIujs7OzXQiykjtmzZyMcDiMajfaboKuQOkpKSuDz+SClTFvZUQgdpaWlak1FN3WMHj0agUAAiUSi3wCrQupIPf+QUqYNnMPp+sgFlpqKUsp9ff/+DWAFgKkDvGeJlHJKPnPgqeCYqYVZCB2p3GV5eXnaCfILoSOVzsg0EqsQOlLHz5QqKYSOVM/H7esjpSNTT6wQOlIpqEypkuF03w6V6yMXDNriFkKEAfiklEf77GsAPDvIn2nQbpI5RSoNOuY0nXSq0Xg8rv1yO8lx09XaGxoaNN+ZZ545oF5Ar/+ORqMIh8MIhUIqj1lcXIxAIGA5r0pz3Pfdd5/mo7kzc8VzusJ3Z2cnOjo6EAwGtZyanZVWfv75Z2WbPRi6KKv5w0Dz4Z2dnWhqaoLf73c0nBnQV1qZN2+e5qutrVU2nd4T0KfPjMViWL9+PYqLi9HT04OXX34Z4XAYwWDQcr13pqHVNF86UG49RTQaRTAYdLziDaAvJkunAwD08QbmCkkvvfSSsuPxuPbcxgl0MV863BvQr0Xz2QsdfyGlzLjavBXo+b788ss1H109afLkyZqP5p3j8Th6enqy+l7ofWuuCLRx40Zlm3Xr5sLkFPrjbuf+sZIqGQVgRd/FGQDwvpTSWUY9C6LR6JDIGx46dEg9fEkkEggEAtrDtULR2tqKlStXAjjxhbuR625tbXW0bFqu6ejowHvvvQfg+PmgiwgUkra2NtdXAwKGTk55qBCLxRw19oYqVpYu2wngnwXQkpFIJKJVN+R6cQOrVFdXa5PX0HUxC0lNTY02EZI5oqyQOmhvxelIsGyprq7GAw88oLbT5fzzzbhx47QWlltL7eViIqNc4PaD8xRlZWVaj96sBPIaBWkq0haI2a2i3TuzNM7svme74gntiphdnaVLlyrbXG7KLHeiq6PQhxxWuzq0G37eeedpvjlz5ijbbLmZZYp00VHa6rfasqCpAXM4PV381Nzfo48+qm1v27bN0vHSQXtSZgkk7ZKb0yWY76X7od3mH3/80bYm88amvQpz9RPze3JSfke773QKAJquAPRUiamDlnfmAhp0zdV16PlpbGzUfLleA5Xe9+Yi33RGxLq6Os23evVqbTvbYE3vdfOaX7RokbKrqqo0n1ny9/XXXyvb6bniIe8MwzAegwM3wzCMx+DAzTAM4zFErvNRACCEaAXQCcDZEsY6Iy3sZ4KUssr8T9YxpHXstrgP1sE6TgYdVrQMqGNApJR5eQHYOBT2wzqGpg7eB+9jOO0jl/uRUnKqhGEYxmtw4GYYhvEY+QzcuZpBMNv9sI7c/n0u98P74H0Ml33kcj/5eTjJMAzD5A9OlTAMw3iMvARuIcQsIcRWIcR2IcSTWeynWQjxPyHEz0KIjYP/BetgHayDdZxcOgYkV+UppOTFD2AHgEkAggA2AZjscF/NAEayDtbBOljHcNSR7pWPFvdUANullDullD0APgTgxhR6rIN1sA7W4XUdA5KPwD0WwB6yvbfv/5yQWuvyx74lhVgH62AdrGM46RiQwq8AYI9B17pkHayDdbCO4aYjHy3ufQDGk+1xff9nG2lhrUvWwTpYB+s4iXWk3WlOXzjeit8JYCJOJPXPc7CfMIAIsdcDmMU6WAfrYB3DRUe6V85TJVLKuBBiPoDVOP5k9r9SSifrWWW11iXrYB2sg3V4XUc6eOQkwzCMx+CRkwzDMB6DAzfDMIzH4MDNMAzjMThwMwzDeAwO3AzDMB6DAzfDMIzH4MDNMAzjMThwMwzDeIz/A3IZWsVEJuJMAAAAAElFTkSuQmCC\n",
  663. "text/plain": [
  664. "<Figure size 432x288 with 10 Axes>"
  665. ]
  666. },
  667. "metadata": {
  668. "needs_background": "light"
  669. },
  670. "output_type": "display_data"
  671. }
  672. ],
  673. "source": [
  674. "# do kmeans\n",
  675. "kmeans = KMeans(n_clusters=10, random_state=0).fit(x_train)\n",
  676. "\n",
  677. "# kmeans.labels_ - output label\n",
  678. "# kmeans.cluster_centers_ - cluster centers\n",
  679. "\n",
  680. "# draw cluster centers\n",
  681. "fig, axes = plt.subplots(nrows=1, ncols=10)\n",
  682. "for i in range(10):\n",
  683. " img = kmeans.cluster_centers_[i].reshape(8, 8)\n",
  684. " axes[i].imshow(img)"
  685. ]
  686. },
  687. {
  688. "cell_type": "markdown",
  689. "metadata": {},
  690. "source": [
  691. "## Exerciese - How to caluate the accuracy?\n",
  692. "\n",
  693. "1. How to match cluster label to groundtruth label\n",
  694. "2. How to solve the uncertainty of some digital"
  695. ]
  696. },
  697. {
  698. "cell_type": "markdown",
  699. "metadata": {},
  700. "source": [
  701. "## 评估聚类性能\n",
  702. "\n",
  703. "方法1: 如果被用来评估的数据本身带有正确的类别信息,则利用Adjusted Rand Index(ARI),ARI与分类问题中计算准确性的方法类似,兼顾了类簇无法和分类标记一一对应的问题。\n",
  704. "\n"
  705. ]
  706. },
  707. {
  708. "cell_type": "code",
  709. "execution_count": 31,
  710. "metadata": {},
  711. "outputs": [
  712. {
  713. "name": "stdout",
  714. "output_type": "stream",
  715. "text": [
  716. "ari_train = 0.687021\n"
  717. ]
  718. }
  719. ],
  720. "source": [
  721. "from sklearn.metrics import adjusted_rand_score\n",
  722. "\n",
  723. "ari_train = adjusted_rand_score(y_train, kmeans.labels_)\n",
  724. "print(\"ari_train = %f\" % ari_train)"
  725. ]
  726. },
  727. {
  728. "cell_type": "markdown",
  729. "metadata": {},
  730. "source": [
  731. "Given the contingency table:\n",
  732. "![ARI_ct](images/ARI_ct.png)\n",
  733. "\n",
  734. "the adjusted index is:\n",
  735. "![ARI_define](images/ARI_define.png)\n",
  736. "\n",
  737. "* [ARI reference](https://davetang.org/muse/2017/09/21/adjusted-rand-index/)"
  738. ]
  739. },
  740. {
  741. "cell_type": "markdown",
  742. "metadata": {},
  743. "source": [
  744. "\n",
  745. "\n",
  746. "方法2: 如果被用来评估的数据没有所属类别,则使用轮廓系数(Silhouette Coefficient)来度量聚类结果的质量,评估聚类的效果。轮廓系数同时兼顾了聚类的凝聚都和分离度,取值范围是[-1,1],轮廓系数越大,表示聚类效果越好。 \n",
  747. "\n",
  748. "轮廓系数的具体计算步骤: \n",
  749. "1. 对于已聚类数据中第i个样本$x_i$,计算$x_i$与其同一类簇内的所有其他样本距离的平均值,记作$a_i$,用于量化簇内的凝聚度 \n",
  750. "2. 选取$x_i$外的一个簇$b$,计算$x_i$与簇$b$中所有样本的平均距离,遍历所有其他簇,找到最近的这个平均距离,记作$b_i$,用于量化簇之间分离度 \n",
  751. "3. 对于样本$x_i$,轮廓系数为$sc_i = \\frac{b_i−a_i}{max(b_i,a_i)}$ \n",
  752. "4. 最后,对所以样本集合$\\mathbf{X}$求出平均值,即为当前聚类结果的整体轮廓系数。"
  753. ]
  754. },
  755. {
  756. "cell_type": "code",
  757. "execution_count": 34,
  758. "metadata": {},
  759. "outputs": [
  760. {
  761. "data": {
  762. "image/png": "\n",
  763. "text/plain": [
  764. "<Figure size 720x720 with 6 Axes>"
  765. ]
  766. },
  767. "metadata": {
  768. "needs_background": "light"
  769. },
  770. "output_type": "display_data"
  771. },
  772. {
  773. "data": {
  774. "image/png": "\n",
  775. "text/plain": [
  776. "<Figure size 720x720 with 1 Axes>"
  777. ]
  778. },
  779. "metadata": {
  780. "needs_background": "light"
  781. },
  782. "output_type": "display_data"
  783. }
  784. ],
  785. "source": [
  786. "import numpy as np\n",
  787. "from sklearn.cluster import KMeans\n",
  788. "from sklearn.metrics import silhouette_score\n",
  789. "import matplotlib.pyplot as plt\n",
  790. "\n",
  791. "plt.rcParams['figure.figsize']=(10,10)\n",
  792. "plt.subplot(3,2,1)\n",
  793. "\n",
  794. "x1=np.array([1,2,3,1,5,6,5,5,6,7,8,9,7,9]) #初始化原始数据\n",
  795. "x2=np.array([1,3,2,2,8,6,7,6,7,1,2,1,1,3])\n",
  796. "X=np.array(list(zip(x1,x2))).reshape(len(x1),2)\n",
  797. "\n",
  798. "plt.xlim([0,10])\n",
  799. "plt.ylim([0,10])\n",
  800. "plt.title('Instances')\n",
  801. "plt.scatter(x1,x2)\n",
  802. "\n",
  803. "colors=['b','g','r','c','m','y','k','b']\n",
  804. "markers=['o','s','D','v','^','p','*','+']\n",
  805. "\n",
  806. "clusters=[2,3,4,5,8]\n",
  807. "subplot_counter=1\n",
  808. "sc_scores=[]\n",
  809. "for t in clusters:\n",
  810. " subplot_counter +=1\n",
  811. " plt.subplot(3,2,subplot_counter)\n",
  812. " kmeans_model=KMeans(n_clusters=t).fit(X) #KMeans建模\n",
  813. "\n",
  814. " for i,l in enumerate(kmeans_model.labels_):\n",
  815. " plt.plot(x1[i],x2[i],color=colors[l],marker=markers[l],ls='None')\n",
  816. "\n",
  817. " plt.xlim([0,10])\n",
  818. " plt.ylim([0,10])\n",
  819. "\n",
  820. " sc_score=silhouette_score(X,kmeans_model.labels_,metric='euclidean') #计算轮廓系数\n",
  821. " sc_scores.append(sc_score)\n",
  822. "\n",
  823. " plt.title('k=%s,silhouette coefficient=%0.03f'%(t,sc_score))\n",
  824. "\n",
  825. "plt.figure()\n",
  826. "plt.plot(clusters,sc_scores,'*-') #绘制类簇数量与对应轮廓系数关系\n",
  827. "plt.xlabel('Number of Clusters')\n",
  828. "plt.ylabel('Silhouette Coefficient Score')\n",
  829. "\n",
  830. "plt.show() "
  831. ]
  832. },
  833. {
  834. "cell_type": "markdown",
  835. "metadata": {},
  836. "source": [
  837. "## How to determin the 'k'?\n",
  838. "\n",
  839. "利用“肘部观察法”可以粗略地估计相对合理的聚类个数。K-means模型最终期望*所有数据点到其所属的类簇距离的平方和趋于稳定,所以可以通过观察这个值随着K的走势来找出最佳的类簇数量。理想条件下,这个折线在不断下降并且趋于平缓的过程中会有斜率的拐点,这表示从这个拐点对应的K值开始,类簇中心的增加不会过于破坏数据聚类的结构*。\n",
  840. "\n"
  841. ]
  842. },
  843. {
  844. "cell_type": "code",
  845. "execution_count": 37,
  846. "metadata": {},
  847. "outputs": [
  848. {
  849. "data": {
  850. "image/png": "\n",
  851. "text/plain": [
  852. "<Figure size 720x720 with 1 Axes>"
  853. ]
  854. },
  855. "metadata": {
  856. "needs_background": "light"
  857. },
  858. "output_type": "display_data"
  859. }
  860. ],
  861. "source": [
  862. "import numpy as np\n",
  863. "from sklearn.cluster import KMeans\n",
  864. "from scipy.spatial.distance import cdist\n",
  865. "import matplotlib.pyplot as plt\n",
  866. "\n",
  867. "cluster1=np.random.uniform(0.5,1.5,(2,10))\n",
  868. "cluster2=np.random.uniform(5.5,6.5,(2,10))\n",
  869. "cluster3=np.random.uniform(3,4,(2,10))\n",
  870. "\n",
  871. "X=np.hstack((cluster1,cluster2,cluster3)).T\n",
  872. "plt.scatter(X[:,0],X[:,1])\n",
  873. "plt.xlabel('x1')\n",
  874. "plt.ylabel('x2')\n",
  875. "plt.show()"
  876. ]
  877. },
  878. {
  879. "cell_type": "code",
  880. "execution_count": 38,
  881. "metadata": {},
  882. "outputs": [
  883. {
  884. "data": {
  885. "image/png": "\n",
  886. "text/plain": [
  887. "<Figure size 720x720 with 1 Axes>"
  888. ]
  889. },
  890. "metadata": {
  891. "needs_background": "light"
  892. },
  893. "output_type": "display_data"
  894. }
  895. ],
  896. "source": [
  897. "K=range(1,10)\n",
  898. "meandistortions=[]\n",
  899. "\n",
  900. "for k in K:\n",
  901. " kmeans=KMeans(n_clusters=k)\n",
  902. " kmeans.fit(X)\n",
  903. " meandistortions.append(sum(np.min(cdist(X,kmeans.cluster_centers_,'euclidean'),axis=1))/X.shape[0])\n",
  904. "\n",
  905. "plt.plot(K,meandistortions,'bx-')\n",
  906. "plt.xlabel('k')\n",
  907. "plt.ylabel('Average Dispersion')\n",
  908. "plt.title('Selecting k with the Elbow Method')\n",
  909. "plt.show()"
  910. ]
  911. },
  912. {
  913. "cell_type": "markdown",
  914. "metadata": {},
  915. "source": [
  916. "从上图可见,类簇数量从1降到2再降到3的过程,更改K值让整体聚类结构有很大改变,这意味着新的聚类数量让算法有更大的收敛空间,这样的K值不能反映真实的类簇数量。而当K=3以后再增大K,平均距离的下降速度显著变缓慢,这意味着进一步增加K值不再会有利于算法的收敛,同时也暗示着K=3是相对最佳的类簇数量。"
  917. ]
  918. }
  919. ],
  920. "metadata": {
  921. "jupytext_formats": "ipynb,py",
  922. "kernelspec": {
  923. "display_name": "Python 3",
  924. "language": "python",
  925. "name": "python3"
  926. },
  927. "language_info": {
  928. "codemirror_mode": {
  929. "name": "ipython",
  930. "version": 3
  931. },
  932. "file_extension": ".py",
  933. "mimetype": "text/x-python",
  934. "name": "python",
  935. "nbconvert_exporter": "python",
  936. "pygments_lexer": "ipython3",
  937. "version": "3.5.2"
  938. }
  939. },
  940. "nbformat": 4,
  941. "nbformat_minor": 2
  942. }

机器学习越来越多应用到飞行器、机器人等领域,其目的是利用计算机实现类似人类的智能,从而实现装备的智能化与无人化。本课程旨在引导学生掌握机器学习的基本知识、典型方法与技术,通过具体的应用案例激发学生对该学科的兴趣,鼓励学生能够从人工智能的角度来分析、解决飞行器、机器人所面临的问题和挑战。本课程主要内容包括Python编程基础,机器学习模型,无监督学习、监督学习、深度学习基础知识与实现,并学习如何利用机器学习解决实际问题,从而全面提升自我的《综合能力》。

Contributors (1)