From 1efdf82e07df44bfc2badaa29eea7dcc71738144 Mon Sep 17 00:00:00 2001 From: bushuhui Date: Fri, 7 Aug 2020 19:33:33 +0800 Subject: [PATCH] Improve some English representation --- 3_kmeans/1-k-means.ipynb | 2 +- 3_kmeans/1-k-means_EN.ipynb | 13 ++++++------- 4_logistic_regression/1-Least_squares.ipynb | 2 +- 4_logistic_regression/1-Least_squares_EN.ipynb | 2 +- 4_logistic_regression/2-Logistic_regression.ipynb | 2 +- 4_logistic_regression/2-Logistic_regression_EN.ipynb | 16 ++++++++-------- 6 files changed, 18 insertions(+), 19 deletions(-) diff --git a/3_kmeans/1-k-means.ipynb b/3_kmeans/1-k-means.ipynb index 6de3fda..27c5cbf 100644 --- a/3_kmeans/1-k-means.ipynb +++ b/3_kmeans/1-k-means.ipynb @@ -955,7 +955,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.6.8" + "version": "3.6.9" } }, "nbformat": 4, diff --git a/3_kmeans/1-k-means_EN.ipynb b/3_kmeans/1-k-means_EN.ipynb index 2fddac6..3d06d3e 100644 --- a/3_kmeans/1-k-means_EN.ipynb +++ b/3_kmeans/1-k-means_EN.ipynb @@ -25,7 +25,7 @@ "J = \\sum_{k=1}^{K} \\sum_{i \\in C_k} | x_i - u_k|^2\n", "$$\n", "\n", - "$u_k$is the centriod poisition$k$个类的重心位置,定义为:\n", + "$u_k$is the centriod poisition of samples from type $C_k$ with the definition of:\n", "$$\n", "u_k = \\frac{1}{|C_k|} \\sum_{x \\in C_k} x\n", "$$\n", @@ -468,7 +468,7 @@ "\n", " # step1: Initialize cluster center by the sample point that generate randomly\n", " centroids = randChosenCent(dataSet, k)\n", - " print('最初的中心=', centroids)\n", + " print('Original centers=', centroids)\n", "\n", " # Flag bit,if the result of sample classification before and after iteration has changed, the value is True\n", " clusterChanged = True\n", @@ -483,7 +483,7 @@ " for i in range(m):\n", " # Initially define distance as infinite\n", " minDist = inf;\n", - " # Initialize index value初始化索引值\n", + " # Initialize index value\n", " minIndex = -1\n", " # Calculate the distance of each sample and k centriods\n", " for j in range(k):\n", @@ -491,7 +491,7 @@ " distJI = distEclud(centroids[j, :], dataSet.values[i, :])\n", " # Judeg if the distance if the minimum\n", " if distJI < minDist:\n", - " # Update to get the minimum distance更新获取到最小距离\n", + " # Update to get the minimum distance\n", " minDist = distJI\n", " # Get corresponding cluster numbers\n", " minIndex = j\n", @@ -808,8 +808,7 @@ "1. For the ith smapel in the clusterded data$x_i$, calculate the average value between $x_i$ and all the other smaple in the same cluster, written as $a_i$, used to quantify the cohesion within a cluster\n", "2. Choose a cluster $b$ outside of $x_i$, calculate the average distance between $x_i$ and all samples in cluster $b$, traverse all other cluster, find the closest average distance and noted as $b_i$, which can be used to quantify the degree of separation between clusters.\n", "3. For sample $x_i$, Silhouette Coefficient is $sc_i = \\frac{b_i−a_i}{max(b_i,a_i)}$ \n", - "4. Finally, calculate average value for all sample $\\mathbf{X}$, which will be the Silhouette Coefficient for current cluster result.\n", - "4. 最后,对所以样本集合$\\mathbf{X}$求出平均值,即为当前聚类结果的整体轮廓系数。" + "4. Finally, calculate average value for all sample $\\mathbf{X}$, which will be the Silhouette Coefficient for current cluster result." ] }, { @@ -1004,7 +1003,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.6.5" + "version": "3.6.9" } }, "nbformat": 4, diff --git a/4_logistic_regression/1-Least_squares.ipynb b/4_logistic_regression/1-Least_squares.ipynb index c883166..115d86b 100644 --- a/4_logistic_regression/1-Least_squares.ipynb +++ b/4_logistic_regression/1-Least_squares.ipynb @@ -5146,7 +5146,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.6.8" + "version": "3.6.9" } }, "nbformat": 4, diff --git a/4_logistic_regression/1-Least_squares_EN.ipynb b/4_logistic_regression/1-Least_squares_EN.ipynb index 09ca9a7..6fbe12e 100644 --- a/4_logistic_regression/1-Least_squares_EN.ipynb +++ b/4_logistic_regression/1-Least_squares_EN.ipynb @@ -4406,7 +4406,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.6.8" + "version": "3.6.9" } }, "nbformat": 4, diff --git a/4_logistic_regression/2-Logistic_regression.ipynb b/4_logistic_regression/2-Logistic_regression.ipynb index fa86c44..e3a94a6 100644 --- a/4_logistic_regression/2-Logistic_regression.ipynb +++ b/4_logistic_regression/2-Logistic_regression.ipynb @@ -698,7 +698,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.6.8" + "version": "3.6.9" } }, "nbformat": 4, diff --git a/4_logistic_regression/2-Logistic_regression_EN.ipynb b/4_logistic_regression/2-Logistic_regression_EN.ipynb index 042df24..e5b2225 100644 --- a/4_logistic_regression/2-Logistic_regression_EN.ipynb +++ b/4_logistic_regression/2-Logistic_regression_EN.ipynb @@ -6,13 +6,13 @@ "source": [ "# Logistic Regression\n", "\n", - "Logistic Regression model is actually apply a logical function at the basis of linear regression. Beacause of this logical function, Logistic regression has become a shining star at the field of machine learning and the core of computational advertising. This chapter is mainly introducing the basis of logistic regression. \n", + "Logistic Regression model is actually apply a logical function at the basis of linear regression. Beacause this method is simple and effective, it has become a shining star at the field of machine learning and the core of computational advertising. This chapter is mainly introducing the basis of logistic regression. \n", "\n", "\n", "## 1. Logestic regression model\n", - "logestic is a model which can be understood easily, which is equal to $y=f(x)$, shows the relationship between independent variable $x$ and dependent variable $y$. The most common questions are like the doctor's observation, hearing, asking and cutting during treatment, and then determining whether the patient is ill or what kind of disease he has. The observation, hearing and cutting is to obtain the independent variable $x$, that is, the characteristic data, while the determination of whether the patient is ill is equivalent to the acquisition of the dependent variable $y$, that is, the prediction classification.\n", + "Logestic is a model which can be understood easily, which is equal to $y=f(x)$, shows the relationship between independent variable $x$ and dependent variable $y$. The most common questions are like the doctor's observation, hearing, asking and cutting during treatment, and then determining whether the patient is ill or what kind of disease he has. The observation, hearing and cutting is to obtain the independent variable $x$, that is, the characteristic data, while the determination of whether the patient is ill is equivalent to the acquisition of the dependent variable $y$, that is, the prediction classification.\n", "\n", - "The most simple regression is linear regression. Using Andrew NG's handout, as shown in the figure, $X$ is number point -- tumor size, $Y$ is the observed value -- whether a malignant tumor or not. By building a linear regression model, as shown in $h_\\theta(x)$, we can predict whether $h_\\theta(x)) \\ GE 0.5$ is malignant and $h_\\theta(x) \\lt 0.5$ is benign according to the size of the tumor.\n", + "The most simple regression is linear regression. Using Andrew NG's handout, as shown in the figure, $X$ is number point -- tumor size, $Y$ is the observed value -- whether a malignant tumor or not. By building a linear regression model, as shown in $h_\\theta(x)$, we can predict whether $h_\\theta(x)) \\ge 0.5$ is malignant and $h_\\theta(x) \\lt 0.5$ is benign according to the size of the tumor.\n", "\n", "![LinearRegression](images/fig1.gif)\n", "\n", @@ -24,12 +24,12 @@ }, { "cell_type": "code", - "execution_count": 16, + "execution_count": 1, "metadata": {}, "outputs": [ { "data": { - "image/png": "\n", + "image/png": "\n", "text/plain": [ "
" ] @@ -61,13 +61,13 @@ "source": [ "### 1.1 Logestic regression expression\n", "\n", - "This function is called logestic function, which is also called sigmoid function. The formula of function is as following:\n", + "The function of logestic function, which is also called sigmoid function. The formula of function is defined as:\n", "\n", "$$\n", "g(z) = \\frac{1}{1+e^{-z}}\n", "$$\n", "\n", - "For logistic function, when z is approach infinity, g(z) is approach to 1. While z is approach minus infinity, g(z) is approach to 0. The graph for logestic function is shown as upper figure. Logistic funciton has an attribute when doing derivative which will be used in the following derivation, the characteristic is:\n", + "For logistic function, when $z$ is approach infinity, $g(z)$ is approach to 1. While $z$ is approach minus infinity, $g(z)$ is approach to 0. The graph for logestic function is shown as upper figure. Logistic funciton has an attribute when doing derivative which will be used in the following derivation, the characteristic is:\n", "\n", "$$\n", "g'(z) = \\frac{d}{dz} \\frac{1}{1+e^{-z}} \\\\\n", @@ -700,7 +700,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.6.8" + "version": "3.6.9" } }, "nbformat": 4,