{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 数据结构2\n", "\n", "## 1. 字符串" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "字符串是基于文本的有序数据，用单/双/三重引号括起来表示。" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "String0 = 'Taj Mahal is beautiful'\n", "String1 = \"Taj Mahal is beautiful\"\n", "String2 = '''Taj Mahal\n", "is\n", "beautiful'''" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Taj Mahal is beautiful \n", "Taj Mahal is beautiful \n", "Taj Mahal\n", "is\n", "beautiful \n" ] } ], "source": [ "print(String0, type(String0))\n", "print(String1, type(String1))\n", "print(String2, type(String2))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "字符串索引和分段类似于前面详细解释过的列表。" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "M\n", "Mahal is beautiful\n" ] } ], "source": [ "print(String0[4])\n", "print(String0[4:])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.1 内置函数" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**find( )** 函数返回要在字符串中找到的给定数据的索引值。如果没有找到它，它返回 **-1**。记住不要将返回的-1与反向索引值混淆。" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Taj Mahal is beautiful\n", "7\n", "-1\n" ] } ], "source": [ "print(String0)\n", "print(String0.find('al'))\n", "print(String0.find('am'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "返回的索引值是输入数据中第一个元素的索引。" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "a\n" ] } ], "source": [ "print(String0[7])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "还可以输入**find()** 函数，在它们之间搜索索引值。" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2\n", "2\n" ] } ], "source": [ "print(String0.find('j',1))\n", "print(String0.find('j',1,3))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**capitalize( )** 用于将字符串中的第一个元素大写。" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Observe the first letter in this sentence. can you change this sentence\n" ] } ], "source": [ "String3 = 'observe the first letter in this sentence. can you change this sentence'\n", "print(String3.capitalize())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**center( )** 用于通过指定字段宽度将字符串居中对齐。" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "' Taj Mahal is beautiful '" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "String0.center(70)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One can also fill the left out spaces with any other character." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'------------------------Taj Mahal is beautiful------------------------'" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "String0.center(70,'-')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**zfill( )** 通过指定字段宽度来填充零。" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'00000000Taj Mahal is beautiful'" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "String0.zfill(30)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**expandtabs( )** 允许您更改制表符的间距。'\\t'默认设置为8个空格。" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "h\te\tl\tl\to\n", "h e l l o\n", "h e l l o\n" ] } ], "source": [ "s = 'h\\te\\tl\\tl\\to'\n", "print(s)\n", "print(s.expandtabs(1))\n", "print(s.expandtabs(4))" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "**index( )** 和 **find( )** 函数的工作方式相同，唯一的区别在于 **find( )** 返回'-1'，当输入元素在字符串中没有找到，但是**index( )** 函数会抛出一个ValueError。" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0\n", "4\n" ] }, { "ename": "ValueError", "evalue": "substring not found", "output_type": "error", "traceback": [ "\u001b[0;31m------------------------------------------\u001b[0m", "\u001b[0;31mValueError\u001b[0mTraceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mString0\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mindex\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'Taj'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mString0\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mindex\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'Mahal'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mString0\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mindex\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'Mahal'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m10\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m20\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mValueError\u001b[0m: substring not found" ] } ], "source": [ "print(String0.index('Taj'))\n", "print(String0.index('Mahal',0))\n", "print(String0.index('Mahal',10,20))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**endswith( )** 函数用于检查给定字符串是否以作为输入的特定字符结尾。" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "False\n" ] } ], "source": [ "print(String0.endswith('y'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "还可以指定开始和停止索引值。" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "True\n", "True\n" ] } ], "source": [ "print(String0.endswith('l',0))\n", "print(String0.endswith('M',0,5))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**count( )** 函数计算给定字符串中的字符数。也可以指定开始和停止索引或将其留空。(这些是隐式参数，将在函数中处理)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4\n", "2\n" ] } ], "source": [ "print(String0.count('a',0))\n", "print(String0.count('a',5,10))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**join( )** 函数在输入字符串的元素之间添加一个字符。" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'*a_a-'" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "'a'.join('*_-')" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'1\\n2'" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "'\\n'.join(['1', '2'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "'*_-' 是输入字符串而字符'a'被添加在每一个元素之间。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**join( )** 函数也可以被用来将列表转化为字符串。" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['T', 'a', 'j', ' ', 'M', 'a', 'h', 'a', 'l', ' ', 'i', 's', ' ', 'b', 'e', 'a', 'u', 't', 'i', 'f', 'u', 'l']\n", "Taj Mahal is beautiful\n" ] } ], "source": [ "a = list(String0)\n", "print(a)\n", "b = ''.join(a)\n", "print(b)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "在将它转化成字符串之前，**join( )** 函数可以被用来在列表元素中插入任意的字符。" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " /i/s/ /b/e/a/u/t/i/f/u/l\n" ] } ], "source": [ "c = '/'.join(a)[18:]\n", "print(c)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**split( )** 函数被用来将一个字符串转化为列表。把它想成与**join()** 相反地函数。" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[' ', 'i', 's', ' ', 'b', 'e', 'a', 'u', 't', 'i', 'f', 'u', 'l']\n" ] } ], "source": [ "d = c.split('/')\n", "print(d)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "在 **split( )** 函数中，还可以指定分割字符串的次数，或者新返回列表应该包含的元素数量。元素的数量总是比指定的数量多1，这是因为它被分割了指定的次数。" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[' ', 'i', 's', ' /b/e/a/u/t/i/f/u/l']\n", "4\n" ] } ], "source": [ "e = c.split('/',3)\n", "print(e)\n", "print(len(e))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**lower( )** 将任何大写字母转换为小写字母。" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Taj Mahal is beautiful\n", "taj mahal is beautiful\n" ] } ], "source": [ "print(String0)\n", "print(String0.lower())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**upper( )** 将任何小写字母转换为大写字母。" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'TAJ MAHAL IS BEAUTIFUL'" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "String0.upper()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**replace( )** 函数将该元素替换为另一个元素。" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'Bengaluru is beautiful'" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "String0.replace('Taj Mahal','Bengaluru')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**strip( )** 函数用于从右端和左端删除不需要的元素。" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "f = ' hello '" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "如果没有指定字符，那么它将删除数据左边和右边的所有空格。" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'hello'" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "f.strip()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**strip( )** 函数，当指定字符时，如果该字符出现在指定字符串的两端，则删除该字符。" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [], "source": [ "f = ' ***----hello---******* '" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "' ***----hello---******* '" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "f.strip('*')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "必须删除星号，但没有。这是因为在左边和右边都有一个空格。在strip函数中。字符需要按照它们出现的特定顺序输入。" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "----hello---\n", "hello\n" ] } ], "source": [ "print(f.strip(' *'))\n", "print(f.strip(' *-'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**lstrip( )** 和 **rstrip( )** 函数具有与strip函数相同的功能，但唯一的区别是**lstrip()** 只删除左边的内容，而**rstrip()** 只删除右边的内容。" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "----hello---******* \n", " ***----hello---\n" ] } ], "source": [ "print(f.lstrip(' *'))\n", "print(f.rstrip(' *'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. 词典" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "词典更像数据库，因为在这里你可以用用户定义的字符串索引特定的序列。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "为了定义一个词典，让一个变量和{ }或dict()相等。" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " \n" ] } ], "source": [ "d0 = {}\n", "d1 = dict()\n", "print(type(d0), type(d1))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "词典的工作方式有点像列表，但增加了分配自己索引样式的功能。" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'One': 1, 'OneTwo': 12}\n" ] } ], "source": [ "d0['One'] = 1\n", "d0['OneTwo'] = 12 \n", "print(d0)" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'key1': 1, 'key2': [1, 2, 4], 3: (1, 4, 6)}\n" ] } ], "source": [ "d1 = {\"key1\":1, \"key2\":[1,2,4], 3:(1, 4, 6)}\n", "print(d1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "这就是字典的样子。现在你可以通过设为'One'的索引值来访问'1'了" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1\n" ] } ], "source": [ "print(d0['One'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "两个相关的列表可以合并成一个字典。" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [], "source": [ "names = ['One', 'Two', 'Three', 'Four', 'Five']\n", "numbers = [1, 2, 3, 4, 5]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**zip( )** 函数用来结合两个列表。" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'One': 1, 'Two': 2, 'Three': 3, 'Four': 4, 'Five': 5}\n" ] } ], "source": [ "d2 = zip(names,numbers)\n", "print(dict(d2))" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'One': 1, 'Two': 2, 'Three': 3, 'Four': 4, 'Five': 5}\n" ] } ], "source": [ "d3 = {names[i]:numbers[i] for i in range(len(names))}\n", "print(d3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "这两个列表组合成一个列表，每个元素都与元组中来自另一个列表的各自元素相连。元组，因为它是被分配的，而且值不应该改变。\n", "\n", "进一步地，为了将上面的内容转化为词典。我们可以使用 **dict( )** 函数。" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'One': 1, 'Four': 4, 'Three': 3, 'Five': 5, 'Two': 2}\n" ] } ], "source": [ "d2 = zip(names,numbers)\n", "\n", "a1 = dict(d2)\n", "print(a1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.1 内置函数" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**clear( )** 函数被用于擦除所创建的整个数据库。" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{}\n" ] } ], "source": [ "a1 = {1:10, 2:20}\n", "a1.clear()\n", "print(a1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "字典也可以使用循环来构建。" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'One': 1, 'Two': 2, 'Three': 3, 'Four': 4, 'Five': 5}\n" ] } ], "source": [ "a1 = {names[i]:numbers[i] for i in range(len(names))}\n", "print(a1)" ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'One': 1, 'Two': 2, 'Three': 3, 'Four': 4, 'Five': 5}\n" ] } ], "source": [ "for i in range(len(names)):\n", " a1[names[i]] = numbers[i]\n", "print(a1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**values( )** 函数返回了一个包含字典中所有赋值的列表" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "dict_values([1, 2, 3, 4, 5])" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a1.values()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**keys( )** 函数返回包含赋值的所有索引或键。" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "dict_keys(['One', 'Two', 'Three', 'Four', 'Five'])" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a1.keys()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**items()** 返回一个列表同时也包含该列表，但是字典中的每个元素都在一个元组中。这与使用zip函数得到的结果相同。" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ One] 1\n", "[ Two] 2\n", "[ Three] 3\n", "[ Four] 4\n", "[ Five] 5\n" ] } ], "source": [ "a1.items()\n", "\n", "for (k,v) in a1.items():\n", " print(\"[%6s] %d\" % (k, v))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**pop()** 函数用于删除特定的元素，并且这个删除的元素可以被分配给一个新的变量。但是请记住，只存储值而不存储键。因为它只是一个索引值。" ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "ename": "TypeError", "evalue": "pop expected at least 1 arguments, got 0", "output_type": "error", "traceback": [ "\u001b[0;31m------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0mTraceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0ma2\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0ma1\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpop\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma2\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mTypeError\u001b[0m: pop expected at least 1 arguments, got 0" ] } ], "source": [ "a2 = a1.pop()\n", "print(a1)\n", "print(a2)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.9" } }, "nbformat": 4, "nbformat_minor": 1 }