空间管理您的位置: 51Testing软件测试网 » liqianqian1116的个人空间 » 日志

Python 字符串总结，建议收藏！（下）

上一篇 / 下一篇 2022-05-20 10:59:48

查看( 250 ) / 评论( 0 ) / 评分( 0 / 0 )

　　字符串方法

　　str.split(sep=None, maxsplit=-1)：字符串拆分方法包含两个属性：sep 和 maxsplit。当使用其默认值调用此方法时，它会在任何有空格的地方拆分字符串。此方法返回字符串列表：

　　string = "Apple, Banana, Orange, Blueberry"

　　print(string.split())

　　Output:

　　['Apple,', 'Banana,', 'Orange,', 'Blueberry']

　　我们可以看到字符串没有很好地拆分，因为拆分的字符串包含 ,。我们可以使用 sep=',' 在有 , 的地方进行拆分：

　　print(string.split(sep=','))

　　Output:

　　['Apple', ' Banana', ' Orange', ' Blueberry']

　　这比之前的拆分要好，但是我们可以在一些拆分字符串之前看到空格。可以使用 (sep=', ') 删除它：

　　# Notice the whitespace after the comma

　　print(string.split(sep=', '))

　　Output:

　　['Apple', 'Banana', 'Orange', 'Blueberry']

　　现在字符串被很好地分割了。有时我们不想分割最大次数，我们可以使用 maxsplit 属性来指定我们打算拆分的次数：

　　print(string.split(sep=', ', maxsplit=1))

　　print(string.split(sep=', ', maxsplit=2))

　　Output:

　　['Apple', 'Banana, Orange, Blueberry']

　　['Apple', 'Banana', 'Orange, Blueberry']

　　str.splitlines(keepends=False)：有时我们想处理一个在边界处具有不同换行符（'\n'、\n\n'、'\r'、'\r\n'）的语料库。我们要拆分成句子，而不是单个单词。可以使用 splitline 方法来执行此操作。当 keepends=True 时，文本中包含换行符；否则它们被排除在外

　　import nltk # You may have to `pip install nltk` to use this library.

　　macbeth = nltk.corpus.gutenberg.raw('shakespeare-macbeth.txt')

　　print(macbeth.splitlines(keepends=True)[:5]

　　Output:

　　['[The Tragedie of Macbeth by William Shakespeare 1603]\n', '\n', '\n', 'Actus Primus. Scoena Prima.\n', '\n']

　　str.strip([chars])：我们使用 strip 方法从字符串的两侧删除尾随空格或字符。例如：

　　string = " Apple Apple Apple no apple in the box apple apple "

　　stripped_string = string.strip()

　　print(stripped_string)

　　left_stripped_string = (

　　 stripped_string

　　 .lstrip('Apple')

　　 .lstrip()

　　 .lstrip('Apple')

　　 .lstrip()

　　 .lstrip('Apple')

　　 .lstrip()

　　)

　　print(left_stripped_string)

　　capitalized_string = left_stripped_string.capitalize()

　　print(capitalized_string)

　　right_stripped_string = (

　　 capitalized_string

　　 .rstrip('apple')

　　 .rstrip()

　　 .rstrip('apple')

　　 .rstrip()

　　)

　　print(right_stripped_string)

　　Output:

　　Apple Apple Apple no apple in the box apple apple

　　no apple in the box apple apple

　　No apple in the box apple apple

　　No apple in the box

　　在上面的代码片段中，我们使用了 lstrip 和 rstrip 方法，它们分别从字符串的左侧和右侧删除尾随空格或字符。我们还使用了 capitalize 方法，它将字符串转换为句子大小写str.zfill(width)： zfill 方法用 0 前缀填充字符串以获得指定的宽度。例如：

　　example = "0.8" # len(example) is 3

　　example_zfill = example.zfill(5) # len(example_zfill) is 5

　　print(example_zfill)

　　Output:

　　000.8

　　str.isalpha()：如果字符串中的所有字符都是字母，该方法返回True；否则返回 False：

　　# Alphabet string

　　alphabet_one = "Learning"

　　print(alphabet_one.isalpha())

　　# Contains whitspace

　　alphabet_two = "Learning Python"

　　print(alphabet_two.isalpha())

　　# Contains comma symbols

　　alphabet_three = "Learning,"

　　print(alphabet_three.isalpha())

　　Output:

　　True

　　False

　　如果字符串字符是字母数字，str.isalnum() 返回 True；如果字符串字符是十进制，str.isdecimal() 返回 True；如果字符串字符是数字，str.isdigit() 返回 True；如果字符串字符是数字，则 str.isnumeric() 返回 True

　　如果字符串中的所有字符都是小写，str.islower() 返回 True；如果字符串中的所有字符都是大写，str.isupper() 返回 True；如果每个单词的首字母大写，str.istitle() 返回 True：

　　# islower() example

　　string_one = "Artificial Neural Network"

　　print(string_one.islower())

　　string_two = string_one.lower() # converts string to lowercase

　　print(string_two.islower())

　　# isupper() example

　　string_three = string_one.upper() # converts string to uppercase

　　print(string_three.isupper())

　　# istitle() example

　　print(string_one.istitle())

　　Output:

　　False

　　True

　　str.endswith(suffix) 返回 True 是以指定后缀结尾的字符串。如果字符串以指定的前缀开头，str.startswith(prefix) 返回 True：

　　sentences = ['Time to master data science', 'I love statistical computing', 'Eat, sleep, code']

　　# endswith() example

　　for one_sentence in sentences:

　　 print(one_sentence.endswith(('science', 'computing', 'Code')))

　　Output:

　　True

　　False

　　# startswith() example

　　for one_sentence in sentences:

　　 print(one_sentence.startswith(('Time', 'I ', 'Ea')))

　　Output:

　　True

　　str.find(substring) 如果子字符串存在于字符串中，则返回最低索引；否则它返回 -1。str.rfind(substring) 返回最高索引。如果找到，str.index(substring) 和 str.rindex(substring) 也分别返回子字符串的最低和最高索引。如果字符串中不存在子字符串，则会引发 ValueError

　　string = "programming"

　　# find() and rfind() examples

　　print(string.find('m'))

　　print(string.find('pro'))

　　print(string.rfind('m'))

　　print(string.rfind('game'))

　　# index() and rindex() examples

　　print(string.index('m'))

　　print(string.index('pro'))

　　print(string.rindex('m'))

　　print(string.rindex('game'))

　　Output:

　　-1

　　---------------------------------------------------------------------------

　　ValueError Traceback (most recent call last)

　　~\AppData\Local\Temp/ipykernel_11336/3954098241.py in

　　 11 print(string.index('pro')) # Output: 0

　　 12 print(string.rindex('m')) # Output: 7

　　---> 13 print(string.rindex('game')) # Output: ValueError: substring not found

　　ValueError: substring not found

　　str.maketrans(dict_map) 从字典映射创建一个翻译表，str.translate(maketrans) 用它们的新值替换翻译中的元素。例如：

　　example = "abcde"

　　mapped = {'a':'1', 'b':'2', 'c':'3', 'd':'4', 'e':'5'}

　　print(example.translate(example.maketrans(mapped)))

　　Output:

　　12345

　　字符串操作

　　循环遍历一个字符串

　　字符串是可迭代的，因此它们支持使用 for 循环和枚举的循环操作：

　　# For-loop example

　　word = "bank"

　　for letter in word:

　　 print(letter)

　　Output:

　　# Enumerate example

　　for idx, value in enumerate(word):

　　 print(idx, value)

　　Output:

　　0 b

　　1 a

　　2 n

　　3 k

　　字符串和关系运算符

　　当使用关系运算符（>、<、== 等）比较两个字符串时，两个字符串的元素按其 ASCII 十进制数字逐个索引进行比较。例如：

　　print('a' > 'b')

　　print('abc' > 'b')

　　Output:

　　False

　　在这两种情况下，输出都是 False。关系运算符首先比较两个字符串的索引 0 上元素的 ASCII 十进制数。由于 b 大于 a，因此返回 False；在这种情况下，其他元素的 ASCII 十进制数字和字符串的长度无关紧要

　　当字符串长度相同时，它比较从索引 0 开始的每个元素的 ASCII 十进制数，直到找到具有不同 ASCII 十进制数的元素。例如：

　　print('abd' > 'abc')

　　Output:

　　True

　　检查字符串的成员资格

　　in 运算符用于检查子字符串是否是字符串的成员：

　　print('data' in 'dataquest')

　　print('gram' in 'programming')

　　Output:

　　True

　　检查字符串成员资格、替换子字符串或匹配模式的另一种方法是使用正则表达式

　　import re

　　substring = 'gram'

　　string = 'programming'

　　replacement = '1234'

　　# Check membership

　　print(re.search(substring, string))

　　# Replace string

　　print(re.sub(substring, replacement, string))

　　Output:

　　pro1234ming

　　字符串格式

　　f-string 和 str.format() 方法用于格式化字符串。两者都使用大括号 {} 占位符。例如：

　　monday, tuesday, wednesday = "Monday", "Tuesday", "Wednesday"

　　format_string_one = "{} {} {}".format(monday, tuesday, wednesday)

　　print(format_string_one)

　　format_string_two = "{2} {1} {0}".format(monday, tuesday, wednesday)

　　print(format_string_two)

　　format_string_three = "{one} {two} {three}".format(one=tuesday, two=wednesday, three=monday)

　　print(format_string_three)

　　format_string_four = f"{monday} {tuesday} {wednesday}"

　　print(format_string_four)

　　Output:

　　Monday Tuesday Wednesday

　　Wednesday Tuesday Monday

　　Tuesday Wednesday Monday

　　Monday Tuesday Wednesday

　　f-strings 更具可读性，并且它们比 str.format() 方法实现得更快。因此，f-string 是字符串格式化的首选方法

　　处理引号和撇号

　　撇号 (') 在 Python 中表示一个字符串。为了让 Python 知道我们不是在处理字符串，我们必须使用 Python 转义字符 ()。因此撇号在 Python 中表示为 '。与处理撇号不同，Python 中有很多处理引号的方法。它们包括以下内容：

　　# 1. Represent string with single quote (`""`) and quoted statement with double quote (`""`)

　　quotes_one = '"Friends don\'t let friends use minibatches larger than 32" - Yann LeCun'

　　print(quotes_one)

　　# 2. Represent string with double quote `("")` and quoted statement with escape and double quote `(\"statement\")`

　　quotes_two = "\"Friends don\'t let friends use minibatches larger than 32\" - Yann LeCun"

　　print(quotes_two)

　　# 3. Represent string with triple quote `("""""")` and quoted statment with double quote ("")

　　quote_three = """"Friends don\'t let friends use minibatches larger than 32" - Yann LeCun"""

　　print(quote_three)

　　Output:

　　"Friends don't let friends use minibatches larger than 32" - Yann LeCun

　　写在最后

　　字符串作为编程语言当中最为常见的数据类型，熟练而灵活的掌握其各种属性和方法，实在是太重要了，小伙伴们千万要实时温习，处处留心哦！

Python 字符串总结，建议收藏！（下）

相关阅读:

用户菜单

标题搜索

日历

我的存档

数据统计

RSS订阅