s1 = '白日依山尽,黄河入海流,欲穷千里目,更上一层楼。'
s2 = 'If we don't have the guts to try anything, what's the meaning of life?'
s3 = '危楼高百丈,伸手欲摘星,不敢高声语,恐惊天上人。'
判断字符串的内容是否为中文?
import re
p = re.compile('[\u4e00-\u9fa5]')
s1 = '白日依山尽,黄河入海流,欲穷千里目,更上一层楼。'
s2 = "If we don't have the guts to try anything, what's the meaning of life?"
s3 = '危楼高百丈,伸手欲摘星,不敢高声语,恐惊天上人。'
import re
p = re.compile('[^\x00-\xff]')
print(''.join(i.group() for i in p.finditer(s1)) == s1)
print(''.join(i.group() for i in p.finditer(s2)) == s2)
print(''.join(i.group() for i in p.finditer(s3)) == s3)
各位大侠,是否有 更好的方法 来判断字符串内容是否为双字节字符?
根据码段来判断,去查查编码表,好像是大于某个值,后面一字节和它是一块的,也即是双字节字符