Python 中的 str.isdigit()、isnumeric() 和 isdecimal() 有什么区别?

新手上路,请多包涵

当我运行这些方法时

s.isdigit()
s.isnumeric()
s.isdecimal()

对于 -- 的每个值,我总是得到输出或全部 True 或全部 --- s False 的每个值(当然是字符串)。三者有什么区别?您能否提供一个示例,给出两个 True 和一个 False (反之亦然)?

原文由 matteo_c 发布,翻译遵循 CC BY-SA 4.0 许可协议

阅读 1.2k
2 个回答

它主要是关于 unicode 分类。以下是一些显示差异的示例:

 >>> def spam(s):
...     for attr in 'isnumeric', 'isdecimal', 'isdigit':
...         print(attr, getattr(s, attr)())
...
>>> spam('½')
isnumeric True
isdecimal False
isdigit False
>>> spam('³')
isnumeric True
isdecimal False
isdigit True

具体行为 官方文档中。

找到所有这些的脚本:

 import sys
import unicodedata
from collections import defaultdict

d = defaultdict(list)
for i in range(sys.maxunicode + 1):
    s = chr(i)
    t = s.isnumeric(), s.isdecimal(), s.isdigit()
    if len(set(t)) == 2:
        try:
            name = unicodedata.name(s)
        except ValueError:
            name = f'codepoint{i}'
        print(s, name)
        d[t].append(s)

原文由 wim 发布,翻译遵循 CC BY-SA 4.0 许可协议

根据定义, isdecimal()isdigit()isnumeric() 。也就是说,如果字符串是 decimal ,那么它也将是 digitnumeric

因此,给定一个字符串 s 并用这三种方法测试它,只会有 4 种类型的结果。

 +-------------+-----------+-------------+----------------------------------+
| isdecimal() | isdigit() | isnumeric() |          Example                 |
+-------------+-----------+-------------+----------------------------------+
|    True     |    True   |    True     | "038", "੦੩੮", "038"           |
|  False      |    True   |    True     | "⁰³⁸", "🄀⒊⒏", "⓪③⑧"          |
|  False      |  False    |    True     | "↉⅛⅘", "ⅠⅢⅧ", "⑩⑬㊿", "壹貳參"  |
|  False      |  False    |  False      | "abc", "38.0", "-38"             |
+-------------+-----------+-------------+----------------------------------+


1.一些字符的例子 isdecimal()==True

(因此 isdigit()==Trueisnumeric()==True

 "0123456789"  DIGIT ZERO~NINE
"٠١٢٣٤٥٦٧٨٩"  ARABIC-INDIC DIGIT ZERO~NINE
"०१२३४५६७८९"  DEVANAGARI DIGIT ZERO~NINE
"০১২৩৪৫৬৭৮৯"  BENGALI DIGIT ZERO~NINE
"੦੧੨੩੪੫੬੭੮੯"  GURMUKHI DIGIT ZERO~NINE
"૦૧૨૩૪૫૬૭૮૯"  GUJARATI DIGIT ZERO~NINE
"୦୧୨୩୪୫୬୭୮୯"  ORIYA DIGIT ZERO~NINE
"௦௧௨௩௪௫௬௭௮௯"  TAMIL DIGIT ZERO~NINE
"౦౧౨౩౪౫౬౭౮౯"  TELUGU DIGIT ZERO~NINE
"೦೧೨೩೪೫೬೭೮೯"  KANNADA DIGIT ZERO~NINE
"൦൧൨൩൪൫൬൭൮൯"  MALAYALAM DIGIT ZERO~NINE
"๐๑๒๓๔๕๖๗๘๙"  THAI DIGIT ZERO~NINE
"໐໑໒໓໔໕໖໗໘໙"  LAO DIGIT ZERO~NINE
"༠༡༢༣༤༥༦༧༨༩"  TIBETAN DIGIT ZERO~NINE
"၀၁၂၃၄၅၆၇၈၉"  MYANMAR DIGIT ZERO~NINE
"០១២៣៤៥៦៧៨៩"  KHMER DIGIT ZERO~NINE
"0123456789"  FULLWIDTH DIGIT ZERO~NINE
"𝟎𝟏𝟐𝟑𝟒𝟓𝟔𝟕𝟖𝟗"  MATHEMATICAL BOLD DIGIT ZERO~NINE
"𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡"  MATHEMATICAL DOUBLE-STRUCK DIGIT ZERO~NINE
"𝟢𝟣𝟤𝟥𝟦𝟧𝟨𝟩𝟪𝟫"  MATHEMATICAL SANS-SERIF DIGIT ZERO~NINE
"𝟬𝟭𝟮𝟯𝟰𝟱𝟲𝟳𝟴𝟵"  MATHEMATICAL SANS-SERIF BOLD DIGIT ZERO~NINE
"𝟶𝟷𝟸𝟹𝟺𝟻𝟼𝟽𝟾𝟿"  MATHEMATICAL MONOSPACE DIGIT ZERO~NINE

2.一些字符的例子 isdecimal()==False 但是 isdigit()==True

(因此 isnumeric()==True

 "⁰¹²³⁴⁵⁶⁷⁸⁹"  SUPERSCRIPT ZERO~NINE
"₀₁₂₃₄₅₆₇₈₉"  SUBSCRIPT ZERO~NINE
"🄀⒈⒉⒊⒋⒌⒍⒎⒏⒐"  DIGIT ZERO~NINE FULL STOP
"🄁🄂🄃🄄🄅🄆🄇🄈🄉🄊"  DIGIT ZERO~NINE COMMA
"⓪①②③④⑤⑥⑦⑧⑨"  CIRCLED DIGIT ZERO~NINE
"⓿❶❷❸❹❺❻❼❽❾"  NEGATIVE CIRCLED DIGIT ZERO~NINE
"⑴⑵⑶⑷⑸⑹⑺⑻⑼"  PARENTHESIZED DIGIT ONE~NINE
"➀➁➂➃➄➅➆➇➈"  DINGBAT CIRCLED SANS-SERIF DIGIT ONE~NINE
"⓵⓶⓷⓸⓹⓺⓻⓼⓽"  DOUBLE CIRCLED DIGIT ONE~NINE
"➊➋➌➍➎➏➐➑➒"  DINGBAT NEGATIVE CIRCLED SANS-SERIF DIGIT ONE~NINE
"፩፪፫፬፭፮፯፰፱"  ETHIOPIC DIGIT ONE~NINE

3.一些字符示例 isdecimal()==Falseisdigit()==False 但是 isnumeric()==True

 "½⅓¼⅕⅙⅐⅛⅑⅒⅔¾⅖⅗⅘⅚⅜⅝⅞⅟↉"  VULGAR FRACTION
"৴৵৶৷৸৹"  BENGALI CURRENCY NUMERATOR
"௰௱௲"  TAMIL NUMBER TEN, ONE HUNDRED, ONE THOUSAND
"౸౹౺౻౼౽౾"  TELUGU FRACTION DIGIT
"൰൱൲൳൴൵"  MALAYALAM NUMBER, MALAYALAM FRACTION
"༳༪༫༬༭༮༯༰༱༲"  TIBETAN DIGIT HALF ZERO~NINE
"፲፳፴፵፶፷፸፹፺፻፼"  ETHIOPIC NUMBER TEN~NINETY, HUNDRED, TEN THOUSAND
"៰៱៲៳៴៵៶៷៸៹"  KHMER SYMBOL LEK ATTAK
"ⅠⅡⅢⅣⅤⅥⅦⅧⅨⅩⅪⅫⅬⅭⅮⅯ"  ROMAN NUMERAL
"ⅰⅱⅲⅳⅴⅵⅶⅷⅸⅹⅺⅻⅼⅽⅾⅿ"  SMALL ROMAN NUMERAL
"ↀↁↂↅↆ"  ROMAN NUMERAL
"⑩⑪⑫⑬⑭⑮⑯⑰⑱⑲⑳㉑㉒㉓㉔㉕㉖㉗㉘㉙㉚㉛㉜㉝㉞㉟㊱㊲㊳㊴㊵㊶㊷㊸㊹㊺㊻㊼㊽㊾㊿"  CIRCLED NUMBER TEN~FIFTY
"㉈㉉㉊㉋㉌㉍㉎㉏"  CIRCLED NUMBER TEN~EIGHTY ON BLACK SQUARE
"⑽⑾⑿⒀⒁⒂⒃⒄⒅⒆⒇"  PARENTHESIZED NUMBER TEN~TWENTY
"⒑⒒⒓⒔⒕⒖⒗⒘⒙⒚⒛"  NUMBER TEN~TWENTY FULL STOP
"⓫⓬⓭⓮⓯⓰⓱⓲⓳⓴"  NEGATIVE CIRCLED NUMBER ELEVEN
"⓾➉❿➓"  various styles of CIRCLED NUMBER TEN
"🄌"  DINGBAT NEGATIVE CIRCLED SANS-SERIF DIGIT ZERO
"〇"  IDEOGRAPHIC NUMBER ZERO
"〡〢〣〤〥〦〧〨〩〸〹〺"  HANGZHOU NUMERAL ONE~TEN, TWENTY, THIRTY
"㆒㆓㆔㆕"  IDEOGRAPHIC ANNOTATION ONE~FOUR MARK
"㈠㈡㈢㈣㈤㈥㈦㈧㈨㈩"  PARENTHESIZED IDEOGRAPH ONE~TEN
"㊀㊁㊂㊃㊄㊅㊆㊇㊈㊉"  CIRCLED IDEOGRAPH ONE~TEN
"一二三四五六七八九十壹貳參肆伍陸柒捌玖拾零百千萬億兆弐貮贰㒃㭍漆什㐅陌阡佰仟万亿幺兩㠪亖卄卅卌廾廿"  CJK UNIFIED IDEOGRAPH
"參拾兩零六陸什"  CJK COMPATIBILITY IDEOGRAPH
"𐄇𐄈𐄉𐄊𐄋𐄌𐄍𐄎𐄏𐄐𐄑𐄒𐄓𐄔𐄕𐄖𐄗𐄘"  AEGEAN NUMBER ONE~NINE, TEN~NINETY
"𐄙𐄚𐄛𐄜𐄝𐄞𐄟𐄠𐄡𐄢𐄣𐄤𐄥𐄦𐄧𐄨𐄩𐄪"  AEGEAN NUMBER ONE~NINE HUNDRED, ONE~NINE THOUSAND
"𐄬𐄭𐄮𐄯𐄰𐄱𐄲𐄳"  AEGEAN NUMBER TEN~NINETY THOUSAND
"𐅀𐅁𐅂𐅃𐅆𐅇𐅈𐅉𐅊𐅋𐅌𐅍𐅎𐅏𐅐𐅑𐅒𐅓𐅔𐅕𐅖𐅗𐅘𐅙𐅚𐅛𐅜𐅝𐅞𐅟𐅠𐅡𐅢𐅣𐅤𐅥𐅦𐅧𐅨𐅩𐅪𐅫𐅬𐅭𐅮𐅯𐅰𐅱𐅲𐅳𐅴"  GREEK ACROPHONIC ATTIC
"𝍠𝍡𝍢𝍣𝍤𝍥𝍦𝍧𝍨"  COUNTING ROD UNIT DIGIT ONE~NINE
"𝍩𝍪𝍫𝍬𝍭𝍮𝍯𝍰𝍱"  COUNTING ROD TENS DIGIT ONE~NINE

原文由 AnnieFromTaiwan 发布,翻译遵循 CC BY-SA 4.0 许可协议

撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进
推荐问题
logo
Stack Overflow 翻译
子站问答
访问
宣传栏