python字符编码转换不完全

是这样的，文件从mac端传到windows端会导致文件名（中文）乱码。于是我想写个python脚本来改下文件名的编码，代码如下：

pythondef convert_gbk_to_utf8():
    for file in os.listdir(ROOT_PATH):
        new_file = file.encode("gbk", "ignore").decode("utf-8", "ignore")
        os.renames(os.path.join(ROOT_PATH, file), os.path.join(ROOT_PATH, new_file))

结果差强人意，有部分文件名转化的不完整。想问下为什么？

转换后效果图

乱码的文件名

鏁堟灉鍥_K11.1_璐圭敤鏄庣粏_鏈彁浜よ璐_.png
鏁堟灉鍥_K11.2_璐圭敤鏄庣粏_濉啓璺ˉ璐_png

调用os.listdir()打印出来的文件名

listdir打印出来的文件名

python

阅读 5.5k

3 个回答

Tranch

✓ 已被采纳

原文件名不是 gbk 编码的吧？试试 gb18030。

参考：Chinese in Mac OS X 10.7 Lion

依云

25k62862

发布于
2015-04-21

似乎有奇怪的字符混进来了：

>>> xsel | iconv -t gb18030
效果K11.1_费用明细_未提交计.png
效果K11.2_费用明细_填写路桥png

看上去没问题，但是在 Vim 中查看时发现：

中间有奇怪的字符

不要使用 errors='ignore'，除非你很明确地知道并且想要这么做。

itlearner

2855

发布于
2015-04-27

def convert_gbk_to_utf8():
for file in os.listdir(ROOT_PATH):
new_file = file.encode("gbk", "ignore").decode("utf-8", "ignore")
os.renames(os.path.join(ROOT_PATH, file), os.path.join(ROOT_PATH, new_file))

看题意，楼主应该是想让编码方式从gbk 转为utf8 在Python中如果想从一种编码方式转向另一种编码方式是以unicode 作为中间码的将某种编码方式转为unicode码用的是decode而不是encode。encode 是将unicode码转为一种具体的编码方式所以楼主上述代码中 new_file = file.encode("gbk", "ignore").decode("utf-8", "ignore") 这一句变为 new_file = file.decode("gbk", "ignore").encode("utf-8", "ignore")应该就行了

撰写回答

你尚未登录，登录后可以

和开发者交流问题的细节
关注并接收问题和回答的更新提醒
参与内容的编辑和改进，让解决方法与时俱进

推荐问题

相似问题

找不到问题？创建新问题

python字符编码转换不完全

你尚未登录，登录后可以

Qt中布局是否只有5种呢？

字节的 trae AI IDE 不支持类似 vscode 的 ssh remote 远程开发怎么办？

这段代码为什么不能获取到数据？

请问一下，如何理解reduce函数呢？

DataCap 中验证码无法显示，后台出现 NullPointerException 错误?

如何使用Python+Selenium爬取Goodreads上万条书评而不崩溃？

如何使用 python 代码实现迅雷磁力链接资源的下载？