手动从 huggingface 下载模型之后，怎么把模型文件放到指定路径？

Question

手动从 huggingface 下载模型之后，怎么把模型文件放到指定路径？

发布于
2024-01-07 浙江

我需要在本地运行 chatGLM3，然后我就用下面的代码跑

from transformers import AutoTokenizer, AutoModel


tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm3-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("THUDM/chatglm3-6b", trust_remote_code=True, device='cuda')
model = model.eval()
response, history = model.chat(tokenizer, "你好", history=[])

运行上面的代码，首先会从 huggingface 下载 10GB 左右的模型文件。但是问题来了，huggingface 的下载速度太慢了。所以我就想手动下载：https://huggingface.co/THUDM/chatglm3-6b/tree/main

但是问题在于，huggingface 奇葩的本地目录结构：

╰─➤  fd -a -u pytorch_model | grep chatglm3-6b
/home/pon/.cache/huggingface/hub/models--THUDM--chatglm3-6b/snapshots/e46a14881eae613281abbd266ee918e93a56018f/pytorch_model-00007-of-00007.bin
/home/pon/.cache/huggingface/hub/models--THUDM--chatglm3-6b/snapshots/e46a14881eae613281abbd266ee918e93a56018f/pytorch_model-00003-of-00007.bin
/home/pon/.cache/huggingface/hub/models--THUDM--chatglm3-6b/snapshots/e46a14881eae613281abbd266ee918e93a56018f/pytorch_model-00006-of-00007.bin
/home/pon/.cache/huggingface/hub/models--THUDM--chatglm3-6b/snapshots/e46a14881eae613281abbd266ee918e93a56018f/pytorch_model-00004-of-00007.bin
/home/pon/.cache/huggingface/hub/models--THUDM--chatglm3-6b/snapshots/e46a14881eae613281abbd266ee918e93a56018f/pytorch_model-00001-of-00007.bin
/home/pon/.cache/huggingface/hub/models--THUDM--chatglm3-6b/snapshots/e46a14881eae613281abbd266ee918e93a56018f/pytorch_model.bin.index.json
/home/pon/.cache/huggingface/hub/models--THUDM--chatglm3-6b/snapshots/e46a14881eae613281abbd266ee918e93a56018f/pytorch_model-00002-of-00007.bin
/home/pon/.cache/huggingface/hub/models--THUDM--chatglm3-6b/snapshots/e46a14881eae613281abbd266ee918e93a56018f/pytorch_model-00005-of-00007.bin
/home/pon/.cache/huggingface/hub/models--THUDM--chatglm3-6b/.no_exist/e46a14881eae613281abbd266ee918e93a56018f/pytorch_model.bin

有一串莫名其妙的 snapshots/e46a14881eae613281abbd266ee918e93a56018f !!!! so, 我手动下载了模型文件之后，我怎么知道要把这些模型文件放哪个路径？上哪知道是 e46a14881eae613281abbd266ee918e93a56018f 还是 e46a14881eae613281abbd266ee918e93a56018w？

算法人工智能 chatgpt pytorch huggingface

阅读 3.3k

AI BotBETA

当你从Hugging Face手动下载模型文件时，你确实需要将模型文件放置在正确的路径。由于Hugging Face在本地目录结构中生成了随机的唯一标识符（如e46a14881eae613281abbd266ee918e93a56018f），因此你不能简单地预测或确定正确的路径。

一种解决方法是查看你的Python环境中的缓存路径。在你的代码中，AutoTokenizer和AutoModel在初始化时会使用缓存路径。你可以通过以下方式找到这个路径：

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm3-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("THUDM/chatglm3-6b", trust_remote_code=True, device='cuda')
print(tokenizer.model_max_length)  # 这将打印出模型的最大长度

在你的代码中，打印的模型最大长度将是一个数字。然后，你可以使用这个数字作为Hugging Face模型缓存的子目录名称。例如，如果模型的最大长度是10240，那么你的模型文件应该放在以下路径：

/home/pon/.cache/huggingface/hub/models--THUDM--chatglm3-6b/10240/pytorch_model.bin

请注意，这只是一种可能的解决方案，并且可能不适用于所有情况。如果这种方法不起作用，你可能需要查看Hugging Face的文档或寻求其他社区的帮助。

1 个回答

得票最新

universe_king

3.4k13388843

发布于
2024-01-18 浙江

更新于
2024-02-17

贴一个群友的回复（真实性未验证，为什么不验证？因为我换用百度的飞桨了）

直接把THUDM/chatglm3-6b改成你本地模型文件夹的路径就行了
直接指向最外层文件夹就行，比如这样

撰写回答

你尚未登录，登录后可以

和开发者交流问题的细节
关注并接收问题和回答的更新提醒
参与内容的编辑和改进，让解决方法与时俱进

推荐问题

相似问题

找不到问题？创建新问题

手动从 huggingface 下载模型之后，怎么把模型文件放到指定路径？

你尚未登录，登录后可以

` Stable Diffusion ` 开源之后为何一直不更新呢？

字节的 trae AI IDE 不支持类似 vscode 的 ssh remote 远程开发怎么办？

要运行本地Midjourney只自己用（只自己使用，不给别人服务使用）1张显卡就足够了吗？

本地部署使用 miniCpmV2-6 、chatglm 这些本地 LLM 的时候，如何实现统计 token？

哪个AI可以满足替换视频里面的声音？

使用cursor编程时，有时会遇到composer模式卡住不回答的情况？

是否可以使用分步骤的方式来学习算法？