Use Docker to run HuggingFace massive models

This article will share how to quickly run interesting models on Hugging Face locally with Docker. Run the model with less code and less time cost than the original project.

If you are familiar with Python, most model projects can be deployed and run locally in about 10 minutes.

write in front

For ease of presentation, I chose an image processing model. Before talking about the details, let's take a look at the actual operation of this model project.

瞳孔和毛发细节也非常接近真实

相比较原始照片新照片清晰不少

The AI model used for the above image processing is something I found on Hugging Face. With the explosion of Hugging Face, more and more interesting models and datasets have appeared on the platform. Currently, the number of models alone is as high as more than 45,000.

These models have an interesting feature . They run well on the cloud platform, but once you want to run them locally, you have to struggle. You can always see user feedback in the GitHub associated with the project: this model and code , I can't run it locally, it's too troublesome to run the environment and call the code.

Hugging Face 的数万开放模型

In fact, in our daily work and study, we will often encounter situations similar to the above Hugging Face: many models run well on the "cloud", but they can't run as soon as they arrive locally. This may be because of "operating" System environment, device CPU architecture (x86/ARM) differences", perhaps because of "the Python runtime version is too high or too low", perhaps because of "the wrong version of the package installed by a pip", "in the verbose example code Write a bunch of stuff"...

So, is there any lazy way we can get around these time-wasting problems?

After some tossing, I found a relatively reliable solution: use Docker container with Towhee to create a one-click runtime environment for the model.

For example, the model mentioned at the beginning of this article, if we want to make a quick call and perform a quick repair process for our pictures, it is really not difficult: only one docker run command, with twenty or thirty Just run Python code.

Next, I will take the open source GFPGAN model of Tencent ARC Lab as an example to talk about how to quickly run the online open model.

GitHub 上拥有两万颗星星的 GFPGAN

Because this model is based on PyTorch, in this article, we will first talk about how to make a generic Docker base image for use by a PyTorch-based model. If the students have needs, I will talk about other model frameworks.

Generic Docker base image for making PyTorch models

I have uploaded the complete sample code of this chapter to GitHub: https://github.com/soulteary/docker-pytorch-playground , and interested students can pick it up. If you want to save further trouble, you can also directly use the image I have built as the base image: https://hub.docker.com/r/soulteary/docker-pytorch-playground .

If you are interested in how to package the base image, you can continue to read this chapter. If you only care about how to run the model quickly, you can directly read the next chapter.

Closer to home, for the following three reasons, I recommend that students who want to quickly reproduce the model locally use the container solution :

Want to avoid environmental interference (pollution) between different projects
Want to make sure project dependencies are clear and anyone can reproduce the results on any device
The time cost of reproducing the model is lower, and I don’t like tossing 80% of the repetitive work content other than model tuning (especially the environment and basic configuration)

After understanding the advantages of the container solution. Next, let's talk about how to write the Dockerfile of this kind of basic image, and the thinking in the process of writing:

Considering that the model may need to run on both x86 and ARM devices, it is recommended to use miniconda3 which is based on the base image of the debian built-in conda toolkit.

 FROM continuumio/miniconda3:4.11.0

Regarding the use of the basic environment image, I recommend that you use the specific version number instead of latest , which can keep your container "stable" and reduce "unexpected surprises" when you need to build repeatedly. ". If you have special version requirements, you can find a more suitable mirror version here . About conda and mini conda related content, this article will not go into details, interested students can get more information from the official warehouse . I'll write a more detailed article to talk about it if there is demand.

Because we will frequently use the OpenGL API, we need to install the libgl1-mesa-glx package in the base image. If you want to know the details of this package, you can read the documentation of the debian official software repository , in order to make the installation time Less, here I adjusted the software source to the domestic "Tsinghua source".

 RUN sed -i -e "s/deb.debian.org/mirrors.tuna.tsinghua.edu.cn/" /etc/apt/sources.list && \
    sed -i -e "s/security.debian.org/mirrors.tuna.tsinghua.edu.cn/" /etc/apt/sources.list && \
    apt update
RUN apt install -y libgl1-mesa-glx

When we have completed the installation of the basic system dependency library, we can start to prepare the model running environment. Take the PyTorch installation as an example:

 RUN pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
RUN conda install -y pytorch

Also in order to save the download time of the Python PyPi package, I also switched the download source to the domestic "Tsinghua source". When the conda install -y pytorch command is executed, our basic operating environment is OK.

Considering that everyone's network environment is different, here are some other commonly used mirror sources in China. You can adjust the package download source according to your own situation to obtain faster package download speed.

 # 清华源
https://pypi.tuna.tsinghua.edu.cn/simple
# 阿里云
http://mirrors.aliyun.com/pypi/simple
# 百度
https://mirror.baidu.com/pypi/simple
# 中科大
https://pypi.mirrors.ustc.edu.cn/simple
# 豆瓣
http://pypi.douban.com/simple

In the above steps, we probably need to download packages that are close to 200MB ( conda 14MB, pytorch 44MB, mkl 140MB), which requires some patience.

In order to make our base image environment compatible with x86 and ARM, in addition to completing the above base environment installation, we also need to specify torch and torchvision versions, about this in the PyTorch community There have been some discussions .

 RUN pip3 install --upgrade torch==1.9.0 torchvision==0.10.0

In the above command, we will replace torch with the specified version. In the process of actually constructing the image, about 800MB of additional data needs to be downloaded. Even if we use domestic software sources, it may take a long time. We can consider going to the refrigerator to get a can of ice Coke to ease the anxiety of waiting. 🥤

After dealing with the various dependencies above, we come to the final step of building the image. To make it easier to run various PyTorch models later, it is recommended to install Towhee directly in the base image:

 # https://docs.towhee.io/Getting%20Started/quick-start/
RUN pip install towhee

At this point, the Dockerfile of a general Docker base image used by the PyTorch-based model has been written. For the convenience of reading, I will post the complete file content here:

 FROM continuumio/miniconda3:4.11.0

RUN sed -i -e "s/deb.debian.org/mirrors.tuna.tsinghua.edu.cn/" /etc/apt/sources.list && \
    sed -i -e "s/security.debian.org/mirrors.tuna.tsinghua.edu.cn/" /etc/apt/sources.list && \
    apt update
RUN apt install -y libgl1-mesa-glx

RUN pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
RUN conda install -y pytorch

RUN pip3 install --upgrade torch==1.9.0 torchvision==0.10.0

RUN pip install towhee

After saving the above content as a Dockerfile, execute docker build -t soulteary/docker-pytorch-playground . , and when the command is executed, our PyTorch base image will be built.

If you don't want to waste time building, you can also directly use the base image I have built (supports automatic differentiation of x86 / ARM architecture devices), and download it directly from DockerHub:

 # 可以直接下载最新版本
docker pull soulteary/docker-pytorch-playground
# 也可以使用带有具体版本的镜像
docker pull soulteary/docker-pytorch-playground:2022.05.19

After getting the basic image, we can continue to toss the running environment and program of the specific model mentioned above.

Writing a Model Invoker in Python

We can find the official model usage example in the GFPGAN project: https://github.com/TencentARC/GFPGAN/blob/master/inference\_gfpgan.py , the original file is relatively long, about 155 lines, I will not post it here .

As I mentioned in the previous section, we can use Towhee to be "lazy", for example, we can shorten the number of lines of the sample code to 30 lines, and implement a small additional function: scan all the pictures in the working directory, and then convert them They are handed over to the model for processing, and finally a static page is generated to compare and display the pictures before and after processing.

 import warnings
warnings.warn('The unoptimized RealESRGAN is very slow on CPU. We do not use it. '
              'If you really want to use it, please modify the corresponding codes.')

from gfpgan import GFPGANer
import towhee

@towhee.register
class GFPGANerOp:

    def __init__(self,
                 model_path='/GFPGAN.pth',
                 upscale=2,
                 arch='clean',
                 channel_multiplier=2,
                 bg_upsampler=None) -> None:
        self._restorer = GFPGANer(model_path, upscale, arch, channel_multiplier, bg_upsampler)

    def __call__(self, img):
        cropped_faces, restored_faces, restored_img = self._restorer.enhance(
            img, has_aligned=False, only_center_face=False, paste_back=True)

        return restored_faces[0][:, :, ::-1]

(
    towhee.glob['path']('*.jpg') 
        .image_load['path', 'img']() 
        .GFPGANerOp['img','face']() 
        .show(formatter=dict(img='image', face='image'))
)

If you remove the above warnings in order to keep the "authentic flavor", you can actually get a shorter number of lines. Save the above as app.py , we will use it later.

After getting the program required to call the model, let's continue to talk about how to make the application container image required for the specific model (GFPGAN) to run.

Make the application image used by the specific model

I also uploaded the complete code of this part to GitHub, so that everyone can be "lazy": https://github.com/soulteary/docker-gfpgan . The companion pre-built image is here https://hub.docker.com/r/soulteary/docker-gfpgan .

Closer to home, with the basic mirroring above, we only need to do some mirror-dependent fine-tuning for each different model in the process of daily play .

Let's take a look at how to customize the application image for the GFPGAN project mentioned above.

The same is used to write Dockerfile as an example, let's first declare that the application image we are building is based on the above basic image.

 FROM soulteary/docker-pytorch-playground:2022.05.19

The advantage of this is that in subsequent daily use, we can save a lot of image construction time and local disk space. It has to be said that model-type large containers can especially enjoy the convenience brought by Docker features.

Next, we need to place the model file we want to use in the application image to be made, and complete the supplementary download of related Python dependencies.

Considering the slow download of Hugging Face and GitHub models on the domestic network, it is also prone to network interruptions. I recommend that when building an application model, you can consider downloading the dependent model in advance, and in the process of building an image, you can place the model in an appropriate directory location. As for the specific way of using the model, whether it is packaged into an image or dynamically mounted during use, it is actually possible.

In the GFPGAN project, we rely on two model files, one is the face detection model based on ResNet50 in the https://github.com/xinntao/facexlib project, and the other is the GFPGAN used for image restoration The adversarial network model, also known as the "protagonist" in the traditional sense.

The first model file detection_Resnet50_Final.pth , we can get it in https://github.com/xinntao/facexlib/releases/tag/v0.1.0 ; the second model requires us to adapt to our own equipment , to make specific choices:

If you need to use CPU to run the model, you can download it at https://github.com/TencentARC/GFPGAN/releases/tag/v0.2.0---068baebe3d8cfe80c877ffdab03da04c--- ; or at GFPGANCleanv1-NoCE-C2.pth Download from TencentARC/GFPGAN/releases/tag/v1.3.0 GFPGANv1.3.pth , this type of model can complete the processing of black and white portrait pictures.
If you can use GPU to run the model, you can download it at https://github.com/TencentARC/GFPGAN/releases/tag/v0.1.0---61ce1360fccca4fb9c051926723053b1--- ; or at https://share.weiyun GFPGANv1.pth com/ShYoCCoc for model file download, this type of model can handle portrait pictures with color.
In addition to GitHub, we also have the option to download models directly from Hugging Face (just not as many optional versions as above): https://huggingface.co/TencentARC/GFPGANv1/tree/main/experiments/pretrained\_models .

After placing the downloaded model file and the new Dockerfile file in the same directory, let's continue to improve the content of the Dockerfile, complete the installation of project dependencies, and place the model in the appropriate directory location in the container:

 # 安装模型相关代码库
RUN pip install gfpgan realesrgan
# 将提前下载好的模型复制到指定位置，避免构建镜像过程中的意外
COPY detection_Resnet50_Final.pth /opt/conda/lib/python3.9/site-packages/facexlib/weights/detection_Resnet50_Final.pth

# 根据你下载的模型版本做选择，选一个模型文件就行
COPY GFPGANCleanv1-NoCE-C2.pth /GFPGAN.pth
# COPY GFPGANCleanv1-NoCE-C2_original.pth /GFPGAN.pth
# COPY GFPGANv1.pth /GFPGAN.pth
# COPY GFPGANv1.3.pth /GFPGAN.pth

In addition to gfpgan above, I also installed realesrgan , this package can make the background of the processed image look better and more natural.

After completing the configuration of basic dependencies and models, there are some simple finishing touches:

 # 将上一步保存的调用模型的程序拷贝到镜像中
COPY app.py /entrypoint.py

# 声明一个干净的工作目录
WORKDIR /data
# 这里可以考虑直接将我们要测试的数据集扔到容器里
# 也可以考虑在运行过程中动态的挂载进去
# COPY imgs/*.jpg ./

# 补充安装一些项目需要的其他依赖
RUN pip install IPython pandas
# 因为 Towhee 目前只支持直接展示模型结果
# 暂时还不支持将展示结果保存为文件
# 所以这里需要打个小补丁，让它支持这个功能
RUN sed -i -e "s/display(HTML(table))/with open('result.html', 'w') as file:\n            file.write(HTML(table).data)/" /opt/conda/lib/python3.9/site-packages/towhee/functional/mixins/display.py
CMD ["python3", "/entrypoint.py"]

In the above code, I added a lot of comments to explain what to do at each step, so I won't go into details. To explain the design and thinking here, moving the above app.py to the / root directory instead of throwing it into the working directory can make our program simpler in use. Because I plan to use the working directory as the storage directory for the image reading and processing results. The container finally uses CMD instead of ENTRYPOINT to execute the default command, which is also more convenient for the user to directly call the command, or enter the container debugging.

Again, for readability, I've merged the contents of the Dockerfile above together:

 FROM soulteary/docker-pytorch-playground:2022.05.19

RUN pip install gfpgan realesrgan
COPY detection_Resnet50_Final.pth /opt/conda/lib/python3.9/site-packages/facexlib/weights/detection_Resnet50_Final.pth

# 尺寸大一些的模型文件，可以选择使用挂载的方式
# 而不在此处直接 COPY 到容器内部
COPY GFPGANCleanv1-NoCE-C2.pth /GFPGAN.pth

COPY app.py /entrypoint.py
WORKDIR /data
RUN pip install IPython pandas
RUN sed -i -e "s/display(HTML(table))/with open('result.html', 'w') as file:\n            file.write(HTML(table).data)/" /opt/conda/lib/python3.9/site-packages/towhee/functional/mixins/display.py
CMD ["python3", "/entrypoint.py"]

After saving the above content as Dockerfile , we execute the command to complete the construction of the application image:

 docker build -t pytorch-playground-gfpgan -f Dockerfile .

After a few moments, we have an application image containing the model and the model runner.

Next, let's see how to use this image to get the model running results at the beginning of the article.

Use of model application images

If you have downloaded the model file in the previous step and packaged the model file into the mirror, then we only need to download some black and white or color pictures containing the portrait (selected according to the model), put them in a directory ( data directory), and then execute a line of command to complete the model call:

 docker run --rm -it -v `pwd`/data:/data soulteary/docker-gfpgan

If you don't want to bother to find pictures, you can also directly use the sample pictures I prepared in the project: https://github.com/soulteary/docker-gfpgan/tree/main/data .

The above is for the case where the model is included in the application image. Let's take a look at what to do if the model is not included in the application image.

If you did not choose to package the GFPGAN model into the image when building the application model image above, then we need to use the file mount method to run the model. For the clarity of the project structure, I created a directory named model in the project to store the model files mentioned above.

The complete directory structure looks like this:

 .
├── data
│   ├── Audrey\ Hepburn.jpg
│   ├── Bruce\ Lee.jpg
│   ├── Edison.jpg
│   ├── Einstein.jpg
│   └── Lu\ Xun.jpg
└── model
    └── GFPGANCleanv1-NoCE-C2.pth

Once the model and the image to be processed are ready, we still execute a simple command to mount the file into the container and let the model work its "magic":

 docker run --rm -it -v `pwd`/model/GFPGANCleanv1-NoCE-C2.pth:/GFPGAN.pth -v `pwd`/data:/data soulteary/docker-gfpgan

After the command is executed, in the data directory, there will be an additional result.html file, which records the image results before and after model processing. Open directly with a browser and you can see results similar to the following:

模型容器的执行结果

At this point, how to encapsulate the basic image of the PyTorch container, how to encapsulate the application image of the specific model, and how to quickly call the model are all finished. If there is a chance later, I will talk about how to do further performance tuning based on these images, and talk about image packaging outside of PyTorch.

at last

To complete the content of this article, I need to thank two good friends, the core developers of the Towhee project @houjie and @guorentong for their help. Solved the content of the model call that is very troublesome for me, a Python rookie, although there are not many lines.

In the next related content, I plan to talk about how to do model training and inference on the M1 device, and continue to practice some more interesting AI projects.

--EOF

This article uses the "Signature 4.0 International (CC BY 4.0)" license agreement, welcome to reprint, or re-modify for use, but you need to indicate the source. Attribution 4.0 International (CC BY 4.0)

Author of this article: Su Yang

Creation time: May 20, 2022 Word count: 10723 words Reading time: 22 minutes Link to read this article: https://soulteary.com/2022/05/20/use-docker-to-run-huggingface-models.html

Use Docker to run HuggingFace massive models

write in front

Generic Docker base image for making PyTorch models

Writing a Model Invoker in Python

Make the application image used by the specific model

Use of model application images

at last

soulteary

引用和评论

用让新海诚本人惊讶的 AI 模型制作属于你的动漫视频

【万字长文】大模型开源开发全景与趋势解读

一文掌握 MCP 上下文协议：从理论到实践

AI Agent爆火后，MCP协议为什么如此重要！

AdventureX 2025 正式启动：五天四夜，120小时极限创造！一起在杭州点燃青年创新之火！

MCP 协议为何不如你想象的安全？从技术专家视角解读

🔥吐血整理 Bolt.diy 部署与应用攻略