头图

This article will talk about how to efficiently turn the commit records and directory structure in the code repository into "cool videos" quickly. Share how to use Docker to run gource on devices with different CPU architectures, and how to make visual videos exponentially more efficient on devices based on the latest M1 Pro chip.

write in front

A while ago, in order to celebrate the community project Milvus won 10,000 stars on GitHub, I made a video to dynamically show the specific submission status of this project and the changes in the project organization structure in the past time.

使用 Gource 对代码仓库进行可视化

When a colleague recently talked about the " history of maintenance of blood and tears " of open source projects, he mentioned this again. It reminded me of the painful memories of making videos at that time: the video production scheme at that time was to use docker to run gource . For a repo with a relatively large number of commits such as the Milvus repo (14,000 commits), it takes at least an hour to generate a visual video using the i9 processor I have at hand. And when I switched the same operation to the M1 device (M1 Pro) to run, maybe because the application in docker is not optimized for ARM chips, or maybe the program version in docker is not new enough, the same workload, or even It takes half a day to run!
In any case, this result is too unscientific.

上万次提交的开源项目背后的代码变动

Not to mention the "unexpected" results of the M1's operation, but even an hour of video generation time made me feel quite uncomfortable. As an old programmer who pursues efficiency, it took me some time to finally figure out the "correct answer" to this question: if you use a program compiled for the M1 chip, the entire video generation time can be shortened to about half an hour, The improvement effect is quite obvious compared to the previous one.

Before I start talking about how I do it, I would like to introduce gource this open source software.

About Gource

In 2009, Andrew Caudwell , an engineer from New Zealand, wanted to visualize the information of various code version management software, so he used C++ to write the Gource program . In 2011, after the project was migrated from Google Code to GitHub, the project started the annual update mode.

Fortunately, as of the time of writing this article, the software has released two important updates this year: including support for retina screens, and a large number of corrections for font scaling, and the regular library used by the software has been upgraded to PCRE2 , the program version is updated to 0.53.

Because the project only provides the Windows version of the program on the GitHub release page , if we want to get the new version of the Linux / macOS program , we can only compile it ourselves. (The version in the Ubuntu APT repository is still stuck at 0.51 released in 2019)

Next, let's talk about how to compile, if you want to use Docker or x86 devices, you can read the later chapters of this article.

Compile Gource on M1 devices

In order to be able to compile the new version of the program on macOS, we need to complete the dependency installation of gource first:

 brew install pkg-config freetype2 pcre2 glow sdl2 sdl2_image boost glm autoconf

If you have not completed the installation of the above dependencies, then when executing ./configure , you will encounter problems such as the following:

 checking for FT2... configure: error: in `/Users/soulteary/lab/gource-0.53':
configure: error: The pkg-config script could not be found or is too old.  Make sure it
is in your PATH or set the PKG_CONFIG environment variable to the full
path to pkg-config.

...

No package 'libpcre2-8' found

...

After we have installed the dependencies, we also need to configure the compilation parameters so that the program can find the dependencies we just installed when compiling. Otherwise, an error similar to the following will occur:

 checking for boostlib >= 1.46 (104600)... configure: We could not detect the boost libraries (version 1.46 or higher). If you have a staged boost library (still not installed) please specify $BOOST_ROOT in your environment and do not give a PATH to --with-boost option.  If you are sure you have boost installed, then check your version number looking in <boost/version.hpp>. See http://randspringer.de/boost for more documentation.
configure: error: Boost Filesystem >= 1.46 is required. Please see INSTALL

...

configure: error: GLM headers are required. Please see INSTALL

...

For the boost framework, we can specify the dependent directory by simply using the --with-boost parameter, while for glm (OpenGL Mathematics), since it is a header-only math library, we must use CPPFLAGS and other parameters, pass the path to configure .

But how do we get the glm or boost path installed by brew in macOS? The following two methods can be used in combination here.

The first way to find the path is to use the brew list command to get a detailed directory listing of a software we installed, and find or try to find the correct directory in the output log. Taking boost as an example, when we finish executing brew list boost , we can see the output similar to the following:

 /opt/homebrew/Cellar/boost/1.78.0_1/include/boost/ (15026 files)
/opt/homebrew/Cellar/boost/1.78.0_1/lib/libboost_atomic-mt.dylib
/opt/homebrew/Cellar/boost/1.78.0_1/lib/libboost_chrono-mt.dylib
...

Among them, /opt/homebrew/Cellar/boost/1.78.0_1/ is the root directory of boost. Combine this path into --with-boost=/opt/homebrew/Cellar/boost/1.78.0_1/ parameters, and then you can use boost in the compilation.

The second way to find the path is to use the pkg-config tool to output the specific directory parameters that can be used for C++ project compilation. Especially suitable for projects like glm. By adding parameters to pkg-config , we can get the command pkg-config glm --libs --cflags , when the command is executed, we can get the directory address that can be used directly when compiling:

 -I/opt/homebrew/Cellar/glm/0.9.9.8/include

Combining the above parameters, it is not difficult to get the complete command for compilation and configuration on the M1 device:

 ./configure --with-boost=/opt/homebrew/Cellar/boost/1.78.0_1/ CPPFLAGS="-I/opt/homebrew/Cellar/glm/0.9.9.8/include"

When the command finishes executing, no surprises, we will see log output similar to the following:

 checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... build-aux/install-sh -c -d
checking for gawk... no
checking for mawk... no
checking for nawk... no
checking for awk... awk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking build system type... arm-apple-darwin21.4.0
checking host system type... arm-apple-darwin21.4.0
checking for g++... g++
checking whether the C++ compiler works... yes
checking for C++ compiler default output file name... a.out
checking for suffix of executables... 
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C++ compiler... yes
checking whether g++ accepts -g... yes
checking for style of include used by make... GNU
checking dependency style of g++... gcc3
checking for timegm... yes
checking for unsetenv... yes
checking how to run the C++ preprocessor... g++ -E
checking for X... disabled
checking for a sed that does not truncate output... /usr/bin/sed
checking for gcc... gcc
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking whether gcc understands -c and -o together... yes
checking dependency style of gcc... gcc3
checking for the pthreads library -lpthreads... no
checking whether pthreads work without any flags... yes
checking for joinable pthread attribute... PTHREAD_CREATE_JOINABLE
checking if more special flags are required for pthreads... -D_THREAD_SAFE
checking how to run the C preprocessor... gcc -E
checking for grep that handles long lines and -e... /usr/bin/grep
checking for egrep... /usr/bin/grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking whether we are using the Microsoft C compiler... no
checking windows.h usability... no
checking windows.h presence... no
checking for windows.h... no
checking for GL/gl.h... no
checking for OpenGL/gl.h... yes
checking for OpenGL library... -framework OpenGL
checking for GL/glu.h... no
checking for OpenGL/glu.h... yes
checking for OpenGL Utility library... yes
checking for varargs GLU tesselator callback function type... no
checking for pkg-config... /opt/homebrew/bin/pkg-config
checking pkg-config is at least version 0.9.0... yes
checking for FT2... yes
checking for PCRE2... yes
checking for GLEW... yes
checking for SDL2... yes
checking for PNG... yes
checking for IMG_LoadPNG_RW... yes
checking for IMG_LoadJPG_RW... yes
checking for boostlib >= 1.46 (104600) includes in "/opt/homebrew/Cellar/boost/1.78.0_1//include"... yes
checking for boostlib >= 1.46 (104600) lib path in "/opt/homebrew/Cellar/boost/1.78.0_1//lib/arm-darwin21.4.0"... no
checking for boostlib >= 1.46 (104600) lib path in "/opt/homebrew/Cellar/boost/1.78.0_1//lib"... yes
checking for boostlib >= 1.46 (104600)... yes
checking whether the Boost::System library is available... yes
checking for exit in -lboost_system... yes
checking whether the Boost::Filesystem library is available... yes
checking for exit in -lboost_filesystem... yes
checking glm/glm.hpp usability... yes
checking glm/glm.hpp presence... yes
checking for glm/glm.hpp... yes
checking that generated files are newer than configure... done
configure: creating ./config.status
config.status: creating Makefile
config.status: executing depfiles commands

Next, execute make , and the system will start compiling the gource. After the compilation is completed, execute sudo make install , and the gource compiling is finished safely.

Code repository visualization with Gource on M1 devices

Before using gource to make a video, we need to evaluate the hard disk space required for the project. The size of the generated video has a lot to do with the number of commits in the warehouse (commits), the total number of file directories, and the long project maintenance time. Here is the previous article. Take the Milvus repository mentioned as an example.

This warehouse has been maintained since 2019. As of now, there are 14,000 submissions. If we want to generate 1280x720 video content, assuming that the daily submission data display time of the project is set to 1 second, the process will be Outputs more than 370 GB of temporary files (PPM screenshot files), so make sure you have enough space on your hard drive before starting the warehouse visualization .

Download the code repository for visualization

The first step in visualization is to download the warehouse we want to visualize locally, for example:

 git clone https://github.com/milvus-io/milvus.git

Visual rendering with Gource

Next, we need to use gource to specify the maximum resolution of the video we want in the future, as well as some key details:

  • How long do we want to display each day in this video (1 second in this example)
  • What is the maximum frame rate of the video we want in this video (30 frames in this example)
  • What is the file name we want to output and the directory name of the downloaded repository just used git clone
 gource --viewport 1280x720 \
    --high-dpi \
    --seconds-per-day 1 \
    --output-ppm-stream milvus.ppm \
    --output-framerate 30 \
    milvus

Execute the above command, the program will open a preview interface and start to visualize each submission record of the warehouse and the directory structure at that time.

使用 Gource 进行逐帧绘制

After a relatively long wait (about 19 minutes), when the command is executed, we get a temporary file containing all code repository submission information and directory change information: milvus.ppm .

Generate the final video file using ffmpeg

The file we got in the previous step is 370 GB in size. In order to get a file that is convenient for subsequent editing or distribution on various network platforms, we also need to use ffmpeg to format it.

If you have not installed ffmpeg , you can consider using the following command to complete the installation (the version used in this article is 5.0.1 ).

 brew install ffmpeg

To generate a milvus.mp4 file in H264 format with better compatibility, you can use the following command:

 ffmpeg -y -r 30 -f image2pipe -loglevel info -vcodec ppm -i ./milvus.ppm -vcodec libx264 -preset medium -pix_fmt yuv420p -crf 1 -threads 0 -bf 0 ./milvus.mp4

在 M1 设备上火力全开的 ffmpeg

Wait patiently for the command to complete (about 14 minutes), and we will be able to get a video file with cool results. Compared with the 370GB temporary file in the previous step, the video file is relatively small and only needs about 12GB of space.

Visualizing code repositories with Docker

If you do not pursue higher conversion efficiency and can accept the execution method of "offline tasks", you can consider using the gource image in the open source project sandrokeil/docker-files/ .

The method of use is very simple, only one command is required:

 docker run --rm -it -v `pwd`/repo:/repos -v `pwd`/results:/results -v `pwd`/avatars:/avatars -v `pwd`/mp3s:/mp3s sandrokeil/gource:latest

In the above command, we need to do some simple preparations:

  • Put our code repository in the repo directory in the current directory.
  • Put the user avatars we plan to replace in the avatars directory.
  • If you want the program to generate the video, the background music can be completed by the way, you can put the mp3 file in the mp3s directory.

When the command is executed, we can find our visual video file in the local results directory.

other

In addition to faithfully restoring every commit in the repository, Gource also supports filtering the start and end of the time according to parameters, filtering and generating the contribution records of the specified user, and even with the shell, you can filter and generate the change records of the specified directory.

Therefore, when we want to conduct a major version review, or celebrate a user in the open source community as a project maintainer, it may be a good idea to restore and present these "time fragments mixed with code" through video.

At last

I hope this content can help you who have the same needs.

In the next article on the same topic, I will share a project visualization solution outside of gource, a relatively lightweight solution.

--EOF


This article uses the "Signature 4.0 International (CC BY 4.0)" license agreement, welcome to reprint, or re-modify for use, but you need to indicate the source. Attribution 4.0 International (CC BY 4.0)

Author of this article: Su Yang

Creation time: May 10, 2022 Word count: 8873 words Reading time: 18 minutes Link to read this article: https://soulteary.com/2022/05/10/lets-talk-about-code-warehouse-visualization-with -source.html


soulteary
191 声望7 粉丝

折腾硬核技术,分享实用内容。