由于我的工作需要,我需要从远程的git仓库中仅下载./git/objects里的commit对象。.git metadata 包含了一些有用的信息。但在pack文件中包括commit/tree/blob对象,其中tree和blob对象很大,所以不下载tree/blob对象。或许我需要重新客户端或者写python脚本。
我看到一些项目(https://github.com/lijiejie/G...),通过解析index一步步下载文件,并且下载之后commit/tree/blob对象是分隔开的。
但这种方法是通过HTTP的方式访问.git文件夹,无法直接从github上下载index文件。我不理解github与git客户端之间是如何通信的。
我尝试过查看C和Java的底层源码,但复杂的结构让我困惑。
有什么办法可以实现仅下载commit对象的功能吗?或者如何能够帮助我在底层源码里找到关键代码。
不清楚segmentfault上又没有国外的朋友,贴上英文版:
Since my work needs, I need to download only the commit object in ./git/objects from the remote git repository. /.git/ metadata contains some useful information. But in the pack file include the commit/tree/blob object. the tree/blob objects are too large, so the tree/blob object Should not be download. Maybe I need to rebuild the git client or write a python script.
I saw some projects (like https://github.com/lijiejie/G... downloading the files step by step by parsing the index file, and finaly the commit/tree/blob objects are separated after the download.
enter image description here
But this method need to access the .git folder via HTTP, and you can't download the index file directly from github.
I don't understand how github communicates with the git client.
I tried to look at the underlying source code for C(https://github.com/git/git) and Java(https://github.com/eclipse/jg... but the complicated structure puzzled me.
Is there any way to implement the function of downloading only the commit object? Or how can I help myself find the key code in the underlying source code.
If you know how to do this, please let me know. thank you very much.