1
Regarding the Apache Pulsar cloud-native era message queue and stream fusion system, it provides a unified consumption model and supports both message queue and stream scenarios. It can not only provide enterprise-level read and write service quality and strong consistency guarantee for queue scenarios, but also provide stream Scenarios provide high throughput and low latency; adopt a storage-computing separation architecture to support enterprise-level and financial-level functions such as large clusters, multi-tenancy, millions of topics, cross-regional data replication, persistent storage, tiered storage, and high scalability .
GitHub address: http://github.com/apache/pulsar/

At the beginning of 2022, a well-known young community "old man", He Zhangjian (GitHub address: @Shoothzj) from HUAWEI CLOUD, was officially selected as the Apache Pulsar Committer.

He Zhangjian, senior engineer of Huawei Cloud IoT, graduated from Xidian University in 2017. After graduation, he has been working in Java in the IoT department of Huawei Cloud, focusing on cloud native, IoT, message middleware, APM and other fields.

It is worth mentioning that He Zhangjian, who was born in 1997, is the second and in the Pulsar community after Zhang Yong ( was selected in the Pulsar community Committer in mid-January 2021). The "post-95s" young people will inject strong and energetic blood into the Apache Pulsar open source project, and the power of the new generation should not be underestimated!

Congratulations to Zhangjian He for being an Apache Pulsar Community Committer! At the same time, we also interviewed He Zhangjian to talk about his story with Apache Pulsar. Let's take a look at his open source road and wonderful stories with us!

The first time to systematically participate in open source

Messaging systems used in the IoT field often need to meet the following conditions:

  • Meet massive topic message queue support
  • Support for multi-tenancy
  • The deployment method supports containerized cloud native

So my team and I gradually reached out to Pulsar and its community. We started our research in April and May 2020, starting from participating in TGIP-CN (Thanks God It's Pulsar) every week, and constantly familiarizing ourselves with the code. At that time, there were many problems to be solved in the early Pulsar 2.5 version. I contacted Li Penghui, Chen Hang and others to discuss many features and issues, and put forward my first PR in November 2020. During 2020-2021, I used Pulsar deeply and made many PRs.

Talking about my first PR, Pulsar version 2.5 did not have GC log configuration at that time, but in general, this configuration is required for commercial software. So after I asked the question and got the approval, I went to help fix the GC logging issue. I still remember the excitement after the first PR was merged - because I know that this feature may be used by tens of thousands of people, and tens of thousands of people will see my results and participate in the open source community. The sense of achievement is "open and transparent".

It can be said that compared to the past experience of submitting piecemeal PRs in other open source project communities, the Pulsar community is the first large-scale community with the longest investment time, more energy, and excellent code quality and unilateral drive.

Contributions behind Committer

My main contribution in the community is the reliability of the Java Client and Broker. In addition, I also contributed a lot of PRs in the Pulsar Go Client. Since most of the main repository uses the Java language, other languages have contributed relatively little, and the features have not yet been aligned, so the features of other clients need to be supplemented.

I'm very concerned about the DFX (i.e. Desgin for Failure) aspect. Huawei pays attention to software reliability and security, so when we researched Pulsar, we tested many uncommon scenarios, such as multiple restart tests, including forced shutdown, physical machine offline, etc. Therefore, in the community I have also made some contributions around reliability and security related features.

Open source community vs work environment

"Volunteer" mechanism

When I first got into the community, I wasn't very comfortable with the Volunteer model's community environment. Unlike problems at work that can be solved by looking for the relevant person in charge, there is no strong interest relationship between members in the community. Mutual help and question answering in the open source community are all enthusiastic and voluntary. Therefore, when seeking help from the community in an emergency, it is necessary to clearly describe the problem and list the causes and consequences; otherwise, timely solutions and feedback may not be obtained. For this reason, I have developed the habit of fully describing the context raising issues or asking questions in the community to improve the efficiency of communication.

Verification and Testing

The difference between the work environment and the community is also in testing. In most of the work, the product design process is product manager-engineer-testing, and the open source community does not have complete functional testing or large-scale integration testing when submitting PRs, only when the version is released. It will be verified by PMC members and some community developers and users. In this way of collaboration, it drives the community to do a good job in unit testing: as long as a problem is found in unit testing, unless there is a high complexity or unit testing is not applicable, 80% of the problem fixes will be covered by unit testing . Such concepts and processes have a great impact on my philosophy and my requirements for the team. I often think about the feasibility and effectiveness of integration testing and unit testing being compatible and learning from each other's strengths. I am working hard to practice, and I encourage more friends to work hard to practice unit testing in fixing bugs.

Code writing and review

Code writing practices at work and in the community are also different. Many companies have their own strict code-writing guidelines, and if the community doesn't prohibit it, and there is no clear advantage or good reason, existing code will remain as it is.

At the same time, work is completely different from the community's code submission and review style. At work, everyone is used to submitting code at one time for unified testing. On the contrary, the community requires that the PR topic is clear, and the code needs to be split for submission. The amount of code is small, the Git record is more elegant, and the split PR is easier to find problems and get feedback and suggestions in time. I have gained a lot.

Open source: from shallow to deep

The small partners from home and abroad in the community are very enthusiastic and share their experience and technical insights, which has allowed me to accumulate many meaningful experiences. Now, I can feel that the bug fixing work of the Pulsar project is more difficult than it was two years ago: the community is developing, Pulsar is upgrading, the main scene fixes are getting better and better, and many problems have been resolved. Next, if you want to use the community to find bugs, fix bugs, and improve documentation as the beginning of participating in the community, you need to put more effort into the learning process. The threshold of the community is increasing, and everyone needs to learn a lot.

Regarding getting started with learning Pulsar, I have summarized the following three suggestions:

  1. In view of the problem that the community only has help wanted at present, the community can build a document and issue classification mechanism, and mark the difficulty. Both newcomers and veterans can choose tasks according to their own abilities.
  2. Friends who are just getting started can explore the corresponding functions according to the community operation documents, experience them one by one, and immediately raise problems when they find problems, such as the code cannot be operated, the documents are wrongly written, etc., and boldly try to solve them, so that they can enter the community faster.
  3. How to read the source code? First of all, be careful not to read line by line, because the reading speed cannot keep up with the speed of code updates. At present, Pulsar has hundreds of thousands of lines of code. With the increase in the number of submissions, the code is updated quickly. At the same time, the module division is also different now, which is different from the previous version. Therefore, everyone should grasp the relationship between modules and grasp the main link. If possible, you should perform single-step debugging, serialize the entire process, and then read the part you are interested in. At this time, you can read the code in depth.

Committer - a stronger sense of belonging

As a Committer, my sense of belonging and responsibility to the community is stronger. Actively handling issues, answering questions in WeChat groups, etc. are calling me like a mission. From Volunteer to Contributor to Committer, the change in identity also means a qualitative improvement in attitude.

It is worth mentioning that in 2021, I am the only contributor in my team, and this year, our team plans to devote more human resources to better contribute to the community. At the same time, I also hope to use my influence to guide everyone to contribute Contribute to the community.

Thanks

Thanks to the Pulsar community for providing such an open and dynamic platform for everyone to contribute code and grow together. I have benefited a lot from it and deeply feel the self-identity value brought by contribution. Now there are more than Contributor in the community. I believe that more and more contributors will participate in the community in the future, and the open source atmosphere will become more and more active. At the same time, I would like to thank the team leaders for their support of my collaborative work with the community.

Interview comments

He Zhangjian is a person who loves to share. He has participated in Pulsar Summit Asia 2020, ApacheCon Asia 2021, Pulsar Meetup and other industry conferences and community activities for many times, and actively shared the practice of individuals and teams in Apache Pulsar. In the interview, we can also feel his love and investment in open source and the Apache Pulsar project from He Zhangjian's eloquent words. Thank him for bringing us a different experience and understanding of the Pulsar open source community in his eyes.

Join the Apache Pulsar community

Participating in open source can gain recognition from inside and outside the company and the community, and make friends with like-minded partners from various fields; at the same time, it can also improve personal influence and promote personal development. Participating in open source is not exclusive to coders. Community, documentation and other aspects can allow everyone to use their skills.

As a global open source project, as of now, Apache Pulsar has 513 contributors, + Star , 2.7K+ Fork . We have provided a participation guide for everyone, and welcome more and more small partners to help the continuous development and progress of the Apache Pulsar project.

Apache Pulsar Official Contribution Guide

Interview: He Zhangjian@HUAWEI
Interview: Chicken Chop @StreamNative, Haiqi@StreamNative

Related Reading

Follow public account "ApachePulsar" to get more technical dry goods

Join the Apache Pulsar Chinese exchange group👇🏻

Click link to like Pulsar!


ApachePulsar
192 声望939 粉丝

Apache软件基金会顶级项目,下一代云原生分布式消息系统