In the keynote speech of the ApacheCon Asia conference held recently, Kyligence co-founder and CEO, Apache Kylin PMC Han Qing gave a speech on "From Open Source to Product, Thinking and Practice of Open Source Project Productization", sharing Apache Kylin’s latest progress and future planning, the comparison of technology and product thinking, and how to operate open source communities/projects through product thinking, and other topics. If you want to know more, let’s read it down~
following is the transcript of Han Qing’s speech at the conference
Hello everyone! I’m Han Qing, and I’m very happy to share it with you at ApacheCon Asia today. I remember that the last time I participated in ApacheCon was in Vancouver in 2016. At that time, Apache Kylin had just graduated as the first top project from China. We shared it at ApacheCon and participated in the international community. In the past five years, we have seen many projects from China, continue to enter incubators, and then graduate to become top-level projects. Our voices in the entire community are also increasing. I am very happy to see the strength, technology and Content is increasingly being voiced in the global open source community.
I believe that many students today will bring open source culture, community operations, technology and other topics to share. I would like to share some of our experiences and views with you from another perspective, that is, the product perspective, that is, how to use product thinking To run an open source project. Today’s sharing mainly consists of the following three parts:
- Introduction and future plans of Apache Kylin
- Technology VS products in open source projects
- How to operate an open source community/project through product thinking
Introduction and future planning of Apache Kylin
Apache Kylin contributed to ASF from the eBay China R&D Center in 2014 and became an incubator project. In December 2015, Apache Kylin graduated as a top-level project, which is also the first top-level project from China. IPMC's VP Ted Dunning also gave us a very high evaluation at the time. He said that Apache Kylin represents the contribution and participation of China and Asian countries in the international open source community.
Apache Kylin has been born for a few years and has already gained more than 1,500 customers. Global customers are from our old company eBay, including Cisco, Walmart, Apple, Amazon, Microsoft, and Europe’s OLX Group. There are also many domestic customers. Many Internet companies above the designated size use Apache Kylin as an indispensable component of their entire big data analysis. We are also very happy to see more and more community friends continue to contribute to the evolution and iteration of Apache Kylin.
What is Apache Kylin used for? As shown in the architecture diagram, actually serves as the core piece of the traditional data warehouse-the data mart or OLAP layer . The user will define the corresponding data model in Kylin, including star model, snowflake model and Constellation model, etc. In Kylin 4.0.0-beta released at the beginning of this year, we have removed the dependency on Hbase, and can directly use Parquet as storage, which can be more suitable for cloud applications in the cloud-native era. This is also our next whole One of product evolution.
In addition, Kylin also has many other evolutions, such as supporting real-time capabilities, including the use of Flink for corresponding processing. whole process, we are always changing, we are serving users' OLAP capabilities. Recently, the technology in the industry has also been continuously developed. The Kylin community also hopes that through continuous innovation, it can merge with its strengths and bring more value to community users. If you are interested in these technologies or topics, please join the Kylin community to discuss and contribute.
case sharing
Introduce two simple cases. The first case is a telecommunications company from Europe. I knew this case when I was doing an exhibition in Spain. What they were doing was the analysis and communication of the network quality of the entire country, including the model, version and corresponding content of the mobile phone used to improve the monitoring and management of the entire network service quality. . They only need a very small Hadoop cluster, which can support a large number of applications at a small cost.
The second case is the OLX Group, a cross-border e-commerce platform from Germany. OLX Group is part of the global Internet giant Prosus. Prosus has also invested in companies such as Tencent. They use K8s to deploy Apache Kylin, while using Amazon EMR to integrate Kylin’s The HBase cluster is hosted with Hadoop HDFS, and the data is backed up to S3. The data architecture also has an automatic restoration process, which can restore all environments from S3 at any time when a crash is found in the deployment. OLX Group uses OKTA as the SAML federated identity authentication for user login, and also uses OpenLDAP for user authorization. Analysts and non-technical users can use a consistent, comprehensive monitoring, stable and scalable cross-team environment to easily and smoothly build cubes and use Apache Kylin. For more case details, please refer to 👉 Kylin on AWS Cloud Operation and Maintenance Practice |
Kylin version iteration
Next, I will introduce the current version of Apache Kylin.
In Kylin 4.0.0-beta released this year, we have removed the dependency on Hbase and support Parquet-based storage, and Apache Kylin 4 has been tested and launched in different companies. For example, community users such as Youzan are already in Meetup. Shared some performance optimization and operation and maintenance practices , the effect is still quite good.
We will do several more important tasks this year. is to support Spark3, which can be quickly introduced to Spark's latest capability . In addition, Apache Kylin is known for its performance in many cases, but everyone's pursuit of performance will never end. We plan to bring new technologies this year, including LocalCache and SoftAffinity. Although storage and calculation are separated, they must be put together in terms of soft affinity. This is a relatively new field for us, and we are constantly exploring. I hope you can find points of interest here. Above, we can discuss further in the community. We have done relevant research. We can see the very big changes that can be brought here, which can continuously improve the performance and stability of the system.
On the other hand, we will continue to find ways to remove the dependence on the entire Hadoop in the future. Although Apache Kylin was born as an OLAP on Hadoop, with the rapid development of cloud computing in the past two years, cloud native has This is the general trend, and we will spend more energy this year to better embrace cloud native.
Thanks to the earliest pluggable architecture of Apache Kylin, we are actually able to replace the underlying storage at any time for the corresponding dependencies, and we are also gradually migrating to K8s.
In the future, we will continue to invest more in the entire CloudNative. The core goal, we hope to be able to transform Apache Kylin from relying on Hadoop to do OLAP into a pure self-contained OLAP capability. We will also completely migrate the scheduling and dependencies of the entire resource to the K8s aspect. We will focus more storage on object storage. We also hope to continuously replace some other components with more general components to make the entire There will be fewer dependencies and simpler deployment.
We expect to do it next year. If users give a bunch of machines or some K8s resources, we can directly deploy and use them without any Hadoop dependency. The advantage is that, on the one hand, we will ensure that the entire system will transition smoothly, which is very important for customers who use Hadoop now; customers do not have to worry about the transition to the future while protecting their existing investments. Many things are repeated. I believe that in the next three to five years, with the rise of cloud computing and cloud native, it will definitely have some impact on Hadoop. How to make the transition smoothly? How to better migrate these existing applications with minimal cost and minimal cost? I believe this is a direction that is worth exploring and investing in.
Technology VS Product
Next, I will share with you some thoughts we have gained in the process of building open source projects and operating communities. In terms of technology, this conference has many technical experts and industry leaders, but in terms of products, it may be slightly ignored. Today I want to discuss technology and product-related topics.
Using this picture, you can see that products and technologies are actually different. technology research and development is more to break through a certain technical challenge and then to innovate, but products often turn some technologies or ideas into market behaviors, which can satisfy more users and application scenarios . The starting point of the two is actually different. The technology is often in-depth research and investment from a certain point, but the product may first need to consider the market situation, production cost and other issues. The two are different, but they are closely connected.
technology is to make yourself cool, and the product is to make others cool. technology makes me cool is that we have made a good algorithm, architecture or framework today, and we will have a great sense of accomplishment. But when we make products, the situation is different, because it’s not enough to be cool by ourselves. We must make others cool, and users should use it happily and comfortably.
Take Apache Kylin as an example. We were very excited when we first made it, but in the first three months of open source, we were very painful because many users in the community found that compatibility, compilation, adaptation, etc. appeared. Various problems. In the past two years of our work, the greater perception is that it is far from enough to just make a technical point. How do we let more people use your technology? That is, how does your product make others use it well? In fact, there is often much more to think about than a technical point of view, and even many things have to be done.
Another point, we say that technology is more focused on the problem itself, and the product is more focused on the value itself. technology is often we encounter some challenges, such as performance, concurrency, or certain algorithms, etc., through some technologies, papers, and capabilities, we can turn it into a solution. But from the product point of view, what is very important is how we can turn it into a valuable product after technology solves the problem. This is not to say that the project should be realized, because the open source project itself is free, we should pay more attention to, how to enable users to obtain value from the project? This is actually a very challenging point.
When I first went to the United States for community exchanges, many people asked why all GUIs were open sourced? This is a very important point. If we only focus on the technology itself, can only sharing a script solve the problem? Maybe it can be solved, but users have to spend a lot of work to use it. We directly open the GUI to open source, just hope that users can use it directly. When the product becomes easier to use, it can bring more value to users, and users can pay attention to the business value after solving the problem, rather than the technology itself. Everyone's pursuit of performance is endless, but we still need to pursue a balance between technology and products through some innovative ways.
I sum up a sentence called without good technology, the product must not be competitive, but if there is no good product, the technology will have no vitality. excellent technology comes out, the product must be used by people. Good technology needs good products to cooperate, and good products also need technology to support.
We can see that open source is currently the best and fastest way to continuously polish technology-based products. Through open source, a project can mature, be used, and even educate the market more . 1612f0fe26d785. Especially in the past two years, the open source community and projects have received more attention, and I hope that more friends will continue to be inspired in the future to continue to grow the open source community from participating in and contributing to open source.
How to operate open source communities and projects through product thinking
Next, I will share how to operate an open source project like a product. After participating in open source for so many years, the suggestion to everyone is that you can think about it from a technical perspective, because even a very simple open source project, or even a small tool, needs to continue to promote and evolve to find users . This is actually the same as making a product in essence. We often joke that the product manager is the CEO of a product. is actually the CEO of an open source project. How to operate the product and the community? , Has surpassed the technology itself .
The above picture is Product Led Growth. This picture is often used to describe the development stage of commercial products, but in fact the same is true for open source projects, but the whole process may not require marketing and sales teams, because the open source community itself can already do it. .
open source at this stage? When a new technology or product goes to the market, it can be used by users at an earlier time and at a lower cost. However, in the later stage, there is one thing that cannot be ignored. You can see that there is a Customer Success Team on the picture. Corresponding to the open source community is to continuously do community operations and user support.
From the early days of Apache Kylin open source to the present, our support to the community is still very busy. In fact, you can also see that in the whole process, our purpose is actually to let open source users use it, which is highly compatible with this Life Cycle. Coincident.
In addition, if you want to make an open source project bigger, I highly recommend everyone to take a look at the Go-to-Market Model of A16Z. There are two different modes, one is called Top down and the other is called Bottom up is actually to open the direction in different ways, so I won’t introduce more here.
open source of a project is actually just the beginning. For the person in charge of operations, it is impossible to do all the work by one person, and the team must be formed in a corresponding way. Recently, I have also seen that more and more Chinese open source projects have made huge investments in this area, which has indeed brought huge impact and gains. I also hope that this picture will bring you more information about open source projects and Product thinking.
Recently, I have chatted with many open source friends. In fact, in operating open source projects, we are easy to fall into a point. For example, many people have made an open source project and gained a lot of users, but if you want to make it a bigger one The dreams often find that there are many challenges.
picture can help us set up product capabilities, match industry needs, and solve the real pain points of customers, in order to gain more users. takes Apache Kylin as an example. At that time, it was very difficult to do large-scale data analysis on Hadoop, and it was very inefficient. Through Apache Kylin's OLAP Server, users can get analysis reports in the shortest time, without spending a lot of time to run various scripts, etc. When users have a pain point, I also have a very good solution, and this solution also has With great universality, customers will use it more and more. When we design our own open source projects or technologies, we must answer questions such as product value and positioning. If we cannot answer these questions from a business perspective, the product may be taken in the wrong direction.
Finally, I will share with you the Valley of Death. Although this is often used to evaluate startups, for open source projects, the essence is the same. Any open source project has its own life cycle. Of course, many open source projects often cannot hide in the valley of death.
Recently, in fact, everyone has seen that many Apache projects have been eliminated, in fact, because it may have lost some value to some extent. When operating an open source project, I hope everyone must have this awareness. It does not mean that if we open it up, someone must use it, or that it will . It will also go through the process of the valley of death. 1612f0fe26d8dd We all What can be done is to allow open source projects to continue to evolve and iterate to bring greater value to users.
For example, in the five years since Apache Kylin was open sourced, if we only provided Hadoop-based solutions, it might slowly disappear in a few years. In the process of community development, we are also constantly discussing, hoping to fight one death valley after another. Since last year, we have gradually embraced cloud native. This picture hopes that everyone will bring more thoughts to everyone, especially the person in charge of each open source project, who is responsible for himself and the community.
Thank you very much, and I hope to have the opportunity to gather with you to talk about how to make open source better, not only to become a big community and influence in China, but also to affect the world.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。