Lead : The annual 11.11 is here again. The best gift for technical people is Technical Guide ! After these years of development, shopping festivals are no longer confined to the e-commerce industry. Now all walks of life actually adopt similar methods for operating activities. There are 818 in the automotive industry and Mi Fan Festival in Xiaomi. The basic software presents many new challenges and also accumulates many best practices.

Before the arrival of 11.11, PingCAP will conduct a series of in-depth discussions with users such as , Bitauto, JD.com, and Zhongtong, hoping to reveal what kind of technical problems are hidden behind the soaring sales year by year? What technical architecture can be used to steadily resist traffic peaks?

The technical challenges of JD 11.11

The annual 618 and 11.11 are a big test for Jingdong, and Jingdong Cloud is the cornerstone of Jingdong Group's technical guarantee. During this period, PB-level data growth pressure of Jingdong's core retail business and Jingdong logistics system . Faced with the rapid growth of JD’s 11.11 order volume and turnover every year, JD Cloud Database is the core of most of JD’s business systems, and the pressure is naturally not small.

Yang Mu, senior product manager of JD Cloud, feels deeply about this: Many business systems related to JD 11.11 require database support, such as analyzing kanbans, report data, and waybill data. Affected by commodity activities and preferential time, the peak time of user orders is often in a fixed time period, and the number of visits to these databases will increase rapidly.

His database team can clearly see peaks from the background monitoring. When JD 11.11 was fully launched, a large number of consumers and orders poured in, a large number of brands and merchants quickly set new records, CPU, QPS, etc. also began to soar, sometimes lasting for several minutes, sometimes for several hours. .

How to make a guarantee?

The JD Cloud database needs to smoothly support the thousands of core business systems that JD Group has been on the cloud during JD 11.11. Pre-plan preparation and stress testing, pre-plan exercises and real-time monitoring are all essential links. The JD Cloud Database team has accumulated rich experience in this regard. They divided the preparation into 8 steps :

  1. Identify the scope of protection;
  2. Business volume estimation and pre-inspection;
  3. Plan arrangement;
  4. Monitoring and alarm sorting;
  5. Business stress test;
  6. Plan drill
  7. 11.11 on duty;
  8. Technical review.

Based on past experience, Yang Mu believes that JD’s business volume at 11.11 will be 10 times as much as usual. This peak increase in data volume must be prepared for additional resources to meet, but since JD Cloud’s database is already running on the cloud, they only need to plan and allocate resources based on the pre-estimated data volume, and do sufficient stress testing to ensure The storage capacity and performance of the subsequent database can meet the requirements. When the traffic peak really arrives, it is often only necessary to wait quietly to pass it smoothly, and there will be no extreme situations.

In particular, most of JD Logistics has already moved to the cloud, and guarantees and preparations are actually in progress all the time. cloud database through a series of database-level technical means such as high-availability architecture, automatic failover, and elastic expansion mechanism, to ensure that data can be backed up, failover can be switched, and incrementally expandable, and calmly cope with the pressure of massive data during JD 11.11.

After applying TiDB, these tasks become easier. The distributed architecture adopted by TiDB supports massive data expansion, which can effectively solve the bottleneck problem of stand-alone MySQL capacity and performance. Yang Mu described that when expanding the capacity, it is only necessary to expand the capacity of TiDB and TiKV in advance according to the needs of the business side. The expansion work also only needs to click a mouse on the console, and then relax with a cup of tea and wait, which greatly improves the work efficiency of the DBA. At the same time, TiDB is open source, there is no technical lock-in problem, and it is easier to use on the cloud, or even deploy across clouds.

In order to lower the technical threshold for each team within the group to use TiDB, JD Cloud and PingCAP jointly launched Cloud-TiDB, a distributed database on the cloud, to provide TiDB services on JD Cloud. In this way, all business-related matters related to database services no longer need to set up their own DBA, and they can be completely entrusted to JD Cloud.

In this year's JD 618 and 11.11, Cloud-TiDB has been successfully applied to JD Logistics' logistics business expense system, logistics large-piece sorting system, waybill accrual detail system and other businesses. The overall application scale is close to 6000 cores. Reaching 30 TiDB clusters, has brought a significant increase in cost, efficiency and experience. Yang Mu said with a smile, research and development no longer need to be busy optimizing the system all day, and you can go home early. Operation and maintenance students no longer need to stay up all night to support system operation and maintenance, and they can lose a few hairs.

Logistics business system

In 11.11, the most anticipated thing after buying is receiving the express. In JD.com, JD Logistics assumes the responsibility of delivering the ordered items to the buyer. It is conceivable that the amount of data in the JD logistics business system must be very large. The numbers of several main tables are 2 billion, 5 billion and 10 billion respectively. The data doubled to 22 billion after the system went live half a year. The original MySQL sub-database sub-table architecture encountered some complex SQL not support, cross-shard statistical reports are difficult to implement and other issues.

在这里插入图片描述

After the system was migrated to TiDB, the overall performance was excellent, the efficiency of writing and updating was about 100 milliseconds, and the query and Sum query were only 20 to 30 milliseconds. A system with tens of billions of data was migrated from MySQL to TiDB, and the actual business code was zero modified. The system only changed the user name and password of the JDBC connection, which truly realized zero code modification and seamless migration from MySQL to TiDB. good compatibility of TiDB and MySQL reduces the user's trial and error, testing and migration costs, and has a short profit cycle and quick results.

In addition, Yang Mu specifically pointed out that the migration to TiDB also brought an unexpected surprise to the business side. If calculated on a two-year cycle, the cost of using TiDB is only 37% of that of MySQL. This is mainly because TiDB has a very good data compression ratio. For example, the data in MySQL accounts for 10.8 TB, and after migrating to TiDB, it is only 3.2 TB, and this is the total data volume of the three copies. TiDB has really helped the entire business part to greatly reduce the IT input cost.

Logistics bulky sorting system

Some real-time kanbans and core reports of JD Logistics's large-piece sorting system run on MySQL. As the amount of data increases and SQL is more complex, the performance of reports and billboards is relatively low, and the user experience is poor. The method of sub-database and sub-table is relatively intrusive to the code, the structure needs to be adjusted greatly, and the risk is high.

在这里插入图片描述

During the 618 period, JD Logistics used TiDB to support real-time billboards and core reports for the business, and used a self-developed honeycomb system to synchronize data in quasi-real time between MySQL and TiDB. After migrating from MySQL to TiDB, with a total of hundreds of indicators, the overall performance has been 8 times by .

Waybill accrual detail system

The air waybill accrual detail system is used to record the detailed data of some waybills. The daily data growth is at the level of tens of millions, and the maximum record of the single table is close to 20 billion. From the perspective of data volume, MySQL is difficult to support. Jingdong Logistics tried to use Presto, but the cost was relatively high. Later, ElasticSearch was used for query, but there were also unstable conditions and heavy maintenance workload.

在这里插入图片描述

After the business system was migrated to TiDB, the problem of massive data was solved. TiDB can support tens of billions of data without any pressure. The cost of TiDB is reduced by 30% compared to the previous MySQL + ElasticSearch solution. TiDB performance meets business requirements. The TP99 for querying business data from tens of billions of single tables is about 500 milliseconds. In addition, the adjustment and modification of the entire table structure of TiDB is very simple, which brings operation and and 1618dedd498445 cost reduction.

After the severe tests of 618 and 11.11, TiDB has been applied stably in multiple 0-level systems of JD Group without any accidents. Feedback from various business parties has been relatively good, and it has become a benchmark case within the group. This also gave Yang Mu their confidence that they can continue to promote the use of TiDB within the group in the next time. promotes business development with technological advances. It is expected that the scale will double again by the end of 2021, reaching 10,000 nuclear scales. .

For an enterprise, in addition to supporting business innovation, it is also a large-scale training and full-link exercise of its own technical architecture. Through the extreme test of the big promotion, the company's IT structure, organizational process, and talent skills have all been greatly improved. The experience and thinking in the big promotion will also accelerate the company's daily business innovation rhythm, improve technology-driven innovation efficiency, and create a new growth engine.


PingCAP
1.9k 声望4.9k 粉丝

PingCAP 是国内开源的新型分布式数据库公司,秉承开源是基础软件的未来这一理念,PingCAP 持续扩大社区影响力,致力于前沿技术领域的创新实现。其研发的分布式关系型数据库 TiDB 项目,具备「分布式强一致性事务...