Introduction to This year's cloud native programming challenge is organized around "challenging serverless innovation practices". It will continue to explore in depth the three popular technology fields of RocketMQ, Dubbo3, and Serverless, and provide a stage for young people who love technology to challenge world-class technical problems. . I hope that the players can use the technology in their hands to create greater value for the whole society.
This year's Cloud Native Programming Challenge is organized around "Challenge Serverless Innovation Practices", and will continue to explore in depth the three popular technology fields of RocketMQ, Dubbo3, and Serverless, and provide a stage for young people who love technology to challenge world-class technical problems. I hope that the players can use the technology in their hands to create greater value for the whole society.
Track 1: RocketMQ storage system design for hot and cold read and write scenarios
600,000 cash prizes, choose any of the three major tracks,
more wonderful task definitions to win prizes and new poses, please click to sign up!
https://tianchi.aliyun.com/specials/promotion/cloudnative2021
1. Background of the contest question
Apache RocketMQ, as a distributed messaging middleware, has carried a trillion-level message flow over the years on Double Eleven, providing business parties with high-performance, low-latency, stable and reliable messaging services. Among them, real-time reading of written data and reading of historical data are common storage access scenarios for business, and they will appear at the same time. Therefore, optimization for this mixed read-write scenario can greatly improve the stability of the storage system . At the same time, Intel® Optane™ persistent memory, as a unique independent storage device, can narrow the gap between traditional memory and storage, and is expected to provide a fulcrum for RocketMQ's performance to leap again.
2. Title analysis
There are roughly two key points in this contest: how to design hierarchical storage, and the role of AEP in the scene.
First of all, tiered storage is not an unfamiliar concept. It is also a widely used technology. Broadly speaking, it is to store data in different levels of media, and perform automatic or manual data migration and copying between different media. operate. In fact, the design of tiered storage does not have the only optimal solution. It needs to be designed specifically for specific scenarios, and extreme conditions should be taken into account as much as possible.
- 4-core 8G ECS, equipped with 400G ESSD PL1 cloud disk (throughput up to 350MiB/s ref), equipped with 126G Optane™ persistent memory. From the perspective of machine configuration, the speed of each storage medium is 8G-DRAM> 126G-AEP> 400G-ESSD.
- During the correctness evaluation, ECS will be restarted and the data on the Optane disk will be cleared. Restarting ECS adopts the method of simulating power failure, so it is necessary to ensure that at least one copy of the data is placed in the ESSD.
- During the performance evaluation, 50% of the queues will be consumed from the current maximum position, and the rest will start from 0. Therefore, the "cold and hot data" can be distinguished and processed to improve the stability and operating efficiency of the system.
- Although there is PageCache in the operating system, it is "stupid" in some extreme situations. For example, in mixed reading and writing, the newly written "hot data" may be swapped out due to insufficient memory. During consumption The data will pollute the already crumbling PageCache. In the worst case, it may completely fail, and all reads and writes will go to the SSD.
The second is how to use this AEP efficiently.
- Intel® Optane™ memory is a unique independent storage device that can bridge the gap between traditional memory and storage.
- For a certain device under what circumstances and how to use it is based on its characteristics. For example, because of the physical addressing method of the previous mechanical hard disk, the throughput during sequential read and write is much greater than random read and write, while DRAM and solid state drives are Circuit addressing does not take into account whether the sequential read and write gap is not large when the cache optimization on hardware and software is optimized; and the characteristics of Optane memory are different from other storage media, and an in-depth understanding will help more flexible and efficient use.
https://developer.aliyun.com/article/770338?groupCode=aliyundb
3. Problem solving ideas
Tiered storage:
- Due to the large difference in storage medium capacity and speed, hot and cold data can be distinguished, newly written data is maintained in DRAM, and cold data is copied to AEP in time before being read.
- Because the bandwidth of the ESSD is very limited, cold data will occupy valuable resources during the migration process, so double write can be used in the write phase.
- Since the queues for hot and cold read and write are randomly specified, the program needs to be able to judge by itself and treat them differently.
- You can maintain a cache in memory by yourself, reducing the dependence on PageCache.
4. How to get good results
Since the result is the sum of the time of all the processes, and each link affects each other, it is possible to boldly display creativity and find the most "cost-effective" optimization.
For example, in order to optimize the subsequent reading process to organize the data, or to optimize the write performance in the ESSD, only Append data is placed in sequence, and the index is constructed in AEP or DRAM. We hope that the players can get their own satisfactory results!
5. Optane technical reference documents:
- Intel Optane persistent memory introduction:
- Optane Persistent Memory (AEP) working mode:
https://code.aliyun.com/dts\_test/dts-contest/blob/master/doc/appdirect-tips.md
- PMEM IO official website:
- How to simulate PMEM:
- PMEM programming guides:
- PMDK sample program:
https://github.com/pmem/pmdk-examples
- The evaluation environment uses PMEM:
- JAVA uses PMEMKV:
https://github.com/pmem/pmemkv-java
- Java* Support for Intel® Optane™ DC Persistent Memory:
- JAVA persistent memory programming tutorial (video):
Copyright Notice: content of this article is contributed spontaneously by Alibaba Cloud real-name registered users. The copyright belongs to the original author. The Alibaba Cloud Developer Community does not own its copyright and does not assume corresponding legal responsibilities. For specific rules, please refer to the "Alibaba Cloud Developer Community User Service Agreement" and the "Alibaba Cloud Developer Community Intellectual Property Protection Guidelines". If you find suspected plagiarism in this community, fill in the infringement complaint form to report it. Once verified, the community will immediately delete the suspected infringing content.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。