6

background

The drafting design product of drafting technology is a multi-scenario online design platform focusing on commercial design, breaking the technical limitations between software and hardware, bringing together creative content and design tools, and providing high-quality solutions for design needs in different scenarios , To meet the design requirements of all types of media such as pictures and videos, making the design simpler.

Drafting Technology uses Elasticsearch (hereinafter referred to as ES) as the log retrieval component. With the growth of business volume, there are about 2T of new data every day, which needs to be stored for 15 to 30 days, which brings a lot to the disk and the system. pressure. In order to ensure the performance of log writing and query in ES, most of them use high-performance cloud disks with higher unit storage costs. However, in actual business scenarios, data older than 7 days is only used for low-frequency use, and all data stored in high-performance cloud disks will inevitably lead to excessive costs and waste of space.

plan

Elasticsearch version 7.10 introduced the concept of index life cycle, and began to support data tiered storage. Different nodes can be designated to use different disk media to distinguish hot and cold data. For example, using HDD disks to store warm and cold data can get more use space and Lower cost. This feature is very suitable for log indexing scenarios.

On the storage medium of warm and cold data, using JuiceFS to replace HDD disks is equivalent to obtaining unlimited storage space. Through ES's index life cycle management, index creation-migration-destruction of the entire life cycle management can be automatically completed without manual intervention.

In our practice, first upgrade the ES cluster to the latest 7.13 version. Then split the hot and cold nodes, the hot nodes give priority to performance, and the cold nodes give priority to storage capacity and cost. At the same time, adjust the index and template method, configure the data life cycle, index template and data flow, and complete the index data writing.

The flow of the entire index after adjustment is shown in the figure below:

image

When creating the index, configure index.routing.allocation.require.box_type:hot for node screening;
When waiting for the index to enter the warm cycle, adjust index.routing.allocation.require.box_type:warm and migrate to the warm node, the data is stored in the cold node, and is actually stored in JuiceFS;
When waiting for the index to enter the delete cycle, ES will automatically delete the index data.

Customer benefits

What is JuiceFS used in the solution?

JuiceFS is an enterprise-level distributed file system designed for cloud environments. Provides complete POSIX compatibility, and provides a low-cost, unlimited-space shared file system for applications. Using JuiceFS to store data, the data itself will be persisted in object storage (for example, Amazon S3, Alibaba Cloud OSS, etc.), combined with JuiceFS's metadata service to provide high-performance file storage. JuiceFS provides fully managed services in the global public cloud services, and it only needs a click of a mouse to configure it in ten minutes. At the same time, JuiceFS will be open sourced on GitHub in early 2021, and it has attracted the attention and participation of global developers. It has now won 3700+ stars.

image

In this solution, after the warm nodes of the ES cluster use JuiceFS for storage, we no longer need to do capacity planning and expansion for these nodes, and also eliminate the data migration in the event of node failures, which reduces costs and also brings a lot to operation and maintenance. The convenience.

The persistence layer of JuiceFS uses object storage, flexible billing, and TCO is lower than using ordinary cloud disks. In the ES cluster of this scheme, the cloud disk price used by the Hot node is 1,000 yuan/TB/month, and the fully managed JuiceFS service plus the overhead of object storage is about 250 yuan/TB/month. The total capacity of the ES cluster is 60TB+. Through the hot and cold tiered processing, 75% of the data is stored in JuiceFS, and the storage cost alone has been saved by nearly 60%. If you add in the time and energy saved by the operation and maintenance team, this solution will reduce the TCO of the customer's data storage by at least 70%.

practice

Cluster configuration

The cluster has a total of 9 nodes, a unique master node (elastic_001), and another 8 data nodes, including 5 hot data nodes (elastic_002 ~ elastic_006) and 3 cold data nodes (elastic_007 ~ elastic_009).

image

Directory mounting and configuration

JuiceFS is mounted on the ES cold data node and provides the ES data directory.

The node is configured with a 2T data disk, which is mounted in the /data directory. The ES process is started in the form of a container. The data disk is mounted on the system's /data/elastic , because the container used is mounted on the system directory. In this way, the ES data /data/elastic soft chain method, and the bind mount of the Linux system is used to mount the subdirectory of JuiceFS to the /data/elastic On the path. Just like on node 007:

# ./juicefs mount gd-elasticsearch-jfs  \ 
--cache-dir=/data/jfsCache --cache-size=307200 \
--upload-limit=800  /jfs
# mount -o bind /jfs/data-elastic-pro-007 /data/elastic

In this way, you can see the content of /jfs/data-elastic-pro-007 /data/elastic

Similar mounting operations are also done on nodes 008 and 009.

If you are not familiar with the basic operations such as initialization and mounting of JuiceFS, please refer to the official JuiceFS documentation.

There are many random write operations when ES indexing Rollover. In order to ensure the performance of writing, the writeback parameter is added when JuiceFS is mounted, so that the data will be written to the local disk first, and the data will be uploaded to the object storage asynchronously in the background. The local disk /data/jfsCache/gd-elasticsearch-jfs/rawstaging/ , please be careful not to delete any files in this directory, otherwise data loss may occur.

cache-size and upload-limit are used to limit the local read cache space to 300GiB, and the bandwidth of write object storage does not exceed 800Mbps. attrcacheto and entrycacheto respectively represent the cache timeout time of the kernel's attr cache and entry cache, in seconds.

Performance optimization

Reduce node load

Before adopting JuiceFS, Force Merge was configured during the life cycle of the ES cluster. The specific configuration item is warm.actions.forcemerge.max_num_segments: 1 , which will cause the data to be re-merged during rollover, which will put great pressure on the CPU. This step is completely unnecessary. Turning off the Force Merge configuration can avoid unnecessary performance overhead and reduce node load.

Rollover parameter configuration optimization

Since the warm phase data is written to JuiceFS, it will eventually be persisted to the object storage, and the application layer does not need to store multiple copies. You can set replicas to 0 during the index rollover process, which is warm.actions.number_of_replicas: 0 .

In addition, considering that when the index data is migrated to the warm stage, the data is no longer written, you can set the warm stage index to be read-only, that is, warm.actions.readonly: {} off the data writing of the index can reduce the memory usage.

Summarize

With the passage of time and the growth of business volume, enterprises are bound to face the dual challenges of larger-scale data storage and management. In this case, Drafting Technology gave full play to the life cycle management capabilities of Elasticsearch, and hierarchically stored log data according to business needs. The hot data that needs to be used frequently is stored in the SSD, and the low-frequency usage data over 7 days is stored in the more cost-effective JuiceFS, saving customers 60% of storage costs. At the same time, JuiceFS also provides nearly unlimited flexibility for applications, eliminating a series of operation and maintenance tasks such as capacity planning, expansion, data migration, and improving the efficiency of enterprise IT architecture.

Recommended reading :
Shopee x JuiceFS: ClickHouse cold and hot data separation storage architecture and practice
JuiceFS v0.17 has been released and passed 1270 LTP tests!


JuiceFS
183 声望9 粉丝

JuiceFS 是一款面向云环境设计的高性能共享文件系统。提供完备的 POSIX 兼容性,可将海量低价的云存储作为本地磁盘使用,亦可同时被多台主机同时挂载读写。