On July 23, 2021, we released Chaos Mesh 2.0 GA version. Chaos Mesh 2.0 is an exciting version, a solid step towards the closed-loop ecology of chaos engineering.

Making chaos engineering easier has always been Chaos Mesh's unwavering goal. Building a closed-loop chaos engineering ecosystem is a key step to help us achieve our goals. After nearly a year of disdainful efforts, we have made very big improvements in three main areas: ease of use, the arrangement and scheduling of experiments, and the richness of fault types.

ease of use

We have been committed to improving the usability of our products. When Chaos Mesh 1.0 GA, we released Chaos Dashboard to facilitate users to conduct chaos experiments through a graphical interface. In Chaos Mesh 2.0, Chaos Dashboard has brought great improvements :

  • Chaos Dashboard supports the creation, viewing and updating of AWSChaos GCPChaos, so that the chaos experiment in the cloud environment is consistent with the chaos experiment experience in the kubernetes environment;
  • For each chaos experiment, Chaos Dashboard can now display more detailed events of each experiment, making the experiment more visible!

native experimental arrangement and scheduling

When conducting chaos experiments, a single chaos experiment is often not enough to simulate the failure scenario, and manual control of the start and stop of the experiment is a tedious and dangerous thing. Before, we can automatically control the injection and end of the experiment through Argo and Chaos Mesh. In Chaos Mesh 2.0 version, our joins Workflow , which supports the ability to arrange scenes, can conveniently execute multiple experiments serially/parallelly, and can weave notifications and health checks to form complex experimental scenes .

Previously, when defining chaos that was executed periodically, using only "cron: @every 10s" and "duration: 5s" to describe behaviors did not meet everyone's needs. For example, a single execution is often greater than the execution cycle. This definition is legal, but there is no suitable description for the study of expected behavior. We refer to the definition of CronJob and introduce a new custom object Schedule, which adds more explicit properties to tasks that are executed regularly, such as whether multiple experiments are allowed to be executed at the same time, thereby restricting behavior.

For definition updates, we provide a migration tool to help users migrate and upgrade, and they will also be released with the release. You can refer to upgrade to Chaos Mesh 2.0 complete the upgrade from 1.x to 2.0.

More fault types

Chaos Mesh already supports system-level fault injection such as NetworkChaos, IOChaos, StressChaos, etc., and also supports fault injection of cloud service types such as AWSChaos and GCPChaos. We also added the fault injection function of the application layer to Chaos Mesh 2.0.

JVMChaos

JVM-based languages such as Java and Kotlin are widely cited in the industry. JVMChaos can be easily implemented through JVM bytecode enhancement and javaagent technologies. At present, Chaos Mesh implements JVMChaos with the help of chaos-exec-jvm. For example, method delay, return value modification, memory overflow, throwing exception and other application-level fault injection. You can refer to the document Simulating JVM Application Failure for more information.

HTTPChaos

HTTPChaos is a Chaos type newly supported in 2.0. It can hijack HTTP service requests and responses on the server side, interrupt links, inject delays, or modify Header/Body. It is suitable for any scenario where HTTP is used as a communication protocol. You can refer to the document Simulating HTTP Failure for more information.

physical machine injection tool Chaosd

Chaos Mesh is specially designed for kubernetes, and in the physical machine environment, we provide Chaosd. Chaosd is evolved from chaos-daemon, and some special chaos experiment functions are added according to the characteristics of physical machines. It supports process, network, JVM, pressure, disk and other different types of fault injection on the physical machine.

Looking to the future

Chaos Mesh is still actively in development in the next few months, we've Chaos Mesh planning a more powerful , including:

  • Injecting JVMChaos at runtime makes JVMChaos cheaper and more convenient.
  • The plug-in mechanism allows users to build custom chaos experiments while enjoying the scheduling function of Chaos Mesh.

In addition, we also found that the user's use of chaos experiments is a very valuable resource, and good chaos experiment scenes can be reused in many places. We will launch a platform in the future to allow users to share the chaos experiment scenes they have used.

Quick experience

https://chaos-mesh.org/interactive-tutorial in your browser and quickly experience Chaos Mesh 2.0 using cloud resources!

Thanks to

Thanks to all the contributors of Chaos Mesh ( https://github.com/chaos-mesh/chaos-mesh/graphs/contributors ), Chaos Mesh is inseparable from the efforts of every contributor from 1.0 to 2.0 !

Finally, everyone is welcome to submit an issue for Chaos Mesh or refer to the documentation to start submitting code. Chaos Mesh looks forward to your participation and feedback!


PingCAP
1.9k 声望4.9k 粉丝

PingCAP 是国内开源的新型分布式数据库公司,秉承开源是基础软件的未来这一理念,PingCAP 持续扩大社区影响力,致力于前沿技术领域的创新实现。其研发的分布式关系型数据库 TiDB 项目,具备「分布式强一致性事务...