KCL: A declarative cloud-native configuration strategy language

Wedge: Take the typical site building scenario of Ant as an example. After connecting to Kuusion, the user-side configuration code is reduced to 5.5%. The four platforms that users face are reduced by accessing the unified code base, and delivered without other exceptions. Time dropped from 2 days to 2 hours...
Note: This article is the content shared by Chai Shushan at the 2021 GIAC Conference. Please click below to download related PPT content

GIAC Conference PPT download: KCL declarative cloud-native configuration strategy language

0. Hello GIAC

Hello everyone, I am a classmate from Ant. I am very happy to share "KCL Configuration Strategy Language" with you in the new paradigm section of GIAC's programming language. KCL language is a self-developed DSL language for cloud native infrastructure configuration in the Kuusion solution of Ant. It has been promoted and tried in a small range in some scenarios such as site construction.

Let's take a look at the simple KCL code first:

schema GIACInvitation[name: str]:
    Name:     str = name
    Topic:    str = "分享主题"
    Company?: str = None
    Type:     str = "分享嘉宾"
    Address:  str = "深圳"

invitation = GIACInvitation("姓名") {
    Topic:   "KCL配置策略语言"
    Company: "蚂蚁集团"
}

This example code first through schema defines a GIACInvitation structure: This structure has a str type name parameters, as well as a set of annotation types and default values of the attribute. Then constructed by declarative syntax of GIACInvitation instance invitation .

Although this example is simple, it contains the most important schema language structure of KCL. It can be seen from the examples that KCL tries to improve the writing and maintenance of configuration code through declarative syntax and static type checking. This is also the original intention of designing the KCL language. We hope to solve the problem of configuration coding in the cloud native field through mature technical theories in the programming field.

1. The background of the KCL language

In the classic Linux/UNIX operating system, we interact with various tools and kernels built into the system through Shell, and at the same time manage higher-level apps through Shell scripts. It can be said that the Shell language greatly simplifies the programming interface of the kernel, not only improves the ease of use of the operating system, but also simplifies the management and operation and maintenance of the upper-level App, and also improves production efficiency. As the de facto standard in the container management field, Kubernetes has become the Linux/UNIX in the cloud computing era. Analogous to UNIX systems, Kubernetes currently lacks an interactive language and tools that conform to its declarative, open, and shared design philosophy.

1.1 Why design the KCL language?

K8s has become the operating system of cloud computing, but there is still a lack of a fully functional SHELL interactive interface. Although there are many open source solutions, there is no more mature solution like UNIX Shell, especially the large-scale engineering requirements of leading Internet companies. There is a gap between cloud native technology and enterprise landing that needs to be filled. This is the problem to be solved by cloud native engineering, and it is also the starting point for designing the KCL language.

1.2 Now is a good time

The idea of cloud native is a high degree of openness and democratization. The result is that everything is configurable, and all configurations are code. Everyone is equal before the configuration code, and each user can interact with the basic platform facilities by adjusting the configuration code. Therefore, the preparation and maintenance of configuration is becoming a necessary skill and requirement for software engineers in the cloud computing era. Based on the growing demand for cloud-native configuration coding, many leading companies in Silicon Valley have carried out large-scale practice and verification in this direction, which has given us a lot of experience to refer to.

Therefore, Ant’s Kusion project tried to use KCL to configure the strategy language to simplify the design of the access method of cloud native technology facilities. Its design goal is not only to improve the openness and use efficiency of Ant’s infrastructure, but also to optimize sharing and collaboration. It can be said that its positioning is the Shell language of the cloud-native era. Although it is still in the exploration and practice stage, through this article, we share with you some concepts of the design and implementation of the KCL language, and contribute a little to the rapid arrival of cloud native.

1.3 History of KCL Birth

KCL language began initial research and design work in 2019. Kcl-0.1 will be released in March 2020, based on Python custom grammar, using Go version of Grumpy and AntLR and other tools to develop. In the second half of 2020, we will switch to the Python language and speed up development and iteration. The released kcl-0.2.x introduces a large number of language features, increases Plugin extension support, and supports IDEA plug-ins. In the first half of 2021, unified optimization and integration of language features will be started. The kcl-0.3 optimized type system, integrated unit testing tools, optimized execution performance, and API support for multiple languages such as Go are released. At the same time, it provides support for VSCode through LSP. In the second half of 2021, it will be launched in common sites such as website construction, while introducing static type checking and optimizing performance, and improving language document support.

2. Design principles of KCL language

Based on Ant’s years of experience in classic operation and maintenance in China and Taiwan, and thinking about the pros and cons of various issues, the Kusion project has a great The dimension system has explored and thought, and put forward and practiced a cloud-native collaborative development model based on the coding of infrastructure. The KCL language is a declarative configuration programming language designed by the Kusion project to solve cloud-native collaborative development. Simple, stable, efficient, and engineering are the design concepts of the KCL language.

2.1 Simplicity is king

Simplicity can not only reduce the cost of learning and communication, but also reduce the risk of code problems. Whether it is the KISS principle pursued by UNIX or the Less is more design concept advocated by the Go language, a simplified and easy-to-use interface has always been a goal pursued by various successful products. Also starting from the principle of simplicity, the KCL language retains only the necessary elements on the basis of modern programming languages, and at the same time provides basic and flexible configuration definition writing capabilities through automatic type derivation, introduction of restricted control flow and schema, and deletes language features. It has always been an important goal of the KCL language design work.

2.1.1 Declarative syntax

Declarative programming is a programming paradigm parallel to imperative programming. Declarative programming only tells you the results you want, and the execution engine is responsible for the process of execution. Declarative programming is simpler to use, can reduce the complexity and side effects caused by imperative assembly, keep the configuration code clear and readable, and the complex execution logic has been supported by the Kubernetes system. The KCL language provides support for declarative grammar by simplifying the grammatical structure of schema structure instantiation, and reduces the complexity of imperative procedural programming by providing only a small number of statements. Around the schema and configuration-related grammar, KCL hopes that each configuration requirement can be completed by a fixed writing method as much as possible, so that the configuration code is as uniform as possible.

For example, as the core structure of KCL declarative grammar, schema can be instantiated in a declarative way:

schema Name:
    firstName: str
    lastName: str

schema Person:
    name: Name = {
        firstName: "John"
        lastName: "default"
    }

JohnDoe = Person {
    name.lastName: "Doe"
}

First, a Name structure is defined through the schema. The structure contains two required attributes of string type. Then reuse the Name type to declare a name attribute in Person, and set a default value for the name attribute to simplify the user's use. Finally, when defining the JohnDoe configuration definition, only one attribute parameter of name.lastName needs to be filled in, and the other part of the attributes adopt the default parameters.

For some standard business applications, by encapsulating the reusable model as KCL schema, it can provide the simplest configuration interface for front-end users. For example, based on the sofa.SofaAppConfiguration in Ant's internal Konfig library, you can customize an App by adding a few configuration parameters.

appConfiguration = sofa.SofaAppConfiguration {
    resource: resource.Resource {
        cpu: "4"
        memory: "8Gi"
        disk: "50Gi"
    }
    overQuota: True
}

Declarative syntax is used to describe the necessary parameters (other parameters all adopt the default configuration), which can greatly simplify the configuration code of ordinary users.

2.1.2 Order-independent grammar

Unlike imperative programming, KCL promotes a declarative syntax that is more suitable for configuration definitions. Taking the Fibonacci sequence as an example, the definition of a set of declarative formulas can be regarded as a system of equations. The sequence of formulating equations does not essentially affect the solution of the system of equations, and the process of calculating attribute dependence and "solving" is performed by the KCL interpreter Complete, this can avoid a large number of imperative assembly processes and sequence judgment codes.

schema Fib:
    n1: int = n - 1
    n2: int = n1 - 1
    n: int
    value: int

    if n <= 1:
        value = 1
    elif n == 2:
        value = 1
    else:
        value = (Fib {n: n1}).value + (Fib {n: n2}).value

fib8 = (Fib {n: 8}).value  # 21

The members n, n1, and n2 defined by Fib in the code have certain dependencies, but they have nothing to do with the order in which they are written. The KCL language engine automatically calculates the correct execution sequence based on the dependencies in the declarative code, and warns of abnormal conditions such as circular references.

2.2.3 Merge configuration with the same name

When the entire business and development and maintenance teams become complicated, the writing and maintenance of configuration codes will also become complicated: the same configuration parameters may be scattered in multiple modules of multiple teams, and a complete application configuration is required Combining these same and different configuration parameters scattered in different places can take effect, and the same configuration parameters may conflict due to the modification of different teams. It is a great challenge to manually synchronize these configurations with the same name and merge different configurations.

For example, the application configuration model in the Konfig library is divided into base and stack configuration for each environment. When the program is running, it is required to merge into an application configuration according to a certain merge strategy, which is equivalent to requiring that the front-end configuration of the large library can be automatically merged, that is, it can be defined multiple times And merge, and then instantiate to generate the corresponding unique front-end configuration. With the help of KCL language capabilities and Konfig's best practices, the preparation of the configuration can be simplified by automatically merging the baseline configuration and the environment configuration. For example, for the standard SOFA application opsfree, its baseline configuration and environment configuration are maintained separately, and finally handed over to the platform tool for configuration consolidation and inspection. The KCL language achieves the design goal of simplifying team collaborative development by automatically merging configurations with the same name.

For example, the general configuration of base configuration collection:

appConfiguration = sofa.SofaAppConfiguration {
    mainContainer: container.Main {
        readinessProbe: probe_tpl.defaultSofaReadinessProbe
    }
    resource: res_tpl.medium
    releaseStrategy: "percent"
}

Then the pre-release environment fine-tunes certain parameters based on the base configuration:

appConfiguration = sofa.SofaAppConfiguration {
    resource: resource.Resource {
        cpu: "4"
        memory: "8Gi"
        disk: "50Gi"
    }
    overQuota: True
}

The merged pre configuration is actually a SofaAppConfiguration configuration (equivalent to the following equivalent code, the priority of the environment configuration is higher than the baseline configuration by default)

appConfiguration = sofa.SofaAppConfiguration {
    mainContainer: container.Main {
        readinessProbe: probe_tpl.defaultSofaReadinessProbe
    }
    resource: resource.Resource {
        cpu: "4"
        memory: "8Gi"
        disk: "50Gi"
    }
    overQuota: True
    releaseStrategy: "percent"
}

Although the current configuration with the same name is only valid for the main package configuration of the application, it has already brought observable benefits.

2.2 Stability overrides everything

The more basic components have higher requirements for stability, and the greater the number of reuses, the better the benefits of stability. Because stability is an indispensable requirement in the field of infrastructure, it not only requires logical correctness, but also reduces the probability of errors.

2.2.1 Static typing and strong immutability

Many configuration languages use runtime dynamic checking of types. The biggest disadvantage of dynamic typing can only check the type of the property being executed, which is very unfavorable for early detection of type errors in the development phase. Static typing can not only analyze most type errors in advance, but also reduce the performance loss of dynamic type checking at the back-end runtime.

In addition to static typing, KCL also prohibits certain important attributes from being modified through the final keyword. The static type combined with the strong immutability of the attribute can provide stronger stability guarantee for the configuration code. For example, the apiVersion information in CafeDeployment is a constant type of configuration parameter, and final provides guarantee for this type of configuration:

schema CafeDeployment:
    final apiVersion: str = "apps.cafe.cloud.alipay.com/v1alpha1"
    final kind: str = 123  # 类型错误

schema ContainerPort:
    containerPort: int = 8080
    protocol: "TCP" | "UDP" | "SCTP" = "TCP"
    ext? : str = None

The apiVersion and kind attributes in the code are protected by final protection and cannot be modified. But kind implies an error because the initial value of the attribute type is different. It is easy to find and correct the error in the development stage through static type checking.

2.2.2 Runtime type and logical check verification

KCL's schema is not only a typed structure, but also can be used to verify the inventory of untyped JSON and YAML data at runtime. In addition, the check block of the schema can write semantic checking code, which will be automatically checked when the schema is instantiated at runtime. At the same time, schema-based inheritance and mixin can generate multiple-associated check rules.

For example, the following example shows the common usage of check:

schema sample:
    foo: str
    bar: int
    fooList: [str]

    check:
        bar > 0 # minimum, also support the exclusive case
        bar < 100, "message" # maximum, also support the exclusive case
        len(fooList) > 0 # min length, also support exclusive case
        len(fooList) < 100 # max length, also support exclusive case
        regex.match(foo, "^The.*Foo$") # regex match
        isunique(fooList) # unique
        bar in [range(100)] # range
        bar in [2, 4, 6, 8] # enum
        multiplyof(bar, 2) # multipleOf

Each statement in check is composed of an expression that can produce a bool result and optional error messages (every ordinary bool expression is actually a simplified version of the assert statement). Logical verification of attribute values at runtime can be achieved through built-in syntax and functions.

2.2.3 Built-in test support

Unit testing is an effective means to improve code quality. KCL is based on the existing schema grammatical structure, with a built-in kcl-test command to provide a flexible unit testing framework (combined with the testing package to specify the face value type of command line parameters).

Built-in testing tools

schema TestPerson:
    a = Person{}
    assert a.name == 'kcl'

schema TestPerson_age:
    a = Person{}
    assert a.age == 1

The kcl-test command not only executes unit tests, but also counts the execution time of each test, and can choose to execute specified tests through regular expression parameters. In addition, through kcl-test ./... you can recursively execute unit tests in subdirectories, and support integration tests and Plugin tests at the same time.

2.3 Efficiency is the eternal pursuit

KCL code not only simplifies programming through a declarative style, but also provides an efficient development experience through module support, mixin features, built-in lint and fmt tools, and IDE plug-ins.

2.3.1 Useful syntax in schema

Schema is the core grammatical structure for writing configuration programs in KCL. Almost every feature is designed to improve efficiency in specific business scenarios. For example, when defining and instantiating deeply nested configuration parameters, you can directly specify the path definition and initialization of the attribute.

schema A:
    a: b: c: int
    a: b: d: str = 'abc'

A {
    a.b.c: 5
}

At the same time, for safety, the default field for each attribute is non-empty, and it will be checked automatically when instantiating.

Schema is not only an independent type-annotated configuration object, we can also extend the existing schema through inheritance:

schema Person:
    firstName: str
    lastName: str

# schema Scholar inherits schema Person
schema Scholar(Person):
    fullName: str = firstName + '_' + lastName
    subject: str

JohnDoe = Scholar {
    firstName: "John",
    lastName: "Doe",
    subject: "CS"
}

In the code, Scholar inherited from Person, and then extended some properties. As a subclass, Scholar can directly access the firstName and other attribute information defined in the parent class.

Inheritance is the basic code reuse method in OOP programming, but there are also technical problems of diamond inheritance caused by multiple inheritance. The KCL language deliberately simplifies the syntax of inheritance, and only retains the syntax of single inheritance. At the same time, the schema can be mixed into and reused the same code fragments through the mixin feature. For different capabilities, we write it through the mixin mechanism and "mix into" different structures through the mixin statement.

For example, by mixing FullnameMixin in Person, you can add new attributes or logic to the schema (including check code blocks):

schema FullnameProtocol:
    firstName : str = "default"
    lastName : str

mixin FullnameMixin for FullnameProtocol:
    fullName : str = "${firstName} ${lastName}"

schema relax Person:
    mixin [FullnameMixin]
    firstName : str = "default"
    lastName : str

Through the language capabilities of KCL, platform-side students can extend the structure through single inheritance, define the dependency and value content of the properties in the structure through the mixin mechanism, and complete the declarative structure definition through the order-independent writing method in the structure. In addition, It also supports common functions such as logical judgment and default values.

2.3.2 doc, fmt, lint and peripheral LSP tools

Although the code is the core part in the programming field, the documentation and supporting tools corresponding to the code are also parts that are highly related to programming efficiency. Strategic design philosophy is not limited to the language itself, but also includes documentation, code formatting tools, code style evaluation tools, and IDE support. KCL uses kcl-doc to directly extract and generate documentation from the configuration code. Automated documentation not only reduces the cost of manual maintenance, but also reduces the cost of learning and communication. kcl-fmt is very convenient to format all the codes (including nested sub-directories) in the current directory into a unique style, and the same format of code also reduces the cost of communication and code review. The kcl-lint tool uses some built-in risk monitoring strategies to evaluate the KCL code in parallel, so that users can optimize the style of the code based on the evaluation results.

2.4 Engineering solutions

For any language to be applied in engineering, not only a good design is needed, but also a complete solution for regular scenarios such as upgrades, extensions, and integration.

2.4.1 Multi-dimensional interface

The KCL language design provides common users (KCL command line), KCL language customizers (Go-API, Python-API), KCL library extenders (Plugin) and IDE developers (LSP services) at different levels of abstraction. Almost equivalent functional interface, thus providing maximum flexibility.

2.4.2 Configuration DB with thousands of people

KCL is a configuration-oriented programming language, and the core of configuration is structured data. Therefore, we can regard the complete KCL code as a configuration database. Through the query and update of the KCL configuration parameters (override/-O command), the corresponding configuration attribute path can be used to realize the query, temporary modification and save modification of the attribute parameters.

Taking the coded configuration as the sole source of DB not only integrates mature query and analysis methods in the DB field, but also adjusts the logical structure of the configuration code through the configuration code perspective. Especially in the practice of automated operation and maintenance, the PullRequest modified by the configuration code automatically generated by the program can facilitate the introduction of developers for code review, and it is a good way to achieve the cooperation between man and machine through different interfaces for operation and maintenance.

2.4.3 Smooth version upgrade

As the business and code evolve, the APIs of related modules will gradually become corrupted. The KCL language design adopts strict dependency version management, and then combines the language's built-in grammar and checking tools to ensure the smooth upgrade and transition of the API, and then cooperates with the code integration test and review process to improve code security. The KCL language uses the @deprecated feature to give prompts in the early stages of code corruption, while leaving a certain time window for users to transition and upgrade, and even wait until the API is completely corrupted by reporting an error to force the synchronization to upgrade the relevant code.

For example, in a certain upgrade, the name attribute is replaced by fullName, you can use the @deprecated feature flag:

schema Person:
    @deprecated(version="1.1.0", reason="use fullName instead", strict=True)
    name: str
    ... # Omitted contents

person = Person {
    # report an error on configing a deprecated attribute
    name: "name"
}

In this way, when the Person is instantiated, the initialization statement of the name attribute will receive the error message in time.

2.4.4 Built-in module, KCL module, plug-in module

KCL is a configuration-oriented programming language that provides engineering expansion capabilities through built-in modules, KCL modules, and plug-in modules.

The user code does not need to import functions that directly use builtin (such as calculating the length of the list with len, obtaining the type of value through typeof, etc.), and provides some built-in methods for basic types such as strings (such as converting the case of strings, etc.) method). For relatively complex common tasks, it is provided through the mark library. For example, you can use the related mathematical functions by importing the math library through import, and you can use the regular expression library by importing the regex library. The KCL code can also be organized into modules. For example, the Konfig library abstracts infrastructure and various standard applications into modules for upper users to use. In addition, you can use the Plugin mechanism to develop plug-ins for KCL using Python. For example, there are currently meta plug-ins that can query the configuration information of the center through the network, and the app-context plug-in can be used to obtain context information of the current application to simplify code writing.

3. The realization principle of KCL language

3.1 Overall architecture

Although KCL is a language dedicated to cloud native configuration and policy definition, it maintains the similar implementation architecture of most procedural and functional programming languages, and its internal overall architecture is also a classic compiler "three-stage" architecture. The following is the architecture diagram of KCL implementation:

There are mainly the following key modules:

Parser: The parser analyzes the KCL source code to generate an AST (Abstract Syntax Tree).
Compiler: It traverses the AST multiple times, performs semantic checking on the AST (such as type checking, invalid code checking) and optimizing the AST (combining constant expressions, etc.), and finally generates bytecode that can be executed by the virtual machine.
Virtual Machine (VM): Execute the bytecode generated by Compiler, calculate the corresponding configuration result, and serialize the configuration result into YAML/JSON for output.

The advantage of the overall architecture being divided into three stages is that the front end for the KCL source language and the back end for the target machine can be combined. This method of creating a compiler combination can greatly reduce the workload. For example, the current KCL bytecode definition and back-end virtual machine are self-developed. The KCL virtual machine is mainly used to calculate the configuration results and serialize them into YAML/JSON for output. If you encounter other special scenarios where KCL is used, such as executing KCL in a browser, you can rewrite a WASM-compliant backend, and you can easily port KCL to the browser for use, but the syntax and semantics of KCL itself are not No changes are required, and no changes are required to the front-end code of the compiler.

3.2 Communication principle between Go and Python

In order to better release the ability of the KCL configuration strategy language and the integration of the upper-level automation products (for example, the well-known compiler backend LLVM has a well-designed API, and developers can use its API to quickly build their own programming language), KCLVM At present, APIs in Python and Go are provided, so that users can use the corresponding APIs to quickly build language peripheral tools, language automatic query modification tools, etc. to enhance the automation capabilities of the language, and further build service capabilities based on this to help more Many users build their own cloud-native configuration coded applications or quickly access infrastructure.

The main body of KCLVM is implemented in Python code, and many cloud-native applications are built with Go programs, so in order to better meet the demands of cloud-native application users. KCLVM first built a communication medium for Go programs and Python programs based on CGo and CPython. Based on this, it designed RPC calls from Python functions to Go functions. The call parameters are stored in JSON format, making the KCLVM-Python compiler's ability to smoothly transition to Go code. , KCL code can be manipulated through Go one line import call.

Supplement: In the process of servicing practice, the solution based on CGO calling Python also encountered some problems: first, Go+CGO+Python caused cross-compilation difficulties, which created challenges for the automated testing and packaging of ACI; secondly, CGO Later, Python does not support multi-language and multi-threaded concurrency, and cannot take advantage of the performance of multi-core. Finally, even if the Python virtual machine is compiled into a Go program through CGO, it still needs to install the Python standard library and third-party libraries.

3.3 Principles of Collaborative Configuration

When there is a configuration language that is easy to use and can ensure stability, another problem faced is how to use configuration coding to improve collaboration capabilities. Based on this, KCL configuration can be divided into two types: user side and platform side configuration. The final configuration content is determined by the configuration content of the respective user side and platform side. Therefore, there are two coordination problems:

Collaboration between platform-side configuration and user-side configuration
Collaboration between user-side configurations

In response to the aforementioned collaborative problems, KCL proposed abstract models such as sequence-independent grammar and configuration merging with the same name on the technical side to meet different collaborative configuration scenarios.

Take the above picture as an example. First, the KCL code forms two pictures during the compilation process (a directed acyclic graph in the general form of direct reference and affiliation of users with different configurations), which correspond to the internal declaration code of the structure and the declaration code of the structure usage respectively. . The compilation process can be simply divided into three steps

First define the structure on the platform side and form the internal declaration code diagram of the structure
Secondly, declare and merge different user side configuration code diagrams
Finally, the user-side configuration code diagram calculation result is substituted into the platform-side structure internal declaration code diagram to solve, and the complete configuration diagram definition is finally obtained.

Through such a simple calculation process, most of the substitution operations can be completed at compile time, and the final solution can be obtained by only a small amount of calculation at the final runtime. At the same time, type checking and value checking can still be performed in the process of compiling and merging graphs. The difference is that type checking is generalization and partial order upper bound (checking whether the value of a variable satisfies a given type or a subtype of a given type) , Value checking is to do specialization, take partial order infimum (for example, merge two dictionaries into one dictionary).

4. Prospects for the future

The KCL language is still in a stage of rapid development, and some applications have been tried out. We hope to provide stronger capabilities for the Kuusion technology stack through the KCL language, and play an active role in the evolution of operation and maintenance, trustworthiness, and cloud native architecture. At the same time, we provide flexible extension and integration solutions for some special non-standard applications. For example, we are considering how to make the backend support the WebAssembly platform to support more integration solutions.

At the right time, we hope to open up all the codes of KCL and contribute to the rapid implementation of cloud-native coding.

thank you all.

For more articles, please scan the QR code to follow the "Financial-level Distributed Architecture" public account