Author: observable

With distributed applications and serverless applications being accepted by more and more developers and enterprises, the hidden operation and maintenance problems behind them have gradually emerged-the request link in the microservice architecture is too long, which leads to long problem positioning time It is also very difficult for operation and maintenance to monitor daily. Taking a specific problem as an example, completing a single user request in a distributed application may require processing by multiple different microservices. The failure or performance of any one of these services will greatly affect the response to user requests. As the business continues to expand, the call chain becomes more and more complex. It is difficult to do a full view or drill down just by printing logs or APM performance monitoring. When it comes to troubleshooting or performance analysis, it's like a blind man touching an elephant.

Faced with such a problem, Google published the paper "Dapper - a Large-Scale Distributed Systems Tracing Infrastructure" [1] to introduce their distributed tracing technology, and believed that the distributed tracing system should meet the following business requirements:

Low performance loss: The performance loss of the distributed tracing system to the service should be ignored as much as possible, especially those applications that are sensitive to performance.
• Low intrusion: Low or no intrusion of business code as much as possible.
• Rapid scaling : Ability to scale rapidly with business or microservice scale.
• Real-time display : collect data with low latency, monitor the system in real time, and respond quickly to abnormal conditions of the system.

在这里插入图片描述

In addition to the above requirements, this paper also fully expounds the three core links of distributed tracing: data collection, data persistence, and data display. Among them, data collection refers to burying points in the code and setting the content that needs to be reported in the request. Data persistence refers to storing the reported data on disk. The data display is presented on the interface according to the request associated with the TraceID query.

在这里插入图片描述

With the birth of this paper, Distributed Tracing has been accepted by more and more people, and the technical concept has gradually emerged. Related products have sprung up, and distributed tracking products such as Uber's Jaeger and Twitter's Zipkin have made a name for themselves. But this process also brought a problem: although each product has its own set of data collection standards and SDKs, most of them are based on the Google Dapper protocol, but the implementation is different. To solve this problem, OpenTracing and OpenCensus were born.

OpenTracing

For many developers, getting an application to support distributed tracing is too difficult. This requires passing trace data not only within the process, but also between processes. What's more difficult is that other components need to support distributed tracing, such as NGINX, Cassandra, Redis and other open source services, or gRPC, ORMs and other open source libraries introduced in the service.

在这里插入图片描述

Before OepnTracing, on the one hand, many distributed tracing systems were implemented with application-level instrumentation using incompatible APIs, which made developers uncomfortable with applications being tightly coupled to any particular distributed tracing. On the other hand, these application-level instrumentation APIs have very similar semantics. In order to solve the API incompatibility of different distributed tracing systems, or the standardization of the transfer of trace data from one library to another and from one process to the next, the OpenTracing specification was born. A lightweight normalization layer that sits between an application/library and a trace or log analyzer.

Advantage

The advantage of OpenTracing is that it has developed a set of vendor-independent and platform-independent protocol standards, so that developers only need to modify Tracer to more quickly add or replace the implementation of underlying monitoring. It is also based on this point that in 2016, the Cloud Native Computing Foundation CNCF officially accepted OpenTracing and successfully became the third CNCF project. The first two projects have both become de facto standards in the cloud-native and open-source world - Kubernetes and Prometheus. It can also be seen that the industry attaches great importance to observable and unified standards.

OpenTracing consists of an API specification, frameworks and libraries that implement the specification, and project documentation, and makes the following efforts:

Standardization of API interfaces that are not related to the background : The tracked service only needs to call the relevant API interface, and can be supported by any tracking background that implements this interface.
Standardize the management of the smallest unit of tracking span : define the API for starting span, ending span and recording span time.
Standardization of the way of tracking data transfer between processes : API is defined to facilitate the transfer of tracking data.
Standardization of multi-language application support : comprehensive coverage of development languages such as GO, Python, Javascript, Java, C#, Objective-C, C++, Ruby, PHP, etc. It supports Zipkin, LightStep, Appdash trackers, and can be easily integrated into frameworks like GRPC, Flask, DropWizard, Django, and Go Kit.

在这里插入图片描述

Introduction to core terms

Trace
A complete request link.
• Span - A logical unit with start time and execution duration in a call procedure system, and contains multiple states.
**Each Span encapsulates the following states:
• An operation name - operation name • A start timestamp - start time • A finish timestamp - end time • Span Tag - a set of Span tags consisting of a set of key-value pairs. **
The key of a key-value pair must be String, and the value can be of type String, Boolean, or Number.
• Span Log - A collection of logs for a set of spans .
Each Log operation contains a key-value pair and a timestamp. The key of a key-value pair must be String, and the value can be of any type.
• References - zero or more spans related to the relationship between spans <br>. The References relationship is established between Spans through the SpanContext.
SpanContext - Reference other causally related Spans through SpanContext.
OpenTracing currently defines two types of references: ChildOf and FollowsFrom. Both of these reference types specifically model the direct causal relationship between a child span and a parent span.

The parent Span in the ChildOf relationship needs to wait for the child Span to return. The execution time of the child Span affects the execution time of the parent Span where it is located. The parent Span depends on the execution result of the child Span. In addition to serial tasks, there are many parallel tasks in our logic, and their corresponding spans are also parallel. In this case, a parent span can merge the execution results of all child spans and wait for all parallel child spans to end. In distributed applications, some upstream systems do not depend on the execution results of downstream systems in any way. For example, upstream systems send messages to downstream systems through message queues. In this case, there is a FollowsFrom relationship between the child span corresponding to the downstream system and the parent span corresponding to the upstream system.

data model

After understanding the relevant terminology, we can see that there are three key and interconnected types in the OpenTracing specification: Tracer, Span, and SpanContext. The technical model of OpenTracing is also clear: the Trace call chain is implicitly defined by the Span that belongs to the call chain. Each call is called a Span, and each Span must carry the global TraceId. The Trace call chain can be considered as a directed acyclic graph (DAG) composed of multiple Spans, and the Spans in a Trace are connected end-to-end. TraceID and related content take SpanContext as the carrier, and follow the Span "path" in sequence through the transmission protocol. The above can be regarded as the whole process of a client request in a distributed application. In addition to the DAG diagram from the business perspective, in order to better display the information such as component call time, sequence relationship, etc., we also try to update the sequence diagram based on the time axis. A good representation of the Trace call chain.

在这里插入图片描述

### Best Practices

• Application code

Developers can use OpenTracing to describe causal relationships between services and add fine-grained logging information.

• library code

Libraries that take intermediate control requests can integrate with OpenTracing, for example, a web middleware library can use OpenTracing to create spans for requests, or an ORM library can use OpenTracing to describe high-level ORM semantics and execute specific SQL queries.

• RPC/IPC framework

Any subservice across processes can use OpenTracing to standardize the format of the trace data.

Related Products

Products that follow the OpenTracing protocol include tracing components such as Jaeger, Zipkin, LightStep, and AppDash, and can be easily integrated into open source frameworks such as gRPC, Flask, Django, and Go-kit.

OpenCensus

In the entire observable field, in order to better implement DevOps, in addition to distributed tracing, operation and maintenance personnel began to pay attention to Log and Metrics. Metrics indicator monitoring is an important part of observability, including machine indicators such as CPU, memory, hard disk, network, etc., network protocol indicators such as gRPC request delay and error rate, and business indicators such as the number of users and the number of visits.

OpenCensus provides a unified measurement tool: cross-service capture tracking span Span, application-level metrics Metrics.

在这里插入图片描述

Advantage

• Compared to OpenTracing which only supports Traces, OpenCensus supports Traces and Metrics.
• Compared with OpenTracing, which makes specifications, OpenCensus not only makes specifications, but also includes Agent and Collector.
• Compared with OpenTracing, the family group lineup is larger, and has the support of Google and Microsoft.

What have you done

• Standard communication protocol and consistent API: used to handle Metrics and Trace.
• Multilingual library support: Java, C++, Go, .Net, Python, PHP, Node.js, Erlang, Ruby.
• Integration with RPC framework.
• Integrated storage and analysis tools.
• Fully open source and supports third-party integration and output plug-in.
• No additional servers or agents are required to support OpenCensus.

Introduction to core terms

In addition to using OpenTracing-related terms, OpenCensus also defines some new terms.
• Tags
OpenCensus allows metrics to be associated with dimensions at record time. This makes it possible to analyze the measurement results from different angles.
• Stats
Collect observable results of library and application records, summarize and export statistical data, and include two parts: Recording (records) and Views (aggregated measurement queries).
• Trace
In addition to the Span attributes provided by Opentracing, OpenCensus also supports Parent SpanId, Remote Parent, Attributes, Annotations, Message Events, Links and other attributes.
• Agent
The OpenCensus Agent is a daemon that allows multilingual deployment of OpenCensus to use the Exporter. Unlike traditionally removing and configuring OpenCensus Exporter per language library and per application, with OpenCensus Agent, you only need to enable OpenCensus Agent Exporter individually for its target language. For operations teams, implement single exporte management and ingest data from multilingual applications, sending data to the backend of choice. At the same time, minimize the impact of repeated startup or deployment on the application. Finally, Agent also comes with "Receivers". "Receivers" enable the Agent to pass through the backend to receive observable data and route it to the Exporter of choice. Such as Zipkin, Jaeger or Prometheus.
在这里插入图片描述

• Collector
As an important part of OpenCensus, Collector is written in Go language and can accept traffic from any application that has Receivers available, regardless of programming language and deployment method, and this benefit is obvious. For services or applications that provide Metrics and Trace, only one Exporters exporting component is required to get data from multilingual applications.

For developers, only need to manage and maintain a single Exporter, all applications use OpenCensus Exporter to send data. At the same time, developers are free to choose to send data to the backend that the business needs, and better at any time. To address the issue of sending large amounts of data over the network that may require handling of send failures, Collector has buffering and retry capabilities to ensure data integrity and availability.

在这里插入图片描述

• Exporters
OpenCensus can upload relevant data to various backends through various exporter implementations, such as: Prometheus for stats, OpenZipkin for traces, Stackdriver Monitoring for stats and Trace for traces, Jaeger for traces, Graphite for stats.

Related Products

Products that follow the OpenCensus protocol are Prometheus, SignalFX, Stackdriver, and Zipkin.
Seeing here, we can see that the above two are evaluated from the dimensions of functions and features. OpenTracing and OpenCensus each have obvious advantages and disadvantages: OpenTracing supports more languages and is less coupled to other systems; OpenCensus supports metrics, distributed tracing, and supports it from the API layer to the infrastructure layer. For many developers, a new idea is constantly being discussed at the same time when the difficulty of choice occurs: Is there a project that can integrate OpenTracing and OpenCensus, and can support Log log-related observable data?
在这里插入图片描述

OpenTelemetry

In answer to the previous question, let's take a look at what a typical service troubleshooting process looks like:
• Find anomalies with a variety of preset alarms (Metrics/Logs)
• Open the monitoring panel to find abnormal phenomena, and find the abnormal module (Metrics) by querying
• Query and analyze abnormal modules and associated logs to find the core error messages (Logs)
• Locate the code causing the problem with detailed call chain data (Tracing)

In order to obtain better observability or quickly solve the above problems, Tracing, Metrics, and Logs are indispensable.
在这里插入图片描述

At the same time, the industry already has a wealth of open source and commercial solutions, including:

Metrics : Zabbix, Nagios, Prometheus, InfluxDB, OpenFalcon, OpenCensus
Tracing : Jaeger, Zipkin, SkyWalking, OpenTracing, OpenCensus
Logs : ELK, Splunk, SumoLogic, Loki, Loggly.

There are various schemes, and each scheme also has various protocol formats/data types. It is difficult to be compatible/interoperable between different schemes. At the same time, various solutions will be mixed in actual business scenarios, and developers can only develop various Adapters for compatibility.

What is OpenTelemetry

In order to better integrate Traces, Metrics and Logs, OpenTelemetry was born. As a CNCF incubation project, OpenTelemetry is a combination of OpenTracing and OpenCensus projects, a set of specifications, API interfaces, SDKs, tools and integrations. Brings a unified standard for Metrics, Tracing, and Logs to many developers, all three have the same metadata structure and can be easily related to each other.
在这里插入图片描述

OpenTelemetry is vendor- and platform-independent, and does not provide observability-related backend services. Observable data can be exported to different backends such as storage, query, and visualization, such as Prometheus, Jaeger, and cloud vendor services, according to user needs.

Advantage

The core strengths of OpenTelemetry are concentrated in the following sections:

• Completely break the Lock-on hidden danger of various manufacturers

As an operation and maintenance personnel, when they find that the tools are not enough, but the evaluation implementation cost is too high to switch, they will jump up and scold the manufacturer that "the dog thief is going to murder me again". The emergence of OpenTelemetry aims to break this fate by providing a standardized instrumentation framework. As a pluggable service, common technical protocols and formats can be easily added to make service choices more free.

• Norm-setting and agreement unification

OpenTelemetry uses a standards-based implementation. The focus on standards is especially important for OpenTelemetry, as interoperability across languages needs to be tracked. Many languages come with type definitions that can be used in implementations, such as interfaces for creating reusable components. Including the specifications required by the internal implementation of the observable client, and the protocol specifications required by the observable client to communicate with the outside world. Specifically include:

• API: Defines the types and operations of Metrics, Tracing, and Logs data.
• SDK: Defines API-specific language implementation requirements, defines configuration, data processing, and export concepts.
• Data: Defines the OpenTelemetry Line Protocol (OTLP). Although the components in Opentelemetry support the implementation of Zipkin v2 or Jaeger Thrift protocol formats, they are all provided in the form of third-party contributed libraries. Only OTLP is an officially natively supported format by Opentelemetry.

在这里插入图片描述

Each language implements the specification through its API. APIs contain language-specific definitions of types and interfaces, which are abstract classes, types, and interfaces used by concrete language implementations. They also contain no-op implementations to support local testing and provide tooling for unit testing. The definition of the API resides in each language's implementation. As stated in the OpenTelemetry Python client: "The opentelemetry-api package includes abstract classes and no-op implementations that make up the specification-compliant OpenTelemetry API." A similar definition can be seen in the Javascript client: "This package provides interface to the OpenTelemetry API Everything you need to interact, including all TypeScript interfaces, enums, and no-op implementations. It works both on the server and in the browser."

• Multilingual SDK implementation and integration

OpenTelemetry implements SDKs for each common language, combining exporters with APIs. SDKs are concrete, executable API implementations. Includes C++, .NET, Erlang/Elixir, Go, Java, JavaScript, PHP, Python, Ruby, Rust, Swift.

The OpenTelemetry SDK generates observable data in the language of choice by using the OpenTelemetry API, and exports that data to the backend. And allows enhancements for common libraries or frameworks. Users can use the SDK for automatic code injection and manual embedding, and integrate support for other third-party libraries (Log4j, LogBack, etc.); these packages are generally implemented in accordance with the specifications and definitions in the opentelemetry-specification, combined with the characteristics of the language itself. The client's basic ability to collect observable data. For example, the transfer of metadata between services and processes, Trace adds monitoring and data export, and the creation, use and data export of Metrics indicators.
在这里插入图片描述

• Implementation of data collection system

A fundamental principle in tracing practice is that the observable data collection process needs to be orthogonal to business logic processing. Collector is based on this principle to minimize the impact of observable clients on the original business logic. OpenTelemetry is based on the collection system of OpenCensus Service, including Agent and Collector. Collector covers functions to collect, transform, and export observable data, and supports receiving observable data in multiple formats (such as OTLP, Jaeger, Prometheus, etc.) and sending data to one or more rear end. It also supports processing and filtering of observable data before outputting it. The Collector contrib package supports more data formats and backends.

At the architectural level, Collector has two modes. One is to deploy the Collector in the same host as the application (such as DaemonSet in Kubernetes), or in the same Pod as the application (such as Sidecar in Kubernetes). The telemetry data collected by the application is directly passed to the Collector through the loopback network. This mode is collectively referred to as the Agent mode. Another mode is to treat Collector as an independent middleware, and the application transmits the collected telemetry data to this middleware. This mode is called the Gateway mode. The two modes can be used alone or in combination, as long as the data protocol format of the data export is consistent with the data protocol format of the data entry.

Automatic code injection technology
OpenTelemetry has also begun to provide the implementation of automatic code injection, and currently supports automatic injection of various mainstream Java frameworks.

• Cloud native architecture
OpenTelemetry has been designed with cloud-native features in mind, and also provides Kubernetes Operator for rapid deployment.

Data types supported by OpenTelemetry

• Metrics
Metrics are metrics about a service, captured at runtime. Logically, the moment when one of these metrics is captured is called a Metric event, and it contains not only the metric itself, but also the time it was acquired and associated metadata. Application and request metrics are important indicators of availability and performance. Custom metrics can provide insight into how availability affects user experience and business. Custom Metrics provide insight into how usability metrics affect user experience or business.

OpenTelemetry currently defines three Metric tools:
• counter: A value that sums over time, it can be understood as the odometer of the car, it will only go up.
• measure: Values aggregated over time. It represents a value within some defined range.
• observer: captures a set of current values at a specific point in time, such as a fuel gauge in a vehicle.

• Logs

Logs are time-stamped text records that can be structured or unstructured with metadata. Although each log is an independent data source, it can be attached to a Trace's Span. When using the call on a daily basis, you can also see the log along with the node analysis.
In OpenTelemetry, any data that is not a distributed Trace or Metric is a log. Logs are often used to determine the root cause of a problem, and usually contain information about who changed something and the results of the change.

• Traces

Trace refers to the trace of a single request. The request can be initiated by the application or by the user. Distributed Tracing is a form of tracing across networks and applications. Each unit of work is called a Span in Trace, and a Trace consists of a tree-shaped Span. Spans represent objects that go through the work of a service or component designed by an application, and Spans also provide request, error, and duration metrics that can be used to debug availability and performance problems. A Span contains a Span Context, which is a set of globally unique identifiers representing the unique request to which each Span belongs. Usually we call it TraceID.

• Baggage

In addition to Trace propagation, OpenTelemetry also provides Baggage to propagate key-value pairs. Baggage is used to index observable events in a service that contains properties provided by previous services in the same transaction, helping to establish causal relationships between events. While Baggage can be used as a prototype for other cross-cutting concerns, this mechanism is primarily intended to pass values from the OpenTelemetry observability system. These values can be consumed from Baggage and used as additional dimensions for metrics, or additional context for logging and tracing.

Just the first step, or a one stop shop?

Combining the above content, we can see that OpenTelemetry covers the specification definition, API definition, specification implementation, and data acquisition and transmission of various observable data types. The application only needs one SDK to realize the unified generation of all types of data; the cluster only needs to deploy one OpenTelemetry Collector to realize the collection of all types of data. Moreover, Metrics, Tracing, and Logging have the same Meta information and can be seamlessly associated.

What OpenTelemetry wants to solve is the first step in the unification of observable data, standardizing the collection and transmission of observable data through APIs and SDKs. OpenTelemetry does not want to rewrite all components, but reuses the industry's major components to the greatest extent possible. Common tools in the field, by providing a secure, platform-independent, vendor-independent protocol, component, and standard. Its own positioning is very clear: the unification of data collection and standard specifications, and how the data is used, stored, displayed, and alerted is not officially involved. However, as far as the overall observable solution is concerned, OpenTelemetry only completes the unified production of data, and there is no clear solution for how to store, use this data for analysis, and alarm in the future, but these problems are very prominent.

• Storage methods of various types of data

Metrics can exist in Prometheus, InfluxDB or various time series databases; Tracing can be connected to Jaeger, OpenCensus, and Zipkin. However, how to select and operate these back-end services in the future is a difficult issue.

Data analysis (visualization and correlation)

How is the collected data analyzed uniformly? Different data requires different data platforms for processing. To display Metrics, Logging, and Tracing on a unified platform and realize the related jump among the three, a lot of custom development work is required. This is a lot of work for operation and maintenance.

• Anomaly detection and diagnosis

In addition to daily visual monitoring, application anomaly detection and root cause diagnosis are important business requirements for operation and maintenance. At this time, it is necessary to integrate OpenTelemetry data into AIOps. But for many development and operation and maintenance teams, the basic DevOps has not been fully implemented, let alone the further AIOps.

Best practice: Accessing the application real-time monitoring service ARMS through OpenTelemetry

In response to the above problems, Alibaba Cloud provides the application real-time monitoring service ARMS to help the operation and maintenance team solve data analysis, anomaly detection and diagnosis problems. ARMS supports multiple ways to access OpenTelemetry Trace data. You can directly report OpenTelemetry Trace data to ARMS, or forward it through OpenTelemetry Collector.

(1) report directly

• Accessing OpenTelemetry Trace data Java applications through ARMS Java Agent It is recommended to install ARMS Java Agent. ARMS Java Agent has built-in link buried points for a large number of common components, and can automatically report Trace data in OpenTelemetry format. It can be used out of the box without additional configuration. For specific operations, see Monitoring Java Applications [2].

• Combining ARMS Java Agent with OpenTelemetry Java SDK to report Trace data The ARMS Java Agent of v2.7.1.3 and above supports the OpenTelemetry Java SDK extension. While using the ARMS Java Agent to automatically obtain the trace data of the common component, you can also use the OpenTelemetry SDK to extend the custom method to bury points. For details, see OpenTelemetry Java SDK Support [3].

• Directly report Trace data through OpenTelemetry You can also use OpenTelemetry SDK for application tracking, and directly report Trace data through Jaeger Exporter. For details, please refer to reporting Java application data through OpenTelemetry [4].

(2) Forwarding through OpenTelemetry Collector

• Forward Trace data through ARMS for OpenTelemetry Collector

In the container service ACK environment, you can install ARMS for OpenTelemetry Collector with one click, and use it to forward Trace data. ARMS for OpenTelemetry Collector implements link lossless statistics (local pre-aggregation, statistical results are not affected by sampling rate), dynamic configuration and parameter adjustment, state management, and out-of-the-box Trace Dashboard on Grafana, which is easier to use, more stable, and more reliable .
The access process of ARMS for OpenTelemetry Collector is as follows:

  1. Install ARMS for OpenTelemetry Collector from the ACK console application directory.
    a. Log in to the Container Service Management Console [5].
    b. Select Market > App Market in the left navigation bar.
    c. On the Marketplace page, search for the ack-arms-cmonitor component by keyword, and then click ack-arms-cmonitor.
    d. On the ack-arms-cmonitor page, click One Click Deploy in the upper right corner.
    e. In the Create panel, select the target cluster and click Next. Description The namespace defaults to arms-prom.
    f. Click OK.
    g. Click Clusters in the left navigation bar, and then click the name of the cluster where the ack-arms-cmonitor component was just installed.
    h. Select Workload > Daemon Sets in the left navigation bar, and select the namespace as arms-prom at the top of the page.
    i. Click otel-collector-service. Check whether the otel-collector-service (Service) is running normally. As shown in the figure below, various Receivers ports are exposed to receive OpenTelemetry data, which means the installation is successful.
    在这里插入图片描述
  2. Modify the Exporter Endpoint address in the application SDK to otel-collector-service:Port.
    在这里插入图片描述 • Forward Trace data through the open source OpenTelemetry Collector

To forward Trace data to ARMS using the open source OpenTelemetry Collector, you only need to modify the access point (Endpoint) and authentication information (Token) in the Exporter.

 exporters:   otlp:     endpoint: <endpoint>:8090     tls:       insecure: true     headers:       Authentication: <token>

illustrate

• Replace <endpoint> with the Endpoint corresponding to your reporting area, for example: http://tracing-analysis-dc-bj.aliyuncs.com:8090 .
• Replace <token> with the Token obtained from your console, for example: b590lhguqs@3a7 9b_b590lhguqs@53d **8301.

( 3) Guidelines for using OpenTelemetry Trace

In order to make better use of the data value of OpenTelemetry Trace, ARMS provides various diagnostic capabilities such as link details, pre-aggregation dashboards, post-aggregation analysis by Trace Explorer, and invoking link-related business logs.

• Link details On the left side of the link details panel, you can view the interface calling sequence and time-consuming of the link. The right side of the panel displays detailed additional information and associated indicators, such as database SQL, JVM, and Host monitoring indicators.
在这里插入图片描述

• Pre-aggregation Dashboard ARMS provides a variety of pre-aggregation indicator dashboards based on OpenTelemetry Trace data, including application overview, interface calls, database calls, and so on.

在这里插入图片描述

• Trace Explorer post-aggregation analysis For OpenTelemetry Trace data, ARMS provides flexible multi-dimensional filtering and post-aggregation analysis capabilities, such as querying exception links for specific applications. Trace data can also be aggregated according to dimensions such as IP and interface.

在这里插入图片描述
在这里插入图片描述

• Calling link-associated service logs ARMS supports the association of OpenTelemetry Trace with service logs to troubleshoot service exceptions from the perspective of application interfaces.
在这里插入图片描述

Related Links

[1] "Dapper - a Large-Scale Distributed Systems Tracing Infrastructure"
https://static.googleusercontent.com/media/research.google.com/en-US//archive/papers/dapper-2010-1.pdf

[2] Monitoring Java applications
https://help.aliyun.com/document_detail/97924.html

[3] OpenTelemetry Java SDK support
https://help.aliyun.com/document_detail/260950.htm#task-2086462

[4] Report Java application data through OpenTelemetry
https://help.aliyun.com/document_detail/413964.htm#task-104185

[5] Container Service Management Console
https://cs.console.aliyun.com/

Click here to learn more about Alibaba Cloud Observable!


阿里云云原生
1k 声望305 粉丝