头图

In recent years, with the development of Internet of Everything technology, Mesh technology has gradually emerged. Mesh technology is a networking technology that can combine multiple access points into the same network to provide services. Compared with traditional WiFi networking technology, Mesh networking is more stable, faster, and more scalable. With the characteristics of self-organization, self-management, and self-healing, WiFi Mesh also occupies an important position in the future interconnection of everything.

Aiming at the emerging scene of WiFi Mesh, at this Black Hat Europe 2021 conference, Baidu Security shared the topic "BadMesher: New Attack Surfaces of Wi-Fi Mesh Network" online, which mainly discussed the issues in WiFi Mesh. Security attack surface, designed and implemented a set of automatic vulnerability mining tool MeshFuzzer, and demonstrated its effect in the actual vulnerability mining process.

** Topic Interpretation

Basic concepts

EasyMesh concept

EasyMesh is a standardized certification scheme launched by the WiFi Alliance, which has gone through three stages of development:

Figure 1 EasyMesh development process

In 2018, Mesh technology was implemented for each manufacturer, lacking a unified standard, so devices from different manufacturers could not be interconnected.

In 2019, the WiFi Alliance launched the EasyMesh V1 version, introduced the Onboarding process and the Auto-Config process, and used the 1905 control protocol to implement most of the control functions in Mesh.

In 2020, the WiFi Alliance launched EasyMesh V2 and V3 versions. The V3 version is enriched with more control features, especially security features, and authorization and integrity verification for control messages.

There are dozens of vendors that have passed EasyMesh certification, including Mediatek, Huawei, ZTE, etc.

EasyMesh architecture

The architecture of EasyMesh is shown in Figure 2, which contains two key links and two key roles.

Figure 2 EasyMesh architecture diagram

critical link

1. Fronthaul link: refers to the exposed WiFi link, that is, the SSID that our mobile phone can connect to normally

2. Backhual link: refers to the hidden WiFi link, that is, the SSID that cannot be searched, and it is a link specifically provided for Mesh

key role
1. Controller role: The manager of the Mesh network can send control instructions to the Agent to complete the management of the Mesh network to achieve the effect of self-organization, self-management, and self-healing

2. Agent role: The executor of the Mesh network executes tasks by accepting control instructions from the Controller, and feeds back the execution results to the Controller

The role here is not specific to a specific device, but a logical entity. A device can be used as a Controller or an Agent, or as a Contrller and an Agent at the same time.

Mesh network construction process
The entire Mesh network construction process is divided into the following 2 steps:

1、Onboarding
2. Discovery and Configuration

Onboarding process
The onboarding process is to help a device that has not joined the Mesh network to join the Mesh network. We call the devices that have not joined the network as Enrollee devices. The entire process is implemented through the 1905 Push Button Configueration protocol (hereinafter referred to as 1905 PBC). 1905 PBC contains The following 3 characteristics:

1. Feature 1: Both parties need to push button

2. Feature 2: Implementation based on WiFi Protected Setup

3. Feature 3: Based on TLV

It can be seen from Figure 3 that the 1905 PBC is specially marked in the Multi-AP Extension part, that is, the marking obtains the Backhaul SSID. Therefore, the Entollee device can obtain the network access certificate of the Mesh link through the 1905 PBC.


Figure 3 Multi-AP Extension

The entire Onboarding process is shown in Figure 4:

Figure 4 Onboarding process

First, push the two devices to enter the network configuration state.

Secondly, the Enrollee device interacts with the Fronthaul SSID through the 1905 PBC. After the M1-M8 process, the Existing Agent finally returns the Backhual SSID and password to the Enrollee device, and then the Enrollee device can connect to the Backhaul SSID and join the Mesh network.

At this point, the Onboarding process is complete.

Discovery and Configuration process

The overall process is shown in Figure 5:

Figure 5 Discovery and Configuration process

After completing the Onborading process, the Enrollee device needs to find the Controller in the Mesh network to obtain the basic configuration of the current Mesh network. Here, the IEEE1905.1a control protocol is used. The Enrollee device uses the "AP Autoconfig Search" broadcast packet to detect the existence of the Controller. If there is a Controller in the network, the Controller will reply "AP Autoconfig Response", and the Enrollee device has successfully found the Controller. At this point, the Discovery process is complete.

The Configuration process is to synchronize the configuration information of the current Mesh network to the Enrollee device, such as the user name and password of the Mesh network, the selection of the communication channel, the maintenance parameters of the network stability, etc., which are implemented through "AP Autoconfig Wifi Sample Configuration". The Enrollee device has obtained the basic configuration of the Mesh network, and can join the Mesh family as a real agent, and the entire Mesh network has been constructed.

Mesh network control process

The maintenance and management of the Mesh network is an important project. It is implemented through IEEE1905.1a. IEEE1905.1a is essentially a protocol between the physical layer and the network layer, and defines the wired or wireless control technology in the home network. In the Mesh scenario, IEEE1905.1a is the carrier, providing a variety of control protocols such as device discovery, device configuration, device management, etc. The entire implementation is based on Type-Length-Value. Some EasyMesh control protocols are shown in Table 1:


Table 1 Part of EasyMesh control protocol

Here select "Multi-AP Policy Config Request Message" as an example, you can see that the corresponding command word in Figure 6 is 0x8003, and the specific Street Policy meets the basic TLV. You can see that Type is 0x89 and len is 21 in Figure 6 , And value is the corresponding payload.


Figure 6 Multi-AP Policy Config Message

Attack surface analysis

After analyzing the networking and control process of the entire Mesh network, let's take a look at the actual attack surface. The attack vector is two key protocols:
1、1905 Push Button Configuration Protocol
2、IEEE 1905.1a Control Protocol
Corresponding to two key attack surfaces:
1. Attack the network construction process

2. Attack the network control process

Attacks the Mesh network construction process

attack Existing Agent
Attacker: "Bad" Enrollee Agent
Victim: Exixting Agent
Attack vector: 1905 Push Button Configuration Protocol (M1, M3, M5, M7)
The entire attack process is shown in Figure 7

Figure 7 Attacking Existing Agent

The attacker constructs a malicious Enrollee device to attack the Existing Agent, specifically based on the 1905 PBC sending malformed M1, M3, M5, and M7 packets to attack, which can trigger the TLV of the Existing Agent in the process of M1, M3, M5, and M7. Resolve the vulnerability.

Attack Enrollee Agent

Attacker: "Bad" Existing Agent
Victim: Enrollee Agent
Attack vector: 1905 Push Button Configuration Protocol (M2, M4, M6, M8)
The entire attack process is shown in Figure 8


Figure 8 Attacking Enrollee Agent

The attacker constructs a malicious Existing Agent device to attack the Enrollee device, specifically based on the 1905 PBC replying malformed M2, M4, M6, M8 packets to attack, which can trigger the TLV of the Enrollee device in the M2, M4, M6, and M8 process Resolve the vulnerability.

Attack the Mesh network control process

After analyzing the attack surface constructed by Mesh, let's look at the attack surface of Mesh network control.
Attacker: "Bad" Existing Agent
Victim: Controller and other Existing Agent
Attack vector: IEEE 1905.1a Control Protocol

An attacker can send a malformed 1905 packet to trigger the 1905 TLV parsing vulnerability in the Controller and Existing Agent. Figure 9 is our malicious packet designed for "AP_AUTOCONFIGURATION_WSC_MESSAGE". As you can see, we have filled 0xFF in the len part of the SSID, and The actual SSID is up to 64, and 0xFF is filled in the payload part of the SSID. From the actual data packet obtained in Figure 10, it can be seen that the actual SSID part is filled with the payload of 0xFF that we filled, which is not consistent. SSID resolution is expected.


Figure 9 Simulation of sending a malformed IEEE 1905.1a control packet

Figure 10 The actual IEEE 1905.1a control packet

**Automation tool MeshFuzzer

MeshFuzzer architecture **

Our Meshfuzzer contains two Fuzzing subsystems, namely Fuzzing for 1905 PBC and Fuzzing for 1905.1a. The overall architecture is shown in Figure 11.


Figure 11 MeshFuzzer architecture

The upper part is the Fuzzing subsystem we designed for 1905 PBC. We use the WPS interaction data between actual devices as input. After our TLV mutation system, we finally use our 802.1 packet sender to send packets. At the same time, The device is connected to the serial port to monitor the crash status in real time.

The lower part is the Fuzzing subsystem we designed for IEEE 1905.1a. We have implemented most of the control protocol fields in EasyMesh, and also through our TLV mutation system, we finally use our 1905 packet sender to send packets, through the unique 1905 packets to monitor the status of the crash.

mutation strategy

Since the two target protocols are implemented based on TLV, we can use a unified mutation strategy to efficiently assist the Fuzzing.

Mutation strategy 1: Mutation length field, through too long or too short length to trigger some conventional memory corruption vulnerabilities in TLV parsing, such as too short length will lead to out-of-bounds read, or integer overflow, too long will cause out-of-bounds write and other problems, figure 12 is the effect of mutating the length field to too short in our actual test.

Mutation strategy 2: Randomly add, delete and modify existing TLV blocks, which may lead to logic vulnerabilities related to memory corruption, such as Double-Free, UAF, etc. Figure 13 shows the effect of randomly adding TLV blocks in our actual test.

Figure 12 Too short length field

Figure 13 Randomly increase the TLV block

Fuzzing network construction process

software and hardware selection

Hardware part: Choose Ubuntu or Raspberry Pi 4, and use a wireless USB network card to send out packages.

Software part: We chose to modify wpa_supplicant to customize our Fuzzer. The specific reason is that wpa_supplicant itself supports the 1905 PBC protocol, so we can add our mutation strategy in its different stages to achieve efficient and stable Mesh network construction Phase of Fuzzing work.

Figure 14 wpa_supplicant implementation code

Actual Fuzzing Existing Agent

We can use the above customized Fuzzing tool to simulate the entire 1905 PBC process and inject the Fuzzing Payload into the M1, M3, M5, and M7 stages. Figure 15 is the TLV analysis of the M7 stage captured during the Fuzzing process. The resulting out-of-bounds write vulnerability crash log, Figure 16 is the actual data packet we captured.

Figure 15 The problem of out-of-bounds writing in M7 stage

Figure 16 Write the actual data packet out of bounds in the M7 phase

The way we monitor the crash is by pinging the target device and capturing the crash log in real time through the serial port.

Actual Fuzzing “Existing” Agent

Another victim of the network construction process is the "Enrollee" who is not equipped with the network. We simulate a malicious "Existing" Agent to fuzz the "Enrollee". In order to ensure that Enrollee keeps joining the Mesh network, we have written a script, as shown in Figure 17.

Figure 17 Enrollee keeps adding Mesh network script

We injected the Fuzzing Payload in the M2, M4, M6, and M8 stages. Figure 18 shows the out-of-bounds write vulnerability caused by the TLV analysis of the M6 stage triggered during our Fuzzing process. Figure 19 is the actual packet we captured.

Figure 18 The problem of out-of-bounds writing in the M8 stage

Figure 19 Write the actual data packet out of bounds in the M8 phase

Here we still monitor the crash by pinging the target device and capturing the crash log in real time via the serial port.

Fuzzing network control process

and hardware selection

Hardware part: Macbook Pro was chosen because Macbook Pro can better support the sending of 1905 data packets.
Software part: I chose the existing open source library pyieee1905, so we can develop custom protocol fields based on pyieee1905, which will greatly reduce our Fuzzer development workload. We only need to implement the control protocol in EasyMesh to control the network Part of the Fuzzing test.

Figure 20 pyieee1905

monitoring module

Since most of the 1905 processing modules are separate processes, we cannot directly capture the crash through the serial port, nor can we monitor the running status of the 1905 process by sending a Ping detection packet to the device. Here we choose the 1905 Topology Query Message provided in EasyMesh. The package is used to detect the mutual support capability between the device 1905 processes. Therefore, we can easily know whether the 1905 process on the device is alive or whether it is working normally through the device's reply to the package or not.

Figure 21 Topology Query Message

Whenever we send a Fuzzing Payload, we will send a 1905 Topology Query. If we get a reply, it means that 1905 Daemon is working normally. If we don’t get a reply, it means that 1905 Daemon may have a problem. At this time, we will record the Fuzzing Payload sent this time. Save it locally and wait for the process to restart.


Figure 22 1905 crash monitoring and saving


Figure 23 Actual crash

Actual effect

We used MeshFuzzer to find multiple memory corruption vulnerabilities caused by TLV parsing in the EasyMesh solution of Mediatek MT7915, and found 1 security issue that violated security design guidelines. A total of 19 CVEs were obtained. The list of issues is shown in Figure 24. At present, Mediatek has fixed all problems and output security patches.

Figure 24 MT7915 security issues

Security Recommendation

For dealing with memory corruption vulnerabilities caused by TLV analysis, we recommend that you perform a complete analysis of the data packet, then check the type and length one by one, and finally deal with it. When the length and type check fails, the data packet is discarded.

A good example is wpa_supplicant. Figure 25 shows the process of wpa_supplicant processing TLV packets, following the analysis->distribution->verification->processing process.

Figure 25 Examples of correct TLV processing

For the problem of violating the security design guidelines, there is a section in the EasyMesh V3 standard that specifically describes the security capabilities of the 1905 protocol. For example, to isolate the Backhaul and FrontHaul links, it is necessary to increase the integrity of the message and encrypt the 1905 packet. It is recommended that the manufacturer comply with the EasyMesh standard when implementing EasyMesh to achieve the security capabilities of the 1905 protocol.

Summarize

A summary of the whole topic is as follows:
1. We have discovered multiple security attack surfaces in WiFi Mesh. Attackers can launch attacks on the devices in the Mesh network during the Mesh network construction phase and the network control phase;

2. We have developed an automated vulnerability mining tool MeshFuzzer, which can automatically mine security vulnerabilities introduced by vendors when implementing EasyMesh;

3. In practice, we found multiple security issues in the EasyMesh solution of the MT7915 chip, obtained 19 CVEs, and gave corresponding repair suggestions.


百度安全
103 声望931 粉丝

百度安全官方内容平台,集合顶级行业论文、技术解读、案例实践等优质内容,如需转载或合作,邮件zhangxinyue02@baidu.com,秒级回复!