SCA (Software Composition Analysis) software composition analysis, the popular understanding is to realize the technology of identification, management and tracking of the software by analyzing some information and features contained in the software.
I. Introduction
In modern development projects, most of the code used in the application is composed of open source code, and the rest of the code is mainly used as "glue" to assemble and call various functions. According to statistics from the China Academy of Information and Communications Technology, in 2020, 88.2% of enterprises in China have used open source technology, an increase of 6.8% over 2018; 9.5% have plans to use open source technology; It can be seen that the proportion of open source software applications in China has increased year by year. Does open source make these software more secure? According to data from the Institute of Communications, 84% of popular open source projects in 2020 contain at least one vulnerability, an increase of 9 percentage points from 2019; 60% contain high-risk vulnerabilities, an increase of 11 percentage points from 2019.
Below are two examples
- The most notable example was last year's exploit in Log4j2, an open source Java-based logging tool. The logging framework is widely used in various common services (spring-boot-starter-log4j2, Apache Struts2, Apache Flink). According to the statistics of Apache Log4j2 vulnerability impact surface query, as many as 44,029 open source software are affected, and 340,657 software packages of related versions are involved.
- In 2020, the first domestic judgment on a GPL copyright dispute case was published. The first-instance judgment shows that the GPL3.0 agreement is a civil legal act with a contractual nature, and can be identified as a copyright agreement between the licensor and the user, which falls within the scope of my country's "Contract Law" adjustment. The first instance ruled that the two infringing defendant companies should compensate the plaintiff company for economic losses and reasonable expenses for rights protection totaling 500,000 yuan, and stop the infringement.
In addition to the vulnerability risks described above, open source software currently faces open source license issues. Avoid using problematic components. This will reduce risk during development.
2. SCA
2.1 Introduction
I believe that at this time, you must be wondering, since there are so many risks in open source software, how to avoid it? Next is the protagonist SCA technology of this article. SCA (Software Composition Analysis) software composition analysis, the popular understanding is to realize the technology of identification, management and tracking of the software by analyzing some information and features contained in the software. SCA can analyze any development language object, Java, Golang, Python, JavaScript, etc., of course, it can also identify some binary, firmware, etc. The analysis process of SCA: first decompress the target source code or binary file, and extract features (component name, version number) from the file, and then identify and analyze the features to obtain the relevant components and versions used by the service, which are related to known Vulnerability database to correlate to identify known risks that exist.
2.2 Difficulties
The above sounds like SCA is not complicated, it only needs to extract key information for matching, but the following problems will be encountered in the real landing process. These problems lead to a mature and easy-to-use SCA tool is not as simple as imagined. The following figure introduces some common difficulties in SCA projects.
3. Landing
3.1 Architecture
How to automatically detect the risks of related components and safely move to the left, so that developers can perceive risks as soon as possible and prohibit high-risk services from going online, is the goal here.
The detection architecture is as follows:
Among them, the security testing platform of Dewu is currently not only doing SCA testing, but also doing SAST and other testing. It is as safe as possible to move to the left and find problems earlier.
3.2 Implementation
At present, there are mainly Java, Go, Python, JavaScript and other languages in Dewu. Currently, all the above four languages are supported (including Jar files packaged by Java and binary files compiled by Go static language).
At present, there are mainly two methods of project construction and analysis of package management files.
- Method 1: The project construction takes the Java project as an example. By connecting with the publishing platform, the compiled Jar file of the project can be obtained. The JAR file format is based on the ZIP file format. Get all third-party dependency package information in the directory.
- Method 2: The package management file parses the configuration file of the corresponding language. Taking Go as an example, the relevant dependency information is obtained by parsing the go.mod and go.sum files.
The security analysis of project dependencies is carried out on the server side. Based on the vulnerability library maintained by security operations students, dependencies with security defects can be quickly identified. When a security risk is detected in the corresponding dependency, a corresponding notification will be sent to the project Owner. Urge to make corresponding improvements.
In order to make developers pay attention to security issues, the detection results will also be output at Gitlab Merge Request synchronously. When reviewing the code, you will directly see the results of security detection.
3.3 Vulnerability library
The construction of the vulnerability library cannot be completed overnight. At present, the construction of the vulnerability library is mainly based on open source, such as NVD, GHSA, GLAD, Go VulnDB, etc., supplemented by some vulnerability information collected manually. It is mentioned here that although NVD's official vulnerability library is very complete, the cpe information defined in it is not completely consistent with the component information, and many CVE data have no impact version, repair version and other data, and this information needs to be combined with other vulnerability intelligence to comprehensively determine.
The good news is that the CVE data format will be released in 5.0 format this summer (currently CVE formats are all 4.0), this CVE release adds several new data fields, in addition to CVE ID number, affected product, affected version In addition to required data for public reference, optional data such as severity score, researcher credit, additional languages, list of affected products, additional references, community contribution capabilities, etc. will be recorded. This optional data will enhance CVE records for downstream users and the entire vulnerability management community. For details, please refer to:
https://www.cve.org/Media/News/item/news/2022/01/11/Changes-Coming-to-CVE-Record
4. Thinking
At present, SCA detection is not only used in security detection, but also plays an important role in emergency response and asset inspection. For example, the common Log4j2, Fastjson emergency response. Through this platform, you can query the services that reference the component, quickly obtain information such as the person in charge of the service, release time, component version, etc., and notify the corresponding person in charge of the upgrade process.
At present, Dewu Security mainly uses SCA tools to manage the security of open source components, and SAST is used to detect source codes to find security vulnerabilities. The two have been perfectly combined to solve common security problems in the early stage of software development. However, in future planning, SCA will move to the left as much as possible to detect relevant security risks before the development stage, or to ensure that the components referenced by users in the development stage are safe. To ensure this, you need to build a private server inside Dewu, and ensure that the packages on the private server are all tested for security.
In general, SCA is not actually a very advanced technology. OWASP officially opened a similar detection tool DependencyCheck in 2012. Mainly in the modern research and development process, people are increasingly using open source components, which means that once the referenced components are at risk, their own projects are also in danger, and the impact of supply chain security issues will also increase. Of course, it does not mean that the use of open source software is not good. After all, it can greatly improve the development efficiency, and it cannot be abandoned because of security issues. The goal of security is to minimize disruption to the business, but to ensure that the system for business development is safer and more reliable.
*Text / Zhang Yangyang
@德物科技public account
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。