A better way to analyze assemblies in .NET Core, beyond ReflectionOnlyLoad

As we all know, using the Assembly.LoadFile() method to analyze an assembly file has certain limitations. If you only want to analyze the assembly, but do not need to execute the assembly, what should you do? Today, I will teach you through a simple experiment.

When writing a .NET program, if we need to analyze an assembly file, we can use Assembly.LoadFile() to load the assembly, and then perform further analysis on the Assembly object returned by the LoadFile() method. However, the Assembly.LoadFile() method will load the assembly into the program for the purpose of execution, so it has strict requirements on the loaded assembly file. For example, if the assembly depends on the assembly does not exist, then LoadFile () will throw an exception. For another example, when loading an assembly of the .NET Framework in .NET Core, LoadFile() will also throw an exception. If we only want to analyze the assembly, but don't need to execute the assembly, then we need a way to simply analyze the assembly file.

.NET Framework provides Assembly.ReflectionOnlyLoad() to achieve a similar effect, but this method is not supported in .NET Core because it depends on AppDomain. Microsoft once proposed a System.Reflection.TypeLoader in the laboratory project to implement this function in .NET Core, but for some reason, this class was not provided in the official version of .NET Core.

We know that .NET assemblies are files in PE format, and .NET provides the class PEReader (located in the NuGet package System.Reflection.Metadata) for analyzing PE files, so we can use PEReader to analyze assembly files.

In PEReader, we can get all the classes in the assembly through TypeDefinitions, and we can get all the methods defined in a class with GetMethods(). In order to improve efficiency, the objects obtained by members such as TypeDefinitions and GetMethods() are all handle types such as TypeDefinitionHandle and MethodDefinitionHandle. These objects only contain address information, but do not contain detailed information such as type names, method names, and method parameters. To get this information, we need to call MetadataReader's GetTypeDefinition(), GetMethodDefinition() and other methods to get it. The following code is used to load an assembly and output all type information in the assembly and the methods defined in the type:


//Install-PackageSystem.Reflection.Metadata
using System.Reflection.Metadata;
using System.Reflection.PortableExecutable;

string file =@"E:\Microsoft.AspNetCore.Components.Web.dll";
using FileStream fileStream =File.OpenRead(file);
using PEReader peReader = newPEReader(fileStream);
if(!peReader.HasMetadata)
{
   Console.WriteLine($"{file} doesn't contain CLI metadata.");
   return;
}
var mdReader =peReader.GetMetadataReader();
if (!mdReader.IsAssembly)
{
   Console.WriteLine($"{file} is not an assembly.");
   return;
}
foreach (var typeHandler inmdReader.TypeDefinitions)
{
   var typeDef = mdReader.GetTypeDefinition(typeHandler);
   string name = mdReader.GetString(typeDef.Name);
   string nameSpace = mdReader.GetString(typeDef.Namespace);
   Console.WriteLine($"***********{nameSpace}.{name}***********");
   foreach (var methodHandler in typeDef.GetMethods())
    {
       var methodDef = mdReader.GetMethodDefinition(methodHandler);
       Console.WriteLine(mdReader.GetString(methodDef.Name));
    }
}

When using PEReader, we need to get XXXHandler first, and then call MetadataReader to get the detailed information of the handle. Although the performance is relatively high, the code is relatively cumbersome, and it is more troublesome to implement some advanced operations. For example, if we want to obtain the CustomAttribute information of an assembly, PEReader does not provide a relatively simple method, and we need to be very proficient in the PE format to write the corresponding code.

We can use the third-party Nuget package AsmResolver.DotNet to simplify the read analysis of assembly files, which is a high-level encapsulation of PEReader. The following code is used to load an assembly, output the company information of the assembly, and output all the type information in the assembly and the methods defined in the type:


string file =@"E:\Microsoft.AspNetCore.Components.Web.dll";
var moduleDef =AsmResolver.DotNet.ModuleDefinition.FromFile(file);//用的不是System.Reflection.Metadata命名空间下的ModuleDefinition类
var asmCompanyAttr =moduleDef.Assembly.CustomAttributes.FirstOrDefault(c =>c.Constructor.DeclaringType.FullName =="System.Reflection.AssemblyCompanyAttribute");
var utf8Value =(Utf8String?)asmCompanyAttr.Signature.FixedArguments[0].Element;
var strValue = (string?)utf8Value;
Console.WriteLine($"companyname:{strValue}");
foreach(var typeDef inmoduleDef.GetAllTypes())
{
   string name = typeDef.Name;
   string nameSpace = typeDef.Namespace;
   Console.WriteLine($"***********{nameSpace}.{name}***********");
   foreach (var methodDef in typeDef.Methods)
    {
       Console.WriteLine(methodDef.Name);
    }
}

In short, if we need to analyze an assembly and want to run the code in it, we can use Assembly.LoadFile(); if we don't need to run the assembly, but just want to analyze the assembly, then using PEReader is a better choice, of course we You can also choose the AsmResolver.DotNet NuGet package that encapsulates PEReader. The author of this article, Yang Zhongke, used AsmResolver.DotNet when he implemented the function of "judging whether an assembly was developed by Microsoft" in the open source project Zack.Commons. You can check the GitHub code repository of this project to view the source code.

A better way to analyze assemblies in .NET Core, beyond ReflectionOnlyLoad

微软技术栈

引用和评论

对话声网 JCFTP AI Studios：以技术温度叩开商业价值之门

JetBrains Rider 2025.1 发布 - 快速且强大的跨平台 .NET IDE

C# virtual 和 abstract 详解

用C#在Excel工作表中创建数据透视表和数据透视图

dotnet 编译模式使用教程

C# sealed 关键字详解

.NET用C#提取PDF中的图片