1. 类加载机制简介
Java 默认的类加载机制是 双亲委派模型。
Flink 则为用户类和框架的类冲突提供了 child-first 的类加载模式,这样能够一定程度上减少由于框架升级导致使用的某部分依赖和用户的依赖版本不兼容的问题(当然不能彻底解决,将这部分类替换成用户的依赖版本也可能导致 Flink 框架在运行过程中出现 NoSuchMethod 等异常)。
但针对一些核心类,Flink 依旧还是优先从框架提供的环境中加载。
为了实现这些能力,Flink 提供了两个关于类加载的配置:
- classloader.resolve-order:有两个值,child-first 和 parent-first,前者是上面提到的优先加载用户类的模式,后者是 Java 默认的双亲委派模式。Flink 默认使用 child-first。
- classloader.parent-first-patterns.default:这里定义了一系列模式,用来匹配一些关键类,被这些模式匹配中的类都会优先加载 Flink 环境的类,而不会从用户类中进行加载,比如 flink-connector-kafka_2.11-1.12.2.jar,必须放到 flink lib 目录下,而不能简单的通过拓展指定 classpath。
2. 源码解读
创建入口可看 ClientUtils 的 buildUserCodeClassLoader 方法:
public static URLClassLoader buildUserCodeClassLoader(
List<URL> jars, List<URL> classpaths, ClassLoader parent, Configuration configuration) {
URL[] urls = new URL[jars.size() + classpaths.size()];
for (int i = 0; i < jars.size(); i++) {
urls[i] = jars.get(i);
}
for (int i = 0; i < classpaths.size(); i++) {
urls[i + jars.size()] = classpaths.get(i);
}
// 1. 读取配置获取 alwaysParentFirstLoaderPatterns,这部分 pattern 匹配的类都优先从 Flink 环境加载
final String[] alwaysParentFirstLoaderPatterns =
CoreOptions.getParentFirstLoaderPatterns(configuration);
// 2. 读取配置获取类加载模式
final String classLoaderResolveOrder =
configuration.getString(CoreOptions.CLASSLOADER_RESOLVE_ORDER);
FlinkUserCodeClassLoaders.ResolveOrder resolveOrder =
FlinkUserCodeClassLoaders.ResolveOrder.fromString(classLoaderResolveOrder);
final boolean checkClassloaderLeak =
configuration.getBoolean(CoreOptions.CHECK_LEAKED_CLASSLOADER);
return FlinkUserCodeClassLoaders.create(
resolveOrder,
urls,
parent,
alwaysParentFirstLoaderPatterns,
NOOP_EXCEPTION_HANDLER,
checkClassloaderLeak);
}
可以看一下 CoreOptions.getParentFirstLoaderPatterns(configuration),其实和 classLoaderResolveOrder 的读取一样,调用了 Configuration.getString() 方法:
public static String[] getParentFirstLoaderPatterns(Configuration config) {
String base = config.getString(ALWAYS_PARENT_FIRST_LOADER_PATTERNS);
String append = config.getString(ALWAYS_PARENT_FIRST_LOADER_PATTERNS_ADDITIONAL);
return parseParentFirstLoaderPatterns(base, append);
}
分别看一下三个 ConfigOption 的声明:
2.1 CHECK_LEAKED_CLASSLOADER
public static final ConfigOption<String> CLASSLOADER_RESOLVE_ORDER =
ConfigOptions.key("classloader.resolve-order")
.defaultValue("child-first")
.withDescription(
"Defines the class resolution strategy when loading classes from user code, meaning whether to"
+ " first check the user code jar (\"child-first\") or the application classpath (\"parent-first\")."
+ " The default settings indicate to load classes first from the user code jar, which means that user code"
+ " jars can include and load different dependencies than Flink uses (transitively).");
可以看到这里设置了默认值:child-first。
2.2 ALWAYS_PARENT_FIRST_LOADER_PATTERNS
public static final ConfigOption<String> ALWAYS_PARENT_FIRST_LOADER_PATTERNS =
ConfigOptions.key("classloader.parent-first-patterns.default")
.defaultValue(
"java.;scala.;org.apache.flink.;com.esotericsoftware.kryo;org.apache.hadoop.;javax.annotation.;org.slf4j;org.apache.log4j;org.apache.logging;org.apache.commons.logging;ch.qos.logback;org.xml;javax.xml;org.apache.xerces;org.w3c")
.withDeprecatedKeys("classloader.parent-first-patterns")
.withDescription(
"A (semicolon-separated) list of patterns that specifies which classes should always be"
+ " resolved through the parent ClassLoader first. A pattern is a simple prefix that is checked against"
+ " the fully qualified class name. This setting should generally not be modified. To add another pattern we"
+ " recommend to use \"classloader.parent-first-patterns.additional\" instead.");
这里的默认值是
"java.;scala.;org.apache.flink.;com.esotericsoftware.kryo;org.apache.hadoop.;javax.annotation.;org.slf4j;org.apache.log4j;org.apache.logging;org.apache.commons.logging;ch.qos.logback;org.xml;javax.xml;org.apache.xerces;org.w3c"
这些值以 ; 为分隔,表示以 java.、org.apache.flink 等开头的类都会优先从 Flink 环境中加载。
2.3 ALWAYS_PARENT_FIRST_LOADER_PATTERNS_ADDITIONAL
public static final ConfigOption<String> ALWAYS_PARENT_FIRST_LOADER_PATTERNS_ADDITIONAL =
ConfigOptions.key("classloader.parent-first-patterns.additional")
.defaultValue("")
.withDescription(
"A (semicolon-separated) list of patterns that specifies which classes should always be"
+ " resolved through the parent ClassLoader first. A pattern is a simple prefix that is checked against"
+ " the fully qualified class name. These patterns are appended to \""
+ ALWAYS_PARENT_FIRST_LOADER_PATTERNS.key()
+ "\".");
和 2 类似,这部分可以理解为用户自定义的希望优先从 Flink 环境中加载的类。
2.4 用户代码类加载器的创建
public static URLClassLoader create(
ResolveOrder resolveOrder,
URL[] urls,
ClassLoader parent,
String[] alwaysParentFirstPatterns,
Consumer<Throwable> classLoadingExceptionHandler,
boolean checkClassLoaderLeak) {
// resolveOrder 创建不同的类加载器
switch (resolveOrder) {
case CHILD_FIRST:
return childFirst(
urls,
parent,
alwaysParentFirstPatterns,
classLoadingExceptionHandler,
checkClassLoaderLeak);
case PARENT_FIRST:
return parentFirst(
urls, parent, classLoadingExceptionHandler, checkClassLoaderLeak);
default:
throw new IllegalArgumentException(
"Unknown class resolution order: " + resolveOrder);
}
}
2.5 ChildFirstClassLoader
parent first 的类加载器即原生的双亲委派模型,因此只需看一下 ChildFirstClassLoader 的加载逻辑:
protected Class<?> loadClassWithoutExceptionHandling(String name, boolean resolve)
throws ClassNotFoundException {
// First, check if the class has already been loaded
Class<?> c = findLoadedClass(name);
if (c == null) {
// check whether the class should go parent-first
for (String alwaysParentFirstPattern : alwaysParentFirstPatterns) {
if (name.startsWith(alwaysParentFirstPattern)) {
return super.loadClassWithoutExceptionHandling(name, resolve);
}
}
try {
// check the URLs
c = findClass(name);
} catch (ClassNotFoundException e) {
// let URLClassLoader do it, which will eventually call the parent
c = super.loadClassWithoutExceptionHandling(name, resolve);
}
} else if (resolve) {
resolveClass(c);
}
return c;
}
- 首先查看该类是否已加载;
- 若该类没有加载,那么用 alwaysParentFirstPatterns 尝试匹配;
- 对匹配成功的类使用父类加载器加载(这个父类加载器使用的还是双亲委派模型);
- 匹配失败的类则使用该类加载进行加载。
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。