In some business scenarios, it may be necessary to convert Word files into Pdf files. Word files are easy to edit, while Pdf files are more convenient to view, and the format will basically not change, and at the same time, it can avoid being edited by mistake.
If you use the Java language, you can convert Word files to Pdf files with the help of the Jacob open source library. What is Jacob? First look at the official definition
Jacob is a Java library that enables Java applications to communicate with Microsoft Windows DLLs or COM libraries, using a custom DLL that the Jacob Java classes communicate with through JNI.
That is to say, the implementation of Jacob is to call the locally installed Office of the system through jacob.dll to realize the conversion, so the use of Jacob also has certain restrictions:
- Because it is a dll library, it must be in the Windows system environment;
- The system needs to install the Office Word application, such as Microsoft Office or WPS Office;
The following is demonstrated through specific practical operations and codes.
1. Introduce Jacob dependency
The official address of jacob on github is: https://github.com/freemansoft/jacob-project , the latest version is 1.20, as follows:
If you are using jacob for the first time, it is recommended to download the zip package directly from github, because when converting Word to Pdf, you need not only jacob.jar, but also the jacob.dll library, which is included in the zip package.
In order to facilitate later use, you can install the downloaded jar into the local maven library. The specific commands are as follows:
mvn install:install-file
-Dfile=C:\Users\admin\Downloads\jacob-1.20\jacob.jar # 配置 jar 包文件所在的位置
-DgroupId=com.jacob # 配置生成 jar 包对应的 groupId
-DartifactId=jacob # 配置生成 jar 包对应的 artifactId
-Dpackaging=jar # 配置文件的打包方式, 此处为 jar
-Dversion=1.20 # 配置版本号, 只要符合 Maven 的版本命名规范即可
Abbreviated to one line, it looks like this:
mvn install:install-file -Dfile=C:\Users\admin\Downloads\jacob-1.20\jacob.jar -DgroupId=com.jacob -DartifactId=jacob -Dpackaging=jar -Dversion=1.20
After the command is executed successfully, you can see similar files in the local warehouse
At this point, you can directly add the following dependencies to the pom.xml file for reference
<dependency>
<groupId>com.jacob</groupId>
<artifactId>jacob</artifactId>
<version>1.20</version>
</dependency>
2. Copy the jacob.dll file
jacob.dll
文件分为x86和x64的,根据自身系统来进行选择,这里拷贝jacob-1.20-x64.dll
文件到jdk/bin
jdk/jre/bin
目录下You can choose to copy it to the jdk/jre/bin
directory, as follows:
For official documentation, please refer to: https://github.com/freemansoft/jacob-project/blob/main/docs/UsingJacob.md
3. Code implementation
package com.magic.jacob;
import java.io.File;
import java.nio.file.Files;
import java.util.Objects;
import com.jacob.activeX.ActiveXComponent;
import com.jacob.com.ComThread;
import com.jacob.com.Dispatch;
import com.jacob.com.Variant;
/**
* Word转PDF工具类
*/
public class WordToPdfUtils {
/** word 转换为 pdf 的格式宏,值为 17 */
private static final int WORD_FORMAT_PDF = 17;
private static final String MS_OFFICE_APPLICATION = "Word.Application";
private static final String WPS_OFFICE_APPLICATION = "KWPS.Application";
/**
* 微软Office Word转PDF
* 如果无法转换,可能需要下载 SaveAsPDFandXPS.exe 插件并安装
* @param wordFile Word文件
* @param pdfFile Pdf文件
*/
public static void msOfficeToPdf(String wordFile, String pdfFile) {
wordToPdf(wordFile, pdfFile, MS_OFFICE_APPLICATION);
}
/**
* WPS Office Word转PDF
* @param wordFile Word文件
* @param pdfFile Pdf文件
*/
public static void wpsOfficeToPdf(String wordFile, String pdfFile) {
wordToPdf(wordFile, pdfFile, WPS_OFFICE_APPLICATION);
}
/**
* Word 转 PDF
* @param wordFile Word文件
* @param pdfFile Pdf文件
* @param application Office 应用
*/
private static void wordToPdf(String wordFile, String pdfFile, String application) {
Objects.requireNonNull(wordFile);
Objects.requireNonNull(pdfFile);
Objects.requireNonNull(application);
ActiveXComponent app = null;
Dispatch document = null;
try {
File outFile = new File(pdfFile);
// 如果目标路径不存在, 则新建该路径,否则会报错
if (!outFile.getParentFile().exists()) {
Files.createDirectories(outFile.getParentFile().toPath());
}
// 如果目标文件存在,则先删除
if (outFile.exists()) {
outFile.delete();
}
// 这里需要根据当前环境安装的是 MicroSoft Office还是WPS来选择
// 如果安装的是WPS,则需要使用 KWPS.Application
// 如果安装的是微软的 Office,需要使用 Word.Application
app = new ActiveXComponent(application);
app.setProperty("Visible", new Variant(false));
app.setProperty("AutomationSecurity", new Variant(3));
Dispatch documents = app.getProperty("Documents").toDispatch();
document = Dispatch.call(documents, "Open", wordFile, false, true).toDispatch();
Dispatch.call(document, "ExportAsFixedFormat", pdfFile, WORD_FORMAT_PDF);
} catch (Exception e) {
e.printStackTrace();
} finally {
if (document != null) {
Dispatch.call(document, "Close", false);
}
if (app != null) {
app.invoke("Quit", 0);
}
ComThread.Release();
}
}
}
If the dll file cannot be found, an error will be reported. The specific error information is as follows:
Exception in thread "main" java.lang.UnsatisfiedLinkError: no jacob-1.20-x64 in java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
at java.lang.Runtime.loadLibrary0(Runtime.java:870)
at java.lang.System.loadLibrary(System.java:1122)
at com.jacob.com.LibraryLoader.loadJacobLibrary(LibraryLoader.java:184)
at com.jacob.com.JacobObject.<clinit>(JacobObject.java:110)
at com.magic.springlearning.jacob.WordToPdfUtils.officeToPdf(WordToPdfUtils.java:31)
at com.magic.springlearning.jacob.WordToPdfUtils.main(WordToPdfUtils.java:17)
4. Test verification
public static void main(String[] args) {
wpsOfficeToPdf("D:\\Test\\test_word.docx", "D:\\Test\\test_word.pdf");
}
Since the local installation is WPS Office, so choose the wpsOfficeToPdf()
method. After testing, if you choose the msOfficeToPdf()
method is also possible, but the size of the converted file is doubled. The file effect is as follows:
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。