Office转PDF方案[Java版]
效果最佳的Office转PDF解决方案
找了很久总算是找到一些可用的转换方案,废话不多说,目前靠谱的解决方案大致分为以下几类:
- 开源组件版:
openoffice
/libroffice
- 企业版API版:
WPS
/Office
- 系统强依赖版:
documents4j
+Windows WPS
/Windows Office
- 纯净依赖版:
aspose-words
开源组件版
以libroffice
为例,需要搭建 Office 服务器,开放API端口,以供Java或其他服务调用。
优点: 开源免费
缺点: 存在转换前后效果不一致的情况、服务稳定性不佳
安装步骤
以Linux服务器为例,以下是环境搭建步骤:
- 请去官网下载安装包libreoffice
- 将下载的安装包(
Apache_OpenOffice_4.1.14_Linux_x86-64_install-rpm_zh-CN.tar.gz
)解压缩安装文件到/tmp/OpenOffice
- 切换至
/tmp/OpenOffice/zh-CN/RPMS
,执行yum localinstall *.rpm
- 装完后会在当前目录下生成一个
desktop-integration
目录。切换至desktop-integration
,执行yum localinstall openoffice4.1.14-redhat-menus-4.1.14-9811.noarch.rpm
- 切换至
/opt/openoffice4/program/
,为防止出现OpenOffice启动时报错,所以先执行安装yum install libXext.x86_64 & yum groupinstall "X Window System"
- 启动
nohup /opt/openoffice4/program/soffice -headless -accept="socket,host={{IP}},port={{Port}};urp;" -nofirststartwizard &
OpenOffice
与libroffice
的环境搭建步骤基本一致,仅仅只是文件名称不一致而已,但大致位置全部相同
企业版API版
以WPS
为例,需要注册和认证企业账号,通过HTTP请求的方式调用接口转换。详情请了解WPS开放平台
优点: 效果好
缺点: 闭源、按转换次数收费、数据外泄
系统强依赖版
以documents4j
+ Windows WPS
为例,可通过进程通信的方式进行转换
优点: 效果好、数据不会外协
缺点: 需要Windows
系统环境、企业使用需要使用许可
搭建步骤
- 在
Windows
中安装WPS
和JRE
环境 - 编写Java代码,运行服务
代码示例
pom.xml
<dependencies>
<dependency>
<groupId>com.documents4j</groupId>
<artifactId>documents4j-local</artifactId>
<version>1.1.12</version>
</dependency>
<dependency>
<groupId>com.documents4j</groupId>
<artifactId>documents4j-transformer-msoffice-word</artifactId>
<version>1.1.12</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<!-- 指定项目编译时的java版本和编码方式 -->
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.7.0</version>
<configuration>
<target>1.8</target>
<source>1.8</source>
<encoding>UTF-8</encoding>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>3.1.0</version>
<configuration>
<archive>
<manifest>
<mainClass>Main</mainClass>
</manifest>
</archive>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
<executions>
<execution>
<id>make-assembly</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
*.java
//Main.java
public class Main {
public static void main(String[] args) throws RemoteException {
IDocumentConvert convert = new DocumentConvertRemote();
IDocumentConvert skeleton = (IDocumentConvert) UnicastRemoteObject.exportObject(convert, 0);
Registry registry = LocateRegistry.createRegistry(10099);
registry.rebind(IDocumentConvert.class.getName(), skeleton);
}
}
package org.cikaros.convert;
import java.rmi.Remote;
import java.rmi.RemoteException;
public interface IDocumentConvert extends Remote {
byte[] convert(byte[] data) throws RemoteException;
}
package org.cikaros.convert;
import com.documents4j.api.DocumentType;
import com.documents4j.api.IConverter;
import com.documents4j.job.LocalConverter;
import java.io.*;
import java.rmi.RemoteException;
public class DocumentConvertRemote implements IDocumentConvert {
@Override
public synchronized byte[] convert(byte[] data) throws RemoteException {
try (ByteArrayInputStream in = new ByteArrayInputStream(data);
ByteArrayOutputStream out = new ByteArrayOutputStream()) {
IConverter converter = LocalConverter.builder()
.build();
converter.convert(in).as(DocumentType.DOCX).to(out).as(DocumentType.PDF).execute();
converter.shutDown();
return out.toByteArray();
} catch (IOException e) {
return new byte[0];
}
}
}
在其他项目中使用
...
//域名和端口来自上述服务的部署位置
Registry registry = LocateRegistry.getRegistry(IP, PORT);
IDocumentConvert convert = (IDocumentConvert) registry.lookup(IDocumentConvert.class.getName());
//文件路径请按需修改
try (BufferedInputStream input = new BufferedInputStream(Files.newInputStream(DOCX_FILE_PATH));
OutputStream output = Files.newOutputStream(PDF_FILE_PATH)) {
long length = DOCX_FILE_PATH.toFile().length();
byte[] docx = new byte[(int) length];
int i = input.read(docx, 0, (int) length);
byte[] pdf = convert.convert(docx);
output.write(pdf);
}
...
纯净依赖版
可直接在项目中加入依赖即可。官网地址
优点: 效果好、数据不会外协、跨平台
缺点: 企业使用需要使用许可且收费离谱、试用版会有水印
搭建步骤
- 准备
JRE
环境 - 在项目中合适的位置调用API即可
代码示例
pom.xml
<dependencies>
<!-- <dependency>-->
<!-- <groupId>com.aspose</groupId>-->
<!-- <artifactId>aspose-words</artifactId>-->
<!-- <version>19.5.0</version>-->
<!-- <scope>system</scope>-->
<!-- <systemPath>${basedir}/src/main/resources/aspose-words/aspose-words-19.5jdk.jar</systemPath>-->
<!-- </dependency>-->
<dependency>
<groupId>com.aspose</groupId>
<artifactId>aspose-words</artifactId>
<version>20.12</version>
<scope>system</scope>
<systemPath>${basedir}/src/main/resources/aspose-words/aspose-words-20.12-jdk17.jar</systemPath>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<!-- 指定项目编译时的java版本和编码方式 -->
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.7.0</version>
<configuration>
<target>1.8</target>
<source>1.8</source>
<encoding>UTF-8</encoding>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>3.1.0</version>
<configuration>
<archive>
<manifest>
<mainClass>Main</mainClass> <!-- 指定入口类路径 -->
</manifest>
</archive>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef> <!-- jar包后缀,生成的jar包形式为:project-1.0-SNAPSHOT-jar-with-dependencies.jar -->
</descriptorRefs>
</configuration>
<!-- 添加此项后,可直接使用mvn package | mvn install -->
<!-- 不添加此项,需直接使用mvn package assembly:single -->
<executions>
<execution>
<id>make-assembly</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
*.java
import com.aspose.words.Document;
import com.aspose.words.PdfSaveOptions;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.nio.file.Files;
import java.nio.file.Paths;
public class Main {
public static void main(String[] args) {
String docx = "...";
String pdf = "...";
try (
InputStream input = Files.newInputStream(Paths.get(docx));
FileOutputStream output = new FileOutputStream(pdf);
) {
Document wordDoc = new Document(input);
PdfSaveOptions pso = new PdfSaveOptions();
wordDoc.save(output, pso);
} catch (Exception e) {
e.printStackTrace();
}
}
}
这里有惊喜!传送门
Office转PDF方案[Java版]
https://blog.cikaros.top/doc/4bd75a2e.html