Business needs
Product Manager: Xiao Ming, we need to make an attachment upload request, the content may be pictures, pdfs or videos.
Xiao Ming: It can be achieved, but the file size must be limited. It is best not to exceed 30MB. If it is too large, the upload will be slow and the server will be under great pressure.
Product Manager: After communication, video is a must. Just limit it to 50MB or less.
Xiao Ming: Yes.
Test classmate: The upload of this file is too slow. I tried a 50mb file and it took a minute.
Xiao Ming: whats up, so slow.
Product Manager: No, you are too slow, try to optimize it.
The road to optimization
identify the problem
The overall file upload call link is as follows:
Xiao Ming found that it took nearly 30 seconds for the front end to start uploading the request to the back end, which should be caused by the slowness of the browser parsing the file.
The backend service requests file service is also relatively slow.
solution
Xiao Ming: Does the file service have an asynchronous interface?
File Services: None at the moment.
Xiao Ming: This upload is really slow. Do you have any optimization suggestions?
File service: No, it's just so slow after reading it.
Xiao Ming: ...
In the end, Xiao Ming decided to adjust the synchronous return of the backend to asynchronous return to reduce the user's waiting time.
The implementation of the back-end is adjusted to suit the business. After the front-end is called, the asynchronous return identifier is obtained, and the back-end queries the result synchronously returned by the file service according to the identifier.
The disadvantage is also obvious, the asynchronous upload of fails, and the user does not know .
However, due to time reasons, that is, to weigh the pros and cons, it is temporarily online.
Xiaoming has some time recently, so he thought about implementing a file service by himself.
file service
Because the function of the file service is very primitive, Xiao Ming thought of implementing one by himself and optimized it from the following aspects:
(1) Compression
(2) Asynchronous
(3) Second pass
(4) Concurrency
(5) Direct connection
compression
In daily development, communicate with the product as clearly as possible, and allow users to upload/download compressed files.
Because network transfer is very time consuming .
Another benefit of compressing files is to save storage space. Of course, we generally don't need to consider this cost.
Advantages: simple to implement, outstanding effect.
Disadvantages: Need to combine business and convince the product. If the product wants image preview, video playback, compression is not very suitable.
asynchronous
For more time-consuming operations, we will naturally think of asynchronous execution to reduce the time for users to wait synchronously.
After receiving the file content, the server returns a request identifier and executes the processing logic asynchronously.
How to get the execution result?
There are generally 2 common scenarios:
(1) Provide a result query interface
Relatively simple, but may have invalid queries.
(2) Provide asynchronous result callback function
The implementation is more troublesome, and the execution result can be obtained at the first time.
second pass
Friends should have used cloud disks. Sometimes cloud disks upload files, but very large files can be uploaded in an instant.
How is this achieved?
Each file content corresponds to a unique file hash value.
Before uploading, we can check whether the hash value exists. If it already exists, we can directly add a reference, skipping the link of file transfer.
Of course, this advantage can only be reflected when your user files have a large amount of data and a certain repetition rate.
The pseudo code is as follows:
public FileUploadResponse uploadByHash(final String fileName,
final String fileBase64) {
FileUploadResponse response = new FileUploadResponse();
//判断文件是否存在
String fileHash = Md5Util.md5(fileBase64);
FileInfoExistsResponse fileInfoExistsResponse = fileInfoExists(fileHash);
if (!RespCodeConst.SUCCESS.equals(fileInfoExistsResponse.getRespCode())) {
response.setRespCode(fileInfoExistsResponse.getRespCode());
response.setRespMessage(fileInfoExistsResponse.getRespMessage());
return response;
}
Boolean exists = fileInfoExistsResponse.getExists();
FileUploadByHashRequest request = new FileUploadByHashRequest();
request.setFileName(fileName);
request.setFileHash(fileHash);
request.setAsyncFlag(asyncFlag);
// 文件不存在再上传内容
if (!Boolean.TRUE.equals(exists)) {
request.setFileBase64(fileBase64);
}
// 调用服务端
return fillAndCallServer(request, "api/file/uploadByHash", FileUploadResponse.class);
}
Concurrency
Another way is to split a relatively large file.
For example, a 100MB file is cut into 10 sub-files, and then uploaded concurrently. A file corresponds to a unique batch number.
When downloading, according to the batch number, the files are downloaded concurrently and spliced into a complete file.
The pseudo code is as follows:
public FileUploadResponse concurrentUpload(final String fileName,
final String fileBase64) {
// 首先进行分段
int limitSize = fileBase64.length() / 10;
final List<String> segments = StringUtil.splitByLength(fileBase64, limitSize);
// 并发上传
int size = segments.size();
final ConcurrentHashMap<Integer, String> map = new ConcurrentHashMap<>();
final CountDownLatch lock = new CountDownLatch(size);
for(int i = 0; i < segments.size(); i++) {
final int index = i;
Thread t = new Thread() {
public void run() {
// 并发上传
// countDown
lock.countDown();
}
};
t.start();
}
// 等待完成
lock.await();
// 针对上传后的信息处理
}
direct connection
Of course, there is another strategy that the client directly accesses the server and skips the back-end service.
Of course, this premise requires that the file service must provide an HTTP file upload interface.
Security issues also need to be considered. It is best for the front-end to call the back-end, obtain an authorization token, and then carry the token for file upload.
Extended reading
to improve file upload performance, would you?
7 implementations of asynchronous query to synchronization
java compression archive algorithm framework tool compress
summary
File upload is a very common business requirement, and upload performance is a problem that must be considered and optimized.
The above methods can be flexibly combined and used in combination with your own business for better practice.
I hope this article is helpful to you. If you like it, please like, collect and forward it.
I'm an old horse, and I look forward to seeing you again next time.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。