Business needs

Product Manager: Xiao Ming, we need to make an attachment upload request, the content may be pictures, pdfs or videos.

Xiao Ming: It can be achieved, but the file size must be limited. It is best not to exceed 30MB. If it is too large, the upload will be slow and the server will be under great pressure.

Product Manager: After communication, video is a must. Just limit it to 50MB or less.

Xiao Ming: Yes.

A FEW DAYS LATER

Test classmate: The upload of this file is too slow. I tried a 50mb file and it took a minute.

Xiao Ming: whats up, so slow.

Product Manager: No, you are too slow, try to optimize it.

The road to optimization

identify the problem

The overall file upload call link is as follows:

调用链路

Xiao Ming found that it took nearly 30 seconds for the front end to start uploading the request to the back end, which should be caused by the slowness of the browser parsing the file.

The backend service requests file service is also relatively slow.

solution

Xiao Ming: Does the file service have an asynchronous interface?

File Services: None at the moment.

Xiao Ming: This upload is really slow. Do you have any optimization suggestions?

File service: No, it's just so slow after reading it.

Xiao Ming: ...

In the end, Xiao Ming decided to adjust the synchronous return of the backend to asynchronous return to reduce the user's waiting time.

The implementation of the back-end is adjusted to suit the business. After the front-end is called, the asynchronous return identifier is obtained, and the back-end queries the result synchronously returned by the file service according to the identifier.

The disadvantage is also obvious, the asynchronous upload of fails, and the user does not know .

However, due to time reasons, that is, to weigh the pros and cons, it is temporarily online.

Xiaoming has some time recently, so he thought about implementing a file service by himself.

file service

Because the function of the file service is very primitive, Xiao Ming thought of implementing one by himself and optimized it from the following aspects:

(1) Compression

(2) Asynchronous

(3) Second pass

(4) Concurrency

(5) Direct connection

compression

In daily development, communicate with the product as clearly as possible, and allow users to upload/download compressed files.

Because network transfer is very time consuming .

Another benefit of compressing files is to save storage space. Of course, we generally don't need to consider this cost.

Advantages: simple to implement, outstanding effect.

Disadvantages: Need to combine business and convince the product. If the product wants image preview, video playback, compression is not very suitable.

asynchronous

For more time-consuming operations, we will naturally think of asynchronous execution to reduce the time for users to wait synchronously.

After receiving the file content, the server returns a request identifier and executes the processing logic asynchronously.

How to get the execution result?

There are generally 2 common scenarios:

(1) Provide a result query interface

Relatively simple, but may have invalid queries.

(2) Provide asynchronous result callback function

The implementation is more troublesome, and the execution result can be obtained at the first time.

second pass

Friends should have used cloud disks. Sometimes cloud disks upload files, but very large files can be uploaded in an instant.

How is this achieved?

Each file content corresponds to a unique file hash value.

Before uploading, we can check whether the hash value exists. If it already exists, we can directly add a reference, skipping the link of file transfer.

Of course, this advantage can only be reflected when your user files have a large amount of data and a certain repetition rate.

The pseudo code is as follows:

public FileUploadResponse uploadByHash(final String fileName,
                                       final String fileBase64) {
    FileUploadResponse response = new FileUploadResponse();

    //判断文件是否存在
    String fileHash = Md5Util.md5(fileBase64);
    FileInfoExistsResponse fileInfoExistsResponse = fileInfoExists(fileHash);
    if (!RespCodeConst.SUCCESS.equals(fileInfoExistsResponse.getRespCode())) {
        response.setRespCode(fileInfoExistsResponse.getRespCode());
        response.setRespMessage(fileInfoExistsResponse.getRespMessage());
        return response;
    }

    Boolean exists = fileInfoExistsResponse.getExists();
    FileUploadByHashRequest request = new FileUploadByHashRequest();
    request.setFileName(fileName);
    request.setFileHash(fileHash);
    request.setAsyncFlag(asyncFlag);
    // 文件不存在再上传内容
    if (!Boolean.TRUE.equals(exists)) {
        request.setFileBase64(fileBase64);
    }

    // 调用服务端
    return fillAndCallServer(request, "api/file/uploadByHash", FileUploadResponse.class);
}

Concurrency

Another way is to split a relatively large file.

For example, a 100MB file is cut into 10 sub-files, and then uploaded concurrently. A file corresponds to a unique batch number.

When downloading, according to the batch number, the files are downloaded concurrently and spliced into a complete file.

The pseudo code is as follows:

public FileUploadResponse concurrentUpload(final String fileName,
                                           final String fileBase64) {
    // 首先进行分段
    int limitSize = fileBase64.length() / 10;
    final List<String> segments = StringUtil.splitByLength(fileBase64, limitSize);

    // 并发上传
    int size = segments.size();
    final ConcurrentHashMap<Integer, String> map = new ConcurrentHashMap<>();
    final CountDownLatch lock = new CountDownLatch(size);

    for(int i = 0; i < segments.size(); i++) {
        final int index = i;
        Thread t = new Thread() {
            public void run() {
               // 并发上传
               // countDown
               lock.countDown();
            }
        };
        t.start();
    }

    // 等待完成
    lock.await();

    // 针对上传后的信息处理
}

direct connection

Of course, there is another strategy that the client directly accesses the server and skips the back-end service.

文件直连

Of course, this premise requires that the file service must provide an HTTP file upload interface.

Security issues also need to be considered. It is best for the front-end to call the back-end, obtain an authorization token, and then carry the token for file upload.

Extended reading

to improve file upload performance, would you?

7 implementations of asynchronous query to synchronization

java compression archive algorithm framework tool compress

summary

File upload is a very common business requirement, and upload performance is a problem that must be considered and optimized.

The above methods can be flexibly combined and used in combination with your own business for better practice.

I hope this article is helpful to you. If you like it, please like, collect and forward it.

I'm an old horse, and I look forward to seeing you again next time.


老马啸西风
191 声望34 粉丝