Large file upload server, support HTTP resumable upload of large files



Recently, due to the product needs of the research and development group of the author, it is necessary to support high-performance http upload of large files, and it is required to support http resumable upload. Here is a brief summary for easy memory:

  1. The server side is implemented by C language, rather than an interpreted language such as java and PHP;
  2. The server writes to the hard disk immediately, so there is no need to call move_uploaded_file , InputStreamReader which requires caching, to avoid server memory usage and browser request timeout;
  3. Support HTML5 and IFRAME (for old browsers), and support to obtain file upload progress.

In order to better adapt to the current mobile Internet, the upload service is required to support resumable upload and reconnect after disconnection. Because the mobile Internet is not very stable; in addition, the possibility of abnormally dropped when uploading a large file is very high. In order to avoid re-uploading, it is very necessary to support resumable upload.

The idea of supporting resumable upload is:

client (usually a browser) uploads a file to the server and keeps recording the progress of the upload. Once the connection is dropped or other abnormalities occur, the client can query the server for the status of a file that has been uploaded. The location of the uploaded file is then uploaded.

There are also masters on the Internet that use the fragmented file upload method to upload large files. The method is to cut the file into small pieces, for example, a 4MB fragment. Each time the server receives a small piece of file and saves it as a temporary file, wait for all the fragments to be transferred. Perform the merger. The author believes that if the original file is small enough, this method is okay, but once the file has hundreds of megabytes or several GB or dozens of GB, the time to merge the files will be very long, often leading to browser response timeouts or server block.

If you implement an independent client (or the browser's ActiveX plug-in) to upload files, it will be a very simple matter to support the resumable upload. You only need to record the file upload status on the client. Supporting browser resumable uploads (no need to install third-party plug-ins) is generally more difficult than doing independent client uploads by yourself, but it's not difficult. My realization idea is as follows:

1. When the browser uploads a certain file, it first generates a HASH value for the file, which must be generated on the browser side.

The file upload record cannot be queried by the file name one by one. The repetitiveness of the file name is very large, and the repetitiveness of the value composed of the file name + file size is reduced. If the file modification time is added, the repetitiveness is further reduced. The ID of the previous browser can further reduce repetitive conflicts. The best HASH value calculation method is to use the content of the file for MD5 calculation, but the amount of calculation is very large (in fact, there is no need to do this), and excessive time-consuming will affect the upload experience.

Based on the above reasons, my HASH value calculation ideas are as follows:

  1. First give the browser an ID, which is stored in a cookie;
  2. The result of browser ID + file modification time + file name + file size is MD5 to calculate the HASH value of a file;
  3. The browser ID is automatically granted to the browser when the browser accesses the file upload site.
function setCookie(cname,cvalue,exdays)  
  var d = new Date();  
  var expires = "expires="+d.toGMTString();  
  document.cookie = cname + "=" + cvalue + "; " + expires;  
function getCookie(cname)  
  var name = cname + "=";  
  var ca = document.cookie.split(';');  
  for(var i=0; i<ca.length; i++)   
    var c = ca[i].trim();  
    if (c.indexOf(name)==0) return c.substring(name.length,c.length);  
  return "";  
function getFileId (file)   
    var clientid = getCookie("HUAYIUPLOAD");  
    if (clientid == "") {  
        var rand = parseInt(Math.random() * 1000);  
        var t = (new Date()).getTime();  
        clientid =rand+'T'+t;  
    var info = clientid;  
    if (file.lastModified)  
        info += file.lastModified;  
    if (  
        info +=;  
    if (file.size)  
        info += file.size;  
    var fileid = md5(info);  
    return fileid;  

The author thinks: It is not necessary to calculate the HASH value by reading the content of the file, it will be very slow. If you really need to implement HTTP second transmission, you may have to do so, so that if the content of the file uploaded by different people is the same, you can avoid repeated uploads and return the result directly.

The reason for assigning an ID to the browser can further avoid the HASH value conflicts of files with the same name and the same size on other computers.

2. Query the HASH value of the file

In file upload support, first query the upload progress information of the file from the upload server through the file's HASH value, and then start uploading from the upload progress position, the code is as follows:

var fileObj = currentfile;  
var fileid = getFileId(fileObj);  
var t = (new Date()).getTime();  
var url = resume_info_url + '?fileid='+fileid + '&t='+t;  
var ajax = new XMLHttpRequest();  
ajax.onreadystatechange = function () {   
    if(this.readyState == 4){  
        if (this.status == 200){  
            var response = this.responseText;  
            var result = JSON.parse(response);  
            if (!result) {  
            var uploadedBytes = result.file && result.file.size;  
            if (!result.file.finished && uploadedBytes < fileObj.size) {  
            else {  
                //var progressBar = document.getElementById('progressbar');  
                //progressBar.value = 100;  
        }else {  

The above is achieved through the jQuery-file-upload component. For the implementation code through the original Javascript, please refer to the h4resume.html sample code in the demos directory.

Three, perform upload

After querying the resumable upload information of the file, if the file has indeed been uploaded before, the server will return the file size that has been uploaded, and then we can upload the data from the size of the file that has been uploaded.

The slice of the html5 File object can be used to cut and upload fragments from the file.

definition and usage

The slice() method can extract a part of a word file and return the extracted part with a new string.



parameter description

start The starting index of the segment to be extracted. If it is a negative number, this parameter specifies the position counted from the end of the string. In other words, -1 refers to the last character of the string, -2 refers to the penultimate character, and so on.

end The subscript immediately following the end of the segment to be extracted. If this parameter is not specified, the substring to be extracted includes the string from start to the end of the original string. In addition, pay attention to the Java PhoenixMiles official account, reply to the "back-end interview", and send you a collection of interview questions!

If the parameter is a negative number, then it specifies the position counted from the end of the string.

code implements segment file uploaded follows:

fileObj : html5 File 对象  
start_offset: 上传的数据相对于文件头的起始位置  
fileid: 文件的ID,这个是上面的getFileId 函数获取的,  
function upload_file(fileObj,start_offset,fileid)  
 var xhr = new XMLHttpRequest();  
 var formData = new FormData();  
 var blobfile;  
 if(start_offset >= fileObj.size){  
  return false;  
 var bitrateDiv = document.getElementById("bitrate");  
 var finishDiv = document.getElementById("finish");  
 var progressBar = document.getElementById('progressbar');  
 var progressDiv = document.getElementById('percent-label');  
 var oldTimestamp = 0;  
 var oldLoadsize = 0;  
 var totalFilesize = fileObj.size;  
 if (totalFilesize == 0) return;  
 var uploadProgress = function (evt) {  
  if (evt.lengthComputable) {  
   var uploadedSize = evt.loaded + start_offset;   
   var percentComplete = Math.round(uploadedSize * 100 / totalFilesize);  
   var timestamp = (new Date()).valueOf();  
   var isFinish = evt.loaded ==;  
   if (timestamp > oldTimestamp || isFinish) {  
    var duration = timestamp - oldTimestamp;  
    if (duration > 500 || isFinish) {  
     var size = evt.loaded - oldLoadsize;  
     var bitrate = (size * 8 / duration /1024) * 1000; //kbps  
     if (bitrate > 1000)  
      bitrate = Math.round(bitrate / 1000) + 'Mbps';  
      bitrate = Math.round(bitrate) + 'Kbps';  
     var finish = evt.loaded + start_offset;  
     if (finish > 1048576)  
      finish = (Math.round(finish / (1048576/100)) / 100).toString() + 'MB';  
      finish = (Math.round(finish / (1024/100) ) / 100).toString() + 'KB';  
     progressBar.value = percentComplete;  
     progressDiv.innerHTML = percentComplete.toString() + '%';  
     bitrateDiv.innerHTML = bitrate;  
     finishDiv.innerHTML = finish;  
     oldTimestamp = timestamp;  
     oldLoadsize = evt.loaded;  
  else {  
   progressDiv.innerHTML = 'N/A';  
 xhr.onreadystatechange = function(){  
    if ( xhr.readyState == 4 && xhr.status == 200 ) {  
      console.log( xhr.responseText );  
  else if (xhr.status == 400) {  
 var uploadComplete = function (evt) {  
  progressDiv.innerHTML = '100%';  
  var result = JSON.parse(;  
  if (result.result == 'success') {  
  else {  
 var uploadFailed = function (evt) {  
 var uploadCanceled = function (evt) {  
 //xhr.timeout = 20000;  
 //xhr.ontimeout = function(event){  
  //  alert('文件上传时间太长,服务器在规定的时间内没有响应!');  
 var filesize = fileObj.size;  
 var blob = fileObj.slice(start_offset,filesize);  
 var fileOfBlob = new File([blob],;  
 formData.append('fileid', fileid);  
 formData.append('file', fileOfBlob);  
 xhr.upload.addEventListener("progress", uploadProgress, false);  
 xhr.addEventListener("load", uploadComplete, false);  
 xhr.addEventListener("error", uploadFailed, false);  
 xhr.addEventListener("abort", uploadCanceled, false);'POST', upload_file_url);  

In order to verify the resuming of file upload, the author made a simple interface to display the status information in the process of file upload. The interface is as follows:


Through HTML, you can calculate the progress of the file upload, the size of the file that has been uploaded, the bit rate of the file upload and other information. If there is any abnormality during the upload process, you can upload it again. The uploaded part will not need to be uploaded again.

In order to verify HTML5 resumable upload, you can download this file upload server through github for testing.
阅读 782

433 声望
211 粉丝
0 条评论

433 声望
211 粉丝