admin管理员组

文章数量:1391924

I'm downloading a ~50MB file in 5 MB chunks using XMLHttpRequest and the Range header. Things work great, except for detecting when I've downloaded the last chunk.

Here's a screenshot of the request and response for the first chunk. Notice the Content-Length is 1024 * 1024 * 5 (5 MB). Also notice that the server responds correctly with the first 5 MB, and in the Content-Range header, properly specifies the size of the entire file (after the /):

When I copy the response body into a text editor (Sublime), I only get 5,242,736 characters instead of the expected 5,242,880 as indicated by Content-Length:

Why are 144 characters missing? This is true of every chunk that gets downloaded, though the exact difference varies a little bit.

However, what's especially strange is the last chunk. The server responds with the last ~2.9 MB of the file (instead of a whole 5 MB) and apparently properly indicates this in the response:

Notice that I am requesting the next 5 MB (even though it goes beyond the total file size). No biggie, the server responds with the last part of the file and the headers indicate the actual byte range returned.

But does it really?

When I call xhr.getResponseHeader("Content-Length") with Javascript, I see a different story in Chrome:

The XMLHttpRequest object is telling me that another 5 MB was downloaded, beyond the end of the file. Is there something I don't understand about the xhr object?

What's even weirder is that it works in Firefox 30 as expected:

So between the xhr.responseText.length not matching the Content-Length and these headers not agreeing between the xhr object and the Network tools, I don't know what to do to fix this.

What's causing these discrepancies?

Update: I have confirmed that the server itself is properly sending the request, despite the overshot Range header in the request for the last chunk. This is the output from the raw HTTP request, thanks to good 'ol telnet:

HTTP/1.1 206 Partial Content
Server: nginx/1.4.5
Date: Mon, 14 Jul 2014 21:50:06 GMT
Content-Type: application/octet-stream
Content-Length: 2987360
Last-Modified: Sun, 13 Jul 2014 22:05:10 GMT
Connection: keep-alive
ETag: "53c30296-2fd9560"
Content-Range: bytes 47185920-50173279/50173280

So it looks like Chrome is malfunctioning. Should this be filed as a bug? Where?

I'm downloading a ~50MB file in 5 MB chunks using XMLHttpRequest and the Range header. Things work great, except for detecting when I've downloaded the last chunk.

Here's a screenshot of the request and response for the first chunk. Notice the Content-Length is 1024 * 1024 * 5 (5 MB). Also notice that the server responds correctly with the first 5 MB, and in the Content-Range header, properly specifies the size of the entire file (after the /):

When I copy the response body into a text editor (Sublime), I only get 5,242,736 characters instead of the expected 5,242,880 as indicated by Content-Length:

Why are 144 characters missing? This is true of every chunk that gets downloaded, though the exact difference varies a little bit.

However, what's especially strange is the last chunk. The server responds with the last ~2.9 MB of the file (instead of a whole 5 MB) and apparently properly indicates this in the response:

Notice that I am requesting the next 5 MB (even though it goes beyond the total file size). No biggie, the server responds with the last part of the file and the headers indicate the actual byte range returned.

But does it really?

When I call xhr.getResponseHeader("Content-Length") with Javascript, I see a different story in Chrome:

The XMLHttpRequest object is telling me that another 5 MB was downloaded, beyond the end of the file. Is there something I don't understand about the xhr object?

What's even weirder is that it works in Firefox 30 as expected:

So between the xhr.responseText.length not matching the Content-Length and these headers not agreeing between the xhr object and the Network tools, I don't know what to do to fix this.

What's causing these discrepancies?

Update: I have confirmed that the server itself is properly sending the request, despite the overshot Range header in the request for the last chunk. This is the output from the raw HTTP request, thanks to good 'ol telnet:

HTTP/1.1 206 Partial Content
Server: nginx/1.4.5
Date: Mon, 14 Jul 2014 21:50:06 GMT
Content-Type: application/octet-stream
Content-Length: 2987360
Last-Modified: Sun, 13 Jul 2014 22:05:10 GMT
Connection: keep-alive
ETag: "53c30296-2fd9560"
Content-Range: bytes 47185920-50173279/50173280

So it looks like Chrome is malfunctioning. Should this be filed as a bug? Where?

Share Improve this question edited Jul 14, 2014 at 21:53 Matt asked Jul 14, 2014 at 20:12 MattMatt 23.8k18 gold badges74 silver badges116 bronze badges 6
  • 1 xhr.responseText.length is the # of chars in your response, not the #of bytes indicated in the Content-Length headers. some unicode chars (or binary bits coerced into unicode) use more than one byte per char. chrome might 2nd-guess invalid range headers (like ones that overlap the file end), as may firefox, but only one approach (ff) seems to be working for your case. fix the REQUEST headers and try again. – dandavis Commented Jul 14, 2014 at 20:34
  • Thanks @dandavis. See my update. I ran the request in telnet directly and the raw output from the server is as expected, meaning (I think?) that Chrome must be malfunctioning when making the XMLHttpRequest or something... – Matt Commented Jul 14, 2014 at 21:54
  • 1 i'm suggesting chrome might be doing something special, not show here, by internally tying the request to the response. it appears the output from the server is fine, but chrome might also consider the input from the request (specifically range 0-52/50), which told it to expect more. sometimes being smart is dumb when you're a browser. – dandavis Commented Jul 14, 2014 at 22:04
  • @dandavis I think I see what you're saying, that Chrome might be assuming something it shouldn't. But... the input/request doesn't include the total file size though: only the response's Content-Range has that. – Matt Commented Jul 14, 2014 at 23:07
  • all i'm saying is that the Range request header (the 0-5242879 one) you show in your first screenshot might be fooling chrome. why you reply every time about the response i don't know, but i would try changing the request if i were you... – dandavis Commented Jul 15, 2014 at 15:16
 |  Show 1 more ment

1 Answer 1

Reset to default 6 +100

The main issue is that you are reading binary data as text. Note that the server responds with Content-Type: application/octet-stream which doesn't specify the encoding explicitly - in that case the browser will typically assume that the data is encoded in UTF-8. While the length will mostly be unchanged (bytes with values 0 to 127 are interpreted as a single character in UTF-8 and bytes with higher values will usually be replaced by the replacement character �), your binary file will certainly contain a few valid multi-byte UTF-8 sequences - and these will be bined into one character. That explains why responseText.length doesn't match the number of bytes received from the server.

Now you could of course force some specific encoding using request.overrideMimeType() method, ISO 8859-1 would make sense in particular because the first 256 Unicode code points are identical with ISO 8859-1:

request.overrideMimeType("application/octet-stream; charset=iso-8859-1");

That should make sure that one byte will always be interpreted as one character. Still, a better approach would be storing the server response in an ArrayBuffer which is explicitly meant to deal with binary data.

var request = new XMLHttpRequest();
request.open(...);
request.responseType = "arraybuffer";
request.send();

...

var array = new Uint8Array(request.response);
alert("First byte has value " + array[0]);
alert("Array length is " + array.length);

According to MDN, responseType = "arraybuffer" is supported starting with Chrome 10, Firefox 6 and Internet Explorer 10. See also: Typed arrays.

Side-note: Firefox also supports responseType = "moz-chunked-text" and responseType = "moz-chunked-arraybuffer" starting with Firefox 9 which allow receiving data in chunks without resorting to ranged requests. It seems that Chrome doesn't plan to implement it, instead they are working on implementing the Streams API.

Edit: I was unable to reproduce your issue with Chrome lying to you about the response headers, at least not without your code. However, the code responsible should be this function in partial_data:

// We are making multiple requests to plete the range requested by the user.
// Just assume that everything is fine and say that we are returning what was
// requested.
void PartialData::FixResponseHeaders(HttpResponseHeaders* headers,
                                     bool success) {
  if (truncated_)
    return;

  if (byte_range_.IsValid() && success) {
    headers->UpdateWithNewRange(byte_range_, resource_size_, !sparse_entry_);
    return;
  }

This code will remove the Content-Length and Content-Range headers returned by the server and replace them by ones generated from your request parameters. Given that I cannot reproduce the issue myself, the following is only guesses:

  • This code path seems to be used only for requests that can be satisfied from cache, so I guess that things will work correctly if you clear your cache.
  • resource_size_ variable must have a wrong value in your case, larger than the actual size of the requested file. This variable is determined from the Content-Range header in the first chunk requested, maybe you have a server response cached there which indicates a larger file.

本文标签: javascriptXMLHttpRequest and Chrome developer tools don39t say the same thingStack Overflow