admin管理员组

文章数量:1332345

I'm creating a download link to an item that is stored on AWS / S3.

As I am building out this link, I've confirmed that the data is encoded in UTF-8, but when the user goes to download it, they are hit with this error anytime the link contains anything other than ASCII encoding.

<Error>
<Code>InvalidArgument</Code>
<Message>Header value cannot be represented using ISO-8859-1.</Message>
<ArgumentName>response-content-disposition</ArgumentName>
<ArgumentValue>attachment;filename="今日も僕は 用もないのに.mp3"</ArgumentValue>
<RequestId>12345</RequestId>
<HostId>12345</HostId>
</Error>
//first attempt - whatever encoding it IS, convert it to utf-8
$encoded_file = mb_convert_encoding($original_filename, "UTF-8", mb_detect_encoding($original_filename));

//second attempt - force filename to use html entities
$encoded_file = mb_convert_encoding($original_filename,'HTML-ENTITIES','UTF-8');

$obj_data['ResponseContentDisposition'] = 'attachment;filename="' . $encoded_file . '"';
$cmd = $s3->getCommand('GetObject', $obj_data);
$presign_url_request = $s3->createPresignedRequest($cmd, AWS_PRESIGNED_URL_EXPIRATION);

Forcing the attachment;filename to use htmlentities works - but it's really ugly. If I am converting the filename into UTF-8, why am I getting this error from AWS that the header value cannot use ISO-8859-1?

I'm creating a download link to an item that is stored on AWS / S3.

As I am building out this link, I've confirmed that the data is encoded in UTF-8, but when the user goes to download it, they are hit with this error anytime the link contains anything other than ASCII encoding.

<Error>
<Code>InvalidArgument</Code>
<Message>Header value cannot be represented using ISO-8859-1.</Message>
<ArgumentName>response-content-disposition</ArgumentName>
<ArgumentValue>attachment;filename="今日も僕は 用もないのに.mp3"</ArgumentValue>
<RequestId>12345</RequestId>
<HostId>12345</HostId>
</Error>
//first attempt - whatever encoding it IS, convert it to utf-8
$encoded_file = mb_convert_encoding($original_filename, "UTF-8", mb_detect_encoding($original_filename));

//second attempt - force filename to use html entities
$encoded_file = mb_convert_encoding($original_filename,'HTML-ENTITIES','UTF-8');

$obj_data['ResponseContentDisposition'] = 'attachment;filename="' . $encoded_file . '"';
$cmd = $s3->getCommand('GetObject', $obj_data);
$presign_url_request = $s3->createPresignedRequest($cmd, AWS_PRESIGNED_URL_EXPIRATION);

Forcing the attachment;filename to use htmlentities works - but it's really ugly. If I am converting the filename into UTF-8, why am I getting this error from AWS that the header value cannot use ISO-8859-1?

Share Improve this question asked Nov 20, 2024 at 21:38 Brian PowellBrian Powell 3,4114 gold badges37 silver badges64 bronze badges
Add a comment  | 

2 Answers 2

Reset to default 2

HTTP headers are forbidden from containing anything other than ISO-8859-1, and strings of any other incompatible encoding must be encoded in conformance to established specs.

In this case, it is RFC6266.

function rfc6266_encode($string, $encoding) {
    $out = '';
    for( $i=0,$l=strlen($string); $i<$l; ++$i ) {
        $o = ord($string[$i]);
        if( $o >= 127 ) {
            $out .= sprintf('%%%02x', $o);
        } else {
            $out .= $string[$i];
        }
    }
    return sprintf('%s"%s"', $encoding, $out);
}

var_dump(rfc6266_encode('今日も僕は 用もないのに.mp3', 'utf-8'));

Output:

string(111) "utf-8"%e4%bb%8a%e6%97%a5%e3%82%82%e5%83%95%e3%81%af %e7%94%a8%e3%82%82%e3%81%aa%e3%81%84%e3%81%ae%e3%81%ab.mp3""

And you would use it in your code like:

$obj_data['ResponseContentDisposition'] = 'attachment;filename="' . rfc6266_encode($original_filename, $original_filename_encoding) . '"';

That said, do not rely on mb_detect_encoding() as it make a guess as to what the encoding might be. String encoding is metadata that must be captured alongside the data itself and preserved.

Unfortunately HTTP message Header doesn't have the same restrictions as the HTTP message body.

UTF-8 is supported in message body but not in the header (for historic and technical reasons). PHP urlencode function is worth a try for headers but not sure it will improve things.

Allowed characters in HTTP header values https://stackoverflow/a/75998796/8199678

本文标签: phpAWS ISO88591 HeaderStack Overflow