admin管理员组文章数量:1316030
I am trying to run what I expect is a very mon use case:
I need to download a gzip file (of plex JSON datasets) from Amazon S3, and depress(gunzip) it in Javascript. I have everything working correctly except the final 'inflate' step.
I am using Amazon Gateway, and have confirmed that the Gateway is properly transferring the pressed file (used Curl and 7-zip to verify the resulting data is ing out of the API). Unfortunately, when I try to inflate the data in Javascript with Pako, I am getting errors.
Here is my code (note: response.data is the binary data transferred from AWS):
apigClient.dataGet(params, {}, {})
.then( (response) => {
console.log(response); //shows response including header and data
const result = pako.inflate(new Uint8Array(response.data), { to: 'string' });
// ERROR HERE: 'buffer error'
}).catch ( (itemGetError) => {
console.log(itemGetError);
});
Also tried a version to do it splitting the binary data input into an array by adding the following before the inflate:
const charData = response.data.split('').map(function(x){return x.charCodeAt(0); });
const binData = new Uint8Array(charData);
const result = pako.inflate(binData, { to: 'string' });
//ERROR: incorrect header check
I suspect I have some sort of issue with the encoding of the data and I am not getting it into the proper format for Uint8Array to be meaningful.
Can anyone point me in the right direction to get this working?
For clarity:
- As the code above is listed, I get a buffer error. If I drop the Uint8Array, and just try to process 'result.data' I get the error: 'incorrect header check', which is what makes me suspect that it is the encoding/format of my data which is the issue.
The original file was pressed in Java using GZIPOutputStream with UTF-8 and then stored as a static file (i.e. randomname.gz).
The file is transferred through the AWS Gateway as binary, so it is exactly the same ing out as the original file, so 'curl --output filename.gz {URLtoS3Gateway}' === downloaded file from S3.
I had the same basic issue when I used the gateway to encode the binary data as 'base64', but did not try a whole lot around that effort, as it seems easier to work with the "real" binary data than to add the base64 encode/decode in the middle. If that is a needed step, I can add it back in.
I have also tried some of the example processing found halfway through this issue: , but that didn't help (I might be misunderstanding the binary format v. array v base64).
I am trying to run what I expect is a very mon use case:
I need to download a gzip file (of plex JSON datasets) from Amazon S3, and depress(gunzip) it in Javascript. I have everything working correctly except the final 'inflate' step.
I am using Amazon Gateway, and have confirmed that the Gateway is properly transferring the pressed file (used Curl and 7-zip to verify the resulting data is ing out of the API). Unfortunately, when I try to inflate the data in Javascript with Pako, I am getting errors.
Here is my code (note: response.data is the binary data transferred from AWS):
apigClient.dataGet(params, {}, {})
.then( (response) => {
console.log(response); //shows response including header and data
const result = pako.inflate(new Uint8Array(response.data), { to: 'string' });
// ERROR HERE: 'buffer error'
}).catch ( (itemGetError) => {
console.log(itemGetError);
});
Also tried a version to do it splitting the binary data input into an array by adding the following before the inflate:
const charData = response.data.split('').map(function(x){return x.charCodeAt(0); });
const binData = new Uint8Array(charData);
const result = pako.inflate(binData, { to: 'string' });
//ERROR: incorrect header check
I suspect I have some sort of issue with the encoding of the data and I am not getting it into the proper format for Uint8Array to be meaningful.
Can anyone point me in the right direction to get this working?
For clarity:
- As the code above is listed, I get a buffer error. If I drop the Uint8Array, and just try to process 'result.data' I get the error: 'incorrect header check', which is what makes me suspect that it is the encoding/format of my data which is the issue.
The original file was pressed in Java using GZIPOutputStream with UTF-8 and then stored as a static file (i.e. randomname.gz).
The file is transferred through the AWS Gateway as binary, so it is exactly the same ing out as the original file, so 'curl --output filename.gz {URLtoS3Gateway}' === downloaded file from S3.
I had the same basic issue when I used the gateway to encode the binary data as 'base64', but did not try a whole lot around that effort, as it seems easier to work with the "real" binary data than to add the base64 encode/decode in the middle. If that is a needed step, I can add it back in.
I have also tried some of the example processing found halfway through this issue: https://github./nodeca/pako/issues/15, but that didn't help (I might be misunderstanding the binary format v. array v base64).
Share Improve this question edited Nov 24, 2017 at 15:08 Keith Priddy asked Nov 22, 2017 at 20:32 Keith PriddyKeith Priddy 1131 silver badge8 bronze badges1 Answer
Reset to default 7I was able to figure out my own problem. It was related to the format of the data being read in by Javascript (either Javascript itself or the Angular HttpClient implementation). I was reading in a "binary" format, but it was not the same as that recognized/used by pako. When I read the data in as base64, and then converted to binary with 'atob', I was able to get it working. Here is what I actually have implemented (starting at fetching from the S3 file storage).
1) Build AWS API Gateway that will read a previously stored *.gz file from S3.
- Create a standard "get" API request to S3 that supports binary. (http://docs.aws.amazon./apigateway/latest/developerguide/api-gateway-payload-encodings-configure-with-console.html)
- Make sure the Gateway will recognize the input type by setting 'Binary types' (application/gzip worked for me, but others like application/binary-octet and image/png should work for other types of files besides *.gz). NOTE: that setting is under the main API selections list on the left of the API config screen.
- Set the 'Content Handling' to "Convert to text(if needed)" by selecting the API Method/{GET} -> Integration Request Box and updating the 'Content Handling' item. (NOTE: the example in the link above remends "passthrough". DON'T use that as it will pass the unreadable binary format.) This is the step that actually converts from binary to base64.
At this point you should be able to download a base64 verion of your binary file via the URL (test in browser or with Curl).
2) I then had the API Gateway generate the SDK and used the respective apiGClient.{get} call.
3) Within the call, translate the base64->binary->Uint8 and then depress/inflate it. My code for that:
apigClient.myDataGet(params, {}, {})
.then( (response) => {
// HttpClient result is in response.data
// convert the ining base64 -> binary
const strData = atob(response.data);
// split it into an array rather than a "string"
const charData = strData.split('').map(function(x){return x.charCodeAt(0); });
// convert to binary
const binData = new Uint8Array(charData);
// inflate
const result = pako.inflate(binData, { to: 'string' });
console.log(result);
}).catch ( (itemGetError) => {
console.log(itemGetError);
});
}
本文标签: amazon s3Extracting gzip data in Javascript with Pakoencoding issuesStack Overflow
版权声明:本文标题:amazon s3 - Extracting gzip data in Javascript with Pako - encoding issues - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1741993806a2409636.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论