admin管理员组文章数量:1340628
I have a data structure with 6000 elements and for each element I need to store 7 bits of info. If I naively store it as an array of 6000 elements filled with numbers, it takes up around 22 KB. I am trying to reduce the size of the page - what is the best way to store 6000*7 bits of info (should be around 5 KB). I want a "bitstream" like data structure. I thought about encoding it into a string or even an image but not exactly sure. The reason I did not encode as string because I cannot mathemtically guarantee that none of the characters won't be one of unprintable ASCII characters (e.g. ASCII 1-25)
I have a data structure with 6000 elements and for each element I need to store 7 bits of info. If I naively store it as an array of 6000 elements filled with numbers, it takes up around 22 KB. I am trying to reduce the size of the page - what is the best way to store 6000*7 bits of info (should be around 5 KB). I want a "bitstream" like data structure. I thought about encoding it into a string or even an image but not exactly sure. The reason I did not encode as string because I cannot mathemtically guarantee that none of the characters won't be one of unprintable ASCII characters (e.g. ASCII 1-25)
Share Improve this question edited Jun 21, 2013 at 5:13 pathikrit asked Jun 21, 2013 at 5:04 pathikritpathikrit 33.5k39 gold badges153 silver badges229 bronze badges 10- 1 String with bits shifted to some range that does not include 0 is probably easiest solution (something like charFromCode[value +32] to get all values in safe range) - easy access to each element... – Alexei Levenkov Commented Jun 21, 2013 at 5:08
- 1 May see this – Ishank Commented Jun 21, 2013 at 5:10
- @Ishank - I actually tried the library pointed at that page (pieroxy/blog/pages/lz-string/index.html) and funnily it is a really bad pression library (the author himself admits, it increases the size of the output if you have more than 750 characters which in my case is true). That solution is good for pressing small strings to be stored in localstorage. Any other client side pression scheme might work but I am looking for something purely bithacks based. – pathikrit Commented Jun 21, 2013 at 5:19
- Why do you want to reduce it? To save network bandwidth or memory footprint in browser, or for storage such as localStorage? – rlb Commented Jun 21, 2013 at 6:54
- 1 Do you have server pression enabled? I would have thought gzip would have conpressed a naive array pretty well? Is the 22k the observed network effect, or just the increase in server file size? – rlb Commented Jun 21, 2013 at 7:16
3 Answers
Reset to default 9 +50Let's consider two solutions.
Base 32
For fun, let's consider using base-32 numbers. Yes, you can do that in JavaScript.
First pack four 7-bit values into one integer:
function pack(a1,a2,a3,a4){
return ((a1 << 8 | a2) << 8 | a3) << 8 | a4;
}
Now, convert to base 32.
function encode(n){
var str = "000000" + n.toString(32);
str = str.slice(0,6);
return str;
}
That should be not more than six digits. We make sure it's exactly six.
Going the other direction:
function decode(s){
return parseInt(s, 32);
}
function unpack(x){
var a1 = x & 0xff0000>>24, a2 = x & 0x00ff0000>>16, a3 = x & 0x0000ff00>>8, a4 = x & 0x000000ff;
return [a1, a2, a3, a4];
}
All that remains is to wrap the logic around this to handle the 6000 elements. To press:
function press(elts){
var str = '';
for(var i = 0; i < elts.length; i+=4){
str += encode(pack(elts[i], elts[i+1], elts[i+2], elts[i+3])
}
return str;
}
And to unpress:
function unpress(str){
var elts = [];
for(var i = 0; i < str.length; i+=6){
elts = elts.concat(unpack(decode(str.slice(i, i+6)));
}
return elts;
}
If you concatenate the results for all 6,000 elements, you'll have 1,500 packed numbers, which at six characters each will turn into about 9K. It's about 1.5 bytes per 7-bit value. It's by no means the information-theoretic maximum pression, but it's not that bad. To decode simply reverse the process:
Unicode
First we'll pack two 7-bit values into one integer:
function pack(a1,a2){
return (a1 << 8 | a2) << 8;
}
We'll do this for all 6,000 inputs, then use our friend String.fromCharCode
to turn all 3,000 values into a 3,000-character Unicode string:
function press(elts){
var packeds = [];
for (var i = 0; i < elts.length; i+=2) {
packeds.push(pack(elts[i], elts[i+1]);
}
return String.fromCharCode.apply(0, packeds);
}
Coming back the other way, it's quite easy:
function unpress(str) {
var elts = [], code;
for (var i = 0; i < str.length; i++) {
code=str.charCodeAt(i);
elts.push(code>>8, code & 0xff);
}
return elts;
}
This will take up two bytes per two 7-bit values, so about 33% more efficient than the base 32 approach.
If the above string is going to be written out into a script tag as a Javascript assignment such as var data="HUGE UNICODE STRING";
, then quotation marks in string will need to be escaped:
javascript_assignment = 'var data = "' + press(elts).replace(/"/g,'\\"') + '";';
The above code is not meant to be production, and in particular does not handle edge cases where the the number of inputs is not a multiple of four or two.
As dandavis said it is ok to encode unprintable ASCII characters into JSON-string. But for random data it gave me 13KB (because many characters must be escaped). You can encode string into base64 and then into JSON-string. It gave me 7.9KB for random data.
var randint = function (from, to) {
return Math.floor(Math.random() * (to - from + 1)) + from;
}
var data = '';
for (var i = 0; i < 6000; ++i) {
data += String.fromCharCode(randint(0, 127));
}
// encoding `data` as JSON-string at this point gave me 13KB
var b64data = btoa(data);
// encoding `b64data` as JSON-string gave me 7.9KB
to decode it
var data = atob(b64data);
var adata = [];
for (var i = 0; i < data.length; ++i) {
adata.push(data.charCodeAt(i));
}
There definitely should be more efficient method to encode your data, but I believe this one is a promise on plexity and efficiency.
PS. In some browsers you might need to write atob
and btoa
by yourself.
actually, strings work fine if you use JSON to encode any potential nasties into a JS-escape code:
var codes=",Ñkqëgdß\u001f", // (10 chars JSON encoded to store all chars ranges)
mySet=codes[4].charCodeAt().toString(2).split("").map(Number).map(Boolean).reverse();
alert(mySet); // shows: [true,false,false,false,true,true,true]
/* broken down into bite-sized steps: (pseudo code)
char == "g" (codes[4])
"g".charCodeAt() == 103
(103).toString(2) == "1100111"
.split().map(Number) == [1,1,0,0,1,1,1]
.map(Boolean).reverse() == [true,true,true,false,false,true,true] */
and to fill the array, reverse the process:
var toStore= [true, false, true, false, true, false, true];
var char= String.fromCharCode(parseInt(toStore.map(Number).reverse().join(""),2));
codes+=char;
//verify (should===true):
codes[10].charCodeAt().toString(2).split("")
.map(Number).map(Boolean).reverse().toString() === toStore.toString();
to export the results to an ascii file, JSON.stringify(codes), or if saving to localStrorage, you can just save the raw string variable since browsers use two bytes per char of localStorage...
本文标签: javascriptEmbedding binary data in web pageStack Overflow
版权声明:本文标题:javascript - Embedding binary data in web page? - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1743646885a2515660.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论