admin管理员组文章数量:1289529
*"Efficient" here basically means in terms of smaller size (to reduce the IO waiting time), and speedy retrieval/deserialization times. Storing times are not as important.
I have to store a couple of dozen arrays of integers, each with 1800 values in the range 0-50, in the browser's localStorage -- that is, as a string.
Obviously, the simplest method is to just JSON.stringify
it, however, that adds a lot of unnecessary information, considering that the ranges of the data is well known. An average size for one of these arrays is then ~5500 bytes.
Here are some other methods I've tried (resultant size, and time to deserialize it 1000 times at the end)
zero-padding the numbers so each was 2 characters long, eg:
[5, 27, 7, 38] ==> "05270738"
base 50 encoding it:
[5, 11, 7, 38] ==> "5b7C"
just using the value as a character code (adding 32 to avoid the weird control characters at the start):
[5, 11, 7, 38] ==> "%+'F" (String.fromCharCode(37), String.fromCharCode(43) ...)
Here are my results:
size Chrome 18 Firefox 11
-------------------------------------------------
JSON.stringify 5286 60ms 99ms
zero-padded 3600 354ms 703ms
base 50 1800 315ms 400ms
charCodes 1800 21ms 178ms
My question is if there is an even better method I haven't yet considered?
Update
MДΓΓБДLL suggested using pression on the data. Combining this LZW implementation with the base 50 and charCode data. I also tested aroth's code (packing 4 integers into 3 bytes). I got these results:
size Chrome 18 Firefox 11
-------------------------------------------------
LZW base 50 1103 494ms 999ms
LZW charCodes 1103 194ms 882ms
bitpacking 1350 2395ms 331ms
*"Efficient" here basically means in terms of smaller size (to reduce the IO waiting time), and speedy retrieval/deserialization times. Storing times are not as important.
I have to store a couple of dozen arrays of integers, each with 1800 values in the range 0-50, in the browser's localStorage -- that is, as a string.
Obviously, the simplest method is to just JSON.stringify
it, however, that adds a lot of unnecessary information, considering that the ranges of the data is well known. An average size for one of these arrays is then ~5500 bytes.
Here are some other methods I've tried (resultant size, and time to deserialize it 1000 times at the end)
zero-padding the numbers so each was 2 characters long, eg:
[5, 27, 7, 38] ==> "05270738"
base 50 encoding it:
[5, 11, 7, 38] ==> "5b7C"
just using the value as a character code (adding 32 to avoid the weird control characters at the start):
[5, 11, 7, 38] ==> "%+'F" (String.fromCharCode(37), String.fromCharCode(43) ...)
Here are my results:
size Chrome 18 Firefox 11
-------------------------------------------------
JSON.stringify 5286 60ms 99ms
zero-padded 3600 354ms 703ms
base 50 1800 315ms 400ms
charCodes 1800 21ms 178ms
My question is if there is an even better method I haven't yet considered?
Update
MДΓΓБДLL suggested using pression on the data. Combining this LZW implementation with the base 50 and charCode data. I also tested aroth's code (packing 4 integers into 3 bytes). I got these results:
size Chrome 18 Firefox 11
-------------------------------------------------
LZW base 50 1103 494ms 999ms
LZW charCodes 1103 194ms 882ms
bitpacking 1350 2395ms 331ms
Share
Improve this question
edited May 23, 2017 at 11:58
CommunityBot
11 silver badge
asked Apr 12, 2012 at 0:12
nickfnickf
546k198 gold badges658 silver badges727 bronze badges
4
- 3 Try bining base 50 or charCodes method with deflate/gzip. I'm not sure that there's even enough data for pression to be worth the overhead, but it can't hurt to check. – Matt Ball Commented Apr 12, 2012 at 0:16
- @MДΓΓБДLL I just tried it with LZW pression... updating the results table now – nickf Commented Apr 12, 2012 at 0:21
- 61% reduction, I wouldn't have expected that - not bad! – Matt Ball Commented Apr 12, 2012 at 0:41
- 1 A single JavaScript character takes up two bytes (per the ecma specification.) Make sure to factor that in when you figure the size of the data. – gilly3 Commented Apr 13, 2012 at 7:55
3 Answers
Reset to default 4If your range is 0-50, then you can pack 4 numbers into 3 bytes (6 bits per number). This would allow you to store 1800 numbers using ~1350 bytes. This code should do it:
window._firstChar = 48;
window.decodeArray = function(encodedText) {
var result = [];
var temp = [];
for (var index = 0; index < encodedText.length; index += 3) {
//skipping bounds checking because the encoded text is assumed to be valid
var firstChar = encodedText.charAt(index).charCodeAt() - _firstChar;
var secondChar = encodedText.charAt(index + 1).charCodeAt() - _firstChar;
var thirdChar = encodedText.charAt(index + 2).charCodeAt() - _firstChar;
temp.push((firstChar >> 2) & 0x3F); //6 bits, 'a'
temp.push(((firstChar & 0x03) << 4) | ((secondChar >> 4) & 0xF)); //2 bits + 4 bits, 'b'
temp.push(((secondChar & 0x0F) << 2) | ((thirdChar >> 6) & 0x3)); //4 bits + 2 bits, 'c'
temp.push(thirdChar & 0x3F); //6 bits, 'd'
}
//filter out 'padding' numbers, if present; this is an extremely inefficient way to do it
for (var index = 0; index < temp.length; index++) {
if(temp[index] != 63) {
result.push(temp[index]);
}
}
return result;
};
window.encodeArray = function(array) {
var encodedData = [];
for (var index = 0; index < dataSet.length; index += 4) {
var num1 = dataSet[index];
var num2 = index + 1 < dataSet.length ? dataSet[index + 1] : 63;
var num3 = index + 2 < dataSet.length ? dataSet[index + 2] : 63;
var num4 = index + 3 < dataSet.length ? dataSet[index + 3] : 63;
encodeSet(num1, num2, num3, num4, encodedData);
}
return encodedData;
};
window.encodeSet = function(a, b, c, d, outArray) {
//we can encode 4 numbers in 3 bytes
var firstChar = ((a & 0x3F) << 2) | ((b >> 4) & 0x03); //6 bits for 'a', 2 from 'b'
var secondChar = ((b & 0x0F) << 4) | ((c >> 2) & 0x0F); //remaining 4 bits from 'b', 4 from 'c'
var thirdChar = ((c & 0x03) << 6) | (d & 0x3F); //remaining 2 bits from 'c', 6 bits for 'd'
//add _firstChar so that all values map to a printable character
outArray.push(String.fromCharCode(firstChar + _firstChar));
outArray.push(String.fromCharCode(secondChar + _firstChar));
outArray.push(String.fromCharCode(thirdChar + _firstChar));
};
Here's a quick example: http://jsfiddle/NWyBx/1
Note that storage size can likely be further reduced by applying gzip pression to the resulting string.
Alternately, if the ordering of your numbers is not significant, then you can simply do a bucket-sort using 51 buckets (assuming 0-50 includes both 0 and 50 as valid numbers) and store the counts for each bucket instead of the numbers themselves. That would likely give you better pression and efficiency than any other approach.
Assuming (as in your test) that pression takes more time than the size reduction saves you, your char encoding is the smallest you'll get without bitshifting. You're currently using one byte for each number, but if they're guaranteed to be small enough you could put two numbers in each byte. That would probably be an over-optimization, unless this is a very hot piece of your code.
You might want to consider using Uint8Array
or ArrayBuffer
. This blogpost shows how it's done. Copying his logic, here's an example, assuming you have an existing Uint8Array
named arr
.
function arrayBufferToBinaryString(buffer, cb) {
var blobBuilder = new BlobBuilder();
blobBuilder.append(buffer);
var blob = blobBuilder.getBlob();
var reader = new FileReader();
reader.onload = function (e) {
cb(reader.result);
};
reader.readAsBinaryString(blob);
}
arrayBufferToBinaryString(arr.buffer, function(s) {
// do something with s
});
版权声明:本文标题:html - Most efficient way to store large arrays of integers in localStorage with Javascript - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1741470779a2380592.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论