admin管理员组

文章数量:1290950

Recently, I needed to parse the JSON that the Chrome web browser produces when you record events in its dev tools, and get some timing data out of it. Chrome can produce a pretty large amount of data in a small amount of time, so the Ruby parser I originally built was quite slow.

Since I'm learning Go, I decided to write scripts in both Go and JavaScript/Node and pare them.

The simplest possible form of the JSON file is what I have in this Gist. It contains an event representing the request sent to fetch a page, and the event representing the response. Typically, there's a huge amount of extra data to sift through. That's its own problem, but not what I'm worried about in this question.

The JavaScript script that I wrote is here, and the Go program I wrote is here. This is the first useful thing I've written in Go, so I'm sure it's all sorts of bad. However, one thing I noticed is that it's much slower than JavaScript at parsing a large JSON file.

Time with a 119Mb JSON file in Go:

$ time ./parse data.json
= 22 Requests
  Min Time:      0.77
  Max Time:      0.77
  Average Time:  0.77
./gm data.json  4.54s user 0.16s system 99% cpu 4.705 total

Time with a 119Mb JSON file in JavaScript/Node:

$ time node parse.js data.json
= 22 Requests
  Min Time: 0.77
  Max Time: 0.77
  Avg Time: 0.77
node jm.js data.json  1.73s user 0.24s system 100% cpu 1.959 total

(The min/max/average times are all identical in this example because I duplicated JSON objects so as to have a very large data set, but that's irrelevant.)

I'm curious if it's just that JavaScript/Node is just way faster at parsing JSON (which wouldn't be particularly surprising, I guess), or if there's something I'm doing totally wrong in the Go program. I'm also just curious what I'm doing wrong in the Go program in general, because I'm sure there's plenty wrong with it.

Note that while these two scripts do more than parsing, it's definitely json.Unmarshal() in Go that is adding lots of time in the program.

Update

I added a Ruby script:

$ ruby parse.rb
= 22 Requests
  Min Time: 0.77
  Max Time: 0.77
  Avg Time: 0.77
ruby parse.rb  4.82s user 0.82s system 99% cpu 5.658 total

Recently, I needed to parse the JSON that the Chrome web browser produces when you record events in its dev tools, and get some timing data out of it. Chrome can produce a pretty large amount of data in a small amount of time, so the Ruby parser I originally built was quite slow.

Since I'm learning Go, I decided to write scripts in both Go and JavaScript/Node and pare them.

The simplest possible form of the JSON file is what I have in this Gist. It contains an event representing the request sent to fetch a page, and the event representing the response. Typically, there's a huge amount of extra data to sift through. That's its own problem, but not what I'm worried about in this question.

The JavaScript script that I wrote is here, and the Go program I wrote is here. This is the first useful thing I've written in Go, so I'm sure it's all sorts of bad. However, one thing I noticed is that it's much slower than JavaScript at parsing a large JSON file.

Time with a 119Mb JSON file in Go:

$ time ./parse data.json
= 22 Requests
  Min Time:      0.77
  Max Time:      0.77
  Average Time:  0.77
./gm data.json  4.54s user 0.16s system 99% cpu 4.705 total

Time with a 119Mb JSON file in JavaScript/Node:

$ time node parse.js data.json
= 22 Requests
  Min Time: 0.77
  Max Time: 0.77
  Avg Time: 0.77
node jm.js data.json  1.73s user 0.24s system 100% cpu 1.959 total

(The min/max/average times are all identical in this example because I duplicated JSON objects so as to have a very large data set, but that's irrelevant.)

I'm curious if it's just that JavaScript/Node is just way faster at parsing JSON (which wouldn't be particularly surprising, I guess), or if there's something I'm doing totally wrong in the Go program. I'm also just curious what I'm doing wrong in the Go program in general, because I'm sure there's plenty wrong with it.

Note that while these two scripts do more than parsing, it's definitely json.Unmarshal() in Go that is adding lots of time in the program.

Update

I added a Ruby script:

$ ruby parse.rb
= 22 Requests
  Min Time: 0.77
  Max Time: 0.77
  Avg Time: 0.77
ruby parse.rb  4.82s user 0.82s system 99% cpu 5.658 total
Share Improve this question edited Jul 11, 2013 at 19:59 clem asked Jul 11, 2013 at 19:33 clemclem 3,5343 gold badges27 silver badges41 bronze badges 13
  • Just as a terminology note, "parsing" is what happens when you call JSON.parse() or the Go json.unMarshal(). The rest of the work is just traversal of the resulting data structure. – Pointy Commented Jul 11, 2013 at 19:37
  • 2 @Pointy I'm perfectly aware of this. It's json.Unmarshal() in Go that's much slower than JSON.parse() in JavaScript (or appears to be). You're right, I should have used a different verb in some places, here. :) – clem Commented Jul 11, 2013 at 19:40
  • OK yes I've been reading over your code and those results and now I think I see what you mean. That's pretty weird; parsing JSON should be crazy fast in any language, as the syntax is so dirt-simple. Maybe the differences involve the construction of the data structure (the "actions" of the parse process)? Or maybe nobody's spent much time optimizing the Go JSON parser :) – Pointy Commented Jul 11, 2013 at 19:42
  • @Pointy FWIW, I added a Ruby version. The Go version is only a second faster, most of the time. – clem Commented Jul 11, 2013 at 19:57
  • 1 @Mostafa See: gist.github./jclem/5979042 – clem Commented Jul 11, 2013 at 20:41
 |  Show 8 more ments

2 Answers 2

Reset to default 10

With Go, you are parsing the JSON into statically-typed structures. With JS and Ruby, you are parsing it into hash tables.

In order to parse JSON into the structures that you defined, the json package needs to find out the names and types of their fields. To do this, it uses the reflect package, which is much slower than accessing those fields directly.

Depending on what you do with the data after you parse it, the extra parsing time may pay for itself. The Go data structures use less memory than hash tables, and they are much faster to access. So if you do a lot with the data, the savings on processing time may outweigh the extra parsing time.

Default Go JSON parsing is really slow. I benchmarked it with 5M json file.

Go's encoding/json:

2024/01/07 21:58:47 UnMarshal: 54.60 ms
2024/01/07 21:58:47 Marshal: 13.70 ms

NodeJS

JSON.parse took 16.12 milliseconds.
JSON.stringify took 30.06 milliseconds.

Though, you can use alternative modules in Go lang which provide the same performance as NodeJS.

Go's https://github./goccy/go-json

2024/01/07 22:06:27 UnMarshal: 17.45 ms
2024/01/07 22:06:27 Marshal: 10.80 ms

本文标签: nodejsGo vs JavaScript JSON parsingStack Overflow