Skip to content

Tokenize the space value when stringify-ing #7

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed

Conversation

ericf
Copy link
Collaborator

@ericf ericf commented Aug 28, 2015

Using a token for the space value, the Node 0.10 CPU issue is avoid while also keeping the file size closer to the no-space size.

Using the test script in #3, the results of this method add a bit of CPU time, but its worth the reduction in size:

1 space:

[48885]      803 ms: Scavenge 109.4 (142.2) -> 106.3 (142.2) MB, 0 ms [Runtime::PerformGC].
memory usage: { rss: 140365824, heapTotal: 127084240, heapUsed: 117960976 }
memory usage: { rss: 140394496, heapTotal: 127084240, heapUsed: 118006704 }
str.replace takes 4 ms.

Tokenized space:

[48879]      880 ms: Mark-sweep 156.3 (190.6) -> 107.8 (148.2) MB, 41 ms [allocation failure] [GC in old space requested].
memory usage: { rss: 151355392, heapTotal: 160808928, heapUsed: 120760240 }
memory usage: { rss: 151359488, heapTotal: 160808928, heapUsed: 120818752 }
str.replace takes 2 ms.

/cc @redonkulus @kaesonho

Using a token for the `space` value, the Node 0.10 CPU issue is avoid
while also keeping the file size closer to the no-space size.

Using the test script in #3, the results of this method add a bit of
CPU time, but its worth the reduction in size:

**1 space:**
```
[48885]      803 ms: Scavenge 109.4 (142.2) -> 106.3 (142.2) MB, 0 ms [Runtime::PerformGC].
memory usage: { rss: 140365824, heapTotal: 127084240, heapUsed: 117960976 }
memory usage: { rss: 140394496, heapTotal: 127084240, heapUsed: 118006704 }
str.replace takes 4 ms.
```

**Tokenized space:**
```
[48879]      880 ms: Mark-sweep 156.3 (190.6) -> 107.8 (148.2) MB, 41 ms [allocation failure] [GC in old space requested].
memory usage: { rss: 151355392, heapTotal: 160808928, heapUsed: 120760240 }
memory usage: { rss: 151359488, heapTotal: 160808928, heapUsed: 120818752 }
str.replace takes 2 ms.
```
@@ -9,7 +9,7 @@ See the accompanying LICENSE file for terms.
var isRegExp = require('util').isRegExp;

// Generate an internal UID to make the regexp pattern harder to guess.
var UID = Math.floor(Math.random() * 0x10000000000).toString(16);
var UID = Math.floor(Math.random() * 0x10000000000).toString(36);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wat

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the max radix and the largest magnitude random number that produces an 8 char string. I need 8 chars so I can wrap it with @ special char to end up with a 10 char string which is the max length for space param to JSON.stringify().

@lingyan
Copy link
Member

lingyan commented Aug 28, 2015

Thanks for all the efforts of optimizing this library!

I would suggest to make these configurable, so that people can benchmark with their own data set and decide what they want to do. An additional string.replace() is a little scary :)

Some people might actually want to serialize to a human readable format. It would be nice to allow configure the number of space for indentation.

The other thing to make configurable is whether to use the replacer, maybe a boolean option like isSimpleData. I'm sure there is a better name ;) That way, if people know their data set does not have regexp or function, they can turn replacer off to be much more performant.

@ericf
Copy link
Collaborator Author

ericf commented Aug 29, 2015

This change was only made to get around the Node 0.10 V8 issue. I'm on board with making the space argument configurable, but adding isSimpleData goes against a primary reason this package exists and libraries or frameworks build on top of this would always set it to false because they don't know.

In the benchmarks the extra str.replace slows things down a bit but seemed within reason for the byte size savings.

@lingyan
Copy link
Member

lingyan commented Aug 31, 2015

Sure. Ok with not adding isSimplData. Making indent and whether to do token-then-replace configurable would be great.

@ericf ericf mentioned this pull request Sep 2, 2015
@ericf
Copy link
Collaborator Author

ericf commented Sep 9, 2015

Closing this because this hack is going too far and the new plan is to revert the changes released in v1.1.0.

@ericf ericf closed this Sep 9, 2015
@ericf ericf deleted the tokenize-space branch September 9, 2015 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants