Skip to content

add an initial radix trie implementation #5205

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

add an initial radix trie implementation #5205

wants to merge 1 commit into from

Conversation

thestinger
Copy link
Contributor

This is an implementation of a map and set for integer keys. It's an ordered container (by byte order, which is sorted order for integers and byte strings when done in the right direction) with O(1) worst-case lookup, removal and insertion. There's no rebalancing or rehashing so it's actually O(1) without amortizing any costs.

The fanout can be adjusted in multiples of 2 from 2-ary through 256-ary, but it's hardcoded at 16-ary because there isn't a way to expose that in the type system yet. To keep things simple, it also only allows uint keys, but later I'll expand it to all the built-in integer types and byte arrays.

There's quite a bit of room for performance improvement, along with the boost that will come with dropping the headers on Owned ~ and getting rid of the overhead from the stack switches to the allocator. It currently does suffix compression for a single node and then splits into two n-ary trie nodes, which could be replaced with an array for at least 4-8 suffixes before splitting it. There's also the option of doing path compression, which may be a good or a bad idea and depends a lot on the data stored.

I want to share the test suite with the other maps so that's why I haven't duplicated all of the existing integer key tests in this file. I'll send in another pull request to deal with that.

Current benchmark numbers against the other map types:

TreeMap:
 Sequential integers:
  insert: 0.798295
  search: 0.188931
  remove: 0.435923
 Random integers:
  insert: 1.557661
  search: 0.758325
  remove: 1.720527

LinearMap:
 Sequential integers:
  insert: 0.272338
  search: 0.141179
  remove: 0.190273
 Random integers:
  insert: 0.293588
  search: 0.162677
  remove: 0.206142

TrieMap:
 Sequential integers:
  insert: 0.0901
  search: 0.012223
  remove: 0.084139
 Random integers:
  insert: 0.392719
  search: 0.261632
  remove: 0.470401

@graydon is using an earlier version of this for the garbage collection implementation, so that's why I added this to libcore. I left out the next and prev methods for now because I just wanted the essentials first.

bors added a commit that referenced this pull request Mar 4, 2013
This is an implementation of a map and set for integer keys. It's an ordered container (by byte order, which is sorted order for integers and byte strings when done in the right direction) with O(1) worst-case lookup, removal and insertion. There's no rebalancing or rehashing so it's actually O(1) without amortizing any costs.

The fanout can be adjusted in multiples of 2 from 2-ary through 256-ary, but it's hardcoded at 16-ary because there isn't a way to expose that in the type system yet. To keep things simple, it also only allows `uint` keys, but later I'll expand it to all the built-in integer types and byte arrays.

There's quite a bit of room for performance improvement, along with the boost that will come with dropping the headers on `Owned` `~` and getting rid of the overhead from the stack switches to the allocator. It currently does suffix compression for a single node and then splits into two n-ary trie nodes, which could be replaced with an array for at least 4-8 suffixes before splitting it. There's also the option of doing path compression, which may be a good or a bad idea and depends a lot on the data stored.

I want to share the test suite with the other maps so that's why I haven't duplicated all of the existing integer key tests in this file. I'll send in another pull request to deal with that.

Current benchmark numbers against the other map types:

    TreeMap:
     Sequential integers:
      insert: 0.798295
      search: 0.188931
      remove: 0.435923
     Random integers:
      insert: 1.557661
      search: 0.758325
      remove: 1.720527

    LinearMap:
     Sequential integers:
      insert: 0.272338
      search: 0.141179
      remove: 0.190273
     Random integers:
      insert: 0.293588
      search: 0.162677
      remove: 0.206142

    TrieMap:
     Sequential integers:
      insert: 0.0901
      search: 0.012223
      remove: 0.084139
     Random integers:
      insert: 0.392719
      search: 0.261632
      remove: 0.470401

@graydon is using an earlier version of this for the garbage collection implementation, so that's why I added this to libcore. I left out the `next` and `prev` methods *for now* because I just wanted the essentials first.
@bors bors closed this Mar 4, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants