Skip to content

Remove lifetime #22

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Sep 23, 2022
Merged

Remove lifetime #22

merged 7 commits into from
Sep 23, 2022

Conversation

Txuritan
Copy link
Collaborator

@Txuritan Txuritan commented Sep 22, 2022

A mentioned in #18 the added lifetime would limit use of the API as well as cause issue with PyO3 bindings. This removes it using Vecs and to_vecs.

I do plan on using this commit as the basis for my optimization experiments which may be included depending on if this is merged or not.

Benchmark (mimalloc):

matchit path-tree w/o lifetime
insert 59µs 73µs 84µs
find 45µs 40µs 40µs

Benchmark (Windows' allocator):

matchit path-tree w/o lifetime
insert 109µs 128µs 158µs
find 31µs 40µs 39µs

src/lib.rs Outdated
@@ -191,7 +192,7 @@ impl<'a, T> PathTree<'a, T> {
self
}

pub fn find<'b>(&'a self, path: &'b str) -> Option<Path<'a, 'b, T>> {
pub fn find<'a, 'b>(&'a self, path: &'b str) -> Option<Path<'a, 'b, T>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the lifetime for path would be fine to keep. For PyO3 we need to copy the data into Python objects to return it anyway, so lifetimes on methods are fine. The problematic lifetimes are on the PathTree. Also I imagine most use cases don’t care too much about insertion times (ideally that would happen once at app startup) so I would heavily bias toward optimizing the performance of find() over all else.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change better? I decided to make it an elided lifetime instead of explicit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, I’ll try updating the wrapper and confirm. By the way, if you have any input there feel free to open a PR, I’m still learning so I’m sure you could improve it.

@fundon
Copy link
Member

fundon commented Sep 23, 2022

The biggest performance problem when capturing parameters.
Currently, I haven't found a better way to update ranges.

@fundon
Copy link
Member

fundon commented Sep 23, 2022

matchit uses Params to improve performance.
We use smallvec.

@Txuritan
Copy link
Collaborator Author

From my profiling I did notice that each find call was an (almost) perfect triangle, means that its not taking too much time in any one check. I plan on taking a closes look using finer tracing than the initial setup I had.

I'll take a look into inserting paths, one of the ideas I had was to store the inserted paths in PathTree and have Nodes reference the path by index and use Ranges for slicing. Basically how the router works but for paths themselves.

What was the issue with capturing parameters. I did notice a second parameter check not sure if thats what you're talking about though.

@fundon
Copy link
Member

fundon commented Sep 23, 2022

What was the issue with capturing parameters.

Updates parameters array, https://github.com/viz-rs/path-tree/blob/main/src/node.rs#L284-L285, if this update operation is removed, the code will be greatly enhanced. So I'm wondering if there's a better way to update the array.

@Txuritan
Copy link
Collaborator Author

Ah those, I was wondering about those. My initial thoughts were to just place them into a single insert (eg: ranges.push(start..start + n);) using the Range enum directly instead of having to recreate it when creating the raw parameteres list. I did try it out during my first pass over the code but it didn't seem like it changed much in terms of performance.

My other idea was to have each 'leaf' Node hold all its parameters and only build the parameter list when it found a match, but it would have complicated the router ever further.

@fundon
Copy link
Member

fundon commented Sep 23, 2022

Using Range is very clean and simple.

My other idea was to have each 'leaf' Node hold all its parameters and only build the parameter list when it found a match, but it would have complicated the router ever further.

Yes we already have the path pieces that seem to do.

@fundon
Copy link
Member

fundon commented Sep 23, 2022

If it is possible to merge, assign to me. :)

@Txuritan
Copy link
Collaborator Author

I fixed the merge conflict but I am unable to assign you to the issue as I don't have the ability to do so.

@fundon
Copy link
Member

fundon commented Sep 23, 2022

@Txuritan I have invited you to role write.

@fundon fundon merged commit fde8ed2 into viz-rs:main Sep 23, 2022
@fundon
Copy link
Member

fundon commented Sep 23, 2022

Landed in v0.5.1. Thanks.

@adriangb
Copy link
Contributor

This worked out great for the Python wrapper: adriangb/routrie@faa4ad8#diff-b1a35a68f14e696205874893c07fd24fdb88882b47c23cc0e0c80a30c7d53759

It would be nice to figure out at some point how to lazily build the parameters instead of calling params() (just like you are doing in Rust) before returning the data to Python, but I suspect it would get pretty complicated again with lifetimes for probably little benefit to users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants