Skip to content

notifications: Switch to use make_links_absolute() from lxml library. #6906

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

HarshitOnGitHub
Copy link
Member

Instead of using custom regexes for converting relative URLs to
absolute URLs switch to using make_links_absolute() function
from lxml library.

@rht
Copy link
Collaborator

rht commented Oct 8, 2017

@HarshitOnGitHub is there any performance gain observed? Regardless, I prefer simpler codebase :> $$\therefore$$ LGTM.

@@ -26,6 +26,7 @@

import datetime
from email.utils import formataddr
import lxml.html as LH
Copy link
Member

@timabbott timabbott Oct 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just do import lxml.html or import lxml; no reason to use an abbreviation here.

@timabbott
Copy link
Member

Posted one comment, otherwise this looks good, since we have unit tests for that function so we can pretty confident there isn't something minor broken here (right?).

Instead of using custom regexes for converting relative URLs to
absolute URLs switch to using `make_links_absolute()` function
from lxml library.
@HarshitOnGitHub
Copy link
Member Author

@timabbott Agree. Updated! Yeah we have a good test suite there so it is very unlikely that this may have broken something minor.
@rht I didn't checked for performance gains but I think this new approach should be faster than previous one. Since this is not a performance critical path I would have replaced the older approach event if this approach was a little bit slower than the older one since this is cleaner and more accurate. :)

@timabbott
Copy link
Member

Yeah we should definitely not waste time trying to see performance impact. It probably is a bit slower in a completely irrelevant way.

@timabbott
Copy link
Member

Merged, thanks @HarshitOnGitHub!

@timabbott
Copy link
Member

Oh, one note; do you think there's a way to handle the file uploads part with lxml as well? Might be a good idea to convert that as well.

@HarshitOnGitHub
Copy link
Member Author

HarshitOnGitHub commented Oct 8, 2017

@timabbott I think we want abbreviation there because of this: https://stackoverflow.com/a/13093714/7516461 otherwise mypy will complain.
Which file uploads part are you referring to?

@timabbott
Copy link
Member

I meant the second half of relative_to_full_url.

This works fine:

In [1]: import lxml
In [2]: lxml.html.parse
Out[2]: <function lxml.html.parse>

but you're right that mypy's annotations for lxml are broken; I posted on python/typeshed#525 about the issue (and added some type: ignores to our codebase).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants