Skip to content

bpo-37953: Fix ForwardRef equality checks #15400

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Sep 13, 2019
Merged

bpo-37953: Fix ForwardRef equality checks #15400

merged 11 commits into from
Sep 13, 2019

Conversation

plokmijnuhby
Copy link
Contributor

@plokmijnuhby plokmijnuhby commented Aug 22, 2019

Ideally if we stick a ForwardRef in a dictionary we would like to reliably be able to get it out again.

https://bugs.python.org/issue37953

Maybe this will cut down on hash collisions
Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable, but let's make sure @ilevkivskyi agrees.

Can you add a test (preferably one that shows what's not working without the fix)?

It's fairly unorthodox to have a PR without a bug in the tracker -- can you create a bug describing the problem and link this to it? Also please add a news item.

@ilevkivskyi
Copy link
Member

TBH I don't understand what problem is being solved. This definitely requires tests.

@gvanrossum
Copy link
Member

The problem seems clear enough to me: The __hash__() and __eq__() operators compute a value that depends on __forward_value__, but it's possible to use a ForwardRef as a dict key before the latter field is assigned (it is initialized to None). If then (through an alias) its _evaluate() is called, which assigns a non-None value to __forward_value__, the hash and equality change, which violates the rules for hashable objects and confuses the dict lookup operation so the lookup will fail.

I'm not clear however on why someone would use a ForwardRef as a dict key. Maybe @plokmijnuhby can explain their use case?

@plokmijnuhby plokmijnuhby changed the title Fix ForwardRef equality checks #37953 - Fix ForwardRef equality checks Aug 26, 2019
@plokmijnuhby plokmijnuhby changed the title #37953 - Fix ForwardRef equality checks Issue #37953 - Fix ForwardRef equality checks Aug 26, 2019
@ilevkivskyi ilevkivskyi changed the title Issue #37953 - Fix ForwardRef equality checks bpo-37953: Fix ForwardRef equality checks Aug 28, 2019
@ilevkivskyi
Copy link
Member

OK, I see how the current status is wrong. On the other hand, just removing the value for evaluated forward references will make two references referring to different classes with the same name equal.

I think that the first one is more important, so I am fine with just removing the __forward_value__ from __hash__() and __eq__().

Copy link
Member

@ilevkivskyi ilevkivskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good, just one question.

Please also add a NEWS item.

@hongweipeng
Copy link
Contributor

I try a test in this PR:

from typing import ForwardRef, get_type_hints

def namespace1():
    a = ForwardRef('A')

    class A: pass
    def fun(x: a): pass

    get_type_hints(fun, globals(), locals())
    return hash(a)

def namespace2():
    a = ForwardRef('A')

    class A: pass
    def fun(x: a): pass

    get_type_hints(fun, globals(), locals())
    return hash(a)

print(namespace1() == namespace2())

Output True in this PR and output False in master branch.

I'm afraid of removing the __forward_value__ can break a legitimate code.

@brandtbucher
Copy link
Member

brandtbucher commented Sep 1, 2019

[edited]

@hongweipeng, I see what you mean now. This is definitely tricky.

I think combining the __eq__ from #15628 with the __hash__ from this PR makes the most sense.

@hongweipeng
Copy link
Contributor

@brandtbucher Yeah, I want to use __eq__ to remove duplicate item instead of using __hash__.

Now I think that __eq__ is only used for ForwardRef, while others keep the original __hash__ way.

@ilevkivskyi
Copy link
Member

Now I think that __eq__ is only used for ForwardRef, while others keep the original __hash__ way.

TBH I don't really like making __hash__() and __eq__() inconsistent. There are however precedents, for example hash('abc') == hash(b'abc')is True while 'abc' == b'abc' is False. So a possible thing I could imagine is keeping only __forward_arg__, but also take into account __forward_value__ in __eq__() if it is not None for self and other.

IOW I am fine with

I think combining the __eq__ from #15628 with the __hash__ from this PR makes the most sense.

@hongweipeng
Copy link
Contributor

Removing __forward_value__ can solve the recursion problem incidentally issue37806.

@ilevkivskyi
Copy link
Member

Removing __forward_value__ can solve the recursion problem incidentally.

Do you mean removing it from __hash__ or from both places?

Generally, I would propose to keep only one PR open, because currently the discussion is scattered between two PRs. @plokmijnuhby Are you interested in continuing to work on this PR?

@hongweipeng
Copy link
Contributor

@ilevkivskyi Yes, we can remove __forward_value__ in __hash__ and keep it in __eq__. In this case, an actually recursive object will also throw an RecursionError normally.

@plokmijnuhby
Copy link
Contributor Author

plokmijnuhby commented Sep 2, 2019

@ilevkivskyi I'm happy to keep working on it, but I'm not quite sure how where we want to take this. It seems like there are still a few ideas bouncing around here. I mean, in theory we could remove the caching thing entirely, and just perform a lookup of the class name in a function's __globals__ whenever we need to evaluate a forward reference on it.

@aeros aeros added the type-feature A feature request or enhancement label Sep 2, 2019
@aeros
Copy link
Contributor

aeros commented Sep 2, 2019

I'm not entirely certain if this is something we'd consider to be a behavioral changing enhancement or if the current behavior is an actual bug. For now, I'll categorize it as enhancement but feel free to change it.

Alright, I don't really agree with this but I seem to be in the minority, so I'll change it.

Co-Authored-By: Brandt Bucher <[email protected]>
@plokmijnuhby
Copy link
Contributor Author

Okay, here's my thoughts on forward references. @hongweipeng's code example doesn't make much sense to me because users are unlikely to evaluate references right after creating them, i.e. in the same calling context, but much more likely to have someone else's code evaluate them a little later, in a completely different calling context. The trouble is that we have no good way of determining the correct context after we've left it.

There are several possible solutions to this:

  • WONTFIX the issue and ignore it.
  • Detect the calling context the moment a forward reference is created, through sys._getframe hacks. Not a brilliant idea, particularly since using the form def foo(a: 'Bar') to use strings as forward references means we'd also have to do some some sort of monkey patch to str, so let's avoid that one.
  • Changes to the parser itself. This would probably slow down the whole python system.
  • A change to how we think about forward references. Forward references could be considered to have no meaning unless they have an obvious context available, eg if globals and locals were passed to get_type_hints. This would mean that a forward reference is basically just a string that can be executed in a fancy way, so it makes sense to mark it as immutable. This is the approach I was leaning towards.

@ilevkivskyi
Copy link
Member

So we now have two competing almost identical PRs: this one and #15650. This one was first, but the other has more tests. I think the best way is to polish this one, so I am going to close the alternative one.

@@ -0,0 +1,3 @@
Forward references in the typing module now have different hash and eq methods. References will now compare equal properly if only one of them has been evaluated and the other hasn't. Hash values are now consistent across the lifetime of a forward reference.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use `__hash__` and `__eq__` instead of just hash and eq. Also this news item is too long, couple short sentences (or even one) should be enough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it, is that better?

@aeros
Copy link
Contributor

aeros commented Sep 13, 2019

Also, thanks for the PR and welcome @plokmijnuhby. (:

Copy link
Member

@ilevkivskyi ilevkivskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updates! LGTM now.

@ilevkivskyi
Copy link
Member

Thanks everyone for help!

@plokmijnuhby To make things consistent, you might also want to add this fix to the backport at https://github.com/python/typing.

@ilevkivskyi ilevkivskyi merged commit e082e7c into python:master Sep 13, 2019
@miss-islington
Copy link
Contributor

Thanks @plokmijnuhby for the PR, and @ilevkivskyi for merging it 🌮🎉.. I'm working now to backport this PR to: 3.8.
🐍🍒⛏🤖

@bedevere-bot
Copy link

GH-16128 is a backport of this pull request to the 3.8 branch.

miss-islington added a commit that referenced this pull request Sep 13, 2019
Ideally if we stick a ForwardRef in a dictionary we would like to reliably be able to get it out again.

https://bugs.python.org/issue37953
(cherry picked from commit e082e7c)

Co-authored-by: plokmijnuhby <[email protected]>
ned-deily pushed a commit that referenced this pull request Mar 3, 2020
Ideally if we stick a ForwardRef in a dictionary we would like to reliably be able to get it out again.

https://bugs.python.org/issue37953
(cherry picked from commit e082e7c)

Co-authored-by: plokmijnuhby <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-bug An unexpected behavior, bug, or error type-feature A feature request or enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants