-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
bpo-38530: Refactor AttributeError suggestions #25776
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Python/suggestions.c
Outdated
assert(d2 == d4); | ||
} | ||
|
||
static void |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pleas remove these functions as we don't normally tests things this way (even with the macros).
I would recommend to either adapt this tests to sole NameError example's in test_exceptions
or maybe expose them so they can be used in the _testcapi module.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase And if you don't make the requested changes, you will be put in the comfy chair! |
static inline PyObject * | ||
calculate_suggestions(PyObject *dir, | ||
PyObject *name) { | ||
PyObject *name) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto here: please remove the tests or move them to the test suite
Python/suggestions.c
Outdated
@@ -90,38 +197,39 @@ calculate_suggestions(PyObject *dir, | |||
if (name_str == NULL) { | |||
return NULL; | |||
} | |||
size_t *work_buffer = PyMem_Calloc(name_size, sizeof(size_t)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that as we work with max length strings we can statically allocate a fixed array here, which simplifies the code a lot
Python/suggestions.c
Outdated
static inline int | ||
substitution_cost(char a, char b) | ||
{ | ||
if ((a & 31) != (b & 31)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, encapsule the 31
in a macro or a static constant for readability
@sweeneyde No rush if you don't have the bandwidth to do it but if we get this ready before Monday we can get it into beta 1 :) |
I tried modifying MAX_CANDIDATE_ITEMS and MAX_STRING_SIZE to be much larger, and I got these benchmark results:
Benchmark script: import random
from test.support import captured_stderr
import sys
from time import perf_counter
STRING_LENGTH = 50
DIR_LENGTH = 8000
def rand_string(n):
return ''.join(random.choices("abc", k=n))
def bench(repeat=100):
rand_strings = [rand_string(STRING_LENGTH)
for _ in range(DIR_LENGTH)]
class Class:
locals().update({h: h for h in rand_strings})
attr = rand_strings[DIR_LENGTH//2]
attr = attr[:9] + "x" + attr[10:]
t0 = perf_counter()
for _ in range(repeat):
try:
getattr(Class, attr)
except AttributeError as exc:
with captured_stderr() as err:
sys.__excepthook__(*sys.exc_info())
t1 = perf_counter()
assert "Did you mean" in err.getvalue(), err.getvalue()
assert attr in err.getvalue()
return (t1 - t0) / repeat
if __name__ == "__main__":
print(f"{STRING_LENGTH = }")
print(f"{DIR_LENGTH = }")
for _ in range(5):
print(f"{bench():.8f}", end=" ")
print() |
Thanks, @sweeneyde for the benchmarking. This is great work! Although performance here is not required (because the interpreter is going to finish soon) this is still important in some edge cases like a thread that is constantly raising exceptions (yeah, I know 😉 ) so I am pleased to see that this PR will make it faster. I guess that if we remove the call to the memory allocator and do stack locations will be slightly faster as well. Although for the average namespace this probably will not be super important. In any case, unless is super expensive I prefer to center our efforts on usability an ensure that the suggestions are the better we can have :) |
If that is the priority, I can take a look in the next couple of hours at implementing the "transpositions" variant with a rotation of three matrix rows. Would that be good for this PR or should it be a later PR? |
You mean the gcc version here https://github.com/gcc-mirror/gcc/blob/16e2427f50c208dfe07d07f18009969502c25dc8/gcc/spellcheck.c#L46-L142? I have implemented that and played a bit with it but I didn't see any cases where is obviously better than the ones we cover already. I was also a bit wary of performance and also the three calls to the allocators also make it a bit messy. I changed them to static arrays of max length and I saw a bit of a speedup in general but I am not sure if it was worth it. It would be great to have some "realistic" examples where that approach is better than the one in this PR or the one that we already have so we can justify the extra complexity. |
I increased the range of possible
|
As this is still marked as WIP, tell me when is ready to review and I will do another pass. |
I think this should be ready to review again. |
@@ -1565,21 +1565,87 @@ def test_name_error_bad_suggestions_do_not_trigger_for_small_names(self): | |||
def test_name_error_suggestions_do_not_trigger_for_too_many_locals(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We must find a way to make this test more bearable, maybe using compile()
as this is getting a bit wild. :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, apparently we have some leaks:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to find the leaks before we merge
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
I fixed the refleaks in ce87e15 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked for refleaks:
❯ ./python -m test test_capi -R :
0:00:00 load avg: 1.13 Run tests sequentially
0:00:00 load avg: 1.13 [1/1] test_capi
beginning 9 repetitions
123456789
.........
test_capi passed in 46.1 sec
== Tests result: SUCCESS ==
1 test OK.
Total duration: 46.1 sec
Tests result: SUCCESS
Thanks for the last minute review and fix! |
(name_size + item_size + 3) * MOVE_COST / 6
.difflib.SequenceMatcher.ratio()
>= 2/3:https://bugs.python.org/issue38530