-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
bpo-43179: Generalise alignment for optimised string routines #24624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept this contribution by verifying everyone involved has signed the PSF contributor agreement (CLA). CLA MissingOur records indicate the following people have not signed the CLA: For legal reasons we need all the people listed to sign the CLA before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue. If you have recently signed the CLA, please wait at least one business day You can check yourself to see if the CLA has been received. Thanks again for the contribution, we look forward to reviewing it! |
@glaubitz Would be good if you could verify there are no regressions on m68k, please :) |
499dadf
to
62b132b
Compare
Thanks! Working on it. We should probably pull in @andreas-schwab as well. |
On m68k, alignments of primitives is more relaxed, with 4-byte and 8-byte types only requiring 2-byte alignment, thus using sizeof(size_t) does not work. Instead, use the portable alternative. Note that this is a minimal fix that only relaxes the assertion and the condition for when to use the optimised version remains overly strict. Such issues will be fixed tree-wide in the next commit. NB: In C11 we could use _Alignof(size_t) instead, but for compatibility we use autoconf.
62b132b
to
e046b24
Compare
@jrtc27 This triggers an assertion:
|
e046b24
to
31947b7
Compare
Don't know if I found the cause of that, but I had been a bit too aggressive with the changes and misunderstood one part of the code (which was particularly annoying to fix, so glad it's not needed). Try again now? Have you also verified that it was a regression compared with master? |
I applied the patches against the python3.10 package from experimental which builds fine: With the patches, the assertion fails. I'm trying git master with your patches now. |
Tried with git master - works fine, breaks with your changes, unfortunately:
|
31947b7
to
e30c139
Compare
C only requires that sizeof(x) is a multiple of alignof(x), not that the two are equal. Thus anywhere where we optimise based on alignment we should be using alignof(x) not sizeof(x). This is more annoying than it would be in C11 where we could just use _Alignof(x) (and alignof(x) in C++11), but since we still require only C99 we must plumb the information all the way from autoconf through the various typedefs and defines.
e30c139
to
e81d2ea
Compare
Changes are fine now and the assertion no longer fails on m68k. Great work and thank you! |
Also verified to work on 32-bit PowerPC and 64-bit SPARC. |
/* Help allocation */ | ||
const char *_p = p; | ||
while (_p < aligned_end) { | ||
while (_p + SIZEOF_SIZE_T <= end) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution. It looks is good change.
But I have questions about one particular change. Are you sure that there is no overflow here? Are you sure that all compilers optimize this code so that the loop does not contain superfluous addition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can rewrite those all as end - _p >= SIZEOF_SIZE_T
if you'd rather, as yes technically this can generate a pointer slightly out of bounds (and in theory overflow if you have a pointer to the last page of the address space, though in practice I hope that page is never given out as most software is unlikely to work if it's used...), though this is a common enough pattern for software that I think compilers end up having to support it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But also, it's 2021, not the late 90s, any compiler worth your time is already rewriting this entire thing behind your back (e.g. the "help allocation" is completely meaningless for any compiler you'll have encountered in the past 10+ years, that's such a basic optimisation for it to do).
Ping? Any news on this? Would be great if this change could be integrated for 3.10 so that there is less arch-specific code in CPython. |
So, any chance we can get this further? |
Thanks for taking the time to implement this, Jessica. Thanks @glaubitz for the ping. |
Thanks a lot for merging this \o/. |
The recent "Move support of legacy platforms/architectures outside Python" thread pointed out that there is currently a gross hack for m68k. Such a hack is not needed, just correct code that doesn't rely on so much implementation-defined behaviour. Thus, introduce and correctly use ALIGNOF_FOO in place of SIZEOF_FOO for alignment-related code in optimised string routines.
The first commit is required for correctness on architectures where alignof(x) is less than sizeof(x) due to the hardware having weaker alignment requirements for certain types (early m68k processors had a 16-bit data bus, and so 32-bit loads/stores were permitted to be only 16-bit aligned, though more recent ones likely have a full 32-bit data bus and have to deal with unaligned accesses much like modern x86 and Arm cores) but makes no changes that are otherwise not strictly required.
The second much larger commit fixes the many variants of optimised string routines to condition themselves on alignof(x) not sizeof(x). This is not required for correctness, but is good practice (and ensures that asserts with sizeof(x) rather than alignof(x) don't start to creep back in); all that happens when the conditions are too strict is they fall back on less-optimised versions.
https://bugs.python.org/issue43179