-
Notifications
You must be signed in to change notification settings - Fork 26
Add combined opcodes, e.g. LOAD_FAST + BINARY_ADD = FAST_ADD #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
In the compiler, collapse LOAD_FAST + BINARY_ADD into FAST_ADD, and ditto for LOAD_FAST + BINARY_SUBSC. Many others could follow (e.g. LOAD_CONST, and other binary and unary opcodes). But benchmarks are required first. Also, a slight mystery: "python s.py" produces optimized bytecode, but in the REPL typing "import s" does not. Why?
Solves the "slight mystery" I noted in the previous commit message.
- Comment out unnecessary (I believe) INCREF/DECREF pairs - Some cleanup
- For values of 1 in range(20, 256) - a+1 for a: int - a[1] for a: tuple | list
case TARGET(INT_ADD): { | ||
PyObject *left = TOP(); | ||
if (PyLong_CheckExact(left)) { | ||
PyLongObject *ll = (PyLongObject *)left; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd move this code into a PyObject *_PyLong_AddInt(PyLongObject *left, int right)
inline function in longobject.h
It will make life easier if we want to improve the implementation of PyLongObject
.
I don't think there is much room for super-instructions implemented in the compiler, because:
Unfortunately, FAST_ADD, FAST_SUBSCR, CONST_ADD, CONST_SUBSCR all break rule 1.
Various combinations of An an aside, I'd leave "dumb" super-instructions (where we just glue two instructions together) to a future interpreter generator. It will save a fair bit of copy-and-paste. |
|
Thanks for the feedback, I'll close this for now. It was a fun project to work on though. :-) |
…python#91466) Fix an uninitialized bool in exception print context. `struct exception_print_context.need_close` was uninitialized. Found by oss-fuzz in a test case running under the undefined behavior sanitizer. https://oss-fuzz.com/testcase-detail/6217746058182656 ``` Python/pythonrun.c:1241:28: runtime error: load of value 253, which is not a valid value for type 'bool' #0 0xbf2203 in print_chained cpython3/Python/pythonrun.c:1241:28 #1 0xbea4bb in print_exception_cause_and_context cpython3/Python/pythonrun.c:1320:19 #2 0xbea4bb in print_exception_recursive cpython3/Python/pythonrun.c:1470:13 #3 0xbe9e39 in _PyErr_Display cpython3/Python/pythonrun.c:1517:9 ``` Pretty obvious what the ommission was upon code inspection.
This PR adds some opcodes that intend to speed up some common opcode pairs.
The opcodes FAST_ADD, FAST_SUBSCR, CONST_ADD, CONST_SUBSCR combine LOAD_FAST/CONST with BINARY_ADD/SUBSCR. These are pretty common opcode pairs. [Citation needed]
The INT_ADD/SUBSCR opcodes use the oparg as an immediate constant (unsigned byte only).
The existing optimization pass in the bytecode compiler generates these opcodes whenever it detects an appropriate pair.
Benchmarks are somewhat inconclusive -- we saw some unexplained slowdowns for the three pickle benchmarks, but likely those were due to a bug in INT_ADD (which I've now fixed -- but we haven't run the benchmark again yet).
PS: To resolve the conflict I would have to rebase to HEAD of main and then regenerate the frozen .h files.