Skip to content

Add combined opcodes, e.g. LOAD_FAST + BINARY_ADD = FAST_ADD #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 10 commits into from

Conversation

gvanrossum
Copy link

@gvanrossum gvanrossum commented Mar 24, 2021

This PR adds some opcodes that intend to speed up some common opcode pairs.

The opcodes FAST_ADD, FAST_SUBSCR, CONST_ADD, CONST_SUBSCR combine LOAD_FAST/CONST with BINARY_ADD/SUBSCR. These are pretty common opcode pairs. [Citation needed]

The INT_ADD/SUBSCR opcodes use the oparg as an immediate constant (unsigned byte only).

The existing optimization pass in the bytecode compiler generates these opcodes whenever it detects an appropriate pair.

Benchmarks are somewhat inconclusive -- we saw some unexplained slowdowns for the three pickle benchmarks, but likely those were due to a bug in INT_ADD (which I've now fixed -- but we haven't run the benchmark again yet).

PS: To resolve the conflict I would have to rebase to HEAD of main and then regenerate the frozen .h files.

In the compiler, collapse LOAD_FAST + BINARY_ADD into FAST_ADD,
and ditto for LOAD_FAST + BINARY_SUBSC.

Many others could follow (e.g. LOAD_CONST, and other binary
and unary opcodes).  But benchmarks are required first.

Also, a slight mystery: "python s.py" produces optimized bytecode,
but in the REPL typing "import s" does not. Why?
Solves the "slight mystery" I noted in the previous commit message.
- Comment out unnecessary (I believe) INCREF/DECREF pairs
- Some cleanup
- For values of 1 in range(20, 256)
- a+1 for a: int
- a[1] for a: tuple | list
@gvanrossum gvanrossum requested a review from markshannon March 24, 2021 23:01
case TARGET(INT_ADD): {
PyObject *left = TOP();
if (PyLong_CheckExact(left)) {
PyLongObject *ll = (PyLongObject *)left;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd move this code into a PyObject *_PyLong_AddInt(PyLongObject *left, int right) inline function in longobject.h
It will make life easier if we want to improve the implementation of PyLongObject.

@markshannon
Copy link
Member

markshannon commented Mar 25, 2021

I don't think there is much room for super-instructions implemented in the compiler, because:

  1. They shouldn't include a bytecode that we will want to specialize, because specialization gains much more than removing the dispatch.
  2. They can't cross line boundaries.
  3. They can't include more than one instruction that has an operand.

Unfortunately, FAST_ADD, FAST_SUBSCR, CONST_ADD, CONST_SUBSCR all break rule 1.

INT_ADD is good because it is already specialized, and will rarely cross a line boundary.

Various combinations of LOAD_CONST, IS_OP and POP_JUMP_IF_XXX are also promising.

An an aside, I'd leave "dumb" super-instructions (where we just glue two instructions together) to a future interpreter generator. It will save a fair bit of copy-and-paste.

@markshannon
Copy link
Member

INT_SUSBSCR also breaks rule 1 because, unlike INT_ADD there is no correlation between the type of the two operands.
In x + 1 x is almost certainly an int or a float. In x[1] x could be a list, a tuple, a str, a bytes, an array.array, etc.

@gvanrossum
Copy link
Author

Thanks for the feedback, I'll close this for now. It was a fun project to work on though. :-)

@gvanrossum gvanrossum closed this Mar 25, 2021
markshannon pushed a commit that referenced this pull request Apr 18, 2022
…python#91466)

Fix an uninitialized bool in exception print context.
    
`struct exception_print_context.need_close` was uninitialized.
    
Found by oss-fuzz in a test case running under the undefined behavior sanitizer.
    
https://oss-fuzz.com/testcase-detail/6217746058182656
    
```
Python/pythonrun.c:1241:28: runtime error: load of value 253, which is not a valid value for type 'bool'
    #0 0xbf2203 in print_chained cpython3/Python/pythonrun.c:1241:28
    #1 0xbea4bb in print_exception_cause_and_context cpython3/Python/pythonrun.c:1320:19
    #2 0xbea4bb in print_exception_recursive cpython3/Python/pythonrun.c:1470:13
    #3 0xbe9e39 in _PyErr_Display cpython3/Python/pythonrun.c:1517:9
```
    
Pretty obvious what the ommission was upon code inspection.
@gvanrossum gvanrossum deleted the add-opcodes branch December 21, 2022 22:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants