Skip to content

Faster dispatch #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed

Faster dispatch #3

wants to merge 2 commits into from

Conversation

markshannon
Copy link
Member

For review purposes. Please don't merge.

This should 🤞 result in MSVC producing a dispatch sequence as efficient as GCC with computed gotos.
This (in theory) works by:

  • Creating an exhaustive set of switch targets and limiting the opcode to a single byte, so there is no need for any range checking.
  • Generating unique code for each target, so the compiler can't merge them.
  • Removing all code between the fetch/decode and dispatch, allowing the compiler to merge the goto with the computed jump.

The dispatch code should end up looking something like:

        movzx   ecx, WORD PTR [edi]
        add     edi, 2
        mov     esi, ecx
        movzx   eax, cl
        shr     esi, 8
        jmp     TABLE[eax*4]

The important thing is that there is a computed jump at the end of the instruction code.

* Do fetch and decode at end of opocde then jump directly to switch.
  Should allow compilers, specifically MSVC to combine the two jumps.

* Remove LLTRACE macros. They are no use for debugging, and add a lot of clutter.
@markshannon
Copy link
Member Author

If this works, we can remove all the PREDICT macros as well.

@gvanrossum
Copy link

See email I sent with Disassembly.txt

@gvanrossum
Copy link

Here's the disassembly for all of _PyEval_EvalFrameDefault() with some extra stuff before and after it.

Disassembly.txt

@markshannon
Copy link
Member Author

There still seem to a lot of conditional jumps and even a call in the DISPATCH() code.
Do you have LLTRACE or DTRACE enabled?

@markshannon
Copy link
Member Author

Closing this.
I'll make another PR with just the extra opcodes, once https://bugs.python.org/issue43760 is done

@markshannon markshannon closed this Apr 7, 2021
@gvanrossum
Copy link

gvanrossum commented Apr 7, 2021 via email

@markshannon markshannon deleted the faster-dispatch branch April 14, 2021 13:26
markshannon pushed a commit that referenced this pull request Apr 18, 2022
…python#91466)

Fix an uninitialized bool in exception print context.
    
`struct exception_print_context.need_close` was uninitialized.
    
Found by oss-fuzz in a test case running under the undefined behavior sanitizer.
    
https://oss-fuzz.com/testcase-detail/6217746058182656
    
```
Python/pythonrun.c:1241:28: runtime error: load of value 253, which is not a valid value for type 'bool'
    #0 0xbf2203 in print_chained cpython3/Python/pythonrun.c:1241:28
    #1 0xbea4bb in print_exception_cause_and_context cpython3/Python/pythonrun.c:1320:19
    #2 0xbea4bb in print_exception_recursive cpython3/Python/pythonrun.c:1470:13
    #3 0xbe9e39 in _PyErr_Display cpython3/Python/pythonrun.c:1517:9
```
    
Pretty obvious what the ommission was upon code inspection.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants