Never have GET_ITER not followed by FOR_ITER #71314

serprex · 2016-05-26T01:44:25Z

BPO	27127
Nosy	@rhettinger, @DinoV, @serhiy-storchaka, @serprex
Files	gitfit.patch forbegin.patch: Replace GET_ITER/FOR_ITER with FOR_BEGIN/FOR_ITER forbegin2.patch forbegin3.patch gitfit2.patch: Pass test_genexps & test_dis

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2016-06-08.15:59:56.410>
created_at = <Date 2016-05-26.01:44:24.678>
labels = ['interpreter-core', 'performance']
title = 'Never have GET_ITER not followed by FOR_ITER'
updated_at = <Date 2016-06-08.15:59:56.409>
user = 'https://github.com/serprex'

bugs.python.org fields:

activity = <Date 2016-06-08.15:59:56.409>
actor = 'Demur Rumed'
assignee = 'none'
closed = True
closed_date = <Date 2016-06-08.15:59:56.410>
closer = 'Demur Rumed'
components = ['Interpreter Core']
creation = <Date 2016-05-26.01:44:24.678>
creator = 'Demur Rumed'
dependencies = []
files = ['43006', '43049', '43271', '43295', '43300']
hgrepos = []
issue_num = 27127
keywords = ['patch']
message_count = 19.0
messages = ['266400', '266453', '266609', '266615', '266617', '266618', '266622', '266624', '266631', '266657', '267468', '267476', '267583', '267584', '267722', '267768', '267769', '267787', '267789']
nosy_count = 4.0
nosy_names = ['rhettinger', 'dino.viehland', 'serhiy.storchaka', 'Demur Rumed']
pr_nums = []
priority = 'normal'
resolution = 'rejected'
stage = 'patch review'
status = 'closed'
superseder = None
type = 'performance'
url = 'https://bugs.python.org/issue27127'
versions = ['Python 3.6']

serprex · 2016-05-26T01:44:22Z

This is a small change to comprehensions pass in their iterable rather than calling GET_ITER before CALL_FUNCTION. This makes it so that the compiler never generates GET_ITER without following it with FOR_ITER, nor does it generate FOR_ITER without preceding it by GET_ITER

This is the standalone portion of a more constructive patch I'm wanting to submit: Replace GET_ITER with a FOR_BEGIN which effectively acts as GET_ITER/FOR_ITER. Then modify FOR_ITER to replace the JUMP_ABSOLUTE by changing it to a JABS instruction which jumps if the iterator does not return null. This reduces the bytecode of by 2 bytes for every for loop & reduces the dispatch count per loop by 1 (As we merge GET_ITER/FOR_ITER & JUMP_ABSOLUTE/FOR_ITER)

DinoV · 2016-05-26T20:31:42Z

I like how this makes the loop opcodes more regular - it should make things like Pyjion (http://www.github.com/Microsoft/Pyjion) which translate bytecode into native code. We already have some code (currently commented out) checking that GET_ITER is followed by FOR_ITER to do some optimizations around loops.

Another issue we ran into was that this leaves the iterable on the stack while the loop is running... I'm just curious if your larger change alter that in anyway?

serprex · 2016-05-29T18:24:31Z

Attached is the larger potential change of replacing GET_ITER/FOR_ITER with FOR_BEGIN/FOR_ITER. I took awhile to getting this put together due to missing that CONTINUE_LOOP was jumping to the top of the loop, skipping past FOR_ITER

serhiy-storchaka · 2016-05-29T19:17:41Z

Interesting.

How it works with continue?

    for x in a:
        if x: continue
        f()

In this case we can use new FOR_ITER instead of JUMP_ABSOLUTE. I would rename FOR_ITER to something more appropriate (maybe containing NEXT or CONTINUE).

How it works with continue in the try block?

    for x in a:
        try:
            if x: continue
            f()
        except Error:
            e()

serprex · 2016-05-29T19:31:38Z

Currently it'll work since in an except it'll generate a CONTINUE_LOOP that'll jump to the end of the loop & either jump back to the start or to the end

Your example is incorrect. If the continue's JUMP_ABS were a FOR_ITER then if we were on the last iteration of the loop we would end up executing the loop body with an invalid stack. So you'd have to follow the FOR_ITER with a JUMP_ABS past the loop. Not sure if there's a speed advantage worth complexity/larger wordcode

serhiy-storchaka · 2016-05-29T19:37:34Z

Ah, sorry. You are right.

Do you have any microbenchmarks that shows the benefit?

serprex · 2016-05-29T19:59:10Z

Pretty small perf increase:

timeit("for a in range(256):pass", number=100000)
Without patch: ~2.1
With patch: ~1.84

pybench is 1-2% faster. Would prefer others' results. Especially a benchmark where it doesn't wrap payloads in for loops

serprex · 2016-05-29T20:03:32Z

Microbenchmark of continue regression: timeit("for a in range(256):\n\tif a:continue",number=100000)
Without patch: ~3.57
With patch: ~3.64

serprex · 2016-05-29T20:29:22Z

Summer heat is getting to me. Please ignore this issue until I've uploaded a fixed patch

serprex · 2016-05-30T01:13:31Z

Two issues:

One: a = (i for i in 6) does not throw until calling next(a). This applies to the first patch. If it's unacceptable to defer the throw then this whole issue should be closed unless there's interest in having a GET_ITER, FOR_ITER, FOR_BEGIN, FOR_NEXT set of opcodes between comprehensions & for loops. That seems excessive

Two: tracing is getting a bit confused by FOR_ITER being at the end of the loop body. The clearest example is that in test_pdb's test_pdb_until_command_generator it needs 3 steps instead of 2 to trace print(i) after the 'until 4' because whereas currently it traces the for i in test_gen() before calling into the iterator, it now calls into the iterator first & traces for i in test_gen() only after returning. Additionally it doesn't trace for i in test_gen() at the end of iteration, instead tracing the last line

It seems fragile to try fix this up by having the FOR_ITER tail receive a negative lnotab line offset to place it on the same line as the loop heading. My own research of the code base has been trying to determine if I can manually emit a trace in the FOR_ITER opcode, but I'd rather not muck with the frame to carry out this façade

serhiy-storchaka · 2016-06-05T19:53:25Z

It looks to me this idea is dead.

serprex · 2016-06-05T20:58:43Z

I've gotten most tests to past by having FOR_ITER be traced as if the instruction index is that of the corresponding FOR_BEGIN. test_sys_settrace still fails on test_15_loops because an empty loop body doesn't have the 'pass' line traced (ie when FOR_ITER starts the line) which I'm currently pondering ways around

The first patch, which only moved GET_ITER into the closure, would still be good for list/set/dict comprehensions (to help PREDICT & JITs)

If there's essentially a decision that all loops should have JUMP_ABSOLUTE to their beginning for the sake of tracing simplicity, then FOR_BEGIN/FOR_ITER are dead

serprex · 2016-06-07T03:55:36Z

Attached is a patch which fixes test_sys_settrace & test_pdb & test_trace. Still failing is test_genexps. I'm unsure the best way to fix this one:

Allow generator expressions to defer type errors about non iterables until the initial call to next
Create a TEST_ITER opcode entirely for generator expressions to eagerly throw
Generate instructions like
FOR_BEGIN, if empty, jump over generator & put an empty generator on the stack
-> CALL_FUNCTION 2, have generator produce code as LOAD_FAST 0, LOAD_FAST 1, <code>, FOR_ITER repeat <code> (ie translate stack of FOR_BEGIN from before call as header of generator)

Empty generator can be created with 'LOAD_CONST (None,), FOR_BEGIN, POP' but it seems rather bad to have generators require 3 more opcodes around their creation than currently

serhiy-storchaka · 2016-06-07T04:13:07Z

What if generate following code?

GET_ITER
JUMP_FORWARD L2

L1:
# ...
L2:
FOR_ITER L1

This saves one jump instruction per iteration.

Next, we can merge GET_ITER+JUMP_FORWARD in one FOR_BEGIN instruction for regular loops. Generators will start with JUMP_FORWARD.

rhettinger · 2016-06-07T18:48:08Z

I don't think this should go forward. The current FOR_ITER and JUMP_ABSOLUTE combination is very efficient and shouldn't be changed lightly. It is the inside of the loop that matters -- the GET_ITER step is outside of the loop and isn't expensive.

Also, I really like that GET_ITER and FOR_ITER correspond exactly to our high level understanding of how for-loops operate:

   for x in s:
       f(x)

corresponds to:

   it = iter(s)
   while True:
       try:
           x = next(it)
       except StopIteration:
           break
       f(x)

I really don't think this should be pursued further. It messes with my understanding of for-loops, it combines features that should remain decoupled, it focuses its efforts on the exterior of the loop, and it alters code that is very mature (having been examined and tweaked by hundreds of your predecessors).

serprex · 2016-06-08T03:38:08Z

Attaching forbegin3.patch. It reintroduces GET_ITER for the sole purpose of eagerly throwing. I decided to reuse GET_ITER over something like TEST_ITER as this way we can have GET_ITER flow into FOR_BEGIN & rely on the fast path of iter(iter(x))

GET_ITER/JUMP_FORWARD idea doesn't work because FOR_ITER is carefully setup currently to trace as existing on 2 separate lines. If we JUMP_FORWARD into FOR_ITER then that tracing triggers & our trace will say we executed the last line of the loop immediately before executing the iteration logic

serprex · 2016-06-08T03:42:18Z

Didn't see Raymond's response before posting, forbegin3 at least exists as a completion of the experiment to a passes-tests state. The tracing hacks to support an instruction corresponding to two separate lines support rejecting this idea

serprex · 2016-06-08T05:26:14Z

I should've kept gitfit & forbegin more separated as issues. Attached is gitfit2, which only folds the GET_ITER into the comprehension if it isn't a generator so to pass test_genexps

serhiy-storchaka · 2016-06-08T05:36:06Z

As Dino noted in msg266453, this leaves the iterable on the stack while the loop is running. I think opcode reworking shouldn't change behavior. You should call GET_ITER or FOR_BEGIN before calling generator code.

serprex mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage labels May 26, 2016

serprex mannequin closed this as completed Jun 8, 2016

ezio-melotti transferred this issue from another repository Apr 10, 2022

serhiy-storchaka mentioned this issue Apr 12, 2022

Optimizing FOR_ITER bytecode #91432

Closed

Uh oh!

Never have GET_ITER not followed by FOR_ITER #71314

Never have GET_ITER not followed by FOR_ITER #71314

Comments

serprex mannequin commented May 26, 2016

serprex mannequin commented May 26, 2016

Uh oh!

DinoV commented May 26, 2016

Uh oh!

serprex mannequin commented May 29, 2016

Uh oh!

serhiy-storchaka commented May 29, 2016

Uh oh!

serprex mannequin commented May 29, 2016

Uh oh!

serhiy-storchaka commented May 29, 2016

Uh oh!

serprex mannequin commented May 29, 2016

Uh oh!

serprex mannequin commented May 29, 2016

Uh oh!

serprex mannequin commented May 29, 2016

Uh oh!

serprex mannequin commented May 30, 2016

Uh oh!

serhiy-storchaka commented Jun 5, 2016

Uh oh!

serprex mannequin commented Jun 5, 2016

Uh oh!

serprex mannequin commented Jun 7, 2016

Uh oh!

serhiy-storchaka commented Jun 7, 2016

Uh oh!

rhettinger commented Jun 7, 2016

Uh oh!

serprex mannequin commented Jun 8, 2016

Uh oh!

serprex mannequin commented Jun 8, 2016

Uh oh!

serprex mannequin commented Jun 8, 2016

Uh oh!

serhiy-storchaka commented Jun 8, 2016

Uh oh!