Skip to content

PEP 649: Avoid creation of function objects for __annotate__ #124157

Open
@JelleZijlstra

Description

@JelleZijlstra

Currently, when a class, function, or module has any annotations, we always generate an __annotate__ function object at import time. A function object takes 168 bytes. But in most cases, all of the relevant fields on an __annotate__ function are predictable (there's no docstring and no defaults or kwdefaults, the name is __annotate__, etc.). So we could save significant memory by constructing only a smaller object and constructing the function on demand when somebody asks for it (by accessing __annotate__).

We need the following to create an __annotate__ function object:

  • The code object itself. That's inescapable.
  • The globals dict. For function annotations, we can reuse the function's globals. For module annotations, we can use the module dict. But for classes, the __annotate__ descriptor can't easily get to the globals dict. To do this, we may need a new bytecode that just loads the current globals.
  • The closure tuple. Module annotations never have this, classes always have it (a reference to the classdict), functions often have it (always for methods, never for global functions, often for nested functions).

I am thinking of a format where __annotate__ can be any of the following:

  • A function, like today
  • A bare code object
  • A tuple containing a code object at position 0, optionally a globals dict at position 1, plus any number of cell objects

__annotate__ getters would have to recognize the second and third cases and translate them into function objects on the fly. As a result, users accessing .__annotate__ would never see the tuple, though those who peek directly into a module or class's __dict__ might.

Other related opportunities for optimization:

  • Tools like functools.wraps would unnecessarily force materialization of the __annotate__ function. Not sure there's an elegant solution for this.
  • The function objects created for various PEP 695/696 objects (e.g., TypeVar bounds) work very similarly to annotate functions, and we could apply the same optimization to them.
  • A code object by itself is also pretty big (232 bytes), and many of its fields are not needed for an annotate function that may never get executed. We could internally create a more streamlined "mini-codeobject" and materialize the real code object only when necessary.

Linked PRs

Metadata

Metadata

Assignees

Labels

3.14bugs and security fixesinterpreter-core(Objects, Python, Grammar, and Parser dirs)topic-typing

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions