Description
Currently, when a class, function, or module has any annotations, we always generate an __annotate__
function object at import time. A function object takes 168 bytes. But in most cases, all of the relevant fields on an __annotate__
function are predictable (there's no docstring and no defaults or kwdefaults, the name is __annotate__
, etc.). So we could save significant memory by constructing only a smaller object and constructing the function on demand when somebody asks for it (by accessing __annotate__
).
We need the following to create an __annotate__
function object:
- The code object itself. That's inescapable.
- The globals dict. For function annotations, we can reuse the function's globals. For module annotations, we can use the module dict. But for classes, the
__annotate__
descriptor can't easily get to the globals dict. To do this, we may need a new bytecode that just loads the current globals. - The closure tuple. Module annotations never have this, classes always have it (a reference to the classdict), functions often have it (always for methods, never for global functions, often for nested functions).
I am thinking of a format where __annotate__
can be any of the following:
- A function, like today
- A bare code object
- A tuple containing a code object at position 0, optionally a globals dict at position 1, plus any number of cell objects
__annotate__
getters would have to recognize the second and third cases and translate them into function objects on the fly. As a result, users accessing .__annotate__
would never see the tuple, though those who peek directly into a module or class's __dict__
might.
Other related opportunities for optimization:
- Tools like
functools.wraps
would unnecessarily force materialization of the__annotate__
function. Not sure there's an elegant solution for this. - The function objects created for various PEP 695/696 objects (e.g., TypeVar bounds) work very similarly to annotate functions, and we could apply the same optimization to them.
- A code object by itself is also pretty big (232 bytes), and many of its fields are not needed for an annotate function that may never get executed. We could internally create a more streamlined "mini-codeobject" and materialize the real code object only when necessary.