Maximumizing the advantages of (deep) freezing. #466

markshannon · 2022-09-19T15:31:19Z

If we can declare data as const in the C code, it can be shared across multiple interpreters at zero cost, and across processes at the cost of a mmap (near zero).

Unfortunately we can't do that for Python objects, at least not unless we have immortal objects.

There is also the degree of mutability to consider. Some objects (mainly strings) are immutable except for the hash code (and maybe refcount). Once the hashcode has been initialized, then they are immutable (except for the refcount). I refer to these objects as idempotent immutable; they are mutable, but become immutable once their hash has been taken.

There is little point in freezing mutable objects, as they cannot be shared and we need extra code to clean them up, e.g. _PyStaticCode_Dealloc

Definitions:

const: Fully immutable to both Python and C code. Can be stored as const data.
Semi constant: Constant at Python level, constant at C level except for reference count.
Idempotent constant: Constant at Python level, constant at C level once initialized.
Idempotent semi constant: Constant at Python level, constant at C level except for reference count once initialized.

Deep-freezing of objects

	Immutable	Idempotent immutable	Mutable
Example	None	"String"	list()
Immortal objects	`const`	Idempotent constant	Do not freeze
No immortal objects	Semi constant	Idempotent semi constant	Do not freeze

const objects can be marked as const in the C source, which is the ideal for startup and sharing.
Idempotent constants need to be initialized on process creation, and can be shared between multiple interpreters
Semi constants can usefully be frozen, but can only be used by the main interpreter
Idempotent semi constants need to be initialized on process creation, and can only be used by the main interpreter. They are probably worth deep freezing, but the benefit is less clear.

Changing object layout to better use deep-freezing

Some objects are mutable but have large immutable parts. For example, code objects.
These objects should be laid out so that as much as possible of the object can be deep-frozen. Keeping the mutable parts in the object, with pointer(s) to the immutable part(s) should work well.

The text was updated successfully, but these errors were encountered:

ericsnowcurrently · 2022-09-19T17:19:47Z

CC @kumaraditya303

ericsnowcurrently · 2022-09-19T17:29:53Z

There should be quite a few things we can make const once we have immortal objects (PEP 683).

CC @eduardo-elizondo

ericsnowcurrently · 2022-09-19T17:30:20Z

Some objects are mutable but have large immutable parts. For example, code objects.
These objects should be laid out so that as much as possible of the object can be deep-frozen. Keeping the mutable parts in the object, with pointer(s) to the immutable part(s) should work well.

That makes a lot of sense.

gvanrossum · 2022-09-19T18:21:20Z

There should be quite a few things we can make const once we have immortal objects (PEP 683).

But what to do about string hashes? We can't precompute those because of hash randomization.

kumaraditya303 · 2022-09-24T17:16:49Z

But what to do about string hashes? We can't precompute those because of hash randomization.

They currently get implicitly computed when they are added to the global interned string table. This happens very early in interpreter startup so is safe to share between interpreters. We effectively make the main interpreter responsible for managing it and since main interpreter is destroyed at last (or at least it should be when per interp GIL is implemented) it is safe.

gvanrossum · 2022-09-24T22:11:13Z

Not all strings get interned currently -- only strings that generally qualify for interning, i.e. strings whose value looks like an identifier. We can fix that of course.

But even if we fix this, string objects cannot be made part of the 'const' data segment in the linker. And until the ob_shash field from PyBytesObject is removed (it was deprecated in 3.11) neither can bytes objects.

eduardo-elizondo · 2022-10-03T15:56:13Z

@markshannon in the past, we've explored internally to move a couple of objects to the .rodata section leading to great results. I have not personally tried this with immortal objects but I agree, this could be maximize the potential of having immortal objects.

Re: code objects, I've explored some of these ideas in the past for the Language Summit presentation. One of the optimization techniques was freezing some of the internal objects in code object. Another option, is to freeze the code object itself to be able to remove the frame code object decref in _PyFrame_Clear and pairing this up with a way to manually clean code objects at the end of execution. All of this improves perf.

However, we're not using any of these in the current immortal proposal since it probably requires its own discussion around how to properly deal with code objects under the assumption that we can make it (or parts of it) immortal.

markshannon · 2023-03-17T16:44:45Z

See #566 for how to handle code objects and still deep freeze much of the data.

carljm · 2023-03-18T00:16:44Z

Some objects are mutable but have large immutable parts. For example, code objects.
These objects should be laid out so that as much as possible of the object can be deep-frozen. Keeping the mutable parts in the object, with pointer(s) to the immutable part(s) should work well.

Is anyone currently working on this? If not, we might be interested in getting it in for 3.12.

gvanrossum · 2023-03-18T00:28:06Z

I don't know if anyone is working on this, but I anticipate a small glitch: currently, nested code objects are included in the co_consts tuple. If the idea is to share the constant portion of a deep-frozen code object between subinterpreters, we'd have to make an exception if co_consts contains other frozen code oobjects -- which would be the case for code objects associated with modules, classes (if they have methods), and anything containing nested functions, lambdas, generator expressions, or (unless/until PEP 709 is accepted) comprehensions.

To get around this, we could just duplicate the co_consts tuple if it contains a code object (as indicated by co_flags) We'd also need to recurse down to apply the same treatment to the nested code objects.

carljm · 2023-03-18T00:39:53Z

So I think there are two separable (but sequential) action items here:

Splitting mutable and immutable parts of a code object. (Along with immortalization) this has standalone benefits, e.g. for improving memory sharing and minimizing copy-on-write in a forking workload. This is actually our primary immediate interest for 3.12. For this purpose, it's no problem for the immutable part to contain (immutable) pointers to (mutable) objects, as is the case with code objects in co_consts.
Actually sharing the immutable parts of deepfrozen code objects between sub-interpreters. I think the handling of nested code objects is a question that can be deferred until someone tackles this; I don't anticipate trying to do this for 3.12 (but if we do (1) it would be a step towards allowing someone to tackle (2).)

carljm · 2023-03-18T00:55:31Z

I created python/cpython#102802 to track specifically (1) as a python/cpython issue.

markshannon · 2024-10-24T09:36:29Z

We've abandoned deep freezing.
Freezing gives us almost all of the performance benefits without the complexity.

markshannon mentioned this issue Sep 19, 2022

Improve memory use, sharing and start up with better code objects. #465

Open

4 tasks

carljm mentioned this issue Mar 18, 2023

split mutable and immutable parts of code objects python/cpython#102802

Open

markshannon closed this as not planned Won't fix, can't repro, duplicate, stale Oct 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Maximumizing the advantages of (deep) freezing. #466

Maximumizing the advantages of (deep) freezing. #466

markshannon commented Sep 19, 2022

ericsnowcurrently commented Sep 19, 2022

Uh oh!

ericsnowcurrently commented Sep 19, 2022

Uh oh!

ericsnowcurrently commented Sep 19, 2022

Uh oh!

gvanrossum commented Sep 19, 2022

Uh oh!

kumaraditya303 commented Sep 24, 2022

Uh oh!

gvanrossum commented Sep 24, 2022

Uh oh!

eduardo-elizondo commented Oct 3, 2022 •

edited

Loading

Uh oh!

markshannon commented Mar 17, 2023

Uh oh!

carljm commented Mar 18, 2023

Uh oh!

gvanrossum commented Mar 18, 2023

Uh oh!

carljm commented Mar 18, 2023 •

edited

Loading

Uh oh!

carljm commented Mar 18, 2023

Uh oh!

markshannon commented Oct 24, 2024

Uh oh!

Maximumizing the advantages of (deep) freezing. #466

Maximumizing the advantages of (deep) freezing. #466

Comments

markshannon commented Sep 19, 2022

Deep-freezing of objects

Changing object layout to better use deep-freezing

ericsnowcurrently commented Sep 19, 2022

Uh oh!

ericsnowcurrently commented Sep 19, 2022

Uh oh!

ericsnowcurrently commented Sep 19, 2022

Uh oh!

gvanrossum commented Sep 19, 2022

Uh oh!

kumaraditya303 commented Sep 24, 2022

Uh oh!

gvanrossum commented Sep 24, 2022

Uh oh!

eduardo-elizondo commented Oct 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

markshannon commented Mar 17, 2023

Uh oh!

carljm commented Mar 18, 2023

Uh oh!

gvanrossum commented Mar 18, 2023

Uh oh!

carljm commented Mar 18, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

carljm commented Mar 18, 2023

Uh oh!

markshannon commented Oct 24, 2024

Uh oh!

eduardo-elizondo commented Oct 3, 2022 •

edited

Loading

carljm commented Mar 18, 2023 •

edited

Loading