Skip to content

[Enhancement] Speed up setting and deleting mutable attributes on non-dataclass subclasses of frozen dataclasses #102578

Closed
@XuehaiPan

Description

@XuehaiPan

Feature or enhancement

The dataclasses library provides an easy way to create classes. The library will automatically generate relevant methods for the users.

Creating dataclasses with argument frozen=True will automatically generate methods __setattr__ and __delattr__ in _frozen_get_del_attr.

This issue proposes to change the tuple-based lookup to set-based lookup. Reduce the time complexity from $O(n)$ to $O(1)$.

In [1]: # tuple-based

In [2]: %timeit 'a' in ('a', 'b', 'c', 'd', 'e', 'f', 'g')
9.91 ns ± 0.0982 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)

In [3]: %timeit 'd' in ('a', 'b', 'c', 'd', 'e', 'f', 'g')
33.2 ns ± 0.701 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

In [4]: %timeit 'g' in ('a', 'b', 'c', 'd', 'e', 'f', 'g')
56.4 ns ± 0.818 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

In [5]: # set-based

In [6]: %timeit 'a' in {'a', 'b', 'c', 'd', 'e', 'f', 'g'}
11.3 ns ± 0.0723 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)

In [7]: %timeit 'd' in {'a', 'b', 'c', 'd', 'e', 'f', 'g'}
11 ns ± 0.106 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)

In [8]: %timeit 'g' in {'a', 'b', 'c', 'd', 'e', 'f', 'g'}
11.1 ns ± 0.126 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)

A tiny benchmark script:

from contextlib import suppress
from dataclasses import FrozenInstanceError, dataclass

@dataclass(frozen=True)
class Foo2:
    a: int
    b: int

foo2 = Foo2(1, 2)

def bench2(inst):
    with suppress(FrozenInstanceError):
        inst.a = 0
    with suppress(FrozenInstanceError):
        inst.b = 0

@dataclass(frozen=True)
class Foo7:
    a: int
    b: int
    c: int
    d: int
    e: int
    f: int
    g: int

foo7 = Foo7(1, 2, 3, 4, 5, 6, 7)

def bench7(inst):
    with suppress(FrozenInstanceError):
        inst.a = 0
    with suppress(FrozenInstanceError):
        inst.b = 0
    with suppress(FrozenInstanceError):
        inst.c = 0
    with suppress(FrozenInstanceError):
        inst.d = 0
    with suppress(FrozenInstanceError):
        inst.e = 0
    with suppress(FrozenInstanceError):
        inst.f = 0
    with suppress(FrozenInstanceError):
        inst.g = 0

class Bar(Foo7):
    def __init__(self, a, b, c, d, e, f, g):
        super().__init__(a, b, c, d, e, f, g)
        self.baz = 0

def bench(inst):
    inst.baz = 1

Result:

set-based lookup:

In [2]: %timeit bench2(foo2)
1.08 µs ± 28.1 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [3]: %timeit bench7(foo7)
3.81 µs ± 20.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [4]: %timeit bench(bar)
249 ns ± 6.31 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

tuple-based lookup (original):

In [2]: %timeit bench2(foo2)
1.15 µs ± 10.9 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [3]: %timeit bench7(foo7)
3.97 µs ± 15.7 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [4]: %timeit bench(bar)
269 ns ± 4.09 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

Result:

`set`-based lookup:

```python
In [2]: %timeit bench2(foo2)
1.08 µs ± 28.1 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [3]: %timeit bench7(foo7)
3.81 µs ± 20.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

tuple-based lookup (original):

In [2]: %timeit bench2(foo2)
1.15 µs ± 10.9 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [3]: %timeit bench7(foo7)
3.97 µs ± 15.7 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

The set-based is constantly faster than the old approach. And the theoretical time complexity is also smaller ($O(1)$ vs. $O(n)$).

Ref: #102573

Pitch

(Explain why this feature or enhancement should be implemented and how it would be used.
Add examples, if applicable.)

In the autogenerate __setattr__ and __delattr__, they have a sanity check at the beginning of the method. For example:

def __setattr__(self, name, value):
    if type(self) is {{UserType}} or name in ({{a tuple of field names}}):
        raise FrozenInstanceError(f"cannot assign to field {name!r}")
    super(cls, self).__setattr__(name, value)

If someone inherits the frozen dataclass, the sanity check will take $O(n)$ time on the tuple__contains__(...) and finally calls super().__setattr__(...). For example:

@dataclass(frozen=True)
class FrozenBase:
    x: int
    y: int
    ... # N_FIELDS

class Foo(FrozenBase):
    def __init__(self, x, y, somevalue, someothervalue):
        super().__init__(x, y)
        self.somevalue = somevalue            # takes O(N_FIELDS)
        self.someothervalue = someothervalue  # takes O(N_FIELDS) time again

foo = Foo(1, 2, 3, 4)
foo.extravalue = extravalue  # takes O(N_FIELDS) time again

Previous discussion

N/A.

Linked PRs

Metadata

Metadata

Assignees

Labels

3.12only security fixesperformancePerformance or resource usagestdlibPython modules in the Lib dirtype-featureA feature request or enhancement

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions