[Enhancement] Speed up setting and deleting mutable attributes on non-dataclass subclasses of frozen dataclasses

# Feature or enhancement

The `dataclasses` library provides an easy way to create classes. The library will automatically generate relevant methods for the users.

Creating `dataclass`es with argument `frozen=True` will automatically generate methods `__setattr__` and `__delattr__` in `_frozen_get_del_attr`.

This issue proposes to change the `tuple`-based lookup to `set`-based lookup. Reduce the time complexity from $O(n)$ to $O(1)$.

```python
In [1]: # tuple-based

In [2]: %timeit 'a' in ('a', 'b', 'c', 'd', 'e', 'f', 'g')
9.91 ns ± 0.0982 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)

In [3]: %timeit 'd' in ('a', 'b', 'c', 'd', 'e', 'f', 'g')
33.2 ns ± 0.701 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

In [4]: %timeit 'g' in ('a', 'b', 'c', 'd', 'e', 'f', 'g')
56.4 ns ± 0.818 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

In [5]: # set-based

In [6]: %timeit 'a' in {'a', 'b', 'c', 'd', 'e', 'f', 'g'}
11.3 ns ± 0.0723 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)

In [7]: %timeit 'd' in {'a', 'b', 'c', 'd', 'e', 'f', 'g'}
11 ns ± 0.106 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)

In [8]: %timeit 'g' in {'a', 'b', 'c', 'd', 'e', 'f', 'g'}
11.1 ns ± 0.126 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)
```

A tiny benchmark script:

```python
from contextlib import suppress
from dataclasses import FrozenInstanceError, dataclass

@dataclass(frozen=True)
class Foo2:
    a: int
    b: int

foo2 = Foo2(1, 2)

def bench2(inst):
    with suppress(FrozenInstanceError):
        inst.a = 0
    with suppress(FrozenInstanceError):
        inst.b = 0

@dataclass(frozen=True)
class Foo7:
    a: int
    b: int
    c: int
    d: int
    e: int
    f: int
    g: int

foo7 = Foo7(1, 2, 3, 4, 5, 6, 7)

def bench7(inst):
    with suppress(FrozenInstanceError):
        inst.a = 0
    with suppress(FrozenInstanceError):
        inst.b = 0
    with suppress(FrozenInstanceError):
        inst.c = 0
    with suppress(FrozenInstanceError):
        inst.d = 0
    with suppress(FrozenInstanceError):
        inst.e = 0
    with suppress(FrozenInstanceError):
        inst.f = 0
    with suppress(FrozenInstanceError):
        inst.g = 0

class Bar(Foo7):
    def __init__(self, a, b, c, d, e, f, g):
        super().__init__(a, b, c, d, e, f, g)
        self.baz = 0

def bench(inst):
    inst.baz = 1
```

Result:

`set`-based lookup:

```python
In [2]: %timeit bench2(foo2)
1.08 µs ± 28.1 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [3]: %timeit bench7(foo7)
3.81 µs ± 20.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [4]: %timeit bench(bar)
249 ns ± 6.31 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
```

`tuple`-based lookup (original):

```python
In [2]: %timeit bench2(foo2)
1.15 µs ± 10.9 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [3]: %timeit bench7(foo7)
3.97 µs ± 15.7 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [4]: %timeit bench(bar)
269 ns ± 4.09 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
```
```

Result:

`set`-based lookup:

```python
In [2]: %timeit bench2(foo2)
1.08 µs ± 28.1 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [3]: %timeit bench7(foo7)
3.81 µs ± 20.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
```

`tuple`-based lookup (original):

```python
In [2]: %timeit bench2(foo2)
1.15 µs ± 10.9 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [3]: %timeit bench7(foo7)
3.97 µs ± 15.7 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
```

The `set`-based is constantly faster than the old approach. And the theoretical time complexity is also smaller ($O(1)$ vs. $O(n)$).

Ref: #102573

# Pitch

(Explain why this feature or enhancement should be implemented and how it would be used.
 Add examples, if applicable.)

In the autogenerate `__setattr__` and `__delattr__`, they have a sanity check at the beginning of the method. For example:

```python
def __setattr__(self, name, value):
    if type(self) is {{UserType}} or name in ({{a tuple of field names}}):
        raise FrozenInstanceError(f"cannot assign to field {name!r}")
    super(cls, self).__setattr__(name, value)
```

If someone inherits the frozen dataclass, the sanity check will take $O(n)$ time on the `tuple__contains__(...)` and finally calls `super().__setattr__(...)`. For example:

```python
@dataclass(frozen=True)
class FrozenBase:
    x: int
    y: int
    ... # N_FIELDS

class Foo(FrozenBase):
    def __init__(self, x, y, somevalue, someothervalue):
        super().__init__(x, y)
        self.somevalue = somevalue            # takes O(N_FIELDS)
        self.someothervalue = someothervalue  # takes O(N_FIELDS) time again

foo = Foo(1, 2, 3, 4)
foo.extravalue = extravalue  # takes O(N_FIELDS) time again
```

# Previous discussion






N/A.



### Linked PRs
* gh-102573

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Enhancement] Speed up setting and deleting mutable attributes on non-dataclass subclasses of frozen dataclasses #102578

Feature or enhancement

Pitch

Previous discussion

Linked PRs

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

[Enhancement] Speed up setting and deleting mutable attributes on non-dataclass subclasses of frozen dataclasses #102578

Description

Feature or enhancement

Pitch

Previous discussion

Linked PRs

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions