gh-128118: Improve performance of copy.copy by using a fast lookup for atomic and container types #128119

eendebakpt · 2024-12-20T10:55:44Z

Similar to the approached used for copy.deepcopy in #114266 we can simplifly the implementation of copy.copy and improve performance by checking on the type of the argument using a lookup.

Results:

copy int: Mean +- std dev: [main] 159 ns +- 10 ns -> [pr_v2] 104 ns +- 7 ns: 1.54x faster
copy slice: Mean +- std dev: [main] 184 ns +- 44 ns -> [pr_v2] 109 ns +- 13 ns: 1.69x faster
copy dict: Mean +- std dev: [main] 196 ns +- 18 ns -> [pr_v2] 182 ns +- 16 ns: 1.07x faster
copy dataclass: Mean +- std dev: [main] 1.88 us +- 0.13 us -> [pr_v2] 1.82 us +- 0.11 us: 1.04x faster
copy small list: Mean +- std dev: [main] 179 ns +- 18 ns -> [pr_v2] 134 ns +- 7 ns: 1.33x faster
copy small tuple: Mean +- std dev: [main] 155 ns +- 12 ns -> [pr_v2] 80.3 ns +- 6.1 ns: 1.93x faster
copy list dataclasses: Mean +- std dev: [main] 151 ns +- 11 ns -> [pr_v2] 160 ns +- 25 ns: 1.05x slower

Geometric mean: 1.32x faster

Benchmark script:

import pyperf

runner = pyperf.Runner()

setup = """
import copy

a={'list': [1,2,3,43], 't': (1,2,3), 'str': 'hello', 'subdict': {'a': True}}

from dataclasses import dataclass

lst = [1, 's']
tpl  =('a', 'b', 3)

i = 123123123
sl = slice(1,2,3)

@dataclass
class A:
    a : int
    
dc = A(123)
list_dc = [A(1), A(2), A(3), A(4)]
"""

runner.timeit(name="copy int", stmt="b=copy.copy(i)", setup=setup)
runner.timeit(name="copy slice", stmt="b=copy.copy(sl)", setup=setup)
runner.timeit(name="copy dict", stmt="b=copy.copy(a)", setup=setup)
runner.timeit(name="copy dataclass", stmt="b=copy.copy(dc)", setup=setup)
runner.timeit(name="copy small list", stmt="b=copy.copy(lst)", setup=setup)
runner.timeit(name="copy small tuple", stmt="b=copy.copy(tpl)", setup=setup)
runner.timeit(name="copy list dataclasses", stmt="b=copy.copy(list_dc)", setup=setup)

Issue: Improve performance of copy.copy #128118

Misc/NEWS.d/next/Library/2024-12-20-10-57-10.gh-issue-128118.mYak8i.rst

picnixz · 2024-12-26T11:07:45Z

Lib/copy.py

-def _copy_immutable(x):
-    return x
-for t in (types.NoneType, int, float, bool, complex, str, tuple,
+_copy_atomic_types = {types.NoneType, int, float, bool, complex, str, tuple,


Out of curiosity, would performance be better if we use a frozenset instead of a set? (and is it possible?)

Good question, I'll benchmark a bit later. A frozenset should not require any locking, so perhaps there is a difference

At this moment the set and frozenset have the same implementation for __contains__:

cpython/Objects/setobject.c

Lines 2529 to 2531 in 3bd7730

static PyMethodDef frozenset_methods[] = {

SET___CONTAINS___METHODDEF

FROZENSET_COPY_METHODDEF

cpython/Objects/setobject.c

Lines 2416 to 2420 in 3bd7730

static PyMethodDef set_methods[] = {

SET_ADD_METHODDEF

SET_CLEAR_METHODDEF

SET___CONTAINS___METHODDEF

SET_COPY_METHODDEF

so there is no performance difference. In the future however, for the free-threading build one could remove the critical section for the frozenset implementation here:

cpython/Objects/setobject.c

Lines 2198 to 2207 in 3bd7730

static int

set_contains(PyObject *self, PyObject *key)

{

PySetObject *so = _PySet_CAST(self);

return _PySet_Contains(so, key);

}

/*[clinic input]

@critical_section

@coexist

Using a frozenset is possible, but this would add a bit of time to the import. On my system %timeit frozenset(_copy_atomic_types) is about 300 ns

Even faster than a setwould be a data structure that looks only at the id of the objects involved (the set will use rich compare if no match is found, but that is not needed as all objects involved are singletons), but that is not available in cpython I believe.

but that is not available in cpython I believe.

That's right, it's not available.

Up to you if you want to make the free-threaded build faster in the future, but we should probably check the performances on this build. For now, let's keep the set for now (hopefully you'll rememeber this)

Lib/copy.py

…Yak8i.rst Co-authored-by: Bénédikt Tran <[email protected]>

erlend-aasland · 2024-12-30T17:19:33Z

Thanks for the speed-up, Pieter! Thanks for the reviews, Bénédikt and Sergey!

…ontainer types (python#128119)

eendebakpt added 2 commits December 20, 2024 09:45

Refactor copy.copy

d951b17

use set instead of tuple

742aa88

bedevere-app bot added the awaiting review label Dec 20, 2024

eendebakpt changed the title ~~Improve performance of copy.copy by using a fast lookup for atomic and container types~~ gh-128118: Improve performance of copy.copy by using a fast lookup for atomic and container types Dec 20, 2024

bedevere-app bot mentioned this pull request Dec 20, 2024

Improve performance of copy.copy #128118

Closed

📜🤖 Added by blurb_it.

c2e6a4a

skirpichev approved these changes Dec 20, 2024

View reviewed changes

bedevere-app bot added awaiting core review and removed awaiting review labels Dec 20, 2024

erlend-aasland approved these changes Dec 20, 2024

View reviewed changes

bedevere-app bot added awaiting merge and removed awaiting core review labels Dec 20, 2024

picnixz reviewed Dec 26, 2024

View reviewed changes

Update Misc/NEWS.d/next/Library/2024-12-20-10-57-10.gh-issue-128118.m…

9490a23

…Yak8i.rst Co-authored-by: Bénédikt Tran <[email protected]>

picnixz approved these changes Dec 26, 2024

View reviewed changes

erlend-aasland merged commit 34b85ef into python:main Dec 30, 2024
38 checks passed

bedevere-app bot removed the awaiting merge label Dec 30, 2024

srinivasreddy pushed a commit to srinivasreddy/cpython that referenced this pull request Jan 8, 2025

pythongh-128118: Speed up copy.copy with fast lookup for atomic and c…

0803d9b

…ontainer types (python#128119)

eendebakpt mentioned this pull request Apr 24, 2025

Understand the outlier benchmarks on 3.14 (main) vs. 3.13.0 faster-cpython/ideas#726

Open

25 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

gh-128118: Improve performance of copy.copy by using a fast lookup for atomic and container types #128119

gh-128118: Improve performance of copy.copy by using a fast lookup for atomic and container types #128119

Uh oh!

eendebakpt commented Dec 20, 2024 •

edited by hugovk

Loading

Uh oh!

Uh oh!

picnixz Dec 26, 2024

Uh oh!

eendebakpt Dec 26, 2024

Uh oh!

eendebakpt Dec 26, 2024

Uh oh!

picnixz Dec 26, 2024

Uh oh!

Uh oh!

Uh oh!

erlend-aasland commented Dec 30, 2024

Uh oh!

Uh oh!

	static PyMethodDef frozenset_methods[] = {
	SET___CONTAINS___METHODDEF
	FROZENSET_COPY_METHODDEF

	static PyMethodDef set_methods[] = {
	SET_ADD_METHODDEF
	SET_CLEAR_METHODDEF
	SET___CONTAINS___METHODDEF
	SET_COPY_METHODDEF

	static int
	set_contains(PyObject self, PyObject key)
	{
	PySetObject *so = _PySet_CAST(self);
	return _PySet_Contains(so, key);
	}

	/*[clinic input]
	@critical_section
	@coexist

Uh oh!

gh-128118: Improve performance of copy.copy by using a fast lookup for atomic and container types #128119

gh-128118: Improve performance of copy.copy by using a fast lookup for atomic and container types #128119

Uh oh!

Conversation

eendebakpt commented Dec 20, 2024 • edited by hugovk Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

picnixz Dec 26, 2024

Choose a reason for hiding this comment

Uh oh!

eendebakpt Dec 26, 2024

Choose a reason for hiding this comment

Uh oh!

eendebakpt Dec 26, 2024

Choose a reason for hiding this comment

Uh oh!

picnixz Dec 26, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

erlend-aasland commented Dec 30, 2024

Uh oh!

Uh oh!

eendebakpt commented Dec 20, 2024 •

edited by hugovk

Loading