Skip to content

Make email.message.Message.__contains__ faster #100792

Closed
@sobolevn

Description

@sobolevn

Right now the implementation of Message.__contains__ looks like this:

def __contains__(self, name):
return name.lower() in [k.lower() for k, v in self._headers]

There are several problems here:

  1. We build intermediate structure (list in this case)
  2. We use list for in operation, which is slow

The fastest way to do check if actually have this item is simply by:

    def __contains__(self, name):
        name_lower = name.lower()
        for k, v in self._headers:
            if name_lower == k.lower():
                return True
        return False

We do not create any intermediate lists / sets. And we even don't iterate longer than needed.
This change makes in check twice as fast.

Microbenchmark

Before

» pyperf timeit --setup 'import email; m = email.message_from_file(open("Lib/test/test_email/data/msg_01.txt"))' '"from" in m'
.....................
Mean +- std dev: 1.40 us +- 0.14 us
pyperf timeit --setup 'import email; m = email.message_from_file(open("Lib/test/test_email/data/msg_01.txt"))' '"missing" in m'
.....................
Mean +- std dev: 1.42 us +- 0.06 us

After

» pyperf timeit --setup 'import email; m = email.message_from_file(open("Lib/test/test_email/data/msg_01.txt"))' '"missing" in m'
.....................
Mean +- std dev: 904 ns +- 55 ns
» pyperf timeit --setup 'import email; m = email.message_from_file(open("Lib/test/test_email/data/msg_01.txt"))' '"from" in m'
.....................
Mean +- std dev: 715 ns +- 24 ns

The second case is now twice as fast.
It probably also consumes less memory now, but I don't think it is very significant.

Importance

Since EmailMessage (a subclass of Message) is quite widely used by users and 3rd party libs, I think it is important to be included.

And since the patch is quite simple and pure-python, I think the risks are very low.

Linked PRs

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions