Skip to content

Documentation on Performance considerations #16310

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Dr-Irv opened this issue May 9, 2017 · 4 comments
Closed

Documentation on Performance considerations #16310

Dr-Irv opened this issue May 9, 2017 · 4 comments
Labels
Docs good first issue Performance Memory or execution speed performance

Comments

@Dr-Irv
Copy link
Contributor

Dr-Irv commented May 9, 2017

Problem description

We could use a section in the docs about performance considerations. @TomAugspurger has a nice notebook with suggestions here: http://nbviewer.jupyter.org/github/TomAugspurger/pandas-head-to-tail/blob/master/06-Performance.ipynb

An item to add relates to using .assign(), where, if you have a big DataFrame, it is more inefficient in time and memory to use .assign() as opposed to just creating each new column without the use of method chaining via the paradigm of

    df['newcol1'] = df.a + df.b
    df['newcol2'] = df.c + df.d
@jreback
Copy link
Contributor

jreback commented May 10, 2017

see also #3871 and #8178

@jreback jreback added this to the Next Major Release milestone May 10, 2017
@andymaheshw
Copy link
Contributor

Working on it at pycon2017 :)

@jorisvandenbossche
Copy link
Member

There was also a presentation on PyCon about optimizing pandas: https://github.com/sversh/pycon2017-optimizing-pandas, which can probably be used as well for some inspiration

@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
@mroeschke
Copy link
Member

Looks like we have https://pandas.pydata.org/pandas-docs/stable/user_guide/enhancingperf.html so closing for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs good first issue Performance Memory or execution speed performance
Projects
None yet
Development

No branches or pull requests

7 participants