-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Cumulative examples #7152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cumulative examples #7152
Conversation
Very nice, this is something that's been on the TODO list! :) I believe we wanted to rename |
@Illviljan That is definitely something I could do. Are there any other methods I should be including in this? |
Right now, I think |
Great I'll start working on that. Shouldn't take too long |
Thanks @patrick-naylor ! Instead of using def cumsum(..., dim):
return xr.apply_ufunc(
np.cumsum if skipna else np.nancumsum,
obj,
input_core_dims=[dim],
output_core_dims=[dim],
kwargs={"axis": -1},
)
# now transpose dimensions back to input order to fix #6528. At the moment, this should also work on GroupBy objects quite nicely. |
Thanks @dcherian, |
Nope. Just wasn't added in :) |
I'm running into an issue with variables without the core dimensions. Would it be better to do a work around in cumsum or in apply_unfunc like you mentioned in #6391 |
While this is definitely worth an improvement, it is quite out of scope for this PR. |
…aset cumulative functions.
for more information, see https://pre-commit.ci
I've merged the cumulative and reduction files into generate_aggregations.py and _aggregations.py. This uses the original version of reductions with an additional statement on the dataset methods that adds the original coordinates back in. Using apply_ufunc and np.cumsum/cumprod has some issues as it only finds the cumulative across one axis which makes iterating through each dimension necessary. This makes it slower than the original functions and also causes some problems with the groupby method. Happy for any input on how the method using apply_ufunc might be usable or on any ways to change the current method. I'm getting a few issues I don't quite understand:
Thanks! |
I don't think you have |
Thanks for taking this on @patrick-naylor ! This is a decent-sized project!
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @patrick-naylor , this is a major step forward.
I think it might be better to take a step back and figure out Dataset.cumsum
using apply_ufunc
that passes tests. Then we can update the generators with that code, iterating using the generators is going to be frustrating ;)
xarray/util/generate_aggregations.py
Outdated
axis = ExtraKwarg( | ||
docs=_AXIS_DOCSTRING, | ||
kwarg="axis: int | Sequence[int] | None = None,", | ||
call="axis=axis,", | ||
example="", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't be supporting axis
, Xarray uses dim
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially thought this was just a documentation error with the old cumsum/prod docs where they had axis as a parameter but the current version of dataarray.cumsum/dataarray.cumprod does support axis input. I was worried that removing it may break some users' code. Is this something we should depreciate, leave in, or just outright remove?
I also just realized that I had axis as an additional kwarg for the dataset methods not dataarray methods which is my mistake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
old cumsum/prod docs where they had axis
We've been removing these so IMO its OK. I think it was an issue with the older reduction functions too
xarray/util/generate_aggregations.py
Outdated
|
||
# numpy_groupies & flox do not support median | ||
# https://github.com/ml31415/numpy-groupies/issues/43 | ||
if method.name == "median": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
THis median bit should not be needed
xarray/util/generate_aggregations.py
Outdated
@@ -209,7 +277,8 @@ def {method}( | |||
example="""\n | |||
Specify ``min_count`` for finer control over when NaNs are ignored. | |||
|
|||
>>> {calculation}(skipna=True, min_count=2)""", | |||
>>> {calculation}(skipna=True, min_count=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, these strings were tweaked to make sure pre-commit was happy so its unlikely these changes are necessary.
Co-authored-by: Deepak Cherian <[email protected]>
xarray/util/generate_aggregations.py
Outdated
Method("cumsum", extra_kwargs=(skipna,)), | ||
Method("cumprod", extra_kwargs=(skipna,)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note we previously had numeric_only=True
though I guess it may not be needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added it now, or should I revert?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Let's open an issue to add tests when we relax it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @patrick-naylor and @Illviljan . This is an amazing improvement!
@patrick-naylor, feel free to try out a better default example if you want. |
Thanks @Illviljan, @dcherian, and @keewis so much for the help. |
Here I am trying to add docstring example to the cumulative functions. I did this by basically copying the method used for the reduction functions. Not sure at all if I did this correctly so i'm marking it a draft
whats-new.rst
api.rst