Cumulative examples #7152

patrick-naylor · 2022-10-10T19:24:32Z

Here I am trying to add docstring example to the cumulative functions. I did this by basically copying the method used for the reduction functions. Not sure at all if I did this correctly so i'm marking it a draft

Tests added
User visible changes (including notable bug fixes) are documented in whats-new.rst
New functions/methods are listed in api.rst

…creates _cumulatives.py

Illviljan · 2022-10-10T21:14:28Z

Very nice, this is something that's been on the TODO list! :)

I believe we wanted to rename generate_reductions.py to generate_aggregations.py so cumsum et al could be included and generated there as well. Is there a lot of work for you if try to merge these into that one?

patrick-naylor · 2022-10-10T21:17:23Z

@Illviljan That is definitely something I could do. Are there any other methods I should be including in this?

Illviljan · 2022-10-10T21:23:17Z

Right now, I think cumsum and cumprod is enough. numpy-groupies has a few more examples that I suppose we could support in the future.

patrick-naylor · 2022-10-10T21:27:46Z

Great I'll start working on that. Shouldn't take too long

dcherian · 2022-10-11T16:27:15Z

Thanks @patrick-naylor !

Instead of using Dataset.reduce I think we want something like

def cumsum(..., dim):
	return xr.apply_ufunc(
	    np.cumsum if skipna else np.nancumsum,
		obj,
	    input_core_dims=[dim],
	    output_core_dims=[dim],
        kwargs={"axis": -1},
    )
    # now transpose dimensions back to input order

to fix #6528.

At the moment, this should also work on GroupBy objects quite nicely.

patrick-naylor · 2022-10-11T18:43:59Z

Thanks @dcherian,
I'll try to work that in.
Is there a particular reason why there is no cumprod for GroupBy objects?

dcherian · 2022-10-11T18:51:26Z

Is there a particular reason why there is no cumprod for GroupBy objects?

Nope. Just wasn't added in :)

patrick-naylor · 2022-10-11T22:12:13Z

def cumsum(..., dim):
	return xr.apply_ufunc(
	    np.cumsum if skipna else np.nancumsum,
		obj,
	    input_core_dims=[dim],
	    output_core_dims=[dim],
        kwargs={"axis": -1},
    )
    # now transpose dimensions back to input order

I'm running into an issue with variables without the core dimensions. Would it be better to do a work around in cumsum or in apply_unfunc like you mentioned in #6391

headtr1ck · 2022-10-14T17:57:09Z

I'm running into an issue with variables without the core dimensions. Would it be better to do a work around in cumsum or in apply_unfunc like you mentioned in #6391

While this is definitely worth an improvement, it is quite out of scope for this PR.
Can you define Datasets such that this problem does not occur?

…aset cumulative functions.

for more information, see https://pre-commit.ci

patrick-naylor · 2022-10-19T23:52:47Z

I've merged the cumulative and reduction files into generate_aggregations.py and _aggregations.py. This uses the original version of reductions with an additional statement on the dataset methods that adds the original coordinates back in.

Using apply_ufunc and np.cumsum/cumprod has some issues as it only finds the cumulative across one axis which makes iterating through each dimension necessary. This makes it slower than the original functions and also causes some problems with the groupby method.

Happy for any input on how the method using apply_ufunc might be usable or on any ways to change the current method.

I'm getting a few issues I don't quite understand:

When running pytest on my local repository I get no errors but it's failing the checks here with a NotImplementedError
Black is having an issue with some of the strings in generate_aggregations. It's saying it cannot parse what should be valid code.

Thanks!

Illviljan · 2022-10-21T08:51:12Z

I don't think you have flox installed, if it's not installed the code will take the old path.
Do conda install flox and I think you'll get the NotImplementedError. Then you maybe have to change the default settings in cumsum so flox is not used.

dcherian · 2022-10-21T15:42:12Z

Thanks for taking this on @patrick-naylor ! This is a decent-sized project!

Using apply_ufunc and np.cumsum/cumprod has some issues as it only finds the cumulative across one axis which makes iterating through each dimension necessary.

np.cumsum only supports an integer axis so this is OK?

flox doesn't support cumsum at the moment (xarray-contrib/flox#91) so we can delete that bit and just have one code path.

dcherian

Thanks @patrick-naylor , this is a major step forward.

I think it might be better to take a step back and figure out Dataset.cumsum using apply_ufunc that passes tests. Then we can update the generators with that code, iterating using the generators is going to be frustrating ;)

dcherian · 2022-10-21T15:47:55Z

xarray/util/generate_aggregations.py

+axis = ExtraKwarg(
+    docs=_AXIS_DOCSTRING,
+    kwarg="axis: int | Sequence[int] | None = None,",
+    call="axis=axis,",
+    example="",


We shouldn't be supporting axis, Xarray uses dim instead.

I initially thought this was just a documentation error with the old cumsum/prod docs where they had axis as a parameter but the current version of dataarray.cumsum/dataarray.cumprod does support axis input. I was worried that removing it may break some users' code. Is this something we should depreciate, leave in, or just outright remove?

I also just realized that I had axis as an additional kwarg for the dataset methods not dataarray methods which is my mistake.

old cumsum/prod docs where they had axis

We've been removing these so IMO its OK. I think it was an issue with the older reduction functions too

xarray/util/generate_aggregations.py

dcherian · 2022-10-21T16:05:42Z

xarray/util/generate_aggregations.py

+
+        # numpy_groupies & flox do not support median
+        # https://github.com/ml31415/numpy-groupies/issues/43
+        if method.name == "median":


THis median bit should not be needed

dcherian · 2022-10-21T16:07:23Z

xarray/util/generate_aggregations.py

@@ -209,7 +277,8 @@ def {method}(
    example="""\n
        Specify ``min_count`` for finer control over when NaNs are ignored.

-        >>> {calculation}(skipna=True, min_count=2)""",
+        >>> {calculation}(skipna=True, min_count=2)


In general, these strings were tweaked to make sure pre-commit was happy so its unlikely these changes are necessary.

xarray/core/resample.py

xarray/util/generate_aggregations.py

xarray/core/groupby.py

Co-authored-by: Deepak Cherian <[email protected]>

…or/xarray into pr/7152

xarray/util/generate_aggregations.py

dcherian · 2022-10-24T18:30:06Z

xarray/util/generate_aggregations.py

+    Method("cumsum", extra_kwargs=(skipna,)),
+    Method("cumprod", extra_kwargs=(skipna,)),


Note we previously had numeric_only=True though I guess it may not be needed here.

I added it now, or should I revert?

LGTM. Let's open an issue to add tests when we relax it.

This reverts commit 66e7390.

dcherian

Thanks @patrick-naylor and @Illviljan . This is an amazing improvement!

Illviljan · 2022-10-26T17:14:12Z

@patrick-naylor, feel free to try out a better default example if you want.

patrick-naylor · 2022-10-26T17:19:20Z

Thanks @Illviljan, @dcherian, and @keewis so much for the help.

patrick-naylor added 3 commits October 7, 2022 18:03

Added file to generate docs for cumulatives. generate_cumulatives.py …

7e2f12a

…creates _cumulatives.py

Fixed mypy issues

d3c05eb

Updated cumulatives to fix mypy issues

ae513d9

github-actions bot added the topic-groupby label Oct 10, 2022

patrick-naylor added 3 commits October 10, 2022 12:26

Merge branch 'main' into cumulative_examples

e9ba30e

Added keep_attrs to groupby funcs

b774ddb

commited merge

804d960

headtr1ck added the topic-documentation label Oct 11, 2022

dcherian mentioned this pull request Oct 12, 2022

DOC: Added examples to docstrings of DataArray methods (#7123) #7123

Merged

patrick-naylor and others added 8 commits October 14, 2022 14:28

Combined cumulatives and reductions and aggregations and modified dat…

7d95f1d

…aset cumulative functions.

Merged cumulatives and reductions into aggregations

330dce8

Removed test print from dataset.py

af5573c

Removed generate_cumulatives and generate_reductions

42c4990

Merge branch 'main' into cumulative_examples

d9e5267

[pre-commit.ci] auto fixes from pre-commit.com hooks

4e611cf

for more information, see https://pre-commit.ci

Updated _aggregations with docstring changes

441450b

Merged origin

2eac348

dcherian reviewed Oct 21, 2022

View reviewed changes

Illviljan added 4 commits October 22, 2022 22:49

fix mypy

29dd1eb

use _group_dim in resample?

b12dd5a

Update resample.py

b4ce34d

Manually fix docstring

a2d9e93

Illviljan marked this pull request as ready for review October 23, 2022 09:43

Merge branch 'main' into pr/7152

061ceaf

dcherian reviewed Oct 24, 2022

View reviewed changes

xarray/core/resample.py Show resolved Hide resolved

dcherian reviewed Oct 24, 2022

View reviewed changes

xarray/util/generate_aggregations.py Outdated Show resolved Hide resolved

dcherian reviewed Oct 24, 2022

View reviewed changes

xarray/util/generate_aggregations.py Outdated Show resolved Hide resolved

dcherian reviewed Oct 24, 2022

View reviewed changes

xarray/util/generate_aggregations.py Outdated Show resolved Hide resolved

dcherian reviewed Oct 24, 2022

View reviewed changes

xarray/util/generate_aggregations.py Outdated Show resolved Hide resolved

dcherian reviewed Oct 24, 2022

View reviewed changes

xarray/util/generate_aggregations.py Outdated Show resolved Hide resolved

dcherian reviewed Oct 24, 2022

View reviewed changes

xarray/core/groupby.py Show resolved Hide resolved

Illviljan and others added 4 commits October 24, 2022 18:13

Apply suggestions from code review

2134bcf

Co-authored-by: Deepak Cherian <[email protected]>

Use TEMPLATE_SEE_ALSO

f6e18cc

Merge branch 'cumulative_examples' of https://github.com/patrick-nayl…

23df2eb

…or/xarray into pr/7152

use default example

ec5062a

dcherian reviewed Oct 24, 2022

View reviewed changes

xarray/util/generate_aggregations.py Show resolved Hide resolved

dcherian reviewed Oct 24, 2022

View reviewed changes

Illviljan added 4 commits October 24, 2022 22:57

add resample test

1a7ccf4

remove cumulative function in ops

66e7390

Revert "remove cumulative function in ops"

7bc95ec

This reverts commit 66e7390.

Add numeric_only=True

21e040c

dcherian approved these changes Oct 25, 2022

View reviewed changes

dcherian added the plan to merge Final call for comments label Oct 25, 2022

dcherian merged commit 076bd8e into pydata:main Oct 26, 2022

This was referenced Aug 24, 2023

cumsum drops index coordinates #6528

Open

rolling_exp loses coords #6870

Closed

		Method("cumsum", extra_kwargs=(skipna,)),
		Method("cumprod", extra_kwargs=(skipna,)),

Uh oh!

Cumulative examples #7152

Cumulative examples #7152

Uh oh!

Conversation

patrick-naylor commented Oct 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Illviljan commented Oct 10, 2022

Uh oh!

patrick-naylor commented Oct 10, 2022

Uh oh!

Illviljan commented Oct 10, 2022

Uh oh!

patrick-naylor commented Oct 10, 2022

Uh oh!

dcherian commented Oct 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

patrick-naylor commented Oct 11, 2022

Uh oh!

dcherian commented Oct 11, 2022

Uh oh!

patrick-naylor commented Oct 11, 2022

Uh oh!

headtr1ck commented Oct 14, 2022

Uh oh!

patrick-naylor commented Oct 19, 2022

Uh oh!

Illviljan commented Oct 21, 2022

Uh oh!

dcherian commented Oct 21, 2022

Uh oh!

dcherian left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dcherian left a comment

Choose a reason for hiding this comment

Uh oh!

Illviljan commented Oct 26, 2022

Uh oh!

patrick-naylor commented Oct 26, 2022

Uh oh!

Uh oh!

patrick-naylor commented Oct 10, 2022 •

edited

Loading

dcherian commented Oct 11, 2022 •

edited

Loading