-
Notifications
You must be signed in to change notification settings - Fork 6
daily to dekadal functions #527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I mean this PR is really just to have acceptable dekadal data to feed the synthetic Maprooms. We can move the broader time dimension where it belongs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Interval class is interesting. But if you have to convert it to points in order to do anything useful with it, then it's of limited utility.
All the commits but last address the comments of this PR. The last one adds a suite of functions that can replace all others, including the older seasonal functions for which cases confirming tests have been made too. You'll have to make a concession somewhere. With Intervals, you'll have to convert them to something to make maths. Without, you need at least 2 coords to describe one time dimension... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reviewed all but the last commit ("another suite of functions..."). Will have to continue with that next week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having trouble figuring out where you're going with this last commit. I'm a little sleepy, so maybe I'll have better luck another time; but also some high-level guidance might help.
Is this meant to be a piece of the general-purpose time library you mentioned in the last paragraph of the PR description? If so, then before reviewing a detailed implementation I would like to see a high-level summary of the proposed design. What class of problem are you trying to solve, how do resampling and regrouping contribute to solving those problems, and then what is the resampling and regrouping API you propose. Perhaps do some example calculations, just on paper, using the API you wish you had.
The above address all but "another suite..." commit |
Yes, it's an experiment for a general-purpose time library. I still want to finish clean up the existing Python code we have (splitting data reading / apps etc.) before I engage in that (or actually I could work on both in parallels if I don't dive into servers logs analyses). But I got carried away with the idea and stuck that in here. So maybe see it as an experiment. I can still write high level considerations in here. |
So at minimum for this PR, the question to answer is whether it's a good/useful generalization of the groupby_dekads function written in the first part of the PR. |
I would prefer to have one PR with a minimal implementation of just what we need now to make synthetic data for testing, and a different branch where we experiment with generalizations. The short-term minimal stuff might belong in a testutils.py rather than calc.py so we don't build too much on top of it before we decide it's the right approach. |
508588b
to
6553ea5
Compare
Move the experimental "regroup" with pd.Interval to another branch. |
Functions to compute dekadal totals from daily data and their tests. Initially needed to generate synthetic dekadal data from synthetic daily data to test Maprooms.
In the process, I (eventually!) discovered how to make use of the Intervals objects that groupby functions return. It turns out they are pandas.Interval, but you need surgical striking to hit the pandas object with its attributes: too soon, it's still a DataArray and too late it's an ndarray... tells so much about my understanding of how xarray, numpy, pandas... intricate...
So there are 3 functions:
For the very short-term purpose of synthetic data into Maprooms, I will need a mean version for temperature. It's trivial.
I didn't deal with units since we might use pint some day.
For the medium term, swap_interval can be applied to any sort of dim (not just time). There could be simplification/generalization to make after checking what xarray ojects return Interval dims and whether they name it always the same or what-have-you. Right now I assumed some behavior.
For longer term, that could open the door for clean library to deal with time: the Intervals contain all the information we need and can be retrieved. I imagine pair-wise operations could be applied on the Interval dims directly, what would force the Intervals to be consistent throughout the pair (Ingrid only relies on centers). Time-wise operations could probably be automatized to start from Interval, move to point-wise for the sake of the operation, and returned to appropriate Interval accordingly. But we have other discussion on time elsewhere and maybe this is not new, or needs be completed by what was said there.