Description
Problem description
For datamining with xarray there is always the following issue with the resampling-method.
If i resample e.g. a daily timeseries over one month and if the data are 'NA' at each day, I get zero as a result. That is annoying considering a timeseries of precipitation. It is definitely a difference if the monthly precipitation is zero for one month (each day zero precipitation) or the monthly precipitation was not measured due to problems with the device (each day NA)
Data example
I have a dataset with hourly values for 5 month 'fcut'.
<xarray.Dataset>
Dimensions: (bnds: 2, time: 3672)
Coordinates:
rlon float32 22.06
rlat float32 5.06
* time (time) datetime64[ns] 2006-05-01 2006-05-01T01:00:00 ...
Dimensions without coordinates: bnds
Data variables:
rotated_pole int32 1
time_bnds (time, bnds) float64 1.304e+07 1.305e+07 1.305e+07 ...
TOT_PREC (time) float64 nan nan nan nan nan nan nan nan nan nan nan ...
Attributes:
Doing a resample process gives only zero values for each month.
In [10]: fcut.resample(dim='time',freq='M',how='sum')
Out[10]:
<xarray.Dataset>
Dimensions: (bnds: 2, time: 5)
Coordinates:
* time (time) datetime64[ns] 2006-05-31 2006-06-30 2006-07-31 ...
Dimensions without coordinates: bnds
Data variables:
rotated_pole (time) int64 1 1 1 1 1
time_bnds (time, bnds) float64 1.07e+10 1.07e+10 1.225e+10 1.225e+10 ...
TOT_PREC (time) float64 0.0 0.0 0.0 0.0 0.0
But I expect to have NA for each month, as it is the case for the operator 'mean'
I know that there is an ongoing discussion about that topic (see for example pandas-dev/pandas#9422).
For earth science it would be nice to have an option telling xarray what to do in case of a sum over values being all NA. Do you see a chance to have a fast fix for that issue in the model code?