Skip to content

Splitcptread #541

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Splitcptread #541

wants to merge 4 commits into from

Conversation

remicousin
Copy link
Contributor

This was a bit tedious because while pyCPTv2 data files can be known and read entirely from their parent path, the tsv case needs starts and leads/targets informed by the config files, read one pair of S L/target at a time, and can take L/target as one or the other.
This creates a chicken and egg issue with respect to reading data and initializing the Start and Target menus. Given that we don't expect new tsv files being produced, I tried to address this following 2 principles:

  • make the least changes to the tsv case
  • make the tsv case "transparent" meaning that all of it can be removed without altering anything else

There are 2 commits only because I had to change branches to work on something else so they are arbitrary, better to see the thing as a whole.

Functions Notes give some more details of the management of the 2 cases. I am also squeezing comments in the PR for clarification.

On top of the tsv/nc considerations, there are 2 cases: single or multiple targets. It matters both for data reading and app design. If multiple, the L dimension is introduced in the dataset and a target menu is displayed in the app. If single, there's no L dimention, and no target menu (only start).

Tested both single/multiple target cases but only for nc files. I don't have tsv files for Senegal. We could generate some or write a config file for tsv files we do have. This could be done now, or if/when ever tsv files cases arise in the future.

@remicousin remicousin self-assigned this Apr 24, 2025
if 'obs.nc' in map(lambda x: str(x.name), children):
fcst_mu, fcst_var, obs = read_pycptv2dataset_single_target(data_path)
fcst_ds, obs = read_pycptv2dataset_single_target(data_path)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to change the rationale throughout from outputting tuples of xr.DataArrays to xr.Datasets of xr.DataArrays because:

  1. so the outputs in the both nc/tsv cases be the same (tsv can also outputs hcst)
  2. because I need to make ds-wise manipulations in the nc case where the Ts can either remain coordinates (single target) or become variables (multiple targets)

)
fcst_ds, new_fcst_ds = xr.align(fcst_ds, new_fcst_ds, join="outer")
fcst_ds = fcst_ds.fillna(new_fcst_ds)
obs_slices.append(new_obs)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out that we must never had run into the case of multiple lead times for multiple targets as the previous code was not covering that case. We only got the case of multiple targets with 1 same lead time covered... multiple targets and leads requite L to become a new dimension, not just another coordinate. Since we read and append the data target after target, it creates gappy S, L squares that must be either filled in or appended as we go. If we were reading and appending S after S, it would be more simple, but it would raise other problems (it's not how we organized the data files).

L = (((
ds.isel(S=[0])["Ti"].dt.month - ds.isel(S=[0])["S"].dt.month
) + 12) % 12).data
ds = ds.reset_coords(["T", "Ti", "Tf"]).expand_dims(dim={"L": L})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to decide at the root here (open_var) whether there will be a new L dim to append against or not


lead_time_control_style = dict(
lead_time_control_style, display=target_display
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important to remember that there is always a target menu even in the single target case: it's just not displayed. This allows not to have cases down the road for getting the target information

for option in targets :
if option["value"] == lead_time :
target = option["label"]
return f'{target} {config["variable"]} Forecast issued {start_date}'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to read data and compute target again: just take it from the menus, even in single target case.

# Forecast CDF
fcst_q, fcst_mu = xr.broadcast(quantiles, fcst_mu)
fcst_q, fcst_mu = xr.broadcast(quantiles, fcst_ds["deterministic"])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

until then, could work with the whole fcst_ds. From then, need to get the individual variables of interest

Copy link
Collaborator

@aaron-kaplan aaron-kaplan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment before making a detailed review: do we need to continue supporting the tsv case? Can we not just say that if you want to upgrade your maprooms you have to also upgrade PyCPT?

@remicousin
Copy link
Contributor Author

A comment before making a detailed review: do we need to continue supporting the tsv case? Can we not just say that if you want to upgrade your maprooms you have to also upgrade PyCPT?

I am very fine with that. In theory, there could be projects funding a new forecast Maprooms without updating the partner to upgraded PyCPT... In practice, they aren't going to be any (such) projects.

@remicousin
Copy link
Contributor Author

We might want to wrap this one in a way or another: we would need flex_fcst MR to port PepsiCo classic forecast MRs to that; and that would lay the ground for such MRs for IRI forecasts as well, if ever part of the shutdown mix.

@aaron-kaplan
Copy link
Collaborator

I vote to remove the tsv version. Ethiopia AA is now using the latest version of PyCPT. I don't know if there are other programs still using an older version, but if there are, they're not funded.

@aaron-kaplan
Copy link
Collaborator

Also note this branch now has conflicts.

@remicousin
Copy link
Contributor Author

I vote to remove the tsv version. Ethiopia AA is now using the latest version of PyCPT. I don't know if there are other programs still using an older version, but if there are, they're not funded.

done

@remicousin
Copy link
Contributor Author

Also note this branch now has conflicts.

rebased

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants