Splitcptread #541

remicousin · 2025-04-24T14:22:36Z

This was a bit tedious because while pyCPTv2 data files can be known and read entirely from their parent path, the tsv case needs starts and leads/targets informed by the config files, read one pair of S L/target at a time, and can take L/target as one or the other.
This creates a chicken and egg issue with respect to reading data and initializing the Start and Target menus. Given that we don't expect new tsv files being produced, I tried to address this following 2 principles:

make the least changes to the tsv case
make the tsv case "transparent" meaning that all of it can be removed without altering anything else

There are 2 commits only because I had to change branches to work on something else so they are arbitrary, better to see the thing as a whole.

Functions Notes give some more details of the management of the 2 cases. I am also squeezing comments in the PR for clarification.

On top of the tsv/nc considerations, there are 2 cases: single or multiple targets. It matters both for data reading and app design. If multiple, the L dimension is introduced in the dataset and a target menu is displayed in the app. If single, there's no L dimention, and no target menu (only start).

Tested both single/multiple target cases but only for nc files. I don't have tsv files for Senegal. We could generate some or write a config file for tsv files we do have. This could be done now, or if/when ever tsv files cases arise in the future.

enacts/config-test-enacts.yaml

enacts/flex_fcst/cpt.py

remicousin · 2025-04-24T14:30:51Z

enacts/flex_fcst/cpt.py

    if 'obs.nc' in map(lambda x: str(x.name), children):
-        fcst_mu, fcst_var, obs = read_pycptv2dataset_single_target(data_path)
+        fcst_ds, obs = read_pycptv2dataset_single_target(data_path)


I had to change the rationale throughout from outputting tuples of xr.DataArrays to xr.Datasets of xr.DataArrays because:

so the outputs in the both nc/tsv cases be the same (tsv can also outputs hcst)

because I need to make ds-wise manipulations in the nc case where the Ts can either remain coordinates (single target) or become variables (multiple targets)

remicousin · 2025-04-24T14:35:38Z

enacts/flex_fcst/cpt.py

+                )
+                fcst_ds, new_fcst_ds = xr.align(fcst_ds, new_fcst_ds, join="outer")
+                fcst_ds = fcst_ds.fillna(new_fcst_ds)
+                obs_slices.append(new_obs)


It turns out that we must never had run into the case of multiple lead times for multiple targets as the previous code was not covering that case. We only got the case of multiple targets with 1 same lead time covered... multiple targets and leads requite L to become a new dimension, not just another coordinate. Since we read and append the data target after target, it creates gappy S, L squares that must be either filled in or appended as we go. If we were reading and appending S after S, it would be more simple, but it would raise other problems (it's not how we organized the data files).

remicousin · 2025-04-24T14:37:52Z

enacts/flex_fcst/cpt.py

+        L = (((
+            ds.isel(S=[0])["Ti"].dt.month - ds.isel(S=[0])["S"].dt.month
+        ) + 12) % 12).data
+        ds = ds.reset_coords(["T", "Ti", "Tf"]).expand_dims(dim={"L": L})


Need to decide at the root here (open_var) whether there will be a new L dim to append against or not

remicousin · 2025-04-24T14:39:35Z

enacts/flex_fcst/maproom.py

-
+        lead_time_control_style = dict(
+            lead_time_control_style, display=target_display
+        )


Important to remember that there is always a target menu even in the single target case: it's just not displayed. This allows not to have cases down the road for getting the target information

remicousin · 2025-04-24T14:40:38Z

enacts/flex_fcst/maproom.py

+        for option in targets :
+            if option["value"] == lead_time :
+                target = option["label"]
+        return f'{target} {config["variable"]} Forecast issued {start_date}'


No need to read data and compute target again: just take it from the menus, even in single target case.

remicousin · 2025-04-24T14:44:01Z

enacts/flex_fcst/maproom.py

        # Forecast CDF
-        fcst_q, fcst_mu = xr.broadcast(quantiles, fcst_mu)
+        fcst_q, fcst_mu = xr.broadcast(quantiles, fcst_ds["deterministic"])


until then, could work with the whole fcst_ds. From then, need to get the individual variables of interest

enacts/flex_fcst/maproom.py

aaron-kaplan

A comment before making a detailed review: do we need to continue supporting the tsv case? Can we not just say that if you want to upgrade your maprooms you have to also upgrade PyCPT?

enacts/config-test-enacts.yaml

remicousin · 2025-05-12T13:55:16Z

A comment before making a detailed review: do we need to continue supporting the tsv case? Can we not just say that if you want to upgrade your maprooms you have to also upgrade PyCPT?

I am very fine with that. In theory, there could be projects funding a new forecast Maprooms without updating the partner to upgraded PyCPT... In practice, they aren't going to be any (such) projects.

remicousin · 2025-05-19T12:10:53Z

We might want to wrap this one in a way or another: we would need flex_fcst MR to port PepsiCo classic forecast MRs to that; and that would lay the ground for such MRs for IRI forecasts as well, if ever part of the shutdown mix.

aaron-kaplan · 2025-05-19T15:48:11Z

I vote to remove the tsv version. Ethiopia AA is now using the latest version of PyCPT. I don't know if there are other programs still using an older version, but if there are, they're not funded.

aaron-kaplan · 2025-05-19T15:48:34Z

Also note this branch now has conflicts.

remicousin · 2025-05-20T17:02:34Z

I vote to remove the tsv version. Ethiopia AA is now using the latest version of PyCPT. I don't know if there are other programs still using an older version, but if there are, they're not funded.

done

remicousin · 2025-05-20T17:17:38Z

Also note this branch now has conflicts.

rebased

remicousin added the forecast label Apr 24, 2025

remicousin self-assigned this Apr 24, 2025

remicousin commented Apr 24, 2025

View reviewed changes

remicousin requested review from aaron-kaplan and xchourio April 24, 2025 14:46

aaron-kaplan reviewed May 9, 2025

View reviewed changes

enacts/config-test-enacts.yaml Outdated Show resolved Hide resolved

remicousin added 4 commits May 20, 2025 13:04

left to test single target cast and document functions

013af35

splitting CPT files reading from app

584e8fa

stripped out of all things cpt-tsv

424943d

keep forgetting saving file after solving conflicts and before git-add

4ac5623

remicousin force-pushed the splitcptread branch from 525c0a5 to 4ac5623 Compare May 20, 2025 17:17

Splitcptread #541

Are you sure you want to change the base?

Splitcptread #541

Uh oh!

Conversation

remicousin commented Apr 24, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

remicousin Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

remicousin Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

remicousin Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

remicousin Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

remicousin Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

remicousin Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

aaron-kaplan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

remicousin commented May 12, 2025

Uh oh!

remicousin commented May 19, 2025

Uh oh!

aaron-kaplan commented May 19, 2025

Uh oh!

aaron-kaplan commented May 19, 2025

Uh oh!

remicousin commented May 20, 2025

Uh oh!

remicousin commented May 20, 2025

Uh oh!

Uh oh!