Skip to content

Conversation

JohnHalleyGotway
Copy link
Collaborator

@JohnHalleyGotway JohnHalleyGotway commented Aug 23, 2024

Expected Differences

@j-opatz or @mpm-meto recommend adding a METplus use case to demonstrate this new functionality. That use case could be set up to run Series-Analysis once for a list of times. For each time, only provide input data for that single timestep but use the -aggr command line option to point to the Series-Analysis output from the previous time step.

Note that the Series-Analysis runtime is much slower for aggregation logic since it requires that many output statistics be written.
For example, in the diff output, the number of NetCDF variables is increased from 8 to 30:

ncdump -h climatology_1.0deg/series_analysis_PROB_CLIMO_1.0DEG_TRUTH.nc | grep "float " | wc -l
8
ncdump -h climatology_1.0deg/series_analysis_PROB_CLIMO_1.0DEG_AGGR_OUTPUT.nc | grep "float " | wc -l
30

And another run increases the count from 30 to 52 variables.

So while this is a nice feature, it does require more I/O and storage, which slows it down.

  • Do these changes introduce new tools, command line arguments, or configuration file options? [Yes]

    If yes, please describe:

    • Adds new -aggr command line option to the Series-Analysis tool to provide previous generated Series-Analysis output data to be aggregated.
  • Do these changes modify the structure of existing or add new output data types (e.g. statistic line types or NetCDF variables)? [No]

    If yes, please describe:

Pull Request Testing

  • Describe testing already performed for these changes:

My testing approach is described below:

  1. Run Series-Analysis with a full set of input data.
  2. Run Series-Analysis with the first half of the inputs.
  3. Run Series-Analysis with the second half of the inputs and the -aggr command line option pointing to the output from 2.
  4. Use ncview to visually confirm that the outputs from 1. and 3. match visually.
  • I used this approach to test the aggregation of CTC, MCTC, SL1L2, SAL1L2, and PCT line types.

  • Library updates were needed in order for anomaly correlation (CNT: ANOM_CORR) to be aggregated correctly.

  • Special logic is added to Series-Analysis to handle aggregating the PSTD BRIERCL and BSS columns.

  • Recommend testing for the reviewer(s) to perform, including the location of input datasets, and any additional instructions:

    • Review the code changes, review the docs updates, and review the differences caused by changes to the unit tests.
    • Use ncview to see the modified output and sanity check each the outputs. Note that while the list of stats requested has changed slightly, generally speaking the TRUTH data should look the same as the AGGR OUTPUT data. So visualize them side-by-side to confirm.
  • Do these changes include sufficient documentation updates, ensuring that no errors or warnings exist in the build of the documentation? [Yes]

  • Do these changes include sufficient testing updates? [Yes]
    Changes to unit_series_analysis.xml demonstrate the aggregation of CTC, MCTC, SL1L2, and PCT line types.
    Changes to unit_climatology_1.0deg.xml demonstrate the aggregation of anomaly statistics, including CNT:ANON_CORR and PSTD:BRIERCL/BSS.

  • Will this PR result in changes to the MET test suite? [Yes]

    If yes, describe the new output and/or changes to the existing output:

    Modified output from Series-Analysis based on changes to unit tests.

  • Will this PR result in changes to existing METplus Use Cases? [Yes]

    If yes, create a new Update Truth METplus issue to describe them.

It did cause an unanticipated diff in METplus Use Case output, as noted in dtcenter/METplus#2667.

  • Do these changes introduce new SonarQube findings? [No]

    If yes, please describe:

  • Develop currently has 19,519 code smells and the scan for for this PRreduces that overall count down to 19,483 code smells.

  • Please complete this pull request review by [Fri 8/30/24].

Pull Request Checklist

See the METplus Workflow for details.

  • Review the source issue metadata (required labels, projects, and milestone).
  • Complete the PR definition above.
  • Ensure the PR title matches the feature or bugfix branch name.
  • Define the PR metadata, as permissions allow.
    Select: Reviewer(s) and Development issue
    Select: Milestone as the version that will include these changes
    Select: Coordinated METplus-X.Y Support project for bugfix releases or MET-X.Y.Z Development project for official releases
  • After submitting the PR, select the ⚙️ icon in the Development section of the right hand sidebar. Search for the issue that this PR will close and select it, if it is not already selected.
  • After the PR is approved, merge your changes. If permissions do not allow this, request that the reviewer do the merge.
  • Close the linked issue and delete your feature or bugfix branch from GitHub.

…or the CTC, MCTC, SL1L2, and PCT line types.
…st to unit_series_analysis.xml to demonstrate.
…roperly using the old aggregate data and the new pair data.
…once instead of storing data value by value for each point.
…es are not present in the input -aggr file.
…1L2Info, and NBRCNTInfo. The metadata settings, like fthresh and othresh, were not being passed to the output.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: 🏁 Done
Development

Successfully merging this pull request may close these issues.

Enhance Series-Analysis to read its own output and incrementally update output statistics over time
2 participants