Skip to content

Fix integer overflow in Grid-Diag. #1886

@JohnHalleyGotway

Description

@JohnHalleyGotway

Describe the Problem

@jwolff-ncar ran into a problem with integer overflow in the Grid-Diag tool while comparing RRFS reflectivity data to MRMS. When running for a single time, the resulting 1 and 2-D histograms are fine. However, when processing data for 400 days, the counts in the first bin go from being positive values to negative ones. This is likely the result of integer overflow. As of MET version 10.0.0, grid_diag is storing its counts in integers. This issue is to make the following changes:

  • Update Grid-Diag to switch from integer to an unsigned datatype that can store larger values.
  • Ensure that that data type can be stored well in the output NetCDF file.
  • Update Grid-Diag to print a clear log that states the verification domain actually being used (similar to what point_stat and grid_stat do).
  • Update Grid-Diag to check for the use of regrid.to_grid = FCST or OBS. If present, error out, because FCST and OBS are well defined in this tool.

Also, give this some thought. No matter what data type we choose, there will always be an upper limit. How should this be handled? Instead of storing integer counts, should we store doubles and increment by 1/n instead of 1, where n is the total number of grid points?

Here's the information from Jamie:

In hera:/scratch2/BMC/fv3lam/RRFS_baseline/expt_dirs/RRFS_baseline_summer/GridDiag.

Under there you will see a scripts/run_metplus_griddiag.ksh script that is driving what I am doing (you can also find the path to the config/conf files in that script).

To start, I ran one day (2019041500 for f12-36) and the resulting file is:
grid_diag_out_FV3_RRFS_v1alpha_3km_summer_2019041500-2019041500_f12-36.nc

I have plot the output with some python scripts (which you can see under the scripts/) but if you simply use ncview you can also see some of the behavior I am trying to understand. This was run for comparing composite reflectivity out of the model with MRMS observed composite reflectivity.

The histograms look reasonable and the 2d histogram has a significant amount of data in the lower left corner (with values in excess of 1e8).

However, when I run this same driver script and aggregate over the entire season I am having trouble understanding the output. In this case, what seems very strange is the histograms have negative numbers for the first several bins (for both observed and forecast). Why would that be? The 2d histogram looks very odd too as all of the values in the bottom left corner are very large negative numbers.

Expected Behavior

The grid_diag tool should run on this data without integer overflow.

Environment

Describe your runtime environment:
1. Machine: (e.g. HPC name, Linux Workstation, Mac Laptop)
2. OS: (e.g. RedHat Linux, MacOS)
3. Software version number(s)

To Reproduce

Describe the steps to reproduce the behavior:
1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error
Post relevant sample data following these instructions:
https://dtcenter.org/community-code/model-evaluation-tools-met/met-help-desk#ftp

Relevant Deadlines

List relevant project deadlines here or state NONE.

Funding Source

Define the source of funding and account keys here or state NONE.

Define the Metadata

Assignee

  • Select engineer(s) or no engineer required: John HG
  • Select scientist(s) or no scientist required: no scientist required but have @jwolff-ncar review the PR.

Labels

  • Select component(s)
  • Select priority
  • Select requestor(s)

Projects and Milestone

  • Select Organization level Project for support of the current coordinated release
  • Select Repository level Project for development toward the next official release or add alert: NEED PROJECT ASSIGNMENT label
  • Select Milestone as the next bugfix version

Define Related Issue(s)

Consider the impact to the other METplus components.

Bugfix Checklist

See the METplus Workflow for details.

  • Complete the issue definition above, including the Time Estimate and Funding Source.
  • Fork this repository or create a branch of main_<Version>.
    Branch name: bugfix_<Issue Number>_main_<Version>_<Description>
  • Fix the bug and test your changes.
  • Add/update log messages for easier debugging.
  • Add/update unit tests: Chose not to add new unit tests for this edge case which would take a long time to run.
  • Add/update documentation: The existing Grid-Diag documentation makes no mention of the output variable type. So there's nowhere to note the switch from ncInt to ncInt64.
  • Push local changes to GitHub.
  • Submit a pull request to merge into main_<Version>.
    Pull request: bugfix <Issue Number> main_<Version> <Description>
  • Define the pull request metadata, as permissions allow.
    Select: Reviewer(s) and Linked issues
    Select: Organization level software support Project for the current coordinated release
    Select: Milestone as the next bugfix version
  • Iterate until the reviewer(s) accept and merge your changes.
  • Delete your fork or branch.
  • Complete the steps above to fix the bug on the develop branch.
    Branch name: bugfix_<Issue Number>_develop_<Description>
    Pull request: bugfix <Issue Number> develop <Description>
    Select: Reviewer(s) and Linked issues
    Select: Repository level development cycle Project for the next official release
    Select: Milestone as the next official version
  • Close this issue.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions