Skip to content

Conversation

JohnHalleyGotway
Copy link
Collaborator

@JohnHalleyGotway JohnHalleyGotway commented Apr 22, 2025

Expected Differences

This PR is the second of multiple ones to enhance MET by expanding its use of OpenMP. This PR focuses on changes in the vx_util library and modifies 17 files:

  • Some minor whitespace formatting tweaks that are self-explanatory.

  • In test_env_var.sh, stop hard-coding OMP_NUM_THREADS = 4 in case the GHA hardware improves to provide more threads.

  • In unit_met_test_scripts.xml, add a couple of missing expected output file names.

  • In data_cube.h/.cc, add parallelization and apply consistent formatting.

  • In data_plane.h/.cc...

    • Add parallelization, noting in particular the instances of #pragma omp for reduction(...) to allow for parallelization when computing sums, mins, and maxes.
    • Define a new DataPlane::set_times() function to avoid redundant code.
    • Call DataPlane::set_times() from vx_regrid.cc, vx_regrid_budget.cc, and gen_vx_mask.cc.
    • Reimplement the DataPlaneArray dynamically-allocated memory using STL::vectors to satisfy SonarQube.
    • Define a new DataPlaneArray::at() function to provide a non-const reference to the array values to solve compilation issues.
    • Call DataPlaneArray::at() from read_climo.cc, ensemble_stat.cc, and point_stat.cc.
  • In handle_openmp.cc, update the OMP_NUM_THREADS validation logic and refine log messages.

  • In interp_util.cc, no parallelization changes but avoid dynamically-allocated memory to satisfy SonarQube.

  • Do these changes introduce new tools, command line arguments, or configuration file options? [No]

    If yes, please describe:

  • Do these changes modify the structure of existing or add new output data types (e.g. statistic line types or NetCDF variables)? [No]

    If yes, please describe:

Pull Request Testing

  • Describe testing already performed for these changes:

    Used GHA to verify that not changes to the output are introduced. However, I did not quantify runtime improvements.

  • Recommend testing for the reviewer(s) to perform, including the location of input datasets, and any additional instructions:

  • @jprestop please review the code changes in vx_util carefully. I focused on parallelizing loops over 2D grids of data (e.g. the DataPlane class). You could check to see if there's any loops I missed that you'd recommend be parallelized as well. Please double-check that the overall number of SonarQube code smells are reduced.

  • @briannen please do some performance testing, comparing to the develop branch AFTER feature_3132_openmp_vx_regrid was merged on 4/25/25. I'd recommend running Grid-Stat using RRFS input data to verify multiple fields and see if any runtime improvements are discernable.

  • Do these changes include sufficient documentation updates, ensuring that no errors or warnings exist in the build of the documentation? [Yes]
    No additional doc updates needed.

  • Do these changes include sufficient testing updates? [Yes]
    No additional testing updates needed.

  • Will this PR result in changes to the MET test suite? [No]

    If yes, describe the new output and/or changes to the existing output:

    I confirmed through GHA runs that no diffs are introduced.

  • Will this PR result in changes to existing METplus Use Cases? [No]

    If yes, create a new Update Truth METplus issue to describe them.
    No METplus diffs expected.

  • Do these changes introduce new SonarQube findings? [No]

    If yes, please describe:
    While code smells in "new code" are flagged, the overall number are reduced by 100, from 17,312 in develop to 17,212 for this PR. And I already addressed the easy ones flagged in the new code.

Note that SonarQube does complain about these code blocks:

#pragma omp parallel default(none)

With Extract this nested code block into a separate function.
Recommend that we consider tweaking our SonarQube configuration to stop flagging #pragma lines.

  • Please complete this pull request review by [Wed 4/30/25].

Pull Request Checklist

See the METplus Workflow for details.

  • Review the source issue metadata (required labels, projects, and milestone).
  • Complete the PR definition above.
  • Ensure the PR title matches the feature or bugfix branch name.
  • Define the PR metadata, as permissions allow.
    Select: Reviewer(s) and Development issue
    Select: Milestone as the version that will include these changes
    Select: METplus-X.Y Support project for bugfix releases or MET-X.Y Development project for the next coordinated release
  • After submitting the PR, select the ⚙️ icon in the Development section of the right hand sidebar. Search for the issue that this PR will close and select it, if it is not already selected.
  • After the PR is approved, merge your changes. If permissions do not allow this, request that the reviewer do the merge.
  • Close the linked issue and delete your feature or bugfix branch from GitHub.

… in the vx_regrid library to avoid duplication.
… using stl::vector's instead of manual memory management. This required introducing a DataPlaneArray::at(int) accessor function. Modify some library and application code to get things compiling.
…ike apply_mask(). Doing so produced a Grid-Stat runtime failure.
… parallelized regridding actually runs well and produces the expected output.
…runtime segfault in the unit_regrid.xml unit test.
… the maximum number of available threads. ci-run-unit
…nce that triggers a failure when ncdiff is used to compute differences: libgomp: Invalid value for environment variable OMP_NUM_THREADS ... ci-run-unit
…te unit_met_test_scripts.xml to add missing expected output file names. ci-run-unit
…uction(+:n) to avoid race conditions updating the counter. ci-run-unit
@JohnHalleyGotway JohnHalleyGotway added this to the MET-12.1.0 milestone Apr 22, 2025
@github-project-automation github-project-automation bot moved this to 🩺 Needs Triage in METplus-6.1 Development Apr 22, 2025
@JohnHalleyGotway JohnHalleyGotway moved this from 🩺 Needs Triage to 🔎 In review in METplus-6.1 Development Apr 22, 2025
@JohnHalleyGotway JohnHalleyGotway linked an issue Apr 22, 2025 that may be closed by this pull request
8 tasks
@JohnHalleyGotway JohnHalleyGotway marked this pull request as ready for review April 25, 2025 17:30
@github-project-automation github-project-automation bot moved this from 🔎 In review to 🏁 Done in METplus-6.1 Development Apr 25, 2025
@jprestop jprestop moved this from 🏁 Done to 🔎 In review in METplus-6.1 Development Apr 29, 2025
@JohnHalleyGotway
Copy link
Collaborator Author

@jprestop, FYI, I compiled this feature branch as the met_test user in seneca:/d1/projects/MET/MET_pull_requests/met-12.1.0/rc1/MET-feature_3132_openmp_vx_util.

cd /d1/projects/MET/MET_pull_requests/met-12.1.0/rc1
git clone https://github.com/dtcenter/MET MET-feature_3132_openmp_vx_util
cd MET-feature_3132_openmp_vx_util
git checkout feature_3132_openmp_vx_util
source internal/scripts/environment/development.seneca
./configure --prefix=`pwd` --enable-all
make -j
make install test

Copy link
Collaborator

@jprestop jprestop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JohnHalleyGotway I have carefully reviewed the changes in vx_util. Thank you for meeting with me and answering my specific questions.

Upon a quick review for checking to see if there are any loops you missed that I'd recommend be parallelized, nothing jumped out at me.

I'll also note that with regard to "Recommend that we consider tweaking our SonarQube configuration to stop flagging #pragma lines.", we did just that in our meeting. We created a new Quality Profile "Sonar Way with Pragma" in order to deactivate "Nested code blocks should not be used" which is the rule (c:S1199) that "Extract this nested code block into a separate function" falls under. We set the MET Current Quality Profile to "Sonar Way with Pragma" and saw a reduction in code smells by the expected 17. I also confirmed that the overall number of SonarQube code smells was reduced by the number you indicated. Thanks for all of your work on this task.

@briannen
Copy link

briannen commented May 1, 2025

@JohnHalleyGotway I'm working on this today. Sorry it's late!

@briannen
Copy link

briannen commented May 1, 2025

I have tested these changes using grid_stat to verify RRFS output with MRMS radar reflectivity.
time grid_stat mpashn4nssl_2024051612f18.grib2 MergedReflectivityQCComposite_00.50_20240517-060000.grib2 GridStatConfig -outdir output_dir

For reference, I used /d1/projects/MET/MET_regression/develop/NB20250426/MET-develop/bin, both with and without OMP_NUM_THREADS set. I used OMP_NUM_THREADS of 1, 2, 4, 8, 16, 32, and 40.

I did the same using /d1/projects/MET/MET_pull_requests/met-12.1.0/rc1/MET-feature_3132_openmp_vx_util/bin to test this PR.

The results are as follows:

  • Reference run: 13.748s
  • OMP reference runs
    • 1 thread: 13.562s
    • 2 threads: 9.594s
    • 4 threads: 7.510s
    • 8 threads: 7.096s
    • 16 threads: 5.965s
    • 32 threads: 6.441s
    • 40 threads: 6.411s
  • PR test run: 13.477s
  • OMP PR test runs
    • 1 thread: 13.459s
    • 2 threads: 14.179s
    • 4 threads: 14.167s
    • 8 threads: 14.113s
    • 16 threads: 13.584s
    • 32 threads: 14.219s
    • 40 threads: 14.167s

Unfortunately, it doesn't seem that increasing OMP_NUM_THREADS improves the run time of grid_stat using this PR. It does increase the run time using the nightly build from 20250426. I haven't approved this yet due to these results.

@JohnHalleyGotway JohnHalleyGotway mentioned this pull request May 1, 2025
17 tasks
@JohnHalleyGotway
Copy link
Collaborator Author

JohnHalleyGotway commented May 2, 2025

Yikes! @briannen this is my fault. The code I asked you to test was ACTUALLY main_v12.0 and not the feature_3132_openmp_vx_util branch for my pull request. Sorry for that!

> runas met_test
> cd /d1/projects/MET/MET_pull_requests/met-12.1.0/rc1/MET-feature_3132_openmp_vx_util
> git branch
main_v12.0

That explains the lack of an improvement!

I just fixed that by checking out the correct branch and recompiling. Are you able to retest?

@briannen, FYI, I did find your test files on seneca and checked the timing of running grid_stat with OMP_NUM_THREADS = 40.

Running Grid-Stat from the nightly build area takes about 6.3 seconds, while running it for my pull request takes about 5.8 seconds. And that modest runtime improvement of about 0.5 seconds is about what I'd expect based on the proposed changes.

Copy link

@briannen briannen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JohnHalleyGotway I reran the tests and can confirm that the run time does improve.

  • Reference run: 14.782s
    • OMP reference runs
      • 1 thread: 14.165s
      • 2 threads: 10.305s
      • 4 threads: 8.200s
      • 8 threads: 7.175s
      • 16 threads: 6.721s
      • 32 threads: 6.488s
      • 40 threads: 6.543s
  • PR test run: 14.523s
    • OMP PR test runs
      • 1 thread: 14.311s
      • 2 threads: 9.925s
      • 4 threads: 7.775s
      • 8 threads: 6.574s
      • 16 threads: 5.986s
      • 32 threads: 5.954s
      • 40 threads: 5.836s

@JohnHalleyGotway JohnHalleyGotway merged commit 4b5a083 into develop May 5, 2025
39 of 40 checks passed
@github-project-automation github-project-automation bot moved this from 🔎 In review to 🏁 Done in METplus-6.1 Development May 5, 2025
@JohnHalleyGotway JohnHalleyGotway changed the title Feature #3132 openmp_vx_util Feature #3132 openmp vx util May 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: 🏁 Done
Development

Successfully merging this pull request may close these issues.

Apply OpenMP to the vx_util library
3 participants