Feature #3132 openmp vx util #3136

JohnHalleyGotway · 2025-04-22T22:28:43Z

Expected Differences

This PR is the second of multiple ones to enhance MET by expanding its use of OpenMP. This PR focuses on changes in the vx_util library and modifies 17 files:

Some minor whitespace formatting tweaks that are self-explanatory.
In test_env_var.sh, stop hard-coding OMP_NUM_THREADS = 4 in case the GHA hardware improves to provide more threads.
In unit_met_test_scripts.xml, add a couple of missing expected output file names.
In data_cube.h/.cc, add parallelization and apply consistent formatting.
In data_plane.h/.cc...
- Add parallelization, noting in particular the instances of #pragma omp for reduction(...) to allow for parallelization when computing sums, mins, and maxes.
- Define a new DataPlane::set_times() function to avoid redundant code.
- Call DataPlane::set_times() from vx_regrid.cc, vx_regrid_budget.cc, and gen_vx_mask.cc.
- Reimplement the DataPlaneArray dynamically-allocated memory using STL::vectors to satisfy SonarQube.
- Define a new DataPlaneArray::at() function to provide a non-const reference to the array values to solve compilation issues.
- Call DataPlaneArray::at() from read_climo.cc, ensemble_stat.cc, and point_stat.cc.
In handle_openmp.cc, update the OMP_NUM_THREADS validation logic and refine log messages.
In interp_util.cc, no parallelization changes but avoid dynamically-allocated memory to satisfy SonarQube.
Do these changes introduce new tools, command line arguments, or configuration file options? [No]

If yes, please describe:
Do these changes modify the structure of existing or add new output data types (e.g. statistic line types or NetCDF variables)? [No]

If yes, please describe:

Pull Request Testing

Describe testing already performed for these changes:

Used GHA to verify that not changes to the output are introduced. However, I did not quantify runtime improvements.
Recommend testing for the reviewer(s) to perform, including the location of input datasets, and any additional instructions:
@jprestop please review the code changes in vx_util carefully. I focused on parallelizing loops over 2D grids of data (e.g. the DataPlane class). You could check to see if there's any loops I missed that you'd recommend be parallelized as well. Please double-check that the overall number of SonarQube code smells are reduced.
@briannen please do some performance testing, comparing to the develop branch AFTER feature_3132_openmp_vx_regrid was merged on 4/25/25. I'd recommend running Grid-Stat using RRFS input data to verify multiple fields and see if any runtime improvements are discernable.
Do these changes include sufficient documentation updates, ensuring that no errors or warnings exist in the build of the documentation? [Yes]
No additional doc updates needed.
Do these changes include sufficient testing updates? [Yes]
No additional testing updates needed.
Will this PR result in changes to the MET test suite? [No]

If yes, describe the new output and/or changes to the existing output:

I confirmed through GHA runs that no diffs are introduced.
Will this PR result in changes to existing METplus Use Cases? [No]

If yes, create a new Update Truth METplus issue to describe them.
No METplus diffs expected.
Do these changes introduce new SonarQube findings? [No]

If yes, please describe:
While code smells in "new code" are flagged, the overall number are reduced by 100, from 17,312 in develop to 17,212 for this PR. And I already addressed the easy ones flagged in the new code.

Note that SonarQube does complain about these code blocks:

#pragma omp parallel default(none)

With Extract this nested code block into a separate function.
Recommend that we consider tweaking our SonarQube configuration to stop flagging #pragma lines.

Please complete this pull request review by [Wed 4/30/25].

Pull Request Checklist

See the METplus Workflow for details.

Review the source issue metadata (required labels, projects, and milestone).
Complete the PR definition above.
Ensure the PR title matches the feature or bugfix branch name.
Define the PR metadata, as permissions allow.
Select: Reviewer(s) and Development issue
Select: Milestone as the version that will include these changes
Select: METplus-X.Y Support project for bugfix releases or MET-X.Y Development project for the next coordinated release
After submitting the PR, select the ⚙️ icon in the Development section of the right hand sidebar. Search for the issue that this PR will close and select it, if it is not already selected.
After the PR is approved, merge your changes. If permissions do not allow this, request that the reviewer do the merge.
Close the linked issue and delete your feature or bugfix branch from GitHub.

…o most MET tools via regridding.

… threads that actually are available.

… in the vx_regrid library to avoid duplication.

… using stl::vector's instead of manual memory management. This required introducing a DataPlaneArray::at(int) accessor function. Modify some library and application code to get things compiling.

…ike apply_mask(). Doing so produced a Grid-Stat runtime failure.

…n all GHA tests more efficiently.

… parallelized regridding actually runs well and produces the expected output.

…nto feature_3132_openmp_vx_util

…runtime segfault in the unit_regrid.xml unit test.

…es that include break statements.

…he compilation.

…wav_1d().

… the maximum number of available threads. ci-run-unit

…nce that triggers a failure when ncdiff is used to compute differences: libgomp: Invalid value for environment variable OMP_NUM_THREADS ... ci-run-unit

…te unit_met_test_scripts.xml to add missing expected output file names. ci-run-unit

…uction(+:n) to avoid race conditions updating the counter. ci-run-unit

…p_vx_util

…rQube code smells.

JohnHalleyGotway · 2025-04-29T17:59:27Z

@jprestop, FYI, I compiled this feature branch as the met_test user in seneca:/d1/projects/MET/MET_pull_requests/met-12.1.0/rc1/MET-feature_3132_openmp_vx_util.

cd /d1/projects/MET/MET_pull_requests/met-12.1.0/rc1
git clone https://github.com/dtcenter/MET MET-feature_3132_openmp_vx_util
cd MET-feature_3132_openmp_vx_util
git checkout feature_3132_openmp_vx_util
source internal/scripts/environment/development.seneca
./configure --prefix=`pwd` --enable-all
make -j
make install test

jprestop

@JohnHalleyGotway I have carefully reviewed the changes in vx_util. Thank you for meeting with me and answering my specific questions.

Upon a quick review for checking to see if there are any loops you missed that I'd recommend be parallelized, nothing jumped out at me.

I'll also note that with regard to "Recommend that we consider tweaking our SonarQube configuration to stop flagging #pragma lines.", we did just that in our meeting. We created a new Quality Profile "Sonar Way with Pragma" in order to deactivate "Nested code blocks should not be used" which is the rule (c:S1199) that "Extract this nested code block into a separate function" falls under. We set the MET Current Quality Profile to "Sonar Way with Pragma" and saw a reduction in code smells by the expected 17. I also confirmed that the overall number of SonarQube code smells was reduced by the number you indicated. Thanks for all of your work on this task.

briannen · 2025-05-01T14:05:36Z

@JohnHalleyGotway I'm working on this today. Sorry it's late!

briannen · 2025-05-01T15:46:58Z

I have tested these changes using grid_stat to verify RRFS output with MRMS radar reflectivity.
time grid_stat mpashn4nssl_2024051612f18.grib2 MergedReflectivityQCComposite_00.50_20240517-060000.grib2 GridStatConfig -outdir output_dir

For reference, I used /d1/projects/MET/MET_regression/develop/NB20250426/MET-develop/bin, both with and without OMP_NUM_THREADS set. I used OMP_NUM_THREADS of 1, 2, 4, 8, 16, 32, and 40.

I did the same using /d1/projects/MET/MET_pull_requests/met-12.1.0/rc1/MET-feature_3132_openmp_vx_util/bin to test this PR.

The results are as follows:

Reference run: 13.748s
OMP reference runs
- 1 thread: 13.562s
- 2 threads: 9.594s
- 4 threads: 7.510s
- 8 threads: 7.096s
- 16 threads: 5.965s
- 32 threads: 6.441s
- 40 threads: 6.411s
PR test run: 13.477s
OMP PR test runs
- 1 thread: 13.459s
- 2 threads: 14.179s
- 4 threads: 14.167s
- 8 threads: 14.113s
- 16 threads: 13.584s
- 32 threads: 14.219s
- 40 threads: 14.167s

Unfortunately, it doesn't seem that increasing OMP_NUM_THREADS improves the run time of grid_stat using this PR. It does increase the run time using the nightly build from 20250426. I haven't approved this yet due to these results.

JohnHalleyGotway · 2025-05-02T16:21:40Z

Yikes! @briannen this is my fault. The code I asked you to test was ACTUALLY main_v12.0 and not the feature_3132_openmp_vx_util branch for my pull request. Sorry for that!

> runas met_test
> cd /d1/projects/MET/MET_pull_requests/met-12.1.0/rc1/MET-feature_3132_openmp_vx_util
> git branch
main_v12.0

That explains the lack of an improvement!

I just fixed that by checking out the correct branch and recompiling. Are you able to retest?

@briannen, FYI, I did find your test files on seneca and checked the timing of running grid_stat with OMP_NUM_THREADS = 40.

Running Grid-Stat from the nightly build area takes about 6.3 seconds, while running it for my pull request takes about 5.8 seconds. And that modest runtime improvement of about 0.5 seconds is about what I'd expect based on the proposed changes.

briannen

@JohnHalleyGotway I reran the tests and can confirm that the run time does improve.

Reference run: 14.782s
- OMP reference runs
  - 1 thread: 14.165s
  - 2 threads: 10.305s
  - 4 threads: 8.200s
  - 8 threads: 7.175s
  - 16 threads: 6.721s
  - 32 threads: 6.488s
  - 40 threads: 6.543s
PR test run: 14.523s
- OMP PR test runs
  - 1 thread: 14.311s
  - 2 threads: 9.925s
  - 4 threads: 7.775s
  - 8 threads: 6.574s
  - 16 threads: 5.986s
  - 32 threads: 5.954s
  - 40 threads: 5.836s

JohnHalleyGotway added 21 commits April 17, 2025 21:43

Per #3120, add OpenMP to the vx_regrid library.

ceca910

Per #3120, move call to init_openmp() into main.cc since it applies t…

d743c12

…o most MET tools via regridding.

Per #3131, enhance OMP_NUM_THREADS log message to print the number of…

1642d22

… threads that actually are available.

Per #3132, add new DataPlane::set_times() member function and call it…

a5c51e3

… in the vx_regrid library to avoid duplication.

Per #3132, one more call to Dao DataPlane::set_times()

270e857

Per #3132, add OpenMP to data_plane.cc and reimplement DataPlaneArray…

510e912

… using stl::vector's instead of manual memory management. This required introducing a DataPlaneArray::at(int) accessor function. Modify some library and application code to get things compiling.

Per #3132, update testing.yml workflow to run using 4 threads.

8ecfc5b

Per #3132, updates to vx_util to call OpenMP.

0932c42

Per #3132, do not parallelize for loops that add values to vectors, l…

b935726

…ike apply_mask(). Doing so produced a Grid-Stat runtime failure.

Per #3132, try setting OMP_NUM_THREADS to 4 in test_env_vars.sh to ru…

2a8ea42

…n all GHA tests more efficiently.

Per #3131, set OMP_NUM_THREADS to 4 for GHA testing to make sure that…

f002eec

… parallelized regridding actually runs well and produces the expected output.

Merge remote-tracking branch 'origin/feature_3131_openmp_vx_regrid' i…

8659dfb

…nto feature_3132_openmp_vx_util

Per #3132, remove OpenMP from DataPlane::convert() since it causes a …

160f941

…runtime segfault in the unit_regrid.xml unit test.

Per #3132, do not parallelize loops that compute min/max values or on…

1bd9e91

…es that include break statements.

Per #3132, add -j argument when compiling MET in Docker to speed up t…

bd58dda

…he compilation.

Per #3132, realized that -j was already being set in development.docker.

23f9f99

Per #3132, fix parallelization logic in rescale_probability() and fit…

2da4b5a

…wav_1d().

Per #3132, update the OMP_NUM_THREADS handling logic to reset <= 0 to…

f0cc3b5

… the maximum number of available threads. ci-run-unit

Per #3132, roll back the support for setting OMP_NUM_THREADS <= 0, si…

147167a

…nce that triggers a failure when ncdiff is used to compute differences: libgomp: Invalid value for environment variable OMP_NUM_THREADS ... ci-run-unit

Per #3132, fix parallelization of the distance_map() utility and upda…

4fdc08b

…te unit_met_test_scripts.xml to add missing expected output file names. ci-run-unit

Per #3132, update DataPlane::n_good_data() parallelization to use red…

b58503e

…uction(+:n) to avoid race conditions updating the counter. ci-run-unit

JohnHalleyGotway added this to the MET-12.1.0 milestone Apr 22, 2025

JohnHalleyGotway added this to METplus-6.1 Development Apr 22, 2025

github-project-automation bot moved this to 🩺 Needs Triage in METplus-6.1 Development Apr 22, 2025

JohnHalleyGotway moved this from 🩺 Needs Triage to 🔎 In review in METplus-6.1 Development Apr 22, 2025

JohnHalleyGotway linked an issue Apr 22, 2025 that may be closed by this pull request

Apply OpenMP to the vx_util library #3132

Closed

8 tasks

JohnHalleyGotway added 3 commits April 23, 2025 20:37

Per #3132, parallelize DataPlane::data_range() function

7c47246

Merge remote-tracking branch 'origin/develop' into feature_3132_openm…

2567819

…p_vx_util

Merging develop into feature_3132_openmp_vx_util

2339c08

JohnHalleyGotway requested a review from jprestop April 25, 2025 17:30

JohnHalleyGotway requested a review from briannen April 25, 2025 17:30

JohnHalleyGotway marked this pull request as ready for review April 25, 2025 17:30

JohnHalleyGotway closed this Apr 25, 2025

github-project-automation bot moved this from 🔎 In review to 🏁 Done in METplus-6.1 Development Apr 25, 2025

JohnHalleyGotway reopened this Apr 25, 2025

Per #3132, some easy changes to drive down the overall number of Sona…

c4e2031

…rQube code smells.

jprestop moved this from 🏁 Done to 🔎 In review in METplus-6.1 Development Apr 29, 2025

jprestop approved these changes Apr 29, 2025

View reviewed changes

JohnHalleyGotway mentioned this pull request May 1, 2025

Feature #3134 openmp libs #3141

Merged

17 tasks

briannen approved these changes May 5, 2025

View reviewed changes

JohnHalleyGotway merged commit 4b5a083 into develop May 5, 2025
39 of 40 checks passed

github-project-automation bot moved this from 🔎 In review to 🏁 Done in METplus-6.1 Development May 5, 2025

jprestop mentioned this pull request May 8, 2025

Feature #3140 openmp apps pcp genvxmask #3147

Merged

17 tasks

JohnHalleyGotway changed the title ~~Feature #3132 openmp_vx_util~~ Feature #3132 openmp vx util May 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature #3132 openmp vx util #3136

Feature #3132 openmp vx util #3136

Uh oh!

JohnHalleyGotway commented Apr 22, 2025 •

edited

Loading

Uh oh!

JohnHalleyGotway commented Apr 29, 2025

Uh oh!

jprestop left a comment

Uh oh!

briannen commented May 1, 2025

Uh oh!

briannen commented May 1, 2025

Uh oh!

JohnHalleyGotway commented May 2, 2025 •

edited

Loading

Uh oh!

briannen left a comment

Uh oh!

Uh oh!

Uh oh!

Feature #3132 openmp vx util #3136

Feature #3132 openmp vx util #3136

Uh oh!

Conversation

JohnHalleyGotway commented Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Expected Differences

Pull Request Testing

Pull Request Checklist

Uh oh!

JohnHalleyGotway commented Apr 29, 2025

Uh oh!

jprestop left a comment

Choose a reason for hiding this comment

Uh oh!

briannen commented May 1, 2025

Uh oh!

briannen commented May 1, 2025

Uh oh!

JohnHalleyGotway commented May 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

briannen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

JohnHalleyGotway commented Apr 22, 2025 •

edited

Loading

JohnHalleyGotway commented May 2, 2025 •

edited

Loading