-
Notifications
You must be signed in to change notification settings - Fork 26
Feature #3065 benchmarking ensemble stat #3137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… GridStatConfig_SEEPS config file needs to be updated with nc_pairs_flag.seeps = TRUE in order for the same output to be produced by the unit tests.
* Per #3032, add data type column to all of the output tables * Per #3032, remove the first row from each output table since its info is repeated from the table name. Additional changes for consistency and accuracy in column names. * Update docs/Users_Guide/gsi-tools.rst Co-authored-by: Julie Prestopnik <jpresto@ucar.edu> --------- Co-authored-by: Julie Prestopnik <jpresto@ucar.edu>
…o create and push an updated test output image.
* Per #3033, update version info, consolidate release notes, and add upgrade instructions. * Per #3033, remove all instances of 'Bugfix: ' from the release notes since it's redundant with the dropdown name * Per #3030, based on request from Randy Pierce, also add MTD header columns to met_header_columns_v12.0.txt to make it easier to parse the output from MET. * Per #3033, fix typo and correct alignment in table
Removing reference to beta version
Remove references to beta version
Update paths for eckit and atlas
Remove beta references
… setting in the yaml config file
…er use case config files, add more assertion checks
…ve to MET_BASE (<install_loc>/share/met) and other files that are only in the MET repo are found relative to MET_TEST_BASE (MET/internal/test_unit). Also remove MET_BUILD_BASE env var (#3052)
…after finishing, format the information file
Assuming this is a first example of profiling to come, I'm wondering about putting the profiling in so many places in ensemble_stat. Not that I'm against that but is there a good reason for it? In other words, what is the intent of profiling this particular application, and does the way it is set up help accomplish that? |
I was taking a guess at what functions might be of interest to instrument
since I'm not familiar with the code. If you don't think all these
functions should be instrumented, feel free to undo or let me know which
ones to undo.
…On Thu, May 1, 2025 at 11:44 AM davidalbo ***@***.***> wrote:
*davidalbo* left a comment (dtcenter/MET#3137)
<#3137 (comment)>
Assuming this is a first example of profiling to come, I'm wondering about
putting the profiling in so many places in ensemble_stat. Not that I'm
against that but is there a good reason for it? In other words, what is the
intent of profiling this particular application, and does the way it is set
up help accomplish that?
—
Reply to this email directly, view it on GitHub
<#3137 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA4UJHUWOX3226CE2CQU4H324JMRTAVCNFSM6AAAAAB3XULGE2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQNBVGM2DKNBRHA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Thinking out loud, I'd describe my thinking like this: Start at a high level, look for bottlenecks, and drill down from there, meaning put only a few ctrack profiling lines initially and see where the app is spending all it's time, then look deeper. If you put profiling everywhere it might be harder to see where to investigate/modify. If that makes sense, I can give it a go as far as making it look like what I think would be a good initial setup. If that does NOT make sense, that is also ok. Let me know what you think. |
That sounds good. Thanks Dave!
…On Mon, May 5, 2025 at 10:15 AM davidalbo ***@***.***> wrote:
*davidalbo* left a comment (dtcenter/MET#3137)
<#3137 (comment)>
Thinking out loud, I'd describe my thinking like this: Start at a high
level, look for bottlenecks, and drill down from there, meaning put only a
few ctrack profiling lines initially and see where the app is spending all
it's time, then look deeper. If you put profiling everywhere it might be
harder to see where to investigate/modify. If that makes sense, I can give
it a go as far as making it look like what I think would be a good initial
setup. If that does NOT make sense, that is also ok. Let me know what you
think.
—
Reply to this email directly, view it on GitHub
<#3137 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA4UJHW4VOXIHECICXC2P3L246FB7AVCNFSM6AAAAAB3XULGE2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQNJRGUYTSMBWGA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
@bikegeek was there a command line option to turn on the profiling? I'd like to test out my changes. |
Contributor's Guide:
https://metplus.readthedocs.io/projects/met/en/feature_3065_benchmarking_ensemble_stat/Contributors_Guide/code_profiling.html
Section 7.1.3.1
Step 2 Compile MET Code
…On Mon, May 5, 2025 at 10:43 AM davidalbo ***@***.***> wrote:
*davidalbo* left a comment (dtcenter/MET#3137)
<#3137 (comment)>
@bikegeek <https://github.com/bikegeek> was there a command line option
to turn on the profiling? I'd like to test out my changes.
—
Reply to this email directly, view it on GitHub
<#3137 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA4UJHRPNWBA46VCTKF5XX3246IJ3AVCNFSM6AAAAAB3XULGE2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQNJRGYYDKNZWHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hopefully the last question. Is there a way to look at these files in a way that formats them nicely? Just treating them as ascii gives output that is hard to read. detail_output.txt |
The metrics files are consolidated by the Python code and stored in the
directory as indicated in the YAML config file: *benchmark_output_path
*setting,
then if you specified the filename, the final file is
<filename>_timestamp.csv, <filename>_timestamp.txt.
If no filename is specified in the filename setting, then the consolidated
files are timestamp.csv, timestamp.txt. *You can then view the csv file
via spreadsheet. *I prefer to view using jupyter lab and opening up into a
dataframe view.
Here is the example config file:
https://metplus.readthedocs.io/projects/met/en/feature_3065_benchmarking_ensemble_stat/Contributors_Guide/code_profiling.html
Expand the drop down:
section 7.1.3.1 under the Edit the benchmark.yaml configuration file
…On Mon, May 5, 2025 at 11:14 AM davidalbo ***@***.***> wrote:
*davidalbo* left a comment (dtcenter/MET#3137)
<#3137 (comment)>
Hopefully the last question. Is there a way to look at these files in a
way that formats them nicely? Just treating them as ascii gives output that
is hard to read.
detail_output.txt
summary_output.txt
—
Reply to this email directly, view it on GitHub
<#3137 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA4UJHQJMJROWO6JEMHD3ML246L7PAVCNFSM6AAAAAB3XULGE2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQNJRGY4TMMJWGA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Sounds good, thanks Dave!
…On Mon, May 5, 2025 at 1:20 PM davidalbo ***@***.***> wrote:
*davidalbo* left a comment (dtcenter/MET#3137)
<#3137 (comment)>
I reduced the profiling to the two main function calls only. process_n_vld
and process_vx. This shows that process_vx is where the bulk of the time is
being spent, so that is where one might add more profiling to see where
within that method it is getting bogged down. This is what I was after. If
you're good with this, I'll commit and push my changes.
Screenshot.2025-05-05.at.1.17.23.PM.png (view on web)
<https://github.com/user-attachments/assets/eb9d3ac0-1ca7-4e92-adae-e75b9a6847b4>
—
Reply to this email directly, view it on GitHub
<#3137 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA4UJHQLT2PVIHQ6ODTO4632462YHAVCNFSM6AAAAAB3XULGE2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQNJSGA4TGNZVGA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approve this pull request.
@@ -194,6 +198,11 @@ int met_main(int argc, char *argv[]) { | |||
// Perform verification | |||
process_vx(); | |||
|
|||
// Save the CTRACK metrics | |||
#ifdef WITH_PROFILER | |||
ctrack::result_print(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bikegeek if ctrack::result_print()
is already being called every time in basic/vx_util/main.cc
don't we NOT need to call it in the application code? Won't this cause the results to be printed twice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If ctrack::result_print() is NOT in the application code, the benchmarking information for the application code does not get saved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bikegeek, apologies for the long delay in getting to this. I proposed a couple of minor edits in the docs.
I tested in seneca as the met_test
user in /d1/projects/MET/MET_pull_requests/met-12.1.0/rc1/MET-feature_3065_benchmarking_ensemble_stat
and compiled with --enable-profiler
.
When I run make test
, I do see all the tests being run, including CTRACK output, and that's good. But it fails with this error at the end:
make[1]: *** No rule to make target 'profiler', needed by 'all'. Stop.
make[1]: Leaving directory '/d1/projects/MET/MET_pull_requests/met-12.1.0/rc1/MET-feature_3065_benchmarking_ensemble_stat/scripts'
make: *** [Makefile:880: test] Error 2
I'll see if I can figure out why that's happening.
…ining which tests to run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I just pushed a fix to the make test
error. It's because of custom logic in scripts/Makefile
to determine which test scripts to run. But I patched it to ignore the --enable-profiler
configuration setting.
I approve of this PR, but do recommend that you make those documentation tweaks.
I imagine as we start using this profiling option, we may want to modify the details. @georgemccabe mentioned one idea to push the CTRACK directives in a common library function, so that adding it to a function is a one-liner instead of a three-liner.
Thanks!
No more error message during make test. Update comiler section accordingly
Co-authored-by: John Halley Gotway <johnhg@ucar.edu>
Co-authored-by: John Halley Gotway <johnhg@ucar.edu>
@davidalbo / @JohnHalleyGotway I updated the documentation and just need one of you to click 'approve' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approve of these changes.
I'm sorry I missed the ping on this yesterday!
Expected Differences
Do these changes introduce new tools, command line arguments, or configuration file options? [Yes]
If yes, please describe:
Benchmarking using the CTRACK tool that runs when the --enable-profiler option is included in the configuration step of a MET build.
Instrumented code:
Configuration:
benchmark.yaml:
for exercising MET code via command line options
for exercising MET code via METplus use case(s) (i.e. wrapper code)
Do these changes modify the structure of existing or add new output data types (e.g. statistic line types or NetCDF variables)? [No]
Pull Request Testing
Describe testing already performed for these changes:
on host 'seneca':
cloned MET code to:
/d1/projects/Benchmark_EnsembleStat
built code without and with --enable-profiler to verify the option is working as expected
ran profiling/benchmarking tool, CTRACK on ensemble stat code via:
- MET command line commands:
- use the benchmark.yaml config file set up for running MET command
- use the setup.bash script to set the appropriate environment variables
- use the envs_for_met.bash for setting the environment variables needed to run MET command line command
- METplus use case
- set the settings for METplus in the benchmark.yaml config file
Recommend testing for the reviewer(s) to perform, including the location of input datasets, and any additional instructions:
Do these changes include sufficient documentation updates, ensuring that no errors or warnings exist in the build of the documentation? [Yes ]
Do these changes include sufficient testing updates? [No]
Will this PR result in changes to the MET test suite? [No]
Will this PR result in changes to existing METplus Use Cases? [No]
If yes, create a new Update Truth METplus issue to describe them.
Do these changes introduce new SonarQube findings? [Yes]
If yes, please describe:
may result in lower code coverage percentage for new code
Please complete this pull request review by for RC1 release.
Pull Request Checklist
See the METplus Workflow for details.
Select: Reviewer(s) and Development issue
Select: Milestone as the version that will include these changes
Select: METplus-X.Y Support project for bugfix releases or MET-X.Y Development project for the next coordinated release