Resolve MPI issues #169

galapaegos · 2018-06-21T12:18:11Z

This PR is fixing any issues that have made the MPI version not work.

henryiii · 2018-06-21T12:23:30Z

Rebased and fixed style.

galapaegos · 2018-06-27T21:23:29Z

Just an update, MPI works for all examples except DP4. All tests work except CorrGauss, and the three previous tests that were broken.

First priority is figuring out the problem with DP4, then CorrGauss.

henryiii · 2018-06-28T10:15:42Z

Rebased on master, and fixed (hopefully) the one usage of old C++ loop style that triggered the tidy checker. Please fetch and reset your checkout. (git fetch && git reset origin/fix-mpi)

galapaegos · 2018-06-28T13:58:14Z

Under src/PDFs/physics/Amp4Body.cu, is there any reason to hand-evaluate vs using evaluate_with_metric?

Here is the code in question:

    generation_no_norm = true; // we need no normalization for generation, but we do need to make sure that norm = 1;
    SigGenSetIndices();
    copyParams();
    normalize();
    setForceIntegrals();
    host_normalizations.sync(d_normalizations);

    thrust::device_vector<fptype> results(numEvents);
    thrust::constant_iterator<int> eventSize(6);
    thrust::constant_iterator<fptype *> arrayAddress(dev_event_array);
    thrust::counting_iterator<int> eventIndex(0);

    // MetricTaker evalor(this, getMetricPointer("ptr_to_Prob"));
    // we need to call evaluate_with_metric(); here.
    auto fc = fitControl;
    setFitControl(std::make_shared<ProbFit>());
    thrust::transform(thrust::make_zip_iterator(thrust::make_tuple(eventIndex, arrayAddress, eventSize)),
                      thrust::make_zip_iterator(thrust::make_tuple(eventIndex + numEvents, arrayAddress, eventSize)),
                      results.begin(),
                      *logger);
    cudaDeviceSynchronize();
    gooFree(dev_event_array);

I would prefer the evaluate_with_metric version if possible: I think GenerateSig will need a setData called prior.

I am wondering about evaluate_with_metric. There are two versions of this function, one that allows you to pass a device_vector, and one that returns a host_vector. Is there a need for the difference, should we just pass a host_vector always and let the user perform any appropriate conversion? I'm thinking from python, what would a user do with a device_vector from goofit, evaluate the buffer with a second metric?

Currently, the transform happens in device, copy back to host for MPI, then copy back to device which is converted back to host.

Thoughts?

henryiii · 2018-06-28T14:02:44Z

I would like to move all evaluation into the three methods in GooPdf, and remove all hand evaluations in subclasses (possibly with the exception of ones that return complex or multiple values, TBD).

There are two versions to allow someone writing CUDA code to avoid the copy. The normal user will probably want the output on the CPU, so the GPU to CPU version is provided. Python will (currently) only have access to the CPU copied version. That was my intention, anyway. (Maybe in numba 0.29 a GPU version could eventually interesting?)

I'm not sure a GPU version is useful/important, though, so would be willing to drop it for now if needed.

henryiii · 2018-06-28T14:03:38Z

Eventual hope is something like this for the call structure: #155

henryiii · 2018-07-02T14:59:55Z

Cherrypicked and fixed style. Please use git fetch && git reset origin/fix-mpi. If the only remaining changes seen by git diff are whitespace, just do git reset --hard to clean them out.

…rocesses

galapaegos · 2018-07-02T15:43:18Z

Thanks for fixing up the style. This PR is ready go!

henryiii force-pushed the fix-mpi branch from 59dc2b6 to c8eeea6 Compare June 21, 2018 12:23

henryiii force-pushed the fix-mpi branch from b58f0fb to 00bf7be Compare June 28, 2018 10:13

henryiii force-pushed the fix-mpi branch 2 times, most recently from 8593b7c to eb7d3c3 Compare July 2, 2018 14:57

galapaegos and others added 12 commits July 2, 2018 17:01

Fixed MPL to MPI, added mpiexec to all tests

9eab8c3

Using a main function for Catch2

2c0fa0e

observables is now observablesList

78652bc

Fixing style

a5aecd9

Passing GOOFIT_MPI for tests

3f1b1c0

Updated extern/modern_cmake

ad402fc

Added recursive setNumPerTask, Testing for isEventNumber, Added trace

315b40b

Typo with isEventNumber

bea9d9e

Ignoring CorrGaussianTest, causing infinite loop

63f9d0c

Fix for evaluate_with_metric to properly distribute all data to all p…

6566d6b

…rocesses

Fixed formatting

be7f202

Added MPI support for Amp4Body.

b70ba7f

henryiii force-pushed the fix-mpi branch from eb7d3c3 to b70ba7f Compare July 2, 2018 15:01

henryiii merged commit af872e5 into master Jul 2, 2018

henryiii deleted the fix-mpi branch July 2, 2018 16:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Resolve MPI issues #169

Resolve MPI issues #169

Uh oh!

galapaegos commented Jun 21, 2018

Uh oh!

henryiii commented Jun 21, 2018

Uh oh!

galapaegos commented Jun 27, 2018

Uh oh!

henryiii commented Jun 28, 2018

Uh oh!

galapaegos commented Jun 28, 2018

Uh oh!

henryiii commented Jun 28, 2018

Uh oh!

henryiii commented Jun 28, 2018

Uh oh!

henryiii commented Jul 2, 2018

Uh oh!

galapaegos commented Jul 2, 2018

Uh oh!

Uh oh!

Resolve MPI issues #169

Resolve MPI issues #169

Uh oh!

Conversation

galapaegos commented Jun 21, 2018

Uh oh!

henryiii commented Jun 21, 2018

Uh oh!

galapaegos commented Jun 27, 2018

Uh oh!

henryiii commented Jun 28, 2018

Uh oh!

galapaegos commented Jun 28, 2018

Uh oh!

henryiii commented Jun 28, 2018

Uh oh!

henryiii commented Jun 28, 2018

Uh oh!

henryiii commented Jul 2, 2018

Uh oh!

galapaegos commented Jul 2, 2018

Uh oh!

Uh oh!