Skip to content

Conversation

mitzimorris
Copy link
Member

Submisison Checklist

  • Run tests: ./runCmdStanTests.py src/test
  • Declare copyright holder and open-source license: see below

Summary:

In command.hpp, before invoking service method fixed_param, add line "# fixed_param=1" to Stan CSV header.

Intended Effect:

This will make it easier for wrapper interfaces to check Stan CSV files. see #1029

How to Verify:

Added more unit tests

Side Effects:

CSV header files will change only when method fixed_param is invoked. Should be harmless.

Documentation:

n/a

Copyright and Licensing

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Columbia University

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 3.05 3.19 0.96 -4.58% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 0.98 -2.49% slower
eight_schools/eight_schools.stan 0.11 0.1 1.03 2.79% faster
gp_regr/gp_regr.stan 0.16 0.16 0.96 -4.67% slower
irt_2pl/irt_2pl.stan 5.95 6.08 0.98 -2.3% slower
performance.compilation 89.76 87.14 1.03 2.91% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 8.55 8.85 0.97 -3.61% slower
pkpd/one_comp_mm_elim_abs.stan 29.95 29.84 1.0 0.37% faster
sir/sir.stan 127.79 123.41 1.04 3.43% faster
gp_regr/gen_gp_data.stan 0.03 0.04 0.97 -3.32% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan 3.02 3.08 0.98 -2.21% slower
pkpd/sim_one_comp_mm_elim_abs.stan 0.42 0.38 1.08 7.37% faster
arK/arK.stan 1.87 1.87 1.0 -0.12% slower
arma/arma.stan 0.83 0.76 1.1 8.71% faster
garch/garch.stan 0.53 0.67 0.79 -26.63% slower
Mean result: 0.98917823181

Jenkins Console Log
Blue Ocean
Commit hash: ed2f282


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Copy link
Member

@rok-cesnovar rok-cesnovar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be changed so that the CSV should be similar if you are running manually starting fixed_param or when cmdstan automatically changes it to fixed_param. With the current implementation the CSV metadata states that both algorithm = hmc as well as fixed_param = 1.

Manual fixed_param:

...
# method = sample (Default)
#   sample
#     num_samples = 1000 (Default)
#     num_warmup = 1000 (Default)
#     save_warmup = 0 (Default)
#     thin = 1 (Default)
#     adapt
#       engaged = 1 (Default)
#       gamma = 0.050000000000000003 (Default)
#       delta = 0.80000000000000004 (Default)
#       kappa = 0.75 (Default)
#       t0 = 10 (Default)
#       init_buffer = 75 (Default)
#       term_buffer = 50 (Default)
#       window = 25 (Default)
#     algorithm = fixed_param
# id = 0 (Default)
...
# stanc_version = stanc3 5f0258c1
# stancflags = 
# fixed_param = 1
lp__,accept_stat__,p

Automatic fixed_param:

...
# method = sample (Default)
#   sample
#     num_samples = 1000 (Default)
#     num_warmup = 1000 (Default)
#     save_warmup = 0 (Default)
#     thin = 1 (Default)
#     adapt
#       engaged = 1 (Default)
#       gamma = 0.050000000000000003 (Default)
#       delta = 0.80000000000000004 (Default)
#       kappa = 0.75 (Default)
#       t0 = 10 (Default)
#       init_buffer = 75 (Default)
#       term_buffer = 50 (Default)
#       window = 25 (Default)
#     algorithm = hmc (Default)
#       hmc
#         engine = nuts (Default)
#           nuts
#             max_depth = 10 (Default)
#         metric = diag_e (Default)
#         metric_file =  (Default)
#         stepsize = 1 (Default)
#         stepsize_jitter = 0 (Default)
# id = 0 (Default)
...
# stanc_version = stanc3 5f0258c1
# stancflags = 
# fixed_param = 1
lp__,accept_stat__,p

@mitzimorris
Copy link
Member Author

I couldn't agree more. I looked at the classes in src/cmdstan/argument - the code is a bit opaque.
are these object mutable? not sure how to change the algorithm argument - just create a new one?

@SteveBronder
Copy link
Contributor

_value seems to be protected and value() just returns back a copy of _value. One sort of weird way you could do this is to go through and manually call delete on items you don't need for fixed_param from valid_arguments. If you delete them and set the pointer of each to nullptr that should be safe.

@mitzimorris
Copy link
Member Author

OK, will look into messing with valid_arguments. since we've got a fix for this in CmdStanPy, not a huge priority. just another example of why we really need to change the way that CmdStan handles command line argments and config.

@betanalpha
Copy link
Contributor

The problem here isn't with the CmdStan argument parser but rather with that recently introduce feature of trying to modify the CmdStan configuration based on the given Stan program.

The entire CmdStan ecosystem was designed around the arguments being immutable. A user specifies the configuration and CmdStan executes exactly that configuration no matter the structure of the Stan program. The behavior of CmdStan is then entirely determined how it is called without the need to consult the associated Stan program, which is critical to reproducible analysis pipelines where different Stan programs are switched in and out. In other words the immutability of the configuration ensures no surprises in what is executed.

Modifying the configuration based on the structure of the Stan program breaks this guarantee. Instead of returning an error that immediately communicates a conflict between the requested configuration and the given Stan program the assumption of a misspecified configuration propagates through the entire analysis pipeline. If that assumption is incorrect then it would not be observed until later on in the pipeline complicating debugging.

The reason why the argument parser does not have readily-accessible mutators is that it was designed for this immutability paradigm. By breaking the immutability you've gone beyond the intended use of the code. That said while mutators are not implemented the modification of the parsed argument is straightforward given that the argument parsing is implemented recursively.

// Grab the algorithm argument node from the argument tree and
// cast it to a `list_argument` to ensure that the proper methods are called
list_argument* algo = dynamic_cast<list_argument*>(
        parser.arg("method")->arg("sample")->arg("algorithm"));

// Create a vector of strings with the new algorithm arguments
std::vector<std::string> modified_algo_args;
modified_algo_args.push_back(“fixed_param”);

// Update the algorithm node to use the new arguments
algo.parse_args(modified_algo_args, info, err, false);

Again I don't actually think that this fix should be implemented but rather the automatic modification reverted.

@bob-carpenter
Copy link
Member

I agree with @betanalpha that it seems confusing to change what users requested to something else because what they requested is incoherent.

How about just generalizing the HMC algorithms to allow zero-size parameter vectors? If you look at the leapfrog algorithm with size zero vectors, the boundary condition is a no-op, which is just what we want. Then users can choose any of our HMC algorithms when there is zero-size parameters and the boundary condition should do the right thing. Then we don't need to modify any of our output.

@betanalpha
Copy link
Contributor

betanalpha commented Sep 21, 2021 via email

@mitzimorris
Copy link
Member Author

mitzimorris commented Sep 21, 2021

it seems confusing to change what users requested to something else because what they requested is incoherent.

the user's request was vague, but then defaults were applied, resulting in an incoherent request. it's a problem with the defaults, not the user. all the user wants is to run the program to generate a sample.

@bob-carpenter
Copy link
Member

the user's request was vague, but then defaults were applied, resulting in an incoherent request

That's a really good point. If a user only specifies "sample", then I don't see the problem with choosing fixed_param or adaptive, diagonal, Euclidean NUTS for them depending on whether the model has 0 or 1+ parameters.

@betanalpha
Copy link
Contributor

betanalpha commented Oct 1, 2021 via email

@bob-carpenter
Copy link
Member

This is a misinterpretation of what “sample” means.

I agree that fixed-param doesn't do MCMC sampling, but it's main utility is for MC sampling using the _rng functions. Not to go all Humpty-Dumpty here, but we can define the keyword "sample" in an interface to mean whatever we want it to mean as long as it's consistent with standard meanings of the word.

I can see two reasons to run fixed-param.

  1. to MC sample using pseudo-RNGS, or
  2. to evaluate a Stan function or test a data/transformed data block.

(2) feels like it should be done another way. Are there other reasons people run fixed_param?

What I tend to do in practice is just call the defaults. Then if I have zero size parameters, I have to fumble in the doc to figure out how to call a different function. And if I ever want to use a single program for both, I have to write a conditional into the interface scripts. But I agree that it's awkward to be able to supply HMC parameters if HMC isn't being run. If we enforce that consistency by throwing errors whenever we call fixed_param with HMC parameters, I'd still have to write conditionals into the interface scripts.

@betanalpha
Copy link
Contributor

betanalpha commented Oct 1, 2021 via email

@nhuurre
Copy link

nhuurre commented Oct 4, 2021

@betanalpha

The problem here isn't with the CmdStan argument parser but rather with that recently introduce feature of trying to modify the CmdStan configuration based on the given Stan program.

I don't actually think that this fix should be implemented but rather the automatic modification reverted.

@bob-carpenter

How about just generalizing the HMC algorithms to allow zero-size parameter vectors? If you look at the leapfrog algorithm with size zero vectors, the boundary condition is a no-op, which is just what we want. Then users can choose any of our HMC algorithms when there is zero-size parameters and the boundary condition should do the right thing. Then we don't need to modify any of our output.

@betanalpha

while the Hamiltonian trajectories themselves are fine the local (trajectory length) and global (stepsize, inverse metric) adaptation are not.

In theory, HMC works perfectly fine in a zero-dimensional space. It's just arithmetic with size-zero vectors, trivial but well-behaved.
In practice, Stan's HMC algorithm works perfectly fine in a zero-dimensional space but there are two other bugs that interfere.

First, disabling the zero-parameter check in CmdStan and running with defaults result in

Exception initializing step size.
Posterior is improper. Please check your model.

which is obviously false: the posterior is a point mass and necessarily proper. It's just that step size is meaningless and the step size adapter misdiagnoses this.
But precisely because step size is meaningless adaptation shouldn't do anything anyway. Running with adapt engaged=0 warmup completes successfully but the program crashes when it tries to print the metric.

stan/lib/stan_math/lib/eigen_3.3.9/Eigen/src/Core/DenseCoeffsBase.h:425: Eigen::DenseCoeffsBase<type-parameter-0-0, 1>::Scalar &Eigen::DenseCoeffsBase<Eigen::Matrix<double, -1, 1, 0, -1, 1>, 1>::operator()(Eigen::Index) [Derived = Eigen::Matrix<double, -1, 1, 0, -1, 1>, Level = 1]: Assertion `index >= 0 && index < size()' failed.

Ok, one-line fix in stan/mcmc/hmc/hamiltonians/diag_e_point.hpp and with that it runs to completion.

                 Mean     MCSE   StdDev    5%       50%  95%    N_Eff  N_Eff/s    R_hat

lp__             0.00      nan     0.00  0.00   0.0e+00 0.00      nan      nan      nan
accept_stat__     1.0      nan  6.7e-16   1.0       1.0  1.0      nan      nan      nan
stepsize__        1.0      nan  6.7e-16   1.0       1.0  1.0      nan      nan      nan
treedepth__       1.0      nan  6.7e-16   1.0       1.0  1.0      nan      nan      nan
n_leapfrog__      1.0      nan  6.7e-16   1.0       1.0  1.0      nan      nan      nan
divergent__      0.00      nan  0.0e+00  0.00      0.00 0.00      nan      nan      nan
energy__         0.00      nan  0.0e+00  0.00      0.00 0.00      nan      nan      nan

All trajectories terminate after one proposal and always accept.
You could generate the same samples a bit more efficiently with fixed_param sampler (which evaluates zero proposals for each iteration) but--and here's the problem this PR is trying to workaround--fixed_param has a different output format from all the other samplers.

Why does fixed_param have different output? As far as I can tell it's because fixed_param has no warmup and thus omits the adaptation info. On one hand that makes sense, "warming up" fixed_param would be just wasted computation; on the other hand fixed_param still supports thinning, which is equally pointless, and CmdStan accepts (but silently ignores) nonzero num_warmup argument even with algorithm=fixed_param.
In short, it's all inconsistent.

@betanalpha
Copy link
Contributor

betanalpha commented Oct 18, 2021 via email

@bob-carpenter
Copy link
Member

If we’re being formal then a zero-dimensional space is topologically discrete and cannot be written as a dense subset of the real numbers.

Right. R^0 has just a single element, the empty tuple and any measure has to assign it 100% of the probability mass. But it's still a proper measure and I can't see where the rest of the math won't generalize properly if we get our code right.

When the output Stan configuration specifies that HMC was run but the output doesn’t contain the assumed HMC output then there are problems.

That feels like a bug to me in the brittle way we pack things into and read things out of .csv format.

Running with adapt engaged=0 warmup completes successfully but the program crashes when it tries to print the metric.

This also just feels like a bug in our code not handling N = 0 boundary conditions.

But isn't this moot for the issue at hand, which is just to let the sampling method deal with size zero output by allowing the flexibility to select between HMC or fixed-param depending on the problem dimensionality?

@betanalpha
Copy link
Contributor

betanalpha commented Nov 1, 2021 via email

@nhuurre
Copy link

nhuurre commented Nov 1, 2021

There are many zero-dimensional spaces and hence many different ways to extrapolate to the N = 0 boundary case of R^N.

There are indeed many zero dimensional spaces (although, no, the empty set is not one of them) but R^N does not mean some random N-dimensional space. R^N means the N-dimensional real vector space. That is unique for every N; zero is not special.

The issue isn't about generalizing HMC to arbitrary zero dimensional spaces. The only space of interest is the one Stan chooses when a model has no parameters at all. And that particular space is quite well-behaved.

@betanalpha
Copy link
Contributor

betanalpha commented Nov 10, 2021 via email

@mitzimorris
Copy link
Member Author

no longer relevant.

@mitzimorris mitzimorris deleted the feature/1029-record-fixed-param branch September 21, 2023 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants