add ScaledUnitLowerCholeskyTransform and change default AutoMultivariateNormal parameterization #1146

martinjankowiak · 2021-09-06T00:39:45Z

joint work with @fehiepsi

instead of parameterizing a lower cholesky factor as an unconstrained strictly lower triangular piece and a positive diagonal we instead parameterize as

L = unit_scale_tril @ scale_diag

where unit_scale_tril is lower triangular with ones along the diagonal and scale_diag is a positive diagonal matrix.

not surprisingly (consider e.g. the analogous parameterization in AutoLowRankMultivariateNormal) this seems to lead to consistently better performance (all results use AutoMultivariateNormal):

logistic regression dataset with N=50k, D=28

elbo: -32 430.37 [this PR]
elbo: -32 590.92 [before this PR]

logistic regression dataset with N=50k, D=18

elbo: -25 112.28 [this PR]
elbo: -25 280.20 [before this PR]

logistic regression dataset with N=37k, D=15

elbo: -12 576.18 [this PR]
elbo: -12 592.29 [before this PR]

sparse FITC GP classifier with N = 37k, D = 14, and latent dim = 128 (inducing points)

elbo: 9 897.12 [this PR]
elbo: 9 765.09 [before this PR]

sparse FITC GP classifier with N = 37k, D = 14, and latent dim = 64 (inducing points)

elbo: 9 925.11 [this PR]
elbo: 9 902.51 [before this PR]

sparse FITC GP classifier with N = 37k, D = 14, and latent dim = 32 (inducing points)

elbo: 9 896.23 [this PR]
elbo: 9 884.37 [before this PR]

sparse FITC GP classifier with N = 37k, D = 14, and latent dim = 16 (inducing points)

elbo: 9 320.48 [this PR]
elbo: 9 311.96 [before this PR]

sparse FITC GP classifier with N = 100k, D = 27, and latent dim = 64 (inducing points)

elbo: -55 858.99 [this PR]
elbo: -55 947.08 [before this PR]

sparse FITC GP classifier with N = 100k, D = 17, and latent dim = 64 (inducing points)

elbo: -43 956.38 [this PR]
elbo: -44 015.16 [before this PR]

(note: inducing points have the same initialization for each comparison; not surprisingly i was getting noisy results before i made sure this was the case; also note the N=37k GP elbos are missing a log pi term that shifts them by large amounts)

there appears to be a clear winner but do we need more experiments? @fehiepsi what do you think?

(note: whether positivity is enforced via exponential or softplus tends to be much less important)

numpyro/distributions/transforms.py

fehiepsi · 2021-09-06T03:47:38Z

LGTM - you might need to double-check if using softplus is still good for your experiments.

martinjankowiak and others added 10 commits September 5, 2021 18:00

initial commit

e4e6191

rm cruft

0877454

tweak

8d94c53

cleanup

32ced63

fix test

65624bc

remove print

da66dc0

registry

7e9adf3

fix inverse

f92e911

add to test_biject_to

398898f

fix transform implementation

ebff7f3

martinjankowiak added WIP discussion labels Sep 6, 2021

martinjankowiak and others added 8 commits September 5, 2021 20:41

black

e312d91

add docs

47bbd82

Merge remote-tracking branch 'upstream/scaledchol' into scaledchol

2401b48

add docstrings

84332ca

Merge branch 'scaledchol' of github.com:pyro-ppl/numpyro into scaledchol

a2eaded

string literal

b2a7b59

fix docs

0453c6f

lint

4fa4a05

martinjankowiak commented Sep 6, 2021

View reviewed changes

numpyro/distributions/transforms.py Outdated Show resolved Hide resolved

fehiepsi added 4 commits September 5, 2021 21:15

update scaled unit lower cholesky transform to use softplus

8c97f46

Merge remote-tracking branch 'upstream/scaledchol' into scaledchol

0be430c

fix log det bug

4afc3b5

fix docstring

f4b0a6c

fehiepsi removed the WIP label Sep 6, 2021

fehiepsi approved these changes Sep 6, 2021

View reviewed changes

martinjankowiak mentioned this pull request Sep 6, 2021

[FR] add an alternative parameterization for lower triangular cholesky factors pyro-ppl/pyro#2924

Open

martinjankowiak added the awaiting review label Sep 6, 2021

martinjankowiak added the enhancement New feature or request label Sep 6, 2021

fehiepsi merged commit 062d822 into master Sep 6, 2021

fehiepsi deleted the scaledchol branch September 6, 2021 18:43

martinjankowiak mentioned this pull request Nov 9, 2021

Change coordinatization of AutoMultivariateNormal pyro-ppl/pyro#2963

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add ScaledUnitLowerCholeskyTransform and change default AutoMultivariateNormal parameterization #1146

add ScaledUnitLowerCholeskyTransform and change default AutoMultivariateNormal parameterization #1146

Uh oh!

martinjankowiak commented Sep 6, 2021 •

edited

Loading

Uh oh!

Uh oh!

fehiepsi commented Sep 6, 2021

Uh oh!

Uh oh!

add ScaledUnitLowerCholeskyTransform and change default AutoMultivariateNormal parameterization #1146

add ScaledUnitLowerCholeskyTransform and change default AutoMultivariateNormal parameterization #1146

Uh oh!

Conversation

martinjankowiak commented Sep 6, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

logistic regression dataset with N=50k, D=28

logistic regression dataset with N=50k, D=18

logistic regression dataset with N=37k, D=15

sparse FITC GP classifier with N = 37k, D = 14, and latent dim = 128 (inducing points)

sparse FITC GP classifier with N = 37k, D = 14, and latent dim = 64 (inducing points)

sparse FITC GP classifier with N = 37k, D = 14, and latent dim = 32 (inducing points)

sparse FITC GP classifier with N = 37k, D = 14, and latent dim = 16 (inducing points)

sparse FITC GP classifier with N = 100k, D = 27, and latent dim = 64 (inducing points)

sparse FITC GP classifier with N = 100k, D = 17, and latent dim = 64 (inducing points)

Uh oh!

Uh oh!

fehiepsi commented Sep 6, 2021

Uh oh!

Uh oh!

martinjankowiak commented Sep 6, 2021 •

edited

Loading