Skip to content

Issue/873 embedded laplace #874

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 26 commits into
base: master
Choose a base branch
from
Open

Conversation

charlesm93
Copy link
Contributor

Submission Checklist

  • Builds locally
  • New functions marked with <<{ since VERSION }>>
  • Declare copyright holder and open-source license: see below

Summary

Documentation for suite of functions for the embedded Laplace approximation. Starting a PR to allow easy file comparison and will fill in the details soon.

Copyright and Licensing

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): (Figuring this out)

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

@WardBrian WardBrian linked an issue Apr 22, 2025 that may be closed by this pull request
2 tasks
@avehtari avehtari self-requested a review April 24, 2025 16:28
The Laplace approximation is especially useful if $p(\theta)$ is
multivariate normal and $p(y \mid \phi, \theta)$ is
log-concave. Stan's embedded Laplace approximation is restricted to
have multivariate normal prior $p(\theta)$ and ... likelihood
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add here the restrictions for the likelihood

Copy link
Contributor Author

@charlesm93 charlesm93 May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two kinds of restrictions:

  • what the user can do without breaking Stan, i.e. the operations in the likelihood need to support higher-order autodiff.
  • what the user should do to insure the approximation is reliable.

I'll assume you have the first in mind.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was thinking the first as a restriction. For the second one we can say which kind of likelihood are more likely to work, ie, log concave and maybe near log concave

@avehtari
Copy link
Member

I made some edits to use the statistical terms correctly. In the end of first section, it would be good to tell the constraints on the likelihood function and I left there three dots.

@WardBrian
Copy link
Member

@charlesm93 I started to fill in some of the boilerplate we have in our functions reference. Those comments and things are actually useful for building the index page

@WardBrian WardBrian force-pushed the issue/873-embeddedLaplace branch from 4db967a to 7a07abd Compare May 29, 2025 14:30
@charlesm93
Copy link
Contributor Author

@WardBrian In the doc, what are the lupmf suffixes for? Is this a typo?

@WardBrian
Copy link
Member

The unnormalized versions, which correspond to propto=true in the C++. For these functions they may be equivalent, but for technical reasons they still need to exist. If they don’t do anything we could remove the documentation, but it would be less consistent with others then

<!-- real; laplace_marginal; (function ll_function, tuple(...), vector theta0, function K_function, tuple(...)); -->
\index{{\tt \bfseries laplace\_marginal }!{\tt (function ll\_function, tuple(...), vector theta0, function K\_function, tuple(...)): real}|hyperpage}

`real` **`laplace_marginal`**`(function ll_function, tuple(...), vector theta0, function K_function, tuple(...))`<br>\newline
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we call theta0 theta_init? I just think it sounds more clear

* `hessian_block_size`: the size of the blocks, assuming the Hessian
$\partial \log p(y \mid \theta, phi) \ \partial \theta$ is block-diagonal.
The structure of the Hessian is determined by the dependence structure of $y$
on $\theta$. By default, the Hessian is treated as diagonal
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should note where that if Hessian block size is not 1 or N then theta needs to be divisible by the Hessian block size

* `solver`: choice of Newton solver. The optimizer used to compute the
Laplace approximation does one of three matrix decompositions to compute a
Newton step. The problem determines which decomposition is numerical stable.
By default (`solver=1`), the solver makes a Cholesky decomposition of the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this a list

```
matrix K_function(...)
```
There is no type restrictions for the variadic arguments. The variables $\phi$
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is phi here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think moreso from the section it's not clear how this is related to the k function

The only restriction is that this function returns a positive-definite matrix
with size $n \times n$ where $n$ is the size of $\theta$. The signature is:
```
matrix K_function(...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we call this covariance_function?

@mitzimorris
Copy link
Member

does the pdf build?

\index{{\tt \bfseries laplace\_marginal\_tol }!{\tt (function ll\_function, tuple(...), vector theta\_init, function K\_function, tuple(...), real tol, int max\_steps, int hessian\_block\_size, int solver, int max\_steps\_linesearch): real}|hyperpage}

<!-- real; laplace_marginal_tol; (function likelihood_function, tuple(...), vector theta_init, function covariance_function, tuple(...), real tol, int max_steps, int hessian_block_size, int solver, int max_steps_linesearch); -->

I had to go through the file and change the theta_init in the \index directive to theta\_init ?

In the above procedure, neither the marginal posterior nor the conditional posterior
are typically available in closed form and so they must be approximated.
The marginal posterior can be written as $p(\phi \mid y) \propto p(y \mid \phi) p(\phi)$,
where $p(y \mid \phi) = \int p(y \mid \phi, \theta) p(\theta) \text{d}\theta$ $
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stray $ at end of line.

@charlesm93
Copy link
Contributor Author

@avehtari thanks for the feedback. Changes implemented.

@charlesm93
Copy link
Contributor Author

I edited the section on GPs to demonstrate how the embedded Laplace can be used. Would love for someone to take a look.

(Note to self: add example with control parameters.)

and allows the user to tune the control parameters of the approximation.
{{< since 2.37 >}}

## Built-in Laplace marginal likelihood functions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance that all these wrappers could have argument for the mean? Now it's possible to define only the covariance, but there are many cases where the latent Gaussian model is used as a component, and the mean is non-zero and then these functions can't be used. Sorry for not noticing this before trying to use the release candidate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this makes sense and I should've pursued that in the first place. This would be straightforward to implement but then there is all the work that goes into testing and documenting this properly.

So the question is whether we want to open an issue and add the custom functions in a future release or push for this to appear in the current release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also like the idea of getting rid of poisson_log and poisson_log_2 and just having one (set of) function(s) for each likelihood and link, which requires user to specify the mean, potentially setting it a vector of 0s.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on if these would be new signatures or require breaking the existing signatures. I would prefer to do it in a future release, but that’s only possible if it would be backwards compatible

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would very much like to have mean as an argument in this release, and I can help with docs, but I'm not able to help with C++ and I understand if it's too much work to get it to this release. If it were left for the future release, it might be better to not make the current built-in Laplace marginal likelihood functions publi?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth starting a new thread for this @avehtari, probably in the math library issues. We can discuss options and their feasibility easier there

@WardBrian
Copy link
Member

@charlesm93 99661ef deleted some of the index metadata and it loos like the tol and non tol arguments are a bit mixed up. What was the intended fix there?

@charlesm93
Copy link
Contributor Author

@WardBrian are you talking about commented out lines? I thought they were left from previous drafts and I deleted them. My bad if these actually serve a purpose....!

@charlesm93
Copy link
Contributor Author

I added a chapter in the reference manual describing the embedded Laplace approximation. This will also need someone to review.

@WardBrian
Copy link
Member

Yes, the HTML comments are scraped to build the index on the website

@charlesm93
Copy link
Contributor Author

@WardBrian should be fixed now.

@mitzimorris
Copy link
Member

mitzimorris commented Jun 11, 2025

just pulled the latest version and can't build the html.
nevermind - in wrong conda environment.

@charlesm93
Copy link
Contributor Author

charlesm93 commented Jun 11, 2025

how do I add the section on the embedded Laplace approximation in the reference manual? I made edits to reference-manual/_quarto.yml and added laplace_embedded.qmd as a chapter, but building the webpage doesn't render the new chapter.

@WardBrian
Copy link
Member

@charlesm93 that is the correct place to add it for it to show up in the pdf, you should separately add it to the _quarto.yml in the folder above

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Documentation for embedded Laplace approximation
5 participants