Data Science / R / R News / Statistics

Bayesian Inference with Backfitting MCMC

by AO · May 3, 2018

This article is originally published at https://stablemarkets.wordpress.com

Previous posts in this series on MCMC samplers for Bayesian inference (in order of publication):

All code for this (and previous) posts are in my Github repo

Recently I’ve been reading about Bayesian additive regression trees (BART). For those interested, the paper is here. I use similar notation when describing the backfitting procedure in this post. I believe this MCMC method was first described here.

BART a Bayesian nonparametric model that can be fit using backfitting MCMC – which is cool in itself. I thought I’d write a post describing the intuition behind backfitting as well as a toy example.

All code for this post can be found in this GitHub repo. I don’t use any packages … the MCMC is implemented from scratch.

Consider data on a covariate $x$ and an outcome $y$ for $n$ subjects, $(y_i, x_i)_{1:n}$ . In this toy example, the data looks something like this:

We might consider the following probabilistic model

$y_i | x_i, \beta, \phi \sim N(\mu_i, \phi)$

Where

$\mu_i = \sum_{j=0}^3 x_i^j \beta_j$

Essentially we model the conditional mean using a third-degree polynomial. Note that this is a special case of the more general additive model

$\mu_i = \sum_{i=0}^k f_j(x_i)\beta_j$

where in this case $f_j(x_i) = x_i^j$ and $k=3$ . The model is completed with a flat prior $p(\beta_j) \propto 1$ on each element of the parameter vector $\beta$ and an inverse-gamma prior, $p(\phi) = InvGam(.5, 10000)$ , with shape of $.5$ and rate $10000$ on the variance parameter.

The conditional posterior for each $\beta_j$ is Gaussian (because of conjugacy). We can sample from it using conjugate Gibbs or Metropolis. We could sample the whole parameter vector $\beta$ as a block as well, but in this post we’ll stick to backfitting – which is itself a Gibbs sampler. We still sample from the conditional posterior of each $\beta_j$ conditional on the other $\beta$ s. However, we make use of the key insight that the conditional posterior of each $\beta_j$ depends on the other betas, denoted by $\beta_{-j}$ , only through the residual

$R_j = y_i - \sum_{k \neq j} x_i^k \beta_k$

Intuitively, $R_j$ is the portion of the mean of $y_i$ left after the portions explained by the other terms (not $j$ ) are subtracted off. It is also normally distributed,

$R_j \sim N(x_i^j\beta_j, \phi)$

With a normal prior on $\beta_j$ , the posterior can be computed via conjugacy as

$p(\beta_j | R_j, \phi) \sim N\big( \frac{\sum_{i=1}^n R_{ij} \cdot x_i^j}{\sum_{i=1}^n [x_i^j]^2}, \frac{\phi}{\sum_{i=1}^n [x_i^j]^2 } \big)$

The backfitting MCMC proceeds as follows. First, initialize all betas except $\beta_1$ . This is totally arbitrary – you can start with any parameter. Then, within each Gibbs iteration,

Compute $R_1$ with the values of $\beta_{-1}$ at the current iteration. The sample from the posterior $\beta_1 \sim p(\beta_1 | R_1, \phi)$ conditional on the current draw of $\phi$ .
Compute $R_2$ with the values of $\beta_{-2}$ at the current iteration. Note, for $\beta_1$ use the value from step 1. The sample from the posterior $\beta_2 \sim p(\beta_2 | R_2, \phi)$ .
Continue this procedure for all beta parameters.
Once all $\beta$ parameters are drawn, sample $\phi \sim p(\phi | y, \beta)$ . This posterior is another inverse gamma.

The term backfitting seems appropriate because, at each iteration, we’re “backing out” the distribution of the $\beta_j$ that we want to sample using the other betas.

To get a fitted regression line, we need to sample from the posterior predictive distribution. We do this after step 4 in each Gibbs iteration by drawing $n$ values from

$\hat{y}_i | x_i, \beta^{[k]}, \phi^{[k]} \sim N(\mu_i^{[k]}, \phi^{[k]} ), i \in \{1, 2, \dots, n \}$

The superscript $[k]$ denotes the parameters using values from the $k^{th}$ Gibbs iteration.

Here are the results from the simulated example. Again, all code can be found on my GitHub repo.

Thanks for visiting r-craft.org
This article is originally published at https://stablemarkets.wordpress.com
Please visit source website for post related comments.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Bayesian Inference with Backfitting MCMC

You may also like...

Categories

Bayesian Inference with Backfitting MCMC

You may also like...

How to Cite Packages

R Weekly 2020-45 Publishing, Prefrontal Cortex, Parentheses

RStudio Professional Drivers 2021.10.0

Categories