R / R News / Statistics

Statistical physics vs Bayesian inference

by joe · November 29, 2018

This article is originally published at https://joewheatley.net

Statistical physics and Bayesian inference are closely related (see Andrew Gelman’s remarks here for example). A good way to illustrate the relationship is to simulate a statistical physics model using the “state-of-the-art full Bayesian statistical inference platform” stan. stan uses Hamiltonian Monte Carlo (HMC) to sample the typical set of the posterior distribution efficiently (Neal 2012) but without the main drawback of HMC, namely, the need to hand-tune the algorithm parameters (Betancourt 2017).

2D XY model

The 2D XY model has played a pivotal role in physics of the past half-century. The model describes a collection of interacting rotors or spins are free to rotate in the xy-plane. Each spin is associated with a site $\bf{r}$ on a $D \times D$ square lattice and is described it’s angle parameter $\theta_{\bf{r}} \in [-\pi,\pi)$ . Spins interact with their four nearest neighbours $\bf{\boldsymbol{\tau}}$ . The energy of a particular configuration $\{\theta_{\bf{r}}\}$ is

(1) $\begin{equation*} E = -\sum_{\bf{r} \boldsymbol{\tau}} \cos(\theta_{\bf{r}+\boldsymbol{\tau}}-\theta_{\bf{r}}) \end{equation*}$

The negative sign in Eqn (1) implies that parallel arrangements of spins have lowest energy (“ferromagnet”). The likelihood of a configuration $\{\theta_{\bf{r}}\}$ is given by the Boltzmann distribution of statistical physics. With uniform priors[*] the posterior joint probability distribution is

(2) $\begin{equation*} P\left( \{\theta_{\bf{r}}\} | \beta \right) \propto e^{-\beta E\left( \{\theta_{\bf{r}}\} \right)} \end{equation*}$

where $\beta$ is the inverse temperature.

In statistical physics the number of internal parameters $N$ can be very large (e.g. Avogadro’s number for a 2D system $N \sim 7 \times 10^{15}$ ). On the other hand $\beta$ is the sole external data value. This is an extreme limit of Bayesian inference where the inference dataset is small but there are an enormous number of model parameters. In statistical physics, interesting posterior distributions arise from the explicit interactions between parameters even in the absence of a rich inference dataset.

The statistical physics of the 2-D XY model is far from trivial. When $\beta \rightarrow \infty$ spins align in some direction to minimise the energy [**]. For large but finite $\beta$ spins are nearly parallel and the cosine terms in Eqn (1) can be expanded as a quadratic in the small angular differences between neighbouring spins. This “spin-wave” model can be solved analytically. For large $N$ it gives “quasi long range order” i.e. power law decay of spin correlations with distance with exponent $\eta = \frac{1}{2 \pi \beta}$ .

$\langle \cos\left(\theta_{\bf r^\prime}-\theta_{\bf r }\right) \rangle \sim | \bf{r}-\bf{r^\prime} |^{-\eta}$

In the opposite weak coupling limit $\beta \ll 1$ entropy dominates and typical draws from the posterior distribution consist of nearly randomly oriented spins and short range (exponentially decaying) correlations. This suggests that a phase transition exists between a weak coupling disordered phase and a strong coupling phase with quasi long range order. The transition is known to occur at $\beta_{BKT} \approx 1.11996$ for very large $N$ . As the transition is approached from below $\eta \rightarrow \frac{1}{4}$ .

It is a remarkable fact that the phase transition in the 2D XY model is driven by a proliferation of vortices (topological defects). This discovery was made in the 1970s and awarded a Nobel prize in 2016. Some vortex configurations on an ordered background are illustrated below. Note that the spins return to their undisturbed state far from a vortex anti-vortex pair but not in the other cases shown. [***]

stan

It is an easy task to simulate Equations (1) & (2) in stan. The stan code (uniform priors and periodic boundary conditions) is

xy_model <- '
functions{
//function f to impose periodic boundary condition
int f(int i, int L){
if (i == 0)
return L;
else if (i == L+1)
return 1;
else
return i;
}

}

data {
int D; //dimension of square lattice
real beta; //coupling or inverse temperature
}

parameters {
matrix<lower=-pi(),upper=pi()>[D,D] theta; //the spin parameters
}

model {
//the site energy
matrix[D,D] energy;
for(i in 1:D)
for(j in 1:D)
energy[i,j] = cos(theta[f(i+1,D),f(j,D)]-theta[f(i,D),f(j,D)])
+ cos(theta[f(i-1,D),f(j,D)]-theta[f(i,D),f(j,D)])
+ cos(theta[f(i,D),f(j+1,D)]-theta[f(i,D),f(j,D)])
+ cos(theta[f(i,D),f(j-1,D)]-theta[f(i,D),f(j,D)]);

for(i in 1:D)
for(j in 1:D) {
target += exponential_lpdf(4-energy[i,j]| beta);
}
}
'

To run the model in R,

library(rstan)
xy_mod <- stan_model(model_code = xy_code)
Nchain <- 8
D <- 32
init <- lapply(1:Nchain, function(i) list(theta=matrix(runif(1,-pi,pi),D,D))) #random orientations
xy_samples <- sampling(xy_mod,chains=Nchain,init=init, data = list(D=D,beta=0.9), iter = 2000, warmup=1000)

Here are some posterior draws generated by the above code after warm-up for $D=32$ ( $N=1024$ ).

Firstly, low-temperature “spin-wave” phase with power law correlations…

Secondly, the high temperature disordered phase with unbound vortices …

Finally, close to the transition, weakly bound vortex-ant-vortex pairs…

It is straightforward to compute quantities of physical interest from the HMC sample draws. The graph below shows mean energy per site, mean site magnetisation and mean square vorticity averaged over ~ 10,000 draws. Vortices proliferate in the vicinity of the transition as expected.

[*]Non-uniform priors can also be used in Eqn. (2). A simple choice is the circular Von Mises distribution

$p\left(\theta_{\bf r}\right) \propto e^{\kappa \cos \left(\theta_{\bf r}- \mu \right)}$

which favours spin direction $\mu$ . This is mathematically identical to uniform prior and an external “magnetic field” ( $H$ ) term in Eqn. (1) with $\kappa = \beta H$ .

[**] A very small external magnetic field can be used to select the direction.

[***] Net vorticity $N_v$ of a region is measured by summing angular differences around it’s perimeter $L$ , $\sum_{i \in L} \Delta \theta_i = 2 \pi N_v$ (where each angular difference is wrapped to the interval $-\pi,\pi$ . ) This formula can be applied to an individual plaquette to locate vortex cores, for example.

Thanks for visiting r-craft.org
This article is originally published at https://joewheatley.net
Please visit source website for post related comments.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Statistical physics vs Bayesian inference

You may also like...

Categories

Statistical physics vs Bayesian inference

2D XY model

stan

You may also like...

Compact Guide to Classical Inference by Daniel Kaplan (or: How to Teach Stats)

T-SQL job title generator

RStudio Package Manager 1.0.8 – System Requirements

Categories