Saturday, May 22, 2010

Gibbs Sampler

Gibbs sampler is a special case of Metropolis-Hasting algorithm. The differences are that first, we always accepts a candidate point and second, we must know the full conditional distributions.

Let's assume that we have a joint distribution $f(x,y_1,...y_d)$, we would like to know more about the marginal density $f(x)$. So, we must integrate over all other variables.
\[f(x) = \int {...\int {f(x,{y_1},...,{y_d})d{y_1}...d{y_d}} } \]
As with other MCMC methods, Gibbs sampler generates $X_1,X_2,...,X_m$ from $f(x)$ and then use the sample to estimate the desired characteristic of $f(x)$.

For bivariate case, to generate samplers (variates) from $f(x,y)$ we alternately sampling  $x_i \sim f(x|y_{i - 1})$ and $y_i \sim f(y|x_i)$ until we got the number of samples we want.

Example. Consider the following joint distribution
\[f(x,y) \propto \left({\begin{array}{*{20}{c}}n\\x\\\end{array}} \right) y^{x+\alpha-1}(1 - y)^{n-x+\beta-1}\]
where $ x = 0,1,...,n $ and $0 \le y \le 1$.
This is the joint distribution of a binomial distribution
\[f(x|n,y)= \left({\begin{array}{*{20}{c}}n\\x\\\end{array}} \right) y^x(1 - y)^{n-x}\]
and a beta distribution
\[f(y|\alpha ,\beta ) = \frac{1}{B(\alpha ,\beta )}y^{\alpha-1}(1-y)^{\beta-1}\]
(confused? Try multiplying these two distributions together, you got the idea)

In short, \[X|y \sim Bino(n,y)\] and \[Y|x \sim Beta(x+\alpha,n-x+\beta)\]

The MATLAB code to generate samples using the Gibbs Sampler is:

% starting points;
x(1)=binornd(n,0.5,1,1);
y(1)=betarnd(x(1)+a,n-x(1)+b,1,1);
for i = 2:k
    x(i)=binornd(n,y(i-1),1,1);
    y(i)=betarnd(x(i)+a,x-x(i)+b,1,1);
end


The estimation from the Gibbs sampler is very close to the true marginal (Casella & George 1992):

\[f(x) = \left( {\begin{array}{*{20}{c}}n\\x\\\end{array}} \right)\frac{{\Gamma (\alpha+\beta)\Gamma (x+\alpha)\Gamma (n-x+\beta )}}{{\Gamma (\alpha )\Gamma (\beta )\Gamma (\alpha+\beta+n)}}\]

The code for plotting the left figure is just the histogram of $X$

[c,xi]=hist(x(m+1:end),17);
bar(xi,c/sum(c),1);

and on the right figure, we compute the true marginal density

for x = 0:16
    fhat(x+1)=nchoosek(n,x)*gamma(a+b)*gamma(x+a)*gamma(n-x+b)/...
        (gamma(a)*gamma(b)*gamma(a+b+n));
end
bar(xi,fhat,1);

No comments:

Post a Comment