BIE::GelmanRubinConverge Class Reference

Convergence class based on multiple chain analyses as desribed in Gelman and Rubin (1992). More...

#include <GelmanRubinConverge.h>

Inheritance diagram for BIE::GelmanRubinConverge:
Collaboration diagram for BIE::GelmanRubinConverge:

List of all members.

Public Member Functions

 GelmanRubinConverge (int m, Ensemble *d, string id)
 The parameter m determines the "burn-in" handling procedure.
virtual GelmanRubinConvergeNew (int m, Ensemble *d, string id)
 Factory method.
virtual bool Converged ()
 True if converged.
virtual bool IsParallel ()
 This is a parallel chain convergence test.
void setNskip (int n)
 Set number of iterations to skip between convergence tests.
void setNoutlier (int n)
 Set number of iterations before performing an outlier test.
void setMaxout (int n)
 Set number maximum number of outliers that can be masked.
void setNgood (int n)
 Set number of states to sample after convergence.
void Quiet ()
 Suppress the diagnostic output.
void setPoffset (double z)
 Set the minimum log probability offset from the maximally probably state for an outlier to be flagged in addition Grubbs' test.
int ConvergedIndex ()
 Return the number of pre-burn-in states to discard from Ensemble.
int BurnIn ()
 Return the number of pre-burn-in states to discard from Ensemble.
void setRhatMax (double r)
 Set correlation coefficient threshold (default: 1.2).
void setAlpha (double a)
 Set outlier detection confidence: 1 - alpha (default: 0, means off).
virtual bool AccumData (vector< double > &values, vector< State > &states)
 Cumulates data (overridden in SampleDistribution).
virtual bool GetLast (vector< double > &values, vector< State > &states)
 Return last state.
void ComputeDistribution ()
 Access to ComputDistribution class.
GelmanRubinConvergeNew ()
 Clone me.
double PDF (State &x)
 Required Distribution members.
double logPDF (State &x)
 Log of differential distribution function P(x).
vector< double > lower (void)
 Lower bound on distribution (in each dimension).
vector< double > upper (void)
 Upper bound on distribution (in each dimension).
vector< double > Mean ()
 Return mean of distribution (mulitvariate).
vector< double > StdDev ()
 Return standard deviation of distribution (mulitvariate).
vector< double > Moments (int m)
 Return specifided momemnt of distribution (mulitvariate).
vector< double > Sample ()
 Return random variate from distribution.

Static Public Member Functions

static void setMaxK (int n)
 Set the maximum number of states to retain if maxit=0 (default: 100000).

Static Public Attributes

Global variables
static unsigned ngood
 Number of iterations after convergence (default: int=1000).
static unsigned nskip
 Number of iterations to skip betwen tests (default: int=100).
static double alpha
 Outlier confidence value 1 - alpha (default: double=0.05).
static int maxoutlier
 Maximum number of allowed outliers (default: int=6).
static int noutlier
 Number of steps before outlier test (default: int=500).
static double rtol
 Tolerance for Rhat (default: double=1.2).
static bool verbose
 Diagnostic files (default: bool=true).
static double poffset
 Log of offset in peak probability for outliers (default: -30.0).
static bool debug
 Additional debug output (default: bool=false).
static int maxkp
 Maximum number of states to keep if maxit=0 (default: 100000).

Friends

class boost::serialization::access


Detailed Description

Convergence class based on multiple chain analyses as desribed in Gelman and Rubin (1992).

Intro and brief description
This implements the convergence metric developed by Gelman and Rubin for measuring convergence of multiple chains. The measure uses both the within- and between-chain variance estimate the total variance of the pooled chain variance. From this they compute the ratio of the pooled variance to in-chain variance as an estimate of the reduction in variance that might be expected from a convergenced simulation. They call this the "scale reduction". A score of 1 indicates convergence. Since a score of 1 is difficult to achieve, Gelman and Rubin recommend using 1.2 or 1.1 to declare convergence. We assume 1.2 by default but this value may be changed using the setRhatMax method.
Outlier detection
We use Grubbs' test (Grubbs 1969 and Stefansky 1972) to identify outliers in a univariate data set. It assumes that the underlying distribution is normal and this is appropriate for converged MCMC output. Grubbs' test detects one outlier at a time, and successive outliers are iteratively removed from the dataset.
Grubbs' test
The Grubbs' test defined the following:

\[ //! G = {\max|x_i - {\bar x}|\over \sigma_s} //! \]

where $x_i$ are the $i=1,\ldots,N$ chain samples for a particular parameter, ${\bar x}$ is the sample mean and $\sigma_s$ is the root of sample variance (standard deviation), respectively. In other words, the Grubbs test statistic is the largest absolute deviation from the sample mean in units of the sample standard deviation. Let the confidence level be $\alpha$. Then the hypothesis of no outliers is rejected if

\[ G > {N-1\over\sqrt{N}} //! \sqrt{t^2_{\alpha/(2N),N-2}\over N - 2 + t^2_{\alpha/(2N),N-2}} //! \]

where $t_{\alpha/(2N),N-2}$ denoting the critical value of the Student t-distribution with $N-2$ degrees of freedom and a significance level of $\alpha/(2N)$.

How to enable Grubbs' test
To diagnose aberrant chains, Grubbs' test may be used by setting the outlier confidence $\alpha$ using the setAlpha method. A value $\alpha=0$ implies no outlier testing. Grubbs' test will only be applied after a specified number of steps have been computed; this is set by the setNoutlier method. The default value is 500.
Other parameters
The remaining parameters are similar to the SubsampleConverge routine. In particular setNgood chooses the number of states to compute after convergence is obtained and and setNskip chooses the number of steps to skip between convergence checks.

Constructor & Destructor Documentation

BIE::GelmanRubinConverge::GelmanRubinConverge ( int  m,
Ensemble d,
string  id 
)

The parameter m determines the "burn-in" handling procedure.

If $m<0$, the most recently computed $|m|$ steps are used to determine convergence. If $m=0$, the most recent half of the steps are used and if $m>0$, the first $m$ interations are discarded before beginning to compute convergence and the most recently computed $m$ iterations are used to determine convergence, thereafter. In all cases, $N_{skip}$ retained steps are required before a convergence test is attempted. Therefore if $m>0$, the first test will be computed after $m+N_{skip}$ steps. d is an Ensemble instance used to accumulate statistics about the posterior simulation. id is used to tag diagnostic output files


Member Function Documentation

void BIE::GelmanRubinConverge::setPoffset ( double  z  )  [inline]

Set the minimum log probability offset from the maximally probably state for an outlier to be flagged in addition Grubbs' test.

In other words, outliers must also have $\log(P)<\max\{\log(P)\}+P_{offset}$. This additional requirement may be entirely disabled by setting $P_{offset}=0$.

void BIE::GelmanRubinConverge::ComputeDistribution ( void   )  [inline, virtual]

Access to ComputDistribution class.

Will be called by provided helper class as needed

Reimplemented from BIE::SampleDistribution.

Here is the call graph for this function:


The documentation for this class was generated from the following file:

Send suggestions, questions, and feedback to WEINBERG at ASTRO dot UMASS dot EDU.
Documentation generated at Fri Mar 26 00:35:12 2010 by doxygen