Quick Start

  1. What is the BIE?

    The BIE is, funadmentally, a object-oriented classes library of tools written in C++ that allow the user to put together a Markov Chain Monte Carlo simulation for Bayesian inference.

    If you do not know what object-oriented means, think of plumbing kit that comes with different sized pipes, fittings, fixtures, etc. The different objects (in the code and in the plumbing kit) have to both fit together and maintain the flow in order to function.

    To facilitate "what if" exploration, we provide a command line interface (written with Bison and Flex) that will run input scripts. This is similar to gnuplot. Behind the scenes is a mechanism that knows about and enforces the object-oriented hierarchy (the fitting together of the different objects).

    The output of the code is a simulation of the Bayesian posterior distribution. In terms of a parameter estimate, the posterior distribution the probability of some parameter vector $\theta$ given some pile of data $D$: $P(\theta|D)$. From this distribution, we can compute summary statistics e.g. by taking moments, or determine confidence intervals and so forth. All of these quantities are fundamentally integrals and the Markov Chain approach produces variates $\theta$ distributed acoording to $P(\theta|D)$ so moments are trivally obtained by summing of the ensemble of variates.

    The BIE provides a diverse number opitions for a Bayesian simulation:

    • Prior distributions;
    • Markov Chain Monte Carlo algorithms with associated convergence statistics;
    • Likelihood functions (binned, point, and user-defined) both for serial and parallel use); and
    • General input and output streams for handling data.

    The BIE provides a Model class from which all of your the models you wish to investigate will be derived. This is not as daunting as it sounds and the base code contains a number of simple and more complex examples for you to study. If you can write some C code, you can incorporate your model.

    Alternatively, you way write your own likelihood function that incorporates both your model and your data. For a data in a specialized form, this approach might be the easiest, see GaussTestLikelihoodFunction for a simple example.

  2. Now, I have to compile the dang thing . . .
    For most people, this is the most frustrating part of getting started.
    • Begin by trying the autogen.sh script. You will probably want to pass in your desired install directory. E.g. I usally install in my top-level home directory and which may be done with the flag --prefix=/home/weinberg. So, the full command would be:

      	  ./autogen.sh --prefix=/home/weinberg
      	

      Hopefully, this will just work. NB: BIE uses a sophisticated peristence and checkpointing scheme that insures that the full state of your running simulation may be saved and logged for future use. This can make the compilation slooooow. Good news: once it's compiled, it will stay compiled.

    • The configure script will complain to you if it can't find some important package. In particular, I usually install the latest version of the Boost library locally. So, I tell the configure script using:

      	  ./autogen.sh --prefix=/home/weinberg --with-boost=/usr/local/boost
      	

    • There are a number of --with-xxx=yyy flags that might be useful. Try configure --help for a summary.

    • Finally, I recommend a make install to install all of the executables and libraries in your previously specified location. You don't have to do this until you are convinced you would like to dig deeper; the code will run from its make directory without installation.

  3. Try an example!
    You may find a sample scripts in the examples directory. This directory has three types of models:
    1. A Splat model -- a two-dimensional Gaussian density distribution. Each sample from this distribution may be associated with one or more auxilliary attributes. These examples attempt parameter inference for mixtures of such Splats.
    2. A Galaxy model -- we assume a distribution of standard candles sampled from a galaxy disk characterized by an exponential scale length and $sech^2$ scale height. These examples attempt parameter inference for mixtures of such galaxy disks (e.g. "thick" and "thin" disks).
    3. One dimensional Gaussian distributions defined using the user likelihood method metioned above.
    Each model has three subdirectories (scalar, parallel, and data) containing scripts designed to be run on a single processor, multiple processors using MPI, and various data files.

    Begin with the examples in the Splat/scalar directory. We will assume that you installed the BIE in /home/user/BIE. Then:

        cd /home/user/BIE/examples/Splats/scalar
        /home/user/BIE/cli/cli -f scriptX
      
    will run an example simulation.

    What have you done??

    • Console output
    • Some description of the input classes

  4. Ok, maybe this will work for me. What should I do next to learn more?
    • More examples
    • Look at blurf to understand the class structure
    • Read the paper . . .


Send suggestions, questions, and feedback to WEINBERG at ASTRO dot UMASS dot EDU.
Documentation generated at Fri Mar 26 00:35:11 2010 by doxygen