
cronus is a Python tool designed to facilitate Markov Chain Monte Carlo (MCMC) and Nested Sampling (NS) in large supercomputing clusters.
It relies on the powerful MCMC sampler zeus to do the heavy-lifting and incorporates various MPI features along with a suite of automated Convergence Diagnostics.
cronus
is designed to be used via the terminal using parameter files and it is particularly suited for
Astrophysical and Cosmological applications.
Installation¶
Requirements¶
cronus
is compatible with Python 3.6+. It requires numpy
, scipy
, mpi4py
, ìminuit
, h5py
and zeus
to run.
If you want to use cronus
with either emcee
or dynesty
please make sure that you have
those installed too.
You can find information about how to install mpi4py
and its prerequisites at https://mpi4py.readthedocs.io/en/stable/install.html
Install using pip¶
We recommend to use pip to install the latest stable version of cronus
:
pip install cronus-mcmc
Install from source¶
Alternatively, install the latest version of cronus
from source:
git clone https://github.com/minaskar/cronus.git
cd cronus
pip install -r requirements.txt
pip install .
Making sure that cronus is installed properly¶
If everything went well, you should be able to import cronus
in Python from anywhere in your directory structure:
$ python -c "import cronus"
If you get an error message, something went wrong. Check twice the instructions above, try again, or contact us.
cronus
also installs some shell scripts. If everything went well, if you try to run in the shell cronus-run
, you
should get a message asking you for an input file, instead of a command not found error.
Note
If you do get a command not found error, this means that the folder where your local scripts are installed has not been added to your path.
To solve this on unix-based machines, look for the cronus-run
script from your home and scratch folders with:
$ find `pwd` -iname cronus-run -printf %h\\n
in Linux, or:
$ which -a cronus-run
in Mac OS X.
This should print the location of the script, e.g. /home/you/.local/bin
. Add:
$ export PATH="/home/you/.local/bin":$PATH
at the end of your ~/.bashrc
file, and restart the terminal or do source ~/.bashrc
. Alternatively, simply
add that line to your cluster jobscripts just before calling cronus-run
.
Quick Start¶
Overview¶
The main purpose of cronus
is to facilitate large-scale Bayesian Inference (e.g. MCMC or NS) in modern
super-computing environments. cronus
utilises MPI
to efficiently distribute the tasks to multiple
nodes. Another important feature of cronus
is its integrated and automated suite of Convergence Diagnostics.
Before we go into detail about how to use cronus
let us first discuss the way it works in a higher level.
cronus
accepts as an input a parameter file that specifies the following:
- The Python file that contains the definition of the Log Likelihood function,
- A set of priors and/or fixed values for the different parameters of the model that enters the Log Likelihood function.
Note
The Paremeter file can also be used to specify some additional optional information, like:
- A set of parameters that configure the MCMC/NS sampler (e.g. number of walkers), those are usually trivial to define.
- A few threshold values for the Convergence Diagnostics,
- The path/directory for the results to be saved in.
For more information about this please read the Advanced Use page.
Once a parameter file is provided, cronus
efficiently distributes the sampling tasks to all available CPUs and runs
until Convergence is reached. The results are saved iteratively so that the researcher can monitor the progress.

Let us present here a simple example that will help illustrate the basic features and capabilities of cronus
.
Log Likelihood Function¶
The first thing we need to do is to create a Python file in which we define the Log Likelihood function. There is
no real restricton to this. The model itself can be computed in any programming language (e.g. C, C++, Fortran) and
the Log Likelihood can be a Python wrapper for this. In this example we will define a strongly-correlated
5-dimensional Normal distribution
.
import os
os.environ["OMP_NUM_THREADS"] = "1"
import numpy as np
ndim = 5
C = np.identity(ndim)
C[C==0] = 0.95
Cinv = np.linalg.inv(C)
def log_likelihood(x):
return - 0.5 * np.dot(x, np.dot(Cinv, x))
We then save the file as logprob.py
.
Note
The important thing to note here is that the function accepts a single argument x
. If your Log Likelihood
requires more than one argument (e.g. data, covariance, etc.) we recommend to make those global like we did with
the ivar
array in the aforementioned example.
Note
Some builds of NumPy (including the version included with Anaconda) will automatically parallelize some
operations using something like the MKL linear algebra. This can cause problems when used with the
parallelization methods described here so it can be good to turn that off (by setting the environment
variable OMP_NUM_THREADS=1
, for example).
import os
os.environ["OMP_NUM_THREADS"] = "1"
Parameter File¶
The next step is to create the parameter file that we will call file.yaml
:
Likelihood:
path: logprob.py
function: log_likelihood
Parameters:
a:
prior:
type: uniform
min: -10.0
max: 10.0
b:
fixed: 1.0
c:
prior:
type: normal
loc: 1.0
scale: 1.0
d:
prior:
type: normal
loc: 0.0
scale: 2.5
e:
prior:
type: normal
loc: -0.5
scale: 1.0
You can see the following sections in the parameter file:
- The
Likelihood
section which includes information about the path of the Log Likelihood function (i.e. both the directory/filename and the name of the function). - The
Parameters
section which includes the priors of fixed values for each parameter of the model.
For more information about these and additional options in the parameter file please see the Advanced Use page.
Run cronus¶
To run this example go the directory where you saved file.yaml
and do:
$ mpiexec -n 8 cronus-run file.yaml
Here we used 8 CPUs.
Results¶
After a few seconds, an output directory will be created containing the following files:
chains/run1 ├── chain_0.h5 ├── chain_1.h5 ├── IAT_0.dat ├── IAT_1.dat ├── GelmanRubin.dat ├── MAP.npy ├── hessian.npy ├── para.yaml ├── results.dat └── varnames.dat
All but the results.dat
file will be created shortly. The files will iteratively be updated every few iterations.
Once the sampling is done, the results.dat
file will be added to the list.
Let’s have a look at what each of those files contains:
- The
chain_x.h5
files contain the actual MCMC samples. - The
IAT_x.dat
files contain the estimated Integrated Autocorrelation Time (IAT) for each and parameter. This is a measure of how independent the chain samples are (i.e. the lower the IAT the better). - The
GelmanRubin.dat
file contains the Gelman-RubinR_hat
diagnostic for each parameter. - The
MAP.npy
file contains the Maximum a Posteriori (MAP) estimate. - The
hessian.npy
file contains the Hessian matrix evaluated at the MAP. - The
para.yaml
file is a copy of the original parameter file with some extra information explicitly described. - The
results.dat
file includes a summary of the results (e.g. mean, std, 1-sigma, 2-sigma, etc.). - The
varnames.dat
file contains a list of the parameter names.
Note
If we can open the results.dat
file using a text editor we will see the following:
| Name | MAP | mean | median | std | -1 sigma | +1 sigma | -2 sigma | +2 sigma | IAT | ESS | R_hat | |--------+----------+----------+----------+----------+------------+------------+------------+------------+---------+---------+---------| | a | 0.885898 | 0.881579 | 0.879316 | 0.304584 | -0.301652 | 0.308398 | -0.609184 | 0.609584 | 6.82365 | 4044.76 | 1 | | c | 0.891147 | 0.879663 | 0.881513 | 0.298963 | -0.301561 | 0.293607 | -0.603484 | 0.59629 | 6.87625 | 4013.82 | 1.0003 | | d | 0.878582 | 0.880138 | 0.881647 | 0.307091 | -0.311894 | 0.302304 | -0.617898 | 0.611955 | 6.814 | 4050.48 | 1.0006 | | e | 0.818762 | 0.807181 | 0.807153 | 0.297321 | -0.29532 | 0.294845 | -0.593549 | 0.597654 | 6.5086 | 4240.54 | 1.0002 |
Now let’s see how we can easily access this information using cronus
.
The first thing we want to do is read the chains using the read_chains
module of cronus
:
import cronus results = cronus.read_chains('chains/run1') print(results.Summary)
This will print the contents of the results.dat
file.
We can easily create some plots by running:
cronus.traceplot(results)
to get the following traceplot
:

Or, run the following to get a cornerplot
:
fig, axes = cronus.cornerplot(results.trace, labels=results.varnames)

Advanced Use¶
Log Likelihood Function¶
The Log Likelihood function needs to be defined in a separate .py
file. It should be a function of one argument,
either a numpy array or a dictionary.
If you need to pass more information (e.g. data, covariance matrix, precision matrix, etc.) to the Log Likelihood function you should declare those as global variables. This is the easiest and most consistent way to make MPI not complain; it’s also the most computationally efficient method (i.e. passing the whole dataset to all processes eveytime you call the function can be slow).
Here we show a short toy example where we demonstrate how we should define such a function.
import numpy as np
ndim = 10
data = np.random.randn(ndim) # Random data vector
C = np.identity(ndim) # Identity Covariance Matrix
Cinv = np.linalg.inv(C) # Inverse Covariance Matrix
def log_likelihood(x): # Normal distribution
return -0.5*np.dot(x, np.dot(Cinv, x))
Parameter File¶
The parameter file can generally include more information than the options presented in the Quick Start page.
Likelihood¶
Usually the argument of the Log Likelihood function is a 1D numpy array but we can also use a dictionary instead.
To do so we need to add the dictionary: True
option to the Likelihood block, for instance:
Likelihood:
path: path/to/logprob.py
function: log_likelihood
dictionary: True
Parameters¶
Every parameter needs to be either fixed or free:
- For fixed parameters we need to specify their value in Parameter block (i.e. parameter
a
in the following example). - For free parameters we need to specify a prior instead. So far, only
uniform
andnormal
priors are supported. For auniform
prior we need to specify the uniform interval(min, max)
(i.e. parameterb
in the following example). For anormal
prior we need to specify the meanloc
and standard deviationscale
(i.e. parameterc
in the following example).
Parameters:
a:
fixed: 1.0
b:
prior:
type: uniform
min: -1.0
max: 1.0
c:
prior:
type: normal
loc: 0.0
scale: 1.0
Sampler¶
cronus
supports three different samplers, zeus
(Default), emcee
, and dynesty
. The prefered sampler can be specified
using the name
option in the Sampler
section of the parameter file, for instance:
Sampler:
name: zeus
...
When either zeus
or emcee
is used as the prefered sampler then the following options are available:
ndim
is the total number of parameters/dimensions.nwalkers
is the total number of walkers (i.e. internal parallel chains for zeus or emcee). This number needs to be at least twice the number of free parameters.nchains
is the number of parallel chains, we recommend at least two and preferably 4 to get good estimate of the Gelman-Rubin diagnostic.ncheck
specifies the number of steps after which the samples are saved and the Convergence Criteria are assessed. The default value is 100 which means that the samples are saved and convergence is diagnosed every 100 steps.maxiter
specifies the maximum number of iterations (Default is inf).miniter
specifies the minimum number of iterations (Default is 0).maxcall
specifies the maximum number of Log Likelihood evalluations/calls (Default is inf).initial
controls the initialization of the walker positions. The available options are:ellipse
(this is a small ellipse around the Maximum a posteriori estimate, this is the default and recommended choice),laplace
(sample the initial positions of the walkers from the Laplace approximation of the posterior distribution), andprior
(sample the initial positions from the prior distribution, not the best choice).thin
is the thinning rate for the chains (i.e. ifthin=5
then save every 5th element to the chain). This can significantly reduce the size of the output files if the autocorrelation time of the chain is large. The default value is 1.
When dynesty
is used as the prefered sampler then the following options are available:
ndim
is the total number of parameters/dimensions.bound
dlogz
maxiter
specifies the maximum number of iterations (Default is inf).maxcall
specifies the maximum number of Log Likelihood evalluations/calls (Default is inf).pfrac
Diagnostics¶
So far cronus
includes two distinct convergence diagnostics, the Gelman-Rubin statistic and the Autocorrelation Time test.
Their combination seems to work well in Astrophysical and Cosmological likelihoods.
Lets see how one can customize the thresholds of those criteria:
- Either of them can be turned off or on (Default) using the
use
argument. |R_hat - 1| < epsilon
is the threshold for the Potential Scale Reduction Factor (PSRF). We recommend to use a value ofepsilon
that it is smaller than 0.05 (Default).- In terms of the Integrated Autocorrelation Time (IAT) we provide two criteria, if the chain is longer than
nact = 20
(Default) times the estimated IAT and the IAT has changed less thandact = 3%
(Default) the criteria are satisfied. If both Gelman-Rubin and IAT criteria are satisfied then sampling stops.
All of the diagnostic options can be seen here:
Diagnostics:
Gelman-Rubin:
use: True
epsilon: 0.05
Autocorrelation:
use: True
nact: 20
dact: 0.03
Output¶
The only option of the Output
block is a directory path in which the samples/results will be saved. If
the provided directory doesn’t exist one will be created by cronus
. The default directory is the current one.
Output: path/to/output/folder/chains
Running cronus¶
To run cronus
, given a parameter file file.yaml
, we execute the following command:
$ mpiexec -n [nprocesses] cronus-run file.yaml
where, nprocesses
is the number of available CPUs. Depending on the cluster you are using you may need to use
mpirun
or srun
instead of mpiexec
.
Note
For better performance we recommend to use a number of processes that can be divided by the number of chains nchains
.
Ideally, we recommend to use nchains * (nwalkers/2 + 1)
if available, there’s no real computational benefit in using
more than this.
Results¶
zeus or emcee¶
When either zeus
or emcee
is used as the prefered sampler then the results are saved as h5
files.
There are as many h5
files saved as the number of chains nchains
. Each file contains two datasets, one
called samples
which constists of the samples as the name suggests, and one named logprob
which includes
the respective values of the Log Posterior Distribution.
After a few seconds of running the following files will be created in the provided Output
directory:
chains ├── chain_0.h5 ├── chain_1.h5 ├── ... └── chain_[nchains].h5
The files will iteratively be updated every few iterations.
Note
You can access those results by doing:
import numpy as np import h5py with h5py.File('chains/chain_0.h5', "r") as hf: samples = np.copy(hf['samples']) logprob = np.copy(hf['logprob'])
The shape of the samples array would be (Iteration, nwalkers, ndim)
and the shape of the Log Posterior array will
be (Iteration, nwalkers)
. You can easily flatten this, combining all the walkers into one chain and discarding
the first half of the chain, by running:
nsamples, nwalkers, ndim_prime = np.shape(samples) samples_flat = samples[nsamples//2:].reshape(-1, ndim_prime) logprob_flat = logprob[nsamples//2:].reshape(-1, 1)
dynesty¶
When dynesty
is used as the sampler then the results are saved as a numpy npy
format file.
Troubleshooting¶
infiniband¶
There seem to be some issues with some mpi4py
features when used in a computing cluster with infiniband.
This leads to cronus
to hang in an ìnfiniband
multi-node setting.
OpenMPI¶
If you are using OpenMPI
you can try including the following command which in your jobscript.
export OMPI_MCA_pml=ob1
This should disable the infiniband interface.
Intel MPI¶
The mpi4py package is using matching probes (MPI_Mpobe)
for the receiving function recv()
instead of regular
MPI_Recv
operations per default. These matching probes from the MPI 3.0
standard however are not supported
for all fabrics, which may lead to a hang in the receiving function.
Therefore, users are recommended to leverage the OFI
fabric instead of TMI
for Omni-Path
systems. For the
Intel MPI Library
, the configuration could look like the following environment variable setting:
export I_MPI_FABRICS=ofi
Getting Started¶
- See the Installation page for instruction on how to easily install
cronus
. - See the Quick Start page for a simple example.
- See the Advanced Use page for more information about the ways
cronus
can be configured. - See the FAQ page for frequently asked questions.
- See the Troubleshooting page for any problems with
cronus
. - See the API page for a detailed API documentation.
Attribution¶
- Please cite X if you find this code useful in your research.
Licence¶
Copyright 2020 Minas Karamanis and contributors.
cronus
is free software made available under the GPL-3.0 License.