Bayesian Analysis in Sherpa
Sherpa Threads (CIAO 4.13 Sherpa v1)
Overview
Synopsis:
pyBLoCXS is a sophisticated Markov chain Monte Carlo (MCMC) based algorithm designed to carry out Bayesian LowCount Xray Spectral (BLoCXS) analysis in the Sherpa environment. The algorithm may be used to explore parameter space at a suspected minimum using a predefined Sherpa model to highenergy Xray spectral data.
Run this thread if:
Last Update: 21 Dec 2020  Updated for plot changes in CIAO 4.13 and to note that plot_cdf works again.
Contents
 Introduction to pyBLoCXS
 Getting Started
 Fitting a Model to the Spectrum
 Running the pyBLoCXS Routine
 Visualizing the Results of the pyBLoCXS Simulations
 Scripting It
 History

Images
 Figure 1: Starting point for the MCMC analysis
 Figure 2: Trace Plot of Fit Statistic Values
 Figure 3: Probability Distribution of a Selected Parameter
 Figure 4: Cumulative Distribution of a Selected Parameter
 Figure 5: Scatter plot of Two Correlated Model Parameters
 Figure 6: Comparing scatter to regionprojection
Introduction to pyBLoCXS
The pyBLoCXS code is a Python extension to Sherpa that explores model parameter space at a suspected minimum using a predefined Sherpa model to highenergy Xray spectral data. pyBLoCXS includes a flexible definition of priors and can efficiently probe the model parameter space to characterize the posterior distributions. It allows for variations in the calibration information, and can be used to compute posterior predictive pvalues for the likelihood ratio test (see Protassov, et al. 2002, ApJ 571, 542, Statistics, Handle with Care: Detecting Multiple Model Components with the Likelihood Ratio Test).
The Sherpa get_draws function runs a pyBLoCXS chain using fit information associated with the userspecified data set(s) and the currently set sampler and parameter priors, for a specified number of iterations. It returns an array of simulated fit statistic values, an array of True or False accepted/rejected Booleans, and a 2D array of associated model parameter values.
A general description of the MCMC techniques employed by the pyBLoCXS algorithm can be found in the following references:
 Appendices A.2A.4 of van Dyk, et al 2001, ApJ 548, 224, "Analysis of Energy Spectra with Low Photon Counts via Bayesian Posterior Simulation"
 Chapter 11 of Gelman, Carlin, Stern, and Rubin (Bayesian Data Analysis, 2nd Edition, 2004, Chapman & Hall/CRC)
 Lee, H., et al. 2011, ApJ 731, 126, Accounting for Calibration Uncertainties in Xray Analysis: Effective Areas in Spectral Fitting
The Sherpa pyBLoCXS functions include:
 get_draws  runs the pyBLoCXS MCMCbased algorithm using the current sampler
 set_sampler / get_sampler  sets/retrieves the current pyBLoCXS sampler
 set_sampler_opt / get_sampler_opt  sets/retrieves the sampler options for a pyBLoCXS chain
 list_samplers  lists all available pyBLoCXS samplers
 get_sampler_name  returns the name of the current sampler
 list_priors  lists thawed Sherpa model parameters with associated priors
 set_prior / get_prior  sets/retrieves a prior function for a particular Sherpa model parameter
Getting Started
Please follow the Obtaining Data Used in Sherpa thread in order to obtain the sample data used in this thread. The spectral data products for the PKS 1127145 quasar core are located in the pyblocxs subdirectory.
Fitting a Model to the Spectrum
The pyBLoCXS routine requires information about a fit to a data set in order to run; therefore, we begin by loading a source spectrum into Sherpa, fitting a model to it, and calculating confidence intervals on thawed parameters using the covariance method.
The source spectrum is loaded with load_pha, and the source model expression to be used for fitting is defined with set_model. We decide to restrict the analysis to the 0.57.0 keV energy range, using the ignore command.
sherpa> load_pha(1, "core.pi") sherpa> set_model(xsphabs.abs0 * powlaw1d.p1) sherpa> ignore(None, 0.5) sherpa> ignore(7., None)
We fit an absorbed powerlaw model to the data using the robust, NelderMead Simplex fit optimization method and Cash fit statistic in order to select a Poisson likelihood as appropriate for pyBLoCXS simulations.
sherpa> set_method("neldermead") sherpa> set_stat("cash")
Then, we fit and estimate confidence intervals on all thawed parameters with the Sherpa covariance command.
sherpa> fit() WARNING: data set 1 has associated backgrounds, but they have not been subtracted, nor have background models been set Dataset = 1 Method = neldermead Statistic = cash Initial fit statistic = 1.01254e+08 Final fit statistic = 461303 at function evaluation 658 Data points = 444 Degrees of freedom = 441 Change in statistic = 1.01716e+08 abs0.nH 0.0721702 p1.gamma 1.24386 p1.ampl 0.000676179
Sherpa has recognized that the PHA data has an associated bakground data set but it isn't being used (it can not be subtracted when using the Cash statistic, and no model has been set up to fit it). For this example we will ignore the background contribution, but in an actual analysis you will have to decide how to handle the background.
sherpa> plot_fit_resid(yerrorbars=False) WARNING: The displayed errorbars have been supplied with the data or calculated using chi2xspecvar; the errors are not used in fits with wstat WARNING: The displayed errorbars have been supplied with the data or calculated using chi2xspecvar; the errors are not used in fits with wstat
Figure 1: Starting point for the MCMC analysis
sherpa> covar() WARNING: data set 1 has associated backgrounds, but they have not been subtracted, nor have background models been set Dataset = 1 Confidence Method = covariance Iterative Fit Method = None Fitting Method = neldermead Statistic = cash covariance 1sigma (68.2689%) bounds: Param BestFit Lower Bound Upper Bound     abs0.nH 0.0721702 0.00456715 0.00456715 p1.gamma 1.24386 0.0126451 0.0126451 p1.ampl 0.000676179 9.22655e06 9.22655e06
Running the pyBLoCXS Routine
pyBLoCXS simulations are executed with the get_draws function, using the fit information associated with our specified data set(s)—and the current sampler and parameter priors—for a specified number of iterations. The simplest case uses the default MetropolisMH sampler, and flat priors for all parameters.
The list of all available samplers is returned by the list_samplers function, and the currently set sampler may be viewed with get_sampler_name, as demonstrated below.
sherpa> print(get_sampler_name()) MetropolisMH sherpa> print(list_samplers()) ['metropolismh', 'mh', 'pragbayes', 'fullbayes']
The sampler—the type of jumping rule to be used in MCMC—can be either MH, MetropolisMH, pragbayes, or fullbayes (note that fullbayes is untested, new functionality and not recommended at this time). Metropolis and MetropolisHastings are the two standard algorithms for sampling the model parameter space (the posterior). Metropolis jumps to the new set of parameters from the current parameter set, while MetropolisHastings jumps always from the best fit. The Sherpa sampler MH is a pure MetropolisHastings, which always jumps from the bestfit. MetropolisMH is a mixture of Metropolis and MetropolisHastings, selected with a probability p_M. PragBayes is used when the effective area calibration uncertainty is to be included in the calculation.
At each nominal MCMC iteration, a new calibration product is generated, and a series of N (option in set_sampler_opt) MCMC subiteration steps are carried out, choosing between Metropolis and MetropolisHastings types of samplers with probability p_M (option in set_sampler_opt). Only the last of these subiterations are kept in the chain.
The get_sampler function is available for accessing the default settings of the current sampler, and set_sampler_opt for modifying those settings.
sherpa> get_sampler() {'log': False, 'inv': False, 'defaultprior': True, 'priorshape': False, 'priors': (), 'originalscale': True, 'scale': 1, 'sigma_m': False, 'p_M': 0.5}
We accept the default settings for the initial, simplest case and run the sampler using the get_draws function, specifying the names of the output arrays and the number of iterations.
sherpa> stats, accept, params = get_draws(niter=1e4)
Here we use Python syntax to name the data arrays created by the simulations: stats, accept, and params. stats contains a 1D array of floatingpoint statistic values; accept contains a 1D array of True or False accepted/rejected Booleans for each iteration; and params contains a ND array of parameter values, where N is the number of free parameters in the model. In this example, N is equal to 3.
The get_draws command above produces the following screen output, showing which parameters are being sampled and what prior function is assumed for the simulation (the hexadecimal value will not be the same as shown below):
Using Priors: abs0.nH: <function flat at 0x16a43f50> p1.gamma: <function flat at 0x16a43f50> p1.ampl: <function flat at 0x16a43f50>
Visualizing the Results of the pyBLoCXS Simulations
There are several Sherpa plotting functions available for visualizing statistics and parameter values for each accepted iteration; these include:
 plot_pdf()
 plot_cdf()
 plot_trace()
 plot_scatter()
Fit Statistic Values
The plot_trace function can be used to plot by default the entire array of statistic values for each sample. We can use the associated get_trace_plot command to specify plot preferences, such as the desired value for line thickness, and then enter the name of the fit statistic array returned by get_draws into the plot_trace expression (stats) to create the plot. Finally, we can save the resulting plot to a PNG format image file with Matplotlib's savefig function.
sherpa> get_trace_plot().plot_prefs {'xerrorbars': False, 'yerrorbars': False, 'ecolor': None, 'capsize': None, 'barsabove': False, 'xlog': False, 'ylog': False, 'linestyle': '', 'linecolor': None, 'color': None, 'marker': 'None', 'markerfacecolor': None, 'markersize': None, 'xaxis': False, 'ratioline': False} sherpa> plot_trace(stats, name='Stats') sherpa> plt.savefig('trace_stats.png')
Figure 2: Trace Plot of Fit Statistic Values
Model Parameter Values
We may also examine the distribution of model parameters resulting from the simulations, stored in the params array returned by get_draws. Like the other data arrays output by get_draws, this array is a standard NumPy ndarray, which means we can obtain information about the array using Python syntax, for example:
sherpa> print(params.shape) (3, 10001) sherpa> print(params.dtype) float64 sherpa> print(params.size) 30003
By manipulating the params array in this way, we can access the simulation results for any of the three model parameters sampled. Below, we filter params to return just the values for iterations 1010 to 1010 of the first parameter, abs0.nH:
sherpa> print(params[0, 1010:1020]) [0.07032822 0.07532993 0.07532993 0.0707736 0.07340574 0.07153077 0.07153077 0.07153077 0.07153077 0.07153077]
Likewise, we can list the parameter values for the second parameter, p1.gamma, over this range:
sherpa> print(params[1, 1010:1020]) [1.24579021 1.25161845 1.25161845 1.2516977 1.26275949 1.24603482 1.24603482 1.24603482 1.24603482 1.24603482]
This range was chosen at random, and has no "theoretical" significance. Figure 5 shows how the parameter values vary together.
The probability distribution of parameter values may be plotted with the plot_pdf command, as shown below for the gamma parameter:
sherpa> plot_pdf(params[1, :], name=r'$\Gamma$')
Figure 3: Probability Distribution of a Selected Parameter
The plot_cdf function displays the cumulative distribution, with the median and 1σ bounds marked on the plot:
sherpa> plot_cdf(params[1, :], name=r'$\Gamma$')
Figure 4: Cumulative Distribution of a Selected Parameter
To access all information about the displayed distribution, the get_pdf_plot and get_cdf_plot functions may be used; for example:
sherpa> print(get_cdf_plot()) points = [1.1983,1.1983,1.1983,1.1983,1.2011,1.2011,1.2011,1.2011,1.2017,1.2018, 1.2019,1.2019,1.2024,1.203 ,1.2038,1.2038,1.2038,1.2041,1.2043,1.2055, ... 1.284 ,1.2854,1.2866,1.2866,1.2866,1.2881,1.2882,1.2883,1.2884,1.2929, 1.2974] x = [9.9990e05,1.9998e04,2.9997e04,3.9996e04,4.9995e04,5.9994e04, 6.9993e04,7.9992e04,8.9991e04,9.9990e04,1.0999e03,1.1999e03, ... 9.9960e01,9.9970e01,9.9980e01,9.9990e01,1.0000e+00] y = [1.2439,1.2476,1.2443,1.2315,1.2668,1.2668,1.2229,1.2295,1.2295,1.2295, 1.2258,1.2419,1.2419,1.2419,1.2419,1.2428,1.2339,1.2191,1.2499,1.2418, ... 1.2382,1.2483,1.2578,1.2588,1.2202,1.2463,1.2602,1.2468,1.2458,1.2261, 1.2261] median = 1.2440636273401948 lower = 1.231328083732741 upper = 1.2569451112600383 xlabel = x ylabel = p(<=x) title = CDF: $\Gamma$ plot_prefs = {'xerrorbars': False, 'yerrorbars': False, 'ecolor': None, 'capsize': None, 'barsabove': False, 'xlog': False, 'ylog': False, 'linestyle': '', 'linecolor': 'red', 'color': None, 'marker': 'None', 'markerfacecolor': None, 'markersize': None, 'xaxis': False, 'ratioline': False}
To access the values for the median of the distribution, and the lower and upper bounds, we can use the function directly, as follows:
sherpa> get_cdf_plot().median 1.2440636273401948 sherpa> get_cdf_plot().lower 1.231328083732741 sherpa> get_cdf_plot().upper 1.2569451112600383
We can also display the values of the parameters in a scatter plot, showing correlations between different parameters. Here we display abs0.nH and p1.gamma, showing parameter correlation typical of an Xray data set.
sherpa> plot_scatter(params[0, :], params[1, :], name=r'n$_H$ vs. $\Gamma$', xlabel='n$_H$', ylabel=r'$\Gamma', markersize=1)
Figure 5: Scatter plot of Two Correlated Model Parameters
Finally, we can access the data arrays and preferences defining the scatter plot with the get_scatter_plot function:
sherpa> print(get_scatter_plot()) x = [0.0722,0.0705,0.0747,0.0667,0.0805,0.0805,0.063 ,0.0618,0.0618,0.0618, 0.0688,0.0722,0.0722,0.0722,0.0722,0.0738,0.0656,0.0662,0.0707,0.0743, ... 0.0758,0.0804,0.0745,0.0762,0.0662,0.0729,0.0733,0.0692,0.0695,0.0667, 0.0667] y = [1.2439,1.2476,1.2443,1.2315,1.2668,1.2668,1.2229,1.2295,1.2295,1.2295, 1.2258,1.2419,1.2419,1.2419,1.2419,1.2428,1.2339,1.2191,1.2499,1.2418, ... 1.2382,1.2483,1.2578,1.2588,1.2202,1.2463,1.2602,1.2468,1.2458,1.2261, 1.2261] yerr = None xerr = None xlabel = n$_H$ ylabel = $\Gamma$ title = Scatter: n$_H$ vs. $\Gamma$ plot_prefs = {'xerrorbars': False, 'yerrorbars': True, 'ecolor': None, 'capsize': None, 'barsabove': False, 'xlog': False, 'ylog': False, 'linestyle': 'None', 'linecolor': None, 'color': None, 'marker': '.', 'markerfacecolor': None, 'markersize': None, 'xaxis': False, 'ratioline': False}
This can be compared to the standard error analysis; for example overplotting the reg_proj contours with:
sherpa> reg_proj(abs0.nh, p1.gamma, overplot=True) WARNING: data set 1 has associated backgrounds, but they have not been subtracted, nor have background models been set
Figure 6: Comparing scatter to regionprojection
Scripting It
The file, fit.py , performs the primary commands used above. It can be executed by typing '%run i fit.py' on the Sherpa command line. The Sherpa script command may be used to save everything typed on the command line in a Sherpa session:
sherpa> script(filename="sherpa.log", clobber=False)
(Note that restoring a Sherpa session from such a file could be problematic since it may include syntax errors, unwanted fitting trials, et cetera.)
History
15 Dec 2011  new for CIAO 4.4 
11 Dec 2013  reviewed for CIAO 4.6: no changes 
19 Mar 2015  updated for CIAO 4.7, no content change. 
09 Dec 2015  reviewed for CIAO 4.8, no content change. 
01 Dec 2016  reviewed for CIAO 4.9, no content change. 
29 May 2018  reviewed for CIAO 4.10, no content change. 
13 Dec 2018  reviewed for CIAO 4.11, no content change. 
10 Dec 2019  Updated for CIAO 4.12, changing ChIPS to matplotlib; added a workaround for plot_cdf being broken in CIAO 4.12. 
21 Dec 2020  Updated for plot changes in CIAO 4.13 and to note that plot_cdf works again. 