What is the difference between Bayesian and frequentist statisticians?


The key differences between Bayesian and classical statistics (or statisticians) are in the concept of replications (or the way they use the concept of replications)— the classical inference fixes the parameter of interest, and replicates the data, whereas the Bayesian inference fixes the data, and replicates the parameter of interest.

In both Bayesian and classical statistics, the replications are hypothetically conducted and assumed to follow a certain distribution. For frequentist inference, sampling distribution of the observable data for a fixed parameter is the core “mechanic” of inference, since the observable data are hypothetically replicated for conducting inference (on that fixed parameter). For Bayesian inference, since the unobservable parameters are hypothetically replicated for representing the inferential uncertainty of the parameters, (prior and posterior) distribution of the unobservable parameters is the “mechanic” of inference.

In Bayesian inference, since the parameters are unobservable, probabilities associated with unobservable parameters are most naturally interpreted as “degrees of belief” on a particular state of the unobservable parameters, rather than as the frequentist’s interpretation on probabilities in terms of the relative frequency of a certain observable event occurring. In particular, prior distribution, which is needed in order to use Bayes’ theorem, dictates the (subjective) probability of the parameter without regard for the data. Bayes’ theorem tells us how to update those subjective probabilities of the unobservable parameter in the light of the observed data (i.e., “evidence”, or the likelihood of the observed data) to obtain a posterior distribution. This quantifies the current uncertainty about the unobservable parameter. In a nutshell, Bayesian inference makes probabilistic statements about the parameter. This is in contrast to the frequentist inference where some real and fixed value of the parameter is assumed to exist, for which no uncertainty regarding that fixed value is quantified. Instead, a confidence coverage of a statistical estimator and hypothesis testing given a fixed parameter are investigated. In other words, the parameters are fixed throughout a frequentist’s statistical investigation, and statistical estimators are evaluated with respect to those fixed parameters, with (hypothetical) replications of the observable data.

The key strength of Bayesian statistics is in the simplicity associated with its inferential strategy that creates a unified probability framework that works on all the unknowns. Bayesian inference is about creating the posterior distribution, from which probability statements about the uncertainties in the unknown parameters of interest can be derived. This capability to summarize and quantify the uncertainties of the unknown variables is particularly appealing in dealing with a complex model setting, since the sampling distributions associated with statistical estimators in frequentist inference can become easily intractable due to the complexity associated with the estimation of the nuisance parameters of the assumed model. Although the nuisance parameters may be profiled out through optimization, the associated statistical uncertainties are often difficult to be quantified and summarized in the subsequent inferential analyses. In Bayesian inference, on the other hand, if posterior samples are available (from Markov chain Monte Carlo posterior sampling, for example), the nuisance parameters can often be marginalized out easily, and a coherent probabilistic inference on the parameters of interest can be conducted (in such a case, one would need to be concerned about the convergence of the chain) assuming that the model is correct.

The key strength of frequentist statistics is in the simplicity associated with model formulation as it does not require specification of prior distributions (i.e., Bayesian inference needs more assumptions on the model) and the objectiveness that arises from it. For rigorous analysis, one would need to assess the impact of picking a particular prior in Bayesian statistics. However, the ability to pick a prior is also a key asset to practical inference, as it provides a means of incorporating prior information to modeling and allows prior beliefs to be tested (using posterior predictive checks).

(This can also be found at Quora.com.)