Often, Frequently, Only Once

There are two major competing schools of thought in modern statistics, divided most obviously by methodology. The underlying difference, however, can be considered one of philosophy.

The traditional school of thought (sometimes called Frequentism) holds that there are basic truths about the universe (the force of gravity, the average height of everyone on the planet) which we attempt to learn. Statistics, in this view, is how we make up for the imperfections in our own knowledge about the universe. In Frequentism data is variable.

The alternative school, Bayesianism, hold that it is simply impossible to know the truth and thus our goal can never to be to determine if we are “right” or “wrong” about our conclusions. Because of this the role of Bayesian statistics is to shape our knowledge or beliefs in accordance with the data. In Bayesianism the truth is variable.

This difference in philosophy continues to the description of errors.

Frequentism has Type I and Type II errors which are false positives and false negatives respectively. These types of error reflect the fact that the goal of Frequentist testing is to avoid getting the wrong answer. Frequentist tests use p-values to evaluate the results.

Bayesianism has Type S and Type M errors which are incorrect sign and incorrect magnitude respectively. An appropriate Bayesian test cannot be wrong because that is meaningless in a Bayesian context, it will correctly describe our knowledge. The errors reflect the fact that the goal of Bayesian testing to come close to the population value. Bayesian tests use distributions of credible values to evaluate results.

Let’s look at our examples of traditional and Bayesian binomial tests to see this difference. Here we will download Rasmus Baath’s bayes.binom.test() function.

set.seed(1771) # Use this function to get the same pseudo-random numbers that I do.

## Simulate flipping a coin twenty times and count up the heads.
total <- 20
flips <- sample(c(0,1),total,replace=TRUE)
heads <- sum(flips)

mod.freq <- binom.test(heads,total) # Frequentists
mod.bayes <- bayes.binom.test(heads,total) # Bayesian

mod.freq
        Exact binomial test

data:  heads and total
number of successes = 9, number of trials = 20, p-value = 0.8238
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.2305779 0.6847219
sample estimates:
probability of success 
                  0.45 

mod.bayes
        Bayesian First Aid binomial test

data: heads and total
number of successes = 9, number of trials = 20
Estimated relative frequency of success:
  0.45 
95% credible interval:
  0.26 0.66 
The relative frequency of success is more than 0.5 by a probability of 0.333 
and less than 0.5 by a probability of 0.667 

The traditional test tells us that if we performed an identical test many times we would up with a result like this about 82% of the time and so we should not believe that the coin is unfair. With a critical value of 0.05 we expect that we will only mistakenly call the coin unfair 5% of the time.
The Bayesian test tells us that based on what we have seen the credible values of the coin’s fairness includes 0.5 but it is somewhat more likely that the coin is biased low.

As we saw two weeks ago these are very different pieces of information, even though they come from identical data and both are true. I would argue that the Bayesian result is a much more useful piece of information but that is discussion we’ll come back to at later date.

On Friday we will take a look at corrections for multiple comparisons.