Home > The Art of Statistics How to Learn from Data(44)

The Art of Statistics How to Learn from Data(44)
Author: David Spiegelhalter

If you briefly think about it, we are surrounded by epistemic uncertainty about things that are fixed but unknown to us. Gamblers bet on the next card to be dealt, we buy lottery scratch-cards, we discuss the possible gender of a baby, we puzzle over whodunnits, we argue over the numbers of tigers left in the wild, and we are told estimates of the possible number of migrants or the unemployed. All these are facts or quantities that exist out there in the world, but we just do not know what they are. To emphasise again, from a Bayesian perspective, it is fine to use probabilities to represent our personal ignorance about these facts and numbers. We might even think of putting probabilities on alternative scientific theories, but this is more contested.

These probabilities will of course depend on our current knowledge: remember from Chapter 8 how our probability of whether a coin has come up heads or tails depends on whether we have looked at it or not! So these Bayesian probabilities are necessarily subjective—they depend on our relationship with the outside world, and are not properties of the world itself. These probabilities should change as we receive new information.

Which brings us to Bayes’ second key contribution: a result in probability theory that allows us to continuously revise our current probabilities in the light of new evidence. This has become known as Bayes’ theorem, and essentially provides a formal mechanism for learning from experience, which is an extraordinary achievement for an obscure clergyman from a small English spa town. Bayes’ legacy is the fundamental insight that the data does not speak for itself—our external knowledge, and even our judgement, has a central role. This may seem to be incompatible with the scientific process, but of course background knowledge and understanding has always been an element in learning from data, and the difference is that in the Bayesian approach it is handled in a formal and mathematical way.

The implications of Bayes’ work have been deeply contested, with many statisticians and philosophers objecting to the idea that subjective judgement has any role in statistical science. So it is only fair that I make my personal position clear: I was introduced into a ‘subjectivist’ Bayesian school of statistical reasoning at the start of my career,* and it still remains for me the most satisfying approach.


You have three coins in your pocket: one has two heads, one is fair and one has two tails. You pick a coin at random and flip it, and it comes up heads. What should be your probability that the other side of the coin also shows heads?

 

This is a classic problem in epistemic uncertainty: there is no randomness left in the coin once it has been flipped, and any probability is simply an expression of your current personal ignorance about the other side of the coin.

Many people would jump to the conclusion that the answer is ½, since the coin must be either the fair or the two-headed coin, and each was equally likely to be picked. There are many ways to check whether this is correct, but the easiest is to use the idea of expected frequencies demonstrated in Chapter 8.

Figure 11.1 shows what would you expect to see if you carried out this exercise six times. On average, each coin would be chosen twice, and each side of each coin would turn up in the flip. Three of the flips end up in a head, and in two of these the coin is two-headed. So your probability that the chosen coin is two-headed rather than fair should be 2⁄3, and not ½. Essentially, seeing a head makes it more likely that the two-headed coin has been chosen, since this coin provides two opportunities for a head to land face-up, whereas the fair coin only provides one.

If this result seems unintuitive, then the next example might be even more surprising.

 

 

Figure 11.1

Expected frequency tree for three-coin problem, showing what we would expect to happen in six repetitions.

 

 

Suppose a screening test for doping in sports is claimed to be ‘95% accurate’, meaning that 95% of dopers, and 95% of non-dopers, will be correctly classified. Assume 1 in 50 athletes are truly doping at any time. If an athlete tests positive, what is the probability that they are truly doping?

 

This type of potentially challenging problem is again best dealt with using expected frequencies, similar to the analysis of breast screening in Chapter 8, and the claims in Chapter 10 that a high proportion of the published scientific literature is wrong.

The tree in Figure 11.2 starts with 1,000 athletes, of whom 20 are doping and 980 are not. All but one of them are detected (95% of 20 = 19), but 49 non-dopers also have positive tests (95% of 980 = 931). We therefore expect a total of 19 + 49 = 68 positive tests, of whom only 19 are truly doping. So if someone tests positive, there is only 19/68 = 28% chance they are truly doping—the remaining 72% of positive tests are false accusations. Even though drug testing could be claimed to be ‘95% accurate’, the majority of people who test positive are in fact innocent—it does not require much imagination to see the problems this apparent paradox could cause in real life, with athletes being casually condemned because they failed a drug test.

One way of thinking of this process is that we are ‘reversing the order’ of the tree to put testing first, followed by the revelation of the truth. This is shown explicitly in Figure 11.3. This ‘reversed tree’ arrives at exactly the same numbers for the final outcomes, but respects the temporal order in which we come to know things (testing and then the truth about doping), rather than the actual timeline of underlying causation (doping and then testing). This ‘reversal’ is exactly what Bayes’ theorem does—in fact Bayesian thinking was known as ‘inverse probability’ until the 1950s.

 

 

Figure 11.2

Expected frequency tree for sports doping, showing what we expect to happen to 1,000 athletes when 1 in 50 are doping, and the screening test is ‘95% accurate’.

 

 

Figure 11.3

‘Reversed’ expected frequency tree for sports doping, restructured so that the test result comes first, followed by revealing the true activity of the athlete.

 

 

The sports doping example shows how easy it is to confuse the probability of doping, given a positive test (28%), with the probability of testing positive, given doping (95%). We have already seen other contexts when the probability of ‘A given B’ is confused with the probability of ‘B given A’:

• the misinterpretation of P-values, in which the probability of the evidence given the null hypothesis is confused with the probability of the null hypothesis given the evidence.

• the prosecutor’s fallacy in court cases, in which the probability of the evidence given innocence is confused with the probability of innocence given the evidence.

 

A reasonable observer might think that formal Bayesian thinking would bring clarity and rigour to the handling of evidence in legal cases, and so may be surprised to hear that Bayes’ theorem is essentially prohibited from British courts. Before revealing the arguments behind this ban, we must first look at the statistical quantity that is allowed in court—the likelihood ratio.

 

 

Odds and Likelihood Ratios


The doping example lays out the logical steps necessary to get to the quantity that is really of interest when making decisions: out of people who test positive, the proportion who are really doping, which turns out to be 19/68. The expected frequency tree shows that this depends on three crucial numbers: the proportion of athletes who are doping (1/50, or 20/1,000 in the tree), the proportion of doping athletes who correctly test positive (95%, or 19/20 in the tree) and the proportion of non-doping athletes who incorrectly test positive (5%, or 49/980 in the tree).

Hot Books
» House of Earth and Blood (Crescent City #1)
» A Kingdom of Flesh and Fire
» From Blood and Ash (Blood And Ash #1)
» A Million Kisses in Your Lifetime
» Deviant King (Royal Elite #1)
» Den of Vipers
» House of Sky and Breath (Crescent City #2)
» The Queen of Nothing (The Folk of the Air #
» Sweet Temptation
» The Sweetest Oblivion (Made #1)
» Chasing Cassandra (The Ravenels #6)
» Wreck & Ruin
» Steel Princess (Royal Elite #2)
» Twisted Hate (Twisted #3)
» The Play (Briar U Book 3)