Home > The Art of Statistics How to Learn from Data(45)

The Art of Statistics How to Learn from Data(45)
Author: David Spiegelhalter

The analysis becomes (fairly) intuitive by using an expected frequency tree, although Bayes’ theorem can also be expressed in a convenient formula using probabilities. But first we need to return to the idea of odds introduced in Chapter 1, to which seasoned gamblers will need no introduction, at least if they are British. The odds of an event is the probability of it happening, divided by the probability of it not happening. So the odds of flipping a coin and getting a head is 1, which comes from ½ (the probability of a head) divided by ½ (the probability of getting a tail).* The odds of throwing a die and getting a six is 1/6 divided by 5/6, which comes to 1/5, popularly known as ‘1 to 5 on’, or ‘5 to 1 against’ if you use the British method of expressing gambling odds.

Next we need to introduce the idea of a likelihood ratio, a concept that has become critical in communicating the strength of forensic evidence in criminal court cases. Judges and lawyers are being increasingly trained to understand likelihood ratios, which essentially compare the relative support provided by a piece of evidence for two competing hypotheses, which we shall call A and B, but which would often represent guilt or innocence. Technically, the likelihood ratio is the probability of the evidence assuming hypothesis A, divided by the probability of the evidence assuming hypothesis B.

Let us see how this works in the doping case, where the forensic ‘evidence’ is the positive test result, hypothesis A is that the athlete is guilty of doping, and hypothesis B is that they are innocent. We are assuming that 95% of dopers test positive, so the probability of the evidence, given hypothesis A, is 0.95. We know that 5% of non-dopers test positive, so the probability of the evidence, given hypothesis B, is 0.05. So the likelihood ratio is 0.95/0.05 = 19: that is, the positive test result was 19 times more likely to happen were the athlete guilty rather than innocent. This may at first seem like quite strong evidence, but we shall later come to likelihood ratios in millions and billions.

So let’s put all this together in Bayes’ theorem, which simply says that

the initial odds for a hypothesis × the likelihood ratio = the final odds for the hypothesis

For the doping example, the initial odds for the hypothesis ‘the athlete is doping’ is 1/49, and the likelihood ratio is 19, so Bayes’ theorem says the final odds are given by

1/49 × 19 = 19/49

These odds of 19/49 can be transformed to a probability of 19/(19+49) = 19/68 = 28%. So this probability, which was obtained from the expected frequency tree in a rather simple way, can also be derived from the general equation for Bayes’ theorem.

In more technical language, the initial odds are known as the ‘prior’ odds, and the final odds are the ‘posterior’ odds. This formula can be repeatedly applied, with the posterior odds becoming the prior odds when introducing new, independent items of evidence. When combining all the evidence, this process is equivalent to multiplying the independent likelihood ratios together to form a composite likelihood ratio.

Bayes’ theorem looks deceptively basic, but turns out to encapsulate an immensely powerful way of learning from data.

 

 

Likelihood Ratios and Forensic Science

 

On Saturday 25 August 2012, archaeologists began an excavation for Richard III’s remains by digging in a car park in Leicester. Within a few hours they found their first skeleton. What is the probability that this was Richard III?

 

In popular legend, promoted by the Tudor apologist William Shakespeare, Richard III (the last king of the House of York) was an evil hunchback. While this is a highly contested view, it is a matter of historical record that he was killed at the Battle of Bosworth Field on 22 August 1485, aged 32, his death effectively ending the War of the Roses. His body was said to have been mutilated and brought for burial to Greyfriars Priory in Leicester, which was later demolished and eventually became covered by a car park.

Considering just the information provided, we might assume that this skeleton was the remains of Richard III if all the following were true:

• he had really been buried in Greyfriars;

• his body had not been dug up and moved or scattered in the intervening 527 years;

• the first skeleton found happened to be him.

 

Suppose we make rather pessimistic assumptions, and assume only a 50% probability that the stories of his burial were true, and a 50% probability that his skeleton is still where he was originally buried in Greyfriars. And imagine that up to 100 other bodies were also buried in the identified location (the archaeologists had a good idea where to dig, since Richard had been reported to have been buried in the choir of the friary). Then the probability of all the above events being true is

1/2 × 1/2 × 1/100 = 1/400. This is a fairly low chance that this skeleton is Richard III; the researchers who originally carried out this analysis assumed a ‘sceptical’ prior probability of 1/40, and so we are being considerably more sceptical.1

But when the archaeologists examined the skeleton in detail they found a remarkable series of supporting forensic findings, which included radiocarbon dating of the bones (there was a 95% probability that they dated from AD 1456 to AD 1530), the fact that it was a male of around thirty, the skeleton displayed scoliosis (curvature of the spine), and evidence that the body had been mutilated after death. Genetic analysis involving known descendants of close relatives of Richard (he had no children himself) revealed shared mitochondrial DNA (through his mother). The male Y chromosome did not support a relationship, but this could easily be explained by breaks in the male line due to mistaken paternity.

The evidential value of each item of evidence can be summarized by its likelihood ratio, which in this situation is defined as

 

 

Table 11–1 shows the individual likelihood ratios for each piece of evidence, revealing that none of them are individually very convincing, although the researchers were cautious and deliberately erred on the side of lower likelihood ratios that did not favour the skeleton being Richard III. But if we assume these are independent forensic findings, then we are entitled to multiply the likelihood ratios to get an overall assessment of the strength of the combined evidence, which comes to an ‘extremely strong’ value of 6.7 million. The verbal terms used in the table are taken from the scale shown in Table 11.2, which has been recommended for use in court.2

So is this evidence convincing? Remember we calculated a conservative initial probability of 1 in 400 that this skeleton was Richard III, before taking into account the detailed forensic findings. This corresponds to initial odds of around 1 to 400: Bayes’ theorem tells us to multiply this by the likelihood ratio to give final odds, which therefore come to 6.7 million / 400 = 16,750. So, even if we are extremely cautious indeed in assessing the prior odds and the likelihood ratios, we could say that the odds are around 17,000 to 1 that the skeleton is Richard III.

*

 

 

Table 11.1

Likelihood ratios assessed for items of evidence found on skeleton found in Leicester, comparing hypotheses that the skeleton is, or is not, Richard III. The combined likelihood ratio of 6.5 million is obtained by multiplying together all the individual likelihood ratios.

*

 

 

Table 11.2

Recommended verbal interpretations of likelihood ratios when reporting forensic findings in court.

Hot Books
» House of Earth and Blood (Crescent City #1)
» A Kingdom of Flesh and Fire
» From Blood and Ash (Blood And Ash #1)
» A Million Kisses in Your Lifetime
» Deviant King (Royal Elite #1)
» Den of Vipers
» House of Sky and Breath (Crescent City #2)
» The Queen of Nothing (The Folk of the Air #
» Sweet Temptation
» The Sweetest Oblivion (Made #1)
» Chasing Cassandra (The Ravenels #6)
» Wreck & Ruin
» Steel Princess (Royal Elite #2)
» Twisted Hate (Twisted #3)
» The Play (Briar U Book 3)