You might think that with the increasing availability of data and user-friendly statistical software that does the mathematical heavy lifting for you, there is less need to be trained in statistical methods.
The increasing use of statistical numbers and graphs to support claims with supposedly objective evidence can be attributed to the fact that data is becoming more accessible and easier to analyze. Statistics are now used as evidence in many fields, including politics, advertising, and the media. As statistics become increasingly detached from their scientific basis, they are increasingly used for persuasion rather than information.
These claims are made by people who may or may not have formal training in statistical methods. There is a proliferation of data producers and disseminators, but little verification to ensure the accuracy of the numbers they publish.
Even when data are generated by scientists conducting research, errors and biases in statistical data can occur at any stage of the cycle, from problems with the research itself to misrepresentations in the media and to the public.
Therefore, in the modern world, data literacy is essential to accurately assess the credibility of the many news stories, social media posts, and arguments that use statistics as evidence. The information contained in The Art of Statistics by David Spiegelhalter will enable you to more accurately evaluate the statistics you encounter every day.
You may be wondering if you should read the book. This book summary will tell you what important lessons you can learn from this book so you can decide if it is worth your time.
At the end of this book summary, I’ll also tell you the best way to get rich by reading and writing.
Without further ado, let’s get started.
The Art of Statistics Book Summary
Lesson 1: Systematic bias is a common problem that reduces the trustworthiness of data.
Information is not a static body of knowledge, but is subject to bias and interpretation like any other body of knowledge.
Human judgment is required from the outset. To begin data collection, we sometimes have to decide on the fly what to measure. If we want to estimate the number of trees in the world, we have to agree on what we mean when we say “tree.” For example, in most studies of this type, researchers ignore trees that are 4 inches in diameter or less.
Therefore, data may be inaccurate if the definition of the measured tree is changed in the middle of the measurement process. The number of sexual assaults reported to the U.K. police nearly doubled between 2014 and 2017, from 64,000 to 121,000.
To the untrained eye, it may look like the crime rate has increased dramatically. The real reason for the increase was that sexual offenses were taken more seriously, despite a report published in 2014 that criticized police recording practices.
Therefore, it is important to keep in mind that data does not always reflect the truth. Remember that much data can be gleaned from surveys that ask about people’s emotions, such as how happy they are. No single data set can ever capture the full range of human experience, but we can at least try to get a sense of it with these questions. An additional problem is that respondents’ own biases and assumptions can skew the results.
Therefore, developing appropriate research questions is a major challenge in statistical analysis. The tone of a question can influence how people respond to it.
Only 25% of British respondents were in favor of “giving 16- and 17-year-olds the right to vote,” while 41% were opposed. But when the same question was asked about “lowering the voting age from 18 to 16,” only 37% were in favor, while 56% were opposed.
It’s not always the type of question that causes bias, but rather the answer choices available. In 2017, Ryanair boasted that 92% of passengers were satisfied with their flight. However, it was noted that the scale of responses to the survey questions was limited to “excellent,” “very good,” “good,” “average,” and “ok.”
This means that statisticians are already confronted with biased information before they examine the raw data.
Lesson 2: What we make of information depends on how we are presented with it.
Human interpretation is a problem in all aspects of data analysis, from collection to presentation.
In recent years, increasing attention has been paid to data visualization as a means of communicating statistical results. Visualization techniques are used to make data more accessible to the human eye through the medium of graphics. Visual representations of data are referred to by statisticians as “inter-ocular” because they allow the recognition of patterns in the data through direct visual inspection without the need for up-front mathematical processing.
When comparing the number of deaths caused by cardiac surgery at different hospitals, a bar chart is a useful tool. Any healthcare facility whose results deviate significantly from the norm will stand out even without data consultation.
However, to be accurate and effective, the visual must be carefully planned. The meaning of the data can be altered by changing factors such as color, font, order, and language. For this reason, modern statistical analyzes often involve psychologists.
Suppose a statistician presents a table-style presentation to continue our discussion of hospital mortality rates. She must determine the order in which she lists the medical facilities. Listing hospitals by their mortality rate may seem obvious. However, this list could be misconstrued as ranking hospitals by quality, since the best facilities tend to have the highest mortality rates because they treat the sickest patients.
The effect of framing is another example of how presentation can affect interpretation and has been studied extensively. The persuasiveness of a statistical statement can be altered by the words used to describe it.
An advertising campaign on London’s subways several years ago made the bold claim that 99.99 percent of young Londoners were not involved in serious youth violence. The intent was probably to give Londoners a sense of security.
A more accurate statement would be that “1% of young Londoners engage in serious youth violence,” which would have the opposite effect on readers. That sounds a bit more threatening. The impact would be even greater if, instead of a percentage, we used a bare number: “London has 10,000 violent young offenders!”
Framing is a powerful tool that statisticians use to their advantage, whether their goal is to shock or reassure their audience. Researchers should use prudent design and precise language to prevent inappropriate initial reactions to the data.
Lesson 3: Due to selective reporting, the scientific literature is positively biased.
Despite spending their entire professional lives combing through data, few researchers ever make a breakthrough. Researchers are under a lot of pressure to publish their work, and that can lead to a bit of fiddling with the data.
Even scientists have been caught engaging in dubious research practices despite their commitment to the truth. Multiple testing is one of the methods used in this way because it allows researchers to run as many tests as necessary to reach their desired conclusion. The more times researchers perform the same test, the more likely they are to obtain false positives, results that appear to confirm a hypothesis but are actually due to random error.
A look at a 2009 study conducted by a group of highly respected researchers shows why this is a problem. Subjects underwent brain imaging while looking at a series of photos that showed people in various emotional states. The only problem was that the “subject” was a dead Atlantic salmon weighing four pounds.
Only 16 of the 8,064 brain areas studied showed activity in response to the images. Rather than concluding that the fish had exceptional abilities, the team concluded that there must have been some false positives among the more than 8,000 tests performed.
While false positives are not necessarily problematic in themselves, they are often the only results reported. Even in scientific reporting, usually only the most promising or interesting results are made available to the public. As a result, a positive bias has developed in the scientific literature, as only those studies that appear to support a hypothesis are published. The consequences for the interpretation of results are obvious.
For example, you would be surprised if a study found that eating bacon bread increased the risk of cancer. But if you knew that twenty other studies had looked for a link and found none, your surprise might be limited.
Both excessive academic pressure and our penchant for sensational, groundbreaking stories contribute to this kind of biased reporting.
John Ioannidis, professor of statistics at Stanford College, claimed that “most published research is wrong” because of widespread “publication bias.” While Ioannidis wanted to stir up controversy, his assertion is a warning against taking research results at face value just because they appear in a scientific journal.
Lesson 4: In the media, storytelling often takes precedence over fact-checking.
After studies are written, the media report on them. In doing so, the media often make use of artistic license.
Fortunately, data journalism is flourishing. Training journalists in data interpretation and communication is on the rise. Useful statistics can clarify and illuminate important issues and add depth and dimension to reports.
However, there is always the danger of twisting statistics in the name of a good story. Stories need an emotional power that most scientific journals lack. Institutions that care more about page views than reporting accurate research will always be tempted to avoid nuanced conclusions in favor of sensationalism.
The author has personal experience with this kind of sensationalism after a careless remark he made during a public speech. He was commenting on the results of a survey of British adults’ sexual preferences. The researchers found that the sexual activity of British teenagers had decreased by 20% compared to a decade earlier.
Crazy headlines like “Sex will be obsolete by 2030 because of Netflix, says a single scientist” were sparked by the author’s speculation that an increase in content like Netflix could be responsible for the decline.
One of the media’s most common methods of stirring emotions is to exaggerate statistical claims of risk, aside from outright fabrications like this one.
The media has heavily promoted the 18% increased risk of colorectal cancer after regular consumption of processed meat found in a World Health Organization report. This indeed conjures up some disturbing images. But if we are honest, how concerned should we be?
The media coverage of the 18% figure, while accurate, lacked any connection between the relative and absolute risk.
By comparison, the risk of developing colorectal cancer in people who do not regularly consume processed meat is only 6%. Thus, adding 20% to the previous percentage of 6 yields 7.08%. The increased risk for regular consumers of processed meat is relatively low compared to non-consumers and is about 1%.
A common method of falsifying statistics is to exaggerate the magnitude of the risk. The next sections examine some other examples of erroneous interpretations.
The Art of Statistics Book Review
The Art of Statistics is a great book I’d like to recommend to anyone who is interested in data science. If you spend some time digesting the ideas, it might make a positive impact on your life.
The study of statistical patterns is a powerful tool for finding the answers to many of the world’s puzzles. When properly reported, statistical studies can add depth to narratives and educate the public about pressing matters.
Unfortunately, research must pass through many biased filters before it reaches the public, such as peer-reviewed journals and the media. As the use of statistics becomes more widespread, everyone needs to improve their data literacy to properly evaluate the results.
Like your friends, statistics can provide you with entertaining anecdotes, but you should remember that they do not always tell the whole truth.
The same skepticism you apply to other types of claims, facts, and quotes should be applied to statistical information. Also, whenever possible, verify the credibility of reported statistics by locating the original sources.
How To Get Rich By Reading and Writing?
You must be an avid reader who is hungry for knowledge if you are reading this book review. Have you thought about making money using your reading and writing skills?
Thanks to the Internet, the world has undergone a massive change in recent years. Blogging has now become the best way to make money online.
Since no tech experience is required, as long as you’re good at writing, you can easily start a blog that generates cash flow for you while you sleep.
Warren Buffet said, “If you don’t find a way to make money while you sleep, you will work until you die.”
Instead of looking for a 9-5 job and staying in your comfort zone, it’s better if you become your own boss as soon as possible.
Find out how to build a blog and become a wealthy blogger today!