blogger counters

Friday, February 07, 2020

Using Statistics to Support Your Research

Statistics can provide excellent evidence for your paper.  However, unless they are used appropriately, they can undermine your argument and can even be destructive. In addition, it’s easy to reinforce cognitive biases with cherry-picked statistics without realizing what you’re doing.  The coupling of cognitive bias with flawed statistics was explored by Daniel Kahneman and Amos Tversky, and was part of their Nobel prize-winning findings. 

Here are a few guidelines for using statistics in your paper.

The key is to be aware of how statistical reasoning occurs and where it might be faulty.  Faulty statistical reasoning can be harmful.  It can lead to causal relationships or conclusions that are unwarranted, inaccurate, or deceptive.  Even if the presentation of the statistics is compelling, and even if the source seems to be reliable, they can be inaccurate. As you analyze, keep in mind when / how you might be making errors when analyzing data.

The Manipulated and "Sanitized" Statistic.  Numbers can be manipulated to make the facts seem to conform to one’s agenda.  For example, the College Board manipulated the SAT scores in 996 and it made it appear that math and verbal scores improved, when in reality, the performance was about scene.

Needlessly precise and hard to read:  need to put it in a form that it is easier to decipher and compare.

The Meaningless Statistic.  Exact numbers can be used to quantify something so inexact, vaguely defined, or difficult to count that it could only be approximated.  The exact number looks impressive, but it can hide the fact that certain subjects (domestic abuse, eating habits, use of narcotics, shopping, sexual preference) cannot be quantified exactly because respondents don't always tell the truth, because of denial, embarrassment, or merely guessing. Or they respond in ways they think the researcher expects.

The Vagueness of the Average.  The mean, median, and mode are three measures of central tendency (the intermediate, or middle, value in a set of numbers) can be used in inconsistent and inappropriate way in order to make .

How to say it’s the average:  The core of the problem comes from the fact that there are ways of reporting "average" - mean, median, mode

Unethical uses of "averages”.  people can tend to use the average that serves their purposes

The Distorted Percentage Figure.  Percentages are often reported without explanation of the original numbers used in the calculation.  Another fallacy in reporting percentages occurs when the margin of error is ignored.  This is the margin within which the true figure lies, based on estimated sampling errors in a survey.

False Ranking.  This happens when items are compared on the basis of poorly-defined criteria.  Unless we know how the ranked items were chosen and how they were compared (the criteria), a ranking can produce a scientific-seeming number based on a completely unscientific methods.

Drawbacks of Data Mining.  Many highly publicized correlations are the product of data-mining.  In this process, a software program searches databases and randomly compares one set of variables (say, buying habits) with another set.  From these countless comparisons, certain relationships, or associations, are revealed (perhaps between green tea frappucino drinking and pancreatic cancer risk).  At one retail company, a correlation between diaper sales and beer sales, presumably because young fathers go out at night to buy diapers.  The retailer then displayed the diapers next to the beer and reportedly sold more of both.

The Biased Meta-Analysis.  In a meta-analysis, researchers look at a whole range of studies that have been done on one topic (say, the role of high-fat diets to cancer risk).  The purpose of this "study of studies" is to decide on the overall meaning suggested by these collected findings. 
These are just a few of the many areas of bias in the use of statistics. With new algorithms being developed and the quest for meaningful pattern recognition in machine learning and deep learning, it’s important to recognize that bias can creep in at any point, especially if you have a predetermined idea about the result, or have a vested interest.

Blog Archive