How can doing “too much” backfire?
First off many people would not prefer to do extra work. Non-statisticians who are not confident in their tests might want to perform several statically tests because they might think it will verify their results. The case is that if you do not pick the correct statistical analysis or just choose many you are bound to obtain results that show significance. Many sources suggest using an ANOVA instead of many t-tests.
Patrick Runkel challenges the assumption of using huge sample sizes to obtain more accurate results by demonstrating that HUGE samples can make the insignificant into significant. The author designs two identical scenarios with one sample size of ten and the other a million. The author then calculated the p-values and found that the smaller sample gave a p-value above 0.05 and the other gave one a p-value below 0.05. Although a sample size of a million is unreasonable, the author was able to demonstrate that deciding your sample size is very important in obtaining accurate results.
Large samples don't prevent hypothesis tests from working, it just allows a more accurate representation of the population. The problem isn't generally false positives, but true positives and that using a larger sample size might result in significance where people don't want them. A lot of times it is not the numerical values that needs to be adjusted but the graphs. In most cases, a large population will not allow the naked eye to discriminate against points on a graph and as a result will led to misconceptions. As a sample size gets very large even tiny differences from the situation specified in the null may become detectable which is okay since this is how it is supposed to work. So does that mean that a huge sample mean is responsible for a false positive or that using a huge n can influence a true positive to appear?
Uncertainty and certainty can be quantified in the form of “degrees.” These values can be converted into probability in that a higher probability expresses a higher degree of certainty that something will happen. Some researchers (Dr. Xiao-Li Meng) have adapted the terminology "randomness" and "variation" instead of uncertainty. Dr. Edward Deming best describes the limitations of the statistical techniques as not being able to eliminate uncertainty, but can help us gain some knowledge despite it.
We can never expect certainty from a single study but a high degree of certainty can be obtained from an accumulated body of evidence. researchers often write results in terms that represent certainty. Wording out the conclusion has to be specific and based on what the results represent. Researchers might that a hypothesis is true, when it would be more accurate to say that the “evidence supports the hypothesis” or is “consistent with the hypothesis.”
1. Deming, W. Edwards, Walter A. Shewhart, 1891 - 1976, Amstat News, September, 2009, p. 19
2. Meng, Xiao-Li, Statistics: Your Chance for Happiness (or Misery), Amstat News, September, 2009, p. 43
3. Schoenfeld, Alan, Purposes and Methods of Research in Mathematics Education, Notices of the American Mathematical Society, v. 47, 2000, pp. 641 - 649.
Runkel, Patrick. "Large Samples: Too Much of a Good Thing?" Minitab. N.p., 04 June 1970. Web. 19 Apr. 2017.