|
||||||
Before drawing conclusions based on a non-significant hypothesis, test check its statistical power to detect important effects.
Statistical tests of null hypotheses are often summarized as a p-value with some indication of its statistical importance. However, when test results are not statistically significant, an equally important but less frequently reported statistic is the test’s statistical power, the probability that it would have detected an important effect, if it existed. Statistical power provides valuable additional information about the importance of non-significant results. Classical Hypothesis Tests Evaluate a Null Hypothesis of ‘No Effect’In classical statistics, hypothesis tests are constructed to evaluate the null hypothesis, a statement of ‘no effect’. For example, when comparing the mean values of two data sets using a two-sample t-test, the null hypothesis is that there is no difference between the means. Of course, even if they are truly identical, there will always some difference between, due to measurement error if nothing else. Therefore the calculations of the hypothesis test evaluate not just the difference between the means, but also how much the distributions of the mean values, estimated from the sample data, overlap. This is summarized as a probabilistic statement, the familiar ‘p-value’. The P-value is the Probability of Getting Data at Least as or More Extreme Than Observed If the Null Hypothesis Were TrueThe p-value is not the probability of the null hypothesis rather it is the probability of generating data as extreme, or more extreme, than those observed if the null hypothesis were true. The statistical importance of the p-value is determined by comparing it to a threshold of ‘statistical significance’. α is the Probability of Making a Type I Error, Or Rejecting the Null Hypothesis When It Is TrueThe level of statistical significance, symbolised by the Greek letter α, is an arbitrary threshold selected by the researcher against which the importance of the p-value is guaged. It represents the acceptable rate of making a Type I error, or incorrectly concluding an effect exists when it does not. This is usually set at 5%, or 1% and it is against this value that the p-value is compared to evaluate its ‘statistical significance’. For example, if α is set at 0.05 and the p-value is less than or equal to 0.05, the result would be considered ‘statistically significant’ and the test would have rejected the null hypothesis. Note that this does not prove the alternative hypothesis of ‘there is an effect’ it merely provides support for it. On the other hand, if the p-value is greater than 0.05, the result is not statistically significant and the test has ‘failed to reject the null hypothesis.’ However, this does not mean that an important difference does not exist because statistical tests are vulnerable to the variability in the data: highly variable data can mask, or hide, important effects. This is why statistical power is so important to consider. Statistical Power is the Probability of Detecting a True Effect, if it ExistsStatistical power is the probability of detecting a true alternative hypothesis given the data. It provides important additional information about the important of non-significant test results that researchers should consider when drawing their conclusions. For example, a non-significant result coupled with high statistical power to detect an effect of interest to the researcher increases confidence that the effect was not missed. On the other hand, a non-significant result coupled with low statistical power to detect the effect of interests suggests that another experiment, or more sampling, is required before strong conclusions can be drawn. An excellent resource with applied examples of calculating statistical power for t-tests and one-way analysis of variance is J. Cohen’s book, Statistical power analysis for the behavioral sciences (1988, 2nd ed. L. Erlbaum Associates, Hillside NJ).
The copyright of the article Think About Statistical Power in Scientific Inquiry is owned by Ian Parnell. Permission to republish Think About Statistical Power in print or online must be granted by the author in writing.
|
||||||
|
|
||||||
|
|
||||||