Comparisons of ten corrections methods for t-test in multiple comparisons via Monte Carlo study


Multiple comparisons of treatments means are common in several fields of knowledge. The Student's t-test is one of the first procedures to be used in multiple comparisons, however the \emph{p}-values associated with it are inaccurate, since there is no control on the family-wise Type I error. To solve this problem several corrections were developed. In this work, based on Monte Carlo simulations, we evaluated the t-test and the following corrections: Bonferroni, Holm, Hochberg, Hommel, Holland, Rom, Finner, Benjamini–Hochberg, Benjamini–Yekutieli and Li with respect to their power and Type I error rate. The study was lead varying the sample size, the sample distribution and the degree of variability. For all instances we regarded three balanced treatments and the probability distributions considered were: Gumbel, Logistic and Normal. Although the corrections were approaching when the sample size increased, our study reveals that the BH correction provides the best family-wise Type I error rate and the second overall most powerful correction.

DOI Code: 10.1285/i20705948v11n1p74

Keywords: t-test; Monte Carlo simulation; Multiple comparison; Type I error rate; Power


Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the royal statistical society. Series B (Methodological), pages 289-300.

Benjamini, Y. and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. Annals of statistics, pages 1165-1188.

Bonferroni, C. E. (1936). Teoria statistica delle classi e calcolo delle probabilita. Libreria internazionale Seeber.

Calvo, B. and Santafe, G. (2015). scmamp: Statistical comparison of multiple algorithms in multiple problems. The R Journal, Accepted for publication.

Carmer, S. G. and Swanson, M. R. (1973). An evaluation of ten pairwise multiple comparison procedures by monte carlo methods. Journal of the American Statistical Association, 68(341):66-74.

Carmer, S. G. and Walker, W. M. (1985). Pairwise multiple comparisons of treatment means in agronomic research. Journal of Agronomic Education.

Dunn, O. J. (1961). Multiple comparisons among means. Journal of the American Statistical Association, 56(293):52-64.

Einot, I. and Gabriel, K. (1975). A study of the powers of several methods of multiplecomparisons. Journal of the American Statistical Association, 70(351a):574-583.

Finner, H. (1993). On a monotonicity problem in step-down multiple test procedures. Journal of the American Statistical Association, 88(423):920-923.

Garcia, S., Fernández, A., Luengo, J., and Herrera, F. (2010). Advanced nonpara-metric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Information Sciences,180(10):2044-2064.

Girardi, L. H., Cargnelutti Filho, A., and Storck, L. (2009). Type i error and power of five multiple comparison procedures for means. Rev. Bras. Biom., 27(1):23-36.

Hochberg, Y. (1988). A sharper bonferroni procedure for multiple tests of signicance. Biometrika, 75(4):800-802.

Holland, B. S. and Copenhaver, M. D. (1987). An improved sequentially rejective bonferroni test procedure. Biometrics, pages 417-423.

Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scandinavian journal of statistics, pages 65-70.

Hommel, G. (1988). A stagewise rejective multiple test procedure based on a modified bonferroni test. Biometrika, 75(2):383-386.

Kemp, K. (1975). Multiple comparisons: comparisonwise versus experimentwise type I error rates and their relationship to power. Journal of dairy science, 58(9):1374-1378.

Klockars, A. J., Hancock, G. R., and McAweeney, M. J. (1995). Power of unweighted and weighted versions of simultaneous and sequential multiple-comparison procedures. Psychological Bulletin, 118(2):300.

Kromrey, J. D. and La Rocca, M. A. (1995). Power and type i error rates of new pairwise multiple comparison procedures under heterogeneous variances. The Journal of Experimental Education, 63(4):343-362.

Li, J. D. (2008). A two-step rejection procedure for testing multiple hypotheses. Journal of Statistical Planning and Inference, 138(6):1521-1527.

Olejnik, S., Li, J., Supattathum, S., and Huberty, C. J. (1997). Multiple testing and statistical power with modied bonferroni procedures. Journal of educational and behavioral statistics, 22(4):389-406.

R Core Team (2016). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.

Rom, D. M. (1990). A sequentially rejective test procedure based on a modied bonferroni inequality. Biometrika, pages 663-665.

Seaman, M. A., Levin, J. R., and Serlin, R. C. (1991). New developments in pair-wise multiple comparisons: Some powerful and practicable procedures. Psychological Bulletin, 110(3):577.

Shaffer, J. P. (1995). Multiple hypothesis testing. Annual review of psychology, 46(1):561-584.

Sheskin, D. J. (2000). Parametric and nonparametric statistical procedures. Chapman & Hall/CRC: Boca Raton, FL.Student (1908). The probable error of a mean. Biometrika, pages 1-25.

van der Laan, M. J., Dudoit, S., Pollard, K. S., et al. (2004). Multiple testing. part ii. step-down procedures for control of the family-wise error rate. Statistical applications in genetics and molecular biology, 3(1):1041.

Westfall, P. H., Tobias, R. D., and Wolnger, R. D. (2011). Multiple comparisons and multiple tests using SAS. SAS Institute.

Full Text: pdf

Creative Commons License
This work is licensed under a Creative Commons Attribuzione - Non commerciale - Non opere derivate 3.0 Italia License.