scikit_posthocs.posthoc_tukey

scikit_posthocs.posthoc_tukey(a, val_col: str = None, group_col: str = None, sort: bool = False) → pandas.core.frame.DataFrame

Performs Tukey’s all-pairs comparisons test for normally distributed data with equal group variances. For all-pairs comparisons in an one-factorial layout with normally distributed residuals and equal variances Tukey’s test can be performed. A total of m = k(k-1)/2 hypotheses can be tested. The null hypothesis is tested in the two-tailed test against the alternative hypothesis [1], [2].

Parameters:
  • a (array_like or pandas DataFrame object) – An array, any object exposing the array interface or a pandas DataFrame.
  • val_col (str, optional) – Name of a DataFrame column that contains dependent variable values (test or response variable). Values should have a non-nominal scale. Must be specified if a is a pandas DataFrame object.
  • group_col (str, optional) – Name of a DataFrame column that contains independent variable values (grouping or predictor variable). Values should have a nominal scale (categorical). Must be specified if a is a pandas DataFrame object.
  • sort (bool, optional) – If True, sort data by block and group columns.
Returns:

result – P values.

Return type:

pandas DataFrame

Notes

The p values are computed from the Tukey-distribution.

References

[1]
  1. Sachs (1997) Angewandte Statistik, New York: Springer.
[2]J. Tukey (1949) Comparing Individual Means in the Analysis of Variance, Biometrics 5, 99-114.

Examples

>>> import scikit_posthocs as sp
>>> import pandas as pd
>>> x = pd.DataFrame({"a": [1,2,3,5,1], "b": [12,31,54,62,12], "c": [10,12,6,74,11]})
>>> x = x.melt(var_name='groups', value_name='values')
>>> sp.posthoc_tukey(x, val_col='values', group_col='groups')