scikit_posthocs.posthoc_tukey_hsd

scikit_posthocs.posthoc_tukey_hsd(a: list | ndarray | DataFrame, val_col: str | None = None, group_col: str | None = None, sort: bool = True) → DataFrame

Pairwise comparisons with TukeyHSD confidence intervals. This is a convenience function to make statsmodels pairwise_tukeyhsd method more applicable for further use.

Parameters:

x (array_like or pandas Series object, 1d) – An array, any object exposing the array interface, containing dependent variable values (test or response variable). Values should have a non-nominal scale. NaN values will cause an error (please handle manually).
g (array_like or pandas Series object, 1d) – An array, any object exposing the array interface, containing independent variable values (grouping or predictor variable). Values should have a nominal scale (categorical).
alpha (float, optional) – Significance level for the test. Default is 0.05.

Returns:

result – DataFrame with 0, 1, and -1 values, where 0 is False (not significant), 1 is True (significant), and -1 is for diagonal elements.

Return type:

pandas.DataFrame

Examples

>>> x = [[1,2,3,4,5], [35,31,75,40,21], [10,6,9,6,1]]
>>> g = [['a'] * 5, ['b'] * 5, ['c'] * 5]
>>> sp.posthoc_tukey_hsd(np.concatenate(x), np.concatenate(g))