Tests for More Than Two Independent Samples
Kruskal-Wallis H, Median, and Jonckheere-Terpstra Tests
The tests in this section test whether one can reject the null hypothesis that two or more independent samples come from the same underlying population distribution. In SPSS, these tests require that the Exact Tests add-on module be installed.
Key Concepts and Terms
- Independent samples. Samples are independent if the response of the nth person in the second sample is not a function of the response of the nth person in the first sample. Independent samples are also called uncorrelated samples and unrelated samples. Samples which are not independent include before-after and panel studies of the same people, or matched-pairs studies of similar people.
- Type of significance estimate. The Exact button in the SPSS dialog above allows the researcher to select among asymptotic, exact, or Monte-Carlo estimates of the significance of the particular test value. These three types of estimates are discussed separately in the section on significance testing. This requires that the SPSS Exact Tests add-on module be installed.
- Kruskal-Wallis H Test. This extension of the Mann-Whitney U test to multiple samples is a nonparametric alternative to one-way analysis of variance. It tests the null hypothesis that the samples do not differ in mean rank for the criterion variable. Because it takes rank size into account rather than just the above-below dichotomy of the median test, discussed below, it is more powerful and preferable when its assumptions are met.
- In SPSS, select Analyze, Nonparametric Tests, K Independent Samples; select your variables; select the Grouping Variable; Check Define Range and set the min and max; Continue; in the Test Type group, select Kruskal Wallis H.
- Computation. Kruskal-Wallis H is calculated on the basis of sums of ranks for combined groups. Data from all samples are ordered as in the Wald-Wolfowitz or Mann-Whitney U tests. Let ni be the number of observations in any particular sample and let N be the number of observations in all samples combined. The data are pooled and renumbered from 1 to N, with 1 corresponding to the lowest score. Using these rank scores, data for each sample are listed by rank. These rank scores are added up for each sample, and the sums are the Ti scores in the formula below. Also, let k be the number of samples. Then Kruskal-Wallis H is computed as:
H = 12/(N(N + 1))* SUM(T2i/ni) - 3(N + 1)
degrees of freedom = k - 1
The Kruskal-Wallis H test should not be used when the number of ties is large. For modest numbers of ties, H may be adjusted for a penalty factor. This is done by dividing H by this penalty factor, as below, where t = the number of ties in any sample and the numerator is the sum for all samples:
Hadjusted = H /[1-(SUM(t3 - t))/(N3 - N)]
- Interpretation. H is distributed approximately as chi-square. The researcher calculates H as above and then consults a chi-square table with (k - 1) degrees of freedom. If the critical value of chi-square for the desired significance level (typically .05) is equal to or less than the computed H value, then the researcher rejects the null hypothesis that the samples do not differ on the criterion variable. SPSS prints the corresponding significance value directly.
- Median Test. Also called the Westenberg-Mood median test, this is a more general but less powerful alternative to the Kruskal-Wallis H test for testing if several independent samples come from the same population. It tests whether two or more independent samples differ in their median values for a criterion variable.
- In SPSS, select Analyze, Nonparametric Tests, K Independent Samples; select your variables; select the Grouping Variable; Check Define Range and set the min and max; Continue; in the Test Type group, select Median.
- Computation. The samples are combined temporarily to determine their pooled median value. A table can then be constructed in which the columns are the samples and the two rows reflect the sampe counts above or below the pooled median value. The significance of the table is then calculated using chi-square.
- Jonckheere-Terpstra Test. This test for differences among several independent samples is more powerful than the Kruskal-Wallis H or median tests. However, it requires that the independent samples be ordinally arranged on the criterion variable (ex., city samples arranged by welfare caseload per 10,000 population, where this is the variable of interest). The J-T test tests the hypothesis that as one moves from samples low on the criterion to samples high on the criterion, the within-sample magnitude of the criterion variable increases.
Assumptions
- Random sampling is assumed, as with all significance tests.
- Independent samples are also assumed.
- Independent observations. Within each sample, the response of the nth subject is not dependent on the response of previous subjects, for all cases.
- Data distribution. All tests in this section are nonparametric, not assuming the normal distribution or equal group variances. It is assumed that the distribution in each sample is similar in shape. If the researcher can assume a normal distribution, t-tests are preferable since they can detect true differences between groups using a lower sample size than nonparametric tests in this section.
- Data level. These tests assume the data are ordinal or higher. The Jonckheere-Terpstra test also assumes the samples are ordinally arranged (ascending or descending) on the criterion variable. The Kruskal-Wallis H test assumes data have a continuous distribution in the population from which they were sampled.
- Adequate cell size: For the median test, some require 5 or more, some require more than 5, and others require 10 or more. A common rule is 5 or more in all cells of a 2-by-2 table, and 5 or more in 80% of cells in larger tables, but no cells with zero count.
Frequently Asked Questions
- Where are these tests located in SPSS?
- Where are these tests located in SPSS?
From the SPSS menu, select Statistics, Nonparametric Tests, K Independent Samples. In the "Tests for Several Independent Samples" dialog box, select the "Test Type" you want: Kruskall Wallis H, Median, or Jonckheere-Terpstra. From the variable picklist, enter the criterion variable in the "Test Variable List:" box. For continuous criterion variables, enter a grouping variable in the "Grouping Variable:" box and click on "Define Range" to enter the minimum and maximum values. Note that the J-T test is available only when the SPSS Exact Tests module is installed.
Bibliography
- Garson, G. David (1976). Political science methods. Boston, Holbrook Press. Or consult any statistics text which covers nonparametric tests.
- Levin, Irwin P. (1999). Relating statistics and experimental design. Thousand Oaks, CA: Sage Publications. Quantitative Applications in the Social Sciences series #125. Elementary introduction covers t-tests and various simple ANOVA designs. Some additional discussion of chi-square, significance tests for correlation and regression. and non-parametric tests such as the median test and the Mann-Whitney U test.