Hypothesis Testing


Hypothesis testing: making an inference about how the value of a parameter relates to a specific numerical value: Is it less than, equal to, or greater than the specified number?

Examples: you want to determine whether the mean level of blood alcohol exceeds the legal limits after two drinks.

Elements of hypothesis testing:

  1. Null hypothesis (H0): A statement about the values of one or more population (sample) parameters, based on observation and prior knowledge.
  2. Alternative hypothesis (Ha): A statement that contradicts the null hypothesis, based on observation and prior knowledge.
  3. Test statistics: A sample statistic used to decide whether to reject the null hypothesis.
  4. Rejection region: The numerical values of the test statistic for which the H0 will be rejected. The values of test statistic making up the rejection region are those values that are less likely to occur if the H0 is true, while the values making up the acceptance region are more likely to occur if the H0 is true.

  5.  

     

    Level of significance (alpha ): specifies the area under the curve of the distribution of the test statistic that is above the value on the horizontal axis constituting the rejection region. Alpha is a probability, a probability of rejecting a true null hypothesis. Normally, alpha = 0.01, 0.05, 0.1.

    When a true null hypothesis is rejected, it is called Type I error (a ).

    When a false null hypothesis is accepted, it is called Type II error (b ).

  6. Calculation of test statistic: the numerical value of the test statistic is determined.
  7. Conclusion: If the numerical value of the test statistic falls in the rejection region, then the H0 is rejected.


CHI-SQUARED TEST



1. Test of goodness-of-fit

Objective: test agreement between the observed data and the theoretical (expected) data.

Hypothesis: H0: there is no difference between the observed and expected data.

HA: there is a difference between the observed and expected data. Test statistic: X2 =  Sum ((Observed — Expected)2/Expected)
            d.f. = n —1, n = sample size
Decision rule: if X2 calculated > X2 critical then reject H0. if X2 calculated < X2 critical then accept H0.
 
Example: Mendel’s pea plants experiment (monohybrid cross)

H0: there is no difference between the observed and expected data.

HA: there is a difference between the observed and expected data.
 
 
 
Phenotype

(class)

Observed

(O)

Expected

(E)

O - E (O-E)2 / E
Purple flower (PP, Pp) 705 697 +8 0.09
White flower (pp) 224 232 -8 0.28
Total  929 929 0 X2 = 0.37

 

d.f. = n —1 = 2- 1 = 1

X2 critical = X2 (alpha = 0.05, d.f. = 1) = 3.84

X2 = 0.37 < 3.84, then the null hypothesis is accepted (there is good fit between the observed and expected data).

Note: in this type of test, only one variable or factor is involved.
 
 

2. Test for independence

Objective: test the independence of two classification variables or factors.

Hypothesis: H0: there is no relationship between the two variables (independent, no association)

HA: there is a relationship between the two variables (dependent)

Test statistic: X2 =  Sum ((Observed — Expected)2/Expected)

d.f. = (r —1)(c-1)

r = no. of rows, c = no. of columns

Decision rule: if X2 calculated > X2 critical then reject H0. if X2 calculated < X2 critical then accept H0. Example: study the relationship between blood type and severity of a certain condition in a population of 1500 (Danel, 1974).
 
Conditions A B AB O Total
Absent 543 211 90 476 1320
Mild 44 22 8 31 105
Severe 28 9 7 31 75
Total 615 242 105 538 1500

 

Use StatView to calculate the X2 value for this example.

Written by Dr. Kate He, October 2001