OSH 637, Biostatistics & Probability


 1. Determine the number and the width of the intervals required
    to group 186 different observations with a range of 55.
 2. Given the following data:
     Class Intervals    Frequency
         10-19              8
         20-29             18
         30-39              9
         40-49             14
         50-59             10
         60-69              7
         70-79              5
         80-89              3
         90-99              1
    Calculate the cumulative frequency percent and the median.
    State your assumptions if any.
 3. Find the value of Y in the following equation:
    2,186,767,328.00 = 12(Y+5)
 4. The following data represent the concentration of CO in ppm
    collected from incomplete combustion in 10 different
    incinerators.  These concentrations are:
      [300, 340, 550, 128, 300, 440, 600, 550, 550, and 296]
    a. Find the following numerical summary measures,
        - mean
        - median
        - mode
        - range
        - standard deviation
        - geometric mean
    b. Assume that these incinerators were randomly selected from
       a normal population; estimate the percentage of all the
       incinerators that would have CO concentrations between
       250 and 560 ppm.
 5. Determine the odds in favor of an event having a probability of
    occurrence equal to 4/32.
 6. Prove that nCr = nPr/r!.
 7. Calculate 8P3,2,1,0.
 8. Determine:
    a) The probability of obtaining 4 heads and 2 tails on a
       single toss of 6 coins.
    b) The probability when each coin was tossed separately.
    c) The probability of obtaining exactly (H,H,T,T,H,H).
    d) What other combination/s of heads and tails would have the
       same probability calculated in (a) above?
 9. Let (A) represent the event that a particular individual is
    exposed to high levels of carbon monoxide and (B) the event
    that he or she is exposed to high levels of nitrogen dioxide.
    a) What is the event A  B?
    b) What is the event A  B?
    c) What is the compliment of A?
    d) Are the events A and B mutually exclusive?
10. You were asked to count and size fibers from 100 slides.
    40% of all slides contain asbestos fibers, and the rest are
    ceramic fibers.  You also know that 50% of all the asbestos
    slides and 30% of all the ceramic slides are labeled as
    "respirable fibers".  All slides are placed face down.  What is
    the probability of randomly selecting an asbestos slide given
    that it is of respirable size?
11. Find the mean and the variance of a binomial distribution with
    n = 100 and a 50% probability.
12. Consider a group of seven people selected from a certain age
    group.  The number of persons in this sample who may suffer from
    diabetes is a binomial random variable with parameters, n = 7 and
    probability = 0.125.
    a) If you wish to make a list of all seven persons, in how many
       ways can they be ordered?
    b) Without regard to order, in how many ways can you select four
       individuals from this group of seven?
    c) What is the probability that four of them have diabetes?
    d) What is the probability that at least four of them have
13. A diagnostic procedure is known to have an average of five
    failures per day.  What is the probability that, on any random
    day, there will be at least one failure?
14. In terms of m, s and probability (p), state the equations for
    the mean and the variance of a discrete random variable.

 1. Consider the standard normal distribution with mean (m) = 0
    and standard deviation (s) = 1.
    a) What is the probability that z > 2.6?
    b) What is the probability that z < 1.3?
    c) What is the probability that z is between -1.7 and 3.1?
    d) What value of z cuts off the upper 15% of the standard
       normal distribution?
    e) What value of z marks the lower 20% of the distribution?
 2. The mean and the variance of the scores on an exam are 75 and
    16 respectively.  What is the probability of obtaining a grade
    between 70 and 82 inclusive?  State your assumptions.
 3. The diameters of respirable fibers range between 0.3 mm and
    3.5 mm with a mean = 2.6 and a variance = 1.  What proportion
    of these fibers would you expect to be greater than 2.0 mm?
 4. Assume a binomial distribution with 15 trials each having a 50%
    chance of occurring.  Calculate the probability that the number
    of successes is greater than 6 and less than or equal to 10.
 5. Given (p) = 0.5 and (n) = 20, calculate the probability of  
    randomly selecting a number that is greater than or equal to 8 
    and strictly less than 13.
 6. To use the normal approximation of the binomial, (n) must at
    least be 30.  Estimate (n) for an event that has a distribution
    probability neither smaller than 20% nor greater than 45% to
 7. Calculate the probability that the mean of a sample of 40
    observations, representing a population with m = 20 and a s2 = 100,
    lies between 18 and 22.
 8. A sample of 144 mice indicated that the average number of fiber
    retained in the lungs from one-hour exposure to asbestos dust in
    a certain facility was 31,000 fibers.  The population standard
    deviation for such exposure is approximately 6,000 fibers.
    What is the 99% confidence interval estimate for m, the unknown
    universal mean?
 9. Twelve companies indicated that the average number of alcohol
    related accidents per year that occur at their plants are: 10, 7,
    12, 11, 10, 10, 8, 4, 8, 8, 9, and 7.
    Calculate the 95% confidence interval for m, the population
    mean, for alcohol related accidents.
10. A research laboratory wishes to purchase a centrifuge to separate
    red blood cells from white cells.  A sample of 300 centrifuged
    tubes indicated 75 faulty results.  What is the 95% confidence
    interval for the true proportion (p) of faulty results produced
    by this centrifuge?.
11. A manufacturing company would like to design pumps with suction
    power of 37 l/min.  A sample of 12 pumps indicated the following:
    38, 44, 42, 33, 38, 38, 37, 37, 37, 44, 30, and 40 l/min. suction.
    Calculate the 95% confidence interval on the overall variance of
    the pumps' suction power.
12. Write the equation for estimating the sample size (n) given the
    length of the confidence interval, when (s) is not known.  Define
    all the variables and state its shortcoming.
13. Thirty-five samples of carbon monoxide collected from an office
    building indicated the following concentrations in ppm.
         Concentration    Frequency
             00-110           5
           >110-120           4
           >120-130           8
           >130-140           6
           >140-150           5
           >150-160           3
           >160-170           2
           >170-180           1
             >180             1
    Find the mean, the standard deviation, the standard error of
    both the mean and the standard deviation, and construct the
    95% confidence envelope.

 1. A diffusion cell is designed to deliver low molecular weight
    agents at a rate of 6 l/minute with a variance of 0.25.  To
    verify this statement, this cell was tested using ten different
    low molecular weight gases.  The average delivery was computed
    to be 5.5 l/minute.  With 95% confidence, can we conclude that
    this diffusion cell is biased & its performance is dependent on
    the molecular weight?
 2. A certain drug is claimed to reduce the sugar level in diabetics
    by more than 50 units with a variance of 16.  A randomly selected
    group of 30 people taking this drug was screened to measure their
    sugar levels.  It was found that the average drop in the sugar
    level was approximately 48 units.  Can we conclude that this drug
    is not as efficient as it was claimed to be, or this average of
    48 units was obtained just by chance?  Test this hypothesis with
    95% confidence.
 3. The manufacturers of the drug in the above problem were not very
    happy with this result.  To back-up their claim, they conducted
    their own study using a randomly selected group of ten people.
    Their results were identical to those obtained above.  With the
    same confidence, can they continue to claim the effectiveness of
    this drug and why??
 4. A safety group wishes to maintain the reputable standards of
    their on-site training procedures.  To do so, they normally
    demonstrate the efficiency of their fire extinguishers, claiming
    that their equipment, on the average, can put out a fire in a
 short period of time with a variance not exceeding 70 seconds2.  Twenty
    of those extinguishers were purchased and tested, and the results
    indicated an average time of 4 minutes/fire for all extinguishers
    with a variance of 100 seconds2.  Can we conclude that the obtained
    variance is significantly different from the variance claimed?
    Test this hypothesis at 0.05 level of significance.  State your
    assumptions if any.
 5. The average age of patients with mesothelioma, an asbestos related
    cancer, is 62 years.  A recent study of 30 cases indicated that
    their average age was 60 years with a standard deviation of
    approximately 4.  Can we conclude that the information we had,
    prior to this recent study, was fairly accurate at the 0.05 level
 6. The average age of patients with mesothelioma, was believed to be
    at least 62 years.  A recent study of 30 cases indicated that
    their average age was 60 years with a standard deviation of
    approximately 4.  Can we conclude that the information we had,
    prior to this recent study, was fairly accurate at the 0.05 level
    of significance?
 7. An industry with fairly strict safety & hygiene standards claims
    that accidents in their numerous plants never exceeds 0.02 of all
    their employees in any one year.  To test their claim, the local
    department of health and safety monitored their records for one
    full year and documented 80 accidents among a total of 2500
    employees.  At an a = 0.05, test the proper hypothesis.

 1. Two samples representing two populations (X) & (Y) indicate the
    following results:
                  X          Y   
         n        16         24
         mean    575        550
         m       590        570
         s        05         06
         S        08         08
    At an (a) = 0.01 level of significance, determine if the two
    populations are significantly different.  State all necessary
 2. Women are believed to observe & respect safety instructions more
    than men do.  Universal testing revealed that women, on the
    average, score at least 20 points higher than men.  The variances
    for both women and men are 12 and 15 respectively.  Samples from
    both sexes were selected and tested.  The results were in agreement
    with those universally observed; the women scored an average of
    of 84 while the men scored only 65.  Twenty were sampled from each
    sex.  Test the appropriate hypothesis with 98% confidence.  State
    all your assumptions.
 3. A company makes filters and filter cassettes for particulate
    sampling.  For best results, the diameters of both must be
    compatible with tolerance for very little or no difference.
    Twenty units were randomly selected from the cassettes with a known
    variance of 0.08 cm2 and we obtained a mean diameter of 5 cm.
    Sampling 15 units from the filters' population that have a variance
    of 0.07 cm2, yielded an average diameter of 4.7 cm.
    Find the 99% confidence interval on the difference of the two means,
    and state all the appropriate hypotheses with their respective
    rejection regions.  State all necessary assumptions.
 4. A company claims that their new on-site training program cut the
    variance of accidents among confined space workers by 60% when
    compared to the old program.  Two groups, 21 each, were trained
    separately utilizing the new program for one and the old program
    for the other.  The variances observed were 100 and 380 for the
    new and old programs respectively.  Can we conclude that the new
    program did not cut the variance by 60 percent?  Test this
    hypothesis with 99% confidence.
 5. A company claims that the variance of accidents related to
    industrial heat welding is significantly lower than the variance
    obtained from accidents related to laser welding.  The variance
    obtained from a sample of 51 heat welding workers was 30 while the
    variance from 41 laser welding workers was 25.  Can we, at an
    (a) = 0.05, conclude that the heat welding variance is indeed less
    than that of the laser welding?
 6. Two injection pumps are believed to be highly compatible and
    currently available on the market.  To properly decide on which
    type to purchase, we sampled a few of each type.  The data was as
    the following:
                                    Type (A)   Type (B)
      number of pumps sampled          10         10
      mean delivery in (microliters)   50         53
      standard deviation                3          4
    Can we conclude that type A and type B are different?
    State & test the appropriate hypothesis with 95% confidence.
    State your assumptions.
 7. Two tests are supposed to be compatible in measuring the IQ
    in kids.  The makers of one of them, known as IQ1, claim that
    their test is superior to the other test known as IQ2.  To test
    this claim we gave the IQ1-test to 45 kids of known IQs; 23
 results were exactly as predicted.  Later we gave the IQ2-test to
 30 of the same 45 kids tested earlier and 18 results were exactly
 as predicted.  At a significance level of 0.01, can we reject the
    claim?  State all necessary assumptions.

 It was determined that the efficiency of collecting dust by impingers
 is directly related to the liquid, normally water, level in the 
 impinger.  To verify this statement, ten impingers having different
 volumes of water were hooked to the same pump and made ready to 
sample the air of a known concentration of very fine and respirable
 dust. After 2 exact hours of sampling, the water from each impinger
 was analyzed and the following results were tabulated as indicated
    Water level (ml)    Efficiency (%)
          3.0                55
          3.3                60
          3.6                65
          3.8                70
          4.0                80
          4.2                85
          4.5                90
          4.8                94
          5.0                96
          5.5                98
 1) Determine n, SX, SY, the sum of the squares of both the X & the Y
    data, SXY, the mean of the X-data, the mean of the Y-data, SX, SY,
    m, b, r, and r2 (the coefficient of determination).
 2) Plot the data and graphically determine the association by three
    different ways.
 3) Interpret r & r2 and discuss their importance.
 4) Assume that the universal population's correlation coefficient = 0,
 determine whether or not the two variables are independent of each 
other.  Test the appropriate hypothesis with 95% confidence.
 5) Assume that the universal population's correlation coefficient is
    0.86, determine the same as in (4) above.
 6) Determine the variance of error (Se2) and explain its application.
 7) Determine the 95% confidence intervals on the entire population's
    Y-intercept, slope and the variance of error.

 1. A psychologist claims that 15% of all children are born with IQs
    ranging between 120 and 140, 80% are considered normal and their
    IQs range between 100 and 119 and the rest of this population have
    IQs less than 100.  He also theorizes that this last segment of
    this population can boost their IQs to over 100 if given the chance,
    the proper education and other environmental stimulants.
    A sample of 400 children of both sexes were randomly selected to
    verify this claim.  One hundred children were determined to have an 
    IQ between 125 and 130; 275 children scored between 100 and 115;
    and the rest of these children had IQs less than 100.
    At the 0.05 significance level (95% confidence level), test the
    appropriate hypothesis.
 2. Two methods are described in the literature to be acceptable for
    collecting respirable fibers in the workplace.  NIOSH disagrees with
    this article claiming that one of them (using cyclones) is superior
    to the second method (using the impinger).  NIOSH believes that the 
    cyclone's collection efficiency is 80% and that for the impinger is 
    only 20%. Both units were used to sample for a known concentration
    of respirable dust (950 mg/l), and after one hour of sampling the
    following was observed; the amount of dust collected by the cyclone
    was 600 mg/l and that of the impinger was 350 mg/l.
    With 90% confidence level, test the null hypothesis that the ratio
    of the cyclone's collection efficiency to that of the impinger is
    4:1 as claimed by NIOSH.
 3. It was claimed that older women are more willing to breast-feed
    their babies than younger mothers.  To scientifically reject
    this claim, we decided to collect our own data and analyze it.
    Eighty (assumed to be older women) and seventy (assumed to be
    younger) were questioned for this study.  The results observed by
    the questionnaire are tabulated below:
         category        older women    younger women    total
       breast feed            40              20           60
       do not breast feed     40              50           90
       total                  80              70          150
    At a = 0.005 significant level, test the null hypothesis that the
    willingness to breast feed among women is independent of their age.
 4. The statement "Education is the key to success" has been questioned
    by many.  To determine if the level of education has any apparent
    effect on success, we sampled 200 people and asked them about their
    views.  Their educational backgrounds and their reactions are
    tabulated as shown below:
          category        agree    disagree   no opinion    total
      with college degree  60         30          10         100
      no college degree    30         55          15         100
           total           90         85          25         200
    Test the appropriate hypothesis of independence between education
    and success at a = 0.025 significance level.  State all your
    assumptions if any.

 1. Three methods are currently available to reduce stress in the
    workplace.  These techniques are put to test among three different
    work settings; industry, colleges and hospitals.  After one year,
    the results of this stress management program showed that the
    number of people benefited from these programs are not conclusive.
    A summary of this study is tabulated below.
         settings     method 1      method 2      method 3
         industry        60            50            40
         colleges        50            65            70
         hospitals       40            30            25
    Determine whether or not all three methods are equally effective in
    reducing stress.  Test this hypothesis with 95% confidence level.
    State all your assumptions.
 2. Three nursing programs (A, B and C) each having twenty students
    were tested for competency in emergency response.  The scores for
    each program were analyzed.  The averages and the variances were
    calculated to be 80 and 25 respectively for the first program or
    (A), 85 and 16 for the second program or (B) and and 78 and 16 for
    the third program or (C).  At a = 0.05, test the hypothesis that
    all three averages are equal or assumed to be basically the same.
 3. Solve the above problem knowing that the number of students in
    programs (A, B and C) are 20, 24 and 26 respectively.
 4. Assume 4 samples with the following observed data:
                  sample 1  sample 2   sample 3   sample 4
                     10         6         12         10
       observed       8        11         20          7
       data          15         9         10         12
                      9         8          9         11
    Test whether or not all four samples are equal with 95% confidence
    level (0.05 significance level).
 5. To test the collection efficiency of filters, impingers and
    cyclones, all three methods were used to collect dust in a
    laboratory experiment.  Five filters, three impingers and four
    cyclones were placed inside a dust chamber for a period of two
    hours.  The average retention by the filters, the impingers
    and the cyclones were 90%, 93% and 88%, respectively.  The
    variances of the collection efficiency were 15% for the filters,
    20% for the impingers and 12% for the cyclones.  Test the
    appropriate hypothesis with 95% confidence and then determine the
    same for each possible pair.
 6. Given the following data, test the hypotheses;
    a) all column means are equal.
    b) all row means are equal.
    Use a as (0.05)
                   1            2            3
          1       55           60           70
          2       45           50           65
          3       55           56           57
          4       75           45           55