next up previous
Next: About this document ...

MA 2612 Test 1 B '99Solutions







NAME:





1.
(15 points) A researcher is conducting a study to definitively establish the causes of childhood leukemia. To do so, she obtains a large random sample of children with leukemia and another of children without leukemia, and measures a large number of potential causal variables for each child. From these potential causal variables, she is able to identify ten variables that show a large association with leukemia. Finally, using the same data, she conducts an hypothesis test for each of the ten variables, obtains a p-value, and sends the results off for publication. Briefly critique her statistical methodology. She has made the error of letting the data suggest the tests she is conducting. This is reasonable in an exploratory study, but not for a confirmatory study (which is the only kind to ''definitively establish'' results).

2.
Scientists have developed a new acrylic fiber, which they believe is stronger than the acrylic fiber presently used. In an experiment to compare the strength of the new fiber to that of the fiber presently used, measurements are taken of the force required to break each of 12 samples of yarn made with the new fiber and a like number of yarn samples made with the present fiber. The results are:

  Breaking Strength (lbs)  
Fiber Mean Variance
New 59.3 12.1
Present 55.6 11.9

Conduct an hypothesis test, at the 0.01 level of significance, to compare the strengths of the two fibers. To do so, please answer the following.

(a)
(10 points) The scientific hypothesis. The new fiber is stronger than the present fiber.

(b)
(10 points) The statistical model.

The two sample C+E model for independent populations.

(c)
(10 points) The statistical hypotheses being tested.

$H_0:\mu_1-\mu_2=0$ versus $H_a:\mu_1-\mu_2\gt$, where $\mu_1$ is the mean breaking strength of the new fiber, and $\mu_2$ is the mean breaking strength of the present fiber.

(d)
(10 points) The standardized test statistic being used.

$T^{(p)}=\frac{\overline{Y}_1-\overline{Y}_2}{\hat{\sigma}_p(\overline{Y}_1-\overline{Y}_2)}$. The pooled variance test is chosen because the sample variances are very close.

(e)
(10 points) The p-value. If you cannot obtain the exact p-value, obtain two numbers a and b, 0<a<b<1, so that the p-value lies between a and b.

First, we compute $\hat{\sigma}_p(\overline{Y}_1-\overline{Y}_2)=
\sqrt{S_p^2\left(\frac{1}{n_1}+\frac{1}{n_2}\right)}$. Now sp2=((12-1)(12.1)+(12-1)(11.9))/22=12, so that $\hat{\sigma}_p(\overline{Y}_1-\overline{Y}_2)=\sqrt{12(1/12+1/12)}=\sqrt{2}$.

The observed value of the test statistic is then $t^{(p)*}=(59.3-55.6)/\sqrt{2}=2.616$.

Using the t tables to compare this with the t22 distributiion, we get the p-value: $P(t_{22}\geq 2.616)$. From the table, we find that the p-value lies between 0.005 and 0.01.

(f)
(10 points) The assumptions made.

(1) Normality (2) Equal population variances

(g)
(10 points) Your conclusions.

Reject H0 in favor of Ha at the 0.01 level of significance. Conclude that the mean strength of the new fiber is greater than that of the present fiber.

3.
(15 points) In order to test H0:p=0.4 versus Ha:p>0.4, where p is the proportion of a population having a certain characteristic, a random sample of size 150 is obtained. The p-value, computed using the large sample approximation with continuity correction, equals 0.03, to two decimal places. How many of the 150 in the sample have the characteristic?

The p-value is $P(N(0,1)\geq z^*_u)$, where $z^*_u=(y^*-np_0-0.5)/\sqrt{np_0(1-np_0)}=
(y^*-60.5)/6$. From the normal table, we see that z*u must lie between 1.82 and 1.95. Therefore, y* must lie between $71.42=6\times 1.82+60.5$, and $72.20=6\times 1.95+60.5$. Since y* must be an integer, y*=72.



 
next up previous
Next: About this document ...
Joseph D Petruccelli
11/22/1999