This exam consists of four problems. The maximum sum of points is 20. The marks 3, 4 and 5 require a minimum

Examiner Linus Carlsson 016-01-07 3 hours In English Exam (TEN) Probability theory and statistical inference MAA137 Aids: Collection of Formulas, Concepts and Tables Pocket calculator This exam consists of four problems. The maximum sum of points is 0. The marks 3, 4 and 5 require a minimum of 11, 15 and 18 points respectively. All events and random variables used in solutions should carefully be defined. All calculations and lines of reasoning should be written such that they are easy to follow and understandable. Explain, step by step, your calculations. 1. Suppose that X 1, X,..., X m and Y 1, Y,..., Y n are independent random samples, with the variables X i normally distributed with mean µ 1 and variance σ 1 and the variables Y i normally distributed with mean µ and variance σ. The difference between the sample means, X Ȳ, is then a linear combination of m + n normally distributed random variables and hence is itself normally distributed. (a) Find E [ X Ȳ ]. (b) Find V ( X Ȳ ). (c) Now, suppose that σ1 =, σ =.5, and m = n. Find the sample sizes so that ( X Ȳ ) will be within 1 unit of (µ 1 µ ) with probability 0.95. Suggested solution: (a) We use the linearity of the expectation operator, first we have that E [ [ ] X1 + X +... + X m X] = E m = 1 m E [X 1 + X +... + X m ] = 1 m (E [X 1] + E [X ] +... + E [X m ]) = 1 m (µ 1 + µ 1 +... + µ 1 ) = µ 1

and in the same way we have E [ Ȳ ] = µ so we get E [ X Ȳ ] = E [ X] E [Ȳ ] = µ1 µ (b) Since X and Ȳ are independent, we have Because we have the identity V ( X Ȳ ) = V ( X) + V (Ȳ ). V (X) = E [ X ] E [X] Answer: µ 1 µ we see that constants can be lifted out from the variance in square, hence V ( ( ) X1 + X +... + X m X) = V m = 1 m V (X 1 + X +... + X m ) (X i and X j are independent) = 1 m (V (X 1) + V (X 1 ) +...V (X 1 )) = m m σ 1 = σ 1/m and similarly with V (Ȳ ) = σ /n. It follows: Answer: V ( X Ȳ ) = σ 1/m + σ /n (c) It is required that P ( X Ȳ (µ 1 µ 1) = 0.95. Using the result in part b for standardization with n = m, σ 1 = and σ =.5, we obtain from the formulas Statistics Chapter θ = µ 1 µ z 0.05 σ 1 n + σ m = 1 1.96 n +.5 n = 1 n = 1.96 4.5

That is n 17.9. Answer: The two sample sizes should be at least 18.. Data from a random sample reveal x = 10.7, s = 4 where the sample size is n = 49. (a) Test H 0 : µ = 10 versus H 1 : µ > 10, with α set at 5%. (b) Find the p-value for the test. (c) Show that if α > p-value (assume the same p-value as in part b), the null hypothesis H 0 is rejected. Suggested solution: (a) The test is one sided. The hypothesis assumes that the underlying distribution is normally distributed as N(10, s ). The Z-test applies since n = 49 is relatively large. z = x µ σ/ n = 10.7 10 / 49 =.45 > 1.645 = z 0.95 Reject Here z 0.95 = 1.645 is from table at page 79. Answer: The data suggest that we should reject H 0 with α set at 5%. (b) The p-value (Table page 79) is 1 time (one sided test) the area under the standard normal curve from the computed z =.45 to infinity, that is, 1 Φ(.45) = P (Z >.45) = 0.0071. Answer: The p-value is 0.0071. (c) Suppose α is set so that α > 0.0071 = p-value. This means that the area, α, under the standard normal curve from z 1 α to infinity is greater than the area, 0.0071, under the standard normal curve from.45 to infinity. That is z 1 α <.45. We get z = 10.7 10 / 49 =.45 > z 1 α which implies rejection. Q.E.D.

3. Test at α = 1% whether the first 1000 decimal digits of π show homogeneity. Use Pearson s χ -test. 0 1 3 4 5 6 7 8 9 93 116 103 10 93 97 94 95 101 106 (5p) Suggested solution: Under the hypothesis that the first 1000 digits of π are homogenous, the expected number for each of the 10 digits is 1000/10 = 100. Then, if we use χ = m (O i E i ) i=1 E i m (O i 100) = 100 i=1 = 1 ( 7 + 16 + 3 + + 7 + 3 + 6 + 5 + 1 + 6 ) 100 = 4.74 The degrees of freedom is ν = 10 1 = 9. And from table, page 395, we get χ 9, = 1.7. Since 4.74 < 1.7 we cannot reject homogenicity. Answer: We can not reject that π show homogeneity 4. From the data x 1 3 4 5 6 y 3 5 6 9 10 1 a regression line y = β 0 + β 1 x is drawn, here β 0 = 1. and β 1 = 1.8. The expected value for Y at x = 3.5 is y = 7.5. The values S XX = 17.5, S XY = 31.5, S Y Y = 57.5 are calculated. (a) Find a 95% confidence interval for E(Y ) when x = 3.5, (b) Find a 95% prediction interval for Y when x = 3.5, (c) Sketch a graph as below with y = β 0 + β 1 x, the point (x, y ) and the intervals calculated in exercise (a) and (b)

(d) Explain conceptually (not by referring to the mathematical formula) what you have calculated above and why the interval found in part (b) contains the interval in (a). y 16 14 1 10 8 6 4 0 0 1 3 4 5 6 7 x SUGGESTED SOLUTION: (a) Confidence interval (E(Y ), x = 3.5, 1 α = 0.95) : (1. + 1.8 3.5) ± t α/,ν S A, where: t α/,ν = t 0.05,4 =.78, A = 1 + (3.5 3.5) = 1, 6 17.5 6 S = S Y Y β 1 S XY n = 1 0.8 = 1 5 S A = 1 30 = 1 5.477 1 ConfInterval : 7.5 ±.78 5.477 = (6.99, 8.01) Answer: (6.99, 8.01) (b) prediction interval (Y, x = 3.5, 1 α = 0.95) : (1. + 1.8 3.5) ± t α/,ν S B, where: B = 1 + A = 7 S B 7 = 6 30 = 0.483 Prediction interval: 7.5 ±.78 0.483 = (6.15, 8.85) Answer: (6.15, 8.85)

Examiner Linus Carlsson 016-01-07 3 hours Svensk översättning Exam (TEN) Probability theory and statistical inference MAA137 Aids: Collection of Formulas, Concepts and Tables Pocket calculator This exam consists of four problems. The maximum sum of points is 0. The marks 3, 4 and 5 require a minimum of 11, 15 and 18 points respectively. All events and random variables used in solutions should carefully be defined. All calculations and lines of reasoning should be written such that they are easy to follow and understandable. Explain, step by step, your calculations. 1. Antag att X 1, X,..., X m och Y 1, Y,..., Y n är oberoende stickprov, där variablerna X i är normalfördelade med väntevärde µ 1 och varians σ1 och variablerna Y i är normalfördelade med väntevärde µ och varians σ. Skillnaden mellan stickprovsmedelvärdena, X Ȳ, är en linjärkombination av m+n normalfördelade variabler, och är därför själv en normalfördelad variabel. (a) Bestäm E [ X Ȳ ]. (b) Bestäm V ( X Ȳ ). (c) Nu antar vi att σ1 =, σ =, 5, samt att m = n. Bestäm stickprovsstorleken så att ( X Ȳ ) faller inom 1 enhet av (µ 1 µ ) med sannolikhet 0, 95.. Data från ett stickprov visar x = 10, 7, s = 4, där stickprovsstorleken är n = 49. (a) Testa H 0 : µ = 10 vs H 1 : µ > 10, med α satt till 5%. (b) Bestäm p-värdet för testet. (c) Visa att om α > p-värdet (antag samma p-värde som i deluppgift b), så kommer nollhypotesen H 0 att förkastas. 3. Testa, med α = 1%, om de 1000 decimalerna av π visar homogenitet. Använd Pearson s χ -test. 0 1 3 4 5 6 7 8 9 93 116 103 10 93 97 94 95 101 106

(5p) 4. Från datat x 1 3 4 5 6 y 3 5 6 9 10 1 uppskattas en regressionslinje, y = β 0 + β 1 x, där β 0 = 1, och β 1 = 1, 8. Väntevärdet för Y då x = 3, 5 är y = 7, 5. Värdena S XX = 17, 5, S XY = 31, 5, S Y Y = 57, 5 är beräknade. (a) Bestäm ett 95% konfidensintervall för E(Y ) då x = 3, 5, (b) Bestäm ett 95% prediktionsintervall för Y då x = 3, 5, (c) Skissa en graf som nedan med y = β 0 + β 1 x, punkten (x, y ) och intervallen från (a) och (b) (d) Förklara konceptuellt (hänvisa inte till matematiska formler) vad du beräknat ovan och varför intervallet från uppgift (b) innehåller intervallet i uppgift (a). y 16 14 1 10 8 6 4 0 0 1 3 4 5 6 7 x