Kurskod: TAMS24 (Statistisk teori / Provkod: TEN 206-0-04 (kl. 8-2 Examinator/Examiner: Xiangfeng Yang (Tel: 070 089666. Please answer in ENGLISH if you can. a. You are permitted to bring: a calculator; formel -och tabellsamling i matematisk statistik (from MAI; TAMS24: Notations and Formulas (by Xiangfeng Yang b. Scores rating: 8- points giving rate 3;.5-4.5 points giving rate 4; 5-8 points giving rate 5.. (3 points English English Version A baker bakes three large chocolate cakes every working day. If the cakes are not sold during the day, then they will be given to a so-called food Bank. The following results are obtained for one year: Number of sold cakes Number of days 0 6 2 55 3 228 Test the hypothesis that the number of sold cakes during one working day is a Binomial distribution Bin(3, p. Use significance level %. Solution. There is one unknown parameter p in the hypothesis Bin(3, p, so first we must estimate this unknown parameter using the given sample: ˆp = cakes to the food Bank total baked cakes Therefore, we can compute the theoretical probabilities as = (0 + 6 + 2 55 + 3 228 (3 + 3 6 + 3 55 + 3 228 = 80 900 = 0.9. p 0 = P (Bin(3, ˆp = 0,..., p 3 = P (Bin(3, ˆp = 3. But it turned out that np 0 < 5, thus we need to combine the two groups 0 and to get only three groups: 0 and : p 0 + p = 0.028; 2 : p 2 = 0.243; 3 : p 3 = 0.729. Thus the test statistics is (keep in mind n = 300, N = + 6 = 7, N 2 = 55, N 3 = 228 3 (N i np i 2 T S = = 3.6. np i On the other hand, the rejection region is C = (χ 2 α(3, + = (6.64, +. Since T S C, reject H 0, namely, we don t think that it is a Binomial distribution Bin(3, p. i= Page /4
2. (3 points English The following data are 40 observations from a Poisson distribution with a parameter µ (which is the mean: Value 0 2 3 4 Frequency 2 0 6 2 Find the Maximum-Likelihood estimate for µ given this information. Solution. The likelihood function is The logarithmic function is L(µ = (e µ 2 (µe µ 0 ( µ2 2 e µ ( µ3 6 e µ 6 ( µ4 24 e µ 2 = e 40µ u 48 constant. ln L(µ = 40µ + 48 ln µ + constant. By setting 0 = ln L(µ = 40 + 48/µ, we have µ =.2. The Maximum-Likelihood estimate for µ is ˆµ ML =.2. 3. (3 points English Let X, X 2, X 3 be random variables such that X X 2 N 4 4, X 3 4 3 2. 3 (3.. (p Find P (X 2 > X +. (3.2. (2p One wants to do a linear combination Y = a X + a 2 X 2 + a 3 X 3 such that E(Y = 4 and V (Y is minimized. Find a, a 2 and a 3. Solution. (3.. Using the An important theorem on the formula paper, one can easily see that So, (X 2 X N(, 7. P (X 2 > X + = P (X 2 X > 0 = P (N(, 7 > 0 = P (N(0, > / 7 = 0.352. (3.2. The first condition E(Y = 4 gives: To use the second condition, we first compute the variance: 4a + 4a 2 + 4a 3 = 4. ( V (Y =...use covariance matrix... = 3a 2 2a a 2 2a a 3 + 2a 2 2 2a 2 a 3 + 3a 2 3. Now we replace every a 3 in V (Y by (: a 3 = a a 2, and regard V (Y =: f(a, a 2 as a function of a and a 2. To get a minimal value, we must take the partial derivatives: as follows: 0 = f(a, a 2 = 6a + 8a 2 8; a (2 0 = f(a, a 2 = 8a + 4a 2 8. a 2 (3 Solving the above two equations gives a = 0.3 and a 2 = 0.4, which in turn yields a 3 = 0.3. Page 2/4
4. (3 points English The following two questions (4. and (4.2 are independent of each other. (4.. (.5p For a number of coal plants with similar purification techniques one has measured the sulfur dioxide emissions y (ppm and the power x (gigawatt. An analysis according to the model Y = β 0 + β x + β 2 x 2 + ε, ε N(0, σ, gave results in the following output. Find a 95% confidence interval for E(Y when the power is 0.5 gigawatt. i ˆβi d( ˆβ i Degrees of freedom Sum of squares 0 204.46 82.95 REGR 2 5049.3-638.0 298.5 RES 6 254.3 2 959.2 263.6 TOT 8 5303.6 62.38 58.90 508.70 (X X = 58.90 202.7 850.58. 508.70 850.58 639.75 (4.2. (.5p Consider the simple linear regression model Y j = β 0 + β x j + ε j, j =,..., n ( ˆB0 where ε j, j =,..., n, are independent N(0, σ. The least-square-method s estimator of the parameter (( ˆB normal vector N, σ 2 (X X. Prove that ˆB 0 and ˆB are independent if and only if n j= x j = 0. β Solution. (4.. The confidence interval is I µ = ˆµ t α/2 (n k s The values are as follows: ˆµ = ˆβ 0 + ˆβ 0.5 + ˆβ 2 0.5 2 ; t α/2 (n k = t 0.05/2 (9 2 ; s 2 = SS E /(n k = 5049.3/(9 2 ; x = (, 0.5, 0.5 2. (... (4.2. It is clear that X X = x... x n (X X = n j= x2 j ( n j= xj2 x.. x ( n n j= x2 j n n j= x j n cov( ˆB 0, ˆB = = j= x j x (X X x = (8.02, 32.52. ( n n j= x j n j= x j n j= x2 j. Thus,. The covariance of ˆB 0 and ˆB is σ 2 n j= x2 j ( n j= x j 2 which means that ˆB 0 and ˆB are independent if and only if n j= x j = 0. n x j, j= ( β is a 5. (3 points English In order to compare the efficacy of three different antihypertensive medications, one treated three groups (each group has ten patients with the different medicines. After a month, the blood pressure reductions are measured. Results are: Sample mean x i Sample standard deviation s i Medicine : 7.3 6.9 Medicine 2: 2. 7.26 Medicine 3: 0.8 5.23 Model: We take these three samples from N(µ i, σ, i =, 2, 3. Does it seem like µ 2 >.4µ 3? Answer the question by constructing an appropriate confidence interval with confidence level 0.95. Page 3/4
Solution. The unknown parameter σ 2 can be estimated as s 2 = (n s 2 + (n 2 s 2 2 + (n 3 s 2 3 n + n 2 + n 3 3 = 39.45887. Since σ is unknown, the help variable is (c i Xi + c j Xj (c i µ i + c j µ j t(n + n 2 + n 3 3. c S 2 i n i + c2 j n j (NOTE that the degrees of freedom involves all samples sizes n, n 2 and n 3, NOT just n i and n j. In order to investigate whether µ 2 >.4µ 3, we construct a 95% confidence interval of µ 2.4µ 3 of the form (a, +. To this end, (a, + = (( x 2.4 x 3 t α (n + n 2 + n 3 3 s 2 +.42, + = (0.7, +. n 2 n 3 Since 0.7 > 0, we conclude that µ 2 >.4µ 3. This problem can be also solved by constructing a two-sided confidence interval. 6. (3 points English For a sample with ten observations {x,..., x 0 } from {X,..., X 0 } where X j N(µ, σ, j =,..., 0, one has estimated the sample standard deviation and got s = 3.2. (6.. (p Test the hypothesis H 0 : σ = 2.5 against H : σ > 2.5 with a level α = 0.05. (6.2. (2p Find the σ-value for which the test in (6. has a power 0.9. Solution. (6.. It is easy to get T S = (n s2 σ 2 0 = 9 3.22 2.5 2 = 4.84, C = (χ 2 0.05(9, + = (6.99, +. Since T S / C, don t reject H 0. (6.2. 0.9 = P (reject H 0, given the real value σ = P ( 9 S2 > 6.99, given the real value σ 2.52 = P ( 9 S2 σ 2 > 6.99 2.5 2 /σ 2, given the real value σ = P (χ 2 (9 > 6.99 2.5 2 /σ 2. It is from the χ 2 table that thus 6.99 2.5 2 /σ 2 = 4.68. σ = 5.04. Page 4/4
. (3 poäng Svenska Svensk Version En bagare bakar varje arbetsdag tre stora chokladtårtor. De tårtor som inte säljs under dagen ges bort till en sk food bank. Följande resultat erhölls efter ett arbetsår: Antal sålda tårtor Antal dagar 0 6 2 55 3 228 Pröva hypotesen att antalet sålda tårtor under en dag är Binomialfördelat Bin(3, p. Använd signifikansnivå %. 2. (3 poäng Svenska Följande datamaterial består av 40 observationer från en Poissonfördelning med parameter µ (som är väntevärdet: Värde 0 2 3 4 Frekvens 2 0 6 2 Beräkna Maximum-Likelihood skattningen för µ givet denna information. 3. (3 poäng Svenska Låt X, X 2, X 3 vara stokastiska variabler sådana att X 4 X 2 N 4, X 3 4 3 2. 3 (3.. (p Beräkna P (X 2 > X +. (3.2. (2p Man vill göra en linjärkombination Y = a X + a 2 X 2 + a 3 X 3 sådan att E(Y = 4 och V (Y minimiseras. Bestäm a, a 2 och a 3. 4. (3 poäng Svenska Deluppgifterna (4. och (4.2 är fristående. (4.. (.5p För ett antal kolkraftverk med likartad reningsteknik har man mätt svaveldioxidutsläpp y (ppm samt effekten x (gigawatt. En analys enligt modellen Y = β 0 + β x + β 2 x 2 + ε, ε N(0, σ, gav ett resultat som finns i utskriften nedåt. Beräkna ett 95% konfidensintervall för E(Y då kraftverkets effekt är 0.5 gigawatt. i ˆβi d( ˆβ i Frihetsgrader Kvadratsumma 0 204.46 82.95 REGR 2 5049.3-638.0 298.5 RES 6 254.3 2 959.2 263.6 TOT 8 5303.6 62.38 58.90 508.70 (X X = 58.90 202.7 850.58. 508.70 850.58 639.75 Page /2
(4.2. (.5p Betrakta den enkla linjära regressionsmodellen Y j = β 0 + β x j + ε j, j =,..., n ( ( ˆB0 där ε j, j =,..., n, är oberonde N(0, σ. Minst-kvadrat-metodens skattningar av parametrarna är ˆB β (( normalfördelad N, σ β 2 (X X. Visa att ˆB 0 och ˆB är oberoende om och endast om n j= x j = 0. 5. (3 poäng Svenska För att jämföra effekten hos tre olika blodtryckssänkande mediciner behandlas tre grupper om vardera tio patienter med de olika medicinerna. Efter en månad mättes sänkningen av blodtrycket. Resultat: Medelvärde x i Stickprovsstand.avvik s i Medicin : 7.3 6.9 Medicin 2: 2. 7.26 Medicin 3: 0.8 5.23 Modell: Vi har tre stickprov från N(µ i, σ, i =, 2, 3. Förefaller det troligt att µ 2 >.4µ 3? Besvara frågan genom att konstruera ett lämpligt konfidensintervall med konfidensgraden 0.95. 6. (3 poäng Svenska För ett slumpmässigt stickprov med tio observationer {x,..., x 0 } från {X,..., X 0 } där X j N(µ, σ, j =,..., 0, har man beräknat stickprovsstandardavvikelsen och fått s = 3.2. (6.. (p Pröva H 0 : σ = 2.5 mot H : σ > 2.5 på nivån α = 0.05. (6.2. (2p Bestäm det σ-värde för vilket testet i (6. har styrkan 0.9. Page 2/2