Exam MVE65 Mathematical Statistics, 016-05-31 The exam consists of eight exercises with a total of 50 points. You need as least 0 points to get a 3, at least 30 points for a 4 and at least 40 points for a 5. The answers can be given in English or in Swedish. You find the Swedish version of the exam starting on page 3. Examinor: Kaspar Stucki Allowed aids: Calculator, Matematisk Statistik, Ulla Blomqvist; Lösningar till matematisk statistik, Ulla Blomqvist; Tabell-och formelsamling, Håkan Blomqvist; Lecture notes. Phone: 0731 47450 (Visiting ca. at 9 am) Good luck! Exercise 1 (6 points). Let Ω = {1,, 3, 4, 5}, A = {1,, 3} and B = {3, 4}. Assume that P(A) = 0.5, P(B) = 0.4 and P(A B) = 0.5. a) What can you say about P({1})? b) Compute P({3}) and P({4, 5}). Exercise (6 points). Suppose there are two bowls of cookies. Bowl A contains 5 chocolate chip and 15 plain cookies, while bowl B has 10 of each. Our friend Eric picks first a bowl at random, and then picks a cookie at random. The cookie turns out to be a plain one. How probable is it that Eric picked it out of Bowl A? Exercise 3 (6 points). Let ξ be a continuous random variable with density e x, x < 0 f(x) = C(1 x ), 0 x < 1. 0, x 1 a) Compute the constant C. b) Compute P( ξ 0.5). c) Compute E(ξ) and Var(ξ). Exercise 4 (6 points). Let ξ 1 Binomial(4, 0.5) and ξ Poisson(3) be independent and let η = ξ 1 + ξ. a) Compute P(η = 0). b) Compute E(η) and Var(η). c) We want to estimate the binomial probability p = 0.5. Consider the estimator ˆp = (η 3)/4 and compute its bias and mean squared error. 1
Exercise 5 (6 points). Assume you have 10 i.i.d. normally distributed random variables taking the values.65 6.5 4.48.46 3.81.99 6.50 4.38 3.75 3.30 a) Compute the sample mean and sample standard deviation. b) Compute a 99% one-sided confidence interval for the standard deviation. Exercise 6 (6 points). Consider the following data: x -1 0 3 5 y 1.7 1 15 4 89 Fit an exponential function through these points. Exercise 7 (6 points). A company produces batteries which have an average lifetime of 300 hours. After implementing a new procedure they got a sample of 5 batteries with average lifetime of 305 hours and sample standard deviation 6 hours. The company wants to test whether the new batteries have a significantly longer lifetime than the old ones. a) Which statistical test would you do? b) Compute this test with α = 0.05. Exercise 8 (8 points). Eric claims he can toss a coin such that the probability of head is higher than the probability of tail. He tosses a coin 100 times and indeed got 65 heads. You are still not convinced and you want to do a test. a) Which test would you do? b) Perform this test and decide on a significance level α = 0.05. c) Compute the p-value. d) What is the power of this test? (For this question assume H 1 : p = 0.65.)
Swedish version Lycka till! Uppgift 1 (6 poäng). Låt Ω = {1,, 3, 4, 5}, A = {1,, 3} and B = {3, 4}. Antag att P(A) = 0.5, P(B) = 0.4 och P(A B) = 0.5. a) Vad kan du säga om P({1})? b) Beräkna P({3}) och P({4, 5}). Uppgift (6 poäng). Antag att det finns två burkar med kakor. Burk A innehåller 5 chokladkakor och 15 vaniljkakor, medan burk B har 10 av varje sort. Vår vän Erik väljer först en burk slumpvis och sedan tar han en kaka slumpvis. Kakan visade sig vara en vaniljkaka. Hur sannolikt är det att Erik tog kakan ur burk A? Uppgift 3 (6 poäng). Låt ξ vara en kontinuerlig stokastik variabel med täthetsfunktionen e x, x < 0 f(x) = C(1 x ), 0 x < 1. 0, x 1 a) Beräkna konstanten C. b) Beräkna P( ξ 0.5). c) Beräkna E(ξ) och Var(ξ). Uppgift 4 (6 poäng). Låt ξ 1 Binomial(4, 0.5) och ξ Poisson(3) vara oberoende och låt η = ξ 1 + ξ. a) Beräkna P(η = 0). b) Beräkna E(η) och Var(η). c) Vi vill skatta den binomiala sannolikheten p = 0.5. Betrakta skattaren ˆp = (η 3)/4 och beräkna dess väntevärdesfel (bias) och medelkvadratfel (mean squared error). Uppgift 5 (6 poäng). Antag att vi har 10 oberoende och likafördelade normalfördelade stokastiska variabler med värdena.65 6.5 4.48.46 3.81.99 6.50 4.38 3.75 3.30 a) Beräkna stickprovsmedelvärdet och stickprovsstandardavvikelsen. 3
b) Beräkna ett 99% ensidigt konfidensintervall för standardavvikelsen. Uppgift 6 (6 poäng). Betrakta följande data: x -1 0 3 5 y 1.7 1 15 4 89 Anpassa en exponentialfunktion genom dessa punkter. Uppgift 7 (6 poäng). Ett företag producerar batterier med en medellivslängd på 300 timmar. Efter att ha implementerat en ny procedur så fick de från ett prov på 5 batterier att medellivslängden var 305 timmar och stickprovsstandardavvikelsen var 6 timmar. Företaget vill testa om de nya batterierna har en signifikant längre livslängd än de gamla. a) Vilket statistiskt test skulle du använda? b) Genomför detta test med α = 0.05. Uppgift 8 (8 poäng). Erik påstår att han kan singla slant så att sannolikheten för krona är högre än den för klave. Han kastar ett mynt 100 gånger och han fick krona 65 gånger. Du är fortfarande inte övertygad och du vill göra ett test. a) Vilket test skulle du göra? b) Genomför detta test med en signifikansnivå α = 0.05. c) Beräkna p-värdet. d) Vad är styrkan på testet? (För denna fråga, antag H 1 : p = 0.65.) 4
Solutions Exercise 1. We try to compute all elemental probabilities P({i}) for i = 1,..., 6. Since P(A B) = P(A B)/P(B) we get Furthermore P({3}) = P(A B) = P(A B) P(B) = 0.5 0.4 = 0.. P({1, }) = P({1,, 3}) P({3}) = 0.3 P({4}) = P({3, 4}) P({3}) = 0. P({5}) = 1 P({1,, 3}) P({4}) = 0.3. a) We can not compute P({1}), but since P({1}) = P({1, }) P({}) we get 0 P({1}) 0.3. b) P({3}) = 0. and P({4, 5}) = P({4}) + P({5}) = 0.5. Exercise. We use Bayes formula P(Bowl A plain) = P(plain Bowl A) P(Bowl A). P(plain) Furthermore In total we get P(plain) = P(plain Bowl A) P(Bowl A) + P(plain Bowl B) P(Bowl B). P(plain Bowl A) P(Bowl A) P(Bowl A plain) = P(plain Bowl A) P(Bowl A) + P(plain Bowl B) P(Bowl B) 15/0 1/ = 15/0 1/ + 10/0 1/ = 15 5 = 0.6. Exercise 3. To solve this exercise we use the following integrals [ ] e e x x 0 dx = = 1 [ ] xe xe x x 0 dx = 1 e x dx = 1 4 [ ] x x e x e x 0 dx = xe x dx = 1 4 1 [ ] x x k k+1 1 dx = = 1, for k = 0, 1,,... k + 1 k + 1 0 0 5
a) The integral over the density should be one. f(x) dx = This is equal to one for C = 3/4. e x dx + 1 0 C(1 x ) dx = 1 + C(1 1 3 ). b) We compute the distribution function F (x) = x f(t) dt. We get 1 ex, x 0 1 F (x) = + 3 x3 (x ), 0 x 1. 4 3 1, 1 x Therefore we get P( ξ 0.5) = P( 0.5 ξ 0.5) c) Using the integral formulas above we get E(ξ) = = F (0.5) F ( 0.5) = 1 + 3 ( 1 4 1 ) 1 3 3 e 1 0.66. xf(x) dx 1 = xe x dx + 3 x x 3 dx 4 0 = 1 4 + 3 ( 1 4 1 ) = 1 4 16. Furthermore E(ξ ) = x f(x) dx 1 = x e x dx + 3 x x 4 dx 4 0 = 1 4 + 3 ( 1 4 3 1 ) = 7 5 0. Thus Var(ξ) = E(ξ ) (E(ξ)) = 7/0 ( 1/16) 0.346. Exercise 4. For a binomial random variable ξ 1 Binomial(n, p) we have ( ) n P(ξ 1 = k) = p k (1 p) n k, E(ξ 1 ) = np and Var(ξ 1 ) = np(1 p). k And for a Poisson random variable ξ Poisson(λ) we have λ λk P(ξ = k) = e k!, E(ξ ) = λ and Var(ξ ) = λ. 6
a) Note that η = 0 if and only if ξ 1 = 0 and ξ = 0. Since ξ 1 and ξ are independent P(η = 0) = P(ξ 1 = 0, ξ = 0) = P(ξ 1 = 0) P(ξ = 0) ( 4 = )0.5 0 0.75 4 e 0 b) Again since ξ 1 and ξ are independent = 0.75 4 e 3 0.0158. 3 30 0! E(η) = E(ξ 1 + ξ ) = E(ξ 1 ) + E(ξ ) = 4 0.5 + 3 = 4 Var(η) = Var(ξ 1 + ξ ) = Var(ξ 1 ) + Var(ξ ) = 4 0.5 0.75 + 3 = 3.75. c) From the linearity of the expectation we get ( ) η 3 Bias(ˆp) = E(ˆp) p = E 4 0.5 = E(η) 3 4 0.5 = 0. Since ˆp is unbiased the mean squared error of ˆp is the variance of ˆp ( ) η 3 MSE(ˆp) = E((ˆp p) ) = Var(ˆp) = Var = 1 4 16 Var(η 3) = 1 Var(η) 0.34. 16 Exercise 5. a) ξ = 1/10 10 ξ i = 4.057 and S = 1/(10 1) 10 (ξ i ξ) 1.393. b) The sample variance is chi-squared distributed with degree of freedom n 1 = 9. The corresponding quantiles are q 0.01 =.088 and q 0.99 = 1.666. In the question it is not specified whether a lower or upper one-sided confidence interval should be computed. The formulas for a 99% one-sided confidence interval are [ ) (n 1)S 0, = [0,.89) Both intervals are correct solutions. q 0.01 ( ) (n 1)S, = (0.898, ). q 0.99 Exercise 6. First we look at the logarithmic problem x -1 0 3 5 log(y) 0.531 0.708 3.138 4.489 and fit a line log(y) = a + bx through this data. The formulas for a and b are a = 1 5 log(y i ) b 1 5 x i and b = 5 x i log(y i ) ( 5 x 5 i log(y i) ) /5 5 x i ( 5 x ). i /5 7
With x i = 9, x i = 39, log(y i ) = 10.905 we get a = 0.81 and b = 0.756. Thus and ŷ = e a+bx = e a (e b ) x =.7.19 x fits an exponential function through the data. x i log(y i ) = 36.863 Exercise 7. a) We do not know the distribution of the lifetime, but as n = 5 is reasonably large we can assume that the sample mean is approximately normally distributed. Since we do not know the true standard deviation, we perform a t-test. b) We test the hypothesis H 0 : µ 300 versus H 1 : µ > 300. The test statistic is given by t = ξ µ 0 S/ n = 305 300 6/ 5 = 4.167 This test statistic is t-distributed with degree of freedom n 1 = 4 the corresponding one-sided quantile is q.095 = 1.71 Since t = 4.167 > 1.71 we reject H 0, which means the new batteries have a significantly longer lifetime than the old ones. Exercise 8. The number of heads is Binomial(100, p) distributed with an unknown p. Our estimate is ˆp = 65/100 = 0.65. a) We perform the test for a proportion. b) We test H 0 : p 0.5 versus H 1 : p > 0.5. The test variable z = ˆp p p(1 p) n = 0.65 0.5 1 400 is approximately standard normally distributed. Since z = 3 > q 0.95 = 1.645 we accept H 1 and believe in Erics coin tossing skills. = 3 c) The p - value is given by p = P(Z > 3) = 1 Φ(3) 0.0013. d) The new hypothesis are H 0 : p = 0.5 versus H 1 : p = 0.65. Under H 1 our estimator ˆp is approximately N (0.65, 0.65 0.35/100) distributed. The test statistic z is a linear transformation of ˆp and therefore z = ˆp 0.5 0.5 0.5 100 N We reject H 0 if z > 1.645, therfore 0.15 0.5 0.5 100 power = P(reject H 0 H 1 ), 0.65 0.35/100 = N (3, 0.91). 0.5 0.5/100 = P(z > 1.645 H 1 ) ( z 3 = P > 1.645 3 ) H 1 0.91 0.91 ( ) 1.645 3 = 1 Φ = 1 Φ( 1.4) = 0.9 0.91 8