LUNDS UNIVERSITET STATISTISKA INSTITUTIONEN MATS HAGNELL STA102:4 Skrivning i multivariata metoder lördagen den 27 augusti 2005 Förutom Körners tabell- och formelsamling och miniräknare är även läroboken: Marcoulides-Hershberger, Multivariate Statistical Methods, tillåtet hjälpmedel. 1. Betrakta följande matriser: A= 20 7 7 20 3 4 4 6 och B=. 7 8 8 9 a) Bestäm A*B och B*A! b) Bestäm egenvärden och egenvektorer till A! Visa också att de två egenvektorerna är ortogonala! 2. Vi har 8 observationer på slumpvektorn X =(X 1,X 2,X 3 ): (2, 4, 3), (2, 6, 4), (6, 5, 8), (7, 6, 10), (5, 6, 10), (7, 6, 10), (6, 5, 8), (7, 6, 10),. Beräkna medelvärdesvektorn, kovariansmatrisen och korrelationsmatrisen! 3. Slumpvektorn X =(X 1,X 2,X 3,X 4,X 5 ) är 5-dimensionellt N(μ, Σ), 10 7 4 3 2 7 8 4 3 1 där μ =(5, 2, 6, 4, 3) och Σ = 4 4 16 5 3. Vad är 3 3 5 9 6 2 1 3 6 10 sannolikhetsfördelningarna för (X 2, X 3, X 4, X 5 ), (X 2, X 3, X 4 ) respektive (X 2, X 3 )?
4. Från ett stickprov om 60 observationer på slumpvektorn X 1 =(X 11,X 12,X 13 ), som är 3-dimensionellt N(μ 1, Σ) fås 9 3 2 x 1 =( 9, 8, 4 ) och S 1 = 3 6 1. Från ett oberoende 2 1 8 stickprov om 40 observationer på slumpvektorn X 2 =(X 21,X 22,X 23 ), som är 3-dimensionellt N(μ 2, Σ) fås 10 4 3 x 2 =( 13, 12, 10 ) och S 2 = 4 7 2. 3 2 9 Från ett oberoende stickprov om 40 observationer på slumpvektorn X 3 =(X 31,X 32,X 33 ), som är 3-dimensionellt 11 3 5 N(μ 3, Σ) fås x 3 =( 3, 4, 10 ) och S 3 = 3 9 4. Skatta μ 1, 5 4 8 μ 2, μ 3 och Σ! 5. För ett stickprov om 784 individer mättes effekter av vissa arbetsrelaterade variabler på olika mått av tillfredsställelse. Variablerna är Y 1 = tillfredsställelse med chefen, Y 2 = tillfredsställelse med karriären, Y 3 = tillfredsställelse med lönen, Y 4 = tillfredsställelse med arbetsmängden, Y 5 = identifiering med företaget, Y 6 = tillfredsställelse med typen av arbete, Y 7 = allmän tillfredsställelse samt X 1 = återkoppling, X 2 = arbetets betydelse, X 3 = omväxling i arbetet, X 4 = arbetets identitet och X 5 = autonomi. Datautskrift från proc CANCORR finns i Appendix. a) Hur många par av kanoniska variabler är signifikanta? b) Försök att tolka de signifikanta kanoniska variablerna!
6. 35 stycken hus, utbjudna till försäljning, mäts m. a. p. tre variabler, PRICE = begärt pris i 1000-tals dollar, BEDROOMS = antalet sovrum och AREA = yta i kvadratfot. Husen indelades i tre grupper efter variabeln GROUP, vilken antar tre värden efter i vilket samhälle huset finns. a) Hur bra diskriminerar alla tre variablerna mellan de tre grupperna, som definieras av variabeln GROUP! Datautskrift från proc STEPDISC finns i Appendix. b) Bestäm hur bra de två variablerna PRICE och BEDROOMS diskriminerar mellan de tre grupperna, som definieras av variabeln GROUP! Datautskrift från proc DISCR finns i Appendix. 7. För ett stickprov om 145 skolbarn mättes 5 psykologiska test, förståelse, förmåga att avsluta meningar, ordförståelse, addition och förmåga att räkna punkter. En explorativ faktoranalys görs på de 5 variablerna. Försök att tolka resultatet av denna! Datautskrifter från proc FACTOR finns i Appendix. Varför tycks tvåfaktormodellen vara att föredra? 8. Fortsättning på uppgift 7: Man vill pröva antagandet att en tvåfaktormodell är lämplig, där de 3 första testen laddar på en faktor, medan de två sista testen laddar på en annan faktor. Därför görs en konfirmativ faktoranalys med LISREL. a) Förklara kortfattat koden i LISREL-programmet (i Appendix) för att pröva hypotesen! b) Dra slutsatser om hypotesen från utskriften till LISRELprogrammet (i Appendix)! c) Åskådligör den skattade modellen i ett pathdiagram! d)förklara antalet frihetsgrader i modellen!
Appendix The CANCORR Procedure Uppgift 5. Canonical Correlation Analysis Adjusted Approximate Squared Canonical Canonical Standard Canonical Correlation Correlation Error Correlation 1 0.553706 0.545622 0.024780 0.306591 2 0.236404 0.214497 0.033740 0.055887 3 0.119186. 0.035229 0.014205 4 0.072228. 0.035551 0.005217 5 0.057270. 0.035620 0.003280 Test of H0: The canonical correlations in Eigenvalues of Inv(E)*H the current row and all that follow are zero = CanRsq/(1-CanRsq) Likelihood Approximate Eigenvalue Difference Proportion Cumulative Ratio F Value Num DF Den DF Pr > F 1 0.4421 0.3830 0.8433 0.8433 0.63988477 10.40 35 3249.9 <.0001 2 0.0592 0.0448 0.1129 0.9562 0.92280941 2.62 24 2697.9 <.0001 3 0.0144 0.0092 0.0275 0.9837 0.97743541 1.18 15 2137.1 0.2776 4 0.0052 0.0020 0.0100 0.9937 0.99152030 0.83 8 1550 0.5790 5 0.0033 0.0063 1.0000 0.99672015 0.85 3 776 0.4661 Multivariate Statistics and F Approximations S=5 M=0.5 N=385 Statistic Value F Value Num DF Den DF Pr > F Wilks' Lambda 0.63988477 10.40 35 3249.9 <.0001 Pillai's Trace 0.38517977 9.25 35 3880 <.0001 Hotelling-Lawley Trace 0.52428968 11.54 35 2184.6 <.0001 Roy's Greatest Root 0.44214935 49.02 7 776 <.0001 NOTE: F Statistic for Roy's Greatest Root is an upper bound. The CANCORR Procedure Canonical Correlation Analysis Raw Canonical Coefficients for the VAR Variables work1 work2 work3 work4 work5 X1 Feedback 0.4217037005 0.3428520819-0.857665357-0.788408809 0.0308427228 X2 Task_signif 0.1951059282-0.66829859 0.4434261983-0.269128849 0.9832287828 X3 Task_variety 0.1676125195-0.853155876-0.259213388 0.4687568565-0.914141074 X4 Task_iden -0.022889257 0.3560702979-0.42310623 1.0423235086 0.5243667708 X5 Autonomy 0.4596557262 0.7287240893 0.9799053152-0.168165047-0.439242006 Raw Canonical Coefficients for the WITH Variables satisfaction1 satisfaction2 satisfaction3 satisfaction4 satisfaction5 Y1 Sup_satis 0.4251801045-0.087992909 0.4918143621-0.128429606-0.482345255 Y2 Career_satis 0.2088846933 0.436262871-0.78320006-0.340530647-0.749890932 Y3 Fin_satis -0.035894963-0.092909718-0.47784774-0.605914061 0.3457245048 Y4 Load_satis 0.023525029 0.9260127356-0.00651068 0.4043753719 0.3115896198 Y5 Company_satis 0.2902803477-0.101101165 0.2831089034-0.446854955 0.7029742619 Y6 Kind_satis 0.5157248004-0.554289712-0.41249444 0.6875998752 0.1795657307 Y7 Gen_satis -0.11014262-0.031722209 0.9284585695 0.2738895655-0.014489511
The CANCORR Procedure Canonical Correlation Analysis Standardized Canonical Coefficients for the VAR Variables work1 work2 work3 work4 work5 X1 Feedback 0.4217 0.3429-0.8577-0.7884 0.0308 X2 Task_signif 0.1951-0.6683 0.4434-0.2691 0.9832 X3 Task_variety 0.1676-0.8532-0.2592 0.4688-0.9141 X4 Task_iden -0.0229 0.3561-0.4231 1.0423 0.5244 X5 Autonomy 0.4597 0.7287 0.9799-0.1682-0.4392 Standardized Canonical Coefficients for the WITH Variables satisfaction1 satisfaction2 satisfaction3 satisfaction4 satisfaction5 Y1 Sup_satis 0.4252-0.0880 0.4918-0.1284-0.4823 Y2 Career_satis 0.2089 0.4363-0.7832-0.3405-0.7499 Y3 Fin_satis -0.0359-0.0929-0.4778-0.6059 0.3457 Y4 Load_satis 0.0235 0.9260-0.0065 0.4044 0.3116 Y5 Company_satis 0.2903-0.1011 0.2831-0.4469 0.7030 Y6 Kind_satis 0.5157-0.5543-0.4125 0.6876 0.1796 Y7 Gen_satis -0.1101-0.0317 0.9285 0.2739-0.0145 The CANCORR Procedure Canonical Structure Correlations Between the VAR Variables and Their Canonical Variables work1 work2 work3 work4 work5 X1 Feedback 0.8293 0.1093-0.4853-0.2469 0.0611 X2 Task_signif 0.7304-0.4366 0.2001 0.0021 0.4857 X3 Task_variety 0.7533-0.4661-0.1056 0.3020-0.3360 X4 Task_iden 0.6160 0.2225-0.2053 0.6614 0.3026 X5 Autonomy 0.8606 0.2660 0.3886 0.1484-0.1246 Correlations Between the WITH Variables and Their Canonical Variables satisfaction1 satisfaction2 satisfaction3 satisfaction4 satisfaction5 Y1 Sup_satis 0.7564 0.0446 0.3395-0.1294-0.3370 Y2 Career_satis 0.6439 0.3582-0.1717-0.3530-0.3335 Y3 Fin_satis 0.3872 0.0373-0.1767-0.5348 0.4148 Y4 Load_satis 0.3772 0.7919-0.0054 0.2886 0.3341 Y5 Company_satis 0.6532 0.1084 0.2092-0.4376 0.4346 Y6 Kind_satis 0.8040-0.2416-0.2348 0.4052 0.1964 Y7 Gen_satis 0.5024 0.1628 0.4933-0.1890 0.0678 Correlations Between the VAR Variables and the Canonical Variables of the WITH Variables satisfaction1 satisfaction2 satisfaction3 satisfaction4 satisfaction5 X1 Feedback 0.4592 0.0258-0.0578-0.0178 0.0035 X2 Task_signif 0.4044-0.1032 0.0239 0.0002 0.0278 X3 Task_variety 0.4171-0.1102-0.0126 0.0218-0.0192 X4 Task_iden 0.3411 0.0526-0.0245 0.0478 0.0173 X5 Autonomy 0.4765 0.0629 0.0463 0.0107-0.0071 Correlations Between the WITH Variables and the Canonical Variables of the VAR Variables work1 work2 work3 work4 work5 Y1 Sup_satis 0.4188 0.0105 0.0405-0.0093-0.0193 Y2 Career_satis 0.3565 0.0847-0.0205-0.0255-0.0191 Y3 Fin_satis 0.2144 0.0088-0.0211-0.0386 0.0238 Y4 Load_satis 0.2088 0.1872-0.0006 0.0208 0.0191 Y5 Company_satis 0.3617 0.0256 0.0249-0.0316 0.0249 Y6 Kind_satis 0.4452-0.0571-0.0280 0.0293 0.0112 Y7 Gen_satis 0.2782 0.0385 0.0588-0.0136 0.0039
The CANCORR Procedure Canonical Redundancy Analysis Raw Variance of the VAR Variables Explained by Their Own The Opposite Canonical Variables Canonical Variables Canonical Variable Cumulative Canonical Cumulative Number Proportion Proportion R-Square Proportion Proportion 1 0.5818 0.5818 0.3066 0.1784 0.1784 2 0.1080 0.6898 0.0559 0.0060 0.1844 3 0.0960 0.7858 0.0142 0.0014 0.1858 4 0.1223 0.9081 0.0052 0.0006 0.1864 5 0.0919 1.0000 0.0033 0.0003 0.1867 Raw Variance of the WITH Variables Explained by Their Own The Opposite Canonical Variables Canonical Variables Canonical Variable Cumulative Canonical Cumulative Number Proportion Proportion R-Square Proportion Proportion 1 0.3721 0.3721 0.3066 0.1141 0.1141 2 0.1222 0.4943 0.0559 0.0068 0.1209 3 0.0740 0.5683 0.0142 0.0011 0.1220 4 0.1289 0.6972 0.0052 0.0007 0.1226 5 0.1058 0.8030 0.0033 0.0003 0.1230 The CANCORR Procedure Canonical Redundancy Analysis Standardized Variance of the VAR Variables Explained by Their Own The Opposite Canonical Variables Canonical Variables Canonical Variable Cumulative Canonical Cumulative Number Proportion Proportion R-Square Proportion Proportion 1 0.5818 0.5818 0.3066 0.1784 0.1784 2 0.1080 0.6898 0.0559 0.0060 0.1844 3 0.0960 0.7858 0.0142 0.0014 0.1858 4 0.1223 0.9081 0.0052 0.0006 0.1864 5 0.0919 1.0000 0.0033 0.0003 0.1867 Standardized Variance of the WITH Variables Explained by Their Own The Opposite Canonical Variables Canonical Variables Canonical Variable Cumulative Canonical Cumulative Number Proportion Proportion R-Square Proportion Proportion 1 0.3721 0.3721 0.3066 0.1141 0.1141 2 0.1222 0.4943 0.0559 0.0068 0.1209 3 0.0740 0.5683 0.0142 0.0011 0.1220 4 0.1289 0.6972 0.0052 0.0007 0.1226 5 0.1058 0.8030 0.0033 0.0003 0.1230 The CANCORR Procedure Canonical Redundancy Analysis Squared Multiple Correlations Between the VAR Variables and the First M Canonical Variables of the WITH Variables M 1 2 3 4 5 X1 Feedback 0.2109 0.2115 0.2149 0.2152 0.2152 X2 Task_signif 0.1635 0.1742 0.1748 0.1748 0.1755 X3 Task_variety 0.1740 0.1861 0.1863 0.1868 0.1871 X4 Task_iden 0.1163 0.1191 0.1197 0.1220 0.1223 X5 Autonomy 0.2271 0.2310 0.2332 0.2333 0.2333 Squared Multiple Correlations Between the WITH Variables and the First M Canonical Variables of the VAR Variables M 1 2 3 4 5 Y1 Sup_satis 0.1754 0.1755 0.1772 0.1773 0.1776 Y2 Career_satis 0.1271 0.1343 0.1347 0.1353 0.1357 Y3 Fin_satis 0.0460 0.0461 0.0465 0.0480 0.0486 Y4 Load_satis 0.0436 0.0787 0.0787 0.0791 0.0795 Y5 Company_satis 0.1308 0.1315 0.1321 0.1331 0.1337 Y6 Kind_satis 0.1982 0.2014 0.2022 0.2031 0.2032 Y7 Gen_satis 0.0774 0.0789 0.0823 0.0825 0.0825
The STEPDISC Procedure The Method for Selecting Variables is STEPWISE Observations 35 Variable(s) in the Analysis 3 Class Levels 3 Variable(s) will be Included 0 Significance Level to Enter 0.5 Significance Level to Stay 0.5 Class Level Information Variable GROUP Name Frequency Weight Proportion 2 1 _1 9 9.0000 0.257143 2 _2 13 13.0000 0.371429 3 _3 13 13.0000 0.371429 The SAS System The STEPDISC Procedure Stepwise Selection: Step 1 Statistics for Entry, DF = 2, 32 Variable R-Square F Value Pr > F Tolerance PRICE 0.1946 3.86 0.0314 1.0000 BEDROOMS 0.1309 2.41 0.1060 1.0000 AREA 0.4865 15.16 <.0001 1.0000 Variable AREA will be entered. Variable(s) that have been Entered AREA Multivariate Statistics Statistic Value F Value Num DF Den DF Pr > F Wilks' Lambda 0.513509 15.16 2 32 <.0001 Pillai's Trace 0.486491 15.16 2 32 <.0001 Average Squared Canonical Correlation 0.243246 The STEPDISC Procedure Stepwise Selection: Step 2 Statistics for Removal, DF = 2, 32 Variable R-Square F Value Pr > F AREA 0.4865 15.16 <.0001 No variables can be removed. Statistics for Entry, DF = 2, 31 Partial Variable R-Square F Value Pr > F Tolerance PRICE 0.0332 0.53 0.5928 0.7774 BEDROOMS 0.1363 2.45 0.1033 0.9974 Variable BEDROOMS will be entered. Variable(s) that have been Entered BEDROOMS AREA
Multivariate Statistics Statistic Value F Value Num DF Den DF Pr > F Wilks' Lambda 0.443540 7.77 4 62 <.0001 Pillai's Trace 0.607263 6.98 4 64 0.0001 Average Squared Canonical Correlation 0.303632 The STEPDISC Procedure Stepwise Selection: Step 3 Statistics for Removal, DF = 2, 31 Partial Variable R-Square F Value Pr > F BEDROOMS 0.1363 2.45 0.1033 AREA 0.4897 14.87 <.0001 No variables can be removed. Statistics for Entry, DF = 2, 30 Partial Variable R-Square F Value Pr > F Tolerance PRICE 0.0410 0.64 0.5334 0.4801 No variables can be entered. 5 No further steps are possible. The SAS System The STEPDISC Procedure Stepwise Selection Summary Average Squared Number Partial Wilks' Pr < Canonical Pr > Step In Entered Removed R-Square F Value Pr > F Lambda Lambda Correlation ASCC 1 1 AREA 0.4865 15.16 <.0001 0.51350886 <.0001 0.24324557 <.0001 2 2 BEDROOMS 0.1363 2.45 0.1033 0.44354014 <.0001 0.30363167 0.0001 The DISCRIM Procedure Observations 35 DF Total 34 Variables 2 DF Within Classes 32 Classes 3 DF Between Classes 2 Pooled Covariance Matrix Information Covariance Matrix Rank Natural Log of the Determinant of the Covariance Matrix 2 8.69823 The DISCRIM Procedure 8 The DISCRIM Procedure Classification Summary for Calibration Data: WORK.FIRMS Resubstitution Summary using Linear Discriminant Function
Number of Observations and Percent Classified into GROUP From GROUP 1 2 3 Total 1 6 2 1 9 66.67 22.22 11.11 100.00 2 2 7 4 13 15.38 53.85 30.77 100.00 3 4 3 6 13 30.77 23.08 46.15 100.00 Total 12 12 11 35 34.29 34.29 31.43 100.00 Priors 0.33333 0.33333 0.33333 Error Count Estimates for GROUP 1 2 3 Total 9 Rate 0.3333 0.4615 0.5385 0.4444 Priors 0.3333 0.3333 0.3333 The SAS System The DISCRIM Procedure Classification Summary for Calibration Data: WORK.FIRMS Cross-validation Summary using Linear Discriminant Function Number of Observations and Percent Classified into GROUP From GROUP 1 2 3 Total 1 6 2 1 9 66.67 22.22 11.11 100.00 2 2 6 5 13 15.38 46.15 38.46 100.00 3 4 4 5 13 30.77 30.77 38.46 100.00 Total 12 12 11 35 34.29 34.29 31.43 100.00 Priors 0.33333 0.33333 0.33333 Error Count Estimates for GROUP 1 2 3 Total Rate 0.3333 0.5385 0.6154 0.4957 Priors 0.3333 0.3333 0.3333 The DISCRIM Procedure Test of Homogeneity of Within Covariance Matrices Notation: K P N = Number of Groups = Number of Variables = Total Number of Observations - Number of Groups
N(i) = Number of Observations in the i'th Group - 1 N(i)/2 Within SS Matrix(i) V = ----------------------------------- N/2 Pooled SS Matrix 2 1 1 2P + 3P - 1 RHO = 1.0 - SUM ----- - --- ------------- _ N(i) N _ 6(P+1)(K-1) DF =.5(K-1)P(P+1) PN/2 N V Under the null hypothesis: -2 RHO ln ------------------ PN(i)/2 _ N(i) _ is distributed approximately as Chi-Square(DF). Chi-Square DF Pr > ChiSq 3.754264 6 0.7099 Since the Chi-Square value is not significant at the 0.1 level, a pooled covariance matrix will be used in the discriminant function. Reference: Morrison, D.F. (1976) Multivariate Statistical Methods p252. The DISCRIM Procedure Univariate Test Statistics F Statistics, Num DF=2, Den DF=32 Total Pooled Between Standard Standard Standard R-Square Variable Deviation Deviation Deviation R-Square / (1-RSq) F Value Pr > F PRICE 134.9003 124.7941 71.8275 0.1946 0.2416 3.86 0.0314 BEDROOMS 0.7886 0.7578 0.3443 0.1309 0.1506 2.41 0.1060 Average R-Square Unweighted 0.1627108 Weighted by Variance 0.1945578 Multivariate Statistics and F Approximations S=2 M=-0.5 N=14.5 Statistic Value F Value Num DF Den DF Pr > F Wilks' Lambda 0.69334575 3.11 4 62 0.0212 Pillai's Trace 0.33280204 3.19 4 64 0.0187 Hotelling-Lawley Trace 0.40456939 3.10 4 36.185 0.0270 Roy's Greatest Root 0.25891168 4.14 2 32 0.0251 The DISCRIM Procedure Classification Summary for Calibration Data: WORK.FIRMS Resubstitution Summary using Linear Discriminant Function Number of Observations and Percent Classified into GROUP
From GROUP 1 2 3 Total 1 6 2 1 9 66.67 22.22 11.11 100.00 2 2 7 4 13 15.38 53.85 30.77 100.00 3 4 3 6 13 30.77 23.08 46.15 100.00 Total 12 12 11 35 34.29 34.29 31.43 100.00 Priors 0.33333 0.33333 0.33333 Error Count Estimates for GROUP 1 2 3 Total Rate 0.3333 0.4615 0.5385 0.4444 Priors 0.3333 0.3333 0.3333 The DISCRIM Procedure Classification Results for Calibration Data: WORK.FIRMS Resubstitution Results using Linear Discriminant Function Number of Observations and Average Posterior Probabilities Classified into GROUP From GROUP 1 2 3 1 6 2 1 0.6154 0.5446 0.4177 2 2 7 4 0.5605 0.6154 0.5414 3 4 3 6 0.5137 0.4432 0.5243 Total 12 12 11 0.5724 0.5606 0.5208 Priors 0.33333 0.33333 0.33333 Posterior Probability Error Rate Estimates for GROUP Estimate 1 2 3 Total Stratified 0.3454 0.4453 0.5450 0.4453 Unstratified 0.4113 0.4234 0.5089 0.4479 Priors 0.3333 0.3333 0.3333 The DISCRIM Procedure Classification Summary for Calibration Data: WORK.FIRMS Cross-validation Summary using Linear Discriminant Function Number of Observations and Percent Classified into GROUP From GROUP 1 2 3 Total 1 6 2 1 9 66.67 22.22 11.11 100.00 2 2 6 5 13 15.38 46.15 38.46 100.00 3 4 4 5 13
30.77 30.77 38.46 100.00 Total 12 12 11 35 34.29 34.29 31.43 100.00 Priors 0.33333 0.33333 0.33333 Error Count Estimates for GROUP 1 2 3 Total Rate 0.3333 0.5385 0.6154 0.4957 Priors 0.3333 0.3333 0.3333 The DISCRIM Procedure Classification Results for Calibration Data: WORK.FIRMS Cross-validation Results using Linear Discriminant Function Number of Observations and Average Posterior Probabilities Classified into GROUP From GROUP 1 2 3 1 6 2 1 0.5705 0.5867 0.4272 2 2 6 5 0.5920 0.6275 0.5470 3 4 4 5 0.5538 0.4970 0.5114 Total 12 12 11 0.5685 0.5772 0.5199 Priors 0.33333 0.33333 0.33333 Posterior Probability Error Rate Estimates for GROUP Estimate 1 2 3 Total Stratified 0.3582 0.4271 0.5454 0.4436 Unstratified 0.4152 0.4063 0.5098 0.4438 Priors 0.3333 0.3333 0.3333
The FACTOR Procedure Initial Factor Method: Principal Components Prior Communality Estimates: ONE Eigenvalues of the Correlation Matrix: Total = 5 Average = 1 Eigenvalue Difference Proportion Cumulative 1 2.58746987 1.16575215 0.5175 0.5175 2 1.42171772 1.00652661 0.2843 0.8018 3 0.41519110 0.10409071 0.0830 0.8849 4 0.31110040 0.04657948 0.0622 0.9471 5 0.26452092 0.0529 1.0000 2 factors will be retained by the MINEIGEN criterion. The FACTOR Procedure Initial Factor Method: Principal Components Scree Plot of Eigenvalues 3.0 ˆ 1 2.5 ˆ 2.0 ˆ E i g e n v a 1.5 ˆ l 2 u e s 1.0 ˆ 0.5 ˆ 3 4 5 0.0 ˆ Šƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒ 0 1 2 3 4 5 Number
The FACTOR Procedure Initial Factor Method: Principal Components Factor Pattern Factor1 Factor2 test1 0.85977-0.29203 test2 0.87251-0.19571 test3 0.84192-0.29457 test4 0.47792 0.74734 test5 0.38698 0.80799 Variance Explained by Each Factor Factor1 Factor2 2.5874699 1.4217177 Final Communality Estimates: Total = 4.009188 test1 test2 test3 test4 test5 0.82449215 0.79957537 0.79559768 0.78692640 0.80259599 The FACTOR Procedure Initial Factor Method: Iterated Principal Factor Analysis Prior Communality Estimates: ONE Preliminary Eigenvalues: Total = 5 Average = 1 Eigenvalue Difference Proportion Cumulative 1 2.58746987 1.16575215 0.5175 0.5175 2 1.42171772 1.00652661 0.2843 0.8018 3 0.41519110 0.10409071 0.0830 0.8849 4 0.31110040 0.04657948 0.0622 0.9471 5 0.26452092 0.0529 1.0000 2 factors will be retained by the NFACTOR criterion. Iteration Change Communalities 1 0.2131 0.82449 0.79958 0.79560 0.78693 0.80260 2 0.1079 0.77053 0.73280 0.72295 0.67904 0.70591 3 0.0546 0.75532 0.71087 0.69607 0.62448 0.65842 4 0.0276 0.75209 0.70379 0.68539 0.59683 0.63515 5 0.0141 0.75228 0.70158 0.68068 0.58274 0.62385 6 0.0073 0.75321 0.70091 0.67833 0.57543 0.61851 7 0.0039 0.75412 0.70072 0.67702 0.57152 0.61614 8 0.0022 0.75484 0.70066 0.67622 0.56932 0.61524 9 0.0013 0.75537 0.70064 0.67571 0.56797 0.61508 10 0.0009 0.75575 0.70062 0.67537 0.56705 0.61529 Convergence criterion satisfied. Eigenvalues of the Reduced Correlation Matrix: Total = 3.31477718 Average = 0.66295544 Eigenvalue Difference Proportion Cumulative 1 2.28220070 1.25031114 0.6885 0.6885 2 1.03188956 1.00687378 0.3113 0.9998 3 0.02501578 0.02604204 0.0075 1.0073 4 -.00102626 0.02227632-0.0003 1.0070 5 -.02330258-0.0070 1.0000
The FACTOR Procedure Initial Factor Method: Iterated Principal Factor Analysis Factor Pattern Factor1 Factor2 test1 0.83498-0.24200 test2 0.82533-0.13946 test3 0.78992-0.22671 test4 0.40982 0.63174 test5 0.33454 0.70949 Variance Explained by Each Factor Factor1 Factor2 2.2822007 1.0318896 Final Communality Estimates: Total = 3.314090 test1 test2 test3 test4 test5 0.75574929 0.70062380 0.67537332 0.56705315 0.61529069 The FACTOR Procedure Rotation Method: Varimax Orthogonal Transformation Matrix 1 2 1 0.93037 0.36663 2-0.36663 0.93037 Rotated Factor Pattern Factor1 Factor2 test1 0.86556 0.08098 test2 0.81899 0.17284 test3 0.81804 0.07868 test4 0.14966 0.73801 test5 0.05112 0.78274 Variance Explained by Each Factor Factor1 Factor2 2.1141352 1.1999550 Final Communality Estimates: Total = 3.314090 test1 test2 test3 test4 test5 0.75574929 0.70062380 0.67537332 0.56705315 0.61529069 The FACTOR Procedure Initial Factor Method: Maximum Likelihood Prior Communality Estimates: SMC test1 test2 test3 test4 test5 0.61579345 0.59136405 0.57010928 0.36722868 0.34927297 Preliminary Eigenvalues: Total = 5.49319767 Average = 1.09863953
Eigenvalue Difference Proportion Cumulative 1 5.12653289 3.81965936 0.9333 0.9333 2 1.30687354 1.56360271 0.2379 1.1712 3 -.25672917 0.03234826-0.0467 1.1244 4 -.28907743 0.10532474-0.0526 1.0718 5 -.39440217-0.0718 1.0000 2 factors will be retained by the NFACTOR criterion. Convergence criterion satisfied. Significance Tests Based on 145 Observations Pr > Test DF Chi-Square ChiSq H0: No common factors 10 293.4632 <.0001 HA: At least one common factor H0: 2 Factors are sufficient 1 0.5811 0.4459 HA: More factors are needed 8 Chi-Square without Bartlett's Correction 0.5969491 Akaike's Information Criterion -1.4030509 Schwarz's Bayesian Criterion -4.3797847 Tucker and Lewis's Reliability Coefficient 1.0147794 The SAS System The FACTOR Procedure Initial Factor Method: Maximum Likelihood Squared Canonical Correlations Factor1 Factor2 0.89238534 0.84344116 Eigenvalues of the Weighted Reduced Correlation Matrix: Total = 13.6797888 Average = 2.73595777 Eigenvalue Difference Proportion Cumulative 1 8.29241449 2.90503949 0.6062 0.6062 2 5.38737499 5.32474501 0.3938 1.0000 3 0.06262998 0.05955989 0.0046 1.0046 4 0.00307009 0.06877083 0.0002 1.0048 5 -.06570073-0.0048 1.0000 Factor Pattern Factor1 Factor2 test1 0.76774-0.41003 test2 0.77906-0.30551 test3 0.73240-0.36927 test4 0.48293 0.43882 test5 0.52811 0.75191 Variance Explained by Each Factor Factor Weighted Unweighted Factor1 8.29241449 2.24489747 Factor2 5.38737499 1.15575055
Final Communality Estimates and Variable Weights Total Communality: Weighted = 13.679789 Unweighted = 3.400648 Variable Communality Weight 9 test1 0.75754937 4.12452114 test2 0.70026770 3.33631726 test3 0.67277359 3.05600568 test4 0.42578560 1.74150981 test5 0.84427176 6.42143495 The SAS System The FACTOR Procedure Rotation Method: Varimax Orthogonal Transformation Matrix 1 2 1 0.83653 0.54792 2-0.54792 0.83653 Rotated Factor Pattern Factor1 Factor2 test1 0.86690 0.07766 test2 0.81910 0.17129 test3 0.81501 0.09239 test4 0.16355 0.63169 test5 0.02980 0.91836 Variance Explained by Each Factor Factor Weighted Unweighted Factor1 7.42027967 2.11432067 Factor2 6.25950981 1.28632734 Final Communality Estimates and Variable Weights Total Communality: Weighted = 13.679789 Unweighted = 3.400648 Variable Communality Weight test1 0.75754937 4.12452114 test2 0.70026770 3.33631726 test3 0.67277359 3.05600568 test4 0.42578560 1.74150981 test5 0.84427176 6.42143495
L I S R E L 8.70 BY Karl G. Jöreskog & Dag Sörbom This program is published exclusively by Scientific Software International, Inc. The following lines were read from file C:\data\SK\mult\05augupp8.dat: TI 05AUG, UPPGIFT 8 MED LISREL DA NI=5 NO=145 LA test1 test2 test3 test4 test5 KM 1.00 0.722 1.000 0.714 0.685 1.000 0.203 0.246 0.170 1.000 0.095 0.181 0.113 0.585 1.000 MO NX=5 NK=2 PH=SY,FI LX=FU,FI TD=SY,FI VA 1.00 PH(1,1) PH(2,2) FREE LX(1,1) LX(2,1) LX(3,1) LX(4,2) LX(5,2) PH(1,2) FREE TD(1,1) TD(2,2) TD(3,3) TD(4,4) TD(5,5) PATH DIAGRAM OU ND=4 MI TI 05AUG, UPPGIFT 8 MED LISREL TI 05AUG, UPPGIFT 8 MED LISREL Covariance Matrix Number of Input Variables 5 Number of Y - Variables 0 Number of X - Variables 5 Number of ETA - Variables 0 Number of KSI - Variables 2 Number of Observations 145 test1 test2 test3 test4 test5 -------- -------- -------- -------- -------- test1 1.0000 test2 0.7220 1.0000 test3 0.7140 0.6850 1.0000 test4 0.2030 0.2460 0.1700 1.0000 test5 0.0950 0.1810 0.1130 0.5850 1.0000 TI 05AUG, UPPGIFT 8 MED LISREL Parameter Specifications LAMBDA-X KSI 1 KSI 2 -------- -------- test1 1 0 test2 2 0 test3 3 0 test4 0 4 test5 0 5
PHI KSI 1 KSI 2 -------- -------- KSI 1 0 KSI 2 6 0 THETA-DELTA test1 test2 test3 test4 test5 -------- -------- -------- -------- -------- 7 8 9 10 11 TI 05AUG, UPPGIFT 8 MED LISREL Number of Iterations = 7 LISREL Estimates (Maximum Likelihood) LAMBDA-X KSI 1 KSI 2 -------- -------- test1 0.8657 - - (0.0707) 12.2539 test2 0.8364 - - (0.0716) 11.6844 test3 0.8207 - - (0.0721) 11.3868 test4 - - 0.9749 (0.2340) 4.1658 test5 - - 0.6000 (0.1583) 3.7903 PHI KSI 1 KSI 2 -------- -------- KSI 1 1.0000 KSI 2 0.2520 1.0000 (0.1019) 2.4716 THETA-DELTA test1 test2 test3 test4 test5 -------- -------- -------- -------- -------- 0.2505 0.3004 0.3265 0.0495 0.6400 (0.0531) (0.0544) (0.0554) (0.4409) (0.1832) 4.7180 5.5230 5.8969 0.1122 3.4924
Squared Multiple Correlations for X - Variables test1 test2 test3 test4 test5 -------- -------- -------- -------- -------- 0.7495 0.6996 0.6735 0.9505 0.3600 Goodness of Fit Statistics Degrees of Freedom = 4 Minimum Fit Function Chi-Square = 2.9306 (P = 0.5695) Normal Theory Weighted Least Squares Chi-Square = 2.9298 (P = 0.5696) Estimated Non-centrality Parameter (NCP) = 0.0 90 Percent Confidence Interval for NCP = (0.0 ; 6.8863) Minimum Fit Function Value = 0.02035 Population Discrepancy Function Value (F0) = 0.0 90 Percent Confidence Interval for F0 = (0.0 ; 0.04782) Root Mean Square Error of Approximation (RMSEA) = 0.0 90 Percent Confidence Interval for RMSEA = (0.0 ; 0.1093) P-Value for Test of Close Fit (RMSEA < 0.05) = 0.7183 Expected Cross-Validation Index (ECVI) = 0.1806 90 Percent Confidence Interval for ECVI = (0.1806 ; 0.2284) ECVI for Saturated Model = 0.2083 ECVI for Independence Model = 2.0972 Chi-Square for Independence Model with 10 Degrees of Freedom = 291.9902 Independence AIC = 301.9902 Model AIC = 24.9298 Saturated AIC = 30.0000 Independence CAIC = 321.8739 Model CAIC = 68.6739 Saturated CAIC = 89.6510 Normed Fit Index (NFI) = 0.9900 Non-Normed Fit Index (NNFI) = 1.0095 Parsimony Normed Fit Index (PNFI) = 0.3960 Comparative Fit Index (CFI) = 1.0000 Incremental Fit Index (IFI) = 1.0037 Relative Fit Index (RFI) = 0.9749 Critical N (CN) = 653.3788 TI 05AUG, UPPGIFT 8 MED LISREL Root Mean Square Residual (RMR) = 0.02182 Standardized RMR = 0.02182 Goodness of Fit Index (GFI) = 0.9919 Adjusted Goodness of Fit Index (AGFI) = 0.9697 Parsimony Goodness of Fit Index (PGFI) = 0.2645 Modification Indices and Expected Change Modification Indices for LAMBDA-X KSI 1 KSI 2 -------- -------- test1 - - 0.1300 test2 - - 1.3448 test3 - - 0.6500
test4 - - - - test5 - - - - Expected Change for LAMBDA-X KSI 1 KSI 2 -------- -------- test1 - - -0.0210 test2 - - 0.0688 test3 - - -0.0485 test4 - - - - test5 - - - - No Non-Zero Modification Indices for PHI Modification Indices for THETA-DELTA test1 test2 test3 test4 test5 -------- -------- -------- -------- -------- test1 - - test2 0.6500 - - test3 1.3448 0.1300 - - test4 0.1703 0.1114 0.6328 - - test5 1.3448 1.0175 0.0503 - - - - Expected Change for THETA-DELTA test1 test2 test3 test4 test5 -------- -------- -------- -------- -------- test1 - - test2-0.1591 - - test3 0.2172-0.0620 - - test4 0.0179 0.0148-0.0357 - - test5-0.0509 0.0452 0.0102 - - - - Maximum Modification Index is 1.34 for Element ( 2, 2) of LAMBDA-X Time used: 0.047 Seconds
Svar till skrivning i multivariata metoder.den 27 augusti 2005: 88 101 122 148 1) a) A*B är inte definerad; B*A = 196 209 223 236 0.7071 b) Egenvärden är 27 och 13. Egenvektorer är 0.7071 ortogonala. respektivive 0.7071 0.7071, som är 5.25 2) Medelvärdesvektorn blir 5.50, kovariansmatrisen 7.875 1 0.445 0.927 korrelationsmatrisen 0.445 1 0.630. 0.927 0.630 1 4.5000 0.7143 5.6071 0.7143 0.5714 1.3571 5.6071 1.3571 8.1250 och 2 6 3) Fyrdimensionellt normalfördelad med väntevärdesvektor och kovariansmatris 4 3 8 4 3 1 4 16 5 3 ; trivariat eller tredimensionellt normalfördelad med väntevärdesvektor 3 5 9 6 1 3 6 10 8 4 3 och kovariansmatris 4 16 5 ; bivariat eller tvådimensionellt normalfördelad med 3 5 9 2 8 4 väntevärdesvektor 6 och kovariansmatris 4 16 ;. 2 6 4 4) 9 8, 4 13 12 10 respektive 3 4 10 och 9.85 3.29 3.19 3.29 7.14 2.14. 3.19 2.14 8.29
5) a) Endast de två första kanoniska korrelationerna är signifikanta ( P<0.001) b) Standardiserad a 1 = ( 0.42, 0.20, 0.17, -0.02, 0.46), strukturkoefficienter = ( 0.83, 0.73, 0.75, 0.62, 0.86), så U 1 tycks vara ett genomsnitt av de arbetsrelaterade variblerna (med undantag av X 4 = arbetets identitet, som är lite obestämd). Standardiserad a 2 = ( 0.34, -0.67, -0.85, 0.35, 0.73), strukturkoefficienter = ( 0.03, -0.10, -0.11, 0.05, 0.66), så U 2 tycks tycks domineras av X 2 =arbetets betydelse och X 3 =omväxling i arbetet Standardiserad b 1 = ( 0.43, 0.21, -0.04, 0.02, 0.29, 0.51, -0.11), strukturkoefficienter = ( 0.76, 0.64, 0.39, 0.38, 0.65, 0.80, 0.50), så V 1 tycks domineras av Y 1 =tillfredsställelse med chefen och Y 6 = tillfredsställelse med typen av arbete. Standardiserad b 2 = ( -0.09, 0.44, -0.09, 0.93, -0.10, -0.55, -0.03), strukturkoefficienter = ( 0.04, 0.36, 0.04, 0.79, 0.11, -0.24, 0.16), så V 2 tycks domineras av Y 4 =tillfredsställelse med arbetsmängden och Y 2 = tillfredsställelse med karriären. 6) a) AREA tas in först, sedan BEDROOMS. PRICE kommer inte in. b) För LDA-korsvalideringen fås: H e =11.67, H o =17, så Z = 1.91, ej sign, o. IOCC= 0.23, så klassificeringen är inte framgångsrik. (För LDA-kalibreringsdata fås: H e =11,67, H o =19, så Z = 2.63, sign, o. IOCC= 0.31) Eftersom testet av samma kovariansmatriser, ger P=0.71 så utförs inte QDA: 7) Scree-plott ger att lämpligt antal faktorer är två. Också egenvärdeskriteriet ger två faktorer. Vi ser också att H 0 : 2 faktorer räcker har P-värde 0.446, så allt tyder på att 2-faktormodellen 0.87 0.08 0.82 0.17 är lämplig. Faktorladdningarna för denna roterade tvåfaktorlösning är 0.82 0.09, dvs. 0.16 0.63 0.03 0.92 stora positiva laddningar på de tre 1:a variablerna för faktor 1 och stora positiva laddningar på variablerna 4 och 5 för faktor 2, som väntat. Kommunaliterna blir 0.76, 0.70, 0.67, 0.43 respektive 0.84, dvs. relativt höga men ingen nära 1.
8) a) Först titelrad, deklaration av antal inlästa variabler och antal observationer, namn på de 7 variablerna ges, kovariansmatrisen läses in, Vi talar om att vi har 5 x-variabler, 2 ksi-faktor och med mönster för faktorladdningarna enligt föregående uppgift, faktorerna ges enhetsvarians men tillåts korrelera. De 5 faktorladdningarna och de 5 specifika varianserna sätts till fria parametrar, Vidare begär vi ett path-diagram och modifikationsindex. b) Chitvå = 2.93 med 4 frihetsgrader, vilket ger P = 0.57, så hypotesen förkastas inte. Även andra anpassningsmått som RMSEA=0.00 är utmärkt enligt de vanliga tumreglerna, NFI=0.99, GFI =0.99 och AGFI=0.97 ser jättebra ut. Vi ser att det största modifikationsindexet är 1.34 ( för LX(2,2) ), vilket är litet. ( Alltså jämfört med en chitvå-fördelning med 1 fg. ) c) Se diagram i läroboken! d) Antalet element i S=p*(p+1)/2=5*6/2=15 och antalet parametrar = antal faktorladdningar + antal specifika varianser + faktorkovariansen= 5 + 5 + 1 = 11 så antalet fg = 15 11 = 4.