Questions
Question 1. (Multivariate Linear Modeling and Tests of a Covariance Matrix) [40 Marks]
In this question we are going to analyze the class data. This data contains five quiz scores of 53 students, along with information about the gender and major of the students.
(a) Use lm() function to find the relationship between ”‘quiz4”’ and ”‘quiz5”’, as response vari- ables, with ”‘quiz1”’, ”‘quiz2”’, ”‘quiz3”’, ”‘gender”’ and ”‘major”’ as predictive variables. We call this model Model 1. Interpret the result carefully. [10 Marks]
(b) Now, consider the relationship between (quiz4,quiz5) with (quiz1,quiz2,quiz3). Repeat the same analysis as in Part (a) for these response and predictive variables. We call this model Model 2. [10 Marks]
(c) Compare Model 1 and Model 2. Which one do you prefer? Why? [10 Marks]
(d) Assume that the random vector (quiz1, quiz2, quiz3, quiz4, quiz5)T for both genders follow some multivariate normal distributions. Find the covariance matrices of
(quiz1, quiz2, quiz3, quiz4, quiz5)T
based on ”‘gender”’. Can we assume that the covariance matrices for the two groups are the
same?? [10 Marks]
Part (e) is optional. You will not miss any point if you do not answer this part. However, if you provide the correct solution you may earn extra points. The Purpose is to show that although some functions may not be introduced in the tutorials, as long as you have the statistical knowledge about the context, you can perform the analysis and provide interpretation of the results.
(e) To find if there is a significant difference between quiz4 and quiz5 scores between gender and major, use the manova() function in R. Interpret the obtained result. Is it in concordance with the result in the previous parts?? [10 Marks]
Question 2. (Factor Analysis) [30 Marks]
The ”‘Harmon23.cor”’ in the datasets package is a correlation matrix of eight physical measure- ments made on 305 girls between the ages of 7 and 17. You can find the information about this correlation matrix using
help(Harman23.cor).
(a) Perform factor analysis for this correlation matrix using the command
factanal(factors = m, covmat = Harman23.cor),
for m = 1, 2, 3. Note that using the provided command you do not need to have access to the complete dataset and the factor analysis can be performed using covariance or correlation matrices, with the same reasoning as in PCA. [20 Marks]
(b) What is the best number of factors in this case?? Interpret the resulting factor loadings for the best case. [10 Marks]
Hint: Not that you cannot rely just on p-values to determine the number of factors in the analysis. You should provide factors that have some interpretation. Therefore, to answer this question, you need to consider factor loadings carefully (Are they large enough??).
DescriptionIn this final assignment, the students will demonstrate their ability to apply two ma
Path finding involves finding a path from A to B. Typically we want the path to have certain properties,such as being the shortest or to avoid going t
Develop a program to emulate a purchase transaction at a retail store. Thisprogram will have two classes, a LineItem class and a Transaction class. Th
1 Project 1 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of
1 Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of