Multivariate Approach to Partial Correlation Analysis
Onyeneke Casmir Chidiebere
Mathematics and Statistics Department, University of Calabar, Calabar, Nigeria
To cite this article:
Onyeneke Casmir Chidiebere. Multivariate Approach to Partial Correlation Analysis. Science Journal of Applied Mathematics and Statistics. Vol. 3, No. 3, 2015, pp. 165-170. doi: 10.11648/j.sjams.20150303.20
Abstract: Multivariate approach to generate variance covariance and partial correlation coefficients of one or more independent variables has been the concern of advanced statisticians and users of statistical tools. This work tackled the problem by keeping one or some variables constant and partitioned the variance covariance matrices to find multivariate partial correlations. Due to the challenges that faced the analysis and computation of complex variables, this research used matrix to ascertain the level of relationship that exist among these variables and obtained correlation coefficients from variance covariance matrices. It was proved that partial correlation coefficients are diagonal matrices that are normally distributed. (Work count = 101).
Keywords: Multivariate, Correlation, Partial, Normality, Coefficients, Variables, Matrices
Multivariate statistics is the science of collecting and analyzing multiple data. It involves observing, describing and presenting complex events and distributions that occur simultaneously. Multivariate partial correlation is the analysis made on many variables to ascertain the degrees of relationship existing when one or more of such variables are held constant (Francois, 2010). Here the main point is to examine how these multiple variance, covariance and correlation are partly related. This measures variance and correlation among the many variables. It examines the relationship between several variables of interest where the objective is to ascertain the interdependence effect existing between them. It checks how one variable relates with another. Neter, et al., (1996) suggested that cases with more than one independent variable are appropriate to compute with matrices. Therefore, we mostly adopted matrix principles to carry out the analysis in this work.
2. Multivariate Correlation
The statistical methodology used in this research is multivariate application to generate both variance covariance and correlation matrices. The correlation matrix was used to established and prove the theories of first and second order partial correlation. Since multivariate correlation employs matrix method to generate all the possible correlation confidents in the variables of consideration, we examined the cases of variance covariance matrix to enable us get the multivariate relationship between one index and another. It was used to measure the strength of association among variables. The impact of independent variables was studied simultaneously. For instance, the correlation coefficients among the standard of living indices in rural communities in a state was given as income (X1), feeding (X2), accommodation (X3) and education (X4), the multivariate correlation coefficients were computed as:
When N = 25
These coefficients take values between -1 and +1, where (Rao, 2008)
3. First Partial Correlation
According to Morison (2007), to calculate the partial relationship between the indices, we examined the cases where only one variable is kept constant and others were varied. In situations where the system has multiple variables and factors influencing them, multivariate partial correlation analysis becomes very relevant. This is seen in physical, agricultural, business and experimental sciences where the aim is how to study separately the effect and control of these variables. The approach is very significant during design or analysis of experiment and principle component analysis. Anderson (2003), stated the first partial correlation as
i, j, k =1, …, 4
Therefore, the first partial correlation matrix can be calculated by first computing the numerator as follows
Theorem one: Given that be variables with variance covariance matrix, X, V and can be partitioned to derive multivariate partial correlation coefficient matrices. We are required to prove that
Divide through by
This has proved that
Let us now apply the theorem in the numbers to get the partial relationship.
Divide through by =
4. Second Partial Correlation
This is to compute the partial relationship between two variables when two other indices are held constant.
Theorem two: At this time, we prove that;
The above expression means . So given with the corresponding , the variance covariance of
Multiplying by each value with the corresponding terms
Also applying the above prove to the standard of living indices we could achieve the following result. For instance,
Other results are seen in the second order partial correlation matrix of the correlation matrices as presented below.
Notice that in this case is from 1 to 16
5. Normality Test of Multivariate Correlation Coefficients
In this part, we prove that if the variables are normally distributed, then the relationships between them in terms of correlation coefficients are also normal distributions. Here since the sample size for each index (variable) is 25, the four variables gives 100. It was already assumed to be normal by the law of large number, now let’s test for the normality of the correlation coefficients.
Theorem three: If then .
In this test we recall that if with
Let be the correlation coefficients of then the joint probability density function is given by
From the above, we can get our test statistic to be
|(1 - ei)|| |
Where = the corresponding correlation coefficient
= the frequency of the correlation coefficient which is one per outcome
and are mean and standard deviation of the correlation coefficients respectively.
The above analysis is grounded under the following hypothesis
Since tabulated (9.4880) we therefore accept and conclude that
The numerical matrix of correlation coefficients displayed all the relationships between any selected sets of two variables, covariance (), under consideration. If both variables increase at the same time, then positive correlation exists between them. This was seen in the case of and so on. However, the negative correlation coefficient exist in the cases where one variable tends to increase as the other tends to decrease as seen in , , . The result of the first partial correlation coefficient measured the relationship between any two variables when any other variables are held constant. At the second partial correlation coefficient matrix which examines the correlation at any two variables keeping the rest two variables stable, it was discovered that is . This implies, . It was confirmed that there exist perfect diagonal matrices among the coefficients of simple and the second partial correlations. Finally, Table 1 showcased what happened when the correlation coefficients were subjected to normality test. The essence was to prove that correlation coefficients obey the normal probability law of when the sample variable is normal .
In this research, it was shown that multivariate technique can be used to easily compute the correlation coefficient matrices through variance covariance matrices (Kendall et al 1973). Theorem one proved that partial correlation coefficients can be derived by adequately partitioning the covariance matrix. Theorem two established the relationship which exists when two variables are kept constant. In such second partial correlation, it was verified that irrespective of the covariance or correlation coefficient terms used, second partial correlation (in a square matrices) must give a corresponding value to show the diagonal attributes of the correlation coefficients. Finally, the normality nature of multivariate correlation confidents was attested. In other words, every normally distributed set of observation give correlation coefficient matrix which is equally normally distributed.
3. If , then .