Science Journal of Applied Mathematics and Statistics
Volume 3, Issue 3, June 2015, Pages: 165-170

Multivariate Approach to Partial Correlation Analysis

Onyeneke Casmir Chidiebere

Mathematics and Statistics Department, University of Calabar, Calabar, Nigeria

Email address:

To cite this article:

Onyeneke Casmir Chidiebere. Multivariate Approach to Partial Correlation Analysis. Science Journal of Applied Mathematics and Statistics. Vol. 3, No. 3, 2015, pp. 165-170. doi: 10.11648/j.sjams.20150303.20


Abstract: Multivariate approach to generate variance covariance and partial correlation coefficients of one or more independent variables has been the concern of advanced statisticians and users of statistical tools. This work tackled the problem by keeping one or some variables constant and partitioned the variance covariance matrices to find multivariate partial correlations. Due to the challenges that faced the analysis and computation of complex variables, this research used matrix to ascertain the level of relationship that exist among these variables and obtained correlation coefficients from variance covariance matrices. It was proved that partial correlation coefficients are diagonal matrices that are normally distributed. (Work count = 101).

Keywords: Multivariate, Correlation, Partial, Normality, Coefficients, Variables, Matrices


1. Introduction

Multivariate statistics is the science of collecting and analyzing multiple data. It involves observing, describing and presenting complex events and distributions that occur simultaneously. Multivariate partial correlation is the analysis made on many variables to ascertain the degrees of relationship existing when one or more of such variables are held constant (Francois, 2010). Here the main point is to examine how these multiple variance, covariance and correlation are partly related. This measures variance and correlation among the many variables. It examines the relationship between several variables of interest where the objective is to ascertain the interdependence effect existing between them. It checks how one variable relates with another.  Neter, et al., (1996) suggested that cases with more than one independent variable are appropriate to compute with matrices.  Therefore, we mostly adopted matrix principles to carry out the analysis in this work.

2. Multivariate Correlation

The statistical methodology used in this research is multivariate application to generate both variance covariance and correlation matrices. The correlation matrix was used to established and prove the theories of first and second order partial correlation. Since multivariate correlation employs matrix method to generate all the possible correlation confidents in the variables of consideration, we examined the cases of variance covariance matrix to enable us get the multivariate relationship between one index and another. It was used to measure the strength of association among variables. The impact of independent variables was studied simultaneously. For instance, the correlation coefficients among the standard of living indices in rural communities in a state was given as income (X1), feeding (X2), accommodation (X3) and education (X4), the multivariate correlation coefficients were computed as:

 

When N = 25

= 

These coefficients take values between -1 and +1, where  (Rao, 2008)

3. First Partial Correlation

According to Morison (2007), to calculate the partial relationship between the indices, we examined the cases where only one variable is kept constant and others were varied. In situations where the system has multiple variables and factors influencing them, multivariate partial correlation analysis becomes very relevant. This is seen in physical, agricultural, business and experimental sciences where the aim is how to study separately the effect and control of these variables. The approach is very significant during design or analysis of experiment and principle component analysis. Anderson (2003), stated the first partial correlation as

i, j, k =1, …, 4

Therefore, the first partial correlation matrix can be calculated by first computing the numerator as follows

Theorem one: Given that  be variables with variance covariance matrix, X, V and  can be partitioned to derive multivariate partial correlation coefficient matrices. We are required to prove that

Let

 and ,

Divide through by

This has proved that

Let us now apply the theorem in the numbers to get the partial relationship.

Since

=

=

So

Divide through by  = 

4. Second Partial Correlation

This is to compute the partial relationship between two variables when two other indices are held constant.

Theorem two:  At this time, we prove that;

The above expression means . So given  with the corresponding , the variance covariance of

  =

==

Multiplying by each value with the corresponding terms

This implies

Also applying the above prove to the standard of living indices we could achieve the following result. For instance,

Other results are seen in the second order partial correlation matrix of the correlation matrices as presented below.

Notice that in this case is from 1 to 16

5. Normality Test of Multivariate Correlation Coefficients

In this part, we prove that if the variables are normally distributed, then the relationships between them in terms of correlation coefficients are also normal distributions. Here since the sample size for each index (variable) is 25, the four variables gives 100. It was already assumed to be normal by the law of large number, now let’s test for the normality of the correlation coefficients.

Theorem three: If  then .

In this test we recall that if    with

Let  be the correlation coefficients of  then the joint probability density function is given by

From the above, we can get our test statistic to be

Table 1. Test for normality of the Correlation Coefficients.

F

(1  - ei)

-0.4657 1 -0.4375 0.1911 -1.16349 0.0516 0.3096 0.6904 0.4767 1.5396
-0.1650 1 -0.1364 0.0189 -1.5101 0.0253 1.5204 -0.5204 0.2708 0.1781
-0.0420 1 -0.0137 0.0002 -0.0513 0.1751 1.0506 -0.0506 0.0026 0.0024
0.0661 1 0.0947 0.0090 0.3542 0.1567 0.9402 0.0598 0.0036 0.0038
0.1247 1 0.1533 0.0235 0.5734 0.0789 0.4734 0.5266 0.2773 0.5858
0.3107 1 0.3393 0.1151 1.2688 0.2843 1.7058 -0.7058 0.4982 0.2620
-0.1715 6   0.3575   1 6     2.6018

Where = the corresponding correlation coefficient

  = the frequency of the correlation coefficient which is one per outcome

 and  are mean and standard deviation of the correlation coefficients respectively.

The above analysis is grounded under the following hypothesis

 

Since  tabulated (9.4880) we therefore accept  and conclude that 

6. Interpretation

The numerical matrix of correlation coefficients displayed all the relationships between any selected sets of two variables, covariance (), under consideration. If both variables increase at the same time, then positive correlation exists between them. This was seen in the case of   and so on. However, the negative correlation coefficient exist in the cases where one variable tends to increase as the other tends to decrease as seen in , , . The result of the first partial correlation coefficient measured the relationship between any two variables when any other variables are held constant. At the second partial correlation coefficient matrix which examines the correlation at any two variables keeping the rest two variables stable, it was discovered that is . This implies, . It was confirmed that there exist perfect diagonal matrices among the coefficients of simple and the second partial correlations.  Finally, Table 1 showcased what happened when the correlation coefficients were subjected to normality test. The essence was to prove that correlation coefficients obey the normal probability law of  when the sample variable is normal .

7. Conclusion

In this research, it was shown that multivariate technique can be used to easily compute the correlation coefficient matrices through variance covariance matrices (Kendall et al 1973). Theorem one proved that partial correlation coefficients can be derived by adequately partitioning the covariance matrix. Theorem two established the relationship which exists when two variables are kept constant. In such second partial correlation, it was verified that irrespective of the covariance or correlation coefficient terms used, second partial correlation (in a square matrices) must give a corresponding value to show the diagonal attributes of the correlation coefficients. Finally, the normality nature of multivariate correlation confidents was attested. In other words, every normally distributed set of observation give correlation coefficient matrix which is equally normally distributed.

Theorems

1.        

2.          

3.         If ,  then .


References

  1. Anderson, T. W., (2003); An Introduction to Multivariate Statistical Analysis 3rd Edition. Wiley-Interscience, New York
  2. Francois Husson, Sebastien Le and Jerome Pages (2010); Exploratory Multivariate Analysis. Chapman & Hall Books, retrieved from CRC Press www.crcpress.com
  3. Kendall, M. G., and Stuart, A. (1973). The Advanced Theory of Statistics (Inference and Relationship). Vol. 2 (Third ed.). Hafner Publishing Co., New York.
  4. Morison D. F., (2007); Multivariate statistical Methods. McGraw-Hill Book Company, New York. Retrieved from onlinelibrary.wiley.com.
  5. Neter, John, Michael H. Kutner, Christopher Nachtsheim, and William Wasserman (1996). Applied Linear Statistical Models.  McGraw Hill, Boston.
  6. Partial Correlation Analysis retrieved from https://explorable.com/partial-correlation-analysis
  7. Rao, C. R., (2008); Linear Statistical Inference and its Application: Second Edition. John Wiley & Sons Inc. Retrieved from onlinelibrary.wiley.com.
  8. Regression Tutorial MenuDictionary,STATS @ MTSUretrieved fromhttp://mtweb.mtsu.edu/stats/regression/level3/multicorrel/multicorrcoef.htm
  9. Steiger, J. H., and Browne, M. W. (1984). The Comparison of Interdependent Correlations between Optimal Linear composites. Psychometrika, 49, 11-24.
  10. Stockwell Ian (2008); Introduction to Correlation and Regression. SAS Global Forum, Batimore.
  11. Tabachnick, B., and Fidell L. (1989); Using Multivariate Statistics. Harper & Row Publishers, New York.

Article Tools
  Abstract
  PDF(209K)
Follow on us
ADDRESS
Science Publishing Group
548 FASHION AVENUE
NEW YORK, NY 10018
U.S.A.
Tel: (001)347-688-8931