Acta Anthropologica Sinica ›› 2005, Vol. 24 ›› Issue (03): 221-231.

Previous Articles     Next Articles

The methodology of principle component analysis based on the averaged covariance matrix for the analysis of human populational genetic structures

XUE Fuzhong, WANG Jiezhen, GUO Yishou, HU Ping   

  • Online:2005-09-15 Published:2005-09-15

Abstract: Objective: To explore the applicability and rationale of principle component analysis based on the averaged covariance matrix for analyzing human populational genetic structure. Methods: Based on the structure of gene frequency matrix, we showed differences of eigenvalues, eigenvectors, and their effect in reducing the dimensionality between the standardized correlation matrix principle component analysis and the averaged covariance matrix principle component analysis. To validate and compare their use and rationale in human population genetics, we analyzed the genetic structure of HLA-A locus in 26 Chinese Han populations using both standardized correlation matrix principle component analysis and averaged covariance matrix principle component analysis methods. Results: The principle component of standardized correlation matrix does not represent the variance weight of gene frequency matrix. Instead it represents the correlation weight between the genes. The principle component of averaged covariance matrix not only reflectsthe variance weight of gene frequency matrix, but also identifies correlation weight between the genes in gene the matrix. From analyzing the genetic structure of HLA - A locus in 26 Chinese Han populations using the different two methods, we discovered that the averaged covariance matrix principle component analysis is better than the standardized correlation matrix principle component analysis in reducing the dimensionality of gene frequency matrix. And using the principle method in reducing covariance matrix, the genetic structure of HLA-A locus in Chinese Han populations can be explained correctly. Conclusion: carry out the principle component analysis of human population genetic structure, one should calculate the PC using averaged covariance matrix rather than the standardized correlation matrix.

Key words: Human population; Genetic structure; Principle component analysis; Averaged covariance matrix; HLA-A