人类学学报 ›› 2007, Vol. 26 ›› Issue (04): 361-371.

• 人类学学报 • 上一篇    下一篇

聚类分析和主成分分析方法在人类学研究中价值的判定

吴秀杰;张全超;李海军   

  • 出版日期:2007-12-15 发布日期:2007-12-15

An examination of cluster and principle component analysis on the study of anthropology

WU Xiujie, ZHANG Quanchao , LI Haijun   

  • Online:2007-12-15 Published:2007-12-15

摘要: 本文以生活在不同地区的9组人群的成年男性头骨(668例)为主要研究对象,通过对其14项测量性状的聚类分析和主成分分析,探讨多变量统计分析方法在人类学研究中的价值。结果显示:欧氏距离系数可以初步判断各组人群的相互关系及差异;根据聚类分析树枝图推出的人群间的相互关系受作者主观意识的影响,可信的结论应建立在多种聚类方法产生的结果一致的基础上;主成分分析的结果与选取的变量有一定关系,选取不同的变量组,其结果会受到影响。同聚类分析方法相比,主成分分析方法相对较好地反映了人群间的相互关系。本文研究结果提示,应慎重对待多变量统计方法得出的人群间相互关系的结论。

关键词: 聚类分析;主成分分析;欧氏距离系数;头骨;测量性状

Abstract: The Multivariate analysis can synthesize the database and supply the direct information, so more and more anthropologists prefer the method to analyze the relationship among the different populations. Because few people tested the method, some researchers still suspected the result from the Multivariate analysis. In order to conduct Multivariate analysis on the study of anthropology, we chose adult male skulls ( n = 668) of nine populations related to the different areas. These populations included: Hebei, Inner Mongolia, Liaoning, Shaanxi, Shanxi, Xinjiang, Huabei, Yunnan and Europe. Fourteen standard linear measures were culled to do cluster and principal components analysis. The relationship and difference of the populations are very similar comparison the result from Euclidean distance coefficient and City block distance. The primary results of this study indicate that Euclidean distance coefficient is useful for primarily judging the relationship and difference of the populations. The dendragrams drew of metric data of nine populations using different cluster analysis methods were varied. It is uncertain to determine the relationship of the populations only according to the cluster dendrogram, except the results from all kinds of cluster methods are consistent. With four PCA scores methods from skull metrical data, the distributions of nine populations did not change a lot. The principal components analysis is associated with the variables. When the variables change, the component matrix and the total variance loadings change too. Compared with cluster analysis, principal components analysis is better to explain the relationship of the populations. It suggests that the conclusion from multi2variables analysis should be considered carefully.

Key words: Cluster analysis; Principle component analysis; Euclidean distance coefficient; Skull; Metric traits