Multivariate Analysis of Mixed Data. The R Package PCAmixdata
Abstract
Mixed data arise when observations are described by a mixture of numerical and categorical variables. The R package PCAmixdata extends to this type of data standard multivariate analysis methods which allow description, exploration and visualization of the data. The key techniques/methods included in the package are principal component analysis for mixed data (PCAmix), varimax-like orthogonal rotation for PCAmix, and multiple factor analysis for mixed multi-table data. This paper proposes a unified mathematical presentation of the different methods with common notations, as well as providing a summarised presentation of the three algorithms, with details to help the user understand graphical and numerical outputs of the corresponding R functions.
This then allows the user to easily provide relevant interpretations of the results obtained.
The three main methods are illustrated on a real dataset composed of four data tables characterizing living conditions in different municipalities in the Gironde region of southwest France.
This then allows the user to easily provide relevant interpretations of the results obtained.
The three main methods are illustrated on a real dataset composed of four data tables characterizing living conditions in different municipalities in the Gironde region of southwest France.
DOI Code:
10.1285/i20705948v15n3p606
Keywords:
mixture of numerical and categorical data, PCA, multiple correspondence analysis, multiple factor analysis, varimax rotation, R
Full Text: pdf