CLUSTATIS: cluster analysis of blocks of variables
Abstract
The STATIS method is one of many strategies of analysis devoted to the unsupervised analysis of multiblock data. A new optimization criterion to define this method of analysis is introduced and an extension to the cluster analysis of several blocks of variables is discussed. This consists in a hierarchical cluster analysis and a partitioning algorithm akin to the K-means algorithm. Moreover, in order to improve the cluster analysis outcomes, an additional cluster called noise cluster which contains atypical blocks of variables is introduced. The general strategy of analysis is illustrated by means of two cases studies.
References
Cariou, V., and Wilderjans, T. F. (2017). Consumer segmentation in multi-attribute product evaluation by means of non-negatively constrained CLV3W. Food Quality and Preference, 67, 18-26.
De Roover, K., Ceulemans, E., & Timmerman, M.E. (2012). How to perform multiblock component analysis in practice. Behavior Research Methods, 44(1), 41-56.
Dahl, T., and Næs, T. (2004). Outlier and group detection in sensory panels using hierarchical cluster analysis with the Procrustes distance. Food Quality and Preference, 15(3), 195-208.
Dave, R. N. (1991). Characterization and detection of noise in clustering. Pattern Recognition Letters, 12(11), 657–664.
El Ghaziri, A., and Qannari, E. M. (2015). Measures of association between two datasets; application to sensory data. Food quality and preference, 40, 116-124.
Everitt, B. S., Landau, S., Leese, M., & Stahl, D. (2011). Cluster Analysis. John Wiley & Sons. Ltd., New York, 330.
Glacon, F., (1981). Analyse conjointe de plusieurs matrices de données. Thèse de 3èmecycle, Université scientifique et médicale de Grenoble.
Jack, F. R., & Piggott, J. R. (1991). Free choice profiling in consumer research. Food quality and preference, 3(3), 129-134.
Lavit, C., Escoufier, Y., Sabatier, R., & Traissac, P. (1994). The act (statis method). Computational Statistics & Data Analysis, 18(1), 97–119.
Lê, S., & Husson, F. (2008). SensoMineR: A package for sensory data analysis. Journal of Sensory Studies, 23(1), 14–25.
Llobell, F., Giacalone, D., Labenne, A., & Qannari, E.M. (2019). Assessment of the agreement and cluster analysis of the respondents in a CATA experiment. Food Quality and Preference, 77, 184-190.
Llobell, F., Vigneau, E., Cariou, V., & Qannari, E.M. (2019). ClustBlock: Clustering of datasets. R package version 2.1.1. https://CRAN.R-project.org/package=ClustBlock.
Llobell, F., Vigneau, E., & Qannari, E. M. (2019). Clustering datasets by means of CLUSTATIS with identification of atypical datasets. Application to sensometrics. Food Quality and Preference, 75, 97-104.
Pagès, J. (2005). Collection and analysis of perceived product inter-distances using multiple factor analysis: Application to the study of 10 white wines from the Loire Valley. Food Quality and Preference, 16(7), 642–649.
Risvik, E., McEwan, J. A., Colwill, J. S., Rogers, R., & Lyon, D. H. (1994). Projective mapping: A tool for sensory analysis and consumer research. Food Quality and Preference, 5(4), 263–269.
Robert, P., and Escoufier, Y. (1976). A unifying tool for linear multivariate statistical methods: the RV-coefficient. Applied statistics, 257-265.
Saporta, G. (2006). Probabilités, analyse des données et statistique. Editions Technip.
Schlich (1996). Defining and validating assessor compromises about product distances and attribute Correlations. In: Næs and Risvik (ed.): Multivariate Analysis of Data in Sensory Science, 259-306.
Varela, P., & Ares, G. (2014). Novel techniques in sensory characterization and consumer profiling. CRC Press.
Vigneau, E., Qannari, E. M. (2003). Clustering of variables around latent components. Communications in Statistics-Simulation and Computation, 32, 4, 1131-1150.
Vigneau, E., Qannari, E. M., Navez, B., & Cottet, V. (2016). Segmentation of consumers in preference studies while setting aside atypical or irrelevant consumers. Food Quality and Preference, 47, 54–63.
Full Text: pdf