Compositional Principal Component Analysis on Air Concentrations data from Kodungaiyur region of Chennai city


Abstract


This study applies Compositional Principal Component Analysis (CoDA PCA) to analyze air pollutant concentrations in the Kodungaiyur region of Chennai, focusing on seven key pollutants: PM10, PM2.5, NO2, SO2, NH3, O3, and CO. Compositional transformations—Centered Log-Ratio (CLR), Isometric Log-Ratio (ILR), and Additive Log-Ratio (ALR)—are employed to address the inherent constraints of compositional data, ensuring the sum of the proportions remains constant. The PCA results reveal that the first two principal components explain approximately 76% of the variance, providing valuable insights into the relationships between pollutants and their potential sources. The CLR, ILR, and ALR biplot highlight different pollutant groupings, suggesting distinct emission sources and environmental impacts. This study underscores the importance of using compositional data techniques in air quality research and offers a detailed understanding of pollutant dynamics in Kodungaiyur, aiding in more targeted pollution control strategies.

Keywords: Air pollutants concentrations, compositional data, log-ratios, principal component analysis

References


References

Aitchison, J. (1986). The statistical analysis of compositional data. Chapman & Hall

Ltd.

Electronic Journal of Applied Statistical Analysis 17

Pawlowsky-Glahn, V. e Egozcue, J. J. (2001). Geometric approach to statistical analysis on the simplex. Stochastic Environmental Research and Risk Assessment, 15(5),

–398.

Egozcue, J. J., Pawlowsky-Glahn, V., Mateu-Figueras, G., e Barcelo-Vidal, C. (2003).

Isometric logratio transformations for compositional data analysis. Mathematical Geology, 35(3), 279–300.

Aitchison, J. e Greenacre, M. (2002). Biplots of compositional data. Journal of the Royal

Statistical Society: Series C (Applied Statistics), 51(4), 375–392.

Egozcue, J. J. e Pawlowsky-Glahn, V. (2005). Groups of parts and their balances in

compositional data analysis. Mathematical Geology, 37(7), 795–828.

Hron, K., Engle, M., Filzmoser, P., e Fiˇserov´a, E. (2021). Weighted symmetric pivot

coordinates for compositional data with geochemical applications. Mathematical Geosciences, 53(4), 655–674.

Gallo, M. (2012). CoDa in three-way arrays and relative sample spaces. Electronic Journal of Applied Statistical Analysis, 5(3), 400-405.

Gallo, M. e Simonacci, V. (2013). A procedure for the three-mode analysis of compositions. Electronic Journal of Applied Statistical Analysis, 6(2), 202-210.

Gallo, M., Simonacci, V., e Di Palma, M. A. (2019). An integrated algorithm for threeway compositional data. Quality & Quantity, 53, 2353-2370.

Trendafilov, N., et al. (2021). Principal component analysis (PCA). In Multivariate Data

Analysis on Matrix Manifolds: (with Manopt) (pp. 89-139).

Trendafilov, N., Gallo, M., e Tierney, R. J. (2023). PCA and other dimensionalityreduction techniques. In International Encyclopedia of Education (Fourth Edition)

(pp. 590-599). Elsevier.

Jarauta-Bragulat, E., Hervada-Sala, C., & Egozcue, J. J. (2016). Air Quality Index

revisited from a compositional point of view. Mathematical Geosciences, 48, 581–593


Full Text: pdf
کاغذ a4

Creative Commons License
This work is licensed under a Creative Commons Attribuzione - Non commerciale - Non opere derivate 3.0 Italia License.