Corvinus
Corvinus

Network-based dimensionality reduction of high-dimensional, low-sample-size datasets

Kosztyán, Zsolt Tibor ORCID: https://orcid.org/0000-0001-7345-8336, Kurbucz, Marcell Tamás ORCID: https://orcid.org/0000-0002-0121-6781 and Katona, Attila Imre ORCID: https://orcid.org/0000-0001-7946-6265 (2022) Network-based dimensionality reduction of high-dimensional, low-sample-size datasets. Knowledge-Based Systems, 251 . DOI 10.1016/j.knosys.2022.109180

[img] PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
3MB

Official URL: https://doi.org/10.1016/j.knosys.2022.109180


Abstract

In the field of data science, there are a variety of datasets that suffer from the high-dimensional, low-sample-size (HDLSS) problem; however, only a few dimensionality reduction methods exist that are applicable to address this type of problem, and there is no nonparametric solution to date. The purpose of this work is to develop a novel network-based (nonparametric) dimensionality reduction analysis (NDA) method, that can be effectively applied to HDLSS data. First, with the NDA method, the correlation graph of variables is specified. With a modularity-based community detection method, the set of modules is specified. Then, the linear combination of variables weighted by their eigenvector centralities (EVCs), defined as LVs, is determined. In the optional phase of variable selection, variables with low EVCs and low communality are ignored. Then, the set of LVs and the set of indicators belonging to the LVs are specified using the NDA method. NDA is applied to publicly available databases and compared with principal factoring with community analysis (PFA) methods. The results show that NDA can be effectively applied to HDLSS datasets as it outperforms the existing methods in terms of interpretability. In addition, the application of NDA is easier, since there is no need to specify the number of latent variables due to its nonparametric nature.

Item Type:Article
Uncontrolled Keywords:Nonparametric methods ; Dimensionality reduction ; Community detection ; Communality analysis
Divisions:Institute of Data Analytics and Information Systems
Subjects:Computer science
Funders:Thematic Excellence Programme by the National Research, Development and Innovation Fund of Hungary, Research Centre of the Faculty of Business and Economics
Projects:2020-4.1.1-TKP2020, PE-GTK-GSKK A095000000-1
DOI:10.1016/j.knosys.2022.109180
ID Code:10435
Deposited By: MTMT SWORD
Deposited On:17 Oct 2024 09:38
Last Modified:17 Oct 2024 09:38

Repository Staff Only: item control page

Downloads

Downloads per month over past year

View more statistics