Kosztyán, Zsolt Tibor ORCID: https://orcid.org/0000-0001-7345-8336, Kurbucz, Marcell Tamás ORCID: https://orcid.org/0000-0002-0121-6781 and Katona, Attila Imre ORCID: https://orcid.org/0000-0001-7946-6265 (2022) Network-based dimensionality reduction of high-dimensional, low-sample-size datasets. Knowledge-Based Systems, 251 . DOI 10.1016/j.knosys.2022.109180
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
3MB |
Official URL: https://doi.org/10.1016/j.knosys.2022.109180
Abstract
In the field of data science, there are a variety of datasets that suffer from the high-dimensional, low-sample-size (HDLSS) problem; however, only a few dimensionality reduction methods exist that are applicable to address this type of problem, and there is no nonparametric solution to date. The purpose of this work is to develop a novel network-based (nonparametric) dimensionality reduction analysis (NDA) method, that can be effectively applied to HDLSS data. First, with the NDA method, the correlation graph of variables is specified. With a modularity-based community detection method, the set of modules is specified. Then, the linear combination of variables weighted by their eigenvector centralities (EVCs), defined as LVs, is determined. In the optional phase of variable selection, variables with low EVCs and low communality are ignored. Then, the set of LVs and the set of indicators belonging to the LVs are specified using the NDA method. NDA is applied to publicly available databases and compared with principal factoring with community analysis (PFA) methods. The results show that NDA can be effectively applied to HDLSS datasets as it outperforms the existing methods in terms of interpretability. In addition, the application of NDA is easier, since there is no need to specify the number of latent variables due to its nonparametric nature.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Nonparametric methods ; Dimensionality reduction ; Community detection ; Communality analysis |
Divisions: | Institute of Data Analytics and Information Systems |
Subjects: | Computer science |
Funders: | Thematic Excellence Programme by the National Research, Development and Innovation Fund of Hungary, Research Centre of the Faculty of Business and Economics |
Projects: | 2020-4.1.1-TKP2020, PE-GTK-GSKK A095000000-1 |
DOI: | 10.1016/j.knosys.2022.109180 |
ID Code: | 10435 |
Deposited By: | MTMT SWORD |
Deposited On: | 17 Oct 2024 09:38 |
Last Modified: | 17 Oct 2024 09:38 |
Repository Staff Only: item control page