Corvinus
Corvinus

Measuring research interest similarity with transition probabilities

Varga, Attila ORCID: https://orcid.org/0000-0002-8913-4616, Kojaku, Sadamori ORCID: https://orcid.org/0000-0002-9414-6814 and Filipi, Silva ORCID: https://orcid.org/0000-0002-9151-6517 (2025) Measuring research interest similarity with transition probabilities. Quantitative Science Studies, 6 . pp. 922-939. DOI 10.1162/qss.a.13

[img] PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
1MB

Official URL: https://doi.org/10.1162/qss.a.13


Abstract

We introduce a family of paper and author similarity measures based on the concept that papers are more similar if they are more likely to be retrieved during a literature search following backward and forward citations. As this browsing process resembles a walk in a citation network, we operationalize the concept using the transition probability (TP) of random walkers. The proposed measures are continuous and symmetric, and can be implemented on any citation network. We conduct validation tests of the TP concept and other extant alternatives to gauge which metric can classify papers and predict future coauthors most consistently across different scales of analysis (coauthorships, journals, and disciplines). Our results show that the proposed basic TP measure outperforms alternative metrics such as personalized PageRank and the node2vec machine-learning technique in classification tasks at various scales. Additionally, we discuss how publication-level data can be leveraged to approximate the research interest similarity of individual scientists. This paper is accompanied by a Python package that implements all the tested metrics.

Item Type:Article
Uncontrolled Keywords:citation networks, paper similarity, research problem choice, transition probability
Divisions:Corvinus Institute for Advanced Studies (CIAS)
Subjects:Mathematics, Econometrics
Funders:Air Force Office of Scientific Research, Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS), National Science Foundation
Projects:FA9550-19-1-0391, CIS230183, #2138259, #2138286, #2138307, #2137603, #2138296
DOI:10.1162/qss.a.13
ID Code:11909
Deposited By: MTMT SWORD
Deposited On:10 Oct 2025 07:58
Last Modified:10 Oct 2025 07:58

Repository Staff Only: item control page

Downloads

Downloads per month over past year

View more statistics