Abstract
Academic genealogy graphs capture information about the lineage of researchers, encode how knowledge flows from advisors to proteges, and shed light on the birth and evolution of disciplines. In this paper, we study the academic genealogy graph/network (AGN) in Shodhganga which is the Indian Electronic Theses and Dissertations (ETD) database. We have disambiguated the names of the researchers in Shodhganga and constructed the Shodhganga-AGN, which we have analyzed with topological metrics proposed in the literature on general graphs as well as that on genealogy networks. The metrics studied have been able to identify the institutes and researchers that have played a significant role in the development of the Indian higher education system. The largest connected component of Shodhganga-AGN consists of 1356 researchers and 1437 advisor–advisee relationships. The component is dominated by researchers from science and is affiliated primarily with three institutions. We have also studied subgraphs in the genealogy network to identify supervision patterns, and found that most of the subgraph instances connect researchers within a single institution or subject. Thus, our study is a detailed insightful analysis of the academic genealogy of researchers indexed in Shodhganga, and captures the decades-old research ecosystem of India, as expressed through the formal advisor–advisee relationships in Indian universities.



















Similar content being viewed by others
Notes
Dropout rate is not known, but we assumed it to be more than 2%.
In Algorithm 2 the the threshold value is dynamic
Structural variation refers to characters missing, longer-shorter version of names, etc.
The May 2018 AFT data dump is used.
We have made the assumption that the Researchers have only one doctoral degree.
The involvement of multiple advisors to advise a researcher in AGN.
Note: We have not identified the subgraphs based on their frequency of occurrence in the random graphs and Shodhganga-AGN.
Note: Temporal information is not considered for detecting subgraphs.
The advisor–advisee connections available in Shodhganga-AGN following the predefined subgraph structures are referred as instances of the subgraph structures (shown in Fig. 13a).
References
Alves, B.L., Benevenuto, F., Laender, A.H. (2013). The role of research leaders on the evolution of scientific communities. In Proceedings of the 22nd international conference on world wide web-www ’13 companion (pp 649–656). New York, ACM Press. https://doi.org/10.1145/2487788.2488016
Arslan, E., Gunes, M.H., Yuksel, M. (2011). Analysis of academic ties: A case study of mathematics genealogy. In 2011 IEEE globecom workshops (gc wkshps) (pp. 125–129). https://doi.org/10.1109/GLOCOMW.2011.6162384
Avron, A., Dershowitz, N., Rabinovich, A. (2008). Boris A. Trakhtenbrot: Academic genealogy and publications. Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) (Vol. 4800 LNCS, pp. 46–57). Springer, Berlin. https://doi.org/10.1007/978-3-540-78127-1_3
Carolina Elias, M., Floeter-Winter, L. M., & Mena-Chalco, J. P. (2016). The dynamics of Brazilian protozoology over the past century. Memorias do Instituto Oswaldo Cruz, 111(1), 67–74. https://doi.org/10.1590/0074-02760150386
Chariker, J. H., Zhang, Y., Pani, J. R., & Rouchka, E. C. (2017). Identification of successful mentoring communities using network-based analysis of mentor-mentee relationships across Nobel laureates. Scientometrics, 111(3), 1733–1749. https://doi.org/10.1007/s11192-017-2364-4
Cronin, B., & Sugimoto, C. R. E. (2014). Beyond bibliometrics: Harnessing multidimensional indicators of scholarly impact cambridge ma mit press 2014 466 pp. Online Information Review, 39(2), 270–271.
Damaceno, R. J., Rossi, L., Mugnaini, R., & Mena-Chalco, J. P. (2019). The Brazilian academic genealogy: Evidence of advisor–advisee relationships through quantitative analysis. Scientometrics, 119(1), 303–333. https://doi.org/10.1007/s11192-019-03023-0
Da Silva, C. E. M., Nunes, R., & Viegas, E. M. M. (2018). A genealogy of the Brazilian scientific research on freshwater fish farming by means of the academic supervision linkage. Scientometrics. https://doi.org/10.1007/s11192-018-2940-2
David, S. V., & Hayden, B. Y. (2012). Neurotree: A collaborative, graphical database of the academic genealogy of neuroscience. PLoS ONE, 7(10), e46608. https://doi.org/10.1371/journal.pone.0046608
Dores, W., Benevenuto, F., Laender, A.H. (2016). Extracting academic genealogy trees from the networked digital library of theses and dissertations. In Proceedings of the ACM/IEEE joint conference on digital libraries (Vol. 2016-Septe, pp. 163–166). New York, Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1145/2910896.2910916
Dores, W., Soares, E., Benevenuto, F., & Laender, A. H. (2017). Building the Brazilian academic genealogy tree. Lecture Notes in Computer Science. https://doi.org/10.1007/978-3-319-67008-9_43
Gargiulo, F., Caen, A., Lambiotte, R., Carletti, T. (2016). The classical origin of modern mathematics. EPJ Data Science 51. https://doi.org/10.1140/epjds/s13688-016-0088-y
Hart, R. E., & Cossuth, J. H. (2013). A family tree of tropical meteorology’s academic community and its proposed expansion. Bulletin of the American Meteorological Society, 94(12), 1837–1848. https://doi.org/10.1175/BAMS-D-12-00110.1
Head, K., Li, Y. A., & Minondo, A. (2019). Geography, ties, and knowledge flows: Evidence from citations in mathematics. The Review of Economics and Statistics, 101(4), 713–727. https://doi.org/10.1162/rest_a_00771
Heinisch, D. P., & Buenstorf, G. (2018). The next generation (plus one): An analysis of doctoral students’ academic fecundity based on a novel approach to advisor identification. Scientometrics, 117(1), 351–380. https://doi.org/10.1007/s11192-018-2840-5
Hirshman, B. R., Alattar, A. A., Dhawan, S., Carley, K. M., & Chen, C. C. (2019). Association between medical academic genealogy and publication outcome: Impact of unconscious bias on scientific objectivity. Acta Neurochirurgica, 161(2), 205–211. https://doi.org/10.1007/s00701-019-03804-9
Jackson, A. (2007). A labor of love: The mathematics genealogy project. Notices of the American Mathematical Society, 54, 1002–1003.
Krumov, L., Fretter, C., Müller-Hannemann, M., Weihe, K., & Hütt, M- T. (2011). Motifs in co-authorship networks and their relation to the impact of scientific publications. The European Physical Journal, B84(4), 535–540. https://doi.org/10.1140/epjb/e2011-10746-5
Liénard, J. F., Achakulvisut, T., Acuna, D. E., & David, S. V. (2018). Intellectual synthesis in mentorship determines success in academic careers. Nature Communications, 9(1), 4840. https://doi.org/10.1038/s41467-018-07034-y
Liu, J., Tang, T., Kong, X., Tolba, A., & AL-Makhadmeh, Z., Xia, F. (2018). Understanding the advisor–advisee relationship via scholarly data analysis. Scientometrics, 116(1), 161–180. https://doi.org/10.1007/s11192-018-2762-2
Liu, J., Xia, F., Wang, L., Xu, B., Kong, X., Tong, H., King, I. (2019). Shifu2: A Network Representation Learning Based Model for Advisor–advisee Relationship Mining. In IEEE Transactions on Knowledge and Data Engineering pp. 1–1. https://doi.org/10.1109/tkde.2019.2946825
Ma, Y., & Uzzi, B. (2018). Scientific prize network predicts who pushes the boundaries of science. Proceedings of the National Academy of Sciences, 115(50), 12608–12615. https://doi.org/10.1073/pnas.1800485115
Madeira, G., Borges, E.N., Barañano, M., Nascimento, P.K., Lucca, G., De Fatima Maia, M., Dimuro, G. (2019). The gold tree: An information system for analyzing academic genealogy. In ICEIS 2019-proceedings of the 21st international conference on enterprise information systems. https://doi.org/10.5220/0007758401140120
Malmgren, R. D., Ottino, J. M., Amaral, L. A. N., Nunes Amaral, L. A., & Shiralkar, P. (2010). The role of mentorship in protégé performance. Nature, 465, 622–626. https://doi.org/10.1038/nature09040
Marsh, E. J. (2017). Family matters: Measuring impact through one’s academic descendants. Perspectives on Psychological Science, 12(6), 1130–1132. https://doi.org/10.1177/1745691617719759
Mugnaini, R., Damaceno, R.J., Mena-Chalco, J.P. (2019). An empirical analysis on the relationship between publications and academic genealogy. In 17th international conference on scientometrics and informetrics, ISSI 2019-proceedings.
Paranjape, A., Benson, A.R., Leskovec, J. (2017). Motifs in temporal networks. In Proceedings of the tenth ACM international conference on web search and data mining (pp. 601–610). New York, NY, USA Association for Computing Machinery. https://doi.org/10.1145/3018661.3018731
Rossi, L., Damaceno, R. J., Freire, I. L., Bechara, E. J., & Mena-Chalco, J. P. (2018). Topological metrics in academic genealogy graphs. Journal of Informetrics, 12(4), 1042–1058. https://doi.org/10.1016/j.joi.2018.08.004
Rossi, L., Freire, I. L., & Mena-Chalco, J. P. (2017). Genealogical index: A metric to analyze advisor–advisee relationships. Journal of Informetrics, 11(2), 564–582. https://doi.org/10.1016/j.joi.2017.04.001
Russell, T. G., & Sugimoto, C. R. (2009). Mpact family trees: Quantifying academic genealogy in library and information science. Journal of Education for Library and Information Science, 5, 248–262.
Sanyal, D. K., Dey, S., & Das, P. P. (2020). gm-index: A new mentorship index for researchers. Scientogmetrics, 123(1), 71–102. https://doi.org/10.1007/s11192-020-03384-x
Semenov, A., Veremyev, A., Nikolaev, A., Pasiliao, E. L., & Boginski, V. (2020). Network-based indices of individual and collective advising impacts in mathematics. Computational Social Networks, 7(1), 1–18. https://doi.org/10.1186/s40649-019-0075-0
Tan, Z., Liu, C., Mao, Y., Guo, Y., Shen, J., Wang, X. (2016). AceMap: A Novel Approach towards Displaying Relationship among Academic Literatures. In Proceedings of the 25th international conference companion on world wide web-www ’16 companion. https://doi.org/10.1145/2872518.2890514
Tuesta, E. F., Delgado, K. V., Mugnaini, R., Digiampietri, L. A., Mena-Chalco, J. P., & Pérez-Alcázar, J. J. (2015). Analysis of an advisor–advisee relationship: An exploratory study of the area of Exact and Earth Sciences in Brazil. PLoS ONE, 10(5), e0129065. https://doi.org/10.1371/journal.pone.0129065
Wang, C., Han, J., Jia, Y., Tang, J., Zhang, D., Yu, Y. (2010). Mining advisor–advisee relationships from research publication networks. In Proceedings of the ACM sigkdd international conference on knowledge discovery and data mining (pp. 203–212). New York, USAACM Press. https://doi.org/10.1145/1835804.1835833
Wang, W., Liu, J., Xia, F., King, I., Tong, H., & (2017). Shifu: Deep learning based advisor–advisee relationship mining in scholarly big data. 26th international world wide web conference,. (2017). www 2017 companion (pp. 303–310). International World Wide Web Conferences Steering Committee. https://doi.org/10.1145/3041021.3054159
Wijsen, L. D., Borsboom, D., Cabaço, T., & Heiser, W. J. (2019). An academic genealogy of psychometric society presidents. Psychometrika. https://doi.org/10.1007/s11336-018-09651-4
Wu, W., Han, Y., Li, D. (2008). The topology and motif analysis of journal citation networks. In 2008 international conference on computer science and software engineering (Vol. 1, pp. 287–293). https://doi.org/10.1109/CSSE.2008.495
Zeitlyn, D., & Hook, D. W. (2019). Perception, prestige and pagerank. PLoS ONE, 14(5), 1–21. https://doi.org/10.1371/journal.pone.0216783
Zhao, Z., Liu, W., Qian, Y., Nie, L., Yin, Y., & Zhang, Y. (2018). Identifying advisor–advisee relationships from co-author networks via a novel deep model. Information Sciences, 466, 258–269. https://doi.org/10.1016/j.ins.2018.07.064
Ziechmann, R., Hoffman, H., & Chin, L. S. (2019). Academic genealogy of neurosurgery via department chair. World Neurosurgery. https://doi.org/10.1016/j.wneu.2018.09.023
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors did not receive support from any organization for the submitted work and have no competing financial interests to declare.
Appendix
Appendix
Algorithms flowchart
Algorithm 2-In all the above cases, we have imposed the natural constraint that the advisor’s (who is potentially the same person as the advisee) date of thesis submission to be earlier than the advisee’s thesis submission date. In the case where the number of candidates (advisee) to merge with the advisor is greater than one, we further check them across the available attributes to find the best candidate
Algorithm 3-Initially students with same names and having same department and institute are assigned the same index even if they are referring to different individuals. In that scenario, if advisor information (advisor thesis) is available, we are using that to find the most likely student among all the students (with the same name) associated with him/her. After that, group the remaining students by thesis title and assign them with different indices (Assumption: very less likely that two students with the same name and in the same institute and department has different thesis titles associated with same advisor)
Algorithm 4-Applicable if multiple advisors advised the student (multiple records exist, same student name has variation due to human error (check Appendix Table 2)), or duplicate records exist, or wrongly combined thesis title. The function merges students with similar names (with slight variation) having the same thesis title. If student names are different across the same thesis index (similarity value less than threshold value), then there is a chance that two different thesis indices are merged wrongly. Therefore, we will not change the student indices
Sample of researcher records with few attributes
Top researchers in Shodhganga-AGN based on different genealogical metrics
Institute abbreviation, Nirf ranking, and DDC subject code with subject name
Extracted subgraphs structures from Shodhganga-AGN
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kumar, D., Bhowmick, P.K., Dey, S. et al. On the banks of Shodhganga: analysis of the academic genealogy graph of an Indian ETD repository. Scientometrics 128, 3879–3914 (2023). https://doi.org/10.1007/s11192-023-04728-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-023-04728-z