Abstract
A novel framework called Graph diffusion & PCA (GDPCA) is proposed in the context of semi-supervised learning on graph structured data. It combines a modified Principal Component Analysis with the classical supervised loss and Laplacian regularization, thus handling the case where the adjacency matrix is Sparse and avoiding the Curse of dimensionality. Our framework can be applied to non-graph datasets as well, such as images by constructing similarity graph. GDPCA improves node classification by enriching the local graph structure by node covariance. We demonstrate the performance of GDPCA in experiments on citation networks and images, and we show that GDPCA compares favourably with the best state-of-the-art algorithms and has significantly lower computational complexity.
Supported by MyDataModels company.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
\(Z = \alpha \left( D^{\sigma -1} AD^{-\sigma } + \delta S D^{-2\sigma + 1} \right) Z + (1- \alpha )Y\).
- 4.
- 5.
In order to simplify the labeling process, we replaced the long description of masking by a shorter version (e.g. Single Blind (Participant, Investigator) by Blind).
- 6.
- 7.
- 8.
- 9.
References
Avrachenkov, K., Mishenin, A., Gonçalves, P., Sokol, M.: Generalized optimization framework for graph-based semi-supervised learning. In: Proceedings of the 2012 SIAM International Conference on Data Mining, pp. 966–974. SIAM (2012)
Baeza-Yates, R., Navarro, G.: Block addressing indices for approximate text retrieval. J. Am. Soc. Inf. Sci. 51(1), 69–82 (2000)
Bair, E., Hastie, T., Paul, D., Tibshirani, R.: Prediction by supervised principal components. J. Am. Stat. Assoc. 101(473), 119–137 (2006)
Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res. 7, 2399–2434 (2006)
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13(2) (2012)
Bertsekas, D.P., Tsitsiklis, J.N.: Parallel and Distributed Computation: Numerical Methods, vol. 23. Prentice Hall, Englewood Cliffs (1989)
Chapelle, O., Scholkopf, B., Zien, A.: Semi-supervised learning. IEEE Trans. Neural Netw. 20(3), 542 (2009). (chapelle, o. et al., eds.; 2006)[bibbook reviews]
Day, S.J., Altman, D.G.: Blinding in clinical trials and other studies. BMJ 321(7259), 504 (2000)
Ding, C., He, X.: K-means clustering via principal component analysis. In: Proceedings of the Twenty-first International Conference on Machine Learning, p. 29 (2004)
Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugen. 7(2), 179–188 (1936)
Fix, E.: Discriminatory analysis: nonparametric discrimination, consistency properties. USAF school of Aviation Medicine (1951)
Freund, R.M.: Quadratic functions, optimization, and quadratic forms (2004)
Grover, A., Zweig, A., Ermon, S.: Graphite: iterative generative modeling of graphs. In: International Conference on Machine Learning, pp. 2434–2444. PMLR (2019)
Halko, N., Martinsson, P.G., Tropp, J.A.: Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev. 53(2), 217–288 (2011)
Harris, Z.S.: Distributional structure. Word 10(2–3), 146–162 (1954)
Joachims, T.: Transductive inference for text classification using support vector machines. In: ICML, vol. 99, pp. 200–209 (1999)
Johnson, R., Zhang, T.: Graph-based semi-supervised learning and spectral kernel design. IEEE Trans. Inf. Theory 54(1), 275–288 (2008)
Kamalov, M., Avrachenkov, K.: GenPR: generative PageRank framework for semi-supervised learning on citation graphs. In: Filchenkov, A., Kauttonen, J., Pivovarova, L. (eds.) AINL 2020. CCIS, vol. 1292, pp. 158–165. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59082-6_12
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: 5th International Conference on Learning Representations. ICLR (2017)
Klicpera, J., Bojchevski, A., Günnemann, S.: Predict then propagate: graph neural networks meet personalized pagerank. arXiv preprint arXiv:1810.05997 (2018)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Masic, I., Miokovic, M., Muhamedagic, B.: Evidence based medicine-new approaches and challenges. Acta Informatica Medica 16(4), 219 (2008)
Perozzi, B., Al-Rfou, R., Skiena, S.: DeepWalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710 (2014)
Rifai, S., Dauphin, Y.N., Vincent, P., Bengio, Y., Muller, X.: The manifold tangent classifier. Adv. Neural. Inf. Process. Syst. 24, 2294–2302 (2011)
Ritchie, A., Scott, C., Balzano, L., Kessler, D., Sripada, C.S.: Supervised principal component analysis via manifold optimization. In: 2019 IEEE Data Science Workshop (DSW), pp. 6–10. IEEE (2019)
Rojo, O., Soto, R., Rojo, H.: Bounds for the spectral radius and the largest singular value. Comput. Math. Appl. 36(1), 41–50 (1998)
Roli, F., Marcialis, G.L.: Semi-supervised PCA-based face recognition using self-training. In: Yeung, D.-Y., Kwok, J.T., Fred, A., Roli, F., de Ridder, D. (eds.) SSPR /SPR 2006. LNCS, vol. 4109, pp. 560–568. Springer, Heidelberg (2006). https://doi.org/10.1007/11815921_61
Saad, Y., Schultz, M.H.: GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 7(3), 856–869 (1986)
Sen, P., Namata, G., Bilgic, M., Getoor, L., Galligher, B., Eliassi-Rad, T.: Collective classification in network data. AI Mag. 29(3), 93 (2008)
Walder, C., Henao, R., Mørup, M., Hansen, L.: Semi-Supervised Kernel PCA. IMM-Technical Report-2010-10, Technical University of Denmark, DTU Informatics, Building 321 (2010)
Weston, J., Ratle, F., Mobahi, H., Collobert, R.: Deep learning via semi-supervised embedding. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 639–655. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_34
Yang, Z., Cohen, W., Salakhudinov, R.: Revisiting semi-supervised learning with graph embeddings. In: Proceedings of Machine Learning Research, vol. 48, pp. 40–48. PMLR, New York, 20–22 June 2016
Zhu, X., Ghahramani, Z.: Learning from labeled and unlabeled data with label propagation (2002)
Zhu, X., Ghahramani, Z., Lafferty, J.D.: Semi-supervised learning using gaussian fields and harmonic functions. In: Proceedings of the 20th International conference on Machine learning (ICML 2003), pp. 912–919 (2003)
Zima, M.: A theorem on the spectral radius of the sum of two operators and its application. Bull. Aust. Math. Soc. 48(3), 427–434 (1993)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
A Proof of Proposition 1
Proof
This proof uses the same strategy as the proof of Proposition 2 in [1]. Rewriting Problem (4) in matrix form with the standard Laplacian \(L = D - A\) and with \(Z_{.i} , Y_{.i} \in \mathbb {R}^{n \times 1}\):
where \(S = \bar{X}^T\bar{X}/(d-1) \in \mathbb {R}^{n \times n}\). Considering \( \frac{Q(Z)}{\partial Z} = 0\):
Multiplying by \(D^{-2\sigma + 1}\) and replacing \(L = D - A\) results in:
Taking out the \(\mu \) over the parentheses and transposing the equation:
Finally, the desired result is obtained with \(\alpha = 2/(2+\mu )\). \(\square \)
B Proof of Proposition 2
Proof
Apply Theorem 1 of sums of spectral radii [35] for the following inequality:
based on the fact that spectral radius of a matrix similar to the stochastic matrix is equal to 1 (Gershgorin bounds):
apply the Theorem 7 [26] for replacing \(\rho (SD^{-2\sigma + 1})\) by the \(\gamma \) maximum singular value of \(SD^{-2\sigma + 1}\) we obtain the desired result in (6). \(\square \)
C Generation of Synthetic Adjacency Matrix
For selecting the best synthetic adjacency matrix for GDPCA, we have considered three standard distances, such as Cosine, Minkowski, Dice and the number of neighbours from 1 till 14 for KNN algorithm. The accuracy of GDPCA on above parameters on the validation set for MNIST and CCT datasets are shown in Fig. 4. Figure 4 shows that the best GDPCA accuracy on the validation set is obtained with the use of 7 neighbours and Dice distance for the CCT dataset is obtained with the use of 7 neighbours and Cosine distance for the MNIST dataset.
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Avrachenkov, K., Boisbunon, A., Kamalov, M. (2021). Graph Diffusion & PCA Framework for Semi-supervised Learning. In: Simos, D.E., Pardalos, P.M., Kotsireas, I.S. (eds) Learning and Intelligent Optimization. LION 2021. Lecture Notes in Computer Science(), vol 12931. Springer, Cham. https://doi.org/10.1007/978-3-030-92121-7_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-92121-7_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92120-0
Online ISBN: 978-3-030-92121-7
eBook Packages: Computer ScienceComputer Science (R0)