Skip to main content

Graph Diffusion & PCA Framework for Semi-supervised Learning

  • Conference paper
  • First Online:
Learning and Intelligent Optimization (LION 2021)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12931))

Included in the following conference series:

Abstract

A novel framework called Graph diffusion & PCA (GDPCA) is proposed in the context of semi-supervised learning on graph structured data. It combines a modified Principal Component Analysis with the classical supervised loss and Laplacian regularization, thus handling the case where the adjacency matrix is Sparse and avoiding the Curse of dimensionality. Our framework can be applied to non-graph datasets as well, such as images by constructing similarity graph. GDPCA improves node classification by enriching the local graph structure by node covariance. We demonstrate the performance of GDPCA in experiments on citation networks and images, and we show that GDPCA compares favourably with the best state-of-the-art algorithms and has significantly lower computational complexity.

Supported by MyDataModels company.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://clinicaltrials.gov.

  2. 2.

    https://github.com/KamalovMikhail/GDPCA.

  3. 3.

    \(Z = \alpha \left( D^{\sigma -1} AD^{-\sigma } + \delta S D^{-2\sigma + 1} \right) Z + (1- \alpha )Y\).

  4. 4.

    https://clinicaltrials.gov/ct2/resources/download#DownloadMultipleRecords.

  5. 5.

    In order to simplify the labeling process, we replaced the long description of masking by a shorter version (e.g. Single Blind (Participant, Investigator) by Blind).

  6. 6.

    https://github.com/KamalovMikhail/GDPCA.

  7. 7.

    https://github.com/tkipf/gcn.

  8. 8.

    https://github.com/kimiyoung/planetoid.

  9. 9.

    https://github.com/KamalovMikhail/GDPCA.

References

  1. Avrachenkov, K., Mishenin, A., Gonçalves, P., Sokol, M.: Generalized optimization framework for graph-based semi-supervised learning. In: Proceedings of the 2012 SIAM International Conference on Data Mining, pp. 966–974. SIAM (2012)

    Google Scholar 

  2. Baeza-Yates, R., Navarro, G.: Block addressing indices for approximate text retrieval. J. Am. Soc. Inf. Sci. 51(1), 69–82 (2000)

    Article  Google Scholar 

  3. Bair, E., Hastie, T., Paul, D., Tibshirani, R.: Prediction by supervised principal components. J. Am. Stat. Assoc. 101(473), 119–137 (2006)

    Article  MathSciNet  Google Scholar 

  4. Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res. 7, 2399–2434 (2006)

    MathSciNet  MATH  Google Scholar 

  5. Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13(2) (2012)

    Google Scholar 

  6. Bertsekas, D.P., Tsitsiklis, J.N.: Parallel and Distributed Computation: Numerical Methods, vol. 23. Prentice Hall, Englewood Cliffs (1989)

    MATH  Google Scholar 

  7. Chapelle, O., Scholkopf, B., Zien, A.: Semi-supervised learning. IEEE Trans. Neural Netw. 20(3), 542 (2009). (chapelle, o. et al., eds.; 2006)[bibbook reviews]

    Article  Google Scholar 

  8. Day, S.J., Altman, D.G.: Blinding in clinical trials and other studies. BMJ 321(7259), 504 (2000)

    Article  Google Scholar 

  9. Ding, C., He, X.: K-means clustering via principal component analysis. In: Proceedings of the Twenty-first International Conference on Machine Learning, p. 29 (2004)

    Google Scholar 

  10. Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugen. 7(2), 179–188 (1936)

    Article  Google Scholar 

  11. Fix, E.: Discriminatory analysis: nonparametric discrimination, consistency properties. USAF school of Aviation Medicine (1951)

    Google Scholar 

  12. Freund, R.M.: Quadratic functions, optimization, and quadratic forms (2004)

    Google Scholar 

  13. Grover, A., Zweig, A., Ermon, S.: Graphite: iterative generative modeling of graphs. In: International Conference on Machine Learning, pp. 2434–2444. PMLR (2019)

    Google Scholar 

  14. Halko, N., Martinsson, P.G., Tropp, J.A.: Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev. 53(2), 217–288 (2011)

    Article  MathSciNet  Google Scholar 

  15. Harris, Z.S.: Distributional structure. Word 10(2–3), 146–162 (1954)

    Article  Google Scholar 

  16. Joachims, T.: Transductive inference for text classification using support vector machines. In: ICML, vol. 99, pp. 200–209 (1999)

    Google Scholar 

  17. Johnson, R., Zhang, T.: Graph-based semi-supervised learning and spectral kernel design. IEEE Trans. Inf. Theory 54(1), 275–288 (2008)

    Article  MathSciNet  Google Scholar 

  18. Kamalov, M., Avrachenkov, K.: GenPR: generative PageRank framework for semi-supervised learning on citation graphs. In: Filchenkov, A., Kauttonen, J., Pivovarova, L. (eds.) AINL 2020. CCIS, vol. 1292, pp. 158–165. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59082-6_12

    Chapter  Google Scholar 

  19. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: 5th International Conference on Learning Representations. ICLR (2017)

    Google Scholar 

  20. Klicpera, J., Bojchevski, A., Günnemann, S.: Predict then propagate: graph neural networks meet personalized pagerank. arXiv preprint arXiv:1810.05997 (2018)

  21. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  22. Masic, I., Miokovic, M., Muhamedagic, B.: Evidence based medicine-new approaches and challenges. Acta Informatica Medica 16(4), 219 (2008)

    Article  Google Scholar 

  23. Perozzi, B., Al-Rfou, R., Skiena, S.: DeepWalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710 (2014)

    Google Scholar 

  24. Rifai, S., Dauphin, Y.N., Vincent, P., Bengio, Y., Muller, X.: The manifold tangent classifier. Adv. Neural. Inf. Process. Syst. 24, 2294–2302 (2011)

    Google Scholar 

  25. Ritchie, A., Scott, C., Balzano, L., Kessler, D., Sripada, C.S.: Supervised principal component analysis via manifold optimization. In: 2019 IEEE Data Science Workshop (DSW), pp. 6–10. IEEE (2019)

    Google Scholar 

  26. Rojo, O., Soto, R., Rojo, H.: Bounds for the spectral radius and the largest singular value. Comput. Math. Appl. 36(1), 41–50 (1998)

    Article  MathSciNet  Google Scholar 

  27. Roli, F., Marcialis, G.L.: Semi-supervised PCA-based face recognition using self-training. In: Yeung, D.-Y., Kwok, J.T., Fred, A., Roli, F., de Ridder, D. (eds.) SSPR /SPR 2006. LNCS, vol. 4109, pp. 560–568. Springer, Heidelberg (2006). https://doi.org/10.1007/11815921_61

    Chapter  Google Scholar 

  28. Saad, Y., Schultz, M.H.: GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 7(3), 856–869 (1986)

    Article  MathSciNet  Google Scholar 

  29. Sen, P., Namata, G., Bilgic, M., Getoor, L., Galligher, B., Eliassi-Rad, T.: Collective classification in network data. AI Mag. 29(3), 93 (2008)

    Google Scholar 

  30. Walder, C., Henao, R., Mørup, M., Hansen, L.: Semi-Supervised Kernel PCA. IMM-Technical Report-2010-10, Technical University of Denmark, DTU Informatics, Building 321 (2010)

    Google Scholar 

  31. Weston, J., Ratle, F., Mobahi, H., Collobert, R.: Deep learning via semi-supervised embedding. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 639–655. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_34

    Chapter  Google Scholar 

  32. Yang, Z., Cohen, W., Salakhudinov, R.: Revisiting semi-supervised learning with graph embeddings. In: Proceedings of Machine Learning Research, vol. 48, pp. 40–48. PMLR, New York, 20–22 June 2016

    Google Scholar 

  33. Zhu, X., Ghahramani, Z.: Learning from labeled and unlabeled data with label propagation (2002)

    Google Scholar 

  34. Zhu, X., Ghahramani, Z., Lafferty, J.D.: Semi-supervised learning using gaussian fields and harmonic functions. In: Proceedings of the 20th International conference on Machine learning (ICML 2003), pp. 912–919 (2003)

    Google Scholar 

  35. Zima, M.: A theorem on the spectral radius of the sum of two operators and its application. Bull. Aust. Math. Soc. 48(3), 427–434 (1993)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mikhail Kamalov .

Editor information

Editors and Affiliations

Appendices

A Proof of Proposition 1

Proof

This proof uses the same strategy as the proof of Proposition 2 in [1]. Rewriting Problem (4) in matrix form with the standard Laplacian \(L = D - A\) and with \(Z_{.i} , Y_{.i} \in \mathbb {R}^{n \times 1}\):

$$\begin{aligned} \begin{aligned} Q(Z)&= 2 \sum _{i=1}^{k} Z^T_{.i} D^{\sigma - 1} L D^{\sigma - 1} Z_{.i} \\ {}&+ \mu \sum _{i=1}^{k} (Z_{.i} - Y_{.i})^T D^{2\sigma - 1}(Z_{.i} - Y_{.i}) - \delta \sum _{i=1}^{k} Z_{.i}S Z_{.i}^T \end{aligned} \end{aligned}$$

where \(S = \bar{X}^T\bar{X}/(d-1) \in \mathbb {R}^{n \times n}\). Considering \( \frac{Q(Z)}{\partial Z} = 0\):

$$\begin{aligned} \begin{aligned}&2 Z^T (D^{\sigma - 1} L D^{\sigma - 1} + D^{\sigma - 1} L^T D^{\sigma - 1} ) + 2\mu (Z - Y )^TD^{2\sigma - 1} - \delta Z ^T(S + S^T) = 0 \end{aligned} \end{aligned}$$

Multiplying by \(D^{-2\sigma + 1}\) and replacing \(L = D - A\) results in:

$$\begin{aligned} \begin{aligned} Z^T (&2I - 2D^{\sigma - 1} AD^{-\sigma } + \mu I - 2 \delta SD^{-2\sigma + 1}) - \mu Y^T = 0 \end{aligned} \end{aligned}$$

Taking out the \(\mu \) over the parentheses and transposing the equation:

$$\begin{aligned} \begin{aligned} Z&= \frac{\mu }{(2 + \mu ) } (I - \frac{2}{(2 + \mu )} ( D^{\sigma -1} AD^{-\sigma } + \delta SD^{-2\sigma + 1} ))^{-1}Y \end{aligned} \end{aligned}$$

Finally, the desired result is obtained with \(\alpha = 2/(2+\mu )\).   \(\square \)

B Proof of Proposition 2

Proof

Apply Theorem 1 of sums of spectral radii [35] for the following inequality:

$$\begin{aligned} \rho ( D^{\sigma - 1}A D^{-\sigma }+ \delta SD^{-2\sigma + 1} ) \le \rho \left( D^{\sigma - 1}A D^{-\sigma }\right) + \rho (\delta SD^{-2\sigma + 1})< 1/\alpha \end{aligned}$$

based on the fact that spectral radius of a matrix similar to the stochastic matrix is equal to 1 (Gershgorin bounds):

$$\begin{aligned} 1 + \delta \rho (SD^{-2\sigma + 1})< 1/\alpha \end{aligned}$$

apply the Theorem 7 [26] for replacing \(\rho (SD^{-2\sigma + 1})\) by the \(\gamma \) maximum singular value of \(SD^{-2\sigma + 1}\) we obtain the desired result in (6).    \(\square \)

C Generation of Synthetic Adjacency Matrix

For selecting the best synthetic adjacency matrix for GDPCA, we have considered three standard distances, such as Cosine, Minkowski, Dice and the number of neighbours from 1 till 14 for KNN algorithm. The accuracy of GDPCA on above parameters on the validation set for MNIST and CCT datasets are shown in Fig. 4. Figure 4 shows that the best GDPCA accuracy on the validation set is obtained with the use of 7 neighbours and Dice distance for the CCT dataset is obtained with the use of 7 neighbours and Cosine distance for the MNIST dataset.

Fig. 4.
figure 4

Estimate different adjacency matrix for GDPCA.

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Avrachenkov, K., Boisbunon, A., Kamalov, M. (2021). Graph Diffusion & PCA Framework for Semi-supervised Learning. In: Simos, D.E., Pardalos, P.M., Kotsireas, I.S. (eds) Learning and Intelligent Optimization. LION 2021. Lecture Notes in Computer Science(), vol 12931. Springer, Cham. https://doi.org/10.1007/978-3-030-92121-7_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-92121-7_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-92120-0

  • Online ISBN: 978-3-030-92121-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy