Abstract
Attributed network embedding has attracted plenty of interest in recent years. It aims to learn task-independent, low-dimensional, and continuous vectors for nodes preserving both topology and attribute information. Most of the existing methods, such as random-walk based methods and GCNs, mainly focus on the local information, i.e., the attributes of the neighbours. Thus, they have been well studied for assortative networks (i.e., networks with communities) but ignored disassortative networks (i.e., networks with multipartite, hubs, and hybrid structures), which are common in the real world. To model both assortative and disassortative networks, we propose a block-based generative model for attributed network embedding from a probability perspective. Specifically, the nodes are assigned to several blocks wherein the nodes in the same block share the similar linkage patterns. These patterns can define assortative networks containing communities or disassortative networks with the multipartite, hub, or any hybrid structures. To preserve the attribute information, we assume that each node has a hidden embedding related to its assigned block. We use a neural network to characterize the nonlinearity between node embeddings and node attributes. We perform extensive experiments on real-world and synthetic attributed networks. The results show that our proposed method consistently outperforms state-of-the-art embedding methods for both clustering and classification tasks, especially on disassortative networks.
Similar content being viewed by others
Notes
In this paper, cluster, group, and block are interchangeable.
References
Abbe, E.: Community detection and stochastic block models: recent developments. J. Mach. Learn. Res. 18(1), 6446–6531 (2017)
Barbieri, N., Bonchi, F., Manco, G.: Who to Follow and Why: Link Prediction with Explanations. In: KDD, pp 1266–1275 (2014)
Bojchevski, A., Günnemann, S.: Deep Gaussian Embedding of Graphs: Unsupervised Inductive Learning via Ranking. In: International Conference on Learning Representations (2018)
Chen, H., Yin, H., Chen, T., Nguyen, Q. V. H., Peng, W. C., Li, X.: Exploiting Centrality Information with Graph Convolutions for Network Representation Learning. In: 2019 IEEE 35Th International Conference on Data Engineering (ICDE), pp 590–601. IEEE (2019)
Chen, H., Yin, H., Chen, T., Wang, W., Li, X., Hu, X.: Social boosted recommendation with folded bipartite network embedding. IEEE Transactions on Knowledge and Data Engineering (2020)
Chen, L., Liu, C., Liao, K., Li, J., Zhou, R.: Contextual Community Search over Large Social Networks. In: 2019 IEEE 35Th International Conference on Data Engineering (ICDE), pp 88–99. IEEE (2019)
Chowdhary, A. A., Liu, C., Chen, L., Zhou, R., Yang, Y.: Finding Attribute Diversified Communities in Complex Networks. In: International Conference on Database Systems for Advanced Applications, pp 19–35. Springer (2020)
Craveny, M., DiPasquoy, D., Freitagy, D., McCallumzy, A., Mitchelly, T., Nigamy, K., an Slatteryy, S.: Learning to Extract Symbolic Knowledge from the World Wide Web. In: AAAI, pp 509–516 (1998)
Gama, F., Ribeiro, A., Bruna, J.: Diffusion Scattering Transforms on Graphs. In: International Conference on Learning Representations (2019)
Gao, H., Huang, H.: Deep Attributed Network Embedding. In: IJCAI, pp 3364–3370 (2018)
Gao, H., Ji, S.: Graph u-nets. In: Proceedings of the 36th International Conference on Machine Learning (2019)
Gao, J., Liang, F., Fan, W., Wang, C., Sun, Y., Han, J.: On Community Outliers and Their Efficient Detection in Information Networks. In: KDD, pp 813–822. ACM (2010)
Gao, M., Chen, L., He, X., Zhou, A.: Bine: Bipartite Network Embedding. In: SIGIR, pp 715–724 (2018)
Gopalan, P., Hofman, J. M., Blei, D. M.: Scalable Recommendation with Hierarchical Poisson Factorization. In: UAI, pp 326–335 (2015)
Grover, A., Leskovec, J.: Node2vec: Scalable Feature Learning for Networks. In: KDD (2016)
Guimerà, R., Sales-Pardo, M.: Missing and spurious interactions and the reconstruction of complex networks. PNAS 106(52), 22073–22078 (2009)
Hamilton, W., Ying, Z., Leskovec, J.: Inductive Representation Learning on Large Graphs. In: Advances in Neural Information Processing Systems, pp 1024–1034 (2017)
Holland, P. W., Laskey, K. B., Leinhardt, S.: Stochastic blockmodels: First steps. Soc. Netw. 5(2), 109–137 (1983)
Huang, X., Li, J., Hu, X.: Accelerated attributed network embedding. In: Proceedings of the 2017 SIAM international conference on data mining, pp 633–641 (2017)
Huang, X., Song, Q., Yang, F., Hu, X.: Large-Scale Heterogeneous Feature Embedding. In: AAAI (2019)
Jiang, F., He, L., Zheng, Y., Zhu, E., Xu, J., Yu, P. S.: On spectral graph embedding: a non-backtracking perspective and graph approximation. In: Proceedings of the 2018 SIAM International Conference on Data Mining, pp 324–332. SIAM (2018)
Jiang, J.Q.: Stochastic block model and exploratory analysis in signed networks. Phys. Rev. E 91(6), 062805 (2015)
Kendrick, L., Musial, K., Gabrys, B.: Change point detection in social networks—critical review with experiments. Comput. Sci. Rev. 29, 1–13 (2018)
Khan, K. U., Nawaz, W., Lee, Y. K.: Set-based unified approach for summarization of a multi-attributed graph. World Wide Web 20(3), 543–570 (2017)
Kingma, D. P., Welling, M.: Auto-encoding variational bayes. ICLR (2014)
Kipf, T. N., Welling, M.: Variational graph auto-encoders. NIPS Workshop on Bayesian Deep Learning (2016)
Kipf, T. N., Welling, M.: Semi-supervised classification with graph convolutional networks. ICLR (2017)
Knyazev, B., Taylor, G. W., Amer, M.: Understanding Attention and Generalization in Graph Neural Networks. In: Advances in Neural Information Processing Systems, pp 4202–4212 (2019)
Kuncheva, L. I., Hadjitodorov, S. T.: Using Diversity in Cluster Ensembles. In: IEEE International Conference on Systems, Man and Cybernetics, vol. 2, pp 1214–1219 (2004)
Li, J., Hu, X., Tang, J., Liu, H.: Unsupervised streaming feature selection in social media. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp 1041–1050 (2015)
Li, Y., Sha, C., Huang, X., Zhang, Y.: Community detection in attributed graphs: an embedding approach. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Liao, L., He, X., Zhang, H., Chua, T. S.: Attributed social network embedding. IEEE TKDE 30(12), 2257–2270 (2018)
Mehta, N., Duke, L. C., Rai, P.: Stochastic Blockmodels Meet Graph Neural Networks. In: ICML, pp 4466–4474 (2019)
Namata, G., London, B., Getoor, L., Huang, B., EDU, U.: Query-Driven Active Surveying for Collective Classification. In: 10Th International Workshop on Mining and Learning with Graphs, vol. 8 (2012)
Newman, M. E., Clauset, A.: Structure and inference in annotated networks. Nat. Commun. 7(1), 1–11 (2016)
Pan, S., Hu, R., Long, G., Jiang, J., Yao, L., Zhang, C.: Adversarially Regularized Graph Autoencoder for Graph Embedding. In: IJCAI, pp 2609–2615 (2018)
Pan, S., Wu, J., Zhu, X., Zhang, C., Wang, Y.: Tri-Party Deep Network Representation. In: IJCAI, pp 1895–1901 (2016)
Pei, H., Wei, B., Chang, K. C. C., Lei, Y., Yang, B.: Geom-Gcn: Geometric Graph Convolutional Networks. In: ICLR (2020)
Perozzi, B., Akoglu, L., Iglesias Sánchez, P., Müller, E.: Focused Clustering and Outlier Detection in Large Attributed Graphs. In: KDD, pp 1346–1355. ACM (2014)
Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: Online Learning of Social Representations. In: KDD, pp 701–710 (2014)
Pillai, I., Fumera, G., Roli, F.: F-measure optimisation in multi-label classifiers. In: Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), pp 2424–2427 (2012)
Ribeiro, L. F., Saverese, P. H., Figueiredo, D.R.: Struc2vec: Learning Node Representations from Structural Identity. In: KDD, pp 385–394 (2017)
Ruan, Y., Fuhry, D., Parthasarathy, S.: Efficient Community Detection in Large Networks Using Content and Links. In: WWW, pp 1089–1098. ACM (2013)
Salehi, A., Davulcu, H.: Graph attention auto-encoders. arXiv:1905.10715 (2019)
Silva, T. H., Laender, A. H., de Melo, P. O. V.: Social-Based Classification of Multiple Interactions in Dynamic Attributed Networks. In: 2018 IEEE International Conference on Big Data (Big Data), pp 4063–4072 (2018)
Van Der Maaten, L.: Accelerating t-sne using tree-based algorithms. J. Mach. Learn. Res. 15(1), 3221–3245 (2014)
Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. ICLR (2018)
Wahid-Ul-Ashraf, A., Budka, M., Musial, K.: How to predict social relationships—physics-inspired approach to link prediction. Physica A: Stat. Mech. Appl. 523, 1110–1129 (2019)
Wang, Y., Li, Y., Fan, J., Ye, C., Chai, M.: A survey of typical attributed graph queries. World Wide Web 24(1), 297–346 (2021)
Xu, B., Shen, H., Cao, Q., Qiu, Y., Cheng, X.: Graph Wavelet Neural Network. In: International Conference on Learning Representations (2018)
Xu, W., Liu, X., Gong, Y.: Document Clustering Based on Non-Negative Matrix Factorization. In: SIGIR, pp 267–273 (2003)
Yang, B., Liu, J., Liu, D.: Characterizing and extracting multiplex patterns in complex networks. IEEE Trans. Syst. Man Cybern. Part b Cybern. Publ. IEEE Syst. Man Cybern. Soc. 42(2), 469 (2012)
Yang, B., Liu, X., Li, Y., Zhao, X.: Stochastic blockmodeling and variational bayes learning for signed network analysis. IEEE TKDE 29(9), 2026–2039 (2017)
Yang, H., Pan, S., Zhang, P., Chen, L., Lian, D., Zhang, C.: Binarized Attributed Network Embedding. In: 2018 IEEE International Conference on Data Mining (ICDM), pp 1476–1481. IEEE (2018)
Yang, S., Yang, B.: Enhanced Network Embedding with Text Information. In: 2018 24Th International Conference on Pattern Recognition (ICPR), pp 326–331. IEEE (2018)
Yang, T., Chi, Y., Zhu, S., Gong, Y., Jin, R.: Detecting communities and their evolutions in dynamic social networks–a bayesian approach. Mach. Learn. 82(2), 157–189 (2011)
Zhang, Z., Yang, H., Bu, J., Zhou, S., Yu, P., Zhang, J., Ester, M., Wang, C.: Anrl: Attributed Network Representation Learning via Deep Neural Networks. In: IJCAI, pp 3155–3161 (2018)
Zhao, A., Liu, G., Zheng, B., Zhao, Y., Zheng, K.: Temporal paths discovery with multiple constraints in attributed dynamic graphs. World Wide Web 23(1), 313–336 (2020)
Acknowledgments
This work was supported by the National Natural Science Foundation of China under grant number 61876069; Jilin Province Key Scientific and Technological Research and Development project under grant numbers 20180201067GX, 20180201044GX; Jilin Province Natural Science Foundation under grant number 20200201036JC; China Scholarship Council under grant number 201906170205, 201906170208; Australian Research Council under grant number DP190101087.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix
In this section, we give some details for the derivation of likelihood of complete-data and the update rules of the parameters of our model.
Derivation of likelihood of complete-data
According to the generative process of attributed networks, the joint probability or the likelihood of complete-data is
and each factor is defined as follows.
First, we know that the node assignment follows a multinomial distribution. The probability of node i belongs to block k is ωk and the assignment for each node is independent. Thus, the probability of assigning all nodes, i.e., obtaining vector c =< c1, c2,...cn >, is
Then, the embedding of node i follows a Gaussian distribution with mean \(\pmb {\mu }_{c_{i}}\) and standard derivation \(\pmb {\sigma }_{c_{i}}\) if we know that node i belongs to block ci. Thus, we have
As for the probability of generating node attributes, if X ∈{0, 1}n×M, it follows a Bernoulli distribution, i.e., the probability of node i having m-th attribute is υim. Thus,
Similarly, if \(\pmb {X} \in \mathbb {R}^{n\times M}\), we can obtain
Finally, generating links between each pair of nodes follows a Bernoulli distribution and the generation process of each pair of nodes is independent. The probability of node i connecting to node j is \(\pi _{c_{i}c_{j}}\) if the node assignment is known. Thus, the probability of generating links is
Using a network with binary attributes as an example, we substitute (14)-(16) and (18) to (13), we obtain
Derivation of update rules of the parameters
First, the items related to τ on (7) are:
Set \(\frac {\partial {\mathscr{L}}_{[\tau _{ik}]}}{\partial \tau _{ik}}=0\), then we can update τik by
Then, we optimize πkl:
Set \(\frac {\partial {\mathscr{L}}_{[\pi _{kl}]}}{\partial \pi _{kl}}=0\), we obtain
Next, the items related to ωk are
Since \({\sum }_{k=1}^{K}=1\), we take the derivative of \({\mathscr{L}}_{[\omega _{k}]} + \upbeta ({\sum }_{k}\omega _{k} - 1)\) of ωk, and make the derivative to zero. Then, we can obtain the update formula for ωk as follows:
In the same way, we can obtain the items related to μkd and σkd as follows:
and
We set \(\frac {\partial {\mathscr{L}}_{[\mu _{kd}]}}{\partial \mu _{kd}}=0\) and \(\frac {\partial {\mathscr{L}}_{[\sigma _{kd}]}}{\partial \sigma _{kd}}=0\), then we derive the update rules for μkd and σkd are
and
respectively.
Rights and permissions
About this article
Cite this article
Liu, X., Yang, B., Song, W. et al. A block-based generative model for attributed network embedding. World Wide Web 24, 1439–1464 (2021). https://doi.org/10.1007/s11280-021-00918-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-021-00918-y