Skip to main content
Log in

Robust generalized canonical correlation analysis

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Generalized canonical correlation analysis (GCCA) has been widely used for classification and regression problems. The key idea of GCCA is to map the data from different views into a common space with the minimum reconstruction error. However, GCCA employs the squared Frobenius norm as a distance metric to find a latent correlated space without a specific strategy to cope with outliers, thus misguiding the GCCA’s training task in real-world applications and leading to suboptimal performance. This inspires us to propose a novel robust formulation for GCCA, namely, GCCA with the p-order (\(0<p\le 2\)) of Frobenius norm minimization (called RGCCA). It is difficult to solve the RGCCA involving the nonsmooth and nonconvex p-order of F-norm terms. Therefore, an efficient iterative algorithm is developed to solve RGCCA, theoretically analyzing its convergence property. In addition, the parameters of RGCCA nicely trade-off between accuracy and training time, a property especially useful for larger samples. Empirical experiments and theoretical analysis prove the effectiveness and robustness of RGCCA on both noiseless and noisy datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Sun L, Ji S, Ye J (2010) Canonical correlation analysis for multilabel classification: a least-squares formulation, extensions, and analysis. IEEE Trans Pattern Anal Mach Intell 33:194–200

    Google Scholar 

  2. Sarvestani RR, Boostani R (2017) FF-SKPCCA: kernel probabilistic canonical correlation analysis. Appl Intell 46:438–454

    Google Scholar 

  3. Elmadany NED, He Y, Guan L (2018) Information fusion for human action recognition via biset/multiset globality locality preserving canonical correlation analysis. IEEE Trans Image Process 27:5275–5287

    MathSciNet  Google Scholar 

  4. Wong HS, Wang L, Chan R, Zeng T (2021) Deep tensor CCA for multi-view learning. IEEE Trans Big Data 8:1664–1677

    Google Scholar 

  5. Chen Z, Liang K, Ding SX, Yang C, Peng T, Yuan X (2021) A comparative study of deep neural network-aided canonical correlation analysis-based process monitoring and fault detection methods. IEEE Trans Neural Netw Learn Syst 33:6158–6172

    MathSciNet  Google Scholar 

  6. Safayani M, Ahmadi SH, Afrabandpey H, Mirzaei A (2018) An EM based probabilistic two-dimensional CCA with application to face recognition. Appl Intell 48:755–770

    Google Scholar 

  7. Sun S, Xie X, Yang M (2015) Multiview uncorrelated discriminant analysis. IEEE Trans Cybern 46:3272–3284

    Google Scholar 

  8. Luo Y, Tao D, Ramamohanarao K, Xu C, Wen Y (2015) Tensor canonical correlation analysis for multi-view dimension reduction. IEEE Trans Knowl Data Eng 27:3111–3124

    Google Scholar 

  9. Gao L, Qi L, Chen E, Guan L (2017) Discriminative multiple canonical correlation analysis for information fusion. IEEE Trans Image Process 27:1951–1965

    MathSciNet  MATH  Google Scholar 

  10. Chen H, Chen Z, Chai Z, Jiang B, Huang B (2021) A single-side neural network-aided canonical correlation analysis with applications to fault diagnosis. IEEE Trans Cybern 52:9454–9466

    Google Scholar 

  11. Xiu X, Pan L, Yang Y, Liu W (2022) Efficient and fast joint sparse constrained canonical correlation analysis for fault detection. IEEE Trans Neural Netw Learn Syst:1–11. https://doi.org/10.1109/TNNLS.2022.3201881

  12. Wang Y, Cang S, Yu H (2019) Mutual information inspired feature selection using kernel canonical correlation analysis. Expert Syst Appl 4:1–9

    Google Scholar 

  13. Chen L, Wang K, Li M, Wu M, Pedrycz W, Hirota K (2022) K-means clustering-based kernel canonical correlation analysis for multimodal emotion recognition in human-robot interaction. IEEE Trans Industr Electron 70:1016–1024

    Google Scholar 

  14. Andrew G, Arora R, Bilmes J, Livescu K (2013) Deep canonical correlation analysis. In: Proceedings of the 29th International Conference on Machine Learning. pp. 1247–1255

  15. Wang W, Arora R, Livescu K, Bilmes J (2015) On deep multi-view representation learning. In: Proceedings of the 29th International Conference on Machine Learning, pp 1083–1092

  16. Xiu X, Miao Z, Yang Y, Liu W (2021) Deep canonical correlation analysis using sparsity constrained optimization for nonlinear process monitoring. IEEE Trans Industr Inf 18:6690–6699

    Google Scholar 

  17. Yu Y, Tang S, Aizawa K, Aizawa A (2018) Category-based deep CCA for fine-grained venue discovery from multimodal data. IEEE Trans Neural Netw Learn Syst 30:1250–1258

    MathSciNet  Google Scholar 

  18. Horst P (1961) Generalized canonical correlations and their application to experimental data. J Clin Psychol 17:331–347

  19. Kanatsoulis CI, Fu X, Sidiropoulos ND, Hong M (2018) Structured SUMCOR multiview canonical correlation analysis for large-scale data. IEEE Trans Signal Process 67:306–319

    MathSciNet  MATH  Google Scholar 

  20. Fu X, Huang K, Hong M, Sidiropoulos ND, So AM-C (2017) Scalable and flexible multiview MAX-VAR canonical correlation analysis. IEEE Trans Signal Process 65:4150–4165

    MathSciNet  MATH  Google Scholar 

  21. Carroll JD (1968) Generalization of canonical correlation analysis to three or more sets of variables. In: Proceedings of the 76th annual convention of the American Psychological Association, pp 227–228

  22. Lu C, Feng J, Chen Y, Liu W, Lin Z, Yan S (2019) Tensor robust principal component analysis with a new tensor nuclear norm. IEEE Trans Pattern Anal Mach Intell 42:925–938

    Google Scholar 

  23. Gao Y, Lin T, Zhang Y, Luo S, Nie F (2021) Robust principal component analysis based on discriminant information. IEEE Trans Knowl Data Eng 35:1991–2003

    Google Scholar 

  24. Sørensen M, Kanatsoulis CI, Sidiropoulos ND (2021) Generalized canonical correlation analysis: a subspace intersection approach. IEEE Trans Signal Process 69:2452–2467

    MATH  Google Scholar 

  25. Zheng T, Ge H, Li J, Wang L (2021) Unsupervised multi-view representation learning with proximity guided representation and generalized canonical correlation analysis. Appl Intell 51:248–264

    Google Scholar 

  26. Gloaguen A, Philippe C, Frouin V, Gennari G, Dehaene-Lambertz G, Le Brusquet L, Tenenhaus A (2022) Multiway generalized canonical correlation analysis. Biostatistics 23:240–256

    MathSciNet  Google Scholar 

  27. Chu D, Liao LZ, Ng MK, Zhang X (2013) Sparse canonical correlation analysis: new formulation and algorithm. IEEE Trans Pattern Anal Mach Intell 35:3050–3065

    Google Scholar 

  28. Hardoon DR, Shawe-Taylor J (2011) Sparse canonical correlation analysis. Mach Learn 83:331–353

    MathSciNet  MATH  Google Scholar 

  29. Xu M, Zhu Z, Zhang X, Zhao Y, Li X (2019) Canonical correlation analysis with L2,1-norm for multiview data representation. IEEE Trans Cybern 50:4772–4782

    Google Scholar 

  30. Li Y, Yang M, Zhang Z (2018) A survey of multi-view representation learning. IEEE Trans Knowl Data Eng 31:1863–1883

    Google Scholar 

  31. Yang X, Liu W, Liu W, Tao D (2019) A survey on canonical correlation analysis. IEEE Trans Knowl Data Eng 33:2349–2368

    Google Scholar 

  32. Zhao H, Wang Z, Nie F (2018) A new formulation of linear discriminant analysis for robust dimensionality reduction. IEEE Trans Knowl Data Eng 31:629–640

    Google Scholar 

  33. Yu Y, Xu G, Jiang M, Zhu H, Dai D, Yan H (2019) Joint transformation learning via the L2,1-norm metric for robust graph matching. IEEE Trans Cybern 51:521–533

    Google Scholar 

  34. Nie F, Wang Z, Wang R, Wang Z, Li X (2019) Towards robust discriminative projections learning via non-greedy L2,1-norm minmax. IEEE Trans Pattern Anal Mach Intell 43:2086–2100

    Google Scholar 

  35. Bala R, Dagar A, Singh RP (2021) A novel online sequential extreme learning machine with L2,1-norm regularization for prediction problems. Appl Intell 51:1669–1689

    Google Scholar 

  36. Tenenhaus A, Tenenhaus M (2014) Regularized generalized canonical correlation analysis for multiblock or multigroup data analysis. Eur J Oper Res 238:391–403

    MathSciNet  MATH  Google Scholar 

  37. Tenenhaus A, Philippe C, Frouin V (2015) Kernel generalized canonical correlation analysis. Comput Stat Data Anal 90:114–131

    MathSciNet  MATH  Google Scholar 

  38. Tenenhaus M, Tenenhaus A, Groenen PJ (2017) Regularized generalized canonical correlation analysis: a framework for sequential multiblock component methods. Psychometrika 82:737–777

    MathSciNet  MATH  Google Scholar 

  39. Li X, Xiu X, Liu W, Miao Z (2021) An efficient newton-based method for sparse generalized canonical correlation analysis. IEEE Signal Process Lett 29:125–129

    Google Scholar 

  40. LeCun Y (1998) The MNIST database of handwritten digits, http://yann.lecun.com/exdb/mnist/

  41. Ionescu C, Papava D, Olaru V, Sminchisescu C (2013) Human3. 6m: Large scale datasets and predictive methods for 3d human sensing in natural environments. IEEE Trans Pattern Anal Mach Intell 36:1325–1339

    Google Scholar 

  42. Martin N, Maes H (1979) Multivariate analysis. Academic Press, London

    Google Scholar 

  43. Hardoon DR, Szedmak S, Shawe-Taylor J (2004) Canonical correlation analysis: an overview with application to learning methods. Neural Comput 16:2639–2664

    MATH  Google Scholar 

  44. Chen J, Wang G, Shen Y, Giannakis GB (2018) Canonical correlation analysis of datasets with a common source graph. IEEE Trans Signal Process 66:4398–4408

    MathSciNet  MATH  Google Scholar 

  45. Wang Y, Shahrampour S (2021) ORCCA: optimal randomized canonical correlation analysis. IEEE Trans Neural Netw Learn Syst:1–13. https://doi.org/10.1109/TNNLS.2021.3124868

  46. Fu X, Huang K, Papalexais E, Song HA, Talukdar PP, Faloutsos C, Sidiropoulos N, Mitchell T (2018) Efficient and distributed generalized canonical correlation analysis for big multiview data. IEEE Trans Knowl Data Eng 31:2304–2318

    Google Scholar 

  47. Wang Q, Gao Q, Xie D, Gao X, Wang Y (2018) Robust DLPP with nongreedy L1-norm minimization and maximization. IEEE Trans Neural Netw Learn Syst 29:738–743

    MathSciNet  Google Scholar 

  48. Yan H, Ye Q, Zhang T, Yu D-J, Yuan X, Xu Y, Fu L (2018) Least squares twin bounded support vector machines based on L1-norm distance metric for classification. Pattern Recogn 74:434–447

    Google Scholar 

  49. Jin J, Xiao R, Daly I, Miao Y, Wang X, Cichocki A (2020) Internal feature selection method of CSP based on L1-norm and Dempster-Shafer theory. IEEE Trans Neural Netw Learn Syst 32:4814–4825

    Google Scholar 

  50. Li Y, Sun H, Yan W, Cui Q (2021) R-CTSVM+: Robust capped L1-norm twin support vector machine with privileged information. Inf Sci 574:12–32

    Google Scholar 

  51. Lai Z, Xu Y, Yang J, Shen L, Zhang D (2016) Rotational invariant dimensionality reduction algorithms. IEEE Trans Cybern 47:3733–3746

    Google Scholar 

  52. Ye Q, Li Z, Fu L, Zhang Z, Yang W, Yang G (2019) Nonpeaked discriminant analysis for data representation. IEEE Trans Neural Netw Learn Syst 30:3818–3832

    MathSciNet  Google Scholar 

  53. Ye Q, Huang P, Zhang Z, Zheng Y, Fu L, Yang W (2021) Multiview learning with robust double-sided twin SVM. IEEE Trans Cybern 52:12745–12758

    Google Scholar 

  54. Nakkala MR, Singh A, Rossi A (2021) Multi-start iterated local search, exact and matheuristic approaches for minimum capacitated dominating set problem. Appl Soft Comput 108:1–19

    Google Scholar 

  55. Mao J, Pan Q, Miao Z, Gao L (2021) An effective multi-start iterated greedy algorithm to minimize makespan for the distributed permutation flowshop scheduling problem with preventive maintenance. Expert Syst Appl 169:1–11

    Google Scholar 

  56. Ye Q, Zhao H, Li Z, Yang X, Gao S, Yin T, Ye N (2017) L1-norm distance minimization-based fast robust twin support vector k-plane clustering. IEEE Trans Neural Netw Learn Syst 29:4494–4503

    Google Scholar 

  57. Kim C, Klabjan D (2019) A simple and fast algorithm for L1-norm kernel PCA. IEEE Trans Pattern Anal Mach Intell 42:1842–1855

    Google Scholar 

  58. Li C, Ren P, Shao Y, Ye Y, Guo Y (2020) Generalized elastic net Lp-norm nonparallel support vector machine. Eng Appl Artif Intell 88:1–16

    Google Scholar 

  59. Yan H, Fu L, Hu J, Ye Q, Qi Y, Yu D-J (2022) Robust distance metric optimization driven GEPSVM classifier for pattern classification. Pattern Recogn 129:1–14

    Google Scholar 

  60. Zhang C, Fu H, Hu Q, Cao X, Xie Y, Tao D, Xu D (2018) Generalized latent multi-view subspace clustering. IEEE Trans Pattern Anal Mach Intell 42:86–99

    Google Scholar 

  61. Fu L, Li Z, Ye Q, Yin H, Liu Q, Chen X, Fan X, Yang W, Yang G (2020) Learning robust discriminant subspace based on joint L2,p-and L2,s-norm distance metrics. IEEE Trans Neural Netw Learn Syst 33:130–144

    Google Scholar 

  62. Ma J (2020) Capped L1-norm distance metric-based fast robust twin extreme learning machine. Appl Intell 50:3775–3787

    Google Scholar 

  63. Liu Y, Jia R, Liu Q, Zhang X, Sun H (2021) Crowd counting method based on the self-attention residual network. Appl Intell 51:427–440

    Google Scholar 

  64. Khouloud S, Ahlem M, Fadel T, Amel S (2022) W-net and inception residual network for skin lesion segmentation and classification. Appl Intell 52:3976–3994

    Google Scholar 

Download references

Acknowledgements

This work was supported by the National Key Research and Development Program of China: Key Projects of International Scientific and Technological Innovation Cooperation between Governments (No. 2019YFE0123800), the National Natural Science Foundation of China (Nos. 62072243 and 62072246), the Natural Science Foundation of Jiangsu Province (BK20201304), the Foundation of National Defense Key Laboratory of Science and Technology (JZX7Y202001SY000901), the “333 Project” of Jiangsu Province under Project (BRA2020044) and the EU’s Horizon 2020 Program (LC-GV-05-2019).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Dong-Jun Yu or Yong Qi.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

1.1 Convergence Analysis

This section proves that Algorithm 1 monotonically decreases the objective function value of (8) and has a local optimal solution. Next, we introduce Lemma 1.

Lemma 1: [61]: For any nonzero matrices \({V}^{\left(t+1\right)},{V}^{t}\), when \(0<p\le 2\), the inequality holds:

$${\Vert {V}^{\left(t+1\right)}\Vert }_{F}^{p}-\frac{p}{2}{\Vert {V}^{t}\Vert }_{F}^{p-2}{\Vert {V}^{\left(t+1\right)}\Vert }_{F}^{2}\le {\Vert {V}^{t}\Vert }_{F}^{p}-\frac{p}{2}{\Vert {V}^{t}\Vert }_{F}^{p-2}{\Vert {V}^{t}\Vert }_{F}^{2}.$$
(A.1)

Theorem 1: In each iteration of Algorithm 1, we have

$$\sum_{j=1}^{J}{\Vert {G}^{\left(t+1\right)}-{U}_{j}^{T}{X}_{j}\Vert }_{F}^{2}{d}_{j}^{\left(t+1\right)}\le \sum_{j=1}^{J}{\Vert {G}^{\left(t\right)}-{U}_{j}^{T}{X}_{j}\Vert }_{F}^{2}{d}_{j}^{t}.$$
(A.2)

It means that Algorithm 1 monotonically decreases the objective function value of (8).

Proof: According to step 3 in Algorithm 1, in the \(\left(t+1\right)\) th iteration, we have the following inequality:

$$tr\left(\sum_{j=1}^{J}{G}^{\left(t\text{+}1\right)}M{\left({G}^{\left(t\text{+}1\right)}\right)}^{T}{d}_{j}^{t}\right)\ge tr\left(\sum_{j=1}^{J}{G}^{t}M{\left({G}^{t}\right)}^{T}{d}_{j}^{t}\right).$$
(A.3)

The inequality (A.3) multiplies -1 and adds \(\sum_{j=1}^{J}tr\left({I}_{N}{d}_{j}^{t}\right)\) on both sides, we have

$$\sum_{j=1}^{J}tr\left({I}_{N}{d}_{j}^{t}\right)-\sum_{j=1}^{J}tr\left({G}^{\left(t\text{+}1\right)}M{\left({G}^{\left(t\text{+}1\right)}\right)}^{T}{d}_{j}^{t}\right)\le \sum_{j=1}^{J}tr\left({I}_{N}{d}_{j}^{t}\right)-\sum_{j=1}^{J}tr\left({G}^{t}M{\left({G}^{t}\right)}^{T}{d}_{j}^{t}\right).$$
(A.4)

By simple algebra, (A.4) becomes,

$$\sum_{j=1}^{J}tr\left({G}^{\left(t\text{+}1\right)}\left({I}_{N}-{P}_{j}\right){\left({G}^{\left(t\text{+}1\right)}\right)}^{T}{d}_{j}^{t}\right)\le \sum_{j=1}^{J}tr\left({G}^{t}\left({I}_{N}-{P}_{j}\right){\left({G}^{t}\right)}^{T}{d}_{j}^{t}\right).$$
(A.5)

Since \({\Vert G\left({I}_{N}-{P}_{j}\right)\Vert }_{F}^{2}=tr\left(G\left({I}_{N}-{P}_{j}\right){G}^{T}\right)\) holds for each j, then the (A.5) is transformed to

$$\sum_{j=1}^{J}{\Vert {G}^{\left(t\text{+}1\right)}\left({I}_{N}-{P}_{j}\right)\Vert }_{F}^{2}{d}_{j}^{t}\le \sum_{j=1}^{J}{\Vert {G}^{t}\left({I}_{N}-{P}_{j}\right)\Vert }_{F}^{2}{d}_{j}^{t}.$$
(A.6)

From the definitions of \({U}_{j}\) and \({P}_{j}\), (A.6) becomes,

$$\sum_{j=1}^{J}{\Vert {G}^{\left(t\text{+}1\right)}-{U}_{j}^{T}{X}_{j}\Vert }_{F}^{2}{d}_{j}^{t}\le \sum_{j=1}^{J}{\Vert {G}^{t}-{U}_{j}^{T}{X}_{j}\Vert }_{F}^{2}{d}_{j}^{t}.$$
(A.7)

From the definition of \({d}_{j}\), and denote by \({V}_{j}^{\left(t+1\right)}\text{=}{G}^{\left(t\text{+}1\right)}-{U}_{j}^{T}{X}_{j}\) and \({V}_{j}^{t}\text{=}{G}^{t}-{U}_{j}^{T}{X}_{j}\). Substituting \({V}_{j}^{\left(t+1\right)}\) and \({V}_{j}^{t}\) into the (A.7), by simple algebra, we rewrite (A.7) as follows,

$$\sum_{j=1}^{J}\frac{{\Vert {V}_{j}^{\left(t+1\right)}\Vert }_{F}^{2}}{{\Vert {V}_{j}^{t}\Vert }_{F}^{2}}{\Vert {V}_{j}^{t}\Vert }_{F}^{p}\le \sum_{j=1}^{J}{\Vert {V}_{j}^{t}\Vert }_{F}^{p}.$$
(A.8)

According to Lemma 1, we have

$$\frac{p}{2}\frac{{\Vert {V}_{j}^{\left(t+1\right)}\Vert }_{F}^{2}}{{\Vert {V}_{j}^{t}\Vert }_{F}^{2}}{\Vert {V}_{j}^{t}\Vert }_{F}^{p}\ge {\Vert {V}_{j}^{\left(t+1\right)}\Vert }_{F}^{p}-\left(1-\frac{p}{2}\right){\Vert {V}_{j}^{t}\Vert }_{F}^{p}.$$
(A.9)

Inequality (A.9) holds for each index j, thus we have

$$\frac{p}{2}\sum_{j=1}^{J}\frac{{\Vert {V}_{j}^{\left(t+1\right)}\Vert }_{F}^{2}}{{\Vert {V}_{j}^{t}\Vert }_{F}^{2}}{\Vert {V}_{j}^{t}\Vert }_{F}^{p}\ge \sum_{j=1}^{J}{\Vert {V}_{j}^{\left(t+1\right)}\Vert }_{F}^{p}-\sum_{j=1}^{J}\left(1-\frac{p}{2}\right){\Vert {V}_{j}^{t}\Vert }_{F}^{p}.$$
(A.10)

Combing inequalities (A.8) and (A.10), by simple algebra, we have

$$\sum_{j=1}^{J}{\Vert {V}_{j}^{\left(t+1\right)}\Vert }_{F}^{p}\le \sum_{j=1}^{J}{\Vert {V}_{j}^{t}\Vert }_{F}^{p}.$$
(A.11)

According to the definitions of \({V}_{j}^{\left(t+1\right)}\) and \({V}_{j}^{t}\), the (A.11) is rewritten as follows,

$$\sum_{j=1}^{J}{\Vert {G}^{\left(t\text{+}1\right)}-{U}_{j}^{T}{X}_{j}\Vert }_{F}^{p}\le \sum_{j=1}^{J}{\Vert {G}^{t}-{U}_{j}^{T}{X}_{j}\Vert }_{F}^{p}.$$
(A.12)

The (A.12) holds indicates that Algorithm 1 monotonically decreases the objective function value of (8) in each iteration. This suggests that the \({G}^{*}\) moves towards the optimal solution in each iteration. □

Theorem 2: Algorithm 1 converges to a local optimal solution of the objective function (8).

Proof: The (8) is a conditional extremum problem. We transform (8) to an unconditional extremum problem by utilizing the Lagrange multiplier method. The constructed Lagrange auxiliary function of (8) is shown as follows:

$${L}_{1}\left(G,\alpha \right)=\sum_{j=1}^{J}{\Vert G-{U}_{j}^{T}{X}_{j}\Vert }_{F}^{p}-tr\left(\alpha \left(G{G}^{T}-{I}_{r}\right)\right),$$
(A.13)

where \(\alpha\) is the Lagrange multiplier. To satisfy the constraint condition \(G{G}^{T}\text{=}{I}_{r}\), we set \(\alpha\) as the diagonal matrix. According to KKT condition for the optimal solution, we take \(\left(\partial {L}_{1}/\partial G\right)=0\), then,

$$\frac{\partial {L}_{1}}{\partial G}=p\sum_{j=1}^{J}\left({\Vert G-{U}_{j}^{T}{X}_{j}\Vert }_{F}^{p-2}\right)G\overline{M }-2G\alpha =0,$$
(A.14)

where \(\overline{M }=\left({I}_{N}-2M\right)\). By simple algebra, we obtain the optimal solution

$$\sum_{j=1}^{J}\left({\Vert G-{U}_{j}^{T}{X}_{j}\Vert }_{F}^{p-2}\right)G\overline{M }=G\left(2\alpha /p\right).$$
(A.15)

According to the aforementioned analysis, the optimal solution of the objective function (12) can be obtained in step 3 of Algorithm 1. Therefore, the converged solution (\({G}^{*}\)) of Algorithm 1 satisfies the KKT condition of the (12). The Lagrange auxiliary function of the objective function (12) is shown as follows:

$${L}_{2}\left(G,\overline{\alpha }\right)=tr\left(GDM{G}^{T}\right)-tr\left(\overline{\alpha }\left(G{G}^{T}-{I}_{r}\right)\right),$$
(A.16)

where \(\overline{\alpha }\) is the Lagrangian multiplier. Taking \(\left(\partial {L}_{2}/\partial G\right)=0\), we obtain the KKT condition of (12) as follows,

$$\frac{\partial {L}_{2}}{\partial G}=GMD-G\overline{\alpha }=0.$$
(A.17)

The (A.17) is formally similar to the (A.15). The key difference between them is that the diagonal matrix \(D\) is known in each iteration. In Algorithm 1, assume that obtaining the optimal solution \({G}^{*}\) in the \(\left(t+1\right)\) th iteration. Then, we have \({G}^{\left(t\text{+}1\right)}\text{=}{G}^{*}\text{=}{G}^{t}\). In addition, according to definition of \(D\), we can see that the (A.17) is the same as (A.15) in this case. This suggests that the converged solution of Algorithm 1 satisfies the KKT condition of the objective function (8). That is \(\left(\partial {L}_{1}/\partial G\right)\left|{}_{G\text{=}{G}^{*}}\right.\text{=}{0}\). Given this, the converged solution of Algorithm 1 is a local optimal solution of the objective function (8). □

According to Theorem 1, an iterative algorithm is developed to search for the local optimal solution of the objective function (8). However, the developed algorithm needs to solve the eigen-decomposition of the matrix \(DM\) iteratively. The iteration procedure makes the RGCCA higher computational cost than the GCCA family methods. This defect exists in iterative algorithms [47, 48, 58, 61, 62]. We will explore a better strategy with the theoretical guarantee to reduce the cost of RGCCA. Fortunately, we can adjust RGCCA’s parameters to balance the robustness and cost, especially for larger samples. In summary, Algorithm 1 scales well to large-scale datasets, which indicates that the developed iterative algorithm is useful for practical applications.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yan, H., Cheng, L., Ye, Q. et al. Robust generalized canonical correlation analysis. Appl Intell 53, 21140–21155 (2023). https://doi.org/10.1007/s10489-023-04666-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-023-04666-6

Keywords

Navigation

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy