Abstract
Situations when only a limited amount of labeled data and a large amount of unlabeled data are available to the learning algorithm are typical for many real-world problems. To make use of unlabeled data in preference learning problems, we propose a semisupervised algorithm that is based on the multiview approach. Our algorithm, which we call Sparse Co-RankRLS, minimizes a least-squares approximation of the ranking error and is formulated within the co-regularization framework. It operates by constructing a ranker for each view and by choosing such ranking prediction functions that minimize the disagreement among all of the rankers on the unlabeled data. Our experiments, conducted on real-world dataset, show that the inclusion of unlabeled data can improve the prediction performance significantly. Moreover, our semisupervised preference learning algorithm has a linear complexity in the number of unlabeled data items, making it applicable to large datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
As described in [8], one can distinguish between weak preference (≽) and strict preference (≻), where \(y {\succ }_{\vec{x}}y\prime \Leftrightarrow (y {\succeq }_{\vec{x}}y\prime) \wedge(y\prime {\nsucceq }_{\vec{x}}y)\); furthermore, \(y {\sim }_{\vec{x}}y\prime \Leftrightarrow (y {\succcurlyeq }_{\vec{x}}y\prime) \wedge(y\prime {\succcurlyeq }_{\vec{x}}y)\).
- 2.
Unless stated otherwise, we assume that a kernel matrix K is positive definite, i.e., \({B}^{{}^{t} }KB > 0\) for all \(B \in{\mathbb{R}}^{n},B\neq 0\). This can be ensured, for example, by performing a small diagonal shift.
- 3.
Python implementation of the algorithm and the dataset are available on request.
References
A. Blum, T. Mitchell, Combining labeled and unlabeled data with co-training, in Proceedings of the 11th Annual Conference on Computational Learning Theory (ACM, New York, NY, USA, 1998), pp. 92–100
U. Brefeld, T. Gärtner, T. Scheffer, S. Wrobel, Efficient co-regularised least squares regression, in Proceedings of the 23rd International Conference on Machine learning (ACM, New York, NY, USA, 2006), pp. 137–144
U. Brefeld, T. Scheffer, Co-em support vector learning, in Proceedings of the 21st International Conference on Machine learning (ACM, New York, NY, USA, 2004), p.16
R.A. Brualdi, H.J. Ryser, Combinatorial Matrix Theory (Cambridge University Press, 1991)
M. Collins, N. Duffy, New ranking algorithms for parsing and tagging: kernels over discrete structures, and the voted perceptron, in ACL ’02: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (Association for Computational Linguistics, Morristown, NJ, USA, 2001), pp. 263–270
O. Dekel, C.D. Manning, Y. Singer, Log-linear models for label ranking, in Advances in Neural Information Processing Systems, vol.16, ed. by S. Thrun, L. Saul, B. Schölkopf (MIT, Cambridge, MA, 2004), pp.497–504
J. Demšar, Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
J. Fürnkranz, E. Hüllermeier, Preference learning. Künstliche Intelligenz 19(1), 60–61 (2005)
G.H. Golub, C.F.Van Loan, Matrix Computations (Johns Hopkins University Press, 1996)
R. Herbrich, T. Graepel, K. Obermayer, Support vector learning for ordinal regression, in Proceedings of the Ninth International Conference on Articial Neural Networks (Institute of Electrical Engineers, London, 1999), pp. 97–102
R. Horn, C.R. Johnson, Matrix Analysis (Cambridge University Press, Cambridge, 1985)
T. Pahikkala, E. Tsivtsivadze, A. Airola, J. Järvinen, J. Boberg, An efficient algorithm for learning to rank from preference graphs. Mach. Learn. 75(1), 129–165 (2009)
T. Pahikkala, E. Tsivtsivadze, J. Boberg, T. Salakoski, Graph kernels versus graph representations: a case study in parse ranking, in Proceedings of the ECML/PKDD’06 workshop on Mining and Learning with Graphs, ed. by T. Gärtner, G.C. Garriga, T. Meinl, Berlin, Germany (pp.181–188) (2006)
T. Poggio, F. Girosi, Networks for approximation and learning. Proc. IEEE 78(9), 1481–1497 (1990)
A. Pothen, H.D. Simon, K.-P. Liou, Partitioning sparse matrices with eigenvectors of graphs. SIAM J. Matrix Anal. Appl. 11(3), 430–452 (1990)
S. Pyysalo, F. Ginter, J. Heimonen, J. Björne, J. Boberg, J. Järvinen, T. Salakoski, BioInfer: A corpus for information extraction in the biomedical domain. BMC Bioinformatics, 8, 50 (2007)
J.Quinonero-Candela, CE. Rasmussen, A unifying view of sparse approximate Gaussian process regression. J. Mach. Learn. Res. 6, 1939–1959 (2005)
D. Rosenberg, P.L. Bartlett, The rademacher complexity of co-regularized kernel classes, in Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, ed. by M. Meila, X. Shen (2007), pp. 396–403
B. Schölkopf, R. Herbrich, A.J. Smola, A generalized representer theorem, in Proceedings of the 14th Annual Conference on Computational Learning Theory, ed. by DavidP. Helmbold, B.Williamson (Springer, London, 2001), pp. 416–426
V. Sindhwani, P. Niyogi, M. Belkin, A co-regularization approach to semi-supervised learning with multiple views, in Proceedings of ICML Workshop on Learning with Multiple Views (2005)
V. Sindhwani, D. Rosenberg, An rkhs for multi-view learning and manifold co-regularization, in Proceedings of the 25th Annual International Conference on Machine Learning (ICML 2008), ed. by A. McCallum, S. Roweis (Omnipress, Helsinki, Finland, 2008), pp. 976–983
A.J. Smola, B. Schölkopf, Sparse greedy matrix approximation for machine learning, in Proceedings of the 17th International Conference on Machine Learning, ed. by Pat Langley (Morgan Kaufmann Publishers, San Francisco, Ca, USA, 2000), pp. 911–918
E. Tsivtsivadze, T. Pahikkala, A. Airola, J. Boberg, T. Salakoski, A sparse regularized least-squares preference learning algorithm, in 10th Scandinavian Conference on Artificial Intelligence (SCAI 2008), vol. 173, ed. by A. Holst, P. Kreuger, P. Funk (IOS, 2008), pp. 76–83
E. Tsivtsivadze, T. Pahikkala, S. Pyysalo, J. Boberg, A. Mylläri, T. Salakoski, Regularized least-squares for parse ranking, in Advances in Intelligent Data Analysis VI, ed. by A.Fazel Famili, J.N. Kok, J.M. Peña, A. Siebes, A.J. Feelders (Springer, 2005), pp. 464–474
P.Vincent, Y.Bengio, Kernel matching pursuit. Mach. Learn. 48(1–3), 165–187 (2002)
Acknowledgements
We acknowledge support from the Netherlands Organization for Scientific Research (NWO), in particular a Vici grant (639.023.604). We also thank CSC, the Finnish IT center for science for providing us with computing resources.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Tsivtsivadze, E., Pahikkala, T., Boberg, J., Salakoski, T., Heskes, T. (2010). Co-Regularized Least-Squares for Label Ranking. In: Fürnkranz, J., Hüllermeier, E. (eds) Preference Learning. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14125-6_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-14125-6_6
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14124-9
Online ISBN: 978-3-642-14125-6
eBook Packages: Computer ScienceComputer Science (R0)