Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4597))

Included in the following conference series:

Abstract

Clustering algorithms for multidimensional numerical data must overcome special difficulties due to the irregularities of data distribution. We present a clustering algorithm for numerical data that combines ideas from random projection techniques and density-based clustering. The algorithm consists of two phases: the first phase that entails the use of random projections to detect clusters, and the second phase that consists of certain post-processing techniques of clusters obtained by several random projections. Experiments were performed on synthetic data consisting of randomly-generated points in ℝn, synthetic images containing colored regions randomly distributed, and, finally, real images. Our results suggest the potential of our algorithm for image segmentation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. In: Proceedings of the ACM-SIGMOD Int. Conf. Management of Data, pp. 94–105. ACM Press, New York (1998)

    Chapter  Google Scholar 

  2. Agarwal, P., Mustafa, N.H.: k-means projective clustering. In: Proceedings of PODS, pp. 155–165 (2004)

    Google Scholar 

  3. Aggarwal, C.C., Procopiuc, C., Wolf, J.L., Yu, P.S., Park, J.S.: Fast algorithms for projected clustering. In: Proceedings of ACM-SIGMOD Conference on Management of Data, pp. 61–72. ACM Press, New York (1999)

    Google Scholar 

  4. Barthélemy, J.P., Leclerc, B.: The median procedure for partitions. In: Partitioning Data Sets. American Mathematical Society, pp. 3–14. Providence, RI (1995)

    Google Scholar 

  5. Chaudhri, A.B., Unland, R., Djeraba, C., Lindner, W. (eds.): EDBT 2002. LNCS, vol. 2490. Springer, Heidelberg (2002)

    MATH  Google Scholar 

  6. Dasgupta, S., Gupta, A.: An elementary proof of the johnson-lindenstrauss lemma. Technical Report TR-99-006, International Computer Science Institute (1999)

    Google Scholar 

  7. Djeraba, C. (ed.): Multimedia Mining - A Highway to Intelligent Multimedia Documents. Kluwer, Dordrecht (2003)

    Google Scholar 

  8. Frankl, P., Maehara, H.: The johnson-lindenstrauss lemma and the sphericity of some graphs. J. Comb. Theory B 44, 355–362 (1988)

    Article  MATH  Google Scholar 

  9. Jain, A.K., Dubes, R.: Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs (1988)

    MATH  Google Scholar 

  10. Jain, A.K., Flynn, P.J.: Image segmentation using clustering. In: Advances in Image Understanding: A Festschrift for Azriel Rosenfeld, Piscataway, NJ, pp. 65–83. IEEE Press, Los Alamitos (1996)

    Google Scholar 

  11. Johnson, W.B., Lindenstrauss, J.: Extensions of lipshitz mappings into hilbert spaces. Contemporary Mathematics 26, 189–206 (1984)

    MATH  Google Scholar 

  12. Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: A review. ACM Computing Surveys 31, 264–323 (1999)

    Article  Google Scholar 

  13. MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, pp. 281–297. University of California Press, California (1967)

    Google Scholar 

  14. Mondrian, P.: http://artchive.com/artchive/M/mondrian.html

  15. Sander, J., Ester, M., Kriegel, H.P., Xu, X.: Density-based clustering in spatial databases: The algorithm gdbscan and its applications. Data Mining and Knowledge Discovery, an International Journal 2, 169–194 (1998)

    Article  Google Scholar 

  16. Tan, P.N, Steinbach, M., Kumar, V.: Introduction to Data Mining. Pearson/Addison-Wesley, Boston (2006)

    Google Scholar 

  17. Vempala, S.S.: The Random Projection Method. American Mathematical Society. Providence, Rhode Island (2004)

    MATH  Google Scholar 

  18. Witten, I.H., Frank, E.: Data Mining - Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

  19. Zaïane, O.R., Simoff, S.J., Djeraba, C. (eds.): MDM/KDD 2002 and KDMCD 2002. LNCS (LNAI), vol. 2797. Springer, Heidelberg (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Urruty, T., Djeraba, C., Simovici, D.A. (2007). Clustering by Random Projections. In: Perner, P. (eds) Advances in Data Mining. Theoretical Aspects and Applications. ICDM 2007. Lecture Notes in Computer Science(), vol 4597. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73435-2_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73435-2_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73434-5

  • Online ISBN: 978-3-540-73435-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy