Abstract
We present an algorithm for grouping multipart symbols, dashed lines, and character strings for extraction from line drawings. The image undergoes a lossless raster-to-vector conversion creating as its vector representation an undirected graph, a so-called run graph. Next, the image elements of the run graph are extracted and classified probabilistically based upon their geometric features using a decision tree. An area Voronoi tessellation of the members of the sets is constructed, from which a neighborhood graph is derived, which is guaranteed to be minimal and complete. The graph is then traversed to group the members of the various sets for extraction and input to different recognition modules. No a priori font or other domain specific information is required for the grouping, and no special geometrical relationships among the elements are assumed. Results are presented with example images taken from those used by our Swiss cadastral map understanding system.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
L. A. Fletcher and R. Kasturi, “A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images,” IEEE Transactions of Pattern Analysis and Machine Intelligence, vol. 10, pp. 910–918, Nov. 1988.
A. Nakamura, O. Shiku, M. Anegawa, C. Nakamura, and H. Kuroda, “A Method for Recognizing Character Strings from Maps Using Linguistic Knowledge,” in Proc. of the Second International Conf. on Document Analysis and Recognition, (Tsukuba, Japan), pp. 561–564, IEEE Computer Society Press, Oct. 20–22 1993.
L. Boatto et al., “An Interpretation System for Land Register Maps,” IEEE Computer, vol. 25, pp. 25–34, July 1992.
M. Burge and G. Monagan, “Using the Voronoi tessellation for grouping words and multi-part symbols in documents,” in Proc. VISION GEOMETRYIV, SPIE's International Symposium on Optics, Imaging and Instrumentation, Vol. 2573, (San Diego, California), July 9–14 1995.
A. Okabe, B. Boots, and K. Sugihara, “Nearest neighbourhood operations with generalized Voronoi diagrams: a review,” International Journal of Geographical Information Systems, vol. 8, pp. 43–71, January–February 1994.
F. Wahl et al., “Block Segmentation and Text Extraction in Mixed Text/Image Documents,” Computer Vision, Graphics and Image Processing, vol. 20, pp. 375–390, 1982.
L. Meng, “Toward the Automatic Digitization of Map Text,” in Mustererkennung 1991, Proc. 13. DAGM Symposium (B. Radig, ed.), pp. 361–366, Springer Verlag, Oct. 1991.
W. R. der Dudenredaktion: G. Drosdowski, ed., Der Duden: in 12 Bänden; das Standardwerk zur deutschen Sprache. Dudenverlag, 1992.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Burge, M., Monagan, G. (1995). Extracting words and multi-part symbols in graphics rich documents. In: Braccini, C., DeFloriani, L., Vernazza, G. (eds) Image Analysis and Processing. ICIAP 1995. Lecture Notes in Computer Science, vol 974. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60298-4_310
Download citation
DOI: https://doi.org/10.1007/3-540-60298-4_310
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60298-9
Online ISBN: 978-3-540-44787-0
eBook Packages: Springer Book Archive