0% found this document useful (0 votes)
74 views17 pages

Sketch4Match - Content-Based Image Retrieval System Using Sketches

The document discusses Sketch4Match, a content-based image retrieval system that allows users to search image databases using freehand sketches as queries. It introduces the challenges of sketch-based image retrieval and describes how the system addresses the "informational gap" between sketches and color images using a special descriptor. The system preprocesses images and sketches in similar ways before comparing them. Experimental results on sample databases showed good retrieval accuracy. The document also provides an overview of existing image retrieval techniques and describes the main modules of the proposed sketch-based system, including indexing images with k-means clustering and annotating images.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views17 pages

Sketch4Match - Content-Based Image Retrieval System Using Sketches

The document discusses Sketch4Match, a content-based image retrieval system that allows users to search image databases using freehand sketches as queries. It introduces the challenges of sketch-based image retrieval and describes how the system addresses the "informational gap" between sketches and color images using a special descriptor. The system preprocesses images and sketches in similar ways before comparing them. Experimental results on sample databases showed good retrieval accuracy. The document also provides an overview of existing image retrieval techniques and describes the main modules of the proposed sketch-based system, including indexing images with k-means clustering and annotating images.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 17

Sketch4Match Content-based Image Retrieval System Using Sketches

ABSTRACT:
The content based image retrieval (CBIR) is one of the most popular, rising research areas of the digital image processing. Most of the available image search tools, such as Google Images and Yahoo! Image search, are based on textual annotation of images. In these tools, images are manually annotated with keywords and then retrieved using text-based search methods. The performances of these systems are not satisfactory. The goal of CBIR is to extract visual content of an image automatically, like color, texture, or shape. This paper aims to introduce the problems and challenges concerned with the design and the creation of CBIR systems, which is based on a free hand sketch (Sketch based image retrieval SBIR). With the help of the existing methods, describe a possible solution how to design and implement a task specific descriptor, which can handle the informational gap between a sketch and a colored image, making an opportunity for the efficient search hereby. The used descriptor is constructed after such special sequence of preprocessing steps that the transformed full color image and the sketch can be compared. We have studied EHD, HOG and SIFT. Experimental results on two sample databases showed good results. Overall, the results show that the sketch based system allows users an intuitive access to search-tools. The SBIR technology can be used in several applications such as digital libraries, crime prevention, and photo sharing sites. Such a system has great value in apprehending suspects and identifying victims in forensics and law enforcement. A possible application is matching a forensic sketch to a gallery of mug shot images. The area of retrieve images based on the visual content of the query picture intensified recently, which demands on the quite wide methodology spectrum on the area of the image processing.

Existing System:

In earlier days, image retrieving from large image database can be done by following ways. We will discuss briefly about the image retrieving of various steps Automatic Image Annotation and Retrieval using Cross Media Relevance Models Concept Based Query Expansion Query System Bridging The Semantic Gap For Large Image Databases Ontology-Based Query Expansion Widget for information Retrieval Detecting image purpose in World-Wide Web documents

Proposed System: Relevance feedback is an interactive process that starts with normal CBIR. The user input a query, and then the system extracts the image feature and measure the distance with images in the database. An initial retrieval list is then generated. User can choose the relevant image to further refine the query, and this process can be iterated many times until the user find the desired images.

Query update

Users Feedback

Input Query

Feature Extraction

Similarity measure

Retrieval Result

Find all images?

Final Retrieval Result

Main Modules:Indexing 4.5 1 Introduction Indexing the whole set of images using K-means Clustering algorithm. Indexing is done using an implementation of the Document Builder Interface. A simple approach is to use the Document Builder Factory, which creates Document Builder instances for all available features as well as popular combinations of features (e.g. all JPEG features or all avail-able features).

In a content based image retrieval system, target images are sorted by feature similarities with respect to the query (CBIR).In this indexing, we propose to use K-means clustering for the classification of feature set obtained from the histogram. Histogram provides a set of features for proposed for Content Based Image Retrieval (CBIR). Hence histogram method further refines the histogram by splitting the pixels in a given bucket into several classes. Here we compute the similarity for 8 bins and similarity for 16 bins. Standard histograms, because of their efficiency and insensitivity to small changes, are widely used for content based image retrieval. But the main disadvantage of histograms is that many images of different appearances can have similar histograms because histograms provide coarse characterization of an image. The k-means algorithm takes the input parameter, k, and partitions a set of n objects into k clusters so that the resulting intra cluster similarity is high but the inter cluster similarity is low. Cluster similarity is measured in regard to the mean value of the objects in a cluster, which can be viewed as the clusters center of gravity. How does the k-means algorithm work. The k-means algorithm proceeds as follows. First, it randomly selects k of the objects, each of which initially represents a cluster mean or center. For each of the remaining objects, an object is assigned to the cluster to which it is the most similar, based on the distance between the object and the cluster mean. It then computes the new mean for each cluster. The k-means algorithm:

Algorithm: k-means. The k-means algorithm for partitioning based on the mean value of the objects in the cluster. Input: The number of clusters k and a database containing n objects. Output: A set of k clusters that minimizes the squared-error criterion. Method: (1) arbitrarily choose k objects as the initial cluster centers: (2) repeat (3) (re)assign each object to the cluster to which the object is the most similar, based on the mean value of the objects in the cluster; (4) Update the cluster means, i.e., calculate the mean value of the objects for each cluster; (5) Until no change.

4.6 Annotation 4.6.1 Introduction Central part of Annotation is the so called semantic description panel. It allows the user to define semantic objects like agents, places, events and times which are saved on exit for reusing them the next time starting Annotation. These semantic objects can also be imported from an existing JPEG file to allow exchange of objects between users and editing and creating those objects in a user preferred tool. Semantic objects can be used for creating the description by dragging and dropping them onto the blue panel with the mouse. As once the objects exist, they can be reused if some pictures or series have the same context. This is especially true for objects representing persons, animals and places like the relatives, colleagues, friends, favorite pets or places like at home or at work. After dropping all the needed objects onto the blue panel the user can interconnect these objects by drawing relations (visualized by arrows) between them using the middle mouse button. The directed graph, which is generated through these user interactions with Caliph, can be saved as part of an JPEG description. The challenging problem of real-time image annotation in their project titled Annotating Images by Mining Image Search Results.

Different from recently published generative and discriminative modeling approaches for image annotation, the work represents a new dimension because it relies on searching in a very large collection of images with textual descriptions. The approach has three main steps: a search process, a mining process, and a filtering process. A large number of realworld images have been used to test the method and promising results are reported. Since most images posted on the Web are not indexed semantically, e.g., by keywords, concept-based image retrieval has depended on lowlevel signatures. The Automatic Semantic Annotation of Real World Web Images by R.C.F. Wong and C.H.C. Leung addresses this semantic gap with a novel method for automatic semantic annotation aimed at retrieving appropriate images in response to user-generated queries about the image content. The main idea is to cluster images based on embedded image-capture metadata, including acquisition parameters such as camera properties and GPS information. A learning framework is developed using decision trees based on components of the acquisition parameter vector and the method is validated on over 100,000 web images from flickr.com and elsewhere.

4.7 Color Layout 4.7.1 Introduction Color is one of the most widely used features in image retrieval. It is robust to background complication and invariant of image size and Orientation. As stated in chapter 1, three major properties of color image similarity are usually considered Area of Matching, Color Distance, and Spatial Distribution. Area of matching is most commonly used because its idea is very clear and it can be represented accurately by histograms. In most histogram representations, histogram entries lay on the selected color space The CLD represents the spatial distribution of colors in an image. The extraction process of the CLD consists of the following four stages The image array is partitioned into 8x8 blocks. Representative colors are selected and expressed in YCbCr color space. Each of the three components (Y, Cb and Cr) is transformed by 8x8 DCT (Discrete Cosine Transform).

The resulting sets of DCT coefficients are zigzag-scanned and the first Few coefficients are nonlinearly quantized to form the descriptor.

Input picture

Partitioning

Representative color selection

Binary marks

Zigzag scanning and weighing

DCT

Fig 4.2 Color Layout In above fig 4.2 shows the color layout of CLD descriptor is thus a very compact representation of the color layout and allows for very fast searches in databases. 4.7.2 Color Spaces There are many color spaces designed for different systems and standards, but most of them can be converted by a simple transformation. i. RGB (Red-Green-Blue) - Digital images are normally represented in RGB color space; it is the most commonly use color space in computers. It is a device dependent color space, which used in CRT monitors.

ii. CMY (Cyan-Magenta-Yellow), CMYK (CMY-Black) It is a subtractive color space for printing, it models the effect of color ink on white project. Black component is use for enhancing the effect of black color.

iii. HSB (Hue Saturation Brightness) It was used to model the properties of human perception. However it is inconvenient to calculate color distance due to its discontinuity of hue at 360. iv. YIQ, YCbCr, YUV Used in television broadcast standards. Y is the luminance 4.7.3 Histograms In color based image retrieval, histogram is the most commonly used representation for color features. Statistically, it utilizes a property that images having similar contents should have a similar color distribution. However, histograms are not limited by describing the area of each color. The entries of the histogram can also represent other features. For instant, texture features can be represented by histograms formed by coefficients in transformed domain and the value of each histogram entry represents the intensity of each coefficient. 4.7.4 Cumulative Color Histogram

As an alternative consideration that most color histograms are very sparse and sensitive to noise and global color shifting, proposed using the cumulative color histogram. It solved shortcomings of simple histogram with histogram intersection since cumulative histogram does not have shifting problem.

4.7.5 Color Sets To facilitate fast searching over large-scale image collections and improving the compactness of the histogram, Smith and Chang proposed color sets as an approximation to the color histograms. They first transformed the RGB color space into a perceptually uniform space, such as CIE, and then quantized the transformed color space into M bins. A color set is defined as a selection of colors from the quantized color space. Color Descriptors Color is the most distinguishing visual features in image and video retrieval. It is robust to changes in the background colors and is independent of image size and orientation. Many forms of color distributions and representations are adopted in JPEG, including some color-spatial descriptors such as Color Layout, Color Structure, and some color quantization based descriptors such as Scalable Color and Dominant Color. Color Spaces Descriptor

This descriptor does not extract features from images. It defines a set of normative color spaces for interoperability between various color descriptors. There are 6 normative color spaces: RGB, YCbCr, HSV, HMMD, Monochrome and Linear Transformation Matrix reference to RGB.

Dominant Color Descriptor (Dcd) DCD is a compact color descriptor, designed for small storage and high speed retrieval. Image feature is formed by a small number of representative colors. These colors are normally obtained by using clustering and color quantization. The descriptor consists of the representative colors, their percentages in a region, spatial coherency of the color, and color variance. Scalable Color Descriptor (Scd) SCD is a color histogram in a uniformly quantized HSV color space with Haar transform. It is scalable because the precision of histogram bins value can vary from 16 to 1000 bitsper bin for different requirements. Number of bins can also be specified. Simple histogram intersection can be used for similarity measure by matching the Hear Coefficients. Color Layout Descriptor (Cld) JPEG CLD represents spatial distribution of color in an image, and it is a compact descriptor. It is formed by dividing the image into an 8x8 grid, and then the representative color of each tile is obtained. A Discrete Cosine Transform (DCT) is performed on the 8x8 block and the DCT coefficients are used as the descriptor. Similarity between CLDs can be measured by the root of squared differences of each matched coefficients. It should be noted

that representation of CLD is in frequency domain, the coefficient represents the visual pattern in three separated channels of YCbCr, information of particular colors cannot be directly accessed. Inverse DCT is needed for accessing spectral information. Color Structure Descriptor (CSD) The main purpose of the CSD is to express local color features in images. To this aim, an 8x8 sized structuring block scans the image in a sliding window approach. With each shift of the structuring element, the number of times a particular color is contained in the structure element is counted, and a color histogram is constructed in such a way. Simple histogram intersection can be used for similarity measure.

YCbCr and Y'CbCr are a practical approximation to color processing and perceptual uniformity, where the primary colors corresponding roughly to Red, Green and Blue are processed into perceptually meaningful information. By doing this, subsequent image/video processing, transmission and storage can do operations and introduce errors in perceptually meaningful ways. Y'CbCr is used to separate out a luma signal (Y') that can be stored with high resolution or transmitted at high bandwidth, and two Chroma components (Cb and Cr) that can be bandwidth-reduced, subsampled, compressed, or otherwise treated separately for improved system efficiency.

One practical example would be decreasing the bandwidth or resolution allocated to "color" compared to "black and white", since humans are more sensitive to the black-and-white information.

4.8 Edge Histogram The EHD represents the spatial distribution of edges in an image. The extraction process of the EHD consists of the following stages: The image array is divided into 4x4 sub images. Each sub image is further partitioned into non-overlapping square image image. The edges in each image-block is categorized into one of the following
six types: vertical, horizontal, 45 diagonal, 135 diagonal, non-

blocks whose size depends on the resolution of the input

directional edge and no-edge. Now a 5-bin edge histogram of each sub image can be obtained. Each bin value is normalized by the total number of image-blocks in the sub image.

The normalized bin values are nonlinearly quantized. There are five types of edge histogram: Vertical edge Horizontal edge 45 degree edge 135 degree edge Non-directional edge

Fig 4.3 Five types of edges histogram

Fig a shows vertical edge of histogram, fig b shows the horizontal edge of histogram, fig c shows the degree edge of the histogram, fig d shows the 135 degree edge of histogram and fig e shows the non-directional edge of histogram.

REFERENCE:
B.Szanto, P. Pozsegovics, Z.Vamossy, Sz.Sergyan, Sketch4Match Content Based Image Retrieval System Using Sketches, SAMI 2011, 9th IEEE International Symposium on Applied Machine Intelligence and Informatics, IEEE January 2011.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy