Unit III ROBOT SENSORS
Unit III ROBOT SENSORS
ROBOT SENSORS
Transducers and Sensors – Tactile sensor – Proximity and range sensors – Sensing joint forces – Robotic vision
system – Image Representation - Image Grabbing –Image processing and analysis – Edge Enhancement – Contrast
Stretching – Band Rationing - Image segmentation – Pattern recognition – Training of vision system.
The transducer is a device that is connected to sensor to convert the measured quantity into a standard
electrical signal such as 0-10V DC, -10 to +10V DC, 4 to 20mA, 0 to 20mA, 0-25mA etc. The o/p of the transducer
can be directly used by the system designer.
Transducers are used in electronic communication systems to convert signals of different physical forms to
electronic signals. In the below figure, two transducers are used where the microphone is used as the first
transducer and as a second transducer speaker is used.
There are various types of sensors and transducers are available to choose from like analog, digital, input
and output. The type of i/p or o/p transducer being used, really depends upon the kind of signal sensed or
controlled. But, a sensor and transducer can be defined as they converts one physical quantity to another.
A device which performs an i/p function is called sensor because they sense a physical change in some
characteristic that changes in response to some excitation. Transducer is also a device, that converts the energy
from one form to another. Examples for the transducer is microphone, loudspeaker etc.
Sensor Characters
1. Static Character
2. Dynamic character
Static Character
Range Span Error Accuracy Sensitivity
Linearity Non Linearity Repeatability Reproducibility Stability
Dead Band /Time Resolution Zero Defect Output Impedance
1
16 ME7305 Industrial Robotics and Expert Systems Unit III
Dynamic character
Response Time Time constant Rise Time Setting Time
Types of Sensors
1. Displacement sonsors
1.Potentiometer displacement sonsors
2. strain gauge displacement sonsors
3. capacitive displacement sonsors
4. Inductive displacement sonsors
2. Position Sensors
1. Potentiometer
2. Capacitive sensor
3. Inductive position sensor
4. Hall effect sensors
5. Photo Electric Sensor
6. optical encoders
3. Tacticle Sensors
1. Force/ torque sensor
2. Dynamic sensor
3. Thermal sensor
4. Velocity and Motion Sensors
1. Incremetal encoders
2. Tachogenerator
3. Pyroelectric sensors
5. Proximity Sensors
1. Optical encoders
2. Capacitive sensor
3. Hall effect sensors
4. Inductive position sensor
5. Eddy Current proximity sensors
6. Pneumatic proximity sensors
7. Proximity switches
6.Liquid flow Sensors
1. Orifice meter
2. Venture meter
3. Turbine flow meter
Tactile sensor:
A tactile sensor is a device. It measures the coming information
in response to the physical interaction with the environment. The sense
of touch in humans is generally modeled, i.e. cutaneous sense and the
kinesthetic sense. Cutaneous touch has a capability of detecting the
stimuli resulting from the mechanical simulation, pain, and temperature.
The kinesthetic touch receives sensor inputs from the receptors present
inside the muscles, tendons and joints.
2
16 ME7305 Industrial Robotics and Expert Systems Unit III
is defined generally, and the signal point contact is assumed, then the force/ torque sensor can give the information
about the contact location of force and moments- it is called as an intrinsic tactile sensing. The image of the torque
sensor is shown below.
Dynamic Sensor
Dynamic sensors are smaller accelerometers at the finger strips or at the skin of the robotic finger. They general
function like pacinian corpuscles in humans and have equally large respective field; thus one or two skins
accelerometer are sufficient for entire finger. These sensors effectively detect the making and breaking of contact,
the vibrations linked with the sliding over textured surfaces.
A stress rate sensor is the second type of a dynamic tactile sensor. If the fingertip is sliding at the speed of a
few cm/s overall small bumps or pits in a surface, the temporary changes in the skin became important. A
piezoelectric polymer such as PVDF produces charge in response to damage can be applied to produce a current,
which is directly proportional to the range of change.
Thermal Sensor
Thermal sensors are important to the human ability to identify the materials of the objects made, but some are used
in the robotics as well. The thermal sensing involves detecting thermal gradients in the skin, which are
correspondent to both the temperature and the thermal conductivity of an object. Robotic thermal sensors are
involved in the Peltier junctions in combination with the Thermistors.
3
16 ME7305 Industrial Robotics and Expert Systems Unit III
Proximity sensors are also used in machine vibration monitoring to measure the variation in distance between a
shaft and its support bearing. This is common in large steam turbines, compressors, and motors that use sleeve-
type bearings.
Types of Proximity sensor
1. Optical encoders 2. Hall effect sensors
3. Capacitive sensors 4. Eddy current proximity sensors
5. Inductive proximity sensors 6. Pneumatic proximity sensors
7. Proximity switches
Optical encoders
Optical encoders are devices that convert a mechanical position into a representative electrical signal by means of
a patterned disk or scale, a light source and photosensitive elements. With proper interface electronics, position
and speed information can be derived. Encoders can be classified as rotary or linear for measurements of
respectively angular and linear displacements. Rotary encoders are available as housed units with shaft and ball-
bearings or as "modular" encoders which are usually mounted on a host shaft (e.g. at the end of a motor).
Incremental and Absolute Encoders
Incremental encoders
The incremental encoder, sometimes called a relative encoder, is simpler in design than the absolute
encoder. It consists of two tracks and two sensors whose outputs are called channels A and B. As the shaft rotates,
pulse trains occur on these channels at a frequency proportional to the shaft speed, and the phase relationship
between the signals yields the direction of rotation. The code disk pattern and output signals A and B are illustrated.
By counting the number of pulses and knowing the resolution of the disk, the angular motion can be
measured. The A and B channels are used to determine the direction of rotation by assessing which channels "leads"
the other. The signals from the two channels are a 1/4 cycle out of phase with each other and are known as
quadrature signals. Often a third output channel, called INDEX, yields one pulse per revolution, which is useful in
counting full revolutions. It is also useful as a reference to define a home base or zero position.
Absolute Encoders:
The optical disk of the absolute encoder is designed to produce a digital word that distinguishes N distinct
positions of the shaft. For example, if there are 8 tracks, the encoder is capable of producing 256 distinct positions
or an angular resolution of 1.406 (360/256) degrees.
4
16 ME7305 Industrial Robotics and Expert Systems Unit III
Capacitive sensor
5
16 ME7305 Industrial Robotics and Expert Systems Unit III
6
16 ME7305 Industrial Robotics and Expert Systems Unit III
7
16 ME7305 Industrial Robotics and Expert Systems Unit III
PROXIMITY SWITCHES
Proximity switches are used to detect the presence of an object. These can be achieved by the presence of
an object in order to give output which is either on or off
1. Contact type
2. Non-Contact type
Magnetic reed switch
Photoelectric sensor
Inductive proximity sensor
8
16 ME7305 Industrial Robotics and Expert Systems Unit III
Range Sensors
Range sensors are devices that capture the 3D struc-ture of the world from the viewpoint of the sensor, usu-
ally measuring the depth to the nearest surfaces. Thesemeasurements could be at a single point, across a scan-ning
plane, or a full image with depth measurements atevery point. The benefits of this range data is that arobot can be
relatively certain where the real world is,relative to the sensor, thus allowing the robot to morereliably find
navigable routes, avoid obstacles, grasp ob-jects, act on industrial parts
Range sensing basics
1) the basic representations used forrange image data, 2) a brief introduction to the main 3Dsensors that are less
commonly used in robotics applica-tions and 3) a detailed presentation of the more commonlaser-baser range
image sensors
The distance between the object and the robot hand is measured using the range sensors Within
it is range of operation. The calculation of the distance is by visual processing. Range sensors find
use in robot navigation and avoidance of the obstacles in the path. The - location and the general
shape characteristics of the part in the work envelope of the robot S done by special applications for
the range sensors. There are several approaches like,
triangulation method, structured lighting
approach and time-of flight range finders etc. In these cases
the source of illumination can be lightsource, laser beam or
based on ultrasonic.
Triangulation Method:
This is the simplest of the techniques, which is easily
demonstrated in the Figure. The object is swept
over by a narrow beam of sharp light. The sensor focussed
on a small spot of the object surface detects
the reflected beam of light. If ‗8‘ is the angle made by the
illuminating source and ‗b‘is the distance
between source and the sensor, the distance ‗c of the sensor
on the robot is given as
d= b tan θ
The distance ‘d’ can be easily transformed into 3D
Coordinates
9
16 ME7305 Industrial Robotics and Expert Systems Unit III
Specific range values are computed by first calibrating the system. One of the simplest
arrangements is shown in Figure, which represents a top view of Figure.
In this, arrangement, the light source and camera are
placed at the same height, and the sheet of light is perpendicular
to the line joining the origin of the light sheet and the center of
the camera lens. We call the vertical plane containing this line
the reference plane. Clearly, the reference plane is
perpendicular to the sheet of
light, and any vertical flat surface that intersects the sheet Will
produce a vertical stripe of light in
which every point will have the same perpendicular distance to
the reference plane. - The objective
of. the arrangement shown in Figure. is to position the camera
so that every such vertical stripe also
appears vertical in the image plane. In this way, every point, the
same column in the ‗image will be
known to have the same distance to the reference plane.
10
16 ME7305 Industrial Robotics and Expert Systems Unit III
Image data reduction: The objective is to reduce the volume of data and as a preliminary step in the data analysis,
the following two schemes have found common usage for data reduction:
1. Digital conversion
2. Windowing
Digital conversion reduces the number of gray levels used by the machine vision system. For example, an 8-
bit register used for each pixel would have 2n = 256 gray levels. Depending on the requirements of the application,
digital conversion can be used to reduce the number of gray levels by using fewer bits to represent the pixel light
intensity. Four bits would reduce the number of gray levels to 16.This kind of conversion would significantly reduce
the magnitude of the image-processing problem.
Windowing involves using only a portion of the total image stored in the frame buffer for image processing
and analysis. This portion is called the window. For example, for inspection of printed circuit boards, one may wish
to inspect and analyze only one component on the board. A rectangular window is selected to surround the
component of interest and only pixels within the window are analysed. The rationale for windowing is that proper
recognition of an object involves only certain portions of the total scene.
Segmentation: Segmentation is a general term which applies to various methods of data reduction. In
segmentation, the objective is to group areas of an image having similar characteristics or features into distinct
entities representing parts of the image. For example, boundaries (edges) or regions (areas) represent two natural
segments of an image. There are many ways to segment an image. Three important techniques are:
1. Threshold
2. Region growing
3. Edge detection
11
16 ME7305 Industrial Robotics and Expert Systems Unit III
Threshold: Threshold is a binary conversion technique in which each pixel is converted into a binary value, either
black or white. This is accomplished by utilizing a frequency histogram of the image and establishing what intensity
(gray level) is to be the border between black and white. Since it is necessary to differentiate between the object
and background, the procedure is to establish a threshold and assign, for example a binary bit 1 for the object and
0 for the background. To improve the ability to differentiate, special lightening techniques must often be applied to
generate a high contrast.
When it is not possible to find a single threshold for an entire image (for example ,if many different objects
occupy the same scene, each having different levels of intensity),one approach is to partition the total image into
smaller rectangular areas and determine the threshold for each windows being analyzed.
Once threshold is established for a particular image, the next step is to identify particular areas associated
with objects within the image.
Region growing: Region growing is a collection of segmentation techniques in which pixels are grouped in regions
called grid elements based on attribute similarities. Defined regions can then be examined as to whether they are
independent or can be merged to other regions by means of an analysis of the difference in their average properties
and spatial connectivity. To differentiate between the objects and the background, assign 1 for any grid element
occupied by an object and 0 for background elements. It is common practice to use a square sampling grid with
pixels spaced equally along each side of the grid. For two dimensional image of a key as shown, this would give the
pattern indicated figure 5 This technique of creating runs of 1s and 0s is often used as a first pass analysis to
partition the image into identifiable segments or blobs. The region growing segmentation technique is applicable
when images are not distinguishable from each other by straight thresholding or edge detection technique.
Image segmentation a) Image pattern with grid b) Segmented image after runs test
12
16 ME7305 Industrial Robotics and Expert Systems Unit III
Edge detection: This technique considers the intensity change that occurs in the pixels at the boundary or edges
of a part. Given that a region of similar attributes has been found but the boundary shape is unknown, the
boundary can be determined by a simple edge following procedure. For the binary image as shown in the figure 6
the procedure is to scan the image until a pixel within the region is encountered. For a pixel within the region,
turn left and step, otherwise, turn right and step. The procedure is stopped when the boundary is traversed and
the path has returned to the starting pixel.
Feature Extraction: In machine vision applications, it is often necessary to distinguish one object from another.
This is usually accomplished by means of features that uniquely characterize the object. Some features of objects
that can be used in machine vision include area, diameter and perimeter. A feature, in the context of vision
systems, is a single parameter that permits ease of comparison and identification. The techniques available to
extract feature values for two dimensional cases can be roughly categorized as those that deal with boundary
features and those that deal with area features. The various features can be used to identify the object or part and
determine the part location and/or orientation.
Object Recognition: The recognition algorithm must be powerful enough to uniquely identify the object. Object
recognition technique is classified into:
1. Template-matching technique
2. Structural technique.
The basic problem in template matching is to match the object with a stored pattern feature set defined as
a model template. The model template is obtained during the training procedure in which the vision system is
programmed for known prototype objects. The features of the object in the image (e.g., area, diameter, aspect ratio)
are compared to the corresponding stored values. These values constitute the stored template. When a match is
found, allowing for certain statistical variations in the comparison process, then the object has been properly
classified.
Structural techniques of pattern recognition consider relationships between features or edges of an object.
For example, if the image of an object can be subdivided into four straight lines (called primitives) connected at
their end points, and the connected lines are right angles, then the object is rectangle. The majority of commercial
robot vision systems make use of this approach to the recognition of two- dimensional objects. The recognition
algorithms are used to identify each segmented objects in an image and assign it to a classification (e.g., nut, bolt.
flange etc).
Application:
The current applications of machine vision in robotics include inspection, part identification, location and
orientation.
Image representation:
The image representation is the format in which the data from a light sensor is stored for save keeping or
further processing. There are hundreds of formats in use, either proprietary or public domain, each with
advantages and disadvantages in respect to storage size, pixel accessibility, abstraction level and usage. Generally,
one can distinguish two major representations: DCG and vector based or pixel based. DCG or vector based
representation make use of two or three dimensional ab- stract objects, e.g. lines, cubes, rectangles to build up the
image of an object. Each feature is then stored with its drawing command and all its parameters and styles.
The drawing process is done by interpretation of the commands and executing it. The requirements on storage
space is dependent on the complexity of an image.
The image size of a vector based images is almost linear dependent on the amount of details represented. Each
feature of a represented object is usually represented by its edges.
Vector or DCG based objects are widely used for CAD and virtual reality. One very common type are postscript
images. Generally, it is very e client for man made objects.
The DCG representation does allow for platform and graphic device independent storage. The independency of
this format does give the drawing process the ability of adjusting the image to the possible output format, e.g. color
optimization, customization of resolution and view port. The view port is of special interest for 3D imaging. In this
case, the viewer may adjust the point of the camera and the view port interactively. Some interpreters, e.g.
Postscript, even allow for programs
13
16 ME7305 Industrial Robotics and Expert Systems Unit III
to be stored into the image, which might be used for simulation and moving images. The format is very e client in
storage size for man made drawings and objects, in general for objects which can be represented by prototypes.
Since the drawing is a combined interpretation and displaying process, this sys- tem does require computing
resources. It also shifts the control of the displaying process to the end device. The output image layout is therefore
less predictable. This format is less e client for free form drawing or non man made objects because the storage
size depends on the ability of representing the object algorithmic in drawing commands. This cannot be done e
client for handwriting, plants or just pixel based representations.
The other format of representation is the pixel based representation. For the pixel representation the object is
being sampled in either one or two dimensions. This is done by mapping it onto a plane. This plane gets quantified
into rectangle or square samples of some known dimension. Once acquired, the image is fixed like a photograph
and can be stored into an array of numbers. The type of numbers is to be defined with the aquisitation process,
because information lost here cannot be recovered later. Table 3 gives an overview over used representations.
The storage size is dependent on the resolution required. Without compression, the image size is independent from
the resolution and sampled area size. Storage space not used by an object is used by the background.
Simple images contain large monotone areas, usually background or surfaces. To reduce storage size, such images
can be compressed by the usual mathematical com- pression methods with some success. In this case, the increase
of image complexity does increase the image size, since the mathematical redundancy decreases.
For uncompressed images, increasing the complexity inside the image does not change the overall image size. This
representation is compatible to most output devices. Therefore, the displaying process does require no complex
processing. In most cases, resizing and color adjustment is all that is required, which makes it a fast and e client
displaying format.
Since the image does not re exact the original object, there is no method of rescaling and zooming into details. The
resolution is fixed from the point of aquisitation and can get only worse.
Image Grabbing
A frame grabber is an electronic device that captures (i.e., "grabs") individual, digital still frames from an analog
video signal or a digital video stream. It is usually employed as a component of a computer vision system, in which
video frames are captured in digital form and then displayed, stored, transmitted, analyzed, or combinations of
these.
Historically, frame grabber expansion cards were the predominant way to interface cameras to PCs. Other interface
methods have emerged since then, with frame grabbers (and in some cases, cameras with built-in frame grabbers)
connecting to computers via interfaces such as USB, Ethernet and IEEE 1394 ("FireWire"). Early frame grabbers
typically had only enough memory to store a single digitized video frame, whereas many modern frame grabbers
can store multiple frames.
Modern frame grabbers often are able to perform functions beyond capturing a single video input. For example,
some devices capture audio in addition to video, and some devices provide, and concurrently capture frames from
multiple video inputs. Other operations may be performed as well, such as deinterlacing, text or graphics overlay,
image transformations (e.g., resizing, rotation, mirroring), and conversion to JPEG or other compressed image
14
16 ME7305 Industrial Robotics and Expert Systems Unit III
formats. To satisfy the technological demands of applications such as radar acquisition, manufacturing and remote
guidance, some frame grabbers can capture images at high frame rates, high resolutions, or both.
Frame Grabbers
Hundreds of frame grabbers are on the market to allow integration of digital and analog machine-vision cameras
with host computers. Varying features make each one unique for a particular application. When evaluating frame
grabbers for a specific application, developers must be aware of specific camera types (either digital or analog), the
sensor types they use (area or line scan), their systems requirements, and the cost of peripherals.
Digital Frame Grabbers
To guarantee low latency between image acquisition and processing, frame grabbers with digital camera interfaces
such as Camera Link cameras are often used, especially in high-speed semiconductor applications. The Camera Link
standard’s Full configuration allows a maximum of 680 Mbytes/s (64-bit at 85 MHz), currently the highest
bandwidth on the market. High-speed applications can also benefit from onboard tap reordering that can be
accomplished with many frame grabbers. This removes the burden of recomposing an entire image from complex
multi tap cameras (several simultaneous data channels) from the host computer. Other features, such as the
recently available power-over-Camera Link standard, offer simpler integration (a single cable for power and data)
of compatible Camera Link cameras when this feature is available on the frame grabber.
Analog Frame Grabbers
Even with established standards such as RS-170, NTSC, CCIR, and PAL, great differences exist among analog frame
grabbers. Differences appear through jitter, digitization quality, and color separation, all of which affect image
quality. However, because it is difficult to compare frame grabbers from datasheet specifications, many OEMs
benchmark several models before making a selection.
Some analog frame grabbers can handle multiple cameras either through multiple analog interfaces or multiplexing
techniques, thus reducing the number of frame grabbers used in multi camera systems. When multiplexing video
on a frame grabber, the resynchronization time required with each switching reduces the total frame rate. In the
case of a multiple simultaneous input configuration, onboard memory will guarantee that images are transferred
without loss of data.
Applications
Healthcare
Frame grabbers are used in medicine for many applications including telenursing and remote guidance. In
situations where an expert at another location needs to be consulted, frame grabbers capture the image or video
from the appropriate medical equipment so it can be sent digitally to the distant expert.
Manufacturing
"Pick and place" machines are often used to mount electronic components on circuit boards during the circuit board
assembly process. Such machines use one or more cameras to monitor the robotics that places the components.
Each camera is paired with a frame grabber that digitizes the analog video, thus converting the video to a form that
can be processed by the machine software.
Network Security
Frame grabbers may be used in security applications. For example, when a potential breach of security is detected,
a frame grabber captures an image or a sequence of images, and then the images are transmitted across a digital
network where they are recorded and viewed by security personnel.
Personal Use
In recent years with the rise of personal video recorders like camcorders, mobile phones, etc. video and photo
applications have gained ascending prominence. Frame grabbing is becoming very popular on these devices.
Astronomy & Astrophotography
Amateur astronomers and astrophotographers use frame grabbers when using analog "low light" cameras for live
image display and internet video broadcasting of celestial objects. Frame grabbers are essential to connect the
analog cameras used in this application to the computers that store or process the images.
Edge Enhancement
Edge enhancement is an image processing filter that enhances the edge contrast of an image or video in an attempt to
improve its acutance (apparent sharpness).
15
16 ME7305 Industrial Robotics and Expert Systems Unit III
The filter works by identifying sharp edge boundaries in the image, such as the edge between a subject and a background
of a contrasting color, and increasing the image contrast in the area immediately around the edge. This has the effect of
creating subtle bright and dark highlights on either side of any edges in the image, called overshoot and undershoot, leading
the edge to look more defined when viewed from a typical viewing distance.
The process is prevalent in the video field, appearing to some degree in the majority of TV broadcasts and DVDs[citation
needed]. A modern television set's "sharpness" control is an example of edge enhancement. It is also widely used in
computer printers especially for font or/and graphics to get a better printing quality. Most digital cameras also perform
some edge enhancement, which in some cases cannot be adjusted.
Edge enhancement can be either an analog or a digital process. Analog edge enhancement may be used, for example, in
all-analog video equipment such as modern CRT televisions.
Properties
Further information: Unsharp masking
Edge enhancement applied to an image can vary according to a number of properties; the most common algorithm is
unsharp masking, which has the following parameters:
Amount. This controls the extent to which contrast in the edge detected area is enhanced.
Radius or aperture. This affects the size of the edges to be detected or enhanced, and the size of the area surrounding
the edge that will be altered by the enhancement. A smaller radius will result in enhancement being applied only
to sharper, finer edges, and the enhancement being confined to a smaller area around the edge.
Threshold. Where available, this adjusts the sensitivity of the edge detection mechanism. A lower threshold results
in more subtle boundaries of colour being identified as edges. A threshold that is too low may result in some small
parts of surface textures, film grain or noise being incorrectly identified as being an edge.
In some cases, edge enhancement can be applied in the horizontal or vertical direction only, or to both directions in different
amounts. This may be useful, for example, when applying edge enhancement to images that were originally sourced from
analog video.
Effects of edge enhancement
Unlike some forms of image sharpening, edge enhancement does not enhance subtle detail which may appear in
more uniform areas of the image, such as texture or grain which appears in flat or smooth areas of the image. The benefit
to this is that imperfections in the image reproduction, such as grain or noise, or imperfections in the subject, such as natural
imperfections on a person's skin, are not made more obvious by the process. A drawback to this is that the image may
begin to look less natural, because the apparent sharpness of the overall image has increased but the level of detail in flat,
smooth areas has not.
As with other forms of image sharpening, edge enhancement is only capable of improving the perceived sharpness
or acutance of an image. The enhancement is not completely reversible, and as such some detail in the image is lost as a
result of filtering. Further sharpening operations on the resulting image compound the loss of detail, leading to artifacts
such as ringing. An example of this can be seen when an image that has already had edge enhancement applied, such as
the picture on a DVD video, has further edge enhancement applied by the DVD player it is played on, and possibly also
by the television it is displayed on. Essentially, the first edge enhancement filter creates new edges on either side of the
existing edges, which are then further enhanced.
Viewing conditions
The ideal amount of edge enhancement that is required to produce a pleasant and sharp-looking image, without
losing too much detail, varies according to several factors. An image that is to be viewed from a nearer distance, at a larger
display size, on a medium that is inherently more "sharp" or by a person with excellent eyesight will typically demand a
finer or lesser amount of edge enhancement than an image that is to be shown at a smaller display size, further viewing
distance, on a medium that is inherently softer or by a person with poorer eyesight [citation needed].
For this reason, home cinema enthusiasts who invest in larger, higher quality screens often complain about the
amount of edge enhancement present in commercially produced DVD videos, claiming that such edge enhancement is
optimized for playback on smaller, poorer quality television screens, but the loss of detail as a result of the edge
enhancement is much more noticeable in their viewing conditions
16
16 ME7305 Industrial Robotics and Expert Systems Unit III
Contrast Stretching:
Contrast stretching (also called Normalization) attempts to improve an image by stretching the range of intensity
values it contains to make full use of possible values. Unlike histogram equalization, contrast stretching is restricted
to a linear mapping of input to output values. The result is less dramatic, but tends to avoid the sometimes artificial
appearance of equalized images.
The first step is to determine the limits over which image intensity values will be extended. These lower and upper
limits will be called a and b, respectively (for standard 8-bit grayscale pictures, these limits are usually 0 and 255).
Next, the histogram of the original image is examined to determine the value limits (lower = c, upper = d) in the
unmodified picture. If the original range covers the full possible set of values, straightforward contrast stretching
will achieve nothing, but even then sometimes most of the image data is contained within a restricted range; this
restricted range can be stretched linearly, with original values which lie outside the range being set to the
appropriate limit of the extended output range. Then for each pixel, the original value r is mapped to output value
s using the function:
One problem with this method is that outliers can reduce the effectiveness of the operation. This was already
mentioned above when it was suggested that sometimes a restricted range of the input values as determined by
inspecting the histogram of the original image might by used. Frequently it is advantageous to select c and d to be
at the 5th and 95th percentiles, respectively, of the input values. Alternatively, one can start at the histogram peak
and move up and down the value list until only a small (e.g., 1% to 3%) number of values are rejected and left
outside the chosen limits for c and d.
Band Rationing
Band rationing means dividing the pixels in one band by the corresponding pixels in a second band. The reason for
this is twofold: One is that differences between the spectral reflectance curves of surface types can be brought out.
The second is that illumination, and consequently radiance, may vary, the ratio between an illuminated and a not
illuminated area of the same surface type will be the same. Thus, this will aid image interpretation, particularly the
near-infrared/ red (NIR/R) band ratio.
From the general spectral reflectance, the following observation can be made:
NIR/R–images can serve as a crude classifier of images, and indicate vegetated areas in particular. Therefore, this
ratio has been developed into a range of different vegetation indices.
Image segmentation
In computer vision, image segmentation is the process of partitioning a digital image into multiple segments
(sets of pixels, also known as super-pixels). The goal of segmentation is to simplify and/or change the
representation of an image into something that is more meaningful and easier to analyze.[1][2] Image segmentation
is typically used to locate objects and boundaries (lines, curves, etc.) in images. More precisely, image segmentation
is the process of assigning a label to every pixel in an image such that pixels with the same label share certain
characteristics.
The result of image segmentation is a set of segments that collectively cover the entire image, or a set of contours
extracted from the image (see edge detection). Each of the pixels in a region are similar with respect to some
characteristic or computed property, such as color, intensity, or texture. Adjacent regions are significantly different
with respect to the same characteristic(s).[1] When applied to a stack of images, typical in medical imaging, the
17
16 ME7305 Industrial Robotics and Expert Systems Unit III
resulting contours after image segmentation can be used to create 3D reconstructions with the help of interpolation
algorithms like Marching cubes.
Pattern recognition:
Pattern recognition is the automated recognition of patterns and regularities in data. Pattern recognition is closely
related to artificial intelligence and machine learning, together with applications such as data mining and
knowledge discovery in databases (KDD), and is often used interchangeably with these terms. However, these are
distinguished: machine learning is one approach to pattern recognition, while other approaches include hand-
crafted (not learned) rules or heuristics; and pattern recognition is one approach to artificial intelligence, while
other approaches include symbolic artificial intelligence.
The field of pattern recognition is concerned with the automatic discovery of regularities in data through the use of
computer algorithms and with the use of these regularities to take actions such as classifying the data into different
categories.
Pattern recognition systems are in many cases trained from labeled "training" data (supervised learning), but when
no labeled data are available other algorithms can be used to discover previously unknown patterns (unsupervised
18
16 ME7305 Industrial Robotics and Expert Systems Unit III
learning). Machine learning is the common term for supervised learning methods [dubious – discuss] and originates
from artificial intelligence, whereas KDD and data mining have a larger focus on unsupervised methods and
stronger connection to business use. Pattern recognition has its origins in engineering, and the term is popular in
the context of computer vision: a leading computer vision conference is named Conference on Computer Vision and
Pattern Recognition. In pattern recognition, there may be a higher interest to formalize, explain and visualize the
pattern, while machine learning traditionally focuses on maximizing the recognition rates. Yet, all of these domains
have evolved substantially from their roots in artificial intelligence, engineering and statistics, and they've become
increasingly similar by integrating developments and ideas from each other.
19