0% found this document useful (0 votes)
8 views14 pages

Roboscan: A Combined 2D and 3D Vision System For Improved Speed and Flexibility in Pick-And-Place Operation

The document presents Roboscan, a combined 2D and 3D vision system designed to enhance the speed and flexibility of robotic pick-and-place operations. It utilizes a camera for 2D vision and a laser slit projector for 3D vision, enabling effective object classification and segmentation while minimizing the limitations of traditional robotic vision. The system's architecture and workflow are detailed, showcasing its ability to handle complex scenarios in automated manufacturing environments.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views14 pages

Roboscan: A Combined 2D and 3D Vision System For Improved Speed and Flexibility in Pick-And-Place Operation

The document presents Roboscan, a combined 2D and 3D vision system designed to enhance the speed and flexibility of robotic pick-and-place operations. It utilizes a camera for 2D vision and a laser slit projector for 3D vision, enabling effective object classification and segmentation while minimizing the limitations of traditional robotic vision. The system's architecture and workflow are detailed, showcasing its ability to handle complex scenarios in automated manufacturing environments.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Int J Adv Manuf Technol (2013) 69:1873–1886

DOI 10.1007/s00170-013-5138-z

ORIGINAL ARTICLE

Roboscan: a combined 2D and 3D vision system for improved


speed and flexibility in pick-and-place operation
Paolo Bellandi & Franco Docchio & Giovanna Sansoni

Received: 31 December 2012 / Accepted: 17 June 2013 / Published online: 11 July 2013
# Springer-Verlag London 2013

Abstract We describe Roboscan, a Robot cell that combines 1 Introduction


2D and 3D vision in a simple device, to aid a Robot manip-
ulator in pick-and-place operations in a fast and accurate The massive and always increasing use of Robots as versatile,
way. The optical head of Roboscan combines the two vision robust and fast manipulators is well known and appreciated in
systems: the camera is used “stand-alone” in the 2D system, a large variety of domains of automated manufacturing. Ro-
and combined to a laser slit projector in the 3D system, bots, indeed, replace humans in numerous repetitive and yet
which operates in the triangulation mode. The 2D system, sophisticated operations. Among the most important usages of
using suitable libraries, provides the preliminary 2D infor- Robots (which include welding, painting, cutting), pick-and-
mation to the 3D system to perform in a very fast, flexible place operations are relevant in modern industry [1]. Robots,
and robust way the point cloud segmentation and fitting. in fact, may be very effective in collecting items from con-
Roboscan is mounted onto an anthropomorphic, 6-DOF veyors or bins, grasping them by means of suitable end
Robot manipulator. The most innovative part of the system effectors, and placing them in another conveyor or bin
is represented by the use of robust 2D geometric template according to a set of rules. Pick-and-place is a key operation
matching as a means to classify 3D objects. In this way, we to feed production lines, to select items for quality control, and
avoid time-consuming 3D point cloud segmentation and 3D to help in packaging operations [2, 3].
object classification, using 3D data only for estimating pose Despite the fact that Robots can potentially perform
and orientation of the robot gripper. In addition, a novel pick-and-place operations at a faster rate than humans do,
approach to the template definition in the 2D geometric their use is obviously limited by the lack of vision ability
template matching is proposed, where the influence of sur- of the large majority (as to date) of commercial Robots. To
face reflectance and colour of the objects over the definition compensate for the “blindness” of Robots, usually the
of the template geometry is minimized. We describe the items to be picked and manipulated must be neatly orga-
procedures for 2D and 3D vision of Roboscan, together with nized in the conveyor or in the pallet. Moreover, in general
the calibration procedures that have been implemented. We “blind” Robots can only take care of a single type of object
also present a set of tests that show the performance of the at a time [4].
system and its effectiveness in a number of pick-and-place In recent years, an increasing amount of R&D (both
operations. University- and Industry-based) has been devoted to the
development of tools and systems for the integration of
vision in Robotic arms. Mounting one or more vision sensors
Keywords Machine vision . Robotic manipulation . Pattern onto the arm can help to overcome the usage limitation
matching . 2D and 3D calibration . Blob analysis . 3D described before. A major advantage in this approach is
segmentation flexibility: whereas in humans the vision sensors are in a
fixed position, in Robots they can be mounted anywhere
along the arm (this allows to “see” the object from the right
P. Bellandi : F. Docchio : G. Sansoni (*) perspective). Vision tools can be beneficial to the Robot to
Laboratory of Optoelectronics, University of Brescia,
identify the item to be manipulated as well as its orientation
Via Branze 38, 25123 Brescia, BS, Italy
e-mail: giovanna.sansoni@ing.unibs.it in the working area, and to identify the region of space where
URL: www.optolab-bs.it the items should be placed after manipulation. This, in
1874 Int J Adv Manuf Technol (2013) 69:1873–1886

principle, reduces the amount of activity required to properly measurement accuracy since present 2D elaboration tech-
position the items before the manipulation area. niques are more accurate than 3D ones. Therefore, “tandem”
Vision systems can be broadly subdivided in two major 2D and 3D operation of the vision system could be beneficial
categories: 2D vision and 3D vision (today, also 4-D vision in terms of the general efficacy of a pick-and-place operation.
systems—where the fourth “dimension” can be either a To prove the above concept, in this paper we describe
spectral property, or colour, or any other measurable Roboscan, a Robot-guiding vision system that combines 2D
property—are investigated, but their application to Robotics and 3D vision techniques and integrates them into an anthro-
is still premature). 2D vision makes use of a single camera pomorphic manipulator, for the optimization of pick-and-
(with either a line- or matrix organization). At present, 2D place operations. We demonstrated that the use of 2D to
vision has reached a considerable degree of maturity in perform information retrieval greatly simplifies the extrac-
manufacturing and product inspection, and its application tion of some features of the scene that would be difficult to
to Robots to monitor the scene and to allow the manipulator treat with 3D techniques alone. The system is composed of a
to adapt to varying scenarios is a state-of-art. 3D laser slit sensor mounted on the robot arm. The measure-
3D vision is accomplished by using either two cameras ment is based on optical triangulation: the 3D information
“seeing” the scene from two different angles (passive stereo about the scene is acquired by scanning the working area [6].
vision), or a single camera with a special “information-car- Segmentation of objects in the 3D point cloud is performed
rying illuminator”, in general a laser stripe or a set of fringes, by means of suitable 2D information, collected by the camera,
which illuminate the scene at an angle with respect to that of used in a "stand-alone" mode. Suitable geometrical pattern
the camera (active stereo vision) [5–9]. All 3D systems matching is applied to the 2D scene, to count the objects and
share, in relation to their use in Robotic picking-and- to estimate their position. This information is exploited to
placing, the ability to give the Robot depth information that identify each object in the scene, in a very flexible way:
is obviously absent in 2D systems. actually the robot is able to pick up objects randomly placed
In general, handling 2D data sets (organized in a bi- in the scene without particular constraints on the typology of
dimensional data matrix) is straightforward, and the extrac- the objects.
tion of the information about pose and orientation from the In this application, the LabVIEW platform has been used
matrix is rather easy: a number of libraries for edge extrac- to develop the whole software architecture, using the "Vision
tion, template matching, etc. are available in combination Development Module" for the image processing algorithms
with standard cameras. “Intelligent” cameras (equipped with and the "Labview Robotics Library for DENSO" for the
elaboration power) are able to perform a large amount of robot motion [17].
operations in “quasi” real time [10]. The system has been tested to evaluate its effectiveness
With 3D systems, the elaboration of the data is more and flexibility with special attention to the measurement
complex: the data set is the so-called three-dimensional point performances of the vision system and to the accuracy of
cloud, and segmentation of the object from the scene, object segmentation. To this aim, special scenarios have been
contouring, pose estimation and template matching (e.g. prepared to evaluate the system capability to interact with
with respect to a CAD description of the object) can be real situations, using some common objects to simulate pick-
time-consuming. Also, 3D libraries to perform these opera- and-place operations.
tions are available, and a number of novel techniques includ- The paper is organized as follows. In Section 2, the
ing neuro-fuzzy functions are proposed. However, it is true Roboscan system is presented. In Section 3, the system work
that 3D vision techniques do not allow fast pick and place flow is detailed, together with the basic features of both 2D
operations of Robots due to the time-consuming information and 3D vision procedures. Section 4 shows the experimental
retrieval operations [11–16]. results.
A simple, but very effective alternative solution, which
could combine the ease and speed of operation of 2D sys-
tems with the completeness of information of 3D ones, can 2 Description of the system components
be represented by a suitable match of the two techniques. A
3D system inherently includes a 2D system: in fact it is The Roboscan system is presented in Fig. 1. It is composed
equipped with at least one camera. If this camera is suitably of two subsystems: the Robot Subsystem and the Vision
oriented with respect to the object to be sampled, it can be Subsystem.
very effectively used alone, to build an additional 2D system The Robot Subsystem is composed of an anthropomor-
together with its libraries. A number of operations can be phic, 6-DOF manipulator (DENSO VP-6556G). The end
performed by the 2D system, and the results can be passed to effector is a pneumatic vacuum gripper, using a VSN1420 tool
the 3D system to facilitate the information retrieval from equipped with a silicon bellow vacuum cup (GIMATIC s.p.a.
the point cloud. This, in general, increases also the object Italy). The robot is cabled to a DENSO RC7M controller,
Int J Adv Manuf Technol (2013) 69:1873–1886 1875

this task is the definition of coordinates (X, Y, Z) of GRS


common to either 3DLS, to M_CAM and to the Robot.
The second block in Fig. 2 is aimed at detecting which
part of the work area must be scanned by means of the
3DLS system. Since the objects are supposed to be ran-
domly placed, it is possible that there are empty regions in
the work area, which should not be scanned. The knowl-
edge of the so-called Region of Scanning (ROS) tells the
robot where to start and to stop the 3D acquisition. This
information is captured by means of 2D acquisition and
blob analysis.
Subsequently, 3D Scanning over the ROS area is performed
(block 3 in Fig. 2). The scene is acquired line by line. Line
elaboration and robot motion are synchronised to capture the
whole 3D information in the X, Y, Z coordinates. Raw data are
saved into a file for subsequent segmentation.
In the fourth block in Fig. 2, 2D object classification is
Fig. 1 The Roboscan system performed. This is accomplished by using 2D geometric
template matching. It returns the position and the orientation
of each matched element in the image. The definition of
which communicates with the supervisor PC (SPC) through a suitable templates allows us to carry out the classification
TCP/IP port. The communication between SPC and the robot of objects differing in shape, texture, dimension and reflec-
controller is implemented using the "Labview Robotics Li- tance, in view of improved flexibility of operation.
brary for DENSO". The aim of the 3D Segmentation task (fifth block in
The Vision Subsystem is the 3D laser slit sensor (3DLS) Fig. 2) is to extract from the 3D point cloud the sub-clouds
shown in Fig. 1: a Lasiris 660 nm, 10 mW device (LSR in the corresponding with each single object. It is well known that
Figure), projects a laser blade onto the scene; a μEye 1.540- in general 3D segmentation is a very difficult task since it
M 1,280×1,024 pixel CMOS digital USB camera (M_CAM requires rather sophisticated algorithms for the modelling
in the Figure) grabs the scene at an angle with respect to and the matching of 3D sub-parts. In our system, this step
LSR. The 3D information is obtained by means of optical is greatly simplified thanks to the 2D object classification
triangulation: the object-induced deformation of the laser
blade is acquired, elaborated and mapped into 3D real world
coordinates by means of suitable calibration. Coverage of the
whole area is achieved by scanning.
In addition, M_CAM acquires the robot work area inde-
pendently from the 3D optical head. The images are elabo-
rated to produce suitable 2D information, which is combined
to the 3D point cloud to obtain flexible and robust 3D
segmentation.

3 Description of the system workflow

The system workflow is shown in Fig. 2. It is based on seven


Tasks, which implement a suitable combination of Vision
and of Motion procedures. Vision procedures perform image
acquisition and elaboration, and give information to the
robot about the unknown scene. Motion procedures accom-
plish robot motion.
All the procedures must share the same reference system
[Global Reference System (GRS)]. GRS is defined in the
first block in Fig. 2, during System Calibration: the result of Fig. 2 System workflow
1876 Int J Adv Manuf Technol (2013) 69:1873–1886

previously performed. We exploit the information about the


position and the orientation of the objects in the 2D space to
find the matching between each object and the correspond-
ing 3D sub-cloud. After this, each sub-cloud is saved in a
single file.
The task in the sixth block in Fig. 2 is aimed at estimating
the position and the orientation of the robot tool for correct
object picking. This step is performed by the 3D object
description procedure. Each sub-cloud is fitted to a suitable
polynomial surface. Again, information about the position of
each object from the procedure in block 4 is used to calculate
the equation of the tangential plane to the fitted surface. The
normal line to this plane is obtained and the corresponding
director cosines are estimated.
The last step is object picking. Optimal object picking is
possible thanks to the pose coordinates from the previous
task. The Roboscan system is able to handle situations where
objects of different shapes are very close to each other, piled
up or partially overlapped; hence, the number of objects that Fig. 3 System calibration
can be picked up at each cycle is maximized. Full object
picking is achieved by iteratively performing tasks from
block 2 to block 7. The possibility of re-defining the ROS 3.1.2 Calibration of M_CAM
at each cycle is strategic in view of minimising the overall
operation time. The developed procedures are detailed in the The calibration of M_CAM is aimed at estimating the ex-
following sub-sections. trinsic parameters (pose and orientation of the camera with
respect to GRS) as well as the intrinsic parameters (focal
length and lens distortion coefficients) [18]. M_CAM is
3.1 System calibration oriented perpendicularly to the calibration plate as in
Fig. 1, at a distance of 670 mm. First, the calibration plate
System calibration is based on three steps. Firstly, the Robot is acquired. Then, each marker is segmented from the back-
is calibrated, then M_CAM and finally the 3DLS sensor. The ground, and the coordinates of its centre are detected (marker
procedure is carried out by means of the calibration master centroid). Suitable estimation algorithms from the Labview
shown in Fig. 3. The master is a 300×200-mm2 plate, with 2D calibration library calculate both the extrinsic and the
27×18 circular markers, having a diameter of 4 mm except intrinsic parameters of the camera, using an a priori knowl-
for the three central markers that have a diameter of 8 mm. edge of the geometry of the calibration plate and the mea-
The distance between two adjacent markers is 10 mm. This sured values of the coordinates of the centroids.
master is used to define the origin O and coordinates X, Y, Z The estimated camera parameters are returned and will be
in GRS. Point O is defined to correspond to the central point used as inputs to the 2D images acquired by the camera
of the marker framed in Fig. 3. Axes X, Y and Z are oriented afterwards. The procedure is extremely efficient since it re-
as shown in the figure. quires the acquisition of a single image of the master.

3.1.1 Calibration of the robot 3.1.3 Calibration of the 3DLS sensor

The contact tool shown in Fig. 3 is used to calibrate the The calibration of the 3DLS sensor is aimed at estimating the
robot. By means of the teach-pendant, it is moved to pose and the orientation of the optical pair (i.e. camera
either the centre of the framed marker, the centre of one M_CAM and laser LSR) with respect to GRS. To this aim,
marker along the row and one marker along the column 3DLS is oriented as shown Fig. 3, with LSR perpendicular to
that intersect in point O on the master. The correspond- the calibration plate, and M_CAM at 55° with respect to it.
ing positions are learned and define origin O and axes X The system baseline is at Z0 =350 mm from the plate. Firstly,
and Y, respectively. Z is taken perpendicularly to plane X two sets of images are acquired: the former is obtained by
Y, and oriented as shown in the figure. This procedure is moving the sensor along −Z from Z0 at steps of 5 mm for
simple and very fast; the positioning precision of the contact N=10 times, and by acquiring the image of the calibration
tool is not critical. plate at each Zh height (h=1,…,N). The latter results from the
Int J Adv Manuf Technol (2013) 69:1873–1886 1877

acquisition of the laser blade projected on the calibration equation system. This is done, on one side, by feeding it with
plate in correspondence with each Zh position. Hence, the the centroid coordinates from the images in the first set as the
whole range along Z is equal to 45 mm. An example of the known terms and, on the other side, by using the a priori
images from the two sets is shown in Fig. 4. known coordinates of the corresponding markers in the
Secondly, M_CAM is calibrated. The well-known Pin- calibration plate as the coefficient of the unknown camera
hole camera model has been chosen to estimate the camera parameters. A simple maximum likelihood algorithm solves
parameters. These are three rotation parameters, three trans- for the unknowns.
lation parameters, the focal length, one scale factor and the The last step is the calibration of laser LSR. It basically
image coordinates of the central image point [19]. We chose consists into the estimation of the coefficients of the equation
this model as the most appropriate in this application for its of the plane of light projected by the device. To this aim, the
simplicity and accuracy. It describes the camera behaviour laser blades acquired in the second image set are elaborated
by means of a system of two linear equations (the so-called to map the intensity values into image pixel coordinates,
perspective equations), which combine the 11 unknown forming the so-called signal of Centres of Gravity (signal
camera parameters with the pixel coordinates of each imaged CG) [19]. These values are then used into the camera per-
point. The unknowns are calculated by overestimating the spective equations and solved for coordinates X and Y, being
Z equal to values Zh in the images. The resulting point cloud
represents the light plane into GRS: it is fitted to an equation
of the following type:
z ¼ ax þ by þ c ð1Þ
In Eq. 1 coefficients a, b and c express the orientation of
LSR in GRS. The result of the calibration is that the 3DLS
sensor is able to measure the coordinates of each point
illuminated by the laser blade in the X, Y, Z reference.
Although the whole process seems a little bit complex, it
is not: in the experimental practice, it only requires the
acquisition of the two sets of images, which is automatically
carried out by the robot in very short time.

3.2 Detection of the region of scanning

In order to detect the ROS dimension, M_CAM is oriented


perpendicularly to the work area, at a distance of 670 mm.
The corresponding FOV is 370 mm along X and 300 mm
along Y. An example of the scene acquired by M_CAM is
shown in Fig. 5a. Here, some objects are adjacent to each
other, some other partially occluded and two of them are
almost completely piled up.
First the image is binarized, and then blob analysis is
performed [20]. The shape of the objects is immaterial since
one blob is defined whenever a group of pixel is detected
within a closed line. Each blob is numbered, and is described
by its centre of gravity and by its bounding box. As an
example, in Fig. 5b, seven blobs are detected: those num-
bered from I=1 to I=5 correspond to single elements in the
scene; blobs 6 and 7 detect overlapped objects. Points BI
(I=1,…7) represent the position of the centres of gravity, and
the rectangles in the image overlay define the blob bounding
boxes. Parameters HI and WI are the height and the width of
Fig. 4 Example of the images acquired to calibrate the 3DLS sensor. a
each Ith box.
An image of the master from the first set; b an image of the laser blade in The ROS area is defined by distance XSTART−XSTOP, being
the second set XSTART and XSTOP the coordinates along X where the robot
1878 Int J Adv Manuf Technol (2013) 69:1873–1886

and elaborated by means of a sub-pixel estimation algorithm


that elaborates the image by rows, and outputs the coordi-
nates of the centres of gravity of the light profile along the
image columns [19]. These coordinates are used as inputs to
the system of three equations that carry out the 3D measure-
ment of the 3DLS sensor. The output is the position in GRS
of each centre of gravity.
Suitable synchronization is performed between image
acquisition-elaboration and the robot motion, to order each
profile along the X direction. The set of measured 3D profiles
is then saved in a single file. The 3D point cloud correspond-
ing to the scene in Fig. 5a is shown in Fig. 6b: the labels
help to identify objects and corresponding sub-clouds in
the two images.

3.4 2D object classification

The aim of this task is to classify each object in the ROS.


Geometric Template Matching (GTM) is used to detect objects

Fig. 5 Example of the ROS detection procedure. a Image acquired by


M_CAM; b resulting image after blob analysis

starts and stops the scanning, respectively. Values XSTART


and XSTOP are determined as follows:
.
X START ¼ X S −W S 2
.
X STOP ¼ X G þ W G 2 ð2Þ

In Eq. 2, XS and XG represent the smallest and the greatest


values along X of points BI, respectively; WS and WG are the
width of the bounding boxes labeled by I=S=5 and I=G=7
respectively.

3.3 3D scanning

In this task, the robot scans the ROS area along X. Figure 6a Fig. 6 3D scanning procedure. a Deformation of the laser blade
shows an example of the deformation induced by the object acquired by M_CAM and signal of the centres of gravity; b Resulting
shape on the laser blade. The pattern is acquired by M_CAM 3D point cloud
Int J Adv Manuf Technol (2013) 69:1873–1886 1879

templates in acquired images [21]. The technique is well all the objects, except for those labelled 12 and 13, since they
known: it is based on the convolution between a template and are almost completely occluded. It is worth noting that also
the image. The template must retain information about the objects 6, 9 and 11 are detected, despite the fact that they are
geometry (shape) of the searched element. Every time the not in the foreground. In the figure, the centre of each box is
template is found in the image, the corresponding element is denoted by point C. The coordinates of each centre are also
framed in the image by a bounding box. The coordinates of its visible. In the following, they will be denoted by Xc, Yc.
centre point, the scale factor, the dimension of the box and
information about its orientation are also provided.
3.5 3D cloud segmentation
Geometric template matching has been preferred to grey
level template matching, to decrease the dependence of the
The segmentation of the whole 3D cloud is performed in two
detection on both the colour of the objects and the environ-
steps. In the former, the order of the segmentation is set in a
mental illumination. To further increase the robustness of the
segmentation list. Objects are sorted using decreasing values
geometric matching, the templates have been defined by
of the scale factor parameter: this operation allows us to
means of a suitably developed image processing procedure.
segment fully imaged objects first (i.e. objects that are not
The procedure is based on (1) brightness and contrast
occluded), and thereafter increasingly occluded objects. In
adjustment and (2) Laplacian edge detection. Task (1) is
the latter step, sub-clouds are extracted from the whole point
carried out to maximize the grey level dynamics. Task (2)
cloud (segmentation). This operation is performed starting
is aimed at retaining only the object’s contours. A FIR high-
from the first element in the segmentation list: the grid of X,
pass filter is implemented, by convolving the image with the
Y coordinates that corresponds to this element is easily
following bi-dimensional kernel:
calculated from both coordinates (Xc, Yc) of point C and
0 1 the dimension of the bounding box. Then, points in the
−1 −1 −1 −1 −1 −1 −1
B −1 −1 −1 −1 −1 −1 −1 C whole 3D point cloud with coordinates X, Y corresponding
B C
B −1 −1 −1 −1 −1 −1 −1 C to those in the grid are saved in a new file, and are eliminated
B C
Laplacian ¼ B
B −1 −1 −1 48 −1 −1 −1 C
C ð3Þ from the original cloud. This process loops until the segmen-
B −1 −1 −1 −1 −1 −1 −1 C tation list is empty. As a result, a number of sub-clouds are
B C
@ −1 −1 −1 −1 −1 −1 −1 A obtained, corresponding to detected elements in the scene.
−1 −1 −1 −1 −1 −1 −1 The original point cloud becomes smaller at each loop, and
retains only 3D points of unrecognized objects: these will be
The kernel coefficients are chosen so that the image re- segmented afterwards, when a new, inherently smaller point
gions presenting both constant and slow-to-medium varia- cloud will be acquired (see Section 3).
tions of the grey levels are set to zero. Only the boundaries
corresponding to the contours of the objects are retained. To 3.6 3D object description
prevent from image noise amplification, a FIR low-pass filter
is applied before it. It is based on the following kernel: The aim of this procedure is to provide information about the
0 1 approaching direction and the final position that the robot
1 2 4 2 1
B2 end effector must have to correctly pick objects up. Firstly,
B 4 8 4 2CC
Gaussian ¼ B
B4 8 16 8 4CC ð4Þ for each sub-cloud, a surface fitting algorithm is performed,
@2 4 8 4 2A in order to find the equation of the surface that best repre-
1 2 4 2 1 sents the object. The system is able to find objects with
different types of surfaces (planar, spherical, cylindrical or
The kernel coefficients have been determined by sam- pyramidal), by means of a dedicated procedure that fits the
pling the Gaussian bi-dimensional impulse response, to min- following surface:
imize the Gibbs phenomenon [22]. Dedicated VIs (LabView
z ¼ f ðx; yÞ ¼ Dx2 þ Ey2 þ Fx þ Gy þ Hxy þ J ð5Þ
Virtual Instruments), belonging to IMAQ Vision Develop-
ment module, have been used to implement the described where parameters D, E, F, G, H and J are the unknowns. The
procedure [17]. These are the IMAQ BCGLookup.VI for Levemberg-Marquardt algorithm is used to iteratively esti-
Task (1) and the IMAQ Convolute.VI for Tasks (2) and (3). mate these parameters by means of a maximum likelihood
As an example, Fig. 7a shows the effect of this process- algorithm [23].
ing: the template obtained from the object at left is defined Secondly, the equation of the plane tangent to the surface
only by the object contour, and is not influenced by the at point C(Xc, Yc) is calculated, and the normal vector at the
surface reflectance. Figure 7b shows the performance of plane is determined. The vector orientation is defined in GRS
GTM on the image in Fig. 5a: the algorithm is able to detect by using the cosine value of angles α, β and γ, as shown in
1880 Int J Adv Manuf Technol (2013) 69:1873–1886

Fig. 7 Example of 2D object


classification. a Effect of image
pre-processing on a single
object; b detection of the
objects in Fig. 5a

Fig. 8. These are the director cosines that allow the end 3DLS sensor. Then, specific tests have been performed to
effector to correctly approach the object. assess the robustness of the vision algorithms, as well as their
Finally, coordinate Zc along Z of the fitted surface is
calculated, solving Eq. 5 for X=Xc and Y=Yc. Value Zc tells
the robot the final position of the end effector. Figure 9
shows the performance of this procedure on the point cloud
in Fig. 6b. Objects have different colours, indicating that
they have been segmented. Red arrows graphically represent
normal vectors. Table 1 lists for each object the values of
origin coordinates (Xc, Yc, Zc) and corresponding values of
angles α, β and γ.
Object picking is carried out following the order set in the
segmentation list. In this way, collisions are avoided.

4 Experimental results

The Roboscan System has been tested to evaluate its ability


to perform correct pick and place operations. The first series
of tests was aimed at evaluating the performances of the Fig. 8 Definition of angles α, β and γ
Int J Adv Manuf Technol (2013) 69:1873–1886 1881

Fig. 9 Example of 3D segmented point cloud

ability to handle different situations: a number of predefined


scenarios have been prepared, in order to evaluate the effec-
tiveness of the developed algorithms.

4.1 Performance of the 3DLS sensor

The measurement performance of this device has been ac-


complished as follows. A plane surface at controlled planar-
ity within 0.01 mm has been placed in the work area. The
optical sensor has been moved to Z=0 mm and oriented as in
Fig. 3. The laser blade was projected, acquired and elaborat-
ed to obtain signal CG. Image pixel coordinates were then
mapped into real-world coordinates, representing measured
distances along Z of the plate for each point of the blade in
the X, Y plane. Distance values were statistically elaborated, Fig. 10 Performance of the 3DLS sensor. a Calibration curve of the
device. b Difference between values E[Zm] and values Zn along
to estimate both mean value E[Zm] and standard deviation
the whole range (left scale), and standard deviation of values Zm
ST.DEV[Zm]. This elaboration sequence was performed at (right scale)
different known heights (thereafter called Zn), moving the
sensor upwards, at steps of 1 mm, until a whole depth range
equal to 45 mm was covered. mean values E[Zm] along the vertical one. The quality of
Figure 10a shows the calibration curve of the sensor, with the measurement is well highlighted in Fig. 10b: the
values Zn along the horizontal axis, and corresponding behaviour of signal E[Zm]-Zn shows that the accuracy

Table 1 Results of the object


description procedure on the Object Xc (mm) Yc (mm) Zc (mm) α (°) β (°) γ (°)
scene represented in Fig. 9
1 −14.907 5.843 15.135 90.06 90.06 0.00
2 −87.868 43.908 15.197 90.06 90.00 0.00
3 −149.907 52.843 15.135 90.06 90.06 0.00
4 42.615 57.029 15.136 89.94 90.06 0.00
5 −160.904 −15.540 23.773 107.58 82.82 19.09
6 −112.346 −36.428 15.066 90.11 90.23 0.00
7 −65.108 −21.736 30.236 90.74 90.11 0.00
8 5.651 −5.371 29.863 89.60 89.20 0.00
9 27.616 −40.069 15.003 90.06 90.00 0.00
10 93.730 19.280 36.459 97.35 77.06 14.98
11 127.560 −36.002 30.268 90.29 90.11 0.00
1882 Int J Adv Manuf Technol (2013) 69:1873–1886

4.3 Performance of the 2D object classification procedure

2D object classification is the most critical among the 2D


vision procedures we have developed. In fact, the GTM
algorithm is required to detect either objects differing in
shape and in surface reflectance and colour, as well as tilted
objects, objects very near to each other and objects partially
or even completely overlapped.
Hence, flexibility and robustness of GTM are major re-
quirements. Figure 12a shows an example of how the work

Fig. 11 Performance of the ROS detection procedure. a Image of the


work area; b detection of positions XSTART and XSTOP that set the ROS

of the measurement is within −0.051 and 0.011 mm (left


scale). Standard deviation ST.DEV[Zm] is also plotted
(right scale): the maximum absolute value in this signal
is 120 μm. These values definitely match with the precision of
the robot, and the gripper characteristics.

4.2 Performance of the ROS detection procedure

The performance of this procedure can be easily appreciated


by looking at the situation in Fig. 11. Here, both circular and
rectangular objects have been placed on the work area. Three
blobs are detected: their definition is not influenced by the
shape of the objects, but rather depends on their disposition
in the scene: adjacent objects lead to the definition of a single
Fig. 12 Performance of the 2D object classification procedure. The
blob. The algorithm correctly detects values X1 and W1 of work area contains plastic disks, soap bars and yoghurt jars. a
blob 1 and values X3 and W3 to estimate positions XSTART and Image of the work area; b template definition; c detection of the
XSTOP, respectively. elements in the scene
Int J Adv Manuf Technol (2013) 69:1873–1886 1883

4.4 Performance of 3D scanning, 3D cloud segmentation


and object description

The results of the experimental tests carry out to assess


the performance of these procedures are presented in this
section. Figure 14a shows the 3D point cloud acquired by
the 3DLS sensor in correspondence with the scene in
Fig. 12a, while Fig. 14b presents the effect of the seg-
mentation and of the estimation of the director cosines. As
expected, all the objects, with the exception of object
‘11’, have been segmented. The tangent planes are graph-
ically visualized, together with normal vectors. Their di-
rector cosines define the grasping direction of the robot.
After picking, only object ‘11’ will remain in the work
area. Accordingly to the workflow in Fig. 2, the system is
expected to go back to block 2, and to go through the loop to
detect and pick it up.
Similar results have been found when the scene in Fig. 11
has been tested. Figure 15a shows the corresponding point

Fig. 13 Performance of the 2D object classification procedure when


applied to the image in Fig. 11a. a Template definition; b detection of
the elements

area could look like. There are three yoghurt jars, four plastic
disks and four soap bars. Objects are tilted, placed upside-
down and overlapped. In addition, most of them are adjacent
to each other.
In order to handle such a situation, the templates in
Fig. 12b have been defined. Yoghurt jars are detected by
means of two templates, (i.e. standing and tilted orienta-
tions), while both disks and soap bars require the definition
of one template, respectively.
Figure 12c shows how the elements in the scene are
detected by GTM. Objects from ‘1’ to ‘10’ are framed by
their bounding boxes, the coordinates of their centres are
measured and their orientation is correctly detected. The only
object that has not been detected is the one labelled ‘11’ in
Fig. 12a: this is by no means surprising, being this element
almost completely occluded by disk ‘10’.
Another interesting example of this class of tests
deals with the scene in Fig. 11a. The templates defined
in this situation are shown in Fig. 13a, while the per-
formance of GTM is presented in Fig. 13b. Here, despite
object ‘5’ is partially occluded by object ‘4’, the matching
is positive since most of the object shape is well visible in Fig. 14 3D analysis of the scene; a 3D point cloud of the scene in
the image. Fig. 12a; b visualization of both tangent planes and normal vectors
1884 Int J Adv Manuf Technol (2013) 69:1873–1886

order to maximize the overall system efficiency. For this


reason, the time between two subsequent 3D acquisitions
depends on the number of the objects and on their orientation
in the work area. In our experiments, considering 10–15
objects and two scans at most, we observed average times
of 20 s.
The 2D processing requires at most 200 ms; the maximum
scanning speed is 500 mm/s; it is limited by the video-
camera frame rate (25 frame/s). However, this is not a serious

Fig. 15 3D analysis of the scene. a 3D point cloud of the scene in


Fig. 11a; b visualization of both tangent planes and normal vectors

cloud, and Fig. 15b visualizes both tangent planes and


normal vectors to each segmented object. In this case, all
objects have been segmented, and their orientation is correctly
detected.
Robot picking is straighforward: it is performed according
to the order set in the segmentation list, which guarantees
that the objects are grasped starting from those not occluded.
The objects that are placed on other objects are removed first
(their scale factor is the highest in the list since they are
closer than the others to the video camera). Fully imaged
objects that are placed directly on the plane of the work area
are picked up next. The remaining objects are those partially
viewed; however, their normal vectors are known, and this
information allows the robot to grasp them. Collisions be-
tween the gripper and the objects are avoided since the robot
knows the coordinate Zc of point C of each object, and
Fig. 16 Evaluation of the accuracy in the estimate of director cosines. a
approaches to them according to decreasing heights. Point clouds of the objects used to perform the accuracy evaluation.
The developed procedures have been designed to pick up b Corresponding planes fitted by Polyworks. c Plot of the errors in
as many objects as possible from a single point cloud, in angles estimation
Int J Adv Manuf Technol (2013) 69:1873–1886 1885

Table 2 Accuracy of the


estimate of director Object α (°) α' (°) β (°) β' (°) γ (°) γ' (°)
cosines
1 71.94 71.64 87.30 87.38 18.19 18.46
2 109.02 108.59 89.88 89.75 19.09 18.54
3 70.12 69.77 91.71 91.07 19.94 20.21
4 68.34 68.68 85.69 85.80 22.18 21.70
5 82.64 82.41 102.94 102.74 14.98 15.56
6 72.66 72.19 97.00 96.97 18.73 19.21
7 58.66 58.74 91.83 91.70 31.35 31.30
8 68.22 68.38 92.35 92.57 21.87 22.31

limitation, being the video camera a very low-cost device, which surface was a plane in itself. Figure 16b presents the sub-
could be replaced by a faster model at very reasonable costs. clouds in the PolyWorks reference system: fitted planes and
their normal vectors are overlaid onto each point cloud. Cor-
4.5 Director cosines accuracy evaluation responding elements in Fig. 16a and in Fig. 16b are identified
by the same label.
The evaluation of the accuracy in the determination of angles Table 2 shows the results. The first column identifies the
α, β and γ has been thought mandatory, in order to assess the objects. Angles α, β and γ are shown in even columns:
quality of the procedure described in Section 3.6. It has been corresponding values α', β' and γ' are listed in odd columns.
carried out comparing the values estimated by our procedure Errors (Δα=α−α'), (Δβ=β−β'), and (Δγ=γ−γ') are plotted
with those calculated by the PolyWorks IMEdit software in Fig. 16c: mean absolute values for Δα, Δβ and Δγ are
(InnovMetric, Ottawa, Ca), a commercial software specifi- 0.295°, 0.193° and 0.388°, respectively. These values can be
cally designed for creating and elaborating 3D point clouds, considered negligible as far as the pick-and-place application
for CAD, rapid prototyping and gauging applications [24]. is considered.
Among the tools available in this software, we chose the
fitting plane procedure, which is designed to estimate the
plane tangent to the 3D input cloud at a specific point: this 5 Conclusions
point, thereafter called point C', is manually selected over the
point cloud by the operator. The fitting plane tool outputs the In this work, we presented an alternative solution to fully 3D
director cosines of angles α', β', and γ' of the vector normal procedures, with the aim of obtaining effective approaches to
to the fitted plane. To provide direct comparison between the pick-and-place applications, while keeping the complexity at
values of the angles output from the PolyWorks software low levels. In particular, the interaction between 2D and 3D
with those estimated by our procedure, we have paid atten- image processing tasks has been studied and characterized
tion to avoid errors that could originate from the inaccuracy to assess the feasibility of exploiting the system in real
in establishing the correspondence between point C' in the industrial environments.
PolyWorks cloud and point C in our segmented point cloud. The method shows good performance in the presence of
In practice, even a small inaccuracy in the location of point objects characterized by simple shapes, like planes, cylin-
C' would result into even markedly different tangent planes ders, cones, spheres and free-form shapes characterized by
in the two point clouds, especially in correspondence with smooth local slopes. A great deal of work has been focused
high local surface curvatures. In such cases, the evaluation of on allowing the system to handle object occlusions and to
differences (α−α'), (β−β') and (γ−γ') would become correctly estimate the pose of objects presenting different
meaningless. shapes, colours and textures in the work area. The use of a
To overcome this problem, we set up a scene using eight video camera in the visible range and of the red laser illumi-
planar objects like those in Fig. 5a, each one tilted by a nation limits the elaboration to non-transparent objects.
certain angle. The scene has been acquired by the 3DLS We are aware that this system cannot be considered as
device: the corresponding 3D point cloud is shown in Fig. 16a. a final solution for bin-picking situations, but we also
This cloud was segmented, and angles α, β and γ were think that it can be considered a quite simple and smart
estimated for each single sub-cloud using the procedure in way to handle a rather large variety of situations. In order
Section 3.6. Then, we imported the sub-clouds into the to appreciate the work performed by Roboscan, a video
PolyWorks environment, and we applied the fitting plane tool has been prepared. It can be downloaded at the website of
to them: selection of point C' was a minor concern since each our laboratory.
1886 Int J Adv Manuf Technol (2013) 69:1873–1886

Acknowledgments The authors are grateful to Dr. Yosuke Sawada 11. Aqsense SAL3D, http://www.aqsense.com/products/sal3d. Accessed
and to Mr. Gabriele Coffetti for their continuous support during the 1 July 2013
development of this project. 12. Rusu RB, Cousins S (2011) 3D is here: Point Cloud Library (PCL).
Proceedings of the IEEE International Conference on Robotics and
Automation 2011:305–309
13. MVTec Software GmbH, Halcon—the power of machine vision—
References HDevelop User’s Guide, München, 2009, pp 185–188
14. Zhao D, Li S (2005) A 3D image processing method for
manufacturing process automation. Comput Ind 56:975–985
1. Brogardh T (2007) Present and future robot control development—an 15. Biegelbauer G, Vincze M, Wohlkinger W (2010) Model-based 3D
industrial perspective. Annual Rev Control 31:69–79 object detection: efficient approach using superquadrics. Mach
2. Xiong Yand Quek F (2002) Machine Vision for 3D mechanical part Vision Appl 21:497–516
recognition in intelligent manufacturing environments. In: Proceed- 16. Richtsfeld M, Vincze M (2009) Robotic grasping of unknown ob-
ings of the 3rd International Workshop on Robotic Motion and jects, contemporary robotics—challenges and solutions. A D
Control (RoMo-Co’02), pp 441–446 Rodić (ed.). ISBN: 978-953-307-038-4, InTech, DOI: 10.5772/7805.
3. Sakakibara S (2006) The robot cell as a re-configurable machining Available from: http://www.intechopen.com/books/contemporary-
system. In: Dashchenko AI (ed) Reconfigurable manufacturing robotics-challenges-and-solutions/robotic-grasping-of-unknown-
systems and transformable factories. Springer, Berlin, pp 259– objects. Accessed 1 July 2013
272 17. Klinger T (2003) Image processing with labVIEW and Imaq Vi-
4. Tudorie CR (2010) Different approaches in feeding of a flexible sion. Prentice-Hall, USA
manufacturing cell. In: Simulation, modeling, and programming for 18. Gruen A, Huang TS (2001) Calibration and orientation of cameras
autonomous robots. Springer-Verlag, Heidelberg, pp 509–520 in computer vision. Springer, Berlin
5. Steger C, Ulrich M, Wiedemann C (2008) Machine vision algo- 19. Sansoni G, Bellandi P, Docchio F (2011) Design and development
rithms and applications. Wiley, Weinheim of a 3D system for the measurement of tube eccentricity. Meas Sci
6. Blais F (2004) A review of 20 years of range sensors development. Technol 22:075302. doi:10.1088/0957-0233/22/7/075302
J Electron Imag 13(1):231–240 20. Moeslund TB (2012) Introduction to video and image processing.
7. Sumi Y, Kawai Y, Yoshimi T, Tomita F (2002) 3D object recognition Springer, London
in cluttered environments by segment-based stereo vision. Int J 21. Sibiryakov A et al (2008) Statistical template matching under
Comput Vision 46(1):5–23 geometric transformations. In: Coeurjolly D (ed) Discrete geometry
8. Rossi NV, Savino C (2010) A new real-time shape acquisition with for computer imagery. Springer, Berlin, pp 225–237
a laser scanner: first test results. Robot Comp Int Man 26:543–550 22. Trucco A Verri E (1998) Introductory techniques for 3D computer
9. Rahayem M, Kjellander JAP (2011) Quadric segmentation and vision. (Prentice-Hall, ISBN 0-13-261108-2)
fitting of data captured by a laser profile scanner mounted on an 23. Marquardt D (1963) An algorithm for least-squares estimation of
industrial robot. Int J Adv Manuf Technol 52:155–169 nonlinear parameters. SIAM J Appl Math 11:431–441
10. Parker JR (2010) Algorithms for image processing and computer 24. InnovMetric Software (2011) Polyworks modeler & inspector—
vision. Wiley, New York user's guide. Ste-Foy, Quèbec

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy