3D Object Pose Tracking for Robotics Grasping
|✅ Paper Type: Free Essay||✅ Subject: Information Technology|
|✅ Wordcount: 2274 words||✅ Published: 8th Feb 2020|
I. INTRODUCTION My role of this project is to build up the graphic scene and deal with adjusting lighting changes of the object. It is a repeatable process since the machine learning algorithm will store precise data and drop lower reliable cases. Then, it will circle the processes to improve the accuracy of stored image data.
II. TARGET Unlike humans, robots do not have a sense of touch, and rely on cameras for vision. As such, computer vision is an important part of designing robots to interact with the world on their own. Our task is part of an ongoing project attempting to improve a computers understanding of a three dimensional environment through machine learning and computer vision; developing computer algorithms to help robotic arms determine the appropriate path to grasp an object. At the moment, the computer is using two stationary cameras placed at dierent angles facing the arm and the object, and analyzing their feedback to determine the objects position and type. The current setup has a few problems. To begin with, lighting changes can make it dicult to identify an object. When the hand is covering the object, the shadow changes the light captured by cameras. Also, if the environments light intensity changes, it will result in a color change to captured streams. The computer is expecting pixels representing the object to be in a certain range in the RGB spectrum, and such changes will potentially move the apparent colors out of the expected range. As a result, the computer fails to recognize the presence of the object its looking for. We could expand the expected color range, but that could cause the computer to recognize parts of the image that are not the object, such as the background, as part of the object. Our goal is for the computer to recognize the object in dierent levels of light, without that issue. Another problem is when the robotic hand gets in the way of the cameras. Since the cameras are at xed positions, the robotic arm will block parts of the object when it moves in to pick it up. As a result, the apparent shape of the object on the screen will change, causing the computer to fail at identifying it and cannot accurately determine its position and shape. The purpose of this software will be to improve a computers ability to identify a known object in an image in diﬀerent lighting conditions. The end product will be two helper functions for the computer vision project. The rst will take an image of a scene and the initial position of the object being tracked as input and determine the diﬀuse and specular that describe the lighting of the scene. These will be used to determine the color range that the computer should be looking for. The second function will then use the images, diﬀuse, specular, and the approximate mask in order to determine a more accurate nal mask. The primary goal is to get these functions to properly track a robotic arm, and as a stretch goal is to attempt to do the same with a set of objects that the arm can manipulate.
III. DETAILED PIECES This section will demonstrate three key pieces of the project and brainstorm other optional technologies that can solve the same problem.
A. Piece 1: Capture of the Data In order to track the robotics grasping pose, it requires that the computer is supposed to compare the monitored streams from cameras with stored data in the memory. Thus, previous task is to capture objects data by taking images from various viewpoints to build three dimension scene. 1) Image Sets: This technology is the method to capture objects data in this project. We intend to set up a box that keeps environmental light out. Then, taking pictures of the object under a certain light intensity in the box. This aims to keep recorded RGB values of each object the constant. Further, changing the viewpoint of the camera and repeating image capturing. Finally, all objects 3D images will be stored in the computer. By this solution, we probably need to create over 150 sets of images of every single object to support suﬃcient sources for the computer to form detailed 3D scene of the object. Also, we need to be very careful to deal with lighting. Since this step is to create objects data on the computer, there is no doubt that data is not allowed to contain wrong information. Notice that the surface color is one of the information that the computer is going to use to distinguish the type of objects. Moreover, colors are represented by RGB values. Each of three primary color has a value interval from 0 to 255, if the computer treats it as an integer. Even though red 250 and red 200 both present red color to the objects surface, the computer will not put them into a single category. Therefore, it is necessary to create images with precise and constant lighting, which lead to a constant surface color.
2) 3D Printing: Another approach that comes out from my mind is 3D printing. The previous point is to capture objects data. Nevertheless, this method is to construct 3D models with computers. In this case, the objects data is in the memory at the beginning. There is no need to worry about transferring physical objects speciﬁcs into data. A 3D printer can carve objects with precise information given by the computer . Then, using created objects to test algorithm of grasping pose tracking. There might be a concern that how accurate a 3D printed object will be. Compared to original object that is built in computer, a physical object may have ﬂaws. Therefore, this option has additional requirement, which is to measure the speciﬁcs of each object. For instance, it is obligatory to measure the length of each edge of the polygon, referring to a cube. Developers might deﬁne required accuracy to be 0.01 centimeter and abandon objects not within the range. 3) Panorama: Back to capture images, a better camera or more powerful device can help us to obtain physical objects data. Panoramic camera shows a possibility to create 3D scene of objects through 360 degree panorama . This solution cost much more than the ﬁrst option since a powerful panoramic camera is expensive. However, it does save lots of time to photograph hundreds of object images. Similarly, it needs to interact with lighting set. As mentioned previously, we need to keep objects surface colors to be constant variables.
If you need assistance with writing your essay, our professional essay writing service is here to help!Essay Writing Service
B. Piece 2: Color Changing To begin with, lighting changes can make it diﬃcult to identify an object. When the hand is covering the object, the shadow changes the light captured by cameras. Also, if the environments light intensity changes, it will result in a color change to captured streams. The computer is expecting pixels representing the object to be in a certain range in the RGB spectrum, and such changes will potentially move the apparent colors out of the expected range. 1) Machine Learning: For the lighting and shading issue, we will be attempting a machine learning approach as our primary option. We will take key frames from these recordings and, using Photoshop or a similar software, label the pixels as either part of the object, part of the arm, or part of the background. We will feed this data into a learning algorithm, then test to see how well the algorithm learned to work around the shading issues. The machine learning algorithm will take an image of a scene and the initial position of the object being tracked as input and determine the diﬀuse and specular that describe the lighting of the scene. These will be used to determine the color range that the computer should be looking for. The diﬀuse accounts for the angle between the incoming light and the surface normal. The specular accounts for the angle between the perfect reﬂector and the eye position. In this situation, the two cameras that we are using to record streams serve the role of the eyes. 2) Color Sensor: This option is using the color sensor to detect the object surface color. The color sensor is a device that can compare objects colors with previously referenced colors to improve color detection . Once two types of colors are within a certain acceptable range of error, the sensor will output the results. With various referenced labels, even though the background has a subtle diﬀerence in color, the sensor could detect it in a fast speed. There are other advantages, including automatically adapting to wavelengths, detection of tiny diﬀerence in gray value and independence of the color of the label and the background. This option could replace the function that is dealing with colors. 3) Gray Scale: Since working on color is a tough task, there is a method to only compare two images gray levels . This option allows us to avoid complex computation of colors, including discrete algorithm. The gray scale is a single value that is represented on a single channel. To demonstrate, the image will be totally black while its gray level is 0 and white for maximum gray value. The ﬁrst option utilizes diﬀuse and specular to calculate the objects surface colors in order to restore the colors variables in computer for comparing with referenced data. In this option, the brightness of objects will be used to compare because more bright the object is, a higher gray value it will show.
C. Piece 3: Pose Tracking The purpose of the project, 3D object pose tracking for Robotics Grasping, is to successfully track the hand and object position. It is improved by dealing with lighting changes and shades. 1) Tracking System: The object tracking system currently used by the Universitys robotics team is to use cameras placed aside the object to produce video streams to a computer. By analyzing frames of certain images, the computer will label each pixel of each image with terms of background, object and robotic hand, which is the way for computer to distinguish the object and then track the pose of robotics grasping. Currently, it accounts for this by identifying a large range of colors as potentially being part of the object. This, of course, results in a lot of false positives (pixels being labeled as part of the object when they actually arent). An advanced approach will allow the computer to adjust the expected average for the speciﬁc lighting with a smaller acceptable range. 2) Motion Capture: This option, indeed, will cost a lot more than our current tracking system. However, this technology aﬀord more precise position tracking of various objects, even human action. Some companies, including Sega Interactive, have launched a variety of commercial motion capture devices. A tracker is set at a key part of the moving object, and the position of the tracker is captured by the motion capture system. Then, a computer processes obtained data to generate the three-dimensional space. This technology is widely used in movie and game ﬁeld to capture human actions . One
sensor is called inertial navigation sensor, which measures the characteristics of the athlete’s motion acceleration, azimuth, and tilt angle. This approach is not eﬀected by environmental disturbances and blocks. The capture accuracy is extremely high and the sampling speed could reach 1000 times per second or higher. 3) OpenCV: It is an open source computer vision library. OpenCV is written by C++ and its main interface is also C++, but still retains a large number of C language interfaces. The library also has a number of interfaces to Python, Java, MATLAB/OCTAVE (version 2.5), C#, Ruby and GO. OpenCV program is fast, stable and strongly compatible. It is a choice to get rid of some special solutions that rely on hardware, such as video surveillance, manufacturing control systems, medical equipment. OpenCV focuses on real-world and real-time applications, and its execution speed is greatly improved by the optimization of C language. Fields, like human computer interaction, action recognition and robotics, are gaining beneﬁts from this technology. We can implement our OpenCV program to track object position.
Our academic experts are ready and waiting to assist with any writing project you may have. From simple essay plans, through to full dissertations, you can guarantee we have a service perfectly matched to your needs.View our services
IV. CONCLUSION According to analysis of technologies that could be used for robotics grasping pose tracking, the machine learning algorithm is the most economic method to improve the tracking accuracy. No needs of sensors and any other hardware equipment, the program will determine the object grabbed by robotic hand. Though the algorithm could not be precise enough at the beginning, the accuracy is increasing since the algorithm stored positive data in the memory. With a long time iteration, more reliable data will be stored and the computer has more sampling to learn. Notice that this technology is automatic and the computer will feed itself with current database.
 Colour sensors system description. https://www.sensopart.com/jdownloads/Systembeschreibungen/Colour-sensors contrast-sensors luminescence-sensors system description.pdf. (Accessed on 11/03/2018).  Gray-level transformation. https://spie.org/samples/PM197.pdf. (Accessed on 11/07/2018).  R. Fischer. Motion capture process and systems. https://pdfs.semanticscholar.org/e399/84b1e08f5a98e03e83f2e4d6bac3e997e0d8.pdf. (Accessed on 11/07/2018).  Walkabout Worlds. Create a basic 3d model of a room. https://www.youtube.com/watch?time continue=78&v=3IAK93U2QUI, March 2017. (Accessed on 11/02/2018).  Bob Yirka. A 3-d printer that can print data sets as physical objects. https://phys.org/news/2018-06-d-printer-physical.html, June 2018. (Accessed on 11/02/2018).
Cite This Work
To export a reference to this article please select a referencing stye below:
Related ServicesView all
DMCA / Removal Request
If you are the original writer of this essay and no longer wish to have your work published on UKEssays.com then please: