* It must be kept in mind that, the depth sensor measures the depth from depth point start (from sensor tip/depth origin point) which is located -4.2mm inside the D435i camera’s front cover glass, therefore, to get a precise depth measurement, +4.2 mm must be added to the depth value.
I tried using the pre-trained model to detect the 3 objects, but the only one successfully detected is the ball. The rest were recognized wrongly (suitcase and mouse).
Attempted to identify the objects via an image:
It appears that a custom dataset would have to be prepared.
15/7/2024
I went to university to test the camera (and other cameras) but encountered the same issues. It was then discovered that the issue was the USB cable lead (it needs to us USB3.0). After obtaining the right lead, the camera worked.
After further discussion with the supervisor, we agreed to integrate my earlier context with the object / aim, which is to have a robotic arm act as an assistant to the dentist / surgeon by passing surgical tools to said person. Therefore, the objects to be picked and place have been changed to tools that resemble surgical tools such as polymer clay tools. Procurement has been made for the tools.
Another suggestion was made to change the fingertips of the RG2 gripper to make it easier to grip the "surgical tools". One option was to 3D print the fingertips using carbon fiber filled nylon. A 3D printed fingertip more suitable for gripping small cylindrical object was found and sent for printing:
My tools have arrived and is placed in the lab for collection. I will collect the rolls tomorrow and commence taking pictures to create the necessary dataset for object recognition.
First, the YOLOv4 algorithm is introduced to identify and locate the individual objects. Secondly, the output frame can be used as an input of the GrabCut algorithm to segment the objects from the background for calculating the capture orientation.
Thirdly, the conversion between the coordinate systems is conducted to acquire the positions of objects, and the obtained information is sent to the robotic arm to complete the grasping task on the Robot Operating System (ROS).
Process flow:
Robot performs SLAM through lidar to reach the desired position.
System will turn on the camera to identify and locate the objects, and compute the pose of each object.
User enters the object category to be captured, and the system will make a judgment.
Computer System/Software:
The system is implemented in Ubuntu16.04 on a Lenovo 2.6 GHz processor RTX 2060 with 16 GB RAM and 8 GB video memory
Setup:
The robot is equipped with a Hokuyo lidar, RealSense D435 camera.
Evaluation:
Experiment is done on 7 objects of different colour and shapes placed in the scene at different positions and angles. 3 measurements are obtained: grasping times, successful times and success rate.
Conclusion:
The results show that the proposed method can complete the robot’s navigation to the target position to realize the grasping function of the specified object.
Future works/considerations:
Increase the accuracy of the system by improving the moving object extraction algorithm.
Title: Robotic Arm Grasping and Placing Using Edge Visual Detection System
10/7/2024
Today, we did an elevator pitch at the workshop, explaining what the objective and goals of my project are. We also listed down the hardware that we will need for our projects. Tentatively, mine are:
1. UR10
2. RGBD Camera
3. Raspberry Pi
We also talked about the Gantt chart today. This is the Gantt chart I came up with.
Gantt Chart
Rough sketch of how the setup would be:
11/7/2024
Installed the SDK file for the Intel RealSense D435 RGBD camera.
However, encountered issues with it:
Upon searching for solutions, similar issues seem to indicate that the above error is due to long term storage:
To present a robotic system that can rearrange objects to a specific goal state, including reorientation and regrasping for final placement.
Method:
Systems runs detection, post estimation and motion planning to rearrange objects.
i) 6D pose estimation and volumetric reconstruction
ii) Motion waypoint selection that pairs start and end waypoints via learned filtering
iii) Trajectory generation by motion planning using the selected waypoints.
Computer System/Software:
ROS, PyTorch to implement the learned models. PyBullet as physics engine to simulate behaviour of objects.
Setup:
Franka Emika Panda robot, Realsense D435 mounted on robotic arm.
Two types of suction gripper:
I-shape
L-shape
Evaluation:
Done in both simulation and real-world. 6 large/medium sized objects were used in both simulation and real-world experiments: drill, cracker box, sugar box, mustard bottle, pitcher and detergent.
Conclusion:
The authors’ system improves in both efficiency and success rate and has shown capable, dynamic reorientation for significant rotation and precise placement in various target configurations.
Future works/considerations:
Possible combinations of learning models with traditional motion planning.
By the end of this project, I would have investigated if a robot system is able to pick an object based on spoken instruction through vision and then place the item picked onto the speaker’s palm within 15 seconds.
Goals:
Investigate if:
Robot arm is able to identify object based on speech and vision
Robot is able to automatically grasp objects through vision.
Robot is able to grasp objects of different shapes and sizes
Robot is able to grasp an object in a different orientation.
Place the object onto the palm of a person.
However, after discussing with the supervisors, I have decided to have a more focused objective and goals:
Objective
By the end of this project, I would have investigated if a robot system is able to pick an object through vision and then place the item picked onto a person's palm within 15 seconds.
Goals:
Investigate if:
Robot arm is able to identify object based on vision
Robot is able to automatically grasp objects through vision.
Robot is able to grasp an object in a different orientation.
Place the object onto the palm of a person.
Literature Summary
Read through 3 papers and wrote a summary for the 3 papers:
A Vision-Based Robot Grasping System
Robotic Object Recognition and Grasping with a Natural Background
Vision-Based Robotic Arm Control Algorithm using Deep Reinforcement Learning for Autonomous Objects Grasping
Increasing the grasp pose detection accuracy for a variety of daily household objects only from the visual sensing
Setup:
Franka Panda robot arm equipped with parallel gripper. An intel RealSense D435 is attached to the arm, just above the gripper.
Method:
For the grasp detector, the system uses a densely connected Feature Pyramid Network (FPN) feature extractor.
For the robot system, the vision measurement algorithm generates the grasp pose from a single input modality directly, the depth image.
Computer system/software:
Tensorflow deep learning library, written in Python. PC running Ubuntu 18.04, equipped with NVidia GeForce 1080 Ti GPU Intel Core i7-6700K CPU @ 4.00GHz x 8 processor and 32G memory.
Evaluation:
Two public datasets were used for validation: Cornell Grasp Dataset and Jacquard Dataset.
Three types of experiments were done: i) Grasping a single object at different poses; ii) grasping 51 different objects that are not included in the public datasets; grasping multiple objects at one time.
Conclusion:
The three experiments done proved that the model is able to grasp all kinds of daily objects in various poses.
Future works/considerations:
Incorporate tactile sensing in the grasping system to give a higher grasping success rate.
Incorporate RGB-D visual servoing controller into the parallel gripper grasp system to eliminate execution error.
Title: Vision-Based Robotic Arm Control Algorithm Using Deep Reinforcement Learning for Autonomous Objects Grasping
Anaconda Python package: Jupyter, Tensorflow, Keras and Matplotlib
Evaluation:
Evaluate the efficiency of the 5-DOF arm robot to grasp a determined object.
Train the model for 400 episodes, obtain accuracy and error results.
Conclusion:
Despite some error, every joint angle can be calculated and the end-effector can reach the determined location.
Decrease in error range throughout episodes proved that the reinforcement learning algorithm can reach a targeted object with an inverse kinematic of the robot arm.
Future works/considerations:
Expand model by integrating the pick and place task.
Title: Robotic Object Recognition and Grasping with Natural Background
Using relative distance between object centroid and the gripper, the algorithm guides the robot to move the gripper to the object and form a proper grabbing posture to complete the task.