106A Project Page

Introduction

Task: The objective of our project is to stack blocks in a tower using the Sawyer robots. We wanted to give the user the flexibility to stack the blocks in any desired order.

Why Interesting: One thing that made this project interesting was being able to identify the correct block to pick up since the blocks are not placed in the same location or orientation every time. Additionally, finding the blocks themselves is difficult because sometimes the blocks are not all within the scope of the camera, so moving the robot so that it can see the blocks will be interesting. Building the stack at any arbitrary location will also be interesting because the robot needs to get to the other blocks without knocking over the already built stack.

Real-World Applications: There are many industries where a robot that can stack objects would be helpful. For instance, a robot that can autonomously stack heavy crates onto a train would be more time and space-efficient. If the robot is precise enough, the robot could build houses by stacking bricks or temporary housing very quickly after a disaster. In an assembly line, a robot could stack parts efficiently allowing for ease of packing.

Design

Desired Functionality: The robot must be able to stack at least 5 blocks in a desired location in an amount of time comparable to a human. The robot should be able to stack the blocks reliably in any order determined by the user.

Chosen Design: We decided to use AR tags to identify each of the blocks so the robot knew which block to pick up. We used the camera on the Sawyer to identify the blocks and recognize the AR tags. We used ROS Noetic for the overall software architecture, MoveIt for inverse kinematics, motion planning, and control, and Alvar for AR tag recognition.

Design Considerations and Challenges: We used the camera on the Sawyer because it was built in and convenient. We considered using a Kinect, but integration proved to be difficult because we could not install dependencies on the sawyer-connected computers. The downside of not having a fixed external camera was that the Sawyer's builtin cameras were very low resolution, so in order to accurately localize the AR tags, we needed to move the camera very close to the table. This prevented us from seeing all the blocks at once, so we implemented searching functionality to adaptively look for the tags. The position of the stack is fixed so that the Sawyer arm wouldn't knock the stack over when stacking and finding the other blocks. The downside of this is that the position of the stack is not flexible. We also used an end effector without rubber tips. We also made the robot interpolate its motions to move in straight lines in task space, ensuring that we could move precisely and reliably around our tower of blocks, not randomly at the whims of the inverse kinematics solver.

Implementation

Hardware and parts: The only hardware we used was the Sawayer robotic arm by Rethink Robotics.

Step-by-step breakdown:

Sawyer goes to set position.
Camera searches for AR tag as the Sawyer arm sweeps to find it.
Within the software, waypoints are created for the end effector to follow.
Once above the cube, the arm moves down and rotates to match the block's orientation.
The Sawyer arm then goes to a designated position to stack the block.
The process is repeated until all blocks are stacked.

Implementation detail 1:

Implementation detail 2:

Results

End Result: The project worked well! By the end of the project, our robot was able to reliably stack 9 blocks in a predetermined spot. Although the cubes were not perfectly aligned, the stacking was accurate enough that the stack reliably stacked 9 cubes. The robot was also able to pick up the cubes in any location and orientation, so long as the AR tags were facing upward and were within the scope of the Sawyer’s camera. To find the correct cube, the Sawyer first tucked (went to its home position which oriented the camera downward), and then performed a sweeping motion to the left, where the cubes were randomly placed. The only times the robot was not able to stack the blocks were when the Sawyer was not able to find the next cube that it needed to stack. This often was because the cube was out of the camera’s scope or the camera was oriented in the wrong direction to see the AR tag on the cube.

Conclusion

Completion of Task: Overall, this project met most of the design requirements. The robot was reliably able to stack at least 5 blocks, and the robot moved at a reasonable speed. The robot was able to smoothly pick up and place the cubes in a stack without knocking over the stack. The robot was also able to stack the blocks in any order determined by the user. The only criterion that was not met was having an arbitrary stacking position, but that was compromised from the beginning by our design choices. We also could have increased the speed of the stacking, but with the camera we were using, this would likely have decreased the precision of the stacking.

Difficulties Encountered: Getting the camera to see all the AR tags was very difficult. The Sawyer camera had to be in the correct orientation to see all the AR tags, since the camera was mounted to the joint before the end effector and thus did not have a full range of motion. To get around this, we just set all of the joint angles to a known safe initial position, then perform small, safe sideways movements that rotate the camera in a predictable way. We also had trouble with the camera parameters. The existing ones were incorrect, causing the robot to read a series of AR tags on the flat table as existing on a significantly curved surface. We could not figure out how to change these settings (largely due to the opaque existing software stack), so we compensated for the curvature as best we could using simple first-order correction factors.

Future Improvements: Currently, we have a fixed stack position. If we had additional time, we would implement PID control to further smoothen the trajectory that Sawyer takes. Additionally, we would account for more of the dynamic variables to better plan the motion trajectory. We would also choose the optimal path for the Sawyer to take - one that is short but still avoids the stack. Having one camera limited the potential of our robot, so we would have another stationary camera which would increase the scope of our camera and allow us the see the AR tags at once. By being able to see the position of all the AR tags at the same time, we could also let the user choose the stack location, either with an AR tag or as input in the command line. The quality of the camera we use would also be better so that the AR tags can be recognized from far away.

Our Team

Contribtions: Everyone contributed equally on brainstorming task and design, planning project timeline and accomplishing milestones, as well as experimenting and debugging different code implementations during this project. There was always more than one person present when working with the robot, not only for safety but also to discuss ideas when debugging code and thinking of different approachs to solve a problem, as well as re-setting the blocks every time we test the code.

Luke Millam is a mechanical engineering student with previous experience in process engineering and ML engineering.

Reid Dye is a mechanical engineering student with experience in optimal control and numerical computing.

Anoushka Chitnis is a mechanical engineer with experience working with CAD and mechatronics.

Shaolun Zhang is a CS student with experience is machine learning and deep learning.

Additional Materials

Our ROS code can be found here.