Note: This tutorial assumes that you have completed the previous tutorials: Writing a Simple Image Publisher (C++), Writing a Simple Image Subscriber (C++), Learning about tf and time (C++), Recording and playing back data. |
Please ask about problems and questions regarding this tutorial on answers.ros.org. Don't forget to include in your question the link to this page, the versions of your OS & ROS, and also add appropriate tags. |
Projecting a TF frame onto an image (C++)
Description: This tutorial shows how to use image_geometry together with tf to project a tf frame onto a live image stream.Tutorial Level: INTERMEDIATE
Contents
In this tutorial we will write a node that projects one or more tf frames to image coordinates and draws them on the image:
Although this is a mathematically simple operation, this tutorial is complex in that it integrates several different software components: OpenCV, tf, bag files, image_transport and of course(!) image_geometry. If you write vision code using ROS, you will be using these packages frequently. Make sure that you understand the recommended tutorials at the top before diving in.
Preliminaries
First begin downloading the sample data: plug_on_base.bag (750 MB).
Create a scratch package to work in and manipulate the example code. Create a sandbox package learning_image_geometry with the following dependencies:
$ roscreate-pkg learning_image_geometry image_geometry image_transport opencv2 cv_bridge tf image_view
$ catkin_create_pkg learning_image_geometry image_geometry image_transport opencv2 cv_bridge tf image_view
Make sure that the learning_image_geometry directory is included in your ROS_PACKAGE_PATH.
Writing the processing node
Here we'll create a node that subscribes to a camera image stream and tf information, draws the requested frames on each incoming image, and publishes the annotated images on an output topic.
Change to the directory you've created for these tutorials:
$ roscd learning_image_geometry
The Code
Open a new file draw_frames.cpp with your favorite editor and place the following inside it: (Code for ROS Groovy)
1 #include <ros/ros.h>
2 #include <image_transport/image_transport.h>
3 #include <opencv/cv.h>
4 #include <cv_bridge/cv_bridge.h>
5 #include <image_geometry/pinhole_camera_model.h>
6 #include <tf/transform_listener.h>
7 #include <boost/foreach.hpp>
8 #include <sensor_msgs/image_encodings.h>
9
10 class FrameDrawer
11 {
12 ros::NodeHandle nh_;
13 image_transport::ImageTransport it_;
14 image_transport::CameraSubscriber sub_;
15 image_transport::Publisher pub_;
16 tf::TransformListener tf_listener_;
17 image_geometry::PinholeCameraModel cam_model_;
18 std::vector<std::string> frame_ids_;
19 CvFont font_;
20
21 public:
22 FrameDrawer(const std::vector<std::string>& frame_ids)
23 : it_(nh_), frame_ids_(frame_ids)
24 {
25 std::string image_topic = nh_.resolveName("image");
26 sub_ = it_.subscribeCamera(image_topic, 1, &FrameDrawer::imageCb, this);
27 pub_ = it_.advertise("image_out", 1);
28 cvInitFont(&font_, CV_FONT_HERSHEY_SIMPLEX, 0.5, 0.5);
29 }
30
31 void imageCb(const sensor_msgs::ImageConstPtr& image_msg,
32 const sensor_msgs::CameraInfoConstPtr& info_msg)
33 {
34 cv::Mat image;
35 cv_bridge::CvImagePtr input_bridge;
36 try {
37 input_bridge = cv_bridge::toCvCopy(image_msg, sensor_msgs::image_encodings::BGR8);
38 image = input_bridge->image;
39 }
40 catch (cv_bridge::Exception& ex){
41 ROS_ERROR("[draw_frames] Failed to convert image");
42 return;
43 }
44
45 cam_model_.fromCameraInfo(info_msg);
46
47 BOOST_FOREACH(const std::string& frame_id, frame_ids_) {
48 tf::StampedTransform transform;
49 try {
50 ros::Time acquisition_time = info_msg->header.stamp;
51 ros::Duration timeout(1.0 / 30);
52 tf_listener_.waitForTransform(cam_model_.tfFrame(), frame_id,
53 acquisition_time, timeout);
54 tf_listener_.lookupTransform(cam_model_.tfFrame(), frame_id,
55 acquisition_time, transform);
56 }
57 catch (tf::TransformException& ex) {
58 ROS_WARN("[draw_frames] TF exception:\n%s", ex.what());
59 return;
60 }
61
62 tf::Point pt = transform.getOrigin();
63 cv::Point3d pt_cv(pt.x(), pt.y(), pt.z());
64 cv::Point2d uv;
65 uv = cam_model_.project3dToPixel(pt_cv);
66
67 static const int RADIUS = 3;
68 cv::circle(image, uv, RADIUS, CV_RGB(255,0,0), -1);
69 CvSize text_size;
70 int baseline;
71 cvGetTextSize(frame_id.c_str(), &font_, &text_size, &baseline);
72 CvPoint origin = cvPoint(uv.x - text_size.width / 2,
73 uv.y - RADIUS - baseline - 3);
74 cv:putText(image, frame_id.c_str(), origin, cv::FONT_HERSHEY_SIMPLEX, 12, CV_RGB(255,0,0));
75 }
76
77 pub_.publish(input_bridge->toImageMsg());
78 }
79 };
80
81 int main(int argc, char** argv)
82 {
83 ros::init(argc, argv, "draw_frames");
84 std::vector<std::string> frame_ids(argv + 1, argv + argc);
85 FrameDrawer drawer(frame_ids);
86 ros::spin();
87 }
The Code Explained
Now, let's break down the code piece by piece.
Most of these headers should already be familiar to you. image_geometry/pinhole_camera_model.h defines the PinholeCameraModel class which we will use to project a 3d point into image coordinates.
Let's briefly detour to the main() function:
Here we do the usual ROS setup. ros::init() removes any ROS command-line remapping arguments. The remaining arguments are the frame IDs we want drawn; we copy those into a vector.
FrameDrawer is our workhorse class for this example. Let's see what it does.
These members are for all our ROS communication needs.
These two objects are our bridge from the ROS message types sensor_msgs/Image and sensor_msgs/CameraInfo into the world of OpenCV. We will use them in our image callback.
Finally we have the list of frame IDs to draw and a CvFont object we'll use for rendering the text.
The constructor. Our node handle nh_ is constructed first and initializes the node; we then pass it to the ImageTransport constructor. tf_listener_ automatically connects to the /tf topic.
We subscribe to the sychronized Image + CameraInfo stream. image_transport::CameraSubscriber subscribes to the image topic and its sibling camera_info topic. Review Writing a Simple Image Subscriber for more information on image_transport subscriptions.
We advertise our annotated output image topic; see Writing a Simple Image Publisher. Finally, we initialize our font object.
Now we get to the image callback. When processing data from a calibrated camera, we need both the image and the CameraInfo message which contains the calibration parameters.
We convert the Image message to an OpenCV IplImage. Review Using CvBridge to Convert Between ROS Images and OpenCV Images for details.
45 cam_model_.fromCameraInfo(info_msg);
Here we fill our PinholeCameraModel using the information from the CameraInfo message. Note that cam_model_ is a member variable rather than a local one. In the common case that the camera parameters do not change, updating a previously filled camera model is very cheap.
47 BOOST_FOREACH(const std::string& frame_id, frame_ids_) {
48 tf::StampedTransform transform;
49 try {
50 ros::Time acquisition_time = info_msg->header.stamp;
51 ros::Duration timeout(1.0 / 30);
52 tf_listener_.waitForTransform(cam_model_.tfFrame(), frame_id,
53 acquisition_time, timeout);
54 tf_listener_.lookupTransform(cam_model_.tfFrame(), frame_id,
55 acquisition_time, transform);
56 }
57 catch (tf::TransformException& ex) {
58 ROS_WARN("[draw_frames] TF exception:\n%s", ex.what());
59 return;
60 }
For each requested frame ID, we query tf for the position of that frame in the optical frame of the camera (cam_model_.tfFrame(), pulled out of the CameraInfo header). The operations here are described in Learning about tf and time.
Notice that for maximal accuracy, we look up the transform at the time the image was acquired by the camera rather than ros::Time::now() or ros::Time(0) (i.e. the most recent transform). For this example we assume the camera is streaming at 30fps, so we are willing to wait no more than 1/30s for the transform to become available.
We extract the origin of the requested frame in the camera optical frame and store it in a cv::Point3d. Then we use our image_geometry::PinholeCameraModel to project that point into image coordinates.
We assume here that we are listening on an undistorted image topic. The raw images from the camera driver are typically distorted, preventing a clean projective mapping from 3D world coordinates to 2D image coordinates; characterizing this distortion is an important part of camera calibration. image_geometry includes methods for undistorting images, but in ROS this is typically performed by an image_proc or stereo_image_proc node sitting between the camera driver and your vision processing node. See the image_pipeline for details.
67 static const int RADIUS = 3;
68 cv::circle(image, uv, RADIUS, CV_RGB(255,0,0), -1);
69 CvSize text_size;
70 int baseline;
71 cvGetTextSize(frame_id.c_str(), &font_, &text_size, &baseline);
72 CvPoint origin = cvPoint(uv.x - text_size.width / 2,
73 uv.y - RADIUS - baseline - 3);
74 cv:putText(image, frame_id.c_str(), origin, cv::FONT_HERSHEY_SIMPLEX, 12, CV_RGB(255,0,0));
Finally, we draw a filled red circle at the position of the frame and print its ID centered above it. For more information see the OpenCV drawing functions.
77 pub_.publish(input_bridge->toImageMsg());
Finally, we convert the OpenCV IplImage to a ROS Image message and publish it.
Writing the launch file
For convenience, we'll make a launch file to start our processing node and a couple instances of image_view.
The launch file
PR2 Beta Workshop Participants
Make sure to use the PR2 launch file below to run off live data from your robot.
If you want to try out the bag provided in this tutorial, you must use a separate roscore from your robot's. Be careful!
Open a new file draw.launch and place inside it:
For use with the example bag file
1 <launch>
2 <param name="/use_sim_time" value="True" />
3
4 <node name="draw_frames" pkg="learning_image_geometry" type="draw_frames" args="/base_link /base_laser_link /r_gripper_tool_frame" output="screen">
5 <remap from="image" to="forearm_camera_r/image_rect_color" />
6 </node>
7
8 <node name="input_viewer" pkg="image_view" type="image_view">
9 <remap from="image" to="forearm_camera_r/image_rect_color" />
10 <param name="autosize" value="True" />
11 </node>
12
13 <node name="output_viewer" pkg="image_view" type="image_view">
14 <remap from="image" to="image_out" />
15 <param name="autosize" value="True" />
16 </node>
17 </launch>
'''For use with a PR2 robot:''' #!xml block=launch_robot <launch> <node name="draw_frames" pkg="learning_image_geometry" type="draw_frames" args="/base_link /base_laser_link /r_gripper_tool_frame" output="screen"> <remap from="image" to="r_forearm_cam/image_rect_color" /> </node> <node name="input_viewer" pkg="image_view" type="image_view"> <remap from="image" to="r_forearm_cam/image_rect_color" /> <param name="autosize" value="True" /> </node> <node name="output_viewer" pkg="image_view" type="image_view"> <remap from="image" to="image_out" /> <param name="autosize" value="True" /> </node> </launch>
The launch file explained
2 <param name="/use_sim_time" value="True" />
This is a tricky bit. On a live system, ROS time is considered the same as wall-clock time. For this test, however, we are running from a bag file; the timestamps of the bagged messages are far in the past relative to the current time. Therefore we need to tell all of our nodes to use simulated time. At startup, each node checks the /use_sim_time parameter; if it is set to true, they will listen on the /clock topic (which rosbag will publish) for the current ROS time. See Clock for details.
Especially when using tf, trying to run a node against a bag file without forcing simulated time is a common source of issues. If you do this, tf will complain about timestamps far in the past.
4 <node name="draw_frames" pkg="learning_image_geometry" type="draw_frames" args="/base_link /base_laser_link /r_gripper_tool_frame" output="screen">
5 <remap from="image" to="forearm_camera_r/image_rect_color" />
6 </node>
Launch the draw_frames node we just wrote, remapping image to the rectified image stream. It comes from the right forearm camera on a PR2 robot. On a live system, the image_rect_color topic is published by image_proc from the image_raw topic published by the camera driver.
8 <node name="input_viewer" pkg="image_view" type="image_view">
9 <remap from="image" to="forearm_camera_r/image_rect_color" />
10 <param name="autosize" value="True" />
11 </node>
Launch a viewer for the unprocessed image stream.
13 <node name="output_viewer" pkg="image_view" type="image_view">
14 <remap from="image" to="image_out" />
15 <param name="autosize" value="True" />
16 </node>
Launch a viewer for the annotated output stream /image_out produced by our processing node.
Building your node
Open your CMakeLists.txt file and append the following line at the bottom:
rosbuild_add_executable(draw_frames draw_frames.cpp)
Then
$ rosmake learning_image_geometry
Trying it out
Make sure you have a roscore running, then launch your processing and visualization nodes:
$ roslaunch draw.launch
Now play back the recorded data:
$ rosbag play plug_on_base.bag --clock
This bag file is not compatible with ROS Groovy because the /camera_info topic has changed since the bag file was generated. You should get an error saying "Client [/draw_frames] wants topic /forearm_camera_r/camera_info to have datatype/md5sum"
The --clock option tells rosbag to publish simulated time on the /clock topic. Recall that this is necessary to avoid confusing time-aware components like tf when using bag files.
You should now see the annotated image in the "/image_out" window.