MBPTrack Tutorial - SOTA 3D Point Cloud Object Tracking in 2023 for LiDAR & Radar
The complete guide on 3D single object tracking on custom point cloud episodes in Supervisely LiDAR annotation toolbox.
Table of Contents
In this tutorial you will learn about state-of-the-art methods for solving 3D point cloud object tracking task in Computer Vision and how to use them in Supervisely's point cloud annotation toolbox.
Computer vision industry has been developing rapidly over the past few years, which has led to the emergence of new methods of visual information representation and its processing. Three dimensional (3D) representation of surrounding environment began to be widely used in such promising areas as robotics and autonomous driving.
Modern neural network architectures allow to efficiently process 3D data. One of the most demanded methods of 3D LiDAR or Radar data processing is 3D single object tracking.
This video tutorial explains all steps from connecting your computer to running the MBPTrack model and using it inside 3D point cloud annotation tool to significantly speedup and automate manual labeling.
What is 3D single object tracking?
3D single object tracking is a computer vision task dedicated to tracking of a single object in three dimensional space based on an initial 3D bounding box. This task is traditionally performed using point clouds - discrete sets of data points in 3D space. Point clouds are generated by LIDARs (light detection and ranging).
Point clouds provide valuable information about surrounding environment and target object surface. At the same time point clouds can be characterized by an increasing point sparsity with distance and inhomogeneous structure, what makes 3D single object tracking task quite non-trivial.
MBPTrack - current state-of-the-art 3D object tracking
Most of neural network architectures for 3D single object tracking were usually based on the Siamese paradigm, which takes search area in the current frame and the crop of target template from the previous frame as an input, and then localizes the target in the current frame using a localization network such as RPN (Region Proposal Network).
In March 2023, during ICCV, a different approach for 3D object tracking was proposed - MBPTrack. Unlike previous existing approaches, MBPTrack uses both temporal and spatial contextual information in the 3D single object tracking task with the help a memory mechanism.
MBPTrack achieves state-of-the-art performance on such popular benchmarks for 3D single object tracking as KITTI, nuScenes and Waymo Open Dataset:
KITTI dataset leaderboard
nuScenes & Waymo datasets leaderboards
What is MBPTrack?
MBPTrack is a powerful neural network architecture for 3D single object tracking. MBPTrack uses a memory mechanism to process past information and a localization head for performing coarse-to-fine bounding box prediction. In this architecture past frames with targetness masks are used as an external memory, and a transformer-based module propagates tracked target cues from the memory to the current frame.
For accurate object localization, MBPTrack first predicts the bounding box center via Hough voting (a technique used in computer vision to detect patterns in an image that can be represented by mathematical curves). By leveraging box priors given in the first frame, reference points are being adaptively sampled around the target center that roughly cover the target of different sizes with the help of a coarse-to-fine network, named BPLocNet. Then, dense feature maps are obtained by aggregating point features into the reference points, where localization can be performed more effectively.
You can find more details about MBPTrack architecture in original paper.
How to run MBPTrack in Supervisely
Follow the steps below to run the model and interact with it during annotation of point cloud episodes with 3D cuboids.
Step 1. Connect your GPU
In Supervisely it is easy to connect your own GPU to the platform and then use it to run any neural networks on it for free. To connect your computer with GPU, please watch these videos for MacOS, Ubuntu, any Unix OS or Windows.
Step 2. Run the app to deploy MBPTrack model
MBPTrack 3D Point Cloud Tracking
Deploy MBPTrack as REST API service
Run MBPTrack 3D Point Cloud Tracking app on your computer, select one of the pre-trained checkpoints and press
Run button. There are checkpoints for tracking numerous different objects in point clouds such as cars, pedestrians, trucks, vans, trailers, buses, cyclists.
Go to Neural Networks page and find 3D Point Cloud tracking category.
Step 3. Use MBPTrack in Supervisely 3D Point Cloud labeling tool
Open 3D Point Cloud labeling tool and create input cuboid, do not forget to set correct cuboid direction as shown below:
After that select input cuboid for tracking, define number of frames to track and press
Configure and run tracking from the point cloud sequence timeline
Supervisely Ecosystem provides modern ways of labeling any type of data for computer vision, including 3D point clouds. In this tutorial we learned about performing 3D single object tracking using state-of-the-art MBPTrack neural network architecture in Supervisely.
MBPTrack will be an excellent choice for improving speed and quality of data labeling in three dimensional space. Sign up and try to label your point clouds for free in Community Edition.
Supervisely for Computer Vision
Supervisely is online and on-premise platform that helps researchers and companies to build computer vision solutions. We cover the entire development pipeline: from data labeling of images, videos and 3D to model training.
The big difference from other products is that Supervisely is built like an OS with countless Supervisely Apps — interactive web-tools running in your browser, yet powered by Python. This allows to integrate all those awesome open-source machine learning tools and neural networks, enhance them with user interface and let everyone run them with a single click.