MAP-NBV: Multi-agent Prediction-guided Next-Best-View Planning for Active 3D Object Reconstruction

University of Maryland, College Park
*Equal contribution (listed alphabatically)
Interpolation end reference image.

Abstract

Next-Best View (NBV) planning is a long-standing problem of determining where to obtain the next best view of an object from, by a robot that is viewing the object. There are a number of methods for choosing NBV based on the observed part of the object. In this paper, we investigate how predicting the unobserved part helps with the efficiency of reconstructing the object. We present, Multi-Agent Prediction-Guided NBV (MAP-NBV), a decentralized coordination algorithm for active 3D reconstruction with multi-agent systems. Prediction-based approaches have shown great improvement in active perception tasks by learning the cues about structures in the environment from data. But these methods primarily focus on single-agent systems. We design a next-best-view approach that utilizes geometric measures over the predictions and jointly optimizes the information gain and control effort for efficient collaborative 3D reconstruction of the object. Our method achieves 19% improvement over the non-predictive multi-agent approach.



UAVs in C17 Simulation Environment

Our method predicts the point cloud based on partial observations and helps the robot find an efficient path to observed the object.




PoinTr-C Prediction

Shape completion prediction point cloud(right) on input pointcloud (left) using PoinTr-C.





Flight Path

UAV flight paths during C17 simulation. The white line represents the flight path taken by UAV 1. The green line represents the flight path taken by UAV 2.





Comparison with Multi-agent Baseline

Comparison of MAP-NBV against the multi-agent baseline NBV algorithm. The image of the object can be seen by clicking on the model name.

Points Seen     Points Seen
Class Model MAP-NBV Pred-NBV Relative Improvement
747 16140 13305 19.26%
A340 10210  8156 22.37%
Airplane C-17 13278 10150 26.70%
C-130  6573  5961  9.77%
Fokker 100   14986 13158 12.99%
Atlas  2085  1747 17.64%
Maverick  3625  2693 29.50%
Rocket Saturn V  1041   877 17.10%
Sparrow  1893  1664 12.88%
V2  1255   919 30.91%
Big Ben  4294  3493 20.57%
Church  7884  6890 13.46%
Tower Clock  3163  2382 28.17%
Pylon  2986  2870  3.96%
Silo  5810  4296 29.96%
Diesel  4013  3233 21.53%
Train Mountain  5067  4215 18.36%
Cruise  5021  3685 30.69%
Watercraft     Patrol  4078  3683 10.18%
Yacht 11678 10341 12.14%


Fraction of Points Observed over Iterations

We look at how much information MAP-NBV (referred to as CO(d)-CP(d)-Greedy below) can observe compared to a centralized, prediction-guided oracle (CO(c)-CP(c)-Optimal) andthe frontier-based baseline.

Interpolation end reference image.


Effect of Coordination

We compare three types of prediction-guided algorithms:

  1. CO(c)-CP(c)-Optimal: This is a centralized approach where at each iteration all the robots combine their observations together, perform prediction on this combined point cloud and find the robot-viewpoint assignment that would results in the maximum number of new points (over the predicted point cloud) to be observed. This algorithms represents the most coordination-oriented approach.
  2. CO(d)-CP(d)-Greedy or MAP-NBV: Our proposed approach which is decentralized in nature. The robot that form a communication subgraph combine their observations together and their perform a seuqential greedy assignment of the robots to the candidaite viewpoints to maximize the information gain (number of points to be seen).
  3. IO-IP: This is a purely non-cooperative approach where the robots do not communicate at all and perform predictions on their own obervations and select the view-point that would result in maximum number of new points being observed.


Directional Chamfer Distance (Ground Truth to Observed Point Cloud)

2 Robots

4 Robots

6 Robots



Hausdorff Distance (Ground Truth to Observed Point Cloud)

2 Robots

4 Robots

6 Robots



Directional Chamfer Distance (Ground Truth to Predicted Point Cloud)

2 Robots

4 Robots

6 Robots





Variants of MAP-NBV

We present three variants of MAP-NBV:

  1. CO(c)-CP(c)-Greedy: This is a centralized version of MAP-NBV, where all the robot always collaborate and coordinate. Thus they all combine their observation before making predictions and perform seuqential greedy robot-viewpoint assignmet at each iteration. This algorithms is not as good as CO(c)-CP(c)-Optimal as it uses greedy assignment, but is faster than the latter.
  2. CO(d)-CP(d)-Greedy or MAP-NBV: Our proposed approach which is decentralized in nature. The robot that form a communication subgraph combine their observations together and their perform a seuqential greedy assignment of the robots to the candidaite viewpoints to maximize the information gain (number of points to be seen).
  3. CO(1)-CP(d)-Greedy: This is a bandwidth-friendly verison of MAP-NBV where the robots share their observation with 1-hop neighbors only (MAP-NBV allows sharing over multiple hops. For robot-viewpoint assigment, robot in a subgraph take precedence in assignment according to their. Thus lowest ID robot moves first to greedily select a viewpoint and shares the select location with others. The second lowest ID robot moves next. It removes the points taht would be seen by other robots moving before it and then makes a viewpoint selection. This process is repeated for the rest of the robots on the graph. This is a work in progress and will be studied in more detail in our future work.


Directional Chamfer Distance (Ground Truth to Observed Point Cloud)

2 Robots

4 Robots

6 Robots



Hausdorff Distance (Ground Truth to Observed Point Cloud)

2 Robots

4 Robots

6 Robots



Directional Chamfer Distance (Ground Truth to Predicted Point Cloud)

2 Robots

4 Robots

6 Robots




Effect of Coordination

We compare three types of prediction-guided algorithms:

  1. CO(c)-CP(c)-Optimal: This is a centralized approach where at each iteration all the robots combine their observations together, perform prediction on this combined point cloud and find the robot-viewpoint assignment that would results in the maximum number of new points (over the predicted point cloud) to be observed. This algorithms represents the most coordination-oriented approach.
  2. CO(d)-CP(d)-Greedy or MAP-NBV: Our proposed approach which is decentralized in nature. The robot that form a communication subgraph combine their observations together and their perform a seuqential greedy assignment of the robots to the candidaite viewpoints to maximize the information gain (number of points to be seen).
  3. IO-IP: This is a purely non-cooperative approach where the robots do not communicate at all and perform predictions on their own obervations and select the view-point that would result in maximum number of new points being observed.


Hausdorff Distance (Ground Truth to Observed Point Cloud)

2 Robots

4 Robots

6 Robots



Directional Chamfer Distance (Ground Truth to Predicted Point Cloud)

2 Robots

4 Robots

6 Robots





Qualitative Real-World Experiment

We also conducted a real-world experiment to evaluate the feasibility of implementing MAP-NBV on hardware. A single iteration of the MAP-NBV pipeline for 2 robots was run using a ZED camera. Due to the lower quality of the ZED camera, an iPhone 12 Pro Max with built-in LiDAR was used to capture data to create a high-resolution reconstruction.

RGB Image of the Car

Observations, Predictions, and MAP-NBV poses

Initial, Drone 1, and Drone 2 points after MAP-NBV iteration

Reconstruction



Related Works

Pred-NBV: Prediction-guided Next-Best-View for 3D Object Reconstruction. Harnaik Dhami, Vishnu D. Sharma, Pratap Tokekar. IROS 2023.

PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers. Xumin Yu*, Yongming Rao*, Ziyi Wang, Zuyan Liu, Jiwen Lu, Jie Zhou. ICCV 2021.

Global registration of mid-range 3D observations and short range next best views. Jacopo Aleotti; Dario Lodi Rizzini; Riccardo Monica; Stefano Caselli. IROS 2014.