GenNBV: Generalizable Next-Best-View Policy for Active 3D Reconstruction

CVPR 2024
Xiao Chen Quanyi Li Tai Wang Tianfan Xue Jiangmiao Pang
Shanghai AI Laboratory The Chinese University of Hong Kong

📋 Contents

About
Getting Started
Model and Benchmark
Citation
License

🏠 About

While recent advances in neural radiance field enable realistic digitization for large-scale scenes, the image-capturing process is still time-consuming and labor-intensive. Previous works attempt to automate this process using the Next-Best-View (NBV) policy for active 3D reconstruction. However, the existing NBV policies heavily rely on hand-crafted criteria, limited action space, or per-scene optimized representations. These constraints limit their cross-dataset generalizability. To overcome them, we propose GenNBV, an end-to-end generalizable NBV policy. Our policy adopts a reinforcement learning (RL)-based framework and extends typical limited action space to 5D free space. It empowers our agent drone to scan from any viewpoint, and even interact with unseen geometries during training. To boost the cross-dataset generalizability, we also propose a novel multi-source state embedding, including geometric, semantic, and action representations. We establish a benchmark using the Isaac Gym simulator with the Houses3K and OmniObject3D datasets to evaluate this NBV policy. Experiments demonstrate that our policy achieves a 98.26% and 97.12% coverage ratio on unseen building-scale objects from these datasets, respectively, outperforming prior solutions.

📚 Getting Started

Installation

We test our codes under the following environment:

Ubuntu 20.04
NVIDIA Driver: 545.29.02
CUDA 11.3
Python 3.8.12
PyTorch 1.11.0+cu113
PyTorch3D 0.7.5

Clone this repository.

git clone https://github.com/zjwzcx/GenNBV
cd GenNBV

Create an environment and install PyTorch.

conda create -n gennbv python=3.8 -y  # pytorch3d needs python>3.7
conda activate gennbv
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113

NVIDIA Isaac Gym Installation: https://developer.nvidia.com/isaac-gym/download

cd isaacgym/python
pip install -e .

Install GenNBV.

pip install -r requirements.txt
pip install -e .

Data Preparation

We provide all the preprocessed data used in our work, including mesh files and ground-truth surface points. We recommend users download the data from our provided Google Drive link [HERE].

The directory structure should be as follows.

gennbv
├── active_reconstruction
├── data_gennbv
│   ├── houses3k
│   │   ├── gt
│   │   ├── obj
│   │   ├── urdf
│   ├── omniobject3d
│   ├── ...

Training

Please run the following command to reproduce the training setting of GenNBV:

python active_reconstruction/train/train_gennbv_houses3k.py --sim_device=cuda:0 --num_envs=256 --stop_wandb=True

Weights & Bias (wandb) are recommended to analyze the training logs. If you want to use wandb in our codebase, please paste your wandb API key into wandb_utils/wandb_api_key_file.txt. And then you need to run the following command to launch training:

python active_reconstruction/train/train_gennbv_houses3k.py --sim_device=cuda:0 --num_envs=256 --stop_wandb=False

Customized Training Environments

If you want customize a novel training environment, you need to create your environment and configuration files in active_reconstruction/env and then define the task in active_reconstruction/__init__.py.

📝 TODO List

Release the paper and training code.
Release preprocessed dataset.
Release the evaluation scripts.

📦 Model and Benchmark

Model Overview

Benchmark Overview

🔗 Citation

If you find our work helpful, please cite it:

@inproceedings{chen2024gennbv,
  title={GenNBV: Generalizable Next-Best-View Policy for Active 3D Reconstruction},
  author={Chen, Xiao and Li, Quanyi and Wang, Tai and Xue, Tianfan and Pang, Jiangmiao},
  year={2024}
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
}

If you use the preprocessed dataset such as Houses3K and OmniObject3D, please cite them:

@inproceedings{peralta2020next,
  title={Next-best view policy for 3d reconstruction},
  author={Peralta, Daryl and Casimiro, Joel and Nilles, Aldrin Michael and Aguilar, Justine Aletta and Atienza, Rowel and Cajote, Rhandley},
  booktitle={Computer Vision--ECCV 2020 Workshops: Glasgow, UK, August 23--28, 2020, Proceedings, Part IV 16},
  pages={558--573},
  year={2020},
  organization={Springer}
}

@inproceedings{wu2023omniobject3d,
  title={Omniobject3d: Large-vocabulary 3d object dataset for realistic perception, reconstruction and generation},
  author={Wu, Tong and Zhang, Jiarui and Fu, Xiao and Wang, Yuxin and Ren, Jiawei and Pan, Liang and Wu, Wayne and Yang, Lei and Wang, Jiaqi and Qian, Chen and others},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={803--814},
  year={2023}
}

We're very grateful to the codebase of Legged Gym (https://github.com/leggedrobotics/legged_gym).

📄 License

This work is under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
active_reconstruction		active_reconstruction
assets		assets
legged_gym		legged_gym
licenses		licenses
resources/robots/drone		resources/robots/drone
rsl_rl		rsl_rl
stable_baselines3		stable_baselines3
wandb_utils		wandb_utils
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GenNBV: Generalizable Next-Best-View Policy for Active 3D Reconstruction

📋 Contents

🏠 About

📚 Getting Started

Installation

Data Preparation

Training

Customized Training Environments

📝 TODO List

📦 Model and Benchmark

Model Overview

Benchmark Overview

🔗 Citation

📄 License

About

Releases

Packages

Languages

License

zjwzcx/GenNBV

Folders and files

Latest commit

History

Repository files navigation

GenNBV: Generalizable Next-Best-View Policy for Active 3D Reconstruction

📋 Contents

🏠 About

📚 Getting Started

Installation

Data Preparation

Training

Customized Training Environments

📝 TODO List

📦 Model and Benchmark

Model Overview

Benchmark Overview

🔗 Citation

📄 License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages