ICCV2025
Abstract
Neural reconstruction has advanced significantly in the past year, and dynamic models are becoming increasingly common. However, these models are limited to handling in-domain objects that closely follow their original trajectories. This demonstration presents a hybrid approach that integrates the advantages of neural reconstruction with physics-based rendering. First, we remove dynamic objects from the scene and reconstruct the static environment using a neural reconstruction model. Then, we populate the reconstructed environment with dynamic objects in aiSim. This approach effectively mitigates the drawbacks of both methods—such as domain gaps in traditional simulation and out-of-domain object rendering in neural reconstruction. In addition, our NeRF2GS method enables high-quality novel view synthesis even after extreme viewpoint changes.
Method
We train our 3D Gaussian Splatting (3DGS) and NeRF-based models using synchronized data collected from vehicles equipped with RGB cameras, GNSS devices, and LiDAR sensors. The reconstructed environment allows for the virtual placement of dynamic agents at arbitrary locations, adjustments to environmental conditions, and rendering from novel camera viewpoints. We have significantly improved novel view synthesis quality—particularly for road surfaces and lane markings—by incorporating surface normal regularization, which enhances its applicability for autonomous driving tasks. Additionally, our method supports multiple sensor modalities (LiDAR, radar target lists), different camera models (e.g., fisheye), and accounts for camera exposure mismatches. It can also predict segmentation masks, surface normals, and depth maps. Our simulator has been validated through downstream tasks, HiL experiments, and closed-loop ADAS/AD tests.
Qualitative results:
Our method works in different operational design domains. Urban environment (San Francisco, CA).
Parking cars and the static environment is given by the neural reconstruction model while dynamic objects are added by aiSim (Sunnyvale, CA).
Our method can be used to resimulate real-world recordings.
Our hybrid rendering approach can also be applied to public datasets like Waymo.
Our AD software is driving in the neural rendered environment.
Novel view synthesis using equirectangular camera model with 3DGS.
Rotating LiDAR sensor simulation within aiSim supported by our hybrid rendering method. Colors indicate LiDAR intensity.
Neural radar target reconstruction where the static environment is reconstructed using NeRF, and another neural network predicts the radar target list (colors indicate distance from the ego vehicle).
Novel view synthesis with 3DGS on a proving ground (ZalaZone, Hungary).
Novel view synthesis and dynamic object removal with NeRF on a 2 km-long highway section (M0, Hungary).
NeRF novel view synthesis from Waymo Open Dataset with camera model change in extreme conditions (top row: RGB/normal, bottom row: depth/segmentation).
NeRF novel view synthesis from Waymo Open Dataset with camera model change in urban environments. (top row: RGB/normal, bottom row: depth/segmentation).
A third-party model’s detections on images rendered using our hybrid method.
Comparison of learned segmentation with the Mask2Former model after extreme viewpoint change (3 meters away from train trajectory). Green regions correspond to matches between the two models, while blue and orange colors represent regions classified by only the learned segmentation or Mask2Former respectively.
Diffusion model-enhanced novel view synthesis from extremely novel viewpoints (up to 9 meters away from the trajectory of the recording vehicle).