Written by Tamás Matuszka / Posted at 7/18/24
Applying neural reconstruction in simulation for automated driving
Validating automated driving software requires millions of test kilometers. This not only implies long system development cycles with continuously increasing complexity, but it also brings with it the problem that real-world testing is resource intensive, and safety issues might arise as well. A virtual validation suite like aiSim can alleviate these burdens of real-world testing.
Automated driving (AD) and Advanced Driver Assistance Systems (ADAS) rely on closed-loop validation to ensure safety and performance. However, achieving closed-loop evaluation requires a 3D environment that accurately represents real-world scenarios. While these 3D environments can be built manually by 3D artists, these solutions have limitations in scalability and addressing the Sim2Real domain gap.
Neural Rendering – Bridging the Gap
Neural rendering can mitigate this issue by leveraging deep learning techniques, it can realistically render the static (and dynamic) environments from novel viewpoints. Let's explore the pros and cons of this approach:
PROS:
Photorealistic Quality: Neural rendering produces almost photorealistic scenes, enhancing realism.
Data-Driven and Scalable: This approach is scalable, making it suitable for real-time applications (such as 3D Gaussian Splatting).
CONS:
Out-of-Distribution Objects: Neural rendering struggles with inserting out-of-distribution (i.e., previously unseen) objects into the 3D environment.
Artifact Impact on Dynamic Objects: Artifacts may affect the appearance of dynamic objects.
Geometric Inconsistencies: Geometric inconsistencies might arise – mostly in depth-prediction.
Challenges of Existing Generative Models
Current generative models are capable of creating highly realistic images and videos, but they fall short in several aspects, such as:
2D-Only Information: These models do not provide 3D information, only operate in a 2D image space.
Projective Geometry Ignorance: They do not know projective geometry (see more HERE)
Limited Sensor Modalities: These models can not be used for generating other sensor modalities (e.g., LiDAR).
In summary, the current generative models are NOT suitable for automotive-grade validation.
Our Hybrid Solution: Integrating Neural Reconstruction
To address these limitations, we at aiMotive developed a hybrid approach. The integration of cutting-edge neural reconstruction techniques within a well-established physically based rendering pipeline allows us to virtually insert dynamic objects at arbitrary locations, adjust environmental conditions, and render previously unseen camera viewpoints.
This way, we get the following features:
Virtual Dynamic Content Insertion:
– Add dynamic objects with realistic lighting and ambient occlusion.
– Simulate environmental effects like rain, snow, and fog for more diverse simulated scenarios.
Multi-Modality Renderings:
– Generate accurate RGB images, depth-, and LiDAR intensity maps from arbitrary camera viewpoints (as seen below, with GT in the first row).
– Future work is semantic segmentation masks and radar simulation.Camera Virtualization:
– Simulate various virtual camera setups, including different camera alignments and models.
– The figure below shows simulated front fisheye (left), front wide angle (middle), and front long-range (right) camera renders from a model trained without direct front cameras.
Subscribe for more
This is the first part of a blog series, and in the following parts we'll go into more detail about the benefits and challenges of our solution – so if you liked this blog and enjoy reading technical texts on AI, neural rendering and simulation, subscribe to our LinkedIn newsletter. Until the next part, you can check out peer-reviewed works on this topic:
CVPR 2024 Demo
SIGGRAPH 2024 Poster
For more, see our latest video: