Neural-Radiance-Fields

Project :

Neural Radiance Fields (NeRF)

Skills Involved:

Python, OpenCv, Deep Learning

Solo Project

Code

Description

Aim:

To render 3d scenes from 2d images using a fully connected deep neural network

Methodology:

NeRF (Neural Radiance Fields):

INITIALIZATION:
- Define a fully connected neural network `f` with weights `W`.
NeRF uses a fully connected network since we're representing a continuous volume, unlike image-based methods that typically use convolutional networks.
`f` takes in a 5D coordinate (x, y, z, θ, φ), where (x, y, z) is the 3D position and (θ, φ) are view directions.
Including view direction ensures that the network can model view-dependent effects like specular reflections.
`f` outputs a color (RGB) and a density (σ) for the given 5D coordinate.
The color indicates how the light interacts at that point, while density suggests how much light gets absorbed or blocked.
FOR each training image `I` in the dataset:
Determine the camera position `C` and image resolution.
Each image in the dataset is associated with metadata that gives the camera's position and orientation.
FOR each pixel `p` in image `I`:
Compute the ray `R` originating from camera position `C` and passing through pixel `p`.
To simulate how light travels, we trace rays from the camera through each pixel into the scene.
HIERARCHICAL SAMPLING along ray `R`:
Coarse Sample: Uniformly sample a few points `{P_coarse}` along ray `R`.
This initial sampling provides a rough estimate of where relevant scene information (like surfaces) might be located.
Query network `f` for colors and densities at `{P_coarse}`.
Fine Sample: Based on the densities from the coarse sample, densely sample points `{P_fine}` near opaque regions.
Once we have a rough idea from `{P_coarse}`, we can smartly sample more points in regions of interest. This makes the process efficient.
VOLUME RENDERING along ray `R`:
Initialize accumulated color `ACC_COLOR` as [0, 0, 0] (black) and accumulated transparency `ACC_TRANSPARENCY` as
We're simulating how light accumulates color as it travels through the volume.
FOR each point `P` in `{P_coarse} U {P_fine}` (from near to far along the ray):
Query network `f` for color `COLOR_P` and density `DENSITY_P` at point `P`.
Compute `TRANSPARENCY_P` = exp(-DENSITY_P * distance_to_next_point).
This computes how much light is absorbed or blocked at this point.
`ACC_COLOR += COLOR_P * DENSITY_P * ACC_TRANSPARENCY`
This accumulates the color contribution of point `P` based on its color, density, and the accumulated transparency.
`ACC_TRANSPARENCY *= (1 - DENSITY_P) * TRANSPARENCY_P`
Update transparency based on how much light gets absorbed/blocked by this point.
Pixel `p`'s color is set to `ACC_COLOR`.
COMPUTE LOSS:
`LOSS` = Mean Squared Error between the rendered color of pixel `p` and its true color in image `I`.
This loss ensures the rendered image closely matches the actual photo, driving the network to learn correct scene representations.
BACKPROPAGATE through the neural network using `LOSS`.
Update weights `W` of the network using an optimizer, e.g., Adam.
EXPLANATION: We adjust the network weights to reduce the error in subsequent iterations.
AFTER sufficient training:
The network `f` can now be used to render novel views.
Given a new viewpoint, follow steps 3-5 to render a new image.
The power of NeRF is in its ability to generate photorealistic images from novel viewpoints, not seen during training.

Figure 1. NeRF Archetecture

Figure 2. Orignal Lego Image

Figure 3. 3D Reconstructed Model