structure-from-motion

Project :

Structure From Motion (SFM)

Skills Involved:

Python, OpenCv

Solo Project

Code

Description

Aim:

To calibrate the camera by obtaining it's calibration matrix and undistort already existing images using classical Computer Vision techniques

Key concepts:

Camera Extrinsic Matrix: Transforms points from the world coordinate system to the camera coordinate system.
Camera Intrinsic Matrix: Transforms points from the camera coordinate system to the pixel coordinate system.
Homography Matrix: Relates the 2D points of a plane in the image to their corresponding 3D points in the world. Each image has its unique Homography because of the specific rotation and translation.

Methodology:

Load images
Define the camera calibration matrix (intrinsic parameters).
Define camera 1's matrix as the world pose.
For each pair of images:
1. Load correspondence points between two images from the given text file.
2. Apply RANSAC algorithm to obtain the best set of inliers and the Fundamental Matrix (F). This process helps remove outlier correspondences.
3. Plot and show the correspondences between the two images.
4. Compute the Essential Matrix (E) from the Fundamental Matrix. The Essential Matrix represents epipolar geometry between two calibrated cameras.
5. Extract possible camera poses (rotation and translation) from the Essential Matrix.
6. Choose the correct camera pose from the extracted poses. This step ensures that therelative orientation of the cameras matches the actual physical setup.
7. Construct the projection matrix for the second camera.
8. Calculate the reprojection errors for the points using the current projection matrix. This shows how accurately the 3D points reproject to the 2D image.
9. Perform Non-linear triangulation to refine the 3D point coordinates. This optimizes the 3D coordinates to best match the observed 2D points.
10. Compare and Plot the difference between the 3D points obtained using linear and non-linear triangulation.
11. Store the camera poses and 3D points.
12. Use the Perspective-n-Point (PnP) method for the next set of images to find the camera pose without recomputing the entire structure.
13. For each new image:
  1. Get the 2D-3D point correspondences.
  2. Apply PnP RANSAC to get an initial estimate of the camera pose.
  3. Refine the pose using Non-linear PnP.
  4. Perform triangulation to get 3D coordinates for the remaining 2D points in the image.
  5. Refine the 3D coordinates using non-linear triangulation.
  6. Store the refined camera pose and 3D points.
  7. Calculate and Store the reprojection errors.
  8. Plot the camera poses and their 3D points.
Perform Bundle Adjustment to further refine camera poses and 3D points. This optimizes all camera poses and 3D points simultaneously to reduce overall reprojection error.
VISUALIZE the final camera poses and 3D points using helper functions.

Figure 1. Raw Matches

Figure 2. Reprojection Points

Figure 3. Linear VS Non-Linear Traingulation

Figure 4. Traingulation

Figure 6. Bundle Adjustment

Figure 5. Non-Linear PNP

Figure 7. Cheirality