# Use Camera Intrinsics to Undistort Video and Gaze Data

Pupil Invisible's scene camera has a very large field of view and is affected by lens distortion accordingly. Every scene camera is calibrated during manufacturing to obtain it's "intrinsic values" though, which allows us to remove the distortion if desired.

In this guide you will learn how to use the intrinsic values to apply undistortion to the scene video and gaze data using OpenCV.

An example of undistortion

# Dependencies for this How-To

In order to run the code in this How-To, you will need to install the following Python dependencies:

pip install av matplotlib numpy opencv-python pandas requests tqdm

You can download all the code of this guide here.

We will use the Anna_Standing_Downstairs recording from the demo workspace as an example for this guide. If you want to follow along, download it and unpack it in the data/demo_recording folder next to the code.

# Obtaining Camera Intrinsic Values

You can find the intrinsic values of the scene camera in the scene_camera.json file, which is included in every recording folder you download from Pupil Cloud. This also holds true for the demo recording! We can read the camera matrix and the distortion coefficients, which make up the camera intrinsics, as follows.

import json
with open("data/demo_recording/anna_standing_downstairs-b1eafeee/scene_camera.json", "r") as f:
  data = json.load(f)
camera_matrix = np.array(data["camera_matrix"])
dist_coeffs = np.array(data["dist_coefs"])
print("Camera Matrix:")
print("Distortion Coefficients:")
Camera Matrix:
[[765.46908275   0.         567.09545418]
 [  0.         765.26730808 545.31770941]
 [  0.           0.           1.        ]]
Distortion Coefficients:
[[-0.12591202  0.10118179  0.00068196 -0.00048156  0.0189529   0.2053599
   0.00745978  0.06701928]]

# Undistorting Images and Points

With the camera matrix and distortion coefficients available, we are now able to undistort images and points. If we want to undistort the scene camera video, it is critical that we undistort the gaze data as well. The gaze data is in scene camera coordinates and if we warp the scene camera images via undistortion, the gaze would otherwise no longer be accurate.

For simplicity, we will first undistort only a single image and a few artificial example points. Afterwards, we will go through an example of undistorting and entire Pupil Invisible recording.

The original distorted data looks as follows. Note that some lines that are straight in real life appear slightly curved in the image due to the distortion. For example the rail of the lamp, or the edges of the paintings.

import cv2
import matplotlib.pyplot as plt
import numpy as np
img = cv2.imread("data/example_frame.png")
example_points = np.array(
        (100.0, 100.0),
        (800.0, 200.0),
        (200.0, 500.0),
        (900.0, 900.0),
# Plot the original distorted version
plt.figure(figsize=(10, 10))
# Convert color space for plotting with matplotlib
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.scatter(example_points[:, 0], example_points[:, 1], color="orange", s=80)
plt.title("Original Image - Still Distorted")


We can now use OpenCV to undistort the image and example points. Note, how all straight lines appear straight in the undistorted image. Also, the image is slightly cropped, e.g. the rail of the lamp is no longer in the image and the top left example point is outside of the image now.

img_undist = cv2.undistort(img, camera_matrix, dist_coeffs)
example_points_undist = cv2.undistortPoints(
    example_points.reshape(-1, 2), camera_matrix, dist_coeffs, P=camera_matrix
example_points_undist = example_points_undist.reshape(-1, 2)
# Plot the undistorted version
plt.figure(figsize=(10, 10))
    example_points_undist[:, 0], example_points_undist[:, 1], color="orange", s=80
plt.title("Undistorted Image and Points")


# Undistort a full Recording

To undistort a full recording we simply need to apply the undistortion to every video frame and all gaze data. This requires additional code to handle the video correctly, but the undistortion works just the same.

Note, that we are using PyAV to handle the video rather than OpenCV. Handling videos with OpenCV often has issues and PyAV additionally allows us to correctly handle the audio track of the video, if there is one.

import pathlib
import av
from tqdm import tqdm
import pandas as pd
def undistort_video(
    original_video_path, undistorted_video_path
    timestamps_path = pathlib.Path(original_video_path).with_name(
    num_frames = pd.read_csv(timestamps_path).shape[0]
    original_container = av.open(str(original_video_path))
    original_video_stream = original_container.streams.video[0]
    undistorted_container = av.open(str(undistorted_video_path), "w")
        undistorted_video = undistorted_container.add_stream("h264_nvenc")
    except Exception as e:
        print("nvenc not available", e)
        undistorted_video = undistorted_container.add_stream("h264")
    undistorted_video.options["bf"] = "0"
    undistorted_video.options["movflags"] = "faststart"
    undistorted_video.gop_size = original_video_stream.gop_size
    undistorted_video.codec_context.height = original_video_stream.height
    undistorted_video.codec_context.width = original_video_stream.width
    undistorted_video.codec_context.time_base = original_video_stream.time_base
    undistorted_video.codec_context.bit_rate = original_video_stream.bit_rate
    if original_container.streams.audio:
        audio_stream = original_container.streams.audio[0]
        output_audio_stream = undistorted_container.add_stream("aac")
        output_audio_stream.codec_context.layout = audio_stream.layout.name
        output_audio_stream.codec_context.time_base = audio_stream.time_base
        output_audio_stream.codec_context.bit_rate = audio_stream.bit_rate
        output_audio_stream.codec_context.sample_rate = audio_stream.sample_rate
    progress = tqdm(unit=" frames", total=num_frames)
    with undistorted_container:
        for packet in original_container.demux():
            frames = packet.decode()
            if packet.stream.type == "audio":
                for frame in frames:
                    packets = output_audio_stream.encode(frame)
            elif packet.stream.type == "video":
                for frame in frames:
                    img = frame.to_ndarray(format="bgr24")
                    undistorted_img = cv2.undistort(img, camera_matrix, dist_coeffs)
                    new_frame = frame.from_ndarray(undistorted_img, format="bgr24")
                    new_frame.pts = frame.pts
                    new_frame.time_base = original_video_stream.time_base
                    packets = undistorted_video.encode(new_frame)
        # encode and mux frames that have been queued internally by the encoders
def undistort_gaze(original_gaze_path, unditorted_gaze_path):
    original_gaze_df = pd.read_csv(original_gaze_path)
    original_gaze = original_gaze_df[["gaze x [px]", "gaze y [px]"]].values
    undistorted_gaze = cv2.undistortPoints(
        original_gaze.reshape(-1, 2), camera_matrix, dist_coeffs, P=camera_matrix
    undistorted_gaze_df = original_gaze_df.copy()
    undistorted_gaze_df[["gaze x [px]", "gaze y [px]"]] = undistorted_gaze.reshape(-1, 2)
    undistorted_gaze_df.to_csv(unditorted_gaze_path, index=False)
recording_folder = "data/demo_recording/anna_standing_downstairs-b1eafeee/"
original_video_path = recording_folder + "d4c97639_0.0-230.541.mp4"
undistorted_video_path = recording_folder + "d4c97639_0.0-230.541_undistorted.mp4"
undistort_video(original_video_path, undistorted_video_path)
original_gaze_path = recording_folder + "gaze.csv"
undistorted_gaze_path = recording_folder + "gaze_undist.csv"
undistort_gaze(original_gaze_path, undistorted_gaze_path)
nvenc not available h264_nvenc
100%|██████████| 6913/6913 [04:26<00:00, 25.96 frames/s]