# Overview

Welcome to Pupil Core developer documentation. If you haven't already, we highly recommend reading the Getting Started section and the User Guide, before continuing with the developer documentation.

Have a question? Get in touch with developers and other community members on the #pupil-software-dev channel on Discord.

# Where to start?

There are a number of ways you can interact with Pupil Core as a developer.

  • Use the realtime API: The Pupil Core Network API allows you to remote control Pupil Core software and send/receive data realtime. It provides access to nearly all data generated by Pupil Core software: gaze, fixation, video data, and much more! The API can be used with any programming language that supports zeromq and msgpack.
  • Develop a plugin: Pupil Core software is desgined with extendability in mind. It provides a simple yet powerful Plugin API that is used by nearly all existing Pupil Core components. Develop a plugin when you need to access to data that is not provided via the Network API.
  • Modify source code: Can't do what you need to do with the network based API or plugin? Then get ready to dive into the inner workings of Pupil, set up dependencies, and run from source!

In most cases you can simply download Pupil Core app bundles and extend the functionality via API or Plugin.

# Terminology

There are a lot of new terms that are specific to eye and to Pupil Core. We have compiled a small list in the terminology section.

# Timing & Data Conventions

Pupil Capture is designed to work with multiple cameras that free-run at different frame rates that may not be in sync. World and eye images are timestamped and any resulting artifacts (detected pupil, markers, etc) inherit the source timestamp. Any correlation of these data streams is the responsibility of the functional part that needs the data to be correlated (e.g. calibration, visualization, analyses).

Example: Pupil Capture data format records the world video frames with their respective timestamps. Independent of this, the recorder saves the detected gaze and pupil positions at their own frame rate and with their timestamps. For details about the stored data, see the recording format section.

Pupil Core software uses simple key-value structures to represent single data points. Key-value structures can easily be serialized and nearly all programming languages have an implementation for them.

Each data point should have at least two keys topic and timestamp. Each data point should be uniquely identifiable by its topic and timestamp.

  1. topic: Identifies the type of the object. We recommend that you specify subtypes, separated by a .
  2. timestamp: Pupil time at which the datum was generated.

# Pupil Data Matching

Because Pupil Capture receives data from free-running eye cameras with sampling rates that may not be in sync, it employs a pupil data matching algorithm to decide which video frames should be used for each gaze estimation.

# Eye Video Frame Timestamp Offset

When an eye video frame is received by Pupil Capture and assigned a pupil timestamp, a fixed offset of 5 ms is applied to it to account for transmission delay from camera to Pupil software. Note, this transmission delay is not guaranteed to be constant, and the variability in the delay causes most of the FPS variance seen, especially on Windows.

# Pupil and Gaze Datums

When raw eye video frames arrive at Pupil Capture, the pupil detection algorithm is run and pupil datums are generated. For binocular gaze, the pupil data matching algorithm tries to match two pupil datums to send to the gaze mapper for gaze estimation. In some case (described below) this is not possible and gaze is mapped monocularly.

Each gaze datum thus has a base_data field that refers two pupil datums that were used in its construction: timestamp-id timestamp-id.... The gaze datum is assigned the average of these timestamps. For monocular gaze, base_data refers to one pupil datum and is intuitively assigned its corresponding pupil datum timestamp.

# Matching algorithm

In each Pupil Capture eye process, the following occurs:

  1. Record the exposure of frame F_i at time T_i (software timestamped on arrival with 5 ms offset applied)
  2. Run pupil detection algorithm on F_i, thus generating a pupil datum P_i with timestamp T_i
  3. Send P_i to Pupil Capture World process

In the Pupil Capture world process, the following occurs:

  1. Receive and queue P_i (there are separate queues for each eye; queues are automatically sorted by time)
  2. Get oldest pupil datums from each queue; right eye: P_0 and left eye: P_1 (does not remove data from queue)
  3. Match P_0 and P_1 and remove oldest datum from its respective queue if the following conditions are met:
    1. Both pupil datums have a confidence of 0.6 or higher
    2. abs(T_0 - T_1) is smaller than a specific temporal cutoff (calculated dynamically based on the pupil data present in each queue; represents effective frame rate of each eye process)
  4. If conditions 3a and 3b are not met, map older P_i monocularly and remove datum from queue. Leave newer datum in queue
  5. Repeat at step 2 for each new pupil datum

# Important Points

  • The matching algorithm will always map at least the oldest pupil datum
  • Whether the datum is mapped monocularly or binocularly depends on the confidence of the pair as well as their temporal distance
  • Given sufficient computational resources, all data is mapped; no data is dropped in this process
  • The latency of base data to gaze time is variable, e.g. ~1-6 ms
  • Single pupil datums can be matched more than once (i.e. a sample from eye0 could be matched with 3 samples from eye1 to generate a gaze datum
    • This effectively leads to super-sampling of the pupil data, which is why you can see more data points than one would expect in the gaze positions file – this is ultimately necessary since the algorithm needs to run in real-time and there is only limited knowledge about future data
    • In contrast, sub-sampling uses multiple values to build one new value

# Convert Pupil Time to System Time

Converting Pupil Time to System Time can helpful if you have other data recorded using System Time on the same machine. However, be aware that the accuracy of System Time is variable and depends on the device's network time protocol. Before using System Time for synchronization purposes, read our Best Practices.

When a Pupil Core recording is started, the time from both clocks is stored. Therefore, we know how they relate to each other / how much they are offset. You can use this information to convert Pupil Time to System Time. For this, you will need: 1) Pupil Time at recording start, and 2) System Time at recording start.

  1. At recording start, current Pupil Time is written into the info.player.json file of the recording under the start_time_synced_s key.
  2. The current System Time is also saved under the start_time_system_s key.

Here is an example of how to implement the conversion with Python:

import datetime

start_time_system = 1533197768.2805  # System Time at recording start
start_time_synced = 674439.5502      # Pupil Time at recording start

# Calculate the fixed offset between System and Pupil Time
offset = start_time_system - start_time_synced

# Choose a Pupil timestamp that you want to convert to System Time
# (this can be any or all timestamps of interest)
pupil_timestamp = 674439.4695  # This is a random example of a Pupil timestamp

# Add the fixed offset to the timestamp(s) we wish to convert
pupiltime_in_systemtime = pupil_timestamp + offset

# Using the datetime python module, we can convert timestamps 
# stored as seconds represented by floating point values to a 
# more readable datetime format.
pupil_datetime = datetime.datetime.fromtimestamp(pupiltime_in_systemtime).strftime("%Y-%m-%d %H:%M:%S.%f")

# example output: '2018-08-02 15:16:08.199800'

# Hint: you can also copy and paste timestamps into various websites that convert them
# to the readable date time format!

Now you know the basics, why not follow this in-depth tutorial that shows how to convert Pupil Time to System Time for different Pupil Core exported files.

# Pupil Datum Format

The pupil detector generates pupil data from eye images. In addition to the pupil topic and the timestamp (inherited from the eye image), the pupil detector adds fields most importantly:

  • norm_pos: Pupil location in normalized eye coordinates, and
  • confidence: Value indicating quality of the measurement

By default, the Pupil Core software uses the 3d detector for pupil detection. Since it is an extension of the 2d detector, its data contains keys that were inherited from the 2d detection, as well as 3d detector specific keys. The minimal set of keys needed in a valid pupil datum object is: id, topic, method, norm_pos, diameter, timestamp, and confidence. Below you can see the Python representation of a 3d pupil datum:

    ### pupil datum required fields

    'id': 0,  # eye id, 0 or 1
    'topic': 'pupil.0',
    'method': '3d c++',
    'norm_pos': [0.5, 0.5],  # norm space, [0, 1]
    'diameter': 0.0,  # 2D image space, unit: pixel
    'timestamp': 535741.715303987,  # time, unit: seconds
    'confidence': 0.0,  # [0, 1]

    ### 2D model data

    # 2D ellipse of the pupil in image coordinates
    'ellipse': {  # image space, unit: pixel
        'angle': 90.0,  # unit: degrees
        'center': [320.0, 240.0],
        'axes': [0.0, 0.0],

    ### 3D model data

    # Fixed to 1.0 in  pye3d v0.0.4.
    'model_confidence': 1.0,

    # pupil polar coordinates on 3D eye model. The model assumes a fixed
    # eye ball size. Therefore there is no `radius` key
    'theta': 0,
    'phi': 0,

    # 3D pupil ellipse
    'circle_3d': {  # 3D space, unit: mm
        'normal': [0.0, -0.0, 0.0],
        'radius': 0.0,
        'center': [0.0, -0.0, 0.0],
    'diameter_3d': 0.0,  # 3D space, unit: mm

    # 3D eye ball sphere
    'sphere': {  # 3D space, unit: mm
        'radius': 0.0,
        'center': [0.0, -0.0, 0.0],
    'projected_sphere': {  # image space, unit: pixel
        'angle': 90.0,
        'center': [0, 0],
        'axes': [0, 0],

# Gaze Datum Format

Gaza data is based on one (monocular) or two (binocular) pupil positions. The gaze mapper is automatically setup after calibration and maps pupil positions into the world camera coordinate system. The pupil data on which the gaze datum is based on can be accessed using the base_data key.

    # monocular gaze datum
    'topic': 'gaze.3d.1.',
    'confidence': 1.0,  # [0, 1]
    'norm_pos': [x, y],  # norm space, [0, 1]
    'timestamp': ts,  # time, unit: seconds

    # 3D space, unit: mm
    'gaze_normal_3d': [x, y, z],
    'eye_center_3d': [x, y, z],
    'gaze_point_3d': [x, y, z],
    'base_data': [<pupil datum>]  # list of pupil data used to calculate gaze
    # binocular gaze datum
    'topic': 'gaze.3d.01.',
    'confidence': 1.0,  # [0, 1]
    'norm_pos': [x, y],  # norm space, [0, 1]
    'timestamp': ts,  # time, unit: seconds

    # 3D space, unit: mm
    'gaze_normals_3d': {
        '0': [x, y, z],
        '1': [x, y, z],
    'eye_centers_3d': {
        '0': [x, y, z],
        '1': [x, y, z],
    'gaze_point_3d': [x, y, z],
    'base_data': [<pupil datum>]  # list of pupil data used to calculate gaze

# Surface Datum Format

Surface data is published when the surface tracker is able to detect a defined surface. It includes

  • the name of the detected surface,
  • the timestamp of the video frame in which it was detected,
  • the homographies to transform surface to image coordinates and vice versa,
  • gaze and fixation data that was mapped onto the surface.

The gaze and fixation norm_pos fields contain surface coordinates. The base_data field is a tuple of the original topic and its timestamp.

    "topic": "surfaces.surface_name",
    "name": "surface_name",
    "surf_to_img_trans": (
        (-394.2704714040225, 62.996680859974035, 833.0782341017057),
        (24.939461954010476, 264.1698344383364, 171.09768247735033),
        (-0.0031580300961504023, 0.07378146751738948, 1.0),
    "img_to_surf_trans": (
        (-0.002552357406770253, 1.5534025217146223e-05, 2.1236555655143734),
        (0.00025853538051076233, 0.003973842600569134, -0.8952954577358644),
        (-2.71355412859636e-05, -0.00029314688183396006, 1.0727627809231568),
    "gaze_on_surfaces": (
            "topic": "gaze.3d.1._on_surface",
            "norm_pos": (-0.6709809899330139, 0.41052111983299255),
            "confidence": 0.5594810076623645,
            "on_surf": False,
            "base_data": ("gaze.3d.1.", 714040.132285),
            "timestamp": 714040.132285,
    # list of fixations associated with
    "fixations_on_surfaces": (
            "topic": "fixations_on_surface",
            "norm_pos": (-0.9006409049034119, 0.7738968133926392),
            "confidence": 0.8663407531808505,
            "on_surf": False,
            "base_data": ("fixations", 714039.771958),
            "timestamp": 714039.771958,
            "id": 27,
            "duration": 306.62299995310605,  # in milliseconds
            "dispersion": 1.4730711610581475,  # in degrees
    # timestamp of the world video frame in which the surface was detected
    "timestamp": 714040.103912,

# Running From Source

Follow the setup instructions for your OS on the Pupil Core Github repo

When running from source, the user settings are not placed in the user's home directory but in the root directory of the cloned repository.