ultrack

ultrack is general purpose 2D/3D cell tracking software.

It can track from segmentation masks or from raw images directly, specially fluorescence microscopy images.

Four interfaces are provided, depending on your needs:

  • napari plugin

  • FIJI plugin

  • Python API

  • Command line interface for batch processing, including distributed computing.

See below for additional details in each interface.

Moreover, it was originally developed to terabyte-scale zebrafish embryo images where we had few 3D annotations. Hence, a few key features of ultrack are:

  • Out-of-memory storage of intermediate results. You should not run out of memory even for large datasets. We have tracked a 3TB dataset on a laptop with 64GB of RAM.

  • It does not commit to a single segmentation. Instead it considers multiple segmentations per cell and it picks the best segmentation for each cell at each time point while tracking.

Zebrafish imaged using DaXi whole embryo tracking.


Quickstart

This quickstart guide is recommended for users who are already familiar with Python and image analysis. Otherwise, we recommend you read the Installation and Getting started sections.

Installation

If already have a working Python environment, you can install ultrack using pip. We recommend you use a conda environment to avoid any conflicts with your existing packages. If you’re using OSX or for additional information on how to create a conda environment and install packages, see Installation.

pip install ultrack

Basic usage

The following example demonstrates how to use ultrack to track cells using its canonical input, a binary image of the foreground and a cells’ contours image.

import napari
from ultrack import MainConfig, Tracker

# import to avoid multi-processing issues
if __name__ == "__main__":

   # Load your data
   foreground = ...
   contours = ...

   # Create a config
   config = MainConfig()

   # Run the tracking
   tracker = Tracker(config=config)
   tracker.track(foreground=foreground, contours=contours)

   # Visualize the results
   tracks, graph = tracker.to_tracks_layer()
   napari.view_tracks(tracks[["track_id", "t", "y", "x"]], graph=graph)
   napari.run()

If you already have segmentation labels, you can provide them directly to the tracker.

import napari
from ultrack import MainConfig, Tracker

# import to avoid multi-processing issues
if __name__ == "__main__":

   # Load your data
   labels = ...

   # Create a config
   config = MainConfig()

   # this removes irrelevant segments from the image
   # see the configuration section for more details
   config.segmentation_config.min_frontier = 0.5

   # Run the tracking
   tracker = Tracker(config=config)
   tracker.track(labels=labels)

   # Visualize the results
   tracks, graph = tracker.to_tracks_layer()
   napari.view_tracks(tracks[["track_id", "t", "y", "x"]], graph=graph)
   napari.run()

Installation

The easiest way to install the package is to use the conda (or mamba) package manager. If you do not have conda installed, we recommend to install mamba first, which is a faster alternative to conda. You can find mamba installation instructions here.

Once you have conda (mamba) installed, you should create an environment for ultrack as follows:

conda create -n ultrack python=3.11 higra gurobi pytorch pyqt -c pytorch -c gurobi -c conda-forge

Then, you can activate the environment and install ultrack:

conda activate ultrack
pip install ultrack

You can check if the installation was successful by running:

ultrack --help

GPU acceleration

Ultrack makes use of GPU for image processing operations. You can install the additional packages required for GPU acceleration by running (Linux and Windows only):

conda install pytorch-cuda -c pytorch -c nvidia
conda install cupy -c conda-forge
# linux only
conda install cucim -c rapidsai
# for windows, you can install cucim using pip
pip install git+https://github.com/rapidsai/cucim.git#egg=cucim&subdirectory=python/cucim"

See the PyTorch website for more information on how to install PyTorch with GPU support.

Gurobi setup

Gurobi is a commercial optimization solver that is used in the tracking module of ultrack. While it is not a requirement, it is recommended to install it for the best performance.

To use it, you need to obtain a license (free for academics) and activate it.

Install gurobi using conda

You can skip this step if you have already installed Gurobi.

In your existing Conda environment, install Gurobi with the following command:

conda install -c gurobi gurobi

Obtain and activate an academic license

Obtaining a license: register for an account using your academic email at Gurobi’s website. Navigate to the Gurobi’s named academic license page, and follow the instructions to get your academic license key.

Activating license: In your Conda environment, run:

grbgetkey YOUR_LICENSE_KEY

Replace YOUR_LICENSE_KEY with the key you received. Follow the prompts to complete activation.

Test the installation

Verify Gurobi’s installation by running:

ultrack check_gurobi

Troubleshooting

Depending on the operating system, the gurobi library might be missing and you need to install it from here.


Napari plugin

We wrapped up most of the functionality in a napari widget. The widget is already installed by default, but you must have napari installed to use it.

To use it, open napari and select the widget from the plugins menu selecting ultrack and then Ultrack from the dropdown menu.

The plugin is built around the concept of a tracking workflow. Any workflow is a sequence of pre-processing steps, segmentation, (candidate segments) linking, and the tracking problem solver. We explain the different workflows in the following sections.

Workflows

The difference between the workflows is the way the user provides the information to the plugin, and the way it processes the information. The remaining steps are the same for all workflows. In that sense, segmentation, linking, and solver are the same for all workflows. For each step, the widget provides direct access to the parameters of the step, and the user can change the parameters to adapt the workflow to the specific problem. We explain how these parameters behave in Configuration docs, and, more specifically, in the Experiment, Linking, and Tracking sections. Every input requested by the plugin should be loaded beforehand as a layer in Napari.

There are three workflows available in the plugin:

  • Automatic tracking from image: This workflow is designed to track cells in a sequence of images. It uses classical image processing techniques to detect the cells (foreground) and their possible contours. In this workflow, you can change the parameters of the image processing steps. Refer to the documentation of the functions used in the image processing steps:

  • Manual tracking: Since ultrack is designed to work with precomputed cell detection and

    contour detection, this workflow is designed for the situation where the user has already computed the cell detection and the contours of the cells. In this situation, no additional parameter is needed, you only need to provide the cell detection and the contours of the cells.

  • Automatic tracking from segmentation labels: This workflow is designed to track cells

    in a sequence of images where the user has already computed the segmentation of the cells. This workflow wraps the function ultrack.utils.labels_to_contours() to compute the foreground and contours of the cells from the segmentation labels, refer to its documentation for additional details.

Flow Field Estimation

Every workflow allows the use of a flow field to improve the tracking of dynamic cells. This method estimates the movement of the cells in the sequence of images through the function ultrack.imgproc.flow.timelapse_flow(). See the documentation of this function for additional details.


FIJI plugin

Ultrack is also available as a FIJI plugin.

Its usage and installation instructions are in FIJI’s ultrack documentation.


Getting started

Ultrack tracking pipeline is divided into three main steps:

  • segment: Creating the candidate segmentation hypotheses;

  • link: Finding candidate links between segmentation hypotheses of adjacent frames;

  • solve: Solving the tracking problem by finding the best segmentation and trajectory for each cell.

These three steps have their respective function with the same names and configurations but are summarized in the track function or the Tracker class, which are the main entry point for the tracking pipeline.

You’ll notice that these functions do not return any results. Instead, they store the results in a database. This enables us to process datasets larger than memory, and distributed or parallel computing. We provide auxiliary functions to export the results to a format of your choice.

The MainConfig.data_config provides the interface to interact with the database, so beware of using overwrite parameter when re-executing these functions, to erase previous results otherwise it will build on top of existing data.

If you want to go deep into the weeds of our backend. We recommend looking at the ultrack.core.database.py file.

Each one of the main steps will be explained in detail below, a detailed description of the parameters can be found in Configuration.

Segmentation

Ultrack’s canonical inputs are a foreground and a contours, there are several ways to obtain these inputs, which will be explained below. For now, let’s consider we are working with them directly.

Both foreground and contours maps must have the same shape, with the first dimension being time (T) and the remaining being the spatial dimensions (Z optional, and Y, X).

foreground is used with config.segmentation_config.threshold to create a binary mask indicating the presence of the object of interest, it’s by default 0.5. Values above the threshold are considered as foreground, and values below are considered as background.

contours indicates the confidence of each pixel (voxel) being a cell boundary (contour). The higher the value, the more likely it is a cell boundary. It couples with config.segmentation_config.min_frontier which fuses segmentation candidates separated by a boundary with an average value below this threshold, it’s by default 0, so no fusion is performed.

The segmentation is the most important step, as it will define the candidates for the tracking problem. If your cells of interest are not present in the foreground after the threshold, you won’t be able to track them. If there isn’t any faint boundary in the contours, you won’t be able to split into individual cells. That’s why it’s preferred to have a lot of contours (more hypotheses), even incorrect ones than having none.

Linking

The linking step is responsible for finding candidate links between segmentation hypotheses of adjacent frames. Usually, not a lot of candidates are needed (config.linking_config.max_neighbors = 5), unless you have several segmentation hypotheses (contours with several gray levels).

The parameter config.linking_config.max_distance must be at least the maximum distance between two cells in consecutive frames. It’s used to filter out candidates that are too far from each other. If this value is too small, you won’t be able to link cells that are far from each other.

Solving

The solving step is responsible for solving the tracking problem by finding the best segmentation and trajectory for each cell. The parameters for this step are harder to interpret, as they are related to the optimization problem. The most important ones are:

  • config.tracking_config.appear_weight: The penalization for a cell to appear, which means to start a new lineage;

  • config.tracking_config.division_weight: The penalization for a cell to divide, breaking a single tracklet into two;

  • config.tracking_config.disappear_weight: The penalization for a cell to disappear, which means to end a lineage;

These weights are negative or zero, as they try to balance the cost of including new lineages in the final solution. The connections (links) between segmentation hypotheses are positive and measure the quality of the tracks, so only lineages with a total linking weight higher than the penalizations are included in the final solution. At the same time, our optimization problem is finding the combination of connections that maximize the sum of weights of all lineages.

See the tracking configuration description for more information and Tuning tracking performance for details on how to select these parameters.

Exporting

Once the above steps have been applied, the tracking solutions are recorded in the database and they can be exported to a format of your choice, them being, to_networkx, to_trackmate, to_tracks_layer, tracks_to_zarr and others.

See the export API reference for all available options and their parameters.

Example of exporting solutions to napari tracks layer:

# ... tracking computation

# Exporting to napari format using `Tracker` class
tracks, graph = tracker.to_tracks_layer()

# Exporting using config file
tracks, graph = to_tracks_layer(config)

Post-processing

We also provide some additional post-processing functions, to remove, join, or analyze your tracks. Most of them are available in ultrack.tracks. Some examples are:

  • close_tracks_gaps: That closes gaps by joining tracklets and interpolating the missing segments;

  • filter_short_sibling_tracks: That removes short tracklets generated by false divisions;

  • get_subgraph: Which returns the whole lineage(s) of a given tracklet.

Other functionalities can be found in ultrack.utils or ultrack.imgproc, one notable example is:

  • tracks_properties: Which returns compute statistics from the tracks, segmentation masks and images.

For additional information, please refer to the tracks post-processing API reference.

Image processing

Despite being presented here last, ultrack’s image processing module provides auxiliary functions to process your image before the segmentation step. It’s not mandatory to use it, but it might reduce the amount of code you need to write to preprocess your images.

Most of them are available in ultrack.imgproc , ultrack.utils.array and ultrack.utils.cuda modules.

Refer to the image processing API reference for more information.


Examples

Here we provide some examples of how to use Ultrack for cell tracking.

Some examples are provided as Jupyter notebooks with additional documentation, but we do not recommend using Jupyter notebooks for your day-to-day analysis.

Other examples as Python scripts can be found in here.

Additional packages might be required. Therefore, conda environment files are provided, which can be installed using:

conda env create -f <environment-file.yml>
conda activate <your new env>
pip install git+https://github.com/royerlab/ultrack

The existing examples are:


Tuning tracking performance

Once you have a working ultrack pipeline, the next step is optimizing the tracking performance. Here we describe our guidelines for optimizing the tracking performance and up to what point you can expect to improve the tracking performance.

It will be divided into a few sections:

  • Pre-processing: How to make tracking easier by pre-processing the data;

  • Input verification: Guidelines to check if you have good labels or foreground and contours maps;

  • Hard constraints: Parameters must be adjusted so the hypotheses include the correct solution;

  • Tracking tuning: Guidelines to adjust the weights to make the correct solution more likely.

Pre-processing

Registration

Before tracking, the first question to ask yourself is, are your frames correctly aligned?

If not, we recommend aligning them. To do that, we provide the ultrack.imgproc.register_timelapse to align translations, see the registration API.

If the movement is more complex, with cells moving in different directions, we recommend using the flow functionalities to align individual segments with distinct transforms, see the flow tutorial. See the flow estimation API for more information.

Deep learning

Some deep learning models are sensitive to the contrast of your data, we recommend adjusting the contrast and removing background before applying them to improve their predictions. See the image processing utilities API for more information.

Input verification

At this point, we assume you already have a labels image or a foreground and contours maps;

You should check if labels or foreground contains every cell you want to track. Any region that is not included in the labels or foreground will not be tracked and can only be fixed with post-processing.

If you are using foreground and contours maps, you should check if the contours induce hierarchies that lead to your desired segmentation.

This can be done by loading the contours in napari and viewing them over your original image with blending='additive'.

You want your contours image to have higher values in the boundary of cells and lower values inside it. This indicates that these regions are more likely to be boundaries than the interior of cells. Notice, that this notion is much more flexible than a real contour map, which is we can use an intensity image as a contours map or an inverted distance transform.

In cells where this is not the case it is less likely ultrack will be able to separate them into individual segments.

If your cells (nuclei) are convex it is worth trying the ultrack.imgproc.inverted_edt for the contours.

If even after going through the next steps you don’t have successful results, I suggest looking for specialized solutions once you have a working pipeline. Some of these solutions are PlantSeg for membranes or GoNuclear for nuclei.

Hard constraints

This section is about adjusting the parameters so we have hypotheses that include the correct solution.

Please refer to the Configuration docs as we refer to different parameters.

1. The expected cell size should be between segmentation_config.min_area and segmentation_config.max_area. Having a tight range assists in finding a good segmentation and significantly reduces the computation. Our rule of thumb is to set the min_area to half the size of the expected cell or the smallest cell, disregarding outliers. And the max_area to 1.25~1.5 the size of the largest cell, this is less problematic than the min_area.

2. linking_config.max_distance should be set to the maximum distance a cell can move between frames. We recommend setting some tolerance, for example, 1.5 times the expected movement.

Tracking tuning

Once you have gone through the previous steps, you should have a working pipeline and now we can focus on the results and what can be done in each scenario.

  1. My cells are oversegmented (excessive splitting of cells):
    • Increase the segmentation_config.min_area to merge smaller cells;

    • Increase the segmentation_config.max_area to avoid splitting larger cells;

    • If you have clear boundaries and the oversegmentation are around weak boundaries, you can increase the segmentation_config.min_frontier to merge them (steps of 0.05 recommended).

    • If you’re using labels as input or to create my contours you can also try to increase the sigma parameter to create a better surface to segmentation by avoiding flat regions (full of zeros or ones).

  2. My cells are undersegmented (cells are fused):
    • Decrease the segmentation_config.min_area to enable segmenting smaller cells;

    • Decrease the segmentation_config.max_area to remove larger segments that are likely to be fused cells;

    • Decrease the segmentation_config.min_frontier to avoid merging cells that have weak boundaries;

    • EXPERIMENTAL: Set segmentation_config.max_noise to a value greater than 0, to create more diverse hierarchies, the scale of this value should be proportional to the contours value, for example, if the contours is in the range of 0-1, the max_noise around 0-0.05 should be enough. Play with it. NOTE: the solve step will take longer because of the increased number of hypotheses.

  3. I have missing segments that are present on the labels or foreground:
    • Check if these cells are above the segmentation_config.threshold value, if not, decrease it;

    • Check if linking_config.max_distance is too low and increase it, when cells don’t have connections they are unlikely to be included in the solutions;

    • Your tracking_config.appear_weight, tracking_config.disappear_weight & tracking_config.division_weight penalization weights are too high (too negative), try bringing them closer to 0.0. TIP: We recommend adjusting disappear_weight weight first, because when tuning appear_weight you should balance out division_weight so appearing cells don’t become fake divisions. A rule of thumb is to keep division_weight equal or higher (more negative) than appear_weight.

  4. I’m not detecting enough dividing cells:
    • Bring tracking_config.division_weight to a value closer to 0.

    • Depending on your time resolution and your cell type, it might be the case where dividing cells move further apart, in this case, you should tune the linking_config.max_distance accordingly.

  5. I’m detecting too many dividing cells:
    • Make tracking_config.division_weight more negative.

  6. My tracks are short and not continuous enough:
    • This is tricky, once you have tried the previous steps, you can try making the tracking_config.{appear, division, disappear}_weight more negative, but this will remove low-quality tracks.

    • Another option is to use ultrack.tracks.close_tracks_gaps to post process the tracks.

  7. I have many incorrect tracks connecting distant cells:
    • Decrease the linking_config.max_distance to avoid connecting distant cells. If that can’t be done because you will lose correct connections, then you should set linking_config.distance_weight to a value closer higher than 0, usually in very small steps (0.01).


Configuration

The configuration is at the heart of ultrack, it is used to define the parameters for each step of the pipeline and where to store the intermediate results. The MainConfig is the main configuration that contains the other configurations of the individual steps plus the data configuration.

The configurations are documented below, the parameters are ordered by importance, most important parameters are at the top of the list. Parameters that should not be changed in most of the cases are at the bottom of the list and contain a SPECIAL tag.

ultrack.config.MainConfig

ultrack.config.DataConfig

Configuration for intermediate data storage and retrieval.

ultrack.config.SegmentationConfig

Segmentation hypotheses creation configuration

ultrack.config.LinkingConfig

Candidate cell hypotheses linking configuration

ultrack.config.TrackingConfig

Tracking (segmentation & linking selection) configuration


class ultrack.config.MainConfig
field data_config [Optional] (alias 'data')

Configuration for intermediate data storage and retrieval.

field segmentation_config [Optional] (alias 'segmentation')

Segmentation hypotheses creation configuration

field linking_config [Optional] (alias 'linking')

Candidate cell hypotheses linking configuration

field tracking_config [Optional] (alias 'tracking')

Tracking (segmentation & linking selection) configuration


class ultrack.config.DataConfig

Configuration for intermediate data storage and retrieval.

field n_workers = 1

Number of workers for parallel processing

field working_dir = PosixPath('.')

Working directory for auxiliary files (e.g. sqlite database, metadata)

field database = 'sqlite'

SPECIAL: Database type sqlite and postgresql supported

field address = None

SPECIAL: Postgresql database path, for example, postgres@localhost:12345/example

property database_path

Returns database path given working directory and database type.

metadata_add(data)

Adds data content to metadata file.

property metadata

Returns metadata as dictionary.

dict(*args, **kwargs)

Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.


class ultrack.config.SegmentationConfig

Segmentation hypotheses creation configuration

field min_area = 100

Minimum segment number of pixels, regions smaller than this value are merged or removed when there is no neighboring region

field max_area = 1000000

Maximum segment’s number of pixels, regions larger than this value are removed

field n_workers = 1

Number of worker threads

field min_frontier = 0.0

Minimum average frontier value between candidate segmentations, regions sharing an average frontier value lower than this are merged

field threshold = 0.5

Threshold used to binarize the cell foreground map

field max_noise = 0.0

SPECIAL: Upper limit of uniform distribution for additive noise on contour map

field ws_hierarchy = <function watershed_hierarchy_by_area>

SPECIAL: Watershed hierarchy function from higra used to construct the hierarchy

field anisotropy_penalization = 0.0

SPECIAL: Image graph z-axis penalization, positive values will prioritize segmenting the xy-plane first, negative will do the opposite

dict(*args, **kwargs)

Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.


class ultrack.config.LinkingConfig

Candidate cell hypotheses linking configuration

field max_distance = 15.0

Maximum distance between neighboring segments

field n_workers = 1

Number of worker threads

field max_neighbors = 5

Maximum number of neighbors per candidate segment

field distance_weight = 0.0

Penalization weight \(\gamma\) for distance between segment centroids, \(w_{pq} - \gamma \|c_p - c_q\|_2\), where \(c_p\) is region \(p\) center of mass

field z_score_threshold = 5.0

SPECIAL: z-score threshold between intensity values from within the segmentation masks of neighboring segments


class ultrack.config.TrackingConfig

Tracking (segmentation & linking selection) configuration

field appear_weight = -0.001

Penalization weight for appearing cell, should be negative

field disappear_weight = -0.001

Penalization for disappearing cell, should be negative

field division_weight = -0.001

Penalization for dividing cell, should be negative

field n_threads = -1

Number of worker threads

field window_size = None

Time window size for partially solving the tracking ILP. By default it solves the entire timelapse at once. Useful for large datasets.

field overlap_size = 1

Number of frames used to shared (overlap/pad) each size when window_size is set. This improves the tracking quality at the edges of the windows and enforce continuity of the tracks.

field solution_gap = 0.001

Solver solution gap. This will speed up the solver when finding the optimal solution might taken a long time, but may affect the quality of the solution.

field time_limit = 36000

Solver execution time limit in seconds

field method = 0

SPECIAL: Solver method, reference

SPECIAL: Function used to transform the edge and node weights, identity or power

field power = 4

SPECIAL: Expoent \(\eta\) of power transform, \(w_{pq}^\eta\)

field bias = -0.0

SPECIAL: Edge weights bias \(b\), \(w_{pq} + b\), should be negative


FAQ

Q: What is each configuration parameters for?

A: See the configuration page.

Q: What to do when Qt platform plugin cannot be initialized?

A: The solution to try is to install pyqt using conda from the -c conda-forge channel.

Q: Why my python script gets stuck when using ultrack?

A: You need to wrap your code in a if __name__ == '__main__': block to avoid the multiprocessing module to run the same code in each process. For example:

import ultrack

def main():
    # Your code here
    ...

if __name__ == '__main__':
    main()
Q: My results show strange segments with perfect lines boundaries. What is happening and how can I fix it?

A: This is a hierarchical watershed artifact. Regions with “flat” intensities create arbitrary partitions that are, most of the time, a straight line.

You have three options to fix this:

  • increase min_area parameter, so these regions get removed. However, if you have objects with varying areas, this might be challenging and lead to undersegmentation.

  • increase min_frontier; this is my preferred option when you have a high accuracy network as plants.

    This merges regions whose average intensity between them is below min_frontier. In this case, assuming your boundary is between 0 and 1, min_frontier=0.1 or even 0.05 should work. Be careful to not increase this too much because it could merge regions where cells are shared by a “weak” boundary.

  • another option is to blur the boundary map so you avoid creating regions with “flat” intensities.

    This follows the same reasoning for using EDT to run watersheds. This works better for convex objects. And remember to renormalize the intensity values if using this with min_frontier.

Q: Ultrack is not working with my data. What can I do?

A: See the tracking optimizing page.

Q: My data is isotropic. How can I take that into account?

A: Provide the the Z, Y, X scaling factors in the scale parameter of track or link functions.

Q: How can I use Ultrack distributed over a cluster?

A: Jacky Ko shared his setup and his documentation here.


Theory

See our algorithm description in our computer vision paper.


Citing

If you use ultrack in your research, please cite the following papers, the algorithm and the biological applications and software.

@article{bragantini2023ucmtracking,
   title={Large-Scale Multi-Hypotheses Cell Tracking Using Ultrametric Contours Maps},
   author={Jordão Bragantini and Merlin Lange and Loïc Royer},
   year={2023},
   eprint={2308.04526},
   archivePrefix={arXiv},
   primaryClass={cs.CV}
}

@article{bragantini2024ultrack,
   title={Ultrack: pushing the limits of cell tracking across biological scales},
   author={Bragantini, Jordao and Theodoro, Ilan and Zhao, Xiang and Huijben, Teun APM and Hirata-Miyasaki, Eduardo and VijayKumar, Shruthi and Balasubramanian, Akilandeswari and Lao, Tiger and Agrawal, Richa and Xiao, Sheng and others},
   journal={bioRxiv},
   pages={2024--09},
   year={2024},
   publisher={Cold Spring Harbor Laboratory}
}

And the respective auxiliary methods (e.g. Cellpose, napari, etc) depending on your usage.

Additional resources

Documentation contents

Reference: