API¶

First, we provide a summary of the main functionalities of the package. Then we provide detailed documentation of every public function of ultrack.

Object Oriented API¶

ultrack.Tracker(config)

An Ultrack wrapper for its core functionalities.

Core functionalities¶

`ultrack.track`(config, *[, labels, sigma, ...])	All-in-one function for cell tracking, it accepts multiple inputs (labels or contours) and run all intermediate steps, computing segmentation hypothesis, linking and solving the ILP.
`ultrack.segment`(foreground, contours, config)	Add candidate segmentation (nodes) from foreground and edge to database.
`ultrack.link`(config[, images, scale, ...])	Links candidate segments (nodes) with their neighbors on the next time.
`ultrack.solve`(config[, batch_index, ...])	Compute tracking by selecting nodes with maximum flow from database.
`ultrack.add_flow`(config, vector_field)	Adds vector field (coordinate shift) data into nodes.
`ultrack.load_config`(path)	Creates MainConfig from TOML file.

Image processing utilities¶

`ultrack.imgproc.PlantSeg`(model_name[, ...])	A class for performing boundary detection using the Plant-Seg model.
`ultrack.imgproc.detect_foreground`(image[, ...])	Detect foreground using morphological reconstruction by dilation and thresholding.
`ultrack.imgproc.inverted_edt`(mask[, ...])	Computes Euclidean distance transform (EDT), inverts and normalizes it.
`ultrack.imgproc.normalize`(image, gamma[, ...])	Normalize image to between [0, 1] and applies a gamma transform (x ^ gamma).
`ultrack.imgproc.robust_invert`(image[, ...])	Inverts an image robustly by first smoothing it with a gaussian filter and then normalizing it to [0, 1].
`ultrack.imgproc.tracks_properties`(segments)	Calculate properties of tracked regions from segmentation data.
`ultrack.imgproc.Cellpose`(**kwargs)
`ultrack.imgproc.sam.MicroSAM`([model_type, ...])	Using a SAM model, generates masks for the entire image.
`ultrack.imgproc.register_timelapse`(timelapse, *)	Register a timelapse sequence using phase cross correlation.
`ultrack.imgproc.flow.timelapse_flow`(images)	Compute vector field from timelapse.
`ultrack.utils.labels_to_contours`(labels[, ...])	Converts and merges a sequence of labels into ultrack input format (foreground and contours)

Exporting¶

`ultrack.core.export.to_ctc`(output_dir, config)	Exports tracking results to cell-tracking challenge (http://celltrackingchallenge.net) format.
`ultrack.core.export.to_trackmate`(config[, ...])	Exports tracking results to TrackMate XML format.
`ultrack.core.export.to_tracks_layer`(config)	Exports solution from database to napari tracks layer format.
`ultrack.core.export.tracks_to_zarr`(config, ...)	Exports segmentations masks to zarr array, track_df assign the track_id to their respective segments.

Core functionalities¶

class ultrack.MainConfig¶

field data_config [Optional] (alias 'data')¶: Configuration for intermediate data storage and retrieval.

field segmentation_config [Optional] (alias 'segmentation')¶: Segmentation hypotheses creation configuration

field linking_config [Optional] (alias 'linking')¶: Candidate cell hypotheses linking configuration

field tracking_config [Optional] (alias 'tracking')¶: Tracking (segmentation & linking selection) configuration

class ultrack.Tracker(config)¶

An Ultrack wrapper for its core functionalities.

Parameters:

config (MainConfig) – The configuration parameters.

Variables:

config (MainConfig) – The configuration parameters.
status (TrackerStatus) – The status of the tracking process.

Examples

>>> import ultrack
>>> from ultrack import MainConfig
>>> config = MainConfig()
>>> foreground = ...
>>> contours = ...
>>> vector_field = ...
>>> tracker = ultrack.Tracker(config)
>>> tracker.segment(foreground=foreground, contours=contours)
>>> tracker.add_flow(vector_field=vector_field)
>>> tracker.link()
>>> tracker.solve()

add_flow(vector_field)¶

Adds vector field (coordinate shift) data into nodes. If there are fewer vector fields than dimensions, the last dimensions from (z,y,x) have priority. For example, if 2 vector fields are provided for a 3D data, only (y, x) are updated. Vector field shape, except t, can be different from the original image. When this happens, the indexing is done mapping the position and rounding.

Parameters:

data_config (DataConfig) – Data configuration parameters.
vector_field (Sequence[ArrayLike]) – Vector field arrays. Each array per coordinate or a single (T, D, (Z), Y, X)-array.

add_links(sources, targets, weights)¶

Adds user-defined links to the database.

Parameters:

config (MainConfig) – Configuration parameters.
sources (ArrayLike) – Sources (t) node id.
targets (ArrayLike) – Targets (t + 1) node id.
weights (ArrayLike) – Link weights, the higher the weight the more likely the link.

add_nodes_prob(indices, probs)¶

Add nodes’ probabilities to the segmentation/tracking database.

Parameters:

config (MainConfig) – Main configuration parameters.
indices (ArrayLike) – Nodes’ indices database index.
probs (ArrayLike) – Nodes’ probabilities.

export_by_extension(include_parents=True, include_node_ids=True)¶

Exports solution from database to napari tracks layer format.

Parameters:

config (MainConfig) – Configuration parameters.
include_parents (bool) – Flag to include parents track id for each track id.
include_ids (bool) – Flag to include node ids for each unit.

Returns:

Tracks dataframe and an lineage graph, mapping node_id -> parent_id.

Return type:

Tuple[pd.DataFrame, Dict[int, List[int]]]

fit_nodes_prob(ground_truth, classifier=None, remove_no_overlap=True, insert_prob=True, persistence_features=False, coord_features=False)¶

Fit a probabilistic classifier to the nodes’ features.

Parameters:

config (MainConfig) – Main configuration parameters.
ground_truth (Union[ArrayLike, pd.Series]) – Ground-truth labels, either a: * timelapse array of labels (T, (Z), Y, X) with the same shape as the input data. * pandas Series indexed by the nodes’ indices.
classifier (ProbabilisticClassifier) – Probabilistic classifier object. Classifier is fit in-place. If not provided, it will use catboost.CatBoostClassifier.
remove_no_overlap (bool, optional) – Whether to remove NO_OVERLAP nodes (-1) from the ground-truth. Classification will compare matched (>0) vs unmatched nodes (0).
insert_prob (bool, optional) – Whether to insert the probabilities to the database, by default True.
persistence_features (bool, optional) – Whether to include persistence features, by default False.
coord_features (bool, optional) – Whether to include coordinate (t, (z), y, x) features, by default False.

Returns:

Fitted probabilistic classifier.

Return type:

ProbabilisticClassifier

get_nodes_features(indices=None, include_persistence=False)¶

Creates a pandas dataframe from nodes features defined during segmentation plus area and coordinates.

Parameters:

config (MainConfig) – Configuration parameters.
indices (Optional[ArrayLike], optional) – List of node indices, by default
include_persistence (bool, optional) – Include persistence features, by default False

Returns:

Dataframe with nodes features

Return type:

pd.DataFrame

link(images=(), scale=None, batch_index=None, overwrite=False)¶

Links candidate segments (nodes) with their neighbors on the next time.

Parameters:

config (MainConfig) – Configuration parameters.
images (Sequence[ArrayLike]) – Optinal sequence of images for color space filtering.
scale (Sequence[float]) – Optional scaling for nodes’ distances.
batch_index (Optional[int], optional) – Batch index for processing a subset of nodes, by default everything is processed.
overwrite (bool) – Cleans up linking database content before processing.

match_to_ground_truth(gt_labels, scale=None, track_id_graph=None, is_segmentation=True, optimize_config=False, batch_index=None)¶

Matches nodes to ground-truth labels returning additional features for automatic parameter tuning.

gt_track_id is the ground-truth track ID matched to the node. gt_parent_track_id is the parent ground-truth track ID matched to the node. gt_track_id can be of: * -1 means no overlap with ground-truth, therefore it could be a potential segmentation without annotation. * 0 means blocked by overlap, so we are sure it is not a cell. * >0 means it is a cell and the value is the ground-truth track ID.

Tolerances for optimal configuration based on ground-truth matches: * max_distance + 1.0 * min_area * 0.95 * max_area * 1.025 * min_frontier - 0.025

Parameters:

config (MainConfig) – Configuration object.
gt_labels (ArrayLike) – Ground-truth labels.
scale (Optional[ArrayLike], optional) – Scale of the data for distance computation, by default None.
track_id_graph (Optional[Dict[int, int]], optional) – Ground-truth graph of track IDs, by default None.
is_segmentation (bool, optional) – Whether the ground-truth labels are segmentation masks or points, by default True.
optimize_config (bool, optional) – Whether to find optimal configuration based on the ground-truth matches, by default False. If True, it will return the configuration object with updated parameters.
batch_index (Optional[int], optional) – Batch index for processing a subset of frames, by default everything is processed.

Returns:

Data frame containing matched ground-truth labels to their respective nodes. If optimize_config is True, it will return a tuple with the data frame and the updated configuration object.

Return type:

Union[pd.DataFrame, Tuple[pd.DataFrame, MainConfig]]

predict_nodes_prob(classifier, insert_prob=True, persistence_features=False, coord_features=False)¶

Predicts the probabilities of the nodes’ features.

Parameters:

config (MainConfig) – Main configuration parameters.
classifier (ProbabilisticClassifier) – Probabilistic classifier object.
insert_prob (bool, optional) – Whether to insert the probabilities to the database, by default True.
persistence_features (bool, optional) – Whether to include persistence features, by default False.
coord_features (bool, optional) – Whether to include coordinate (t, (z), y, x) features, by default False.

Returns:

Nodes’ probabilities.

Return type:

pd.Series

segment(contours, config, max_segments_per_time=1000000, batch_index=None, overwrite=False, insertion_throttle_rate=50, images=None, properties=None)¶

Add candidate segmentation (nodes) from foreground and edge to database.

Parameters:

foreground (ArrayLike) – Foreground probability array of shape (T, (Z), Y, X)
contours (ArrayLike) – Contours array of shape (T, (Z), Y, X)
config (MainConfig) – Configuration parameters.
max_segments_per_time (int) – Upper bound of segments per time point.
batch_index (Optional[int], optional) – Batch index for processing a subset of nodes, by default everything is processed.
overwrite (bool) – Cleans up segmentation, linking, and tracking database content before processing.
insertion_throttle_rate (int) – Throttling rate for insertion, by default 50. Only used with non-sqlite databases.
images (Optional[ArrayLike], optional) – Image array of shape (T, (Z), Y, X, (C)) for segments properties, by default None. Channel and Z dimensions are optional.
properties (Optional[List[str]], optional) – List of properties to compute for each segment, see skimage.measure.regionprops documentation.

solve(batch_index=None, overwrite=False, use_annotations=False, use_ground_truth_match=False)¶

Compute tracking by selecting nodes with maximum flow from database.

Parameters:

config (MainConfig) – Configuration parameters.
batch_index (Optional[int], optional) – Batch index for processing a subset of nodes, by default everything is processed.
overwrite (bool, optional) – Resets existing solution before processing.
use_annotations (bool, optional) – Use annotations to fix ILP variables, by default False
use_ground_truth_match (bool) – Fix ILP variables using ground truth matching data, by default False.

to_ctc(config, margin=0, scale=None, first_frame=None, dilation_iters=0, stitch_tracks=False, overwrite=False)¶

Exports tracking results to cell-tracking challenge (http://celltrackingchallenge.net) format.

Parameters:

output_dir (Path) – Output directory to save segmentation masks and lineage graph
config (DataConfig) – Configuration parameters.
scale (Optional[Tuple[float]], optional) – Optional scaling of output segmentation masks, by default None
margin (int) – Margin used to filter out nodes and splitting their tracklets
first_frame (Optional[ArrayLike], optional) – Optional first frame detection mask to select a subset of tracks (e.g. Fluo-N3DL-DRO), by default None
dilation_iters (int) – Iterations of radius 1 morphological dilations on labels, applied after scaling, by default 0.
stitch_tracks (bool, optional) – Stitches (connects) incomplete tracks nearby tracks on subsequent time point, by default False
overwrite (bool, optional) – Flag to overwrite existing output_dir content, by default False

to_geff(filename, overwrite=False)¶

Export tracks to a geff (Graph Exchange File Format) file.

Parameters:

config (MainConfig) – The configuration object.
filename (str or Path) – The name of the file to save the tracks to.
overwrite (bool, optional) – Whether to overwrite the file if it already exists, by default False.

Raises:

FileExistsError – If the file already exists and overwrite is False.

to_networkx(children_to_parent=False)¶

Convert napari tracks layer tracks dataframe to networkx directed graph. By default, the edges are the parent to child relationships.

Parameters:

config (MainConfig) – Configuration parameters.
children_to_parent (bool) – If set, edges encode child to parent relationships.

Returns:

Networkx graph.

Return type:

nx.DiGraph

to_pandas(include_parents=True, include_node_ids=True)¶

Exports solution from database to napari tracks layer format.

Parameters:

config (MainConfig) – Configuration parameters.
include_parents (bool) – Flag to include parents track id for each track id.
include_ids (bool) – Flag to include node ids for each unit.

Returns:

Tracks dataframe and an lineage graph, mapping node_id -> parent_id.

Return type:

Tuple[pd.DataFrame, Dict[int, List[int]]]

to_trackmate()¶

Convert a pandas DataFrame representation of Napari track layer to TrackMate XML format. <ImageData/> need to be set manually in the output XML.

Parameters:: tracks_df (pd.DataFrame) – A DataFrame with columns track_id, id, parent_id, t, z, y, x. Cells that belong to the same track have the same track_id.
Returns:: A string representation of the XML in the TrackMate format.
Return type:: str

Examples

>>> tracks_df = pd.DataFrame(
...     [[1,0,12.0,49.0,49.0,1000001,-1,-1],
...     [1,1,12.0,49.0,32.0,2000001,-1,1000001],
...     [2,1,12.0,49.0,66.0,2000002,-1,1000001]],
...     columns=['track_id','t','z','y','x','id','parent_track_id','parent_id']
... )
>>> print(tracks_df)
   track_id  t     z     y     x       id  parent_track_id  parent_id
0         1  0  12.0  49.0  49.0  1000001               -1         -1
1         1  1  12.0  49.0  32.0  2000001               -1    1000001
2         2  1  12.0  49.0  66.0  2000002               -1    1000001
>>> tracks_layer_to_trackmate(tracks_df)
<?xml version="1.0" ?>
<TrackMate version="7.11.1">
    <Model spatialunits="pixels" timeunits="frames">
        <AllTracks>
            <Track TRACK_ID="1" NUMBER_SPOTS="2" NUMBER_GAPS="0" TRACK_START="0" TRACK_STOP="1" name="Track_1">
                <Edge SPOT_SOURCE_ID="1000001" SPOT_TARGET_ID="2000001" EDGE_TIME="0.5"/>
            </Track>
            <Track TRACK_ID="2" NUMBER_SPOTS="1" NUMBER_GAPS="0" TRACK_START="1" TRACK_STOP="1" name="Track_2">
                <Edge SPOT_SOURCE_ID="1000001" SPOT_TARGET_ID="2000002" EDGE_TIME="0.5"/>
            </Track>
        </AllTracks>
        <FilteredTracks>
            <TrackID TRACK_ID="1"/>
            <TrackID TRACK_ID="2"/>
        </FilteredTracks>
        <AllSpots>
            <SpotsInFrame frame="0">
                <Spot ID="1000001" QUALITY="1.0" VISIBILITY="1" name="1000001" FRAME="0" RADIUS="5.0" POSITION_X="49.0" POSITION_Y="49.0" POSITION_Z="12.0"/>
            </SpotsInFrame>
            <SpotsInFrame frame="1">
                <Spot ID="2000001" QUALITY="1.0" VISIBILITY="1" name="2000001" FRAME="1" RADIUS="5.0" POSITION_X="32.0" POSITION_Y="49.0" POSITION_Z="12.0"/>
                <Spot ID="2000002" QUALITY="1.0" VISIBILITY="1" name="2000002" FRAME="1" RADIUS="5.0" POSITION_X="66.0" POSITION_Y="49.0" POSITION_Z="12.0"/>
            </SpotsInFrame>
        </AllSpots>
        <FeatureDeclarations>
            ...
        </FeatureDeclarations>
    </Model>
    <Settings>
        <InitialSpotFilter feature="QUALITY" value="0.0" isabove="true"/>
        <SpotFilterCollection/>
        <TrackFilterCollection/>
        <ImageData filename="None" folder="None" width="0" height="0" depth="0" nslices="1" nframes="2" pixelwidth="1.0" pixelheight="1.0" voxeldepth="1.0" timeinterval="1.0"/>
    </Settings>
</TrackMate>

to_tracks_layer(include_parents=True, include_node_ids=True)¶

Exports solution from database to napari tracks layer format.

Parameters:

config (MainConfig) – Configuration parameters.
include_parents (bool) – Flag to include parents track id for each track id.
include_ids (bool) – Flag to include node ids for each unit.

Returns:

Tracks dataframe and an lineage graph, mapping node_id -> parent_id.

Return type:

Tuple[pd.DataFrame, Dict[int, List[int]]]

to_zarr(tracks_df, store_or_path=None, chunks=None, overwrite=False)¶

Exports segmentations masks to zarr array, track_df assign the track_id to their respective segments. By changing the store this function can be used to write zarr arrays into disk.

Parameters:

config (MainConfig) – Configuration parameters.
tracks_df (pd.DataFrame) – Tracks dataframe, must have track_id column and be indexed by node id.
store_or_path (Optional[StoreLike], optional) – Zarr storage or output path, if not provided a temporary store is used.
chunks (Optional[Tuple[int]], optional) – Chunk size, if not provided it chunks time with 1 and the spatial dimensions as big as possible.
overwrite (bool, optional) – If True, overwrites existing zarr array.

Returns:

Output zarr array.

Return type:

zarr.Array

track(*, labels=None, sigma=None, foreground=None, contours=None, images=(), scale=None, vector_field=None, overwrite='none', segment_kwargs={}, link_kwargs={}, solve_kwargs={})¶

All-in-one function for cell tracking, it accepts multiple inputs (labels or contours) and run all intermediate steps, computing segmentation hypothesis, linking and solving the ILP. The results must be queried using the export function of preference.

Note: Either labels or foreground and contours can be used as input, but not both.

Parameters:

config (MainConfig) – Tracking configuration parameters.
labels (Optional[ArrayLike], optional) – Segmentation labels of shape (T, (Z), Y, X), by default None
sigma (Optional[Union[Sequence[float], float]], optional) – Edge smoothing parameter (gaussian blur) for labels to contours conversion, by default None
foreground (Optional[ArrayLike], optional) – Foreground probability array of shape (T, (Z), Y, X), by default None
contours (Optional[ArrayLike], optional) – Contours array of shape (T, (Z), Y, X), by default None
images (Sequence[ArrayLike]) – Optinal sequence of images (T, (Z), Y, X) for color space filtering.
scale (Sequence[float]) – Optional scaling for nodes’ distances.
vector_field (Union[ArrayLike, Sequence[ArrayLike]]) – Vector field arrays. Each array per coordinate or a single (T, D, (Z), Y, X)-array.
overwrite (Literal[``”all”, ``"links", "solutions", "none"], optional) – Clear the corresponding data from the database, by default nothing is overwritten with “none” When not “none”, only the cleared and subsequent parts of the pipeline is executed.
segment_kwargs (Dict[str, Any]) – Optional keyword arguments for segmentation. See ultrack.segment for more details.
link_kwargs (Dict[str, Any]) – Optional keyword arguments for linking. See ultrack.link for more details.
solve_kwargs (Dict[str, Any]) – Optional keyword arguments for ILP solving. See ultrack.solve for more details.

ultrack.add_flow(config, vector_field)¶

Parameters:

data_config (DataConfig) – Data configuration parameters.
vector_field (Sequence[ArrayLike]) – Vector field arrays. Each array per coordinate or a single (T, D, (Z), Y, X)-array.

ultrack.add_new_node(config, time, mask, bbox=None, index=None, include_overlaps=True)¶

Adds a new node to the database.

NOTE: this is not taking node shifts or image features (color) into account.

Parameters:

config (MainConfig) – Ultrack configuration parameters.
time (int) – Time point of the node.
mask (ArrayLike) – Binary mask of the node.
bbox (Optional[ArrayLike], optional) – Bounding box of the node, (min_0, min_1, …, max_0, max_1, …). When provided it assumes the mask is a crop of the original image, by default None
index (Optional[int], optional) – Node index, otherwise it is automatically generated, and returned.
include_overlaps (bool, optional) – Include overlaps in the database, by default True When False it will allow oclusions between new node and existing nodes.

Returns:

New node index.

Return type:

int

ultrack.export_tracks_by_extension(config, filename, overwrite=False)¶

Export tracks to a file given the file extension.

Supported file extensions are .xml, .csv, .zarr, .parquet, .dot, .json, and .geff - .xml exports to a TrackMate compatible XML file. - .csv exports to a CSV file. - .parquet exports to a Parquet file. - .zarr exports the tracks to dense segments in a zarr array format. - .geff exports the tracks to a zarr format using the geff standard. - .dot exports to a Graphviz DOT file. - .json exports to a networkx JSON file.

Parameters:

filename (str or Path) – The name of the file to save the tracks to.
config (MainConfig) – The configuration object.
overwrite (bool, optional) – Whether to overwrite the file if it already exists, by default False.

Array utilities¶

ultrack.utils.array.array_apply(*in_arrays, func, out_array=None, axis=0, out_zarr_kwargs={}, **kwargs)¶

Apply a function over a given dimension of an array.

Parameters:

in_arrays (ArrayLike) – Arrays to apply function to.
func (function) – Function to apply over time.
out_array (ArrayLike, optional) – Array to store result of function if not provided a new array is created, by default None. See create_zarr for more information.
axis (Union[Tuple[int], int], optional) – Axis of data to apply func, by default 0.
args (tuple) – Positional arguments to pass to func.
out_zarr_kwargs (Dict[str, Any], optional) – Keyword arguments to pass to create_zarr. If dtype and shape are not provided, they are inferred from the first input array.
**kwargs – Keyword arguments to pass to func.

Returns:

out_array or new array with result of function.

Return type:

zarr.Array

ultrack.utils.array.assert_same_length(**kwargs)¶: Validates if key-word arguments have the same length.

ultrack.utils.array.check_array_chunk(array)¶: Checks if chunked array has chunk size of 1 on time dimension.

ultrack.utils.array.create_zarr(shape, dtype, store_or_path=None, overwrite=False, default_store_type=None, chunks=None, **kwargs)¶

Create a zarr array of zeros.

Parameters:

shape (Tuple[int, ]) – Shape of the array.
dtype (np.dtype) – Data type of the array.
store_or_path (Optional[StoreLike], optional) – Path to store the array, if None a zarr.storage.MemoryStore is used, by default None
overwrite (bool, optional) – Overwrite existing file, by default False
chunks (Optional[Tuple[int]], optional) – Chunk size, if not provided it chunks time with 1 and the spatial dimensions as big as possible.

Returns:

Zarr array of zeros.

Return type:

zarr.Array

ultrack.utils.array.large_chunk_size(shape, dtype, max_size=2147483647)¶

Computes a large chunk size for a given shape and dtype. Large chunks improves the performance on Elastic Storage Systems (ESS). Leading dimension (time) will always be chunked as 1.

Parameters:

shape (Tuple[int]) – Input data shape.
dtype (Union[str, np.dtype]) – Input data type.
max_size (int, optional) – Reference maximum size, by default 2147483647

Returns:

Suggested chunk size.

Return type:

Tuple[int]

ultrack.utils.array.validate_and_overwrite_path(path, overwrite, msg_type)¶: Validates and errors existance of path (or dir) and overwrites it if requested.

Image processing utilities¶

class ultrack.imgproc.PlantSeg(model_name, model_update=False, device=None, patch=(80, 160, 160), stride_ratio=0.75, batch_size=None, preprocess_sigma=None, postprocess_sigma=None, scale_factor=None)¶

A class for performing boundary detection using the Plant-Seg model. Plant-Seg documentation for more details, https://github.com/hci-unihd/plant-seg

Parameters:

model_name (str) – Name of the pre-trained segmentation model.
model_update (bool, optional) – Update the model if True. Default is False.
device ({str, torch.device}, optional) – Device for model execution. If None, the default device is used. Default is None.
patch (tuple[int], optional) – Patch size for model inference. Default is (80, 160, 160).
stride_ratio (float, optional) – Stride ratio for patch sampling. Default is 0.75.
batch_size (int, optional) – Batch size for inference. Default is None.
preprocess_sigma (float, optional) – Sigma value for Gaussian preprocessing filter. Default is None.
postprocess_sigma (float, optional) – Sigma value for Gaussian postprocessing filter. Default is None.
scale_factor (tuple[float], optional) – Scaling factors for input images. Default is None.

Flow¶

ultrack.imgproc.flow.add_flow(config, vector_field)¶

Parameters:

data_config (DataConfig) – Data configuration parameters.
vector_field (Sequence[ArrayLike]) – Vector field arrays. Each array per coordinate or a single (T, D, (Z), Y, X)-array.

ultrack.imgproc.flow.advenct_field(field, sources, shape=None, invert=True)¶

Advenct points from sources through the provided field. Shape indicates the original shape (space) and sources. Useful when field is down scaled from the original space.

Parameters:

field (ArrayLike) – Field array with shape T x D x (Z) x Y x X
sources (th.Tensor) – Array of sources N x D
shape (tuple[int, ]) – When provided scales field accordingly, D-dimensional tuple.
invert (bool) – When true flow is multiplied by -1, resulting in reversal of the flow.

Returns:

Trajectories of sources N x T x D

Return type:

th.Tensor

ultrack.imgproc.flow.advenct_field_from_labels(field, label, invert=True)¶

Advenct points from segmentation labels centroid.

Parameters:

field (ArrayLike) – Field array with shape T x D x (Z) x Y x X
label (ArrayLike) – Label image.
invert (bool) – When true flow is multiplied by -1, resulting in reversal of the flow.

Returns:

Trajectories of sources N x T x D

Return type:

ArrayLike

ultrack.imgproc.flow.advenct_from_quasi_random(field, img_shape, n_samples, invert=True, device=None)¶

Advenct points from quasi random uniform distribution.

Parameters:

field (ArrayLike) – Field array with shape T x D x (Z) x Y x X
img_shape (Tuple[int, ]) – Must be D-dimensional.
n_samples (int) – Number of samples.
invert (bool) – When true flow is multiplied by -1, resulting in reversal of the flow.
device (Optional[th.device]) – Torch device, by default uses last GPU if available or mps.

Returns:

Trajectories of sources N x T x D

Return type:

ArrayLike

ultrack.imgproc.flow.apply_field(field, image)¶

Transform image using vector field. Image will be scaled to the field size.

Parameters:

field (th.Tensor) – Vector field (D, z, y, x)
image (th.Tensor) – Original image used to compute the vector field.

Returns:

Transformed image (z, y, x)

Return type:

th.Tensor

ultrack.imgproc.flow.flow_field(source, target, im_factor=4, grid_factor=4, num_iterations=1000, lr=0.01, n_scales=3, init_grid=None)¶

Compute the flow vector field T that minimizes the mean squared error between T(source) and target.

Parameters:

source (torch.Tensor) – Source image (C, Z, Y, X).
target (torch.Tensor) – Target image (C, Z, Y, X).
im_factor (int, optional) – Image space down scaling factor, by default 4.
grid_factor (int, optional) – Grid space down scaling factor, by default 4. Grid dimensions will be divided by both im_factor and grid_factor.
num_iterations (int, optional) – Number of gradient descent iterations, by default 1000.
lr (float, optional) – Learning rate (gradient descent step), by default 1e-2
n_scales (int, optional) – Number of scales used for multi-scale optimization, by default 3.

Returns:

Vector field array with shape (D, (Z / factor), Y / factor, X / factor)

Return type:

torch.Tensor

ultrack.imgproc.flow.identity_grid(shape)¶

Grid equivalent to a identity vector field (no flow).

Parameters:: shape (tuple[int, ]) – Grid shape.
Returns:: Tensor of shape (Z, Y, X, D)
Return type:: th.Tensor

ultrack.imgproc.flow.timelapse_flow(images, store_or_path=None, chunks=None, channel_axis=None, im_factor=4, grid_factor=4, num_iterations=1000, lr=0.01, n_scales=3, device=None)¶

Compute vector field from timelapse.

Parameters:

images (ArrayLike) – Timelapse images shape as (T, …).
store_or_path (Optional[StoreLike], optional) – Zarr storage or output path, if not provided a temporary store is used.
chunks (Optional[Tuple[int]], optional) – Chunk size, if not provided it chunks time with 1 and the spatial dimensions as big as possible.
channel_axis (Optional[int], optional) – Channel axis EXCLUDING TIME (first axis), e.g (T, C, Y, X) would have channel_axis=0. If not provided assumes first axis after time.
im_factor (int, optional) – Image space down scaling factor, by default 4.
grid_factor (int, optional) – Grid space down scaling factor, by default 4. Grid dimensions will be divided by both im_factor and grid_factor.
num_iterations (int, optional) – Number of gradient descent iterations, by default 2000.
lr (float, optional) – Learning rate (gradient descent step), by default 1e-4
n_scales (int, optional) – Number of scales used for multi-scale optimization, by default 3.
device (Optional[th.device], optional) – Torch device, by default uses last GPU if available or mps.

Returns:

Vector field array with shape (T, D, (Z), Y, X).

Return type:

zarr.Array

ultrack.imgproc.flow.trajectories_to_tracks(trajectories)¶

Converts trajectories to napari tracks format.

Parameters:: trajectories (th.Tensor) – Input N x T x D trajectories.
Returns:: Napari tracks (N x T) x (2 + D) array.
Return type:: np.ndarray

Tracks utilities¶

ultrack.tracks.add_track_ids_to_tracks_df(df)¶

Adds track_id and parent_track_id columns to forest df. Each maximal path receveis a unique track_id.

Parameters:: df (pd.DataFrame) – Forest defined by the parent_id column and the dataframe indices.
Returns:: Inplace modified input dataframe with additional columns.
Return type:: pd.DataFrame

ultrack.tracks.close_tracks_gaps(tracks_df, max_gap, max_radius, spatial_columns=['z', 'y', 'x'], scale=None, segments=None, segments_store_or_path=None, overwrite=False)¶

Close gaps between tracklets in the given DataFrame.

Parameters:

tracks_df (pd.DataFrame) – The DataFrame containing the tracks information.
max_gap (int) – The maximum gap size to close.
max_radius (float) – The maximum distance between the end of one tracklet and the start of the next tracklet.
spatial_columns (List[str]) – The names of the columns containing the spatial information.
scale (Optional[ArrayLike]) – The scaling factors for the spatial columns.
segments (Optional[ArrayLike]) – When provided, the function will update the segments labels to match the tracks.
segments_store_or_path (Union[Store, Path, str, None]) – The store or path to save the updated segments, if not provided in memory store is used.
overwrite (bool) – If True, overwrites the segments store if it already exists.

Returns:

The DataFrame containing the tracks information with the gaps closed.

Return type:

Union[pd.DataFrame, Tuple[pd.DataFrame, ArrayLike]]

ultrack.tracks.filter_short_sibling_tracks(tracks_df, min_length, segments=None, segments_store_or_path=None, overwrite=False)¶

Filter short tracks created from fake division tracks shorter than min_length.

This function tranverse the graph bottom up and remove tracks that are shorter than min_length upon divions, merging the remaining sibling track with their parent.

If both are shorter than min_length, they are not removed.

Parameters:

tracks_df (pd.DataFrame) –

DataFrame containing track information with columns:
”track_id” : Unique identifier for each track. “parent_track_id” : Identifier of the parent track in the forest. (Other columns may be present in the DataFrame but are not used in this function.)
min_length (int) – Minimum track length, below this value the track is removed.
segments (Optional[ArrayLike]) – Segmentation array to update the tracks.
segments_store_or_path (Union[Store, Path, str, None]) – Store or path to save the new segments.
overwrite (bool) – If True, overwrite the existing segmentation array.

Returns:

If segments is None, returns the modified tracks dataframe. If segments is provided, returns the modified tracks dataframe and the updated segments.

Return type:

Union[pd.DataFrame, Tuple[pd.DataFrame, ArrayLike]]

ultrack.tracks.get_paths_to_roots(tracks_df, graph=None, *, node_index=None, track_index=None)¶

Returns paths from node_index or track_index to roots. If node_index and track_index are None, returns all paths to roots.

Parameters:

tracks_df (pd.DataFrame) –

DataFrame containing track information with columns:
”track_id” : Unique identifier for each track. “parent_track_id” : Identifier of the parent track in the forest. (Other columns may be present in the DataFrame but are not used in this function.)
graph (Optional[Dict[int, int]], optional) – Inverted forest graph, if not provided it will be computed from tracks_df.
node_index (Optional[int], optional) – Node (dataframe) index to compute path to roots.
track_index (Optional[int], optional) – Track index (track_id column value) to compute path to roots.

Returns:

DataFrame containing paths to roots.

Return type:

pd.DataFrame

ultrack.tracks.get_subgraph(tracks_df, track_ids)¶

Get a subgraph from a forest of tracks represented as a DataFrame.

Parameters:

tracks_df (pd.DataFrame) –

DataFrame containing track information with columns:
”track_id” : Unique identifier for each track. “parent_track_id” : Identifier of the parent track in the forest. (Other columns may be present in the DataFrame but are not used in this function.)
track_ids (ArrayLike) – An array-like object containing the track IDs for which to extract the subgraph.

Returns:

A DataFrame containing the subgraph of tracks corresponding to the input track IDs.

Return type:

pd.DataFrame

Examples

>>> subgraph_df = get_subgraph(tracks_df, [3, 7, 10])

Notes

The input DataFrame ‘tracks_df’ should have at least two columns: “track_id” and “parent_track_id”, where “track_id” represents the unique identifier for each track, and “parent_track_id” represents the identifier of the parent track in the forest.

ultrack.tracks.inv_tracks_df_forest(df)¶

Returns track_id and parent_track_id leaves-to-root inverted forest (set of trees) graph structure.

Example: forest[child_id] = parent_id

ultrack.tracks.left_first_search(track_id, graph)¶

Perform a left-first traversal on a binary tree represented as a graph and return a list of track IDs in the order they are visited during the traversal.

Parameters:

track_id (int) – The ID of the track to start the traversal from.
graph (Dict[int, List[int]]) – The graph representing the binary tree. It is a dictionary where the keys are track IDs, and the values are lists of two child track IDs. The binary tree must have exactly two children for each node.

Returns:

A list of track IDs visited during the left-first traversal of the binary tree, with track_id being the starting point.

Return type:

List[int]

Example

>>> graph = {1: [2, 3], 2: [4, 5], 3: [6, 7], 4: None, 5: None, 6: None, 7: None}
>>> result = _left_first_search(1, graph)
>>> print(result)
[4, 2, 5, 1, 6, 3, 7]

ultrack.tracks.sort_track_ids(tracks_df)¶

Sort track IDs in a given DataFrame representing tracks in a way that maintains the left-first order of the binary tree formed by their parent-child relationships.

Parameters:: tracks_df (pd.DataFrame) – A DataFrame containing information about tracks, where each row represents a track and contains at least two columns - “track_id” and “track_parent_id”. The “track_id” column holds unique track IDs, and the “track_parent_id” column contains the parent track IDs for each track. The DataFrame should have a consistent parent-child relationship, forming one or multiple binary trees.
Returns:: A NumPy array containing the sorted track IDs based on the left-first traversal of the binary trees formed by the parent-child relationships.
Return type:: np.ndarray

Example

>>> import pandas as pd
>>> import numpy as np
>>> data = {
...     "track_id": [1, 2, 3, 4, 5, 6, 7],
...     "track_parent_id": [None, 1, 1, 2, 2, 3, 3],
... }
>>> tracks_df = pd.DataFrame(data)
>>> sorted_track_ids = sort_track_ids(tracks_df)
>>> print(sorted_track_ids)
[4 2 5 1 6 3 7]

ultrack.tracks.sort_trees_by_length(df, graph=None)¶

Sorts trees from the track graph by length (deepest tree path).

Parameters:

df (pd.DataFrame) – tracks dataframe.
graph (Dict[int, int], optional) – Child -> parent tracks graph. Optional, if not provided it will be computed from the dataframe, must have track_id and parent_track_id columns.

Returns:

Sorted list of tracks dataframe.

Return type:

List[pd.DataFrame]

ultrack.tracks.sort_trees_by_max_radius(df, scale=None, metric='euclidean')¶

Sorts trees from the track graph by radius (distance between nodes at the same time point).

Parameters:

df (pd.DataFrame) – tracks dataframe.
scale (ArrayLike, optional) – Spatial domain scale.
metric (Union[Callable, str], optional) – Distance metric, see scipy.spatial.distance.pdist for more information.

Returns:

Sorted list of tracks dataframe.

Return type:

List[pd.DataFrame]

ultrack.tracks.split_tracks_df_by_lineage(tracks_df)¶

Split tracks dataframe into a list of dataframes, one for each lineage, sorted by the root track id.

Parameters:

tracks_df (pd.DataFrame) –

Tracks dataframe with columns:: ”track_id” : Unique identifier for each track. “parent_track_id” : Identifier of the parent track in the forest. (Other columns may be present in the DataFrame but are not used in this function.)

Returns:

List of dataframes, one for each lineage.

Return type:

List[pd.DataFrame]

ultrack.tracks.split_trees(tracks_df)¶

Split tracks forest into trees.

Parameters:

tracks_df (pd.DataFrame) –

DataFrame containing track information with columns:: ”track_id” : Unique identifier for each track. “parent_track_id” : Identifier of the parent track in the forest. (Other columns may be present in the DataFrame but are not used in this function.)

Returns:

List of dataframes, each representing a tree.

Return type:

List[pd.DataFrame]

ultrack.tracks.tracks_df_forest(df, remove_roots=False, numba_dict=False)¶

Creates the forest graph of track lineages

Example: forest[parent_id] = [child_id_0, child_id_1]

Parameters:

df (pd.DataFrame) – Tracks dataframe.
remove_roots (bool) – If True, removes root nodes (nodes with no parent).
numba_dict (bool) – If True, returns a numba typed dictionary.

Returns:

Forest graph where parent maps to their children (parent -> children)

Return type:

Dict[int, List[int]]

ultrack.tracks.tracks_df_movement(tracks_df, lag=1, cols=None)¶

Compute the displacement for track data across given time lags.

This function computes the displacement (difference) for track coordinates across the specified lag periods.

NOTE: this sort the dataframe by [“track_id”, “t”].

Parameters:

tracks_df (pd.DataFrame) – Dataframe containing track data. It is expected to have columns [“track_id”, “t”] and any of [“z”, “y”, “x”] representing the 3D coordinates.
lag (int, optional) – Number of periods to compute the difference over. Default is 1.
cols (tuple[str, ], optional) – Columns to compute the displacement for. If not provided, it will try to find any of [“z”, “y”, “x”] columns in the dataframe and use them.

Returns:

Dataframe of the displacement (difference) of coordinates for the given lag. Displacements for the first row of each track_id will be set to zero.

Return type:

pd.DataFrame

Examples

>>> df = pd.DataFrame({
...     "track_id": [1, 1, 2, 2],
...     "t": [1, 2, 1, 2],
...     "z": [0, 1, 0, 2],
...     "y": [1, 2, 1, 2],
...     "x": [2, 3, 2, 2]
... })
>>> print(tracks_df_movement(df))
   z    y    x
0 0.0  0.0  0.0
1 1.0  1.0  1.0
2 0.0  0.0  0.0
3 2.0  1.0  0.0

ultrack.tracks.tracks_length(tracks_df, include_appearing=True, include_disappearing=True)¶

Compute the length of each track in a tracks dataframe.

Parameters:

tracks_df (pd.DataFrame) –

DataFrame containing track information with columns:
”t” : Time step index for each data point in the track. “track_id” : Unique identifier for each track. “parent_track_id” : Unique identifier for the parent track.
include_appearing (bool, optional) – Include tracks that appear outside the first time step, by default True.
include_disappearing (bool, optional) – Include tracks that disappear outside the last time step, by default True.

Returns:

Series containing the length of each track.

Return type:

pd.DataFrame

ultrack.tracks.tracks_profile_matrix(tracks_df, columns)¶

Construct a profile matrix from a pandas DataFrame containing tracks data.

Parameters:

tracks_df (pd.DataFrame) –

DataFrame containing track information with columns:
”track_id” : Unique identifier for each track. “t” : Time step index for each data point in the track. Other columns specified in ‘columns’ parameter, representing track attributes.
columns (List[str]) – List of strings, specifying the columns of ‘tracks_df’ to use as attributes.

Returns:

A 3D NumPy array representing the profile matrix with shape (num_attributes, num_tracks, max_timesteps), where ‘num_attributes’ is the number of attributes specified in ‘columns’, ‘num_tracks’ is the number of unique tracks, and ‘max_timesteps’ is the maximum number of timesteps encountered among all tracks.

Return type:

np.ndarray

Exporting¶

ultrack.core.export.export_tracks_by_extension(config, filename, overwrite=False)¶

Export tracks to a file given the file extension.

Parameters:

filename (str or Path) – The name of the file to save the tracks to.
config (MainConfig) – The configuration object.
overwrite (bool, optional) – Whether to overwrite the file if it already exists, by default False.