Segmentation & tracking from cell boundaries prediction¶
This example shows how Ultrack can segment and track from cell boundaries. This is the most natural application of this library because, in other examples, we converted the segmentation labels into contours, and here they are already in a representation analogous to the contour space.
While you can directly track from the membrane channel, we used PlantSeg to enhance the cell boundaries. PlantSeg provides several edges (i.e., boundary) detection pre-trained models, which have shown to generalize well working on Biohub’s Zebrafish dataset while trained on Arabidopsis thaliana.
Additionally, ultrack.imgproc.plantseg
module provides a wrapper around their library to facilitate it’s usage on array data.
First we import the required packages.
[1]:
from typing import Optional
import napari
import numpy as np
import dask.array as da
from napari.utils.notebook_display import nbscreenshot
from rich.pretty import pprint
from numpy.typing import ArrayLike
from ultrack import MainConfig, track, to_tracks_layer, tracks_to_zarr
from ultrack.utils.array import array_apply, create_zarr
from ultrack.imgproc.segmentation import reconstruction_by_dilation
from ultrack.imgproc.plantseg import PlantSeg
from ultrack.utils.cuda import import_module, to_cpu, on_gpu
Next we open a Zebrafish Neuromast dataset from Adrian’s group using napari’s ome-zarr reader plugin.
[2]:
viewer = napari.Viewer()
viewer.window.resize(1800, 1000)
viewer.open(
"http://public.czbiohub.org/royerlab/ultrack/zebrafish_neuromast.ome.zarr",
plugin="napari-ome-zarr",
)
viewer.dims.set_current_step(1, 54)
nuclei = viewer.layers["488"].data
membranes = viewer.layers["561"].data
scale = viewer.layers["488"].scale
nbscreenshot(viewer)
WARNING: inotify_add_watch(/home/jordao/.config/ibus/bus/c7c857cdefee40528bfceae4d3c85191-unix-1) failed: (No space left on device)
WARNING:vispy:inotify_add_watch(/home/jordao/.config/ibus/bus/c7c857cdefee40528bfceae4d3c85191-unix-1) failed: (No space left on device)
[2]:
Below, we defined a helper function to identify our region of interest, the neuromast, from the background. This step should vary for each application and can be quite challenging in some domains, where more advanced methodologies based on machine learning might be preferred over classical image processing. Mistakes in detecting the region of interest might lead to non-existing objects, decreasing the tracking accuracy.
In the detect_neuromast
function, we detect the nuclei and expand into PlantSeg’s boundary prediction. However, regions with weak boundaries can let a bit of background be detected.
[3]:
# helper functions
def detect_neuromast(
nuclei: ArrayLike,
boundaries: ArrayLike,
sigma: float = 4.0,
threshold: Optional[float] = None,
iterations: int = 25,
) -> ArrayLike:
"""
Fill the boundaries of the binarized nuclei using morphological operations to detection neuromast region.
Parameters
----------
nuclei : ArrayLike
Input array representing the nuclei.
boundaries : ArrayLike
Input array indicating the cell boundaries.
sigma : float,
Gaussian blur sigma applied before detecting nuclei.
threshold : float, optional
Threshold value for detecting nuclei. If not provided,
Otsu's thresholding is applied by default.
iterations : int
Iterations used to expand binary nuclei using reconstruction by dilation.
Returns
-------
ArrayLike
Array with the filled nuclei regions.
Notes
-----
This function fills the boundaries of nuclei while preventing
the expansion into specified regions, creating a wall-like
barrier for the expansion.
Examples
--------
>>> nuclei = np.random.rand(256, 256)
>>> boundaries = np.random.rand(256, 256)
>>> filled_nuclei = fill_boundaries(nuclei, boundaries)
"""
filters = import_module("skimage", "filters", nuclei)
morph = import_module("skimage", "morphology", nuclei)
segm = import_module("skimage", "segmentation", nuclei)
ndi = import_module("scipy", "ndimage", nuclei)
# detecting nuclei with threshold
seeds = filters.gaussian(nuclei, sigma)
if threshold is None:
current_thold = filters.threshold_otsu(seeds)
else:
current_thold = threshold
seeds = seeds > current_thold
seeds = morph.binary_closing(seeds, morph.ball(3))
# closed boundaries mask
mask = boundaries > 0.25
mask = ndi.binary_fill_holes(mask, morph.disk(5)[None, ...])
# expanding nuclei detection to mask
mask = reconstruction_by_dilation(seeds & mask, mask, iterations)
mask |= seeds
mask = segm.morphological_geodesic_active_contour(
boundaries,
num_iter=10,
init_level_set=mask,
smoothing=3,
balloon=0.1,
)
# filtering thin objects / holes
mask = morph.binary_opening(mask, morph.disk(10)[None, ...])
mask = morph.binary_closing(mask, morph.disk(10)[None, ...])
mask = ndi.binary_fill_holes(mask, morph.disk(5)[None, ...])
# boundaries are too thick
# mask can be eroded to fit image better
mask = morph.binary_erosion(mask, morph.ball(8))
return to_cpu(mask)
Next, we allocate the PlantSeg predictor and use it to detect the cell boundaries. The PlantSeg class has post(pre)-processing functionalities that will be run on the gpu if the array data is on that device. To do that we wrap the predictor with the on_gpu
call.
[4]:
predictor = PlantSeg(
model_name="generic_light_sheet_3D_unet",
batch_size=3,
postprocess_sigma=25.0 * scale[-3:],
stride_ratio=0.5,
patch=(64, 256, 256),
scale_factor=(1, 0.65, 0.65),
)
boundaries = create_zarr(membranes.shape, np.float16, store_or_path="boundaries.zarr", overwrite=True)
array_apply(
membranes,
out_array=boundaries,
func=on_gpu(predictor),
)
layer = viewer.add_image(boundaries, visible=True, scale=scale)
nbscreenshot(viewer)
layer.visible = False
2024-01-08 12:11:25,072 [MainThread] INFO PlantSeg - Using batch size of 4 for prediction
INFO:PlantSeg:Using batch size of 4 for prediction
Applying PlantSeg ...: 100%|██████████████████| 100/100 [03:23<00:00, 2.04s/it]
Using the detected boundaries and the nuclei channel we detect the neuromast, also with the on_gpu
call.
[5]:
detection = create_zarr(nuclei.shape, bool, store_or_path="detection.zarr", overwrite=True)
array_apply(
nuclei,
boundaries,
out_array=detection,
func=on_gpu(detect_neuromast),
sigma=5.0 * scale[-3:],
)
viewer.add_labels(detection, visible=True, scale=scale)
nbscreenshot(viewer)
Applying detect_neuromast ...: 100%|██████████| 100/100 [06:17<00:00, 3.77s/it]
[5]:
Now that we have the required detection and boundary map inputs we start the tracking stage.
Ultrack’s processing parameters are configured using the MainConfig
class. The track
function wraps the three main segment
, link
, and solve
functions. Each step contains its own parameters in a subconfiguration (e.g., config.linking_config
).
In this example, the parameters were optimized through inspection.
The configuration documentation can be found here.
[6]:
cfg = MainConfig()
cfg.data_config.n_workers = 8
cfg.segmentation_config.n_workers = 6
cfg.segmentation_config.min_area = 20_000
cfg.segmentation_config.max_area = 200_000
cfg.segmentation_config.min_frontier = 0.35
cfg.linking_config.n_workers = 16
cfg.linking_config.max_neighbors = 5
cfg.linking_config.max_distance = 3.0 # microns
cfg.linking_config.distance_weight = 0.01
cfg.tracking_config.appear_weight = -1.0
cfg.tracking_config.disappear_weight = -1.0
cfg.tracking_config.division_weight -0.1
pprint(cfg)
MainConfig( │ data_config=DataConfig(working_dir=PosixPath('.'), database='sqlite', address=None, n_workers=8), │ segmentation_config=SegmentationConfig( │ │ threshold=0.5, │ │ min_area=20000, │ │ max_area=200000, │ │ min_frontier=0.35, │ │ anisotropy_penalization=0.0, │ │ max_noise=0.0, │ │ ws_hierarchy=<function watershed_hierarchy_by_area at 0x7f9c26576cb0>, │ │ n_workers=6 │ ), │ linking_config=LinkingConfig( │ │ n_workers=16, │ │ max_neighbors=5, │ │ max_distance=3.0, │ │ distance_weight=0.01, │ │ z_score_threshold=5.0 │ ), │ tracking_config=TrackingConfig( │ │ appear_weight=-1.0, │ │ disappear_weight=-1.0, │ │ division_weight=-0.001, │ │ dismiss_weight_guess=None, │ │ include_weight_guess=None, │ │ window_size=None, │ │ overlap_size=1, │ │ solution_gap=0.001, │ │ time_limit=36000, │ │ method=0, │ │ n_threads=-1, │ │ link_function='power', │ │ power=4, │ │ bias=-0.0 │ ) )
We track the cells using the boundary and detection maps, the scale
parameter let the linking distance operate on the physical space.
[10]:
track(
cfg,
detection=detection,
edges=boundaries,
overwrite=True,
scale=scale,
)
Adding nodes to database: 100%|███████████████| 100/100 [08:06<00:00, 4.87s/it]
Linking nodes.: 100%|███████████████████████████| 99/99 [00:10<00:00, 9.24it/s]
Academic license - for non-commercial use only - expires 2024-08-06
Using GRB solver
Solving ILP batch 0
Constructing ILP ...
Solving ILP ...
Gurobi Optimizer version 9.1.2 build v9.1.2rc0 (linux64)
Thread count: 8 physical cores, 16 logical processors, using up to 16 threads
Optimize a model with 29307 rows, 44453 columns and 95545 nonzeros
Model fingerprint: 0x518cd744
Variable types: 0 continuous, 44453 integer (44453 binary)
Coefficient statistics:
Matrix range [1e+00, 1e+00]
Objective range [3e-18, 1e+00]
Bounds range [1e+00, 1e+00]
RHS range [1e+00, 1e+00]
Found heuristic solution: objective -0.0000000
Presolve removed 5464 rows and 9462 columns
Presolve time: 0.16s
Presolved: 23843 rows, 34991 columns, 78605 nonzeros
Found heuristic solution: objective 9.5551060
Variable types: 0 continuous, 34991 integer (34991 binary)
Concurrent LP optimizer: primal simplex, dual simplex, and barrier
Showing barrier log only...
Root barrier log...
Ordering time: 0.01s
Barrier statistics:
AA' NZ : 6.372e+04
Factor NZ : 3.588e+05 (roughly 27 MBytes of memory)
Factor Ops : 7.612e+06 (less than 1 second per iteration)
Threads : 6
Objective Residual
Iter Primal Dual Primal Dual Compl Time
0 1.31817494e+03 5.43138554e+04 1.14e+01 8.21e-01 3.21e+00 0s
1 -1.90984614e+03 2.96184632e+04 2.53e+00 2.33e-15 7.54e-01 0s
2 -9.65556643e+02 6.81388511e+03 1.77e-01 1.11e-15 1.04e-01 0s
3 1.89738069e+02 2.19723637e+03 1.09e-02 8.88e-16 2.36e-02 0s
4 6.97634519e+02 1.57864333e+03 2.19e-03 6.66e-16 1.02e-02 0s
5 8.55539960e+02 1.26267870e+03 8.29e-04 6.66e-16 4.71e-03 0s
6 9.31768806e+02 1.14771828e+03 3.77e-04 4.44e-16 2.50e-03 0s
7 9.73986228e+02 1.09704241e+03 1.72e-04 6.66e-16 1.42e-03 0s
8 9.93994772e+02 1.05645130e+03 8.52e-05 4.44e-16 7.21e-04 0s
Barrier performed 8 iterations in 0.36 seconds
Barrier solve interrupted - model solved by another algorithm
Solved with dual simplex
Root relaxation: objective 1.017585e+03, 9867 iterations, 0.18 seconds
Nodes | Current Node | Objective Bounds | Work
Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time
0 0 1017.58478 0 195 9.55511 1017.58478 - - 0s
H 0 0 1010.2036652 1017.58478 0.73% - 0s
H 0 0 1011.5510900 1017.58478 0.60% - 0s
0 0 1015.90394 0 128 1011.55109 1015.90394 0.43% - 0s
H 0 0 1012.6621415 1015.90394 0.32% - 0s
0 0 1015.79736 0 149 1012.66214 1015.79736 0.31% - 0s
0 0 1015.22684 0 108 1012.66214 1015.22684 0.25% - 0s
H 0 0 1014.5756839 1015.22684 0.06% - 1s
Cutting planes:
Gomory: 43
Clique: 150
MIR: 1
Zero half: 18
RLT: 27
Explored 1 nodes (10106 simplex iterations) in 1.23 seconds
Thread count was 16 (of 16 available processors)
Solution count 6: 1014.58 1012.66 1011.55 ... -0
Optimal solution found (tolerance 1.00e-03)
Best objective 1.014575683931e+03, best bound 1.015226558010e+03, gap 0.0642%
Saving solution ...
Done!
Finally, export the tracking results and visualize them using the napari tracks layer.
[12]:
tracks_df, graph = to_tracks_layer(cfg)
tracks_df.to_csv("tracks.csv", index=False)
segments = tracks_to_zarr(
cfg,
tracks_df,
store_or_path="segments.zarr",
overwrite=True,
)
viewer.layers["detection"].visible = False
viewer.add_tracks(
tracks_df[["track_id", "t", "z", "y", "x"]],
name="tracks",
graph=graph,
visible=True,
scale=scale,
)
viewer.add_labels(da.from_zarr(segments), name="segments", scale=scale)
nbscreenshot(viewer)
Exporting segmentation masks: 100%|███████████| 100/100 [00:09<00:00, 10.11it/s]
[12]:
[ ]: