Implementing your own processing pipeline

Here we provide an example where we implement a custom image processing routine and apply it to an DEXP’s dataset.

To replicate this example you need to download our sample data here, it contains four different views of a zebra-fish embryo. In the previous tutorial we discuss it in more detail.

First, we import the required packages:

import cupy as cp
import numpy as np
from arbol import asection
from cucim.skimage import morphology

from dexp.datasets import ZDataset
from dexp.processing.morphology import area_white_top_hat

Next we implement our image processing operation to remove the background fluorescence, in this case it consists of 2 steps:

  • Filter out noisy spots using morphological opening operator with a small sphere — using GPU acceleration through cupy;

  • Subtract darker components with area larger than 10_000 using the area white top hat transform, we use downsampling of 4 to speedup the computation and the axis-0 slices are processed individually due to the data anisotropy.

def remove_background(stack: np.ndarray) -> np.ndarray:
    with asection("Computing opening ..."):
        cu_stack = cp.asarray(stack)
        cu_opened = morphology.opening(cu_stack, morphology.ball(np.sqrt(2)))
        opened = cu_opened.get()

    with asection("Computing white top hat ..."):
        wth = area_white_top_hat(opened, area_threshold=10_000, sampling=4, axis=0)

    return wth

To apply this operation to the demo_4views.zarr.zip dataset we open it and create a new storage processed.zarr using write mode w-.

We iterate over each existing channel, creating it respective processed channel in the output dataset, and iterating over their time points. At each iteration, the required stack is read from the input dataset, the processing function is called, and the processed data is written to the output dataset, thus avoiding accumulation of data in the computer memory.

in_ds = ZDataset("demo_4views.zarr.zip")
out_ds = ZDataset("processed.zarr", mode="w-")

for channel in in_ds.channels():
    new_channel = channel + "-processed"
    out_ds.add_channel(name=new_channel, shape=in_ds.shape(channel), dtype=np.int32)
    with asection(f"Processing channel {channel} ..."):
        for tp in range(in_ds.nb_timepoints(channel)):
            stack = in_ds.get_stack(channel=channel, time_point=tp)
            labels = remove_background(stack)
            out_ds.write_stack(channel=new_channel, time_point=tp, stack_array=labels)

in_ds.close()
out_ds.close()

The datasets are closed before finishing the program.

The whole script is presented below:

import cupy as cp
import numpy as np
from arbol import asection
from cucim.skimage import morphology

from dexp.datasets import ZDataset
from dexp.processing.morphology import area_white_top_hat


def remove_background(stack: np.ndarray) -> np.ndarray:
    with asection("Computing opening ..."):
        cu_stack = cp.asarray(stack)
        cu_opened = morphology.opening(cu_stack, morphology.ball(np.sqrt(2)))
        opened = cu_opened.get()

    with asection("Computing white top hat ..."):
        wth = area_white_top_hat(opened, area_threshold=10_000, sampling=4, axis=0)

    return wth


in_ds = ZDataset("demo_4views.zarr.zip")
out_ds = ZDataset("processed.zarr", mode="w-")

for channel in in_ds.channels():
    new_channel = channel + "-processed"
    out_ds.add_channel(name=new_channel, shape=in_ds.shape(channel), dtype=np.int32)
    with asection(f"Processing channel {channel} ..."):
        for tp in range(in_ds.nb_timepoints(channel)):
            stack = in_ds.get_stack(channel=channel, time_point=tp)
            labels = remove_background(stack)
            out_ds.write_stack(channel=new_channel, time_point=tp, stack_array=labels)

in_ds.close()
out_ds.close()