API#

This page provides detailed API documentation for TrackCell.

Input/Output Functions#

IO module for TrackCell package.

This module provides functions for reading and writing single-cell and spatial transcriptomics data.

trackcell.io.read_hd_bin(datapath: str | Path, sample: str | None = None, binsize: int = 16, matrix_file_h5: str = 'filtered_feature_bc_matrix.h5', matrix_file_dir: str = 'filtered_feature_bc_matrix', tissue_positions_file: str = 'spatial/tissue_positions.parquet', hires_image_file: str = 'spatial/tissue_hires_image.png', lowres_image_file: str = 'spatial/tissue_lowres_image.png', scalefactors_file: str = 'spatial/scalefactors_json.json') AnnData[source]#

Read 10X HD SpaceRanger bin-level output (2um/8um/16um) and create an AnnData object with spatial information.

This function reads the output from SpaceRanger pipeline and creates an AnnData object that includes spatial coordinates, tissue images, and scalefactors for bin-level data.

Parameters:
  • datapath (str or Path) – Path to the SpaceRanger output directory containing bin-level outputs.

  • sample (str, optional) – Sample name. If None, will be inferred from the path.

  • binsize (int, default 16) – Bin size in micrometers. Common values are 2, 8, or 16. This information will be stored in adata.uns[“spatial”][sample][“binsize”].

  • matrix_file_h5 (str, default "filtered_feature_bc_matrix.h5") – Name of the H5 matrix file. Will be tried first.

  • matrix_file_dir (str, default "filtered_feature_bc_matrix") – Name of the matrix directory. Will be used if H5 file is not available.

  • tissue_positions_file (str, default "spatial/tissue_positions.parquet") – Path to tissue positions file (parquet or csv format).

  • hires_image_file (str, default "spatial/tissue_hires_image.png") – Path to the high-resolution tissue image relative to datapath.

  • lowres_image_file (str, default "spatial/tissue_lowres_image.png") – Path to the low-resolution tissue image relative to datapath.

  • scalefactors_file (str, default "spatial/scalefactors_json.json") – Name of the scalefactors JSON file.

Returns:

AnnData object containing: - Expression matrix in .X - Cell metadata in .obs - Gene metadata in .var - Spatial coordinates in .obsm[“spatial”] - Tissue images in .uns[“spatial”][sample][“images”] - Scalefactors in .uns[“spatial”][sample][“scalefactors”] - Bin size in .uns[“spatial”][sample][“binsize”]

Return type:

sc.AnnData

Examples

>>> import trackcell.io as tcio
>>> adata = tcio.read_hd_bin("SpaceRanger4.0/Case1/outs", sample="Case1")
>>> print(adata)
AnnData object with n_obs × n_vars = 10000 × 2000
    obs: 'barcode'
    obsm: 'spatial'
    uns: 'spatial'

Notes

This function expects the SpaceRanger output to have the following structure: - filtered_feature_bc_matrix.h5 or filtered_feature_bc_matrix/: Expression matrix - spatial/tissue_positions.parquet or spatial/tissue_positions.csv: Spatial coordinates - spatial/tissue_hires_image.png: High-resolution tissue image - spatial/tissue_lowres_image.png: Low-resolution tissue image - spatial/scalefactors_json.json: Image scaling factors

trackcell.io.read_hd_cellseg(datapath: str | Path, sample: str | None = None, cell_segmentations_file: str = 'graphclust_annotated_cell_segmentations.geojson', matrix_file: str = 'filtered_feature_cell_matrix.h5', hires_image_file: str = 'spatial/tissue_hires_image.png', lowres_image_file: str = 'spatial/tissue_lowres_image.png', scalefactors_file: str = 'spatial/scalefactors_json.json') AnnData[source]#

Read 10X HD SpaceRanger cell segmentation output and create an AnnData object with spatial information.

This function reads the output from SpaceRanger pipeline and creates an AnnData object that includes spatial coordinates, cell segmentations, and tissue images.

Parameters:
  • datapath (str or Path) – Path to the SpaceRanger output directory containing segmented outputs.

  • sample (str, optional) – Sample name. If None, will be inferred from the path.

  • cell_segmentations_file (str, default "graphclust_annotated_cell_segmentations.geojson") – Name of the cell segmentations file.

  • matrix_file (str, default "filtered_feature_cell_matrix.h5") – Name of the filtered feature-cell matrix file.

  • hires_image_file (str, default "spatial/tissue_hires_image.png") – Path to the high-resolution tissue image relative to datapath.

  • lowres_image_file (str, default "spatial/tissue_lowres_image.png") – Path to the low-resolution tissue image relative to datapath.

  • scalefactors_file (str, default "spatial/scalefactors_json.json") – Name of the scalefactors JSON file.

Returns:

AnnData object containing: - Expression matrix in .X - Cell metadata in .obs - Gene metadata in .var - Spatial coordinates in .obsm[“spatial”] - Tissue images in .uns[“spatial”][sample][“images”] - Scalefactors in .uns[“spatial”][sample][“scalefactors”] - Cell geometries in .uns[“spatial”][sample][“geometries”] (GeoDataFrame) - Cell geometries in .obs[“geometry”] (WKT strings for serialization)

Return type:

sc.AnnData

Examples

>>> import trackcell.io as tcio
>>> adata = tcio.read_hd_cellseg("SpaceRanger4.0/Case1/outs/segmented_outputs", sample="Case1")
>>> print(adata)
AnnData object with n_obs × n_vars = 1000 × 2000
    obs: 'cellid'
    obsm: 'spatial'
    uns: 'spatial'

Notes

This function expects the SpaceRanger output to have the following structure: - graphclust_annotated_cell_segmentations.geojson: Cell segmentation polygons - filtered_feature_cell_matrix.h5: Expression matrix - spatial/tissue_hires_image.png: High-resolution tissue image - spatial/tissue_lowres_image.png: Low-resolution tissue image - spatial/scalefactors_json.json: Image scaling factors

trackcell.io.sync_geometries_after_subset(adata, sample: str | None = None)[source]#

Synchronize geometries in adata.uns[“spatial”][sample][“geometries”] after subsetting the AnnData object.

When you subset an AnnData object (e.g., adata[mask] or adata[adata.obs[‘col’] == ‘value’]), the adata.obs[“geometry”] column is automatically subset, but the GeoDataFrame in adata.uns[“spatial”][sample][“geometries”] is not automatically updated. This function ensures that the geometries GeoDataFrame matches the subsetted cells.

Parameters:
  • adata (sc.AnnData) – AnnData object that has been subsetted.

  • sample (str, optional) – Sample name. If None, will use the first available sample in adata.uns[“spatial”].

Returns:

The same AnnData object with synchronized geometries (modified in place).

Return type:

sc.AnnData

Examples

>>> import trackcell.io as tcio
>>> adata = tcio.read_hd_cellseg("path/to/data", sample="sample1")
>>> # Subset the data
>>> adata_subset = adata[adata.obs['classification'] == 'Cluster-1'].copy()
>>> # Synchronize geometries
>>> tcio.sync_geometries_after_subset(adata_subset, sample="sample1")
>>> # Now adata_subset.uns["spatial"]["sample1"]["geometries"] only contains geometries for subsetted cells
trackcell.io.convert_annohdcell_to_trackcell(bin_h5ad_path: str | Path, output_h5ad_path: str | Path | None = None, sample: str | None = None, labels_key: str = 'labels_joint', bin_size_um: float = 2.0, create_polygons: bool = True, buffer_polygons: bool = True) AnnData[source]#

Convert annohdcell 2μm bin h5ad to trackcell-compatible cell-level h5ad.

This function reads a bin-level h5ad file from annohdcell (with cell assignment labels) and converts it to a cell-level h5ad file compatible with trackcell visualization tools. It aggregates bin counts into cells and creates polygon geometries for each cell.

Parameters:
  • bin_h5ad_path (str or Path) – Path to the input 2μm bin h5ad file (e.g., b2c_2um.h5ad from annohdcell)

  • output_h5ad_path (str or Path, optional) – Path to save the output h5ad file. If None, will not save to disk.

  • sample (str, optional) – Sample name for spatial metadata. If None, inferred from filename.

  • labels_key (str, default "labels_joint") – Column name in .obs containing cell assignment labels (0 = unassigned)

  • bin_size_um (float, default 2.0) – Size of each bin in micrometers

  • create_polygons (bool, default True) – Whether to create polygon geometries from bin coordinates

  • buffer_polygons (bool, default True) – Whether to buffer polygons slightly to account for bin size

Returns:

Cell-level AnnData object with: - .X: Summed expression counts per cell - .obs: Cell metadata including “geometry” (WKT strings) - .obsm[“spatial”]: Cell centroid coordinates - .uns[“spatial”][sample][“geometries”]: GeoDataFrame with cell polygons - .uns[“spatial”][sample][“images”]: Tissue images (if present in input) - .uns[“spatial”][sample][“scalefactors”]: Scale factors

Return type:

sc.AnnData

Examples

>>> import trackcell.io as tcio
>>> adata = tcio.convert_annohdcell_to_trackcell(
...     "b2c_2um.h5ad",
...     output_h5ad_path="trackcell_format.h5ad",
...     sample="sample1"
... )
>>> print(adata)
>>> # Now can use trackcell visualization functions
>>> import trackcell.pl as tcpl
>>> tcpl.spatial_cell(adata, sample="sample1")

Notes

  • Input h5ad must have .obs[labels_key] with integer cell labels (0 = unassigned)

  • Input h5ad must have .obsm[“spatial”] with bin coordinates

  • Bins with label 0 are excluded (unassigned bins)

  • Gene counts are summed across all bins in each cell

  • Cell coordinates are the mean of constituent bin coordinates

  • Polygon geometries are created using convex hull of bin coordinates

trackcell.io.add_geometries_to_annohdcell_output(bin_h5ad_path: str | Path, cell_h5ad_path: str | Path, output_h5ad_path: str | Path | None = None, sample: str | None = None, labels_key: str = 'labels_joint', bin_size_um: float = 2.0, buffer_polygons: bool = True, n_jobs: int = -1) AnnData[source]#

Add polygon geometries to annohdcell’s final cell h5ad using bin-level data.

This function reads both the 2μm bin h5ad (with cell labels) and the final cell-level h5ad from annohdcell, then adds polygon geometries to the cell h5ad by creating convex hulls from the constituent bins of each cell.

Parameters:
  • bin_h5ad_path (str or Path) – Path to the 2μm bin h5ad file (e.g., b2c_2um.h5ad) with cell labels

  • cell_h5ad_path (str or Path) – Path to the final cell h5ad file (e.g., b2c_cell.h5ad) from annohdcell

  • output_h5ad_path (str or Path, optional) – Path to save the output h5ad file. If None, will not save to disk.

  • sample (str, optional) – Sample name for spatial metadata. If None, inferred from filename.

  • labels_key (str, default "labels_joint") – Column name in bin h5ad .obs containing cell assignment labels

  • bin_size_um (float, default 2.0) – Size of each bin in micrometers

  • buffer_polygons (bool, default True) – Whether to buffer polygons slightly to account for bin size

  • n_jobs (int, default -1) – Number of parallel workers for polygon creation. -1 uses all available CPU cores, 1 disables parallelization.

Returns:

Cell-level AnnData object (copy of input cell h5ad) with added: - .obs[“geometry”]: WKT strings representing cell boundaries - .uns[“spatial”][sample][“geometries”]: GeoDataFrame with cell polygons

Return type:

sc.AnnData

Examples

>>> import trackcell.io as tcio
>>> adata = tcio.add_geometries_to_annohdcell_output(
...     bin_h5ad_path="b2c_2um.h5ad",
...     cell_h5ad_path="b2c_cell.h5ad",
...     output_h5ad_path="b2c_cell_with_geom.h5ad",
...     sample="sample1"
... )
>>> # Now can use trackcell visualization
>>> import trackcell.pl as tcpl
>>> tcpl.spatial_cell(adata, sample="sample1")

Notes

  • The cell h5ad must have .obs[“object_id”] matching labels in bin h5ad

  • Bin h5ad must have .obs[labels_key] with integer cell labels

  • Bin h5ad must have .obsm[“spatial”] with bin coordinates

  • The function preserves all data from the input cell h5ad

  • Only adds geometry information, does not modify counts or other data

trackcell.io.restore_geometries(adata: AnnData, sample: str | None = None) AnnData[source]#

Restore GeoDataFrame from WKT strings after reading h5ad file.

When h5ad files are saved with geometries, the GeoDataFrame is converted to a regular DataFrame with WKT strings for serialization. This function converts them back to a GeoDataFrame for spatial visualization.

Parameters:
  • adata (sc.AnnData) – AnnData object read from h5ad file

  • sample (str, optional) – Sample name in adata.uns[“spatial”]. If None, processes all samples.

Returns:

AnnData object with GeoDataFrame restored in uns[“spatial”][sample][“geometries”]

Return type:

sc.AnnData

Examples

>>> import scanpy as sc
>>> import trackcell as tcl
>>> adata = sc.read_h5ad("cell_with_geom.h5ad")
>>> adata = tcl.io.restore_geometries(adata)
>>> # Now adata.uns["spatial"][sample]["geometries"] is a GeoDataFrame

Plotting Functions#

PL (Plotting) module for TrackCell package.

This module provides functions for plotting and visualizing single-cell and spatial transcriptomics data.

trackcell.pl.spatial_cell(adata, color: str | List[str] | None = None, groups: List[str] | None = None, groupby: str | None = None, library_id: str | None = None, size: float = 1.0, figsize: tuple | None = None, cmap: str = 'viridis', palette: dict | list | ndarray | None = None, vmin: float | None = None, vmax: float | None = None, img_key: str | None = None, basis: str = 'spatial', edges_width: float = 0.5, edges_color: str = 'black', alpha: float = 0.8, alpha_img: float = 0.5, show: bool = True, ax: Axes | None = None, legend: bool = True, xlabel: str | None = 'spatial 1', ylabel: str | None = 'spatial 2', show_ticks: bool = False, **kwargs)[source]#

Plot spatial transcriptomics data with cell polygons instead of points.

This function visualizes cells as polygons (from cell segmentation) rather than simple points, providing a more accurate representation of cell boundaries. Uses GeoDataFrame.plot() for efficient rendering and automatic legend generation.

Parameters:
  • adata (AnnData) – Annotated data object with spatial information and cell geometries.

  • color (str or list of str, optional) –

    Keys for observation/categorical or continuous variables to color cells. Can be a single key or a list of keys for multiple plots. Can be: - A column name in adata.obs (metadata) - A gene name in adata.var_names (gene expression) - None: Only display the H&E background image without cell polygons.

    When None, axis ticks and labels are automatically shown.

  • groups (list of str, optional) – Subset of groups to plot. If None, plots all groups. Requires either color to be a categorical column in adata.obs or groupby to be specified.

  • groupby (str, optional) – Column name in adata.obs to use for filtering with groups parameter. If None and groups is specified, will use color if it’s a categorical column in adata.obs. This is useful when color is a continuous variable (e.g., gene expression) but you want to filter by a categorical column (e.g., ‘classification’).

  • library_id (str, optional) – Key in adata.uns[“spatial”] containing spatial information. If None, uses the first available library_id (similar to sc.pl.spatial). Default is None, which will auto-select the first library_id.

  • size (float, default 1.0) – Size scaling factor for cells (not used for polygons, kept for API compatibility). This parameter is currently not implemented for polygon-based visualization.

  • figsize (tuple, optional) – Figure size (width, height) in inches.

  • cmap (str, default "viridis") – Colormap for continuous values.

  • vmin (float, optional) – Minimum value for colormap normalization. If None, uses the minimum value in the data.

  • vmax (float, optional) – Maximum value for colormap normalization. If None, uses the maximum value in the data.

  • palette (dict, list, or array, optional) –

    Color palette for categorical variables. Can be: - A dictionary mapping category names to colors (e.g., {‘A’: ‘red’, ‘B’: ‘blue’}) - A list/array of colors that will be assigned to categories in sorted order

    (e.g., [‘red’, ‘blue’, ‘green’] will assign colors to categories alphabetically)

  • img_key (str, optional) – Key in adata.uns[“spatial”][library_id][“images”] for background image. If None, uses “hires” if available.

  • basis (str, default "spatial") – Key in adata.obsm containing spatial coordinates (for fallback).

  • edges_width (float, default 0.5) – Width of cell polygon edges.

  • edges_color (str, default "black") – Color of cell polygon edges.

  • alpha (float, default 0.8) – Transparency of cell polygons.

  • alpha_img (float, default 0.5) – Transparency of background image.

  • show (bool, default True) – Whether to display the plot.

  • ax (matplotlib.Axes, optional) – Axes object to plot on. If None, creates a new figure.

  • legend (bool, default True) – Whether to show legend for categorical values or colorbar for continuous values.

  • xlabel (str, optional, default "spatial 1") – Label for the x-axis. Set to None to hide the label.

  • ylabel (str, optional, default "spatial 2") – Label for the y-axis. Set to None to hide the label.

  • show_ticks (bool, default False) – Whether to show axis ticks and tick labels. If False, ticks are hidden. Note: When color=None, ticks are automatically shown regardless of this setting.

  • **kwargs – Additional arguments passed to GeoDataFrame.plot().

Returns:

Axes object(s) containing the plot.

Return type:

matplotlib.Axes or list of matplotlib.Axes

Examples

>>> import trackcell as tcl
>>> adata = tcl.io.read_hd_cellseg("path/to/data", sample="sample1")
>>> # Plot by metadata (categorical)
>>> tcl.pl.spatial_cell(adata, color="classification")
>>> # Plot by metadata (continuous)
>>> tcl.pl.spatial_cell(adata, color="Cluster-2_dist", cmap="Reds")
>>> # Plot by gene expression
>>> tcl.pl.spatial_cell(adata, color="PDPN", cmap="viridis")
>>> # Plot with groups filter
>>> tcl.pl.spatial_cell(adata, color="classification", groups=["Cluster-1", "Cluster-2"])
>>> # Plot only H&E image (no cell polygons)
>>> tcl.pl.spatial_cell(adata, color=None)
trackcell.pl.mark_region(ax: Axes, xlim: tuple | None = None, ylim: tuple | None = None, edges_color: str = 'red', edges_width: float = 1.0)[source]#

Mark a rectangular region on a spatial plot by drawing a rectangle.

This function draws a rectangle on the given axes to highlight a specific spatial region. It can be used with any spatial plot.

Parameters:
  • ax (matplotlib.axes.Axes) – Axes object to draw the rectangle on.

  • xlim (tuple, optional) – Tuple of (x_min, x_max) to define the x-range of the region. If None, uses the current x-axis limits.

  • ylim (tuple, optional) – Tuple of (y_min, y_max) to define the y-range of the region. If None, uses the current y-axis limits.

  • edges_color (str, default 'red') – Color of the rectangle edges.

  • edges_width (float, default 1.0) – Width of the rectangle edges.

Returns:

The rectangle patch object that was added to the axes.

Return type:

matplotlib.patches.Rectangle

Examples

>>> import trackcell as tcl
>>> import matplotlib.pyplot as plt
>>>
>>> # Plot with spatial_cell and mark a region
>>> fig, ax = plt.subplots(figsize=(10, 10))
>>> tcl.pl.spatial_cell(adata, color="CellType", ax=ax)
>>> tcl.pl.mark_region(ax, xlim=(54500, 56000), ylim=(15000, 16000))
>>>
>>> # Mark a region on any plot
>>> fig, ax = plt.subplots(figsize=(10, 10))
>>> # ... create your plot on ax ...
>>> tcl.pl.mark_region(ax, xlim=(54500, 56000), ylim=(15000, 16000),
...                    edges_color='blue', edges_width=2.0)

Tools Functions#

TL (Tools) module for TrackCell package.

This module provides utility and helper functions for single-cell and spatial transcriptomics data analysis.

trackcell.tl.hd_labeldist(adata, groupby: str, label: str, inplace: bool = True, method: str = 'kdtree')[source]#

Compute the distance from every cell to the nearest cell annotated with a specific label (10x HD data).

Distances are reported both in the pixel coordinate system stored in .obsm[“spatial”] (SpaceRanger target/hires layer) and in microns using the scalefactors embedded in adata.uns[“spatial”].

The function automatically detects whether coordinates are in hires or full-res resolution by comparing the computed tissue size with the expected chip size (6.5mm for 10X HD). This ensures compatibility with both SpaceRanger output and bin2cell-processed data.

Parameters:
  • adata (sc.AnnData) – Annotated data matrix with spatial coordinates and SpaceRanger scalefactors.

  • groupby (str) – Column name in adata.obs that contains the annotation labels.

  • label (str) – Target label within groupby for which distances will be computed.

  • inplace (bool, default True) – If True, the function adds two columns to adata.obs: {label}_px (pixel distance on the hires/registered image) and {label}_dist (physical distance in microns). If False, the function returns a dataframe with the two columns.

  • method (str, default "kdtree") –

    Method to use for distance computation: - “kdtree”: Use KDTree spatial indexing (recommended, O(n log n) time, O(n) memory).

    Best for large datasets with many cells.

    • ”cdist”: Use scipy’s cdist function (O(n*m) time and memory, where m is number of label cells).

      Faster for small datasets but memory-intensive for large ones.

Returns:

Returns a DataFrame with {label}_px and {label}_dist when inplace=False. Otherwise, modifies adata.obs in place and returns None.

Return type:

pandas.DataFrame | None