Cytoland
Version 2025-03 released 31 Mar 2025
- Ziwen Liu (Chan Zuckerberg Biohub San Francisco)
- Eduardo Hirata-Miyasaki (Chan Zuckerberg Biohub San Francisco)
- Shalin Mehta (Chan Zuckerberg Biohub San Francisco)
The Cytoland virtual staining models are a collection of models (VSCyto2D, VSCyto3D, and VSNeuromast) based on a convolutional architecture (UNext2) to predict cellular landmarks (nuclei and plasma membrane) from label-free microscopy images such as quantitative phase, Zernike phase contrast, and brightfield. The Cytoland models are robust to variations in imaging parameters, cell types, and developmental stages, demonstrating broad utility for downstream analysis. These models enable segmentation and tracking of cells from label-free images of diverse cell types and improve the throughput of dynamic imaging screens, by eliminating the need for labeling nuclei and cell membranes.
Model Details
Model Architecture
The UNeXt2 family are asymmetric U-Net architectures based on the ConvNeXt v2 backbone which uses 2D intermediate feature maps. The 3D variants use projection stem and head blocks to map 3D input and output to the 2D backbone.
Parameters
32M
Citation
Liu, Hirata-Miyasaki, et al., (2025) Cytoland: robust virtual staining of landmark organelles. Accepted for publication in Nature Machine Intelligence. Preprint: DOI:10.1101/2024.05.31.596901
Primary Contact Email
Shalin Mehta shalin.mehta@czbiohub.org
Model Variants
Model Variant Name | Task | Access URL |
---|---|---|
VSCyto2D | 2D virtual staining of cultured cells for high-content screening | |
VSCyto3D | 3D virtual staining of cultured cells for studying subcellular dynamics | |
VSNeromast | 3D virtual staining of in vivo zebrafish neuromast for studying organogenesis |
Intended Use
Primary Use Cases
Translate label-free microscopy images into images that represent landmark organelles (nuclei and plasma membrane) which can then be used for cell segmentation and tracking.
Out-of-Scope or Unauthorized Use Cases
Do not use the model for the following purposes:
- Use that violates applicable laws or regulations (including trade compliance laws).
- Any use that is prohibited by the Acceptable Use Policy or BSD 3-Clause license
Training Details
Training Data
Label-free and fluorescence light microscopy images of cultured human cell lines and zebrafish neuromast were used for model training. The datasets are available from the following sources:
- Cytoland dataset: S-BIAD1702
- LiveCell dataset: LiveCell
- Allen Institute of Cell Science hIPSC dataset: AICS data
Training Procedure
Preprocessing
Brightfield and widefield fluorescence images are deconvolved with the waveorder package to retrieve phase and fluorescence density. Channels are registered with the biahub pipeline to ensure spatial correspondence.
Model training
During pre-training, label-free images are used to train masked autoencoders. During fine-tuning, the pre-trained encoders are paired with randomly-initialized decoders and optimized for a supervised image translation task. See the paper for full details.
Training Code
- Repo: https://github.com/mehta-lab/VisCy
- Scripts and Configurations: https://public.czbiohub.org/comp.micro/viscy/VS_models/
Speeds, Sizes, Times
Checkpoints are between 350 and 460 MB. The Inference time varies with size of the input image and the type of GPU used.
Training Hyperparameters
fp32 and fp16 mixed precision, a detailed list of hyperparameters is specified by the configuration.
Data Sources
- Cytoland dataset: S-BIAD1702
- AICS hiPSC single cell image dataset (used for VSCyto3D)
- LiveCell dataset (used for VSCyto2D)
Performance Metrics
Metrics
The models were evaluated using a range of benchmarks to measure its performance. Key metrics include: image regression metrics (Pearson Correlation Coefficient, Structural Similarity Index Measure, etc.), segmentation metrics (Intersection of Union, Average Precision, etc.), and application-specific metrics (cell area, cell count, etc.).
Evaluation Datasets
The Cytoland models were evaluated using microscopy images from various cell types, including HEK293T, A549, Bj-5ta, and zebrafish neuromasts. Evaluated datasets by model can be found at the BioImage Archive S-BIAD1702.
Evaluation Results
VSCyto2D
A. Self-supervised (FCMAE) pre-training with label-free data and supervised pre-training (end-to-end) with paired label-free and fluorescence data enables data-efficient virtual staining of BJ-5ta cells.
B. VSCyto2D achieves few-shot generalization on iPSC-derived neurons (iNeurons), enabling soma segmentation and neurite tracing.
C. Segmentation of iNeuron cells from virtual staining can provide accurate label-free quantification of neuronal differentiation.

VSCyto3D
A. VSCyto3D, trained with data acquired at the Biohub, achieves zero-shot generalization to images acquired independently at the Allen Institute for Cell Science (AICS), using different imaging systems and cell culture protocols.
B. Pre-training for VSCyto3D improves cell and nuclei segmentation accuracy from virtual staining.

VSNeuromast
A. VSNeuromast predicts cellular landmarks in the zebrafish neuromast in vivo, enabling label-free visualization of organ development.
B. Virtual staining performance is stable in long-term time-lapse imaging.
C. Unlike fluorescence microscopy, virtually stained cellular landmarks are not affected by photobleaching.
D. Virtual staining can be used for label-free cell tracking in living animals.

Biases, Risks, and Limitations
Potential Biases
- The model may exhibit biases present in the training data. The models were trained with several cell types and by simulating diverse imaging conditions, but may not generalize to your cell type if they are distinct from the ones included in training.
Limitations
- The models' performance may degrade for imaging modalities and conditions that are significantly different from the training distribution (e.g., very low SNR or spatial resolution).
- The models may not produce ambiguous output in regions of high uncertainty.
Caveats and Recommendations
- Review and validate outputs generated by the model.
- Experiment with test-time transformations (e.g. zoom, revert intensity, gamma correction) to better match the input data with the regime of validity. See the demo on Hugging Face for illustration.
- We are committed to advancing the responsible development and use of artificial intelligence. Please follow our Acceptable Use Policy when using the model.
- Should you have any security or privacy issues or questions related to the models, please reach out to our team at security@chanzuckerberg.com or privacy@chanzuckerberg.com respectively.
Acknowledgements
This work is supported by the intramural program of Chan Zuckerberg Biohub San Francisco. The training and evaluation of the models is enabled by the Bruno high-performance computing system, maintained by the Scientific Compute Platform at CZ Biohub SF.
If you have recommendations for this model card please contact virtualcellmodels@chanzuckerberg.com.