CryoLens

Version v0.0.1 released 05 Nov 2025

License

Repository

Developed By

Kyle Harrington (Chan Zuckerberg Initiative)
Alan Lowe (Chan Zuckerberg Initiative)

CryoLens is a generative model trained on a large-scale synthetic dataset that uses interpretable 3D representations to support particle analysis in cryo-electron tomography.

Model Details

Model Architecture

CryoLens is a Variational Autoencoder (VAE) designed for 3D reconstruction and representation learning of molecular structures from cryoET data. The term "3D reconstruction" is employed here consistent with its use within the field of computer vision. In the cryoET field, structural reconstructions are derived through sophisticated, well-validated methods that do more than generate 3D interpretable representations. Architecture Components:

Encoder: 3D convolutional neural network with 4 layers (channels: 8→16→32→64), stride-2 convolutions with ReLU activations.
Latent Space: Configurable latent dimensions (default: 40 dimensions) with reparameterization for variational inference.
Pose Estimation: Dedicated linear layer predicting 4-channel pose representation (axis-angle format).
Decoder: Segmented Gaussian Splatting decoder with 768 splats, rendering 3D density volumes.
Latent Ratio: 0.8 (80% of latent dimensions used for rotation-invariant structure).
Output: 48³ voxel volumes.

Input/Output:

Input: 48×48×48 voxel density volumes (subvolumes from cryoET tomograms).
Output: Reconstructed 48×48×48 density volumes + Gaussian splats + latent embeddings + pose estimates.

Parameters

15-20 million depending on hyperparameters

Model Card Authors

Kyle Harrington and Alan Lowe (CZI)

Primary Contact Email

Kyle Harrington kharrington@chanzuckerberg.com

To submit feature requests or report issues with the model, please open an issue on the GitHub repository.

System Requirements

GPU

Intended Use

Primary Use Cases

Molecular structure embedding generation: Extract latent representations from cryoET subtomograms for clustering and comparison.
Particle classification: Identify molecular structures in tomograms based on learned embeddings.
Generating interpretable 3D reconstructions: Generating 3D images of input structures.

Out-of-Scope or Unauthorized Use Cases

Use on experimental cryoET data.
Clinical or diagnostic applications.
Critical infrastructure or safety-critical systems.
Any use prohibited by the MIT License or applicable laws.

Training Data

Dataset: TomoTwin synthetic dataset with simulated cryoET subtomograms.
Size: 103 protein structures from the Protein Data Bank (PDB), multiple particles per structure.
SNR level: 5.0
Volume size: 48×48×48 voxels (10 Å per voxel)
Features: Missing wedge artifacts, realistic noise models, uniform orientation sampling.
Software: Simulated data was produced with PolNet and data was managed with copick.

Training Procedure

Curriculum Learning: Alternating between sequential and random structure sampling (10 structures per phase, 100 epochs per phase).
Loss Function: Missing wedge reconstruction loss + KL divergence + Contrastive affinity loss.

Training Code

https://github.com/czi-ai/cryolens

Training Hyperparameters

Latent dimensions: 40
Number of Gaussian splats: 768
Batch size: 128
Learning rate: 1×10⁻⁴
Beta (KL weight): 0.001
Gamma (similarity weight): 10.0
Latent ratio: 0.8
Contrastive margin: 1.5

Data Sources

The following dataset was used for training:

CryoLens v1.0.0

Performance Metrics

Metrics

Classification accuracy: Evaluated using 10-fold cross-validation on particle classification benchmark.
3D Reconstruction quality: Cross correlation and Fourier shell coefficient curves.
Sample contamination detection: Detection of mislabeled particles from a set of candidates.

Evaluation Datasets

Evaluation Results

Evaluation results are available in the Github repository.

Biases, Risks, and Limitations

Potential Biases

Synthetic cryoET data does not have a realistic noise model and does not reflect typical cryoET conditions. Notably, the best performance at SNR ~5.0 was emphasized during training.
Results may not generalize equally to all structure types.

Risks

The model has not been adequately tested on experimental data.
May introduce reconstruction artifacts, particularly at low SNR.
Overconfident predictions on ambiguous inputs.

Limitations

Fixed 48³ voxel input size.
Effective resolution constrained by ~10 Å voxel size.
Requires GPU resources for practical inference.

Caveats and Recommendations

Review and validate outputs generated by the model.
This is a research tool; biological conclusions should be drawn with appropriate validation.
Should you have any security or privacy issues or questions related to this model, please reach out to our team at security@chanzuckerberg.com or privacy@chanzuckerberg.com, respectively.

Acknowledgements

CryoLens team: Kyle Harrington, Utz Ermel, Ritvik Vasan, Ashley Anderson, Ryan Lim, Mikala Caton, Alan R. Lowe
Support: Chan Zuckerberg Initiative
Synthetic data: PolNet and TomoTwin teams for their cryoET simulator and pioneering work on synthetic data for training foundation models, respectively.