Try Models

MonjuDetectHM

Version v0.1.0 released 01 Apr 2025

The MonjuDetectHM model is an ensemble of three models, all of which are 3D segmentation models trained to predict the presence and position of the following protein complexes: apoferritin, beta-galactosidase, ribosome, thyroglobulin and virus-like-particle. The models detect the position of these protein complexes by predicting a heatmap around areas likely to contain those particles. MonjuDetectHM is the 7th-place solution of the CZII CryoET Object Identification Kaggle competition for evaluating cryoET particle picking algorithms. It was trained on simulated and experimental datasets hosted by the Kaggle competition.

Developed By

Koki Kobayashi1
1 Institute of Science Tokyo

Model Details

Finetuned From Model

Models 1 and 3 (ResNet50d-based models) use the pre-trained weights from timm (ra2_in1k), while Model 2 (EfficientNetv2-M) uses the pre-trained weights from timm_3d (in21k_ft_in1k).

Model Architecture

This model is an ensemble of three models, listed below.

  • Model 1: 3D-UNet with 3D ResNet50d as its backbone. The first 4 stages (output stride 2 ~ 16) from the encoder are used as the inputs to the decoder.
  • Model 2: 3D-UNet with 3D EfficientNetV2-M as its backbone. Outputs from all 5 stages are passed to the decoder.
  • Model 3: 3D DeepLabV3+ with 3D ResNet50d as its backbone. The last convolution layer with stride=2 is replaced with a stride=1, dilation_rate=1 convolution layer to change the output stride from 32 to 16.

Parameters

  • Model 1: 61M
  • Model 2: 77M
  • Model 3: 92M

Citation

Peck, A., et al., (2025) A Realistic Phantom Dataset for Benchmarking Cryo-ET Data Annotation. Nature Methods. DOI: 10.1101/2024.11.04.621686.

Primary Contact Email

kobayashi.k.f785@m.isct.ac.jp

To submit feature requests or report issues with the model, please open an issue on the GitHub repository.

System Requirements

GPU with sufficient memory (16GB suggested). The model has been tested on a personal computer with Ryzen 5 5600X + RTX4070Ti Super and Kaggle Notebooks with GPU acceleration (P100 and 2xT4). Running this model on a CPU is likely impractical due to its size and complexity.

Intended Use

Primary Use Cases

  • Protein complex detection within cryoET tomograms (e.g., apoferritin, beta-galactosidase, ribosome, thyroglobulin and virus-like-particles).

Out-of-Scope or Unauthorized Use Cases

Do not use the model for the following purposes:

  • Use that violates applicable laws, regulations (including trade compliance laws), or third party rights such as privacy or intellectual property rights.
  • Any use that is prohibited by the MIT License.
  • Any use that is prohibited by the Acceptable Use Policy.

Training Details

Training Data

Models were trained in two stages using simulated and experimental datasets provided by the Chan Zuckerberg Imaging Institute (CZII) for the Kaggle competition.

  • Stage 1: Models were trained using tomograms from 27 simulated runs of resolution (630, 630, 200) with seven types of labelled particles.
  • Stage 2: Models were trained using tomograms from seven experimental runs of resolution (630, 630, 184) with ground truth annotations for six particles.

Training Procedure

In both stages, the models were trained on patches of size (128, 128, 128), cropped out from the original tomograms. Minimal preprocessing was done, only consisting of percentile clipping and pixel value rescaling. Heavy augmentations were employed to overcome the scarcity of the data.

The target samples were created by overlaying a Gaussian kernel with a sigma value proportional to the particle size. Beta-amylase annotations were not used to train the model.

Training Code

Training code is available in the MonjuDetectHM GitHub repository.

Speeds, Sizes, Times

All measurements were taken on a machine with Ryzen 5 5600X + RTX4070Ti SUPER. The training time includes all the logging steps and validations in between epochs. Inferences were done on (192, 128, 128) sized patches with batch size 4. 4x test time augmentation was employed, effectively quadrupling the latency.

Description
Pretraining time
Training time
Inference latency (fp16)
Model 14.5h2.5h-
Model 26.5h3.5h-
Model 33.5h3.5h-
Ensemble14.5h9.5h66 s/tomogram

Training Hyperparameters

  • Optimizer: AdamW
  • Scheduler: CosineAnnealing with linear warmup
  • Loss: Weighted Binary Cross-Entropy
  • Precision: fp32

Data Sources

CZII CryoET Object Identification Challenge deposition site for:

Performance Metrics

Metrics

Models were evaluated using the F-beta metric with beta=4, which prioritizes recall over precision. The official metric code can be found in the Competition Metric Notebook. Predictions were considered true positives when they fell within a sphere with half of the target particle's radius. The MonjuDetectHM model scored 0.77708 on the final private leaderboard, ranking 7th.

Evaluation Datasets

The evaluation datasets can be found in the CryoET Data Portal deposition site for the Kaggle competition, including:

Evaluation Results

Description
Public Score
Private Score
Model 10.773200.76665
Model 20.770240.76478
Model 30.768240.76057
Ensemble0.783510.77708

Biases, Risks, and Limitations

Potential Biases

  • The model ensemble may reflect biases present in the training data.
  • The model ensemble is trained to have a higher recall, at the cost of lower precision.
  • The model ensemble might generate more inaccurate results when the particle distributions are different from the competition data.

Risks

Areas of risk may include, but are not limited to:

  • Inaccurate outputs or false positives in high contrast regions.
  • Potential misuse for incorrect biological interpretations.

Limitations

  • The model ensemble is not trained to detect particles not present in the training data.
  • The model ensemble architecture may not be suitable for detecting larger particles.

Caveats and Recommendations

  • Review and validate outputs generated by the model.
  • We are committed to advancing the responsible development and use of artificial intelligence. Please follow our Acceptable Use Policy when using the model.

Acknowledgements

We gratefully acknowledge the Chan Zuckerberg Initiative for providing the platform, dataset, and opportunity for this work. We also thank Kaggle for providing a platform and computational resources that aided in the development of this model.