Quickstart: Getting Started with the Rbio Model
Estimated time to complete: 15 minutes
Learning Goals
In this tutorial, you will learn how to:
- Load the Rbio model and its tokenizer from an S3 bucket.
- Query the model with a biological question using a system prompt to guide its reasoning.
- Interpret the model's structured output, including the reasoning trace (
<think>
) and the final answer (<answer>
).
Prerequisites
To run this notebook, you will need the following:
Packages:
transformers
torch
safetensors
pandas
awscli
Hardware:
- A GPU is recommended for faster inference (this notebook can be run in Google Colab with a GPU runtime).
Introduction
rbio is a conversational large language model designed for biological reasoning. It's built on the Qwen2.5-3B-Instruct model, and has been post-trained to understand and respond to complex biological questions.
The model takes two main inputs:
- A system prompt that sets the context and persona for the model (e.g., acting as an expert biologist).
- A user question about a biological topic.
Its output is structured to reveal its reasoning process, providing a reasoning trace within <think>
tags and the final answer within <answer>
tags. This allows users to see how the model arrived at its conclusion, making it a powerful tool for hypothesis exploration and scientific inquiry. Optionally, the users can experiment with different chain-of-thought techniques, asking the model to perform more intricate thinking actions, such as taking time to reflect upon its answer, and provide its learnings upon reflection inside the <reflect>
tags.
1. Setup
First, let's install the necessary Python libraries.
%%capture [--no-stderr]
!pip install transformers==4.41.2 safetensors==0.4.3 torch boto3 pandas awscli
# Check for access to Rbio model weights
!aws s3 ls s3://czi-rbio/ --no-sign-request
Output:
PRE rbio_TF_ckpt/
PRE rbio_exp_data_ckpt/
#Explore Rbio_TF Checkpoints
!aws s3 ls s3://czi-rbio/rbio_TF_ckpt/ --no-sign-request
Output:
PRE global_step40000/
2025-08-14 23:45:19 605 added_tokens.json
2025-08-14 23:45:19 685 config.json
2025-08-14 23:45:19 243 generation_config.json
2025-08-14 23:48:01 16 latest
2025-08-14 23:48:01 1671853 merges.txt
2025-08-14 23:48:01 4957560304 model-00001-of-00002.safetensors
2025-08-14 23:48:03 1214366696 model-00002-of-00002.safetensors
2025-08-14 23:48:03 35581 model.safetensors.index.json
2025-08-14 23:48:03 16325 rng_state_0.pth
2025-08-14 23:48:03 16389 rng_state_1.pth
2025-08-14 23:48:04 16389 rng_state_2.pth
2025-08-14 23:48:04 16389 rng_state_3.pth
2025-08-14 23:48:04 16389 rng_state_4.pth
2025-08-14 23:48:04 16389 rng_state_5.pth
2025-08-14 23:48:04 16389 rng_state_6.pth
2025-08-14 23:48:04 16325 rng_state_7.pth
2025-08-14 23:48:05 1465 scheduler.pt
2025-08-14 23:48:05 613 special_tokens_map.json
2025-08-14 23:48:05 11422063 tokenizer.json
2025-08-14 23:48:06 7362 tokenizer_config.json
2025-08-14 23:48:06 69183 trainer_state.json
2025-08-14 23:48:06 8337 training_args.bin
2025-08-14 23:48:06 2776833 vocab.json
2025-08-14 23:48:06 33272 zero_to_fp32.py
Download Model Checkpoint
# Create the local directory first
!mkdir -p ./rbio_TF_ckpt/
# Define the local directory
local_dir = './rbio_TF_ckpt/'
# --- Model Weights ---
!aws s3 cp s3://czi-rbio/rbio_TF_ckpt/model.safetensors.index.json {local_dir} --no-sign-request
!aws s3 cp s3://czi-rbio/rbio_TF_ckpt/model-00001-of-00002.safetensors {local_dir} --no-sign-request
!aws s3 cp s3://czi-rbio/rbio_TF_ckpt/model-00002-of-00002.safetensors {local_dir} --no-sign-request
# --- Configuration & Tokenizer ---
!aws s3 cp s3://czi-rbio/rbio_TF_ckpt/config.json {local_dir} --no-sign-request
!aws s3 cp s3://czi-rbio/rbio_TF_ckpt/tokenizer.json {local_dir} --no-sign-request
!aws s3 cp s3://czi-rbio/rbio_TF_ckpt/tokenizer_config.json {local_dir} --no-sign-request
!aws s3 cp s3://czi-rbio/rbio_TF_ckpt/special_tokens_map.json {local_dir} --no-sign-request
Output:
download: s3://czi-rbio/rbio_TF_ckpt/model.safetensors.index.json to rbio_TF_ckpt/model.safetensors.index.json
download: s3://czi-rbio/rbio_TF_ckpt/model-00001-of-00002.safetensors to rbio_TF_ckpt/model-00001-of-00002.safetensors
download: s3://czi-rbio/rbio_TF_ckpt/model-00002-of-00002.safetensors to rbio_TF_ckpt/model-00002-of-00002.safetensors
download: s3://czi-rbio/rbio_TF_ckpt/config.json to rbio_TF_ckpt/config.json
download: s3://czi-rbio/rbio_TF_ckpt/tokenizer.json to rbio_TF_ckpt/tokenizer.json
download: s3://czi-rbio/rbio_TF_ckpt/tokenizer_config.json to rbio_TF_ckpt/tokenizer_config.json
download: s3://czi-rbio/rbio_TF_ckpt/special_tokens_map.json to rbio_TF_ckpt/special_tokens_map.json
Import Libraries
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers.modeling_utils import load_sharded_checkpoint
import re
import boto3
import os
import pandas as pd
from getpass import getpass
import torch
import warnings
warnings.filterwarnings('ignore')
pd.set_option('display.max_colwidth', None)
2. Run Model Inference
Now we will define helper functions to download, load, and query the model.
def load_model_and_tokenizer(model_name, model_checkpoint_path, device):
"""Loads the base model and tokenizer, and applies the fine-tuned checkpoint."""
# Load the base model from Hugging Face
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Load the fine-tuned weights from the local checkpoint directory
if model_checkpoint_path is not None:
print(f"Loading fine-tuned weights from {model_checkpoint_path}")
load_sharded_checkpoint(model, model_checkpoint_path, strict=False)
# Set tokenizer padding for batch processing
tokenizer.padding_side = "left"
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
model.eval()
model.to(device)
return model, tokenizer
def ask_rbio(system_prompt, question, device, model, tokenizer):
"""Formats the prompt and generates a response from the model."""
# Create the chat message format
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": question},
]
# Apply the chat template
texts = tokenizer.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
# Tokenize the input and move to the appropriate device (CPU/GPU)
model_inputs = tokenizer(texts, return_tensors="pt", padding=True).to(device)
# Generate a response
generated_ids = model.generate(**model_inputs, max_new_tokens=1024, do_sample=False)
generated_ids = [
output_ids[len(input_ids) :]
for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
# Decode the response
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
# Parse the structured output using regex
think_process = re.findall(r"<think>(.*?)</think>", response, flags=re.DOTALL)
reflection = re.findall(r"<reflect>(.*?)</reflect>", response, flags=re.DOTALL)
answer = re.findall(r"<answer>(.*?)</answer>", response, flags=re.DOTALL)
# Return a dictionary with the parsed components
parsed_response = {
'think': think_process[0].strip() if think_process else "",
'reflection': reflection[0].strip() if reflection else "",
'answer': answer[0].strip() if answer else ""
}
return parsed_response
Download and Load the Model
Next, we'll set the model parameters and download the checkpoint from S3. We are using the rbio_TF_ckpt
checkpoint for this tutorial, which was fine-tuned with Transcriptformer as a soft verifier.
This step may take a few minutes depending on your internet connection, as the model files are several gigabytes in size.
# S3 bucket and model details
model_name = 'Qwen/Qwen2.5-3B-Instruct' # Base model from Hugging Face
model_checkpoint = 'rbio_TF_ckpt' # Fine-tuned checkpoint
local_dir = f'./{model_checkpoint}/'
# Set device to GPU if available, otherwise CPU
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f"Using device: {device}")
# Load the model and tokenizer
model, tokenizer = load_model_and_tokenizer(model_name, local_dir, device)
Ask Biological Questions
We'll now query the model with a couple of biological questions. We will use a standard system prompt that instructs the model to act as a biologist and provide a step-by-step reasoning trace.
system_prompt = ( "A conversation between a User and a Biologist. The user asks a question, "
"and the Biologist solves it. The biologist first thinks about the reasoning process in the mind and "
"then provides the user with the answer. The reasoning process and answer are enclosed within "
"<think> </think> and <answer> </answer> tags, respectively, i.e., <think> reasoning process here </think> <answer> answer here </answer>. "
"The Biologist will evaluate each step of this problem, using logical reasoning and evidence from the prompt.")
Example 1: Perturbation Prediction
#%%capture [--no-stderr]
question_pert = ("Is a knockdown of RNPS1 in rpe1 cells likely "
"to result in differential expression of SPARC? The answer is either yes or no.")
print("--- Perturbation Question ---")
print(f"Q: {question_pert}")
# Get the model's response
response_pert = ask_rbio(system_prompt, question_pert, device, model, tokenizer)
# Display the structured output
print("\n--- Reasoning Trace --- ")
print(response_pert['think'])
print("\n--- Final Answer ---")
print(response_pert['answer'])
Output:
--- Perturbation Question ---
Q: Is a knockdown of RNPS1 in rpe1 cells likely to result in differential expression of SPARC? The answer is either yes or no.
--- Reasoning Trace ---
When a gene is knocked down, its expression is reduced. If RNPS1 is knocked down, it means that the protein encoded by RNPS1 is less abundant. SPARC is a gene that encodes a protein, so if RNPS1 is knocked down, it could lead to a change in the expression of SPARC. However, without specific data on the regulatory relationship between RNPS1 and SPARC, we cannot definitively say whether the expression of SPARC will be increased, decreased, or remain unchanged. In some cases, the knockdown of one gene can lead to compensatory changes in the expression of other genes, but this is not always the case.
--- Final Answer ---
no
3. Model Outputs
The rbio model provides a structured response to facilitate interpretability. The key components are:
-
<think>
...</think>
: This block contains the model's step-by-step reasoning process. It outlines the biological concepts, assumptions, and logical steps the model took to arrive at its conclusion. This is useful for understanding the 'why' behind the answer. -
<reflect>
...</reflect>
: (Optional) In some cases, particularly with more complex system prompts, the model may include a reflection block. This is where it re-evaluates its initial reasoning, considers alternative hypotheses, or incorporates additional biological context before settling on a final answer. -
<answer>
...</answer>
: This block contains the final, concise answer to the user's question, based on the preceding reasoning and reflection.
Contact and Acknowledgments
For issues with this quickstart please contact virtualcellmodels@chanzuckerberg.com.
Responsible Use
We are committed to advancing the responsible development and use of artificial intelligence. Please follow our Acceptable Use Policy when engaging with our services.
Should you have any security or privacy issues or questions related to the services, please reach out to our team at security@chanzuckerberg.com or privacy@chanzuckerberg.com respectively.