Instructions to use Synthyra/DPLM2-150M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Synthyra/DPLM2-150M with Transformers:
# Load model directly from transformers import EsmForDPLM2 model = EsmForDPLM2.from_pretrained("Synthyra/DPLM2-150M", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
NOTE
The GitHub with the implementation and requirements can be found here.
DPLM2
Synthyra DPLM2 checkpoints are HuggingFace AutoModel compatible and include FastPLMs embedding helpers.
Supported models
model_dict = {
"Synthyra/DPLM2-150M": "airkingbd/dplm2_150m",
"Synthyra/DPLM2-650M": "airkingbd/dplm2_650m",
"Synthyra/DPLM2-3B": "airkingbd/dplm2_3b",
}
Use with transformers
import torch
from transformers import AutoModel, AutoModelForMaskedLM
model_path = "Synthyra/DPLM2-150M"
model = AutoModel.from_pretrained(model_path, trust_remote_code=True, dtype=torch.float16).eval()
tokenizer = model.tokenizer
batch = tokenizer(["MPRTEIN", "MSEQWENCE"], padding=True, return_tensors="pt")
with torch.no_grad():
hidden = model(**batch).last_hidden_state
mlm = AutoModelForMaskedLM.from_pretrained(model_path, trust_remote_code=True, dtype=torch.float16).eval()
with torch.no_grad():
logits = mlm(**batch).logits
Experimental test-time training
TTT is disabled by default. Normal DPLM2 inference, embeddings, logits, and
state_dict() keys are unchanged unless you explicitly call model.ttt(...).
The current implementation is experimental and trains only local LoRA adapters
on the PLM backbone with masked language modeling on the test protein. It can
help some difficult proteins, but it adds test-time compute and can degrade
already confident predictions.
metrics = mlm.ttt(
seq="MSTNPKPQRKTKRNT",
ttt_config={"steps": 3, "ags": 1, "batch_size": 1},
)
mlm.ttt_reset()
print(metrics["losses"])
DPLM2 modality types
DPLM2 infers type_ids automatically from input_ids and attention_mask when they are not provided.
Attention backends
sdpa (PyTorch Scaled Dot Product Attention) is the default.
| Backend | Key | Notes |
|---|---|---|
| PyTorch SDPA | "sdpa" |
Default. Exact numerics, stable on all hardware. |
| Flash Attention | "kernels_flash" |
Fastest on Ampere/Hopper GPUs. Requires pip install kernels (pre-built — no hours-long compilation). Outputs are not bitwise identical to SDPA due to online softmax reordering; differences are often small but not guaranteed to be inconsequential — use "sdpa" if exact numerics matter. |
| Flex Attention | "flex" |
Skips padding tokens via block mask — faster on variable-length batches. Near-exact numerics. First use compiles a Triton kernel (30–120 s). Best combined with torch.compile. |
| Auto | "auto" |
Picks the best available: kernels_flash → flex → sdpa. |
Set via config before loading, or change on the model after loading (DPLM2 propagates the change to all attention layers immediately):
from transformers import AutoConfig, AutoModel
# Option 1: set before loading
config = AutoConfig.from_pretrained("Synthyra/DPLM2-150M", trust_remote_code=True)
config.attn_backend = "flex"
model = AutoModel.from_pretrained("Synthyra/DPLM2-150M", config=config, trust_remote_code=True)
# Option 2: set after loading
model = AutoModel.from_pretrained("Synthyra/DPLM2-150M", trust_remote_code=True)
model.attn_backend = "flex" # propagates to all attention layers in-place
Embed datasets
All DPLM2 models inherit EmbeddingMixin, so you can call model.embed_dataset(...) directly.
Citations
@article{wang2024dplm2,
title={DPLM-2: A Multimodal Diffusion Protein Language Model},
author={Wang, Xinyou and Ye, Zaixiang and Huang, Fei and Cao, Dongyan and Liang, Shujian and Huang, Liang},
journal={arXiv preprint arXiv:2410.13782},
year={2024}
}
@misc{FastPLMs,
author={Hallee, Logan and Bichara, David and Gleghorn, Jason P.},
title={FastPLMs: Fast, efficient, protein language model inference from Huggingface AutoModel.},
year={2024},
url={https://huggingface.co/Synthyra/ESMplusplus_small},
DOI={10.57967/hf/3726},
publisher={Hugging Face}
}
@article{dong2024flexattention,
title={Flex Attention: A Programming Model for Generating Optimized Attention Kernels},
author={Dong, Juechu and Feng, Boyuan and Guessous, Driss and Liang, Yanbo and He, Horace},
journal={arXiv preprint arXiv:2412.05496},
year={2024}
}
@inproceedings{paszke2019pytorch,
title={PyTorch: An Imperative Style, High-Performance Deep Learning Library},
author={Paszke, Adam and Gross, Sam and Massa, Francisco and Lerer, Adam and Bradbury, James and Chanan, Gregory and Killeen, Trevor and Lin, Zeming and Gimelshein, Natalia and Antiga, Luca and Desmaison, Alban and K{\"o}pf, Andreas and Yang, Edward and DeVito, Zach and Raison, Martin and Tejani, Alykhan and Chilamkurthy, Sasank and Steiner, Benoit and Fang, Lu and Bai, Junjie and Chintala, Soumith},
booktitle={Advances in Neural Information Processing Systems 32},
year={2019}
}
- Downloads last month
- 1,023