Diffusion Language Models: Turning ModernBERT into an instruct-tuned Diffusion LLM

Video
Some early experiments fine-tuning ModernBERT to be a masked diffusion LLM, with lots of room to explore further
Published

June 16, 2025

Diffusion Language Models: Turning ModernBERT into an instruct-tuned Diffusion LLM

Inference notebook: https://colab.research.google.com/drive/1hMV0OBpmJL7L5yIEtkeeUz-7rB1buFmg?usp=sharing Training notebook: https://colab.research.google.com/drive/1D82ULU5dUyJKPnj2oUxtfJeWTB1sVds_?usp=sharing Model on HF: https://huggingface.co/johnowhitaker/modernbert-diffusion/blob/main/README.md LLaDA paper (‘Large Language Diffusion Models’): https://arxiv.org/pdf/2502.09992