![](https://assets.nebius.ai/events/webinar-february-22/header-13.jpg)
MLOps Podcast: Handling Multi-Terabyte LLM Checkpoints
MLOps Community is the world’s largest community dedicated to addressing the unique technical and operational challenges of production machine learning systems.
The talk provides a gentle introduction to the topic of LLM checkpointing: why is it hard, and how big are the checkpoints? It covers various tips and tricks for saving and loading multi-terabyte checkpoints, as well as the selection of cloud storage options for checkpointing.
Simon Karasik, Machine learning engineer at Nebius AI, and Demetrios Brinkmann, Chief happiness engineer at MLOps Community.
![](https://assets.nebius.ai/events/mlops-podkast/cover-podkast-handling-multi-terabyte-llm-checkpoints-1.jpg)
Need custom pricing for a large-scale project?
Leave your contact details, and our cloud experts will contact you promptly to provide a transparent pricing that meets your specific needs.
![](https://assets.nebius.ai/events/new-contact-banner.jpg)