Nebius AI monthly digest, January 2024

This past month, we explored the possibilities of K8s and JupyterLab in ML, announced our first live webinar featuring Anna Veronika Dorogush, founder and CEO of Recraft, and updated the documentation with handy tutorials on testing GPU clusters.

Platform and products

  • How to optimize K8s for ML?
    Follow the steps of Boris, our Solutions Architect, as he sets up a GPU-enabled Kubernetes cluster with Terraform. This is perfect for DevOps engineers, data scientists, and anyone who’s into cloud computing or works with ML loads.
  • Model fine-tuning with JupyterLab
    Explore how to quickly run model fine-tuning with the famous JupyterLab on our platform. The tutorial once again features Nikita, a Product Manager at Nebius AI.

  • Live webinar: introduction to the platform
    This will be our first live webinar, where Anna Veronika Dorogush, founder and CEO of Recraft, will share her company’s experience with heavy model training for geverative AI on Nebius infrastructure. Also, Andrew and Levon from Nebius will introduce the platform’s capabilities. Join us on February 29th.

Docs and blog

  • Creating an InfiniBand GPU cluster
    If you use multiple Compute Cloud VMs with GPUs, you can group them into an InfiniBand GPU cluster for faster model training. To create it, refer to this guide. A cluster may be created via console or CLI, either when creating the first VM in it or afterwards.

  • Testing InfiniBand connection between GPUs
    Efficient training on GPU clusters requires a fast and stable InfiniBand connection between the GPUs. See the guides on checking the connection state and running the NCCL performance tests developed by NVIDIA.

  • Which AI conferences to attend in 2024?
    Recently, we had an insightful discussion with our machine learning experts at Nebius AI. From their wealth of hands-on experience in developing LLMs, generative networks, and other cutting-edge tech, they’ve curated this list of must-attend conferences.

author
Nebius AI team
Sign in to save this post