Graphics accelerators (GPUs)
Compute Cloud provides graphics accelerators (GPUs) for different VM configurations. GPUs outperform CPUs in terms of processing certain types of data and are suitable for machine learning (ML), artificial intelligence (AI), and 3D rendering tasks. You can manage GPUs and RAM directly from your VM.
To handle tasks that are too large for a single GPU, you can create GPU clusters.
Available GPUs
In Compute Cloud, the following GPUs are available:
- NVIDIA A100
(NVIDIA Ampere architecture) with 80 GB HBM2. - NVIDIA H100
(NVIDIA Hopper architecture) with 80 GB HBM3. - NVIDIA L4
(NVIDIA Ada Lovelace architecture) with 24 GB GDDR6. - NVIDIA L40
(NVIDIA Ada Lovelace architecture) with 48 GB GDDR6. - NVIDIA L40S
(NVIDIA Ada Lovelace architecture) with 48 GB GDDR6. - NVIDIA V100
(NVIDIA Volta architecture) in two form factors: SXM with 32 GB HBM2, and PCIe with 32 GB HBM2.
Warning
GPUs run in TCC
By default, many quotas for creating VMs with GPUs are set to zero. You can generate a request for a quota increaseeditors
or admins
group.
NVIDIA A100
The NVIDIA A100 GPUs are based on the Ampere®
NVIDIA H100
NVIDIA H100 GPU is powered by the NVIDIA Hopper architecture. It features fourth-generation Tensor Cores and has the capability to apply mixed FP8 and FP16 precisions to dramatically accelerate AI calculations for transformers
Nebius AI provides NVIDIA H100 GPUs in the NVIDIA H100 SXM form factor. It offers 80 GB of HBM3 RAM and up to 3.35 TB/s memory bandwidth. NVIDIA H100 SXM is available on multiple platforms with identical performance and capabilities:
- NVIDIA® H100 NVLink with Intel Sapphire Rapids (Type A)
- NVIDIA® H100 NVLink with Intel Sapphire Rapids (Type B)
- NVIDIA® H100 NVLink with Intel Sapphire Rapids (Type C)
For more information, go to the NVIDIA website:
NVIDIA L4, L40, and L40S
NVIDIA L4, NVIDIA L40, and NVIDIA L40S are inference-optimized GPUs powered by the NVIDIA Ada Lovelace architecture, featuring fourth-generation Tensor Cores and FP8 precision support.
Nebius AI provides:
- NVIDIA® L4 PCIe with Intel Ice Lake (
standard-v3-l4
): Designed as a universal GPU that can handle workload of various types such as generative AI, visual computing, and graphics, NVIDIA L4 excels in video processing, delivering 120x higher AI video performance than CPUs. It offers 24 GB of GDDR6 RAM and up to 300 GB/s memory bandwidth. - NVIDIA® L40 PCIe with Intel Ice Lake (
standard-v3-l40
): NVIDIA L40 delivers state-of-the-art capabilities for graphics and AI-enabled 2D, 3D and video generation. It offers 48 GB of GDDR6 RAM and up to 864 GB/s memory bandwidth. - NVIDIA® L40S PCIe with Intel Ice Lake (
standard-v3-l40s
): NVIDIA L40S is a multi-workload GPU that combines powerful AI compute with best-in-class graphics and media acceleration. It offers 48 GB of GDDR6 RAM and up to 864 GB/s memory bandwidth.
For more information, see the NVIDIA website:
- NVIDIA L4:
- NVIDIA L40:
- NVIDIA L40S:
- NVIDIA Ada Lovelace
- Performance data
NVIDIA V100
The NVIDIA V100 GPU contains 5120 CUDA cores for high-performance computing
Nebius AI provides NVIDIA V100 GPUs in two form factors:
- NVIDIA V100 SXM offers 300 GB/s GPU interconnect bandwidth and can be used for single-node training. It is available on the NVIDIA® V100 NVLink with Intel Cascade Lake platform.
- NVIDIA V100 PCIe offers 32 GB/s GPU interconnect bandwidth and is best suited for inference tasks.
For more information, go to the NVIDIA website:
Configurations of VMs with GPUs
The configurations available for VMs with GPUs are listed below. The configurations that can be used in GPU clusters are marked with an asterisk *.
-
NVIDIA® A100 NVLink with AMD Epyc Rome (
gpu-standard-v3
):Number of GPUs VRAM, GB Number of vCPUs RAM, GB 1 80 28 119 8 640 224 952 -
NVIDIA® H100 NVLink with Intel Sapphire Rapids (Type A) (
gpu-h100
):Number of GPUs VRAM, GB Number of vCPUs RAM, GB 1 80 20 160 8* 640 160 1280 -
NVIDIA® H100 NVLink with Intel Sapphire Rapids (Type B) (
gpu-h100-b
):Number of GPUs VRAM, GB Number of vCPUs RAM, GB 1 80 20 160 8* 640 160 1280 -
NVIDIA® H100 NVLink with Intel Sapphire Rapids (Type C) (
gpu-h100-c
):Number of GPUs VRAM, GB Number of vCPUs RAM, GB 1 80 20 160 8* 640 160 1280 -
NVIDIA® L4 PCIe with Intel Ice Lake (
standard-v3-l4
):Number of GPUs VRAM, GB Number of vCPUs RAM, GB 1 24 4 16 1 24 8 32 1 24 12 48 1 24 16 64 1 24 24 96 2 48 24 96 2 48 48 192 -
NVIDIA® L40 PCIe with Intel Ice Lake (
standard-v3-l40
):Number of GPUs VRAM, GB Number of vCPUs RAM, GB 1 48 8 32 1 48 12 48 1 48 16 64 1 48 24 96 2 96 48 192 -
NVIDIA® L40S PCIe with Intel Ice Lake (
standard-v3-l40s
):Number of GPUs VRAM, GB Number of vCPUs RAM, GB 1 48 8 32 1 48 12 48 1 48 16 64 1 48 24 96 1 48 48 192 1 48 96 384 -
NVIDIA® V100 PCIe with Intel Broadwell (
gpu-standard-v1
):Number of GPUs VRAM, GB Number of vCPUs RAM, GB 1 32 4 48 2 64 8 96 4 128 16 192 8 256 32 384 -
NVIDIA® V100 NVLink with Intel Cascade Lake (
gpu-standard-v2
):Number of GPUs VRAM, GB Number of vCPUs RAM, GB 1 32 8 48 2 64 16 96 4 128 32 192 8 256 64 384
* The configuration can be used in GPU clusters.
VM GPUs are provided in full. For example, if a configuration has eight GPUs specified, your VM will have eight full-featured GPU devices.
For more information about organizational and technical limitations for VMs, see Quotas and limits.
OS images
For VMs with GPUs, you can use products with pre-installed NVIDIA drivers available from Nebius AI Marketplace or install the drivers on another standard image yourself.