Graphics accelerators (GPUs)

Available GPUs
Configurations of VMs with GPUs
OS images
See also

Compute Cloud provides graphics accelerators (GPUs) for different VM configurations. GPUs outperform CPUs in terms of processing certain types of data and are suitable for machine learning (ML), artificial intelligence (AI), and 3D rendering tasks. You can manage GPUs and RAM directly from your VM.

To handle tasks that are too large for a single GPU, you can create GPU clusters.

Available GPUs

In Compute Cloud, the following GPUs are available:

NVIDIA A100 (NVIDIA Ampere architecture) with 80 GB HBM2.
NVIDIA H100 (NVIDIA Hopper architecture) with 80 GB HBM3.
NVIDIA L4 (NVIDIA Ada Lovelace architecture) with 24 GB GDDR6.
NVIDIA L40 (NVIDIA Ada Lovelace architecture) with 48 GB GDDR6.
NVIDIA L40S (NVIDIA Ada Lovelace architecture) with 48 GB GDDR6.

Warning

GPUs run in TCC mode, which does not use the operating system's graphics drivers.

By default, many quotas for creating VMs with GPUs are set to zero. You can generate a request for a quota increase in the management console. To do this, you need to be in your organization's editors or admins group.

NVIDIA A100

The NVIDIA A100 GPUs are based on the Ampere® microarchitecture. It uses third-generation Tensor Cores and delivers 80 GB HBM2 memory with up to 2 TB/s throughput.

NVIDIA H100

NVIDIA H100 GPU is powered by the NVIDIA Hopper architecture. It features fourth-generation Tensor Cores and has the capability to apply mixed FP8 and FP16 precisions to dramatically accelerate AI calculations for transformers – up to 4x faster training and up to 30x faster inference over the prior generation (NVIDIA A100).

Nebius AI provides NVIDIA H100 GPUs in the NVIDIA H100 SXM form factor. It offers 80 GB of HBM3 RAM and up to 3.35 TB/s memory bandwidth. NVIDIA H100 SXM is available on multiple platforms with identical performance and capabilities:

NVIDIA® H100 NVLink with Intel Sapphire Rapids (Type A)
NVIDIA® H100 NVLink with Intel Sapphire Rapids (Type B)
NVIDIA® H100 NVLink with Intel Sapphire Rapids (Type C)

For more information, go to the NVIDIA website:

NVIDIA L4, L40, and L40S

NVIDIA L4, NVIDIA L40, and NVIDIA L40S are inference-optimized GPUs powered by the NVIDIA Ada Lovelace architecture, featuring fourth-generation Tensor Cores and FP8 precision support.

Nebius AI provides:

NVIDIA® L4 PCIe with Intel Ice Lake (standard-v3-l4): Designed as a universal GPU that can handle workload of various types such as generative AI, visual computing, and graphics, NVIDIA L4 excels in video processing, delivering 120x higher AI video performance than CPUs. It offers 24 GB of GDDR6 RAM and up to 300 GB/s memory bandwidth.
NVIDIA® L40 PCIe with Intel Ice Lake (standard-v3-l40): NVIDIA L40 delivers state-of-the-art capabilities for graphics and AI-enabled 2D, 3D and video generation. It offers 48 GB of GDDR6 RAM and up to 864 GB/s memory bandwidth.
NVIDIA® L40S PCIe with Intel Ice Lake (standard-v3-l40s): NVIDIA L40S is a multi-workload GPU that combines powerful AI compute with best-in-class graphics and media acceleration. It offers 48 GB of GDDR6 RAM and up to 864 GB/s memory bandwidth.

For more information, see the NVIDIA website:

NVIDIA V100

Warning

NVIDIA V100 GPUs are discontinued in Nebius AI. They are not available for new and existing virtual machines.

Configurations of VMs with GPUs

The configurations available for VMs with GPUs are listed below. The configurations that can be used in GPU clusters are marked with an asterisk *.

NVIDIA® A100 NVLink with AMD Epyc Rome (gpu-standard-v3):

Number of GPUs VRAM, GB Number of vCPUs RAM, GB

1 80 28 119

8 640 224 952
NVIDIA® H100 NVLink with Intel Sapphire Rapids (Type A) (gpu-h100):

Number of GPUs VRAM, GB Number of vCPUs RAM, GB

1 80 20 160

8* 640 160 1280
NVIDIA® H100 NVLink with Intel Sapphire Rapids (Type B) (gpu-h100-b):

Number of GPUs VRAM, GB Number of vCPUs RAM, GB

1 80 20 160

8* 640 160 1280
NVIDIA® H100 NVLink with Intel Sapphire Rapids (Type C) (gpu-h100-c):

Number of GPUs VRAM, GB Number of vCPUs RAM, GB

1 80 20 160

8* 640 160 1280
NVIDIA® L4 PCIe with Intel Ice Lake (standard-v3-l4):

Number of GPUs VRAM, GB Number of vCPUs RAM, GB

1 24 4 16

1 24 8 32

1 24 12 48

1 24 16 64

1 24 24 96

2 48 24 96

2 48 48 192
NVIDIA® L40 PCIe with Intel Ice Lake (standard-v3-l40):

Number of GPUs VRAM, GB Number of vCPUs RAM, GB

1 48 8 32

1 48 12 48

1 48 16 64

1 48 24 96

2 96 48 192
NVIDIA® L40S PCIe with Intel Ice Lake (standard-v3-l40s):

Number of GPUs VRAM, GB Number of vCPUs RAM, GB

1 48 8 32

1 48 12 48

1 48 16 64

1 48 24 96

1 48 48 192

1 48 96 384

Number of GPUs	VRAM, GB	Number of vCPUs	RAM, GB
1	80	28	119
8	640	224	952

Number of GPUs	VRAM, GB	Number of vCPUs	RAM, GB
1	80	20	160
8*	640	160	1280

Number of GPUs	VRAM, GB	Number of vCPUs	RAM, GB
1	80	20	160
8*	640	160	1280

Number of GPUs	VRAM, GB	Number of vCPUs	RAM, GB
1	80	20	160
8*	640	160	1280

Number of GPUs	VRAM, GB	Number of vCPUs	RAM, GB
1	24	4	16
1	24	8	32
1	24	12	48
1	24	16	64
1	24	24	96
2	48	24	96
2	48	48	192

Number of GPUs	VRAM, GB	Number of vCPUs	RAM, GB
1	48	8	32
1	48	12	48
1	48	16	64
1	48	24	96
2	96	48	192

Number of GPUs	VRAM, GB	Number of vCPUs	RAM, GB
1	48	8	32
1	48	12	48
1	48	16	64
1	48	24	96
1	48	48	192
1	48	96	384

* The configuration can be used in GPU clusters.

VM GPUs are provided in full. For example, if a configuration has eight GPUs specified, your VM will have eight full-featured GPU devices.

For more information about organizational and technical limitations for VMs, see Quotas and limits.

OS images

For VMs with GPUs, you can use products with pre-installed NVIDIA drivers available from Nebius AI Marketplace or install the drivers on another standard image yourself.