Nebius AI services quotas and limits
Nebius AI services can be subject to quotas and limits:
- Quotas are organizational restrictions that can be changed by technical support on request.
- Limits are technical limitations due to Nebius AI architectural features. The limits cannot be changed.
When designing your infrastructure in Nebius AI, plan for the maximum limits that Nebius AI can provide you with. Quotas are restrictions that can be potentially increased up to their limit.
Why quotas are needed
Quotas serve as a soft restriction for requesting resources and enable Nebius AI to guarantee service stability, as new users cannot take up too many resources for testing purposes.
If you need more resources, generate a request for a quota increaseeditors
or admins
group.
Default quotas and limits for Nebius AI services
Cloud Organization
Quotas
Type of limit | Value |
---|---|
Maximum number of subjects per organization | 10,000 |
Maximum number of invitations to an organization | 100 |
Maximum number of federations per organization | 10 |
Maximum number of group members | 1,000 |
Maximum number of subjects per federation | 1,000 |
Limits
There are no hard limits for organizations.
Compute Cloud
Quotas
Type of limit | Value |
---|---|
Number of virtual machines | 12 |
Total number of vCPUs for all VMs | 170 |
Total virtual memory for all VMs | 1350 GB |
Total number of disks | 32 |
Total HDD storage capacity | 500 GB |
Total SSD storage capacity | 200 GB |
Total non-replicated SSD storage capacity | 558 GB |
Total high-performance SSD storage capacity | 186 GB |
Total number of disk snapshots | 32 |
Total storage capacity of all disk snapshots | 400 GB |
Number of disk snapshot schedules | 32 |
Number of images | 8 |
Number of images optimized for deployment1 | 0 |
Number of NVIDIA® A100 NVLink GPUs1 | 0 |
Number of NVIDIA® H100 NVLink (Type A) GPUs1 | 0 |
Number of NVIDIA® H100 NVLink (Type B) GPUs1 | 0 |
Number of NVIDIA® H100 NVLink (Type C) GPUs1 | 8 |
Number of NVIDIA® L4, L40 PCIe GPUs1 | 0 |
Number of NVIDIA® L40S PCIe GPUs1 | 8 |
Number of NVIDIA® V100 PCIe or NVLink GPUs1 | 8 |
Number of concurrent operations per folder | 15 |
Total number of file stores per cloud1 | 32 |
Total HDD file storage capacity per cloud1 | 500 GB |
Total SDD file storage capacity per cloud1 | 500 GB |
1 To increase quotas for deployment-optimized images or GPUs, contact technical support
VM limits
Limits per VM depend on the VM platform:
Type of limit | Value |
---|---|
Maximum number of vCPUs per VM | 224 |
Maximum virtual memory per VM | 952 GB |
Maximum number of GPUs connected to a single VM | 8 |
Type of limit | Value |
---|---|
Maximum number of vCPUs per VM | 160 |
Maximum virtual memory per VM | 1280 GB |
Maximum number of GPUs connected to a single VM | 8 |
Type of limit | Value |
---|---|
Maximum number of vCPUs per VM | 48 |
Maximum virtual memory per VM | 192 GB |
Maximum number of GPUs connected to a single VM | 2 |
Type of limit | Value |
---|---|
Maximum number of vCPUs per VM | 96 |
Maximum virtual memory per VM | 384 GB |
Maximum number of GPUs connected to a single VM | 1 |
Type of limit | Value |
---|---|
Maximum number of vCPUs per VM | 32 |
Maximum virtual memory per VM | 384 GB |
Maximum number of disks and file stores attached to a single VM2 | 18 vCPUs or fewer: 8 More than 18 vCPUs: 163 |
Maximum number of GPUs connected to a single VM | 8 |
Type of limit | Value |
---|---|
Maximum number of vCPUs per VM | 64 |
Maximum virtual memory per VM | 384 GB |
Maximum number of disks and file stores attached to a single VM2 | 20 vCPUs or fewer: 8 More than 20 vCPUs: 163 |
Maximum number of GPUs connected to a single VM | 8 |
Type of limit | Value |
---|---|
Maximum number of vCPUs per VM | 60 |
Maximum virtual memory per VM | 420 GB |
Maximum number of disks and file stores attached to a single VM2 | 20 vCPUs or fewer: 8 More than 20 vCPUs: 163 |
2 Including the boot disk.
3 When a VM starts, a maximum of 14 devices, including the boot disk and a NIC, can be connected to it. Other devices must be connected when the VM is already running. Please note that if you restart a VM with more than 14 devices connected, it will not be able to boot.
VM limits on disk operations
-
VM platforms with Intel Broadwell:
Type of limit Value Maximum4 IOPS per vCPU 10,500 Maximum5 bandwidth per vCPU 130 MB/s -
VM platforms with Intel Cascade Lake:
Type of limit Value Maximum4 IOPS per vCPU 6,300 Maximum5 bandwidth per vCPU 80 MB/s -
VM platforms with Intel Ice Lake:
Type of limit Value Maximum4 IOPS per vCPU 3,500 Maximum5 bandwidth per vCPU 45 MB/s -
VM platforms with Intel Sapphire Rapids:
Type of limit Value Maximum4 IOPS per vCPU 2,300 Maximum5 bandwidth per vCPU 50 MB/s
Disk limits
Type of limit | Value |
---|---|
Maximum disk size | 256 TB |
Allocation unit size | 32 GB |
Maximum4 IOPS for writes per disk | 40,000 |
Maximum4 IOPS for writes per allocation unit | 1,000 |
Maximum5 bandwidth for writes per disk | 450 MB/s |
Maximum5 bandwidth for writes per allocation unit | 15 MB/s |
Maximum4 IOPS for reads per disk | 20,000 |
Maximum4 IOPS for reads per allocation unit | 1,000 |
Maximum5 bandwidth for reads per disk | 450 MB/s |
Maximum5 bandwidth for reads per allocation unit | 15 MB/s |
Type of limit | Value |
---|---|
Maximum disk size | 256 TB |
Allocation unit size | 256 GB |
Maximum4 IOPS for writes per disk | 11,000 |
Maximum4 IOPS for writes per allocation unit | 300 |
Maximum5 bandwidth for writes per disk | 240 MB/s |
Maximum5 bandwidth for writes per allocation unit | 30 MB/s |
Maximum4 IOPS for reads per disk | 2,000 |
Maximum4 IOPS for reads per allocation unit | 300 |
Maximum5 bandwidth for reads per disk | 240 MB/s |
Maximum5 bandwidth for reads per allocation unit | 30 MB/s |
Type of limit | Value |
---|---|
Minimum non-replicated disk size | 93 GB |
Allocation unit size | 93 GB |
Maximum4 IOPS for writes per disk | 75,000 |
Maximum4 IOPS for writes per allocation unit | 5,600 |
Maximum5 bandwidth for writes per disk | 1 GB/s |
Maximum5 bandwidth for writes per allocation unit | 82 MB/s |
Maximum4 IOPS for reads per disk | 75,000 |
Maximum4 IOPS for reads per allocation unit | 28,000 |
Maximum5 bandwidth for reads per disk | 1 GB/s |
Maximum5 bandwidth for reads per allocation unit | 110 MB/s |
Type of limit | Value |
---|---|
Minimum size of high-performance disk | 93 GB |
Allocation unit size | 93 GB |
Maximum4 IOPS for writes per disk | 40,000 |
Maximum4 IOPS for writes per allocation unit | 5,600 |
Maximum5 bandwidth for writes per disk | 1 GB/s |
Maximum5 bandwidth for writes per allocation unit | 82 MB/s |
Maximum4 IOPS for reads per disk | 75,000 |
Maximum4 IOPS for reads per allocation unit | 28,000 |
Maximum5 bandwidth for reads per disk | 1 GB/s |
Maximum5 bandwidth for reads per allocation unit | 110 MB/s |
Type of limit | Value |
---|---|
Maximum storage size | 256 TB |
Allocation unit size | 32 GB |
Maximum number of files in storage | 4,294,967,294 |
Maximum size of file in storage | 300 GB |
Maximum filename length | 255 |
Maximum pathname length | 4095 |
Maximum number of links to a single file | 65,536 |
Type of limit | Value |
---|---|
Maximum storage size | 256 TB |
Allocation unit size | 256 GB |
Maximum number of files in storage | 4,294,967,294 |
Maximum size of file in storage | 300 GB |
Maximum filename length | 255 |
Maximum pathname length | 4095 |
Maximum number of links to a single file | 65,536 |
Read and write operations utilize the same disk resource. The more read operations you do, the fewer write operations you can do, and vice versa. For more information, see Read and write operations.
4 To achieve maximum IOPS, we recommend performing read and write operations whose size is close to that of the disk block (4 KB by default).
5 To achieve the maximum possible bandwidth, we recommend performing 4 MB reads and writes.
Limits of disk snapshot schedules
Type of limit | Value |
---|---|
Number of disks included in a schedule | 1,000 |
Number of schedules to add a disk to | 1,000 |
Identity and Access Management
There are no quotas or limits for Identity and Access Management.
Managed Service for Kubernetes
Quotas
Type of limit | Value |
---|---|
Maximum number of Kubernetes clusters per cloud | 8 |
Limits
Type of limit | Value |
---|---|
Maximum number of volumes connected to a single node | 56 |
Object Storage
Quotas
Type of limit | Value |
---|---|
Storage volume per cloud | 10 TB |
Number of buckets per cloud | 25 |
Limits
Type of limit | Value |
---|---|
Maximum object size | 5 TB |
Total header size per request to HTTP API | 8 KB |
Size of user-defined metadata in an object | 2 KB |
Maximum size of data to upload per request | 5 GB |
Minimum size of data parts for multipart upload, except the last one | 5 MB |
Maximum number of parts in multipart upload | 10,000 |
Resource Manager
There are no quotas or limits for Resource Manager.
Virtual Private Cloud
Quotas
Type of limit | Value |
---|---|
Number of public IP addresses per cloud | 2 |
Limits
Type of limit | Value |
---|---|
Minimum CIDR size for a subnet* | /28 |
Maximum CIDR size for a subnet* | /16 |
Maximum number of TCP/UDP connections per VM, node, or host† | 50,000 |
Supported network and transport layer protocols | IP, ICMP, TCP, UDP, GRE, ESP, AH |
Maximum number of DNS queries to the subnet's DNS server | 1000 RPS |
* The limit applies to subnets created for Kubernetes services and pods in a Managed Service for Kubernetes cluster.
† All TCP/IP and UDP connections opened and half-opened within 180 seconds are taken into account. If there are no data or keep-alive packets in the connection during this time, it is forcibly closed.
Outgoing traffic filtering
Nebius AI automatically blocks traffic sent from Virtual Private Cloud pubic IPs to TCP port 25 of any servers on the internet and Compute Cloud VMs.
Nebius AI can provide a special public IP address with TCP port 25 opened upon request to the support team.
For public IPs that are already in use, port 25 cannot be opened.