Running workloads with GPUs

Before you begin
Create a pod with a GPU
Test the pod
Delete the resources you created

Managed Service for Kubernetes clusters let you run workloads on GPUs, which may be helpful for tasks with special computing requirements.

To run workloads with GPUs on Managed Service for Kubernetes cluster pods:

Create a pod with a GPU
Test the pod

If you no longer need these resources, delete them.

Before you begin

If you don't have the Nebius AI command line interface yet, install and initialize it.
The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name or --folder-id parameter.
Create a Managed Service for Kubernetes cluster with any suitable configuration.
Create a node group with the following settings:
- Platform: Select a platform that supports GPUs. See the list of platforms and details about available GPUs.
- GPU: Specify the desired number of GPUs.
- GPU settings: Deselect the Do not install GPU drivers checkbox. The nodes will use the preinstalled GPU drivers.

Create a pod with a GPU

Save the GPU pod creation specification to a YAML file named cuda-vector-add.yaml:

apiVersion: v1
kind: Pod
metadata:
  name: cuda-vector-add
spec:
  restartPolicy: OnFailure
  containers:
    - name: cuda-vector-add
      # https://github.com/kubernetes/kubernetes/blob/v1.7.11/test/images/nvidia-cuda/Dockerfile
      image: "k8s.gcr.io/cuda-vector-add:v0.1"
      resources:
        limits:
          nvidia.com/gpu: 1 # Request for 1 GPU.

To learn more about the pod creation specification, see the Kubernetes documentation.

Create a pod with a GPU:
```
kubectl create -f cuda-vector-add.yaml
```

Test the pod

View information about the pod created:

kubectl describe pod cuda-vector-add

Result:

Name:         cuda-vector-add
Namespace:    default
Priority:     0
...
  Normal  Pulling    16m   kubelet, cl1i7hcbti99j6bbua6u-ebyq  Successfully pulled image "k8s.gcr.io/cuda-vector-add:v0.1"
  Normal  Created    16m   kubelet, cl1i7hcbti99j6bbua6u-ebyq  Created container cuda-vector-add
  Normal  Started    16m   kubelet, cl1i7hcbti99j6bbua6u-ebyq  Created container

View the pod logs:

kubectl logs -f cuda-vector-add

Result:

[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done

Delete the resources you created

Delete the resources you no longer need to avoid paying for them:

Delete the Managed Service for Kubernetes cluster.