Running workloads with GPUs
Managed Service for Kubernetes clusters let you run workloads on GPUs, which may be helpful for tasks with special computing requirements.
To run workloads with GPUs on Managed Service for Kubernetes cluster pods:
If you no longer need these resources, delete them.
Before you begin
-
If you don't have the Nebius AI command line interface yet, install and initialize it.
-
The folder specified in the CLI profile is used by default. You can specify a different folder using the
--folder-name
or--folder-id
parameter. -
Create a Managed Service for Kubernetes cluster with any suitable configuration.
-
Create a node group with the following settings:
- Platform: Select a platform that supports GPUs. See the list of platforms and details about available GPUs.
- GPU: Specify the desired number of GPUs.
Create a pod with a GPU
-
Save the GPU pod creation specification to a YAML file named
cuda-vector-add.yaml
:apiVersion: v1 kind: Pod metadata: name: cuda-vector-add spec: restartPolicy: OnFailure containers: - name: cuda-vector-add # https://github.com/kubernetes/kubernetes/blob/v1.7.11/test/images/nvidia-cuda/Dockerfile image: "k8s.gcr.io/cuda-vector-add:v0.1" resources: limits: nvidia.com/gpu: 1 # Request for 1 GPU.
To learn more about the pod creation specification, see the Kubernetes documentation
. -
Create a pod with a GPU:
kubectl create -f cuda-vector-add.yaml
Test the pod
-
View information about the pod created:
kubectl describe pod cuda-vector-add
Result:
Name: cuda-vector-add Namespace: default Priority: 0 ... Normal Pulling 16m kubelet, cl1i7hcbti99j6bbua6u-ebyq Successfully pulled image "k8s.gcr.io/cuda-vector-add:v0.1" Normal Created 16m kubelet, cl1i7hcbti99j6bbua6u-ebyq Created container cuda-vector-add Normal Started 16m kubelet, cl1i7hcbti99j6bbua6u-ebyq Created container
-
View the pod logs:
kubectl logs -f cuda-vector-add
Result:
[Vector addition of 50000 elements] Copy input data from the host memory to the CUDA device CUDA kernel launch with 196 blocks of 256 threads Copy output data from the CUDA device to the host memory Test PASSED Done
Delete the resources you created
Delete the resources you no longer need to avoid paying for them: