Using fGPUs With OKS
You can work with NVIDIA GPUs in your OKS clusters through the allocation of flexible GPUs (fGPUs). The appropriate node pool configuration allows worker nodes to allocate, attach, and use fGPUs. For more information, see About Flexible GPUs and Node Pool Manifest Reference.
OKS currently supports two GPU configuration models:
-
The flavored model (recommended): The node image does not include GPU drivers and requires installing them separately (for example using the NVIDIA GPU Operator).
-
The legacy model: GPU drivers are pre-installed in the node image and configured through the
fgpufield.
The behavior of the fgpu parameters thus depends on the desired configuration model.
Enabling GPU Support Through Your Node Pool Manifest
Using the Flavored Model
The latest model introduces GPU node pools based on flavored node images. These images do not include GPU drivers, which you must install after node pool creation.
To use this model, you need to apply a Kubernetes manifest to your node pool that specifies a compatible nodeType and the flavour field.
This model is required for newer GPU types (such as H200) and is the recommended approach. Use it if you need at least one of the following:
-
to use H200 GPUs;
-
to use
inference7VM types; -
to attach more than one GPU per node;
-
to use a custom NVIDIA driver.
|
In this model:
|
apiVersion: oks.dev/v1beta2
kind: NodePool
metadata:
name: fgpu-pool-b
spec:
desiredNodes: 1
nodeType: inference7-h200.4xlargeA
flavour: oks.ubuntu22
volumes:
- device: root
type: gp2
size: 300
dir: /
zones:
- cloudgouv-eu-west-1b
taint: false
fgpu:
k8s-operator: false
upgradeStrategy:
autoUpgradeEnabled: false
autoHealing: false
This sample includes the following key fields that you need to specify:
-
nodeType: The VM type to use for the node pool. It determines the attached GPU model. For more information, see VM Types > OUTSCALE Type. -
flavour: The OKS OMI to use, which defines the operating system and compatibility with GPU drivers. -
k8s-operator: Ignored when the flavour is not legacy and should always be treated asfalse.
Installing the NVIDIA GPU Operator
After creating a node pool using the new model, you need to install the NVIDIA GPU Operator to deploy the required GPU drivers and components.
Before you begin: Install Helm on your machine. |
-
Add the NVIDIA Helm repository:
Request sample$ helm repo add nvidia https://helm.ngc.nvidia.com/nvidia && helm repo update -
Install the NVIDIA GPU Operator:
Request sample$ helm install gpu-operator -n gpu-operator --create-namespace nvidia/gpu-operator \ --set driver.enabled=true \ --set toolkit.enabled=true \ --version=v25.10.0 \ --set driver.usePrecompiled=false-
We recommend using version v25.10.0 as it supports all fGPU models currently supported by OKS.
-
Before installing another version, check its compatibility with the
oks.ubuntu22flavor and your target GPU model.
-
-
(optional) If you enabled the
taintoption when creating your node pool, remove the taint to allow the GPU Operator pods to be scheduled on the nodes:Request sample$ kubectl taint node fgpu-pool-b gpu=true:NoScheduleIf
taintwas set tofalsewhen the node pool was created, you do not need to perform this step.
Using the Legacy Model
The legacy model uses node images (OMIs) with pre-installed GPU drivers. GPU configuration is fully managed through the fgpu field in the node pool manifest.
|
You can use this model if you require a simpler and faster deployment process, but you cannot use it if you want to:
|
To use this model, you need to create a node pool using a Kubernetes manifest that includes the fgpu field.
|
In this model:
|
apiVersion: oks.dev/v1beta2
kind: NodePool
metadata:
name: application-pool2-a
spec:
desiredNodes: 2
nodeType: tinav5.c2r4p1
fgpu:
model: "nvidia-p6"
k8s-operator: true
zones:
- eu-west-2a
upgradeStrategy:
maxUnavailable: 1
maxSurge: 0
autoUpgradeEnabled: false
autoHealing: true
You can configure GPU support by specifying the following characteristics under the spec section of your node pool manifest.
spec:
fgpu:
model: "nvidia-p6"
k8s-operator: true
This sample contains the following fields that you need to specify:
-
model: The GPU model to allocate. -
k8s-operator: Whether the official NVIDIA GPU operator in thegpu-operatornamespace is installed on the cluster (true|false).Deleting the node pool does not uninstall the operator.
|
Make sure that the processor generation you select is compatible with the desired fGPU model. For more information about these models, see About Flexible GPUs > Models of fGPUs. fGPUs are attached by default to supported VM types. If multiple VM types support the same fGPU, you can select an alternative VM type to meet your requirements. For more information about VMs, see VM Types. |
Supported fGPU Models
OKS supports the following fGPU models provided by 3DS OUTSCALE:
-
nvidia-a100 -
nvidia-a100-80 -
nvidia-h100 -
nvidia-l40 -
nvidia-m60 -
nvidia-p6 -
nvidia-p100 -
nvidia-v100 -
nvidia-h200(only supported in the flavored model)
|
The
|
For some fGPU models, additional VM types from the inference7 type are supported:
-
nvidia-h200(only supported in the flavored model)-
inference7-h200.4xlargeA
-
-
nvidia-h100-
inference7-h100.medium -
inference7-h100.large -
inference7-h100.xlarge -
inference7-h100.2xlarge
-
-
nvidia-l40-
inference7-l40.large -
inference7-l40.medium
-
For more information about these models, see About Flexible GPUs > Models of fGPUs.
|
You must make sure that your chosen fGPU model is supported by the VM type that you defined when creating your node pool. If the fGPU model and VM type are incompatible, the allocated GPUs may fail to attach. After 3 unsuccessful attempts, the VM may fail to start as well. |
Related Pages