Using fGPUs With OKS

You can work with NVIDIA GPUs in your OKS clusters through the allocation of flexible GPUs (fGPUs). The appropriate node pool configuration allows worker nodes to allocate, attach, and use fGPUs. For more information, see About Flexible GPUs and Node Pool Manifest Reference.

OKS currently supports two GPU configuration models:

  • The flavored model (recommended): The node image does not include GPU drivers and requires installing them separately (for example using the NVIDIA GPU Operator).

  • The legacy model: GPU drivers are pre-installed in the node image and configured through the fgpu field.

The behavior of the fgpu parameters thus depends on the desired configuration model.

Enabling GPU Support Through Your Node Pool Manifest

Using the Flavored Model

The latest model introduces GPU node pools based on flavored node images. These images do not include GPU drivers, which you must install after node pool creation.

To use this model, you need to apply a Kubernetes manifest to your node pool that specifies a compatible nodeType and the flavour field.

This model is required for newer GPU types (such as H200) and is the recommended approach. Use it if you need at least one of the following:

  • to use H200 GPUs;

  • to use inference7 VM types;

  • to attach more than one GPU per node;

  • to use a custom NVIDIA driver.

In this model:

  • The node image does not include GPU drivers.

  • GPU capabilities are determined by the selected nodeType.

  • The flavour field defines the operating system and compatibility with GPU drivers.

  • You must install the GPU drivers manually (for example using the NVIDIA GPU Operator or a custom deployment method).

Manifest Sample
apiVersion: oks.dev/v1beta2
kind: NodePool
metadata:
  name: fgpu-pool-b
spec:
  desiredNodes: 1
  nodeType: inference7-h200.4xlargeA
  flavour: oks.ubuntu22
  volumes:
    - device: root
      type: gp2
      size: 300
      dir: /
  zones:
  - cloudgouv-eu-west-1b
  taint: false
  fgpu:
    k8s-operator: false
  upgradeStrategy:
    autoUpgradeEnabled: false
  autoHealing: false

This sample includes the following key fields that you need to specify:

  • nodeType: The VM type to use for the node pool. It determines the attached GPU model. For more information, see VM Types > OUTSCALE Type.

  • flavour: The OKS OMI to use, which defines the operating system and compatibility with GPU drivers.

  • k8s-operator: Ignored when the flavour is not legacy and should always be treated as false.

Installing the NVIDIA GPU Operator

After creating a node pool using the new model, you need to install the NVIDIA GPU Operator to deploy the required GPU drivers and components.

Before you begin: Install Helm on your machine.

  1. Add the NVIDIA Helm repository:

    Request sample
    $ helm repo add nvidia https://helm.ngc.nvidia.com/nvidia && helm repo update
  2. Install the NVIDIA GPU Operator:

    Request sample
    $ helm install gpu-operator -n gpu-operator --create-namespace nvidia/gpu-operator \
      --set driver.enabled=true \
      --set toolkit.enabled=true \
      --version=v25.10.0 \
      --set driver.usePrecompiled=false
    • We recommend using version v25.10.0 as it supports all fGPU models currently supported by OKS.

    • Before installing another version, check its compatibility with the oks.ubuntu22 flavor and your target GPU model.

  3. (optional) If you enabled the taint option when creating your node pool, remove the taint to allow the GPU Operator pods to be scheduled on the nodes:

    Request sample
    $ kubectl taint node fgpu-pool-b gpu=true:NoSchedule

    If taint was set to false when the node pool was created, you do not need to perform this step.

Using the Legacy Model

The legacy model uses node images (OMIs) with pre-installed GPU drivers. GPU configuration is fully managed through the fgpu field in the node pool manifest.

You can use this model if you require a simpler and faster deployment process, but you cannot use it if you want to:

  • use H200 GPUs;

  • use inference7 VM types;

  • attach more than one GPU per node;

  • use a custom NVIDIA driver.

To use this model, you need to create a node pool using a Kubernetes manifest that includes the fgpu field.

In this model:

  • You can only attach 1 GPU per node.

  • GPU drivers are pre-installed on the node image.

  • The GPU model must be explicitly specified.

  • The NVIDIA GPU Operator will be automatically installed during node pool creation.

Manifest Sample
apiVersion: oks.dev/v1beta2
kind: NodePool
metadata:
  name: application-pool2-a
spec:
  desiredNodes: 2
  nodeType: tinav5.c2r4p1
  fgpu:
    model: "nvidia-p6"
    k8s-operator: true
  zones:
    - eu-west-2a
  upgradeStrategy:
    maxUnavailable: 1
    maxSurge: 0
    autoUpgradeEnabled: false
  autoHealing: true

You can configure GPU support by specifying the following characteristics under the spec section of your node pool manifest.

fgpu spec Sample
spec:
  fgpu:
    model: "nvidia-p6"
    k8s-operator: true

This sample contains the following fields that you need to specify:

  • model: The GPU model to allocate.

  • k8s-operator: Whether the official NVIDIA GPU operator in the gpu-operator namespace is installed on the cluster (true | false).

    Deleting the node pool does not uninstall the operator.

Make sure that the processor generation you select is compatible with the desired fGPU model. For more information about these models, see About Flexible GPUs > Models of fGPUs.

fGPUs are attached by default to supported VM types. If multiple VM types support the same fGPU, you can select an alternative VM type to meet your requirements. For more information about VMs, see VM Types.

Supported fGPU Models

OKS supports the following fGPU models provided by 3DS OUTSCALE:

  • nvidia-a100

  • nvidia-a100-80

  • nvidia-h100

  • nvidia-l40

  • nvidia-m60

  • nvidia-p6

  • nvidia-p100

  • nvidia-v100

  • nvidia-h200 (only supported in the flavored model)

The nvidia-h200 model is supported only if the following conditions are met:

  • You are using the flavored model;

  • You are using an inference7 VM type;

  • The VM type is available in the Subregion where the node pool is deployed.

For some fGPU models, additional VM types from the inference7 type are supported:

  • nvidia-h200 (only supported in the flavored model)

    • inference7-h200.4xlargeA

  • nvidia-h100

    • inference7-h100.medium

    • inference7-h100.large

    • inference7-h100.xlarge

    • inference7-h100.2xlarge

  • nvidia-l40

    • inference7-l40.large

    • inference7-l40.medium

For more information about these models, see About Flexible GPUs > Models of fGPUs.

You must make sure that your chosen fGPU model is supported by the VM type that you defined when creating your node pool. If the fGPU model and VM type are incompatible, the allocated GPUs may fail to attach. After 3 unsuccessful attempts, the VM may fail to start as well.

Related Pages