This is the multi-page printable view of this section. Click here to print.
Assign Devices to Pods and Containers
1 - Set Up DRA in a Cluster
Kubernetes v1.35 [stable](enabled by default)This page shows you how to configure dynamic resource allocation (DRA) in a Kubernetes cluster by enabling API groups and configuring classes of devices. These instructions are for cluster administrators.
About DRA
A Kubernetes feature that lets you request and share resources among Pods. These resources are often attached devices like hardware accelerators.
With DRA, device drivers and cluster admins define device classes that are available to claim in workloads. Kubernetes allocates matching devices to specific claims and places the corresponding Pods on nodes that can access the allocated devices.
Ensure that you're familiar with how DRA works and with DRA terminology like DeviceClasses, ResourceClaims, and ResourceClaimTemplates. For details, see Dynamic Resource Allocation (DRA).
Before you begin
You need to have a Kubernetes cluster, and the kubectl command-line tool must be configured to communicate with your cluster. It is recommended to run this tutorial on a cluster with at least two nodes that are not acting as control plane hosts. If you do not already have a cluster, you can create one by using minikube or you can use one of these Kubernetes playgrounds:
Your Kubernetes server must be at or later than version v1.34.To check the version, enter kubectl version.
- Directly or indirectly attach devices to your cluster. To avoid potential issues with drivers, wait until you set up the DRA feature for your cluster before you install drivers.
Optional: enable additional DRA API groups
DRA overall is a stable feature in Kubernetes; however, aspects of it may still be alpha or beta. If you want to use any aspect of DRA that is not yet stable, and the associated feature relies on a dedicated API kind, then you must enable the associated alpha or beta API groups.
Some older DRA drivers or workloads might still need the v1beta1 API from Kubernetes 1.30 or v1beta2 from Kubernetes 1.32. If and only if support for those is desired, then enable the following API groups:
* `resource.k8s.io/v1beta1`
* `resource.k8s.io/v1beta2`
Alpha features with separate API types need:
resource.k8s.io/v1alpha3
For more information, see Enabling or disabling API groups.
Verify that DRA is enabled
To verify that the cluster is configured correctly, try to list DeviceClasses:
kubectl get deviceclasses
If the component configuration was correct, the output is similar to the following:
No resources found
If DRA isn't correctly configured, the output of the preceding command is similar to the following:
error: the server doesn't have a resource type "deviceclasses"
For example, this can occur when the resource.k8s.io API group was disabled. A similar check is applicable to alpha or beta quality top-level types.
Try the following troubleshooting steps:
-
Reconfigure and restart the
kube-apiservercomponent. -
If the complete
.spec.resourceClaimsfield gets removed from Pods, or if Pods get scheduled without considering the ResourceClaims, then verify that theDynamicResourceAllocationfeature gate is not turned off for kube-apiserver, kube-controller-manager, kube-scheduler or the kubelet.
Install device drivers
After you enable DRA for your cluster, you can install the drivers for your attached devices. For instructions, check the documentation of your device owner or the project that maintains the device drivers. The drivers that you install must be compatible with DRA.
To verify that your installed drivers are working as expected, list ResourceSlices in your cluster:
kubectl get resourceslices
The output is similar to the following:
NAME NODE DRIVER POOL AGE
00000-driver.example.com-cluster-1-node-1-abcde cluster-1-node-1 driver.example.com cluster-1-device-pool-1-r1gc 7s
00000-driver.example.com-cluster-1-node-2-fghij cluster-1-node-2 driver.example.com cluster-1-device-pool-2-446z 8s
Try the following troubleshooting steps:
- Check the health of the DRA driver and look for error messages about publishing ResourceSlices in its log output. The vendor of the driver may have further instructions about installation and troubleshooting.
Create DeviceClasses
You can define categories of devices that your application operators can claim in workloads by creating DeviceClasses. Some device driver providers might also instruct you to create DeviceClasses during driver installation.
The ResourceSlices that your driver publishes contain information about the devices that the driver manages, such as capacity, metadata, and attributes. You can use Common Expression Language to filter for properties in your DeviceClasses, which can make finding devices easier for your workload operators.
-
To find the device properties that you can select in DeviceClasses by using CEL expressions, get the specification of a ResourceSlice:
kubectl get resourceslice <resourceslice-name> -o yamlThe output is similar to the following:
apiVersion: resource.k8s.io/v1 kind: ResourceSlice # lines omitted for clarity spec: devices: - attributes: type: string: gpu capacity: memory: value: 64Gi name: gpu-0 - attributes: type: string: gpu capacity: memory: value: 64Gi name: gpu-1 driver: driver.example.com nodeName: cluster-1-node-1 # lines omitted for clarityYou can also check the driver provider's documentation for available properties and values.
-
Review the following example DeviceClass manifest, which selects any device that's managed by the
driver.example.comdevice driver:apiVersion: resource.k8s.io/v1 kind: DeviceClass metadata: name: example-device-class spec: selectors: - cel: expression: |- device.driver == "driver.example.com" -
Create the DeviceClass in your cluster:
kubectl apply -f https://k8s.io/examples/dra/deviceclass.yaml
Clean up
To delete the DeviceClass that you created in this task, run the following command:
kubectl delete -f https://k8s.io/examples/dra/deviceclass.yaml
What's next
2 - Allocate Devices to Workloads with DRA
Kubernetes v1.35 [stable](enabled by default)This page shows you how to allocate devices to your Pods by using dynamic resource allocation (DRA). These instructions are for workload operators. Before reading this page, familiarize yourself with how DRA works and with DRA terminology like ResourceClaims and ResourceClaimTemplates. For more information, see Dynamic Resource Allocation (DRA).
About device allocation with DRA
As a workload operator, you can claim devices for your workloads by creating ResourceClaims or ResourceClaimTemplates. When you deploy your workload, Kubernetes and the device drivers find available devices, allocate them to your Pods, and place the Pods on nodes that can access those devices.
Before you begin
You need to have a Kubernetes cluster, and the kubectl command-line tool must be configured to communicate with your cluster. It is recommended to run this tutorial on a cluster with at least two nodes that are not acting as control plane hosts. If you do not already have a cluster, you can create one by using minikube or you can use one of these Kubernetes playgrounds:
Your Kubernetes server must be at or later than version v1.34.To check the version, enter kubectl version.
- Ensure that your cluster admin has set up DRA, attached devices, and installed drivers. For more information, see Set Up DRA in a Cluster.
Identify devices to claim
Your cluster administrator or the device drivers create DeviceClasses that define categories of devices. You can claim devices by using Common Expression Language to filter for specific device properties.
Get a list of DeviceClasses in the cluster:
kubectl get deviceclasses
The output is similar to the following:
NAME AGE
driver.example.com 16m
If you get a permission error, you might not have access to get DeviceClasses. Check with your cluster administrator or with the driver provider for available device properties.
Claim resources
You can request resources from a DeviceClass by using ResourceClaims. To create a ResourceClaim, do one of the following:
- Manually create a ResourceClaim if you want multiple Pods to share access to the same devices, or if you want a claim to exist beyond the lifetime of a Pod.
- Use a ResourceClaimTemplate to let Kubernetes generate and manage per-Pod ResourceClaims. Create a ResourceClaimTemplate if you want every Pod to have access to separate devices that have similar configurations. For example, you might want simultaneous access to devices for Pods in a Job that uses parallel execution.
If you directly reference a specific ResourceClaim in a Pod, that ResourceClaim must already exist in the cluster. If a referenced ResourceClaim doesn't exist, the Pod remains in a pending state until the ResourceClaim is created. You can reference an auto-generated ResourceClaim in a Pod, but this isn't recommended because auto-generated ResourceClaims are bound to the lifetime of the Pod that triggered the generation.
To create a workload that claims resources, select one of the following options:
Review the following example manifest:
apiVersion: resource.k8s.io/v1
kind: ResourceClaimTemplate
metadata:
name: example-resource-claim-template
spec:
spec:
devices:
requests:
- name: gpu-claim
exactly:
deviceClassName: example-device-class
selectors:
- cel:
expression: |-
device.attributes["driver.example.com"].type == "gpu" &&
device.capacity["driver.example.com"].memory == quantity("64Gi")
This manifest creates a ResourceClaimTemplate that requests devices in the
example-device-class DeviceClass that match both of the following parameters:
- Devices that have a
driver.example.com/typeattribute with a value ofgpu. - Devices that have
64Giof capacity.
To create the ResourceClaimTemplate, run the following command:
kubectl apply -f https://k8s.io/examples/dra/resourceclaimtemplate.yaml
Review the following example manifest:
apiVersion: resource.k8s.io/v1
kind: ResourceClaim
metadata:
name: example-resource-claim
spec:
devices:
requests:
- name: single-gpu-claim
exactly:
deviceClassName: example-device-class
allocationMode: All
selectors:
- cel:
expression: |-
device.attributes["driver.example.com"].type == "gpu" &&
device.capacity["driver.example.com"].memory == quantity("64Gi")
This manifest creates ResourceClaim that requests devices in the
example-device-class DeviceClass that match both of the following parameters:
- Devices that have a
driver.example.com/typeattribute with a value ofgpu. - Devices that have
64Giof capacity.
To create the ResourceClaim, run the following command:
kubectl apply -f https://k8s.io/examples/dra/resourceclaim.yaml
Request devices in workloads using DRA
To request device allocation, specify a ResourceClaim or a ResourceClaimTemplate
in the resourceClaims field of the Pod specification. Then, request a specific
claim by name in the resources.claims field of a container in that Pod.
You can specify multiple entries in the resourceClaims field and use specific
claims in different containers.
-
Review the following example Job:
apiVersion: batch/v1 kind: Job metadata: name: example-dra-job spec: completions: 10 parallelism: 2 template: spec: restartPolicy: Never containers: - name: container0 image: ubuntu:24.04 command: ["sleep", "9999"] resources: claims: - name: separate-gpu-claim - name: container1 image: ubuntu:24.04 command: ["sleep", "9999"] resources: claims: - name: shared-gpu-claim - name: container2 image: ubuntu:24.04 command: ["sleep", "9999"] resources: claims: - name: shared-gpu-claim resourceClaims: - name: separate-gpu-claim resourceClaimTemplateName: example-resource-claim-template - name: shared-gpu-claim resourceClaimName: example-resource-claimEach Pod in this Job has the following properties:
- Makes a ResourceClaimTemplate named
separate-gpu-claimand a ResourceClaim namedshared-gpu-claimavailable to containers. - Runs the following containers:
container0requests the devices from theseparate-gpu-claimResourceClaimTemplate.container1andcontainer2share access to the devices from theshared-gpu-claimResourceClaim.
- Makes a ResourceClaimTemplate named
-
Create the Job:
kubectl apply -f https://k8s.io/examples/dra/dra-example-job.yaml
Try the following troubleshooting steps:
- When the workload does not start as expected, drill down from Job
to Pods to ResourceClaims and check the objects
at each level with
kubectl describeto see whether there are any status fields or events which might explain why the workload is not starting. - When creating a Pod fails with
must specify one of: resourceClaimName, resourceClaimTemplateName, check that all entries inpod.spec.resourceClaimshave exactly one of those fields set. If they do, then it is possible that the cluster has a mutating Pod webhook installed which was built against APIs from Kubernetes < 1.32. Work with your cluster administrator to check this.
Clean up
To delete the Kubernetes objects that you created in this task, follow these steps:
-
Delete the example Job:
kubectl delete -f https://k8s.io/examples/dra/dra-example-job.yaml -
To delete your resource claims, run one of the following commands:
-
Delete the ResourceClaimTemplate:
kubectl delete -f https://k8s.io/examples/dra/resourceclaimtemplate.yaml -
Delete the ResourceClaim:
kubectl delete -f https://k8s.io/examples/dra/resourceclaim.yaml
-
What's next
3 - Access DRA Device Metadata
Kubernetes v1.36 [alpha]
This page shows you how to access device metadata from containers that use dynamic resource allocation (DRA). Device metadata lets workloads discover information about allocated devices such as device attributes or network interface details — by reading JSON files at well-known paths inside the container.
Before reading this page, familiarize yourself with Dynamic Resource Allocation (DRA) and how to allocate devices to workloads.
Before you begin
You need to have a Kubernetes cluster, and the kubectl command-line tool must be configured to communicate with your cluster. It is recommended to run this tutorial on a cluster with at least two nodes that are not acting as control plane hosts. If you do not already have a cluster, you can create one by using minikube or you can use one of these Kubernetes playgrounds:
Your Kubernetes server must be version v1.36.To check the version, enter kubectl version.
- Ensure that your cluster admin has set up DRA, attached devices, and installed drivers. For more information, see Set Up DRA in a Cluster.
- Ensure that the DRA driver deployed in your cluster supports device metadata.
Drivers that use the DRA kubelet plugin enable the
EnableDeviceMetadataandMetadataVersionsoptions when starting the plugin. Check the driver's documentation for details.
Access device metadata with a ResourceClaim
When you use a directly referenced ResourceClaim to allocate devices, the device metadata files appear inside the container at:
/var/run/kubernetes.io/dra-device-attributes/resourceclaims/<claimName>/<requestName>/<driverName>-metadata.json
-
Review the following example manifest:
apiVersion: resource.k8s.io/v1 kind: ResourceClaim metadata: name: gpu-claim spec: devices: requests: - name: gpu exactly: deviceClassName: gpu.example.com --- apiVersion: v1 kind: Pod metadata: name: gpu-metadata-reader spec: resourceClaims: - name: my-gpu resourceClaimName: gpu-claim containers: - name: workload image: ubuntu:24.04 resources: claims: - name: my-gpu request: gpu command: - sh - -c - | echo "=== DRA device metadata ===" find /var/run/kubernetes.io/dra-device-attributes -name '*-metadata.json' -print -exec cat {} \; sleep 3600 restartPolicy: NeverThis manifest creates a ResourceClaim named
gpu-claimthat requests a device from thegpu.example.comDeviceClass, and a Pod that reads the device metadata. -
Create the ResourceClaim and Pod:
kubectl apply -f https://k8s.io/examples/dra/dra-device-metadata-pod.yaml -
After the Pod is running, view the container logs to see the metadata:
kubectl logs gpu-metadata-readerThe output is similar to:
=== DRA device metadata === /var/run/kubernetes.io/dra-device-attributes/resourceclaims/gpu-claim/gpu/gpu.example.com-metadata.json { "kind": "DeviceMetadata", "apiVersion": "metadata.resource.k8s.io/v1alpha1", ... } -
To inspect the full metadata file, exec into the container:
kubectl exec gpu-metadata-reader -- \ cat /var/run/kubernetes.io/dra-device-attributes/resourceclaims/gpu-claim/gpu/gpu.example.com-metadata.jsonThe output is a JSON object containing device attributes like the model, driver version, and device UUID. See metadata schema for details on the JSON structure.
Access device metadata with a ResourceClaimTemplate
When you use a ResourceClaimTemplate, Kubernetes generates a ResourceClaim for each Pod. Because the generated claim name is not predictable, the metadata files appear at a path that uses the Pod's claim reference name instead:
/var/run/kubernetes.io/dra-device-attributes/resourceclaimtemplates/<podClaimName>/<requestName>/<driverName>-metadata.json
The <podClaimName> corresponds to the name field in the Pod's
spec.resourceClaims[] entry. The JSON metadata also includes a
podClaimName field that records this mapping.
-
Review the following example manifest:
apiVersion: resource.k8s.io/v1 kind: ResourceClaimTemplate metadata: name: gpu-claim-template spec: spec: devices: requests: - name: gpu exactly: deviceClassName: gpu.example.com --- apiVersion: v1 kind: Pod metadata: name: gpu-metadata-template-reader spec: resourceClaims: - name: my-gpu resourceClaimTemplateName: gpu-claim-template containers: - name: workload image: ubuntu:24.04 resources: claims: - name: my-gpu request: gpu command: - sh - -c - | echo "=== DRA device metadata (from template) ===" find /var/run/kubernetes.io/dra-device-attributes -name '*-metadata.json' -print -exec cat {} \; sleep 3600 restartPolicy: NeverThis manifest creates a ResourceClaimTemplate and a Pod. Each Pod gets its own generated ResourceClaim. The metadata path uses the Pod's claim reference name
my-gpu. -
Create the ResourceClaimTemplate and Pod:
kubectl apply -f https://k8s.io/examples/dra/dra-device-metadata-template-pod.yaml -
After the Pod is running, view the metadata:
kubectl exec gpu-metadata-template-reader -- \ cat /var/run/kubernetes.io/dra-device-attributes/resourceclaimtemplates/my-gpu/gpu/gpu.example.com-metadata.json
Read metadata in your application
Go applications
The k8s.io/dynamic-resource-allocation/devicemetadata package provides
ready-made functions for reading metadata files. These functions handle
version negotiation automatically, decoding the metadata stream and converting
it to internal types so your code works across schema versions without manual
version checks.
For a directly referenced ResourceClaim:
import "k8s.io/dynamic-resource-allocation/devicemetadata"
dm, err := devicemetadata.ReadResourceClaimMetadata("gpu-claim", "gpu")
For a template-generated claim (using the Pod's claim reference name):
dm, err := devicemetadata.ReadResourceClaimTemplateMetadata("my-gpu", "gpu")
If you know the specific driver name, you can read a single driver's metadata file:
dm, err := devicemetadata.ReadResourceClaimMetadataWithDriverName("gpu.example.com", "gpu-claim", "gpu")
The returned *metadata.DeviceMetadata contains the claim metadata, requests,
and per-device attributes.
Applications in other languages can read the JSON file directly and inspect
the apiVersion field to determine the schema version before parsing.
Clean up
Delete the resources that you created:
kubectl delete -f https://k8s.io/examples/dra/dra-device-metadata-pod.yaml
kubectl delete -f https://k8s.io/examples/dra/dra-device-metadata-template-pod.yaml
What's next
- Learn more about DRA device metadata
- Allocate devices to workloads with DRA
- For more information on the design, see KEP-5304.