Huge pages
This section explains setting up and using huge pages. For additional information, see the Kubernetes documentation.
Warning
To increase performance and efficiency, disable transparent huge pages before following the steps in this section. Especially, on nodes with high memory utilization.
Set huge pages at the node level
Huge pages are configured by setting the kernel parameter vm.nr_hugepages
. This parameter can be set using DaemonSet:
apiVersion: apps/v1 kind: DaemonSet metadata: name: sysctl-hugepages namespace: kube-system labels: k8s-app: sysctl-hugepages spec: selector: matchLabels: name: sysctl-hugepages template: metadata: labels: name: sysctl-hugepages spec: tolerations: # these tolerations are to have the daemonset runnable on control plane nodes # remove them if your control plane nodes should not run pods - key: node-role.kubernetes.io/control-plane operator: Exists effect: NoSchedule - key: node-role.kubernetes.io/master operator: Exists effect: NoSchedule containers: - name: sysctl image: busybox command: ["/bin/sh"] args: ["-c", "sysctl -w vm.hugetlb_shm_group=100; sysctl -w vm.nr_hugepages=1280; tail -f /dev/null"] securityContext: privileged: true resources: limits: memory: 200Mi requests: cpu: 100m memory: 200Mi terminationGracePeriodSeconds: 30 # these nodeSelector is to have the daemonset only running on specific nodes hosting ML pods # adapt value if necessary nodeSelector: role: ml-worker
The docker image used is the standard busybox.
Arguments
Huge pages are set with these arguments:
args: ["-c", "sysctl -w vm.hugetlb_shm_group=100; sysctl -w vm.nr_hugepages=1280; tail -f /dev/null"]
vm.hugetlb_shm_group=100 (gid of default ml user)
vm.nr_hugepages=1280
Note
Linux huge pages should be set at 3/8 the size of physical memory.
Set privileged to true
Set the securityContext: privileged
setting to true as shown below (and in Set huge pages at the node level).
containers: - name: sysctl image: busybox command: ["/bin/sh"] args: ["-c", "sysctl -w vm.hugetlb_shm_group=100; sysctl -w vm.nr_hugepages=1280; tail -f /dev/null"] securityContext: privileged: true