目录

nvidia-docker安装系列

概述

按照官网文档,可以按照下面的命令进行安装。

1
2
3
4
5
6
7
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.repo | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo

yum-config-manager --enable libnvidia-container-experimental

# 验证
nvidia-docker run -–rm nvidia/cuda nvidia-smi

Kubernetes GPU插件安装

https://github.com/NVIDIA/k8s-device-plugin#deployment-via-helm

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
apiVersion: v1
kind: Pod
metadata:
  name: gpu-pod
spec:
  containers:
    - name: cuda-container
      image: nvcr.io/nvidia/cuda:9.0-devel
      resources:
        limits:
          nvidia.com/gpu: 2 # requesting 2 GPUs
    - name: digits-container
      image: nvcr.io/nvidia/digits:20.12-tensorflow-py3
      resources:
        limits:
          nvidia.com/gpu: 2 # requesting 2 GPUs
---
apiVersion: v1
kind: Pod
metadata:
  name: cuda-vector-add
spec:
  restartPolicy: OnFailure
  containers:
    - name: cuda-vector-add
      # https://github.com/kubernetes/kubernetes/blob/v1.7.11/test/images/nvidia-cuda/Dockerfile
      image: "harbor.dev-prev.com/middleware/cuda-vector-add:v0.1"
      resources:
        limits:
          nvidia.com/gpu: 1 # requesting 1 GPU

https://github.com/NVIDIA/k8s-device-plugin#deployment-via-helm

1
2
3
4
repotrack nvidia-docker2
repotrack slirp4netns fuse-overlayfs container-selinux
tar zcvf nv.tar.gz nv-docker2
docker run --rm --gpus all cuda-vector-add:v0.1 nvidia-smi

下面是调度器的配置文件的写法。

1
2
3
4
5
6
7
8
scheduler:
  extra_args:
    address: 0.0.0.0
    kubeconfig: /etc/kubernetes/ssl/kubecfg-kube-scheduler.yaml
    leader-elect: 'true'
    policy-config-file: /etc/kubernetes/ssl/scheduler-policy-config.json
    profiling: 'false'
    v: '2'
警告
本文最后更新于 2017年2月1日,文中内容可能已过时,请谨慎参考。