但是docker 要 supoort 這些 cuda ready 的 image,要安裝 nvidia-container-toolkit
sudo apt install nvidia-container-toolkit之後設定runtime support:
sudo nvidia-ctk runtime configure --runtime=docker然後重新啟動 docker daemon:
sudo systemctl restart docker這樣之後,docker command 就會 support --gpus all 這個 option
另外,image 的 cuda 版本不能比 host 的 cuda 版本新,實際用 torch.rand(2000,128,device=torch.device('cuda')) 測試,他會說使用了新的function.
有一個 dockerfile,是從 yolov5s_android" 看到的:
FROM nvidia/cuda:11.7.1-cudnn8-devel-ubuntu18.04
ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update --fix-missing
RUN apt-get install -y python3 python3-pip
RUN pip3 install --upgrade pip
RUN pip3 install torch==1.7.1+cu110 torchvision==0.8.2+cu110 -f https://download.pytorch.org/whl/cu110/torch_stable.html
# install openvino
RUN apt-get update && apt-get install -y --no-install-recommends \
wget \
cpio \
sudo \
lsb-release && \
rm -rf /var/lib/apt/lists/*
# Add a user that UID:GID will be updated by vscode
ARG USERNAME=developer
ARG GROUPNAME=developer
ARG UID=1000
ARG GID=1000
ARG PASSWORD=developer
RUN groupadd -g $GID $GROUPNAME && \
useradd -m -s /bin/bash -u $UID -g $GID -G sudo $USERNAME && \
echo $USERNAME:$PASSWORD | chpasswd && \
echo "$USERNAME ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers
USER $USERNAME
ENV HOME /home/developer
原文說明是:
git clone --recursive https://github.com/lp6m/yolov5s_android cd yolov5s_android docker build ./ -f ./docker/Dockerfile -t yolov5s_android docker run -it --gpus all -v `pwd`:/workspace yolov5s_android bash
241122 update
ref
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt update
sudo apt install nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
測試是否安裝成功,用 nvidia docker 來 run nvidia-smi:
sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
沒有留言:
張貼留言