2023/9/21

Docker : run nvidia cuda ready images

ref: nvidia 有做好一堆有 support cuda 的 docker image : nvidia in docker image

但是docker 要 supoort 這些 cuda ready 的 image,要安裝 nvidia-container-toolkit
sudo apt install nvidia-container-toolkit
之後設定runtime support:
sudo nvidia-ctk runtime configure --runtime=docker
然後重新啟動 docker daemon:
sudo systemctl restart docker
這樣之後,docker command 就會 support --gpus all 這個 option

另外,image 的 cuda 版本不能比 host 的 cuda 版本新,實際用 torch.rand(2000,128,device=torch.device('cuda')) 測試,他會說使用了新的function.


有一個 dockerfile,是從 yolov5s_android" 看到的:
FROM nvidia/cuda:11.7.1-cudnn8-devel-ubuntu18.04

ENV DEBIAN_FRONTEND noninteractive

RUN apt-get update --fix-missing
RUN apt-get install -y python3 python3-pip
RUN pip3 install --upgrade pip
RUN pip3 install torch==1.7.1+cu110 torchvision==0.8.2+cu110 -f https://download.pytorch.org/whl/cu110/torch_stable.html

# install openvino
RUN apt-get update && apt-get install -y --no-install-recommends \
    wget \
    cpio \
    sudo \
    lsb-release && \
    rm -rf /var/lib/apt/lists/*
# Add a user that UID:GID will be updated by vscode
ARG USERNAME=developer
ARG GROUPNAME=developer
ARG UID=1000
ARG GID=1000
ARG PASSWORD=developer
RUN groupadd -g $GID $GROUPNAME && \
    useradd -m -s /bin/bash -u $UID -g $GID -G sudo $USERNAME && \
    echo $USERNAME:$PASSWORD | chpasswd && \
    echo "$USERNAME   ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers
USER $USERNAME
ENV HOME /home/developer

原文說明是:
git clone --recursive https://github.com/lp6m/yolov5s_android
cd yolov5s_android
docker build ./ -f ./docker/Dockerfile  -t yolov5s_android
docker run -it --gpus all -v `pwd`:/workspace yolov5s_android bash

沒有留言:

張貼留言