High Altitude Oolong: 3月 2022

2022/3/22

Bookmark : online gdb

onlinegdb

有趣的網站，線上各種語言的ide，但是可以進入 debug mode。
下面會出現 gdb 的 command prompt。
所以可以用來 test 小 code，語法。
也可以用來學習 gdb command

2022/3/21

Test Yolact

紀錄一下步驟，ref:

只是要test 一下 infer。

所以 create 一個 env，並且依照 ref:1 的說明安裝yolact 需要的 packages:

conda create --name yolact-3.7.9 python=3.7.9
conda activate yolact-3.7.9
pip install cypthon opencv-python pillow pycocotools matplotlib

然後 clone yolact project source:

git clone https://github.com/dbolya/yolact.git

然後download trained 好的 weights，在yolact 的 Evaluation 有download link。
選 yolact_resnet50_54_800000.pth
mkdir weights 並且 copy .pth 進去。

另外，安裝 pytorch，參考 ref:2

nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89

是 10.2，所以我要用 10.2

conda install pytorch torchvision cudatoolkit=10.2 -c pytorch

裝完測試一下：

python
>>import torch
>> torch.cuda.is_available()
True

OK. test infer:

python eval.py --trained_model=weights/yolact_resnet50_54_800000.pth --image=../test.jpg

source 的 save/load 只有做 weight 的部份。
所以要save model and weight 要自己改。
yolact.py : torch.save( )

2022/3/8

GT 740M, ubuntu 20.04, nvidia driver and cuda

driver 用 nvidia download and install 的話，他會叫你用 distribution 提供的。
因為這個 card 太舊了。
如果一樣安裝得話，會抱怨 kernel 用的 gcc 版本和driver 不一樣，
加入 ignore_cc_mismatch 後，build fail，一堆 function reference error

證明這個 driver source 沒有隨著 kernel 更新...

最後只好用 local script 安裝。結果都還是 fail。
fail log: /var/log/cuda-install.log，只有說 driver install failed

網路上說，/var/log/nvidia-install.log 有 driver install fail 的 log，
果然...
說是 /usr/lib/nvidia 下有 alternate-install-availeble，所以 abort

把 /usr/lib/nvidia rename 後，一樣 fail，/var/log/nvidia-install.log 說因為有用 Nouveau driver。
所以他寫了 nvidia-install-disable，讓 nouveau 部會load。

reboot 後再 run, 一樣 fail, nvidia-installer.log 說..

NVRM: API mismatch: the client has the version 510.47.03, but this kernel module has version 470.57.02.
Please make sure that this kernel module and all NVIDIA driver compoments have the same version

所以 apt purge libnvidia-**

看到有一些 510 utility remove，結果一樣..

結論：

nvidia 的 script/deb 都沒有依照險卡決定安裝版本的功能，一律安裝最新版。
但是在nvidia 的網站，可以看到，每個險卡最新的 driver 不一樣。
以 GT 740M 來說，只有到 470，但是現在最新的 driver version 是 510 了。

而 cuda 版本右跟 driver 版本相關。
最新的 cuda 11.6 需要 driver 版本 510。

所以 GT 740M 就不能用 cuda 11,.6 了，因為 driver 版本不支援。
所以只能選(test install)到 cuda 11.4.1

--- 結果做完 apt upgrade 又 fail 了，好像 tool 去用到 510 的版本。

因為 510 (cuda_11.6 安裝) 的 tool, library 一直留在系統，移除不完整。
只好重新安裝系統，之後：

開啟 softare and updater 中 "Additional"，nvidia 用 propritry 的 driver 版本 470
執行 cuda 11.4 local shell script，但是不安裝 driver

完成。
這樣 nvidia driver 會放到 dkms 管理，升級 kernel 時也會自動重 build initramfs，不會發生上面 update kernel 後，nvidia driver 沒有 load 的情況。

20.04 超容易 boot fail..

dev/sda1: clean, 552599/6111232 files, 7119295/24414464 blocks

follow 這一頁的說明。
開機按下shift，啟動 grub menu，進入recovery mode。
然後

apt-get purgr nvidia*

如果是不小心把X 杉了，也會是這樣的 error，就要 apt reinstall ubuntu-desktop
這樣開進desktop gui 後，network-manager 會要修改，把 ethernet 加回去。ref:networkmanager not show
lanaguage input method 也要再 install, reboot

這樣開進desktop gui 後，bluetooth 連不上 Airpod pro，這一篇說，要開啟 br/edr mode。
照著修改 /etc/bluetooth/main.conf:

ControllerMode = bredr
or 
ControllerMode = dual

之後，就可以連上了。

2022/3/7

先是 browser 沒反應，結果是 nginx 的關係，先停止 nginx 後， browser 就出現 'ERR_UNSAFE_PORT'.
這裡說 10080 在 chrome 是 unsafe。
所以只好改 port 到 20080

2022/3/4

gitlab runner

ref:

from gitlab :

Download and Install Binary:

# Download the binary for your system
sudo curl -L --output /usr/local/bin/gitlab-runner https://gitlab-runner-downloads.s3.amazonaws.com/latest/binaries/gitlab-runner-linux-amd64

# Give it permission to execute
sudo chmod +x /usr/local/bin/gitlab-runner

# Create a GitLab Runner user
sudo useradd --comment 'GitLab Runner' --create-home gitlab-runner --shell /bin/bash

# Install and run as a service
sudo gitlab-runner install --user=gitlab-runner --working-directory=/home/gitlab-runner
sudo gitlab-runner start

Command to register a runner:

sudo gitlab-runner register --url http://1b10f96444f4/ --registration-token $REGISTRATION_TOKEN

gitlab-runner 是一個獨立的 serivce，由 systemd run 起來，他會定期的 check gitlab server, project，看看有沒有符合的狀態要 run
當他發現要 run 的時候，就會啟動 executor。
executor 就是你裝完 gitlab-runner 後，register 到某 project 的時候，指定的執行方式，可以是 shell, ssh, docker..
最一般就是 shell， shell-executor 會用 gitlab-runner 所在的系統用 gitlab-runner 這個 user 在他的 HOME: /home/gitlab-runner 下， run 你指定的 stage script

所以這個 gitlab-runner 所在的系統，必須先安裝好 run stage script 需要的 library 和 tool
不然就會 fail

所以gitlab ci/cd 就是安裝一個 service : gitlab-runner，幫你作 build, test, deploy 的動作。
這個 service: gitlab-runner ，可以幫很多 gitlab server/ project 做 CI/CD 服務。
只要 register 就可以。

所以 gitlab-runner register 就是設定它要為誰 (那一個 gitlab) 服務。
第一個參數就是要服務得 gitlab server path。
第二格就是那個 gitlab server 提供的 tag，類似 key，靠這個 key 向那個 gitlab server 要求存取與通知。

gitlab-runner register 完後，那個 gitlab server 會收到通知，就會把這格 gitlab-runner 顯示在對應的 project 或 server settting (share runner) 下了。
之後，就可以用這格 gitlab server 的 web 界面來 config gitlab-runner 的服務。

2022/3/1

紀錄一下，block write, kernel error message..

[32625.118232] INFO: task systemd-udevd:530 blocked for more than 120 seconds.
[32625.118240]       Tainted: P           OE     5.0.0-23-generic #24~18.04.1-Ubuntu
[32625.118242] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

稍微紀錄一下.. pm2 startup log

ref:

how to setup a node js application for production

開機的時候自動啟動 pm2..

charles-chang@raspberrypi:~ $ sudo pm2 startup

                        -------------

__/\\\\\\\\\\\\\____/\\\\____________/\\\\____/\\\\\\\\\_____
 _\/\\\/////////\\\_\/\\\\\\________/\\\\\\__/\\\///////\\\___
  _\/\\\_______\/\\\_\/\\\//\\\____/\\\//\\\_\///______\//\\\__
   _\/\\\\\\\\\\\\\/__\/\\\\///\\\/\\\/_\/\\\___________/\\\/___
    _\/\\\/////////____\/\\\__\///\\\/___\/\\\________/\\\//_____
     _\/\\\_____________\/\\\____\///_____\/\\\_____/\\\//________
      _\/\\\_____________\/\\\_____________\/\\\___/\\\/___________
       _\/\\\_____________\/\\\_____________\/\\\__/\\\\\\\\\\\\\\\_
        _\///______________\///______________\///__\///////////////__


                          Runtime Edition

        PM2 is a Production Process Manager for Node.js applications
                     with a built-in Load Balancer.

                Start and Daemonize any application:
                $ pm2 start app.js

                Load Balance 4 instances of api.js:
                $ pm2 start api.js -i 4

                Monitor in production:
                $ pm2 monitor

                Make pm2 auto-boot at server restart:
                $ pm2 startup

                To go further checkout:
                http://pm2.io/


                        -------------

[PM2] Init System found: systemd
Platform systemd
Template
[Unit]
Description=PM2 process manager
Documentation=https://pm2.keymetrics.io/
After=network.target

[Service]
Type=forking
User=root
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
Environment=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
Environment=PM2_HOME=/root/.pm2
PIDFile=/root/.pm2/pm2.pid
Restart=on-failure

ExecStart=/usr/lib/node_modules/pm2/bin/pm2 resurrect
ExecReload=/usr/lib/node_modules/pm2/bin/pm2 reload all
ExecStop=/usr/lib/node_modules/pm2/bin/pm2 kill

[Install]
WantedBy=multi-user.target

Target path
/etc/systemd/system/pm2-root.service
Command list
[ 'systemctl enable pm2-root' ]
[PM2] Writing init configuration in /etc/systemd/system/pm2-root.service
[PM2] Making script booting at startup...
[PM2] [-] Executing: systemctl enable pm2-root...
Created symlink /etc/systemd/system/multi-user.target.wants/pm2-root.service → /etc/systemd/system/pm2-root.service.
[PM2] [v] Command successfully executed.
+---------------------------------------+
[PM2] Freeze a process list on reboot via:
$ pm2 save

[PM2] Remove init script via:
$ pm2 unstartup systemd

主要是要 follow 這一個 step by step..

訂閱：文章 (Atom)