High Altitude Oolong: 4月 2026

2026/4/25

在 TITAN RTX 上使用 vLLM 部署 cyankiwi Qwen3.5-27B AWQ 模型的測試筆記

目標：將 Hermes Agent 連接到本地 vLLM 推理引擎

起因

有一張 NVIDIA TITAN RTX（24GB VRAM），希望能在 Hermes Agent（需 64K context window）上使用。聽cyankiwi 有提供 Qwen3.5 系列的 AWQ 4bit 量化版本號稱可以在較小的 VRAM 上運行，於是開始測試。

硬體與環境

項目	規格
GPU	NVIDIA TITAN RTX（Turing 架構，SM 7.5）
VRAM	24GB
CUDA	12.6
vLLM 版本	0.19.1
Python	3.12
主要測試機器	tsi7-14700（192.168.145.98）
Hermes Agent	tspi4ssd（192.168.145.x）

測試過程

1. 安裝 vLLM

pip install --break-system-packages transformers>=5.5.0
pip install --break-system-packages vllm

vLLM 安裝成功，磁碟空間從 94% 降到 77%（釋放約 17%）。

2. 下載模型

hf download cyankiwi/Qwen3.5-27B-AWQ-4bit

模型大小約 20.1GB，下載完成。

3. 嘗試啟動 vLLM

一開始用預設參數（262K context、90% GPU memory）啟動，失敗了。問題是 KV cache 初始化時 VRAM 不夠。

4. 參數調整過程

嘗試	參數	結果
1	--gpu-memory-utilization 0.90 --max-model-len 262144	OOM：只有 1.41GB 可用
2	--gpu-memory-utilization 0.75	OOM
3	--gpu-memory-utilization 0.70 --max-model-len 131072 --enforce-eager	OOM：KV cache -4.2GB
4	--gpu-memory-utilization 0.85 --max-model-len 65536 --enforce-eager	OOM：KV cache -0.51GB
5	--gpu-memory-utilization 0.90 --max-model-len 32768 --enforce-eager	OOM：估算最大長度 7840
6	--gpu-memory-utilization 0.90 --max-model-len 7840 --enforce-eager	✅ 啟動成功

終於啟動成功！但 max_model_len 只有 7840 tokens。

5. Hermes 連接設定

修改 tspi4ssd 上的 Hermes config.yaml：

model:
  default: cyankiwi/Qwen3.5-27B-AWQ-4bit
  provider: custom
  base_url: http://192.168.145.98:8000/v1
  context_length: 7840

重啟 hermes-gateway，API 測試成功。

6. 最終失敗

Hermes Agent 最低需要 64,000 tokens 的 context window，但 TITAN RTX 只能支援到 7,840 tokens。即使把模型降到 32K 的設定，KV cache 估算需要 2.15GB，但當時可用只有 0.67GB。

核心問題

TITAN RTX 的 24GB VRAM 在承載 Qwen3.5-27B（AWQ 4bit 約 18.65GB）之後，可用於 KV cache 的空間只剩下約 4-5GB。這對於 64K context 所需的 KV cache 來說遠遠不夠。

2026/4/16

Attempting to Bring Up Realtek RTL8922BE on Ubuntu 24.04 (Failed)

Hardware: Realtek Semiconductor Co., Ltd. RTL8922BE [10ec:8922] (rev 01)

Operating System: Ubuntu 24.04 LTS (Kernel 6.8.0-107-generic)

Step 1: Hardware Identification and Initial Setup

The system was found to have a Realtek RTL8922BE Wi-Fi 7 card, but no wireless interface was present. Initial checks showed the rtw89_8922be driver was missing from the default Ubuntu kernel.

Installed necessary build tools and headers:

sudo apt install git build-essential linux-headers-$(uname -r) network-manager iw

Step 2: Building the Driver from Source

We used the official backport repository for Realtek rtw89 drivers maintained by lwfinger:

git clone https://github.com/lwfinger/rtw89.git
cd rtw89
make
sudo make install

After installation, we encountered an Exec format error during modprobe due to conflicts with existing partial kernel modules. We successfully resolved this by unloading all related modules (rtw89_core, rtw89_pci, etc.) before loading the new rtw_8922ae driver.

Step 3: Firmware Issues and Troubleshooting

The driver successfully loaded but failed to initialize the hardware with the error: no suitable firmware found and failed to recognize firmware.

Discovery: The driver was looking for rtw8922a_fw.bin. We downloaded the latest versions (v1, v2, v3, v4) from the Linux Firmware repository.
Hardware Mismatch: We added debug logging to the driver and discovered the hardware is Cut C (CV=2). However, all publicly available firmware files (even the latest v4) only contain support for Cut A (CV=0) and Cut B (CV=1).

Step 4: Attempted Fallback and Final Failure

As a last-ditch effort, we patched the driver's firmware loading logic (fw.c) to force the hardware to accept the CV=1 firmware as a fallback for the CV=2 chip.

Result:

The driver successfully initialized and created the wlp7s0 interface.
The interface was visible in nmcli.
However, the hardware failed internal calibration with errors: failed to do RF RX_DCK result from state 4 and HW scan failed with status: -14.

Conclusion

The Realtek RTL8922BE Cut C (CV=2) revision found in this server is too new for the current publicly available drivers and firmware. While the driver can be "hacked" to load, the firmware for earlier revisions is physically incompatible with the RF calibration requirements of the newer chip revision.

Recommendation: Users with this hardware revision must wait for Realtek to release an updated firmware file (likely rtw8922a_fw-5.bin) that explicitly supports the CV=2 hardware revision.

2026/4/12

Kronos 技術深度分析：金融K線首個基礎模型

1. 為何需要專門的金融K線 Foundation Model？

時間序列基礎模型（TSFM）在電力、醫療等領域已有成功案例，但金融 K線資料面臨獨特挑戰：

K線具有多維結構（OHLCVA），難以直接套用一般 Transformer
市場具有高噪聲、非穩態特性，過去規律不一定適用未來
不同交易所的交易規則、結算制度、漲跌限制各異，跨市場泛化困難

Kronos（arXiv:2508.02739，AAAI 2026）針對這些問題提出系統性解決方案，是第一個開源的金融 K線基礎模型。

2. 技術架構：兩階段框架

Stage 1：K-line Tokenizer（BSQ）

使用 Binary Spherical Quantization 將連續多維的 OHLCVA 向量量化為離散 tokens。這個 tokenizer 並非一般 NLP 的 BPE，而是針對金融市場價格動態專門設計，能保留價格區間、相對變化等關鍵資訊。

Stage 2：Autoregressive Transformer

Decoder-only 架構，在量化後的 token 序列上做自回歸預測。預訓練目標為「預測下一個 K-line token」，與 GPT 系列語言模型的訓練邏輯一致。

3. 模型版本與選擇

模型	Tokenizer	Context Length	參數量	開源
Kronos-mini	Kronos-Tokenizer-2k	2048	4.1M	✅
Kronos-small	Kronos-Tokenizer-base	512	24.7M	✅
Kronos-base	Kronos-Tokenizer-base	512	102.3M	✅
Kronos-large	Kronos-Tokenizer-base	512	499.2M	❌

預訓練規模：120 億筆 K線記錄、45 個交易所、7 種時間粒度。
HuggingFace：NeoQuasar/Kronos-small

4. 台灣股市應用評估

優勢

預訓練涵蓋45個交易所、120億筆 K線，台灣屬亞洲新興市場，已有跨市場泛化能力
Zero-shot RankIC 領先其他 TSFM 93%
Kronos-mini（4.1M）可在一般 GPU 執行，適合個人或小型團隊研究
不需自行設計特徵工程或時間序列標記

限制

Context Length 限制（512 ~ 2048），日K約1.4年至8年
不看財報、總經、籌碼等非價格資訊
台灣特有的政策干預、權值股主導特性未被訓練進去
±10% 漲跌限制、T+2 結算等台灣制度細節，建議透過 Fine-tuning 針對優化

5. 建議流程

回測優先：以歷史日K做滾動窗口回測，觀察模型在多頭、空頭、震盪不同市場型態的表現
參考輔助：Kronos 預測結果作為技術面輔助參考，不作唯一進出场依據
長期評估 Fine-tuning：如欲針對台股特性微調，需較大 GPU 資源與乾淨歷史資料

參考連結

Kronos GitHub	github.com/shiyu-coder/Kronos
Kronos 論文	arxiv.org/abs/2508.02739
HuggingFace 模型	NeoQuasar/Kronos-small
twstock 套件	github.com/mlouielu/twstock

研究時間：2026-04-12

Kronos 安裝腳本與台灣股市預測實作

1. 安裝腳本

#!/bin/bash
# Kronos 安裝腳本（Python 需 3.10+）
set -e

KRONOS_DIR="/home/charles-chang/.openclaw/workspace/research/2026-04-12_kronos_taiwan_stock/kronos_repo"
ENV_DIR="/home/charles-chang/kronos_env"

# 建立虛擬環境
if [ ! -d "$ENV_DIR" ]; then
    python3 -m venv $ENV_DIR
fi
source $ENV_DIR/bin/activate

# 安裝依賴
pip install --upgrade pip -q
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121 -q
pip install yfinance pandas transformers einops -q

# 克隆 Kronos（如尚未存在）
if [ ! -d "$KRONOS_DIR" ]; then
    git clone https://github.com/shiyu-coder/Kronos.git $KRONOS_DIR
fi

# 驗證模型
python3 <<'PYEOF'
import sys
sys.path.insert(0, '/home/charles-chang/.openclaw/workspace/research/2026-04-12_kronos_taiwan_stock/kronos_repo')
from model import Kronos, KronosTokenizer, KronosPredictor
tokenizer = KronosTokenizer.from_pretrained("NeoQuasar/Kronos-Tokenizer-base")
model = Kronos.from_pretrained("NeoQuasar/Kronos-small")
predictor = KronosPredictor(model, tokenizer, max_context=512)
print("模型載入成功 ✓")
PYEOF

2. 台灣股市預測腳本

#!/usr/bin/env python3
"""
Kronos 台灣股市預測腳本
使用：python3 predict_taiwan_stock.py <股票代號> [--lookback N] [--pred_len N]
範例：python3 predict_taiwan_stock.py 2330 --pred_len 30
"""

import os, sys, argparse
from datetime import datetime, timedelta
import pandas as pd
import yfinance as yf

kronos_path = os.path.join(os.path.dirname(__file__), 'kronos_repo')
if os.path.exists(kronos_path):
    sys.path.insert(0, kronos_path)
from model import Kronos, KronosTokenizer, KronosPredictor


def fetch_taiwan_stock_data(stock_code: str, days: int = 800) -> pd.DataFrame:
    """從 Yahoo Finance 取得台股歷史 K線"""
    ticker = f"{stock_code}.TW"
    df = yf.download(ticker, period=f"{days}d", auto_adjust=False, progress=False)
    df = df.reset_index()
    if isinstance(df.columns, pd.MultiIndex):
        df.columns = [col[0] if isinstance(col, tuple) else col for col in df.columns]
    rename_map = {}
    for col in df.columns:
        cl = str(col).lower()
        if 'date' in cl: rename_map[col] = 'timestamps'
        elif 'open' in cl and 'open' not in rename_map.values(): rename_map[col] = 'open'
        elif 'high' in cl and 'high' not in rename_map.values(): rename_map[col] = 'high'
        elif 'low' in cl and 'low' not in rename_map.values(): rename_map[col] = 'low'
        elif 'close' in cl: rename_map[col] = 'close'
        elif 'volume' in cl and 'volume' not in rename_map.values(): rename_map[col] = 'volume'
    df = df.rename(columns=rename_map)
    df = df.loc[:, ~df.columns.duplicated()]
    df['amount'] = 0.0
    for col in ['open', 'high', 'low', 'close', 'volume', 'amount']:
        df[col] = pd.to_numeric(df[col], errors='coerce')
    df = df.dropna()
    return df[['timestamps', 'open', 'high', 'low', 'close', 'volume', 'amount']]


def predict(df: pd.DataFrame, lookback: int = 400,
             pred_len: int = 30, sample_count: int = 1) -> pd.DataFrame:
    """Kronos-small K線預測"""
    tokenizer = KronosTokenizer.from_pretrained("NeoQuasar/Kronos-Tokenizer-base")
    model = Kronos.from_pretrained("NeoQuasar/Kronos-small")
    predictor = KronosPredictor(model, tokenizer, max_context=512)
    lookback = min(lookback, 512)
    df_in = df.tail(lookback).reset_index(drop=True)
    x_ts = df_in['timestamps'].reset_index(drop=True)
    y_ts = pd.Series(pd.date_range(
        start=df_in['timestamps'].iloc[-1] + timedelta(days=1),
        periods=pred_len, freq='B'))
    return predictor.predict(
        df=df_in[['open', 'high', 'low', 'close', 'volume', 'amount']],
        x_timestamp=x_ts, y_timestamp=y_ts,
        pred_len=pred_len, T=1.0, top_p=0.9, sample_count=sample_count)


def main():
    ap = argparse.ArgumentParser()
    ap.add_argument('stock_code', help='股票代號（例：2330）')
    ap.add_argument('--lookback', type=int, default=400)
    ap.add_argument('--pred_len', type=int, default=30)
    ap.add_argument('--sample_count', type=int, default=1)
    args = ap.parse_args()
    df = fetch_taiwan_stock_data(args.stock_code, days=int(args.lookback * 2))
    pred = predict(df, args.lookback, args.pred_len, args.sample_count)
    print(f"\n{'日期':<12} {'Open':>10} {'High':>10} {'Low':>10} {'Close':>10}")
    print("-" * 60)
    for idx, row in pred.iterrows():
        ds = idx.strftime('%Y-%m-%d') if hasattr(idx,'strftime') else str(idx)
        f = lambda k: f"{row.get(k,0):>10.2f}" if isinstance(row.get(k,0),(int,float)) else f"{str(row.get(k,'')):>10}"
        print(f"{ds:<12} {f('open')} {f('high')} {f('low')} {f('close')}")
    out = f"prediction_{args.stock_code}_{datetime.now().strftime('%Y%m%d_%H%M%S')}.csv"
    pred.to_csv(out, index_label='timestamps')
    print(f"\n已儲存：{out}\n此為模型輸出結果，僅供參考，不構成投資建議。")


if __name__ == "__main__":
    main()

3. 台灣股市資料來源

Yahoo Finance（yfinance）	日K為主，即時性普通，3行代碼即完成
twstock 套件	台灣原生，直接串接證交所與櫃買中心
Fugle API	日內分鐘線最可靠
證交所開放資料 API	官方 REST，即時性高

# Yahoo Finance 最簡範例
import yfinance as yf
df = yf.download("2330.TW", start="2020-01-01")  # 台積電

4. 使用方式

# 安裝環境
bash setup_kronos.sh

# 預測台積電未來30天
/home/charles-chang/kronos_env/bin/python3 predict_taiwan_stock.py 2330

# 指定歷史窗口與預測長度
/home/charles-chang/kronos_env/bin/python3 predict_taiwan_stock.py 2330 --lookback 200 --pred_len 10

5. 台股特殊考量

漲跌停 ±10%：Fine-tuning 時不應過濾這些bars，是正常市場現象
成交金額（amount）：Yahoo Finance 無此欄位，以0填補即可
Context Length：Kronos-small 512 tokens（約1.4年日K）；日內分鐘線建議用 Kronos-mini（2048）
日內資料：Yahoo Finance 分鐘線在台股不可靠，建議改用 Fugle API

實作完成時間：2026-04-12

如何突破 Cloudflare 取得 Perplexity 內容

問題背景

在自動化研究場景中，常常需要透過 AI 搜尋引擎（如 Perplexity）抓取整理過的答案，並進一步餵給其他系統處理。然而 Perplexity 的分享連結受到 Cloudflare JS Challenge 保護，大多數程式化的存取方式都會被阻擋。

本文記錄完整的原因分析、所有嘗試過的失敗方法，以及最終如何透過 CloakBrowser 成功突破。

為什麼無法直接取得 Perplexity 內容？

根本原因：瀏覽器指紋偵測

Perplexity 使用的保護機制不是傳統的「封 IP」或「輸入驗證碼」，而是瀏覽器指紋偵測（Browser Fingerprinting）。

當請求抵達 Cloudflare 時，會檢測以下項目：

navigator.webdriver 是否為 true（自動化框架標記）
navigator.plugins.length 是否符合真實瀏覽器
window.chrome 物件是否存在
WebGL / Canvas / Audio 指紋是否正常
WebRTC 是否洩漏真實 IP
TLS 指紋（JA3/JA4）是否為常見瀏覽器

只要有任何一項不符，Cloudflare 就判定為機器人，發回 JS Challenge 頁面。

常見工具為何全部失效？

嘗試方法	失敗原因
curl + 多種 User-Agent	無法執行 JavaScript，只能拿到 Challenge 頁 HTML
web_fetch / cloudscraper	403，cloudscraper 對新版 Cloudflare 失效
Playwright（標準版）	navigator.webdriver=true，立即被偵測
Playwright + stealth 參數	仍有殘留信號，觸發指紋檢查
undetected-chromedriver	連線錯誤，新版 Chrome 不相容
curl-impersonate	TLS 指紋可能僥倖通過，但 JS Challenge 仍需瀏覽器執行
Xvfb + patchright	底層同樣缺乏完整指紋修補

這些方法的共同盲點是：都在試圖「欺騙」Cloudflare 的 JS 檢測層，而非真正讓檢測看到的都是正常值。

解決方案：CloakBrowser

工具介紹

CloakBrowser（GitHub: CloakHQ/CloakBrowser）是目前唯一在 C++ 層級修改 Chromium 原始碼的指紋修補方案。不是設定調整，不是 JS 注入，而是49 個指紋 patch 直接編進二進位，包括：

navigator.webdriver → false
navigator.plugins.length → 真實 plugin 清單
window.chrome → 正常瀏覽器物件
WebGL / Canvas / Audio 指紋
Font / Screen / Hardware 指紋
WebRTC IP leak 防護
TLS 指紋（JA3/JA4/Akamai）
Automation signals（CDP detection）

安裝方式

pip install cloakbrowser

Python 使用範例

from cloakbrowser import launch
import time

browser = launch(headless=False)
page = browser.new_page()

page.goto(
    "https://www.perplexity.ai/search/你的分享連結",
    wait_until="domcontentloaded"
)

# 等待 Cloudflare 挑戰自動通過
page.wait_for_function(
    "() => !document.title.includes('Just a moment')",
    timeout=40000
)
time.sleep(3)  # 額外等待確保渲染完成

html = page.content()
browser.close()

print(f"成功取得內容，HTML 長度: {len(html)} bytes")

關鍵實務細節

使用 headful 模式（headless=False）：相較 headless，headful 的指紋更難被偵測，搭配 Xvfb 虛擬顯示器即可在無頭伺服器上運行
等待 Challenge 完成：wait_for_function 監控標題是否離開「請稍候」頁，而非盲目 sleep
額外等待 3 秒：確保 Perplexity 的 SPA 完全渲染動態內容

失敗方法完整列表

方法	結果	瓶頸
curl + UA 輪換	❌	無法執行 JS
cloudscraper	❌	Cloudflare 版本過新
Playwright 標準版	❌	webdriver flag
Playwright + stealth	❌	仍有殘留信號
undetected-chromedriver	❌	Chrome 版本不相容
curl-impersonate	❌	JS Challenge 仍需瀏覽器
Xvfb + patchright	❌	指紋修補不完整
CloakBrowser	✅ 成功	C++ 層級完整修補

結論

突破 Cloudflare JS Challenge 的核心瓶頸不在於「如何繞過檢測」，而在於讓檢測看到的全部都是正常瀏覽器值。CloakBrowser 的價值在於它從 Chromium 原始碼層面解決了這個問題，而不是停留在 JS 層或參數層的半成品繞過。

對於需要自動化存取受保護網頁的系統，CloakBrowser 是目前已知最穩定、成功率最高的方案。

（本文同步發布於研究記錄 2026-04-12）

訂閱：文章 (Atom)