High Altitude Oolong: Problem and Solve : some sections are mssing when VLM ocr pdf file

2026/1/14

Problem and Solve : some sections are mssing when VLM ocr pdf file

Problem

When converting 5.1.3. Tutorial for Large Language Models.pdf to Markdown, shell script sections were missing in the output. These sections had light gray text on light gray background (low contrast).

Root Cause

The VLM prompt didn't explicitly instruct the model to look for low-contrast code blocks. While the VLM could recognize the text, recognition was inconsistent.

Changes Made

1. Enhanced VLM Prompt (
ollama_client.py)

Added explicit instructions to detect low-contrast code:

"Pay special attention to CODE BLOCKS and SHELL COMMANDS that may appear in LIGHT GRAY BOXES"
"These low-contrast code sections are VERY IMPORTANT and MUST NOT be skipped"
Specific examples: $ bash script.sh, $ ./compile.sh, $ adb push

2. Added VLM Output Cleanup (
ollama_client.py)

New

_clean_vlm_output() method removes VLM thinking noise:

Patterns like "Wait, no...", "Let me think...", "So final Markdown:"
Markdown code block wrappers
Multiple consecutive blank lines

High Altitude Oolong

2026/1/14

Problem and Solve : some sections are mssing when VLM ocr pdf file

Problem

Root Cause

Changes Made

1. Enhanced VLM Prompt (
ollama_client.py)

2. Added VLM Output Cleanup (
ollama_client.py)

沒有留言:

張貼留言

2026/1/14

Problem and Solve : some sections are mssing when VLM ocr pdf file

Problem

Root Cause

Changes Made

1. Enhanced VLM Prompt (ollama_client.py)

2. Added VLM Output Cleanup (ollama_client.py)

沒有留言:

張貼留言

1. Enhanced VLM Prompt (
ollama_client.py)

2. Added VLM Output Cleanup (
ollama_client.py)