High Altitude Oolong: 隨便的 memo ...

沒 train 完的 yolo3，用 coco dataset 來試試看。
-- convert coco to pascal_voc format:

pascal_voc/
├── VOC2012/
│   ├── Annotations/         # XML annotation files
│   ├── ImageSets/
│   │   └── Main/
│   │       └── train.txt    # Training image names
│   └── JPEGImages/         # Image files (.jpg)
│
└── VOC2007/
    ├── Annotations/         # XML annotation files
    ├── ImageSets/
    │   └── Main/
    │       └── test.txt     # Testing image names
    └── JPEGImages/         # Image files (.jpg)

train.txt 是 filename list.
所以自己去 annotations 找 filename.txt，
到 JPEGImages 找 filename.jpg
要分 train 和 validate。

coco dataset 的format：
一個 imagefolder 配一個 json，裡面有 folder 所有的 image 和 labels

datasets 的format 現在大概有三種

pascal_voc
coco
ultralytic yolo

dataloader( ) 中，參數是 filename list. txt 檔的 path，image folder path，label folder path
所以不用依照pascal_voc 的方式，只要 image，label 分開 folder放就可以。

這個：

YOLOv3 from Scratch Analaysis and Implementation

說的 "YOLODataset" 就是哪個用空白格開的文字檔 (csv with space)
然後 folder 要依照

.
├── images
│   ├── train
│   └── val
└── labels
    ├── train
    └── val

好像有一個 filelist ..

還是專業的厲害，叫 gemma3:27b 從 dataset.py 找出 train.csv 的格式是什麼，一直鬼打牆說 train.csv 找不到。
換成 github copilot + sonnect 3.5，同樣的 prompt 就回答了(只是不知道鄭不正確):

Based on the code in dataset.py, I can explain how the train.csv file should be structured. The file is used to map image names to their corresponding label files.

The train.csv should contain two columns:
1. Image filename
2. Label filename

Here's an example of how train.csv should look:

```csv
image_name,label_name
image1.jpg,image1.txt
image2.jpg,image2.txt
dog_picture.jpg,dog_picture.txt
car_street.jpg,car_street.txt
```

Key points about the CSV structure:
- The image filenames should correspond to files in the `COCO/images/images/` directory
- The label filenames should correspond to files in the `COCO/labels/labels_new/` directory
- Each label file (.txt) contains the bounding box information in the format: <class_label> <x> <y> <width> <height&ht;`

You can create this CSV file using a simple Python script like this:

````python
import os
import pandas as pd

def create_train_csv(image_dir, label_dir, output_csv):
    image_files = [f for f in os.listdir(image_dir) if f.endswith(('.jpg', '.jpeg', '.png'))]
    data = []
    
    for img_file in image_files:
        # Assume label file has same name but .txt extension
        label_file = os.path.splitext(img_file)[0] + '.txt'
        if os.path.exists(os.path.join(label_dir, label_file)):
            data.append([img_file, label_file])
    
    df = pd.DataFrame(data, columns=['image_name', 'label_name'])
    df.to_csv(output_csv, index=False)

# Example usage
create_train_csv(
    'COCO/images/images/',
    'COCO/labels/labels_new/',
    'COCO/train.csv'
)
````

This script will:
1. Scan the images directory
2. Match image files with their corresponding label files
3. Create a CSV file with the correct format expected by the YOLODataset class

The dataset class then uses this CSV file to load the corresponding image and label pairs during training.

然後 COCO dataset 的 annotation 中 class (category 會到 90)，但是只有列出 80 個。
據原始論文說是因為 2014 公佈時，object detection 只有這 80個，另外 10 個是 segmentation 的。
我照著說明，修改在這裡

PyTorch-YOLOv3

他自己做好的 5k.part 是 ascii file，裡面是 image file 的 list

...
/images/val2014/COCO_val2014_000000581827.jpg
/images/val2014/COCO_val2014_000000581829.jpg
...

剛好有 5000 個 file

哪個 labels.tgz 其實是 tar，沒有gz
解開就是 YOLO 的 label 格式 txt，每個 image 都有一個。

High Altitude Oolong

2025/4/12

隨便的 memo ...

沒有留言:

張貼留言