This document will introduce how to use the Labelme annotation tool to complete data annotation for image classification related single models. Click on the above link to refer to the homepage documentation for installing the data annotation tool and viewing detailed usage procedures.
Labelme is a Python-based image annotation software with a graphical interface. It can be used for tasks such as image classification, object detection, and image segmentation. In instance segmentation annotation tasks, labels are stored as JSON files.
To avoid environment conflicts, it is recommended to install in a conda environment.
conda create -n labelme python=3.10
conda activate labelme
pip install pyqt5
pip install labelme
pets.images directory (must be named images) within pets and store the images to be annotated in the images directory, as shown below:flags.txt for the dataset to be annotated in the pets folder, and write the categories of the dataset to be annotated into flags.txt line by line. Taking the flags.txt for a cat and dog classification dataset as an example, as shown below:Navigate to the root directory of the dataset to be annotated in the terminal and start the labelme annotation tool.
cd path/to/pets
labelme images --nodata --autosave --output annotations --flags flags.txt
flags creates classification labels for images, passing in the path to the labels.nodata stops storing image data in JSON files.autosave enables automatic saving.output specifies the storage path for label files.labelme, it will look like this:Flags interface.output is not specified when starting labelme, it will prompt to select a save path upon the first save. If autosave is specified, there is no need to click the Save button).Next Image to annotate the next image.After annotating all images, use the convert_to_imagenet.py script to convert the annotated dataset to the ImageNet-1k dataset format, generating train.txt, val.txt, and label.txt.
python convert_to_imagenet.py --dataset_path /path/to/dataset
dataset_path is the path to the annotated labelme format classification dataset.
The dataset defined by PaddleX for image classification tasks is named ClsDataset, with the following organizational structure and annotation format:
dataset_dir # Root directory of the dataset, the directory name can be changed
├── images # Directory for saving images, the directory name can be changed, but note the correspondence with the content of train.txt and val.txt
├── label.txt # Correspondence between annotation IDs and category names, the file name cannot```bash
classname1
classname2
classname3
...
Modified label.txt:
```bash 0 classname1 1 classname2 2 classname3 ...