This section will introduce how to use Labelme and PaddleLabel annotation tools to complete data annotation for single-model object detection tasks. Click the above links to install the annotation tools and view detailed usage instructions.
Labelme is a Python-based image annotation software with a graphical user interface. It can be used for tasks such as image classification, object detection, and image segmentation. For object detection annotation tasks, labels are stored as JSON files.
To avoid environment conflicts, it is recommended to install in a conda environment.
conda create -n labelme python=3.10
conda activate labelme
pip install pyqt5
pip install labelme
helmet.images directory (must be named images) within helmet and store the images to be annotated in the images directory, as shown below:label.txt in the helmet folder and write the categories of the dataset to be annotated, line by line. For example, for a helmet detection dataset, label.txt would look like this:Navigate to the root directory of the dataset to be annotated in the terminal and start the Labelme annotation tool:
cd path/to/helmet
labelme images --labels label.txt --nodata --autosave --output annotations
flags creates classification labels for images, passing in the path to the labels.nodata stops storing image data in the JSON file.autosave enables automatic saving.output specifies the path for storing label files.Labelme, it will look like this:output is not specified when starting Labelme, it will prompt to select a save path upon the first save. If autosave is enabled, no need to click Save.)Next Image to annotate the next.Labelme format
train_anno_list.txt and val_anno_list.txt, in the root directory of the dataset. Write the paths of all json files in the annotations directory into train_anno_list.txt and val_anno_list.txt at a certain ratio, or write all of them into train_anno_list.txt and create an empty val_anno_list.txt file. Use the data splitting function to re-split. The specific filling format of train_anno_list.txt and val_anno_list.txt is shown below:After labeling with Labelme, the data format needs to be converted to coco format. Below is a code example for converting the data labeled using Labelme according to the above tutorial:
cd /path/to/paddlex
wget https://paddle-model-ecology.bj.bcebos.com/paddlex/data/det_labelme_examples.tar -P ./dataset
tar -xf ./dataset/det_labelme_examples.tar -C ./dataset/
python main.py -c paddlex/configs/obeject_detection/PicoDet-L.yaml \
-o Global.mode=check_dataset \
-o Global.dataset_dir=./dataset/det_labelme_examples \
-o CheckDataset.convert.enable=True \
-o CheckDataset.convert.src_dataset_type=LabelMe
To avoid environment conflicts, it is recommended to create a clean conda environment:
conda create -n paddlelabel python=3.11
conda activate paddlelabel
Alternatively, you can install it with pip in one command:
pip install --upgrade paddlelabel
pip install a2wsgi uvicorn==0.18.1
pip install connexion==2.14.1
pip install Flask==2.2.2
pip install Werkzeug==2.2.2
After successful installation, you can start PaddleLabel using one of the following commands in the terminal:
paddlelabel # Start paddlelabel
pdlabel # Abbreviation, identical to paddlelabel
PaddleLabel will automatically open a webpage in your browser after startup. You can then proceed with the annotation process based on your task.
coco.coco format dataset for helmet detection
json files and the image directory according to the following correspondence:| Original File (Directory) Name | Renamed File (Directory) Name |
|---|---|
train.json |
instance_train.json |
val.json |
instance_val.json |
test.json |
instance_test.json |
image |
images |
annotations directory in the root directory of the dataset and move all json files to the annotations directory. The final dataset directory structure will look like this:helmet directory into a .tar or .zip format compressed package to obtain the standard coco format dataset for helmet detection.The dataset defined by PaddleX for object detection tasks is named COCODetDataset, with the following organizational structure and annotation format:
dataset_dir # Root directory of the dataset, the directory name can be changed
├── annotations # Directory for saving annotation files, the directory name cannot be changed
│ ├── instance_train.json # Annotation file for the training set, the file name cannot be changed, using COCO annotation format
│ └── instance_val.json # Annotation file for the validation set, the file name cannot be changed, using COCO annotation format
└── images # Directory for saving images, the directory name cannot be changed
The annotation files use the COCO format. Please prepare your data according to the above specifications. Additionally, you can refer to the example dataset.