简体中文 | English

PaddleX Model List (Huawei Ascend NPU)

PaddleX incorporates multiple pipelines, each containing several modules, and each module encompasses various models. You can select the appropriate models based on the benchmark data below. If you prioritize model accuracy, choose models with higher accuracy. If you prioritize model size, select models with smaller storage requirements.

Image Classification Module

| Model Name | Top-1 Accuracy (%) | Model Size (M) | |-|-|-| | CLIP_vit_base_patch16_224 | 85.36 | 306.5 M | | CLIP_vit_large_patch14_224 | 88.1 | 1.04 G | | ConvNeXt_base_224 | 83.84 | 313.9 M | | ConvNeXt_base_384 | 84.90 | 313.9 M | | ConvNeXt_large_224 | 84.26 | 700.7 M | | ConvNeXt_large_384 | 85.27 | 700.7 M | | ConvNeXt_small | 83.13 | 178.0 M | | ConvNeXt_tiny | 82.03 | 101.4 M | | MobileNetV1_x0_75 | 68.8 | 9.3 M | | MobileNetV1_x1_0 | 71.0 | 15.2 M | | MobileNetV2_x0_5 | 65.0 | 7.1 M | | MobileNetV2_x0_25 | 53.2 | 5.5 M | | MobileNetV2_x1_0 | 72.2 | 12.6 M | | MobileNetV2_x1_5 | 74.1 | 25.0 M | | MobileNetV2_x2_0 | 75.2 | 41.2 M | | MobileNetV3_large_x0_5 | 69.2 | 9.6 M | | MobileNetV3_large_x0_35 | 64.3 | 7.5 M | | MobileNetV3_large_x0_75 | 73.1 | 14.0 M | | MobileNetV3_large_x1_0 | 75.3 | 19.5 M | | MobileNetV3_large_x1_25 | 76.4 | 26.5 M | | MobileNetV3_small_x0_5 | 59.2 | 6.8 M | | MobileNetV3_small_x0_35 | 53.0 | 6.0 M | | MobileNetV3_small_x0_75 | 66.0 | 8.5 M | | MobileNetV3_small_x1_0 | 68.2 | 10.5 M | | MobileNetV3_small_x1_25 | 70.7 | 13.0 M | | PP-HGNet_base | 85.0 | 249.4 M | | PP-HGNet_small | 81.51 | 86.5 M | | PP-HGNet_tiny | 79.83 | 52.4 M | | PP-HGNetV2-B0 | 77.77 | 21.4 M | | PP-HGNetV2-B1 | 79.18 | 22.6 M | | PP-HGNetV2-B2 | 81.74 | 39.9 M | | PP-HGNetV2-B3 | 82.98 | 57.9 M | | PP-HGNetV2-B4 | 83.57 | 70.4 M | | PP-HGNetV2-B5 | 84.75 | 140.8 M | | PP-HGNetV2-B6 | 86.30 | 268.4 M | |PP-LCNet_x0_5|63.14|6.7 M| |PP-LCNet_x0_25|51.86|5.5 M| |PP-LCNet_x0_35|58.09|5.9 M| |PP-LCNet_x0_75|68.18|8.4 M| |PP-LCNet_x1_0|71.32|10.5 M| |PP-LCNet_x1_5|73.71|16.0 M| |PP-LCNet_x2_0|75.18|23.2 M| |PP-LCNet_x2_5|76.60|32.1 M| |PP-LCNetV2_base|77.05|23.7 M| |ResNet18_vd|72.3|41.5 M| |ResNet18|71.0|41.5 M| |ResNet34_vd|76.0|77.3 M| |ResNet34|74.6|77.3 M| |ResNet50_vd|79.1|90.8 M| |ResNet50|76.5|90.8 M| |ResNet101_vd|80.2|158.4 M| |ResNet101|77.6|158.7 M| |ResNet152_vd|80.6|214.3 M| |ResNet152|78.3|214.2 M| |ResNet200_vd|80.9|266.0 M| |SwinTransformer_base_patch4_window7_224|83.37|310.5 M| |SwinTransformer_small_patch4_window7_224|83.21|175.6 M| |SwinTransformer_tiny_patch4_window7_224|81.10|100.1 M|

Note: The above accuracy metrics refer to Top-1 Accuracy on the ImageNet-1k validation set.

Object Detection Module

| Model Name | mAP (%) | Model Size (M) | |-|-|-| |CenterNet-DLA-34|37.6|75.4 M| |CenterNet-ResNet50|38.9|319.7 M| |DETR-R50|42.3|159.3 M| |FasterRCNN-ResNet34-FPN|37.8|137.5 M| |FasterRCNN-ResNet50-FPN|38.4|148.1 M| |FasterRCNN-ResNet50-vd-FPN|39.5|148.1 M| |FasterRCNN-ResNet50-vd-SSLDv2-FPN|41.4|148.1 M| |FasterRCNN-ResNet101-FPN|41.4|216.3 M| |FCOS-ResNet50|39.6|124.2 M| |PicoDet-L|42.6|20.9 M| |PicoDet-M|37.5|16.8 M| |PicoDet-S|29.1|4.4 M | |PicoDet-XS|26.2|5.7M | |PP-YOLOE_plus-L|52.9|185.3 M| |PP-YOLOE_plus-M|49.8|83.2 M| |PP-YOLOE_plus-S|43.7|28.3 M| |PP-YOLOE_plus-X|54.7|349.4 M| |RT-DETR-H|56.3|435.8 M| |RT-DETR-L|53.0|113.7 M| |RT-DETR-R18|46.5|70.7 M| |RT-DETR-R50|53.1|149.1 M| |RT-DETR-X|54.8|232.9 M| |YOLOv3-DarkNet53|39.1|219.7 M| |YOLOv3-MobileNetV3|31.4|83.8 M| |YOLOv3-ResNet50_vd_DCN|40.6|163.0 M|

Note: The above accuracy metrics are for COCO2017 validation set mAP(0.5:0.95).

Semantic Segmentation Module

| Model Name | mIoU (%) | Model Size (M) | |-|-|-| | Deeplabv3_Plus-R50 | 80.36 | 94.9 M | | Deeplabv3_Plus-R101 | 81.10 | 162.5 M | | Deeplabv3-R50 | 79.90 | 138.3 M | | Deeplabv3-R101 | 80.85 | 205.9 M | | OCRNet_HRNet-W48 | 82.15 | 249.8 M | | PP-LiteSeg-T | 73.10 | 28.5 M |

Note: The above accuracy metrics are for Cityscapes dataset mIoU.

Instance Segmentation Module

| Model Name | Mask AP | Model Size (M) | |-|-|-| | Mask-RT-DETR-H | 50.6 | 449.9 M | | Mask-RT-DETR-L | 45.7 | 113.6 M | | Mask-RT-DETR-M | 42.7 | 66.6 M | | Cascade-MaskRCNN-ResNet50-FPN | 36.3 | 254.8 M | | Cascade-MaskRCNN-ResNet50-vd-SSLDv2-FPN | 39.1 | 254.7 M | | PP-YOLOE_seg-S | 32.5 | 31.5 M |

Note: The above accuracy metrics are for COCO2017 validation set Mask AP(0.5:0.95).

Text Detection Module

| Model Name | Detection Hmean (%) | Model Size (M) | |-|-|-| | PP-OCRv4_mobile_det | 77.79 | 4.2 M | | PP-OCRv4_server_det | 82.69 | 100.1 M |

Note: The above accuracy metrics are evaluated on PaddleOCR's self-built Chinese dataset, covering street scenes, web images, documents, and handwritten scenarios, with 500 images for detection.

Text Recognition Module

| Model Name | Recognition Avg Accuracy (%) | Model Size (M) | |-|-|-| | PP-OCRv4_mobile_rec | 78.20 | 10.6 M | | PP-OCRv4_server_rec | 79.20 | 71.2 M |

Note: The above accuracy metrics are evaluated on PaddleOCR's self-built Chinese dataset, covering street scenes, web images, documents, and handwritten scenarios, with 11,000 images for text recognition.

| Model Name | Recognition Avg Accuracy (%) | Model Size (M) | |-|-|-| | ch_SVTRv2_rec | 68.81 | 73.9 M |

Note: The above accuracy metrics are evaluated on the PaddleOCR Algorithm Model Challenge - Task 1: OCR End-to-End Recognition A-Rank.

| Model Name | Recognition Avg Accuracy (%) | Model Size (M) | |-|-|-| | ch_RepSVTR_rec | 65.07 | 22.1 M |

Note: The above accuracy metrics are evaluated on the PaddleOCR Algorithm Model Challenge - Task 1: OCR End-to-End Recognition B-Rank.

Table Structure Recognition Module

| Model Name | Accuracy (%) | Model Size (M) | |-|-|-| | SLANet | 76.31 | 6.9 M |

Note: The above accuracy metrics are measured on the PubtabNet English table recognition dataset.

Layout Analysis Module

Model Name	mAP (%)	Model Size (M)
PicoDet_layout_1x	86.8	7.4M

Note: The evaluation set for the above accuracy metrics is PaddleOCR's self-built layout analysis dataset, containing 10,000 images.

Time Series Forecasting Module

Model Name	MSE	MAE	Model Size (M)
DLinear	0.382	0.394	72K
NLinear	0.386	0.392	40K
Nonstationary	0.600	0.515	55.5 M
PatchTST	0.385	0.397	2.0M
RLinear	0.384	0.392	40K
TiDE	0.405	0.412	31.7M
TimesNet	0.417	0.431	4.9M

Note: The above accuracy metrics are measured on the ETTH1 dataset (evaluation results on the test set test.csv).

Time Series Anomaly Detection Module

Model Name	Precision	Recall	F1-Score	Model Size (M)
AutoEncoder_ad	99.36	84.36	91.25	52K
DLinear_ad	98.98	93.96	96.41	112K
Nonstationary_ad	98.55	88.95	93.51	1.8M
PatchTST_ad	98.78	90.70	94.57	320K
TimesNet_ad	98.37	94.80	96.56	1.3M

Note: The above accuracy metrics are measured on the PSM dataset.

Time Series Classification Module

Model Name	Acc (%)	Model Size (M)
TimesNet_cls	87.5	792K

Note: The above accuracy metrics are measured on the UWaveGestureLibrary: Training, Evaluation datasets.

model_list_npu_en.md 7.9 KB Vēsture Neapstrādāts