comments: true

PaddleX Model List (Huawei Ascend NPU)

PaddleX incorporates multiple pipelines, each containing several modules, and each module encompasses various models. You can select the appropriate models based on the benchmark data below. If you prioritize model accuracy, choose models with higher accuracy. If you prioritize model size, select models with smaller storage requirements.

Image Classification Module

Model Name	Top-1 Accuracy (%)	Model Size (M)
CLIP_vit_base_patch16_224	85.36	306.5 M
CLIP_vit_large_patch14_224	88.1	1.04 G
ConvNeXt_base_224	83.84	313.9 M
ConvNeXt_base_384	84.90	313.9 M
ConvNeXt_large_224	84.26	700.7 M
ConvNeXt_large_384	85.27	700.7 M
ConvNeXt_small	83.13	178.0 M
ConvNeXt_tiny	82.03	101.4 M
MobileNetV1_x0_75	68.8	9.3 M
MobileNetV1_x1_0	71.0	15.2 M
MobileNetV2_x0_5	65.0	7.1 M
MobileNetV2_x0_25	53.2	5.5 M
MobileNetV2_x1_0	72.2	12.6 M
MobileNetV2_x1_5	74.1	25.0 M
MobileNetV2_x2_0	75.2	41.2 M
MobileNetV3_large_x0_5	69.2	9.6 M
MobileNetV3_large_x0_35	64.3	7.5 M
MobileNetV3_large_x0_75	73.1	14.0 M
MobileNetV3_large_x1_0	75.3	19.5 M
MobileNetV3_large_x1_25	76.4	26.5 M
MobileNetV3_small_x0_5	59.2	6.8 M
MobileNetV3_small_x0_35	53.0	6.0 M
MobileNetV3_small_x0_75	66.0	8.5 M
MobileNetV3_small_x1_0	68.2	10.5 M
MobileNetV3_small_x1_25	70.7	13.0 M
PP-HGNet_base	85.0	249.4 M
PP-HGNet_small	81.51	86.5 M
PP-HGNet_tiny	79.83	52.4 M
PP-HGNetV2-B0	77.77	21.4 M
PP-HGNetV2-B1	79.18	22.6 M
PP-HGNetV2-B2	81.74	39.9 M
PP-HGNetV2-B3	82.98	57.9 M
PP-HGNetV2-B4	83.57	70.4 M
PP-HGNetV2-B5	84.75	140.8 M
PP-HGNetV2-B6	86.30	268.4 M
PP-LCNet_x0_5	63.14	6.7 M
PP-LCNet_x0_25	51.86	5.5 M
PP-LCNet_x0_35	58.09	5.9 M
PP-LCNet_x0_75	68.18	8.4 M
PP-LCNet_x1_0	71.32	10.5 M
PP-LCNet_x1_5	73.71	16.0 M
PP-LCNet_x2_0	75.18	23.2 M
PP-LCNet_x2_5	76.60	32.1 M
PP-LCNetV2_base	77.05	23.7 M
ResNet18_vd	72.3	41.5 M
ResNet18	71.0	41.5 M
ResNet34_vd	76.0	77.3 M
ResNet34	74.6	77.3 M
ResNet50_vd	79.1	90.8 M
ResNet50	76.5	90.8 M
ResNet101_vd	80.2	158.4 M
ResNet101	77.6	158.7 M
ResNet152_vd	80.6	214.3 M
ResNet152	78.3	214.2 M
ResNet200_vd	80.9	266.0 M
SwinTransformer_base_patch4_window7_224	83.37	310.5 M
SwinTransformer_small_patch4_window7_224	83.21	175.6 M
SwinTransformer_tiny_patch4_window7_224	81.10	100.1 M

Note: The above accuracy metrics refer to Top-1 Accuracy on the ImageNet-1k validation set.

Object Detection Module

Model Name	mAP (%)	Model Size (M)
CenterNet-DLA-34	37.6	75.4 M
CenterNet-ResNet50	38.9	319.7 M
DETR-R50	42.3	159.3 M
FasterRCNN-ResNet34-FPN	37.8	137.5 M
FasterRCNN-ResNet50-FPN	38.4	148.1 M
FasterRCNN-ResNet50-vd-FPN	39.5	148.1 M
FasterRCNN-ResNet50-vd-SSLDv2-FPN	41.4	148.1 M
FasterRCNN-ResNet101-FPN	41.4	216.3 M
FCOS-ResNet50	39.6	124.2 M
PicoDet-L	42.6	20.9 M
PicoDet-M	37.5	16.8 M
PicoDet-S	29.1	4.4 M
PicoDet-XS	26.2	5.7M
PP-YOLOE_plus-L	52.9	185.3 M
PP-YOLOE_plus-M	49.8	83.2 M
PP-YOLOE_plus-S	43.7	28.3 M
PP-YOLOE_plus-X	54.7	349.4 M
RT-DETR-H	56.3	435.8 M
RT-DETR-L	53.0	113.7 M
RT-DETR-R18	46.5	70.7 M
RT-DETR-R50	53.1	149.1 M
RT-DETR-X	54.8	232.9 M
YOLOv3-DarkNet53	39.1	219.7 M
YOLOv3-MobileNetV3	31.4	83.8 M
YOLOv3-ResNet50_vd_DCN	40.6	163.0 M

Note: The above accuracy metrics are for COCO2017 validation set mAP(0.5:0.95).

Semantic Segmentation Module

Model Name	mIoU (%)	Model Size (M)
Deeplabv3_Plus-R50	80.36	94.9 M
Deeplabv3_Plus-R101	81.10	162.5 M
Deeplabv3-R50	79.90	138.3 M
Deeplabv3-R101	80.85	205.9 M
OCRNet_HRNet-W48	82.15	249.8 M
PP-LiteSeg-T	73.10	28.5 M

Note: The above accuracy metrics are for Cityscapes dataset mIoU.

Instance Segmentation Module

Model Name	Mask AP	Model Size (M)
Mask-RT-DETR-H	50.6	449.9 M
Mask-RT-DETR-L	45.7	113.6 M
Mask-RT-DETR-M	42.7	66.6 M
Cascade-MaskRCNN-ResNet50-FPN	36.3	254.8 M
Cascade-MaskRCNN-ResNet50-vd-SSLDv2-FPN	39.1	254.7 M
PP-YOLOE_seg-S	32.5	31.5 M

Note: The above accuracy metrics are for COCO2017 validation set Mask AP(0.5:0.95).

Text Detection Module

Model Name	Detection Hmean (%)	Model Size (M)
PP-OCRv4_mobile_det	77.79	4.2 M
PP-OCRv4_server_det	82.69	100.1 M

Note: The above accuracy metrics are evaluated on PaddleOCR's self-built Chinese dataset, covering street scenes, web images, documents, and handwritten scenarios, with 500 images for detection.

Text Recognition Module

Model Name	Recognition Avg Accuracy (%)	Model Size (M)
PP-OCRv4_mobile_rec	78.20	10.6 M
PP-OCRv4_server_rec	79.20	71.2 M

Note: The above accuracy metrics are evaluated on PaddleOCR's self-built Chinese dataset, covering street scenes, web images, documents, and handwritten scenarios, with 11,000 images for text recognition.

Model Name	Recognition Avg Accuracy (%)	Model Size (M)
ch_SVTRv2_rec	68.81	73.9 M

Note: The above accuracy metrics are evaluated on the PaddleOCR Algorithm Model Challenge - Task 1: OCR End-to-End Recognition A-Rank.

Model Name	Recognition Avg Accuracy (%)	Model Size (M)
ch_RepSVTR_rec	65.07	22.1 M

Note: The above accuracy metrics are evaluated on the PaddleOCR Algorithm Model Challenge - Task 1: OCR End-to-End Recognition B-Rank.

Table Structure Recognition Module

Model Name	Accuracy (%)	Model Size (M)
SLANet	76.31	6.9 M

Note: The above accuracy metrics are measured on the PubtabNet English table recognition dataset.

Layout Analysis Module

Model Name	mAP (%)	Model Size (M)
PicoDet_layout_1x	86.8	7.4M

Note: The evaluation set for the above accuracy metrics is PaddleOCR's self-built layout analysis dataset, containing 10,000 images.

Time Series Forecasting Module

Model Name	MSE	MAE	Model Size (M)
DLinear	0.382	0.394	72K
NLinear	0.386	0.392	40K
Nonstationary	0.600	0.515	55.5 M
PatchTST	0.385	0.397	2.0M
RLinear	0.384	0.392	40K
TiDE	0.405	0.412	31.7M
TimesNet	0.417	0.431	4.9M

Note: The above accuracy metrics are measured on the ETTH1 dataset (evaluation results on the test set test.csv).

Time Series Anomaly Detection Module

Model Name	Precision	Recall	F1-Score	Model Size (M)
AutoEncoder_ad	99.36	84.36	91.25	52K
DLinear_ad	98.98	93.96	96.41	112K
Nonstationary_ad	98.55	88.95	93.51	1.8M
PatchTST_ad	98.78	90.70	94.57	320K
TimesNet_ad	98.37	94.80	96.56	1.3M

Note: The above accuracy metrics are measured on the PSM dataset.

Time Series Classification Module

Model Name	Acc (%)	Model Size (M)
TimesNet_cls	87.5	792K

Note: The above accuracy metrics are measured on the UWaveGestureLibrary: Training, Evaluation datasets.

model_list_npu.en.md 12 KB Historie Surový