--- comments: true --- # PaddleX Model List (Huawei Ascend NPU) PaddleX incorporates multiple pipelines, each containing several modules, and each module encompasses various models. You can select the appropriate models based on the benchmark data below. If you prioritize model accuracy, choose models with higher accuracy. If you prioritize model size, select models with smaller storage requirements. ## Image Classification Module

Model Name	Top-1 Accuracy (%)	Model Size (M)
CLIP_vit_base_patch16_224	85.36	306.5 M
CLIP_vit_large_patch14_224	88.1	1.04 G
ConvNeXt_base_224	83.84	313.9 M
ConvNeXt_base_384	84.90	313.9 M
ConvNeXt_large_224	84.26	700.7 M
ConvNeXt_large_384	85.27	700.7 M
ConvNeXt_small	83.13	178.0 M
ConvNeXt_tiny	82.03	101.4 M
MobileNetV1_x0_5	63.5	4.8 M
MobileNetV1_x0_25	51.4	1.8 M
MobileNetV1_x0_75	68.8	9.3 M
MobileNetV1_x1_0	71.0	15.2 M
MobileNetV2_x0_5	65.0	7.1 M
MobileNetV2_x0_25	53.2	5.5 M
MobileNetV2_x1_0	72.2	12.6 M
MobileNetV2_x1_5	74.1	25.0 M
MobileNetV2_x2_0	75.2	41.2 M
MobileNetV3_large_x0_5	69.2	9.6 M
MobileNetV3_large_x0_35	64.3	7.5 M
MobileNetV3_large_x0_75	73.1	14.0 M
MobileNetV3_large_x1_0	75.3	19.5 M
MobileNetV3_large_x1_25	76.4	26.5 M
MobileNetV3_small_x0_5	59.2	6.8 M
MobileNetV3_small_x0_35	53.0	6.0 M
MobileNetV3_small_x0_75	66.0	8.5 M
MobileNetV3_small_x1_0	68.2	10.5 M
MobileNetV3_small_x1_25	70.7	13.0 M
MobileNetV4_conv_large	83.4	125.2 M
MobileNetV4_conv_medium	79.9	37.6 M
MobileNetV4_conv_small	74.6	14.7 M
PP-HGNet_base	85.0	249.4 M
PP-HGNet_small	81.51	86.5 M
PP-HGNet_tiny	79.83	52.4 M
PP-HGNetV2-B0	77.77	21.4 M
PP-HGNetV2-B1	79.18	22.6 M
PP-HGNetV2-B2	81.74	39.9 M
PP-HGNetV2-B3	82.98	57.9 M
PP-HGNetV2-B4	83.57	70.4 M
PP-HGNetV2-B5	84.75	140.8 M
PP-HGNetV2-B6	86.30	268.4 M
PP-LCNet_x0_5	63.14	6.7 M
PP-LCNet_x0_25	51.86	5.5 M
PP-LCNet_x0_35	58.09	5.9 M
PP-LCNet_x0_75	68.18	8.4 M
PP-LCNet_x1_0	71.32	10.5 M
PP-LCNet_x1_5	73.71	16.0 M
PP-LCNet_x2_0	75.18	23.2 M
PP-LCNet_x2_5	76.60	32.1 M
PP-LCNetV2_base	77.05	23.7 M
PP-LCNetV2_large	78.51	37.3 M
PP-LCNetV2_small	73.97	14.6 M
ResNet18_vd	72.3	41.5 M
ResNet18	71.0	41.5 M
ResNet34_vd	76.0	77.3 M
ResNet34	74.6	77.3 M
ResNet50_vd	79.1	90.8 M
ResNet50	76.5	90.8 M
ResNet101_vd	80.2	158.4 M
ResNet101	77.6	158.7 M
ResNet152_vd	80.6	214.3 M
ResNet152	78.3	214.2 M
ResNet200_vd	80.9	266.0 M
SwinTransformer_base_patch4_window7_224	83.37	310.5 M
SwinTransformer_base_patch4_window12_384	84.17	311.4 M
SwinTransformer_large_patch4_window7_224	86.19	694.8 M
SwinTransformer_large_patch4_window12_384	87.06	696.1 M
SwinTransformer_small_patch4_window7_224	83.21	175.6 M
SwinTransformer_tiny_patch4_window7_224	81.10	100.1 M

Note: The above accuracy metrics refer to Top-1 Accuracy on the [ImageNet-1k](https://www.image-net.org/index.php) validation set. ## [图像多标签分类模块](../module_usage/tutorials/cv_modules/image_multilabel_classification.md)

模型名称	mAP（%）	模型存储大小
CLIP_vit_base_patch16_448_ML	89.15	325.6 M
PP-HGNetV2-B0_ML	80.98	39.6 M
PP-HGNetV2-B4_ML	87.96	88.5 M
PP-HGNetV2-B6_ML	91.25	286.5 M

注：以上精度指标为 [COCO2017](https://cocodataset.org/#home) 的多标签分类任务mAP。 ## Object Detection Module

Model Name	mAP (%)	Model Size (M)
Cascade-FasterRCNN-ResNet50-FPN	41.1	245.4 M
Cascade-FasterRCNN-ResNet50-vd-SSLDv2-FPN	45.0	246.2 M
CenterNet-DLA-34	37.6	75.4 M
CenterNet-ResNet50	38.9	319.7 M
DETR-R50	42.3	159.3 M
FasterRCNN-ResNet34-FPN	37.8	137.5 M
FasterRCNN-ResNet50	36.7	120.2 M
FasterRCNN-ResNet50-FPN	38.4	148.1 M
FasterRCNN-ResNet50-vd-FPN	39.5	148.1 M
FasterRCNN-ResNet50-vd-SSLDv2-FPN	41.4	148.1 M
FasterRCNN-ResNet101	39.0	188.1 M
FasterRCNN-ResNet101-FPN	41.4	216.3 M
FasterRCNN-ResNeXt101-vd-FPN	43.4	360.6 M
FasterRCNN-Swin-Tiny-FPN	42.6	159.8 M
FCOS-ResNet50	39.6	124.2 M
PicoDet-L	42.6	20.9 M
PicoDet-M	37.5	16.8 M
PicoDet-S	29.1	4.4 M
PicoDet-XS	26.2	5.7M
PP-YOLOE_plus-L	52.9	185.3 M
PP-YOLOE_plus-M	49.8	83.2 M
PP-YOLOE_plus-S	43.7	28.3 M
PP-YOLOE_plus-X	54.7	349.4 M
RT-DETR-H	56.3	435.8 M
RT-DETR-L	53.0	113.7 M
RT-DETR-R18	46.5	70.7 M
RT-DETR-R50	53.1	149.1 M
RT-DETR-X	54.8	232.9 M
YOLOv3-DarkNet53	39.1	219.7 M
YOLOv3-MobileNetV3	31.4	83.8 M
YOLOv3-ResNet50_vd_DCN	40.6	163.0 M

Note: The above accuracy metrics are for [COCO2017](https://cocodataset.org/#home) validation set mAP(0.5:0.95). ## [小目标检测模块](../module_usage/tutorials/cv_modules/small_object_detection.md)

模型名称	mAP（%）	模型存储大小
PP-YOLOE_plus_SOD-S	25.1	77.3 M
PP-YOLOE_plus_SOD-L	31.9	325.0 M
PP-YOLOE_plus_SOD-largesize-L	42.7	340.5 M

注：以上精度指标为 [VisDrone-DET](https://github.com/VisDrone/VisDrone-Dataset) 验证集 mAP(0.5:0.95)。 ## [行人检测模块](../module_usage/tutorials/cv_modules/human_detection.md)

模型名称	mAP（%）	模型存储大小
PP-YOLOE-L_human	48.0	196.1 M
PP-YOLOE-S_human	42.5	28.8 M

注：以上精度指标为 [CrowdHuman](https://bj.bcebos.com/v1/paddledet/data/crowdhuman.zip) 验证集 mAP(0.5:0.95)。 ## Semantic Segmentation Module

Model Name	mIoU (%)	Model Size (M)
Deeplabv3_Plus-R50	80.36	94.9 M
Deeplabv3_Plus-R101	81.10	162.5 M
Deeplabv3-R50	79.90	138.3 M
Deeplabv3-R101	80.85	205.9 M
OCRNet_HRNet-W48	82.15	249.8 M
PP-LiteSeg-T	73.10	28.5 M

Note: The above accuracy metrics are for [Cityscapes](https://www.cityscapes-dataset.com/) dataset mIoU. ## Instance Segmentation Module

Model Name	Mask AP	Model Size (M)
Mask-RT-DETR-H	50.6	449.9 M
Mask-RT-DETR-L	45.7	113.6 M
Mask-RT-DETR-M	42.7	66.6 M
Mask-RT-DETR-S	41.0	51.8 M
Mask-RT-DETR-X	47.5	237.5 M
Cascade-MaskRCNN-ResNet50-FPN	36.3	254.8 M
Cascade-MaskRCNN-ResNet50-vd-SSLDv2-FPN	39.1	254.7 M
MaskRCNN-ResNet50-FPN	35.6	157.5 M
MaskRCNN-ResNet50-vd-FPN	36.4	157.5 M
MaskRCNN-ResNet50	32.8	127.8 M
MaskRCNN-ResNet101-FPN	36.6	225.4 M
MaskRCNN-ResNet101-vd-FPN	38.1	225.1 M
MaskRCNN-ResNeXt101-vd-FPN	39.5	370.0 M
PP-YOLOE_seg-S	32.5	31.5 M

Note: The above accuracy metrics are for [COCO2017](https://cocodataset.org/#home) validation set Mask AP(0.5:0.95). ## [图像特征模块](../module_usage/tutorials/cv_modules/image_feature.md)

模型名称	recall@1（%）	模型存储大小
PP-ShiTuV2_rec_CLIP_vit_base	88.69	306.6 M
PP-ShiTuV2_rec_CLIP_vit_large	91.03	1.05 G

注：以上精度指标为 AliProducts recall@1。 ## [主体检测模块](../module_usage/tutorials/cv_modules/mainbody_detection.md)

模型名称	mAP（%）	模型存储大小
PP-ShiTuV2_det	41.5	27.6 M

注：以上精度指标为 [PaddleClas主体检测数据集](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.5/docs/zh_CN/training/PP-ShiTu/mainbody_detection.md) mAP(0.5:0.95)。 ## [车辆检测模块](../module_usage/tutorials/cv_modules/vehicle_detection.md)

模型名称	mAP（%）	模型存储大小
PP-YOLOE-L_vehicle	63.9	196.1 M
PP-YOLOE-S_vehicle	61.3	28.8 M

注：以上精度指标为 [PPVehicle](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/ppvehicle) 验证集 mAP(0.5:0.95)。 ## [异常检测模块](../module_usage/tutorials/cv_modules/anomaly_detection.md)

模型名称	Avg（%）	模型存储大小
STFPM	96.2	21.5 M

注：以上精度指标为 [MVTec AD](https://www.mvtec.com/company/research/datasets/mvtec-ad) 验证集平均异常分数。 ## Text Detection Module

Model Name	Detection Hmean (%)	Model Size (M)
PP-OCRv4_mobile_det	77.79	4.2 M
PP-OCRv4_server_det	82.69	100.1 M

Note: The above accuracy metrics are evaluated on PaddleOCR's self-built Chinese dataset, covering street scenes, web images, documents, and handwritten scenarios, with 500 images for detection. ## Text Recognition Module

Model Name	Recognition Avg Accuracy (%)	Model Size (M)
PP-OCRv4_mobile_rec	78.20	10.6 M
PP-OCRv4_server_rec	79.20	71.2 M

Note: The above accuracy metrics are evaluated on PaddleOCR's self-built Chinese dataset, covering street scenes, web images, documents, and handwritten scenarios, with 11,000 images for text recognition.

Model Name	Recognition Avg Accuracy (%)	Model Size (M)
ch_SVTRv2_rec	68.81	73.9 M

Note: The above accuracy metrics are evaluated on the [PaddleOCR Algorithm Model Challenge - Task 1: OCR End-to-End Recognition](https://aistudio.baidu.com/competition/detail/1131/0/introduction) A-Rank.

Model Name	Recognition Avg Accuracy (%)	Model Size (M)
ch_RepSVTR_rec	65.07	22.1 M

Model Name	Accuracy (%)	Model Size (M)
SLANet	76.31	6.9 M

Note: The above accuracy metrics are measured on the PubtabNet English table recognition dataset. ## Layout Analysis Module

Model Name	mAP (%)	Model Size (M)
PicoDet_layout_1x	86.8	7.4M
PicoDet-L_layout_3cls	89.3	22.6 M
RT-DETR-H_layout_3cls	95.9	470.1 M
RT-DETR-H_layout_17cls	92.6	470.2 M

Note: The evaluation set for the above accuracy metrics is PaddleOCR's self-built layout analysis dataset, containing 10,000 images. ## Time Series Forecasting Module

Model Name	MSE	MAE	Model Size (M)
DLinear	0.382	0.394	72K
NLinear	0.386	0.392	40K
Nonstationary	0.600	0.515	55.5 M
PatchTST	0.385	0.397	2.0M
RLinear	0.384	0.392	40K
TiDE	0.405	0.412	31.7M
TimesNet	0.417	0.431	4.9M

Note: The above accuracy metrics are measured on the [ETTH1](https://paddle-model-ecology.bj.bcebos.com/paddlex/data/Etth1.tar) dataset (evaluation results on the test set test.csv). ## Time Series Anomaly Detection Module

Model Name	Precision	Recall	F1-Score	Model Size (M)
AutoEncoder_ad	99.36	84.36	91.25	52K
DLinear_ad	98.98	93.96	96.41	112K
Nonstationary_ad	98.55	88.95	93.51	1.8M
PatchTST_ad	98.78	90.70	94.57	320K
TimesNet_ad	98.37	94.80	96.56	1.3M

Note: The above accuracy metrics are measured on the [PSM](https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ts_anomaly_examples.tar) dataset. ## Time Series Classification Module

Model Name	Acc (%)	Model Size (M)
TimesNet_cls	87.5	792K

Note: The above accuracy metrics are measured on the UWaveGestureLibrary: [Training](https://paddlets.bj.bcebos.com/classification/UWaveGestureLibrary_TRAIN.csv), [Evaluation](https://paddlets.bj.bcebos.com/classification/UWaveGestureLibrary_TEST.csv) datasets.