--- comments: true --- # PaddleX Model List (CPU/GPU) PaddleX includes multiple production lines, each containing several modules, and each module includes several models. You can choose which models to use based on the benchmark data below. If you prioritize model accuracy, choose models with higher accuracy. If you prioritize model inference speed, choose models with faster inference speed. If you prioritize model storage size, choose models with smaller storage size. ## [Image Classification Module](../module_usage/tutorials/cv_modules/image_classification.en.md)
| Model Name | Top1 Acc (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| CLIP_vit_base_patch16_224 | 85.36 | 12.84 / 2.82 | 60.52 / 60.52 | 306.5 M | CLIP_vit_base_patch16_224.yaml | Inference Model/Training Model |
| CLIP_vit_large_patch14_224 | 88.1 | 51.72 / 11.13 | 238.07 / 238.07 | 1.04 G | CLIP_vit_large_patch14_224.yaml | Inference Model/Training Model |
| ConvNeXt_base_224 | 83.84 | 13.18 / 12.14 | 128.39 / 81.78 | 313.9 M | ConvNeXt_base_224.yaml | Inference Model/Training Model |
| ConvNeXt_base_384 | 84.90 | 32.15 / 30.52 | 279.36 / 220.35 | 313.9 M | ConvNeXt_base_384.yaml | Inference Model/Training Model |
| ConvNeXt_large_224 | 84.26 | 26.51 / 7.21 | 213.32 / 157.22 | 700.7 M | ConvNeXt_large_224.yaml | Inference Model/Training Model |
| ConvNeXt_large_384 | 85.27 | 67.07 / 65.26 | 494.04 / 438.97 | 700.7 M | ConvNeXt_large_384.yaml | Inference Model/Training Model |
| ConvNeXt_small | 83.13 | 9.05 / 8.21 | 97.94 / 55.29 | 178.0 M | ConvNeXt_small.yaml | Inference Model/Training Model |
| ConvNeXt_tiny | 82.03 | 5.12 / 2.06 | 63.96 / 29.77 | 101.4 M | ConvNeXt_tiny.yaml | Inference Model/Training Model |
| FasterNet-L | 83.5 | 15.67 / 3.10 | 52.24 / 52.24 | 357.1 M | FasterNet-L.yaml | Inference Model/Training Model |
| FasterNet-M | 83.0 | 9.72 / 2.30 | 35.29 / 35.29 | 204.6 M | FasterNet-M.yaml | Inference Model/Training Model |
| FasterNet-S | 81.3 | 5.46 / 1.27 | 20.46 / 18.03 | 119.3 M | FasterNet-S.yaml | Inference Model/Training Model |
| FasterNet-T0 | 71.9 | 4.18 / 0.60 | 6.34 / 3.44 | 15.1 M | FasterNet-T0.yaml | Inference Model/Training Model |
| FasterNet-T1 | 75.9 | 4.24 / 0.64 | 9.57 / 5.20 | 29.2 M | FasterNet-T1.yaml | Inference Model/Training Model |
| FasterNet-T2 | 79.1 | 3.87 / 0.78 | 11.14 / 9.98 | 57.4 M | FasterNet-T2.yaml | Inference Model/Training Model |
| MobileNetV1_x0_5 | 63.5 | 1.39 / 0.28 | 2.74 / 1.02 | 4.8 M | MobileNetV1_x0_5.yaml | Inference Model/Training Model |
| MobileNetV1_x0_25 | 51.4 | 1.32 / 0.30 | 2.04 / 0.58 | 1.8 M | MobileNetV1_x0_25.yaml | Inference Model/Training Model |
| MobileNetV1_x0_75 | 68.8 | 1.75 / 0.33 | 3.41 / 1.57 | 9.3 M | MobileNetV1_x0_75.yaml | Inference Model/Training Model |
| MobileNetV1_x1_0 | 71.0 | 1.89 / 0.34 | 4.01 / 2.17 | 15.2 M | MobileNetV1_x1_0.yaml | Inference Model/Training Model |
| MobileNetV2_x0_5 | 65.0 | 3.17 / 0.48 | 4.52 / 1.35 | 7.1 M | MobileNetV2_x0_5.yaml | Inference Model/Training Model |
| MobileNetV2_x0_25 | 53.2 | 2.80 / 0.46 | 3.92 / 0.98 | 5.5 M | MobileNetV2_x0_25.yaml | Inference Model/Training Model |
| MobileNetV2_x1_0 | 72.2 | 3.57 / 0.49 | 5.63 / 2.51 | 12.6 M | MobileNetV2_x1_0.yaml | Inference Model/Training Model |
| MobileNetV2_x1_5 | 74.1 | 3.58 / 0.62 | 8.02 / 4.49 | 25.0 M | MobileNetV2_x1_5.yaml | Inference Model/Training Model |
| MobileNetV2_x2_0 | 75.2 | 3.56 / 0.74 | 10.24 / 6.83 | 41.2 M | MobileNetV2_x2_0.yaml | Inference Model/Training Model |
| MobileNetV3_large_x0_5 | 69.2 | 3.79 / 0.62 | 6.76 / 1.61 | 9.6 M | MobileNetV3_large_x0_5.yaml | Inference Model/Training Model |
| MobileNetV3_large_x0_35 | 64.3 | 3.70 / 0.60 | 5.54 / 1.41 | 7.5 M | MobileNetV3_large_x0_35.yaml | Inference Model/Training Model |
| MobileNetV3_large_x0_75 | 73.1 | 4.82 / 0.66 | 7.45 / 2.00 | 14.0 M | MobileNetV3_large_x0_75.yaml | Inference Model/Training Model |
| MobileNetV3_large_x1_0 | 75.3 | 4.86 / 0.68 | 6.88 / 2.61 | 19.5 M | MobileNetV3_large_x1_0.yaml | Inference Model/Training Model |
| MobileNetV3_large_x1_25 | 76.4 | 5.08 / 0.71 | 7.37 / 3.58 | 26.5 M | MobileNetV3_large_x1_25.yaml | Inference Model/Training Model |
| MobileNetV3_small_x0_5 | 59.2 | 3.41 / 0.57 | 5.60 / 1.14 | 6.8 M | MobileNetV3_small_x0_5.yaml | Inference Model/Training Model |
| MobileNetV3_small_x0_35 | 53.0 | 3.49 / 0.60 | 4.63 / 1.07 | 6.0 M | MobileNetV3_small_x0_35.yaml | Inference Model/Training Model |
| MobileNetV3_small_x0_75 | 66.0 | 3.49 / 0.60 | 5.19 / 1.28 | 8.5 M | MobileNetV3_small_x0_75.yaml | Inference Model/Training Model |
| MobileNetV3_small_x1_0 | 68.2 | 3.76 / 0.53 | 5.11 / 1.43 | 10.5 M | MobileNetV3_small_x1_0.yaml | Inference Model/Training Model |
| MobileNetV3_small_x1_25 | 70.7 | 4.23 / 0.58 | 6.48 / 1.68 | 13.0 M | MobileNetV3_small_x1_25.yaml | Inference Model/Training Model |
| MobileNetV4_conv_large | 83.4 | 8.33 / 2.24 | 33.56 / 23.70 | 125.2 M | MobileNetV4_conv_large.yaml | Inference Model/Training Model |
| MobileNetV4_conv_medium | 79.9 | 6.81 / 0.92 | 12.47 / 6.27 | 37.6 M | MobileNetV4_conv_medium.yaml | Inference Model/Training Model |
| MobileNetV4_conv_small | 74.6 | 3.25 / 0.46 | 4.42 / 1.54 | 14.7 M | MobileNetV4_conv_small.yaml | Inference Model/Training Model |
| MobileNetV4_hybrid_large | 83.8 | 12.27 / 4.18 | 58.64 / 58.64 | 145.1 M | MobileNetV4_hybrid_large.yaml | Inference Model/Training Model |
| MobileNetV4_hybrid_medium | 80.5 | 12.08 / 1.34 | 24.69 / 8.10 | 42.9 M | MobileNetV4_hybrid_medium.yaml | Inference Model/Training Model |
| PP-HGNet_base | 85.0 | 14.10 / 4.19 | 68.92 / 68.92 | 249.4 M | PP-HGNet_base.yaml | Inference Model/Training Model |
| PP-HGNet_small | 81.51 | 5.12 / 1.73 | 25.01 / 25.01 | 86.5 M | PP-HGNet_small.yaml | Inference Model/Training Model |
| PP-HGNet_tiny | 79.83 | 3.28 / 1.29 | 16.40 / 15.97 | 52.4 M | PP-HGNet_tiny.yaml | Inference Model/Training Model |
| PP-HGNetV2-B0 | 77.77 | 3.83 / 0.57 | 9.95 / 2.37 | 21.4 M | PP-HGNetV2-B0.yaml | Inference Model/Training Model |
| PP-HGNetV2-B1 | 79.18 | 3.87 / 0.62 | 8.77 / 3.79 | 22.6 M | PP-HGNetV2-B1.yaml | Inference Model/Training Model |
| PP-HGNetV2-B2 | 81.74 | 5.73 / 0.86 | 15.11 / 7.05 | 39.9 M | PP-HGNetV2-B2.yaml | Inference Model/Training Model |
| PP-HGNetV2-B3 | 82.98 | 6.26 / 1.01 | 18.47 / 10.34 | 57.9 M | PP-HGNetV2-B3.yaml | Inference Model/Training Model |
| PP-HGNetV2-B4 | 83.57 | 5.47 / 1.10 | 14.42 / 9.89 | 70.4 M | PP-HGNetV2-B4.yaml | Inference Model/Training Model |
| PP-HGNetV2-B5 | 84.75 | 10.24 / 1.96 | 29.71 / 29.71 | 140.8 M | PP-HGNetV2-B5.yaml | Inference Model/Training Model |
| PP-HGNetV2-B6 | 86.30 | 12.25 / 3.76 | 62.29 / 62.29 | 268.4 M | PP-HGNetV2-B6.yaml | Inference Model/Training Model |
| PP-LCNet_x0_5 | 63.14 | 2.28 / 0.42 | 2.86 / 0.83 | 6.7 M | PP-LCNet_x0_5.yaml | Inference Model/Training Model |
| PP-LCNet_x0_25 | 51.86 | 1.89 / 0.45 | 2.49 / 0.68 | 5.5 M | PP-LCNet_x0_25.yaml | Inference Model/Training Model |
| PP-LCNet_x0_35 | 58.09 | 1.94 / 0.41 | 2.73 / 0.77 | 5.9 M | PP-LCNet_x0_35.yaml | Inference Model/Training Model |
| PP-LCNet_x0_75 | 68.18 | 2.30 / 0.41 | 2.95 / 1.07 | 8.4 M | PP-LCNet_x0_75.yaml | Inference Model/Training Model |
| PP-LCNet_x1_0 | 71.32 | 2.35 / 0.47 | 4.03 / 1.35 | 10.5 M | PP-LCNet_x1_0.yaml | Inference Model/Training Model |
| PP-LCNet_x1_5 | 73.71 | 2.33 / 0.53 | 4.17 / 2.29 | 16.0 M | PP-LCNet_x1_5.yaml | Inference Model/Training Model |
| PP-LCNet_x2_0 | 75.18 | 2.40 / 0.51 | 5.37 / 3.46 | 23.2 M | PP-LCNet_x2_0.yaml | Inference Model/Training Model |
| PP-LCNet_x2_5 | 76.60 | 2.36 / 0.61 | 6.29 / 5.05 | 32.1 M | PP-LCNet_x2_5.yaml | Inference Model/Training Model |
| PP-LCNetV2_base | 77.05 | 3.33 / 0.55 | 6.86 / 3.77 | 23.7 M | PP-LCNetV2_base.yaml | Inference Model/Training Model |
| PP-LCNetV2_large | 78.51 | 4.37 / 0.71 | 9.43 / 8.07 | 37.3 M | PP-LCNetV2_large.yaml | Inference Model/Training Model |
| PP-LCNetV2_small | 73.97 | 2.53 / 0.41 | 5.14 / 1.98 | 14.6 M | PP-LCNetV2_small.yaml | Inference Model/Training Model |
| ResNet18_vd | 72.3 | 2.47 / 0.61 | 6.97 / 5.15 | 41.5 M | ResNet18_vd.yaml | Inference Model/Training Model |
| ResNet18 | 71.0 | 2.35 / 0.67 | 6.35 / 4.61 | 41.5 M | ResNet18.yaml | Inference Model/Training Model |
| ResNet34_vd | 76.0 | 4.01 / 1.03 | 11.99 / 9.86 | 77.3 M | ResNet34_vd.yaml | Inference Model/Training Model |
| ResNet34 | 74.6 | 3.99 / 1.02 | 12.42 / 9.81 | 77.3 M | ResNet34.yaml | Inference Model/Training Model |
| ResNet50_vd | 79.1 | 6.04 / 1.16 | 16.08 / 12.07 | 90.8 M | ResNet50_vd.yaml | Inference Model/Training Model |
| ResNet50 | 76.5 | 6.44 / 1.16 | 15.04 / 11.63 | 90.8 M | ResNet50.yaml | Inference Model/Training Model |
| ResNet101_vd | 80.2 | 11.16 / 2.07 | 32.14 / 32.14 | 158.4 M | ResNet101_vd.yaml | Inference Model/Training Model |
| ResNet101 | 77.6 | 10.91 / 2.06 | 31.14 / 22.93 | 158.7 M | ResNet101.yaml | Inference Model/Training Model |
| ResNet152_vd | 80.6 | 15.96 / 2.99 | 49.33 / 49.33 | 214.3 M | ResNet152_vd.yaml | Inference Model/Training Model |
| ResNet152 | 78.3 | 15.61 / 2.90 | 47.33 / 36.60 | 214.2 M | ResNet152.yaml | Inference Model/Training Model |
| ResNet200_vd | 80.9 | 24.20 / 3.69 | 62.62 / 62.62 | 266.0 M | ResNet200_vd.yaml | Inference Model/Training Model |
| StarNet-S1 | 73.6 | 6.33 / 1.98 | 7.56 / 3.26 | 11.2 M | StarNet-S1.yaml | Inference Model/Training Model |
| StarNet-S2 | 74.8 | 4.49 / 1.55 | 7.38 / 3.38 | 14.3 M | StarNet-S2.yaml | Inference Model/Training Model |
| StarNet-S3 | 77.0 | 6.70 / 1.62 | 11.05 / 4.76 | 22.2 M | StarNet-S3.yaml | Inference Model/Training Model |
| StarNet-S4 | 79.0 | 8.50 / 2.86 | 15.40 / 6.76 | 28.9 M | StarNet-S4.yaml | Inference Model/Training Model |
| SwinTransformer_base_patch4_window7_224 | 83.37 | 14.29 / 5.13 | 130.89 / 130.89 | 310.5 M | SwinTransformer_base_patch4_window7_224.yaml | Inference Model/Training Model |
| SwinTransformer_base_patch4_window12_384 | 84.17 | 37.74 / 10.10 | 362.56 / 362.56 | 311.4 M | SwinTransformer_base_patch4_window12_384.yaml | Inference Model/Training Model |
| SwinTransformer_large_patch4_window7_224 | 86.19 | 26.48 / 7.94 | 228.23 / 228.23 | 694.8 M | SwinTransformer_large_patch4_window7_224.yaml | Inference Model/Training Model |
| SwinTransformer_large_patch4_window12_384 | 87.06 | 74.72 / 18.16 | 652.04 / 652.04 | 696.1 M | SwinTransformer_large_patch4_window12_384.yaml | Inference Model/Training Model |
| SwinTransformer_small_patch4_window7_224 | 83.21 | 10.37 / 3.90 | 94.20 / 94.20 | 175.6 M | SwinTransformer_small_patch4_window7_224.yaml | Inference Model/Training Model |
| SwinTransformer_tiny_patch4_window7_224 | 81.10 | 6.66 / 2.15 | 60.45 / 60.45 | 100.1 M | SwinTransformer_tiny_patch4_window7_224.yaml | Inference Model/Training Model |
| Model Name | mAP (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| CLIP_vit_base_patch16_448_ML | 89.15 | 54.75 / 14.30 | 280.23 / 280.23 | 325.6 M | CLIP_vit_base_patch16_448_ML.yaml | Inference Model/Training Model |
| PP-HGNetV2-B0_ML | 80.98 | 6.47 / 1.38 | 21.56 / 13.69 | 39.6 M | PP-HGNetV2-B0_ML.yaml | Inference Model/Training Model |
| PP-HGNetV2-B4_ML | 87.96 | 9.63 / 2.79 | 43.98 / 36.63 | 88.5 M | PP-HGNetV2-B4_ML.yaml | Inference Model/Training Model |
| PP-HGNetV2-B6_ML | 91.06 | 37.07 / 9.43 | 188.58 / 188.58 | 286.5 M | PP-HGNetV2-B6_ML.yaml | Inference Model/Training Model |
| PP-LCNet_x1_0_ML | 77.96 | 4.04 / 1.15 | 11.76 / 8.32 | 29.4 M | PP-LCNet_x1_0_ML.yaml | Inference Model/Training Model |
| ResNet50_ML | 83.42 | 12.12 / 3.27 | 51.79 / 44.36 | 108.9 M | ResNet50_ML.yaml | Inference Model/Training Model |
| Model Name | mA (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-LCNet_x1_0_pedestrian_attribute | 92.2 | 2.35 / 0.49 | 3.17 / 1.25 | 6.7 M | PP-LCNet_x1_0_pedestrian_attribute.yaml | Inference Model/Training Model |
| Model Name | mA (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-LCNet_x1_0_vehicle_attribute | 91.7 | 2.32 / 2.32 | 3.22 / 1.26 | 6.7 M | PP-LCNet_x1_0_vehicle_attribute.yaml | Inference Model/Training Model |
| Model Name | recall@1 (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-ShiTuV2_rec | 84.2 | 3.48 / 0.55 | 8.04 / 4.04 | 16.3 M | PP-ShiTuV2_rec.yaml | Inference Model/Training Model |
| PP-ShiTuV2_rec_CLIP_vit_base | 88.69 | 12.94 / 2.88 | 58.36 / 58.36 | 306.6 M | PP-ShiTuV2_rec_CLIP_vit_base.yaml | Inference Model/Training Model |
| PP-ShiTuV2_rec_CLIP_vit_large | 91.03 | 51.65 / 11.18 | 255.78 / 255.78 | 1.05 G | PP-ShiTuV2_rec_CLIP_vit_large.yaml | Inference Model/Training Model |
| Model Name | Top-1 Acc (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-LCNet_x1_0_doc_ori | 99.06 | 2.31 / 0.43 | 3.37 / 1.27 | 7 | PP-LCNet_x1_0_doc_ori.yaml | Inference Model/Training Model |
| Model Name | Output Feature Dimension | Acc (%) AgeDB-30/CFP-FP/LFW |
GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|---|
| MobileFaceNet | 128 | 96.28/96.71/99.58 | 3.16 / 0.48 | 6.49 / 6.49 | 4.1 | MobileFaceNet.yaml | Inference Model/Training Model |
| ResNet50_face | 512 | 98.12/98.56/99.77 | 5.68 / 1.09 | 14.96 / 11.90 | 87.2 | ResNet50_face.yaml | Inference Model/Training Model |
| Model Name | mAP (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-ShiTuV2_det | 41.5 | 12.79 / 4.51 | 44.14 / 44.14 | 27.54 | PP-ShiTuV2_det.yaml | Inference Model/Training Model |
| Model Name | mAP (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| Cascade-FasterRCNN-ResNet50-FPN | 41.1 | 135.92 / 135.92 | - | 245.4 M | Cascade-FasterRCNN-ResNet50-FPN.yaml | Inference Model/Training Model |
| Cascade-FasterRCNN-ResNet50-vd-SSLDv2-FPN | 45.0 | 138.23 / 138.23 | - | 246.2 M | Cascade-FasterRCNN-ResNet50-vd-SSLDv2-FPN.yaml | Inference Model/Training Model |
| CenterNet-DLA-34 | 37.6 | - | - | 75.4 M | CenterNet-DLA-34.yaml | Inference Model/Training Model |
| CenterNet-ResNet50 | 38.9 | - | - | 319.7 M | CenterNet-ResNet50.yaml | Inference Model/Training Model |
| DETR-R50 | 42.3 | 62.91 / 17.33 | 392.63 / 392.63 | 159.3 M | DETR-R50.yaml | Inference Model/Training Model |
| FasterRCNN-ResNet34-FPN | 37.8 | 83.33 / 31.64 | - | 137.5 M | FasterRCNN-ResNet34-FPN.yaml | Inference Model/Training Model |
| FasterRCNN-ResNet50-FPN | 38.4 | 107.08 / 35.40 | - | 148.1 M | FasterRCNN-ResNet50-FPN.yaml | Inference Model/Training Model |
| FasterRCNN-ResNet50-vd-FPN | 39.5 | 109.36 / 36.00 | - | 148.1 M | FasterRCNN-ResNet50-vd-FPN.yaml | Inference Model/Training Model |
| FasterRCNN-ResNet50-vd-SSLDv2-FPN | 41.4 | 109.06 / 36.19 | - | 148.1 M | FasterRCNN-ResNet50-vd-SSLDv2-FPN.yaml | Inference Model/Training Model |
| FasterRCNN-ResNet50 | 36.7 | 496.33 / 109.12 | - | 120.2 M | FasterRCNN-ResNet50.yaml | Inference Model/Training Model |
| FasterRCNN-ResNet101-FPN | 41.4 | 148.21 / 42.21 | - | 216.3 M | FasterRCNN-ResNet101-FPN.yaml | Inference Model/Training Model |
| FasterRCNN-ResNet101 | 39.0 | 538.58 / 120.88 | - | 188.1 M | FasterRCNN-ResNet101.yaml | Inference Model/Training Model |
| FasterRCNN-ResNeXt101-vd-FPN | 43.4 | 258.01 / 58.25 | - | 360.6 M | FasterRCNN-ResNeXt101-vd-FPN.yaml | Inference Model/Training Model |
| FasterRCNN-Swin-Tiny-FPN | 42.6 | - | - | 159.8 M | FasterRCNN-Swin-Tiny-FPN.yaml | Inference Model/Training Model |
| FCOS-ResNet50 | 39.6 | 106.13 / 28.32 | 721.79 / 721.79 | 124.2 M | FCOS-ResNet50.yaml | Inference Model/Training Model |
| PicoDet-L | 42.6 | 14.68 / 5.81 | 47.32 / 47.32 | 20.9 M | PicoDet-L.yaml | Inference Model/Training Model |
| PicoDet-M | 37.5 | 9.62 / 3.23 | 23.75 / 14.88 | 16.8 M | PicoDet-M.yaml | Inference Model/Training Model |
| PicoDet-S | 29.1 | 7.98 / 2.33 | 14.82 / 5.60 | 4.4 M | PicoDet-S.yaml | Inference Model/Training Model |
| PicoDet-XS | 26.2 | 9.66 / 2.75 | 19.15 / 7.24 | 5.7 M | PicoDet-XS.yaml | Inference Model/Training Model |
| PP-YOLOE_plus-L | 52.9 | 33.55 / 10.46 | 189.05 / 189.05 | 185.3 M | PP-YOLOE_plus-L.yaml | Inference Model/Training Model |
| PP-YOLOE_plus-M | 49.8 | 19.52 / 7.46 | 113.36 / 113.36 | 83.2 M | PP-YOLOE_plus-M.yaml | Inference Model/Training Model |
| PP-YOLOE_plus-S | 43.7 | 12.16 / 4.58 | 73.86 / 52.90 | 28.3 M | PP-YOLOE_plus-S.yaml | Inference Model/Training Model |
| PP-YOLOE_plus-X | 54.7 | 58.87 / 15.84 | 292.93 / 292.93 | 349.4 M | PP-YOLOE_plus-X.yaml | Inference Model/Training Model |
| RT-DETR-H | 56.3 | 115.92 / 28.16 | 971.32 / 971.32 | 435.8 M | RT-DETR-H.yaml | Inference Model/Training Model |
| RT-DETR-L | 53.0 | 35.00 / 10.45 | 495.51 / 495.51 | 113.7 M | RT-DETR-L.yaml | Inference Model/Training Model |
| RT-DETR-R18 | 46.5 | 20.21 / 6.23 | 266.01 / 266.01 | 70.7 M | RT-DETR-R18.yaml | Inference Model/Training Model |
| RT-DETR-R50 | 53.1 | 42.14 / 11.31 | 523.97 / 523.97 | 149.1 M | RT-DETR-R50.yaml | Inference Model/Training Model |
| RT-DETR-X | 54.8 | 61.24 / 15.83 | 647.08 / 647.08 | 232.9 M | RT-DETR-X.yaml | Inference Model/Training Model |
| YOLOv3-DarkNet53 | 39.1 | 41.58 / 10.10 | 158.78 / 158.78 | 219.7 M | YOLOv3-DarkNet53.yaml | Inference Model/Training Model |
| YOLOv3-MobileNetV3 | 31.4 | 16.53 / 5.70 | 60.44 / 60.44 | 83.8 M | YOLOv3-MobileNetV3.yaml | Inference Model/Training Model |
| YOLOv3-ResNet50_vd_DCN | 40.6 | 32.91 / 10.07 | 225.72 / 224.32 | 163.0 M | YOLOv3-ResNet50_vd_DCN.yaml | Inference Model/Training Model |
| YOLOX-L | 50.1 | 121.19 / 13.55 | 295.38 / 274.15 | 192.5 M | YOLOX-L.yaml | Inference Model/Training Model |
| YOLOX-M | 46.9 | 87.19 / 10.09 | 183.95 / 172.67 | 90.0 M | YOLOX-M.yaml | Inference Model/Training Model |
| YOLOX-N | 26.1 | 53.31 / 45.02 | 69.69 / 59.18 | 3.4M | YOLOX-N.yaml | Inference Model/Training Model |
| YOLOX-S | 40.4 | 129.52 / 13.19 | 181.39 / 179.01 | 32.0 M | YOLOX-S.yaml | Inference Model/Training Model |
| YOLOX-T | 32.9 | 66.81 / 61.31 | 92.30 / 83.90 | 18.1 M | YOLOX-T.yaml | Inference Model/Training Model |
| YOLOX-X | 51.8 | 156.40 / 20.17 | 480.14 / 454.35 | 351.5 M | YOLOX-X.yaml | Inference Model/Training Model |
| Model Name | mAP (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Size | yaml File | Model Download Links |
|---|---|---|---|---|---|---|
| PP-YOLOE_plus_SOD-S | 25.1 | 135.68 / 122.94 | 188.09 / 107.74 | 77.3 M | PP-YOLOE_plus_SOD-S.yaml | Inference Model/Training Model |
| Model | mAP(0.5:0.95) | mAP(0.5) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Size (M) | Model Download Link |
|---|---|---|---|---|---|---|
| GroundingDINO-T | 49.4 | 64.4 | 253.72 | 1807.4 | 658.3 | Inference Model |
| Model | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | Model Download Link |
|---|---|---|---|---|
| SAM-H_box | 144.9 | 33920.7 | 2433.7 | Inference Model |
| SAM-H_point | 144.9 | 33920.7 | 2433.7 | Inference Model |
| Model | mAP(%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-YOLOE-R-L | 78.14 | 20.7039 | 157.942 | 211.0 M | PP-YOLOE-R.yaml | Inference Model/Training Model |
Note: The above accuracy metrics are based on the DOTA validation set mAP(0.5:0.95). All model GPU inference times are based on NVIDIA TRX2080 Ti, with precision type F16. CPU inference speed is based on Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz, with 8 threads, and precision type FP32.
## [Pedestrian Detection Module](../module_usage/tutorials/cv_modules/human_detection.en.md)| Model Name | mAP (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-YOLOE-L_human | 48.0 | 33.27 / 9.19 | 173.72 / 173.72 | 196.1 M | PP-YOLOE-L_human.yaml | Inference Model/Training Model |
| PP-YOLOE-S_human | 42.5 | 9.94 / 3.42 | 54.48 / 46.52 | 28.8 M | PP-YOLOE-S_human.yaml | Inference Model/Training Model |
| Model Name | mAP (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-YOLOE-L_vehicle | 63.9 | 32.84 / 9.03 | 176.60 / 176.60 | 196.1 M | PP-YOLOE-L_vehicle.yaml | Inference Model/Training Model |
| PP-YOLOE-S_vehicle | 61.3 | 9.79 / 3.48 | 54.14 / 46.69 | 28.8 M | PP-YOLOE-S_vehicle.yaml | Inference Model/Training Model |
| Model Name | AP (%) Easy/Medium/Hard |
GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| BlazeFace | 77.7/73.4/49.5 | 60.34 / 54.76 | 84.18 / 84.18 | 0.447 M | BlazeFace.yaml | Inference Model/Training Model |
| BlazeFace-FPN-SSH | 83.2/80.5/60.5 | 69.29 / 63.42 | 86.96 / 86.96 | 0.606 M | BlazeFace-FPN-SSH.yaml | Inference Model/Training Model |
| PicoDet_LCNet_x2_5_face | 93.7/90.7/68.1 | 35.37 / 12.88 | 126.24 / 126.24 | 28.9 M | PicoDet_LCNet_x2_5_face.yaml | Inference Model/Training Model |
| PP-YOLOE_plus-S_face | 93.9/91.8/79.8 | 22.54 / 8.33 | 138.67 / 138.67 | 26.5 M | PP-YOLOE_plus-S_face | Inference Model/Training Model |
| Model Name | mIoU | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| STFPM | 0.9901 | 2.97 / 1.57 | 38.86 / 13.24 | 22.5 M | STFPM.yaml | Inference Model/Training Model |
| Model | Scheme | Input Size | AP(0.5:0.95) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|---|---|
| PP-TinyPose_128x96 | Top-Down | 128*96 | 58.4 | 4.9 | PP-TinyPose_128x96.yaml | Inference Model/Training Model | ||
| PP-TinyPose_256x192 | Top-Down | 256*192 | 68.3 | 4.9 | PP-TinyPose_256x192.yaml | Inference Model/Training Model |
| Model | mAP(%) | NDS | yaml File | Model Download Link |
|---|---|---|---|---|
| BEVFusion | 53.9 | 60.9 | BEVFusion.yaml | Inference Model/Training Model |
Note: The above accuracy metrics are based on the nuscenes validation set with mAP(0.5:0.95) and NDS 60.9, and the precision type is FP32.
## [Semantic Segmentation Module](../module_usage/tutorials/cv_modules/semantic_segmentation.en.md)| Model Name | mloU(%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| Deeplabv3_Plus-R50 | 80.36 | 503.51 / 122.30 | 3543.91 / 3543.91 | 94.9 M | Deeplabv3_Plus-R50.yaml | Inference Model/Training Model |
| Deeplabv3_Plus-R101 | 81.10 | 803.79 / 175.45 | 5136.21 / 5136.21 | 162.5 M | Deeplabv3_Plus-R101.yaml | Inference Model/Training Model |
| Deeplabv3-R50 | 79.90 | 647.56 / 121.67 | 3803.09 / 3803.09 | 138.3 M | Deeplabv3-R50.yaml | Inference Model/Training Model |
| Deeplabv3-R101 | 80.85 | 950.43 / 178.50 | 5517.14 / 5517.14 | 205.9 M | Deeplabv3-R101.yaml | Inference Model/Training Model |
| OCRNet_HRNet-W18 | 80.67 | 286.12 / 80.76 | 1794.03 / 1794.03 | 43.1 M | OCRNet_HRNet-W18.yaml | Inference Model/Training Model |
| OCRNet_HRNet-W48 | 82.15 | 627.36 / 170.76 | 3531.61 / 3531.61 | 249.8 M | OCRNet_HRNet-W48.yaml | Inference Model/Training Model |
| PP-LiteSeg-T | 73.10 | 30.16 / 14.03 | 420.07 / 235.01 | 28.5 M | PP-LiteSeg-T.yaml | Inference Model/Training Model |
| PP-LiteSeg-B | 75.25 | 40.92 / 20.18 | 494.32 / 310.34 | 47.0 M | PP-LiteSeg-B.yaml | Inference Model/Training Model |
| SegFormer-B0 (slice) | 76.73 | 11.1946 | 268.929 | 13.2 M | SegFormer-B0.yaml | Inference Model/Training Model |
| SegFormer-B1 (slice) | 78.35 | 17.9998 | 403.393 | 48.5 M | SegFormer-B1.yaml | Inference Model/Training Model |
| SegFormer-B2 (slice) | 81.60 | 48.0371 | 1248.52 | 96.9 M | SegFormer-B2.yaml | Inference Model/Training Model |
| SegFormer-B3 (slice) | 82.47 | 64.341 | 1666.35 | 167.3 M | SegFormer-B3.yaml | Inference Model/Training Model |
| SegFormer-B4 (slice) | 82.38 | 82.4336 | 1995.42 | 226.7 M | SegFormer-B4.yaml | Inference Model/Training Model |
| SegFormer-B5 (slice) | 82.58 | 97.3717 | 2420.19 | 229.7 M | SegFormer-B5.yaml | Inference Model/Training Model |
| Model Name | mIoU (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| SeaFormer_base(slice) | 40.92 | 24.4073 | 397.574 | 30.8 M | SeaFormer_base.yaml | Inference Model/Training Model |
| SeaFormer_large (slice) | 43.66 | 27.8123 | 550.464 | 49.8 M | SeaFormer_large.yaml | Inference Model/Training Model |
| SeaFormer_small (slice) | 38.73 | 19.2295 | 358.343 | 14.3 M | SeaFormer_small.yaml | Inference Model/Training Model |
| SeaFormer_tiny (slice) | 34.58 | 13.9496 | 330.132 | 6.1M | SeaFormer_tiny.yaml | Inference Model/Training Model |
| Model Name | Mask AP | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| Mask-RT-DETR-H | 50.6 | 172.36 / 172.36 | 1615.75 / 1615.75 | 449.9 M | Mask-RT-DETR-H.yaml | Inference Model/Training Model |
| Mask-RT-DETR-L | 45.7 | 88.18 / 88.18 | 1090.84 / 1090.84 | 113.6 M | Mask-RT-DETR-L.yaml | Inference Model/Training Model |
| Mask-RT-DETR-M | 42.7 | 78.69 / 78.69 | - | 66.6 M | Mask-RT-DETR-M.yaml | Inference Model/Training Model |
| Mask-RT-DETR-S | 41.0 | 33.5007 | - | 51.8 M | Mask-RT-DETR-S.yaml | Inference Model/Training Model |
| Mask-RT-DETR-X | 47.5 | 114.16 / 114.16 | 1240.92 / 1240.92 | 237.5 M | Mask-RT-DETR-X.yaml | Inference Model/Training Model |
| Cascade-MaskRCNN-ResNet50-FPN | 36.3 | 141.69 / 141.69 | - | 254.8 M | Cascade-MaskRCNN-ResNet50-FPN.yaml | Inference Model/Training Model |
| Cascade-MaskRCNN-ResNet50-vd-SSLDv2-FPN | 39.1 | 147.62 / 147.62 | - | 254.7 M | Cascade-MaskRCNN-ResNet50-vd-SSLDv2-FPN.yaml | Inference Model/Training Model |
| MaskRCNN-ResNet50-FPN | 35.6 | 118.30 / 118.30 | - | 157.5 M | MaskRCNN-ResNet50-FPN.yaml | Inference Model/Training Model |
| MaskRCNN-ResNet50-vd-FPN | 36.4 | 118.34 / 118.34 | - | 157.5 M | MaskRCNN-ResNet50-vd-FPN.yaml | Inference Model/Training Model |
| MaskRCNN-ResNet50 | 32.8 | 228.83 / 228.83 | - | 127.8 M | MaskRCNN-ResNet50.yaml | Inference Model/Training Model |
| MaskRCNN-ResNet101-FPN | 36.6 | 148.14 / 148.14 | - | 225.4 M | MaskRCNN-ResNet101-FPN.yaml | Inference Model/Training Model |
| MaskRCNN-ResNet101-vd-FPN | 38.1 | 151.12 / 151.12 | - | 225.1 M | MaskRCNN-ResNet101-vd-FPN.yaml | Inference Model/Training Model |
| MaskRCNN-ResNeXt101-vd-FPN | 39.5 | 237.55 / 237.55 | - | 370.0 M | MaskRCNN-ResNeXt101-vd-FPN.yaml | Inference Model/Training Model |
| PP-YOLOE_seg-S | 32.5 | - | - | 31.5 M | PP-YOLOE_seg-S.yaml | Inference Model/Training Model |
| SOLOv2 | 35.5 | - | - | 179.1 M | SOLOv2.yaml | Inference Model/Training Model |
| Model | Detection Hmean (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-OCRv4_server_det | 82.56 | 83.34 / 80.91 | 442.58 / 442.58 | 109 | PP-OCRv4_server_det.yaml | Inference Model/Training Model |
| PP-OCRv4_mobile_det | 77.35 | 8.79 / 3.13 | 51.00 / 28.58 | 4.7 | PP-OCRv4_mobile_det.yaml | Inference Model/Training Model |
| PP-OCRv3_mobile_det | 78.68 | 8.44 / 2.91 | 27.87 / 27.87 | 2.1 | PP-OCRv3_mobile_det.yaml | Inference Model/Training Model |
| PP-OCRv3_server_det | 80.11 | 65.41 / 13.67 | 305.07 / 305.07 | 102.1 | PP-OCRv3_server_det.yaml | Inference Model/Training Model |
| Model Name | Detection Hmean (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-OCRv4_mobile_seal_det | 96.47 | 7.82 / 3.09 | 48.28 / 23.97 | 4.7M | PP-OCRv4_mobile_seal_det.yaml | Inference Model/Training Model |
| PP-OCRv4_server_seal_det | 98.21 | 74.75 / 67.72 | 382.55 / 382.55 | 108.3 M | PP-OCRv4_server_seal_det.yaml | Inference Model/Training Model |
| Model | Recognition Avg Accuracy(%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-OCRv4_server_rec_doc | 81.53 | 6.65 / 6.65 | 32.92 / 32.92 | 74.7 M | PP-OCRv4_server_rec_doc.yaml | Inference Model/Training Model |
| PP-OCRv4_mobile_rec | 78.74 | 4.82 / 4.82 | 16.74 / 4.64 | 10.6 M | PP-OCRv4_mobile_rec.yaml | Inference Model/Training Model |
| PP-OCRv4_server_rec | 80.61 | 6.58 / 6.58 | 33.17 / 33.17 | 71.2 M | PP-OCRv4_server_rec.yaml | Inference Model/Training Model |
| PP-OCRv3_mobile_rec | 72.96 | 5.87 / 5.87 | 9.07 / 4.28 | 9.2 M | PP-OCRv3_mobile_rec.yaml | Inference Model/Training Model |
Note: The evaluation set for the above accuracy metrics is a Chinese dataset built by PaddleOCR, covering multiple scenarios such as street view, web images, documents, and handwriting, with 8367 images for text recognition. All models' GPU inference times are based on an NVIDIA Tesla T4 machine with FP32 precision, while CPU inference speeds are based on an Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz with 8 threads and FP32 precision.
| Model | Recognition Avg Accuracy(%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| ch_SVTRv2_rec | 68.81 | 8.08 / 8.08 | 50.17 / 42.50 | 73.9 M | ch_SVTRv2_rec.yaml | Inference Model/Training Model |
Note: The evaluation dataset for the above accuracy metrics is the PaddleOCR Algorithm Model Challenge - Task 1: OCR End-to-End Recognition Task Leaderboard A. All model GPU inference times are based on an NVIDIA Tesla T4 machine with FP32 precision. CPU inference speed is based on an Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz with 8 threads and FP32 precision.
| Model | Recognition Avg Accuracy(%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| ch_RepSVTR_rec | 65.07 | 5.93 / 5.93 | 20.73 / 7.32 | 22.1 M | ch_RepSVTR_rec.yaml | Inference Model/Training Model |
Note: The evaluation dataset for the above accuracy metrics is the PaddleOCR Algorithm Model Challenge - Task 1: OCR End-to-End Recognition Task Leaderboard B. All model GPU inference times are based on an NVIDIA Tesla T4 machine with FP32 precision. CPU inference speed is based on an Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz with 8 threads and FP32 precision.
English Recognition Model| Model | Recognition Avg Accuracy(%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| en_PP-OCRv4_mobile_rec | 70.39 | 4.81 / 4.81 | 16.10 / 5.31 | 6.8 M | en_PP-OCRv4_mobile_rec.yaml | Inference Model/Training Model |
| en_PP-OCRv3_mobile_rec | 70.69 | 5.44 / 5.44 | 8.65 / 5.57 | 7.8 M | en_PP-OCRv3_mobile_rec.yaml | Inference Model/Training Model |
| Model | Recognition Avg Accuracy(%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| korean_PP-OCRv3_mobile_rec | 60.21 | 5.40 / 5.40 | 9.11 / 4.05 | 8.6 M | korean_PP-OCRv3_mobile_rec.yaml | Inference Model/Training Model |
| japan_PP-OCRv3_mobile_rec | 45.69 | 5.70 / 5.70 | 8.48 / 4.07 | 8.8 M | japan_PP-OCRv3_mobile_rec.yaml | Inference Model/Training Model |
| chinese_cht_PP-OCRv3_mobile_rec | 82.06 | 5.90 / 5.90 | 9.28 / 4.34 | 9.7 M | chinese_cht_PP-OCRv3_mobile_rec.yaml | Inference Model/Training Model |
| te_PP-OCRv3_mobile_rec | 95.88 | 5.42 / 5.42 | 8.10 / 6.91 | 7.8 M | te_PP-OCRv3_mobile_rec.yaml | Inference Model/Training Model |
| ka_PP-OCRv3_mobile_rec | 96.96 | 5.25 / 5.25 | 9.09 / 3.86 | 8.0 M | ka_PP-OCRv3_mobile_rec.yaml | Inference Model/Training Model |
| ta_PP-OCRv3_mobile_rec | 76.83 | 5.23 / 5.23 | 10.13 / 4.30 | 8.0 M | ta_PP-OCRv3_mobile_rec.yaml | Inference Model/Training Model |
| latin_PP-OCRv3_mobile_rec | 76.93 | 5.20 / 5.20 | 8.83 / 7.15 | 7.8 M | latin_PP-OCRv3_mobile_rec.yaml | Inference Model/Training Model |
| arabic_PP-OCRv3_mobile_rec | 73.55 | 5.35 / 5.35 | 8.80 / 4.56 | 7.8 M | arabic_PP-OCRv3_mobile_rec.yaml | Inference Model/Training Model |
| cyrillic_PP-OCRv3_mobile_rec | 94.28 | 5.23 / 5.23 | 8.89 / 3.88 | 7.9 M | cyrillic_PP-OCRv3_mobile_rec.yaml | Inference Model/Training Model |
| devanagari_PP-OCRv3_mobile_rec | 96.44 | 5.22 / 5.22 | 8.56 / 4.06 | 7.9 M | devanagari_PP-OCRv3_mobile_rec.yaml | Inference Model/Training Model |
Note: The evaluation set for the above accuracy metrics is a multi-language dataset built by PaddleX. All model GPU inference times are based on NVIDIA Tesla T4 machines, with precision type FP32. CPU inference speed is based on Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz, with 8 threads, and precision type FP32.
## [Formula Recognition Module](../module_usage/tutorials/ocr_modules/formula_recognition.en.md)| Model | Avg-BLEU | GPU Inference Time (ms) | Model Storage Size (M) | yaml File | Model Download Link | UniMERNet | 0.8613 | 2266.96 | 1.4 G | UniMERNet.yaml | Inference Model/Training Model |
|---|---|---|---|---|---|
| PP-FormulaNet-S | 0.8712 | 202.25 | 167.9 M | PP-FormulaNet-S.yaml | Inference Model/Training Model | PP-FormulaNet-L | 0.9213 | 1976.52 | 535.2 M | PP-FormulaNet-L.yaml | Inference Model/Training Model |
| LaTeX_OCR_rec | 0.7163 | - | 89.7 M | LaTeX_OCR_rec.yaml | Inference Model/Training Model |
| Model | Accuracy (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| SLANet | 59.52 | 103.08 / 103.08 | 197.99 / 197.99 | 6.9 M | SLANet.yaml | Inference Model/Training Model |
| SLANet_plus | 63.69 | 140.29 / 140.29 | 195.39 / 195.39 | 6.9 M | SLANet_plus.yaml | Inference Model/Training Model |
| SLANeXt_wired | 69.65 | -- | -- | -- | SLANeXt_wired.yaml | Inference Model/Training Model |
| SLANeXt_wireless | SLANeXt_wireless.yaml | Inference Model/Training Model |
| Model | mAP(%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| RT-DETR-L_wired_table_cell_det | -- | -- | -- | -- | RT-DETR-L_wired_table_cell_det.yaml | Inference Model/Training Model |
| RT-DETR-L_wireless_table_cell_det | RT-DETR-L_wireless_table_cell_det.yaml | Inference Model/Training Model |
Note: The above accuracy metrics are measured from the internal table cell detection dataset of PaddleX. All model GPU inference times are based on an NVIDIA Tesla T4 machine, with precision type FP32. CPU inference speed is based on an Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz, with 8 threads, and precision type FP32.
## [Table Classification Module](../module_usage/tutorials/ocr_modules/table_classification.en.md)| Model | Top1 Acc(%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-LCNet_x1_0_table_cls | -- | -- | -- | -- | PP-LCNet_x1_0_table_cls.yaml | Inference Model/Training Model |
Note: The above accuracy metrics are measured from the internal table classification dataset built by PaddleX. All model GPU inference times are based on an NVIDIA Tesla T4 machine with FP32 precision. CPU inference speed is based on an Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz with 8 threads and FP32 precision.
## [Text Image Unwarping Module](../module_usage/tutorials/ocr_modules/text_image_unwarping.en.md)| Model Name | MS-SSIM (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| UVDoc | 54.40 | 16.27 / 7.76 | 176.97 / 80.60 | 30.3 M | UVDoc.yaml | Inference Model/Training Model |
| Model | mAP(0.5) (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PicoDet_layout_1x_table | 97.5 | 8.02 / 3.09 | 23.70 / 20.41 | 7.4 M | PicoDet_layout_1x_table.yaml | Inference Model/Training Model |
| Model | mAP(0.5) (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PicoDet-S_layout_3cls | 88.2 | 8.99 / 2.22 | 16.11 / 8.73 | 4.8 | PicoDet-S_layout_3cls.yaml | Inference Model/Training Model |
| PicoDet-L_layout_3cls | 89.0 | 13.05 / 4.50 | 41.30 / 41.30 | 22.6 | PicoDet-L_layout_3cls.yaml | Inference Model/Training Model |
| RT-DETR-H_layout_3cls | 95.8 | 114.93 / 27.71 | 947.56 / 947.56 | 470.1 | RT-DETR-H_layout_3cls.yaml | Inference Model/Training Model |
| Model | mAP(0.5) (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PicoDet_layout_1x | 97.8 | 9.03 / 3.10 | 25.82 / 20.70 | 7.4 | PicoDet_layout_1x.yaml | Inference Model/Training Model |
| Model | mAP(0.5) (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PicoDet-S_layout_17cls | 87.4 | 9.11 / 2.12 | 15.42 / 9.12 | 4.8 | PicoDet-S_layout_17cls.yaml | Inference Model/Training Model |
| Model | Top-1 Acc (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-LCNet_x1_0_doc_ori | 99.06 | 2.31 / 0.43 | 3.37 / 1.27 | 7 | PP-LCNet_x1_0_doc_ori.yaml | Inference Model/Training Model |
| Model Name | mse | mae | Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|
| DLinear | 0.382 | 0.394 | 72 K | DLinear.yaml | Inference Model/Training Model |
| NLinear | 0.386 | 0.392 | 40 K | NLinear.yaml | Inference Model/Training Model |
| Nonstationary | 0.600 | 0.515 | 55.5 M | Nonstationary.yaml | Inference Model/Training Model |
| PatchTST | 0.379 | 0.391 | 2.0 M | PatchTST.yaml | Inference Model/Training Model |
| RLinear | 0.385 | 0.392 | 40 K | RLinear.yaml | Inference Model/Training Model |
| TiDE | 0.407 | 0.414 | 31.7 M | TiDE.yaml | Inference Model/Training Model |
| TimesNet | 0.416 | 0.429 | 4.9 M | TimesNet.yaml | Inference Model/Training Model |
| Model Name | Precision | Recall | F1 Score | Model Storage Size | YAML File | Model Download Link |
|---|---|---|---|---|---|---|
| AutoEncoder_ad | 99.36 | 84.36 | 91.25 | 52 K | AutoEncoder_ad.yaml | Inference Model/Training Model |
| DLinear_ad | 98.98 | 93.96 | 96.41 | 112 K | DLinear_ad.yaml | Inference Model/Training Model |
| Nonstationary_ad | 98.55 | 88.95 | 93.51 | 1.8 M | Nonstationary_ad.yaml | Inference Model/Training Model |
| PatchTST_ad | 98.78 | 90.70 | 94.57 | 320 K | PatchTST_ad.yaml | Inference Model/Training Model |
| Model Name | acc(%) | Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|
| TimesNet_cls | 87.5 | 792 K | TimesNet_cls.yaml | Inference Model/Training Model |
| Model | Training Data | Model Size | Word Error Rate | YAML File | Model Download Link |
|---|---|---|---|---|---|
| whisper_large | 680kh | 5.8G | 2.7 (Librispeech) | whisper_large.yaml | Inference Model |
| whisper_medium | 680kh | 2.9G | - | whisper_medium.yaml | Inference Model |
| whisper_small | 680kh | 923M | - | whisper_small.yaml | Inference Model |
| whisper_base | 680kh | 277M | - | whisper_base.yaml | Inference Model |
| whisper_tiny | 680kh | 145M | - | whisper_small.yaml | Inference Model |
| Model | Top1 Acc(%) | Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|
| PP-TSM-R50_8frames_uniform | 74.36 | 93.4 M | PP-TSM-R50_8frames_uniform.yaml | Inference Model/Training Model |
| PP-TSMv2-LCNetV2_8frames_uniform | 71.71 | 22.5 M | PP-TSMv2-LCNetV2_8frames_uniform.yaml | Inference Model/Training Model |
| PP-TSMv2-LCNetV2_16frames_uniform | 73.11 | 22.5 M | PP-TSMv2-LCNetV2_16frames_uniform.yaml | Inference Model/Training Model |
Note: The above accuracy metrics are based on the K400 validation set Top1 Acc.
## [Video Detection Module](../module_usage/tutorials/video_modules/video_detection.en.md)| Model | Frame-mAP(@ IoU 0.5) | Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|
| YOWO | 80.94 | 462.891M | YOWO.yaml | Inference Model/Training Model |
Note: The above accuracy metrics are based on the test dataset UCF101-24, using the Frame-mAP (@ IoU 0.5) metric. All model GPU inference times are based on an NVIDIA Tesla T4 machine with FP32 precision. CPU inference speed is based on an Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz with 8 threads and FP32 precision.