PaddleX includes multiple pipelines, each containing several modules, and each module includes several models. You can choose which models to use based on the benchmark data below. If you prioritize model accuracy, choose models with higher accuracy. If you prioritize model inference speed, choose models with faster inference speed. If you prioritize model storage size, choose models with smaller storage size.
| Model Name | Top1 Acc (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| CLIP_vit_base_patch16_224 | 85.36 | 12.84 / 2.82 | 60.52 / 60.52 | 306.5 M | CLIP_vit_base_patch16_224.yaml | Inference Model/Training Model |
| CLIP_vit_large_patch14_224 | 88.1 | 51.72 / 11.13 | 238.07 / 238.07 | 1.04 G | CLIP_vit_large_patch14_224.yaml | Inference Model/Training Model |
| ConvNeXt_base_224 | 83.84 | 13.18 / 12.14 | 128.39 / 81.78 | 313.9 M | ConvNeXt_base_224.yaml | Inference Model/Training Model |
| ConvNeXt_base_384 | 84.90 | 32.15 / 30.52 | 279.36 / 220.35 | 313.9 M | ConvNeXt_base_384.yaml | Inference Model/Training Model |
| ConvNeXt_large_224 | 84.26 | 26.51 / 7.21 | 213.32 / 157.22 | 700.7 M | ConvNeXt_large_224.yaml | Inference Model/Training Model |
| ConvNeXt_large_384 | 85.27 | 67.07 / 65.26 | 494.04 / 438.97 | 700.7 M | ConvNeXt_large_384.yaml | Inference Model/Training Model |
| ConvNeXt_small | 83.13 | 9.05 / 8.21 | 97.94 / 55.29 | 178.0 M | ConvNeXt_small.yaml | Inference Model/Training Model |
| ConvNeXt_tiny | 82.03 | 5.12 / 2.06 | 63.96 / 29.77 | 101.4 M | ConvNeXt_tiny.yaml | Inference Model/Training Model |
| FasterNet-L | 83.5 | 15.67 / 3.10 | 52.24 / 52.24 | 357.1 M | FasterNet-L.yaml | Inference Model/Training Model |
| FasterNet-M | 83.0 | 9.72 / 2.30 | 35.29 / 35.29 | 204.6 M | FasterNet-M.yaml | Inference Model/Training Model |
| FasterNet-S | 81.3 | 5.46 / 1.27 | 20.46 / 18.03 | 119.3 M | FasterNet-S.yaml | Inference Model/Training Model |
| FasterNet-T0 | 71.9 | 4.18 / 0.60 | 6.34 / 3.44 | 15.1 M | FasterNet-T0.yaml | Inference Model/Training Model |
| FasterNet-T1 | 75.9 | 4.24 / 0.64 | 9.57 / 5.20 | 29.2 M | FasterNet-T1.yaml | Inference Model/Training Model |
| FasterNet-T2 | 79.1 | 3.87 / 0.78 | 11.14 / 9.98 | 57.4 M | FasterNet-T2.yaml | Inference Model/Training Model |
| MobileNetV1_x0_5 | 63.5 | 1.39 / 0.28 | 2.74 / 1.02 | 4.8 M | MobileNetV1_x0_5.yaml | Inference Model/Training Model |
| MobileNetV1_x0_25 | 51.4 | 1.32 / 0.30 | 2.04 / 0.58 | 1.8 M | MobileNetV1_x0_25.yaml | Inference Model/Training Model |
| MobileNetV1_x0_75 | 68.8 | 1.75 / 0.33 | 3.41 / 1.57 | 9.3 M | MobileNetV1_x0_75.yaml | Inference Model/Training Model |
| MobileNetV1_x1_0 | 71.0 | 1.89 / 0.34 | 4.01 / 2.17 | 15.2 M | MobileNetV1_x1_0.yaml | Inference Model/Training Model |
| MobileNetV2_x0_5 | 65.0 | 3.17 / 0.48 | 4.52 / 1.35 | 7.1 M | MobileNetV2_x0_5.yaml | Inference Model/Training Model |
| MobileNetV2_x0_25 | 53.2 | 2.80 / 0.46 | 3.92 / 0.98 | 5.5 M | MobileNetV2_x0_25.yaml | Inference Model/Training Model |
| MobileNetV2_x1_0 | 72.2 | 3.57 / 0.49 | 5.63 / 2.51 | 12.6 M | MobileNetV2_x1_0.yaml | Inference Model/Training Model |
| MobileNetV2_x1_5 | 74.1 | 3.58 / 0.62 | 8.02 / 4.49 | 25.0 M | MobileNetV2_x1_5.yaml | Inference Model/Training Model |
| MobileNetV2_x2_0 | 75.2 | 3.56 / 0.74 | 10.24 / 6.83 | 41.2 M | MobileNetV2_x2_0.yaml | Inference Model/Training Model |
| MobileNetV3_large_x0_5 | 69.2 | 3.79 / 0.62 | 6.76 / 1.61 | 9.6 M | MobileNetV3_large_x0_5.yaml | Inference Model/Training Model |
| MobileNetV3_large_x0_35 | 64.3 | 3.70 / 0.60 | 5.54 / 1.41 | 7.5 M | MobileNetV3_large_x0_35.yaml | Inference Model/Training Model |
| MobileNetV3_large_x0_75 | 73.1 | 4.82 / 0.66 | 7.45 / 2.00 | 14.0 M | MobileNetV3_large_x0_75.yaml | Inference Model/Training Model |
| MobileNetV3_large_x1_0 | 75.3 | 4.86 / 0.68 | 6.88 / 2.61 | 19.5 M | MobileNetV3_large_x1_0.yaml | Inference Model/Training Model |
| MobileNetV3_large_x1_25 | 76.4 | 5.08 / 0.71 | 7.37 / 3.58 | 26.5 M | MobileNetV3_large_x1_25.yaml | Inference Model/Training Model |
| MobileNetV3_small_x0_5 | 59.2 | 3.41 / 0.57 | 5.60 / 1.14 | 6.8 M | MobileNetV3_small_x0_5.yaml | Inference Model/Training Model |
| MobileNetV3_small_x0_35 | 53.0 | 3.49 / 0.60 | 4.63 / 1.07 | 6.0 M | MobileNetV3_small_x0_35.yaml | Inference Model/Training Model |
| MobileNetV3_small_x0_75 | 66.0 | 3.49 / 0.60 | 5.19 / 1.28 | 8.5 M | MobileNetV3_small_x0_75.yaml | Inference Model/Training Model |
| MobileNetV3_small_x1_0 | 68.2 | 3.76 / 0.53 | 5.11 / 1.43 | 10.5 M | MobileNetV3_small_x1_0.yaml | Inference Model/Training Model |
| MobileNetV3_small_x1_25 | 70.7 | 4.23 / 0.58 | 6.48 / 1.68 | 13.0 M | MobileNetV3_small_x1_25.yaml | Inference Model/Training Model |
| MobileNetV4_conv_large | 83.4 | 8.33 / 2.24 | 33.56 / 23.70 | 125.2 M | MobileNetV4_conv_large.yaml | Inference Model/Training Model |
| MobileNetV4_conv_medium | 79.9 | 6.81 / 0.92 | 12.47 / 6.27 | 37.6 M | MobileNetV4_conv_medium.yaml | Inference Model/Training Model |
| MobileNetV4_conv_small | 74.6 | 3.25 / 0.46 | 4.42 / 1.54 | 14.7 M | MobileNetV4_conv_small.yaml | Inference Model/Training Model |
| MobileNetV4_hybrid_large | 83.8 | 12.27 / 4.18 | 58.64 / 58.64 | 145.1 M | MobileNetV4_hybrid_large.yaml | Inference Model/Training Model |
| MobileNetV4_hybrid_medium | 80.5 | 12.08 / 1.34 | 24.69 / 8.10 | 42.9 M | MobileNetV4_hybrid_medium.yaml | Inference Model/Training Model |
| PP-HGNet_base | 85.0 | 14.10 / 4.19 | 68.92 / 68.92 | 249.4 M | PP-HGNet_base.yaml | Inference Model/Training Model |
| PP-HGNet_small | 81.51 | 5.12 / 1.73 | 25.01 / 25.01 | 86.5 M | PP-HGNet_small.yaml | Inference Model/Training Model |
| PP-HGNet_tiny | 79.83 | 3.28 / 1.29 | 16.40 / 15.97 | 52.4 M | PP-HGNet_tiny.yaml | Inference Model/Training Model |
| PP-HGNetV2-B0 | 77.77 | 3.83 / 0.57 | 9.95 / 2.37 | 21.4 M | PP-HGNetV2-B0.yaml | Inference Model/Training Model |
| PP-HGNetV2-B1 | 79.18 | 3.87 / 0.62 | 8.77 / 3.79 | 22.6 M | PP-HGNetV2-B1.yaml | Inference Model/Training Model |
| PP-HGNetV2-B2 | 81.74 | 5.73 / 0.86 | 15.11 / 7.05 | 39.9 M | PP-HGNetV2-B2.yaml | Inference Model/Training Model |
| PP-HGNetV2-B3 | 82.98 | 6.26 / 1.01 | 18.47 / 10.34 | 57.9 M | PP-HGNetV2-B3.yaml | Inference Model/Training Model |
| PP-HGNetV2-B4 | 83.57 | 5.47 / 1.10 | 14.42 / 9.89 | 70.4 M | PP-HGNetV2-B4.yaml | Inference Model/Training Model |
| PP-HGNetV2-B5 | 84.75 | 10.24 / 1.96 | 29.71 / 29.71 | 140.8 M | PP-HGNetV2-B5.yaml | Inference Model/Training Model |
| PP-HGNetV2-B6 | 86.30 | 12.25 / 3.76 | 62.29 / 62.29 | 268.4 M | PP-HGNetV2-B6.yaml | Inference Model/Training Model |
| PP-LCNet_x0_5 | 63.14 | 2.28 / 0.42 | 2.86 / 0.83 | 6.7 M | PP-LCNet_x0_5.yaml | Inference Model/Training Model |
| PP-LCNet_x0_25 | 51.86 | 1.89 / 0.45 | 2.49 / 0.68 | 5.5 M | PP-LCNet_x0_25.yaml | Inference Model/Training Model |
| PP-LCNet_x0_35 | 58.09 | 1.94 / 0.41 | 2.73 / 0.77 | 5.9 M | PP-LCNet_x0_35.yaml | Inference Model/Training Model |
| PP-LCNet_x0_75 | 68.18 | 2.30 / 0.41 | 2.95 / 1.07 | 8.4 M | PP-LCNet_x0_75.yaml | Inference Model/Training Model |
| PP-LCNet_x1_0 | 71.32 | 2.35 / 0.47 | 4.03 / 1.35 | 10.5 M | PP-LCNet_x1_0.yaml | Inference Model/Training Model |
| PP-LCNet_x1_5 | 73.71 | 2.33 / 0.53 | 4.17 / 2.29 | 16.0 M | PP-LCNet_x1_5.yaml | Inference Model/Training Model |
| PP-LCNet_x2_0 | 75.18 | 2.40 / 0.51 | 5.37 / 3.46 | 23.2 M | PP-LCNet_x2_0.yaml | Inference Model/Training Model |
| PP-LCNet_x2_5 | 76.60 | 2.36 / 0.61 | 6.29 / 5.05 | 32.1 M | PP-LCNet_x2_5.yaml | Inference Model/Training Model |
| PP-LCNetV2_base | 77.05 | 3.33 / 0.55 | 6.86 / 3.77 | 23.7 M | PP-LCNetV2_base.yaml | Inference Model/Training Model |
| PP-LCNetV2_large | 78.51 | 4.37 / 0.71 | 9.43 / 8.07 | 37.3 M | PP-LCNetV2_large.yaml | Inference Model/Training Model |
| PP-LCNetV2_small | 73.97 | 2.53 / 0.41 | 5.14 / 1.98 | 14.6 M | PP-LCNetV2_small.yaml | Inference Model/Training Model |
| ResNet18_vd | 72.3 | 2.47 / 0.61 | 6.97 / 5.15 | 41.5 M | ResNet18_vd.yaml | Inference Model/Training Model |
| ResNet18 | 71.0 | 2.35 / 0.67 | 6.35 / 4.61 | 41.5 M | ResNet18.yaml | Inference Model/Training Model |
| ResNet34_vd | 76.0 | 4.01 / 1.03 | 11.99 / 9.86 | 77.3 M | ResNet34_vd.yaml | Inference Model/Training Model |
| ResNet34 | 74.6 | 3.99 / 1.02 | 12.42 / 9.81 | 77.3 M | ResNet34.yaml | Inference Model/Training Model |
| ResNet50_vd | 79.1 | 6.04 / 1.16 | 16.08 / 12.07 | 90.8 M | ResNet50_vd.yaml | Inference Model/Training Model |
| ResNet50 | 76.5 | 6.44 / 1.16 | 15.04 / 11.63 | 90.8 M | ResNet50.yaml | Inference Model/Training Model |
| ResNet101_vd | 80.2 | 11.16 / 2.07 | 32.14 / 32.14 | 158.4 M | ResNet101_vd.yaml | Inference Model/Training Model |
| ResNet101 | 77.6 | 10.91 / 2.06 | 31.14 / 22.93 | 158.7 M | ResNet101.yaml | Inference Model/Training Model |
| ResNet152_vd | 80.6 | 15.96 / 2.99 | 49.33 / 49.33 | 214.3 M | ResNet152_vd.yaml | Inference Model/Training Model |
| ResNet152 | 78.3 | 15.61 / 2.90 | 47.33 / 36.60 | 214.2 M | ResNet152.yaml | Inference Model/Training Model |
| ResNet200_vd | 80.9 | 24.20 / 3.69 | 62.62 / 62.62 | 266.0 M | ResNet200_vd.yaml | Inference Model/Training Model |
| StarNet-S1 | 73.6 | 6.33 / 1.98 | 7.56 / 3.26 | 11.2 M | StarNet-S1.yaml | Inference Model/Training Model |
| StarNet-S2 | 74.8 | 4.49 / 1.55 | 7.38 / 3.38 | 14.3 M | StarNet-S2.yaml | Inference Model/Training Model |
| StarNet-S3 | 77.0 | 6.70 / 1.62 | 11.05 / 4.76 | 22.2 M | StarNet-S3.yaml | Inference Model/Training Model |
| StarNet-S4 | 79.0 | 8.50 / 2.86 | 15.40 / 6.76 | 28.9 M | StarNet-S4.yaml | Inference Model/Training Model |
| SwinTransformer_base_patch4_window7_224 | 83.37 | 14.29 / 5.13 | 130.89 / 130.89 | 310.5 M | SwinTransformer_base_patch4_window7_224.yaml | Inference Model/Training Model |
| SwinTransformer_base_patch4_window12_384 | 84.17 | 37.74 / 10.10 | 362.56 / 362.56 | 311.4 M | SwinTransformer_base_patch4_window12_384.yaml | Inference Model/Training Model |
| SwinTransformer_large_patch4_window7_224 | 86.19 | 26.48 / 7.94 | 228.23 / 228.23 | 694.8 M | SwinTransformer_large_patch4_window7_224.yaml | Inference Model/Training Model |
| SwinTransformer_large_patch4_window12_384 | 87.06 | 74.72 / 18.16 | 652.04 / 652.04 | 696.1 M | SwinTransformer_large_patch4_window12_384.yaml | Inference Model/Training Model |
| SwinTransformer_small_patch4_window7_224 | 83.21 | 10.37 / 3.90 | 94.20 / 94.20 | 175.6 M | SwinTransformer_small_patch4_window7_224.yaml | Inference Model/Training Model |
| SwinTransformer_tiny_patch4_window7_224 | 81.10 | 6.66 / 2.15 | 60.45 / 60.45 | 100.1 M | SwinTransformer_tiny_patch4_window7_224.yaml | Inference Model/Training Model |
| Model Name | mAP (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| CLIP_vit_base_patch16_448_ML | 89.15 | 54.75 / 14.30 | 280.23 / 280.23 | 325.6 M | CLIP_vit_base_patch16_448_ML.yaml | Inference Model/Training Model |
| PP-HGNetV2-B0_ML | 80.98 | 6.47 / 1.38 | 21.56 / 13.69 | 39.6 M | PP-HGNetV2-B0_ML.yaml | Inference Model/Training Model |
| PP-HGNetV2-B4_ML | 87.96 | 9.63 / 2.79 | 43.98 / 36.63 | 88.5 M | PP-HGNetV2-B4_ML.yaml | Inference Model/Training Model |
| PP-HGNetV2-B6_ML | 91.06 | 37.07 / 9.43 | 188.58 / 188.58 | 286.5 M | PP-HGNetV2-B6_ML.yaml | Inference Model/Training Model |
| PP-LCNet_x1_0_ML | 77.96 | 4.04 / 1.15 | 11.76 / 8.32 | 29.4 M | PP-LCNet_x1_0_ML.yaml | Inference Model/Training Model |
| ResNet50_ML | 83.42 | 12.12 / 3.27 | 51.79 / 44.36 | 108.9 M | ResNet50_ML.yaml | Inference Model/Training Model |
| Model Name | mA (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-LCNet_x1_0_pedestrian_attribute | 92.2 | 2.35 / 0.49 | 3.17 / 1.25 | 6.7 M | PP-LCNet_x1_0_pedestrian_attribute.yaml | Inference Model/Training Model |
| Model Name | mA (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-LCNet_x1_0_vehicle_attribute | 91.7 | 2.32 / 2.32 | 3.22 / 1.26 | 6.7 M | PP-LCNet_x1_0_vehicle_attribute.yaml | Inference Model/Training Model |
| Model Name | recall@1 (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-ShiTuV2_rec | 84.2 | 3.48 / 0.55 | 8.04 / 4.04 | 16.3 M | PP-ShiTuV2_rec.yaml | Inference Model/Training Model |
| PP-ShiTuV2_rec_CLIP_vit_base | 88.69 | 12.94 / 2.88 | 58.36 / 58.36 | 306.6 M | PP-ShiTuV2_rec_CLIP_vit_base.yaml | Inference Model/Training Model |
| PP-ShiTuV2_rec_CLIP_vit_large | 91.03 | 51.65 / 11.18 | 255.78 / 255.78 | 1.05 G | PP-ShiTuV2_rec_CLIP_vit_large.yaml | Inference Model/Training Model |
| Model Name | Top-1 Acc (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-LCNet_x1_0_doc_ori | 99.06 | 2.31 / 0.43 | 3.37 / 1.27 | 7 | PP-LCNet_x1_0_doc_ori.yaml | Inference Model/Training Model |
| Model Name | Output Feature Dimension | Acc (%) AgeDB-30/CFP-FP/LFW |
GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|---|
| MobileFaceNet | 128 | 96.28/96.71/99.58 | 3.16 / 0.48 | 6.49 / 6.49 | 4.1 | MobileFaceNet.yaml | Inference Model/Training Model |
| ResNet50_face | 512 | 98.12/98.56/99.77 | 5.68 / 1.09 | 14.96 / 11.90 | 87.2 | ResNet50_face.yaml | Inference Model/Training Model |
| Model Name | mAP (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-ShiTuV2_det | 41.5 | 12.79 / 4.51 | 44.14 / 44.14 | 27.54 | PP-ShiTuV2_det.yaml | Inference Model/Training Model |
| Model Name | mAP (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| Cascade-FasterRCNN-ResNet50-FPN | 41.1 | 135.92 / 135.92 | - | 245.4 M | Cascade-FasterRCNN-ResNet50-FPN.yaml | Inference Model/Training Model |
| Cascade-FasterRCNN-ResNet50-vd-SSLDv2-FPN | 45.0 | 138.23 / 138.23 | - | 246.2 M | Cascade-FasterRCNN-ResNet50-vd-SSLDv2-FPN.yaml | Inference Model/Training Model |
| CenterNet-DLA-34 | 37.6 | - | - | 75.4 M | CenterNet-DLA-34.yaml | Inference Model/Training Model |
| CenterNet-ResNet50 | 38.9 | - | - | 319.7 M | CenterNet-ResNet50.yaml | Inference Model/Training Model |
| DETR-R50 | 42.3 | 62.91 / 17.33 | 392.63 / 392.63 | 159.3 M | DETR-R50.yaml | Inference Model/Training Model |
| FasterRCNN-ResNet34-FPN | 37.8 | 83.33 / 31.64 | - | 137.5 M | FasterRCNN-ResNet34-FPN.yaml | Inference Model/Training Model |
| FasterRCNN-ResNet50-FPN | 38.4 | 107.08 / 35.40 | - | 148.1 M | FasterRCNN-ResNet50-FPN.yaml | Inference Model/Training Model |
| FasterRCNN-ResNet50-vd-FPN | 39.5 | 109.36 / 36.00 | - | 148.1 M | FasterRCNN-ResNet50-vd-FPN.yaml | Inference Model/Training Model |
| FasterRCNN-ResNet50-vd-SSLDv2-FPN | 41.4 | 109.06 / 36.19 | - | 148.1 M | FasterRCNN-ResNet50-vd-SSLDv2-FPN.yaml | Inference Model/Training Model |
| FasterRCNN-ResNet50 | 36.7 | 496.33 / 109.12 | - | 120.2 M | FasterRCNN-ResNet50.yaml | Inference Model/Training Model |
| FasterRCNN-ResNet101-FPN | 41.4 | 148.21 / 42.21 | - | 216.3 M | FasterRCNN-ResNet101-FPN.yaml | Inference Model/Training Model |
| FasterRCNN-ResNet101 | 39.0 | 538.58 / 120.88 | - | 188.1 M | FasterRCNN-ResNet101.yaml | Inference Model/Training Model |
| FasterRCNN-ResNeXt101-vd-FPN | 43.4 | 258.01 / 58.25 | - | 360.6 M | FasterRCNN-ResNeXt101-vd-FPN.yaml | Inference Model/Training Model |
| FasterRCNN-Swin-Tiny-FPN | 42.6 | - | - | 159.8 M | FasterRCNN-Swin-Tiny-FPN.yaml | Inference Model/Training Model |
| FCOS-ResNet50 | 39.6 | 106.13 / 28.32 | 721.79 / 721.79 | 124.2 M | FCOS-ResNet50.yaml | Inference Model/Training Model |
| PicoDet-L | 42.6 | 14.68 / 5.81 | 47.32 / 47.32 | 20.9 M | PicoDet-L.yaml | Inference Model/Training Model |
| PicoDet-M | 37.5 | 9.62 / 3.23 | 23.75 / 14.88 | 16.8 M | PicoDet-M.yaml | Inference Model/Training Model |
| PicoDet-S | 29.1 | 7.98 / 2.33 | 14.82 / 5.60 | 4.4 M | PicoDet-S.yaml | Inference Model/Training Model |
| PicoDet-XS | 26.2 | 9.66 / 2.75 | 19.15 / 7.24 | 5.7 M | PicoDet-XS.yaml | Inference Model/Training Model |
| PP-YOLOE_plus-L | 52.9 | 33.55 / 10.46 | 189.05 / 189.05 | 185.3 M | PP-YOLOE_plus-L.yaml | Inference Model/Training Model |
| PP-YOLOE_plus-M | 49.8 | 19.52 / 7.46 | 113.36 / 113.36 | 83.2 M | PP-YOLOE_plus-M.yaml | Inference Model/Training Model |
| PP-YOLOE_plus-S | 43.7 | 12.16 / 4.58 | 73.86 / 52.90 | 28.3 M | PP-YOLOE_plus-S.yaml | Inference Model/Training Model |
| PP-YOLOE_plus-X | 54.7 | 58.87 / 15.84 | 292.93 / 292.93 | 349.4 M | PP-YOLOE_plus-X.yaml | Inference Model/Training Model |
| RT-DETR-H | 56.3 | 115.92 / 28.16 | 971.32 / 971.32 | 435.8 M | RT-DETR-H.yaml | Inference Model/Training Model |
| RT-DETR-L | 53.0 | 35.00 / 10.45 | 495.51 / 495.51 | 113.7 M | RT-DETR-L.yaml | Inference Model/Training Model |
| RT-DETR-R18 | 46.5 | 20.21 / 6.23 | 266.01 / 266.01 | 70.7 M | RT-DETR-R18.yaml | Inference Model/Training Model |
| RT-DETR-R50 | 53.1 | 42.14 / 11.31 | 523.97 / 523.97 | 149.1 M | RT-DETR-R50.yaml | Inference Model/Training Model |
| RT-DETR-X | 54.8 | 61.24 / 15.83 | 647.08 / 647.08 | 232.9 M | RT-DETR-X.yaml | Inference Model/Training Model |
| YOLOv3-DarkNet53 | 39.1 | 41.58 / 10.10 | 158.78 / 158.78 | 219.7 M | YOLOv3-DarkNet53.yaml | Inference Model/Training Model |
| YOLOv3-MobileNetV3 | 31.4 | 16.53 / 5.70 | 60.44 / 60.44 | 83.8 M | YOLOv3-MobileNetV3.yaml | Inference Model/Training Model |
| YOLOv3-ResNet50_vd_DCN | 40.6 | 32.91 / 10.07 | 225.72 / 224.32 | 163.0 M | YOLOv3-ResNet50_vd_DCN.yaml | Inference Model/Training Model |
| YOLOX-L | 50.1 | 121.19 / 13.55 | 295.38 / 274.15 | 192.5 M | YOLOX-L.yaml | Inference Model/Training Model |
| YOLOX-M | 46.9 | 87.19 / 10.09 | 183.95 / 172.67 | 90.0 M | YOLOX-M.yaml | Inference Model/Training Model |
| YOLOX-N | 26.1 | 53.31 / 45.02 | 69.69 / 59.18 | 3.4M | YOLOX-N.yaml | Inference Model/Training Model |
| YOLOX-S | 40.4 | 129.52 / 13.19 | 181.39 / 179.01 | 32.0 M | YOLOX-S.yaml | Inference Model/Training Model |
| YOLOX-T | 32.9 | 66.81 / 61.31 | 92.30 / 83.90 | 18.1 M | YOLOX-T.yaml | Inference Model/Training Model |
| YOLOX-X | 51.8 | 156.40 / 20.17 | 480.14 / 454.35 | 351.5 M | YOLOX-X.yaml | Inference Model/Training Model |
| Model Name | mAP (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Size | yaml File | Model Download Links |
|---|---|---|---|---|---|---|
| PP-YOLOE_plus_SOD-S | 25.1 | 135.68 / 122.94 | 188.09 / 107.74 | 77.3 M | PP-YOLOE_plus_SOD-S.yaml | Inference Model/Training Model |
| Model | mAP(0.5:0.95) | mAP(0.5) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Size (M) | Model Download Link |
|---|---|---|---|---|---|---|
| GroundingDINO-T | 49.4 | 64.4 | 253.72 | 1807.4 | 658.3 | Inference Model |
| YOLO-Worldv2-L | 44.4 | 59.8 | 24.32 | 374.89 | 421.4 | Inference Model |
| Model | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | Model Download Link |
|---|---|---|---|---|
| SAM-H_box | 144.9 | 33920.7 | 2433.7 | Inference Model |
| SAM-H_point | 144.9 | 33920.7 | 2433.7 | Inference Model |
| Model | mAP(%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-YOLOE-R-L | 78.14 | 20.7039 | 157.942 | 211.0 M | PP-YOLOE-R.yaml | Inference Model/Training Model |
Note: The above accuracy metrics are based on the DOTA validation set mAP(0.5:0.95).
## [Pedestrian Detection Module](../module_usage/tutorials/cv_modules/human_detection.en.md)| Model Name | mAP (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-YOLOE-L_human | 48.0 | 33.27 / 9.19 | 173.72 / 173.72 | 196.1 M | PP-YOLOE-L_human.yaml | Inference Model/Training Model |
| PP-YOLOE-S_human | 42.5 | 9.94 / 3.42 | 54.48 / 46.52 | 28.8 M | PP-YOLOE-S_human.yaml | Inference Model/Training Model |
| Model Name | mAP (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-YOLOE-L_vehicle | 63.9 | 32.84 / 9.03 | 176.60 / 176.60 | 196.1 M | PP-YOLOE-L_vehicle.yaml | Inference Model/Training Model |
| PP-YOLOE-S_vehicle | 61.3 | 9.79 / 3.48 | 54.14 / 46.69 | 28.8 M | PP-YOLOE-S_vehicle.yaml | Inference Model/Training Model |
| Model Name | AP (%) Easy/Medium/Hard |
GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| BlazeFace | 77.7/73.4/49.5 | 60.34 / 54.76 | 84.18 / 84.18 | 0.447 M | BlazeFace.yaml | Inference Model/Training Model |
| BlazeFace-FPN-SSH | 83.2/80.5/60.5 | 69.29 / 63.42 | 86.96 / 86.96 | 0.606 M | BlazeFace-FPN-SSH.yaml | Inference Model/Training Model |
| PicoDet_LCNet_x2_5_face | 93.7/90.7/68.1 | 35.37 / 12.88 | 126.24 / 126.24 | 28.9 M | PicoDet_LCNet_x2_5_face.yaml | Inference Model/Training Model |
| PP-YOLOE_plus-S_face | 93.9/91.8/79.8 | 22.54 / 8.33 | 138.67 / 138.67 | 26.5 M | PP-YOLOE_plus-S_face | Inference Model/Training Model |
Note: The above precision metrics are evaluated on the WIDER-FACE validation set with an input size of 640x640.
| Model Name | mIoU | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| STFPM | 0.9901 | 2.97 / 1.57 | 38.86 / 13.24 | 22.5 M | STFPM.yaml | Inference Model/Training Model |
| Model | Scheme | Input Size | AP(0.5:0.95) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|---|---|
| PP-TinyPose_128x96 | Top-Down | 128*96 | 58.4 | 4.9 | PP-TinyPose_128x96.yaml | Inference Model/Training Model | ||
| PP-TinyPose_256x192 | Top-Down | 256*192 | 68.3 | 4.9 | PP-TinyPose_256x192.yaml | Inference Model/Training Model |
Note: The above accuracy metrics are based on the COCO dataset AP(0.5:0.95), with detection boxes obtained from ground truth annotations.
| Model | mAP(%) | NDS | yaml File | Model Download Link |
|---|---|---|---|---|
| BEVFusion | 53.9 | 60.9 | BEVFusion.yaml | Inference Model/Training Model |
Note: The above accuracy metrics are based on the nuscenes validation set with mAP(0.5:0.95) and NDS 60.9, and the precision type is FP32.
## [Semantic Segmentation Module](../module_usage/tutorials/cv_modules/semantic_segmentation.en.md)| Model Name | mloU(%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| Deeplabv3_Plus-R50 | 80.36 | 503.51 / 122.30 | 3543.91 / 3543.91 | 94.9 M | Deeplabv3_Plus-R50.yaml | Inference Model/Training Model |
| Deeplabv3_Plus-R101 | 81.10 | 803.79 / 175.45 | 5136.21 / 5136.21 | 162.5 M | Deeplabv3_Plus-R101.yaml | Inference Model/Training Model |
| Deeplabv3-R50 | 79.90 | 647.56 / 121.67 | 3803.09 / 3803.09 | 138.3 M | Deeplabv3-R50.yaml | Inference Model/Training Model |
| Deeplabv3-R101 | 80.85 | 950.43 / 178.50 | 5517.14 / 5517.14 | 205.9 M | Deeplabv3-R101.yaml | Inference Model/Training Model |
| OCRNet_HRNet-W18 | 80.67 | 286.12 / 80.76 | 1794.03 / 1794.03 | 43.1 M | OCRNet_HRNet-W18.yaml | Inference Model/Training Model |
| OCRNet_HRNet-W48 | 82.15 | 627.36 / 170.76 | 3531.61 / 3531.61 | 249.8 M | OCRNet_HRNet-W48.yaml | Inference Model/Training Model |
| PP-LiteSeg-T | 73.10 | 30.16 / 14.03 | 420.07 / 235.01 | 28.5 M | PP-LiteSeg-T.yaml | Inference Model/Training Model |
| PP-LiteSeg-B | 75.25 | 40.92 / 20.18 | 494.32 / 310.34 | 47.0 M | PP-LiteSeg-B.yaml | Inference Model/Training Model |
| SegFormer-B0 (slice) | 76.73 | 11.1946 | 268.929 | 13.2 M | SegFormer-B0.yaml | Inference Model/Training Model |
| SegFormer-B1 (slice) | 78.35 | 17.9998 | 403.393 | 48.5 M | SegFormer-B1.yaml | Inference Model/Training Model |
| SegFormer-B2 (slice) | 81.60 | 48.0371 | 1248.52 | 96.9 M | SegFormer-B2.yaml | Inference Model/Training Model |
| SegFormer-B3 (slice) | 82.47 | 64.341 | 1666.35 | 167.3 M | SegFormer-B3.yaml | Inference Model/Training Model |
| SegFormer-B4 (slice) | 82.38 | 82.4336 | 1995.42 | 226.7 M | SegFormer-B4.yaml | Inference Model/Training Model |
| SegFormer-B5 (slice) | 82.58 | 97.3717 | 2420.19 | 229.7 M | SegFormer-B5.yaml | Inference Model/Training Model |
| Model Name | mIoU (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| SeaFormer_base(slice) | 40.92 | 24.4073 | 397.574 | 30.8 M | SeaFormer_base.yaml | Inference Model/Training Model |
| SeaFormer_large (slice) | 43.66 | 27.8123 | 550.464 | 49.8 M | SeaFormer_large.yaml | Inference Model/Training Model |
| SeaFormer_small (slice) | 38.73 | 19.2295 | 358.343 | 14.3 M | SeaFormer_small.yaml | Inference Model/Training Model |
| SeaFormer_tiny (slice) | 34.58 | 13.9496 | 330.132 | 6.1M | SeaFormer_tiny.yaml | Inference Model/Training Model |
| Model Name | Mask AP | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| Mask-RT-DETR-H | 50.6 | 172.36 / 172.36 | 1615.75 / 1615.75 | 449.9 M | Mask-RT-DETR-H.yaml | Inference Model/Training Model |
| Mask-RT-DETR-L | 45.7 | 88.18 / 88.18 | 1090.84 / 1090.84 | 113.6 M | Mask-RT-DETR-L.yaml | Inference Model/Training Model |
| Mask-RT-DETR-M | 42.7 | 78.69 / 78.69 | - | 66.6 M | Mask-RT-DETR-M.yaml | Inference Model/Training Model |
| Mask-RT-DETR-S | 41.0 | 33.5007 | - | 51.8 M | Mask-RT-DETR-S.yaml | Inference Model/Training Model |
| Mask-RT-DETR-X | 47.5 | 114.16 / 114.16 | 1240.92 / 1240.92 | 237.5 M | Mask-RT-DETR-X.yaml | Inference Model/Training Model |
| Cascade-MaskRCNN-ResNet50-FPN | 36.3 | 141.69 / 141.69 | - | 254.8 M | Cascade-MaskRCNN-ResNet50-FPN.yaml | Inference Model/Training Model |
| Cascade-MaskRCNN-ResNet50-vd-SSLDv2-FPN | 39.1 | 147.62 / 147.62 | - | 254.7 M | Cascade-MaskRCNN-ResNet50-vd-SSLDv2-FPN.yaml | Inference Model/Training Model |
| MaskRCNN-ResNet50-FPN | 35.6 | 118.30 / 118.30 | - | 157.5 M | MaskRCNN-ResNet50-FPN.yaml | Inference Model/Training Model |
| MaskRCNN-ResNet50-vd-FPN | 36.4 | 118.34 / 118.34 | - | 157.5 M | MaskRCNN-ResNet50-vd-FPN.yaml | Inference Model/Training Model |
| MaskRCNN-ResNet50 | 32.8 | 228.83 / 228.83 | - | 127.8 M | MaskRCNN-ResNet50.yaml | Inference Model/Training Model |
| MaskRCNN-ResNet101-FPN | 36.6 | 148.14 / 148.14 | - | 225.4 M | MaskRCNN-ResNet101-FPN.yaml | Inference Model/Training Model |
| MaskRCNN-ResNet101-vd-FPN | 38.1 | 151.12 / 151.12 | - | 225.1 M | MaskRCNN-ResNet101-vd-FPN.yaml | Inference Model/Training Model |
| MaskRCNN-ResNeXt101-vd-FPN | 39.5 | 237.55 / 237.55 | - | 370.0 M | MaskRCNN-ResNeXt101-vd-FPN.yaml | Inference Model/Training Model |
| PP-YOLOE_seg-S | 32.5 | - | - | 31.5 M | PP-YOLOE_seg-S.yaml | Inference Model/Training Model |
| SOLOv2 | 35.5 | - | - | 179.1 M | SOLOv2.yaml | Inference Model/Training Model |
| Model | Detection Hmean (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-OCRv4_server_det | 82.56 | 83.34 / 80.91 | 442.58 / 442.58 | 109 | PP-OCRv4_server_det.yaml | Inference Model/Training Model |
| PP-OCRv4_mobile_det | 77.35 | 8.79 / 3.13 | 51.00 / 28.58 | 4.7 | PP-OCRv4_mobile_det.yaml | Inference Model/Training Model |
| PP-OCRv3_mobile_det | 78.68 | 8.44 / 2.91 | 27.87 / 27.87 | 2.1 | PP-OCRv3_mobile_det.yaml | Inference Model/Training Model |
| PP-OCRv3_server_det | 80.11 | 65.41 / 13.67 | 305.07 / 305.07 | 102.1 | PP-OCRv3_server_det.yaml | Inference Model/Training Model |
| Model Name | Detection Hmean (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-OCRv4_mobile_seal_det | 96.47 | 7.82 / 3.09 | 48.28 / 23.97 | 4.7M | PP-OCRv4_mobile_seal_det.yaml | Inference Model/Training Model |
| PP-OCRv4_server_seal_det | 98.21 | 74.75 / 67.72 | 382.55 / 382.55 | 108.3 M | PP-OCRv4_server_seal_det.yaml | Inference Model/Training Model |
| Model | Recognition Avg Accuracy(%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-OCRv4_server_rec_doc | 81.53 | 6.65 / 2.38 | 32.92 / 32.92 | 74.7 M | PP-OCRv4_server_rec_doc.yaml | Inference Model/Training Model |
| PP-OCRv4_mobile_rec | 78.74 | 4.82 / 1.20 | 16.74 / 4.64 | 10.6 M | PP-OCRv4_mobile_rec.yaml | Inference Model/Training Model |
| PP-OCRv4_server_rec | 80.61 | 6.58 / 2.43 | 33.17 / 33.17 | 71.2 M | PP-OCRv4_server_rec.yaml | Inference Model/Training Model |
| PP-OCRv3_mobile_rec | 72.96 | 5.87 / 1.19 | 9.07 / 4.28 | 9.2 M | PP-OCRv3_mobile_rec.yaml | Inference Model/Training Model |
Note: The evaluation set for the above accuracy metrics is a Chinese dataset built by PaddleOCR, covering multiple scenarios such as street view, web images, documents, and handwriting, with 8367 images for text recognition.
| Model | Recognition Avg Accuracy(%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| ch_SVTRv2_rec | 68.81 | 8.08 / 2.74 | 50.17 / 42.50 | 73.9 M | ch_SVTRv2_rec.yaml | Inference Model/Training Model |
Note: The evaluation dataset for the above accuracy metrics is the PaddleOCR Algorithm Model Challenge - Task 1: OCR End-to-End Recognition Task Leaderboard A.
| Model | Recognition Avg Accuracy(%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| ch_RepSVTR_rec | 65.07 | 5.93 / 1.62 | 20.73 / 7.32 | 22.1 M | ch_RepSVTR_rec.yaml | Inference Model/Training Model |
Note: The evaluation dataset for the above accuracy metrics is the PaddleOCR Algorithm Model Challenge - Task 1: OCR End-to-End Recognition Task Leaderboard B.
English Recognition Model| Model | Recognition Avg Accuracy(%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| en_PP-OCRv4_mobile_rec | 70.39 | 4.81 / 0.75 | 16.10 / 5.31 | 6.8 M | en_PP-OCRv4_mobile_rec.yaml | Inference Model/Training Model |
| en_PP-OCRv3_mobile_rec | 70.69 | 5.44 / 0.75 | 8.65 / 5.57 | 7.8 M | en_PP-OCRv3_mobile_rec.yaml | Inference Model/Training Model |
Note: The evaluation set for the above accuracy metrics is an English dataset built by PaddleX.
Multilingual Recognition Model
| Model | Recognition Avg Accuracy(%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| korean_PP-OCRv3_mobile_rec | 60.21 | 5.40 / 0.97 | 9.11 / 4.05 | 8.6 M | korean_PP-OCRv3_mobile_rec.yaml | Inference Model/Training Model |
| japan_PP-OCRv3_mobile_rec | 45.69 | 5.70 / 1.02 | 8.48 / 4.07 | 8.8 M | japan_PP-OCRv3_mobile_rec.yaml | Inference Model/Training Model |
| chinese_cht_PP-OCRv3_mobile_rec | 82.06 | 5.90 / 1.28 | 9.28 / 4.34 | 9.7 M | chinese_cht_PP-OCRv3_mobile_rec.yaml | Inference Model/Training Model |
| te_PP-OCRv3_mobile_rec | 95.88 | 5.42 / 0.82 | 8.10 / 6.91 | 7.8 M | te_PP-OCRv3_mobile_rec.yaml | Inference Model/Training Model |
| ka_PP-OCRv3_mobile_rec | 96.96 | 5.25 / 0.79 | 9.09 / 3.86 | 8.0 M | ka_PP-OCRv3_mobile_rec.yaml | Inference Model/Training Model |
| ta_PP-OCRv3_mobile_rec | 76.83 | 5.23 / 0.75 | 10.13 / 4.30 | 8.0 M | ta_PP-OCRv3_mobile_rec.yaml | Inference Model/Training Model |
| latin_PP-OCRv3_mobile_rec | 76.93 | 5.20 / 0.79 | 8.83 / 7.15 | 7.8 M | latin_PP-OCRv3_mobile_rec.yaml | Inference Model/Training Model |
| arabic_PP-OCRv3_mobile_rec | 73.55 | 5.35 / 0.79 | 8.80 / 4.56 | 7.8 M | arabic_PP-OCRv3_mobile_rec.yaml | Inference Model/Training Model |
| cyrillic_PP-OCRv3_mobile_rec | 94.28 | 5.23 / 0.76 | 8.89 / 3.88 | 7.9 M | cyrillic_PP-OCRv3_mobile_rec.yaml | Inference Model/Training Model |
| devanagari_PP-OCRv3_mobile_rec | 96.44 | 5.22 / 0.79 | 8.56 / 4.06 | 7.9 M | devanagari_PP-OCRv3_mobile_rec.yaml | Inference Model/Training Model |
Note: The evaluation set for the above accuracy metrics is a multi-language dataset built by PaddleX.
| Model | Avg-BLEU(%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link | UniMERNet | 86.13 | 2266.96/- | -/- | 1.4 G | UniMERNet.yaml | Inference Model/Training Model |
|---|---|---|---|---|---|---|
| PP-FormulaNet-S | 87.12 | 202.25/- | -/- | 167.9 M | PP-FormulaNet-S.yaml | Inference Model/Training Model | PP-FormulaNet-L | 92.13 | 1976.52/- | -/- | 535.2 M | PP-FormulaNet-L.yaml | Inference Model/Training Model |
| LaTeX_OCR_rec | 71.63 | -/- | -/- | 89.7 M | LaTeX_OCR_rec.yaml | Inference Model/Training Model |
| Model | Accuracy (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| SLANet | 59.52 | 103.08 / 103.08 | 197.99 / 197.99 | 6.9 M | SLANet.yaml | Inference Model/Training Model |
| SLANet_plus | 63.69 | 140.29 / 140.29 | 195.39 / 195.39 | 6.9 M | SLANet_plus.yaml | Inference Model/Training Model |
| SLANeXt_wired | 69.65 | -- | -- | -- | SLANeXt_wired.yaml | Inference Model/Training Model |
| SLANeXt_wireless | SLANeXt_wireless.yaml | Inference Model/Training Model |
| Model | Model Download Link | mAP(%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | Introduction |
|---|---|---|---|---|---|---|
| RT-DETR-L_wired_table_cell_det | Inference Model/Training Model | 82.7 | 35.00 / 10.45 | 495.51 / 495.51 | 124M | RT-DETR is the first real-time end-to-end object detection model. The Baidu PaddlePaddle Vision Team, based on RT-DETR-L as the base model, has completed pretraining on a self-built table cell detection dataset, achieving good performance for both wired and wireless table cell detection. |
| RT-DETR-L_wireless_table_cell_det | Inference Model/Training Model |
Note: The above accuracy metrics are measured from the internal table cell detection dataset of PaddleX.
## [Table Classification Module](../module_usage/tutorials/ocr_modules/table_classification.en.md)| Model | Top1 Acc(%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-LCNet_x1_0_table_cls | -- | -- | -- | -- | PP-LCNet_x1_0_table_cls.yaml | Inference Model/Training Model |
Note: The above accuracy metrics are measured from the internal table classification dataset built by PaddleX.
## [Text Image Unwarping Module](../module_usage/tutorials/ocr_modules/text_image_unwarping.en.md)| Model Name | MS-SSIM (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| UVDoc | 54.40 | 16.27 / 7.76 | 176.97 / 80.60 | 30.3 M | UVDoc.yaml | Inference Model/Training Model |
| Model | mAP(0.5) (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PicoDet_layout_1x_table | 97.5 | 8.02 / 3.09 | 23.70 / 20.41 | 7.4 M | PicoDet_layout_1x_table.yaml | Inference Model/Training Model |
| Model | mAP(0.5) (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PicoDet-S_layout_3cls | 88.2 | 8.99 / 2.22 | 16.11 / 8.73 | 4.8 | PicoDet-S_layout_3cls.yaml | Inference Model/Training Model |
| PicoDet-L_layout_3cls | 89.0 | 13.05 / 4.50 | 41.30 / 41.30 | 22.6 | PicoDet-L_layout_3cls.yaml | Inference Model/Training Model |
| RT-DETR-H_layout_3cls | 95.8 | 114.93 / 27.71 | 947.56 / 947.56 | 470.1 | RT-DETR-H_layout_3cls.yaml | Inference Model/Training Model |
| Model | mAP(0.5) (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PicoDet_layout_1x | 97.8 | 9.03 / 3.10 | 25.82 / 20.70 | 7.4 | PicoDet_layout_1x.yaml | Inference Model/Training Model |
| Model | mAP(0.5) (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PicoDet-S_layout_17cls | 87.4 | 9.11 / 2.12 | 15.42 / 9.12 | 4.8 | PicoDet-S_layout_17cls.yaml | Inference Model/Training Model |
| Model | Top-1 Acc (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-LCNet_x1_0_doc_ori | 99.06 | 2.31 / 0.43 | 3.37 / 1.27 | 7 | PP-LCNet_x1_0_doc_ori.yaml | Inference Model/Training Model |
| Model | Top-1 Acc (%) | GPU Inference Time (ms) [Standard Mode / High-Performance Mode] |
CPU Inference Time (ms) [Standard Mode / High-Performance Mode] |
Model Storage Size (M) | YAML File | Model Download Link |
|---|---|---|---|---|---|---|
| PP-LCNet_x1_0_doc_ori | 99.06 | 2.31 / 0.43 | 3.37 / 1.27 | 7 | PP-LCNet_x0_25_textline_ori.yaml | Inference Model/Training Model |
Note: The evaluation dataset for the above accuracy metrics is a self-built dataset covering multiple scenarios such as certificates and documents, with 1,000 images.
| Model Name | mse | mae | Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|---|
| DLinear | 0.382 | 0.394 | 72 K | DLinear.yaml | Inference Model/Training Model |
| NLinear | 0.386 | 0.392 | 40 K | NLinear.yaml | Inference Model/Training Model |
| Nonstationary | 0.600 | 0.515 | 55.5 M | Nonstationary.yaml | Inference Model/Training Model |
| PatchTST | 0.379 | 0.391 | 2.0 M | PatchTST.yaml | Inference Model/Training Model |
| RLinear | 0.385 | 0.392 | 40 K | RLinear.yaml | Inference Model/Training Model |
| TiDE | 0.407 | 0.414 | 31.7 M | TiDE.yaml | Inference Model/Training Model |
| TimesNet | 0.416 | 0.429 | 4.9 M | TimesNet.yaml | Inference Model/Training Model |
| Model Name | Precision | Recall | F1 Score | Model Storage Size | YAML File | Model Download Link |
|---|---|---|---|---|---|---|
| AutoEncoder_ad | 99.36 | 84.36 | 91.25 | 52 K | AutoEncoder_ad.yaml | Inference Model/Training Model |
| DLinear_ad | 98.98 | 93.96 | 96.41 | 112 K | DLinear_ad.yaml | Inference Model/Training Model |
| Nonstationary_ad | 98.55 | 88.95 | 93.51 | 1.8 M | Nonstationary_ad.yaml | Inference Model/Training Model |
| PatchTST_ad | 98.78 | 90.70 | 94.57 | 320 K | PatchTST_ad.yaml | Inference Model/Training Model |
| Model Name | acc(%) | Model Storage Size | yaml File | Model Download Link |
|---|---|---|---|---|
| TimesNet_cls | 87.5 | 792 K | TimesNet_cls.yaml | Inference Model/Training Model |
| Model | Training Data | Model Size | Word Error Rate | YAML File | Model Download Link |
|---|---|---|---|---|---|
| whisper_large | 680kh | 5.8G | 2.7 (Librispeech) | whisper_large.yaml | Inference Model |
| whisper_medium | 680kh | 2.9G | - | whisper_medium.yaml | Inference Model |
| whisper_small | 680kh | 923M | - | whisper_small.yaml | Inference Model |
| whisper_base | 680kh | 277M | - | whisper_base.yaml | Inference Model |
| whisper_tiny | 680kh | 145M | - | whisper_small.yaml | Inference Model |
| Model | Top1 Acc(%) | Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|
| PP-TSM-R50_8frames_uniform | 74.36 | 93.4 M | PP-TSM-R50_8frames_uniform.yaml | Inference Model/Training Model |
| PP-TSMv2-LCNetV2_8frames_uniform | 71.71 | 22.5 M | PP-TSMv2-LCNetV2_8frames_uniform.yaml | Inference Model/Training Model |
| PP-TSMv2-LCNetV2_16frames_uniform | 73.11 | 22.5 M | PP-TSMv2-LCNetV2_16frames_uniform.yaml | Inference Model/Training Model |
Note: The above accuracy metrics are based on the K400 validation set Top1 Acc.
## [Video Detection Module](../module_usage/tutorials/video_modules/video_detection.en.md)| Model | Frame-mAP(@ IoU 0.5) | Model Storage Size (M) | yaml File | Model Download Link |
|---|---|---|---|---|
| YOWO | 80.94 | 462.891M | YOWO.yaml | Inference Model/Training Model |
Note: The above accuracy metrics are based on the test dataset UCF101-24, using the Frame-mAP (@ IoU 0.5) metric.
## [Document Vision-Language Model Module](../module_usage/tutorials/vlm_modules/doc_vlm.en.md)| Model | Model Storage Size(GB) | Model Download Lin |
|---|---|---|
| PP-DocBee-2B | 4.2 | Inference Model |
| PP-DocBee-7B | 15.8 | Inference Model |
Test Environment Description:
Performance Test Environment
Inference Mode Description
| Mode | GPU Configuration | CPU Configuration | Acceleration Technology Combination |
|---|---|---|---|
| Normal Mode | FP32 Precision / No TRT Acceleration | FP32 Precision / 8 Threads | PaddleInference |
| High-Performance Mode | Optimal combination of pre-selected precision types and acceleration strategies | FP32 Precision / 8 Threads | Pre-selected optimal backend (Paddle/OpenVINO/TRT, etc.) |