Pārlūkot izejas kodu

Merge pull request #1844 from zhangyubo0722/add_models_1

add ResNet_vd, PP-LCNetV2, ConvNeXt models
cuicheng01 1 gadu atpakaļ
vecāks
revīzija
26f8fc73d1
33 mainītis faili ar 2870 papildinājumiem un 6 dzēšanām
  1. 15 1
      docs/tutorials/models/support_model_list.md
  2. 38 0
      paddlex/configs/image_classification/ConvNeXt_base_224.yaml
  3. 38 0
      paddlex/configs/image_classification/ConvNeXt_base_384.yaml
  4. 38 0
      paddlex/configs/image_classification/ConvNeXt_large_224.yaml
  5. 38 0
      paddlex/configs/image_classification/ConvNeXt_large_384.yaml
  6. 38 0
      paddlex/configs/image_classification/ConvNeXt_small.yaml
  7. 38 0
      paddlex/configs/image_classification/PP-LCNetV2_base.yaml
  8. 38 0
      paddlex/configs/image_classification/PP-LCNetV2_large.yaml
  9. 38 0
      paddlex/configs/image_classification/PP-LCNetV2_small.yaml
  10. 38 0
      paddlex/configs/image_classification/ResNet101_vd.yaml
  11. 38 0
      paddlex/configs/image_classification/ResNet152_vd.yaml
  12. 38 0
      paddlex/configs/image_classification/ResNet18_vd.yaml
  13. 38 0
      paddlex/configs/image_classification/ResNet200_vd.yaml
  14. 38 0
      paddlex/configs/image_classification/ResNet34_vd.yaml
  15. 38 0
      paddlex/configs/image_classification/ResNet50_vd.yaml
  16. 28 1
      paddlex/modules/base/predictor/utils/official_models.py
  17. 14 1
      paddlex/modules/image_classification/model_list.py
  18. 5 2
      paddlex/repo_apis/PaddleClas_api/cls/config.py
  19. 104 1
      paddlex/repo_apis/PaddleClas_api/cls/register.py
  20. 177 0
      paddlex/repo_apis/PaddleClas_api/configs/ConvNeXt_base_224.yaml
  21. 177 0
      paddlex/repo_apis/PaddleClas_api/configs/ConvNeXt_base_384.yaml
  22. 177 0
      paddlex/repo_apis/PaddleClas_api/configs/ConvNeXt_large_224.yaml
  23. 177 0
      paddlex/repo_apis/PaddleClas_api/configs/ConvNeXt_large_384.yaml
  24. 177 0
      paddlex/repo_apis/PaddleClas_api/configs/ConvNeXt_small.yaml
  25. 145 0
      paddlex/repo_apis/PaddleClas_api/configs/PP-LCNetV2_base.yaml
  26. 145 0
      paddlex/repo_apis/PaddleClas_api/configs/PP-LCNetV2_large.yaml
  27. 145 0
      paddlex/repo_apis/PaddleClas_api/configs/PP-LCNetV2_small.yaml
  28. 142 0
      paddlex/repo_apis/PaddleClas_api/configs/ResNet101_vd.yaml
  29. 142 0
      paddlex/repo_apis/PaddleClas_api/configs/ResNet152_vd.yaml
  30. 142 0
      paddlex/repo_apis/PaddleClas_api/configs/ResNet18_vd.yaml
  31. 142 0
      paddlex/repo_apis/PaddleClas_api/configs/ResNet200_vd.yaml
  32. 142 0
      paddlex/repo_apis/PaddleClas_api/configs/ResNet34_vd.yaml
  33. 142 0
      paddlex/repo_apis/PaddleClas_api/configs/ResNet50_vd.yaml

+ 15 - 1
docs/tutorials/models/support_model_list.md

@@ -5,11 +5,17 @@
 | 模型名称 | config |
 | :--- | :---: |
 | ResNet18 | [ResNet18.yaml](../../../paddlex/configs/image_classification/ResNet18.yaml)|
+| ResNet18_vd | [ResNet18_vd.yaml](../../../paddlex/configs/image_classification/ResNet18_vd.yaml)|
 | ResNet34 | [ResNet34.yaml](../../../paddlex/configs/image_classification/ResNet34.yaml)|
+| ResNet34_vd | [ResNet34_vd.yaml](../../../paddlex/configs/image_classification/ResNet34_vd.yaml)|
 | ResNet50 | [ResNet50.yaml](../../../paddlex/configs/image_classification/ResNet50.yaml)|
+| ResNet50_vd | [ResNet50_vd.yaml](../../../paddlex/configs/image_classification/ResNet50_vd.yaml)|
 | ResNet101 | [ResNet101.yaml](../../../paddlex/configs/image_classification/ResNet101.yaml)|
+| ResNet101_vd | [ResNet101_vd.yaml](../../../paddlex/configs/image_classification/ResNet101_vd.yaml)|
 | ResNet152 | [ResNet152.yaml](../../../paddlex/configs/image_classification/ResNet152.yaml)|
-### 2.PP-LCNet 系列
+| ResNet152_vd | [ResNet152_vd.yaml](../../../paddlex/configs/image_classification/ResNet152_vd.yaml)|
+| ResNet200_vd | [ResNet200_vd.yaml](../../../paddlex/configs/image_classification/ResNet200_vd.yaml)|
+### 2.PP-LCNet & PP-LCNetV2 系列
 | 模型名称 | config |
 | :--- | :---: |
 | PP-LCNet_x0_25 | [PP-LCNet_x0_25.yaml](../../../paddlex/configs/image_classification/PP-LCNet_x0_25.yaml)|
@@ -20,6 +26,9 @@
 | PP-LCNet_x1_5 | [PP-LCNet_x1_5.yaml](../../../paddlex/configs/image_classification/PP-LCNet_x1_5.yaml)|
 | PP-LCNet_x2_0 | [PP-LCNet_x2_0.yaml](../../../paddlex/configs/image_classification/PP-LCNet_x2_0.yaml)|
 | PP-LCNet_x2_5 | [PP-LCNet_x2_5.yaml](../../../paddlex/configs/image_classification/PP-LCNet_x2_5.yaml)|
+| PP-LCNetV2_small | [PP-LCNetV2_small.yaml](../../../paddlex/configs/image_classification/PP-LCNetV2_small.yaml)|
+| PP-LCNetV2_base | [PP-LCNetV2_base.yaml](../../../paddlex/configs/image_classification/PP-LCNetV2_base.yaml)|
+| PP-LCNetV2_large | [PP-LCNetV2_large.yaml](../../../paddlex/configs/image_classification/PP-LCNetV2_large.yaml)|
 ### 3.MobileNetV2 系列
 | 模型名称 | config |
 | :--- | :---: |
@@ -60,6 +69,11 @@
 | 模型名称 | config |
 | :--- | :---: |
 | ConvNeXt_tiny | [ConvNeXt_tiny.yaml](../../../paddlex/configs/image_classification/ConvNeXt_tiny.yaml)|
+| ConvNeXt_small | [ConvNeXt_small.yaml](../../../paddlex/configs/image_classification/ConvNeXt_small.yaml)|
+| ConvNeXt_base_224 | [ConvNeXt_base_224.yaml](../../../paddlex/configs/image_classification/ConvNeXt_base_224.yaml)|
+| ConvNeXt_base_384 | [ConvNeXt_base_384.yaml](../../../paddlex/configs/image_classification/ConvNeXt_base_384.yaml)|
+| ConvNeXt_large_224 | [ConvNeXt_large_224.yaml](../../../paddlex/configs/image_classification/ConvNeXt_large_224.yaml)|
+| ConvNeXt_large_384 | [ConvNeXt_large_384.yaml](../../../paddlex/configs/image_classification/ConvNeXt_large_384.yaml)|
 ### 9.SwinTransformer系列
 | 模型名称 | config |
 | :--- | :---: |

+ 38 - 0
paddlex/configs/image_classification/ConvNeXt_base_224.yaml

@@ -0,0 +1,38 @@
+Global:
+  model: ConvNeXt_base_224
+  mode: check_dataset # check_dataset/train/evaluate/predict
+  dataset_dir: "/paddle/dataset/paddlex/cls/cls_flowers_examples"
+  device: gpu:0,1,2,3
+  output: "output"
+
+CheckDataset:
+  convert: 
+    enable: False
+    src_dataset_type: null
+  split: 
+    enable: False
+    train_percent: null
+    val_percent: null
+
+Train:
+  num_classes: 102
+  epochs_iters: 20
+  batch_size: 128
+  learning_rate: 0.004
+  pretrain_weight_path: null
+  warmup_steps: 5
+  resume_path: null
+  log_interval: 1
+  eval_interval: 1
+  save_interval: 1
+
+Evaluate:
+  weight_path: "output/best_model.pdparams"
+  log_interval: 1
+
+Predict:
+  model_dir: "output/best_model"
+  input_path: "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_classification_001.jpg"
+  kernel_option:
+    run_mode: paddle
+    batch_size: 1

+ 38 - 0
paddlex/configs/image_classification/ConvNeXt_base_384.yaml

@@ -0,0 +1,38 @@
+Global:
+  model: ConvNeXt_base_384
+  mode: check_dataset # check_dataset/train/evaluate/predict
+  dataset_dir: "/paddle/dataset/paddlex/cls/cls_flowers_examples"
+  device: gpu:0,1,2,3
+  output: "output"
+
+CheckDataset:
+  convert: 
+    enable: False
+    src_dataset_type: null
+  split: 
+    enable: False
+    train_percent: null
+    val_percent: null
+
+Train:
+  num_classes: 102
+  epochs_iters: 20
+  batch_size: 128
+  learning_rate: 0.004
+  pretrain_weight_path: null
+  warmup_steps: 5
+  resume_path: null
+  log_interval: 1
+  eval_interval: 1
+  save_interval: 1
+
+Evaluate:
+  weight_path: "output/best_model.pdparams"
+  log_interval: 1
+
+Predict:
+  model_dir: "output/best_model"
+  input_path: "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_classification_001.jpg"
+  kernel_option:
+    run_mode: paddle
+    batch_size: 1

+ 38 - 0
paddlex/configs/image_classification/ConvNeXt_large_224.yaml

@@ -0,0 +1,38 @@
+Global:
+  model: ConvNeXt_large_224
+  mode: check_dataset # check_dataset/train/evaluate/predict
+  dataset_dir: "/paddle/dataset/paddlex/cls/cls_flowers_examples"
+  device: gpu:0,1,2,3
+  output: "output"
+
+CheckDataset:
+  convert: 
+    enable: False
+    src_dataset_type: null
+  split: 
+    enable: False
+    train_percent: null
+    val_percent: null
+
+Train:
+  num_classes: 102
+  epochs_iters: 20
+  batch_size: 128
+  learning_rate: 0.004
+  pretrain_weight_path: null
+  warmup_steps: 5
+  resume_path: null
+  log_interval: 1
+  eval_interval: 1
+  save_interval: 1
+
+Evaluate:
+  weight_path: "output/best_model.pdparams"
+  log_interval: 1
+
+Predict:
+  model_dir: "output/best_model"
+  input_path: "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_classification_001.jpg"
+  kernel_option:
+    run_mode: paddle
+    batch_size: 1

+ 38 - 0
paddlex/configs/image_classification/ConvNeXt_large_384.yaml

@@ -0,0 +1,38 @@
+Global:
+  model: ConvNeXt_large_384
+  mode: check_dataset # check_dataset/train/evaluate/predict
+  dataset_dir: "/paddle/dataset/paddlex/cls/cls_flowers_examples"
+  device: gpu:0,1,2,3
+  output: "output"
+
+CheckDataset:
+  convert: 
+    enable: False
+    src_dataset_type: null
+  split: 
+    enable: False
+    train_percent: null
+    val_percent: null
+
+Train:
+  num_classes: 102
+  epochs_iters: 20
+  batch_size: 128
+  learning_rate: 0.004
+  pretrain_weight_path: null
+  warmup_steps: 5
+  resume_path: null
+  log_interval: 1
+  eval_interval: 1
+  save_interval: 1
+
+Evaluate:
+  weight_path: "output/best_model.pdparams"
+  log_interval: 1
+
+Predict:
+  model_dir: "output/best_model"
+  input_path: "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_classification_001.jpg"
+  kernel_option:
+    run_mode: paddle
+    batch_size: 1

+ 38 - 0
paddlex/configs/image_classification/ConvNeXt_small.yaml

@@ -0,0 +1,38 @@
+Global:
+  model: ConvNeXt_small
+  mode: check_dataset # check_dataset/train/evaluate/predict
+  dataset_dir: "/paddle/dataset/paddlex/cls/cls_flowers_examples"
+  device: gpu:0,1,2,3
+  output: "output"
+
+CheckDataset:
+  convert: 
+    enable: False
+    src_dataset_type: null
+  split: 
+    enable: False
+    train_percent: null
+    val_percent: null
+
+Train:
+  num_classes: 102
+  epochs_iters: 20
+  batch_size: 128
+  learning_rate: 0.004
+  pretrain_weight_path: null
+  warmup_steps: 5
+  resume_path: null
+  log_interval: 1
+  eval_interval: 1
+  save_interval: 1
+
+Evaluate:
+  weight_path: "output/best_model.pdparams"
+  log_interval: 1
+
+Predict:
+  model_dir: "output/best_model"
+  input_path: "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_classification_001.jpg"
+  kernel_option:
+    run_mode: paddle
+    batch_size: 1

+ 38 - 0
paddlex/configs/image_classification/PP-LCNetV2_base.yaml

@@ -0,0 +1,38 @@
+Global:
+  model: PP-LCNetV2_base
+  mode: check_dataset # check_dataset/train/evaluate/predict/predict
+  dataset_dir: "/paddle/dataset/paddlex/cls/cls_flowers_examples"
+  device: gpu:0,1,2,3
+  output: "output"
+
+CheckDataset:
+  convert: 
+    enable: False
+    src_dataset_type: null
+  split: 
+    enable: False
+    train_percent: null
+    val_percent: null
+
+Train:
+  num_classes: 102
+  epochs_iters: 20
+  batch_size: 500
+  learning_rate: 0.8
+  pretrain_weight_path: null
+  warmup_steps: 5
+  resume_path: null
+  log_interval: 1
+  eval_interval: 1
+  save_interval: 1
+
+Evaluate:
+  weight_path: "output/best_model.pdparams"
+  log_interval: 1
+
+Predict:
+  model_dir: "output/best_model"
+  input_path: "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_classification_001.jpg"
+  kernel_option:
+    run_mode: paddle
+    batch_size: 1

+ 38 - 0
paddlex/configs/image_classification/PP-LCNetV2_large.yaml

@@ -0,0 +1,38 @@
+Global:
+  model: PP-LCNetV2_large
+  mode: check_dataset # check_dataset/train/evaluate/predict/predict
+  dataset_dir: "/paddle/dataset/paddlex/cls/cls_flowers_examples"
+  device: gpu:0,1,2,3
+  output: "output"
+
+CheckDataset:
+  convert: 
+    enable: False
+    src_dataset_type: null
+  split: 
+    enable: False
+    train_percent: null
+    val_percent: null
+
+Train:
+  num_classes: 102
+  epochs_iters: 20
+  batch_size: 250
+  learning_rate: 0.4
+  pretrain_weight_path: null
+  warmup_steps: 5
+  resume_path: null
+  log_interval: 1
+  eval_interval: 1
+  save_interval: 1
+
+Evaluate:
+  weight_path: "output/best_model.pdparams"
+  log_interval: 1
+
+Predict:
+  model_dir: "output/best_model"
+  input_path: "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_classification_001.jpg"
+  kernel_option:
+    run_mode: paddle
+    batch_size: 1

+ 38 - 0
paddlex/configs/image_classification/PP-LCNetV2_small.yaml

@@ -0,0 +1,38 @@
+Global:
+  model: PP-LCNetV2_small
+  mode: check_dataset # check_dataset/train/evaluate/predict/predict
+  dataset_dir: "/paddle/dataset/paddlex/cls/cls_flowers_examples"
+  device: gpu:0,1,2,3
+  output: "output"
+
+CheckDataset:
+  convert: 
+    enable: False
+    src_dataset_type: null
+  split: 
+    enable: False
+    train_percent: null
+    val_percent: null
+
+Train:
+  num_classes: 102
+  epochs_iters: 20
+  batch_size: 500
+  learning_rate: 0.8
+  pretrain_weight_path: null
+  warmup_steps: 5
+  resume_path: null
+  log_interval: 1
+  eval_interval: 1
+  save_interval: 1
+
+Evaluate:
+  weight_path: "output/best_model.pdparams"
+  log_interval: 1
+
+Predict:
+  model_dir: "output/best_model"
+  input_path: "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_classification_001.jpg"
+  kernel_option:
+    run_mode: paddle
+    batch_size: 1

+ 38 - 0
paddlex/configs/image_classification/ResNet101_vd.yaml

@@ -0,0 +1,38 @@
+Global:
+  model: ResNet101_vd
+  mode: check_dataset # check_dataset/train/evaluate/predict
+  dataset_dir: "/paddle/dataset/paddlex/cls/cls_flowers_examples"
+  device: gpu:0,1,2,3
+  output: "output"
+
+CheckDataset:
+  convert: 
+    enable: False
+    src_dataset_type: null
+  split: 
+    enable: False
+    train_percent: null
+    val_percent: null
+
+Train:
+  num_classes: 102
+  epochs_iters: 20
+  batch_size: 64
+  learning_rate: 0.1
+  pretrain_weight_path: null
+  warmup_steps: 5
+  resume_path: null
+  log_interval: 1
+  eval_interval: 1
+  save_interval: 1
+
+Evaluate:
+  weight_path: "output/best_model.pdparams"
+  log_interval: 1
+
+Predict:
+  model_dir: "output/best_model"
+  input_path: "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_classification_001.jpg"
+  kernel_option:
+    run_mode: paddle
+    batch_size: 1

+ 38 - 0
paddlex/configs/image_classification/ResNet152_vd.yaml

@@ -0,0 +1,38 @@
+Global:
+  model: ResNet152_vd
+  mode: check_dataset # check_dataset/train/evaluate/predict
+  dataset_dir: "/paddle/dataset/paddlex/cls/cls_flowers_examples"
+  device: gpu:0,1,2,3
+  output: "output"
+
+CheckDataset:
+  convert: 
+    enable: False
+    src_dataset_type: null
+  split: 
+    enable: False
+    train_percent: null
+    val_percent: null
+
+Train:
+  num_classes: 102
+  epochs_iters: 20
+  batch_size: 64
+  learning_rate: 0.1
+  pretrain_weight_path: null
+  warmup_steps: 5
+  resume_path: null
+  log_interval: 1
+  eval_interval: 1
+  save_interval: 1
+
+Evaluate:
+  weight_path: "output/best_model.pdparams"
+  log_interval: 1
+
+Predict:
+  model_dir: "output/best_model"
+  input_path: "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_classification_001.jpg"
+  kernel_option:
+    run_mode: paddle
+    batch_size: 1

+ 38 - 0
paddlex/configs/image_classification/ResNet18_vd.yaml

@@ -0,0 +1,38 @@
+Global:
+  model: ResNet18_vd
+  mode: check_dataset # check_dataset/train/evaluate/predict
+  dataset_dir: "/paddle/dataset/paddlex/cls/cls_flowers_examples"
+  device: gpu:0,1,2,3
+  output: "output"
+
+CheckDataset:
+  convert: 
+    enable: False
+    src_dataset_type: null
+  split: 
+    enable: False
+    train_percent: null
+    val_percent: null
+
+Train:
+  num_classes: 102
+  epochs_iters: 20
+  batch_size: 64
+  learning_rate: 0.1
+  pretrain_weight_path: null
+  warmup_steps: 5
+  resume_path: null
+  log_interval: 1
+  eval_interval: 1
+  save_interval: 1
+
+Evaluate:
+  weight_path: "output/best_model.pdparams"
+  log_interval: 1
+
+Predict:
+  model_dir: "output/best_model"
+  input_path: "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_classification_001.jpg"
+  kernel_option:
+    run_mode: paddle
+    batch_size: 1

+ 38 - 0
paddlex/configs/image_classification/ResNet200_vd.yaml

@@ -0,0 +1,38 @@
+Global:
+  model: ResNet200_vd
+  mode: check_dataset # check_dataset/train/evaluate/predict
+  dataset_dir: "/paddle/dataset/paddlex/cls/cls_flowers_examples"
+  device: gpu:0,1,2,3
+  output: "output"
+
+CheckDataset:
+  convert: 
+    enable: False
+    src_dataset_type: null
+  split: 
+    enable: False
+    train_percent: null
+    val_percent: null
+
+Train:
+  num_classes: 102
+  epochs_iters: 20
+  batch_size: 64
+  learning_rate: 0.1
+  pretrain_weight_path: null
+  warmup_steps: 5
+  resume_path: null
+  log_interval: 1
+  eval_interval: 1
+  save_interval: 1
+
+Evaluate:
+  weight_path: "output/best_model.pdparams"
+  log_interval: 1
+
+Predict:
+  model_dir: "output/best_model"
+  input_path: "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_classification_001.jpg"
+  kernel_option:
+    run_mode: paddle
+    batch_size: 1

+ 38 - 0
paddlex/configs/image_classification/ResNet34_vd.yaml

@@ -0,0 +1,38 @@
+Global:
+  model: ResNet34_vd
+  mode: check_dataset # check_dataset/train/evaluate/predict
+  dataset_dir: "/paddle/dataset/paddlex/cls/cls_flowers_examples"
+  device: gpu:0,1,2,3
+  output: "output"
+
+CheckDataset:
+  convert: 
+    enable: False
+    src_dataset_type: null
+  split: 
+    enable: False
+    train_percent: null
+    val_percent: null
+
+Train:
+  num_classes: 102
+  epochs_iters: 20
+  batch_size: 64
+  learning_rate: 0.1
+  pretrain_weight_path: null
+  warmup_steps: 5
+  resume_path: null
+  log_interval: 1
+  eval_interval: 1
+  save_interval: 1
+
+Evaluate:
+  weight_path: "output/best_model.pdparams"
+  log_interval: 1
+
+Predict:
+  model_dir: "output/best_model"
+  input_path: "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_classification_001.jpg"
+  kernel_option:
+    run_mode: paddle
+    batch_size: 1

+ 38 - 0
paddlex/configs/image_classification/ResNet50_vd.yaml

@@ -0,0 +1,38 @@
+Global:
+  model: ResNet50_vd
+  mode: check_dataset # check_dataset/train/evaluate/predict
+  dataset_dir: "/paddle/dataset/paddlex/cls/cls_flowers_examples"
+  device: gpu:0,1,2,3
+  output: "output"
+
+CheckDataset:
+  convert: 
+    enable: False
+    src_dataset_type: null
+  split: 
+    enable: False
+    train_percent: null
+    val_percent: null
+
+Train:
+  num_classes: 102
+  epochs_iters: 20
+  batch_size: 64
+  learning_rate: 0.1
+  pretrain_weight_path: null
+  warmup_steps: 5
+  resume_path: null
+  log_interval: 1
+  eval_interval: 1
+  save_interval: 1
+
+Evaluate:
+  weight_path: "output/best_model.pdparams"
+  log_interval: 1
+
+Predict:
+  model_dir: "output/best_model"
+  input_path: "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_classification_001.jpg"
+  kernel_option:
+    run_mode: paddle
+    batch_size: 1

+ 28 - 1
paddlex/modules/base/predictor/utils/official_models.py

@@ -12,7 +12,6 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-
 from pathlib import Path
 
 from .....utils.cache import CACHE_DIR
@@ -21,14 +20,26 @@ from .....utils.download import download_and_extract
 OFFICIAL_MODELS = {
     "ResNet18":
     "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0/ResNet18_infer.tar",
+    "ResNet18_vd":
+    "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0/ResNet18_vd_infer.tar",
     "ResNet34":
     "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0/ResNet34_infer.tar",
+    "ResNet34_vd":
+    "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0/ResNet34_vd_infer.tar",
     "ResNet50":
     "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0/ResNet50_infer.tar",
+    "ResNet50_vd":
+    "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0/ResNet50_vd_infer.tar",
     "ResNet101":
     "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0/ResNet101_infer.tar",
+    "ResNet101_vd":
+    "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0/ResNet101_vd_infer.tar",
     "ResNet152":
     "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0/ResNet152_infer.tar",
+    "ResNet152_vd":
+    "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0/ResNet152_vd_infer.tar",
+    "ResNet200_vd":
+    "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0/ResNet200_vd_infer.tar",
     "PP-LCNet_x0_25":
     "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0/PP-LCNet_x0_25_infer.tar",
     "PP-LCNet_x0_35":
@@ -45,6 +56,12 @@ OFFICIAL_MODELS = {
     "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0/PP-LCNet_x2_5_infer.tar",
     "PP-LCNet_x2_0":
     "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0/PP-LCNet_x2_0_infer.tar",
+    "PP-LCNetV2_small":
+    "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0/PP-LCNetV2_small_infer.tar",
+    "PP-LCNetV2_base":
+    "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0/PP-LCNetV2_base_infer.tar",
+    "PP-LCNetV2_large":
+    "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0/PP-LCNetV2_large_infer.tar",
     "MobileNetV3_large_x0_35":
     "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0/\
 MobileNetV3_large_x0_35_infer.tar",
@@ -77,6 +94,16 @@ MobileNetV3_small_x1_0_infer.tar",
 MobileNetV3_small_x1_25_infer.tar",
     "ConvNeXt_tiny":
     "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0/ConvNeXt_tiny_infer.tar",
+    "ConvNeXt_small":
+    "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0/ConvNeXt_small_infer.tar",
+    "ConvNeXt_base_224":
+    "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0/ConvNeXt_base_224_infer.tar",
+    "ConvNeXt_base_384":
+    "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0/ConvNeXt_base_384_infer.tar",
+    "ConvNeXt_large_224":
+    "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0/ConvNeXt_large_224_infer.tar",
+    "ConvNeXt_large_384":
+    "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0/ConvNeXt_large_384_infer.tar",
     "MobileNetV2_x0_25":
     "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0/\
 MobileNetV2_x0_25_infer.tar",

+ 14 - 1
paddlex/modules/image_classification/model_list.py

@@ -12,11 +12,15 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-
 MODELS = [
     'CLIP_vit_base_patch16_224',
     'CLIP_vit_large_patch14_224',
     'ConvNeXt_tiny',
+    'ConvNeXt_small',
+    'ConvNeXt_base_224',
+    'ConvNeXt_base_384',
+    'ConvNeXt_large_224',
+    'ConvNeXt_large_384',
     'MobileNetV2_x0_25',
     'MobileNetV2_x0_5',
     'MobileNetV2_x1_0',
@@ -44,10 +48,19 @@ MODELS = [
     'PP-LCNet_x1_5',
     'PP-LCNet_x2_0',
     'PP-LCNet_x2_5',
+    'PP-LCNetV2_small',
+    'PP-LCNetV2_base',
+    'PP-LCNetV2_large',
     'ResNet101',
     'ResNet152',
     'ResNet18',
     'ResNet34',
     'ResNet50',
+    'ResNet200_vd',
+    'ResNet101_vd',
+    'ResNet152_vd',
+    'ResNet18_vd',
+    'ResNet34_vd',
+    'ResNet50_vd',
     'SwinTransformer_base_patch4_window7_224',
 ]

+ 5 - 2
paddlex/repo_apis/PaddleClas_api/cls/config.py

@@ -12,7 +12,6 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-
 import yaml
 from typing import Union
 from paddleclas.ppcls.utils.config import get_config, override_config
@@ -110,7 +109,11 @@ class ClsConfig(BaseConfig):
             ValueError: `mode` error.
         """
         if mode == 'train':
-            _cfg = [f'DataLoader.Train.sampler.batch_size={batch_size}']
+            if self.DataLoader["Train"]["sampler"].get("batch_size", False):
+                _cfg = [f'DataLoader.Train.sampler.batch_size={batch_size}']
+            else:
+                _cfg = [f'DataLoader.Train.sampler.first_bs={batch_size}']
+                _cfg = [f'DataLoader.Train.dataset.name=MultiScaleDataset']
         elif mode == 'eval':
             _cfg = [f'DataLoader.Eval.sampler.batch_size={batch_size}']
         elif mode == 'test':

+ 104 - 1
paddlex/repo_apis/PaddleClas_api/cls/register.py

@@ -12,7 +12,6 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-
 import os
 import os.path as osp
 
@@ -107,6 +106,30 @@ register_model_info({
 })
 
 register_model_info({
+    'model_name': 'PP-LCNetV2_small',
+    'suite': 'Cls',
+    'config_path': osp.join(PDX_CONFIG_DIR, 'PP-LCNetV2_small.yaml'),
+    'supported_apis': ['train', 'evaluate', 'predict', 'export', 'infer'],
+    'infer_config': 'deploy/configs/inference_cls.yaml'
+})
+
+register_model_info({
+    'model_name': 'PP-LCNetV2_base',
+    'suite': 'Cls',
+    'config_path': osp.join(PDX_CONFIG_DIR, 'PP-LCNetV2_base.yaml'),
+    'supported_apis': ['train', 'evaluate', 'predict', 'export', 'infer'],
+    'infer_config': 'deploy/configs/inference_cls.yaml'
+})
+
+register_model_info({
+    'model_name': 'PP-LCNetV2_large',
+    'suite': 'Cls',
+    'config_path': osp.join(PDX_CONFIG_DIR, 'PP-LCNetV2_large.yaml'),
+    'supported_apis': ['train', 'evaluate', 'predict', 'export', 'infer'],
+    'infer_config': 'deploy/configs/inference_cls.yaml'
+})
+
+register_model_info({
     'model_name': 'CLIP_vit_base_patch16_224',
     'suite': 'Cls',
     'config_path': osp.join(PDX_CONFIG_DIR, 'CLIP_vit_base_patch16_224.yaml'),
@@ -163,6 +186,14 @@ register_model_info({
 })
 
 register_model_info({
+    'model_name': 'ResNet18_vd',
+    'suite': 'Cls',
+    'config_path': osp.join(PDX_CONFIG_DIR, 'ResNet18_vd.yaml'),
+    'supported_apis': ['train', 'evaluate', 'predict', 'export', 'infer'],
+    'infer_config': 'deploy/configs/inference_cls.yaml'
+})
+
+register_model_info({
     'model_name': 'ResNet34',
     'suite': 'Cls',
     'config_path': osp.join(PDX_CONFIG_DIR, 'ResNet34.yaml'),
@@ -171,6 +202,14 @@ register_model_info({
 })
 
 register_model_info({
+    'model_name': 'ResNet34_vd',
+    'suite': 'Cls',
+    'config_path': osp.join(PDX_CONFIG_DIR, 'ResNet34_vd.yaml'),
+    'supported_apis': ['train', 'evaluate', 'predict', 'export', 'infer'],
+    'infer_config': 'deploy/configs/inference_cls.yaml'
+})
+
+register_model_info({
     'model_name': 'ResNet50',
     'suite': 'Cls',
     'config_path': osp.join(PDX_CONFIG_DIR, 'ResNet50.yaml'),
@@ -179,6 +218,14 @@ register_model_info({
 })
 
 register_model_info({
+    'model_name': 'ResNet50_vd',
+    'suite': 'Cls',
+    'config_path': osp.join(PDX_CONFIG_DIR, 'ResNet50_vd.yaml'),
+    'supported_apis': ['train', 'evaluate', 'predict', 'export', 'infer'],
+    'infer_config': 'deploy/configs/inference_cls.yaml'
+})
+
+register_model_info({
     'model_name': 'ResNet101',
     'suite': 'Cls',
     'config_path': osp.join(PDX_CONFIG_DIR, 'ResNet101.yaml'),
@@ -187,6 +234,14 @@ register_model_info({
 })
 
 register_model_info({
+    'model_name': 'ResNet101_vd',
+    'suite': 'Cls',
+    'config_path': osp.join(PDX_CONFIG_DIR, 'ResNet101_vd.yaml'),
+    'supported_apis': ['train', 'evaluate', 'predict', 'export', 'infer'],
+    'infer_config': 'deploy/configs/inference_cls.yaml'
+})
+
+register_model_info({
     'model_name': 'ResNet152',
     'suite': 'Cls',
     'config_path': osp.join(PDX_CONFIG_DIR, 'ResNet152.yaml'),
@@ -195,6 +250,22 @@ register_model_info({
 })
 
 register_model_info({
+    'model_name': 'ResNet152_vd',
+    'suite': 'Cls',
+    'config_path': osp.join(PDX_CONFIG_DIR, 'ResNet152_vd.yaml'),
+    'supported_apis': ['train', 'evaluate', 'predict', 'export', 'infer'],
+    'infer_config': 'deploy/configs/inference_cls.yaml'
+})
+
+register_model_info({
+    'model_name': 'ResNet200_vd',
+    'suite': 'Cls',
+    'config_path': osp.join(PDX_CONFIG_DIR, 'ResNet200_vd.yaml'),
+    'supported_apis': ['train', 'evaluate', 'predict', 'export', 'infer'],
+    'infer_config': 'deploy/configs/inference_cls.yaml'
+})
+
+register_model_info({
     'model_name': 'MobileNetV2_x0_25',
     'suite': 'Cls',
     'config_path': osp.join(PDX_CONFIG_DIR, 'MobileNetV2_x0_25.yaml'),
@@ -321,3 +392,35 @@ register_model_info({
     'supported_apis': ['train', 'evaluate', 'predict', 'export', 'infer'],
     'infer_config': 'deploy/configs/inference_cls.yaml'
 })
+
+register_model_info({
+    'model_name': 'ConvNeXt_small',
+    'suite': 'Cls',
+    'config_path': osp.join(PDX_CONFIG_DIR, 'ConvNeXt_small.yaml'),
+    'supported_apis': ['train', 'evaluate', 'predict', 'export', 'infer'],
+    'infer_config': 'deploy/configs/inference_cls.yaml'
+})
+
+register_model_info({
+    'model_name': 'ConvNeXt_base_224',
+    'suite': 'Cls',
+    'config_path': osp.join(PDX_CONFIG_DIR, 'ConvNeXt_base_224.yaml'),
+    'supported_apis': ['train', 'evaluate', 'predict', 'export', 'infer'],
+    'infer_config': 'deploy/configs/inference_cls.yaml'
+})
+
+register_model_info({
+    'model_name': 'ConvNeXt_base_384',
+    'suite': 'Cls',
+    'config_path': osp.join(PDX_CONFIG_DIR, 'ConvNeXt_base_384.yaml'),
+    'supported_apis': ['train', 'evaluate', 'predict', 'export', 'infer'],
+    'infer_config': 'deploy/configs/inference_cls.yaml'
+})
+
+register_model_info({
+    'model_name': 'ConvNeXt_large_224',
+    'suite': 'Cls',
+    'config_path': osp.join(PDX_CONFIG_DIR, 'ConvNeXt_large_384.yaml'),
+    'supported_apis': ['train', 'evaluate', 'predict', 'export', 'infer'],
+    'infer_config': 'deploy/configs/inference_cls.yaml'
+})

+ 177 - 0
paddlex/repo_apis/PaddleClas_api/configs/ConvNeXt_base_224.yaml

@@ -0,0 +1,177 @@
+# global configs
+Global:
+  checkpoints: null
+  pretrained_model: null
+  output_dir: ./output/
+  device: gpu
+  save_interval: 1
+  eval_during_train: True
+  eval_interval: 1
+  epochs: 300
+  print_batch_step: 10
+  use_visualdl: False
+  # used for static mode and model export
+  image_shape: [3, 224, 224]
+  save_inference_dir: ./inference
+  # training model under @to_static
+  to_static: False
+  update_freq: 4  # for 8 cards
+
+
+# mixed precision
+AMP:
+  use_amp: False
+  use_fp16_test: False
+  scale_loss: 128.0
+  use_dynamic_loss_scaling: True
+  use_promote: False
+  # O1: mixed fp16, O2: pure fp16
+  level: O1
+
+
+# model architecture
+Arch:
+  name: ConvNeXt_base_224
+  class_num: 102
+  drop_path_rate: 0.1
+  layer_scale_init_value: 1e-6
+  head_init_scale: 1.0
+
+
+# loss function config for traing/eval process
+Loss:
+  Train:
+    - CELoss:
+        weight: 1.0
+        epsilon: 0.1
+  Eval:
+    - CELoss:
+        weight: 1.0
+
+
+Optimizer:
+  name: AdamW
+  beta1: 0.9
+  beta2: 0.999
+  epsilon: 1e-8
+  weight_decay: 0.05
+  one_dim_param_no_weight_decay: True
+  lr:
+    # for 8 cards
+    name: Cosine
+    learning_rate: 4e-3  # lr 4e-3 for total_batch_size 4096
+    eta_min: 1e-6
+    warmup_epoch: 20
+    warmup_start_lr: 0
+
+
+# data loader for train and eval
+DataLoader:
+  Train:
+    dataset:
+      name: ImageNetDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/train_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - RandCropImage:
+            size: 224
+            interpolation: bicubic
+            backend: pil
+        - RandFlipImage:
+            flip_code: 1
+        - TimmAutoAugment:
+            config_str: rand-m9-mstd0.5-inc1
+            interpolation: bicubic
+            img_size: 224
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+        - RandomErasing:
+            EPSILON: 0.25
+            sl: 0.02
+            sh: 1.0/3.0
+            r1: 0.3
+            attempt: 10
+            use_log_aspect: True
+            mode: pixel
+      batch_transform_ops:
+        - OpSampler:
+            MixupOperator:
+              alpha: 0.8
+              prob: 0.5
+            CutmixOperator:
+              alpha: 1.0
+              prob: 0.5
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 128
+      drop_last: True
+      shuffle: True
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+  Eval:
+    dataset: 
+      name: ImageNetDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/val_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - ResizeImage:
+            resize_short: 256
+            interpolation: bicubic
+            backend: pil
+        - CropImage:
+            size: 224
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 128
+      drop_last: False
+      shuffle: False
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+
+Infer:
+  infer_imgs: docs/images/inference_deployment/whl_demo.jpg
+  batch_size: 10
+  transforms:
+    - DecodeImage:
+        to_rgb: True
+        channel_first: False
+    - ResizeImage:
+        resize_short: 256
+        interpolation: bicubic
+        backend: pil
+    - CropImage:
+        size: 224
+    - NormalizeImage:
+        scale: 1.0/255.0
+        mean: [0.485, 0.456, 0.406]
+        std: [0.229, 0.224, 0.225]
+        order: ''
+    - ToCHWImage:
+  PostProcess:
+    name: Topk
+    topk: 5
+    class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
+
+
+Metric:
+  Eval:
+    - TopkAcc:
+        topk: [1, 5]

+ 177 - 0
paddlex/repo_apis/PaddleClas_api/configs/ConvNeXt_base_384.yaml

@@ -0,0 +1,177 @@
+# global configs
+Global:
+  checkpoints: null
+  pretrained_model: null
+  output_dir: ./output/
+  device: gpu
+  save_interval: 1
+  eval_during_train: True
+  eval_interval: 1
+  epochs: 300
+  print_batch_step: 10
+  use_visualdl: False
+  # used for static mode and model export
+  image_shape: [3, 384, 384]
+  save_inference_dir: ./inference
+  # training model under @to_static
+  to_static: False
+  update_freq: 4  # for 8 cards
+
+
+# mixed precision
+AMP:
+  use_amp: False
+  use_fp16_test: False
+  scale_loss: 128.0
+  use_dynamic_loss_scaling: True
+  use_promote: False
+  # O1: mixed fp16, O2: pure fp16
+  level: O1
+
+
+# model architecture
+Arch:
+  name: ConvNeXt_base_384
+  class_num: 102
+  drop_path_rate: 0.1
+  layer_scale_init_value: 1e-6
+  head_init_scale: 1.0
+
+
+# loss function config for traing/eval process
+Loss:
+  Train:
+    - CELoss:
+        weight: 1.0
+        epsilon: 0.1
+  Eval:
+    - CELoss:
+        weight: 1.0
+
+
+Optimizer:
+  name: AdamW
+  beta1: 0.9
+  beta2: 0.999
+  epsilon: 1e-8
+  weight_decay: 0.05
+  one_dim_param_no_weight_decay: True
+  lr:
+    # for 8 cards
+    name: Cosine
+    learning_rate: 4e-3  # lr 4e-3 for total_batch_size 4096
+    eta_min: 1e-6
+    warmup_epoch: 20
+    warmup_start_lr: 0
+
+
+# data loader for train and eval
+DataLoader:
+  Train:
+    dataset:
+      name: ImageNetDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/train_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - RandCropImage:
+            size: 384
+            interpolation: bicubic
+            backend: pil
+        - RandFlipImage:
+            flip_code: 1
+        - TimmAutoAugment:
+            config_str: rand-m9-mstd0.5-inc1
+            interpolation: bicubic
+            img_size: 384
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+        - RandomErasing:
+            EPSILON: 0.25
+            sl: 0.02
+            sh: 1.0/3.0
+            r1: 0.3
+            attempt: 10
+            use_log_aspect: True
+            mode: pixel
+      batch_transform_ops:
+        - OpSampler:
+            MixupOperator:
+              alpha: 0.8
+              prob: 0.5
+            CutmixOperator:
+              alpha: 1.0
+              prob: 0.5
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 128
+      drop_last: True
+      shuffle: True
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+  Eval:
+    dataset: 
+      name: ImageNetDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/val_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - ResizeImage:
+            resize_short: 384
+            interpolation: bicubic
+            backend: pil
+        - CropImage:
+            size: 384
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 128
+      drop_last: False
+      shuffle: False
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+
+Infer:
+  infer_imgs: docs/images/inference_deployment/whl_demo.jpg
+  batch_size: 10
+  transforms:
+    - DecodeImage:
+        to_rgb: True
+        channel_first: False
+    - ResizeImage:
+        resize_short: 384
+        interpolation: bicubic
+        backend: pil
+    - CropImage:
+        size: 384
+    - NormalizeImage:
+        scale: 1.0/255.0
+        mean: [0.485, 0.456, 0.406]
+        std: [0.229, 0.224, 0.225]
+        order: ''
+    - ToCHWImage:
+  PostProcess:
+    name: Topk
+    topk: 5
+    class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
+
+
+Metric:
+  Eval:
+    - TopkAcc:
+        topk: [1, 5]

+ 177 - 0
paddlex/repo_apis/PaddleClas_api/configs/ConvNeXt_large_224.yaml

@@ -0,0 +1,177 @@
+# global configs
+Global:
+  checkpoints: null
+  pretrained_model: null
+  output_dir: ./output/
+  device: gpu
+  save_interval: 1
+  eval_during_train: True
+  eval_interval: 1
+  epochs: 300
+  print_batch_step: 10
+  use_visualdl: False
+  # used for static mode and model export
+  image_shape: [3, 224, 224]
+  save_inference_dir: ./inference
+  # training model under @to_static
+  to_static: False
+  update_freq: 4  # for 8 cards
+
+
+# mixed precision
+AMP:
+  use_amp: False
+  use_fp16_test: False
+  scale_loss: 128.0
+  use_dynamic_loss_scaling: True
+  use_promote: False
+  # O1: mixed fp16, O2: pure fp16
+  level: O1
+
+
+# model architecture
+Arch:
+  name: ConvNeXt_large_224
+  class_num: 102
+  drop_path_rate: 0.1
+  layer_scale_init_value: 1e-6
+  head_init_scale: 1.0
+
+
+# loss function config for traing/eval process
+Loss:
+  Train:
+    - CELoss:
+        weight: 1.0
+        epsilon: 0.1
+  Eval:
+    - CELoss:
+        weight: 1.0
+
+
+Optimizer:
+  name: AdamW
+  beta1: 0.9
+  beta2: 0.999
+  epsilon: 1e-8
+  weight_decay: 0.05
+  one_dim_param_no_weight_decay: True
+  lr:
+    # for 8 cards
+    name: Cosine
+    learning_rate: 4e-3  # lr 4e-3 for total_batch_size 4096
+    eta_min: 1e-6
+    warmup_epoch: 20
+    warmup_start_lr: 0
+
+
+# data loader for train and eval
+DataLoader:
+  Train:
+    dataset:
+      name: ImageNetDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/train_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - RandCropImage:
+            size: 224
+            interpolation: bicubic
+            backend: pil
+        - RandFlipImage:
+            flip_code: 1
+        - TimmAutoAugment:
+            config_str: rand-m9-mstd0.5-inc1
+            interpolation: bicubic
+            img_size: 224
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+        - RandomErasing:
+            EPSILON: 0.25
+            sl: 0.02
+            sh: 1.0/3.0
+            r1: 0.3
+            attempt: 10
+            use_log_aspect: True
+            mode: pixel
+      batch_transform_ops:
+        - OpSampler:
+            MixupOperator:
+              alpha: 0.8
+              prob: 0.5
+            CutmixOperator:
+              alpha: 1.0
+              prob: 0.5
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 128
+      drop_last: True
+      shuffle: True
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+  Eval:
+    dataset: 
+      name: ImageNetDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/val_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - ResizeImage:
+            resize_short: 256
+            interpolation: bicubic
+            backend: pil
+        - CropImage:
+            size: 224
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 128
+      drop_last: False
+      shuffle: False
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+
+Infer:
+  infer_imgs: docs/images/inference_deployment/whl_demo.jpg
+  batch_size: 10
+  transforms:
+    - DecodeImage:
+        to_rgb: True
+        channel_first: False
+    - ResizeImage:
+        resize_short: 256
+        interpolation: bicubic
+        backend: pil
+    - CropImage:
+        size: 224
+    - NormalizeImage:
+        scale: 1.0/255.0
+        mean: [0.485, 0.456, 0.406]
+        std: [0.229, 0.224, 0.225]
+        order: ''
+    - ToCHWImage:
+  PostProcess:
+    name: Topk
+    topk: 5
+    class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
+
+
+Metric:
+  Eval:
+    - TopkAcc:
+        topk: [1, 5]

+ 177 - 0
paddlex/repo_apis/PaddleClas_api/configs/ConvNeXt_large_384.yaml

@@ -0,0 +1,177 @@
+# global configs
+Global:
+  checkpoints: null
+  pretrained_model: null
+  output_dir: ./output/
+  device: gpu
+  save_interval: 1
+  eval_during_train: True
+  eval_interval: 1
+  epochs: 300
+  print_batch_step: 10
+  use_visualdl: False
+  # used for static mode and model export
+  image_shape: [3, 384, 384]
+  save_inference_dir: ./inference
+  # training model under @to_static
+  to_static: False
+  update_freq: 4  # for 8 cards
+
+
+# mixed precision
+AMP:
+  use_amp: False
+  use_fp16_test: False
+  scale_loss: 128.0
+  use_dynamic_loss_scaling: True
+  use_promote: False
+  # O1: mixed fp16, O2: pure fp16
+  level: O1
+
+
+# model architecture
+Arch:
+  name: ConvNeXt_large_384
+  class_num: 102
+  drop_path_rate: 0.1
+  layer_scale_init_value: 1e-6
+  head_init_scale: 1.0
+
+
+# loss function config for traing/eval process
+Loss:
+  Train:
+    - CELoss:
+        weight: 1.0
+        epsilon: 0.1
+  Eval:
+    - CELoss:
+        weight: 1.0
+
+
+Optimizer:
+  name: AdamW
+  beta1: 0.9
+  beta2: 0.999
+  epsilon: 1e-8
+  weight_decay: 0.05
+  one_dim_param_no_weight_decay: True
+  lr:
+    # for 8 cards
+    name: Cosine
+    learning_rate: 4e-3  # lr 4e-3 for total_batch_size 4096
+    eta_min: 1e-6
+    warmup_epoch: 20
+    warmup_start_lr: 0
+
+
+# data loader for train and eval
+DataLoader:
+  Train:
+    dataset:
+      name: ImageNetDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/train_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - RandCropImage:
+            size: 384
+            interpolation: bicubic
+            backend: pil
+        - RandFlipImage:
+            flip_code: 1
+        - TimmAutoAugment:
+            config_str: rand-m9-mstd0.5-inc1
+            interpolation: bicubic
+            img_size: 384
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+        - RandomErasing:
+            EPSILON: 0.25
+            sl: 0.02
+            sh: 1.0/3.0
+            r1: 0.3
+            attempt: 10
+            use_log_aspect: True
+            mode: pixel
+      batch_transform_ops:
+        - OpSampler:
+            MixupOperator:
+              alpha: 0.8
+              prob: 0.5
+            CutmixOperator:
+              alpha: 1.0
+              prob: 0.5
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 128
+      drop_last: True
+      shuffle: True
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+  Eval:
+    dataset: 
+      name: ImageNetDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/val_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - ResizeImage:
+            resize_short: 384
+            interpolation: bicubic
+            backend: pil
+        - CropImage:
+            size: 384
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 128
+      drop_last: False
+      shuffle: False
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+
+Infer:
+  infer_imgs: docs/images/inference_deployment/whl_demo.jpg
+  batch_size: 10
+  transforms:
+    - DecodeImage:
+        to_rgb: True
+        channel_first: False
+    - ResizeImage:
+        resize_short: 384
+        interpolation: bicubic
+        backend: pil
+    - CropImage:
+        size: 384
+    - NormalizeImage:
+        scale: 1.0/255.0
+        mean: [0.485, 0.456, 0.406]
+        std: [0.229, 0.224, 0.225]
+        order: ''
+    - ToCHWImage:
+  PostProcess:
+    name: Topk
+    topk: 5
+    class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
+
+
+Metric:
+  Eval:
+    - TopkAcc:
+        topk: [1, 5]

+ 177 - 0
paddlex/repo_apis/PaddleClas_api/configs/ConvNeXt_small.yaml

@@ -0,0 +1,177 @@
+# global configs
+Global:
+  checkpoints: null
+  pretrained_model: null
+  output_dir: ./output/
+  device: gpu
+  save_interval: 1
+  eval_during_train: True
+  eval_interval: 1
+  epochs: 300
+  print_batch_step: 10
+  use_visualdl: False
+  # used for static mode and model export
+  image_shape: [3, 224, 224]
+  save_inference_dir: ./inference
+  # training model under @to_static
+  to_static: False
+  update_freq: 4  # for 8 cards
+
+
+# mixed precision
+AMP:
+  use_amp: False
+  use_fp16_test: False
+  scale_loss: 128.0
+  use_dynamic_loss_scaling: True
+  use_promote: False
+  # O1: mixed fp16, O2: pure fp16
+  level: O1
+
+
+# model architecture
+Arch:
+  name: ConvNeXt_small
+  class_num: 102
+  drop_path_rate: 0.1
+  layer_scale_init_value: 1e-6
+  head_init_scale: 1.0
+
+
+# loss function config for traing/eval process
+Loss:
+  Train:
+    - CELoss:
+        weight: 1.0
+        epsilon: 0.1
+  Eval:
+    - CELoss:
+        weight: 1.0
+
+
+Optimizer:
+  name: AdamW
+  beta1: 0.9
+  beta2: 0.999
+  epsilon: 1e-8
+  weight_decay: 0.05
+  one_dim_param_no_weight_decay: True
+  lr:
+    # for 8 cards
+    name: Cosine
+    learning_rate: 4e-3  # lr 4e-3 for total_batch_size 4096
+    eta_min: 1e-6
+    warmup_epoch: 20
+    warmup_start_lr: 0
+
+
+# data loader for train and eval
+DataLoader:
+  Train:
+    dataset:
+      name: ImageNetDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/train_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - RandCropImage:
+            size: 224
+            interpolation: bicubic
+            backend: pil
+        - RandFlipImage:
+            flip_code: 1
+        - TimmAutoAugment:
+            config_str: rand-m9-mstd0.5-inc1
+            interpolation: bicubic
+            img_size: 224
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+        - RandomErasing:
+            EPSILON: 0.25
+            sl: 0.02
+            sh: 1.0/3.0
+            r1: 0.3
+            attempt: 10
+            use_log_aspect: True
+            mode: pixel
+      batch_transform_ops:
+        - OpSampler:
+            MixupOperator:
+              alpha: 0.8
+              prob: 0.5
+            CutmixOperator:
+              alpha: 1.0
+              prob: 0.5
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 128
+      drop_last: True
+      shuffle: True
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+  Eval:
+    dataset: 
+      name: ImageNetDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/val_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - ResizeImage:
+            resize_short: 256
+            interpolation: bicubic
+            backend: pil
+        - CropImage:
+            size: 224
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 128
+      drop_last: False
+      shuffle: False
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+
+Infer:
+  infer_imgs: docs/images/inference_deployment/whl_demo.jpg
+  batch_size: 10
+  transforms:
+    - DecodeImage:
+        to_rgb: True
+        channel_first: False
+    - ResizeImage:
+        resize_short: 256
+        interpolation: bicubic
+        backend: pil
+    - CropImage:
+        size: 224
+    - NormalizeImage:
+        scale: 1.0/255.0
+        mean: [0.485, 0.456, 0.406]
+        std: [0.229, 0.224, 0.225]
+        order: ''
+    - ToCHWImage:
+  PostProcess:
+    name: Topk
+    topk: 5
+    class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
+
+
+Metric:
+  Eval:
+    - TopkAcc:
+        topk: [1, 5]

+ 145 - 0
paddlex/repo_apis/PaddleClas_api/configs/PP-LCNetV2_base.yaml

@@ -0,0 +1,145 @@
+# global configs
+Global:
+  checkpoints: null
+  pretrained_model: null
+  output_dir: ./output/
+  device: gpu
+  save_interval: 1
+  eval_during_train: True
+  eval_interval: 1
+  epochs: 480
+  print_batch_step: 10
+  use_visualdl: False
+  # used for static mode and model export
+  image_shape: [3, 224, 224]
+  save_inference_dir: ./inference
+
+
+# mixed precision
+AMP:
+  use_amp: False
+  use_fp16_test: False
+  scale_loss: 128.0
+  use_dynamic_loss_scaling: True
+  use_promote: False
+  # O1: mixed fp16, O2: pure fp16
+  level: O1
+
+
+# model architecture
+Arch:
+  name: PPLCNetV2_base
+  class_num: 102
+
+# loss function config for traing/eval process
+Loss:
+  Train:
+    - CELoss:
+        weight: 1.0
+        epsilon: 0.1
+  Eval:
+    - CELoss:
+        weight: 1.0
+
+Optimizer:
+  name: Momentum
+  momentum: 0.9
+  lr:
+    name: Cosine
+    learning_rate: 0.8
+    warmup_epoch: 5
+  regularizer:
+    name: 'L2'
+    coeff: 0.00004
+
+# data loader for train and eval
+DataLoader:
+  Train:
+    dataset:
+      name: MultiScaleDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/train_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - RandCropImage:
+            size: 224
+        - RandFlipImage:
+            flip_code: 1
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+
+    # support to specify width and height respectively:
+    # scales: [(160,160), (192,192), (224,224) (288,288) (320,320)]
+    sampler:
+      name: MultiScaleSampler
+      scales: [160, 192, 224, 288, 320]
+      # first_bs: batch size for the first image resolution in the scales list
+      # divide_factor: to ensure the width and height dimensions can be devided by downsampling multiple
+      first_bs: 500
+      divided_factor: 32
+      is_training: True
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+  Eval:
+    dataset: 
+      name: ImageNetDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/val_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - ResizeImage:
+            resize_short: 256
+        - CropImage:
+            size: 224
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 64
+      drop_last: False
+      shuffle: False
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+Infer:
+  infer_imgs: docs/images/inference_deployment/whl_demo.jpg
+  batch_size: 10
+  transforms:
+    - DecodeImage:
+        to_rgb: True
+        channel_first: False
+    - ResizeImage:
+        resize_short: 256
+    - CropImage:
+        size: 224
+    - NormalizeImage:
+        scale: 1.0/255.0
+        mean: [0.485, 0.456, 0.406]
+        std: [0.229, 0.224, 0.225]
+        order: ''
+    - ToCHWImage:
+  PostProcess:
+    name: Topk
+    topk: 5
+    class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
+
+Metric:
+  Train:
+    - TopkAcc:
+        topk: [1, 5]
+  Eval:
+    - TopkAcc:
+        topk: [1, 5]

+ 145 - 0
paddlex/repo_apis/PaddleClas_api/configs/PP-LCNetV2_large.yaml

@@ -0,0 +1,145 @@
+# global configs
+Global:
+  checkpoints: null
+  pretrained_model: null
+  output_dir: ./output/
+  device: gpu
+  save_interval: 1
+  eval_during_train: True
+  eval_interval: 1
+  epochs: 480
+  print_batch_step: 10
+  use_visualdl: False
+  # used for static mode and model export
+  image_shape: [3, 224, 224]
+  save_inference_dir: ./inference
+
+
+# mixed precision
+AMP:
+  use_amp: False
+  use_fp16_test: False
+  scale_loss: 128.0
+  use_dynamic_loss_scaling: True
+  use_promote: False
+  # O1: mixed fp16, O2: pure fp16
+  level: O1
+
+
+# model architecture
+Arch:
+  name: PPLCNetV2_large
+  class_num: 102
+
+# loss function config for traing/eval process
+Loss:
+  Train:
+    - CELoss:
+        weight: 1.0
+        epsilon: 0.1
+  Eval:
+    - CELoss:
+        weight: 1.0
+
+Optimizer:
+  name: Momentum
+  momentum: 0.9
+  lr:
+    name: Cosine
+    learning_rate: 0.4
+    warmup_epoch: 5
+  regularizer:
+    name: 'L2'
+    coeff: 0.00004
+
+# data loader for train and eval
+DataLoader:
+  Train:
+    dataset:
+      name: MultiScaleDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/train_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - RandCropImage:
+            size: 224
+        - RandFlipImage:
+            flip_code: 1
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+
+    # support to specify width and height respectively:
+    # scales: [(160,160), (192,192), (224,224) (288,288) (320,320)]
+    sampler:
+      name: MultiScaleSampler
+      scales: [160, 192, 224, 288, 320]
+      # first_bs: batch size for the first image resolution in the scales list
+      # divide_factor: to ensure the width and height dimensions can be devided by downsampling multiple
+      first_bs: 250
+      divided_factor: 32
+      is_training: True
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+  Eval:
+    dataset: 
+      name: ImageNetDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/val_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - ResizeImage:
+            resize_short: 256
+        - CropImage:
+            size: 224
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 64
+      drop_last: False
+      shuffle: False
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+Infer:
+  infer_imgs: docs/images/inference_deployment/whl_demo.jpg
+  batch_size: 10
+  transforms:
+    - DecodeImage:
+        to_rgb: True
+        channel_first: False
+    - ResizeImage:
+        resize_short: 256
+    - CropImage:
+        size: 224
+    - NormalizeImage:
+        scale: 1.0/255.0
+        mean: [0.485, 0.456, 0.406]
+        std: [0.229, 0.224, 0.225]
+        order: ''
+    - ToCHWImage:
+  PostProcess:
+    name: Topk
+    topk: 5
+    class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
+
+Metric:
+  Train:
+    - TopkAcc:
+        topk: [1, 5]
+  Eval:
+    - TopkAcc:
+        topk: [1, 5]

+ 145 - 0
paddlex/repo_apis/PaddleClas_api/configs/PP-LCNetV2_small.yaml

@@ -0,0 +1,145 @@
+# global configs
+Global:
+  checkpoints: null
+  pretrained_model: null
+  output_dir: ./output/
+  device: gpu
+  save_interval: 1
+  eval_during_train: True
+  eval_interval: 1
+  epochs: 480
+  print_batch_step: 10
+  use_visualdl: False
+  # used for static mode and model export
+  image_shape: [3, 224, 224]
+  save_inference_dir: ./inference
+
+
+# mixed precision
+AMP:
+  use_amp: False
+  use_fp16_test: False
+  scale_loss: 128.0
+  use_dynamic_loss_scaling: True
+  use_promote: False
+  # O1: mixed fp16, O2: pure fp16
+  level: O1
+
+
+# model architecture
+Arch:
+  name: PPLCNetV2_small
+  class_num: 102
+
+# loss function config for traing/eval process
+Loss:
+  Train:
+    - CELoss:
+        weight: 1.0
+        epsilon: 0.1
+  Eval:
+    - CELoss:
+        weight: 1.0
+
+Optimizer:
+  name: Momentum
+  momentum: 0.9
+  lr:
+    name: Cosine
+    learning_rate: 0.8
+    warmup_epoch: 5
+  regularizer:
+    name: 'L2'
+    coeff: 0.00002
+
+# data loader for train and eval
+DataLoader:
+  Train:
+    dataset:
+      name: MultiScaleDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/train_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - RandCropImage:
+            size: 224
+        - RandFlipImage:
+            flip_code: 1
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+
+    # support to specify width and height respectively:
+    # scales: [(160,160), (192,192), (224,224) (288,288) (320,320)]
+    sampler:
+      name: MultiScaleSampler
+      scales: [160, 192, 224, 288, 320]
+      # first_bs: batch size for the first image resolution in the scales list
+      # divide_factor: to ensure the width and height dimensions can be devided by downsampling multiple
+      first_bs: 500
+      divided_factor: 32
+      is_training: True
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+  Eval:
+    dataset: 
+      name: ImageNetDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/val_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - ResizeImage:
+            resize_short: 256
+        - CropImage:
+            size: 224
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 64
+      drop_last: False
+      shuffle: False
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+Infer:
+  infer_imgs: docs/images/inference_deployment/whl_demo.jpg
+  batch_size: 10
+  transforms:
+    - DecodeImage:
+        to_rgb: True
+        channel_first: False
+    - ResizeImage:
+        resize_short: 256
+    - CropImage:
+        size: 224
+    - NormalizeImage:
+        scale: 1.0/255.0
+        mean: [0.485, 0.456, 0.406]
+        std: [0.229, 0.224, 0.225]
+        order: ''
+    - ToCHWImage:
+  PostProcess:
+    name: Topk
+    topk: 5
+    class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
+
+Metric:
+  Train:
+    - TopkAcc:
+        topk: [1, 5]
+  Eval:
+    - TopkAcc:
+        topk: [1, 5]

+ 142 - 0
paddlex/repo_apis/PaddleClas_api/configs/ResNet101_vd.yaml

@@ -0,0 +1,142 @@
+# global configs
+Global:
+  checkpoints: null
+  pretrained_model: null
+  output_dir: ./output/
+  device: gpu
+  save_interval: 1
+  eval_during_train: True
+  eval_interval: 1
+  epochs: 200
+  print_batch_step: 10
+  use_visualdl: False
+  # used for static mode and model export
+  image_shape: [3, 224, 224]
+  save_inference_dir: ./inference
+
+
+# mixed precision
+AMP:
+  use_amp: False
+  use_fp16_test: False
+  scale_loss: 128.0
+  use_dynamic_loss_scaling: True
+  use_promote: False
+  # O1: mixed fp16, O2: pure fp16
+  level: O1
+
+
+# model architecture
+Arch:
+  name: ResNet101_vd
+  class_num: 102
+ 
+# loss function config for traing/eval process
+Loss:
+  Train:
+    - CELoss:
+        weight: 1.0
+        epsilon: 0.1
+  Eval:
+    - CELoss:
+        weight: 1.0
+
+
+Optimizer:
+  name: Momentum
+  momentum: 0.9
+  lr:
+    name: Cosine
+    learning_rate: 0.1
+  regularizer:
+    name: 'L2'
+    coeff: 0.0001
+
+
+# data loader for train and eval
+DataLoader:
+  Train:
+    dataset:
+      name: ImageNetDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/train_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - RandCropImage:
+            size: 224
+        - RandFlipImage:
+            flip_code: 1
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+      batch_transform_ops:
+        - MixupOperator:
+            alpha: 0.2
+
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 64
+      drop_last: False
+      shuffle: True
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+  Eval:
+    dataset: 
+      name: ImageNetDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/val_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - ResizeImage:
+            resize_short: 256
+        - CropImage:
+            size: 224
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 64
+      drop_last: False
+      shuffle: False
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+Infer:
+  infer_imgs: docs/images/inference_deployment/whl_demo.jpg
+  batch_size: 10
+  transforms:
+    - DecodeImage:
+        to_rgb: True
+        channel_first: False
+    - ResizeImage:
+        resize_short: 256
+    - CropImage:
+        size: 224
+    - NormalizeImage:
+        scale: 1.0/255.0
+        mean: [0.485, 0.456, 0.406]
+        std: [0.229, 0.224, 0.225]
+        order: ''
+    - ToCHWImage:
+  PostProcess:
+    name: Topk
+    topk: 5
+    class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
+
+Metric:
+  Train:
+  Eval:
+    - TopkAcc:
+        topk: [1, 5]

+ 142 - 0
paddlex/repo_apis/PaddleClas_api/configs/ResNet152_vd.yaml

@@ -0,0 +1,142 @@
+# global configs
+Global:
+  checkpoints: null
+  pretrained_model: null
+  output_dir: ./output/
+  device: gpu
+  save_interval: 1
+  eval_during_train: True
+  eval_interval: 1
+  epochs: 200
+  print_batch_step: 10
+  use_visualdl: False
+  # used for static mode and model export
+  image_shape: [3, 224, 224]
+  save_inference_dir: ./inference
+
+
+# mixed precision
+AMP:
+  use_amp: False
+  use_fp16_test: False
+  scale_loss: 128.0
+  use_dynamic_loss_scaling: True
+  use_promote: False
+  # O1: mixed fp16, O2: pure fp16
+  level: O1
+
+
+# model architecture
+Arch:
+  name: ResNet152_vd
+  class_num: 102
+ 
+# loss function config for traing/eval process
+Loss:
+  Train:
+    - CELoss:
+        weight: 1.0
+        epsilon: 0.1
+  Eval:
+    - CELoss:
+        weight: 1.0
+
+
+Optimizer:
+  name: Momentum
+  momentum: 0.9
+  lr:
+    name: Cosine
+    learning_rate: 0.1
+  regularizer:
+    name: 'L2'
+    coeff: 0.0001
+
+
+# data loader for train and eval
+DataLoader:
+  Train:
+    dataset:
+      name: ImageNetDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/train_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - RandCropImage:
+            size: 224
+        - RandFlipImage:
+            flip_code: 1
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+      batch_transform_ops:
+        - MixupOperator:
+            alpha: 0.2
+
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 64
+      drop_last: False
+      shuffle: True
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+  Eval:
+    dataset: 
+      name: ImageNetDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/val_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - ResizeImage:
+            resize_short: 256
+        - CropImage:
+            size: 224
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 64
+      drop_last: False
+      shuffle: False
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+Infer:
+  infer_imgs: docs/images/inference_deployment/whl_demo.jpg
+  batch_size: 10
+  transforms:
+    - DecodeImage:
+        to_rgb: True
+        channel_first: False
+    - ResizeImage:
+        resize_short: 256
+    - CropImage:
+        size: 224
+    - NormalizeImage:
+        scale: 1.0/255.0
+        mean: [0.485, 0.456, 0.406]
+        std: [0.229, 0.224, 0.225]
+        order: ''
+    - ToCHWImage:
+  PostProcess:
+    name: Topk
+    topk: 5
+    class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
+
+Metric:
+  Train:
+  Eval:
+    - TopkAcc:
+        topk: [1, 5]

+ 142 - 0
paddlex/repo_apis/PaddleClas_api/configs/ResNet18_vd.yaml

@@ -0,0 +1,142 @@
+# global configs
+Global:
+  checkpoints: null
+  pretrained_model: null
+  output_dir: ./output/
+  device: gpu
+  save_interval: 1
+  eval_during_train: True
+  eval_interval: 1
+  epochs: 200
+  print_batch_step: 10
+  use_visualdl: False
+  # used for static mode and model export
+  image_shape: [3, 224, 224]
+  save_inference_dir: ./inference
+
+
+# mixed precision
+AMP:
+  use_amp: False
+  use_fp16_test: False
+  scale_loss: 128.0
+  use_dynamic_loss_scaling: True
+  use_promote: False
+  # O1: mixed fp16, O2: pure fp16
+  level: O1
+
+
+# model architecture
+Arch:
+  name: ResNet18_vd
+  class_num: 102
+ 
+# loss function config for traing/eval process
+Loss:
+  Train:
+    - CELoss:
+        weight: 1.0
+        epsilon: 0.1
+  Eval:
+    - CELoss:
+        weight: 1.0
+
+
+Optimizer:
+  name: Momentum
+  momentum: 0.9
+  lr:
+    name: Cosine
+    learning_rate: 0.1
+  regularizer:
+    name: 'L2'
+    coeff: 0.00007
+
+
+# data loader for train and eval
+DataLoader:
+  Train:
+    dataset:
+      name: ImageNetDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/train_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - RandCropImage:
+            size: 224
+        - RandFlipImage:
+            flip_code: 1
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+      batch_transform_ops:
+        - MixupOperator:
+            alpha: 0.2
+
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 64
+      drop_last: False
+      shuffle: True
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+  Eval:
+    dataset: 
+      name: ImageNetDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/val_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - ResizeImage:
+            resize_short: 256
+        - CropImage:
+            size: 224
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 64
+      drop_last: False
+      shuffle: False
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+Infer:
+  infer_imgs: docs/images/inference_deployment/whl_demo.jpg
+  batch_size: 10
+  transforms:
+    - DecodeImage:
+        to_rgb: True
+        channel_first: False
+    - ResizeImage:
+        resize_short: 256
+    - CropImage:
+        size: 224
+    - NormalizeImage:
+        scale: 1.0/255.0
+        mean: [0.485, 0.456, 0.406]
+        std: [0.229, 0.224, 0.225]
+        order: ''
+    - ToCHWImage:
+  PostProcess:
+    name: Topk
+    topk: 5
+    class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
+
+Metric:
+  Train:
+  Eval:
+    - TopkAcc:
+        topk: [1, 5]

+ 142 - 0
paddlex/repo_apis/PaddleClas_api/configs/ResNet200_vd.yaml

@@ -0,0 +1,142 @@
+# global configs
+Global:
+  checkpoints: null
+  pretrained_model: null
+  output_dir: ./output/
+  device: gpu
+  save_interval: 1
+  eval_during_train: True
+  eval_interval: 1
+  epochs: 200
+  print_batch_step: 10
+  use_visualdl: False
+  # used for static mode and model export
+  image_shape: [3, 224, 224]
+  save_inference_dir: ./inference
+
+
+# mixed precision
+AMP:
+  use_amp: False
+  use_fp16_test: False
+  scale_loss: 128.0
+  use_dynamic_loss_scaling: True
+  use_promote: False
+  # O1: mixed fp16, O2: pure fp16
+  level: O1
+
+
+# model architecture
+Arch:
+  name: ResNet200_vd
+  class_num: 102
+ 
+# loss function config for traing/eval process
+Loss:
+  Train:
+    - CELoss:
+        weight: 1.0
+        epsilon: 0.1
+  Eval:
+    - CELoss:
+        weight: 1.0
+
+
+Optimizer:
+  name: Momentum
+  momentum: 0.9
+  lr:
+    name: Cosine
+    learning_rate: 0.1
+  regularizer:
+    name: 'L2'
+    coeff: 0.0001
+
+
+# data loader for train and eval
+DataLoader:
+  Train:
+    dataset:
+      name: ImageNetDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/train_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - RandCropImage:
+            size: 224
+        - RandFlipImage:
+            flip_code: 1
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+      batch_transform_ops:
+        - MixupOperator:
+            alpha: 0.2
+
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 64
+      drop_last: False
+      shuffle: True
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+  Eval:
+    dataset: 
+      name: ImageNetDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/val_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - ResizeImage:
+            resize_short: 256
+        - CropImage:
+            size: 224
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 64
+      drop_last: False
+      shuffle: False
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+Infer:
+  infer_imgs: docs/images/inference_deployment/whl_demo.jpg
+  batch_size: 10
+  transforms:
+    - DecodeImage:
+        to_rgb: True
+        channel_first: False
+    - ResizeImage:
+        resize_short: 256
+    - CropImage:
+        size: 224
+    - NormalizeImage:
+        scale: 1.0/255.0
+        mean: [0.485, 0.456, 0.406]
+        std: [0.229, 0.224, 0.225]
+        order: ''
+    - ToCHWImage:
+  PostProcess:
+    name: Topk
+    topk: 5
+    class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
+
+Metric:
+  Train:
+  Eval:
+    - TopkAcc:
+        topk: [1, 5]

+ 142 - 0
paddlex/repo_apis/PaddleClas_api/configs/ResNet34_vd.yaml

@@ -0,0 +1,142 @@
+# global configs
+Global:
+  checkpoints: null
+  pretrained_model: null
+  output_dir: ./output/
+  device: gpu
+  save_interval: 1
+  eval_during_train: True
+  eval_interval: 1
+  epochs: 200
+  print_batch_step: 10
+  use_visualdl: False
+  # used for static mode and model export
+  image_shape: [3, 224, 224]
+  save_inference_dir: ./inference
+
+
+# mixed precision
+AMP:
+  use_amp: False
+  use_fp16_test: False
+  scale_loss: 128.0
+  use_dynamic_loss_scaling: True
+  use_promote: False
+  # O1: mixed fp16, O2: pure fp16
+  level: O1
+
+
+# model architecture
+Arch:
+  name: ResNet34_vd
+  class_num: 102
+ 
+# loss function config for traing/eval process
+Loss:
+  Train:
+    - CELoss:
+        weight: 1.0
+        epsilon: 0.1
+  Eval:
+    - CELoss:
+        weight: 1.0
+
+
+Optimizer:
+  name: Momentum
+  momentum: 0.9
+  lr:
+    name: Cosine
+    learning_rate: 0.1
+  regularizer:
+    name: 'L2'
+    coeff: 0.00007
+
+
+# data loader for train and eval
+DataLoader:
+  Train:
+    dataset:
+      name: ImageNetDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/train_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - RandCropImage:
+            size: 224
+        - RandFlipImage:
+            flip_code: 1
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+      batch_transform_ops:
+        - MixupOperator:
+            alpha: 0.2
+
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 64
+      drop_last: False
+      shuffle: True
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+  Eval:
+    dataset: 
+      name: ImageNetDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/val_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - ResizeImage:
+            resize_short: 256
+        - CropImage:
+            size: 224
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 64
+      drop_last: False
+      shuffle: False
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+Infer:
+  infer_imgs: docs/images/inference_deployment/whl_demo.jpg
+  batch_size: 10
+  transforms:
+    - DecodeImage:
+        to_rgb: True
+        channel_first: False
+    - ResizeImage:
+        resize_short: 256
+    - CropImage:
+        size: 224
+    - NormalizeImage:
+        scale: 1.0/255.0
+        mean: [0.485, 0.456, 0.406]
+        std: [0.229, 0.224, 0.225]
+        order: ''
+    - ToCHWImage:
+  PostProcess:
+    name: Topk
+    topk: 5
+    class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
+
+Metric:
+  Train:
+  Eval:
+    - TopkAcc:
+        topk: [1, 5]

+ 142 - 0
paddlex/repo_apis/PaddleClas_api/configs/ResNet50_vd.yaml

@@ -0,0 +1,142 @@
+# global configs
+Global:
+  checkpoints: null
+  pretrained_model: null
+  output_dir: ./output/
+  device: gpu
+  save_interval: 1
+  eval_during_train: True
+  eval_interval: 1
+  epochs: 200
+  print_batch_step: 10
+  use_visualdl: False
+  # used for static mode and model export
+  image_shape: [3, 224, 224]
+  save_inference_dir: ./inference
+
+
+# mixed precision
+AMP:
+  use_amp: False
+  use_fp16_test: False
+  scale_loss: 128.0
+  use_dynamic_loss_scaling: True
+  use_promote: False
+  # O1: mixed fp16, O2: pure fp16
+  level: O1
+
+
+# model architecture
+Arch:
+  name: ResNet50_vd
+  class_num: 102
+ 
+# loss function config for traing/eval process
+Loss:
+  Train:
+    - CELoss:
+        weight: 1.0
+        epsilon: 0.1
+  Eval:
+    - CELoss:
+        weight: 1.0
+
+
+Optimizer:
+  name: Momentum
+  momentum: 0.9
+  lr:
+    name: Cosine
+    learning_rate: 0.1
+  regularizer:
+    name: 'L2'
+    coeff: 0.00007
+
+
+# data loader for train and eval
+DataLoader:
+  Train:
+    dataset:
+      name: ImageNetDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/train_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - RandCropImage:
+            size: 224
+        - RandFlipImage:
+            flip_code: 1
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+      batch_transform_ops:
+        - MixupOperator:
+            alpha: 0.2
+
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 64
+      drop_last: False
+      shuffle: True
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+  Eval:
+    dataset: 
+      name: ImageNetDataset
+      image_root: ./dataset/ILSVRC2012/
+      cls_label_path: ./dataset/ILSVRC2012/val_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - ResizeImage:
+            resize_short: 256
+        - CropImage:
+            size: 224
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 64
+      drop_last: False
+      shuffle: False
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+Infer:
+  infer_imgs: docs/images/inference_deployment/whl_demo.jpg
+  batch_size: 10
+  transforms:
+    - DecodeImage:
+        to_rgb: True
+        channel_first: False
+    - ResizeImage:
+        resize_short: 256
+    - CropImage:
+        size: 224
+    - NormalizeImage:
+        scale: 1.0/255.0
+        mean: [0.485, 0.456, 0.406]
+        std: [0.229, 0.224, 0.225]
+        order: ''
+    - ToCHWImage:
+  PostProcess:
+    name: Topk
+    topk: 5
+    class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
+
+Metric:
+  Train:
+  Eval:
+    - TopkAcc:
+        topk: [1, 5]