Răsfoiți Sursa

Merge branch 'develop' of https://github.com/PaddlePaddle/PaddleX into devellop_fix

FlyingQianMM 4 ani în urmă
părinte
comite
9bc5cd2c46
87 a modificat fișierele cu 1375 adăugiri și 1659 ștergeri
  1. 1 1
      README.md
  2. 1 1
      docs/apis/visualize.md
  3. 20 9
      dygraph/deploy/cpp/model_deploy/ppseg/src/seg_postprocess.cpp
  4. 8 0
      dygraph/examples/README.md
  5. 184 0
      dygraph/examples/defect_detection/README.md
  6. 41 0
      dygraph/examples/defect_detection/code/infer.py
  7. 52 0
      dygraph/examples/defect_detection/code/train.py
  8. BIN
      dygraph/examples/defect_detection/images/labelme.png
  9. BIN
      dygraph/examples/defect_detection/images/lens.png
  10. BIN
      dygraph/examples/defect_detection/images/predict.jpg
  11. BIN
      dygraph/examples/defect_detection/images/process.png
  12. BIN
      dygraph/examples/defect_detection/images/robot.png
  13. BIN
      dygraph/examples/defect_detection/images/split_dataset.png
  14. BIN
      dygraph/examples/defect_detection/images/vdl.png
  15. BIN
      dygraph/examples/defect_detection/images/vdl2.png
  16. 288 0
      dygraph/examples/rebar_count/README.md
  17. 42 0
      dygraph/examples/rebar_count/code/infer.py
  18. 65 0
      dygraph/examples/rebar_count/code/prune.py
  19. 56 0
      dygraph/examples/rebar_count/code/train.py
  20. BIN
      dygraph/examples/rebar_count/images/0.png
  21. BIN
      dygraph/examples/rebar_count/images/2.png
  22. BIN
      dygraph/examples/rebar_count/images/3.png
  23. BIN
      dygraph/examples/rebar_count/images/5.png
  24. BIN
      dygraph/examples/rebar_count/images/7.png
  25. BIN
      dygraph/examples/rebar_count/images/8.png
  26. BIN
      dygraph/examples/rebar_count/images/phone_pic.jpg
  27. BIN
      dygraph/examples/rebar_count/images/predict.jpg
  28. BIN
      dygraph/examples/rebar_count/images/process.png
  29. BIN
      dygraph/examples/rebar_count/images/split_dataset.png
  30. BIN
      dygraph/examples/rebar_count/images/vdl.png
  31. BIN
      dygraph/examples/rebar_count/images/vdl2.png
  32. BIN
      dygraph/examples/rebar_count/images/worker.png
  33. 207 0
      dygraph/examples/robot_grab/README.md
  34. 40 0
      dygraph/examples/robot_grab/code/infer.py
  35. 50 0
      dygraph/examples/robot_grab/code/train.py
  36. BIN
      dygraph/examples/robot_grab/images/labelme.png
  37. BIN
      dygraph/examples/robot_grab/images/lens.png
  38. BIN
      dygraph/examples/robot_grab/images/predict.bmp
  39. BIN
      dygraph/examples/robot_grab/images/predict.jpg
  40. BIN
      dygraph/examples/robot_grab/images/process.png
  41. BIN
      dygraph/examples/robot_grab/images/robot.png
  42. BIN
      dygraph/examples/robot_grab/images/split_dataset.png
  43. BIN
      dygraph/examples/robot_grab/images/vdl.png
  44. BIN
      dygraph/examples/robot_grab/images/vdl2.png
  45. 0 1
      dygraph/paddlex/__init__.py
  46. 45 197
      dygraph/paddlex/cls.py
  47. 0 1
      dygraph/paddlex/cv/datasets/voc.py
  48. 20 12
      dygraph/paddlex/cv/models/classifier.py
  49. 80 25
      dygraph/paddlex/cv/models/detector.py
  50. 1 1
      dygraph/paddlex/cv/models/segmenter.py
  51. 10 4
      dygraph/paddlex/cv/models/utils/det_metrics/metrics.py
  52. 2 2
      dygraph/paddlex/cv/models/utils/visualize.py
  53. 0 164
      dygraph/paddlex/cv/transforms/cls_transforms.py
  54. 0 152
      dygraph/paddlex/cv/transforms/det_transforms.py
  55. 95 69
      dygraph/paddlex/cv/transforms/operators.py
  56. 0 536
      dygraph/paddlex/cv/transforms/seg_transforms.py
  57. 19 168
      dygraph/paddlex/det.py
  58. 0 78
      dygraph/paddlex/models.py
  59. 15 208
      dygraph/paddlex/seg.py
  60. 6 2
      dygraph/paddlex/utils/checkpoint.py
  61. 1 1
      dygraph/tutorials/slim/prune/image_classification/mobilenetv2_train.py
  62. 1 1
      dygraph/tutorials/slim/prune/object_detection/yolov3_train.py
  63. 1 1
      dygraph/tutorials/slim/prune/semantic_segmentation/unet_train.py
  64. 1 1
      dygraph/tutorials/slim/quantize/image_classification/mobilenetv2_train.py
  65. 1 1
      dygraph/tutorials/slim/quantize/object_detection/yolov3_train.py
  66. 1 1
      dygraph/tutorials/slim/quantize/semantic_segmentation/unet_train.py
  67. 1 1
      dygraph/tutorials/train/image_classification/alexnet.py
  68. 1 1
      dygraph/tutorials/train/image_classification/darknet53.py
  69. 1 1
      dygraph/tutorials/train/image_classification/densenet121.py
  70. 1 1
      dygraph/tutorials/train/image_classification/hrnet_w18_c.py
  71. 1 1
      dygraph/tutorials/train/image_classification/mobilenetv3_large_w_custom_optimizer.py
  72. 1 1
      dygraph/tutorials/train/image_classification/mobilenetv3_small.py
  73. 1 1
      dygraph/tutorials/train/image_classification/resnet50_vd_ssld.py
  74. 1 1
      dygraph/tutorials/train/image_classification/shufflenetv2.py
  75. 1 1
      dygraph/tutorials/train/image_classification/xception41.py
  76. 1 1
      dygraph/tutorials/train/instance_segmentation/mask_rcnn_r50_fpn.py
  77. 1 1
      dygraph/tutorials/train/object_detection/faster_rcnn_hrnet_w18.py
  78. 1 1
      dygraph/tutorials/train/object_detection/faster_rcnn_r50_fpn.py
  79. 1 1
      dygraph/tutorials/train/object_detection/ppyolo.py
  80. 1 1
      dygraph/tutorials/train/object_detection/ppyolotiny.py
  81. 1 2
      dygraph/tutorials/train/object_detection/ppyolov2.py
  82. 1 1
      dygraph/tutorials/train/object_detection/yolov3_darknet53.py
  83. 1 1
      dygraph/tutorials/train/semantic_segmentation/bisenetv2.py
  84. 1 1
      dygraph/tutorials/train/semantic_segmentation/deeplabv3p_resnet50_vd.py
  85. 1 1
      dygraph/tutorials/train/semantic_segmentation/fastscnn.py
  86. 1 1
      dygraph/tutorials/train/semantic_segmentation/hrnet.py
  87. 1 1
      dygraph/tutorials/train/semantic_segmentation/unet.py

+ 1 - 1
README.md

@@ -16,7 +16,7 @@
  ![QQGroup](https://img.shields.io/badge/QQ_Group-1045148026-52B6EF?style=social&logo=tencent-qq&logoColor=000&logoWidth=20)
 
 
-## PaddleX dygraph mode is ready! Static mode is set by default and dynamic graph code base is in [dygraph](https://github.com/PaddlePaddle/PaddleX/tree/develop/dygraph). If you want to use static mode, the version 1.3.10 can be installed by pip. The version 2.0.0rc0 corresponds to the dygraph mode.
+## PaddleX dynamic graph mode is ready! Static graph mode is set as default and dynamic graph code base is in [dygraph](https://github.com/PaddlePaddle/PaddleX/tree/develop/dygraph). If you want to use static graph mode, the version 1.3.11 can be installed by pip. The version 2.0.0rc0 corresponds to the dynamic graph mode.
 
 
 :hugs:  PaddleX integrated the abilities of **Image classification**, **Object detection**, **Semantic segmentation**, and **Instance segmentation** in the Paddle CV toolkits, and get through the whole-process development from **Data preparation** and **Model training and optimization** to **Multi-end deployment**. At the same time, PaddleX provides **Succinct APIs** and a **Graphical User Interface**. Developers can quickly complete the end-to-end process development of the Paddle in a form of **low-code**  without installing different libraries.

+ 1 - 1
docs/apis/visualize.md

@@ -45,7 +45,7 @@ paddlex.seg.visualize(image, result, weight=0.6, save_dir='./', color=None)
 import paddlex as pdx
 model = pdx.load_model('cityscape_deeplab')
 result = model.predict('city.png')
-pdx.det.visualize('city.png', result, save_dir='./')
+pdx.seg.visualize('city.png', result, save_dir='./')
 # 预测结果保存在./visualize_city.png
 ```
 

+ 20 - 9
dygraph/deploy/cpp/model_deploy/ppseg/src/seg_postprocess.cpp

@@ -59,29 +59,40 @@ void SegPostprocess::RestoreSegMap(const ShapeInfo& shape_info,
     score_mat->begin<float>(), score_mat->end<float>());
 }
 
+// ppseg version >= 2.1  shape = [b, w, h]
 bool SegPostprocess::RunV2(const DataBlob& output,
                            const std::vector<ShapeInfo>& shape_infos,
                            std::vector<Result>* results, int thread_num) {
   int batch_size = shape_infos.size();
-  std::vector<int> score_map_shape = output.shape;
-  int score_map_size = std::accumulate(output.shape.begin() + 1,
-                                       output.shape.end(), 1,
-                                       std::multiplies<int>());
-  const uint8_t* score_map_data =
-          reinterpret_cast<const uint8_t*>(output.data.data());
-  int num_map_pixels = output.shape[1] * output.shape[2];
+  int label_map_size = output.shape[1] * output.shape[2];
+  const uint8_t* label_data;
+  std::vector<uint8_t> label_vector;
+  if (output.dtype == INT64) {  // int64
+    const int64_t* output_data =
+          reinterpret_cast<const int64_t*>(output.data.data());
+    std::transform(output_data, output_data + label_map_size * batch_size,
+                   std::back_inserter(label_vector),
+                   [](int64_t x) { return (uint8_t)x;});
+    label_data = reinterpret_cast<const uint8_t*>(label_vector.data());
+  } else if (output.dtype == INT8) {  // uint8
+    label_data = reinterpret_cast<const uint8_t*>(output.data.data());
+  } else {
+    std::cerr << "Output dtype is not support on seg posrtprocess "
+              << output.dtype << std::endl;
+    return false;
+  }
 
   for (int i = 0; i < batch_size; ++i) {
     (*results)[i].model_type = "seg";
     (*results)[i].seg_result = new SegResult();
-    const uint8_t* current_start_ptr = score_map_data + i * score_map_size;
+    const uint8_t* current_start_ptr = label_data + i * label_map_size;
     cv::Mat score_mat(output.shape[1], output.shape[2],
                       CV_32FC1, cv::Scalar(1.0));
     cv::Mat label_mat(output.shape[1], output.shape[2],
                       CV_8UC1, const_cast<uint8_t*>(current_start_ptr));
 
     RestoreSegMap(shape_infos[i], &label_mat,
-                &score_mat, (*results)[i].seg_result);
+                 &score_mat, (*results)[i].seg_result);
   }
   return true;
 }

+ 8 - 0
dygraph/examples/README.md

@@ -0,0 +1,8 @@
+# 说明
+本目录提供了多个产业实际的应用案例,用户可根据案例文档说明快速掌握如何应用PaddleX应用到实际的项目开发当中。
+
+* [钢筋计数](./rebar_count)
+
+* [机械手抓取](./robat_grab)
+
+* [缺陷检测](./defect_detection)

+ 184 - 0
dygraph/examples/defect_detection/README.md

@@ -0,0 +1,184 @@
+# 镜头缺陷检测
+### 1 项目说明
+摄像头模组是智能手机最为重要的组成部分之一。随着智能手机行业的快速发展,摄像头模组的需求量增加。高像素摄像头的出现,对模组检测精度要求提出了新的挑战。
+
+项目中以手机镜头为例,向大家介绍如何快速使用实例分割方式进行缺陷检测。
+
+
+
+### 2 数据准备
+数据集中包含了992张已经标注好的数据,标注形式为MSCOCO的实例分割格式。[点击此处下载数据集](https://bj.bcebos.com/paddlex/examples2/defect_detection/dataset_lens_defect_detection.zip)
+
+<div align="center">
+<img src="./images/lens.png"  width = "1000" /
+>              </div>
+
+更多数据格式信息请参考[数据标注说明文档](https://paddlex.readthedocs.io/zh_CN/develop/data/annotation/index.html)
+* **数据切分**
+将训练集、验证集和测试集按照7:2:1的比例划分。
+``` shell
+paddlex --split_dataset --format COCO --dataset_dir dataset --val_value 0.2 --test_value 0.1
+```
+<div align="center">
+<img src="./images/split_dataset.png"  width = "1500" />              </div>
+数据文件夹切分前后的状态如下:
+
+```bash
+  dataset/                      dataset/
+  ├── JPEGImages/       -->     ├── JPEGImages/
+  ├── annotations.json          ├── annotations.json
+                                ├── test.json
+                                ├── train.json
+                                ├── val.json
+  ```
+
+
+### 3 模型选择
+PaddleX提供了丰富的视觉模型,在实例分割中提供了MaskRCNN系列模型.在本项目中采用Mask-RCNN算法
+
+### 4 模型训练
+在项目中,我们采用Mask-RCNN作为镜头缺陷检测的模型。具体代码请参考[train.py](./code/train.py)
+运行如下代码开始训练模型:
+``` shell
+python code/train.py
+```
+若输入如下代码,则可在log文件中查看训练日志
+``` shell
+python code/train.py > log
+```
+* 训练过程说明
+<div align="center">
+<img src="./images/process.png"  width = "1000" />              </div>
+
+``` bash
+# 定义训练和验证时的transforms
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/transforms/operators.py
+train_transforms = T.Compose([
+    T.RandomResizeByShort(
+        short_sizes=[640, 672, 704, 736, 768, 800],
+        max_size=1333,
+        interp='CUBIC'), T.RandomHorizontalFlip(), T.Normalize(
+            mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+])
+
+eval_transforms = T.Compose([
+    T.ResizeByShort(
+        short_size=800, max_size=1333, interp='CUBIC'), T.Normalize(
+            mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+])
+```
+
+```bash
+# 定义训练和验证所用的数据集
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/develop/dygraph/paddlex/cv/datasets/coco.py#L26
+train_dataset = pdx.datasets.CocoDetection(
+    data_dir='dataset/JPEGImages',
+    ann_file='dataset/train.json',
+    # num_workers=0, # 注意:若运行时报错则添加该句
+    transforms=train_transforms,
+    shuffle=True)
+eval_dataset = pdx.datasets.CocoDetection(
+    data_dir='dataset/JPEGImages',
+    ann_file='dataset/val.json',
+    # num_workers=0, # 注意:若运行时报错则添加该句
+    transforms=eval_transforms)
+```
+``` bash
+# 初始化模型,并进行训练
+# 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
+num_classes = len(train_dataset.labels)
+model = pdx.models.MaskRCNN(
+    num_classes=num_classes, backbone='ResNet50', with_fpn=True)
+```
+``` bash
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/models/detector.py#L155
+# 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html
+model.train(
+    num_epochs=12,
+    train_dataset=train_dataset,
+    train_batch_size=1,
+    eval_dataset=eval_dataset,
+    learning_rate=0.00125,
+    lr_decay_epochs=[8, 11],
+    warmup_steps=10,
+    warmup_start_lr=0.0,
+    save_dir='output/mask_rcnn_r50_fpn',
+    use_vdl=True)
+ ```
+
+### 5 训练可视化
+
+在模型训练过程,在`train`函数中,将`use_vdl`设为True,则训练过程会自动将训练日志以VisualDL的格式打点在`save_dir`(用户自己指定的路径)下的`vdl_log`目录。
+
+用户可以使用如下命令启动VisualDL服务,查看可视化指标
+
+```
+visualdl --logdir output/mask_rcnn_r50_fpn/vdl_log --port 8001
+```
+
+<div align="center">
+<img src="./images/vdl.png"  width = "1000" />              </div>
+
+服务启动后,按照命令行提示,使用浏览器打开 http://localhost:8001/
+
+### 6 模型导出
+模型训练处理被保存在了output文件夹,此时模型文件还是动态图文档,需要导出成静态图的模型才可以进一步部署预测,运行如下命令,会自动在output文件夹下创建一个`inference_model`的文件夹,用来存放预测好的模型。
+
+``` bash
+paddlex --export_inference --model_dir=output/mask_rcnn_r50_fpn/best_model --save_dir=output/inference_model
+```
+### 7 模型预测
+
+运行如下代码:
+``` bash
+python code/infer.py
+```
+文件内容如下:
+``` bash
+import glob
+import numpy as np
+import threading
+import time
+import random
+import os
+import base64
+import cv2
+import json
+import paddlex as pdx
+
+image_name = 'dataset/JPEGImages/Image_370.jpg'
+model = pdx.load_model('output/mask_rcnn_r50_fpn/best_model')
+
+
+img = cv2.imread(image_name)
+result = model.predict(img)
+
+keep_results = []
+areas = []
+f = open('result.txt','a')
+count = 0
+for dt in np.array(result):
+    cname, bbox, score = dt['category'], dt['bbox'], dt['score']
+    if score < 0.5:
+        continue
+    keep_results.append(dt)
+    count+=1
+    f.write(str(dt)+'\n')
+    f.write('\n')
+    areas.append(bbox[2] * bbox[3])
+areas = np.asarray(areas)
+sorted_idxs = np.argsort(-areas).tolist()
+keep_results = [keep_results[k]
+                for k in sorted_idxs] if len(keep_results) > 0 else []
+print(keep_results)
+print(count)
+f.write("the total number is :"+str(int(count)))
+f.close()
+
+pdx.det.visualize(image_name, result, threshold=0.5, save_dir='./output/mask_rcnn_r50_fpn')
+```
+则可生成result.txt文件并显示预测结果图片,result.txt文件中会显示图片中每个检测框的位置、类别及置信度,并给出检测框的总个数.
+
+预测结果如下:
+<div align="center">
+<img src="./images/predict.jpg"  width = "1000" />              </div>

+ 41 - 0
dygraph/examples/defect_detection/code/infer.py

@@ -0,0 +1,41 @@
+import glob
+import numpy as np
+import threading
+import time
+import random
+import os
+import base64
+import cv2
+import json
+import paddlex as pdx
+
+image_name = 'dataset/JPEGImages/Image_370.jpg'
+model = pdx.load_model('output/mask_rcnn_r50_fpn/best_model')
+
+img = cv2.imread(image_name)
+result = model.predict(img)
+
+keep_results = []
+areas = []
+f = open('result.txt', 'a')
+count = 0
+for dt in np.array(result):
+    cname, bbox, score = dt['category'], dt['bbox'], dt['score']
+    if score < 0.5:
+        continue
+    keep_results.append(dt)
+    count += 1
+    f.write(str(dt) + '\n')
+    f.write('\n')
+    areas.append(bbox[2] * bbox[3])
+areas = np.asarray(areas)
+sorted_idxs = np.argsort(-areas).tolist()
+keep_results = [keep_results[k]
+                for k in sorted_idxs] if len(keep_results) > 0 else []
+print(keep_results)
+print(count)
+f.write("the total number is :" + str(int(count)))
+f.close()
+
+pdx.det.visualize(
+    image_name, result, threshold=0.5, save_dir='./output/mask_rcnn_r50_fpn')

+ 52 - 0
dygraph/examples/defect_detection/code/train.py

@@ -0,0 +1,52 @@
+import paddlex as pdx
+from paddlex import transforms as T
+
+# 定义训练和验证时的transforms
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/transforms/operators.py
+train_transforms = T.Compose([
+    T.RandomResizeByShort(
+        short_sizes=[640, 672, 704, 736, 768, 800],
+        max_size=1333,
+        interp='CUBIC'), T.RandomHorizontalFlip(), T.Normalize(
+            mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+])
+
+eval_transforms = T.Compose([
+    T.ResizeByShort(
+        short_size=800, max_size=1333, interp='CUBIC'), T.Normalize(
+            mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+])
+
+# 定义训练和验证所用的数据集
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/develop/dygraph/paddlex/cv/datasets/coco.py#L26
+train_dataset = pdx.datasets.CocoDetection(
+    data_dir='dataset/JPEGImages',
+    ann_file='dataset/train.json',
+    transforms=train_transforms,
+    shuffle=True,
+    num_workers=0)
+eval_dataset = pdx.datasets.CocoDetection(
+    data_dir='dataset/JPEGImages',
+    ann_file='dataset/val.json',
+    transforms=eval_transforms,
+    num_workers=0)
+
+# 初始化模型,并进行训练
+# 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
+num_classes = len(train_dataset.labels)
+model = pdx.models.MaskRCNN(
+    num_classes=num_classes, backbone='ResNet50', with_fpn=True)
+
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/models/detector.py#L155
+# 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html
+model.train(
+    num_epochs=12,
+    train_dataset=train_dataset,
+    train_batch_size=1,
+    eval_dataset=eval_dataset,
+    learning_rate=0.00125,
+    lr_decay_epochs=[8, 11],
+    warmup_steps=10,
+    warmup_start_lr=0.0,
+    save_dir='output/mask_rcnn_r50_fpn',
+    use_vdl=True)

BIN
dygraph/examples/defect_detection/images/labelme.png


BIN
dygraph/examples/defect_detection/images/lens.png


BIN
dygraph/examples/defect_detection/images/predict.jpg


BIN
dygraph/examples/defect_detection/images/process.png


BIN
dygraph/examples/defect_detection/images/robot.png


BIN
dygraph/examples/defect_detection/images/split_dataset.png


BIN
dygraph/examples/defect_detection/images/vdl.png


BIN
dygraph/examples/defect_detection/images/vdl2.png


+ 288 - 0
dygraph/examples/rebar_count/README.md

@@ -0,0 +1,288 @@
+# 钢筋计数
+
+
+### 1 项目说明
+在该项目中,主要向大家介绍如何使用目标检测来实现对钢筋计数。涉及代码亦可用于车辆计数、螺母计数、圆木计数等。
+
+在工地现场,对于进场的钢筋车,验收人员需要对车上的钢筋进行现场人工点根,确认数量后钢筋车才能完成进场卸货。上述过程繁琐、消耗人力且速度很慢。针对上述问题,希望通过手机拍照->目标检测计数->人工修改少量误检的方式智能、高效的完成此任务:
+<div align="center">
+<img src="./images/worker.png"  width = "500" />              </div>
+
+**业务难点:**
+* **精度要求高** 钢筋本身价格较昂贵,且在实际使用中数量很大,误检和漏检都需要人工在大量的标记点中找出,所以需要精度非常高才能保证验收人员的使用体验。需要专门针对此密集目标的检测算法进行优化,另外,还需要处理拍摄角度、光线不完全受控,钢筋存在长短不齐、可能存在遮挡等情况。
+* **钢筋尺寸不一** 钢筋的直径变化范围较大且截面形状不规则、颜色不一,拍摄的角度、距离也不完全受控,这也导致传统算法在实际使用的过程中效果很难稳定。
+* **边界难以区分** 一辆钢筋车一次会运输很多捆钢筋,如果直接全部处理会存在边缘角度差、遮挡等问题效果不好,目前在用单捆处理+最后合计的流程,这样的处理过程就会需要对捆间进行分割或者对最终结果进行去重,难度较大。
+
+<div align="center">
+<img src="./images/phone_pic.jpg"  width = "1000" />              </div>
+
+### 2 数据准备
+
+数据集中包含了250张已经标注好的数据,原始数据标注形式为csv格式。该项目采用目标检测的标注方式,在本文档中提供了VOC数据集格式。[点击此处下载数据集]( https://bj.bcebos.com/paddlex/examples2/rebar_count/dataset_reinforcing_steel_bar_counting.zip)
+
+更多数据格式信息请参考[数据标注说明文档](https://paddlex.readthedocs.io/zh_CN/develop/data/annotation/index.html)
+
+* **数据切分**
+将训练集、验证集和测试集按照7:2:1的比例划分。 PaddleX中提供了简单易用的API,方便用户直接使用进行数据划分。
+``` shell
+paddlex --split_dataset --format VOC --dataset_dir dataset --val_value 0.2 --test_value 0.1
+```
+<div align="center">
+<img src="./images/split_dataset.png"  width = "1500" />              </div>
+数据文件夹切分前后的状态如下:
+
+```bash
+  dataset/                          dataset/
+  ├── Annotations/      -->         ├── Annotations/
+  ├── JPEGImages/                   ├── JPEGImages/
+                                    ├── labels.txt
+                                    ├── test_list.txt
+                                    ├── train_list.txt
+                                    ├── val_list.txt
+  ```
+
+### 3 模型选择
+PaddleX提供了丰富的视觉模型,在目标检测中提供了RCNN和YOLO系列模型。在本项目中采用yoloV3作为检测模型进行钢筋计数。
+
+### 4 模型训练
+在项目中,我们采用yolov3作为钢筋检测的模型。具体代码请参考[train.py](./code/train.py)。
+
+运行如下代码开始训练模型:
+
+
+``` shell
+python code/train.py
+```
+
+若输入如下代码,则可在log文件中查看训练日志,log文件保存在`code`目标下
+``` shell
+python code/train.py > log
+```
+
+* 训练过程说明
+<div align="center">
+<img src="./images/process.png"  width = "1000" />              </div>
+
+``` bash
+# 定义训练和验证时的transforms
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/transforms/operators.py
+train_transforms = T.Compose([
+    T.MixupImage(mixup_epoch=250), T.RandomDistort(),
+    T.RandomExpand(im_padding_value=[123.675, 116.28, 103.53]), T.RandomCrop(),
+    T.RandomHorizontalFlip(), T.BatchRandomResize(
+        target_sizes=[320, 352, 384, 416, 448, 480, 512, 544, 576, 608],
+        interp='RANDOM'), T.Normalize(
+            mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+])
+
+eval_transforms = T.Compose([
+    T.Resize(
+        608, interp='CUBIC'), T.Normalize(
+            mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+])
+```
+
+```bash
+# 定义训练和验证所用的数据集
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/datasets/voc.py#L29
+train_dataset = pdx.datasets.VOCDetection(
+    data_dir='dataset',
+    file_list='dataset/train_list.txt',
+    label_list='dataset/labels.txt',
+    transforms=train_transforms,
+    shuffle=True,
+    num_worker=0)
+
+eval_dataset = pdx.datasets.VOCDetection(
+    data_dir='dataset',
+    file_list='dataset/val_list.txt',
+    label_list='dataset/labels.txt',
+    transforms=eval_transforms,
+    shuffle=False,
+    num_worker=0)
+```
+``` bash
+# 初始化模型,并进行训练
+# 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
+num_classes = len(train_dataset.labels)
+model = pdx.models.YOLOv3(num_classes=num_classes, backbone='DarkNet53')
+```
+``` bash
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/models/detector.py#L155
+# 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html
+model.train(
+    num_epochs=270,
+    train_dataset=train_dataset,
+    train_batch_size=2,
+    eval_dataset=eval_dataset,
+    learning_rate=0.001 / 8,
+    warmup_steps=1000,
+    warmup_start_lr=0.0,
+    save_interval_epochs=5,
+    lr_decay_epochs=[216, 243],
+    save_dir='output/yolov3_darknet53')
+ ```
+
+
+
+### 5 训练可视化
+
+在模型训练过程,在`train`函数中,将`use_vdl`设为True,则训练过程会自动将训练日志以VisualDL的格式打点在`save_dir`(用户自己指定的路径)下的`vdl_log`目录。
+
+用户可以使用如下命令启动VisualDL服务,查看可视化指标
+
+```
+visualdl --logdir output/yolov3_darknet53/vdl_log --port 8001
+```
+
+<div align="center">
+<img src="./images/vdl.png"  width = "1000" />              </div>
+
+服务启动后,按照命令行提示,使用浏览器打开 http://localhost:8001/
+<div align="center">
+<img src="./images/vdl2.png"  width = "1000" />              </div>
+
+### 6 模型导出
+模型训练后保存在output文件夹,如果要使用PaddleInference进行部署需要导出成静态图的模型,运行如下命令,会自动在output文件夹下创建一个`inference_model`的文件夹,用来存放导出后的模型。
+
+``` bash
+paddlex --export_inference --model_dir=output/yolov3_darknet53/best_model --save_dir=output/inference_model --fixed_input_shape=608,608
+```
+**注意**:设定 fixed_input_shape 的数值需与 eval_transforms 中设置的 target_size 数值上保持一致。
+### 7 模型预测
+
+运行如下代码:
+``` bash
+python code/infer.py
+```
+文件内容如下:
+```bash
+import glob
+import numpy as np
+import threading
+import time
+import random
+import os
+import base64
+import cv2
+import json
+import paddlex as pdx
+
+image_name = 'dataset/JPEGImages/6B898244.jpg'
+
+model = pdx.load_model('output/yolov3_darknet53/best_model')
+
+img = cv2.imread(image_name)
+result = model.predict(img)
+
+keep_results = []
+areas = []
+f = open('result.txt','a')
+count = 0
+for dt in np.array(result):
+    cname, bbox, score = dt['category'], dt['bbox'], dt['score']
+    if score < 0.5:
+        continue
+    keep_results.append(dt)
+    count+=1
+    f.write(str(dt)+'\n')
+    f.write('\n')
+    areas.append(bbox[2] * bbox[3])
+areas = np.asarray(areas)
+sorted_idxs = np.argsort(-areas).tolist()
+keep_results = [keep_results[k]
+                for k in sorted_idxs] if len(keep_results) > 0 else []
+print(keep_results)
+print(count)
+f.write("the total number is :"+str(int(count)))
+f.close()
+pdx.det.visualize(image_name, result, threshold=0.5, save_dir='./output/yolov3_darknet53')
+```
+
+则可生成result.txt文件并显示预测结果图片,result.txt文件中会显示图片中每个检测框的位置、类别及置信度,并给出检测框的总个数,从而实现了钢筋自动计数。
+
+预测结果如下:
+<div align="center">
+<img src="./images/predict.jpg"  width = "1000" />              </div>
+
+### 8 模型裁剪
+
+模型裁剪可以更好地满足在端侧、移动端上部署场景下的性能需求,可以有效得降低模型的体积,以及计算量,加速预测性能。PaddleX集成了PaddleSlim的基于敏感度的通道裁剪算法,用户可以在PaddleX的训练代码里轻松使用起来。
+
+运行如下代码:
+``` bash
+python code/prune.py
+```
+裁剪过程说明:
+``` bash
+# 定义训练和验证时的transforms
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/transforms/operators.py
+train_transforms = T.Compose([
+    T.MixupImage(mixup_epoch=250), T.RandomDistort(),
+    T.RandomExpand(im_padding_value=[123.675, 116.28, 103.53]), T.RandomCrop(),
+    T.RandomHorizontalFlip(), T.BatchRandomResize(
+        target_sizes=[320, 352, 384, 416, 448, 480, 512, 544, 576, 608],
+        interp='RANDOM'), T.Normalize(
+            mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+])
+
+eval_transforms = T.Compose([
+    T.Resize(
+        608, interp='CUBIC'), T.Normalize(
+            mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+])
+```
+``` bash
+# 定义训练和验证所用的数据集
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/datasets/voc.py#L29
+train_dataset = pdx.datasets.VOCDetection(
+    data_dir='dataset',
+    file_list='dataset/train_list.txt',
+    label_list='dataset/labels.txt',
+    transforms=train_transforms,
+    shuffle=True)
+
+eval_dataset = pdx.datasets.VOCDetection(
+    data_dir='dataset',
+    file_list='dataset/val_list.txt',
+    label_list='dataset/labels.txt',
+    transforms=eval_transforms,
+    shuffle=False)
+```
+``` bash
+# 加载模型
+model = pdx.load_model('output/yolov3_darknet53/best_model')
+```
+``` bash
+# Step 1/3: 分析模型各层参数在不同的剪裁比例下的敏感度
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/95c53dec89ab0f3769330fa445c6d9213986ca5f/paddlex/cv/models/base.py#L352
+model.analyze_sensitivity(
+    dataset=eval_dataset,
+    batch_size=1,
+    save_dir='output/yolov3_darknet53/prune')
+```
+**注意**: 如果之前运行过该步骤,第二次运行时会自动加载已有的output/yolov3_darknet53/prune/model.sensi.data,不再进行敏感度分析。
+``` bash
+# Step 2/3: 根据选择的FLOPs减小比例对模型进行剪裁
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/95c53dec89ab0f3769330fa445c6d9213986ca5f/paddlex/cv/models/base.py#L394
+model.prune(pruned_flops=.2)
+```
+**注意**: 如果想直接保存剪裁完的模型参数,设置save_dir即可。但我们强烈建议对剪裁过的模型重新进行训练,以保证模型精度损失能尽可能少。
+``` bash
+# Step 3/3: 对剪裁后的模型重新训练
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/models/detector.py#L154
+# 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html
+model.train(
+    num_epochs=270,
+    train_dataset=train_dataset,
+    train_batch_size=8,
+    eval_dataset=eval_dataset,
+    learning_rate=0.001 / 8,
+    warmup_steps=1000,
+    warmup_start_lr=0.0,
+    save_interval_epochs=5,
+    lr_decay_epochs=[216, 243],
+    pretrain_weights=None,
+    save_dir='output/yolov3_darknet53/prune')
+```
+**注意**: 重新训练时需将pretrain_weights设置为None,否则模型会加载pretrain_weights指定的预训练模型参数。

+ 42 - 0
dygraph/examples/rebar_count/code/infer.py

@@ -0,0 +1,42 @@
+import glob
+import numpy as np
+import threading
+import time
+import random
+import os
+import base64
+import cv2
+import json
+import paddlex as pdx
+
+image_name = 'dataset/JPEGImages/6B898244.jpg'
+
+model = pdx.load_model('output/ppyolo_r50vd_dcn/best_model')
+
+img = cv2.imread(image_name)
+result = model.predict(img)
+
+keep_results = []
+areas = []
+f = open('result.txt', 'a')
+count = 0
+for dt in np.array(result):
+    cname, bbox, score = dt['category'], dt['bbox'], dt['score']
+    if score < 0.5:
+        continue
+    keep_results.append(dt)
+    count += 1
+    f.write(str(dt) + '\n')
+    f.write('\n')
+    areas.append(bbox[2] * bbox[3])
+areas = np.asarray(areas)
+sorted_idxs = np.argsort(-areas).tolist()
+keep_results = [keep_results[k]
+                for k in sorted_idxs] if len(keep_results) > 0 else []
+print(keep_results)
+print(count)
+f.write("the total number is :" + str(int(count)))
+f.close()
+
+pdx.det.visualize(
+    image_name, result, threshold=0.5, save_dir='./output/ppyolo_r50vd_dcn')

+ 65 - 0
dygraph/examples/rebar_count/code/prune.py

@@ -0,0 +1,65 @@
+import paddlex as pdx
+from paddlex import transforms as T
+
+# 定义训练和验证时的transforms
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/transforms/operators.py
+train_transforms = T.Compose([
+    T.MixupImage(mixup_epoch=250), T.RandomDistort(),
+    T.RandomExpand(im_padding_value=[123.675, 116.28, 103.53]), T.RandomCrop(),
+    T.RandomHorizontalFlip(), T.BatchRandomResize(
+        target_sizes=[320, 352, 384, 416, 448, 480, 512, 544, 576, 608],
+        interp='RANDOM'), T.Normalize(
+            mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+])
+
+eval_transforms = T.Compose([
+    T.Resize(
+        608, interp='CUBIC'), T.Normalize(
+            mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+])
+
+# 定义训练和验证所用的数据集
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/datasets/voc.py#L29
+train_dataset = pdx.datasets.VOCDetection(
+    data_dir='dataset',
+    file_list='dataset/train_list.txt',
+    label_list='dataset/labels.txt',
+    transforms=train_transforms,
+    shuffle=True)
+
+eval_dataset = pdx.datasets.VOCDetection(
+    data_dir='dataset',
+    file_list='dataset/val_list.txt',
+    label_list='dataset/labels.txt',
+    transforms=eval_transforms,
+    shuffle=False)
+
+# 加载模型
+model = pdx.load_model('output/yolov3_darknet53/best_model')
+
+# Step 1/3: 分析模型各层参数在不同的剪裁比例下的敏感度
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/95c53dec89ab0f3769330fa445c6d9213986ca5f/paddlex/cv/models/base.py#L352
+model.analyze_sensitivity(
+    dataset=eval_dataset,
+    batch_size=1,
+    save_dir='output/yolov3_darknet53/prune')
+
+# Step 2/3: 根据选择的FLOPs减小比例对模型进行剪裁
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/95c53dec89ab0f3769330fa445c6d9213986ca5f/paddlex/cv/models/base.py#L394
+model.prune(pruned_flops=.2)
+
+# Step 3/3: 对剪裁后的模型重新训练
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/models/detector.py#L154
+# 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html
+model.train(
+    num_epochs=270,
+    train_dataset=train_dataset,
+    train_batch_size=8,
+    eval_dataset=eval_dataset,
+    learning_rate=0.001 / 8,
+    warmup_steps=1000,
+    warmup_start_lr=0.0,
+    save_interval_epochs=5,
+    lr_decay_epochs=[216, 243],
+    pretrain_weights=None,
+    save_dir='output/yolov3_darknet53/prune')

+ 56 - 0
dygraph/examples/rebar_count/code/train.py

@@ -0,0 +1,56 @@
+import paddlex as pdx
+from paddlex import transforms as T
+
+# 定义训练和验证时的transforms
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/transforms/operators.py
+train_transforms = T.Compose([
+    T.MixupImage(mixup_epoch=250), T.RandomDistort(),
+    T.RandomExpand(im_padding_value=[123.675, 116.28, 103.53]), T.RandomCrop(),
+    T.RandomHorizontalFlip(), T.BatchRandomResize(
+        target_sizes=[320, 352, 384, 416, 448, 480, 512, 544, 576, 608],
+        interp='RANDOM'), T.Normalize(
+            mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+])
+
+eval_transforms = T.Compose([
+    T.Resize(
+        608, interp='CUBIC'), T.Normalize(
+            mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+])
+
+# 定义训练和验证所用的数据集
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/datasets/voc.py#L29
+train_dataset = pdx.datasets.VOCDetection(
+    data_dir='dataset',
+    file_list='dataset/train_list.txt',
+    label_list='dataset/labels.txt',
+    transforms=train_transforms,
+    shuffle=True,
+    num_worker=0)
+
+eval_dataset = pdx.datasets.VOCDetection(
+    data_dir='dataset',
+    file_list='dataset/val_list.txt',
+    label_list='dataset/labels.txt',
+    transforms=eval_transforms,
+    shuffle=False,
+    num_worker=0)
+
+# 初始化模型,并进行训练
+# 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
+num_classes = len(train_dataset.labels)
+model = pdx.models.YOLOv3(num_classes=num_classes, backbone='DarkNet53')
+
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/models/detector.py#L155
+# 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html
+model.train(
+    num_epochs=270,
+    train_dataset=train_dataset,
+    train_batch_size=2,
+    eval_dataset=eval_dataset,
+    learning_rate=0.001 / 8,
+    warmup_steps=1000,
+    warmup_start_lr=0.0,
+    save_interval_epochs=5,
+    lr_decay_epochs=[216, 243],
+    save_dir='output/yolov3_darknet53')

BIN
dygraph/examples/rebar_count/images/0.png


BIN
dygraph/examples/rebar_count/images/2.png


BIN
dygraph/examples/rebar_count/images/3.png


BIN
dygraph/examples/rebar_count/images/5.png


BIN
dygraph/examples/rebar_count/images/7.png


BIN
dygraph/examples/rebar_count/images/8.png


BIN
dygraph/examples/rebar_count/images/phone_pic.jpg


BIN
dygraph/examples/rebar_count/images/predict.jpg


BIN
dygraph/examples/rebar_count/images/process.png


BIN
dygraph/examples/rebar_count/images/split_dataset.png


BIN
dygraph/examples/rebar_count/images/vdl.png


BIN
dygraph/examples/rebar_count/images/vdl2.png


BIN
dygraph/examples/rebar_count/images/worker.png


+ 207 - 0
dygraph/examples/robot_grab/README.md

@@ -0,0 +1,207 @@
+# 机械手抓取
+### 1 项目说明
+在生产过程中为了节省人工、提高加工效率,经常会用到自动化的抓取装置或机械手来代替人工操作。抓取的准确性很大程度上取决于视觉系统的识别准确性。
+
+在2D视觉抓取中,能否获得目标物体精准的轮廓边缘将直接决定是否可以对目标进行准确的抓取。在本项目中,通过实例分割的方式对箱体中的目标进行边缘轮廓分割,引导机械手实现准确的抓取。
+
+<div align="center">
+<img src="./images/robot.png"  width = "500" /
+>              </div>
+
+### 2 数据准备
+数据集中提供了采用labelme进行多边形标注的30张图片。[点击此处下载数据集](https://bj.bcebos.com/paddlex/examples2/robot_grab/dataset_manipulator_grab.zip)
+
+* **准备工作**
+
+先指定路径到项目文件夹
+``` shell
+cd path_to_paddlexproject
+```
+建立`dataset_labelme`文件夹,在该文件夹下再分别建立`JPEGImages`和`Annotations`文件夹,将图片存放于`JPEGImages`文件夹,`Annotations`文件夹用于存储标注的json文件。
+
+打开LabelMe,点击”Open Dir“按钮,选择需要标注的图像所在的文件夹打开,则”File List“对话框中会显示所有图像所对应的绝对路径,接着便可以开始遍历每张图像,进行标注工作.
+
+更多数据格式信息请参考[数据标注说明文档](https://paddlex.readthedocs.io/zh_CN/develop/data/annotation/index.html)
+* **目标边缘标注**
+
+打开多边形标注工具(右键菜单->Create Polygon)以打点的方式圈出目标的轮廓,并在弹出的对话框中写明对应label(当label已存在时点击即可,此处请注意label勿使用中文),具体如下提所示,当框标注错误时,可点击左侧的“Edit Polygons”再点击标注框,通过拖拉进行修改,也可再点击“Delete Polygon”进行删除。
+
+点击”Save“,将标注结果保存到中创建的文件夹`Annotations`目录中。
+<div align="center">
+<img src="./images/labelme.png"  width = "1000" /
+>              </div>
+
+* **格式转换**
+
+LabelMe标注后的数据还需要进行转换为MSCOCO格式,才可以用于实例分割任务的训练,创建保存目录`dataset`,在python环境中安装paddlex后,使用如下命令即可:
+``` shell
+paddlex --data_conversion --source labelme --to COCO --pics dataset_labelme/JPEGImages --annotations dataset_labelme/Annotations --save_dir dataset
+```
+
+* **数据切分**
+将训练集、验证集和测试集按照7:2:1的比例划分。
+``` shell
+paddlex --split_dataset --format COCO --dataset_dir dataset --val_value 0.2 --test_value 0.1
+```
+<div align="center">
+<img src="./images/split_dataset.png"  width = "1500" />              </div>
+数据文件夹切分前后的状态如下:
+
+```bash
+  dataset/                      dataset/
+  ├── JPEGImages/       -->     ├── JPEGImages/
+  ├── annotations.json          ├── annotations.json
+                                ├── test.json
+                                ├── train.json
+                                ├── val.json
+  ```
+
+
+### 4 模型选择
+PaddleX提供了丰富的视觉模型,在实例分割中提供了MaskRCNN系列模型,方便用户根据实际需要做选择。
+
+### 5 模型训练
+在项目中,我们采用MaskRCNN作为木块抓取的模型。具体代码请参考[train.py](./code/train.py)
+运行如下代码开始训练模型:
+``` shell
+python code/train.py
+```
+若输入如下代码,则可在log文件中查看训练日志
+``` shell
+python code/train.py > log
+```
+* 训练过程说明
+<div align="center">
+<img src="./images/process.png"  width = "1000" />              </div>
+
+``` bash
+# 定义训练和验证时的transforms
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/transforms/operators.py
+train_transforms = T.Compose([
+    T.RandomResizeByShort(
+        short_sizes=[640, 672, 704, 736, 768, 800],
+        max_size=1333,
+        interp='CUBIC'), T.RandomHorizontalFlip(), T.Normalize(
+            mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+])
+
+eval_transforms = T.Compose([
+    T.ResizeByShort(
+        short_size=800, max_size=1333, interp='CUBIC'), T.Normalize(
+            mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+])
+```
+
+```bash
+# 定义训练和验证所用的数据集
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/develop/dygraph/paddlex/cv/datasets/coco.py#L26
+train_dataset = pdx.datasets.CocoDetection(
+    data_dir='dataset/JPEGImages',
+    ann_file='dataset/train.json',
+    transforms=train_transforms,
+    shuffle=True)
+eval_dataset = pdx.datasets.CocoDetection(
+    data_dir='dataset/JPEGImages',
+    ann_file='dataset/val.json',
+    transforms=eval_transforms)
+```
+``` bash
+# 初始化模型,并进行训练
+# 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
+num_classes = len(train_dataset.labels)
+model = pdx.models.MaskRCNN(
+    num_classes=num_classes, backbone='ResNet50', with_fpn=True)
+```
+``` bash
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/models/detector.py#L155
+# 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html
+model.train(
+    num_epochs=12,
+    train_dataset=train_dataset,
+    train_batch_size=1,
+    eval_dataset=eval_dataset,
+    learning_rate=0.00125,
+    lr_decay_epochs=[8, 11],
+    warmup_steps=10,
+    warmup_start_lr=0.0,
+    save_dir='output/mask_rcnn_r50_fpn',
+    use_vdl=True)
+ ```
+
+### 6 训练可视化
+
+在模型训练过程,在`train`函数中,将`use_vdl`设为True,则训练过程会自动将训练日志以VisualDL的格式打点在`save_dir`(用户自己指定的路径)下的`vdl_log`目录。
+
+用户可以使用如下命令启动VisualDL服务,查看可视化指标
+
+```
+visualdl --logdir output/mask_rcnn_r50_fpn/vdl_log --port 8001
+```
+
+<div align="center">
+<img src="./images/vdl.png"  width = "1000" />              </div>
+
+服务启动后,按照命令行提示,使用浏览器打开 http://localhost:8001/
+
+### 7 模型导出
+模型训练后保存在output文件夹,如果要使用PaddleInference进行部署需要导出成静态图的模型,运行如下命令,会自动在output文件夹下创建一个`inference_model`的文件夹,用来存放导出后的模型。
+
+``` bash
+paddlex --export_inference --model_dir=output/mask_rcnn_r50_fpn/best_model --save_dir=output/inference_model
+```
+
+### 8 模型预测
+
+运行如下代码:
+``` bash
+python code/infer.py
+```
+文件内容如下:
+``` bash
+import glob
+import numpy as np
+import threading
+import time
+import random
+import os
+import base64
+import cv2
+import json
+import paddlex as pdx
+
+image_name = 'dataset/JPEGImages/Image_20210615204210757.bmp'
+model = pdx.load_model('output/mask_rcnn_r50_fpn/best_model')
+
+
+img = cv2.imread(image_name)
+result = model.predict(img)
+
+keep_results = []
+areas = []
+f = open('result.txt','a')
+count = 0
+for dt in np.array(result):
+    cname, bbox, score = dt['category'], dt['bbox'], dt['score']
+    if score < 0.5:
+        continue
+    keep_results.append(dt)
+    count+=1
+    f.write(str(dt)+'\n')
+    f.write('\n')
+    areas.append(bbox[2] * bbox[3])
+areas = np.asarray(areas)
+sorted_idxs = np.argsort(-areas).tolist()
+keep_results = [keep_results[k]
+                for k in sorted_idxs] if len(keep_results) > 0 else []
+print(keep_results)
+print(count)
+f.write("the total number is :"+str(int(count)))
+f.close()
+
+pdx.det.visualize(image_name, result, threshold=0.5, save_dir='./output/mask_rcnn_r50_fpn')
+```
+在目录中会生成result.txt文件和预测结果图片,result.txt文件中会显示图片中每个检测框的位置、类别及置信度,并给出检测框的总个数。
+
+预测结果如下:
+<div align="center">
+<img src="./images/predict.bmp"  width = "1000" />              </div>

+ 40 - 0
dygraph/examples/robot_grab/code/infer.py

@@ -0,0 +1,40 @@
+import glob
+import numpy as np
+import threading
+import time
+import random
+import os
+import base64
+import cv2
+import json
+import paddlex as pdx
+
+image_name = 'dataset/JPEGImages/Image_20210615204210757.bmp'
+model = pdx.load_model('output/mask_rcnn_r50_fpn/best_model')
+
+img = cv2.imread(image_name)
+result = model.predict(img)
+
+keep_results = []
+areas = []
+f = open('result.txt', 'a')
+count = 0
+for dt in np.array(result):
+    cname, bbox, score = dt['category'], dt['bbox'], dt['score']
+    if score < 0.5:
+        continue
+    keep_results.append(dt)
+    count += 1
+    f.write(str(dt) + '\n')
+    f.write('\n')
+    areas.append(bbox[2] * bbox[3])
+areas = np.asarray(areas)
+sorted_idxs = np.argsort(-areas).tolist()
+keep_results = [keep_results[k]
+                for k in sorted_idxs] if len(keep_results) > 0 else []
+print(keep_results)
+print(count)
+f.write("the total number is :" + str(int(count)))
+f.close()
+pdx.det.visualize(
+    image_name, result, threshold=0.5, save_dir='./output/mask_rcnn_r50_fpn')

+ 50 - 0
dygraph/examples/robot_grab/code/train.py

@@ -0,0 +1,50 @@
+import paddlex as pdx
+from paddlex import transforms as T
+
+# 定义训练和验证时的transforms
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/transforms/operators.py
+train_transforms = T.Compose([
+    T.RandomResizeByShort(
+        short_sizes=[640, 672, 704, 736, 768, 800],
+        max_size=1333,
+        interp='CUBIC'), T.RandomHorizontalFlip(), T.Normalize(
+            mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+])
+
+eval_transforms = T.Compose([
+    T.ResizeByShort(
+        short_size=800, max_size=1333, interp='CUBIC'), T.Normalize(
+            mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+])
+
+# 定义训练和验证所用的数据集
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/develop/dygraph/paddlex/cv/datasets/coco.py#L26
+train_dataset = pdx.datasets.CocoDetection(
+    data_dir='dataset/JPEGImages',
+    ann_file='dataset/train.json',
+    transforms=train_transforms,
+    shuffle=True)
+eval_dataset = pdx.datasets.CocoDetection(
+    data_dir='dataset/JPEGImages',
+    ann_file='dataset/val.json',
+    transforms=eval_transforms)
+
+# 初始化模型,并进行训练
+# 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
+num_classes = len(train_dataset.labels)
+model = pdx.models.MaskRCNN(
+    num_classes=num_classes, backbone='ResNet50', with_fpn=True)
+
+# API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/models/detector.py#L155
+# 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html
+model.train(
+    num_epochs=12,
+    train_dataset=train_dataset,
+    train_batch_size=1,
+    eval_dataset=eval_dataset,
+    learning_rate=0.00125,
+    lr_decay_epochs=[8, 11],
+    warmup_steps=10,
+    warmup_start_lr=0.0,
+    save_dir='output/mask_rcnn_r50_fpn',
+    use_vdl=True)

BIN
dygraph/examples/robot_grab/images/labelme.png


BIN
dygraph/examples/robot_grab/images/lens.png


BIN
dygraph/examples/robot_grab/images/predict.bmp


BIN
dygraph/examples/robot_grab/images/predict.jpg


BIN
dygraph/examples/robot_grab/images/process.png


BIN
dygraph/examples/robot_grab/images/robot.png


BIN
dygraph/examples/robot_grab/images/split_dataset.png


BIN
dygraph/examples/robot_grab/images/vdl.png


BIN
dygraph/examples/robot_grab/images/vdl2.png


+ 0 - 1
dygraph/paddlex/__init__.py

@@ -21,7 +21,6 @@ from . import cv
 from . import seg
 from . import cls
 from . import det
-from . import models
 from . import tools
 
 from .cv.models.utils.visualize import visualize_detection

+ 45 - 197
dygraph/paddlex/cls.py

@@ -12,215 +12,63 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
+import sys
 from . import cv
-from paddlex.cv.transforms import cls_transforms
-import paddlex.utils.logging as logging
 
-transforms = cls_transforms
+message = 'Your running script needs PaddleX<2.0.0, please refer to {} to solve this issue.'.format(
+    'https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#%E7%89%88%E6%9C%AC%E5%8D%87%E7%BA%A7'
+)
 
 
-class ResNet18(cv.models.ResNet18):
-    def __init__(self, num_classes=1000, input_channel=None):
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-        super(ResNet18, self).__init__(num_classes=num_classes)
+def __getattr__(attr):
+    if attr == 'transforms':
 
+        print("\033[1;31;40m{}\033[0m".format(message).encode("utf-8")
+              .decode("latin1"))
+        sys.exit(-1)
 
-class ResNet34(cv.models.ResNet34):
-    def __init__(self, num_classes=1000, input_channel=None):
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-        super(ResNet34, self).__init__(num_classes=num_classes)
 
+ResNet18 = cv.models.ResNet18
+ResNet34 = cv.models.ResNet34
+ResNet50 = cv.models.ResNet50
+ResNet101 = cv.models.ResNet101
+ResNet152 = cv.models.ResNet152
 
-class ResNet50(cv.models.ResNet50):
-    def __init__(self, num_classes=1000, input_channel=None):
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-        super(ResNet50, self).__init__(num_classes=num_classes)
+ResNet18_vd = cv.models.ResNet18_vd
+ResNet34_vd = cv.models.ResNet34_vd
+ResNet50_vd = cv.models.ResNet50_vd
+ResNet50_vd_ssld = cv.models.ResNet50_vd_ssld
+ResNet101_vd = cv.models.ResNet101_vd
+ResNet101_vd_ssld = cv.models.ResNet101_vd_ssld
+ResNet152_vd = cv.models.ResNet152_vd
+ResNet200_vd = cv.models.ResNet200_vd
 
+MobileNetV1 = cv.models.MobileNetV1
+MobileNetV2 = cv.models.MobileNetV2
+MobileNetV3_small = cv.models.MobileNetV3_small
+MobileNetV3_large = cv.models.MobileNetV3_large
 
-class ResNet101(cv.models.ResNet101):
-    def __init__(self, num_classes=1000, input_channel=None):
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-        super(ResNet101, self).__init__(num_classes=num_classes)
+AlexNet = cv.models.AlexNet
 
+DarkNet53 = cv.models.DarkNet53
 
-class ResNet50_vd(cv.models.ResNet50_vd):
-    def __init__(self, num_classes=1000, input_channel=None):
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-        super(ResNet50_vd, self).__init__(num_classes=num_classes)
+DenseNet121 = cv.models.DenseNet121
+DenseNet161 = cv.models.DenseNet161
+DenseNet169 = cv.models.DenseNet169
+DenseNet201 = cv.models.DenseNet201
+DenseNet264 = cv.models.DenseNet264
 
+HRNet_W18_C = cv.models.HRNet_W18_C
+HRNet_W30_C = cv.models.HRNet_W30_C
+HRNet_W32_C = cv.models.HRNet_W32_C
+HRNet_W40_C = cv.models.HRNet_W40_C
+HRNet_W44_C = cv.models.HRNet_W44_C
+HRNet_W48_C = cv.models.HRNet_W48_C
+HRNet_W64_C = cv.models.HRNet_W64_C
 
-class ResNet101_vd(cv.models.ResNet101_vd):
-    def __init__(self, num_classes=1000, input_channel=None):
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-        super(ResNet101_vd, self).__init__(num_classes=num_classes)
+Xception41 = cv.models.Xception41
+Xception65 = cv.models.Xception65
+Xception71 = cv.models.Xception71
 
-
-class ResNet50_vd_ssld(cv.models.ResNet50_vd_ssld):
-    def __init__(self, num_classes=1000, input_channel=None):
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-        super(ResNet50_vd_ssld, self).__init__(num_classes=num_classes)
-
-
-class ResNet101_vd_ssld(cv.models.ResNet101_vd_ssld):
-    def __init__(self, num_classes=1000, input_channel=None):
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-        super(ResNet101_vd_ssld, self).__init__(num_classes=num_classes)
-
-
-class DarkNet53(cv.models.DarkNet53):
-    def __init__(self, num_classes=1000, input_channel=None):
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-        super(DarkNet53, self).__init__(num_classes=num_classes)
-
-
-class MobileNetV1(cv.models.MobileNetV1):
-    def __init__(self, num_classes=1000, input_channel=None):
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-        super(MobileNetV1, self).__init__(num_classes=num_classes)
-
-
-class MobileNetV2(cv.models.MobileNetV2):
-    def __init__(self, num_classes=1000, input_channel=None):
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-        super(MobileNetV2, self).__init__(num_classes=num_classes)
-
-
-class MobileNetV3_small(cv.models.MobileNetV3_small):
-    def __init__(self, num_classes=1000, input_channel=None):
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-        super(MobileNetV3_small, self).__init__(num_classes=num_classes)
-
-
-class MobileNetV3_large(cv.models.MobileNetV3_large):
-    def __init__(self, num_classes=1000, input_channel=None):
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-        super(MobileNetV3_large, self).__init__(num_classes=num_classes)
-
-
-class MobileNetV3_small_ssld(cv.models.MobileNetV3_small_ssld):
-    def __init__(self, num_classes=1000, input_channel=None):
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-        super(MobileNetV3_small_ssld, self).__init__(num_classes=num_classes)
-
-
-class MobileNetV3_large_ssld(cv.models.MobileNetV3_large_ssld):
-    def __init__(self, num_classes=1000, input_channel=None):
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-        super(MobileNetV3_large_ssld, self).__init__(num_classes=num_classes)
-
-
-class Xception41(cv.models.Xception41):
-    def __init__(self, num_classes=1000, input_channel=None):
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-        super(Xception41, self).__init__(num_classes=num_classes)
-
-
-class Xception65(cv.models.Xception65):
-    def __init__(self, num_classes=1000, input_channel=None):
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-        super(Xception65, self).__init__(num_classes=num_classes)
-
-
-class DenseNet121(cv.models.DenseNet121):
-    def __init__(self, num_classes=1000, input_channel=None):
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-        super(DenseNet121, self).__init__(num_classes=num_classes)
-
-
-class DenseNet161(cv.models.DenseNet161):
-    def __init__(self, num_classes=1000, input_channel=None):
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-        super(DenseNet161, self).__init__(num_classes=num_classes)
-
-
-class DenseNet201(cv.models.DenseNet201):
-    def __init__(self, num_classes=1000, input_channel=None):
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-        super(DenseNet201, self).__init__(num_classes=num_classes)
-
-
-class ShuffleNetV2(cv.models.ShuffleNetV2):
-    def __init__(self, num_classes=1000, input_channel=None):
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-        super(ShuffleNetV2, self).__init__(num_classes=num_classes)
-
-
-class HRNet_W18(cv.models.HRNet_W18_C):
-    def __init__(self, num_classes=1000, input_channel=None):
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-        super(HRNet_W18, self).__init__(num_classes=num_classes)
-
-
-class AlexNet(cv.models.AlexNet):
-    def __init__(self, num_classes=1000, input_channel=None):
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-        super(AlexNet, self).__init__(num_classes=num_classes)
+ShuffleNetV2 = cv.models.ShuffleNetV2
+ShuffleNetV2_swish = cv.models.ShuffleNetV2_swish

+ 0 - 1
dygraph/paddlex/cv/datasets/voc.py

@@ -14,7 +14,6 @@
 
 from __future__ import absolute_import
 import copy
-import os
 import os.path as osp
 import random
 import re

+ 20 - 12
dygraph/paddlex/cv/models/classifier.py

@@ -99,12 +99,12 @@ class BaseClassifier(BaseModel):
             outputs = OrderedDict([('prediction', softmax_out)])
 
         elif mode == 'eval':
-            labels = to_tensor(inputs[1].numpy().astype('int64').reshape(-1,
-                                                                         1))
+            pred = softmax_out
+            gt = inputs[1]
+            labels = inputs[1].reshape([-1, 1])
             acc1 = paddle.metric.accuracy(softmax_out, label=labels)
             k = min(5, self.num_classes)
             acck = paddle.metric.accuracy(softmax_out, label=labels, k=k)
-            prediction = softmax_out
             # multi cards eval
             if paddle.distributed.get_world_size() > 1:
                 acc1 = paddle.distributed.all_reduce(
@@ -113,17 +113,19 @@ class BaseClassifier(BaseModel):
                 acck = paddle.distributed.all_reduce(
                     acck, op=paddle.distributed.ReduceOp.
                     SUM) / paddle.distributed.get_world_size()
-                prediction = []
-                paddle.distributed.all_gather(prediction, softmax_out)
-                prediction = paddle.concat(prediction, axis=0)
+                pred = list()
+                gt = list()
+                paddle.distributed.all_gather(pred, softmax_out)
+                paddle.distributed.all_gather(gt, inputs[1])
+                pred = paddle.concat(pred, axis=0)
+                gt = paddle.concat(gt, axis=0)
 
             outputs = OrderedDict([('acc1', acc1), ('acc{}'.format(k), acck),
-                                   ('prediction', prediction)])
+                                   ('prediction', pred), ('labels', gt)])
 
         else:
             # mode == 'train'
-            labels = to_tensor(inputs[1].numpy().astype('int64').reshape(-1,
-                                                                         1))
+            labels = inputs[1].reshape([-1, 1])
             loss = CELoss(class_dim=self.num_classes)
             loss = loss(net_out, inputs[1])
             acc1 = paddle.metric.accuracy(softmax_out, label=labels, k=1)
@@ -358,9 +360,9 @@ class BaseClassifier(BaseModel):
         self.eval_data_loader = self.build_data_loader(
             eval_dataset, batch_size=batch_size, mode='eval')
         eval_metrics = TrainingStats()
-        eval_details = None
         if return_details:
-            eval_details = list()
+            true_labels = list()
+            pred_scores = list()
 
         logging.info(
             "Start to evaluate(total_samples={}, total_steps={})...".format(
@@ -370,10 +372,16 @@ class BaseClassifier(BaseModel):
             for step, data in enumerate(self.eval_data_loader()):
                 outputs = self.run(self.net, data, mode='eval')
                 if return_details:
-                    eval_details.append(outputs['prediction'].tolist())
+                    true_labels.extend(outputs['labels'].tolist())
+                    pred_scores.extend(outputs['prediction'].tolist())
                 outputs.pop('prediction')
+                outputs.pop('labels')
                 eval_metrics.update(outputs)
         if return_details:
+            eval_details = {
+                'true_labels': true_labels,
+                'pred_scores': pred_scores
+            }
             return eval_metrics.get(), eval_details
         else:
             return eval_metrics.get()

+ 80 - 25
dygraph/paddlex/cv/models/detector.py

@@ -60,18 +60,7 @@ class BaseDetector(BaseModel):
     def _fix_transforms_shape(self, image_shape):
         raise NotImplementedError("_fix_transforms_shape: not implemented!")
 
-    def _get_test_inputs(self, image_shape):
-        if image_shape is not None:
-            if len(image_shape) == 2:
-                image_shape = [None, 3] + image_shape
-            if image_shape[-2] % 32 > 0 or image_shape[-1] % 32 > 0:
-                raise Exception(
-                    "Height and width in fixed_input_shape must be a multiple of 32, but recieved is {}.".
-                    format(image_shape[-2:]))
-            self._fix_transforms_shape(image_shape[-2:])
-        else:
-            image_shape = [None, 3, -1, -1]
-
+    def _define_input_spec(self, image_shape):
         input_spec = [{
             "image": InputSpec(
                 shape=image_shape, name='image', dtype='float32'),
@@ -82,9 +71,26 @@ class BaseDetector(BaseModel):
                 name='scale_factor',
                 dtype='float32')
         }]
-
         return input_spec
 
+    def _check_image_shape(self, image_shape):
+        if len(image_shape) == 2:
+            image_shape = [None, 3] + image_shape
+            if image_shape[-2] % 32 > 0 or image_shape[-1] % 32 > 0:
+                raise Exception(
+                    "Height and width in fixed_input_shape must be a multiple of 32, but received {}.".
+                    format(image_shape[-2:]))
+        return image_shape
+
+    def _get_test_inputs(self, image_shape):
+        if image_shape is not None:
+            image_shape = self._check_image_shape(image_shape)
+            self._fix_transforms_shape(image_shape[-2:])
+        else:
+            image_shape = [None, 3, -1, -1]
+
+        return self._define_input_spec(image_shape)
+
     def _get_backbone(self, backbone_name, **params):
         backbone = getattr(ppdet.modeling, backbone_name)(**params)
         return backbone
@@ -455,10 +461,11 @@ class BaseDetector(BaseModel):
             If img_file is a string or np.array, the result is a list of dict with key-value pairs:
             {"category_id": `category_id`, "category": `category`, "bbox": `[x, y, w, h]`, "score": `score`}.
             If img_file is a list, the result is a list composed of dicts with the corresponding fields:
-            category_id(int): the predicted category ID
+            category_id(int): the predicted category ID. 0 represents the first category in the dataset, and so on.
             category(str): category name
             bbox(list): bounding box in [x, y, w, h] format
             score(str): confidence
+            mask(dict): Only for instance segmentation task. Mask of the object in RLE format
 
         """
         if transforms is None and not hasattr(self, 'test_transforms'):
@@ -512,7 +519,7 @@ class BaseDetector(BaseModel):
                     h = ymax - ymin
                     bbox = [xmin, ymin, w, h]
                     dt_res = {
-                        'category_id': int(num_id) + 1,
+                        'category_id': int(num_id),
                         'category': category,
                         'bbox': bbox,
                         'score': score
@@ -544,9 +551,9 @@ class BaseDetector(BaseModel):
                         if 'counts' in rle:
                             rle['counts'] = rle['counts'].decode("utf8")
                     sg_res = {
-                        'category_id': int(label) + 1,
+                        'category_id': int(label),
                         'category': category,
-                        'segmentation': rle,
+                        'mask': rle,
                         'score': score
                     }
                     seg_res.append(sg_res)
@@ -720,6 +727,7 @@ class FasterRCNN(BaseDetector):
                  num_classes=80,
                  backbone='ResNet50',
                  with_fpn=True,
+                 with_dcn=False,
                  aspect_ratios=[0.5, 1.0, 2.0],
                  anchor_sizes=[[32], [64], [128], [256], [512]],
                  keep_top_k=100,
@@ -740,12 +748,17 @@ class FasterRCNN(BaseDetector):
                 "('ResNet50', 'ResNet50_vd', 'ResNet50_vd_ssld', 'ResNet34', 'ResNet34_vd', "
                 "'ResNet101', 'ResNet101_vd', 'HRNet_W18')".format(backbone))
         self.backbone_name = backbone
+        dcn_v2_stages = [1, 2, 3] if with_dcn else [-1]
         if backbone == 'HRNet_W18':
             if not with_fpn:
                 logging.warning(
                     "Backbone {} should be used along with fpn enabled, 'with_fpn' is forcibly set to True".
                     format(backbone))
                 with_fpn = True
+            if with_dcn:
+                logging.warning(
+                    "Backbone {} should be used along with dcn disabled, 'with_dcn' is forcibly set to False".
+                    format(backbone))
             backbone = self._get_backbone(
                 'HRNet', width=18, freeze_at=0, return_idx=[0, 1, 2, 3])
         elif backbone == 'ResNet50_vd_ssld':
@@ -761,7 +774,8 @@ class FasterRCNN(BaseDetector):
                 freeze_at=0,
                 return_idx=[0, 1, 2, 3],
                 num_stages=4,
-                lr_mult_list=[0.05, 0.05, 0.1, 0.15])
+                lr_mult_list=[0.05, 0.05, 0.1, 0.15],
+                dcn_v2_stages=dcn_v2_stages)
         elif 'ResNet50' in backbone:
             if with_fpn:
                 backbone = self._get_backbone(
@@ -770,8 +784,13 @@ class FasterRCNN(BaseDetector):
                     norm_type='bn',
                     freeze_at=0,
                     return_idx=[0, 1, 2, 3],
-                    num_stages=4)
+                    num_stages=4,
+                    dcn_v2_stages=dcn_v2_stages)
             else:
+                if with_dcn:
+                    logging.warning(
+                        "Backbone {} without fpn should be used along with dcn disabled, 'with_dcn' is forcibly set to False".
+                        format(backbone))
                 backbone = self._get_backbone(
                     'ResNet',
                     variant='d' if '_vd' in backbone else 'b',
@@ -792,7 +811,8 @@ class FasterRCNN(BaseDetector):
                 norm_type='bn',
                 freeze_at=0,
                 return_idx=[0, 1, 2, 3],
-                num_stages=4)
+                num_stages=4,
+                dcn_v2_stages=dcn_v2_stages)
         else:
             if not with_fpn:
                 logging.warning(
@@ -806,7 +826,8 @@ class FasterRCNN(BaseDetector):
                 norm_type='bn',
                 freeze_at=0,
                 return_idx=[0, 1, 2, 3],
-                num_stages=4)
+                num_stages=4,
+                dcn_v2_stages=dcn_v2_stages)
 
         rpn_in_channel = backbone.out_shape[0].channels
 
@@ -848,7 +869,8 @@ class FasterRCNN(BaseDetector):
                 if test_pre_nms_top_n is None else test_pre_nms_top_n,
                 'post_nms_top_n': test_post_nms_top_n
             }
-            head = ppdet.modeling.TwoFCHead(out_channel=1024)
+            head = ppdet.modeling.TwoFCHead(
+                in_channel=neck.out_shape[0].channels, out_channel=1024)
             roi_extractor_cfg = {
                 'resolution': 7,
                 'spatial_scale': [1. / i.stride for i in neck.out_shape],
@@ -988,6 +1010,18 @@ class FasterRCNN(BaseDetector):
                 self.test_transforms.transforms.append(
                     Padding(im_padding_value=[0., 0., 0.]))
 
+    def _get_test_inputs(self, image_shape):
+        if image_shape is not None:
+            image_shape = self._check_image_shape(image_shape)
+            self._fix_transforms_shape(image_shape[-2:])
+        else:
+            image_shape = [None, 3, -1, -1]
+            if self.with_fpn:
+                self.test_transforms.transforms.append(
+                    Padding(im_padding_value=[0., 0., 0.]))
+
+        return self._define_input_spec(image_shape)
+
 
 class PPYOLO(YOLOv3):
     def __init__(self,
@@ -1416,6 +1450,7 @@ class MaskRCNN(BaseDetector):
                  num_classes=80,
                  backbone='ResNet50_vd',
                  with_fpn=True,
+                 with_dcn=False,
                  aspect_ratios=[0.5, 1.0, 2.0],
                  anchor_sizes=[[32], [64], [128], [256], [512]],
                  keep_top_k=100,
@@ -1437,6 +1472,7 @@ class MaskRCNN(BaseDetector):
                 format(backbone))
 
         self.backbone_name = backbone + '_fpn' if with_fpn else backbone
+        dcn_v2_stages = [1, 2, 3] if with_dcn else [-1]
 
         if backbone == 'ResNet50':
             if with_fpn:
@@ -1445,8 +1481,13 @@ class MaskRCNN(BaseDetector):
                     norm_type='bn',
                     freeze_at=0,
                     return_idx=[0, 1, 2, 3],
-                    num_stages=4)
+                    num_stages=4,
+                    dcn_v2_stages=dcn_v2_stages)
             else:
+                if with_dcn:
+                    logging.warning(
+                        "Backbone {} should be used along with dcn disabled, 'with_dcn' is forcibly set to False".
+                        format(backbone))
                 backbone = self._get_backbone(
                     'ResNet',
                     norm_type='bn',
@@ -1468,7 +1509,8 @@ class MaskRCNN(BaseDetector):
                 return_idx=[0, 1, 2, 3],
                 num_stages=4,
                 lr_mult_list=[0.05, 0.05, 0.1, 0.15]
-                if '_ssld' in backbone else [1.0, 1.0, 1.0, 1.0])
+                if '_ssld' in backbone else [1.0, 1.0, 1.0, 1.0],
+                dcn_v2_stages=dcn_v2_stages)
 
         else:
             if not with_fpn:
@@ -1483,7 +1525,8 @@ class MaskRCNN(BaseDetector):
                 norm_type='bn',
                 freeze_at=0,
                 return_idx=[0, 1, 2, 3],
-                num_stages=4)
+                num_stages=4,
+                dcn_v2_stages=dcn_v2_stages)
 
         rpn_in_channel = backbone.out_shape[0].channels
 
@@ -1688,3 +1731,15 @@ class MaskRCNN(BaseDetector):
                         interp='CUBIC')
                 self.test_transforms.transforms.append(
                     Padding(im_padding_value=[0., 0., 0.]))
+
+    def _get_test_inputs(self, image_shape):
+        if image_shape is not None:
+            image_shape = self._check_image_shape(image_shape)
+            self._fix_transforms_shape(image_shape[-2:])
+        else:
+            image_shape = [None, 3, -1, -1]
+            if self.with_fpn:
+                self.test_transforms.transforms.append(
+                    Padding(im_padding_value=[0., 0., 0.]))
+
+        return self._define_input_spec(image_shape)

+ 1 - 1
dygraph/paddlex/cv/models/segmenter.py

@@ -208,7 +208,7 @@ class BaseSegmenter(BaseModel):
             log_interval_steps(int, optional): Step interval for printing training information. Defaults to 10.
             save_dir(str, optional): Directory to save the model. Defaults to 'output'.
             pretrain_weights(str or None, optional):
-                None or name/path of pretrained weights. If None, no pretrained weights will be loaded. Defaults to 'IMAGENET'.
+                None or name/path of pretrained weights. If None, no pretrained weights will be loaded. Defaults to 'CITYSCAPES'.
             learning_rate(float, optional): Learning rate for training. Defaults to .025.
             lr_decay_power(float, optional): Learning decay power. Defaults to .9.
             early_stop(bool, optional): Whether to adopt early stop strategy. Defaults to False.

+ 10 - 4
dygraph/paddlex/cv/models/utils/det_metrics/metrics.py

@@ -61,6 +61,11 @@ class VOCMetric(Metric):
                  classwise=False):
         self.cid2cname = {i: name for i, name in enumerate(labels)}
         self.coco_gt = coco_gt
+        self.clsid2catid = {
+            i: cat['id']
+            for i, cat in enumerate(
+                self.coco_gt.loadCats(self.coco_gt.getCatIds()))
+        }
         self.overlap_thresh = overlap_thresh
         self.map_type = map_type
         self.evaluate_difficult = evaluate_difficult
@@ -80,9 +85,10 @@ class VOCMetric(Metric):
         self.detection_map.reset()
 
     def update(self, inputs, outputs):
-        bboxes = outputs['bbox'][:, 2:].numpy()
-        scores = outputs['bbox'][:, 1].numpy()
-        labels = outputs['bbox'][:, 0].numpy()
+        bbox_np = outputs['bbox'].numpy()
+        bboxes = bbox_np[:, 2:]
+        scores = bbox_np[:, 1]
+        labels = bbox_np[:, 0]
         bbox_lengths = outputs['bbox_num'].numpy()
 
         if bboxes.shape == (1, 1) or bboxes is None:
@@ -121,7 +127,7 @@ class VOCMetric(Metric):
                 bbox = [xmin, ymin, w, h]
                 coco_res = {
                     'image_id': int(inputs['im_id']),
-                    'category_id': int(l + 1),
+                    'category_id': self.clsid2catid[int(l)],
                     'bbox': bbox,
                     'score': float(s)
                 }

+ 2 - 2
dygraph/paddlex/cv/models/utils/visualize.py

@@ -252,8 +252,8 @@ def draw_bbox_mask(image, results, threshold=0.5, color_map=None):
                 linestyle="-", ))
 
         # draw mask
-        if 'segmentation' in dt:
-            mask = mask_util.decode(dt['segmentation'])
+        if 'mask' in dt:
+            mask = mask_util.decode(dt['mask'])
             mask = np.ascontiguousarray(mask)
             res = cv2.findContours(
                 mask.astype("uint8"), cv2.RETR_CCOMP, cv2.CHAIN_APPROX_NONE)

+ 0 - 164
dygraph/paddlex/cv/transforms/cls_transforms.py

@@ -1,164 +0,0 @@
-# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#    http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-"""
-function:
-    transforms for classification in PaddleX<2.0
-"""
-
-import math
-import numpy as np
-import cv2
-from PIL import Image
-from .operators import Transform, Compose, RandomHorizontalFlip, RandomVerticalFlip, Normalize, \
-    ResizeByShort, CenterCrop, RandomDistort, ArrangeClassifier
-
-__all__ = [
-    'Compose', 'RandomHorizontalFlip', 'RandomVerticalFlip', 'Normalize',
-    'ResizeByShort', 'CenterCrop', 'RandomDistort', 'ArrangeClassifier',
-    'RandomCrop', 'RandomRotate', 'ComposedClsTransforms'
-]
-
-
-class RandomCrop(Transform):
-    """对图像进行随机剪裁,模型训练时的数据增强操作。
-    1. 根据lower_scale、lower_ratio、upper_ratio计算随机剪裁的高、宽。
-    2. 根据随机剪裁的高、宽随机选取剪裁的起始点。
-    3. 剪裁图像。
-    4. 调整剪裁后的图像的大小到crop_size*crop_size。
-    Args:
-        crop_size (int): 随机裁剪后重新调整的目标边长。默认为224。
-        lower_scale (float): 裁剪面积相对原面积比例的最小限制。默认为0.08。
-        lower_ratio (float): 宽变换比例的最小限制。默认为3. / 4。
-        upper_ratio (float): 宽变换比例的最大限制。默认为4. / 3。
-    """
-
-    def __init__(self,
-                 crop_size=224,
-                 lower_scale=0.08,
-                 lower_ratio=3. / 4,
-                 upper_ratio=4. / 3):
-        super(RandomCrop, self).__init__()
-        self.crop_size = crop_size
-        self.lower_scale = lower_scale
-        self.lower_ratio = lower_ratio
-        self.upper_ratio = upper_ratio
-
-    def apply_im(self, image):
-        scale = [self.lower_scale, 1.0]
-        ratio = [self.lower_ratio, self.upper_ratio]
-        aspect_ratio = math.sqrt(np.random.uniform(*ratio))
-        w = 1. * aspect_ratio
-        h = 1. / aspect_ratio
-        bound = min((float(image.shape[0]) / image.shape[1]) / (h**2),
-                    (float(image.shape[1]) / image.shape[0]) / (w**2))
-        scale_max = min(scale[1], bound)
-        scale_min = min(scale[0], bound)
-        target_area = image.shape[0] * image.shape[1] * np.random.uniform(
-            scale_min, scale_max)
-        target_size = math.sqrt(target_area)
-        w = int(target_size * w)
-        h = int(target_size * h)
-        i = np.random.randint(0, image.shape[0] - h + 1)
-        j = np.random.randint(0, image.shape[1] - w + 1)
-        image = image[i:i + h, j:j + w, :]
-        image = cv2.resize(image, (self.crop_size, self.crop_size))
-        return image
-
-    def apply(self, sample):
-        sample['image'] = self.apply_im(sample['image'])
-        return sample
-
-
-class RandomRotate(Transform):
-    def __init__(self, rotate_range=30, prob=.5):
-        """
-        Randomly rotate image(s) by an arbitrary angle between -rotate_range and rotate_range.
-        Args:
-            rotate_range(int, optional): Range of the rotation angle. Defaults to 30.
-            prob(float, optional): Probability of operating rotation. Defaults to .5.
-        """
-        self.rotate_range = rotate_range
-        self.prob = prob
-
-    def apply_im(self, image, angle):
-        image = image.astype('uint8')
-        image = Image.fromarray(image)
-        image = image.rotate(angle)
-        image = np.asarray(image).astype('float32')
-        return image
-
-    def apply(self, sample):
-        rotate_lower = -self.rotate_range
-        rotate_upper = self.rotate_range
-
-        if np.random.uniform(0, 1) < self.prob:
-            angle = np.random.uniform(rotate_lower, rotate_upper)
-            sample['image'] = self.apply_im(sample['image'], angle)
-
-        return sample
-
-
-class ComposedClsTransforms(Compose):
-    """ 分类模型的基础Transforms流程,具体如下
-        训练阶段:
-        1. 随机从图像中crop一块子图,并resize成crop_size大小
-        2. 将1的输出按0.5的概率随机进行水平翻转
-        3. 将图像进行归一化
-        验证/预测阶段:
-        1. 将图像按比例Resize,使得最小边长度为crop_size[0] * 1.14
-        2. 从图像中心crop出一个大小为crop_size的图像
-        3. 将图像进行归一化
-        Args:
-            mode(str): 图像处理流程所处阶段,训练/验证/预测,分别对应'train', 'eval', 'test'
-            crop_size(int|list): 输入模型里的图像大小
-            mean(list): 图像均值
-            std(list): 图像方差
-            random_horizontal_flip(bool): 是否以0.5的概率使用随机水平翻转增强,该仅在mode为`train`时生效,默认为True
-    """
-
-    def __init__(self,
-                 mode,
-                 crop_size=[224, 224],
-                 mean=[0.485, 0.456, 0.406],
-                 std=[0.229, 0.224, 0.225],
-                 random_horizontal_flip=True):
-        width = crop_size
-        if isinstance(crop_size, list):
-            if crop_size[0] != crop_size[1]:
-                raise Exception(
-                    "In classifier model, width and height should be equal, please modify your parameter `crop_size`"
-                )
-            width = crop_size[0]
-        if width % 32 != 0:
-            raise Exception(
-                "In classifier model, width and height should be multiple of 32, e.g 224、256、320...., please modify your parameter `crop_size`"
-            )
-
-        if mode == 'train':
-            # 训练时的transforms,包含数据增强
-            transforms = [
-                RandomCrop(crop_size=width), Normalize(
-                    mean=mean, std=std)
-            ]
-            if random_horizontal_flip:
-                transforms.insert(0, RandomHorizontalFlip())
-        else:
-            # 验证/预测时的transforms
-            transforms = [
-                ResizeByShort(short_size=int(width * 1.14)),
-                CenterCrop(crop_size=width), Normalize(
-                    mean=mean, std=std)
-            ]
-
-        super(ComposedClsTransforms, self).__init__(transforms)

+ 0 - 152
dygraph/paddlex/cv/transforms/det_transforms.py

@@ -1,152 +0,0 @@
-# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#    http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-"""
-function:
-    transforms for detection in PaddleX<2.0
-"""
-
-import numpy as np
-from .operators import Transform, Compose, ResizeByShort, Resize, RandomHorizontalFlip, Normalize
-from .operators import RandomExpand as dy_RandomExpand
-from .operators import RandomCrop as dy_RandomCrop
-from .functions import is_poly, expand_poly, expand_rle
-
-__all__ = [
-    'Compose', 'ResizeByShort', 'Resize', 'RandomHorizontalFlip', 'Normalize',
-    'Padding', 'RandomExpand', 'RandomCrop'
-]
-
-
-class Padding(Transform):
-    """1.将图像的长和宽padding至coarsest_stride的倍数。如输入图像为[300, 640],
-       `coarest_stride`为32,则由于300不为32的倍数,因此在图像最右和最下使用0值
-       进行padding,最终输出图像为[320, 640]。
-       2.或者,将图像的长和宽padding到target_size指定的shape,如输入的图像为[300,640],
-         a. `target_size` = 960,在图像最右和最下使用0值进行padding,最终输出
-            图像为[960, 960]。
-         b. `target_size` = [640, 960],在图像最右和最下使用0值进行padding,最终
-            输出图像为[640, 960]。
-    1. 如果coarsest_stride为1,target_size为None则直接返回。
-    2. 获取图像的高H、宽W。
-    3. 计算填充后图像的高H_new、宽W_new。
-    4. 构建大小为(H_new, W_new, 3)像素值为0的np.ndarray,
-       并将原图的np.ndarray粘贴于左上角。
-    Args:
-        coarsest_stride (int): 填充后的图像长、宽为该参数的倍数,默认为1。
-        target_size (int|list|tuple): 填充后的图像长、宽,默认为None,coarset_stride优先级更高。
-    Raises:
-        TypeError: 形参`target_size`数据类型不满足需求。
-        ValueError: 形参`target_size`为(list|tuple)时,长度不满足需求。
-    """
-
-    def __init__(self, coarsest_stride=1, target_size=None):
-        if target_size is not None:
-            if not isinstance(target_size, int):
-                if not isinstance(target_size, tuple) and not isinstance(
-                        target_size, list):
-                    raise TypeError(
-                        "Padding: Type of target_size must in (int|list|tuple)."
-                    )
-                elif len(target_size) != 2:
-                    raise ValueError(
-                        "Padding: Length of target_size must equal 2.")
-        super(Padding, self).__init__()
-        self.coarsest_stride = coarsest_stride
-        self.target_size = target_size
-
-    def apply_im(self, image, padding_im_h, padding_im_w):
-        im_h, im_w, im_c = image.shape
-        padding_im = np.zeros(
-            (padding_im_h, padding_im_w, im_c), dtype=np.float32)
-        padding_im[:im_h, :im_w, :] = image
-        return padding_im
-
-    def apply_bbox(self, bbox):
-        return bbox
-
-    def apply_segm(self, segms, im_h, im_w, padding_im_h, padding_im_w):
-        expanded_segms = []
-        for segm in segms:
-            if is_poly(segm):
-                # Polygon format
-                expanded_segms.append(
-                    [expand_poly(poly, 0, 0) for poly in segm])
-            else:
-                # RLE format
-                expanded_segms.append(
-                    expand_rle(segm, 0, 0, im_h, im_w, padding_im_h,
-                               padding_im_w))
-        return expanded_segms
-
-    def apply(self, sample):
-        im_h, im_w, im_c = sample['image'].shape[:]
-
-        if isinstance(self.target_size, int):
-            padding_im_h = self.target_size
-            padding_im_w = self.target_size
-        elif isinstance(self.target_size, list) or isinstance(self.target_size,
-                                                              tuple):
-            padding_im_w = self.target_size[0]
-            padding_im_h = self.target_size[1]
-        elif self.coarsest_stride > 0:
-            padding_im_h = int(
-                np.ceil(im_h / self.coarsest_stride) * self.coarsest_stride)
-            padding_im_w = int(
-                np.ceil(im_w / self.coarsest_stride) * self.coarsest_stride)
-        else:
-            raise ValueError(
-                "coarsest_stridei(>1) or target_size(list|int) need setting in Padding transform"
-            )
-        pad_height = padding_im_h - im_h
-        pad_width = padding_im_w - im_w
-        if pad_height < 0 or pad_width < 0:
-            raise ValueError(
-                'the size of image should be less than target_size, but the size of image ({}, {}), is larger than target_size ({}, {})'
-                .format(im_w, im_h, padding_im_w, padding_im_h))
-        sample['image'] = self.apply_im(sample['image'], padding_im_h,
-                                        padding_im_w)
-        if 'gt_bbox' in sample and len(sample['gt_bbox']) > 0:
-            sample['gt_bbox'] = self.apply_bbox(sample['gt_bbox'])
-        if 'gt_poly' in sample and len(sample['gt_poly']) > 0:
-            sample['gt_poly'] = self.apply_segm(sample['gt_poly'], im_h, im_w,
-                                                padding_im_h, padding_im_w)
-
-        return sample
-
-
-class RandomExpand(dy_RandomExpand):
-    def __init__(self,
-                 ratio=4.,
-                 prob=0.5,
-                 fill_value=[123.675, 116.28, 103.53]):
-        super(RandomExpand, self).__init__(
-            upper_ratio=ratio, prob=prob, im_padding_value=fill_value)
-
-
-class RandomCrop(dy_RandomCrop):
-    def __init__(self,
-                 aspect_ratio=[.5, 2.],
-                 thresholds=[.0, .1, .3, .5, .7, .9],
-                 scaling=[.3, 1.],
-                 num_attempts=50,
-                 allow_no_crop=True,
-                 cover_all_box=False):
-        super(RandomCrop, self).__init__(
-            crop_size=None,
-            aspect_ratio=aspect_ratio,
-            thresholds=thresholds,
-            scaling=scaling,
-            num_attempts=num_attempts,
-            allow_no_crop=allow_no_crop,
-            cover_all_box=cover_all_box)

+ 95 - 69
dygraph/paddlex/cv/transforms/operators.py

@@ -30,9 +30,10 @@ from .functions import normalize, horizontal_flip, permute, vertical_flip, cente
 
 __all__ = [
     "Compose", "Decode", "Resize", "RandomResize", "ResizeByShort",
-    "RandomResizeByShort", "RandomHorizontalFlip", "RandomVerticalFlip",
-    "Normalize", "CenterCrop", "RandomCrop", "RandomExpand", "Padding",
-    "MixupImage", "RandomDistort", "ArrangeSegmenter", "ArrangeClassifier",
+    "RandomResizeByShort", "ResizeByLong", "RandomHorizontalFlip",
+    "RandomVerticalFlip", "Normalize", "CenterCrop", "RandomCrop",
+    "RandomScaleAspect", "RandomExpand", "Padding", "MixupImage",
+    "RandomDistort", "RandomBlur", "ArrangeSegmenter", "ArrangeClassifier",
     "ArrangeDetector"
 ]
 
@@ -375,44 +376,7 @@ class ResizeByShort(Transform):
         self.max_size = max_size
         self.interp = interp
 
-    def apply_im(self, image, interp, target_size):
-        image = cv2.resize(image, target_size, interpolation=interp)
-        return image
-
-    def apply_mask(self, mask, target_size):
-        mask = cv2.resize(mask, target_size, interpolation=cv2.INTER_NEAREST)
-        return mask
-
-    def apply_bbox(self, bbox, scale, target_size):
-        im_scale_x, im_scale_y = scale
-        bbox[:, 0::2] *= im_scale_x
-        bbox[:, 1::2] *= im_scale_y
-        bbox[:, 0::2] = np.clip(bbox[:, 0::2], 0, target_size[1])
-        bbox[:, 1::2] = np.clip(bbox[:, 1::2], 0, target_size[0])
-        return bbox
-
-    def apply_segm(self, segms, im_size, scale):
-        im_h, im_w = im_size
-        im_scale_x, im_scale_y = scale
-        resized_segms = []
-        for segm in segms:
-            if is_poly(segm):
-                # Polygon format
-                resized_segms.append([
-                    resize_poly(poly, im_scale_x, im_scale_y) for poly in segm
-                ])
-            else:
-                # RLE format
-                resized_segms.append(
-                    resize_rle(segm, im_h, im_w, im_scale_x, im_scale_y))
-
-        return resized_segms
-
     def apply(self, sample):
-        if self.interp == "RANDOM":
-            interp = random.choice(list(interp_dict.values()))
-        else:
-            interp = interp_dict[self.interp]
         im_h, im_w = sample['image'].shape[:2]
         im_short_size = min(im_h, im_w)
         im_long_size = max(im_h, im_w)
@@ -422,25 +386,7 @@ class ResizeByShort(Transform):
         target_w = int(round(im_w * scale))
         target_h = int(round(im_h * scale))
         target_size = (target_w, target_h)
-
-        sample['image'] = self.apply_im(sample['image'], interp, target_size)
-        im_scale_y = target_h / im_h
-        im_scale_x = target_w / im_w
-        if 'mask' in sample:
-            sample['mask'] = self.apply_mask(sample['mask'], target_size)
-        if 'gt_bbox' in sample and len(sample['gt_bbox']) > 0:
-            sample['gt_bbox'] = self.apply_bbox(
-                sample['gt_bbox'], [im_scale_x, im_scale_y], target_size)
-        if 'gt_poly' in sample and len(sample['gt_poly']) > 0:
-            sample['gt_poly'] = self.apply_segm(
-                sample['gt_poly'], [im_h, im_w], [im_scale_x, im_scale_y])
-        sample['im_shape'] = np.asarray(
-            sample['image'].shape[:2], dtype=np.float32)
-        if 'scale_factor' in sample:
-            scale_factor = sample['scale_factor']
-            sample['scale_factor'] = np.asarray(
-                [scale_factor[0] * im_scale_y, scale_factor[1] * im_scale_x],
-                dtype=np.float32)
+        sample = Resize(target_size=target_size, interp=self.interp)(sample)
 
         return sample
 
@@ -484,6 +430,24 @@ class RandomResizeByShort(Transform):
         return sample
 
 
+class ResizeByLong(Transform):
+    def __init__(self, long_size=256, interp='LINEAR'):
+        super(ResizeByLong, self).__init__()
+        self.long_size = long_size
+        self.interp = interp
+
+    def apply(self, sample):
+        im_h, im_w = sample['image'].shape[:2]
+        im_long_size = max(im_h, im_w)
+        scale = float(self.long_size) / float(im_long_size)
+        target_h = int(round(im_h * scale))
+        target_w = int(round(im_w * scale))
+        target_size = (target_w, target_h)
+        sample = Resize(target_size=target_size, interp=self.interp)(sample)
+
+        return sample
+
+
 class RandomHorizontalFlip(Transform):
     """
     Randomly flip the input horizontally.
@@ -683,12 +647,13 @@ class RandomCrop(Transform):
     4. Resize the cropped area to crop_size by crop_size.
 
     Args:
-        crop_size(int or None, optional): Target size of the cropped area. If None, the cropped area will not be resized. Defaults to None.
-        aspect_ratio (List[float], optional): Aspect ratio of cropped region.
-            in [min, max] format. Defaults to [.5, .2].
-        thresholds (List[float], optional): Iou thresholds to decide a valid bbox crop. Defaults to [.0, .1, .3, .5, .7, .9].
-        scaling (List[float], optional): Ratio between the cropped region and the original image.
-             in [min, max] format, default [.3, 1.].
+        crop_size(int, List[int] or Tuple[int]): Target size of the cropped area. If None, the cropped area will not be
+            resized. Defaults to None.
+        aspect_ratio (List[float], optional): Aspect ratio of cropped region in [min, max] format. Defaults to [.5, 2.].
+        thresholds (List[float], optional): Iou thresholds to decide a valid bbox crop.
+            Defaults to [.0, .1, .3, .5, .7, .9].
+        scaling (List[float], optional): Ratio between the cropped region and the original image in [min, max] format.
+            Defaults to [.3, 1.].
         num_attempts (int, optional): The number of tries before giving up. Defaults to 50.
         allow_no_crop (bool, optional): Whether returning without doing crop is allowed. Defaults to True.
         cover_all_box (bool, optional): Whether to ensure all bboxes are covered in the final crop. Defaults to False.
@@ -860,11 +825,37 @@ class RandomCrop(Transform):
                 sample['mask'] = self.apply_mask(sample['mask'], crop_box)
 
         if self.crop_size is not None:
-            sample = Resize((self.crop_size, self.crop_size))(sample)
+            sample = Resize(self.crop_size)(sample)
 
         return sample
 
 
+class RandomScaleAspect(Transform):
+    """
+    Crop input image(s) and resize back to original sizes.
+    Args:
+        min_scale (float):Minimum ratio between the cropped region and the original image.
+            If 0, image(s) will not be cropped. Defaults to .5.
+        aspect_ratio (float): Aspect ratio of cropped region. Defaults to .33.
+    """
+
+    def __init__(self, min_scale=0.5, aspect_ratio=0.33):
+        super(RandomScaleAspect, self).__init__()
+        self.min_scale = min_scale
+        self.aspect_ratio = aspect_ratio
+
+    def apply(self, sample):
+        if self.min_scale != 0 and self.aspect_ratio != 0:
+            img_height, img_width = sample['image'].shape[:2]
+            sample = RandomCrop(
+                crop_size=(img_height, img_width),
+                aspect_ratio=[self.aspect_ratio, 1. / self.aspect_ratio],
+                scaling=[self.min_scale, 1.],
+                num_attempts=10,
+                allow_no_crop=False)(sample)
+        return sample
+
+
 class RandomExpand(Transform):
     """
     Randomly expand the input by padding according to random offsets.
@@ -934,7 +925,7 @@ class Padding(Transform):
                 if 0, only pad to right and bottom. If 1, pad according to center. If 2, only pad left and top. Defaults to 0.
             im_padding_value(Sequence[float]): RGB value of pad area. Defaults to (127.5, 127.5, 127.5).
             label_padding_value(int, optional): Filling value for the mask. Defaults to 255.
-            size_divisor(int): Image width and height after padding is a multiple of size_divisor
+            size_divisor(int): Image width and height after padding is a multiple of coarsest_stride.
         """
         super(Padding, self).__init__()
         if isinstance(target_size, (list, tuple)):
@@ -952,7 +943,7 @@ class Padding(Transform):
             assert offsets, 'if pad_mode is -1, offsets should not be None'
 
         self.target_size = target_size
-        self.size_divisor = size_divisor
+        self.coarsest_stride = size_divisor
         self.pad_mode = pad_mode
         self.offsets = offsets
         self.im_padding_value = im_padding_value
@@ -1004,7 +995,7 @@ class Padding(Transform):
             ), 'target size ({}, {}) cannot be less than image size ({}, {})'\
                 .format(h, w, im_h, im_w)
         else:
-            h = (np.ceil(im_h // self.size_divisor) *
+            h = (np.ceil(im_h / self.size_divisor) *
                  self.size_divisor).astype(int)
             w = (np.ceil(im_w / self.size_divisor) *
                  self.size_divisor).astype(int)
@@ -1235,6 +1226,41 @@ class RandomDistort(Transform):
         return sample
 
 
+class RandomBlur(Transform):
+    """
+    Randomly blur input image(s).
+
+    Args:
+        prob (float): Probability of blurring.
+    """
+
+    def __init__(self, prob=0.1):
+        super(RandomBlur, self).__init__()
+        self.prob = prob
+
+    def apply_im(self, image, radius):
+        image = cv2.GaussianBlur(image, (radius, radius), 0, 0)
+        return image
+
+    def apply(self, sample):
+        if self.prob <= 0:
+            n = 0
+        elif self.prob >= 1:
+            n = 1
+        else:
+            n = int(1.0 / self.prob)
+        if n > 0:
+            if np.random.randint(0, n) == 0:
+                radius = np.random.randint(3, 10)
+                if radius % 2 != 1:
+                    radius = radius + 1
+                if radius > 9:
+                    radius = 9
+                sample['image'] = self.apply_im(sample['image'], radius)
+
+        return sample
+
+
 class _PadBox(Transform):
     def __init__(self, num_max_boxes=50):
         """

+ 0 - 536
dygraph/paddlex/cv/transforms/seg_transforms.py

@@ -1,536 +0,0 @@
-# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#    http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-"""
-function:
-    transforms for segmentation in PaddleX<2.0
-"""
-
-import numpy as np
-import cv2
-import copy
-from .operators import Transform, Compose, RandomHorizontalFlip, RandomVerticalFlip, Resize, \
-    ResizeByShort, Normalize, RandomDistort, ArrangeSegmenter
-from .operators import Padding as dy_Padding
-
-__all__ = [
-    'Compose', 'RandomHorizontalFlip', 'RandomVerticalFlip', 'Resize',
-    'ResizeByShort', 'Normalize', 'RandomDistort', 'ArrangeSegmenter',
-    'ResizeByLong', 'ResizeRangeScaling', 'ResizeStepScaling', 'Padding',
-    'RandomPaddingCrop', 'RandomBlur', 'RandomRotate', 'RandomScaleAspect',
-    'Clip', 'ComposedSegTransforms'
-]
-
-
-class ResizeByLong(Transform):
-    """对图像长边resize到固定值,短边按比例进行缩放。当存在标注图像时,则同步进行处理。
-    Args:
-        long_size (int): resize后图像的长边大小。
-    """
-
-    def __init__(self, long_size=256):
-        super(ResizeByLong, self).__init__()
-        self.long_size = long_size
-
-    def apply_im(self, image):
-        image = _resize_long(image, long_size=self.long_size)
-        return image
-
-    def apply_mask(self, mask):
-        mask = _resize_long(
-            mask, long_size=self.long_size, interpolation=cv2.INTER_NEAREST)
-        return mask
-
-    def apply(self, sample):
-        sample['image'] = self.apply_im(sample['image'])
-        if 'mask' in sample:
-            sample['mask'] = self.apply_mask(sample['mask'])
-
-        return sample
-
-
-class ResizeRangeScaling(Transform):
-    """对图像长边随机resize到指定范围内,短边按比例进行缩放。当存在标注图像时,则同步进行处理。
-    Args:
-        min_value (int): 图像长边resize后的最小值。默认值400。
-        max_value (int): 图像长边resize后的最大值。默认值600。
-    Raises:
-        ValueError: min_value大于max_value
-    """
-
-    def __init__(self, min_value=400, max_value=600):
-        super(ResizeRangeScaling, self).__init__()
-        if min_value > max_value:
-            raise ValueError('min_value must be less than max_value, '
-                             'but they are {} and {}.'.format(min_value,
-                                                              max_value))
-        self.min_value = min_value
-        self.max_value = max_value
-
-    def apply_im(self, image, random_size):
-        image = _resize_long(image, long_size=random_size)
-        return image
-
-    def apply_mask(self, mask, random_size):
-        mask = _resize_long(
-            mask, long_size=random_size, interpolation=cv2.INTER_NEAREST)
-        return mask
-
-    def apply(self, sample):
-        if self.min_value == self.max_value:
-            random_size = self.max_value
-        else:
-            random_size = int(
-                np.random.uniform(self.min_value, self.max_value) + 0.5)
-        sample['image'] = self.apply_im(sample['image'], random_size)
-        if 'mask' in sample:
-            sample['mask'] = self.apply_mask(sample['mask'], random_size)
-
-        return sample
-
-
-class ResizeStepScaling(Transform):
-    """对图像按照某一个比例resize,这个比例以scale_step_size为步长
-    在[min_scale_factor, max_scale_factor]随机变动。当存在标注图像时,则同步进行处理。
-    Args:
-        min_scale_factor(float), resize最小尺度。默认值0.75。
-        max_scale_factor (float), resize最大尺度。默认值1.25。
-        scale_step_size (float), resize尺度范围间隔。默认值0.25。
-    Raises:
-        ValueError: min_scale_factor大于max_scale_factor
-    """
-
-    def __init__(self,
-                 min_scale_factor=0.75,
-                 max_scale_factor=1.25,
-                 scale_step_size=0.25):
-        if min_scale_factor > max_scale_factor:
-            raise ValueError(
-                'min_scale_factor must be less than max_scale_factor, '
-                'but they are {} and {}.'.format(min_scale_factor,
-                                                 max_scale_factor))
-        super(ResizeStepScaling, self).__init__()
-        self.min_scale_factor = min_scale_factor
-        self.max_scale_factor = max_scale_factor
-        self.scale_step_size = scale_step_size
-
-    def apply_im(self, image, scale_factor):
-        image = cv2.resize(
-            image, (0, 0),
-            fx=scale_factor,
-            fy=scale_factor,
-            interpolation=cv2.INTER_LINEAR)
-        if image.ndim < 3:
-            image = np.expand_dims(image, axis=-1)
-        return image
-
-    def apply_mask(self, mask, scale_factor):
-        mask = cv2.resize(
-            mask, (0, 0),
-            fx=scale_factor,
-            fy=scale_factor,
-            interpolation=cv2.INTER_NEAREST)
-        return mask
-
-    def apply(self, sample):
-        if self.min_scale_factor == self.max_scale_factor:
-            scale_factor = self.min_scale_factor
-
-        elif self.scale_step_size == 0:
-            scale_factor = np.random.uniform(self.min_scale_factor,
-                                             self.max_scale_factor)
-
-        else:
-            num_steps = int((self.max_scale_factor - self.min_scale_factor) /
-                            self.scale_step_size + 1)
-            scale_factors = np.linspace(self.min_scale_factor,
-                                        self.max_scale_factor,
-                                        num_steps).tolist()
-            np.random.shuffle(scale_factors)
-            scale_factor = scale_factors[0]
-
-        sample['image'] = self.apply_im(sample['image'], scale_factor)
-        if 'mask' in sample:
-            sample['mask'] = self.apply_mask(sample['mask'], scale_factor)
-
-        return sample
-
-
-class Padding(dy_Padding):
-    """对图像或标注图像进行padding,padding方向为右和下。
-    根据提供的值对图像或标注图像进行padding操作。
-    Args:
-        target_size (int|list|tuple): padding后图像的大小。
-        im_padding_value (list): 图像padding的值。默认为[127.5, 127.5, 127.5]。
-        label_padding_value (int): 标注图像padding的值。默认值为255。
-    Raises:
-        TypeError: target_size不是int|list|tuple。
-        ValueError:  target_size为list|tuple时元素个数不等于2。
-    """
-
-    def __init__(self,
-                 target_size,
-                 im_padding_value=[127.5, 127.5, 127.5],
-                 label_padding_value=255):
-        super(Padding, self).__init__(
-            target_size=target_size,
-            pad_mode=0,
-            offsets=None,
-            im_padding_value=im_padding_value,
-            label_padding_value=label_padding_value)
-
-
-class RandomPaddingCrop(Transform):
-    """对图像和标注图进行随机裁剪,当所需要的裁剪尺寸大于原图时,则进行padding操作。
-    Args:
-        crop_size (int|list|tuple): 裁剪图像大小。默认为512。
-        im_padding_value (list): 图像padding的值。默认为[127.5, 127.5, 127.5]。
-        label_padding_value (int): 标注图像padding的值。默认值为255。
-    Raises:
-        TypeError: crop_size不是int/list/tuple。
-        ValueError:  target_size为list/tuple时元素个数不等于2。
-    """
-
-    def __init__(self,
-                 crop_size=512,
-                 im_padding_value=[127.5, 127.5, 127.5],
-                 label_padding_value=255):
-        if isinstance(crop_size, list) or isinstance(crop_size, tuple):
-            if len(crop_size) != 2:
-                raise ValueError(
-                    'when crop_size is list or tuple, it should include 2 elements, but it is {}'
-                    .format(crop_size))
-        elif not isinstance(crop_size, int):
-            raise TypeError(
-                "Type of crop_size is invalid. Must be Integer or List or tuple, now is {}"
-                .format(type(crop_size)))
-        super(RandomPaddingCrop, self).__init__()
-        self.crop_size = crop_size
-        self.im_padding_value = im_padding_value
-        self.label_padding_value = label_padding_value
-
-    def apply_im(self, image, pad_h, pad_w):
-        im_h, im_w, im_c = image.shape
-        orig_im = copy.deepcopy(image)
-        image = np.zeros(
-            (im_h + pad_h, im_w + pad_w, im_c)).astype(orig_im.dtype)
-        for i in range(im_c):
-            image[:, :, i] = np.pad(orig_im[:, :, i],
-                                    pad_width=((0, pad_h), (0, pad_w)),
-                                    mode='constant',
-                                    constant_values=(self.im_padding_value[i],
-                                                     self.im_padding_value[i]))
-        return image
-
-    def apply_mask(self, mask, pad_h, pad_w):
-        mask = np.pad(mask,
-                      pad_width=((0, pad_h), (0, pad_w)),
-                      mode='constant',
-                      constant_values=(self.label_padding_value,
-                                       self.label_padding_value))
-        return mask
-
-    def apply(self, sample):
-        """
-        Args:
-            im (np.ndarray): 图像np.ndarray数据。
-            im_info (list): 存储图像reisze或padding前的shape信息,如
-                [('resize', [200, 300]), ('padding', [400, 600])]表示
-                图像在过resize前shape为(200, 300), 过padding前shape为
-                (400, 600)
-            label (np.ndarray): 标注图像np.ndarray数据。
-         Returns:
-            tuple: 当label为空时,返回的tuple为(im, im_info),分别对应图像np.ndarray数据、存储与图像相关信息的字典;
-                当label不为空时,返回的tuple为(im, im_info, label),分别对应图像np.ndarray数据、
-                存储与图像相关信息的字典和标注图像np.ndarray数据。
-        """
-        if isinstance(self.crop_size, int):
-            crop_width = self.crop_size
-            crop_height = self.crop_size
-        else:
-            crop_width = self.crop_size[0]
-            crop_height = self.crop_size[1]
-
-        im_h, im_w, im_c = sample['image'].shape
-
-        if im_h == crop_height and im_w == crop_width:
-            return sample
-        else:
-            pad_height = max(crop_height - im_h, 0)
-            pad_width = max(crop_width - im_w, 0)
-            if pad_height > 0 or pad_width > 0:
-                sample['image'] = self.apply_im(sample['image'], pad_height,
-                                                pad_width)
-
-                if 'mask' in sample:
-                    sample['mask'] = self.apply_mask(sample['mask'],
-                                                     pad_height, pad_width)
-
-                im_h = sample['image'].shape[0]
-                im_w = sample['image'].shape[1]
-
-            if crop_height > 0 and crop_width > 0:
-                h_off = np.random.randint(im_h - crop_height + 1)
-                w_off = np.random.randint(im_w - crop_width + 1)
-
-                sample['image'] = sample['image'][h_off:(
-                    crop_height + h_off), w_off:(w_off + crop_width), :]
-                if 'mask' in sample:
-                    sample['mask'] = sample['mask'][h_off:(
-                        crop_height + h_off), w_off:(w_off + crop_width)]
-        return sample
-
-
-class RandomBlur(Transform):
-    """以一定的概率对图像进行高斯模糊。
-    Args:
-        prob (float): 图像模糊概率。默认为0.1。
-    """
-
-    def __init__(self, prob=0.1):
-        super(RandomBlur, self).__init__()
-        self.prob = prob
-
-    def apply_im(self, image, radius):
-        image = cv2.GaussianBlur(image, (radius, radius), 0, 0)
-        return image
-
-    def apply(self, sample):
-        if self.prob <= 0:
-            n = 0
-        elif self.prob >= 1:
-            n = 1
-        else:
-            n = int(1.0 / self.prob)
-        if n > 0:
-            if np.random.randint(0, n) == 0:
-                radius = np.random.randint(3, 10)
-                if radius % 2 != 1:
-                    radius = radius + 1
-                if radius > 9:
-                    radius = 9
-                sample['image'] = self.apply_im(sample['image'], radius)
-
-        return sample
-
-
-class RandomRotate(Transform):
-    """对图像进行随机旋转, 模型训练时的数据增强操作。
-    在旋转区间[-rotate_range, rotate_range]内,对图像进行随机旋转,当存在标注图像时,同步进行,
-    并对旋转后的图像和标注图像进行相应的padding。
-    Args:
-        rotate_range (float): 最大旋转角度。默认为15度。
-        im_padding_value (list): 图像padding的值。默认为[127.5, 127.5, 127.5]。
-        label_padding_value (int): 标注图像padding的值。默认为255。
-    """
-
-    def __init__(self,
-                 rotate_range=15,
-                 im_padding_value=[127.5, 127.5, 127.5],
-                 label_padding_value=255):
-        super(RandomRotate, self).__init__()
-        self.rotate_range = rotate_range
-        self.im_padding_value = im_padding_value
-        self.label_padding_value = label_padding_value
-
-    def apply(self, sample):
-        if self.rotate_range > 0:
-            h, w, c = sample['image'].shape
-            do_rotation = np.random.uniform(-self.rotate_range,
-                                            self.rotate_range)
-            pc = (w // 2, h // 2)
-            r = cv2.getRotationMatrix2D(pc, do_rotation, 1.0)
-            cos = np.abs(r[0, 0])
-            sin = np.abs(r[0, 1])
-
-            nw = int((h * sin) + (w * cos))
-            nh = int((h * cos) + (w * sin))
-
-            (cx, cy) = pc
-            r[0, 2] += (nw / 2) - cx
-            r[1, 2] += (nh / 2) - cy
-            dsize = (nw, nh)
-            rot_ims = list()
-            for i in range(0, c, 3):
-                ori_im = sample['image'][:, :, i:i + 3]
-                rot_im = cv2.warpAffine(
-                    ori_im,
-                    r,
-                    dsize=dsize,
-                    flags=cv2.INTER_LINEAR,
-                    borderMode=cv2.BORDER_CONSTANT,
-                    borderValue=self.im_padding_value[i:i + 3])
-                rot_ims.append(rot_im)
-            sample['image'] = np.concatenate(rot_ims, axis=-1)
-            if 'mask' in sample:
-                sample['mask'] = cv2.warpAffine(
-                    sample['mask'],
-                    r,
-                    dsize=dsize,
-                    flags=cv2.INTER_NEAREST,
-                    borderMode=cv2.BORDER_CONSTANT,
-                    borderValue=self.label_padding_value)
-
-        return sample
-
-
-class RandomScaleAspect(Transform):
-    """裁剪并resize回原始尺寸的图像和标注图像。
-    按照一定的面积比和宽高比对图像进行裁剪,并reszie回原始图像的图像,当存在标注图时,同步进行。
-    Args:
-        min_scale (float):裁取图像占原始图像的面积比,取值[0,1],为0时则返回原图。默认为0.5。
-        aspect_ratio (float): 裁取图像的宽高比范围,非负值,为0时返回原图。默认为0.33。
-    """
-
-    def __init__(self, min_scale=0.5, aspect_ratio=0.33):
-        super(RandomScaleAspect, self).__init__()
-        self.min_scale = min_scale
-        self.aspect_ratio = aspect_ratio
-
-    def apply(self, sample):
-        if self.min_scale != 0 and self.aspect_ratio != 0:
-            img_height = sample['image'].shape[0]
-            img_width = sample['image'].shape[1]
-            for i in range(0, 10):
-                area = img_height * img_width
-                target_area = area * np.random.uniform(self.min_scale, 1.0)
-                aspectRatio = np.random.uniform(self.aspect_ratio,
-                                                1.0 / self.aspect_ratio)
-
-                dw = int(np.sqrt(target_area * 1.0 * aspectRatio))
-                dh = int(np.sqrt(target_area * 1.0 / aspectRatio))
-                if (np.random.randint(10) < 5):
-                    tmp = dw
-                    dw = dh
-                    dh = tmp
-
-                if (dh < img_height and dw < img_width):
-                    h1 = np.random.randint(0, img_height - dh)
-                    w1 = np.random.randint(0, img_width - dw)
-
-                    sample['image'] = sample['image'][h1:(h1 + dh), w1:(w1 + dw
-                                                                        ), :]
-                    sample['image'] = cv2.resize(
-                        sample['image'], (img_width, img_height),
-                        interpolation=cv2.INTER_LINEAR)
-                    if sample['image'].ndim < 3:
-                        sample['image'] = np.expand_dims(
-                            sample['image'], axis=-1)
-
-                    if 'mask' in sample:
-                        sample['mask'] = sample['mask'][h1:(h1 + dh), w1:(w1 +
-                                                                          dw)]
-                        sample['mask'] = cv2.resize(
-                            sample['mask'], (img_width, img_height),
-                            interpolation=cv2.INTER_NEAREST)
-                    break
-        return sample
-
-
-class Clip(Transform):
-    """
-    对图像上超出一定范围的数据进行截断。
-    Args:
-        min_val (list): 裁剪的下限,小于min_val的数值均设为min_val. 默认值0.
-        max_val (list): 裁剪的上限,大于max_val的数值均设为max_val. 默认值255.0.
-    """
-
-    def __init__(self, min_val=[0, 0, 0], max_val=[255.0, 255.0, 255.0]):
-        if not (isinstance(min_val, list) and isinstance(max_val, list)):
-            raise ValueError("{}: input type is invalid.".format(self))
-        super(Clip, self).__init__()
-        self.min_val = min_val
-        self.max_val = max_val
-
-    def apply_im(self, image):
-        for k in range(image.shape[2]):
-            np.clip(
-                image[:, :, k],
-                self.min_val[k],
-                self.max_val[k],
-                out=image[:, :, k])
-        return image
-
-    def apply(self, sample):
-        sample['image'] = self.apply_im(sample['image'])
-        return sample
-
-
-class ComposedSegTransforms(Compose):
-    """ 语义分割模型(UNet/DeepLabv3p)的图像处理流程,具体如下
-        训练阶段:
-        1. 随机对图像以0.5的概率水平翻转,若random_horizontal_flip为False,则跳过此步骤
-        2. 按不同的比例随机Resize原图, 处理方式参考[paddlex.seg.transforms.ResizeRangeScaling](#resizerangescaling)。若min_max_size为None,则跳过此步骤
-        3. 从原图中随机crop出大小为train_crop_size大小的子图,如若crop出来的图小于train_crop_size,则会将图padding到对应大小
-        4. 图像归一化
-       预测阶段:
-        1. 将图像的最长边resize至(min_max_size[0] + min_max_size[1])//2, 短边按比例resize。若min_max_size为None,则跳过此步骤
-        2. 图像归一化
-        Args:
-            mode(str): Transforms所处的阶段,包括`train', 'eval'或'test'
-            min_max_size(list): 用于对图像进行resize,具体作用参见上述步骤。
-            train_crop_size(list): 训练过程中随机裁剪原图用于训练,具体作用参见上述步骤。此参数仅在mode为`train`时生效。
-            mean(list): 图像均值, 默认为[0.485, 0.456, 0.406]。
-            std(list): 图像方差,默认为[0.229, 0.224, 0.225]。
-            random_horizontal_flip(bool): 数据增强,是否随机水平翻转图像,此参数仅在mode为`train`时生效。
-    """
-
-    def __init__(self,
-                 mode,
-                 min_max_size=[400, 600],
-                 train_crop_size=[512, 512],
-                 mean=[0.5, 0.5, 0.5],
-                 std=[0.5, 0.5, 0.5],
-                 random_horizontal_flip=True):
-        if mode == 'train':
-            # 训练时的transforms,包含数据增强
-            if min_max_size is None:
-                transforms = [
-                    RandomPaddingCrop(crop_size=train_crop_size), Normalize(
-                        mean=mean, std=std)
-                ]
-            else:
-                transforms = [
-                    ResizeRangeScaling(
-                        min_value=min(min_max_size),
-                        max_value=max(min_max_size)),
-                    RandomPaddingCrop(crop_size=train_crop_size), Normalize(
-                        mean=mean, std=std)
-                ]
-            if random_horizontal_flip:
-                transforms.insert(0, RandomHorizontalFlip())
-        else:
-            # 验证/预测时的transforms
-            if min_max_size is None:
-                transforms = [Normalize(mean=mean, std=std)]
-            else:
-                long_size = (min(min_max_size) + max(min_max_size)) // 2
-                transforms = [
-                    ResizeByLong(long_size=long_size), Normalize(
-                        mean=mean, std=std)
-                ]
-        super(ComposedSegTransforms, self).__init__(transforms)
-
-
-def _resize_long(im, long_size=224, interpolation=cv2.INTER_LINEAR):
-    value = max(im.shape[0], im.shape[1])
-    scale = float(long_size) / float(value)
-    resized_width = int(round(im.shape[1] * scale))
-    resized_height = int(round(im.shape[0] * scale))
-
-    im_dims = im.ndim
-    im = cv2.resize(
-        im, (resized_width, resized_height), interpolation=interpolation)
-    if im_dims >= 3 and im.ndim < 3:
-        im = np.expand_dims(im, axis=-1)
-    return im

+ 19 - 168
dygraph/paddlex/det.py

@@ -11,182 +11,33 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-import logging
 
+import sys
 from . import cv
 from .cv.models.utils.visualize import visualize_detection, draw_pr_curve
-from paddlex.cv.transforms import det_transforms
 
-transforms = det_transforms
+message = 'Your running script needs PaddleX<2.0.0, please refer to {} to solve this issue.'.format(
+    'https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#%E7%89%88%E6%9C%AC%E5%8D%87%E7%BA%A7'
+)
 
-visualize = visualize_detection
-draw_pr_curve = draw_pr_curve
 
+def __getattr__(attr):
+    if attr == 'transforms':
 
-class FasterRCNN(cv.models.FasterRCNN):
-    def __init__(self,
-                 num_classes=81,
-                 backbone='ResNet50',
-                 with_fpn=True,
-                 aspect_ratios=[0.5, 1.0, 2.0],
-                 anchor_sizes=[32, 64, 128, 256, 512],
-                 with_dcn=None,
-                 rpn_cls_loss=None,
-                 rpn_focal_loss_alpha=None,
-                 rpn_focal_loss_gamma=None,
-                 rcnn_bbox_loss=None,
-                 rcnn_nms=None,
-                 keep_top_k=100,
-                 nms_threshold=0.5,
-                 score_threshold=0.05,
-                 softnms_sigma=None,
-                 bbox_assigner=None,
-                 fpn_num_channels=256,
-                 input_channel=None,
-                 rpn_batch_size_per_im=256,
-                 rpn_fg_fraction=0.5,
-                 test_pre_nms_top_n=None,
-                 test_post_nms_top_n=1000):
-        if with_dcn is not None:
-            logging.warning(
-                "`with_dcn` is deprecated in PaddleX 2.0 and won't take effect. Defaults to False."
-            )
-        if rpn_cls_loss is not None:
-            logging.warning(
-                "`rpn_cls_loss` is deprecated in PaddleX 2.0 and won't take effect. "
-                "Defaults to 'SigmoidCrossEntropy'.")
-        if rpn_focal_loss_alpha is not None or rpn_focal_loss_gamma is not None:
-            logging.warning(
-                "Focal loss is deprecated in PaddleX 2.0."
-                " `rpn_focal_loss_alpha` and `rpn_focal_loss_gamma` won't take effect."
-            )
-        if rcnn_bbox_loss is not None:
-            logging.warning(
-                "`rcnn_bbox_loss` is deprecated in PaddleX 2.0 and won't take effect. "
-                "Defaults to 'SmoothL1Loss'")
-        if rcnn_nms is not None:
-            logging.warning(
-                "MultiClassSoftNMS is deprecated in PaddleX 2.0. "
-                "`rcnn_nms` and `softnms_sigma` won't take effect. MultiClassNMS will be used by default"
-            )
-        if bbox_assigner is not None:
-            logging.warning(
-                "`bbox_assigner` is deprecated in PaddleX 2.0 and won't take effect. "
-                "Defaults to 'BBoxAssigner'")
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-        super(FasterRCNN, self).__init__(
-            num_classes=num_classes - 1,
-            backbone=backbone,
-            with_fpn=with_fpn,
-            aspect_ratios=aspect_ratios,
-            anchor_sizes=anchor_sizes,
-            keep_top_k=keep_top_k,
-            nms_threshold=nms_threshold,
-            score_threshold=score_threshold,
-            fpn_num_channels=fpn_num_channels,
-            rpn_batch_size_per_im=rpn_batch_size_per_im,
-            rpn_fg_fraction=rpn_fg_fraction,
-            test_pre_nms_top_n=test_pre_nms_top_n,
-            test_post_nms_top_n=test_post_nms_top_n)
+        print("\033[1;31;40m{}\033[0m".format(message).encode("utf-8")
+              .decode("latin1"))
+        sys.exit(-1)
 
 
-class YOLOv3(cv.models.YOLOv3):
-    def __init__(self,
-                 num_classes=80,
-                 backbone='MobileNetV1',
-                 anchors=None,
-                 anchor_masks=None,
-                 ignore_threshold=0.7,
-                 nms_score_threshold=0.01,
-                 nms_topk=1000,
-                 nms_keep_topk=100,
-                 nms_iou_threshold=0.45,
-                 label_smooth=False,
-                 train_random_shapes=None,
-                 input_channel=None):
-        if train_random_shapes is not None:
-            logging.warning(
-                "`train_random_shapes` is deprecated in PaddleX 2.0 and won't take effect. "
-                "To apply multi_scale training, please refer to paddlex.transforms.BatchRandomResize: "
-                "'https://github.com/PaddlePaddle/PaddleX/blob/develop/dygraph/paddlex/cv/transforms/batch_operators.py#L53'"
-            )
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-        super(YOLOv3, self).__init__(
-            num_classes=num_classes,
-            backbone=backbone,
-            anchors=anchors,
-            anchor_masks=anchor_masks,
-            ignore_threshold=ignore_threshold,
-            nms_score_threshold=nms_score_threshold,
-            nms_topk=nms_topk,
-            nms_keep_topk=nms_keep_topk,
-            nms_iou_threshold=nms_iou_threshold,
-            label_smooth=label_smooth)
+visualize = visualize_detection
+draw_pr_curve = draw_pr_curve
 
+# detection
+YOLOv3 = cv.models.YOLOv3
+FasterRCNN = cv.models.FasterRCNN
+PPYOLO = cv.models.PPYOLO
+PPYOLOTiny = cv.models.PPYOLOTiny
+PPYOLOv2 = cv.models.PPYOLOv2
 
-class PPYOLO(cv.models.PPYOLO):
-    def __init__(
-            self,
-            num_classes=80,
-            backbone='ResNet50_vd_ssld',
-            with_dcn_v2=None,
-            # YOLO Head
-            anchors=None,
-            anchor_masks=None,
-            use_coord_conv=True,
-            use_iou_aware=True,
-            use_spp=True,
-            use_drop_block=True,
-            scale_x_y=1.05,
-            # PPYOLO Loss
-            ignore_threshold=0.7,
-            label_smooth=False,
-            use_iou_loss=True,
-            # NMS
-            use_matrix_nms=True,
-            nms_score_threshold=0.01,
-            nms_topk=1000,
-            nms_keep_topk=100,
-            nms_iou_threshold=0.45,
-            train_random_shapes=None,
-            input_channel=None):
-        if with_dcn_v2 is not None:
-            logging.warning(
-                "`with_dcn_v2` is deprecated in PaddleX 2.0 and will not take effect. "
-                "To use backbone with deformable convolutional networks, "
-                "please specify in `backbone_name`. "
-                "Currently the only backbone with dcn is 'ResNet50_vd_dcn'.")
-        if train_random_shapes is not None:
-            logging.warning(
-                "`train_random_shapes` is deprecated in PaddleX 2.0 and won't take effect. "
-                "To apply multi_scale training, please refer to paddlex.transforms.BatchRandomResize: "
-                "'https://github.com/PaddlePaddle/PaddleX/blob/develop/dygraph/paddlex/cv/transforms/batch_operators.py#L53'"
-            )
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-        super(PPYOLO, self).__init__(
-            num_classes=num_classes,
-            backbone=backbone,
-            anchors=anchors,
-            anchor_masks=anchor_masks,
-            use_coord_conv=use_coord_conv,
-            use_iou_aware=use_iou_aware,
-            use_spp=use_spp,
-            use_drop_block=use_drop_block,
-            scale_x_y=scale_x_y,
-            ignore_threshold=ignore_threshold,
-            label_smooth=label_smooth,
-            use_iou_loss=use_iou_loss,
-            use_matrix_nms=use_matrix_nms,
-            nms_score_threshold=nms_score_threshold,
-            nms_topk=nms_topk,
-            nms_keep_topk=nms_keep_topk,
-            nms_iou_threshold=nms_iou_threshold)
+# instance segmentation
+MaskRCNN = cv.models.MaskRCNN

+ 0 - 78
dygraph/paddlex/models.py

@@ -1,78 +0,0 @@
-# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#    http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-from . import cv
-
-# image classification
-ResNet18 = cv.models.ResNet18
-ResNet34 = cv.models.ResNet34
-ResNet50 = cv.models.ResNet50
-ResNet101 = cv.models.ResNet101
-ResNet152 = cv.models.ResNet152
-
-ResNet18_vd = cv.models.ResNet18_vd
-ResNet34_vd = cv.models.ResNet34_vd
-ResNet50_vd = cv.models.ResNet50_vd
-ResNet50_vd_ssld = cv.models.ResNet50_vd_ssld
-ResNet101_vd = cv.models.ResNet101_vd
-ResNet101_vd_ssld = cv.models.ResNet101_vd_ssld
-ResNet152_vd = cv.models.ResNet152_vd
-ResNet200_vd = cv.models.ResNet200_vd
-
-MobileNetV1 = cv.models.MobileNetV1
-MobileNetV2 = cv.models.MobileNetV2
-MobileNetV3_small = cv.models.MobileNetV3_small
-MobileNetV3_large = cv.models.MobileNetV3_large
-
-AlexNet = cv.models.AlexNet
-
-DarkNet53 = cv.models.DarkNet53
-
-DenseNet121 = cv.models.DenseNet121
-DenseNet161 = cv.models.DenseNet161
-DenseNet169 = cv.models.DenseNet169
-DenseNet201 = cv.models.DenseNet201
-DenseNet264 = cv.models.DenseNet264
-
-HRNet_W18_C = cv.models.HRNet_W18_C
-HRNet_W30_C = cv.models.HRNet_W30_C
-HRNet_W32_C = cv.models.HRNet_W32_C
-HRNet_W40_C = cv.models.HRNet_W40_C
-HRNet_W44_C = cv.models.HRNet_W44_C
-HRNet_W48_C = cv.models.HRNet_W48_C
-HRNet_W64_C = cv.models.HRNet_W64_C
-
-Xception41 = cv.models.Xception41
-Xception65 = cv.models.Xception65
-Xception71 = cv.models.Xception71
-
-ShuffleNetV2 = cv.models.ShuffleNetV2
-ShuffleNetV2_swish = cv.models.ShuffleNetV2_swish
-
-# object detection
-YOLOv3 = cv.models.YOLOv3
-FasterRCNN = cv.models.FasterRCNN
-PPYOLO = cv.models.PPYOLO
-PPYOLOTiny = cv.models.PPYOLOTiny
-PPYOLOv2 = cv.models.PPYOLOv2
-
-# instance segmentation
-MaskRCNN = cv.models.MaskRCNN
-
-# semantic segmentattion
-UNet = cv.models.UNet
-DeepLabV3P = cv.models.DeepLabV3P
-FastSCNN = cv.models.FastSCNN
-HRNet = cv.models.HRNet
-BiSeNetV2 = cv.models.BiSeNetV2

+ 15 - 208
dygraph/paddlex/seg.py

@@ -11,221 +11,28 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-import logging
 
+import sys
 from . import cv
 from .cv.models.utils.visualize import visualize_segmentation
-from paddlex.cv.transforms import seg_transforms
 
-transforms = seg_transforms
+message = 'Your running script needs PaddleX<2.0.0, please refer to {} to solve this issue.'.format(
+    'https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#%E7%89%88%E6%9C%AC%E5%8D%87%E7%BA%A7'
+)
 
-visualize = visualize_segmentation
-
-
-class UNet(cv.models.UNet):
-    def __init__(self,
-                 num_classes=2,
-                 upsample_mode='bilinear',
-                 use_bce_loss=False,
-                 use_dice_loss=False,
-                 class_weight=None,
-                 ignore_index=None,
-                 input_channel=None):
-        if num_classes > 2 and (use_bce_loss or use_dice_loss):
-            raise ValueError(
-                "dice loss and bce loss is only applicable to binary classification"
-            )
-        elif num_classes == 2:
-            if use_bce_loss and use_dice_loss:
-                use_mixed_loss = [('CrossEntropyLoss', 1), ('DiceLoss', 1)]
-            elif use_bce_loss:
-                use_mixed_loss = [('CrossEntropyLoss', 1)]
-            elif use_dice_loss:
-                use_mixed_loss = [('DiceLoss', 1)]
-            else:
-                use_mixed_loss = False
-        else:
-            use_mixed_loss = False
-
-        if class_weight is not None:
-            logging.warning(
-                "`class_weight` is not supported in PaddleX 2.0 currently and is forcibly set to None."
-            )
-        if ignore_index is not None:
-            logging.warning(
-                "`ignore_index` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 255."
-            )
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-
-        if upsample_mode == 'bilinear':
-            use_deconv = False
-        else:
-            use_deconv = True
-        super(UNet, self).__init__(
-            num_classes=num_classes,
-            use_mixed_loss=use_mixed_loss,
-            use_deconv=use_deconv)
-
-
-class DeepLabv3p(cv.models.DeepLabV3P):
-    def __init__(self,
-                 num_classes=2,
-                 backbone='ResNet50_vd',
-                 output_stride=8,
-                 aspp_with_sep_conv=None,
-                 decoder_use_sep_conv=None,
-                 encoder_with_aspp=None,
-                 enable_decoder=None,
-                 use_bce_loss=False,
-                 use_dice_loss=False,
-                 class_weight=None,
-                 ignore_index=None,
-                 pooling_crop_size=None,
-                 input_channel=None):
-        if num_classes > 2 and (use_bce_loss or use_dice_loss):
-            raise ValueError(
-                "dice loss and bce loss is only applicable to binary classification"
-            )
-        elif num_classes == 2:
-            if use_bce_loss and use_dice_loss:
-                use_mixed_loss = [('CrossEntropyLoss', 1), ('DiceLoss', 1)]
-            elif use_bce_loss:
-                use_mixed_loss = [('CrossEntropyLoss', 1)]
-            elif use_dice_loss:
-                use_mixed_loss = [('DiceLoss', 1)]
-            else:
-                use_mixed_loss = False
-        else:
-            use_mixed_loss = False
 
-        if aspp_with_sep_conv is not None:
-            logging.warning(
-                "`aspp_with_sep_conv` is deprecated in PaddleX 2.0 and will not take effect. "
-                "Defaults to True")
-        if decoder_use_sep_conv is not None:
-            logging.warning(
-                "`decoder_use_sep_conv` is deprecated in PaddleX 2.0 and will not take effect. "
-                "Defaults to True")
-        if encoder_with_aspp is not None:
-            logging.warning(
-                "`encoder_with_aspp` is deprecated in PaddleX 2.0 and will not take effect. "
-                "Defaults to True")
-        if enable_decoder is not None:
-            logging.warning(
-                "`enable_decoder` is deprecated in PaddleX 2.0 and will not take effect. "
-                "Defaults to True")
-        if class_weight is not None:
-            logging.warning(
-                "`class_weight` is not supported in PaddleX 2.0 currently and is forcibly set to None."
-            )
-        if ignore_index is not None:
-            logging.warning(
-                "`ignore_index` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 255."
-            )
-        if pooling_crop_size is not None:
-            logging.warning(
-                "Backbone 'MobileNetV3_large_x1_0_ssld' is currently not supported in PaddleX 2.0. "
-                "`pooling_crop_size` will not take effect. Defaults to None")
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
+def __getattr__(attr):
+    if attr == 'transforms':
 
-        super(DeepLabv3p, self).__init__(
-            num_classes=num_classes,
-            backbone=backbone,
-            use_mixed_loss=use_mixed_loss,
-            output_stride=output_stride)
+        print("\033[1;31;40m{}\033[0m".format(message).encode("utf-8")
+              .decode("latin1"))
+        sys.exit(-1)
 
 
-class HRNet(cv.models.HRNet):
-    def __init__(self,
-                 num_classes=2,
-                 width=18,
-                 use_bce_loss=False,
-                 use_dice_loss=False,
-                 class_weight=None,
-                 ignore_index=None,
-                 input_channel=None):
-        if num_classes > 2 and (use_bce_loss or use_dice_loss):
-            raise ValueError(
-                "dice loss and bce loss is only applicable to binary classification"
-            )
-        elif num_classes == 2:
-            if use_bce_loss and use_dice_loss:
-                use_mixed_loss = [('CrossEntropyLoss', 1), ('DiceLoss', 1)]
-            elif use_bce_loss:
-                use_mixed_loss = [('CrossEntropyLoss', 1)]
-            elif use_dice_loss:
-                use_mixed_loss = [('DiceLoss', 1)]
-            else:
-                use_mixed_loss = False
-        else:
-            use_mixed_loss = False
-
-        if class_weight is not None:
-            logging.warning(
-                "`class_weight` is not supported in PaddleX 2.0 currently and is forcibly set to None."
-            )
-        if ignore_index is not None:
-            logging.warning(
-                "`ignore_index` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 255."
-            )
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
-
-        super(HRNet, self).__init__(
-            num_classes=num_classes,
-            width=width,
-            use_mixed_loss=use_mixed_loss)
-
-
-class FastSCNN(cv.models.FastSCNN):
-    def __init__(self,
-                 num_classes=2,
-                 use_bce_loss=False,
-                 use_dice_loss=False,
-                 class_weight=None,
-                 ignore_index=255,
-                 multi_loss_weight=None,
-                 input_channel=3):
-        if num_classes > 2 and (use_bce_loss or use_dice_loss):
-            raise ValueError(
-                "dice loss and bce loss is only applicable to binary classification"
-            )
-        elif num_classes == 2:
-            if use_bce_loss and use_dice_loss:
-                use_mixed_loss = [('CrossEntropyLoss', 1), ('DiceLoss', 1)]
-            elif use_bce_loss:
-                use_mixed_loss = [('CrossEntropyLoss', 1)]
-            elif use_dice_loss:
-                use_mixed_loss = [('DiceLoss', 1)]
-            else:
-                use_mixed_loss = False
-        else:
-            use_mixed_loss = False
-
-        if class_weight is not None:
-            logging.warning(
-                "`class_weight` is not supported in PaddleX 2.0 currently and is forcibly set to None."
-            )
-        if ignore_index is not None:
-            logging.warning(
-                "`ignore_index` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 255."
-            )
-        if multi_loss_weight is not None:
-            logging.warning(
-                "`multi_loss_weight` is deprecated in PaddleX 2.0 and will not take effect. "
-                "Defaults to [1.0, 0.4]")
-        if input_channel is not None:
-            logging.warning(
-                "`input_channel` is deprecated in PaddleX 2.0 and won't take effect. Defaults to 3."
-            )
+visualize = visualize_segmentation
 
-        super(FastSCNN, self).__init__(
-            num_classes=num_classes, use_mixed_loss=use_mixed_loss)
+UNet = cv.models.UNet
+DeepLabV3P = cv.models.DeepLabV3P
+FastSCNN = cv.models.FastSCNN
+HRNet = cv.models.HRNet
+BiSeNetV2 = cv.models.BiSeNetV2

+ 6 - 2
dygraph/paddlex/utils/checkpoint.py

@@ -20,7 +20,7 @@ from .download import download_and_decompress
 
 seg_pretrain_weights_dict = {
     'UNet': ['CITYSCAPES'],
-    'DeepLabV3P': ['CITYSCAPES', 'PascalVOC'],
+    'DeepLabV3P': ['CITYSCAPES', 'PascalVOC', 'IMAGENET'],
     'FastSCNN': ['CITYSCAPES'],
     'HRNet': ['CITYSCAPES', 'PascalVOC'],
     'BiSeNetV2': ['CITYSCAPES']
@@ -254,7 +254,11 @@ imagenet_weights = {
     'MaskRCNN_ResNet101_fpn_IMAGENET':
     'https://paddledet.bj.bcebos.com/models/pretrained/ResNet101_pretrained.pdparams',
     'MaskRCNN_ResNet101_vd_fpn_IMAGENET':
-    'https://paddledet.bj.bcebos.com/models/pretrained/ResNet101_vd_pretrained.pdparams'
+    'https://paddledet.bj.bcebos.com/models/pretrained/ResNet101_vd_pretrained.pdparams',
+    'DeepLabV3P_ResNet50_vd_IMAGENET':
+    'https://bj.bcebos.com/paddleseg/dygraph/resnet50_vd_ssld.tar.gz',
+    'DeepLabV3P_ResNet101_vd_IMAGENET':
+    'https://bj.bcebos.com/paddleseg/dygraph/resnet101_vd_ssld.tar.gz'
 }
 
 pascalvoc_weights = {

+ 1 - 1
dygraph/tutorials/slim/prune/image_classification/mobilenetv2_train.py

@@ -32,7 +32,7 @@ eval_dataset = pdx.datasets.ImageNet(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.MobileNetV2(num_classes=num_classes)
+model = pdx.cls.MobileNetV2(num_classes=num_classes)
 
 # API说明:https://github.com/PaddlePaddle/PaddleX/blob/95c53dec89ab0f3769330fa445c6d9213986ca5f/paddlex/cv/models/classifier.py#L153
 # 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html

+ 1 - 1
dygraph/tutorials/slim/prune/object_detection/yolov3_train.py

@@ -41,7 +41,7 @@ eval_dataset = pdx.datasets.VOCDetection(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.YOLOv3(num_classes=num_classes, backbone='DarkNet53')
+model = pdx.det.YOLOv3(num_classes=num_classes, backbone='DarkNet53')
 
 # API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/models/detector.py#L154
 # 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html

+ 1 - 1
dygraph/tutorials/slim/prune/semantic_segmentation/unet_train.py

@@ -39,7 +39,7 @@ eval_dataset = pdx.datasets.SegDataset(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.UNet(num_classes=num_classes)
+model = pdx.seg.UNet(num_classes=num_classes)
 
 # API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/models/segmenter.py#L150
 # 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html

+ 1 - 1
dygraph/tutorials/slim/quantize/image_classification/mobilenetv2_train.py

@@ -32,7 +32,7 @@ eval_dataset = pdx.datasets.ImageNet(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.MobileNetV3_large(num_classes=num_classes)
+model = pdx.cls.MobileNetV3_large(num_classes=num_classes)
 
 # API说明:https://github.com/PaddlePaddle/PaddleX/blob/95c53dec89ab0f3769330fa445c6d9213986ca5f/paddlex/cv/models/classifier.py#L153
 # 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html

+ 1 - 1
dygraph/tutorials/slim/quantize/object_detection/yolov3_train.py

@@ -41,7 +41,7 @@ eval_dataset = pdx.datasets.VOCDetection(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.YOLOv3(num_classes=num_classes, backbone='DarkNet53')
+model = pdx.det.YOLOv3(num_classes=num_classes, backbone='DarkNet53')
 
 # API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/models/detector.py#L154
 # 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html

+ 1 - 1
dygraph/tutorials/slim/quantize/semantic_segmentation/unet_train.py

@@ -39,7 +39,7 @@ eval_dataset = pdx.datasets.SegDataset(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.UNet(num_classes=num_classes)
+model = pdx.seg.UNet(num_classes=num_classes)
 
 # API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/models/segmenter.py#L150
 # 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html

+ 1 - 1
dygraph/tutorials/train/image_classification/alexnet.py

@@ -32,7 +32,7 @@ eval_dataset = pdx.datasets.ImageNet(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.AlexNet(num_classes=num_classes)
+model = pdx.cls.AlexNet(num_classes=num_classes)
 
 # API说明:https://github.com/PaddlePaddle/PaddleX/blob/95c53dec89ab0f3769330fa445c6d9213986ca5f/paddlex/cv/models/classifier.py#L153
 # 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html

+ 1 - 1
dygraph/tutorials/train/image_classification/darknet53.py

@@ -32,7 +32,7 @@ eval_dataset = pdx.datasets.ImageNet(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.DarkNet53(num_classes=num_classes)
+model = pdx.cls.DarkNet53(num_classes=num_classes)
 
 # API说明:https://github.com/PaddlePaddle/PaddleX/blob/95c53dec89ab0f3769330fa445c6d9213986ca5f/paddlex/cv/models/classifier.py#L153
 # 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html

+ 1 - 1
dygraph/tutorials/train/image_classification/densenet121.py

@@ -32,7 +32,7 @@ eval_dataset = pdx.datasets.ImageNet(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.DenseNet121(num_classes=num_classes)
+model = pdx.cls.DenseNet121(num_classes=num_classes)
 
 # API说明:https://github.com/PaddlePaddle/PaddleX/blob/95c53dec89ab0f3769330fa445c6d9213986ca5f/paddlex/cv/models/classifier.py#L153
 # 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html

+ 1 - 1
dygraph/tutorials/train/image_classification/hrnet_w18_c.py

@@ -32,7 +32,7 @@ eval_dataset = pdx.datasets.ImageNet(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.HRNet_W18_C(num_classes=num_classes)
+model = pdx.cls.HRNet_W18_C(num_classes=num_classes)
 
 # API说明:https://github.com/PaddlePaddle/PaddleX/blob/95c53dec89ab0f3769330fa445c6d9213986ca5f/paddlex/cv/models/classifier.py#L153
 # 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html

+ 1 - 1
dygraph/tutorials/train/image_classification/mobilenetv3_large_w_custom_optimizer.py

@@ -33,7 +33,7 @@ eval_dataset = pdx.datasets.ImageNet(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.MobileNetV3_large(num_classes=num_classes)
+model = pdx.cls.MobileNetV3_large(num_classes=num_classes)
 
 # 自定义优化器:使用CosineAnnealingDecay
 train_batch_size = 64

+ 1 - 1
dygraph/tutorials/train/image_classification/mobilenetv3_small.py

@@ -32,7 +32,7 @@ eval_dataset = pdx.datasets.ImageNet(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.MobileNetV3_small(num_classes=num_classes)
+model = pdx.cls.MobileNetV3_small(num_classes=num_classes)
 
 # API说明:https://github.com/PaddlePaddle/PaddleX/blob/95c53dec89ab0f3769330fa445c6d9213986ca5f/paddlex/cv/models/classifier.py#L153
 # 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html

+ 1 - 1
dygraph/tutorials/train/image_classification/resnet50_vd_ssld.py

@@ -32,7 +32,7 @@ eval_dataset = pdx.datasets.ImageNet(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.ResNet50_vd_ssld(num_classes=num_classes)
+model = pdx.cls.ResNet50_vd_ssld(num_classes=num_classes)
 
 # API说明:https://github.com/PaddlePaddle/PaddleX/blob/95c53dec89ab0f3769330fa445c6d9213986ca5f/paddlex/cv/models/classifier.py#L153
 # 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html

+ 1 - 1
dygraph/tutorials/train/image_classification/shufflenetv2.py

@@ -32,7 +32,7 @@ eval_dataset = pdx.datasets.ImageNet(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.ShuffleNetV2(num_classes=num_classes)
+model = pdx.cls.ShuffleNetV2(num_classes=num_classes)
 
 # API说明:https://github.com/PaddlePaddle/PaddleX/blob/95c53dec89ab0f3769330fa445c6d9213986ca5f/paddlex/cv/models/classifier.py#L153
 # 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html

+ 1 - 1
dygraph/tutorials/train/image_classification/xception41.py

@@ -32,7 +32,7 @@ eval_dataset = pdx.datasets.ImageNet(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.Xception41(num_classes=num_classes)
+model = pdx.cls.Xception41(num_classes=num_classes)
 
 # API说明:https://github.com/PaddlePaddle/PaddleX/blob/95c53dec89ab0f3769330fa445c6d9213986ca5f/paddlex/cv/models/classifier.py#L153
 # 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html

+ 1 - 1
dygraph/tutorials/train/instance_segmentation/mask_rcnn_r50_fpn.py

@@ -36,7 +36,7 @@ eval_dataset = pdx.datasets.CocoDetection(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.MaskRCNN(
+model = pdx.det.MaskRCNN(
     num_classes=num_classes, backbone='ResNet50', with_fpn=True)
 
 # API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/models/detector.py#L155

+ 1 - 1
dygraph/tutorials/train/object_detection/faster_rcnn_hrnet_w18.py

@@ -40,7 +40,7 @@ eval_dataset = pdx.datasets.VOCDetection(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.FasterRCNN(num_classes=num_classes, backbone='HRNet_W18')
+model = pdx.seg.FasterRCNN(num_classes=num_classes, backbone='HRNet_W18')
 
 # API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/models/detector.py#L155
 # 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html

+ 1 - 1
dygraph/tutorials/train/object_detection/faster_rcnn_r50_fpn.py

@@ -40,7 +40,7 @@ eval_dataset = pdx.datasets.VOCDetection(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.FasterRCNN(
+model = pdx.det.FasterRCNN(
     num_classes=num_classes, backbone='ResNet50', with_fpn=True)
 
 # API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/models/detector.py#L155

+ 1 - 1
dygraph/tutorials/train/object_detection/ppyolo.py

@@ -41,7 +41,7 @@ eval_dataset = pdx.datasets.VOCDetection(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.PPYOLO(num_classes=num_classes, backbone='ResNet50_vd_dcn')
+model = pdx.det.PPYOLO(num_classes=num_classes, backbone='ResNet50_vd_dcn')
 
 # API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/models/detector.py#L155
 # 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html

+ 1 - 1
dygraph/tutorials/train/object_detection/ppyolotiny.py

@@ -41,7 +41,7 @@ eval_dataset = pdx.datasets.VOCDetection(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.PPYOLOTiny(num_classes=num_classes)
+model = pdx.det.PPYOLOTiny(num_classes=num_classes)
 
 # API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/models/detector.py#L155
 # 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html

+ 1 - 2
dygraph/tutorials/train/object_detection/ppyolov2.py

@@ -44,8 +44,7 @@ eval_dataset = pdx.datasets.VOCDetection(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.PPYOLOv2(
-    num_classes=num_classes, backbone='ResNet50_vd_dcn')
+model = pdx.det.PPYOLOv2(num_classes=num_classes, backbone='ResNet50_vd_dcn')
 
 # API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/models/detector.py#L155
 # 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html

+ 1 - 1
dygraph/tutorials/train/object_detection/yolov3_darknet53.py

@@ -41,7 +41,7 @@ eval_dataset = pdx.datasets.VOCDetection(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.YOLOv3(num_classes=num_classes, backbone='DarkNet53')
+model = pdx.det.YOLOv3(num_classes=num_classes, backbone='DarkNet53')
 
 # API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/models/detector.py#L155
 # 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html

+ 1 - 1
dygraph/tutorials/train/semantic_segmentation/bisenetv2.py

@@ -39,7 +39,7 @@ eval_dataset = pdx.datasets.SegDataset(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.BiSeNetV2(num_classes=num_classes)
+model = pdx.seg.BiSeNetV2(num_classes=num_classes)
 
 # API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/models/segmenter.py#L150
 # 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html

+ 1 - 1
dygraph/tutorials/train/semantic_segmentation/deeplabv3p_resnet50_vd.py

@@ -39,7 +39,7 @@ eval_dataset = pdx.datasets.SegDataset(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.DeepLabV3P(num_classes=num_classes, backbone='ResNet50_vd')
+model = pdx.seg.DeepLabV3P(num_classes=num_classes, backbone='ResNet50_vd')
 
 # API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/models/segmenter.py#L150
 # 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html

+ 1 - 1
dygraph/tutorials/train/semantic_segmentation/fastscnn.py

@@ -39,7 +39,7 @@ eval_dataset = pdx.datasets.SegDataset(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.FastSCNN(num_classes=num_classes)
+model = pdx.seg.FastSCNN(num_classes=num_classes)
 
 # API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/models/segmenter.py#L150
 # 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html

+ 1 - 1
dygraph/tutorials/train/semantic_segmentation/hrnet.py

@@ -39,7 +39,7 @@ eval_dataset = pdx.datasets.SegDataset(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.HRNet(num_classes=num_classes, width=48)
+model = pdx.seg.HRNet(num_classes=num_classes, width=48)
 
 # API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/models/segmenter.py#L150
 # 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html

+ 1 - 1
dygraph/tutorials/train/semantic_segmentation/unet.py

@@ -39,7 +39,7 @@ eval_dataset = pdx.datasets.SegDataset(
 # 初始化模型,并进行训练
 # 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/tree/release/2.0-rc/tutorials/train#visualdl可视化训练指标
 num_classes = len(train_dataset.labels)
-model = pdx.models.UNet(num_classes=num_classes)
+model = pdx.seg.UNet(num_classes=num_classes)
 
 # API说明:https://github.com/PaddlePaddle/PaddleX/blob/release/2.0-rc/paddlex/cv/models/segmenter.py#L150
 # 各参数介绍与调整说明:https://paddlex.readthedocs.io/zh_CN/develop/appendix/parameters.html