소스 검색

Doc update0211 (#3289)

* update docs

* update model list

* update docs

* add face rec tutorial

* fix readme
AmberC0209 9 달 전
부모
커밋
f323eada17
61개의 변경된 파일3084개의 추가작업 그리고 1312개의 파일을 삭제
  1. 155 6
      README.md
  2. 153 0
      README_en.md
  3. 39 10
      docs/module_usage/tutorials/cv_modules/3d_bev_detection.en.md
  4. 3 4
      docs/module_usage/tutorials/cv_modules/anomaly_detection.en.md
  5. 2 1
      docs/module_usage/tutorials/cv_modules/face_detection.en.md
  6. 49 52
      docs/module_usage/tutorials/cv_modules/face_feature.en.md
  7. 4 4
      docs/module_usage/tutorials/cv_modules/human_detection.en.md
  8. 2 3
      docs/module_usage/tutorials/cv_modules/human_keypoint_detection.en.md
  9. 5 5
      docs/module_usage/tutorials/cv_modules/image_classification.en.md
  10. 8 9
      docs/module_usage/tutorials/cv_modules/image_feature.en.md
  11. 60 51
      docs/module_usage/tutorials/cv_modules/image_multilabel_classification.en.md
  12. 61 54
      docs/module_usage/tutorials/cv_modules/image_multilabel_classification.md
  13. 67 73
      docs/module_usage/tutorials/cv_modules/instance_segmentation.en.md
  14. 3 3
      docs/module_usage/tutorials/cv_modules/mainbody_detection.en.md
  15. 19 19
      docs/module_usage/tutorials/cv_modules/object_detection.en.md
  16. 16 16
      docs/module_usage/tutorials/cv_modules/object_detection.md
  17. 1 1
      docs/module_usage/tutorials/cv_modules/open_vocabulary_detection.en.md
  18. 1 1
      docs/module_usage/tutorials/cv_modules/open_vocabulary_segmentation.en.md
  19. 233 65
      docs/module_usage/tutorials/cv_modules/pedestrian_attribute_recognition.en.md
  20. 2 2
      docs/module_usage/tutorials/cv_modules/rotated_object_detection.en.md
  21. 55 59
      docs/module_usage/tutorials/cv_modules/semantic_segmentation.en.md
  22. 3 3
      docs/module_usage/tutorials/cv_modules/small_object_detection.en.md
  23. 1 1
      docs/module_usage/tutorials/cv_modules/small_object_detection.md
  24. 197 6
      docs/module_usage/tutorials/cv_modules/vehicle_attribute_recognition.en.md
  25. 1 1
      docs/module_usage/tutorials/cv_modules/vehicle_attribute_recognition.md
  26. 56 60
      docs/module_usage/tutorials/cv_modules/vehicle_detection.en.md
  27. 2 3
      docs/module_usage/tutorials/ocr_modules/doc_img_orientation_classification.en.md
  28. 2 2
      docs/module_usage/tutorials/ocr_modules/layout_detection.en.md
  29. 2 2
      docs/module_usage/tutorials/ocr_modules/layout_detection.md
  30. 6 7
      docs/module_usage/tutorials/ocr_modules/seal_text_detection.en.md
  31. 1 1
      docs/module_usage/tutorials/ocr_modules/seal_text_detection.md
  32. 1 2
      docs/module_usage/tutorials/ocr_modules/table_cells_detection.en.md
  33. 1 2
      docs/module_usage/tutorials/ocr_modules/table_classification.en.md
  34. 6 7
      docs/module_usage/tutorials/ocr_modules/table_structure_recognition.en.md
  35. 8 8
      docs/module_usage/tutorials/ocr_modules/text_detection.en.md
  36. 1 3
      docs/module_usage/tutorials/ocr_modules/text_image_unwarping.en.md
  37. 6 6
      docs/module_usage/tutorials/ocr_modules/text_recognition.en.md
  38. 41 41
      docs/module_usage/tutorials/ocr_modules/text_recognition.md
  39. 1 2
      docs/module_usage/tutorials/ocr_modules/textline_orientation_classification.en.md
  40. 44 0
      docs/pipeline_usage/pipeline_develop_guide.en.md
  41. 44 0
      docs/pipeline_usage/pipeline_develop_guide.md
  42. 83 91
      docs/pipeline_usage/tutorials/cv_pipelines/face_recognition.en.md
  43. 2 2
      docs/pipeline_usage/tutorials/cv_pipelines/general_image_recognition.en.md
  44. 1 1
      docs/pipeline_usage/tutorials/cv_pipelines/human_keypoint_detection.en.md
  45. 20 5
      docs/pipeline_usage/tutorials/cv_pipelines/image_classification.en.md
  46. 1 1
      docs/pipeline_usage/tutorials/cv_pipelines/object_detection.md
  47. 13 6
      docs/pipeline_usage/tutorials/cv_pipelines/pedestrian_attribute_recognition.en.md
  48. 1 1
      docs/pipeline_usage/tutorials/cv_pipelines/small_object_detection.en.md
  49. 11 5
      docs/pipeline_usage/tutorials/cv_pipelines/vehicle_attribute_recognition.en.md
  50. 67 74
      docs/pipeline_usage/tutorials/cv_pipelines/vehicle_attribute_recognition.md
  51. 1 1
      docs/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.en.md
  52. 1 0
      docs/pipeline_usage/tutorials/time_series_pipelines/time_series_forecasting.en.md
  53. 75 38
      docs/practical_tutorials/anomaly_detection_tutorial.en.md
  54. 1 1
      docs/practical_tutorials/anomaly_detection_tutorial.md
  55. 850 0
      docs/practical_tutorials/face_recognition_tutorial.en.md
  56. 107 57
      docs/practical_tutorials/ocr_det_license_tutorial.en.md
  57. 32 18
      docs/practical_tutorials/ts_anomaly_detection.en.md
  58. 26 6
      docs/practical_tutorials/ts_classification.en.md
  59. 27 7
      docs/practical_tutorials/ts_forecast.en.md
  60. 205 205
      docs/support_list/models_list.en.md
  61. 198 198
      docs/support_list/models_list.md

+ 155 - 6
README.md

@@ -206,6 +206,66 @@ PaddleX的各个产线均支持本地**快速推理**,部分模型支持在[AI
         <td>🚧</td>
     </tr>
     <tr>
+        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/cv_pipelines/human_keypoint_detection.html">人体关键点检测</a></td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>🚧</td>
+    </tr>
+    <tr>
+        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/cv_pipelines/open_vocabulary_detection.html">开放词汇检测</a></td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>🚧</td>
+        <td>🚧</td>
+        <td>🚧</td>
+    </tr>
+    <tr>
+        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/cv_pipelines/open_vocabulary_segmentation.html">开放词汇分割</a></td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>🚧</td>
+        <td>🚧</td>
+        <td>🚧</td>
+    </tr>
+    <tr>
+        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/cv_pipelines/rotated_object_detection.html">旋转目标检测</a></td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>🚧</td>
+    </tr>
+    <tr>
+        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/cv_pipelines/3d_bev_detection.html">3D多模态融合检测</a></td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>🚧</td>
+    </tr>
+    <tr>
+        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/table_recognition_v2.html">通用表格识别v2</a></td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>🚧</td>
+    </tr>
+    <tr>
         <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/layout_parsing.html">通用版面解析</a></td>
         <td>🚧</td>
         <td>✅</td>
@@ -216,6 +276,16 @@ PaddleX的各个产线均支持本地**快速推理**,部分模型支持在[AI
         <td>🚧</td>
     </tr>
     <tr>
+        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/layout_parsing_v2.html">通用版面解析v2</a></td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>🚧</td>
+        <td>🚧</td>
+        <td>🚧</td>
+    </tr>
+    <tr>
         <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/formula_recognition.html">公式识别</a></td>
         <td>🚧</td>
         <td>✅</td>
@@ -236,6 +306,16 @@ PaddleX的各个产线均支持本地**快速推理**,部分模型支持在[AI
         <td>🚧</td>
     </tr>
     <tr>
+        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/doc_preprocessor.html">文档图像预处理</a></td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>🚧</td>
+    </tr>
+    <tr>
         <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/cv_pipelines/general_image_recognition.html">通用图像识别</a></td>
         <td>🚧</td>
         <td>✅</td>
@@ -275,6 +355,36 @@ PaddleX的各个产线均支持本地**快速推理**,部分模型支持在[AI
         <td>✅</td>
         <td>🚧</td>
     </tr>
+    <tr>
+        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/speech_pipelines/multilingual_speech_recognition.html">多语种语音识别</a></td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>🚧</td>
+        <td>🚧</td>
+        <td>🚧</td>
+    </tr>
+    <tr>
+        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/video_pipelines/video_classification.html">通用视频分类</a></td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>🚧</td>
+    </tr>
+    <tr>
+        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/video_pipelines/video_detection.html">通用视频检测</a></td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>🚧</td>
+    </tr>
 
 
 </table>
@@ -534,12 +644,15 @@ for res in output:
 * <details open>
     <summary> <b> 🔍 OCR </b></summary>
 
-    * [📜 通用 OCR 产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/OCR.html)
-    * [📊 通用表格识别产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/table_recognition.html)
-    * [📄 通用版面解析产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/layout_parsing.html)
-    * [📐 公式识别产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/formula_recognition.html)
-    * [📝 印章文本识别产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/seal_recognition.html)
-  </details>
+  * [📜 通用 OCR 产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/OCR.html )
+  * [📊 通用表格识别产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/table_recognition.html )
+  * [🗂️ 通用表格识别v2产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/table_recognition_v2.html)
+  * [📰 通用版面解析产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/layout_parsing.html)
+  * [🗞️ 通用版面解析产线v2使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/layout_parsing_v2.html)
+  * [📐 公式识别产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/formula_recognition.html)
+  * [🖋️ 印章文本识别产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/seal_recognition.html)
+  * [🖌️ 文档图像预处理产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/doc_preprocessor.html) 
+</details>
 
 * <details open>
     <summary> <b> 🎥 计算机视觉 </b></summary>
@@ -551,6 +664,11 @@ for res in output:
    * [🏷️ 图像多标签分类产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/cv_pipelines/image_multi_label_classification.html)
    * [🔍 小目标检测产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/cv_pipelines/small_object_detection.html)
    * [🖼️ 图像异常检测产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/cv_pipelines/image_anomaly_detection.html)
+   * [🔍 人体关键点检测产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/cv_pipelines/human_keypoint_detection.html)
+   * [📚 开放词汇检测产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/cv_pipelines/open_vocabulary_detection.html)
+   * [🎨 开放词汇分割产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/cv_pipelines/open_vocabulary_segmentation.html)
+   * [🔄 旋转目标检测产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/cv_pipelines/rotated_object_detection.html)
+   * [🌐 3D多模态融合检测产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/cv_pipelines/3d_bev_detection.html)
    * [🖼️ 通用图像识别产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/cv_pipelines/general_image_recognition.html)
    * [🆔人脸识别产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/cv_pipelines/face_recognition.html)
    * [🚗 车辆属性识别产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/cv_pipelines/vehicle_attribute_recognition.html)
@@ -565,7 +683,16 @@ for res in output:
    * [🕒 时序分类产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/time_series_pipelines/time_series_classification.html)
   </details>
 
+* <details open>
+    <summary> <b> 🎤 语音分析</b> </summary>
+
+    * [🌐 多语种语音识别产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/speech_pipelines/multilingual_speech_recognition.html)
+
+* <details open>
+    <summary> <b> 🎥 视频处理</b> </summary>
 
+    * [📈 通用视频分类产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/video_pipelines/video_classification.html)
+    * [🔍 通用视频检测产线使用教程](https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/video_pipelines/video_detection.html)
 
 * <details>
     <summary> <b>🔧 相关说明文件</b> </summary>
@@ -590,6 +717,9 @@ for res in output:
   * [📄 文档图像方向分类使用教程](https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/ocr_modules/doc_img_orientation_classification.html)
   * [🔧 文本图像矫正模块使用教程](https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/ocr_modules/text_image_unwarping.html)
   * [📐 公式识别模块使用教程](https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/ocr_modules/formula_recognition.html)
+  * [📊 表格单元格检测模块使用教程](https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/ocr_modules/table_cells_detection.html)
+  * [📈 表格分类模块使用教程](https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/ocr_modules/table_classification.html)
+  * [📝 文本行方向分类模块使用教程](https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/ocr_modules/textline_orientation_classification.html)
 
   </details>
 
@@ -619,6 +749,9 @@ for res in output:
   * [🔍 主体检测模块使用教程](https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/cv_modules/mainbody_detection.html)
   * [🚶 行人检测模块使用教程](https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/cv_modules/human_detection.html)
   * [🚗 车辆检测模块使用教程](https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/cv_modules/vehicle_detection.html)
+  * [🚶‍♂️ 人体关键点检测模块使用教程](https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/cv_modules/human_keypoint_detection.html)
+  * [🌐 开放词汇目标检测模块使用教程](https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/cv_modules/open_vocabulary_detection.html)
+  * [🔄 旋转目标检测模块使用教程](https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/cv_modules/rotated_object_detection.html)
 
   </details>
 
@@ -628,6 +761,7 @@ for res in output:
   * [🗺️ 语义分割模块使用教程](https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/cv_modules/semantic_segmentation.html)
   * [🔍 实例分割模块使用教程](https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/cv_modules/instance_segmentation.html)
   * [🚨 图像异常检测模块使用教程](https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/cv_modules/anomaly_detection.html)
+  * [🌐 开放词汇分割模块使用教程](https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/cv_modules/open_vocabulary_segmentation.html)
   </details>
 
 * <details open>
@@ -638,6 +772,21 @@ for res in output:
   * [🕒 时序分类模块使用教程](https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/time_series_modules/time_series_classification.html)
   </details>
 
+* <details open>
+  <summary> <b> 🌐 3D </b></summary>
+  * [🚗 3D多模态融合检测模块使用教程](https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/cv_modules/3d_bev_detection.html)
+
+* <details open>
+  <summary> <b> 🎤 语音 </b></summary>
+
+  * [🌐 多语种语音识别模块使用教程](https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/speech_modules/multilingual_speech_recognition.html)
+
+* <details open>
+  <summary> <b> 🎥 视频 </b></summary>
+  
+  * [📈 视频分类模块使用教程](https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/video_modules/video_classification.html)
+  * [🔍 视频检测模块使用教程](https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/video_modules/video_detection.html)
+
 * <details>
   <summary> <b> 📄 相关说明文件 </b></summary>
 

+ 153 - 0
README_en.md

@@ -208,6 +208,66 @@ In addition, PaddleX provides developers with a full-process efficient model tra
         <td>🚧</td>
     </tr>
     <tr>
+        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/cv_pipelines/human_keypoint_detection.html">Human Keypoint Detection</a></td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>🚧</td>
+    </tr>
+    <tr>
+        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/cv_pipelines/open_vocabulary_detection.html">Open Vocabulary Detection</a></td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>🚧</td>
+        <td>🚧</td>
+        <td>🚧</td>
+    </tr>
+    <tr>
+        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/cv_pipelines/open_vocabulary_segmentation.html">Open Vocabulary Segmentation</a></td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>🚧</td>
+        <td>🚧</td>
+        <td>🚧</td>
+    </tr>
+    <tr>
+        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/cv_pipelines/rotated_object_detection.html">Rotated Object Detection</a></td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>🚧</td>
+    </tr>
+    <tr>
+        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/cv_pipelines/3d_bev_detection.html">3D Bev Detection</a></td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>🚧</td>
+    </tr>
+    <tr>
+        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/table_recognition_v2.html">Table Recognition v2</a></td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>🚧</td>
+    </tr>
+    <tr>
         <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/layout_parsing.html">Layout Parsing</a></td>
         <td>🚧</td>
         <td>✅</td>
@@ -218,6 +278,16 @@ In addition, PaddleX provides developers with a full-process efficient model tra
         <td>🚧</td>
     </tr>
     <tr>
+        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/layout_parsing_v2.html">Layout Parsing v2</a></td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>🚧</td>
+        <td>🚧</td>
+        <td>🚧</td>
+    </tr>
+    <tr>
         <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/formula_recognition.html">Formula Recognition</a></td>
         <td>🚧</td>
         <td>✅</td>
@@ -238,6 +308,16 @@ In addition, PaddleX provides developers with a full-process efficient model tra
         <td>🚧</td>
     </tr>
     <tr>
+        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/doc_preprocessor.html">Document Image Preprocessing</a></td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>🚧</td>
+    </tr>
+    <tr>
         <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/cv_pipelines/general_image_recognition.html>Image Recognition</a></td>
         <td>🚧</td>
         <td>✅</td>
@@ -277,6 +357,37 @@ In addition, PaddleX provides developers with a full-process efficient model tra
         <td>✅</td>
         <td>🚧</td>
     </tr>
+    <tr>
+        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/speech_pipelines/multilingual_speech_recognition.html">Multilingual Speech Recognition</a></td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>🚧</td>
+        <td>🚧</td>
+        <td>🚧</td>
+    </tr>
+    <tr>
+        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/video_pipelines/video_classification.html">Video Classification</a></td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>🚧</td>
+    </tr>
+    <tr>
+        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/video_pipelines/video_detection.html">Video Detection</a></td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>🚧</td>
+    </tr>
+
 </table>
 
 > ❗Note: The above capabilities are implemented based on GPU/CPU. PaddleX can also perform local inference and custom development on mainstream hardware such as Kunlunxin, Ascend, Cambricon, and Haiguang. The table below details the support status of the pipelines. For specific supported model lists, please refer to the [Model List (Kunlunxin XPU)](https://paddlepaddle.github.io/PaddleX/latest/en/support_list/model_list_xpu.html)/[Model List (Ascend NPU)](https://paddlepaddle.github.io/PaddleX/latest/en/support_list/model_list_npu.html)/[Model List (Cambricon MLU)](https://paddlepaddle.github.io/PaddleX/latest/en/support_list/model_list_mlu.html)/[Model List (Haiguang DCU)](https://paddlepaddle.github.io/PaddleX/latest/en/support_list/model_list_dcu.html). We are continuously adapting more models and promoting the implementation of high-performance and serving on mainstream hardware.
@@ -534,9 +645,12 @@ For other pipelines in Python scripts, just adjust the `pipeline` parameter of t
 
     * [📜 OCR Pipeline Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/OCR.html)
     * [📊 Table Recognition Pipeline Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/table_recognition.html)
+    * [🗂️ Table Recognition v2 Pipeline Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/table_recognition_v2.html)
     * [📄 Layout Parsing Pipeline Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/layout_parsing.html)
+    * [🗞️ Layout Parsing v2 Pipeline Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/layout_parsing_v2.html)
     * [📐 Formula Recognition Pipeline Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/formula_recognition.html)
     * [📝 Seal Recognition Pipeline Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/seal_recognition.html)
+    * [🖌️ Document Image Preprocessing](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/doc_preprocessor.html)
   </details>
 
 * <details open>
@@ -549,6 +663,11 @@ For other pipelines in Python scripts, just adjust the `pipeline` parameter of t
    * [🏷️ Multi-label Image Classification Pipeline Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/cv_pipelines/image_multi_label_classification.html)
    * [🔍 Small Object Detection Pipeline Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/cv_pipelines/small_object_detection.html)
    * [🖼️ Image Anomaly Detection Pipeline Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/cv_pipelines/image_anomaly_detection.html)
+   * [🔍 Human Keypoint Detection Pipeline Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/cv_pipelines/human_keypoint_detection.html)
+   * [📚 Open Vocabulary Detection Pipeline Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/cv_pipelines/open_vocabulary_detection.html)
+   * [🎨 Open Vocabulary Segmentation Pipeline Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/cv_pipelines/open_vocabulary_segmentation.html)
+   * [🔄 Rotated Object Detection Pipeline Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/cv_pipelines/rotated_object_detection.html)
+   * [🌐 3D Bev Detection Pipeline Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/cv_pipelines/3d_bev_detection.html)
    * [🖼️ Image Recognition Pipeline Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/cv_pipelines/general_image_recognition.html)
    * [🆔 Face Recognition Pipeline Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/cv_pipelines/face_recognition.html)
    * [🚗 Vehicle Attribute Recognition Pipeline Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/cv_pipelines/vehicle_attribute.html)
@@ -564,6 +683,17 @@ For other pipelines in Python scripts, just adjust the `pipeline` parameter of t
   </details>
 
 * <details open>
+    <summary> <b> 🎤 Speech Analysis</b> </summary>
+
+    * [🌐 Multilingual Speech Recognition Pipeline Usage Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/speech_pipelines/multilingual_speech_recognition.html)
+
+* <details open>
+    <summary> <b> 🎥 Video Processing</b> </summary>
+
+    * [📈 General Video Classification Pipeline Usage Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/video_pipelines/video_classification.html)
+    * [🔍 General Video Detection Pipeline Usage Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/video_pipelines/video_detection.html)
+
+* <details open>
     <summary> <b>🔧 Related Instructions</b> </summary>
 
    * [🖥️ PaddleX pipeline Command Line Instruction](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/instructions/pipeline_CLI_usage.html)
@@ -586,6 +716,9 @@ For other pipelines in Python scripts, just adjust the `pipeline` parameter of t
   * [📄 Document Image Orientation Classification Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/doc_img_orientation_classification.html)
   * [🔧 Document Image Unwarp Module Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/text_image_unwarping.html)
   * [📐 Formula Recognition Module Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/formula_recognition.html)
+  * [📊 Table Cell Detection Module Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/table_cells_detection.html)
+  * [📈 Table Classification Module Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/table_classification.html)
+  * [📝 Text Line Orientation Classification Module Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/textline_orientation_classification.html)
   </details>
 
 * <details open>
@@ -615,6 +748,9 @@ For other pipelines in Python scripts, just adjust the `pipeline` parameter of t
   * [🔍 Mainbody Detection Module Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/cv_modules/mainbody_detection.html)
   * [🚶 Pedestrian Detection Module Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/cv_modules/human_detection.html)
   * [🚗 Vehicle Detection Module Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/cv_modules/vehicle_detection.html)
+  * [🚶‍♂️ Human Keypoint Detection Module Usage Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/cv_modules/human_keypoint_detection.html)
+  * [🌐 Open-Vocabulary Object Detection Module Usage Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/cv_modules/open_vocabulary_detection.html)
+  * [🔄 Rotated Object Detection Module Usage Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/cv_modules/rotated_object_detection.html)
 
   </details>
 
@@ -624,6 +760,7 @@ For other pipelines in Python scripts, just adjust the `pipeline` parameter of t
   * [🗺️ Semantic Segmentation Module Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/cv_modules/semantic_segmentation.html)
   * [🔍 Instance Segmentation Module Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/cv_modules/instance_segmentation.html)
   * [🚨 Image Anomaly Detection Module Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/cv_modules/anomaly_detection.html)
+  * [🌐 Open-Vocabulary Segmentation Module Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/cv_modules/open_vocabulary_segmentation.html)
   </details>
 
 * <details open>
@@ -635,6 +772,22 @@ For other pipelines in Python scripts, just adjust the `pipeline` parameter of t
   </details>
 
 * <details open>
+  <summary> <b> 🌐 3D </b></summary>
+  
+  * [🚗 3D Multimodal Fusion Detection Module Usage Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/cv_modules/3d_bev_detection.html)
+
+* <details open>
+  <summary> <b> 🎤 Speech </b></summary>
+
+  * [🌐 Multilingual Speech Recognition Module Usage Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/speech_modules/multilingual_speech_recognition.html)
+
+* <details open>
+  <summary> <b> 🎥 Video </b></summary>
+
+  * [📈 Video Classification Module Usage Tutorial](https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/video_modules/video_classification.html)
+  * [🔍 Video Detection Module Usage Tutorial](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/video_modules/video_detection.html)
+
+* <details open>
   <summary> <b> 📄 Related Instructions </b></summary>
 
   * [📝 PaddleX Single Model Python Script Instruction](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/instructions/model_python_API.html)

+ 39 - 10
docs/module_usage/tutorials/cv_modules/3d_bev_detection.en.md

@@ -31,23 +31,53 @@ The 3D multimodal fusion detection module is a key component in the fields of co
 ## III. Quick Integration
 > ❗ Before quick integration, please install the PaddleX wheel package first. For details, refer to the [PaddleX Local Installation Guide](../../../installation/installation.en.md).
 
-After installing the wheel package, you can complete the inference of the object detection module with just a few lines of code. You can switch between models under this module at will, and you can also integrate the model inference of the 3D multimodal fusion detection module into your project. Before running the following code, please download the [sample input](https://paddle-model-ecology.bj.bcebos.com/paddlex/det_3d/demo_det_3d/nuscenes_infos_val.pkl) to your local machine. 
+After completing the installation of the wheel package, you can perform inference for the object detection module with just a few lines of code. You can switch models under this module at will, and you can also integrate the model inference of the 3D multimodal fusion detection module into your project. Before running the following code, please download the [sample input](https://paddle-model-ecology.bj.bcebos.com/paddlex/det_3d/demo_det_3d/nuscenes_demo_infer.tar) to your local machine.
+
 
 ```python
-from paddlex import create_model
-model = create_model(model_name="BEVFusion")
-output = model.predict(input="nuscenes_infos_val.pkl", batch_size=1)
+from paddlex import create_pipeline
+
+pipeline = create_pipeline(pipeline="3d_bev_detection")
+output = pipeline.predict("nuscenes_demo_infer.tar")
+
 for res in output:
-    res.print()
-    res.save_to_json(save_path="./output/res.json")
-```
+    res.print()  ## Print the structured output of the prediction
+    res.save_to_json("./output/")  ## Save the results to a JSON file```
 
 After running, the result obtained is:
 
 ```bash
 {"res":
   {
-    "input_path": "./data/nuscenes/samples/LIDAR_TOP/n008-2018-08-01-15-16-36-0400__LIDAR_TOP__1533151616947490.pcd.bin", "input_img_paths": ["./data/nuscenes/samples/CAM_FRONT_LEFT/n008-2018-08-01-15-16-36-0400__CAM_FRONT_LEFT__1533151616904806.jpg", "./data/nuscenes/samples/CAM_FRONT/n008-2018-08-01-15-16-36-0400__CAM_FRONT__1533151616912404.jpg", "./data/nuscenes/samples/CAM_FRONT_RIGHT/n008-2018-08-01-15-16-36-0400__CAM_FRONT_RIGHT__1533151616920482.jpg", "./data/nuscenes/samples/CAM_BACK_RIGHT/n008-2018-08-01-15-16-36-0400__CAM_BACK_RIGHT__1533151616928113.jpg", "./data/nuscenes/samples/CAM_BACK/n008-2018-08-01-15-16-36-0400__CAM_BACK__1533151616937558.jpg", "./data/nuscenes/samples/CAM_BACK_LEFT/n008-2018-08-01-15-16-36-0400__CAM_BACK_LEFT__1533151616947405.jpg"], "sample_id": "cc57c1ea80fe46a7abddfdb15654c872", "boxes_3d": [[-8.913962364196777, 13.30993366241455, -1.7353310585021973, 1.9886571168899536, 4.886075019836426, 1.877254605293274, 6.317165374755859, -0.00018131558317691088, 0.022375036031007767]], "labels_3d": [0], "scores_3d": [0.9951273202896118]
+        'input_path': 'samples/LIDAR_TOP/n015-2018-10-08-15-36-50+0800__LIDAR_TOP__1538984253447765.pcd.bin',
+    'sample_id': 'b4ff30109dd14c89b24789dc5713cf8c',
+    'input_img_paths': [
+      'samples/CAM_FRONT_LEFT/n015-2018-10-08-15-36-50+0800__CAM_FRONT_LEFT__1538984253404844.jpg',
+      'samples/CAM_FRONT/n015-2018-10-08-15-36-50+0800__CAM_FRONT__1538984253412460.jpg',
+      'samples/CAM_FRONT_RIGHT/n015-2018-10-08-15-36-50+0800__CAM_FRONT_RIGHT__1538984253420339.jpg',
+      'samples/CAM_BACK_RIGHT/n015-2018-10-08-15-36-50+0800__CAM_BACK_RIGHT__1538984253427893.jpg',
+      'samples/CAM_BACK/n015-2018-10-08-15-36-50+0800__CAM_BACK__1538984253437525.jpg',
+      'samples/CAM_BACK_LEFT/n015-2018-10-08-15-36-50+0800__CAM_BACK_LEFT__1538984253447423.jpg'
+    ]
+    "boxes_3d": [
+        [
+            14.5425386428833,
+            22.142045974731445,
+            -1.2903141975402832,
+            1.8441576957702637,
+            4.433370113372803,
+            1.7367216348648071,
+            6.367165565490723,
+            0.0036598597653210163,
+            -0.013568558730185032
+        ]
+    ],
+    "labels_3d": [
+        0
+    ],
+    "scores_3d": [
+        0.9920279383659363
+    ]
   }
 }
 ```
@@ -268,8 +298,7 @@ python main.py -c paddlex/configs/modules/3d_bev_detection/BEVFusion.yaml \
   &quot;analysis&quot;: {
     &quot;histogram&quot;: &quot;check_dataset/histogram.png&quot;
   },
-  &quot;dataset_path&quot;: &quot;/workspace/bevfusion/Paddle3D/data/nuscenes&quot;,
-  &quot;show_type&quot;: &quot;path for images and lidar&quot;,
+  &quot;dataset_path&quot;: &quot;/workspace/bevfusion/Paddle3D/data/nuscenes&quot;,&quot;show_type&quot;: &quot;txt&quot;,
   &quot;dataset_type&quot;: &quot;NuscenesMMDataset&quot;
 }
 </code></pre>

+ 3 - 4
docs/module_usage/tutorials/cv_modules/anomaly_detection.en.md

@@ -106,15 +106,14 @@ Relevant methods, parameters, and explanations are as follows:
 <tr>
 <td><code>input</code></td>
 <td>Data to be predicted, supporting multiple input types</td>
-<td><code>Python Var</code>/<code>str</code>/<code>dict</code>/<code>list</code></td>
+<td><code>Python Var</code>/<code>str</code>/<code>list</code></td>
 <td>
 <ul>
   <li><b>Python variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
   <li><b>File path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
   <li><b>URL link</b>, such as the network URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">Example</a></li>
   <li><b>Local directory</b>, the directory should contain data files to be predicted, such as the local path: <code>/root/data/</code></li>
-  <li><b>Dictionary</b>, the <code>key</code> of the dictionary must correspond to the specific task, such as <code>"img"</code> for image classification tasks. The <code>value</code> of the dictionary supports the above types of data, for example: <code>{"img": "/root/data1"}</code></li>
-  <li><b>List</b>, elements of the list must be the above types of data, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>["/root/data/img1.jpg", "/root/data/img2.jpg"]</code>, <code>["/root/data1", "/root/data2"]</code>, <code>[{"img": "/root/data1"}, {"img": "/root/data2/img.jpg"}]</code></li>
+  <li><b>List</b>, the elements of the list should be of the above-mentioned data types, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>, <code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>None</td>
@@ -262,7 +261,7 @@ After executing the above command, PaddleX will validate the dataset and collect
   &quot;analysis&quot;: {
     &quot;histogram&quot;: &quot;check_dataset/histogram.png&quot;
   },
-  &quot;dataset_path&quot;: &quot;./dataset/example_data/mvtec_examples&quot;,
+  &quot;dataset_path&quot;: &quot;mvtec_examples&quot;,
   &quot;show_type&quot;: &quot;image&quot;,
   &quot;dataset_type&quot;: &quot;SegDataset&quot;
 }

파일 크기가 너무 크기때문에 변경 상태를 표시하지 않습니다.
+ 2 - 1
docs/module_usage/tutorials/cv_modules/face_detection.en.md


+ 49 - 52
docs/module_usage/tutorials/cv_modules/face_feature.en.md

@@ -15,8 +15,8 @@ Face feature models typically take standardized face images processed through de
 <th>Model</th><th>Model Download Link</th>
 <th>Output Feature Dimension</th>
 <th>Acc (%)<br/>AgeDB-30/CFP-FP/LFW</th>
-<th>GPU Inference Time (ms)</th>
-<th>CPU Inference Time (ms)</th>
+<th>GPU Inference Time (ms)<br/>[Normal Mode / High-Performance Mode]</th>
+<th>CPU Inference Time (ms)<br/>[Normal Mode / High-Performance Mode]</th>
 <th>Model Size (M)</th>
 <th>Description</th>
 </tr>
@@ -26,8 +26,8 @@ Face feature models typically take standardized face images processed through de
 <td>MobileFaceNet</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/MobileFaceNet_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/MobileFaceNet_pretrained.pdparams">Trained Model</a></td>
 <td>128</td>
 <td>96.28/96.71/99.58</td>
-<td>5.7</td>
-<td>101.6</td>
+<td>3.16 / 0.48</td>
+<td>6.49 / 6.49</td>
 <td>4.1</td>
 <td>Face feature model trained on MobileFaceNet with MS1Mv3 dataset</td>
 </tr>
@@ -35,8 +35,8 @@ Face feature models typically take standardized face images processed through de
 <td>ResNet50_face</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/ResNet50_face_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/ResNet50_face_pretrained.pdparams">Trained Model</a></td>
 <td>512</td>
 <td>98.12/98.56/99.77</td>
-<td>8.7</td>
-<td>200.7</td>
+<td>5.68 / 1.09</td>
+<td>14.96 / 11.90</td>
 <td>87.2</td>
 <td>Face feature model trained on ResNet50 with MS1Mv3 dataset</td>
 </tr>
@@ -45,7 +45,7 @@ Face feature models typically take standardized face images processed through de
 <p>Note: The above accuracy metrics are Accuracy scores measured on the AgeDB-30, CFP-FP, and LFW datasets, respectively. All model GPU inference times are based on an NVIDIA Tesla T4 machine with FP32 precision. CPU inference speeds are based on an Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz with 8 threads and FP32 precision.</p>
 
 ## III. Quick Integration
-> ❗ Before quick integration, please install the PaddleX wheel package. For details, refer to the [PaddleX Local Installation Tutorial](../../../installation/installation.en.md)
+&gt; ❗ Before quick integration, please install the PaddleX wheel package. For details, refer to the [PaddleX Local Installation Tutorial](../../../installation/installation.en.md)
 
 After installing the whl package, a few lines of code can complete the inference of the face feature module. You can switch models under this module freely, and you can also integrate the model inference of the face feature module into your project. Before running the following code, please download the [example image](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/face_recognition_001.jpg) to your local machine.
 
@@ -126,15 +126,14 @@ The explanations for the methods, parameters, etc., are as follows:
 <tr>
 <td><code>input</code></td>
 <td>Data to be predicted, supporting multiple input types</td>
-<td><code>Python Var</code>/<code>str</code>/<code>dict</code>/<code>list</code></td>
+<td><code>Python Var</code>/<code>str</code>/<code>list</code></td>
 <td>
 <ul>
-  <li><b>Python variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
-  <li><b>File path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
-  <li><b>URL link</b>, such as the network URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">Example</a></li>
-  <li><b>Local directory</b>, the directory should contain data files to be predicted, such as the local path: <code>/root/data/</code></li>
-  <li><b>Dictionary</b>, the <code>key</code> of the dictionary must correspond to the specific task, such as <code>"img"</code> for image classification tasks. The <code>value</code> of the dictionary supports the above types of data, for example: <code>{"img": "/root/data1"}</code></li>
-  <li><b>List</b>, elements of the list must be of the above types of data, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>["/root/data/img1.jpg", "/root/data/img2.jpg"]</code>, <code>["/root/data1", "/root/data2"]</code>, <code>[{"img": "/root/data1"}, {"img": "/root/data2/img.jpg"}]</code></li>
+<li><b>Python Variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
+<li><b>File Path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
+<li><b>URL Link</b>, such as the web URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">Example</a></li>
+<li><b>Local Directory</b>, the directory should contain the data files to be predicted, such as the local path: <code>/root/data/</code></li>
+<li><b>List</b>, the elements of the list should be of the above-mentioned data types, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>, <code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>None</td>
@@ -246,47 +245,46 @@ python main.py -c paddlex/configs/modules/face_feature/MobileFaceNet.yaml \
 After executing the above command, PaddleX will validate the dataset and collect its basic information. Upon successful execution, the log will print the message `Check dataset passed !`. The validation result file will be saved in `./output/check_dataset_result.json`, and related outputs will be saved in the `./output/check_dataset` directory of the current directory. The output directory includes visualized example images and histograms of sample distributions.
 
 <details><summary>👉 <b>Validation Result Details (Click to Expand)</b></summary>
-
 <p>The specific content of the validation result file is:</p>
 <pre><code class="language-bash">{
-  &quot;done_flag&quot;: true,
-  &quot;check_pass&quot;: true,
-  &quot;attributes&quot;: {
-    &quot;train_label_file&quot;: &quot;../../dataset/face_rec_examples/train/label.txt&quot;,
-    &quot;train_num_classes&quot;: 995,
-    &quot;train_samples&quot;: 1000,
-    &quot;train_sample_paths&quot;: [
-      &quot;check_dataset/demo_img/01378592.jpg&quot;,
-      &quot;check_dataset/demo_img/04331410.jpg&quot;,
-      &quot;check_dataset/demo_img/03485713.jpg&quot;,
-      &quot;check_dataset/demo_img/02382123.jpg&quot;,
-      &quot;check_dataset/demo_img/01722397.jpg&quot;,
-      &quot;check_dataset/demo_img/02682349.jpg&quot;,
-      &quot;check_dataset/demo_img/00272794.jpg&quot;,
-      &quot;check_dataset/demo_img/03151987.jpg&quot;,
-      &quot;check_dataset/demo_img/01725764.jpg&quot;,
-      &quot;check_dataset/demo_img/02580369.jpg&quot;
+  "done_flag": true,
+  "check_pass": true,
+  "attributes": {
+    "train_label_file": "../../dataset/face_rec_examples/train/label.txt",
+    "train_num_classes": 995,
+    "train_samples": 1000,
+    "train_sample_paths": [
+      "check_dataset/demo_img/01378592.jpg",
+      "check_dataset/demo_img/04331410.jpg",
+      "check_dataset/demo_img/03485713.jpg",
+      "check_dataset/demo_img/02382123.jpg",
+      "check_dataset/demo_img/01722397.jpg",
+      "check_dataset/demo_img/02682349.jpg",
+      "check_dataset/demo_img/00272794.jpg",
+      "check_dataset/demo_img/03151987.jpg",
+      "check_dataset/demo_img/01725764.jpg",
+      "check_dataset/demo_img/02580369.jpg"
     ],
-    &quot;val_label_file&quot;: &quot;../../dataset/face_rec_examples/val/pair_label.txt&quot;,
-    &quot;val_num_classes&quot;: 2,
-    &quot;val_samples&quot;: 500,
-    &quot;val_sample_paths&quot;: [
-      &quot;check_dataset/demo_img/Don_Carcieri_0001.jpg&quot;,
-      &quot;check_dataset/demo_img/Eric_Fehr_0001.jpg&quot;,
-      &quot;check_dataset/demo_img/Harry_Kalas_0001.jpg&quot;,
-      &quot;check_dataset/demo_img/Francis_Ford_Coppola_0001.jpg&quot;,
-      &quot;check_dataset/demo_img/Amer_al-Saadi_0001.jpg&quot;,
-      &quot;check_dataset/demo_img/Sergei_Ivanov_0001.jpg&quot;,
-      &quot;check_dataset/demo_img/Erin_Runnion_0003.jpg&quot;,
-      &quot;check_dataset/demo_img/Bill_Stapleton_0001.jpg&quot;,
-      &quot;check_dataset/demo_img/Daniel_Bruehl_0001.jpg&quot;,
-      &quot;check_dataset/demo_img/Clare_Short_0004.jpg&quot;
+    "val_label_file": "../../dataset/face_rec_examples/val/pair_label.txt",
+    "val_num_classes": 2,
+    "val_samples": 500,
+    "val_sample_paths": [
+      "check_dataset/demo_img/Don_Carcieri_0001.jpg",
+      "check_dataset/demo_img/Eric_Fehr_0001.jpg",
+      "check_dataset/demo_img/Harry_Kalas_0001.jpg",
+      "check_dataset/demo_img/Francis_Ford_Coppola_0001.jpg",
+      "check_dataset/demo_img/Amer_al-Saadi_0001.jpg",
+      "check_dataset/demo_img/Sergei_Ivanov_0001.jpg",
+      "check_dataset/demo_img/Erin_Runnion_0003.jpg",
+      "check_dataset/demo_img/Bill_Stapleton_0001.jpg",
+      "check_dataset/demo_img/Daniel_Bruehl_0001.jpg",
+      "check_dataset/demo_img/Clare_Short_0004.jpg"
     ]
   },
-  &quot;analysis&quot;: {},
-  &quot;dataset_path&quot;: &quot;./dataset/face_rec_examples&quot;,
-  &quot;show_type&quot;: &quot;image&quot;,
-  &quot;dataset_type&quot;: &quot;ClsDataset&quot;
+  "analysis": {},
+  "dataset_path": "./dataset/face_rec_examples",
+  "show_type": "image",
+  "dataset_type": "ClsDataset"
 }
 </code></pre>
 <p>The verification results mentioned above indicate that <code>check_pass</code> being <code>True</code> means the dataset format meets the requirements. Details of other indicators are as follows:</p>
@@ -302,7 +300,6 @@ After executing the above command, PaddleX will validate the dataset and collect
 After completing the data validation, you can convert the dataset format and re-split the training/validation ratio by <b>modifying the configuration file</b> or <b>adding hyperparameters</b>.
 
 <details><summary>👉 <b>Details on Format Conversion / Dataset Splitting (Click to Expand)</b></summary>
-
 <p>The face feature module does not support data format conversion or dataset splitting.</p></details>
 
 #### 4.1.4 Data Organization for Face Feature Module
@@ -352,7 +349,6 @@ The steps required are:
 Other related parameters can be set by modifying the `Global` and `Train` fields in the `.yaml` configuration file or by appending parameters in the command line. For example, to specify the first two GPUs for training: `-o Global.device=gpu:0,1`; to set the number of training epochs to 10: `-o Train.epochs_iters=10`. For more modifiable parameters and their detailed explanations, refer to the configuration file instructions for the corresponding task module [PaddleX Common Configuration Parameters for Model Tasks](../../instructions/config_parameters_common.en.md).
 
 <details><summary>👉 <b>More Details (Click to Expand)</b></summary>
-
 <ul>
 <li>During model training, PaddleX automatically saves model weight files, defaulting to <code>output</code>. To specify a save path, use the <code>-o Global.output</code> field in the configuration file.</li>
 <li>PaddleX shields you from the concepts of dynamic graph weights and static graph weights. During model training, both dynamic and static graph weights are produced, and static graph weights are selected by default for model inference.</li>
@@ -417,3 +413,4 @@ The face feature module can be integrated into the PaddleX pipeline for [<b>Face
 2. <b>Module Integration</b>
 
 The weights you produced can be directly integrated into the face feature module. You can refer to the Python example code in [Quick Integration](#III.-Quick-Integration) and only need to replace the model with the path to the model you trained.
+</details></details>

+ 4 - 4
docs/module_usage/tutorials/cv_modules/human_detection.en.md

@@ -42,7 +42,7 @@ Human detection is a subtask of object detection, which utilizes computer vision
 
 
 ## III. Quick Integration
-&gt; ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to [PaddleX Local Installation Guide](../../../installation/installation.en.md)
+> ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to [PaddleX Local Installation Guide](../../../installation/installation.en.md)
 
 After installing the wheel package, you can perform human detection with just a few lines of code. You can easily switch between models in this module and integrate the human detection model inference into your project. Before running the following code, please download the [demo image](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/human_detection.jpg) to your local machine.
 
@@ -114,7 +114,7 @@ The explanations for the methods, parameters, etc., are as follows:
 </table>
 
 * The `model_name` must be specified. After specifying `model_name`, the default model parameters built into PaddleX are used. If `model_dir` is specified, the user-defined model is used.
-* `threshold` is the threshold for filtering low-confidence objects. The default is `None`, which means using the settings from the previous layer. The priority of parameter settings from highest to lowest is: `predict parameter &gt; create_model initialization &gt; yaml configuration file`. Currently, two types of threshold settings are supported:
+* `threshold` is the threshold for filtering low-confidence objects. The default is `None`, which means using the settings from the previous layer. The priority of parameter settings from highest to lowest is: `predict parameter > create_model initialization > yaml configuration file`. Currently, two types of threshold settings are supported:
   * `float`, using the same threshold for all classes.
   * `dict`, where the key is the class ID and the value is the threshold, allowing different thresholds for different classes. Since pedestrian detection is a single-class detection, this setting is not required.
 
@@ -140,7 +140,7 @@ The explanations for the methods, parameters, etc., are as follows:
 <li><b>File path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
 <li><b>URL link</b>, such as the network URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_instance_segmentation_004.png">Example</a></li>
 <li><b>Local directory</b>, the directory should contain data files to be predicted, such as the local path: <code>/root/data/</code></li>
-<li><b>List</b>, elements of the list must be of the above types of data, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>["/root/data/img1.jpg", "/root/data/img2.jpg"]</code>, <code>["/root/data1", "/root/data2"]</code></li>
+<li><b>List</b>, the elements of the list should be of the above-mentioned data types, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>, <code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>None</td>
@@ -158,7 +158,7 @@ The explanations for the methods, parameters, etc., are as follows:
 <td><code>float</code>/<code>dict</code>/<code>None</code></td>
 <td>
 <ul>
-<li><b>None</b>, indicating the use of settings from the previous layer. The priority of parameter settings from highest to lowest is: <code>predict parameter &gt; create_model initialization &gt; yaml configuration file</code></li>
+<li><b>None</b>, indicating the use of settings from the previous layer. The priority of parameter settings from highest to lowest is: <code>predict parameter > create_model initialization > yaml configuration file</code></li>
 <li><b>float</b>, such as 0.5, indicating the use of 0.5 as the threshold for filtering low-confidence objects during inference</li>
 <li><b>dict</b>, such as <code>{0: 0.5, 1: 0.35}</code>, indicating the use of 0.5 as the threshold for class 0 and 0.35 for class 1 during inference. Since pedestrian detection is a single-class detection, this setting is not required.</li>
 </ul>

+ 2 - 3
docs/module_usage/tutorials/cv_modules/human_keypoint_detection.en.md

@@ -139,15 +139,14 @@ The explanations for the methods, parameters, etc., are as follows:
 <tr>
 <td><code>input</code></td>
 <td>Data to be predicted, supporting multiple input types</td>
-<td><code>Python Var</code>/<code>str</code>/<code>dict</code>/<code>list</code></td>
+<td><code>Python Var</code>/<code>str</code>/<code>list</code></td>
 <td>
 <ul>
   <li><b>Python variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
   <li><b>File path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
   <li><b>URL link</b>, such as the network URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">Example</a></li>
   <li><b>Local directory</b>, the directory should contain data files to be predicted, such as the local path: <code>/root/data/</code></li>
-  <li><b>Dictionary</b>, the <code>key</code> of the dictionary must correspond to the specific task, such as <code>"img"</code> for image classification tasks. The <code>value</code> of the dictionary supports the above types of data, for example: <code>{"img": "/root/data1"}</code></li>
-  <li><b>List</b>, elements of the list must be of the above types of data, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>["/root/data/img1.jpg", "/root/data/img2.jpg"]</code>, <code>["/root/data1", "/root/data2"]</code>, <code>[{"img": "/root/data1"}, {"img": "/root/data2/img.jpg"}]</code></li>
+  <li><b>List</b>, the elements of the list should be of the above-mentioned data types, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>, <code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>None</td>

+ 5 - 5
docs/module_usage/tutorials/cv_modules/image_classification.en.md

@@ -83,7 +83,7 @@ The image classification module is a crucial component in computer vision system
 </tr>
 </table>
 
-&gt; ❗ The above list features the <b>9 core models</b> that the image classification module primarily supports. In total, this module supports <b>80 models</b>. The complete list of models is as follows:
+> ❗ The above list features the <b>9 core models</b> that the image classification module primarily supports. In total, this module supports <b>80 models</b>. The complete list of models is as follows:
 
 <details><summary> 👉Details of Model List</summary>
 <table>
@@ -680,7 +680,7 @@ The image classification module is a crucial component in computer vision system
 <p><b>Note: The above accuracy metrics refer to Top-1 Accuracy on the <a href="https://www.image-net.org/index.php">ImageNet-1k</a> validation set. </b><b>All model GPU inference times are based on NVIDIA Tesla T4 machines, with precision type FP32. CPU inference speeds are based on Intel® Xeon® Gold 5117 CPU @ 2.00GHz, with 8 threads and precision type FP32.</b></p></details>
 
 ## <span id="lable">III. Quick Integration</span>
-&gt; ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to the [PaddleX Local Installation Guide](../../../installation/installation.en.md).
+> ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to the [PaddleX Local Installation Guide](../../../installation/installation.en.md).
 
 After installing the wheel package, you can complete image classification module inference with just a few lines of code. You can switch between models in this module freely, and you can also integrate the model inference of the image classification module into your project. Before running the following code, please download the [demo image](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_classification_001.jpg) to your local machine.
 
@@ -702,6 +702,7 @@ After running, the result obtained is:
 
 The meanings of the running results parameters are as follows:
 - `input_path`: Indicates the path of the input image.
+- `page_index`: If the input is a PDF file, it indicates which page of the PDF is currently being processed; otherwise, it is `None`.
 - `class_ids`: Indicates the class IDs of the prediction results.
 - `scores`: Indicates the confidence scores of the prediction results.
 - `label_names`: Indicates the class names of the prediction results.
@@ -758,15 +759,14 @@ Related methods, parameters, and explanations are as follows:
 <tr>
 <td><code>input</code></td>
 <td>Data to be predicted, supporting multiple input types</td>
-<td><code>Python Var</code>/<code>str</code>/<code>dict</code>/<code>list</code></td>
+<td><code>Python Var</code>/<code>str</code>/<code>list</code></td>
 <td>
 <ul>
 <li><b>Python variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
 <li><b>File path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
 <li><b>URL link</b>, such as the network URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_classification_001.jpg">Example</a></li>
 <li><b>Local directory</b>, the directory should contain data files to be predicted, such as the local path: <code>/root/data/</code></li>
-<li><b>Dictionary</b>, the <code>key</code> of the dictionary must correspond to the specific task, such as <code>"img"</code> for image classification tasks. The <code>value</code> of the dictionary supports the above types of data, for example: <code>{"img": "/root/data1"}</code></li>
-<li><b>List</b>, elements of the list must be of the above types of data, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>["/root/data/img1.jpg", "/root/data/img2.jpg"]</code>, <code>["/root/data1", "/root/data2"]</code>, <code>[{"img": "/root/data1"}, {"img": "/root/data2/img.jpg"}]</code></li>
+<li><b>List</b>,elements of the list should be of the above-mentioned data types, such as <code>[numpy.ndarray, numpy.ndarray]</code>,<code>["/root/data/img1.jpg", "/root/data/img2.jpg"]</code>,<code>["/root/data1", "/root/data2"]</code></li>
 </ul>
 </td>
 <td>None</td>

+ 8 - 9
docs/module_usage/tutorials/cv_modules/image_feature.en.md

@@ -45,7 +45,7 @@ The image feature module is one of the important tasks in computer vision, prima
 <b>Note: The above accuracy metrics are Recall@1 from AliProducts. All GPU inference times are based on an NVIDIA Tesla T4 machine with FP32 precision. CPU inference speeds are based on an Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz with 8 threads and FP32 precision.</b>
 
 ## III. Quick Integration
-&gt; ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to the [PaddleX Local Installation Guide](../../../installation/installation.en.md)
+> ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to the [PaddleX Local Installation Guide](../../../installation/installation.en.md)
 
 After installing the wheel package, a few lines of code can complete the inference of the image feature module. You can switch between models under this module freely, and you can also integrate the model inference of the image feature module into your project. Before running the following code, please download the [demo image](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_recognition_001.jpg) to your local machine.
 
@@ -116,15 +116,14 @@ Descriptions of related methods, parameters, etc., are as follows:
 <tr>
 <td><code>input</code></td>
 <td>Data to be predicted, supporting multiple input types</td>
-<td><code>Python Var</code>/<code>str</code>/<code>dict</code>/<code>list</code></td>
+<td><code>Python Var</code>/<code>str</code>/<code>list</code></td>
 <td>
 <ul>
-<li><b>Python variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
-<li><b>File path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
-<li><b>URL link</b>, such as the network URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">Example</a></li>
-<li><b>Local directory</b>, which must contain data files to be predicted, such as the local path: <code>/root/data/</code></li>
-<li><b>Dictionary</b>, where the <code>key</code> must correspond to the specific task (e.g., <code>"img"</code> for image classification), and the <code>value</code> supports the above types of data, for example: <code>{"img": "/root/data1"}</code></li>
-<li><b>List</b>, elements of the list must be of the above types, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>["/root/data/img1.jpg", "/root/data/img2.jpg"]</code>, <code>["/root/data1", "/root/data2"]</code>, <code>[{"img": "/root/data1"}, {"img": "/root/data2/img.jpg"}]</code></li>
+<li><b>Python Variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
+<li><b>File Path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
+<li><b>URL Link</b>, such as the web URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">Example</a></li>
+<li><b>Local Directory</b>, the directory should contain the data files to be predicted, such as the local path: <code>/root/data/</code></li>
+<li><b>List</b>, the elements of the list should be of the above-mentioned data types, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>, <code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>None</td>
@@ -442,7 +441,7 @@ Similar to model training and evaluation, the following steps are required:
 * Specify the input data path: `-o Predict.input="..."`.
 Other related parameters can be set by modifying the `Global` and `Predict` fields in the `.yaml` configuration file. For details, please refer to [PaddleX Common Model Configuration File Parameter Description](../../instructions/config_parameters_common.en.md).
 
-&gt; ❗ Note: The inference result of the recognition model is a set of vectors, which requires a retrieval module to complete image feature.
+> ❗ Note: The inference result of the recognition model is a set of vectors, which requires a retrieval module to complete image feature.
 
 #### 4.4.2 Model Integration
 The model can be directly integrated into the PaddleX pipeline or directly into your own project.

+ 60 - 51
docs/module_usage/tutorials/cv_modules/image_multilabel_classification.en.md

@@ -14,49 +14,62 @@ The image multi-label classification module is a crucial component in computer v
 <tr>
 <th>Model</th><th>Model Download Link</th>
 <th>mAP(%)</th>
+<th>GPU Inference Time (ms)<br/>[Normal Mode / High-Performance Mode]</th>
+<th>CPU Inference Time (ms)<br/>[Normal Mode / High-Performance Mode]</th>
 <th>Model Size (M)</th>
 <th>Description</th>
 </tr>
 <tr>
 <td>CLIP_vit_base_patch16_448_ML</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/CLIP_vit_base_patch16_448_ML_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/CLIP_vit_base_patch16_448_ML_pretrained.pdparams">Trained Model</a></td>
 <td>89.15</td>
+<td>54.75 / 14.30</td>
+<td>280.23 / 280.23</td>
 <td>325.6 M</td>
 <td>CLIP_ML is an image multi-label classification model based on CLIP, which significantly improves accuracy on multi-label classification tasks by incorporating an ML-Decoder.</td>
 </tr>
 <tr>
 <td>PP-HGNetV2-B0_ML</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PP-HGNetV2-B0_ML_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-HGNetV2-B0_ML_pretrained.pdparams">Trained Model</a></td>
 <td>80.98</td>
+<td>6.47 / 1.38</td>
+<td>21.56 / 13.69</td>
 <td>39.6 M</td>
 <td rowspan="3">PP-HGNetV2_ML is an image multi-label classification model based on PP-HGNetV2, which significantly improves accuracy on multi-label classification tasks by incorporating an ML-Decoder.</td>
 </tr>
 <tr>
 <td>PP-HGNetV2-B4_ML</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PP-HGNetV2-B4_ML_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-HGNetV2-B4_ML_pretrained.pdparams">Trained Model</a></td>
 <td>87.96</td>
+<td>9.63 / 2.79</td>
+<td>43.98 / 36.63</td>
 <td>88.5 M</td>
 </tr>
 <tr>
 <td>PP-HGNetV2-B6_ML</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PP-HGNetV2-B6_ML_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-HGNetV2-B6_ML_pretrained.pdparams">Trained Model</a></td>
 <td>91.25</td>
+<td>37.07 / 9.43</td>
+<td>188.58 / 188.58</td>
 <td>286.5 M</td>
 </tr>
 <tr>
 <td>PP-LCNet_x1_0_ML</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PP-LCNet_x1_0_ML_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-LCNet_x1_0_ML_pretrained.pdparams">Trained Model</a></td>
 <td>77.96</td>
+<td>4.04 / 1.15</td>
+<td>11.76 / 8.32</td>
 <td>29.4 M</td>
 <td>PP-LCNet_ML is an image multi-label classification model based on PP-LCNet, which significantly improves accuracy on multi-label classification tasks by incorporating an ML-Decoder.</td>
 </tr>
 <tr>
 <td>ResNet50_ML</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/ResNet50_ML_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/ResNet50_ML_pretrained.pdparams">Trained Model</a></td>
 <td>83.50</td>
+<td>12.12 / 3.27</td>
+<td>51.79 / 44.36</td>
 <td>108.9 M</td>
 <td>ResNet50_ML is an image multi-label classification model based on ResNet50, which significantly improves accuracy on multi-label classification tasks by incorporating an ML-Decoder.</td>
 </tr>
 </table>
-
 <b>Note: The above accuracy metrics are mAP for the multi-label classification task on [COCO2017](https://cocodataset.org/#home).</b>
 
 ## III. Quick Integration
-> ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to the [PaddleX Local Installation Guide](../../../installation/installation.en.md)
+&gt; ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to the [PaddleX Local Installation Guide](../../../installation/installation.en.md)
 
 After installing the wheel package, you can complete multi-label classification module inference with just a few lines of code. You can switch between models in this module freely, and you can also integrate the model inference of the multi-label classification module into your project. Before running the following code, please download the [demo image](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/multilabel_classification_005.png) to your local machine.
 
@@ -78,13 +91,14 @@ After running, the result obtained is:
 
 The meanings of the running results parameters are as follows:
 - `input_path`: Indicates the path of the input multi-class image to be predicted.
+- `page_index`: If the input is a PDF file, it indicates which page of the PDF is currently being processed; otherwise, it is `None`.
 - `class_ids`: Indicates the predicted label IDs of the multi-class image.
 - `scores`: Indicates the confidence scores of the predicted labels of the multi-class image.
 - `label_names`: Indicates the predicted label names of the multi-class image.
 
 The visualization image is as follows:
 
-<img src="https://github.com/user-attachments/assets/4bdd6999-637d-4c9b-aa47-8dd6f587f5a1">
+<img src="https://github.com/user-attachments/assets/4bdd6999-637d-4c9b-aa47-8dd6f587f5a1"/>
 
 **Note:** Due to network issues, the above URL may not be accessible. If you need to access this link, please check the validity of the URL and try again. If the problem persists, it may be related to the link itself or the network connection.
 
@@ -134,15 +148,14 @@ Related methods, parameters, and explanations are as follows:
 <tr>
 <td><code>input</code></td>
 <td>Data to be predicted, supporting multiple input types</td>
-<td><code>Python Var</code>/<code>str</code>/<code>dict</code>/<code>list</code></td>
+<td><code>Python Var</code>/<code>str</code>/<code>list</code></td>
 <td>
 <ul>
-  <li><b>Python variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
-  <li><b>File path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
-  <li><b>URL link</b>, such as the network URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/multilabel_classification_005.png">Example</a></li>
-  <li><b>Local directory</b>, the directory should contain data files to be predicted, such as the local path: <code>/root/data/</code></li>
-  <li><b>Dictionary</b>, the <code>key</code> of the dictionary must correspond to the specific task, such as <code>"img"</code> for image classification tasks. The <code>value</code> of the dictionary supports the above types of data, for example: <code>{"img": "/root/data1"}</code></li>
-  <li><b>List</b>, elements of the list must be of the above types of data, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>["/root/data/img1.jpg", "/root/data/img2.jpg"]</code>, <code>["/root/data1", "/root/data2"]</code>, <code>[{"img": "/root/data1"}, {"img": "/root/data2/img.jpg"}]</code></li>
+<li><b>Python variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
+<li><b>File path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
+<li><b>URL link</b>, such as the network URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/multilabel_classification_005.png">Example</a></li>
+<li><b>Local directory</b>, the directory should contain data files to be predicted, such as the local path: <code>/root/data/</code></li>
+<li><b>List</b>, the elements of the list should be of the above-mentioned data types, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>, <code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>None</td>
@@ -215,8 +228,8 @@ Related methods, parameters, and explanations are as follows:
 <td><code>save_path</code></td>
 <td><code>str</code></td>
 <td>The path to save the file. If it is a directory, the saved file name will be consistent with the input file name</td>
-<td>None</vd>
-</tr>
+<td>None
+</td></tr>
 </table>
 
 * Additionally, it supports obtaining the visualization image with results and the prediction results through attributes, as follows:
@@ -264,47 +277,46 @@ python main.py -c paddlex/configs/modules/image_multilabel_classification/PP-LCN
 After executing the above command, PaddleX will validate the dataset and summarize its basic information. If the command runs successfully, it will print `Check dataset passed !` in the log. The validation results file is saved in `./output/check_dataset_result.json`, and related outputs are saved in the `./output/check_dataset` directory in the current directory, including visual examples of sample images and sample distribution histograms.
 
 <details><summary>👉 <b>Details of Validation Results (Click to Expand)</b></summary>
-
 <p>The specific content of the validation result file is:</p>
 <pre><code class="language-bash">{
-  &quot;done_flag&quot;: true,
-  &quot;check_pass&quot;: true,
-  &quot;attributes&quot;: {
-    &quot;label_file&quot;: &quot;../../dataset/mlcls_nus_examples/label.txt&quot;,
-    &quot;num_classes&quot;: 33,
-    &quot;train_samples&quot;: 17463,
-    &quot;train_sample_paths&quot;: [
-      &quot;check_dataset/demo_img/0543_4338693.jpg&quot;,
-      &quot;check_dataset/demo_img/0272_347806939.jpg&quot;,
-      &quot;check_dataset/demo_img/0069_2291994812.jpg&quot;,
-      &quot;check_dataset/demo_img/0012_1222850604.jpg&quot;,
-      &quot;check_dataset/demo_img/0238_53773041.jpg&quot;,
-      &quot;check_dataset/demo_img/0373_541261977.jpg&quot;,
-      &quot;check_dataset/demo_img/0567_519506868.jpg&quot;,
-      &quot;check_dataset/demo_img/0023_289621557.jpg&quot;,
-      &quot;check_dataset/demo_img/0581_484524659.jpg&quot;,
-      &quot;check_dataset/demo_img/0325_120753036.jpg&quot;
+  "done_flag": true,
+  "check_pass": true,
+  "attributes": {
+    "label_file": "../../dataset/mlcls_nus_examples/label.txt",
+    "num_classes": 33,
+    "train_samples": 17463,
+    "train_sample_paths": [
+      "check_dataset/demo_img/0543_4338693.jpg",
+      "check_dataset/demo_img/0272_347806939.jpg",
+      "check_dataset/demo_img/0069_2291994812.jpg",
+      "check_dataset/demo_img/0012_1222850604.jpg",
+      "check_dataset/demo_img/0238_53773041.jpg",
+      "check_dataset/demo_img/0373_541261977.jpg",
+      "check_dataset/demo_img/0567_519506868.jpg",
+      "check_dataset/demo_img/0023_289621557.jpg",
+      "check_dataset/demo_img/0581_484524659.jpg",
+      "check_dataset/demo_img/0325_120753036.jpg"
     ],
-    &quot;val_samples&quot;: 17463,
-    &quot;val_sample_paths&quot;: [
-      &quot;check_dataset/demo_img/0546_130758157.jpg&quot;,
-      &quot;check_dataset/demo_img/0284_2230710138.jpg&quot;,
-      &quot;check_dataset/demo_img/0090_1491261559.jpg&quot;,
-      &quot;check_dataset/demo_img/0013_392798436.jpg&quot;,
-      &quot;check_dataset/demo_img/0246_2248376356.jpg&quot;,
-      &quot;check_dataset/demo_img/0377_1349296474.jpg&quot;,
-      &quot;check_dataset/demo_img/0570_2457645006.jpg&quot;,
-      &quot;check_dataset/demo_img/0027_309333946.jpg&quot;,
-      &quot;check_dataset/demo_img/0584_132639537.jpg&quot;,
-      &quot;check_dataset/demo_img/0329_206031527.jpg&quot;
+    "val_samples": 17463,
+    "val_sample_paths": [
+      "check_dataset/demo_img/0546_130758157.jpg",
+      "check_dataset/demo_img/0284_2230710138.jpg",
+      "check_dataset/demo_img/0090_1491261559.jpg",
+      "check_dataset/demo_img/0013_392798436.jpg",
+      "check_dataset/demo_img/0246_2248376356.jpg",
+      "check_dataset/demo_img/0377_1349296474.jpg",
+      "check_dataset/demo_img/0570_2457645006.jpg",
+      "check_dataset/demo_img/0027_309333946.jpg",
+      "check_dataset/demo_img/0584_132639537.jpg",
+      "check_dataset/demo_img/0329_206031527.jpg"
     ]
   },
-  &quot;analysis&quot;: {
-    &quot;histogram&quot;: &quot;check_dataset/histogram.png&quot;
+  "analysis": {
+    "histogram": "check_dataset/histogram.png"
   },
-  &quot;dataset_path&quot;: &quot;./dataset/mlcls_nus_examples&quot;,
-  &quot;show_type&quot;: &quot;image&quot;,
-  &quot;dataset_type&quot;: &quot;MLClsDataset&quot;
+  "dataset_path": "mlcls_nus_examples",
+  "show_type": "image",
+  "dataset_type": "MLClsDataset"
 }
 </code></pre>
 <p>In the above validation results, <code>check_pass</code> being True indicates that the dataset format meets the requirements. Explanations for other indicators are as follows:</p>
@@ -316,14 +328,13 @@ After executing the above command, PaddleX will validate the dataset and summari
 <li><code>attributes.val_sample_paths</code>: A list of relative paths to the visual samples in the validation set of this dataset;</li>
 </ul>
 <p>Additionally, the dataset validation analyzes the sample number distribution across all classes in the dataset and generates a distribution histogram (histogram.png):
-<img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/modules/ml_classification/01.png"></p></details>
+<img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/modules/ml_classification/01.png"/></p></details>
 
 #### 4.1.3 Dataset Format Conversion/Dataset Splitting (Optional)
 
 After completing data validation, you can convert the dataset format or re-split the training/validation ratio of the dataset by <b>modifying the configuration file</b> or <b>appending hyperparameters</b>.
 
 <details><summary>👉 <b>Dataset Format Conversion/Dataset Splitting Details (Click to Expand)</b></summary>
-
 <p><b>(1) Dataset Format Conversion</b></p>
 <p>The multi-label image classification supports the conversion of <code>COCO</code> format datasets to <code>MLClsDataset</code> format. The parameters for dataset format conversion can be set by modifying the fields under <code>CheckDataset</code> in the configuration file. Examples of some parameters in the configuration file are as follows:</p>
 <ul>
@@ -407,7 +418,6 @@ the following steps are required:
 
 
 <details><summary>👉 <b>More Details (Click to Expand)</b></summary>
-
 <ul>
 <li>During model training, PaddleX automatically saves the model weight files, with the default being <code>output</code>. If you need to specify a save path, you can set it through the <code>-o Global.output</code> field in the configuration file.</li>
 <li>PaddleX shields you from the concepts of dynamic graph weights and static graph weights. During model training, both dynamic and static graph weights are produced, and static graph weights are selected by default for model inference.</li>
@@ -440,7 +450,6 @@ Similar to model training, the following steps are required:
 Other related parameters can be set by modifying the `Global` and `Evaluate` fields in the `.yaml` configuration file. For details, refer to [PaddleX Common Model Configuration File Parameter Description](../../instructions/config_parameters_common.en.md).
 
 <details><summary>👉 <b>More Details (Click to Expand)</b></summary>
-
 <p>When evaluating the model, you need to specify the model weights file path. Each configuration file has a default weight save path. If you need to change it, simply append the command line parameter to set it, such as <code>-o Evaluate.weight_path=./output/best_model/best_model.pdparams</code>.</p>
 <p>After completing the model evaluation, an <code>evaluate_result.json</code> file will be produced, which records the evaluation results, specifically, whether the evaluation task was completed successfully and the model's evaluation metrics, including MultiLabelMAP;</p></details>
 

+ 61 - 54
docs/module_usage/tutorials/cv_modules/image_multilabel_classification.md

@@ -14,50 +14,62 @@ comments: true
 <tr>
 <th>模型</th><th>模型下载链接</th>
 <th>mAP(%)</th>
+<th>GPU推理耗时(ms)<br/>[常规模式 / 高性能模式]</th>
+<th>CPU推理耗时(ms)<br/>[常规模式 / 高性能模式]</th>
 <th>模型存储大小 (M)</th>
 <th>介绍</th>
 </tr>
 <tr>
 <td>CLIP_vit_base_patch16_448_ML</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/CLIP_vit_base_patch16_448_ML_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/CLIP_vit_base_patch16_448_ML_pretrained.pdparams">训练模型</a></td>
 <td>89.15</td>
+<td>54.75 / 14.30</td>
+<td>280.23 / 280.23</td>
 <td>325.6 M</td>
 <td>CLIP_ML是一种基于CLIP的图像多标签分类模型,通过结合ML-Decoder,显著提升了模型在图像多标签分类任务上的准确性。</td>
 </tr>
 <tr>
 <td>PP-HGNetV2-B0_ML</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PP-HGNetV2-B0_ML_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-HGNetV2-B0_ML_pretrained.pdparams">训练模型</a></td>
 <td>80.98</td>
+<td>6.47 / 1.38</td>
+<td>21.56 / 13.69</td>
 <td>39.6 M</td>
 <td rowspan="3">PP-HGNetV2_ML是一种基于PP-HGNetV2的图像多标签分类模型,通过结合ML-Decoder,显著提升了模型在图像多标签分类任务上的准确性。</td>
 </tr>
 <tr>
 <td>PP-HGNetV2-B4_ML</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PP-HGNetV2-B4_ML_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-HGNetV2-B4_ML_pretrained.pdparams">训练模型</a></td>
 <td>87.96</td>
+<td>9.63 / 2.79</td>
+<td>43.98 / 36.63</td>
 <td>88.5 M</td>
 </tr>
 <tr>
 <td>PP-HGNetV2-B6_ML</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PP-HGNetV2-B6_ML_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-HGNetV2-B6_ML_pretrained.pdparams">训练模型</a></td>
 <td>91.25</td>
+<td>37.07 / 9.43</td>
+<td>188.58 / 188.58</td>
 <td>286.5 M</td>
 </tr>
 <tr>
 <td>PP-LCNet_x1_0_ML</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PP-LCNet_x1_0_ML_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-LCNet_x1_0_ML_pretrained.pdparams">训练模型</a></td>
 <td>77.96</td>
+<td>4.04 / 1.15</td>
+<td>11.76 / 8.32</td>
 <td>29.4 M</td>
 <td>PP-LCNet_ML是一种基于PP-LCNet的图像多标签分类模型,通过结合ML-Decoder,显著提升了模型在图像多标签分类任务上的准确性。</td>
 </tr>
 <tr>
 <td>ResNet50_ML</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/ResNet50_ML_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/ResNet50_ML_pretrained.pdparams">训练模型</a></td>
 <td>83.50</td>
+<td>12.12 / 3.27</td>
+<td>51.79 / 44.36</td>
 <td>108.9 M</td>
 <td>ResNet50_ML是一种基于ResNet50的图像多标签分类模型,通过结合ML-Decoder,显著提升了模型在图像多标签分类任务上的准确性。</td>
 </tr>
 </table>
-
-
 <b>注:以上精度指标为[COCO2017](https://cocodataset.org/#home)的多标签分类任务mAP。</b>
 
 ## 三、快速集成
-> ❗ 在快速集成前,请先安装 PaddleX 的 wheel 包,详细请参考 [PaddleX本地安装教程](../../../installation/installation.md)
+ > ❗ 在快速集成前,请先安装 PaddleX 的 wheel 包,详细请参考 [PaddleX本地安装教程](../../../installation/installation.md)
 
 wheel 包的安装后,几行代码即可完成图像多标签分类模块的推理,可以任意切换该模块下的模型,您也可以将图像多标签分类的模块中的模型推理集成到您的项目中。运行以下代码前,请您下载[示例图片](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/multilabel_classification_005.png)到本地。
 
@@ -85,7 +97,7 @@ for res in output:
 
 可视化图片如下:
 
-<img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/modules/image_multilabel_classification/multilabel_classification_005_result.png">
+<img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/modules/image_multilabel_classification/multilabel_classification_005_result.png"/>
 
 相关方法、参数等说明如下:
 
@@ -156,11 +168,11 @@ for res in output:
 <td><code>Python Var</code>/<code>str</code>/<code>list</code></td>
 <td>
 <ul>
-  <li><b>Python变量</b>,如<code>numpy.ndarray</code>表示的图像数据</li>
-  <li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
-  <li><b>URL链接</b>,如图像文件的网络URL:<a href = "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/multilabel_classification_005.png">示例</a></li>
-  <li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
-  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code></li>
+<li><b>Python变量</b>,如<code>numpy.ndarray</code>表示的图像数据</li>
+<li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
+<li><b>URL链接</b>,如图像文件的网络URL:<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/multilabel_classification_005.png">示例</a></li>
+<li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
+<li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>无</td>
@@ -199,8 +211,8 @@ for res in output:
 </tr>
 </thead>
 <tr>
-<td rowspan = "3"><code>print()</code></td>
-<td rowspan = "3">打印结果到终端</td>
+<td rowspan="3"><code>print()</code></td>
+<td rowspan="3">打印结果到终端</td>
 <td><code>format_json</code></td>
 <td><code>bool</code></td>
 <td>是否对输出内容进行使用 <code>JSON</code> 缩进格式化</td>
@@ -219,8 +231,8 @@ for res in output:
 <td><code>False</code></td>
 </tr>
 <tr>
-<td rowspan = "3"><code>save_to_json()</code></td>
-<td rowspan = "3">将结果保存为json格式的文件</td>
+<td rowspan="3"><code>save_to_json()</code></td>
+<td rowspan="3">将结果保存为json格式的文件</td>
 <td><code>save_path</code></td>
 <td><code>str</code></td>
 <td>保存的文件路径,当为目录时,保存文件命名与输入文件类型命名一致</td>
@@ -258,14 +270,13 @@ for res in output:
 </tr>
 </thead>
 <tr>
-<td rowspan = "1"><code>json</code></td>
-<td rowspan = "1">获取预测的<code>json</code>格式的结果</td>
+<td rowspan="1"><code>json</code></td>
+<td rowspan="1">获取预测的<code>json</code>格式的结果</td>
 </tr>
 <tr>
-<td rowspan = "1"><code>img</code></td>
-<td rowspan = "1">获取格式为<code>dict</code>的可视化图像</td>
+<td rowspan="1"><code>img</code></td>
+<td rowspan="1">获取格式为<code>dict</code>的可视化图像</td>
 </tr>
-
 </table>
 
 
@@ -295,43 +306,42 @@ python main.py -c paddlex/configs/modules/image_multilabel_classification/PP-LCN
 执行上述命令后,PaddleX 会对数据集进行校验,并统计数据集的基本信息,命令运行成功后会在 log 中打印出`Check dataset passed !`信息。校验结果文件保存在`./output/check_dataset_result.json`,同时相关产出会保存在当前目录的`./output/check_dataset`目录下,产出目录中包括可视化的示例样本图片和样本分布直方图。
 
 <details><summary>👉 <b>校验结果详情(点击展开)</b></summary>
-
 <p>校验结果文件具体内容为:</p>
 <pre><code class="language-bash">{
-  &quot;done_flag&quot;: true,
-  &quot;check_pass&quot;: true,
-  &quot;attributes&quot;: {
-    &quot;label_file&quot;: &quot;../../dataset/mlcls_nus_examples/label.txt&quot;,
-    &quot;num_classes&quot;: 33,
-    &quot;train_samples&quot;: 17463,
-    &quot;train_sample_paths&quot;: [
-      &quot;check_dataset/demo_img/0543_4338693.jpg&quot;,
-      &quot;check_dataset/demo_img/0272_347806939.jpg&quot;,
-      &quot;check_dataset/demo_img/0069_2291994812.jpg&quot;,
-      &quot;check_dataset/demo_img/0012_1222850604.jpg&quot;,
-      &quot;check_dataset/demo_img/0238_53773041.jpg&quot;,
-      &quot;check_dataset/demo_img/0373_541261977.jpg&quot;,
-      &quot;check_dataset/demo_img/0567_519506868.jpg&quot;,
-      &quot;check_dataset/demo_img/0023_289621557.jpg&quot;,
-      &quot;check_dataset/demo_img/0581_484524659.jpg&quot;,
-      &quot;check_dataset/demo_img/0325_120753036.jpg&quot;
+  "done_flag": true,
+  "check_pass": true,
+  "attributes": {
+    "label_file": "../../dataset/mlcls_nus_examples/label.txt",
+    "num_classes": 33,
+    "train_samples": 17463,
+    "train_sample_paths": [
+      "check_dataset/demo_img/0543_4338693.jpg",
+      "check_dataset/demo_img/0272_347806939.jpg",
+      "check_dataset/demo_img/0069_2291994812.jpg",
+      "check_dataset/demo_img/0012_1222850604.jpg",
+      "check_dataset/demo_img/0238_53773041.jpg",
+      "check_dataset/demo_img/0373_541261977.jpg",
+      "check_dataset/demo_img/0567_519506868.jpg",
+      "check_dataset/demo_img/0023_289621557.jpg",
+      "check_dataset/demo_img/0581_484524659.jpg",
+      "check_dataset/demo_img/0325_120753036.jpg"
     ],
-    &quot;val_samples&quot;: 17463,
-    &quot;val_sample_paths&quot;: [
-      &quot;check_dataset/demo_img/0546_130758157.jpg&quot;,
-      &quot;check_dataset/demo_img/0284_2230710138.jpg&quot;,
-      &quot;check_dataset/demo_img/0090_1491261559.jpg&quot;,
-      &quot;check_dataset/demo_img/0013_392798436.jpg&quot;,
-      &quot;check_dataset/demo_img/0246_2248376356.jpg&quot;,
-      &quot;check_dataset/demo_img/0377_1349296474.jpg&quot;,
-      &quot;check_dataset/demo_img/0570_2457645006.jpg&quot;,
-      &quot;check_dataset/demo_img/0027_309333946.jpg&quot;,
-      &quot;check_dataset/demo_img/0584_132639537.jpg&quot;,
-      &quot;check_dataset/demo_img/0329_206031527.jpg&quot;
+    "val_samples": 17463,
+    "val_sample_paths": [
+      "check_dataset/demo_img/0546_130758157.jpg",
+      "check_dataset/demo_img/0284_2230710138.jpg",
+      "check_dataset/demo_img/0090_1491261559.jpg",
+      "check_dataset/demo_img/0013_392798436.jpg",
+      "check_dataset/demo_img/0246_2248376356.jpg",
+      "check_dataset/demo_img/0377_1349296474.jpg",
+      "check_dataset/demo_img/0570_2457645006.jpg",
+      "check_dataset/demo_img/0027_309333946.jpg",
+      "check_dataset/demo_img/0584_132639537.jpg",
+      "check_dataset/demo_img/0329_206031527.jpg"
     ]
   },
-  &quot;analysis&quot;: {
-    &quot;histogram&quot;: &quot;check_dataset/histogram.png&quot;
+  "analysis": {
+    "histogram": "check_dataset/histogram.png"
   },
   &quot;dataset_path&quot;: &quot;mlcls_nus_examples&quot;,
   &quot;show_type&quot;: &quot;image&quot;,
@@ -347,13 +357,12 @@ python main.py -c paddlex/configs/modules/image_multilabel_classification/PP-LCN
 <li><code>attributes.val_sample_paths</code>:该数据集验证集样本可视化图片相对路径列表;</li>
 </ul>
 <p>另外,数据集校验还对数据集中所有类别的样本数量分布情况进行了分析,并绘制了分布直方图(histogram.png):</p>
-<p><img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/modules/ml_classification/01.png"></p></details>
+<p><img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/modules/ml_classification/01.png"/></p></details>
 
 #### 4.1.3 数据集格式转换/数据集划分(可选)
 在您完成数据校验之后,可以通过<b>修改配置文件</b>或是<b>追加超参数</b>的方式对数据集的格式进行转换,也可以对数据集的训练/验证比例进行重新划分。
 
 <details><summary>👉 <b>格式转换/数据集划分详情(点击展开)</b></summary>
-
 <p><b>(1)数据集格式转换</b></p>
 <p>图像多标签分类支持 <code>COCO</code>格式的数据集转换为 <code>MLClsDataset</code>格式,数据集格式转换的参数可以通过修改配置文件中 <code>CheckDataset</code> 下的字段进行设置,配置文件中部分参数的示例说明如下:</p>
 <ul>
@@ -438,7 +447,6 @@ python main.py -c paddlex/configs/modules/image_multilabel_classification/PP-LCN
 其他相关参数均可通过修改`.yaml`配置文件中的`Global`和`Train`下的字段来进行设置,也可以通过在命令行中追加参数来进行调整。如指定前 2 卡 gpu 训练:`-o Global.device=gpu:0,1`;设置训练轮次数为 10:`-o Train.epochs_iters=10`。更多可修改的参数及其详细解释,可以查阅模型对应任务模块的配置文件说明[PaddleX通用模型配置文件参数说明](../../instructions/config_parameters_common.md)。
 
 <details><summary>👉 <b>更多说明(点击展开)</b></summary>
-
 <ul>
 <li>模型训练过程中,PaddleX 会自动保存模型权重文件,默认为<code>output</code>,如需指定保存路径,可通过配置文件中 <code>-o Global.output</code> 字段进行设置。</li>
 <li>PaddleX 对您屏蔽了动态图权重和静态图权重的概念。在模型训练的过程中,会同时产出动态图和静态图的权重,在模型推理时,默认选择静态图权重推理。</li>
@@ -469,7 +477,6 @@ python main.py -c paddlex/configs/modules/image_multilabel_classification/PP-LCN
 其他相关参数均可通过修改`.yaml`配置文件中的`Global`和`Evaluate`下的字段来进行设置,详细请参考[PaddleX通用模型配置文件参数说明](../../instructions/config_parameters_common.md)。
 
 <details><summary>👉 <b>更多说明(点击展开)</b></summary>
-
 <p>在模型评估时,需要指定模型权重文件路径,每个配置文件中都内置了默认的权重保存路径,如需要改变,只需要通过追加命令行参数的形式进行设置即可,如<code>-o Evaluate.weight_path=./output/best_model/best_model.pdparams</code>。</p>
 <p>在完成模型评估后,会产出<code>evaluate_result.json,其记录了</code>评估的结果,具体来说,记录了评估任务是否正常完成,以及模型的评估指标,包括 MultiLabelMAP;</p></details>
 

+ 67 - 73
docs/module_usage/tutorials/cv_modules/instance_segmentation.en.md

@@ -13,76 +13,75 @@ The instance segmentation module is a crucial component in computer vision syste
 <tr>
 <th>Model</th><th>Model Download Link</th>
 <th>Mask AP</th>
-<th>GPU Inference Time (ms)</th>
-<th>CPU Inference Time (ms)</th>
+<th>GPU Inference Time (ms)<br/>[Normal Mode / High-Performance Mode]</th>
+<th>CPU Inference Time (ms)<br/>[Normal Mode / High-Performance Mode]</th>
 <th>Model Size (M)</th>
 <th>Description</th>
 </tr>
 <tr>
 <td>Mask-RT-DETR-H</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/Mask-RT-DETR-H_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/Mask-RT-DETR-H_pretrained.pdparams">Trained Model</a></td>
 <td>50.6</td>
-<td>132.693</td>
-<td>4896.17</td>
+<td>172.36 / 172.36</td>
+<td>1615.75 / 1615.75</td>
 <td>449.9 M</td>
 <td rowspan="5">Mask-RT-DETR is an instance segmentation model based on RT-DETR. By adopting the high-performance PP-HGNetV2 as the backbone network and constructing a MaskHybridEncoder encoder, along with introducing IOU-aware Query Selection technology, it achieves state-of-the-art (SOTA) instance segmentation accuracy with the same inference time.</td>
 </tr>
 <tr>
 <td>Mask-RT-DETR-L</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/Mask-RT-DETR-L_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/Mask-RT-DETR-L_pretrained.pdparams">Trained Model</a></td>
 <td>45.7</td>
-<td>46.5059</td>
-<td>2575.92</td>
+<td>88.18 / 88.18</td>
+<td>1090.84 / 1090.84</td>
 <td>113.6 M</td>
 </tr>
 </table>
 
-> ❗ The above list features the <b>2 core models</b> that the image classification module primarily supports. In total, this module supports <b>15 models</b>. The complete list of models is as follows:
+&gt; ❗ The above list features the <b>2 core models</b> that the image classification module primarily supports. In total, this module supports <b>15 models</b>. The complete list of models is as follows:
 
 <details><summary> 👉Model List Details</summary>
-
 <table>
 <tr>
 <th>Model</th><th>Model Download Link</th>
 <th>Mask AP</th>
-<th>GPU Inference Time (ms)</th>
-<th>CPU Inference Time (ms)</th>
+<th>GPU Inference Time (ms)<br/>[Normal Mode / High-Performance Mode]</th>
+<th>CPU Inference Time (ms)<br/>[Normal Mode / High-Performance Mode]</th>
 <th>Model Size (M)</th>
 <th>Description</th>
 </tr>
 <tr>
 <td>Cascade-MaskRCNN-ResNet50-FPN</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/Cascade-MaskRCNN-ResNet50-FPN_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/Cascade-MaskRCNN-ResNet50-FPN_pretrained.pdparams">Trained Model</a></td>
 <td>36.3</td>
-<td>-</td>
-<td>-</td>
+<td>141.69 / 141.69</td>
+<td>nan / nan</td>
 <td>254.8 M</td>
 <td rowspan="2">Cascade-MaskRCNN is an improved Mask RCNN instance segmentation model that utilizes multiple detectors in a cascade, optimizing segmentation results by leveraging different IOU thresholds to address the mismatch between detection and inference stages, thereby enhancing instance segmentation accuracy.</td>
 </tr>
 <tr>
 <td>Cascade-MaskRCNN-ResNet50-vd-SSLDv2-FPN</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/Cascade-MaskRCNN-ResNet50-vd-SSLDv2-FPN_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/Cascade-MaskRCNN-ResNet50-vd-SSLDv2-FPN_pretrained.pdparams">Trained Model</a></td>
 <td>39.1</td>
-<td>-</td>
-<td>-</td>
+<td>147.62 / 147.62</td>
+<td>nan / nan</td>
 <td>254.7 M</td>
 </tr>
 <tr>
 <td>Mask-RT-DETR-H</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/Mask-RT-DETR-H_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/Mask-RT-DETR-H_pretrained.pdparams">Trained Model</a></td>
 <td>50.6</td>
-<td>132.693</td>
-<td>4896.17</td>
+<td>172.36 / 172.36</td>
+<td>1615.75 / 1615.75</td>
 <td>449.9 M</td>
 <td rowspan="5">Mask-RT-DETR is an instance segmentation model based on RT-DETR. By adopting the high-performance PP-HGNetV2 as the backbone network and constructing a MaskHybridEncoder encoder, along with introducing IOU-aware Query Selection technology, it achieves state-of-the-art (SOTA) instance segmentation accuracy with the same inference time.</td>
 </tr>
 <tr>
 <td>Mask-RT-DETR-L</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/Mask-RT-DETR-L_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/Mask-RT-DETR-L_pretrained.pdparams">Trained Model</a></td>
 <td>45.7</td>
-<td>46.5059</td>
-<td>2575.92</td>
+<td>88.18 / 88.18</td>
+<td>1090.84 / 1090.84</td>
 <td>113.6 M</td>
 </tr>
 <tr>
 <td>Mask-RT-DETR-M</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/Mask-RT-DETR-M_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/Mask-RT-DETR-M_pretrained.pdparams">Trained Model</a></td>
 <td>42.7</td>
-<td>36.8329</td>
-<td>-</td>
+<td>78.69 / 78.69</td>
+<td>nan / nan</td>
 <td>66.6 M</td>
 </tr>
 <tr>
@@ -95,51 +94,51 @@ The instance segmentation module is a crucial component in computer vision syste
 <tr>
 <td>Mask-RT-DETR-X</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/Mask-RT-DETR-X_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/Mask-RT-DETR-X_pretrained.pdparams">Trained Model</a></td>
 <td>47.5</td>
-<td>75.755</td>
-<td>3358.04</td>
+<td>114.16 / 114.16</td>
+<td>1240.92 / 1240.92</td>
 <td>237.5 M</td>
 </tr>
 <tr>
 <td>MaskRCNN-ResNet50-FPN</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/MaskRCNN-ResNet50-FPN_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/MaskRCNN-ResNet50-FPN_pretrained.pdparams">Trained Model</a></td>
 <td>35.6</td>
-<td>-</td>
-<td>-</td>
+<td>118.30 / 118.30</td>
+<td>nan / nan</td>
 <td>157.5 M</td>
 <td rowspan="6">Mask R-CNN is a full-task deep learning model from Facebook AI Research (FAIR) that can perform object classification and localization in a single model, combined with image-level masks to complete segmentation tasks.</td>
 </tr>
 <tr>
 <td>MaskRCNN-ResNet50-vd-FPN</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/MaskRCNN-ResNet50-vd-FPN_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/MaskRCNN-ResNet50-vd-FPN_pretrained.pdparams">Trained Model</a></td>
 <td>36.4</td>
-<td>-</td>
-<td>-</td>
+<td>118.34 / 118.34</td>
+<td>nan / nan</td>
 <td>157.5 M</td>
 </tr>
 <tr>
 <td>MaskRCNN-ResNet50</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/MaskRCNN-ResNet50_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/MaskRCNN-ResNet50_pretrained.pdparams">Trained Model</a></td>
 <td>32.8</td>
-<td>-</td>
-<td>-</td>
+<td>228.83 / 228.83</td>
+<td>nan / nan</td>
 <td>128.7 M</td>
 </tr>
 <tr>
 <td>MaskRCNN-ResNet101-FPN</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/MaskRCNN-ResNet101-FPN_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/MaskRCNN-ResNet101-FPN_pretrained.pdparams">Trained Model</a></td>
 <td>36.6</td>
-<td>-</td>
-<td>-</td>
+<td>148.14 / 148.14</td>
+<td>nan / nan</td>
 <td>225.4 M</td>
 </tr>
 <tr>
 <td>MaskRCNN-ResNet101-vd-FPN</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/MaskRCNN-ResNet101-vd-FPN_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/MaskRCNN-ResNet101-vd-FPN_pretrained.pdparams">Trained Model</a></td>
 <td>38.1</td>
-<td>-</td>
-<td>-</td>
+<td>151.12 / 151.12</td>
+<td>nan / nan</td>
 <td>225.1 M</td>
 </tr>
 <tr>
 <td>MaskRCNN-ResNeXt101-vd-FPN</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/MaskRCNN-ResNeXt101-vd-FPN_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/MaskRCNN-ResNeXt101-vd-FPN_pretrained.pdparams">Trained Model</a></td>
 <td>39.5</td>
-<td>-</td>
-<td>-</td>
+<td>237.55 / 237.55</td>
+<td>nan / nan</td>
 <td>370.0 M</td>
 <td></td>
 </tr>
@@ -160,11 +159,10 @@ The instance segmentation module is a crucial component in computer vision syste
 <td> SOLOv2 is a real-time instance segmentation algorithm that segments objects by location. This model is an improved version of SOLO, achieving a good balance between accuracy and speed through the introduction of mask learning and mask NMS.</td>
 </tr>
 </table>
-
 <p><b>Note: The above accuracy metrics are based on the Mask AP of the <a href="https://cocodataset.org/#home">COCO2017</a> validation set. All GPU inference times are based on an NVIDIA Tesla T4 machine with FP32 precision. CPU inference speeds are based on an Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz with 8 threads and FP32 precision.</b></p></details>
 
 ## <span id="lable">III. Quick Integration</span>
-> ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to the [PaddleX Local Installation Tutorial](../../../installation/installation.en.md)
+&gt; ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to the [PaddleX Local Installation Tutorial](../../../installation/installation.en.md)
 
 After installing the wheel package, a few lines of code can complete the inference of the instance segmentation module. You can switch models under this module freely, and you can also integrate the model inference of the instance segmentation module into your project. Before running the following code, please download the [demo image](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_instance_segmentation_004.png) to your local machine.
 
@@ -196,7 +194,7 @@ The meanings of the running results parameters are as follows:
 
 The visualization image is as follows:
 
-<img src="https://raw.githubusercontent.com/BluebirdStory/PaddleX_doc_images/main/images/modules/instance_segmentation/general_instance_segmentation_004_res.png">
+<img src="https://raw.githubusercontent.com/BluebirdStory/PaddleX_doc_images/main/images/modules/instance_segmentation/general_instance_segmentation_004_res.png"/>
 
 **Note:** Due to network issues, the above URL may not be accessible. If you need to access this link, please check the validity of the URL and try again. If the problem persists, it may be related to the link itself or the network connection.
 
@@ -237,7 +235,7 @@ Related methods, parameters, and explanations are as follows:
 </table>
 
 * The `model_name` must be specified. After specifying `model_name`, the default model parameters built into PaddleX are used. If `model_dir` is specified, the user-defined model is used.
-* `threshold` is the threshold for filtering low-confidence objects. The default is `None`, which means using the settings from the previous layer. The priority of parameter settings from highest to lowest is: `predict parameter > create_model initialization > yaml configuration file`.
+* `threshold` is the threshold for filtering low-confidence objects. The default is `None`, which means using the settings from the previous layer. The priority of parameter settings from highest to lowest is: `predict parameter &gt; create_model initialization &gt; yaml configuration file`.
 
 * The `predict()` method of the general instance segmentation model is called for inference prediction. The `predict()` method has parameters `input`, `batch_size`, and `threshold`, which are explained as follows:
 
@@ -257,11 +255,11 @@ Related methods, parameters, and explanations are as follows:
 <td><code>Python Var</code>/<code>str</code>/<code>list</code></td>
 <td>
 <ul>
-  <li><b>Python variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
-  <li><b>File path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
-  <li><b>URL link</b>, such as the network URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_instance_segmentation_004.png">Example</a></li>
-  <li><b>Local directory</b>, the directory should contain data files to be predicted, such as the local path: <code>/root/data/</code></li>
-  <li><b>List</b>, elements of the list must be of the above types of data, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>["/root/data/img1.jpg", "/root/data/img2.jpg"]</code>, <code>["/root/data1", "/root/data2"]</code></li>
+<li><b>Python variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
+<li><b>File path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
+<li><b>URL link</b>, such as the network URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_instance_segmentation_004.png">Example</a></li>
+<li><b>Local directory</b>, the directory should contain data files to be predicted, such as the local path: <code>/root/data/</code></li>
+<li><b>List</b>, elements of the list must be of the above types of data, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>["/root/data/img1.jpg", "/root/data/img2.jpg"]</code>, <code>["/root/data1", "/root/data2"]</code></li>
 </ul>
 </td>
 <td>None</td>
@@ -279,8 +277,8 @@ Related methods, parameters, and explanations are as follows:
 <td><code>float</code>/<code>None</code></td>
 <td>
 <ul>
-  <li><b>None</b>, indicating the use of settings from the previous layer. The priority of parameter settings from highest to lowest is: <code>predict parameter > create_model initialization > yaml configuration file</code></li>
-  <li><b>float</b>, such as 0.5, indicating the use of <code>0.5</code> as the threshold for filtering low-confidence objects during inference</li>
+<li><b>None</b>, indicating the use of settings from the previous layer. The priority of parameter settings from highest to lowest is: <code>predict parameter &gt; create_model initialization &gt; yaml configuration file</code></li>
+<li><b>float</b>, such as 0.5, indicating the use of <code>0.5</code> as the threshold for filtering low-confidence objects during inference</li>
 </ul>
 </td>
 <td>None</td>
@@ -318,8 +316,8 @@ Related methods, parameters, and explanations are as follows:
 <td><code>ensure_ascii</code></td>
 <td><code>bool</code></td>
 <td>Control whether to escape non-<code>ASCII</code> characters to <code>Unicode</code>. If set to <code>True</code>, all non-<code>ASCII</code> characters will be escaped; <code>False</code> retains the original characters, only effective when <code>format_json</code> is <code>True</code></td>
-<td><code>False</code></vd>
-</tr>
+<td><code>False</code>
+</td></tr>
 <tr>
 <td rowspan="3"><code>save_to_json()</code></td>
 <td rowspan="3">Save the results as a JSON file</td>
@@ -338,16 +336,16 @@ Related methods, parameters, and explanations are as follows:
 <td><code>ensure_ascii</code></td>
 <td><code>bool</code></td>
 <td>Control whether to escape non-<code>ASCII</code> characters to <code>Unicode</code>. If set to <code>True</code>, all non-<code>ASCII</code> characters will be escaped; <code>False</code> retains the original characters, only effective when <code>format_json</code> is <code>True</code></td>
-<td><code>False</code></vd>
-</tr>
+<td><code>False</code>
+</td></tr>
 <tr>
 <td><code>save_to_img()</code></td>
 <td>Save the results as an image file</td>
 <td><code>save_path</code></td>
 <td><code>str</code></td>
 <td>The path to save the file. If it is a directory, the saved file name will be consistent with the input file name</td>
-<td>None</vd>
-</tr>
+<td>None
+</td></tr>
 </table>
 
 * Additionally, it supports obtaining the visualization image with results and the prediction results through attributes, as follows:
@@ -397,30 +395,29 @@ python main.py -c paddlex/configs/modules/instance_segmentation/Mask-RT-DETR-L.y
 
 After executing the above command, PaddleX will validate the dataset and summarize its basic information. If the command runs successfully, it will print `Check dataset passed !` in the log. The validation results file is saved in `./output/check_dataset_result.json`, and related outputs are saved in the `./output/check_dataset` directory in the current directory, including visual examples of sample images and sample distribution histograms.
 <details><summary>👉 <b>Details of Validation Results (Click to Expand)</b></summary>
-
 <p>The specific content of the validation result file is:</p>
 <pre><code class="language-bash">{
-  &quot;done_flag&quot;: true,
-  &quot;check_pass&quot;: true,
-  &quot;attributes&quot;: {
-    &quot;num_classes&quot;: 2,
-    &quot;train_samples&quot;: 79,
-    &quot;train_sample_paths&quot;: [
-      &quot;check_dataset/demo_img/pexels-photo-634007.jpeg&quot;,
-      &quot;check_dataset/demo_img/pexels-photo-59576.png&quot;
+  "done_flag": true,
+  "check_pass": true,
+  "attributes": {
+    "num_classes": 2,
+    "train_samples": 79,
+    "train_sample_paths": [
+      "check_dataset/demo_img/pexels-photo-634007.jpeg",
+      "check_dataset/demo_img/pexels-photo-59576.png"
     ],
-    &quot;val_samples&quot;: 19,
-    &quot;val_sample_paths&quot;: [
-      &quot;check_dataset/demo_img/peasant-farmer-farmer-romania-botiza-47862.jpeg&quot;,
-      &quot;check_dataset/demo_img/pexels-photo-715546.png&quot;
+    "val_samples": 19,
+    "val_sample_paths": [
+      "check_dataset/demo_img/peasant-farmer-farmer-romania-botiza-47862.jpeg",
+      "check_dataset/demo_img/pexels-photo-715546.png"
     ]
   },
-  &quot;analysis&quot;: {
-    &quot;histogram&quot;: &quot;check_dataset/histogram.png&quot;
+  "analysis": {
+    "histogram": "check_dataset/histogram.png"
   },
-  &quot;dataset_path&quot;: &quot;instance_seg_coco_examples&quot;,
-  &quot;show_type&quot;: &quot;image&quot;,
-  &quot;dataset_type&quot;: &quot;COCOInstSegDataset&quot;
+  "dataset_path": "instance_seg_coco_examples",
+  "show_type": "image",
+  "dataset_type": "COCOInstSegDataset"
 }
 </code></pre>
 <p>In the above verification results, <code>check_pass</code> being <code>True</code> indicates that the dataset format meets the requirements. Explanations for other indicators are as follows:</p>
@@ -432,13 +429,12 @@ After executing the above command, PaddleX will validate the dataset and summari
 <li><code>attributes.val_sample_paths</code>: A list of relative paths to the visualized validation samples in this dataset;
 Additionally, the dataset verification also analyzes the distribution of sample numbers across all categories in the dataset and generates a distribution histogram (<code>histogram.png</code>):</li>
 </ul>
-<p><img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/modules/instanceseg/01.png"></p></details>
+<p><img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/modules/instanceseg/01.png"/></p></details>
 
 #### 4.1.3 Dataset Format Conversion/Dataset Splitting (Optional)
 After completing data verification, you can convert the dataset format or re-split the training/validation ratio by <b>modifying the configuration file</b> or <b>appending hyperparameters</b>.
 
 <details><summary>👉 <b>Details of Format Conversion/Dataset Splitting (Click to Expand)</b></summary>
-
 <p><b>(1) Dataset Format Conversion</b></p>
 <p>The instance segmentation task supports converting <code>LabelMe</code> format to <code>COCO</code> format. The parameters for dataset format conversion can be set by modifying the fields under <code>CheckDataset</code> in the configuration file. Below are some example explanations for some of the parameters in the configuration file:</p>
 <ul>
@@ -522,7 +518,6 @@ The following steps are required:
 Other related parameters can be set by modifying the fields under `Global` and `Train` in the `.yaml` configuration file, or adjusted by appending parameters in the command line. For example, to specify the first 2 GPUs for training: `-o Global.device=gpu:0,1`; to set the number of training epochs to 10: `-o Train.epochs_iters=10`. For more modifiable parameters and their detailed explanations, refer to the [PaddleX Common Configuration File Parameters Instructions](../../instructions/config_parameters_common.en.md).
 
 <details><summary>👉 <b>More Details (Click to Expand)</b></summary>
-
 <ul>
 <li>During model training, PaddleX automatically saves the model weight files, with the default being <code>output</code>. If you need to specify a save path, you can set it through the <code>-o Global.output</code> field in the configuration file.</li>
 <li>PaddleX shields you from the concepts of dynamic graph weights and static graph weights. During model training, both dynamic and static graph weights are produced, and static graph weights are selected by default for model inference.</li>
@@ -553,7 +548,6 @@ Similar to model training, the following steps are required:
 * Specify the path to the validation dataset: `-o Global.dataset_dir`. Other related parameters can be set by modifying the `Global` and `Evaluate` fields in the `.yaml` configuration file. For details, refer to [PaddleX Common Model Configuration File Parameter Description](../../instructions/config_parameters_common.en.md).
 
 <details><summary>👉 <b>More Details (Click to Expand)</b></summary>
-
 <p>When evaluating the model, you need to specify the model weights file path. Each configuration file has a default weight save path built-in. If you need to change it, simply set it by appending a command line parameter, such as <code>-o Evaluate.weight_path=./output/best_model/best_model.pdparams</code>.</p>
 <p>After completing the model evaluation, an <code>evaluate_result.json</code> file will be generated, which records the evaluation results, specifically whether the evaluation task was completed successfully and the model's evaluation metrics, including AP.</p></details>
 

+ 3 - 3
docs/module_usage/tutorials/cv_modules/mainbody_detection.en.md

@@ -33,7 +33,7 @@ Mainbody detection is a fundamental task in object detection, aiming to identify
 <b>Note: The evaluation set for the above accuracy metrics is  PaddleClas mainbody detection dataset mAP(0.5:0.95). GPU inference time is based on an NVIDIA Tesla T4 machine with FP32 precision. CPU inference speed is based on an Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz with 8 threads and FP32 precision.</b>
 
 ## III. Quick Integration  <a id="quick"> </a>
-&gt; ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to [PaddleX Local Installation Guide](../../../installation/installation.en.md)
+> ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to [PaddleX Local Installation Guide](../../../installation/installation.en.md)
 
 After installing the wheel package, you can perform mainbody detection inference with just a few lines of code. You can easily switch between models under this module, and integrate the mainbody detection model inference into your project. Before running the following code, please download the [demo image](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_object_detection_002.png) to your local machine.
 
@@ -106,7 +106,7 @@ Related methods, parameters, and explanations are as follows:
 </table>
 
 * The `model_name` must be specified. After specifying `model_name`, the default model parameters built into PaddleX are used. If `model_dir` is specified, the user-defined model is used.
-* `threshold` is the threshold for filtering low-confidence objects. The default is `None`, which means using the settings from the previous layer. The priority of parameter settings from highest to lowest is: `predict parameter &gt; create_model initialization &gt; yaml configuration file`. Currently, two types of threshold settings are supported:
+* `threshold` is the threshold for filtering low-confidence objects. The default is `None`, which means using the settings from the previous layer. The priority of parameter settings from highest to lowest is: `predict parameter > create_model initialization > yaml configuration file`. Currently, two types of threshold settings are supported:
   * `float`, using the same threshold for all classes.
   * `dict`, where the key is the class ID and the value is the threshold, allowing different thresholds for different classes. Since main body detection is a single-class detection, this setting is not required.
 
@@ -150,7 +150,7 @@ Related methods, parameters, and explanations are as follows:
 <td><code>float</code>/<code>dict</code>/<code>None</code></td>
 <td>
 <ul>
-<li><b>None</b>, indicating the use of settings from the previous layer. The priority of parameter settings from highest to lowest is: <code>predict parameter &gt; create_model initialization &gt; yaml configuration file</code></li>
+<li><b>None</b>, indicating the use of settings from the previous layer. The priority of parameter settings from highest to lowest is: <code>predict parameter > create_model initialization > yaml configuration file</code></li>
 <li><b>float</b>, such as 0.5, indicating the use of <code>0.5</code> as the threshold for filtering low-confidence objects during inference</li>
 <li><b>dict</b>, such as <code>{0: 0.5, 1: 0.35}</code>, indicating the use of 0.5 as the threshold for class 0 and 0.35 for class 1 during inference. Since main body detection is a single-class detection, this setting is not required.</li>
 </ul>

+ 19 - 19
docs/module_usage/tutorials/cv_modules/object_detection.en.md

@@ -65,7 +65,7 @@ The object detection module is a crucial component in computer vision systems, r
 </tr>
 </table>
 
-&gt; ❗ The above list features the <b>6 core models</b> that the image classification module primarily supports. In total, this module supports <b>37 models</b>. The complete list of models is as follows:
+> ❗ The above list features the <b>6 core models</b> that the image classification module primarily supports. In total, this module supports <b>37 models</b>. The complete list of models is as follows:
 
 <details><summary> 👉Details of Model List</summary>
 <table>
@@ -81,7 +81,7 @@ The object detection module is a crucial component in computer vision systems, r
 <td>Cascade-FasterRCNN-ResNet50-FPN</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/Cascade-FasterRCNN-ResNet50-FPN_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/Cascade-FasterRCNN-ResNet50-FPN_pretrained.pdparams">Trained Model</a></td>
 <td>41.1</td>
 <td>135.92 / 135.92</td>
-<td>nan / nan</td>
+<td>-</td>
 <td>245.4 M</td>
 <td rowspan="2">Cascade-FasterRCNN is an improved version of the Faster R-CNN object detection model. By coupling multiple detectors and optimizing detection results using different IoU thresholds, it addresses the mismatch problem between training and prediction stages, enhancing the accuracy of object detection.</td>
 </tr>
@@ -89,22 +89,22 @@ The object detection module is a crucial component in computer vision systems, r
 <td>Cascade-FasterRCNN-ResNet50-vd-SSLDv2-FPN</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/Cascade-FasterRCNN-ResNet50-vd-SSLDv2-FPN_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/Cascade-FasterRCNN-ResNet50-vd-SSLDv2-FPN_pretrained.pdparams">Trained Model</a></td>
 <td>45.0</td>
 <td>138.23 / 138.23</td>
-<td>nan / nan</td>
+<td>-</td>
 <td>246.2 M</td>
 </tr>
 <tr>
 <td>CenterNet-DLA-34</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/CenterNet-DLA-34_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/CenterNet-DLA-34_pretrained.pdparams">Trained Model</a></td>
 <td>37.6</td>
-<td>nan / nan</td>
-<td>nan / nan</td>
+<td>-</td>
+<td>-</td>
 <td>75.4 M</td>
 <td rowspan="2">CenterNet is an anchor-free object detection model that treats the keypoints of the object to be detected as a single point—the center point of its bounding box, and performs regression through these keypoints.</td>
 </tr>
 <tr>
 <td>CenterNet-ResNet50</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/CenterNet-ResNet50_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/CenterNet-ResNet50_pretrained.pdparams">Trained Model</a></td>
 <td>38.9</td>
-<td>nan / nan</td>
-<td>nan / nan</td>
+<td>-</td>
+<td>-</td>
 <td>319.7 M</td>
 </tr>
 <tr>
@@ -119,7 +119,7 @@ The object detection module is a crucial component in computer vision systems, r
 <td>FasterRCNN-ResNet34-FPN</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/FasterRCNN-ResNet34-FPN_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/FasterRCNN-ResNet34-FPN_pretrained.pdparams">Trained Model</a></td>
 <td>37.8</td>
 <td>83.33 / 31.64</td>
-<td>nan / nan</td>
+<td>-</td>
 <td>137.5 M</td>
 <td rowspan="9">Faster R-CNN is a typical two-stage object detection model that first generates region proposals and then performs classification and regression on these proposals. Compared to its predecessors R-CNN and Fast R-CNN, Faster R-CNN's main improvement lies in the region proposal aspect, using a Region Proposal Network (RPN) to provide region proposals instead of traditional selective search. RPN is a Convolutional Neural Network (CNN) that shares convolutional features with the detection network, reducing the computational overhead of region proposals.</td>
 </tr>
@@ -127,56 +127,56 @@ The object detection module is a crucial component in computer vision systems, r
 <td>FasterRCNN-ResNet50-FPN</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/FasterRCNN-ResNet50-FPN_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/FasterRCNN-ResNet50-FPN_pretrained.pdparams">Trained Model</a></td>
 <td>38.4</td>
 <td>107.08 / 35.40</td>
-<td>nan / nan</td>
+<td>-</td>
 <td>148.1 M</td>
 </tr>
 <tr>
 <td>FasterRCNN-ResNet50-vd-FPN</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/FasterRCNN-ResNet50-vd-FPN_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/FasterRCNN-ResNet50-vd-FPN_pretrained.pdparams">Trained Model</a></td>
 <td>39.5</td>
 <td>109.36 / 36.00</td>
-<td>nan / nan</td>
+<td>-</td>
 <td>148.1 M</td>
 </tr>
 <tr>
 <td>FasterRCNN-ResNet50-vd-SSLDv2-FPN</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/FasterRCNN-ResNet50-vd-SSLDv2-FPN_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/FasterRCNN-ResNet50-vd-SSLDv2-FPN_pretrained.pdparams">Trained Model</a></td>
 <td>41.4</td>
 <td>109.06 / 36.19</td>
-<td>nan / nan</td>
+<td>-</td>
 <td>148.1 M</td>
 </tr>
 <tr>
 <td>FasterRCNN-ResNet50</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/FasterRCNN-ResNet50_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/FasterRCNN-ResNet50_pretrained.pdparams">Trained Model</a></td>
 <td>36.7</td>
 <td>496.33 / 109.12</td>
-<td>nan / nan</td>
+<td>-</td>
 <td>120.2 M</td>
 </tr>
 <tr>
 <td>FasterRCNN-ResNet101-FPN</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/FasterRCNN-ResNet101-FPN_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/FasterRCNN-ResNet101-FPN_pretrained.pdparams">Trained Model</a></td>
 <td>41.4</td>
 <td>148.21 / 42.21</td>
-<td>nan / nan</td>
+<td>-</td>
 <td>216.3 M</td>
 </tr>
 <tr>
 <td>FasterRCNN-ResNet101</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/FasterRCNN-ResNet101_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/FasterRCNN-ResNet101_pretrained.pdparams">Trained Model</a></td>
 <td>39.0</td>
 <td>538.58 / 120.88</td>
-<td>nan / nan</td>
+<td>-</td>
 <td>188.1 M</td>
 </tr>
 <tr>
 <td>FasterRCNN-ResNeXt101-vd-FPN</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/FasterRCNN-ResNeXt101-vd-FPN_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/FasterRCNN-ResNeXt101-vd-FPN_pretrained.pdparams">Trained Model</a></td>
 <td>43.4</td>
 <td>258.01 / 58.25</td>
-<td>nan / nan</td>
+<td>-</td>
 <td>360.6 M</td>
 </tr>
 <tr>
 <td>FasterRCNN-Swin-Tiny-FPN</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/FasterRCNN-Swin-Tiny-FPN_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/FasterRCNN-Swin-Tiny-FPN_pretrained.pdparams">Trained Model</a></td>
 <td>42.6</td>
-<td>nan / nan</td>
-<td>nan / nan</td>
+<td>-</td>
+<td>-</td>
 <td>159.8 M</td>
 </tr>
 <tr>
@@ -351,7 +351,7 @@ The object detection module is a crucial component in computer vision systems, r
 
 ## III. Quick Integration
 
-&gt; ❗ Before proceeding with quick integration, please install the PaddleX wheel package. For detailed instructions, refer to the [PaddleX Local Installation Guide](../../../installation/installation.en.md).
+> ❗ Before proceeding with quick integration, please install the PaddleX wheel package. For detailed instructions, refer to the [PaddleX Local Installation Guide](../../../installation/installation.en.md).
 
 After installing the wheel package, you can perform object detection inference with just a few lines of code. You can easily switch between models within the module and integrate the object detection inference into your projects. Before running the following code, please download the [demo image](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_object_detection_002.png) to your local machine.
 
@@ -630,7 +630,7 @@ After executing the above command, PaddleX will validate the dataset and summari
   "analysis": {
     "histogram": "check_dataset/histogram.png"
   },
-  "dataset_path": "./dataset/det_coco_examples",
+  "dataset_path": "det_coco_examples",
   "show_type": "image",
   "dataset_type": "COCODetDataset"
 }

+ 16 - 16
docs/module_usage/tutorials/cv_modules/object_detection.md

@@ -81,7 +81,7 @@ comments: true
 <td>Cascade-FasterRCNN-ResNet50-FPN</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/Cascade-FasterRCNN-ResNet50-FPN_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/Cascade-FasterRCNN-ResNet50-FPN_pretrained.pdparams">训练模型</a></td>
 <td>41.1</td>
 <td>135.92 / 135.92</td>
-<td>nan / nan</td>
+<td>-</td>
 <td>245.4 M</td>
 <td rowspan="2">Cascade-FasterRCNN 是一种改进的Faster R-CNN目标检测模型,通过耦联多个检测器,利用不同IoU阈值优化检测结果,解决训练和预测阶段的mismatch问题,提高目标检测的准确性。</td>
 </tr>
@@ -89,22 +89,22 @@ comments: true
 <td>Cascade-FasterRCNN-ResNet50-vd-SSLDv2-FPN</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/Cascade-FasterRCNN-ResNet50-vd-SSLDv2-FPN_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/Cascade-FasterRCNN-ResNet50-vd-SSLDv2-FPN_pretrained.pdparams">训练模型</a></td>
 <td>45.0</td>
 <td>138.23 / 138.23</td>
-<td>nan / nan</td>
+<td>-</td>
 <td>246.2 M</td>
 </tr>
 <tr>
 <td>CenterNet-DLA-34</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/CenterNet-DLA-34_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/CenterNet-DLA-34_pretrained.pdparams">训练模型</a></td>
 <td>37.6</td>
-<td>nan / nan</td>
-<td>nan / nan</td>
+<td>-</td>
+<td>-</td>
 <td>75.4 M</td>
 <td rowspan="2">CenterNet是一种anchor-free目标检测模型,把待检测物体的关键点视为单一点-即其边界框的中心点,并通过关键点进行回归。</td>
 </tr>
 <tr>
 <td>CenterNet-ResNet50</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/CenterNet-ResNet50_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/CenterNet-ResNet50_pretrained.pdparams">训练模型</a></td>
 <td>38.9</td>
-<td>nan / nan</td>
-<td>nan / nan</td>
+<td>-</td>
+<td>-</td>
 <td>319.7 M</td>
 </tr>
 <tr>
@@ -119,7 +119,7 @@ comments: true
 <td>FasterRCNN-ResNet34-FPN</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/FasterRCNN-ResNet34-FPN_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/FasterRCNN-ResNet34-FPN_pretrained.pdparams">训练模型</a></td>
 <td>37.8</td>
 <td>83.33 / 31.64</td>
-<td>nan / nan</td>
+<td>-</td>
 <td>137.5 M</td>
 <td rowspan="9">Faster R-CNN是典型的two-stage目标检测模型,即先生成区域建议(Region Proposal),然后在生成的Region Proposal上做分类和回归。相较于前代R-CNN和Fast R-CNN,Faster R-CNN的改进主要在于区域建议方面,使用区域建议网络(Region Proposal Network, RPN)提供区域建议,以取代传统选择性搜索。RPN是卷积神经网络,并与检测网络共享图像的卷积特征,减少了区域建议的计算开销。</td>
 </tr>
@@ -127,56 +127,56 @@ comments: true
 <td>FasterRCNN-ResNet50-FPN</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/FasterRCNN-ResNet50-FPN_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/FasterRCNN-ResNet50-FPN_pretrained.pdparams">训练模型</a></td>
 <td>38.4</td>
 <td>107.08 / 35.40</td>
-<td>nan / nan</td>
+<td>-</td>
 <td>148.1 M</td>
 </tr>
 <tr>
 <td>FasterRCNN-ResNet50-vd-FPN</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/FasterRCNN-ResNet50-vd-FPN_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/FasterRCNN-ResNet50-vd-FPN_pretrained.pdparams">训练模型</a></td>
 <td>39.5</td>
 <td>109.36 / 36.00</td>
-<td>nan / nan</td>
+<td>-</td>
 <td>148.1 M</td>
 </tr>
 <tr>
 <td>FasterRCNN-ResNet50-vd-SSLDv2-FPN</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/FasterRCNN-ResNet50-vd-SSLDv2-FPN_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/FasterRCNN-ResNet50-vd-SSLDv2-FPN_pretrained.pdparams">训练模型</a></td>
 <td>41.4</td>
 <td>109.06 / 36.19</td>
-<td>nan / nan</td>
+<td>-</td>
 <td>148.1 M</td>
 </tr>
 <tr>
 <td>FasterRCNN-ResNet50</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/FasterRCNN-ResNet50_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/FasterRCNN-ResNet50_pretrained.pdparams">训练模型</a></td>
 <td>36.7</td>
 <td>496.33 / 109.12</td>
-<td>nan / nan</td>
+<td>-</td>
 <td>120.2 M</td>
 </tr>
 <tr>
 <td>FasterRCNN-ResNet101-FPN</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/FasterRCNN-ResNet101-FPN_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/FasterRCNN-ResNet101-FPN_pretrained.pdparams">训练模型</a></td>
 <td>41.4</td>
 <td>148.21 / 42.21</td>
-<td>nan / nan</td>
+<td>-</td>
 <td>216.3 M</td>
 </tr>
 <tr>
 <td>FasterRCNN-ResNet101</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/FasterRCNN-ResNet101_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/FasterRCNN-ResNet101_pretrained.pdparams">训练模型</a></td>
 <td>39.0</td>
 <td>538.58 / 120.88</td>
-<td>nan / nan</td>
+<td>-</td>
 <td>188.1 M</td>
 </tr>
 <tr>
 <td>FasterRCNN-ResNeXt101-vd-FPN</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/FasterRCNN-ResNeXt101-vd-FPN_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/FasterRCNN-ResNeXt101-vd-FPN_pretrained.pdparams">训练模型</a></td>
 <td>43.4</td>
 <td>258.01 / 58.25</td>
-<td>nan / nan</td>
+<td>-</td>
 <td>360.6 M</td>
 </tr>
 <tr>
 <td>FasterRCNN-Swin-Tiny-FPN</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/FasterRCNN-Swin-Tiny-FPN_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/FasterRCNN-Swin-Tiny-FPN_pretrained.pdparams">训练模型</a></td>
 <td>42.6</td>
-<td>nan / nan</td>
-<td>nan / nan</td>
+<td>-</td>
+<td>-</td>
 <td>159.8 M</td>
 </tr>
 <tr>

+ 1 - 1
docs/module_usage/tutorials/cv_modules/open_vocabulary_detection.en.md

@@ -123,7 +123,7 @@ Related methods, parameters, and explanations are as follows:
 <tr>
 <td><code>input</code></td>
 <td>Data to be predicted, supporting multiple input types</td>
-<td><code>Python Var</code>/<code>str</code>/<code>dict</code>/<code>list</code></td>
+<td><code>Python Var</code>/<code>str</code>/<code>list</code></td>
 <td>
 <ul>
   <li><b>Python variable</b>, such as image data represented by <code>numpy.ndarray</code></li>

+ 1 - 1
docs/module_usage/tutorials/cv_modules/open_vocabulary_segmentation.en.md

@@ -124,7 +124,7 @@ Related methods and parameter explanations are as follows:
 <tr>
 <td><code>input</code></td>
 <td>Data to be predicted, supporting multiple input types</td>
-<td><code>Python Var</code>/<code>str</code>/<code>dict</code>/<code>list</code></td>
+<td><code>Python Var</code>/<code>str</code>/<code>list</code></td>
 <td>
 <ul>
   <li><b>Python variable</b>, such as image data represented by <code>numpy.ndarray</code></li>

+ 233 - 65
docs/module_usage/tutorials/cv_modules/pedestrian_attribute_recognition.en.md

@@ -15,8 +15,8 @@ Pedestrian attribute recognition is a crucial component in computer vision syste
 <tr>
 <th>Model</th><th>Model Download Link</th>
 <th>mA (%)</th>
-<th>GPU Inference Time (ms)</th>
-<th>CPU Inference Time (ms)</th>
+<th>GPU Inference Time (ms)<br/>[Normal Mode / High-Performance Mode]</th>
+<th>CPU Inference Time (ms)<br/>[Normal Mode / High-Performance Mode]</th>
 <th>Model Size (M)</th>
 <th>Description</th>
 </tr>
@@ -25,8 +25,8 @@ Pedestrian attribute recognition is a crucial component in computer vision syste
 <tr>
 <td>PP-LCNet_x1_0_pedestrian_attribute</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PP-LCNet_x1_0_pedestrian_attribute_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-LCNet_x1_0_pedestrian_attribute_pretrained.pdparams">Trained Model</a></td>
 <td>92.2</td>
-<td>3.84845</td>
-<td>9.23735</td>
+<td>2.35 / 0.49</td>
+<td>3.17 / 1.25</td>
 <td>6.7 M</td>
 <td>PP-LCNet_x1_0_pedestrian_attribute is a lightweight pedestrian attribute recognition model based on PP-LCNet, covering 26 categories</td>
 </tr>
@@ -35,40 +35,212 @@ Pedestrian attribute recognition is a crucial component in computer vision syste
 <b>Note: The above accuracy metrics are mA on PaddleX's internal self-built dataset. GPU inference time is based on an NVIDIA Tesla T4 machine with FP32 precision. CPU inference speed is based on an Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz with 8 threads and FP32 precision.</b>
 
 ## <span id="lable">III. Quick Integration</span>
-> ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to the [PaddleX Local Installation Guide](../../../installation/installation.en.md)
+&gt; ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to the [PaddleX Local Installation Guide](../../../installation/installation.en.md)
 
 After installing the wheel package, a few lines of code can complete the inference of the pedestrian attribute recognition module. You can easily switch models under this module and integrate the model inference of pedestrian attribute recognition into your project. Before running the following code, please download the [demo image](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/pedestrian_attribute_006.jpg) to your local machine.
 
-```bash
+```python
 from paddlex import create_model
-model = create_model("PP-LCNet_x1_0_pedestrian_attribute")
+model = create_model(model_name="PP-LCNet_x1_0_pedestrian_attribute")
 output = model.predict("pedestrian_attribute_006.jpg", batch_size=1)
 for res in output:
     res.print(json_format=False)
     res.save_to_img("./output/")
     res.save_to_json("./output/res.json")
 ```
-For more information on using PaddleX's single-model inference API, refer to the [PaddleX Single Model Python Script Usage Instructions](../../instructions/model_python_API.en.md).
 
-<b>Note</b>: The index of the `output` value represents the following attributes: index 0 indicates whether a hat is worn, index 1 indicates whether glasses are worn, indexes 2-7 represent the style of the upper garment, indexes 8-13 represent the style of the lower garment, index 14 indicates whether boots are worn, indexes 15-17 represent the type of bag carried, index 18 indicates whether an object is held in front, indexes 19-21 represent age, index 22 represents gender, and indexes 23-25 represent orientation. Specifically, the attributes include the following types:
+After running, the obtained result is:
 
+```bash
+{'res': {'input_path': 'pedestrian_attribute_006.jpg', 'page_index': None, 'class_ids': array([10, ..., 23]), 'scores': array([1.     , ..., 0.54777]), 'label_names': ['LongCoat(长外套)', 'Age18-60(年龄在18-60岁之间)', 'Trousers(长裤)', 'Front(面朝前)']}}
 ```
-- Gender: Male, Female
-- Age: Under 18, 18-60, Over 60
-- Orientation: Front, Back, Side
-- Accessories: Glasses, Hat, None
-- Holding Object in Front: Yes, No
-- Bag: Backpack, Shoulder Bag, Handbag
-- Upper Garment Style: Striped, Logo, Plaid, Patchwork
-- Lower Garment Style: Striped, Patterned
-- Short-sleeved Shirt: Yes, No
-- Long-sleeved Shirt: Yes, No
-- Long Coat: Yes, No
-- Pants: Yes, No
-- Shorts: Yes, No
-- Skirt: Yes, No
-- Boots: Yes, No
-```
+
+运行结果参数含义如下:
+- `input_path`:表示输入待预测多类别图像的路径
+- `page_index`:如果输入是PDF文件,则表示当前是PDF的第几页,否则为 `None`
+- `class_ids`:表示行人属性图像的预测标签ID
+- `scores`:表示行人属性图像的预测标签置信度
+- `label_names`:表示行人属性图像的预测标签名称
+
+可视化图片如下:
+
+<img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/modules/ped_attri/pedestrian_attribute_006_res.jpg" alt="Pedestrian Attribute Result">
+
+相关方法、参数等说明如下:
+
+* `create_model`实例化行人属性识别模型(此处以`PP-LCNet_x1_0_pedestrian_attribute`为例),具体说明如下:
+<table>
+<thead>
+<tr>
+<th>参数</th>
+<th>参数说明</th>
+<th>参数类型</th>
+<th>可选项</th>
+<th>默认值</th>
+</tr>
+</thead>
+<tr>
+<td><code>model_name</code></td>
+<td>模型名称</td>
+<td><code>str</code></td>
+<td>无</td>
+<td><code>PP-LCNet_x1_0_pedestrian_attribute</code></td>
+</tr>
+<tr>
+<td><code>model_dir</code></td>
+<td>模型存储路径</td>
+<td><code>str</code></td>
+<td>无</td>
+<td>无</td>
+</tr>
+<tr>
+<td><code>threshold</code></td>
+<td>行人属性识别阈值</td>
+<td><code>float/list/dict</code></td>
+<td><li><b>float类型变量</b>,任意[0-1]之间浮点数:<code>0.5</code></li>
+<li><b>list类型变量</b>,由多个[0-1]之间浮点数组成的列表:<code>[0.5,0.5,...]</code></li>
+<li><b>dict类型变量</b>,指定不同类别使用不同的阈值,其中"default"为必须包含的键:<code>{"default":0.5,1:0.1,...}</code></li>
+</td>
+<td>0.5</td>
+</tr>
+</table>
+
+* 其中,`model_name` 必须指定,指定 `model_name` 后,默认使用 PaddleX 内置的模型参数,在此基础上,指定 `model_dir` 时,使用用户自定义的模型。
+
+* 其中,`threshold` 参数用于设置多标签分类的阈值,默认为0.7。当设置为浮点数时,表示所有类别均使用该阈值;当设置为列表时,表示不同类别使用不同的阈值,此时需保持列表长度与类别数量一致;当设置为字典时,`default` 为必须包含的键, 表示所有类别的默认阈值,其它类别使用各自的阈值。例如:{"default":0.5,1:0.1}。
+
+* 调用多标签分类模型的 `predict()` 方法进行推理预测,`predict()` 方法参数有 `input` , `batch_size` 和  `threshold`,具体说明如下:
+
+<table>
+<thead>
+<tr>
+<th>Parameter</th>
+<th>Parameter Description</th>
+<th>Parameter Type</th>
+<th>Optional</th>
+<th>Default Value</th>
+</tr>
+</thead>
+<tr>
+<td><code>input</code></td>
+<td>Data to be predicted, supporting multiple input types</td>
+<td><code>Python Var</code>/<code>str</code>/<code>list</code></td>
+<td>
+<ul>
+  <li><b>Python variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
+  <li><b>File path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
+  <li><b>URL link</b>, such as the network URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/multilabel_classification_005.png">Example</a></li>
+  <li><b>Local directory</b>, the directory should contain data files to be predicted, such as the local path: <code>/root/data/</code></li>
+  <li><b>List</b>, elements of the list should be data of the above types, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>, <code>[\"/root/data1\", \"/root/data2\"]</code></li>
+</ul>
+</td>
+<td>None</td>
+</tr>
+<tr>
+<td><code>batch_size</code></td>
+<td>Batch size</td>
+<td><code>int</code></td>
+<td>Any integer</td>
+<td>1</td>
+</tr>
+<tr>
+<td><code>threshold</code></td>
+<td>Threshold for pedestrian attribute recognition</td>
+<td><code>float/list/dict</code></td>
+<td>
+<ul>
+<li><b>Float variable</b>, any floating-point number between [0-1]: <code>0.5</code></li>
+<li><b>List variable</b>, a list composed of multiple floating-point numbers between [0-1]: <code>[0.5,0.5,...]</code></li>
+<li><b>Dict variable</b>, specifying different thresholds for different categories, where "default" is a required key: <code>{"default":0.5,1:0.1,...}</code></li>
+</ul>
+</td>
+<td>0.5</td>
+</tr>
+</table>
+
+* Process the prediction results. Each sample's prediction result is a corresponding Result object, and it supports operations such as printing, saving as an image, and saving as a `json` file:
+
+<table>
+<thead>
+<tr>
+<th>Method</th>
+<th>Method Description</th>
+<th>Parameter</th>
+<th>Parameter Type</th>
+<th>Parameter Description</th>
+<th>Default Value</th>
+</tr>
+</thead>
+<tr>
+<td rowspan = "3"><code>print()</code></td>
+<td rowspan = "3">Print the result to the terminal</td>
+<td><code>format_json</code></td>
+<td><code>bool</code></td>
+<td>Whether to format the output content using <code>JSON</code> indentation</td>
+<td><code>True</code></td>
+</tr>
+<tr>
+<td><code>indent</code></td>
+<td><code>int</code></td>
+<td>Specify the indentation level to beautify the output <code>JSON</code> data, making it more readable, only effective when <code>format_json</code> is <code>True</code></td>
+<td>4</td>
+</tr>
+<tr>
+<td><code>ensure_ascii</code></td>
+<td><code>bool</code></td>
+<td>Control whether to escape non-<code>ASCII</code> characters to <code>Unicode</code>. When set to <code>True</code>, all non-<code>ASCII</code> characters will be escaped; <code>False</code> retains the original characters, only effective when <code>format_json</code> is <code>True</code></td>
+<td><code>False</code></td>
+</tr>
+<tr>
+<td rowspan = "3"><code>save_to_json()</code></td>
+<td rowspan = "3">Save the result as a JSON file</td>
+<td><code>save_path</code></td>
+<td><code>str</code></td>
+<td>The file path to save the result. When it is a directory, the saved file name will be consistent with the input file name</td>
+<td>None</td>
+</tr>
+<tr>
+<td><code>indent</code></td>
+<td><code>int</code></td>
+<td>Specify the indentation level to beautify the output <code>JSON</code> data, making it more readable, only effective when <code>format_json</code> is <code>True</code></td>
+<td>4</td>
+</tr>
+<tr>
+<td><code>ensure_ascii</code></td>
+<td><code>bool</code></td>
+<td>Control whether to escape non-<code>ASCII</code> characters to <code>Unicode</code>. When set to <code>True</code>, all non-<code>ASCII</code> characters will be escaped; <code>False</code> retains the original characters, only effective when <code>format_json</code> is <code>True</code></td>
+<td><code>False</code></td>
+</tr>
+<tr>
+<td><code>save_to_img()</code></td>
+<td>Save the result as an image file</td>
+<td><code>save_path</code></td>
+<td><code>str</code></td>
+<td>The file path to save the result. When it is a directory, the saved file name will be consistent with the input file name</td>
+<td>None</td>
+</tr>
+</table>
+
+* Additionally, it also supports obtaining the visualized image with results and the prediction results through attributes, as follows:
+
+<table>
+<thead>
+<tr>
+<th>Attribute</th>
+<th>Attribute Description</th>
+</tr>
+</thead>
+<tr>
+<td rowspan = "1"><code>json</code></td>
+<td rowspan = "1">Get the prediction result in <code>json</code> format</td>
+</tr>
+<tr>
+<td rowspan = "1"><code>img</code></td>
+<td rowspan = "1">Get the visualized image in <code>dict</code> format</td>
+</tr>
+</table>
+
+For more information on the usage of PaddleX single-model inference APIs, you can refer to the [PaddleX Single-Model Python Script Usage Instructions](../../instructions/model_python_API.en.md).
 
 ## IV. Custom Development
 If you seek higher accuracy from existing models, you can leverage PaddleX's custom development capabilities to develop better pedestrian attribute recognition models. Before developing pedestrian attribute recognition with PaddleX, ensure you have installed the classification-related model training plugins for PaddleX.  The installation process can be found in the custom development section of the [PaddleX Local Installation Guide](../../../installation/installation.en.md).
@@ -96,47 +268,46 @@ python main.py -c paddlex/configs/modules/pedestrian_attribute_recognition/PP-LC
 After executing the above command, PaddleX will validate the dataset and summarize its basic information. If the command runs successfully, it will print `Check dataset passed !` in the log. The validation results file is saved in `./output/check_dataset_result.json`, and related outputs are saved in the `./output/check_dataset` directory in the current directory, including visual examples of sample images and sample distribution histograms.
 
 <details><summary>👉 <b>Details of Validation Results (Click to Expand)</b></summary>
-
 <p>The specific content of the validation result file is:</p>
 <pre><code class="language-bash">{
-  &quot;done_flag&quot;: true,
-  &quot;check_pass&quot;: true,
-  &quot;attributes&quot;: {
-    &quot;label_file&quot;: &quot;../../dataset/pedestrian_attribute_examples/label.txt&quot;,
-    &quot;num_classes&quot;: 26,
-    &quot;train_samples&quot;: 1000,
-    &quot;train_sample_paths&quot;: [
-      &quot;check_dataset/demo_img/020907.jpg&quot;,
-      &quot;check_dataset/demo_img/004274.jpg&quot;,
-      &quot;check_dataset/demo_img/009412.jpg&quot;,
-      &quot;check_dataset/demo_img/026873.jpg&quot;,
-      &quot;check_dataset/demo_img/030560.jpg&quot;,
-      &quot;check_dataset/demo_img/022846.jpg&quot;,
-      &quot;check_dataset/demo_img/009055.jpg&quot;,
-      &quot;check_dataset/demo_img/015399.jpg&quot;,
-      &quot;check_dataset/demo_img/006435.jpg&quot;,
-      &quot;check_dataset/demo_img/055307.jpg&quot;
+  "done_flag": true,
+  "check_pass": true,
+  "attributes": {
+    "label_file": "../../dataset/pedestrian_attribute_examples/label.txt",
+    "num_classes": 26,
+    "train_samples": 1000,
+    "train_sample_paths": [
+      "check_dataset/demo_img/020907.jpg",
+      "check_dataset/demo_img/004274.jpg",
+      "check_dataset/demo_img/009412.jpg",
+      "check_dataset/demo_img/026873.jpg",
+      "check_dataset/demo_img/030560.jpg",
+      "check_dataset/demo_img/022846.jpg",
+      "check_dataset/demo_img/009055.jpg",
+      "check_dataset/demo_img/015399.jpg",
+      "check_dataset/demo_img/006435.jpg",
+      "check_dataset/demo_img/055307.jpg"
     ],
-    &quot;val_samples&quot;: 500,
-    &quot;val_sample_paths&quot;: [
-      &quot;check_dataset/demo_img/080381.jpg&quot;,
-      &quot;check_dataset/demo_img/080469.jpg&quot;,
-      &quot;check_dataset/demo_img/080146.jpg&quot;,
-      &quot;check_dataset/demo_img/080003.jpg&quot;,
-      &quot;check_dataset/demo_img/080283.jpg&quot;,
-      &quot;check_dataset/demo_img/080104.jpg&quot;,
-      &quot;check_dataset/demo_img/080149.jpg&quot;,
-      &quot;check_dataset/demo_img/080313.jpg&quot;,
-      &quot;check_dataset/demo_img/080131.jpg&quot;,
-      &quot;check_dataset/demo_img/080412.jpg&quot;
+    "val_samples": 500,
+    "val_sample_paths": [
+      "check_dataset/demo_img/080381.jpg",
+      "check_dataset/demo_img/080469.jpg",
+      "check_dataset/demo_img/080146.jpg",
+      "check_dataset/demo_img/080003.jpg",
+      "check_dataset/demo_img/080283.jpg",
+      "check_dataset/demo_img/080104.jpg",
+      "check_dataset/demo_img/080149.jpg",
+      "check_dataset/demo_img/080313.jpg",
+      "check_dataset/demo_img/080131.jpg",
+      "check_dataset/demo_img/080412.jpg"
     ]
   },
-  &quot;analysis&quot;: {
-    &quot;histogram&quot;: &quot;check_dataset/histogram.png&quot;
+  "analysis": {
+    "histogram": "check_dataset/histogram.png"
   },
-  &quot;dataset_path&quot;: &quot;./dataset/pedestrian_attribute_examples&quot;,
-  &quot;show_type&quot;: &quot;image&quot;,
-  &quot;dataset_type&quot;: &quot;MLClsDataset&quot;
+  "dataset_path": "pedestrian_attribute_examples",
+  "show_type": "image",
+  "dataset_type": "MLClsDataset"
 }
 </code></pre>
 <p>In the above validation results, <code>check_pass</code> being True indicates that the dataset format meets the requirements. Explanations for other indicators are as follows:</p>
@@ -148,13 +319,12 @@ After executing the above command, PaddleX will validate the dataset and summari
 <li><code>attributes.val_sample_paths</code>: The list of relative paths to the visualization images of samples in the validation set of this dataset;</li>
 </ul>
 <p>Additionally, the dataset verification also analyzes the distribution of the length and width of all images in the dataset and plots a histogram (histogram.png):</p>
-<p><img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/modules/ped_attri/image.png"></p></details>
+<p><img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/modules/ped_attri/image.png"/></p></details>
 
 #### 4.1.3 Dataset Format Conversion/Dataset Splitting (Optional)
 After completing data validation, you can convert the dataset format or re-split the training/validation ratio of the dataset by <b>modifying the configuration file</b> or <b>appending hyperparameters</b>.
 
 <details><summary>👉 <b>Dataset Format Conversion/Dataset Splitting Details (Click to Expand)</b></summary>
-
 <p><b>(1) Dataset Format Conversion</b></p>
 <p>Pedestrian attribute recognition does not support data format conversion.</p>
 <p><b>(2) Dataset Splitting</b></p>
@@ -207,7 +377,6 @@ the following steps are required:
 
 
 <details><summary>👉 <b>More Details (Click to Expand)</b></summary>
-
 <ul>
 <li>During model training, PaddleX automatically saves the model weight files, with the default being <code>output</code>. If you need to specify a save path, you can set it through the <code>-o Global.output</code> field in the configuration file.</li>
 <li>PaddleX shields you from the concepts of dynamic graph weights and static graph weights. During model training, both dynamic and static graph weights are produced, and static graph weights are selected by default for model inference.</li>
@@ -238,7 +407,6 @@ Similar to model training, the following steps are required:
 Other related parameters can be set by modifying the `Global` and `Evaluate` fields in the `.yaml` configuration file. For details, refer to [PaddleX Common Model Configuration File Parameter Description](../../instructions/config_parameters_common.en.md).
 
 <details><summary>👉 <b>More Details (Click to Expand)</b></summary>
-
 <p>When evaluating the model, you need to specify the model weights file path. Each configuration file has a default weight save path built-in. If you need to change it, simply set it by appending a command line parameter, such as <code>-o Evaluate.weight_path=./output/best_model/best_model.pdparams</code>.</p>
 <p>After completing the model evaluation, an <code>evaluate_result.json</code> file will be produced, which records the evaluation results, specifically, whether the evaluation task was completed successfully and the model's evaluation metrics, including MultiLabelMAP;</p></details>
 
@@ -267,7 +435,7 @@ The model can be directly integrated into the PaddleX pipeline or directly into
 
 1.<b>Pipeline Integration</b>
 
-The pedestrian attribute recognition module can be integrated into the [General Image Multi-label Classification Pipeline](../../../pipeline_usage/tutorials/cv_pipelines/image_multi_label_classification.en.md) of PaddleX. Simply replace the model path to update the pedestrian attribute recognition module of the relevant pipeline. In pipeline integration, you can use high-performance inference and service-oriented deployment to deploy your model.
+The pedestrian attribute recognition module can be integrated into the [Pedestrian Attribute Recognition Pipeline](../../../pipeline_usage/tutorials/cv_pipelines/pedestrian_attribute_recognition.en.md) of PaddleX. Simply replace the model path to update the pedestrian attribute recognition module of the relevant pipeline. In pipeline integration, you can use high-performance inference and service-oriented deployment to deploy your model.
 
 2.<b>Module Integration</b>
 

+ 2 - 2
docs/module_usage/tutorials/cv_modules/rotated_object_detection.en.md

@@ -132,7 +132,7 @@ Related methods and parameter explanations are as follows:
 <tr>
 <td><code>input</code></td>
 <td>Data to be predicted, supporting multiple input types</td>
-<td><code>Python Var</code>/<code>str</code>/<code>dict</code>/<code>list</code></td>
+<td><code>Python Var</code>/<code>str</code>/<code>list</code></td>
 <td>
 <ul>
   <li><b>Python variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
@@ -323,7 +323,7 @@ After executing the above command, PaddleX will verify the dataset and count the
   &quot;analysis&quot;: {
     &quot;histogram&quot;: &quot;check_dataset/histogram.png&quot;
   },
-  &quot;dataset_path&quot;: &quot;./dataset/DOTA-sampled200_crop1024_data&quot;,
+  &quot;dataset_path&quot;: &quot;rdet_dota_examples&quot;,
   &quot;show_type&quot;: &quot;image&quot;,
   &quot;dataset_type&quot;: &quot;COCODetDataset&quot;
 }

+ 55 - 59
docs/module_usage/tutorials/cv_modules/semantic_segmentation.en.md

@@ -14,8 +14,8 @@ Semantic segmentation is a technique in computer vision that classifies each pix
 <tr>
 <th>Model Name</th><th>Model Download Link</th>
 <th>mIoU (%)</th>
-<th>GPU Inference Time (ms)</th>
-<th>CPU Inference Time (ms)</th>
+<th>GPU Inference Time (ms)<br/>[Normal Mode / High-Performance Mode]</th>
+<th>CPU Inference Time (ms)<br/>[Normal Mode / High-Performance Mode]</th>
 <th>Model Size (M)</th>
 </tr>
 </thead>
@@ -23,20 +23,20 @@ Semantic segmentation is a technique in computer vision that classifies each pix
 <tr>
 <td>OCRNet_HRNet-W48</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/OCRNet_HRNet-W48_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/OCRNet_HRNet-W48_pretrained.pdparams">Trained Model</a></td>
 <td>82.15</td>
-<td>78.9976</td>
-<td>2226.95</td>
+<td>627.36 / 170.76</td>
+<td>3531.61 / 3531.61</td>
 <td>249.8 M</td>
 </tr>
 <tr>
 <td>PP-LiteSeg-T</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PP-LiteSeg-T_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-LiteSeg-T_pretrained.pdparams">Trained Model</a></td>
 <td>73.10</td>
-<td>7.6827</td>
-<td>138.683</td>
+<td>30.16 / 14.03</td>
+<td>420.07 / 235.01</td>
 <td>28.5 M</td>
 </tr>
 </tbody>
 </table>
-> ❗ The above list features the <b>2 core models</b> that the image classification module primarily supports. In total, this module supports <b>18 models</b>. The complete list of models is as follows:
+&gt; ❗ The above list features the <b>2 core models</b> that the image classification module primarily supports. In total, this module supports <b>18 models</b>. The complete list of models is as follows:
 
 <details><summary> 👉Model List Details</summary>
 <table>
@@ -44,8 +44,8 @@ Semantic segmentation is a technique in computer vision that classifies each pix
 <tr>
 <th>Model Name</th><th>Model Download Link</th>
 <th>mIoU (%)</th>
-<th>GPU Inference Time (ms)</th>
-<th>CPU Inference Time (ms)</th>
+<th>GPU Inference Time (ms)<br/>[Normal Mode / High-Performance Mode]</th>
+<th>CPU Inference Time (ms)<br/>[Normal Mode / High-Performance Mode]</th>
 <th>Model Size (M)</th>
 </tr>
 </thead>
@@ -53,57 +53,57 @@ Semantic segmentation is a technique in computer vision that classifies each pix
 <tr>
 <td>Deeplabv3_Plus-R50</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/Deeplabv3_Plus-R50_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/Deeplabv3_Plus-R50_pretrained.pdparams">Trained Model</a></td>
 <td>80.36</td>
-<td>61.0531</td>
-<td>1513.58</td>
+<td>503.51 / 122.30</td>
+<td>3543.91 / 3543.91</td>
 <td>94.9 M</td>
 </tr>
 <tr>
 <td>Deeplabv3_Plus-R101</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/Deeplabv3_Plus-R101_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/Deeplabv3_Plus-R101_pretrained.pdparams">Trained Model</a></td>
 <td>81.10</td>
-<td>100.026</td>
-<td>2460.71</td>
+<td>803.79 / 175.45</td>
+<td>5136.21 / 5136.21</td>
 <td>162.5 M</td>
 </tr>
 <tr>
 <td>Deeplabv3-R50</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/Deeplabv3-R50_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/Deeplabv3-R50_pretrained.pdparams">Trained Model</a></td>
 <td>79.90</td>
-<td>82.2631</td>
-<td>1735.83</td>
+<td>647.56 / 121.67</td>
+<td>3803.09 / 3803.09</td>
 <td>138.3 M</td>
 </tr>
 <tr>
 <td>Deeplabv3-R101</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/Deeplabv3-R101_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/Deeplabv3-R101_pretrained.pdparams">Trained Model</a></td>
 <td>80.85</td>
-<td>121.492</td>
-<td>2685.51</td>
+<td>950.43 / 178.50</td>
+<td>5517.14 / 5517.14</td>
 <td>205.9 M</td>
 </tr>
 <tr>
 <td>OCRNet_HRNet-W18</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/OCRNet_HRNet-W18_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/OCRNet_HRNet-W18_pretrained.pdparams">Trained Model</a></td>
 <td>80.67</td>
-<td>48.2335</td>
-<td>906.385</td>
+<td>286.12 / 80.76</td>
+<td>1794.03 / 1794.03</td>
 <td>43.1 M</td>
 </tr>
 <tr>
 <td>OCRNet_HRNet-W48</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/OCRNet_HRNet-W48_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/OCRNet_HRNet-W48_pretrained.pdparams">Trained Model</a></td>
 <td>82.15</td>
-<td>78.9976</td>
-<td>2226.95</td>
+<td>627.36 / 170.76</td>
+<td>3531.61 / 3531.61</td>
 <td>249.8 M</td>
 </tr>
 <tr>
 <td>PP-LiteSeg-T</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PP-LiteSeg-T_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-LiteSeg-T_pretrained.pdparams">Trained Model</a></td>
 <td>73.10</td>
-<td>7.6827</td>
-<td>138.683</td>
+<td>30.16 / 14.03</td>
+<td>420.07 / 235.01</td>
 <td>28.5 M</td>
 </tr>
 <tr>
 <td>PP-LiteSeg-B</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PP-LiteSeg-B_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-LiteSeg-B_pretrained.pdparams">Trained Model</a></td>
 <td>75.25</td>
-<td>10.9935</td>
-<td>194.727</td>
+<td>40.92 / 20.18</td>
+<td>494.32 / 310.34</td>
 <td>47.0 M</td>
 </tr>
 <tr>
@@ -209,7 +209,7 @@ Semantic segmentation is a technique in computer vision that classifies each pix
 <p><b>The accuracy metrics of the above models are measured on the <a href="https://groups.csail.mit.edu/vision/datasets/ADE20K/">ADE20k</a> dataset. GPU inference time is based on an NVIDIA Tesla T4 machine with FP32 precision. CPU inference speed is based on an Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz with 8 threads and FP32 precision.</b></p></details>
 
 ## III. Quick Integration
-> ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to the [PaddleX Local Installation Guide](../../../installation/installation.en.md)
+&gt; ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to the [PaddleX Local Installation Guide](../../../installation/installation.en.md)
 
 
 Just a few lines of code can complete the inference of the Semantic Segmentation module, allowing you to easily switch between models under this module. You can also integrate the model inference of the the Semantic Segmentation module into your project. Before running the following code, please download the [demo image](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_semantic_segmentation_002.png) to your local machine.
@@ -237,7 +237,7 @@ The meanings of the runtime parameters are as follows:
 
 The visualization image is as follows:
 
-<img src="https://raw.githubusercontent.com/BluebirdStory/PaddleX_doc_images/main/images/modules/semantic_segmentation/general_semantic_segmentation_002_res.png" alt="Visualization Image">
+<img alt="Visualization Image" src="https://raw.githubusercontent.com/BluebirdStory/PaddleX_doc_images/main/images/modules/semantic_segmentation/general_semantic_segmentation_002_res.png"/>
 
 Note: The image link may not be accessible due to network issues or problems with the link itself. If you need to access the image, please check the validity of the link and try again.
 
@@ -279,7 +279,7 @@ Related methods, parameters, and explanations are as follows:
 
 * The `model_name` must be specified. After specifying `model_name`, the built-in model parameters of PaddleX are used by default. If `model_dir` is specified, the user-defined model is used.
 
-* The `target_size` is specified during initialization to set the resolution for model inference. The default value is `None`. `-1` indicates that the original image size is used for inference, and `None` indicates that the settings from the previous layer are used. The priority order for parameter settings is: `predict parameter > create_model initialization > yaml configuration file`.
+* The `target_size` is specified during initialization to set the resolution for model inference. The default value is `None`. `-1` indicates that the original image size is used for inference, and `None` indicates that the settings from the previous layer are used. The priority order for parameter settings is: `predict parameter &gt; create_model initialization &gt; yaml configuration file`.
 
 * The `predict()` method of the general semantic segmentation model is called for inference and prediction. The parameters of the `predict()` method are `input`, `batch_size`, and `target_size`, with specific explanations as follows:
 
@@ -299,11 +299,11 @@ Related methods, parameters, and explanations are as follows:
 <td><code>Python Var</code>/<code>str</code>/<code>list</code></td>
 <td>
 <ul>
-  <li><b>Python Variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
-  <li><b>File Path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
-  <li><b>URL Link</b>, such as the network URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_semantic_segmentation_001.png">Example</a></li>
-  <li><b>Local Directory</b>, the directory should contain data files to be predicted, such as the local path: <code>/root/data/</code></li>
-  <li><b>List</b>, elements of the list should be data of the above types, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>, <code>[\"/root/data1\", \"/root/data2\"]</code></li>
+<li><b>Python Variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
+<li><b>File Path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
+<li><b>URL Link</b>, such as the network URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_semantic_segmentation_001.png">Example</a></li>
+<li><b>Local Directory</b>, the directory should contain data files to be predicted, such as the local path: <code>/root/data/</code></li>
+<li><b>List</b>, elements of the list should be data of the above types, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>, <code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>None</td>
@@ -321,10 +321,10 @@ Related methods, parameters, and explanations are as follows:
 <td><code>int</code>/<code>tuple</code></td>
 <td>
 <ul>
-  <li><b>-1</b>, indicating inference using the original image size</li>
-  <li><b>None</b>, indicating the settings from the previous layer are used. The priority order for parameter settings is: <code>predict parameter > create_model initialization > yaml configuration file</code></li>
-  <li><b>int</b>, such as 512, indicating inference using a resolution of <code>(512, 512)</code></li>
-  <li><b>tuple</b>, such as (512, 1024), indicating inference using a resolution of <code>(512, 1024)</code></li>
+<li><b>-1</b>, indicating inference using the original image size</li>
+<li><b>None</b>, indicating the settings from the previous layer are used. The priority order for parameter settings is: <code>predict parameter &gt; create_model initialization &gt; yaml configuration file</code></li>
+<li><b>int</b>, such as 512, indicating inference using a resolution of <code>(512, 512)</code></li>
+<li><b>tuple</b>, such as (512, 1024), indicating inference using a resolution of <code>(512, 1024)</code></li>
 </ul>
 </td>
 <td>None</td>
@@ -445,30 +445,29 @@ python main.py -c paddlex/configs/modules/semantic_segmentation/PP-LiteSeg-T.yam
 After executing the above command, PaddleX will verify the dataset and collect basic information about it. Once the command runs successfully, a message saying `Check dataset passed !` will be printed in the log. The verification results will be saved in `./output/check_dataset_result.json`, and related outputs will be stored in the `./output/check_dataset` directory, including visual examples of sample images and a histogram of sample distribution.
 
 <details><summary>👉 <b>Verification Result Details (click to expand)</b></summary>
-
 <p>The specific content of the verification result file is:</p>
 <pre><code class="language-bash">{
-  &quot;done_flag&quot;: true,
-  &quot;check_pass&quot;: true,
-  &quot;attributes&quot;: {
-    &quot;train_sample_paths&quot;: [
-      &quot;check_dataset/demo_img/P0005.jpg&quot;,
-      &quot;check_dataset/demo_img/P0050.jpg&quot;
+  "done_flag": true,
+  "check_pass": true,
+  "attributes": {
+    "train_sample_paths": [
+      "check_dataset/demo_img/P0005.jpg",
+      "check_dataset/demo_img/P0050.jpg"
     ],
-    &quot;train_samples&quot;: 267,
-    &quot;val_sample_paths&quot;: [
-      &quot;check_dataset/demo_img/N0139.jpg&quot;,
-      &quot;check_dataset/demo_img/P0137.jpg&quot;
+    "train_samples": 267,
+    "val_sample_paths": [
+      "check_dataset/demo_img/N0139.jpg",
+      "check_dataset/demo_img/P0137.jpg"
     ],
-    &quot;val_samples&quot;: 76,
-    &quot;num_classes&quot;: 2
+    "val_samples": 76,
+    "num_classes": 2
   },
-  &quot;analysis&quot;: {
-    &quot;histogram&quot;: &quot;check_dataset/histogram.png&quot;
+  "analysis": {
+    "histogram": "check_dataset/histogram.png"
   },
-  &quot;dataset_path&quot;: &quot;seg_optic_examples&quot;,
-  &quot;show_type&quot;: &quot;image&quot;,
-  &quot;dataset_type&quot;: &quot;SegDataset&quot;
+  "dataset_path": "seg_optic_examples",
+  "show_type": "image",
+  "dataset_type": "SegDataset"
 }
 </code></pre>
 <p>The verification results above indicate that <code>check_pass</code> being <code>True</code> means the dataset format meets the requirements. Explanations for other indicators are as follows:</p>
@@ -480,11 +479,10 @@ After executing the above command, PaddleX will verify the dataset and collect b
 <li><code>attributes.val_sample_paths</code>: A list of relative paths to the visualization images of validation samples in this dataset;</li>
 </ul>
 <p>The dataset verification also analyzes the distribution of sample numbers across all classes and plots a histogram (histogram.png):</p>
-<p><img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/modules/semanticseg/01.png"></p></details>
+<p><img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/modules/semanticseg/01.png"/></p></details>
 
 #### 4.1.3 Dataset Format Conversion/Dataset Splitting (Optional) (Click to Expand)
 <details><summary>👉 <b>Details on Format Conversion/Dataset Splitting (Click to Expand)</b></summary>
-
 <p>After completing dataset verification, you can convert the dataset format or re-split the training/validation ratio by modifying the configuration file or appending hyperparameters.</p>
 <p><b>(1) Dataset Format Conversion</b></p>
 <p>Semantic segmentation supports converting <code>LabelMe</code> format datasets to the required format.</p>
@@ -572,7 +570,6 @@ You need to follow these steps:
 Other related parameters can be set by modifying the `Global` and `Train` fields in the `.yaml` configuration file, or adjusted by appending parameters in the command line. For example, to train using the first two GPUs: `-o Global.device=gpu:0,1`; to set the number of training epochs to 10: `-o Train.epochs_iters=10`. For more modifiable parameters and their detailed explanations, refer to the [PaddleX Common Configuration Parameters Documentation](../../instructions/config_parameters_common.en.md).
 
 <details><summary>👉 <b>More Details (Click to Expand)</b></summary>
-
 <ul>
 <li>During model training, PaddleX automatically saves model weight files, with the default path being <code>output</code>. To specify a different save path, use the <code>-o Global.output</code> field in the configuration file.</li>
 <li>PaddleX abstracts the concepts of dynamic graph weights and static graph weights from you. During model training, both dynamic and static graph weights are produced, and static graph weights are used by default for model inference.</li>
@@ -605,7 +602,6 @@ Similar to model training, follow these steps:
 Other related parameters can be set by modifying the `Global` and `Evaluate` fields in the `.yaml` configuration file. For more details, refer to the [PaddleX Common Configuration Parameters Documentation](../../instructions/config_parameters_common.en.md).
 
 <details><summary>👉 <b>More Details (Click to Expand)</b></summary>
-
 <p>When evaluating the model, you need to specify the model weight file path. Each configuration file has a default weight save path. If you need to change it, simply append the command line parameter, e.g., <code>-o Evaluate.weight_path=./output/best_model/best_model.pdparams</code>.</p>
 <p>After model evaluation, the following outputs are typically produced:</p>
 <ul>

+ 3 - 3
docs/module_usage/tutorials/cv_modules/small_object_detection.en.md

@@ -49,7 +49,7 @@ Small object detection typically refers to accurately detecting and locating sma
 
 
 ## III. Quick Integration  <a id="quick"> </a>
-&gt; ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to the [PaddleX Local Installation Guide](../../../installation/installation.en.md)
+> ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to the [PaddleX Local Installation Guide](../../../installation/installation.en.md)
 
 After installing the wheel package, you can complete the inference of the small object detection module with just a few lines of code. You can switch models under this module freely, and you can also integrate the model inference of the small object detection module into your project. Before running the following code, please download the [demo image](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/small_object_detection.jpg) to your local machine.
 
@@ -124,7 +124,7 @@ Related methods, parameters, and explanations are as follows:
 </table>
 
 * The `model_name` must be specified. After specifying `model_name`, the default model parameters built into PaddleX are used. If `model_dir` is specified, the user-defined model is used.
-* `threshold` is the threshold for filtering low-confidence objects. The default is `None`, which means using the settings from the previous layer. The priority of parameter settings from highest to lowest is: `predict parameter &gt; create_model initialization &gt; yaml configuration file`. Currently, two types of threshold settings are supported:
+* `threshold` is the threshold for filtering low-confidence objects. The default is `None`, which means using the settings from the previous layer. The priority of parameter settings from highest to lowest is: `predict parameter > create_model initialization > yaml configuration file`. Currently, two types of threshold settings are supported:
   * `float`, using the same threshold for all classes.
   * `dict`, where the key is the class ID and the value is the threshold, allowing different thresholds for different classes.
 
@@ -168,7 +168,7 @@ Related methods, parameters, and explanations are as follows:
 <td><code>float</code>/<code>dict</code>/<code>None</code></td>
 <td>
 <ul>
-<li><b>None</b>, indicating the use of settings from the previous layer. The priority of parameter settings from highest to lowest is: <code>predict parameter &gt; create_model initialization &gt; yaml configuration file</code></li>
+<li><b>None</b>, indicating the use of settings from the previous layer. The priority of parameter settings from highest to lowest is: <code>predict parameter > create_model initialization > yaml configuration file</code></li>
 <li><b>float</b>, such as 0.5, indicating the use of <code>0.5</code> as the threshold for filtering low-confidence objects during inference</li>
 <li><b>dict</b>, such as <code>{0: 0.5, 1: 0.35}</code>, indicating the use of 0.5 as the threshold for class 0 and 0.35 for class 1 during inference.</li>
 </ul>

+ 1 - 1
docs/module_usage/tutorials/cv_modules/small_object_detection.md

@@ -170,7 +170,7 @@ for res in output:
 <td><code>float</code>/<code>dict[int, float]</code>/<code>None</code></td>
 <td>
 <ul>
-<li><b>None</b>,表示沿用上一层设置, 参数设置优先级从高到低为: <code>predict参数传入 &gt; create_model初始化传入 > yaml配置文件设置</code></li>
+<li><b>None</b>,表示沿用上一层设置, 参数设置优先级从高到低为: <code>predict参数传入 > create_model初始化传入 > yaml配置文件设置</code></li>
 <li><b>float</b>,对于所有的类别使用同一个阈值。如0.5,表示推理时使用0.5作为所有类别的低分object过滤阈值</li>
 <li><b>dict[int, float]</b>,如<code>{0: 0.5, 1: 0.35}</code>,表示推理时对类别0使用0.5低分过滤阈值,对类别1使用0.35低分过滤阈值。</li>
 </ul>

+ 197 - 6
docs/module_usage/tutorials/cv_modules/vehicle_attribute_recognition.en.md

@@ -25,7 +25,7 @@ Vehicle attribute recognition is a crucial component in computer vision systems.
 <tr>
 <td>PP-LCNet_x1_0_vehicle_attribute</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PP-LCNet_x1_0_vehicle_attribute_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-LCNet_x1_0_vehicle_attribute_pretrained.pdparams">Trained Model</a></td>
 <td>91.7</td>
-<td>2.32 / 2.32</td>
+<td>2.32 / 0.52</td>
 <td>3.22 / 1.26</td>
 <td>6.7 M</td>
 <td>PP-LCNet_x1_0_vehicle_attribute is a lightweight vehicle attribute recognition model based on PP-LCNet.</td>
@@ -37,19 +37,210 @@ Vehicle attribute recognition is a crucial component in computer vision systems.
 
 ## <span id="lable">III. Quick Integration</span>
 
-&gt; ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to [PaddleX Local Installation Guide](../../../installation/installation.en.md)
+> ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to [PaddleX Local Installation Guide](../../../installation/installation.en.md)
 
 After installing the wheel package, a few lines of code can complete the inference of the vehicle attribute recognition module. You can easily switch models under this module, and you can also integrate the model inference of the vehicle attribute recognition module into your project. Before running the following code, please download the [demo image](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/vehicle_attribute_007.jpg) to your local machine.
 
-```bash
+```python
 from paddlex import create_model
-model = create_model("PP-LCNet_x1_0_vehicle_attribute")
+model = create_model(model_name="PP-LCNet_x1_0_vehicle_attribute")
 output = model.predict("vehicle_attribute_007.jpg", batch_size=1)
 for res in output:
     res.print(json_format=False)
     res.save_to_img("./output/")
     res.save_to_json("./output/res.json")
 ```
+
+After running, the obtained result is:
+
+```bash
+{'res': {'input_path': 'vehicle_attribute_007.jpg', 'page_index': None, 'class_ids': array([ 0, 13]), 'scores': array([0.98929, 0.97349]), 'label_names': ['yellow(黄色)', 'hatchback(掀背车)']}}
+```
+
+The meanings of the parameters in the running result are as follows:
+- `input_path`: Indicates the path of the input multi-category image to be predicted.
+- `page_index`: If the input is a PDF file, it indicates which page of the PDF is currently being processed; otherwise, it is `None`.
+- `class_ids`: Indicates the predicted label IDs of the vehicle attribute images.
+- `scores`: Indicates the confidence scores of the predicted labels of the vehicle attribute images.
+- `label_names`: Indicates the names of the predicted labels of the vehicle attribute images.
+
+The visualization image is as follows:
+
+<img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/modules/vehicle_attri/vehicle_attribute_007_res.jpg" alt="Vehicle Attribute Result">
+
+Please note that due to network issues, the above image link may not be successfully parsed. This issue might be related to the link itself or the network connection. If you need the content of this link, please check the validity of the link and try again. If the problem persists, you may need to access the link directly through a browser.
+
+Relevant methods, parameters, and explanations are as follows:
+
+* `create_model` instantiates the vehicle attribute recognition model (here, `PP-LCNet_x1_0_vehicle_attribute` is used as an example). The specific explanations are as follows:
+<table>
+<thead>
+<tr>
+<th>Parameter</th>
+<th>Parameter Description</th>
+<th>Parameter Type</th>
+<th>Options</th>
+<th>Default Value</th>
+</tr>
+</thead>
+<tr>
+<td><code>model_name</code></td>
+<td>The name of the model</td>
+<td><code>str</code></td>
+<td>None</td>
+<td><code>PP-LCNet_x1_0_vehicle_attribute</code></td>
+</tr>
+<tr>
+<td><code>model_dir</code></td>
+<td>The storage path of the model</td>
+<td><code>str</code></td>
+<td>None</td>
+<td>None</td>
+</tr>
+<tr>
+<td><code>threshold</code></td>
+<td>The threshold for vehicle attribute recognition</td>
+<td><code>float/list/dict</code></td>
+<td><li><b>float variable</b>, any floating-point number between [0-1]: <code>0.5</code></li>
+<li><b>list variable</b>, a list composed of multiple floating-point numbers between [0-1]: <code>[0.5,0.5,...]</code></li>
+<li><b>dict variable</b>, specifying different thresholds for different categories, where "default" is a required key: <code>{"default":0.5,1:0.1,...}</code></li>
+</td>
+<td>0.5</td>
+</tr>
+</table>
+
+* The `model_name` must be specified. After specifying `model_name`, PaddleX's built-in model parameters are used by default. If `model_dir` is specified, the user-defined model is used.
+
+* The `threshold` parameter is used to set the threshold for multi-label classification, with a default value of 0.7. When set as a float, it means all categories use this threshold; when set as a list, different categories use different thresholds, and the list length must match the number of categories; when set as a dictionary, "default" is a required key, indicating the default threshold for all categories, while other categories use their respective thresholds. For example: <code>{"default":0.5,1:0.1}</code>.
+
+* The `predict()` method of the multi-label classification model is called for inference prediction. The parameters of the `predict()` method include `input`, `batch_size`, and `threshold`, with specific explanations as follows:
+
+<table>
+<thead>
+<tr>
+<th>Parameter</th>
+<th>Description</th>
+<th>Type</th>
+<th>Options</th>
+<th>Default Value</th>
+</tr>
+</thead>
+<tr>
+<td><code>input</code></td>
+<td>Data to be predicted, supporting multiple input types</td>
+<td><code>Python Var</code>/<code>str</code>/<code>list</code></td>
+<td>
+<ul>
+  <li><b>Python variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
+  <li><b>File path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
+  <li><b>URL link</b>, such as the network URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/multilabel_classification_005.png">Example</a></li>
+  <li><b>Local directory</b>, the directory should contain data files to be predicted, such as the local path: <code>/root/data/</code></li>
+  <li><b>List</b>, elements of the list should be of the above types, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>, <code>[\"/root/data1\", \"/root/data2\"]</code></li>
+</ul>
+</td>
+<td>None</td>
+</tr>
+<tr>
+<td><code>batch_size</code></td>
+<td>Batch size</td>
+<td><code>int</code></td>
+<td>Any integer</td>
+<td>1</td>
+</tr>
+<tr>
+<td><code>threshold</code></td>
+<td>Threshold for vehicle attribute recognition</td>
+<td><code>float/list/dict</code></td>
+<td><li><b>float variable</b>, any floating-point number between [0-1]: <code>0.5</code></li>
+<li><b>list variable</b>, a list of multiple floating-point numbers between [0-1]: <code>[0.5,0.5,...]</code></li>
+<li><b>dict variable</b>, specifying different thresholds for different categories, where <code>"default"</code> is a required key: <code>{"default":0.5,1:0.1,...}</code></li>
+</td>
+<td>0.5</td>
+</tr>
+</table>
+
+* The prediction results are processed, with each sample's prediction result being a corresponding Result object, which supports operations such as printing, saving as an image, and saving as a <code>json</code> file:
+
+<table>
+<thead>
+<tr>
+<th>Method</th>
+<th>Description</th>
+<th>Parameter</th>
+<th>Type</th>
+<th>Description</th>
+<th>Default Value</th>
+</tr>
+</thead>
+<tr>
+<td rowspan = "3"><code>print()</code></td>
+<td rowspan = "3">Print the result to the terminal</td>
+<td><code>format_json</code></td>
+<td><code>bool</code></td>
+<td>Whether to format the output content using <code>JSON</code> indentation</td>
+<td><code>True</code></td>
+</tr>
+<tr>
+<td><code>indent</code></td>
+<td><code>int</code></td>
+<td>Specify the indentation level to beautify the output <code>JSON</code> data and make it more readable, effective only when <code>format_json</code> is <code>True</code></td>
+<td>4</td>
+</tr>
+<tr>
+<td><code>ensure_ascii</code></td>
+<td><code>bool</code></td>
+<td>Control whether non-<code>ASCII</code> characters are escaped to <code>Unicode</code>. If set to <code>True</code>, all non-<code>ASCII</code> characters will be escaped; <code>False</code> retains the original characters, effective only when <code>format_json</code> is <code>True</code></td>
+<td><code>False</code></td>
+</tr>
+<tr>
+<td rowspan = "3"><code>save_to_json()</code></td>
+<td rowspan = "3">Save the result as a <code>json</code> file</td>
+<td><code>save_path</code></td>
+<td><code>str</code></td>
+<td>The file path for saving; if it is a directory, the saved file will be named consistently with the input file type</td>
+<td>None</td>
+</tr>
+<tr>
+<td><code>indent</code></td>
+<td><code>int</code></td>
+<td>Specify the indentation level to beautify the output <code>JSON</code> data and make it more readable, effective only when <code>format_json</code> is <code>True</code></td>
+<td>4</td>
+</tr>
+<tr>
+<td><code>ensure_ascii</code></td>
+<td><code>bool</code></td>
+<td>Control whether non-<code>ASCII</code> characters are escaped to <code>Unicode</code>. If set to <code>True</code>, all non-<code>ASCII</code> characters will be escaped; <code>False</code> retains the original characters, effective only when <code>format_json</code> is <code>True</code></td>
+<td><code>False</code></td>
+</tr>
+<tr>
+<td><code>save_to_img()</code></td>
+<td>Save the result as an image file</td>
+<td><code>save_path</code></td>
+<td><code>str</code></td>
+<td>The file path for saving; if it is a directory, the saved file will be named consistently with the input file type</td>
+<td>None</td>
+</tr>
+</table>
+
+* In addition, it also supports obtaining visualized images with results and prediction results through attributes, as follows:
+
+<table>
+<thead>
+<tr>
+<th>Attribute</th>
+<th>Description</th>
+</tr>
+</thead>
+<tr>
+<td rowspan = "1"><code>json</code></td>
+<td>Get the prediction result in <code>json</code> format</td>
+</tr>
+<tr>
+<td rowspan = "1"><code>img</code></td>
+<td>Get the visualized image in <code>dict</code> format</td>
+</tr>
+</table>
+
 For more information on using PaddleX's single-model inference API, refer to [PaddleX Single Model Python Script Usage Instructions](../../instructions/model_python_API.en.md).
 
 <b>Note</b>: In the `output`, values indexed from 0-9 represent color attributes, corresponding to the following colors respectively: yellow, orange, green, gray, red, blue, white, golden, brown, black. Indices 10-18 represent vehicle type attributes, corresponding to the following vehicle types: sedan, suv, van, hatchback, mpv, pickup, bus, truck, estate.
@@ -117,7 +308,7 @@ After executing the above command, PaddleX will validate the dataset and summari
   "analysis": {
     "histogram": "check_dataset/histogram.png"
   },
-  "dataset_path": "./dataset/vehicle_attribute_examples",
+  "dataset_path": "vehicle_attribute_examples",
   "show_type": "image",
   "dataset_type": "MLClsDataset"
 }
@@ -249,7 +440,7 @@ The model can be directly integrated into the PaddleX pipeline or directly into
 
 1.<b>Pipeline Integration</b>
 
-The vehicle attribute recognition module can be integrated into the [General Image Multi-label Classification Pipeline](../../../pipeline_usage/tutorials/cv_pipelines/image_multi_label_classification.en.md) of PaddleX. Simply replace the model path to update the vehicle attribute recognition module of the relevant pipeline. In pipeline integration, you can use high-performance inference and service-oriented deployment to deploy your model.
+The vehicle attribute recognition module can be integrated into the [Vehicle Attribute Recognition Pipeline](../../../pipeline_usage/tutorials/cv_pipelines/vehicle_attribute_recognition.en.md) of PaddleX. Simply replace the model path to update the vehicle attribute recognition module of the relevant pipeline. In pipeline integration, you can use high-performance inference and service-oriented deployment to deploy your model.
 
 2.<b>Module Integration</b>
 

+ 1 - 1
docs/module_usage/tutorials/cv_modules/vehicle_attribute_recognition.md

@@ -25,7 +25,7 @@ comments: true
 <tr>
 <td>PP-LCNet_x1_0_vehicle_attribute</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PP-LCNet_x1_0_vehicle_attribute_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-LCNet_x1_0_vehicle_attribute_pretrained.pdparams">训练模型</a></td>
 <td>91.7</td>
-<td>2.32 / 2.32</td>
+<td>2.32 / 0.52</td>
 <td>3.22 / 1.26</td>
 <td>6.7 M</td>
 <td>PP-LCNet_x1_0_vehicle_attribute 是一种基于PP-LCNet的轻量级车辆属性识别模型。</td>

+ 56 - 60
docs/module_usage/tutorials/cv_modules/vehicle_detection.en.md

@@ -11,35 +11,34 @@ Vehicle detection is a subtask of object detection, specifically referring to th
 
 
 <table>
-  <tr>
-    <th>Model</th>
-    <th>mAP 0.5:0.95</th>
-    <th>GPU Inference Time (ms)</th>
-    <th>CPU Inference Time (ms)</th>
-    <th>Model Size (M)</th>
-    <th>Description</th>
-  </tr>
-  <tr>
-    <td>PP-YOLOE-S_vehicle</td>
-    <td>61.3</td>
-    <td>15.4</td>
-    <td>178.4</td>
-    <td>28.79</td>
-    <td rowspan="2">Vehicle detection model based on PP-YOLOE</td>
-  </tr>
-  <tr>
-    <td>PP-YOLOE-L_vehicle</td>
-    <td>63.9</td>
-    <td>32.6</td>
-    <td>775.6</td>
-    <td>196.02</td>
-  </tr>
-
+<tr>
+<th>Model</th>
+<th>mAP 0.5:0.95</th>
+<th>GPU Inference Time (ms)<br/>[Normal Mode / High-Performance Mode]</th>
+<th>CPU Inference Time (ms)<br/>[Normal Mode / High-Performance Mode]</th>
+<th>Model Size (M)</th>
+<th>Description</th>
+</tr>
+<tr>
+<td>PP-YOLOE-S_vehicle</td>
+<td>61.3</td>
+<td>9.79 / 3.48</td>
+<td>54.14 / 46.69</td>
+<td>28.79</td>
+<td rowspan="2">Vehicle detection model based on PP-YOLOE</td>
+</tr>
+<tr>
+<td>PP-YOLOE-L_vehicle</td>
+<td>63.9</td>
+<td>32.84 / 9.03</td>
+<td>176.60 / 176.60</td>
+<td>196.02</td>
+</tr>
 <b>Note: The evaluation set for the above accuracy metrics is PPVehicle dataset mAP(0.5:0.95). GPU inference time is based on an NVIDIA Tesla T4 machine with FP32 precision. CPU inference speed is based on an Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz with 8 threads and FP32 precision.</b>
-</details>
+
 
 ## III. Quick Integration
-> ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to the [PaddleX Local Installation Guide](../../../installation/installation.en.md)
+&gt; ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to the [PaddleX Local Installation Guide](../../../installation/installation.en.md)
 
 After installing the wheel package, you can complete the inference of the vehicle detection module with just a few lines of code. You can switch models under this module freely, and you can also integrate the model inference of the vehicle detection module into your project. Before running the following code, please download the [demo image](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/vehicle_detection.jpg) to your local machine.
 
@@ -71,7 +70,7 @@ The meanings of the runtime parameters are as follows:
 
 The visualization image is as follows:
 
-<img src="https://raw.githubusercontent.com/BluebirdStory/PaddleX_doc_images/main/images/modules/vehicle_detection/vehicle_detection_res.jpg" alt="Visualization Image">
+<img alt="Visualization Image" src="https://raw.githubusercontent.com/BluebirdStory/PaddleX_doc_images/main/images/modules/vehicle_detection/vehicle_detection_res.jpg"/>
 
 Related methods, parameters, and explanations are as follows:
 
@@ -111,7 +110,7 @@ Related methods, parameters, and explanations are as follows:
 
 * The `model_name` must be specified. After specifying `model_name`, the built-in model parameters of PaddleX are used by default. If `model_dir` is specified, the user-defined model is used.
 
-* The `threshold` is the threshold for filtering low-score objects. The default value is `None`, indicating that the settings from the previous layer are used. The priority order for parameter settings is: `predict parameter > create_model initialization > yaml configuration file`. Currently, two types of threshold settings are supported:
+* The `threshold` is the threshold for filtering low-score objects. The default value is `None`, indicating that the settings from the previous layer are used. The priority order for parameter settings is: `predict parameter &gt; create_model initialization &gt; yaml configuration file`. Currently, two types of threshold settings are supported:
   * `float`: Use the same threshold for all classes.
   * `dict`: The key is the class ID, and the value is the threshold. Different thresholds can be set for different classes. For vehicle detection, which is a single-class detection task, this setting is not required.
 
@@ -133,11 +132,11 @@ Related methods, parameters, and explanations are as follows:
 <td><code>Python Var</code>/<code>str</code>/<code>list</code></td>
 <td>
 <ul>
-  <li><b>Python Variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
-  <li><b>File Path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
-  <li><b>URL Link</b>, such as the network URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_instance_segmentation_004.png">Example</a></li>
-  <li><b>Local Directory</b>, the directory should contain data files to be predicted, such as the local path: <code>/root/data/</code></li>
-  <li><b>List</b>, elements of the list should be data of the above types, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>, <code>[\"/root/data1\", \"/root/data2\"]</code></li>
+<li><b>Python Variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
+<li><b>File Path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
+<li><b>URL Link</b>, such as the network URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_instance_segmentation_004.png">Example</a></li>
+<li><b>Local Directory</b>, the directory should contain data files to be predicted, such as the local path: <code>/root/data/</code></li>
+<li><b>List</b>, elements of the list should be data of the above types, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>, <code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>None</td>
@@ -155,9 +154,9 @@ Related methods, parameters, and explanations are as follows:
 <td><code>float</code>/<code>dict</code>/<code>None</code></td>
 <td>
 <ul>
-  <li><b>None</b>, indicating the settings from the previous layer are used. The priority order for parameter settings is: <code>predict parameter > create_model initialization > yaml configuration file</code></li>
-  <li><b>float</b>, such as 0.5, indicating the threshold of 0.5 is used for filtering low-score objects during inference</li>
-  <li><b>dict</b>, such as <code>{0: 0.5, 1: 0.35}</code>, indicating a threshold of 0.5 for class 0 and 0.35 for class 1 during inference. Vehicle detection is a single-class detection task and does not require this setting.</li>
+<li><b>None</b>, indicating the settings from the previous layer are used. The priority order for parameter settings is: <code>predict parameter &gt; create_model initialization &gt; yaml configuration file</code></li>
+<li><b>float</b>, such as 0.5, indicating the threshold of 0.5 is used for filtering low-score objects during inference</li>
+<li><b>dict</b>, such as <code>{0: 0.5, 1: 0.35}</code>, indicating a threshold of 0.5 for class 0 and 0.35 for class 1 during inference. Vehicle detection is a single-class detection task and does not require this setting.</li>
 </ul>
 </td>
 <td>None</td>
@@ -274,32 +273,31 @@ python main.py -c paddlex/configs/modules/vehicle_detection/PP-YOLOE-S_vehicle.y
 After executing the above command, PaddleX will validate the dataset and collect its basic information. Upon successful execution, the log will print the message `Check dataset passed !`. The validation result file will be saved in `./output/check_dataset_result.json`, and related outputs will be saved in the `./output/check_dataset` directory of the current directory. The output directory includes visualized example images and histograms of sample distributions.
 
 <details><summary>👉 <b>Details of validation results (click to expand)</b></summary>
-
 <p>The specific content of the validation result file is:</p>
 <pre><code class="language-bash">{
-  &quot;done_flag&quot;: true,
-  &quot;check_pass&quot;: true,
-  &quot;attributes&quot;: {
-    &quot;num_classes&quot;: 4,
-    &quot;train_samples&quot;: 500,
-    &quot;train_sample_paths&quot;: [
-      &quot;check_dataset/demo_img/MVI_20011__img00001.jpg&quot;,
-      &quot;check_dataset/demo_img/MVI_20011__img00005.jpg&quot;,
-      &quot;check_dataset/demo_img/MVI_20011__img00009.jpg&quot;
+  "done_flag": true,
+  "check_pass": true,
+  "attributes": {
+    "num_classes": 4,
+    "train_samples": 500,
+    "train_sample_paths": [
+      "check_dataset/demo_img/MVI_20011__img00001.jpg",
+      "check_dataset/demo_img/MVI_20011__img00005.jpg",
+      "check_dataset/demo_img/MVI_20011__img00009.jpg"
     ],
-    &quot;val_samples&quot;: 100,
-    &quot;val_sample_paths&quot;: [
-      &quot;check_dataset/demo_img/MVI_20032__img00401.jpg&quot;,
-      &quot;check_dataset/demo_img/MVI_20032__img00405.jpg&quot;,
-      &quot;check_dataset/demo_img/MVI_20032__img00409.jpg&quot;
+    "val_samples": 100,
+    "val_sample_paths": [
+      "check_dataset/demo_img/MVI_20032__img00401.jpg",
+      "check_dataset/demo_img/MVI_20032__img00405.jpg",
+      "check_dataset/demo_img/MVI_20032__img00409.jpg"
     ]
   },
-  &quot;analysis&quot;: {
-    &quot;histogram&quot;: &quot;check_dataset/histogram.png&quot;
+  "analysis": {
+    "histogram": "check_dataset/histogram.png"
   },
-  &quot;dataset_path&quot;: &quot;vehicle_coco_examples&quot;,
-  &quot;show_type&quot;: &quot;image&quot;,
-  &quot;dataset_type&quot;: &quot;COCODetDataset&quot;
+  "dataset_path": "vehicle_coco_examples",
+  "show_type": "image",
+  "dataset_type": "COCODetDataset"
 }
 </code></pre>
 <p>In the above validation results, <code>check_pass</code> being <code>True</code> indicates that the dataset format meets the requirements. The explanations for other indicators are as follows:</p>
@@ -311,13 +309,12 @@ After executing the above command, PaddleX will validate the dataset and collect
 <li><code>attributes.val_sample_paths</code>: A list of relative paths to the visualized images of samples in the validation set of this dataset.</li>
 </ul>
 <p>The dataset validation also analyzes the distribution of sample counts across all classes in the dataset and generates a histogram (histogram.png) to visualize this distribution. </p>
-<p><img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/modules/vehicle_det/01.png"></p></details>
+<p><img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/modules/vehicle_det/01.png"/></p></details>
 
 #### 4.1.3 Dataset Format Conversion / Dataset Splitting (Optional)
 After completing the dataset verification, you can convert the dataset format or re-split the training/validation ratio by <b>modifying the configuration file</b> or <b>appending hyperparameters</b>.
 
 <details><summary>👉 <b>Details on Format Conversion / Dataset Splitting (Click to Expand)</b></summary>
-
 <p><b>(1) Dataset Format Conversion</b></p>
 <p>Vehicle detection does not support data format conversion.</p>
 <p><b>(2) Dataset Splitting</b></p>
@@ -369,7 +366,6 @@ The steps required are:
 Other related parameters can be set by modifying the `Global` and `Train` fields in the `.yaml` configuration file, or adjusted by appending parameters in the command line. For example, to specify training on the first two GPUs: `-o Global.device=gpu:0,1`; to set the number of training epochs to 10: `-o Train.epochs_iters=10`. For more modifiable parameters and their detailed explanations, refer to the [PaddleX Common Configuration Parameters for Model Tasks](../../instructions/config_parameters_common.en.md).
 
 <details><summary>👉 <b>More Details (Click to Expand)</b></summary>
-
 <ul>
 <li>During model training, PaddleX automatically saves model weight files, defaulting to <code>output</code>. To specify a save path, use the <code>-o Global.output</code> field in the configuration file.</li>
 <li>PaddleX shields you from the concepts of dynamic graph weights and static graph weights. During model training, both dynamic and static graph weights are produced, and static graph weights are selected by default for model inference.</li>
@@ -400,7 +396,6 @@ Similar to model training, the process involves the following steps:
 Other related parameters can be configured by modifying the fields under `Global` and `Evaluate` in the `.yaml` configuration file. For detailed information, please refer to[PaddleX Common Configuration Parameters for Models](../../instructions/config_parameters_common.en.md)。
 
 <details><summary>👉 <b>More Details (Click to Expand)</b></summary>
-
 <p>When evaluating the model, you need to specify the model weights file path. Each configuration file has a default weight save path built-in. If you need to change it, simply set it by appending a command line parameter, such as <code>-o Evaluate.weight_path=./output/best_model/best_model/model.pdparams</code>.</p>
 <p>After completing the model evaluation, an <code>evaluate_result.json</code> file will be generated, which records the evaluation results, specifically whether the evaluation task was completed successfully, and the model's evaluation metrics, including AP.</p></details>
 
@@ -435,3 +430,4 @@ Other related parameters can be set by modifying the fields under `Global` and `
 
 #### 4.4.2 Model Integration
 The weights you produced can be directly integrated into the vehicle detection module. You can refer to the Python example code in [Quick Integration](#iii-quick-integration), simply replace the model with the path to your trained model.
+</table>

+ 2 - 3
docs/module_usage/tutorials/ocr_modules/doc_img_orientation_classification.en.md

@@ -36,7 +36,7 @@ The document image orientation classification module is aim to distinguish the o
 
 ## III. Quick Integration
 
-&gt; ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to [PaddleX Local Installation Tutorial](../../../installation/installation.en.md)
+> ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to [PaddleX Local Installation Tutorial](../../../installation/installation.en.md)
 
 After completing the installation of the wheel package, you can perform inference on the document image orientation classification module with just a few lines of code. You can switch models under this module at will, and you can also integrate the model inference of the document image orientation classification module into your project. Before running the following code, please download the [example image](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/img_rot180_demo.jpg) to your local machine.
 
@@ -120,8 +120,7 @@ Related methods, parameters, and other explanations are as follows:
 <li><b>File path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
 <li><b>URL link</b>, such as the network URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/img_rot180_demo.jpg">Example</a></li>
 <li><b>Local directory</b>, the directory should contain data files to be predicted, such as the local path: <code>/root/data/</code></li>
-<li><b>Dictionary</b>, the <code>key</code> of the dictionary must correspond to the specific task, such as <code>"img"</code> for image classification tasks. The <code>value</code> of the dictionary supports the above types of data, for example: <code>{"img": "/root/data1"}</code></li>
-<li><b>List</b>, elements of the list must be the above types of data, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>["/root/data/img1.jpg", "/root/data/img2.jpg"]</code>, <code>["/root/data1", "/root/data2"]</code>, <code>[{"img": "/root/data1"}, {"img": "/root/data2/img.jpg"}]</code></li>
+<li><b>List</b>, the elements of the list should be of the above-mentioned data types, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>, <code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>None</td>

파일 크기가 너무 크기때문에 변경 상태를 표시하지 않습니다.
+ 2 - 2
docs/module_usage/tutorials/ocr_modules/layout_detection.en.md


+ 2 - 2
docs/module_usage/tutorials/ocr_modules/layout_detection.md

@@ -51,7 +51,7 @@ comments: true
 <b>注:以上精度指标的评估集是 PaddleOCR 自建的版面区域检测数据集,包含中英文论文、杂志、合同、书本、试卷和研报等常见的 500 张文档类型图片。GPU 推理耗时基于 NVIDIA Tesla T4 机器,精度类型为 FP32, CPU 推理速度基于 Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz,线程数为 8,精度类型为 FP32。</b>
 
 
-&gt; ❗ 以上列出的是版面检测模块重点支持的<b>3个核心模型</b>,该模块总共支持<b>11个全量模型</b>,包含多个预定义了不同类别的模型,完整的模型列表如下:
+> ❗ 以上列出的是版面检测模块重点支持的<b>3个核心模型</b>,该模块总共支持<b>11个全量模型</b>,包含多个预定义了不同类别的模型,完整的模型列表如下:
 
 <details><summary> 👉模型列表详情</summary>
 
@@ -186,7 +186,7 @@ comments: true
 </details>
 
 ## 三、快速集成
-&gt; ❗ 在快速集成前,请先安装 PaddleX 的 wheel 包,详细请参考 [PaddleX本地安装教程](../../../installation/installation.md)
+> ❗ 在快速集成前,请先安装 PaddleX 的 wheel 包,详细请参考 [PaddleX本地安装教程](../../../installation/installation.md)
 
 完成whl包的安装后,几行代码即可完成版面区域检测模块的推理,可以任意切换该模块下的模型,您也可以将版面区域检测模块中的模型推理集成到您的项目中。运行以下代码前,请您下载[示例图片](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/layout.jpg)到本地。
 

+ 6 - 7
docs/module_usage/tutorials/ocr_modules/seal_text_detection.en.md

@@ -44,7 +44,7 @@ The seal text detection module typically outputs multi-point bounding boxes arou
 
 
 ## III. Quick Integration
-&gt; ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to the [PaddleX Local Installation Guide](../../../installation/installation.en.md)
+> ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to the [PaddleX Local Installation Guide](../../../installation/installation.en.md)
 
 
 Just a few lines of code can complete the inference of the Seal Text Detection module, allowing you to easily switch between models under this module. You can also integrate the model inference of the the Seal Text Detection module into your project. Before running the following code, please download the [demo image](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/seal_text_det.png) to your local machine.
@@ -197,12 +197,11 @@ The explanations of related methods and parameters are as follows:
 <td><code>Python Var</code>/<code>str</code>/<code>dict</code>/<code>list</code></td>
 <td>
 <ul>
-<li><b>Python variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
-<li><b>File path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
-<li><b>URL link</b>, such as the network URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">Example</a></li>
-<li><b>Local directory</b>, the directory must contain data files to be predicted, such as the local path: <code>/root/data/</code></li>
-<li><b>Dictionary</b>, the <code>key</code> of the dictionary must correspond to the specific task, such as <code>"img"</code> for image classification tasks, and the <code>val</code> of the dictionary supports the above types of data, for example: <code>{"img": "/root/data1"}</code></li>
-<li><b>List</b>, the elements of the list must be the above types of data, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>["/root/data/img1.jpg", "/root/data/img2.jpg"]</code>, <code>["/root/data1", "/root/data2"]</code>, <code>[{"img": "/root/data1"}, {"img": "/root/data2/img.jpg"}]</code></li>
+<li><b>Python Variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
+<li><b>File Path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
+<li><b>URL Link</b>, such as the web URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">Example</a></li>
+<li><b>Local Directory</b>, the directory should contain the data files to be predicted, such as the local path: <code>/root/data/</code></li>
+<li><b>List</b>, the elements of the list should be of the above-mentioned data types, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>, <code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>None</td>

+ 1 - 1
docs/module_usage/tutorials/ocr_modules/seal_text_detection.md

@@ -44,7 +44,7 @@ comments: true
 
 
 ## 三、快速集成
-&gt; ❗ 在快速集成前,请先安装 PaddleX 的 wheel 包,详细请参考 [PaddleX本地安装教程](../../../installation/installation.md)
+> ❗ 在快速集成前,请先安装 PaddleX 的 wheel 包,详细请参考 [PaddleX本地安装教程](../../../installation/installation.md)
 
 完成 wheel 包的安装后,几行代码即可完成印章文本检测模块的推理,可以任意切换该模块下的模型,您也可以将印章文本检测的模块中的模型推理集成到您的项目中。运行以下代码前,请您下载[示例图片](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/seal_text_det.png)到本地。
 

+ 1 - 2
docs/module_usage/tutorials/ocr_modules/table_cells_detection.en.md

@@ -152,8 +152,7 @@ The following is the explanation of the methods, parameters, etc.:
   <li><b>File path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
   <li><b>URL link</b>, such as the network URL of an image file: <a href = "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/table_recognition.jpg">Example</a></li>
   <li><b>Local directory</b>, the directory must contain files to be predicted, such as the local path: <code>/root/data/</code></li>
-  <li><b>Dictionary</b>, the <code>key</code> of the dictionary must correspond to the specific task, such as <code>"img"</code> for image classification tasks, and the <code>val</code> of the dictionary supports the above types of data, for example: <code>{"img": "/root/data1"}</code></li>
-  <li><b>List</b>, elements of the list must be of the above types of data, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>["/root/data/img1.jpg", "/root/data/img2.jpg"]</code>, <code>["/root/data1", "/root/data2"]</code>, <code>[{"img": "/root/data1"}, {"img": "/root/data2/img.jpg"}]</code></li>
+  <li><b>List</b>, the elements of the list should be of the above-mentioned data types, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>, <code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>None</td>

+ 1 - 2
docs/module_usage/tutorials/ocr_modules/table_classification.en.md

@@ -107,8 +107,7 @@ The descriptions of the related methods and parameters are as follows:
   <li><b>File path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
   <li><b>URL link</b>, such as the network URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/table_recognition.jpg">Example</a></li>
   <li><b>Local directory</b>, this directory should contain the data files to be predicted, such as the local path: <code>/root/data/</code></li>
-  <li><b>Dictionary</b>, the <code>key</code> of the dictionary must correspond to the specific task, such as <code>"img"</code> for the table classification task, and the <code>val</code> of the dictionary supports the above types of data, for example: <code>{"img": "/root/data1"}</code></li>
-  <li><b>List</b>, the elements of the list must be the above types of data, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>["/root/data/img1.jpg", "/root/data/img2.jpg"]</code>, <code>["/root/data1", "/root/data2"]</code>, <code>[{"img": "/root/data1"}, {"img": "/root/data2/img.jpg"}]</code></li>
+  <li><b>List</b>, the elements of the list should be of the above-mentioned data types, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>, <code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>None</td>

+ 6 - 7
docs/module_usage/tutorials/ocr_modules/table_structure_recognition.en.md

@@ -42,7 +42,7 @@ SLANet_plus is an enhanced version of SLANet, a table structure recognition mode
 
 
 ## III. Quick Integration
-&gt; ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to [PaddleX Local Installation Guide](../../../installation/installation.en.md)
+> ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to [PaddleX Local Installation Guide](../../../installation/installation.en.md)
 
 After installing the wheel package, a few lines of code can complete the inference of the table structure recognition module. You can easily switch models within this module and integrate the model inference into your project. Before running the following code, please download the [demo image](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/table_recognition.jpg) to your local machine.
 
@@ -119,12 +119,11 @@ Relevant methods, parameters, and explanations are as follows:
 <td><code>Python Var</code>/<code>str</code>/<code>dict</code>/<code>list</code></td>
 <td>
 <ul>
-<li><b>Python variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
-<li><b>File path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
-<li><b>URL link</b>, such as the network URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/table_recognition.jpg">Example</a></li>
-<li><b>Local directory</b>, this directory must contain data files to be predicted, such as the local path: <code>/root/data/</code></li>
-<li><b>Dictionary</b>, the <code>key</code> of the dictionary must correspond to the specific task, such as <code>"img"</code> for image classification tasks. The <code>val</code> of the dictionary supports the above types of data, for example: <code>{"img": "/root/data1"}</code></li>
-<li><b>List</b>, elements of the list must be of the above types of data, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>["/root/data/img1.jpg", "/root/data/img2.jpg"]</code>, <code>["/root/data1", "/root/data2"]</code>, <code>[{"img": "/root/data1"}, {"img": "/root/data2/img.jpg"}]</code></li>
+<li><b>Python Variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
+<li><b>File Path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
+<li><b>URL Link</b>, such as the web URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">Example</a></li>
+<li><b>Local Directory</b>, the directory should contain the data files to be predicted, such as the local path: <code>/root/data/</code></li>
+<li><b>List</b>, the elements of the list should be of the above-mentioned data types, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>, <code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>None</td>

+ 8 - 8
docs/module_usage/tutorials/ocr_modules/text_detection.en.md

@@ -40,7 +40,7 @@ The text detection module is a crucial component in OCR (Optical Character Recog
 </table>
 
 ## III. Quick Integration
-&gt; ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to the [PaddleX Local Installation Guide](../../../installation/installation.en.md).
+> ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to the [PaddleX Local Installation Guide](../../../installation/installation.en.md).
 
 Just a few lines of code can complete the inference of the text detection module, allowing you to easily switch between models under this module. You can also integrate the model inference of the text detection module into your project. Before running the following code, please download the [demo image](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_001.png) to your local machine.
 
@@ -57,11 +57,12 @@ for res in output:
 After running, the result obtained is:
 
 ```bash
-{'res': {'input_path': 'general_ocr_001.png', 'dt_polys': [[[73, 553], [443, 541], [444, 574], [74, 585]], [[17, 507], [515, 489], [517, 534], [19, 552]], [[191, 458], [398, 449], [400, 481], [193, 490]], [[41, 413], [483, 390], [485, 431], [43, 453]]], 'dt_scores': [0.7555687038101032, 0.701620896397861, 0.8839516283528792, 0.8123399529333318]}}
+{'res': {'input_path': 'general_ocr_001.png', "page_index": None, 'dt_polys': [[[73, 552], [453, 542], [454, 575], [74, 585]], [[17, 506], [515, 486], [517, 535], [19, 555]], [[189, 457], [398, 449], [399, 482], [190, 490]], [[41, 412], [484, 387], [486, 433], [43, 457]]], 'dt_scores': [0.7555687038101032, 0.701620896397861, 0.8839516283528792, 0.8123399529333318]}}
 ```
 
 The meanings of the running result parameters are as follows:
 - `input_path`: Indicates the path of the input image to be predicted.
+- `page_index`: If the input is a PDF file, it indicates which page of the PDF it is; otherwise, it is `None`.
 - `dt_polys`: Indicates the predicted text detection boxes, where each text detection box contains four vertices of a quadrilateral. Each vertex is a tuple representing the x and y coordinates of the vertex.
 - `dt_scores`: Indicates the confidence scores of the predicted text detection boxes.
 
@@ -176,12 +177,11 @@ Relevant methods, parameters, and explanations are as follows:
 <td><code>Python Var</code>/<code>str</code>/<code>dict</code>/<code>list</code></td>
 <td>
 <ul>
-<li><b>Python variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
-<li><b>File path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
-<li><b>URL link</b>, such as the network URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">Example</a></li>
-<li><b>Local directory</b>, the directory should contain data files to be predicted, such as the local path: <code>/root/data/</code></li>
-<li><b>Dictionary</b>, the <code>key</code> of the dictionary must correspond to the specific task, such as <code>"img"</code> for image classification tasks. The <code>value</code> of the dictionary supports the above types of data, for example: <code>{"img": "/root/data1"}</code></li>
-<li><b>List</b>, elements of the list must be the above types of data, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>["/root/data/img1.jpg", "/root/data/img2.jpg"]</code>, <code>["/root/data1", "/root/data2"]</code>, <code>[{"img": "/root/data1"}, {"img": "/root/data2/img.jpg"}]</code></li>
+<li><b>Python Variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
+<li><b>File Path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
+<li><b>URL Link</b>, such as the web URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">Example</a></li>
+<li><b>Local Directory</b>, the directory should contain the data files to be predicted, such as the local path: <code>/root/data/</code></li>
+<li><b>List</b>, the elements of the list should be of the above-mentioned data types, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>, <code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>None</td>

+ 1 - 3
docs/module_usage/tutorials/ocr_modules/text_image_unwarping.en.md

@@ -115,9 +115,7 @@ Relevant methods, parameters, and explanations are as follows:
   <li><b>Python variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
   <li><b>File path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
   <li><b>URL link</b>, such as the network URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">Example</a></li>
-  <li><b>Local directory</b>, the directory should contain data files to be predicted, such as the local path: <code>/root/data/</code></li>
-  <li><b>Dictionary</b>, the <code>key</code> of the dictionary must correspond to the specific task, such as <code>"img"</code> for image classification tasks. The <code>value</code> of the dictionary supports the above types of data, for example: <code>{"img": "/root/data1"}</code></li>
-  <li><b>List</b>, elements of the list must be the above types of data, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>["/root/data/img1.jpg", "/root/data/img2.jpg"]</code>, <code>["/root/data1", "/root/data2"]</code>, <code>[{"img": "/root/data1"}, {"img": "/root/data2/img.jpg"}]</code></li>
+  <li><b>List</b>, the elements of the list should be of the above-mentioned data types, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>, <code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>None</td>

+ 6 - 6
docs/module_usage/tutorials/ocr_modules/text_recognition.en.md

@@ -53,7 +53,7 @@ The text recognition module is the core component of an OCR (Optical Character R
 </table>
 <b>Note: The evaluation set for the above accuracy indicators is the Chinese dataset built by PaddleOCR, covering multiple scenarios such as street view, web images, documents, and handwriting, with 11,000 images included in text recognition. All models' GPU inference time is based on NVIDIA Tesla T4 machine, with precision type of FP32. CPU inference speed is based on Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz, with 8 threads and precision type of FP32.</b>
 
-&gt; ❗ The above list features the <b>4 core models</b> that the text recognition module primarily supports. In total, this module supports <b>18 models</b>. The complete list of models is as follows:
+> ❗ The above list features the <b>4 core models</b> that the text recognition module primarily supports. In total, this module supports <b>18 models</b>. The complete list of models is as follows:
 
 <details><summary> 👉Model List Details</summary>
 
@@ -313,11 +313,11 @@ In the above Python script, the following steps are executed:
 <td><code>Python Var</code>/<code>str</code>/<code>list</code></td>
 <td>
 <ul>
-<li><b>Python variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
-<li><b>File path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
-<li><b>URL link</b>, such as the network URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_001.png">Example</a></li>
-<li><b>Local directory</b>, this directory should contain the data files to be predicted, such as the local path: <code>/root/data/</code></li>
-<li><b>List</b>, the elements of the list should be the above types of data, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>["/root/data/img1.jpg", "/root/data/img2.jpg"]</code>, <code>["/root/data1", "/root/data2"]</code>, <code>[{"img": "/root/data1"}, {"img": "/root/data2/img.jpg"}]</code></li>
+<li><b>Python Variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
+<li><b>File Path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
+<li><b>URL Link</b>, such as the web URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">Example</a></li>
+<li><b>Local Directory</b>, the directory should contain the data files to be predicted, such as the local path: <code>/root/data/</code></li>
+<li><b>List</b>, the elements of the list should be of the above-mentioned data types, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>, <code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>None</td>

+ 41 - 41
docs/module_usage/tutorials/ocr_modules/text_recognition.md

@@ -109,16 +109,16 @@ PP-OCRv3_mobile_rec_infer.tar">推理模型</a>/<a href="https://paddle-model-ec
 <tr>
 <th>模型</th><th>模型下载链接</th>
 <th>识别 Avg Accuracy(%)</th>
-<th>GPU推理耗时(ms)</th>
-<th>CPU推理耗时</th>
+<th>GPU推理耗时(ms)<br/>[常规模式 / 高性能模式]</th>
+<th>CPU推理耗时(ms)<br/>[常规模式 / 高性能模式]</th>
 <th>模型存储大小(M)</th>
 <th>介绍</th>
 </tr>
 <tr>
 <td>ch_SVTRv2_rec</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/ch_SVTRv2_rec_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/ch_SVTRv2_rec_pretrained.pdparams">训练模型</a></td>
 <td>68.81</td>
-<td>8.36801</td>
-<td>165.706</td>
+<td>8.08 / 8.08</td>
+<td>50.17 / 42.50</td>
 <td>73.9 M</td>
 <td rowspan="1">
 SVTRv2 是一种由复旦大学视觉与学习实验室(FVL)的OpenOCR团队研发的服务端文本识别模型,其在PaddleOCR算法模型挑战赛 - 赛题一:OCR端到端识别任务中荣获一等奖,A榜端到端识别精度相比PP-OCRv4提升6%。
@@ -130,16 +130,16 @@ SVTRv2 是一种由复旦大学视觉与学习实验室(FVL)的OpenOCR团队
 <tr>
 <th>模型</th><th>模型下载链接</th>
 <th>识别 Avg Accuracy(%)</th>
-<th>GPU推理耗时(ms)</th>
-<th>CPU推理耗时</th>
+<th>GPU推理耗时(ms)<br/>[常规模式 / 高性能模式]</th>
+<th>CPU推理耗时(ms)<br/>[常规模式 / 高性能模式]</th>
 <th>模型存储大小(M)</th>
 <th>介绍</th>
 </tr>
 <tr>
 <td>ch_RepSVTR_rec</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/ch_RepSVTR_rec_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/ch_RepSVTR_rec_pretrained.pdparams">训练模型</a></td>
 <td>65.07</td>
-<td>10.5047</td>
-<td>51.5647</td>
+<td>5.93 / 5.93</td>
+<td>20.73 / 7.32</td>
 <td>22.1 M</td>
 <td rowspan="1">    RepSVTR 文本识别模型是一种基于SVTRv2 的移动端文本识别模型,其在PaddleOCR算法模型挑战赛 - 赛题一:OCR端到端识别任务中荣获一等奖,B榜端到端识别精度相比PP-OCRv4提升2.5%,推理速度持平。</td>
 </tr>
@@ -151,8 +151,8 @@ SVTRv2 是一种由复旦大学视觉与学习实验室(FVL)的OpenOCR团队
 <tr>
 <th>模型</th><th>模型下载链接</th>
 <th>识别 Avg Accuracy(%)</th>
-<th>GPU推理耗时(ms)</th>
-<th>CPU推理耗时</th>
+<th>GPU推理耗时(ms)<br/>[常规模式 / 高性能模式]</th>
+<th>CPU推理耗时(ms)<br/>[常规模式 / 高性能模式]</th>
 <th>模型存储大小(M)</th>
 <th>介绍</th>
 </tr>
@@ -160,8 +160,8 @@ SVTRv2 是一种由复旦大学视觉与学习实验室(FVL)的OpenOCR团队
 <td>en_PP-OCRv4_mobile_rec</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/\
 en_PP-OCRv4_mobile_rec_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/en_PP-OCRv4_mobile_rec_pretrained.pdparams">训练模型</a></td>
 <td> 70.39</td>
-<td></td>
-<td></td>
+<td>4.81 / 4.81</td>
+<td>16.10 / 5.31</td>
 <td>6.8 M</td>
 <td>基于PP-OCRv4识别模型训练得到的超轻量英文识别模型,支持英文、数字识别</td>
 </tr>
@@ -169,8 +169,8 @@ en_PP-OCRv4_mobile_rec_infer.tar">推理模型</a>/<a href="https://paddle-model
 <td>en_PP-OCRv3_mobile_rec</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/\
 en_PP-OCRv3_mobile_rec_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/en_PP-OCRv3_mobile_rec_pretrained.pdparams">训练模型</a></td>
 <td>70.69</td>
-<td></td>
-<td></td>
+<td>5.44 / 5.44</td>
+<td>8.65 / 5.57</td>
 <td>7.8 M </td>
 <td>基于PP-OCRv3识别模型训练得到的超轻量英文识别模型,支持英文、数字识别</td>
 </tr>
@@ -182,8 +182,8 @@ en_PP-OCRv3_mobile_rec_infer.tar">推理模型</a>/<a href="https://paddle-model
 <tr>
 <th>模型</th><th>模型下载链接</th>
 <th>识别 Avg Accuracy(%)</th>
-<th>GPU推理耗时(ms)</th>
-<th>CPU推理耗时</th>
+<th>GPU推理耗时(ms)<br/>[常规模式 / 高性能模式]</th>
+<th>CPU推理耗时(ms)<br/>[常规模式 / 高性能模式]</th>
 <th>模型存储大小(M)</th>
 <th>介绍</th>
 </tr>
@@ -191,8 +191,8 @@ en_PP-OCRv3_mobile_rec_infer.tar">推理模型</a>/<a href="https://paddle-model
 <td>korean_PP-OCRv3_mobile_rec</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/\
 korean_PP-OCRv3_mobile_rec_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/korean_PP-OCRv3_mobile_rec_pretrained.pdparams">训练模型</a></td>
 <td>60.21</td>
-<td></td>
-<td></td>
+<td>5.40 / 5.40</td>
+<td>9.11 / 4.05</td>
 <td>8.6 M</td>
 <td>基于PP-OCRv3识别模型训练得到的超轻量韩文识别模型,支持韩文、数字识别</td>
 </tr>
@@ -200,8 +200,8 @@ korean_PP-OCRv3_mobile_rec_infer.tar">推理模型</a>/<a href="https://paddle-m
 <td>japan_PP-OCRv3_mobile_rec</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/\
 japan_PP-OCRv3_mobile_rec_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/japan_PP-OCRv3_mobile_rec_pretrained.pdparams">训练模型</a></td>
 <td>45.69</td>
-<td></td>
-<td></td>
+<td>5.70 / 5.70</td>
+<td>8.48 / 4.07</td>
 <td>8.8 M </td>
 <td>基于PP-OCRv3识别模型训练得到的超轻量日文识别模型,支持日文、数字识别</td>
 </tr>
@@ -209,8 +209,8 @@ japan_PP-OCRv3_mobile_rec_infer.tar">推理模型</a>/<a href="https://paddle-mo
 <td>chinese_cht_PP-OCRv3_mobile_rec</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/\
 chinese_cht_PP-OCRv3_mobile_rec_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/chinese_cht_PP-OCRv3_mobile_rec_pretrained.pdparams">训练模型</a></td>
 <td>82.06</td>
-<td></td>
-<td></td>
+<td>5.90 / 5.90</td>
+<td>9.28 / 4.34</td>
 <td>9.7 M </td>
 <td>基于PP-OCRv3识别模型训练得到的超轻量繁体中文识别模型,支持繁体中文、数字识别</td>
 </tr>
@@ -218,8 +218,8 @@ chinese_cht_PP-OCRv3_mobile_rec_infer.tar">推理模型</a>/<a href="https://pad
 <td>te_PP-OCRv3_mobile_rec</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/\
 te_PP-OCRv3_mobile_rec_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/te_PP-OCRv3_mobile_rec_pretrained.pdparams">训练模型</a></td>
 <td>95.88</td>
-<td></td>
-<td></td>
+<td>5.42 / 5.42</td>
+<td>8.10 / 6.91</td>
 <td>7.8 M </td>
 <td>基于PP-OCRv3识别模型训练得到的超轻量泰卢固文识别模型,支持泰卢固文、数字识别</td>
 </tr>
@@ -227,8 +227,8 @@ te_PP-OCRv3_mobile_rec_infer.tar">推理模型</a>/<a href="https://paddle-model
 <td>ka_PP-OCRv3_mobile_rec</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/\
 ka_PP-OCRv3_mobile_rec_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/ka_PP-OCRv3_mobile_rec_pretrained.pdparams">训练模型</a></td>
 <td>96.96</td>
-<td></td>
-<td></td>
+<td>5.25 / 5.25</td>
+<td>9.09 / 3.86</td>
 <td>8.0 M </td>
 <td>基于PP-OCRv3识别模型训练得到的超轻量卡纳达文识别模型,支持卡纳达文、数字识别</td>
 </tr>
@@ -236,8 +236,8 @@ ka_PP-OCRv3_mobile_rec_infer.tar">推理模型</a>/<a href="https://paddle-model
 <td>ta_PP-OCRv3_mobile_rec</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/\
 ta_PP-OCRv3_mobile_rec_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/ta_PP-OCRv3_mobile_rec_pretrained.pdparams">训练模型</a></td>
 <td>76.83</td>
-<td></td>
-<td></td>
+<td>5.23 / 5.23</td>
+<td>10.13 / 4.30</td>
 <td>8.0 M </td>
 <td>基于PP-OCRv3识别模型训练得到的超轻量泰米尔文识别模型,支持泰米尔文、数字识别</td>
 </tr>
@@ -245,8 +245,8 @@ ta_PP-OCRv3_mobile_rec_infer.tar">推理模型</a>/<a href="https://paddle-model
 <td>latin_PP-OCRv3_mobile_rec</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/\
 latin_PP-OCRv3_mobile_rec_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/latin_PP-OCRv3_mobile_rec_pretrained.pdparams">训练模型</a></td>
 <td>76.93</td>
-<td></td>
-<td></td>
+<td>5.20 / 5.20</td>
+<td>8.83 / 7.15</td>
 <td>7.8 M</td>
 <td>基于PP-OCRv3识别模型训练得到的超轻量拉丁文识别模型,支持拉丁文、数字识别</td>
 </tr>
@@ -254,8 +254,8 @@ latin_PP-OCRv3_mobile_rec_infer.tar">推理模型</a>/<a href="https://paddle-mo
 <td>arabic_PP-OCRv3_mobile_rec</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/\
 arabic_PP-OCRv3_mobile_rec_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/arabic_PP-OCRv3_mobile_rec_pretrained.pdparams">训练模型</a></td>
 <td>73.55</td>
-<td></td>
-<td></td>
+<td>5.35 / 5.35</td>
+<td>8.80 / 4.56</td>
 <td>7.8 M</td>
 <td>基于PP-OCRv3识别模型训练得到的超轻量阿拉伯字母识别模型,支持阿拉伯字母、数字识别</td>
 </tr>
@@ -263,8 +263,8 @@ arabic_PP-OCRv3_mobile_rec_infer.tar">推理模型</a>/<a href="https://paddle-m
 <td>cyrillic_PP-OCRv3_mobile_rec</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/\
 cyrillic_PP-OCRv3_mobile_rec_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/cyrillic_PP-OCRv3_mobile_rec_pretrained.pdparams">训练模型</a></td>
 <td>94.28</td>
-<td></td>
-<td></td>
+<td>5.23 / 5.23</td>
+<td>8.89 / 3.88</td>
 <td>7.9 M  </td>
 <td>基于PP-OCRv3识别模型训练得到的超轻量斯拉夫字母识别模型,支持斯拉夫字母、数字识别</td>
 </tr>
@@ -272,8 +272,8 @@ cyrillic_PP-OCRv3_mobile_rec_infer.tar">推理模型</a>/<a href="https://paddle
 <td>devanagari_PP-OCRv3_mobile_rec</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/\
 devanagari_PP-OCRv3_mobile_rec_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/devanagari_PP-OCRv3_mobile_rec_pretrained.pdparams">训练模型</a></td>
 <td>96.44</td>
-<td></td>
-<td></td>
+<td>5.22 / 5.22</td>
+<td>8.56 / 4.06</td>
 <td>7.9 M</td>
 <td>基于PP-OCRv3识别模型训练得到的超轻量梵文字母识别模型,支持梵文字母、数字识别</td>
 </tr>
@@ -366,11 +366,11 @@ for res in output:
 <td><code>Python Var</code>/<code>str</code>/<code>list</code></td>
 <td>
 <ul>
-  <li><b>Python变量</b>,如<code>numpy.ndarray</code>表示的图像数据</li>
-  <li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
-  <li><b>URL链接</b>,如图像文件的网络URL:<a href = "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">示例</a></li>
-  <li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
-  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code></li>
+<li><b>Python变量</b>,如<code>numpy.ndarray</code>表示的图像数据</li>
+<li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
+<li><b>URL链接</b>,如图像文件的网络URL:<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">示例</a></li>
+<li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
+<li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>无</td>

+ 1 - 2
docs/module_usage/tutorials/ocr_modules/textline_orientation_classification.en.md

@@ -119,8 +119,7 @@ The explanations for the methods, parameters, etc., are as follows:
   <li><b>File path</b>, such as the local path of an image file: <code>/root/data/img.jpg</code></li>
   <li><b>URL link</b>, such as the network URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/textline_rot180_demo.jpg">Example</a></li>
   <li><b>Local directory</b>, the directory should contain data files to be predicted, such as the local path: <code>/root/data/</code></li>
-  <li><b>Dictionary</b>, the <code>key</code> of the dictionary must correspond to the specific task, such as <code>"img"</code> for image classification tasks. The <code>value</code> of the dictionary supports the above types of data, for example: <code>{"img": "/root/data1"}</code></li>
-  <li><b>List</b>, elements of the list must be of the above types of data, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>["/root/data/img1.jpg", "/root/data/img2.jpg"]</code>, <code>["/root/data1", "/root/data2"]</code>, <code>[{"img": "/root/data1"}, {"img": "/root/data2/img.jpg"}]</code></li>
+  <li><b>List</b>, the elements of the list should be of the above-mentioned data types, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>, <code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>None</td>

+ 44 - 0
docs/pipeline_usage/pipeline_develop_guide.en.md

@@ -233,6 +233,26 @@ Choose the appropriate deployment method for your model pipeline based on your n
 <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/cv_pipelines/image_anomaly_detection.html">Image Anomaly Detection Pipeline Usage Tutorial</a></td>
 </tr>
 <tr>
+<td>Human Keypoint Detection</td>
+<td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/cv_pipelines/human_keypoint_detection.html">Human Keypoint Detection Pipeline Usage Tutorial</a></td>
+</tr>
+<tr>
+<td>Open Vocabulary Detection</td>
+<td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/cv_pipelines/open_vocabulary_detection.html">Open Vocabulary Detection Pipeline Usage Tutorial</a></td>
+</tr>
+<tr>
+<td>Open Vocabulary Segmentation</td>
+<td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/cv_pipelines/open_vocabulary_segmentation.html">Open Vocabulary Segmentation Pipeline Usage Tutorial</a></td>
+</tr>
+<tr>
+<td>Rotated Object Detection</td>
+<td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/cv_pipelines/rotated_object_detection.html">Rotated Object Detection Pipeline Usage Tutorial</a></td>
+</tr>
+<tr>
+<td>3D Bev Detection</td>
+<td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/cv_pipelines/3d_bev_detection.html">3D Bev Detection Pipeline Usage Tutorial</a></td>
+</tr>
+<tr>
 <td>OCR</td>
 <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/OCR.html">OCR Pipeline Usage Tutorial</a></td>
 </tr>
@@ -241,10 +261,18 @@ Choose the appropriate deployment method for your model pipeline based on your n
 <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/table_recognition.html">Table Recognition Pipeline Usage Tutorial</a></td>
 </tr>
 <tr>
+<td>Table Recognition v2</td>
+<td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/table_recognition_v2.html">Table Recognition v2 Pipeline Usage Tutorial</a></td>
+</tr>
+<tr>
 <td>Layout Parsing</td>
 <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/layout_parsing.html">Layout Parsing Pipeline Usage Tutorial</a></td>
 </tr>
 <tr>
+<td>Layout Parsing v2</td>
+<td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/layout_parsing_v2.html">Layout Parsing v2 Pipeline Usage Tutorial</a></td>
+</tr>
+<tr>
 <td>Formula Recognition</td>
 <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/formula_recognition.html">Formula Recognition Pipeline Usage Tutorial</a></td>
 </tr>
@@ -253,6 +281,10 @@ Choose the appropriate deployment method for your model pipeline based on your n
 <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/seal_recognition.html">Seal Recognition Pipeline Usage Tutorial</a></td>
 </tr>
 <tr>
+<td>Document Image Preprocessing</td>
+<td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/doc_preprocessor.html">Document Image Preprocessing Pipeline Usage Tutorial</a></td>
+</tr>
+<tr>
 <td>Time Series Forecasting</td>
 <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/time_series_pipelines/time_series_forecasting.html">Time Series Forecasting Pipeline Usage Tutorial</a></td>
 </tr>
@@ -264,5 +296,17 @@ Choose the appropriate deployment method for your model pipeline based on your n
 <td>Time Series Classification</td>
 <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/time_series_pipelines/time_series_classification.html">Time Series Classification Pipeline Usage Tutorial</a></td>
 </tr>
+<tr>
+<td>Multilingual Speech Recognition</td>
+<td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/speech_pipelines/multilingual_speech_recognition.html">Multilingual Speech Recognition Pipeline Usage Tutorial</a></td>
+</tr>
+<tr>
+<td>Video Classification</td>
+<td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/video_pipelines/video_classification.html">Video Classification Pipeline Usage Tutorial</a></td>
+</tr>
+<tr>
+<td>Video Detection</td>
+<td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/video_pipelines/video_detection.html">Video Detection Pipeline Usage Tutorial</a></td>
+</tr>
 </tbody>
 </table>

+ 44 - 0
docs/pipeline_usage/pipeline_develop_guide.md

@@ -235,6 +235,26 @@ Pipeline:
 <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/cv_pipelines/image_anomaly_detection.html">图像异常检测产线使用教程</a></td>
 </tr>
 <tr>
+<td>人体关键点检测</td>
+<td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/cv_pipelines/human_keypoint_detection.html">人体关键点检测产线使用教程</a></td>
+</tr>
+<tr>
+<td>开放词汇检测</td>
+<td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/cv_pipelines/open_vocabulary_detection.html">开放词汇检测产线使用教程</a></td>
+</tr>
+<tr>
+<td>开放词汇分割</td>
+<td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/cv_pipelines/open_vocabulary_segmentation.html">开放词汇分割产线使用教程</a></td>
+</tr>
+<tr>
+<td>旋转目标检测</td>
+<td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/cv_pipelines/rotated_object_detection.html">旋转目标检测产线使用教程</a></td>
+</tr>
+<tr>
+<td>3D多模态融合检测</td>
+<td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/cv_pipelines/3d_bev_detection.html">3D多模态融合检测产线使用教程</a></td>
+</tr>
+<tr>
 <td>通用OCR</td>
 <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/OCR.html">通用OCR产线使用教程</a></td>
 </tr>
@@ -243,10 +263,18 @@ Pipeline:
 <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/table_recognition.html">通用表格识别产线使用教程</a></td>
 </tr>
 <tr>
+<td>通用表格识别v2</td>
+<td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/table_recognition_v2.html">通用表格识别产线v2使用教程</a></td>
+</tr>
+<tr>
 <td>通用版面解析</td>
 <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/layout_parsing.html">通用版面解析产线使用教程</a></td>
 </tr>
 <tr>
+<td>通用版面解析v2</td>
+<td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/layout_parsing_v2.html">通用版面解析v2产线使用教程</a></td>
+</tr>
+<tr>
 <td>公式识别</td>
 <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/formula_recognition.html">公式识别产线使用教程</a></td>
 </tr>
@@ -255,6 +283,10 @@ Pipeline:
 <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/seal_recognition.html">印章文本识别产线使用教程</a></td>
 </tr>
 <tr>
+<td>文档图像预处理</td>
+<td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/doc_preprocessor.html">文档图像预处理产线使用教程</a></td>
+</tr>
+<tr>
 <td>时序预测</td>
 <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/time_series_pipelines/time_series_forecasting.html">通用时序预测产线使用教程</a></td>
 </tr>
@@ -266,5 +298,17 @@ Pipeline:
 <td>时序分类</td>
 <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/time_series_pipelines/time_series_classification.html">通用时序分类产线使用教程</a></td>
 </tr>
+<tr>
+<td>多语种语音识别</td>
+<td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/speech_pipelines/multilingual_speech_recognition.html">多语种语音识别产线使用教程</a></td>
+</tr>
+<tr>
+<td>通用视频分类</td>
+<td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/video_pipelines/video_classification.html">通用视频分类产线使用教程</a></td>
+</tr>
+<tr>
+<td>通用视频检测</td>
+<td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/video_pipelines/video_detection.html">通用视频检测产线使用教程</a></td>
+</tr>
 </tbody>
 </table>

+ 83 - 91
docs/pipeline_usage/tutorials/cv_pipelines/face_recognition.en.md

@@ -9,8 +9,7 @@ Face recognition is a crucial component in the field of computer vision, aiming
 
 The face recognition pipeline is an end-to-end system dedicated to solving face detection and recognition tasks. It can quickly and accurately locate face regions in images, extract facial features, and retrieve and compare them with pre-established features in a feature database to confirm identity information.
 
-<img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/pipelines/face_recognition/01.png">
-
+<img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/pipelines/face_recognition/01.png"/>
 <b>The face recognition pipeline includes a face detection module and a face feature module</b>, with several models in each module. Which models to use can be selected based on the benchmark data below. <b>If you prioritize model accuracy, choose models with higher accuracy; if you prioritize inference speed, choose models with faster inference; if you prioritize model size, choose models with smaller storage requirements</b>.
 
 
@@ -20,8 +19,8 @@ The face recognition pipeline is an end-to-end system dedicated to solving face
 <tr>
 <th>Model</th><th>Model Download Link</th>
 <th>AP (%)<br/>Easy/Medium/Hard</th>
-<th>GPU Inference Time (ms)</th>
-<th>CPU Inference Time (ms)</th>
+<th>CPU Inference Time (ms)<br/>[Normal Mode / High-Performance Mode]</th>
+<th>CPU Inference Time (ms)<br/>[Normal Mode / High-Performance Mode]</th>
 <th>Model Size (M)</th>
 <th>Description</th>
 </tr>
@@ -30,32 +29,32 @@ The face recognition pipeline is an end-to-end system dedicated to solving face
 <tr>
 <td>BlazeFace</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/BlazeFace_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/BlazeFace_pretrained.pdparams">Trained Model</a></td>
 <td>77.7/73.4/49.5</td>
-<td></td>
-<td></td>
+<td>60.34 / 54.76</td>
+<td>84.18 / 84.18</td>
 <td>0.447</td>
 <td>A lightweight and efficient face detection model</td>
 </tr>
 <tr>
 <td>BlazeFace-FPN-SSH</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/BlazeFace-FPN-SSH_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/BlazeFace-FPN-SSH_pretrained.pdparams">Trained Model</a></td>
 <td>83.2/80.5/60.5</td>
-<td>52.4</td>
-<td>73.2</td>
+<td>69.29 / 63.42</td>
+<td>86.96 / 86.96</td>
 <td>0.606</td>
 <td>Improved BlazeFace with FPN and SSH structures</td>
 </tr>
 <tr>
 <td>PicoDet_LCNet_x2_5_face</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PicoDet_LCNet_x2_5_face_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PicoDet_LCNet_x2_5_face_pretrained.pdparams">Trained Model</a></td>
 <td>93.7/90.7/68.1</td>
-<td>33.7</td>
-<td>185.1</td>
+<td>35.37 / 12.88</td>
+<td>126.24 / 126.24</td>
 <td>28.9</td>
 <td>Face detection model based on PicoDet_LCNet_x2_5</td>
 </tr>
 <tr>
 <td>PP-YOLOE_plus-S_face</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PP-YOLOE_plus-S_face_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-YOLOE_plus-S_face_pretrained.pdparams">Trained Model</a></td>
 <td>93.9/91.8/79.8</td>
-<td>25.8</td>
-<td>159.9</td>
+<td>22.54 / 8.33</td>
+<td>138.67 / 138.67</td>
 <td>26.5</td>
 <td>Face detection model based on PP-YOLOE_plus-S</td>
 </tr>
@@ -106,14 +105,14 @@ The pre-trained model pipelines provided by PaddleX can be quickly experienced.
 Oneline Experience is not supported at the moment.
 
 ### 2.2 Local Experience
-> ❗ Before using the face recognition pipeline locally, please ensure that you have completed the installation of the PaddleX wheel package according to the [PaddleX Installation Guide](../../../installation/installation.en.md).
+&gt; ❗ Before using the face recognition pipeline locally, please ensure that you have completed the installation of the PaddleX wheel package according to the [PaddleX Installation Guide](../../../installation/installation.en.md).
 
 #### 2.2.1 Command Line Experience
 
 Command line experience is not supported yet.
 
 #### 2.2.2 Python Script Integration
-Please download the [test image](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/friends1.jpg) for testing.</url>
+Please download the [test image](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/friends1.jpg) for testing.
 In the example run of this pipeline, you need to pre-build a face feature library. You can refer to the following instructions to download the official demo data for subsequent construction of the face feature library.
 You can refer to the following command to download the Demo dataset to the specified folder:
 
@@ -201,8 +200,8 @@ In the above Python script, the following steps are performed:
 <td><code>str</code>|<code>list</code></td>
 <td>
 <ul>
-  <li><b>str</b>: The root directory of the images, data organization method refers to <a href="#2.3-构建特征库的数据组织方式">Section 2.3 Data Organization Method for Building Feature Library</a></li>
-  <li><b>List[numpy.ndarray]</b>: List of numpy.array type base library image data</li>
+<li><b>str</b>: The root directory of the images, data organization method refers to <a href="#2.3-构建特征库的数据组织方式">Section 2.3 Data Organization Method for Building Feature Library</a></li>
+<li><b>List[numpy.ndarray]</b>: List of numpy.array type base library image data</li>
 </ul>
 </td>
 <td>None</td>
@@ -213,8 +212,8 @@ In the above Python script, the following steps are performed:
 <td><code>str|list</code></td>
 <td>
 <ul>
-  <li><b>str</b>: The path to the annotation file, the data organization method is the same as when building the feature library, refer to <a href="#2.3-构建特征库的数据组织方式">Section 2.3 Data Organization Method for Building Feature Library</a></li>
-  <li><b>List[str]</b>: List of str type base library image annotations</li>
+<li><b>str</b>: The path to the annotation file, the data organization method is the same as when building the feature library, refer to <a href="#2.3-构建特征库的数据组织方式">Section 2.3 Data Organization Method for Building Feature Library</a></li>
+<li><b>List[str]</b>: List of str type base library image annotations</li>
 </ul>
 </td>
 <td>None</td>
@@ -225,8 +224,8 @@ In the above Python script, the following steps are performed:
 <td><code>str</code></td>
 <td>
 <ul>
-  <li><code>"IP"</code>: Inner Product</li>
-  <li><code>"L2"</code>: Euclidean Distance</li>
+<li><code>"IP"</code>: Inner Product</li>
+<li><code>"L2"</code>: Euclidean Distance</li>
 </ul>
 </td>
 <td><code>"IP"</code></td>
@@ -237,9 +236,9 @@ In the above Python script, the following steps are performed:
 <td><code>str</code></td>
 <td>
 <ul>
-  <li><code>"HNSW32"</code>: Fast retrieval speed and high accuracy, but does not support <code>remove_index()</code> operation</li>
-  <li><code>"IVF"</code>: Fast retrieval speed but relatively low accuracy, supports <code>append_index()</code> and <code>remove_index()</code> operations</li>
-  <li><code>"Flat"</code>: Low retrieval speed and high accuracy, supports <code>append_index()</code> and <code>remove_index()</code> operations</li>
+<li><code>"HNSW32"</code>: Fast retrieval speed and high accuracy, but does not support <code>remove_index()</code> operation</li>
+<li><code>"IVF"</code>: Fast retrieval speed but relatively low accuracy, supports <code>append_index()</code> and <code>remove_index()</code> operations</li>
+<li><code>"Flat"</code>: Low retrieval speed and high accuracy, supports <code>append_index()</code> and <code>remove_index()</code> operations</li>
 </ul>
 </td>
 <td><code>"HNSW32"</code></td>
@@ -286,9 +285,9 @@ In the above Python script, the following steps are performed:
 <td><code>Python Var|str|list</code></td>
 <td>
 <ul>
-  <li><b>Python Var</b>: Image data represented by <code>numpy.ndarray</code></li>
-  <li><b>str</b>: Local path of an image file, such as <code>/root/data/img.jpg</code>; <b>URL link</b>, such as a network URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_002.png">Example</a>; <b>Local directory</b>, which should contain images to be predicted, such as <code>/root/data/</code></li>
-  <li><b>List</b>: Elements of the list must be of the above types, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>["/root/data/img1.jpg", "/root/data/img2.jpg"]</code>, <code>["/root/data1", "/root/data2"]</code></li>
+<li><b>Python Var</b>: Image data represented by <code>numpy.ndarray</code></li>
+<li><b>str</b>: Local path of an image file, such as <code>/root/data/img.jpg</code>; <b>URL link</b>, such as a network URL of an image file: <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_002.png">Example</a>; <b>Local directory</b>, which should contain images to be predicted, such as <code>/root/data/</code></li>
+<li><b>List</b>: Elements of the list must be of the above types, such as <code>[numpy.ndarray, numpy.ndarray]</code>, <code>["/root/data/img1.jpg", "/root/data/img2.jpg"]</code>, <code>["/root/data1", "/root/data2"]</code></li>
 </ul>
 </td>
 <td>None</td>
@@ -299,8 +298,8 @@ In the above Python script, the following steps are performed:
 <td><code>str|paddlex.inference.components.retrieval.faiss.IndexData|None</code></td>
 <td>
 <ul>
-    <li><b>str</b> type representing a directory (which should contain feature library files, including <code>vector.index</code> and <code>index_info.yaml</code>)</li>
-    <li><b>IndexData</b> object created by the <code>build_index</code> method</li>
+<li><b>str</b> type representing a directory (which should contain feature library files, including <code>vector.index</code> and <code>index_info.yaml</code>)</li>
+<li><b>IndexData</b> object created by the <code>build_index</code> method</li>
 </ul>
 </td>
 <td><code>None</code></td>
@@ -387,7 +386,7 @@ In the above Python script, the following steps are performed:
 - Calling the `save_to_json()` method will save the above content to the specified `save_path`. If a directory is specified, the saved path will be `save_path/{your_img_basename}_res.json`. If a file is specified, it will be saved directly to that file.
 - Calling the `save_to_img()` method will save the visualization result to the specified `save_path`. If a directory is specified, the saved path will be `save_path/{your_img_basename}_res.{your_img_extension}`. If a file is specified, it will be saved directly to that file. (The production line usually contains many result images; it is not recommended to specify a specific file path directly, otherwise multiple images will be overwritten, leaving only the last one.) In the example above, the visualization result is as follows:
 
-<img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/pipelines/face_recognition/02.jpg">
+<img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/pipelines/face_recognition/02.jpg"/>
 
 * Additionally, it also supports obtaining the visualized image with results and prediction results through attributes, as follows:
 
@@ -399,12 +398,12 @@ In the above Python script, the following steps are performed:
 </tr>
 </thead>
 <tr>
-<td rowspan = "1"><code>json</code></td>
-<td rowspan = "1">Get the prediction result in <code>json</code> format.</td>
+<td rowspan="1"><code>json</code></td>
+<td rowspan="1">Get the prediction result in <code>json</code> format.</td>
 </tr>
 <tr>
-<td rowspan = "2"><code>img</code></td>
-<td rowspan = "2">Get the visualized image in <code>dict</code> format.</td>
+<td rowspan="2"><code>img</code></td>
+<td rowspan="2">Get the visualized image in <code>dict</code> format.</td>
 </tr>
 </table>
 
@@ -467,8 +466,8 @@ The parameters of the above method are described as follows:
 <td><code>str</code>|<code>list</code></td>
 <td>
 <ul>
-  <li><b>str</b>: Root directory of images, data organization refers to <a href="#2.3-Data Organization for Building the Feature Library">Section 2.3 Data Organization for Building the Feature Library</a></li>
-  <li><b>List[numpy.ndarray]</b>: Gallery image data in the form of a list of numpy arrays</li>
+<li><b>str</b>: Root directory of images, data organization refers to <a href="#2.3-Data Organization for Building the Feature Library">Section 2.3 Data Organization for Building the Feature Library</a></li>
+<li><b>List[numpy.ndarray]</b>: Gallery image data in the form of a list of numpy arrays</li>
 </ul>
 </td>
 <td>None</td>
@@ -479,8 +478,8 @@ The parameters of the above method are described as follows:
 <td><code>str|list</code></td>
 <td>
 <ul>
-  <li><b>str</b>: Path to the label file, data organization is the same as when building the feature library, refer to <a href="#2.3-Data Organization for Building the Feature Library">Section 2.3 Data Organization for Building the Feature Library</a></li>
-  <li><b>List[str]</b>: Gallery image labels in the form of a list of strings</li>
+<li><b>str</b>: Path to the label file, data organization is the same as when building the feature library, refer to <a href="#2.3-Data Organization for Building the Feature Library">Section 2.3 Data Organization for Building the Feature Library</a></li>
+<li><b>List[str]</b>: Gallery image labels in the form of a list of strings</li>
 </ul>
 </td>
 <td>None</td>
@@ -491,8 +490,8 @@ The parameters of the above method are described as follows:
 <td><code>str</code></td>
 <td>
 <ul>
-  <li><code>"IP"</code>: Inner Product</li>
-  <li><code>"L2"</code>: Euclidean Distance</li>
+<li><code>"IP"</code>: Inner Product</li>
+<li><code>"L2"</code>: Euclidean Distance</li>
 </ul>
 </td>
 <td><code>"IP"</code></td>
@@ -503,9 +502,9 @@ The parameters of the above method are described as follows:
 <td><code>str</code></td>
 <td>
 <ul>
-  <li><code>"HNSW32"</code>: Faster search speed and higher accuracy, but does not support <code>remove_index()</code> operation</li>
-  <li><code>"IVF"</code>: Faster search speed but relatively lower accuracy, supports <code>append_index()</code> and <code>remove_index()</code> operations</li>
-  <li><code>"Flat"</code>: Slower search speed but higher accuracy, supports <code>append_index()</code> and <code>remove_index()</code> operations</li>
+<li><code>"HNSW32"</code>: Faster search speed and higher accuracy, but does not support <code>remove_index()</code> operation</li>
+<li><code>"IVF"</code>: Faster search speed but relatively lower accuracy, supports <code>append_index()</code> and <code>remove_index()</code> operations</li>
+<li><code>"Flat"</code>: Slower search speed but higher accuracy, supports <code>append_index()</code> and <code>remove_index()</code> operations</li>
 </ul>
 </td>
 <td><code>"HNSW32"</code></td>
@@ -527,15 +526,14 @@ The parameters of the above method are described as follows:
 <td><code>str|paddlex.inference.components.retrieval.faiss.IndexData</code></td>
 <td>
 <ul>
-    <li><b>str</b>: Directory (the directory should contain feature library files, including <code>vector.index</code> and <code>index_info.yaml</code>)</li>
-    <li><b>IndexData</b> object created by <code>build_index</code> method</li>
+<li><b>str</b>: Directory (the directory should contain feature library files, including <code>vector.index</code> and <code>index_info.yaml</code>)</li>
+<li><b>IndexData</b> object created by <code>build_index</code> method</li>
 </ul>
 </td>
 <td>None</td>
 </tr>
 </tbody>
 </table>
-
 <b>Note</b>: <code>HNSW32</code> has compatibility issues on the Windows platform, which may prevent the index library from being built or loaded.
 
 ### 2.3 Data Organization for Building the Feature Library
@@ -570,7 +568,6 @@ Additionally, PaddleX provides three other deployment methods, detailed as follo
 Below is the API reference for basic service deployment and multi-language service call examples:
 
 <details><summary>API Reference</summary>
-
 <p>For the main operations provided by the service:</p>
 <ul>
 <li>The HTTP request method is POST.</li>
@@ -710,7 +707,6 @@ Below is the API reference for basic service deployment and multi-language servi
 </tr>
 </tbody>
 </table>
-
 <ul>
 <li><b><code>addImagesToIndex</code></b></li>
 </ul>
@@ -941,89 +937,85 @@ Below is the API reference for basic service deployment and multi-language servi
 </tbody>
 </table>
 </details>
-
 <details><summary>Multi-language Service Call Example</summary>
-
 <details>
 <summary>Python</summary>
-
-
 <pre><code class="language-python">import base64
 import pprint
 import sys
 
 import requests
 
-API_BASE_URL = &quot;<url id="cu9nt776o68pmutlr330" type="url" status="failed" title="" wc="0">http://0.0.0.0:8080&quot;</url>
+API_BASE_URL = "<url id="cu9nt776o68pmutlr330" status="failed" title="" type="url" wc="0">http://0.0.0.0:8080"</url>
 
 base_image_label_pairs = [
-    {&quot;image&quot;: &quot;./demo0.jpg&quot;, &quot;label&quot;: &quot;ID0&quot;},
-    {&quot;image&quot;: &quot;./demo1.jpg&quot;, &quot;label&quot;: &quot;ID1&quot;},
-    {&quot;image&quot;: &quot;./demo2.jpg&quot;, &quot;label&quot;: &quot;ID2&quot;},
+    {"image": "./demo0.jpg", "label": "ID0"},
+    {"image": "./demo1.jpg", "label": "ID1"},
+    {"image": "./demo2.jpg", "label": "ID2"},
 ]
 image_label_pairs_to_add = [
-    {&quot;image&quot;: &quot;./demo3.jpg&quot;, &quot;label&quot;: &quot;ID2&quot;},
+    {"image": "./demo3.jpg", "label": "ID2"},
 ]
 ids_to_remove = [1]
-infer_image_path = &quot;./demo4.jpg&quot;
-output_image_path = &quot;./out.jpg&quot;
+infer_image_path = "./demo4.jpg"
+output_image_path = "./out.jpg"
 
 for pair in base_image_label_pairs:
-    with open(pair[&quot;image&quot;], &quot;rb&quot;) as file:
+    with open(pair["image"], "rb") as file:
         image_bytes = file.read()
-        image_data = base64.b64encode(image_bytes).decode(&quot;ascii&quot;)
-    pair[&quot;image&quot;] = image_data
+        image_data = base64.b64encode(image_bytes).decode("ascii")
+    pair["image"] = image_data
 
-payload = {&quot;imageLabelPairs&quot;: base_image_label_pairs}
-resp_index_build = requests.post(f&quot;{API_BASE_URL}/face-recognition-index-build&quot;, json=payload)
+payload = {"imageLabelPairs": base_image_label_pairs}
+resp_index_build = requests.post(f"{API_BASE_URL}/face-recognition-index-build", json=payload)
 if resp_index_build.status_code != 200:
-    print(f&quot;Request to face-recognition-index-build failed with status code {resp_index_build}.&quot;)
+    print(f"Request to face-recognition-index-build failed with status code {resp_index_build}.")
     pprint.pp(resp_index_build.json())
     sys.exit(1)
-result_index_build = resp_index_build.json()[&quot;result&quot;]
-print(f&quot;Number of images indexed: {len(result_index_build['idMap'])}&quot;)
+result_index_build = resp_index_build.json()["result"]
+print(f"Number of images indexed: {len(result_index_build['idMap'])}")
 
 for pair in image_label_pairs_to_add:
-    with open(pair[&quot;image&quot;], &quot;rb&quot;) as file:
+    with open(pair["image"], "rb") as file:
         image_bytes = file.read()
-        image_data = base64.b64encode(image_bytes).decode(&quot;ascii&quot;)
-    pair[&quot;image&quot;] = image_data
+        image_data = base64.b64encode(image_bytes).decode("ascii")
+    pair["image"] = image_data
 
-payload = {&quot;imageLabelPairs&quot;: image_label_pairs_to_add, &quot;indexKey&quot;: result_index_build[&quot;indexKey&quot;]}
-resp_index_add = requests.post(f&quot;{API_BASE_URL}/face-recognition-index-add&quot;, json=payload)
+payload = {"imageLabelPairs": image_label_pairs_to_add, "indexKey": result_index_build["indexKey"]}
+resp_index_add = requests.post(f"{API_BASE_URL}/face-recognition-index-add", json=payload)
 if resp_index_add.status_code != 200:
-    print(f&quot;Request to face-recognition-index-add failed with status code {resp_index_add}.&quot;)
+    print(f"Request to face-recognition-index-add failed with status code {resp_index_add}.")
     pprint.pp(resp_index_add.json())
     sys.exit(1)
-result_index_add = resp_index_add.json()[&quot;result&quot;]
-print(f&quot;Number of images indexed: {len(result_index_add['idMap'])}&quot;)
+result_index_add = resp_index_add.json()["result"]
+print(f"Number of images indexed: {len(result_index_add['idMap'])}")
 
-payload = {&quot;ids&quot;: ids_to_remove, &quot;indexKey&quot;: result_index_build[&quot;indexKey&quot;]}
-resp_index_remove = requests.post(f&quot;{API_BASE_URL}/face-recognition-index-remove&quot;, json=payload)
+payload = {"ids": ids_to_remove, "indexKey": result_index_build["indexKey"]}
+resp_index_remove = requests.post(f"{API_BASE_URL}/face-recognition-index-remove", json=payload)
 if resp_index_remove.status_code != 200:
-    print(f&quot;Request to face-recognition-index-remove failed with status code {resp_index_remove}.&quot;)
+    print(f"Request to face-recognition-index-remove failed with status code {resp_index_remove}.")
     pprint.pp(resp_index_remove.json())
     sys.exit(1)
-result_index_remove = resp_index_remove.json()[&quot;result&quot;]
-print(f&quot;Number of images indexed: {len(result_index_remove['idMap'])}&quot;)
+result_index_remove = resp_index_remove.json()["result"]
+print(f"Number of images indexed: {len(result_index_remove['idMap'])}")
 
-with open(infer_image_path, &quot;rb&quot;) as file:
+with open(infer_image_path, "rb") as file:
     image_bytes = file.read()
-    image_data = base64.b64encode(image_bytes).decode(&quot;ascii&quot;)
+    image_data = base64.b64encode(image_bytes).decode("ascii")
 
-payload = {&quot;image&quot;: image_data, &quot;indexKey&quot;: result_index_build[&quot;indexKey&quot;]}
-resp_infer = requests.post(f&quot;{API_BASE_URL}/face-recognition-infer&quot;, json=payload)
+payload = {"image": image_data, "indexKey": result_index_build["indexKey"]}
+resp_infer = requests.post(f"{API_BASE_URL}/face-recognition-infer", json=payload)
 if resp_infer.status_code != 200:
-    print(f&quot;Request to face-recogntion-infer failed with status code {resp_infer}.&quot;)
+    print(f"Request to face-recogntion-infer failed with status code {resp_infer}.")
     pprint.pp(resp_infer.json())
     sys.exit(1)
-result_infer = resp_infer.json()[&quot;result&quot;]
+result_infer = resp_infer.json()["result"]
 
-with open(output_image_path, &quot;wb&quot;) as file:
-    file.write(base64.b64decode(result_infer[&quot;image&quot;]))
-print(f&quot;Output image saved at {output_image_path}&quot;)
-print(&quot;\nDetected faces:&quot;)
-pprint.pp(result_infer[&quot;faces&quot;])
+with open(output_image_path, "wb") as file:
+    file.write(base64.b64decode(result_infer["image"]))
+print(f"Output image saved at {output_image_path}")
+print("\nDetected faces:")
+pprint.pp(result_infer["faces"])
 </code></pre>
 </details>
 </details>
@@ -1075,7 +1067,7 @@ from paddlex import create_pipeline
 
 pipeline = create_pipeline(
     pipeline="face_recognition",
-    device="npu:0" # gpu:0 --> npu:0
+    device="npu:0" # gpu:0 --&gt; npu:0
     )
 ```
 

+ 2 - 2
docs/pipeline_usage/tutorials/cv_pipelines/general_image_recognition.en.md

@@ -84,7 +84,7 @@ Not supported yet.
 
 ### 2.2 Local Experience
 
-&gt; ❗ Before using the general image recognition pipeline locally, please ensure that you have completed the installation of the PaddleX wheel package according to the [PaddleX Installation Guide](../../../installation/installation.en.md).
+> ❗ Before using the general image recognition pipeline locally, please ensure that you have completed the installation of the PaddleX wheel package according to the [PaddleX Installation Guide](../../../installation/installation.en.md).
 
 #### 2.2.1 Command Line Experience
 
@@ -1032,7 +1032,7 @@ from paddlex import create_pipeline
 
 pipeline = create_pipeline(
     pipeline="PP-ShiTuV2",
-    device="npu:0" # gpu:0 --&gt; npu:0
+    device="npu:0" # gpu:0 --> npu:0
     )
 ```
 

+ 1 - 1
docs/pipeline_usage/tutorials/cv_pipelines/human_keypoint_detection.en.md

@@ -87,7 +87,7 @@ Not supported for online experience.
 
 ### 2.2 Local Experience
 
-&gt; ❗ Before using the Human Keypoint Detection Pipeline locally, please ensure that you have completed the installation of the PaddleX wheel package according to the [PaddleX Installation Guide](../../../installation/installation.en.md).
+> ❗ Before using the Human Keypoint Detection Pipeline locally, please ensure that you have completed the installation of the PaddleX wheel package according to the [PaddleX Installation Guide](../../../installation/installation.en.md).
 
 #### 2.2.1 Command Line Experience
 

+ 20 - 5
docs/pipeline_usage/tutorials/cv_pipelines/image_classification.en.md

@@ -5,7 +5,7 @@ comments: true
 # General Image Classification Pipeline Tutorial
 
 ## 1. Introduction to the General Image Classification Pipeline
-Image classification is a technique that assigns images to predefined categories. It is widely applied in object recognition, scene understanding, and automatic annotation. Image classification can identify various objects such as animals, plants, traffic signs, and categorize them based on their features. By leveraging deep learning models, image classification can automatically extract image features and perform accurate classification.
+Image classification is a technique that assigns images to predefined categories. It is widely applied in object recognition, scene understanding, and automatic annotation. Image classification can identify various objects such as animals, plants, traffic signs, and categorize them based on their features. By leveraging deep learning models, image classification can automatically extract image features and perform accurate classification.This production line also offers a flexible service-oriented deployment approach, supporting the use of multiple programming languages on various hardware platforms. Moreover, this production line provides the capability for secondary development. You can train and optimize models on your own dataset based on this production line, and the trained models can be seamlessly integrated.
 
 <img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/pipelines/image_classification/01.png"/>
 <b>The General Image Classification Pipeline includes an image classification module. If you prioritize model accuracy, choose a model with higher accuracy. If you prioritize inference speed, select a model with faster inference. If you prioritize model storage size, choose a model with a smaller storage size.</b>
@@ -16,6 +16,7 @@ Image classification is a technique that assigns images to predefined categories
 <th>CPU Inference Time (ms)<br/>[Normal Mode / High-Performance Mode]</th>
 <th>CPU Inference Time (ms)<br/>[Normal Mode / High-Performance Mode]</th>
 <th>Model Storage Size (M)</th>
+<th>Description</th>
 </tr>
 <tr>
 <td>CLIP_vit_base_patch16_224</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/CLIP_vit_base_patch16_224_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/CLIP_vit_base_patch16_224_pretrained.pdparams">Trained Model</a></td>
@@ -23,6 +24,7 @@ Image classification is a technique that assigns images to predefined categories
 <td>12.84 / 2.82</td>
 <td>60.52 / 60.52</td>
 <td>306.5 M</td>
+<td>The general high-precision image classification model of the large visual model CLIP fine-tuned on the ImageNet1k dataset</td>
 </tr>
 <tr>
 <td>MobileNetV3_small_x1_0</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/MobileNetV3_small_x1_0_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/MobileNetV3_small_x1_0_pretrained.pdparams">Trained Model</a></td>
@@ -30,6 +32,7 @@ Image classification is a technique that assigns images to predefined categories
 <td>3.76 / 0.53</td>
 <td>5.11 / 1.43</td>
 <td>10.5 M</td>
+<td>MobileNetV3 is a new lightweight network based on NAS proposed by Google in 2019. To further improve performance, the relu and sigmoid activation functions are replaced with hard_swish and hard_sigmoid activation functions, respectively. Additionally, several strategies specifically aimed at reducing the computational load of the network have been introduced.</td>
 </tr>
 <tr>
 <td>PP-HGNet_small</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PP-HGNet_small_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-HGNet_small_pretrained.pdparams">Trained Model</a></td>
@@ -37,6 +40,7 @@ Image classification is a technique that assigns images to predefined categories
 <td>5.12 / 1.73</td>
 <td>25.01 / 25.01</td>
 <td>86.5 M</td>
+<td>PP-HGNet (High Performance GPU Net) is a high-performance backbone network developed by the Baidu PaddlePaddle Vision Team, specifically optimized for GPU platforms. This network builds upon VOVNet and incorporates a learnable downsampling layer (LDS Layer), integrating the advantages of models such as ResNet_vd and PPHGNet. The model achieves higher accuracy compared to other state-of-the-art (SOTA) models at the same speed on GPU platforms.</td>
 </tr>
 <tr>
 <td>PP-HGNetV2-B0</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PP-HGNetV2-B0_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-HGNetV2-B0_pretrained.pdparams">Trained Model</a></td>
@@ -44,6 +48,7 @@ Image classification is a technique that assigns images to predefined categories
 <td>3.83 / 0.57</td>
 <td>9.95 / 2.37</td>
 <td>21.4 M</td>
+<td rowspan="3">PP-HGNetV2 (High Performance GPU Network V2) is the next-generation version of PP-HGNet developed by the Baidu PaddlePaddle Vision Team. Building upon PP-HGNet, it has been further optimized and improved. Ultimately, on NVIDIA GPU devices, it achieves an extreme "Accuracy-Latency Balance," with accuracy significantly surpassing other models of the same inference speed.</td>
 </tr>
 <tr>
 <td>PP-HGNetV2-B4</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PP-HGNetV2-B4_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-HGNetV2-B4_pretrained.pdparams">Trained Model</a></td>
@@ -65,6 +70,7 @@ Image classification is a technique that assigns images to predefined categories
 <td>2.35 / 0.47</td>
 <td>4.03 / 1.35</td>
 <td>10.5 M</td>
+<td>PP-LCNet_x1_0 is designed with a specific backbone network for Intel CPU devices and their acceleration library MKLDNN. Compared to other lightweight state-of-the-art (SOTA) models, this backbone network can further enhance model performance without increasing inference time, ultimately significantly surpassing existing SOTA models.</td>
 </tr>
 <tr>
 <td>ResNet50</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/ResNet50_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/ResNet50_pretrained.pdparams">Trained Model</a></td>
@@ -72,6 +78,7 @@ Image classification is a technique that assigns images to predefined categories
 <td>6.44 / 1.16</td>
 <td>15.04 / 11.63</td>
 <td>90.8 M</td>
+<td>The ResNet series of models was proposed in 2015 and won the championship in the ILSVRC2015 competition with a top-5 error rate of 3.57%. The network innovatively introduced the residual structure, and by stacking multiple residual structures, the ResNet network was constructed.</td>
 </tr>
 <tr>
 <td>SwinTransformer_tiny_patch4_window7_224</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/SwinTransformer_tiny_patch4_window7_224_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/SwinTransformer_tiny_patch4_window7_224_pretrained.pdparams">Trained Model</a></td>
@@ -79,6 +86,7 @@ Image classification is a technique that assigns images to predefined categories
 <td>6.66 / 2.15</td>
 <td>60.45 / 60.45</td>
 <td>100.1 M</td>
+<td>SwinTransformer is a new type of visual Transformer network that can be used as a general-purpose backbone network in the field of computer vision. SwinTransformer consists of a hierarchical Transformer structure represented by shifted windows. The shifted windows confine the self-attention computation to non-overlapping local windows while allowing cross-window connections, thereby enhancing the network's performance.</td>
 </tr>
 </table>
 
@@ -696,7 +704,7 @@ You can quickly experience the image classification pipeline with a single comma
 
 
 ```bash
-paddlex --pipeline image_classification --input general_image_classification_001.jpg --device gpu:0
+paddlex --pipeline image_classification --input general_image_classification_001.jpg --device gpu:0 --save_path ./output/
 ```
 
 The relevant parameter descriptions can be found in the parameter explanation section of [2.2.2 Python Script Integration](#222-integration-via-python-script).
@@ -749,6 +757,12 @@ In the above Python script, the following steps are executed:
 <td>None</td>
 </tr>
 <tr>
+<td><code>config</code></td>
+<td>Specific configuration information for the production line (if set simultaneously with <code>pipeline</code>, it has higher priority than <code>pipeline</code>, and the production line name must be consistent with <code>pipeline</code>).</td>
+<td><code>dict[str, Any]</code></td>
+<td><code>None</code></td>
+</tr>
+<tr>
 <td><code>device</code></td>
 <td>The device used for production line inference. It supports specifying the specific card number of GPUs, such as "gpu:0", other hardware card numbers, such as "npu:0", and CPUs, such as "cpu".</td>
 <td><code>str</code></td>
@@ -1061,11 +1075,12 @@ In the above Python script, the following steps are executed:
 - Calling the `print()` method will print the results to the terminal, with the following explanations for the printed content:
 
     - `input_path`: `(str)` The input path of the image to be predicted.
+    - `page_index`: `(Union[int, None])` If the input is a PDF file, it indicates the current page number of the PDF; otherwise, it is `None`.
     - `class_ids`: `(List[numpy.ndarray])` The class IDs of the prediction results.
     - `scores`: `(List[numpy.ndarray])` The confidence scores of the prediction results.
     - `label_names`: `(List[str])` The names of the predicted classes.
 
-- Calling the `save_to_json()` method will save the above content to the specified `save_path`. If a directory is specified, the saved path will be `save_path/{your_img_basename}.json`. If a file is specified, the results will be saved directly to that file. Since JSON files do not support saving NumPy arrays, `numpy.array` types will be converted to lists.
+- Calling the `save_to_json()` method will save the above content to the specified `save_path`. If a directory is specified, the saved path will be `save_path/{your_img_basename}_res.json`. If a file is specified, the results will be saved directly to that file. Since JSON files do not support saving NumPy arrays, `numpy.array` types will be converted to lists.
 
 - Calling the `save_to_img()` method will save the visualized results to the specified `save_path`. If a directory is specified, the saved path will be `save_path/{your_img_basename}_res.{your_img_extension}`. If a file is specified, the results will be saved directly to that file. (It is not recommended to specify a specific file path directly, as multiple images will be overwritten, leaving only the last one.)
 
@@ -1096,7 +1111,7 @@ Additionally, you can obtain the configuration file for the image classification
 ```
 paddlex --get_pipeline_config image_classification --save_path ./my_path
 ```
-If you have obtained the configuration file, you can customize the settings for the OCR production line by simply modifying the `pipeline` parameter value in the `create_pipeline` method to the path of the configuration file. The example is as follows:
+If you have obtained the configuration file, you can customize the settings for the image classification production line by simply modifying the `pipeline` parameter value in the `create_pipeline` method to the path of the configuration file. The example is as follows:
 
 ```python
 from paddlex import create_pipeline
@@ -1699,7 +1714,7 @@ SubModules:
   ImageClassification:
     module_name: image_classification
     model_name: PP-LCNet_x0_5
-    model_dir: null
+    model_dir: null # Replace with the path to the fine-tuned image classification model weights
     batch_size: 4
     topk: 5
 ```

+ 1 - 1
docs/pipeline_usage/tutorials/cv_pipelines/object_detection.md

@@ -66,7 +66,7 @@ comments: true
 </tr>
 </table>
 
-&gt; ❗ 以上列出的是目标检测模块重点支持的<b>6个核心模型</b>,该模块总共支持<b>37个模型</b>,完整的模型列表如下:
+> ❗ 以上列出的是目标检测模块重点支持的<b>6个核心模型</b>,该模块总共支持<b>37个模型</b>,完整的模型列表如下:
 
 <details><summary> 👉模型列表详情</summary>
 <table>

+ 13 - 6
docs/pipeline_usage/tutorials/cv_pipelines/pedestrian_attribute_recognition.en.md

@@ -5,7 +5,7 @@ comments: true
 # Pedestrian Attribute Recognition Pipeline Tutorial
 
 ## 1. Introduction to Pedestrian Attribute Recognition Pipeline
-Pedestrian attribute recognition is a key function in computer vision systems, used to locate and label specific characteristics of pedestrians in images or videos, such as gender, age, clothing color, and style. This task not only requires accurately detecting pedestrians but also identifying detailed attribute information for each pedestrian. The pedestrian attribute recognition pipeline is an end-to-end serial system for locating and recognizing pedestrian attributes, widely used in smart cities, security surveillance, and other fields, significantly enhancing the system's intelligence level and management efficiency.
+Pedestrian attribute recognition is a key function in computer vision systems, used to locate and label specific characteristics of pedestrians in images or videos, such as gender, age, clothing color, and style. This task not only requires accurately detecting pedestrians but also identifying detailed attribute information for each pedestrian. The pedestrian attribute recognition pipeline is an end-to-end serial system for locating and recognizing pedestrian attributes, widely used in smart cities, security surveillance, and other fields, significantly enhancing the system's intelligence level and management efficiency.This production line also offers a flexible service-oriented deployment approach, supporting the use of multiple programming languages on various hardware platforms. Moreover, this production line provides the capability for secondary development. You can train and optimize models on your own dataset based on this production line, and the trained models can be seamlessly integrated.
 
 <img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/pipelines/pedestrian_attribute_recognition/01.jpg"/>
 <b>The pedestrian attribute recognition pipeline includes a pedestrian detection module and a pedestrian attribute recognition module</b>, with several models in each module. Which models to use specifically can be selected based on the benchmark data below. <b>If you prioritize model accuracy, choose models with higher accuracy; if you prioritize inference speed, choose models with faster inference; if you prioritize model storage size, choose models with smaller storage</b>.
@@ -79,7 +79,7 @@ Before using the pedestrian attribute recognition pipeline locally, please ensur
 You can quickly experience the pedestrian attribute recognition pipeline with a single command. Use [the test image](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/pedestrian_attribute_002.jpg) and replace `--input` with your local path for prediction.
 
 ```bash
-paddlex --pipeline pedestrian_attribute_recognition --input pedestrian_attribute_002.jpg --device gpu:0
+paddlex --pipeline pedestrian_attribute_recognition --input pedestrian_attribute_002.jpg --device gpu:0 --save_path ./output/
 ```
 
 The relevant parameter descriptions can be found in the parameter explanation section of [2.2.2 Python Script Integration](#222-python脚本方式集成).
@@ -135,6 +135,12 @@ In the above Python script, the following steps are executed:
 <td>None</td>
 </tr>
 <tr>
+<td><code>config</code></td>
+<td>Specific configuration information for the production line (if set simultaneously with <code>pipeline</code>, it has higher priority than <code>pipeline</code>, and the production line name must be consistent with <code>pipeline</code>).</td>
+<td><code>dict[str, Any]</code></td>
+<td><code>None</code></td>
+</tr>
+<tr>
 <td><code>device</code></td>
 <td>The device used for production line inference. It supports specifying the specific card number of GPUs, such as "gpu:0", other hardware card numbers, such as "npu:0", and CPUs, such as "cpu".</td>
 <td><code>str</code></td>
@@ -285,6 +291,7 @@ In the above Python script, the following steps are executed:
 - Calling the `print()` method will print the result to the terminal, and the content printed to the terminal is explained as follows:
 
     - `input_path`: `(str)` The input path of the image to be predicted.
+    - `page_index`: `(Union[int, None])` If the input is a PDF file, it indicates the current page number of the PDF; otherwise, it is `None`.
     - `boxes`: `(List[Dict])` Indicates the category ID of the prediction result.
     - `labels`: `(List[str])` Indicates the category name of the prediction result.
     - `cls_scores`: `(List[numpy.ndarray])` Indicates the confidence of the attribute prediction result.
@@ -322,7 +329,7 @@ paddlex --get_pipeline_config pedestrian_attribute_recognition --save_path ./my_
 ```
 
 
-If you have obtained the configuration file, you can customize the settings for the OCR pipeline. Simply modify the `pipeline` parameter value in the `create_pipeline` method to the path of the pipeline configuration file. An example is as follows:
+If you have obtained the configuration file, you can customize the settings for the pedestrian attribute recognition production line by simply modifying the value of the `pipeline` parameter in the `create_pipeline` method to the path of the production line configuration file. The example is as follows:
 
 ```python
 from paddlex import create_pipeline
@@ -597,13 +604,13 @@ SubModules:
   Detection:
     module_name: object_detection
     model_name: PP-YOLOE-L_human
-    model_dir: null
+    model_dir: null # Replace with the path to the fine-tuned image classification model weights
     batch_size: 1
     threshold: 0.5
   Classification:
     module_name: multilabel_classification
     model_name: PP-LCNet_x1_0_pedestrian_attribute
-    model_dir: null
+    model_dir: null # Replace with the path to the fine-tuned image classification model weights
     batch_size: 1
     threshold: 0.7
 ```
@@ -620,4 +627,4 @@ paddlex --pipeline pedestrian_attribute_recognition \
         --device npu:0
 ```
 
-If you want to use the general OCR production line on a wider range of hardware devices, please refer to the [PaddleX Multi-Hardware Usage Guide](../../../other_devices_support/multi_devices_use_guide.en.md).
+If you want to use the general Pedestrian Attribute Recognition pipeline on a wider range of hardware devices, please refer to the [PaddleX Multi-Hardware Usage Guide](../../../other_devices_support/multi_devices_use_guide.en.md).

+ 1 - 1
docs/pipeline_usage/tutorials/cv_pipelines/small_object_detection.en.md

@@ -51,7 +51,7 @@ PaddleX supports experiencing the small object detection pipeline's effects thro
 Before using the small object detection pipeline locally, ensure you have installed the PaddleX wheel package following the [PaddleX Local Installation Tutorial](../../../installation/installation.en.md).
 
 ### 2.1 Local Experience
-&gt; ❗ Before using the general small object detection pipeline locally, please ensure that you have completed the installation of the PaddleX wheel package according to the [PaddleX Local Installation Guide](../../../installation/installation.en.md).
+> ❗ Before using the general small object detection pipeline locally, please ensure that you have completed the installation of the PaddleX wheel package according to the [PaddleX Local Installation Guide](../../../installation/installation.en.md).
 
 #### 2.1.1 Command Line Experience
 * You can quickly experience the small object detection pipeline effect with a single command. Use the [test file](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/small_object_detection.jpg), and replace `--input` with the local path for prediction.

+ 11 - 5
docs/pipeline_usage/tutorials/cv_pipelines/vehicle_attribute_recognition.en.md

@@ -5,7 +5,7 @@ comments: true
 # Vehicle Attribute Recognition Pipeline Tutorial
 
 ## 1. Introduction to Vehicle Attribute Recognition Pipeline
-Vehicle attribute recognition is a crucial component in computer vision systems. Its primary task is to locate and label specific attributes of vehicles in images or videos, such as vehicle type, color, and license plate number. This task not only requires accurately detecting vehicles but also identifying detailed attribute information for each vehicle. The vehicle attribute recognition pipeline is an end-to-end serial system for locating and recognizing vehicle attributes, widely used in traffic management, intelligent parking, security surveillance, autonomous driving, and other fields. It significantly enhances system efficiency and intelligence levels, driving the development and innovation of related industries.
+Vehicle attribute recognition is a crucial component in computer vision systems. Its primary task is to locate and label specific attributes of vehicles in images or videos, such as vehicle type, color, and license plate number. This task not only requires accurately detecting vehicles but also identifying detailed attribute information for each vehicle. The vehicle attribute recognition pipeline is an end-to-end serial system for locating and recognizing vehicle attributes, widely used in traffic management, intelligent parking, security surveillance, autonomous driving, and other fields. It significantly enhances system efficiency and intelligence levels, driving the development and innovation of related industries. This pipeline also offers a flexible service-oriented deployment approach, supporting the use of multiple programming languages on various hardware platforms. Moreover, this production line provides the capability for secondary development. You can train and optimize models on your own dataset based on this production line, and the trained models can be seamlessly integrated.
 
 <img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/pipelines/vehicle_attribute_recognition/01.jpg"/>
 <b>The vehicle attribute recognition pipeline includes a vehicle detection module and a vehicle attribute recognition module</b>, with several models in each module. Which models to use can be selected based on the benchmark data below. <b>If you prioritize model accuracy, choose models with higher accuracy; if you prioritize inference speed, choose models with faster inference; if you prioritize model storage size, choose models with smaller storage</b>.
@@ -53,7 +53,7 @@ Vehicle attribute recognition is a crucial component in computer vision systems.
 <tr>
 <td>PP-LCNet_x1_0_vehicle_attribute</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PP-LCNet_x1_0_vehicle_attribute_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-LCNet_x1_0_vehicle_attribute_pretrained.pdparams">Trained Model</a></td>
 <td>91.7</td>
-<td>2.32 / 2.32</td>
+<td>2.32 / 0.52</td>
 <td>3.22 / 1.26</td>
 <td>6.7 M</td>
 <td>PP-LCNet_x1_0_vehicle_attribute is a lightweight vehicle attribute recognition model based on PP-LCNet.</td>
@@ -75,7 +75,7 @@ Before using the vehicle attribute recognition pipeline locally, ensure you have
 You can quickly experience the vehicle attribute recognition pipeline with a single command. Use the [test file](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/vehicle_attribute_002.jpg) and replace `--input` with the local path for prediction.
 
 ```bash
-paddlex --pipeline vehicle_attribute_recognition --input vehicle_attribute_002.jpg --device gpu:0
+paddlex --pipeline vehicle_attribute_recognition --input vehicle_attribute_002.jpg --device gpu:0 --save_path ./output/
 ```
 Parameter Description:
 
@@ -116,6 +116,12 @@ In the above Python script, the following steps are executed:
 <td>None</td>
 </tr>
 <tr>
+<td><code>config</code></td>
+<td>Specific configuration information for the production line (if set simultaneously with <code>pipeline</code>, it has higher priority than <code>pipeline</code>, and the production line name must be consistent with <code>pipeline</code>).</td>
+<td><code>dict[str, Any]</code></td>
+<td><code>None</code></td>
+</tr>
+<tr>
 <td><code>device</code></td>
 <td>The device used for production line inference. It supports specifying the specific card number of GPUs, such as "gpu:0", other hardware card numbers, such as "npu:0", and CPUs, such as "cpu".</td>
 <td><code>str</code></td>
@@ -302,7 +308,7 @@ Additionally, you can obtain the vehicle attribute recognition pipeline configur
 paddlex --get_pipeline_config vehicle_attribute_recognition --save_path ./my_path
 ```
 
-If you have obtained the configuration file, you can customize the settings for the OCR production line by simply modifying the `pipeline` parameter value in the `create_pipeline` method to the path of the configuration file. The example is as follows:
+If you have obtained the configuration file, you can customize the settings for the Vehicle Attribute Recognition pipeline by simply modifying the `pipeline` parameter value in the `create_pipeline` method to the path of the configuration file. The example is as follows:
 
 ```python
 from paddlex import create_pipeline
@@ -612,4 +618,4 @@ paddlex --pipeline vehicle_attribute_recognition \
         --device npu:0
 ```
 
-If you want to use the general OCR production line on a wider range of hardware devices, please refer to the [PaddleX Multi-Hardware Usage Guide](../../../other_devices_support/multi_devices_use_guide.en.md).
+If you want to use the general Vehicle Attribute Recognition pipeline on a wider range of hardware devices, please refer to the [PaddleX Multi-Hardware Usage Guide](../../../other_devices_support/multi_devices_use_guide.en.md).

+ 67 - 74
docs/pipeline_usage/tutorials/cv_pipelines/vehicle_attribute_recognition.md

@@ -7,8 +7,7 @@ comments: true
 ## 1. 车辆属性识别产线介绍
 车辆属性识别是计算机视觉系统中的重要组成部分,其主要任务是在图像或视频中定位并标记出车辆的特定属性,如车辆类型、颜色、车牌号等。该任务不仅要求准确检测出车辆,还需识别每辆车的详细属性信息。车辆属性识别产线是定位并识别车辆属性的端到端串联系统,广泛应用于交通管理、智能停车、安防监控、自动驾驶等领域,显著提升了系统效率和智能化水平,并推动了相关行业的发展与创新。本产线同时提供了灵活的服务化部署方式,支持在多种硬件上使用多种编程语言调用。不仅如此,本产线也提供了二次开发的能力,您可以基于本产线在您自己的数据集上训练调优,训练后的模型也可以无缝集成。
 
-<img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/pipelines/vehicle_attribute_recognition/vehicle_attribute_1.jpg">
-
+<img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/pipelines/vehicle_attribute_recognition/vehicle_attribute_1.jpg"/>
 <b>车辆属性识别产线中包含了车辆检测模块和车辆属性识别模块</b>,每个模块中包含了若干模型,具体使用哪些模型,您可以根据下边的 benchmark 数据来选择。<b>如您更考虑模型精度,请选择精度较高的模型,如您更考虑模型推理速度,请选择推理速度较快的模型,如您更考虑模型存储大小,请选择存储大小较小的模型</b>。
 
 
@@ -17,28 +16,27 @@ comments: true
 <tr>
 <th>模型</th><th>模型下载链接</th>
 <th>mAP 0.5:0.95</th>
-<th>GPU推理耗时 (ms)</th>
-<th>CPU推理耗时 (ms)</th>
+<th>GPU推理耗时(ms)<br/>[常规模式 / 高性能模式]</th>
+<th>CPU推理耗时(ms)<br/>[常规模式 / 高性能模式]</th>
 <th>模型存储大小(M)</th>
 <th>介绍</th>
 </tr>
 <tr>
 <td>PP-YOLOE-S_vehicle</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PP-YOLOE-S_vehicle_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-YOLOE-S_vehicle_pretrained.pdparams">训练模型</a></td>
 <td>61.3</td>
-<td>15.4</td>
-<td>178.4</td>
+<td>9.79 / 3.48</td>
+<td>54.14 / 46.69</td>
 <td>28.79</td>
 <td rowspan="2">基于PP-YOLOE的车辆检测模型</td>
 </tr>
 <tr>
 <td>PP-YOLOE-L_vehicle</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PP-YOLOE-L_vehicle_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-YOLOE-L_vehicle_pretrained.pdparams">训练模型</a></td>
 <td>63.9</td>
-<td>32.6</td>
-<td>775.6</td>
+<td>32.84 / 9.03</td>
+<td>176.60 / 176.60</td>
 <td>196.02</td>
 </tr>
 </table>
-
 <p><b>注:以上精度指标为PPVehicle 验证集 mAP(0.5:0.95)。所有模型 GPU 推理耗时基于 NVIDIA Tesla T4 机器,精度类型为 FP32, CPU 推理速度基于 Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz,线程数为8,精度类型为 FP32。</b></p>
 <p><b>车辆属性识别模块:</b></p>
 <table>
@@ -46,8 +44,8 @@ comments: true
 <tr>
 <th>模型</th><th>模型下载链接</th>
 <th>mAP(%)</th>
-<th>GPU推理耗时 (ms)</th>
-<th>CPU推理耗时 (ms)</th>
+<th>GPU推理耗时(ms)<br/>[常规模式 / 高性能模式]</th>
+<th>CPU推理耗时(ms)<br/>[常规模式 / 高性能模式]</th>
 <th>模型存储大小(M)</th>
 <th>介绍</th>
 </tr>
@@ -56,8 +54,8 @@ comments: true
 <tr>
 <td>PP-LCNet_x1_0_vehicle_attribute</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PP-LCNet_x1_0_vehicle_attribute_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-LCNet_x1_0_vehicle_attribute_pretrained.pdparams">训练模型</a></td>
 <td>91.7</td>
-<td>3.84845</td>
-<td>9.23735</td>
+<td>2.32 / 0.52</td>
+<td>3.22 / 1.26</td>
 <td>6.7 M</td>
 <td>PP-LCNet_x1_0_vehicle_attribute 是一种基于PP-LCNet的轻量级车辆属性识别模型。</td>
 </tr>
@@ -92,7 +90,7 @@ paddlex --pipeline vehicle_attribute_recognition --input vehicle_attribute_002.j
 
 可视化结果保存在`save_path`下,可视化结果如下:
 
-<img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/pipelines/vehicle_attribute_recognition/01.jpg">
+<img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/pipelines/vehicle_attribute_recognition/01.jpg"/>
 
 #### 2.2.2 Python脚本方式集成
 * 上述命令行是为了快速体验查看效果,一般来说,在项目中,往往需要通过代码集成,您可以通过几行代码即可完成产线的快速推理,推理代码如下:
@@ -168,9 +166,9 @@ for res in output:
 <td><code>Python Var|str|list</code></td>
 <td>
 <ul>
-  <li><b>Python Var</b>:如 <code>numpy.ndarray</code> 表示的图像数据</li>
-  <li><b>str</b>:如图像文件的本地路径:<code>/root/data/img.jpg</code>;<b>如URL链接</b>,如图像文件网络URL:<a href = "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/vehicle_attribute_002.jpg">示例</a>;<b>如本地目录</b>,该目录下需包含待预测图像,如本地路径:<code>/root/data/</code></li>
-  <li><b>List</b>:列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code></li>
+<li><b>Python Var</b>:如 <code>numpy.ndarray</code> 表示的图像数据</li>
+<li><b>str</b>:如图像文件的本地路径:<code>/root/data/img.jpg</code>;<b>如URL链接</b>,如图像文件网络URL:<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/vehicle_attribute_002.jpg">示例</a>;<b>如本地目录</b>,该目录下需包含待预测图像,如本地路径:<code>/root/data/</code></li>
+<li><b>List</b>:列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td><code>None</code></td>
@@ -181,13 +179,13 @@ for res in output:
 <td><code>str|None</code></td>
 <td>
 <ul>
-  <li><b>CPU</b>:如 <code>cpu</code> 表示使用 CPU 进行推理;</li>
-  <li><b>GPU</b>:如 <code>gpu:0</code> 表示使用第 1 块 GPU 进行推理;</li>
-  <li><b>NPU</b>:如 <code>npu:0</code> 表示使用第 1 块 NPU 进行推理;</li>
-  <li><b>XPU</b>:如 <code>xpu:0</code> 表示使用第 1 块 XPU 进行推理;</li>
-  <li><b>MLU</b>:如 <code>mlu:0</code> 表示使用第 1 块 MLU 进行推理;</li>
-  <li><b>DCU</b>:如 <code>dcu:0</code> 表示使用第 1 块 DCU 进行推理;</li>
-  <li><b>None</b>:如果设置为 <code>None</code>, 将默认使用产线初始化的该参数值,初始化时,会优先使用本地的 GPU 0号设备,如果没有,则使用 CPU 设备;</li>
+<li><b>CPU</b>:如 <code>cpu</code> 表示使用 CPU 进行推理;</li>
+<li><b>GPU</b>:如 <code>gpu:0</code> 表示使用第 1 块 GPU 进行推理;</li>
+<li><b>NPU</b>:如 <code>npu:0</code> 表示使用第 1 块 NPU 进行推理;</li>
+<li><b>XPU</b>:如 <code>xpu:0</code> 表示使用第 1 块 XPU 进行推理;</li>
+<li><b>MLU</b>:如 <code>mlu:0</code> 表示使用第 1 块 MLU 进行推理;</li>
+<li><b>DCU</b>:如 <code>dcu:0</code> 表示使用第 1 块 DCU 进行推理;</li>
+<li><b>None</b>:如果设置为 <code>None</code>, 将默认使用产线初始化的该参数值,初始化时,会优先使用本地的 GPU 0号设备,如果没有,则使用 CPU 设备;</li>
 </ul>
 </td>
 <td><code>None</code></td>
@@ -198,8 +196,8 @@ for res in output:
 <td><code>float | None</code></td>
 <td>
 <ul>
-  <li><b>float</b>:如<code>0.5</code>, 表示过滤掉所有阈值小于<code>0.5</code>的目标框;</li>
-  <li><b>None</b>:如果设置为<code>None</code>, 将默认使用产线初始化的该参数值,初始化为<code>0.5</code>;</li>
+<li><b>float</b>:如<code>0.5</code>, 表示过滤掉所有阈值小于<code>0.5</code>的目标框;</li>
+<li><b>None</b>:如果设置为<code>None</code>, 将默认使用产线初始化的该参数值,初始化为<code>0.5</code>;</li>
 </ul>
 </td>
 <td><code>0.5</code></td>
@@ -210,10 +208,10 @@ for res in output:
 <td><code>float | dict | list| None</code></td>
 <td>
 <ul>
-  <li><b>float</b>:表示属性识别的统一阈值;</li>
-  <li><b>list</b>:如<code>[0.5, 0.45, 0.48, 0.4]</code>,表示按照<code>label list</code>顺序的不同类别阈值;</code>;</li>
-  <li><b>dict</b>:字典的key 为 <code>default</code> 和 <code>int</code> 类型,val 为 <code>float</code> 类型阈值,如<code>{"default": 0.5, 0: 0.45, 2: 0.48, 7: 0.4}</code>,<code>default</code> 表示属性识别的统一阈值,其他 <code>int</code> 类型表示对 cls_id 为0的类别应用阈值 0.45、cls_id 为 1 的类别应用阈值 0.48、cls_id 为 7 的类别应用阈值 0.4;</li>
-  <li><b>None</b>:如果设置为<code>None</code>, 将默认使用产线初始化的该参数值,初始化为<code>0.7</code>;</li>
+<li><b>float</b>:表示属性识别的统一阈值;</li>
+<li><b>list</b>:如<code>[0.5, 0.45, 0.48, 0.4]</code>,表示按照<code>label list</code>顺序的不同类别阈值;;</li>
+<li><b>dict</b>:字典的key 为 <code>default</code> 和 <code>int</code> 类型,val 为 <code>float</code> 类型阈值,如<code>{"default": 0.5, 0: 0.45, 2: 0.48, 7: 0.4}</code>,<code>default</code> 表示属性识别的统一阈值,其他 <code>int</code> 类型表示对 cls_id 为0的类别应用阈值 0.45、cls_id 为 1 的类别应用阈值 0.48、cls_id 为 7 的类别应用阈值 0.4;</li>
+<li><b>None</b>:如果设置为<code>None</code>, 将默认使用产线初始化的该参数值,初始化为<code>0.7</code>;</li>
 </ul>
 </td>
 <td><code>0.7</code></td>
@@ -233,8 +231,8 @@ for res in output:
 </tr>
 </thead>
 <tr>
-<td rowspan = "3"><code>print()</code></td>
-<td rowspan = "3">打印结果到终端</td>
+<td rowspan="3"><code>print()</code></td>
+<td rowspan="3">打印结果到终端</td>
 <td><code>format_json</code></td>
 <td><code>bool</code></td>
 <td>是否对输出内容进行使用 <code>JSON</code> 缩进格式化</td>
@@ -253,8 +251,8 @@ for res in output:
 <td><code>False</code></td>
 </tr>
 <tr>
-<td rowspan = "3"><code>save_to_json()</code></td>
-<td rowspan = "3">将结果保存为json格式的文件</td>
+<td rowspan="3"><code>save_to_json()</code></td>
+<td rowspan="3">将结果保存为json格式的文件</td>
 <td><code>save_path</code></td>
 <td><code>str</code></td>
 <td>保存的文件路径,当为目录时,保存文件命名与输入文件类型命名一致</td>
@@ -303,12 +301,12 @@ for res in output:
 </tr>
 </thead>
 <tr>
-<td rowspan = "1"><code>json</code></td>
-<td rowspan = "1">获取预测的 <code>json</code> 格式的结果</td>
+<td rowspan="1"><code>json</code></td>
+<td rowspan="1">获取预测的 <code>json</code> 格式的结果</td>
 </tr>
 <tr>
-<td rowspan = "2"><code>img</code></td>
-<td rowspan = "2">获取格式为 <code>dict</code> 的可视化图像</td>
+<td rowspan="2"><code>img</code></td>
+<td rowspan="2">获取格式为 <code>dict</code> 的可视化图像</td>
 </tr>
 </table>
 
@@ -355,7 +353,6 @@ for res in output:
 以下是基础服务化部署的API参考与多语言服务调用示例:
 
 <details><summary>API参考</summary>
-
 <p>对于服务提供的主要操作:</p>
 <ul>
 <li>HTTP请求方法为POST。</li>
@@ -535,38 +532,34 @@ for res in output:
 </tbody>
 </table>
 </details>
-
 <details><summary>多语言调用服务示例</summary>
-
 <details>
 <summary>Python</summary>
-
-
 <pre><code class="language-python">import base64
 import requests
 
-API_URL = &quot;http://localhost:8080/vehicle-attribute-recognition&quot; # 服务URL
-image_path = &quot;./demo.jpg&quot;
-output_image_path = &quot;./out.jpg&quot;
+API_URL = "http://localhost:8080/vehicle-attribute-recognition" # 服务URL
+image_path = "./demo.jpg"
+output_image_path = "./out.jpg"
 
 # 对本地图像进行Base64编码
-with open(image_path, &quot;rb&quot;) as file:
+with open(image_path, "rb") as file:
     image_bytes = file.read()
-    image_data = base64.b64encode(image_bytes).decode(&quot;ascii&quot;)
+    image_data = base64.b64encode(image_bytes).decode("ascii")
 
-payload = {&quot;image&quot;: image_data}  # Base64编码的文件内容或者图像URL
+payload = {"image": image_data}  # Base64编码的文件内容或者图像URL
 
 # 调用API
 response = requests.post(API_URL, json=payload)
 
 # 处理接口返回数据
 assert response.status_code == 200
-result = response.json()[&quot;result&quot;]
-with open(output_image_path, &quot;wb&quot;) as file:
-    file.write(base64.b64decode(result[&quot;image&quot;]))
-print(f&quot;Output image saved at {output_image_path}&quot;)
-print(&quot;\nDetected vehicles:&quot;)
-print(result[&quot;vehicles&quot;])
+result = response.json()["result"]
+with open(output_image_path, "wb") as file:
+    file.write(base64.b64decode(result["image"]))
+print(f"Output image saved at {output_image_path}")
+print("\nDetected vehicles:")
+print(result["vehicles"])
 </code></pre></details>
 </details>
 <br/>
@@ -581,25 +574,25 @@ print(result[&quot;vehicles&quot;])
 由于车辆属性识别产线包含车辆属性识别模块和车辆检测模块,如果模型产线的效果不及预期,可能来自于其中任何一个模块。您可以对识别效果差的图片进行分析,进而确定是哪个模块存在问题,并参考以下表格中对应的微调教程链接进行模型微调。
 
 <table>
-  <thead>
-    <tr>
-      <th>情形</th>
-      <th>微调模块</th>
-      <th>微调参考链接</th>
-    </tr>
-  </thead>
-  <tbody>
-    <tr>
-      <td>车辆检测不准</td>
-      <td>车辆检测模块</td>
-      <td><a href="../../../module_usage/tutorials/cv_modules/vehicle_detection.md">链接</a></td>
-    </tr>
-    <tr>
-      <td>属性识别不准</td>
-      <td>车辆属性识别模块</td>
-      <td><a href="../../../module_usage/tutorials/cv_modules/vehicle_attribute_recognition.md">链接</a></td>
-    </tr>
-  </tbody>
+<thead>
+<tr>
+<th>情形</th>
+<th>微调模块</th>
+<th>微调参考链接</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>车辆检测不准</td>
+<td>车辆检测模块</td>
+<td><a href="../../../module_usage/tutorials/cv_modules/vehicle_detection.md">链接</a></td>
+</tr>
+<tr>
+<td>属性识别不准</td>
+<td>车辆属性识别模块</td>
+<td><a href="../../../module_usage/tutorials/cv_modules/vehicle_attribute_recognition.md">链接</a></td>
+</tr>
+</tbody>
 </table>
 
 ### 4.2 模型应用

+ 1 - 1
docs/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.en.md

@@ -1351,7 +1351,7 @@ pipeline = create_pipeline(
     pipeline="PP-ChatOCRv3-doc",
     llm_name="ernie-3.5",
     llm_params={"api_type": "qianfan", "ak": "", "sk": ""},
-    device="npu:0" # gpu:0 --&gt; npu:0
+    device="npu:0" # gpu:0 --> npu:0
     )
 ```
 

+ 1 - 0
docs/pipeline_usage/tutorials/time_series_pipelines/time_series_forecasting.en.md

@@ -66,6 +66,7 @@ Time series forecasting is a technique that utilizes historical data to predict
 </tbody>
 </table>
 <p><b>Note: The above accuracy metrics are measured on <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/data/Etth1.tar">ETTH1</a>. All model GPU inference times are based on an NVIDIA Tesla T4 machine with FP32 precision. CPU inference speeds are based on an Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz with 8 threads and FP32 precision.</b></p>
+
 ## 2. Quick Start
 The pre-trained model pipelines provided by PaddleX allow for quick experience of their effects. You can experience the effects of the General Time Series Forecasting Pipeline online or locally using command line or Python.
 

+ 75 - 38
docs/practical_tutorials/anomaly_detection_tutorial.en.md

@@ -35,9 +35,23 @@ After experiencing the pipeline, determine if it meets your expectations (includ
 
 PaddleX provides 1 end-to-end anomaly detection models. For details, refer to the [Model List](../support_list/models_list.en.md). Some model benchmarks are as follows:
 
-| Model List          | Avg (%) | GPU Inference Time (ms) | CPU Inference Time (ms) | Model Size (M) | yaml file |
-|-|-|-|-|-|-|
-|STFPM|96.2|-|-|21.5 M|[STFPM.yaml](../../paddlex/configs/modules/image_anomaly_detection/STFPM.yaml)|
+<table>
+<thead>
+<tr>
+<th>Model</th>
+<th>mIoU</th>
+<th>Model Storage Size (M)</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>STFPM</td>
+<td>0.9901</td>
+<td>22.5</td>
+</tr>
+</tbody>
+</table>
+<b>The above model accuracy metrics are measured from the grid category in the MVTec_AD dataset.</b>
 
 > **Note: The above accuracy metrics are measured on the [MVTec AD](https://www.mvtec.com/company/research/datasets/mvtec-ad) dataset.**
 
@@ -58,7 +72,7 @@ tar -xf ./dataset/anomaly_detection_hazelnut.tar -C ./dataset/
 To verify the dataset, simply use the following command:
 
 ```bash
-python main.py -c paddlex/configs/image_anomaly_detection/STFPM.yaml \
+python main.py -c paddlex/configs/modules/image_anomaly_detection/STFPM.yaml \
     -o Global.mode=check_dataset \
     -o Global.dataset_dir=./dataset/anomaly_detection_hazelnut
 ```
@@ -71,37 +85,37 @@ After executing the above command, PaddleX will verify the dataset and collect b
   "check_pass": true,
   "attributes": {
     "train_sample_paths": [
-      "check_dataset\/demo_img\/294.png",
-      "check_dataset\/demo_img\/260.png",
-      "check_dataset\/demo_img\/297.png",
-      "check_dataset\/demo_img\/170.png",
-      "check_dataset\/demo_img\/068.png",
-      "check_dataset\/demo_img\/212.png",
-      "check_dataset\/demo_img\/204.png",
-      "check_dataset\/demo_img\/233.png",
-      "check_dataset\/demo_img\/367.png",
-      "check_dataset\/demo_img\/383.png"
+      "check_dataset/demo_img/294.png",
+      "check_dataset/demo_img/260.png",
+      "check_dataset/demo_img/297.png",
+      "check_dataset/demo_img/170.png",
+      "check_dataset/demo_img/068.png",
+      "check_dataset/demo_img/212.png",
+      "check_dataset/demo_img/204.png",
+      "check_dataset/demo_img/233.png",
+      "check_dataset/demo_img/367.png",
+      "check_dataset/demo_img/383.png"
     ],
     "train_samples": 391,
     "val_sample_paths": [
-      "check_dataset\/demo_img\/012.png",
-      "check_dataset\/demo_img\/017.png",
-      "check_dataset\/demo_img\/006.png",
-      "check_dataset\/demo_img\/013.png",
-      "check_dataset\/demo_img\/014.png",
-      "check_dataset\/demo_img\/010.png",
-      "check_dataset\/demo_img\/007.png",
-      "check_dataset\/demo_img\/001.png",
-      "check_dataset\/demo_img\/002.png",
-      "check_dataset\/demo_img\/009.png"
+      "check_dataset/demo_img/012.png",
+      "check_dataset/demo_img/017.png",
+      "check_dataset/demo_img/006.png",
+      "check_dataset/demo_img/013.png",
+      "check_dataset/demo_img/014.png",
+      "check_dataset/demo_img/010.png",
+      "check_dataset/demo_img/007.png",
+      "check_dataset/demo_img/001.png",
+      "check_dataset/demo_img/002.png",
+      "check_dataset/demo_img/009.png"
     ],
     "val_samples": 70,
     "num_classes": 1
   },
   "analysis": {
-    "histogram": "check_dataset\/histogram.png"
+    "histogram": "check_dataset/histogram.png"
   },
-  "dataset_path": ".\/dataset\/hazelnut",
+  "dataset_path": "anomaly_detection_hazelnut",
   "show_type": "image",
   "dataset_type": "SegDataset"
 }
@@ -149,7 +163,7 @@ Data conversion and splitting can be enabled simultaneously. For data splitting,
 Before training, ensure that you have validated your dataset. To complete the training of a PaddleX model, simply use the following command:
 
 ```bash
-python main.py -c paddlex/configs/image_anomaly_detection/STFPM.yaml \
+python main.py -c paddlex/configs/modules/image_anomaly_detection/STFPM.yaml \
     -o Global.mode=train \
     -o Global.dataset_dir=./dataset/anomaly_detection_hazelnut \
     -o Train.epochs_iters=4000
@@ -160,7 +174,7 @@ PaddleX supports modifying training hyperparameters, single/multi-GPU training,
 Each model in PaddleX provides a configuration file for model development, which is used to set relevant parameters. Model training-related parameters can be set by modifying the `Train` fields in the configuration file. Some example parameter descriptions in the configuration file are as follows:
 
 * `Global`:
-    * `mode`: Mode, supports dataset validation (`check_dataset`), model training (`train`), and model evaluation (`evaluate`);
+    * `mode`: Mode, supports data verification (`check_dataset`), model training (`train`), model evaluation (`evaluate`), and model inference (`predict`);
     * `device`: Training device, options include `cpu`, `gpu`, `xpu`, `npu`, `mlu`. For multi-GPU training, specify card numbers, e.g., `gpu:0,1,2,3`;
 * `Train`: Training hyperparameter settings;
     * `epochs_iters`: Number of training iterations;
@@ -187,7 +201,7 @@ After completing model training, all outputs are saved in the specified output d
 After completing model training, you can evaluate the specified model weight file on the validation set to verify the model's accuracy. To evaluate a model using PaddleX, simply use the following command:
 
 ```bash
-python main.py -c paddlex/configs/image_anomaly_detection/STFPM.yaml \
+python main.py -c paddlex/configs/modules/image_anomaly_detection/STFPM.yaml \
     -o Global.mode=evaluate \
     -o Global.dataset_dir=./dataset/anomaly_detection_hazelnut
 ```
@@ -231,13 +245,35 @@ Changing Epoch Results:
 
 ## 6. Production Line Testing
 
-Replace the model in the production line with the fine-tuned model for testing, for example:
+Replace the model in the production line with the fine-tuned model for testing. You can obtain the anomaly_detection production configuration file and load the configuration file for prediction. You can execute the following command to save the configuration in `my_path`:
 
-```bash
-python main.py -c paddlex/configs/image_anomaly_detection/STFPM.yaml \
-    -o Global.mode=predict \
-    -o Predict.model_dir="output/best_model/inference" \
-    -o Predict.input="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/uad_hazelnut.png"
+```
+paddlex --get_pipeline_config anomaly_detection --save_path ./my_path
+```
+
+Modify the `SubModules.AnomalyDetection.model_dir` in the configuration file `my_path/anomaly_detection.yaml` to your model path `output/best_model/inference`:
+
+pipeline_name: anomaly_detection
+
+```yaml
+SubModules:
+  AnomalyDetection:
+    module_name: anomaly_detection
+    model_name: STFPM
+    model_dir: output/best_model/inference  # Replace with the fine-tuned image anomaly detection model weight path
+    batch_size: 1
+```
+
+Subsequently, in the Python code, you can call the production line as follows:
+
+```python
+from paddlex import create_pipeline
+pipeline = create_pipeline(pipeline="./my_path/anomaly_detection.yaml")
+output = pipeline.predict("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/uad_hazelnut.png")
+for res in output:
+    res.print()  ## Print the structured output of the prediction
+    res.save_to_img("./output/")  ## Save the visualized result image
+    res.save_to_json("./output/")  ## Save the structured output of the prediction
 ```
 
 The prediction results will be generated under `./output`, where the prediction result for `uad_hazelnut.png` is shown below:
@@ -249,10 +285,11 @@ The prediction results will be generated under `./output`, where the prediction
 
 ## 7. Development Integration/Deployment
 If the anomaly detection pipeline meets your requirements for inference speed and accuracy in the production line, you can proceed directly with development integration/deployment.
-1. Directly apply the trained model in your Python project by referring to the following sample code, and modify the `Pipeline.model` in the `paddlex/pipelines/anomaly_detection.yaml` configuration file to your own model path `output/best_model/inference`:
+1. Directly apply the trained model in your Python project. You can refer to the following example:
+
 ```python
 from paddlex import create_pipeline
-pipeline = create_pipeline(pipeline="paddlex/pipelines/anomaly_detection.yaml")
+pipeline = create_pipeline(pipeline="my_path/anomaly_detection.yaml")
 output = pipeline.predict("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/uad_hazelnut.png")
 for res in output:
     res.print() # Print the structured output of the prediction
@@ -261,7 +298,7 @@ for res in output:
 ```
 For more parameters, please refer to [Anomaly Detection Pipeline Usage Tutorial](../pipeline_usage/tutorials/cv_pipelines/image_anomaly_detection.en.md).
 
-2. Additionally, PaddleX offers three other deployment methods, detailed as follows:
+1. Additionally, PaddleX offers three other deployment methods, detailed as follows:
 
 * high-performance inference: In actual production environments, many applications have stringent standards for deployment strategy performance metrics (especially response speed) to ensure efficient system operation and smooth user experience. To this end, PaddleX provides high-performance inference plugins aimed at deeply optimizing model inference and pre/post-processing for significant end-to-end process acceleration. For detailed high-performance inference procedures, please refer to the [PaddleX High-Performance Inference Guide](../pipeline_deploy/high_performance_inference.en.md).
 * Service-Oriented Deployment: Service-oriented deployment is a common deployment form in actual production environments. By encapsulating inference functions as services, clients can access these services through network requests to obtain inference results. PaddleX supports users in achieving cost-effective service-oriented deployment of production lines. For detailed service-oriented deployment procedures, please refer to the [PaddleX Service-Oriented Deployment Guide](../pipeline_deploy/serving.en.md).

+ 1 - 1
docs/practical_tutorials/anomaly_detection_tutorial.md

@@ -72,7 +72,7 @@ tar -xf ./dataset/anomaly_detection_hazelnut.tar -C ./dataset/
 在对数据集校验时,只需一行命令:
 
 ```bash
-python main.py -c paddlex/configs/modules/modules/image_anomaly_detection/STFPM.yaml \
+python main.py -c paddlex/configs/modules/image_anomaly_detection/STFPM.yaml \
     -o Global.mode=check_dataset \
     -o Global.dataset_dir=./dataset/anomaly_detection_hazelnut
 ```

+ 850 - 0
docs/practical_tutorials/face_recognition_tutorial.en.md

@@ -0,0 +1,850 @@
+---
+comments: true
+---
+
+# PaddleX 3.0 Face Recognition Pipeline —— Cartoon Face Recognition Tutorial
+
+PaddleX provides a rich set of model pipelines, which are composed of one or more models. Each pipeline is designed to solve specific task problems in certain scenarios. All pipelines provided by PaddleX support quick experience. If the effect does not meet expectations, fine-tuning the model with private data is also supported. Moreover, PaddleX provides Python APIs for easy integration into personal projects. Before using it, you need to install PaddleX. For installation methods, please refer to [PaddleX Installation](../installation/installation.md). This tutorial uses a cartoon face recognition task as an example to introduce the usage process of the pipeline tool.
+
+## 1. Selecting the Pipeline
+
+Firstly, you need to select the corresponding PaddleX pipeline based on your task scenario. For face recognition, the corresponding pipeline is the Face Recognition Pipeline. If you are unsure about the relationship between the task and the pipeline, you can refer to the [Pipeline List](../support_list/pipelines_list.md) supported by PaddleX to understand the capabilities of each pipeline.
+
+## 2. Quick Experience
+
+In the PaddleX Face Recognition Pipeline, the model weights provided by the official are trained based on real face data. We first use a real face demonstration dataset for experience. PaddleX provides the following quick experience methods, which can be directly experienced locally through Python APIs.
+
+* Local experience method for real face data:
+
+(1) Download the real face demonstration dataset and extract it to the local directory. The command is as follows:
+
+```bash
+wget https://paddle-model-ecology.bj.bcebos.com/paddlex/data/face_demo_gallery.tar
+tar -xf ./face_demo_gallery.tar
+```
+
+(2) Execute the Python script to perform image inference.
+
+```python
+from paddlex import create_pipeline
+# Create a face recognition pipeline
+pipeline = create_pipeline(pipeline="face_recognition")
+# Build a real face feature database
+index_data = pipeline.build_index(gallery_imgs="face_demo_gallery", gallery_label="face_demo_gallery/gallery.txt")
+# Image prediction
+output = pipeline.predict("face_demo_gallery/test_images/friends1.jpg", index=index_data)
+for res in output:
+    res.print()
+    res.save_to_img("./output/") # Save the visualized result image
+```
+
+Real face demo data quick experience inference result example:
+
+<center>
+
+  <img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/pipelines/face_recognition/02.jpg" width=600>
+
+</center>
+
+* Cartoon face data local experience method:
+
+(1) Download the cartoon face demo dataset and extract it to the local directory. The command is as follows:
+
+```bash
+wget https://paddle-model-ecology.bj.bcebos.com/paddlex/data/cartoonface_demo_gallery.tar
+tar -xf ./cartoonface_demo_gallery.tar
+```
+
+(2) Execute the Python script to implement image inference
+
+```python
+from paddlex import create_pipeline
+# Create a face recognition pipeline
+pipeline = create_pipeline(pipeline="face_recognition")
+# Build a cartoon face feature index
+index_data = pipeline.build_index(gallery_imgs="cartoonface_demo_gallery", gallery_label="cartoonface_demo_gallery/gallery.txt")
+# Image prediction
+output = pipeline.predict("cartoonface_demo_gallery/test_images/cartoon_demo.jpg", index=index_data)
+for res in output:
+    res.print()
+    res.save_to_img("./output/") # Save the visualized result image
+```
+
+<!-- Translation of the provided HTML content -->
+<center>
+  <img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/practical_tutorials/face_recognition/01.jpg" width=600>
+</center>
+<p>
+After experiencing the production line, it is necessary to determine whether the production line meets expectations (including accuracy, speed, etc.), and whether the models included in the production line need further fine-tuning. If the speed or accuracy of the model does not meet expectations, alternative models should be selected for continued testing to determine if the results are satisfactory. If the final results are still unsatisfactory, the models need to be fine-tuned.
+</p>
+<p>
+This tutorial aims to achieve cartoon face recognition. From the quick experience results above, it can be observed that the official default model performs well in real face scenarios, but it fails to meet practical application requirements when inferring cartoon data, resulting in missed detections of cartoon faces and misidentifications of faces. Therefore, we need to conduct secondary development to train and fine-tune the face detection and face feature models.
+</p>
+<h2>3. Construction of Face Feature Database</h2>
+<h3>3.1 Data Preparation</h3>
+<p>
+This tutorial uses a real face demonstration dataset as an example. You can download the official real face demonstration dataset and unzip it to your local directory. The command is as follows:
+</p>
+
+## 3. Construction of Face Feature Database
+
+### 3.1 Data Preparation
+This tutorial uses a real face demonstration dataset as an example. You can download the official real face demonstration dataset and extract it to your local directory. The command is as follows:
+
+```bash
+wget https://paddle-model-ecology.bj.bcebos.com/paddlex/data/face_demo_gallery.tar
+tar -xf ./face_demo_gallery.tar
+```
+
+If you wish to build a face feature database with private data, you need to organize the data in the following manner:
+
+```bash
+data_root             # Root directory of the dataset, the directory name can be changed
+├── images            # Directory for storing images, the directory name can be changed
+│   ├── ID0           # Identity ID name, it is better to use meaningful names, such as names of people
+│   │   ├── xxx.jpg   # Image, nested hierarchy is supported here
+│   │   ├── xxx.jpg   # Image, nested hierarchy is supported here
+│   │       ...
+│   ├── ID1           # Identity ID name, it is better to use meaningful names, such as names of people
+│   │   ├── xxx.jpg   # Image, nested hierarchy is supported here
+│   │   ├── xxx.jpg   # Image, nested hierarchy is supported here
+│   │       ...
+│       ...
+└── gallery.txt       # Annotation file for the feature library dataset, the file name can be changed. Each line provides the path and label of the image to be retrieved, separated by spaces. Example content: images/Chandler/Chandler00037.jpg Chandler
+```
+
+### 3.2 Construction of Face Feature Database
+PaddleX provides a simple command for database construction. It only takes a few lines of code to build and save the face feature database:
+
+```python
+from paddlex import create_pipeline
+
+pipeline = create_pipeline(pipeline="face_recognition")
+# gallery_imgs: root directory of gallery images, gallery_label: annotation file
+index_data = pipeline.build_index(gallery_imgs="face_demo_gallery", gallery_label="face_demo_gallery/gallery.txt")
+# Save the face feature gallery
+index_data.save("face_index")
+```
+
+## 4. Adding and Removing from the Face Feature Database
+The quality of the face feature database is crucial for the results of face recognition. For cases where the recognition effect is poor, such as under specific lighting conditions or at specific shooting angles, it is necessary to collect and add corresponding images to the feature database. Additionally, when new identities are added, the corresponding face images need to be included in the face feature database. Conversely, for incorrect indices or identities that need to be removed, the corresponding indices should be deleted from the face feature database.
+
+PaddleX provides simple commands for adjusting the face feature database. To add images to the face feature database, you can call the `append_index` method; to remove indices, you can call the `remove_index` method. For the face recognition dataset in this tutorial, the commands for adjusting the face feature database are as follows:
+
+```python
+from paddlex import create_pipeline
+
+pipeline = create_pipeline("face_recognition")
+index_data = pipeline.build_index(gallery_imgs="face_demo_gallery", gallery_label="face_demo_gallery/gallery.txt", index_type="Flat")
+index_data.save("face_index_base")
+index_data = pipeline.remove_index(remove_ids="face_demo_gallery/remove_ids.txt", index="face_index_base")
+index_data.save("face_index_del")
+index_data = pipeline.append_index(gallery_imgs="face_demo_gallery", gallery_label="face_demo_gallery/gallery.txt", index="face_index_del")
+index_data.save("face_index_add")
+```
+
+Note:
+1. The default "HNSW32" index method does not support deleting indexes, so "Flat" is used. For details on the differences between various index methods, please refer to the [Face Recognition Pipeline Tutorial](../pipeline_usage/tutorials/cv_pipelines/face_recognition.md#223-adding-and-deleting-operations-of-face-feature-library);
+2. For detailed descriptions of the parameters in the commands, please refer to the [Face Recognition Pipeline Tutorial](../pipeline_usage/tutorials/cv_pipelines/face_recognition.md#223-adding-and-deleting-operations-of-face-feature-library).
+
+After deleting and adding features to the base library using the above methods, test the different base libraries generated with the example images in sequence again:
+
+```python
+from paddlex import create_pipeline
+# Create a face recognition pipeline
+pipeline = create_pipeline(pipeline="face_recognition")
+# Pass the image to be predicted and the local feature index
+output = pipeline.predict("face_demo_gallery/test_images/friends1.jpg", index='./face_index_del')
+for res in output:
+    res.print()
+    res.save_to_img("./output/")
+```
+
+The visualization of the prediction results is as follows:
+<center>
+
+<img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/practical_tutorials/face_recognition/02.jpg" width=600>
+
+</center>
+
+## 5. Training and Fine-Tuning a Face Detection Model with Cartoon Data
+
+### 5.1 Model Selection
+PaddleX provides 4 face detection models. For details, please refer to the [Model List](../support_list/models_list.md). The benchmark of some face detection models is as follows:
+
+<table>
+<thead>
+<tr>
+<th>Model</th><th>Model Download Link</th>
+<th>AP(%)</th>
+<th>GPU Inference Time (ms)<br/>[Normal Mode / High-Performance Mode]</th>
+<th>CPU Inference Time (ms)<br/>[Normal Mode / High-Performance Mode]</th>
+<th>Model Storage Size (M)</th>
+<th>Introduction</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>BlazeFace</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/BlazeFace_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/BlazeFace_pretrained.pdparams">Training Model</a></td>
+<td>15.4</td>
+<td>60.34 / 54.76</td>
+<td>84.18 / 84.18</td<td>
+>0.447</td>
+<td>A lightweight and efficient face detection model</td>
+</tr>
+<tr>
+<td>BlazeFace-FPN-SSH</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/BlazeFace-FPN-SSH_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/BlazeFace-FPN-SSH_pretrained.pdparams">Training Model</a></td>
+<td>18.7</td>
+<td>69.29 / 63.42</td>
+<td>86.96 / 86.96</td>
+<td>0.606</td>
+<td>An improved model of BlazeFace with added FPN and SSH structures</td>
+</tr>
+<tr>
+<td>PicoDet_LCNet_x2_5_face</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PicoDet_LCNet_x2_5_face_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PicoDet_LCNet_x2_5_face_pretrained.pdparams">Training Model</a></td>
+<td>31.4</td>
+<td>35.37 / 12.88</td>
+<td>126.24 / 126.24</td>
+<td>28.9</td>
+<td>A face detection model based on PicoDet_LCNet_x2_5</td>
+</tr>
+<tr>
+<td>PP-YOLOE_plus-S_face</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PP-YOLOE_plus-S_face_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-YOLOE_plus-S_face_pretrained.pdparams">Training Model</a></td>
+<td>36.1</td>
+<td>22.54 / 8.33</td>
+<td>138.67 / 138.67</td>
+<td>26.5</td>
+<td>A face detection model based on PP-YOLOE_plus-S</td>
+</tr>
+</tr></tbody>
+</table>
+<p>Note: The above accuracy metrics are evaluated on the WIDER-FACE validation set in COCO format with an input size of 640*640. The GPU inference time for all models is based on an NVIDIA V100 machine with FP32 precision type, and the CPU inference speed is based on an Intel(R) Xeon(R) Gold 6271C CPU @ 2.60GHz with FP32 precision type.</p>
+
+### 5.2 Data Preparation and Verification
+
+This tutorial uses the Cartoon Face Detection Dataset as an example dataset, which can be obtained using the following command. If you use your own labeled dataset, you need to adjust it according to the format requirements of PaddleX to meet the data format requirements of PaddleX. For details on data format, please refer to the [PaddleX Object Detection Task Module Data Preparation Tutorial](../data_annotations/cv_modules/object_detection.md).
+
+Dataset acquisition command:
+
+```bash
+cd /path/to/paddlex
+wget https://paddle-model-ecology.bj.bcebos.com/paddlex/data/cartoonface_coco_examples.tar -P ./dataset
+tar -xf ./dataset/cartoonface_coco_examples.tar -C ./dataset/
+```
+
+Dataset Verification, when verifying the dataset, only one command is needed:
+
+```bash
+python main.py -c paddlex/configs/modules/face_detection/PP-YOLOE_plus-S_face.yaml \
+    -o Global.mode=check_dataset \
+    -o Global.dataset_dir=./dataset/cartoonface_coco_examples
+```
+
+After executing the above command, PaddleX will verify the dataset and count the basic information of the dataset. After the command runs successfully, the information `Check dataset passed !` will be printed in the log, and the related output will be saved in the `./output/check_dataset` directory under the current directory. The output directory includes visualized sample images and sample distribution histograms. The verification result file is saved in `./output/check_dataset_result.json`, and the specific content of the verification result file is
+
+```json
+{
+  "done_flag": true,
+  "check_pass": true,
+  "attributes": {
+    "num_classes": 1,
+    "train_samples": 2000,
+    "train_sample_paths": [
+      "check_dataset\/demo_img\/personai_icartoonface_dettrain_27140.jpg",
+      "check_dataset\/demo_img\/personai_icartoonface_dettrain_23804.jpg",
+      "check_dataset\/demo_img\/personai_icartoonface_dettrain_76484.jpg",
+      "check_dataset\/demo_img\/personai_icartoonface_dettrain_18197.jpg",
+      "check_dataset\/demo_img\/personai_icartoonface_dettrain_35260.jpg",
+      "check_dataset\/demo_img\/personai_icartoonface_dettrain_00404.jpg",
+      "check_dataset\/demo_img\/personai_icartoonface_dettrain_15455.jpg",
+      "check_dataset\/demo_img\/personai_icartoonface_dettrain_10119.jpg",
+      "check_dataset\/demo_img\/personai_icartoonface_dettrain_26397.jpg",
+      "check_dataset\/demo_img\/personai_icartoonface_dettrain_17044.jpg"
+    ],
+    "val_samples": 500,
+    "val_sample_paths": [
+      "check_dataset\/demo_img\/personai_icartoonface_dettrain_77030.jpg",
+      "check_dataset\/demo_img\/personai_icartoonface_dettrain_36265.jpg",
+      "check_dataset\/demo_img\/personai_icartoonface_dettrain_63390.jpg",
+      "check_dataset\/demo_img\/personai_icartoonface_dettrain_59167.jpg",
+      "check_dataset\/demo_img\/personai_icartoonface_dettrain_82024.jpg",
+      "check_dataset\/demo_img\/personai_icartoonface_dettrain_50449.jpg",
+      "check_dataset\/demo_img\/personai_icartoonface_dettrain_71386.jpg",
+      "check_dataset\/demo_img\/personai_icartoonface_dettrain_23112.jpg",
+      "check_dataset\/demo_img\/personai_icartoonface_dettrain_69609.jpg",
+      "check_dataset\/demo_img\/personai_icartoonface_dettrain_39296.jpg"
+    ]
+  },
+  "analysis": {
+    "histogram": "check_dataset\/histogram.png"
+  },
+  "dataset_path": ".\/dataset\/cartoonface_coco_examples",
+  "show_type": "image",
+  "dataset_type": "COCODetDataset"
+}
+```
+
+In the above verification results, `check_pass` being `True` indicates that the dataset format meets the requirements. The explanations for the other metrics are as follows:
+* `attributes.num_classes`: This dataset contains only one category, which is human faces.
+* `attributes.train_samples`: The number of training samples in this dataset is 2000.
+* `attributes.val_samples`: The number of validation samples in this dataset is 500.
+* `attributes.train_sample_paths`: This is a list of relative paths to the visualized training samples.
+* `attributes.val_sample_paths`: This is a list of relative paths to the visualized validation samples.
+
+Additionally, the dataset verification has analyzed the distribution of annotation counts for all categories in the dataset and generated a histogram (histogram.png):
+<center>
+
+<img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/practical_tutorials/face_recognition/03.png" width=600>
+
+</center>
+
+<b>Note</b>: Only datasets that pass the verification can be used for training and evaluation.
+
+### Dataset Format Conversion / Dataset Splitting (Optional)
+
+If you need to convert the dataset format or re-split the dataset, you can set this by modifying the configuration file or appending hyperparameters.
+
+Parameters related to dataset verification can be set by modifying the fields under `CheckDataset` in the configuration file. Some example explanations of the parameters in the configuration file are as follows:
+
+* `CheckDataset`:
+    * `convert`:
+        * `enable`: Whether to perform dataset format conversion. If set to `True`, the dataset format will be converted. The default is `False`.
+        * `src_dataset_type`: If dataset format conversion is performed, you need to specify the source dataset format. The available source formats are `LabelMe` and `VOC`.
+    * `split`:
+        * `enable`: Whether to re-split the dataset. If set to `True`, the dataset will be re-split. The default is `False`.
+        * `train_percent`: If re-splitting the dataset, you need to set the percentage of the training set. This should be an integer  between0 and 100, and it must sum to 100 with `val_percent`.
+        * `val_percent`: If re-splitting the dataset, you need to set the percentage of the validation set. This should be an integer between 0 and 100, and it must sum to 100 with `train_percent`.
+
+Both data conversion and data splitting can be enabled simultaneously. For data splitting, the original annotation file will be renamed to `xxx.bak` in the original path. These parameters can also be set by appending command-line arguments, for example, to re-split the dataset and set the training and validation set ratios: `-o CheckDataset.split.enable=True -o CheckDataset.split.train_percent=80 -o CheckDataset.split.val_percent=20`.
+
+### 5.3 Model Training and Evaluation
+
+#### Model Training
+
+Before training, please ensure that you have verified the dataset. To complete the training of a PaddleX model, you only need the following command:
+
+```bash
+python main.py -c paddlex/configs/modules/face_detection/PP-YOLOE_plus-S_face.yaml \
+    -o Global.mode=train \
+    -o Global.dataset_dir=./dataset/cartoonface_coco_examples \
+    -o Train.epochs_iters=10
+```
+
+In PaddleX, model training supports functions such as modifying training hyperparameters and single-machine single-card/multi-card training. You only need to modify the configuration file or append command-line parameters.
+
+Each model in PaddleX provides a configuration file for model development, which is used to set related parameters. Parameters related to model training can be set by modifying the fields under `Train` in the configuration file. Some examples of parameters in the configuration file are as follows:
+
+* `Global`:
+    * `mode`: Mode, supports data validation (`check_dataset`), model training (`train`), and model evaluation (`evaluate`);
+    * `device`: Training device, options include `cpu`, `gpu`, `xpu`, `npu`, `mlu`. Except for `cpu`, multi-card training can specify card numbers, such as: `gpu:0,1,2,3`;
+* `Train`: Training hyperparameter settings;
+    * `epochs_iters`: Setting for the number of training epochs;
+    * `learning_rate`: Setting for the training learning rate;
+
+For more information on hyperparameters, please refer to [PaddleX General Model Configuration File Parameter Description](../module_usage/instructions/config_parameters_common.md).
+
+<b>Note:</b>
+
+- The above parameters can be set by appending command-line parameters. For example, to specify the mode as model training: `-o Global.mode=train`; to specify training on the first 2 GPUs: `-o Global.device=gpu:0,1`; to set the number of training epochs to 10: `-o Train.epochs_iters=10`.
+- During model training, PaddleX automatically saves the model weight files, with the default directory being `output`. If you need to specify a save path, you can use the `-o Global.output` field in the configuration file.
+- PaddleX abstracts away the concepts of dynamic graph weights and static graph weights for you. During model training, both dynamic graph and static graph weights are generated. By default, static graph weights are used for model inference.
+
+<b>Training Output Explanation:</b>
+
+After completing model training, all outputs are saved in the specified output directory (default is `./output/`). The usual outputs include:
+
+* `train_result.json`: Training result record file, which records whether the training task was completed normally, as well as the weight metrics, related file paths, etc.;
+* `train.log`: Training log file, which records the changes in model metrics and loss during the training process;
+* `config.yaml`: Training configuration file, which records the hyperparameter settings for this training session;
+* `.pdparams`, `.pdema`, `.pdopt.pdstate`, `.pdiparams`, `.pdmodel`: Model weight-related files, including network parameters, optimizer, EMA, static graph network parameters, static graph network structure, etc.;
+
+#### Model Evaluation
+
+After completing model training, you can evaluate the specified model weight file on the validation set to verify the model's accuracy. To perform model evaluation using PaddleX, you only need one command:
+
+```bash
+python main.py -c paddlex/configs/modules/face_detection/PP-YOLOE_plus-S.yaml \
+    -o Global.mode=evaluate \
+    -o Global.dataset_dir=./dataset/cartoonface_coco_examples
+```
+
+<b>Note:</b> When evaluating the model, you need to specify the path to the model weight file. Each configuration file has a default weight save path built-in. If you need to change it, you can simply set it by adding a command-line parameter, such as `-o Evaluate.weight_path=./output/best_model/best_model.pdparams`.
+
+### 5.4 Model Tuning
+
+After learning about model training and evaluation, we can improve the model's accuracy by adjusting hyperparameters. By reasonably adjusting the number of training epochs, you can control the depth of model training to avoid overfitting or underfitting. The learning rate setting is related to the speed and stability of model convergence. Therefore, when optimizing model performance, it is essential to carefully consider the values of these two parameters and flexibly adjust them according to the actual situation to achieve the best training results.
+
+It is recommended to follow the control variable method when debugging parameters:
+1. First, fix the training iteration to 10 and batch size to 4.
+2. Based on the PP-YOLOE_plus-S_face model, start two experiments with learning rates of 0.001 and 0.0001, respectively.
+3. It can be observed that the highest accuracy configuration in Experiment 2 is with a learning rate of 0.0001. On this training hyperparameter basis, increase the number of training epochs to 20, and you can achieve better accuracy.
+
+Learning rate exploration experiment results:
+<center>
+
+<table>
+<thead>
+<tr>
+<th>Experiment</th>
+<th>Epochs</th>
+<th>Learning Rate</th>
+<th>Batch Size</th>
+<th>Training Environment</th>
+<th>mAP@0.5</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>Experiment 1</td>
+<td>10</td>
+<td>0.001</td>
+<td>4</td>
+<td>4 GPUs</td>
+<td>0.323</td>
+</tr>
+<tr>
+<tdExperiment> 2</td>
+<td>10</td>
+<td>0.0001</td>
+<td>4</td>
+<td>4 GPUs</td>
+<td><strong>0.334</strong></td>
+</tr>
+</tbody>
+</table>
+</center>
+
+Changing epoch experiment results:
+<center>
+
+<table>
+<thead>
+<tr>
+<th>Experiment</th>
+<th>Epochs</th>
+<th>Learning Rate</th>
+<th>Batch Size</th>
+<th>Training Environment</th>
+<th>mAP@0.5</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>Experiment 2</td>
+<td>10</td>
+<td>0.0001</td>
+<td>4</td>
+<td>4 GPUs</td>
+<td>0.334</td>
+</tr>
+<tr>
+<td>Experiment 2 with increased epochs</td>
+<td>20</td>
+<td>0.0001</td>
+<td>4</td>
+<td>4 GPUs</td>
+<td><strong>0.360</strong></td>
+</tr>
+</tbody>
+</table>
+</center>
+
+<b>Note: This tutorial is for a 4-GPU setup. If you only have one GPU, you can adjust the number of training GPUs to complete this experiment. However, the final metrics may not align with the above metrics, which is a normal situation.</b>
+
+### 5.5 Model Inference
+
+After completing model training, evaluation, and fine-tuning, you can use the model weights that satisfy your metrics for inference prediction. To perform inference prediction via the command line, you only need the following command. We use the cartoon face demonstration data with poor performance from the official model weights in the previous [Quick Experience](#2-Quick Experience) for inference.
+
+```bash
+python main.py -c paddlex/configs/modules/face_detection/PP-YOLOE_plus-S.yaml \
+    -o Global.mode=predict \
+    -o Predict.model_dir="output/best_model/inference" \
+    -o Predict.input="cartoonface_demo_gallery/test_images/cartoon_demo.jpg"
+```
+
+The prediction results can be generated under `./output` through the above instructions, and the prediction result of `cartoon_demo.jpg` is as follows:
+<center>
+
+<img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/practical_tutorials/face_detection/04.jpg" width="600"/>
+
+</center>
+
+## 6. Fine-tuning Face Feature Model with Cartoon Data
+During the training process of PaddleX, the path of the best model will be saved to `output/best_model/inference`. Before starting the experiment in this section, please note to save the optimal weights from previous experiments to another path to avoid being overwritten by new experiments.
+
+### 6.1 Model Selection
+PaddleX provides 2 face feature models. For details, please refer to the [Model List](../support_list/models_list.md). The benchmark of the face feature models is as follows:
+
+<table>
+<thead>
+<tr>
+<th>Model</th><th>Model Download Link</th>
+<th>Output Feature Dimension</th>
+<th>Acc (%)<br/>AgeDB-30/CFP-FP/LFW</th>
+<th>GPU Inference Time (ms)<br/>[Normal Mode / High Performance Mode]</th>
+<th>CPU Inference Time (ms)<br/>[Normal Mode / High Performance Mode]</th>
+<th>Model Storage Size (M)</th>
+<th>Introduction</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>MobileFaceNet</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/MobileFaceNet_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/MobileFaceNet_pretrained.pdparams">Training Model</a></td>
+<td>128</td>
+<td>96.28/96.71/99.58</td>
+<td>3.16 / 0.48</td>
+<td>6.49 / 6.49</td>
+<td>4.1</td>
+<td>A face feature extraction model trained on the MS1Mv3 dataset based on MobileFaceNet</td>
+</tr>
+<tr>
+<td>ResNet50_face</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/ResNet50_face_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/ResNet50_face_pretrained.pdparams">Training Model</a></td>
+<td>512</td>
+<td>98.12/98.56/99.77</td>
+<td>5.68 / 1.09</td>
+<td>14.96 / 11.90</td>
+<td>87.2</td>
+<td>A face feature extraction model trained on the MS1Mv3 dataset based on ResNet50</td>
+</tr>
+</tbody>
+</table>
+<p>Note: The above accuracy metrics are measured on the AgeDB-30, CFP-FP, and LFW datasets. The GPU inference time is based on NVIDIA Tesla T4, with precision type FP32. The CPU inference speed is based on Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz, with 8 threads and precision type FP32.</p>
+
+### 6.2 Data Preparation and Verification
+
+This tutorial uses the Cartoon Face Recognition Dataset as the example dataset, which can be obtained through the following command. If you use your own labeled dataset, you need to adjust it according to the format requirements of PaddleX to meet the data format requirements of PaddleX. For details on data format, please refer to the [Face Feature Module Tutorial Document](../module_usage/tutorials/cv_modules/face_feature.md#414-Face Feature Module Dataset Organization).
+
+Dataset acquisition command:
+
+```bash
+cd /path/to/paddlex
+wget https://paddle-model-ecology.bj.bcebos.com/paddlex/data/cartoonface_rec_examples.tar -P ./dataset
+tar -xf ./dataset/cartoonface_rec_examples.tar -C ./dataset/
+```
+
+Data Validation
+When validating a dataset, only one command is needed:
+
+```bash
+python main.py -c paddlex/configs/modules/face_feature/ResNet50_face.yaml \
+    -o Global.mode=check_dataset \
+    -o Global.dataset_dir=./dataset/cartoonface_rec_examples
+```
+
+After executing the above command, PaddleX will verify the dataset and collect basic information about the dataset. Upon successful execution of the command, the message `Check dataset passed !` will be printed in the log. The related outputs will be saved in the `./output/check_dataset` directory under the current directory, which includes visualized example sample images. The verification result file is saved in `./output/check_dataset_result.json`, and the specific content of the verification result file is
+
+```json
+{
+  "done_flag": true,
+  "check_pass": true,
+  "attributes": {
+    "train_label_file": "..\/..\/dataset\/cartoonface_rec_examples\/train\/label.txt",
+    "train_num_classes": 5006,
+    "train_samples": 77934,
+    "train_sample_paths": [
+      "check_dataset\/demo_img\/0000259.jpg",
+      "check_dataset\/demo_img\/0000052.jpg",
+      "check_dataset\/demo_img\/0000009.jpg",
+      "check_dataset\/demo_img\/0000203.jpg",
+      "check_dataset\/demo_img\/0000007.jpg",
+      "check_dataset\/demo_img\/0000055.jpg",
+      "check_dataset\/demo_img\/0000273.jpg",
+      "check_dataset\/demo_img\/0000006.jpg",
+      "check_dataset\/demo_img\/0000155.jpg",
+      "check_dataset\/demo_img\/0000006.jpg"
+    ],
+    "val_label_file": "..\/..\/dataset\/cartoonface_rec_examples\/val\/pair_label.txt",
+    "val_num_classes": 2,
+    "val_samples": 8000,
+    "val_sample_paths": [
+      "check_dataset\/demo_img\/0000011.jpg",
+      "check_dataset\/demo_img\/0000121.jpg",
+      "check_dataset\/demo_img\/0000118.jpg",
+      "check_dataset\/demo_img\/0000034.jpg",
+      "check_dataset\/demo_img\/0000229.jpg",
+      "check_dataset\/demo_img\/0000070.jpg",
+      "check_dataset\/demo_img\/0000033.jpg",
+      "check_dataset\/demo_img\/0000089.jpg",
+      "check_dataset\/demo_img\/0000139.jpg",
+      "check_dataset\/demo_img\/0000081.jpg"
+    ]
+  },
+  "analysis": {},
+  "dataset_path": ".\/dataset\/cartoonface_rec_examples",
+  "show_type": "image",
+  "dataset_type": "ClsDataset"
+}
+```
+
+The above verification results indicate that `check_pass` being `True` means the dataset format meets the requirements. The explanations for other metrics are as follows:
+* `attributes.train_num_classes`: The training set of this dataset contains 5006 face classes, and this number of classes is the number that needs to be passed for subsequent training;
+* `attributes.val_num_classes`: The validation set of the face feature model only has two classes, 0 and 1, representing that two face images do not belong to the same identity and belong to the same identity, respectively;
+* `attributes.train_samples`: The number of samples in the training set of this dataset is 77934;
+* `attributes.val_samples`: The number of samples in the validation set of this dataset is 8000;
+* `attributes.train_sample_paths`: The list of relative paths for visualizing the samples in the training set of this dataset;
+* `attributes.val_sample_paths`: The list of relative paths for visualizing the samples in the validation set of this dataset;
+
+Note: Only data that has passed the data verification can be used for training and evaluation.
+
+### 6.3 Model Training and Model Evaluation
+
+#### Model Training
+
+Before training, please ensure that you have verified the dataset. To complete the training of the PaddleX model, simply use the following command:
+
+```bash
+python main.py -c paddlex/configs/modules/face_feature/ResNet50_face.yaml \
+    -o Global.mode=train \
+    -o Global.dataset_dir=./dataset/cartoonface_rec_examples \
+    -o Train.num_classes=5006 \
+    -o Train.log_interval=50
+```
+
+In PaddleX, model training supports features such as modifying training hyperparameters and single-machine single-card/multi-card training. You only need to modify the configuration file or append command-line parameters.
+
+Each model in PaddleX provides a configuration file for model development to set relevant parameters. Parameters related to model training can be set by modifying the fields under `Train` in the configuration file. Some examples of parameters in the configuration file are as follows:
+
+* `Global`:
+    * `mode`: Mode, supports data validation (`check_dataset`), model training (`train`), and model evaluation (`evaluate`);
+    * `device`: Training device, options include `cpu`, `gpu`, `xpu`, `npu`, `mlu`. Except for `cpu`, multi-card training can specify card numbers, such as: `gpu:0,1,2,3`;
+* `Train`: Training hyperparameter settings;
+    * `epochs_iters`: Setting for the number of training epochs;
+    * `learning_rate`: Setting for the training learning rate;
+
+For more information on hyperparameters, please refer to [PaddleX General Model Configuration File Parameter Description](../module_usage/instructions/config_parameters_common.md)
+
+<b>Note:</b>
+
+- The above parameters can be set by appending command-line parameters, such as specifying the mode as model training: `-o Global.mode=train`; specifying training on the first 2 GPUs: `-o Global.device=gpu:0,1`; setting the number of training epochs to 50: `-o Train.epochs_iters=50`.
+- During model training, PaddleX will automatically save the model weight, files with the default directory being `output`. If you need to specify a save path, you can use the `-o Global.output` field in the configuration file.
+- PaddleX hides the concept of dynamic graph weights and static graph weights from you. During model training, both dynamic graph and static graph weights will be generated. By default, static graph weights are used for model inference.
+
+Training output explanation:
+
+After completing model training, all outputs are saved in the specified output directory (default is ./output/), and typically include the following:
+* `train_result.json`: Training result record file, which records whether the training task was completed normally, as well as the metrics of the generated weights and relevant file paths;
+* `train.log`: Training log file, which records changes in model metrics and loss during the training process;
+* `config.yaml`: Training configuration file, which records the hyperparameter settings for this training session;
+* `.pdparams`, `.pdema`, `.pdopt.pdstate`, `.pdiparams`, `.pdmodel`: Model weight-related files, including network parameters, optimizer, EMA, static graph network parameters, and static graph network structure;
+
+#### Model Evaluation
+After completing model training, you can evaluate the specified model weight file on the validation set to verify the model's accuracy. To perform model evaluation using PaddleX, you only need a single command:
+
+```bash
+python main.py -c paddlex/configs/modules/face_feature/ResNet50_face.yaml \
+    -o Global.mode=evaluate \
+    -o Global.dataset_dir=./dataset/cartoonface_rec_examples \
+    -o Evaluate.log_interval=50
+```
+
+### 5.4 Model Tuning
+
+The face feature model in PaddleX is trained based on the ArcFace loss function. In addition to learning rate and number of training epochs, the margin parameter in the ArcFace loss function is a key hyperparameter. It enhances the separability between classes by introducing a fixed angular margin in the angular space, thereby improving the model's discriminative ability. In our model tuning experiments, we mainly focus on experimenting with the margin parameter.
+
+The size of the margin has a significant impact on the training effect and final performance of the model. Generally speaking: increasing the margin makes the decision boundaries between different classes more distinct, forcing the model to separate the feature vectors of different classes more widely in the feature space, thereby improving the separability between classes. However, an overly large margin may increase the optimization difficulty during training, slow down the convergence speed, or even prevent convergence. Additionally, in cases where the training data is insufficient or the data noise is high, an overly large margin may cause the model to overfit the training data, reducing its generalization ability on unseen data.
+
+A smaller margin can reduce the training difficulty of the model and make it easier to converge. For training sets with small data volume or low data quality, reducing the margin can lower the risk of overfitting and improve the model's generalization ability. However, an overly small margin may not be sufficient to separate the features of different classes widely enough, and the model may fail to learn discriminative features, affecting recognition accuracy.
+
+In the data validation phase, we can see that the cartoon face recognition dataset we used has 77,934 training samples. Compared to general face recognition datasets (MS1Mv2 with 5.8 million, Glint360K with 17 million), it is a relatively small dataset. Therefore, when training the face feature model using this dataset, we recommend reducing the margin parameter to lower the risk of overfitting and improve the model's generalization ability. In PaddleX, you can specify the value of the margin parameter during face feature model training by adding the command-line parameter `-o Train.arcmargin_m=xx`. Example training command:
+
+```bash
+python main.py -c paddlex/configs/modules/face_feature/ResNet50_face.yaml \
+    -o Global.mode=train \
+    -o Global.dataset_dir=./dataset/cartoonface_rec_examples \
+    -o Train.num_classes=5006 \
+    -o Train.log_interval=50 \
+    -o Train.arcmargin_m=0.3
+```
+
+When debugging parameters, follow the control variable method:
+1. Fix the training iteration to 20 and the learning rate to 4e-4.
+2. Launch two experiments based on the ResNet50_face model with margins of 0.5 and 0.3, respectively.
+
+Margin exploration experiment results:
+
+<center>
+
+<table>
+<thead>
+<tr>
+<th>Experiment</th>
+<th>Epochs</th>
+<th>Learning Rate</th>
+<th>Margin</th>
+<th>Training Environment</th>
+<th>Acc</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>Experiment 1</td>
+<td>25</td>
+<td>4e-4</td>
+<td>0.5</td>
+<td>4 GPUs</td>
+<td>0.925</td>
+</tr>
+<tr>
+<td>Experiment 2</td>
+<td>25</td>
+<td>4e-4</td>
+<td>0.3</td>
+<td>4 GPUs</td>
+<td><strong>0.928</strong></td>
+</tr>
+</tbody>
+</table>
+</center>
+
+Note: This tutorial is designed for a 4-GPU setup. If you only have 1 GPU, you can adjust the number of training GPUs to complete the experiment. However, the final metrics may not align with the above metrics, which is a normal situation.
+
+## 7. Production Line Integration
+
+After fine-tuning the face detection model and face feature model with cartoon scene data, you can select the high-precision model weights to integrate into the PaddleX face recognition production line.
+
+First, obtain the face_recognition production line configuration file and load the configuration file for prediction. You can execute the following command to save the results in `my_path`:
+
+```bash
+paddlex --get_pipeline_config face_recognition --save_path ./my_path
+```
+
+Modify `SubModules.Detection.model_dir` and `SubModules.Recognition.model_dir` in the configuration file to the paths of your fine-tuned face detection model and face feature model, respectively. If you need to directly integrate the face recognition production line into your Python project, you can refer to the following example:
+
+```yaml
+pipeline_name: face_recognition
+
+index: None
+det_threshold: 0.6
+rec_threshold: 0.4
+rec_topk: 5
+
+SubModules:
+  Detection:
+    module_name: face_detection
+    model_name: PP-YOLOE_plus-S_face
+    model_dir: "path/to/your/det_model" # 使用卡通人脸数据微调的人脸检测模型
+    batch_size: 1
+  Recognition:
+    module_name: face_feature
+    model_name: ResNet50_face
+    model_dir: "path/to/your/rec_model" # 使用卡通人脸数据微调的人脸特征模型
+    batch_size: 1
+```
+
+Subsequently, in your Python code, you can use the production line as follows:
+
+```python
+from paddlex import create_pipeline
+# 创建人脸识别产线
+pipeline = create_pipeline(pipeline="my_path/face_recognition.yaml")
+# 构建卡通人脸特征底库
+index_data = pipeline.build_index(gallery_imgs="cartoonface_demo_gallery", gallery_label="cartoonface_demo_gallery/gallery.txt")
+# 图像预测
+output = pipeline.predict("cartoonface_demo_gallery/test_images/cartoon_demo.jpg", index=index_data)
+for res in output:
+    res.print()
+    res.save_to_img("./output/") # 保存可视化结果图像
+```
+
+If there is a case where a cartoon face can be detected but is recognized as "Unknown0.00", you can modify the `rec_thresholds` in the configuration file and try again after lowering the retrieval threshold. If there are cases of face recognition errors, please replace the optimal weights with the weights from the last training round, or replace the recognition model weights trained with different hyperparameters and try again.
+
+## 8. Production Line Service Deployment
+
+In addition to the Python API integration development method mentioned earlier, PaddleX also provides high-performance deployment and service-oriented deployment capabilities, which are detailed as follows:
+* High-performance deployment: In actual production environments, many applications have strict standards for the performance metrics of deployment strategies (especially response speed) to ensure the efficient operation of the system and the smoothness of user experience. To this end, PaddleX provides a high-performance inference plugin, which aims to deeply optimize the performance of model inference and pre- and post-processing, achieving significant acceleration of the end-to-end process. For detailed high-performance deployment procedures, please refer to the [PaddleX High-Performance Inference Guide](../pipeline_deploy/high_performance_inference.md).
+* Service-oriented deployment: Service-oriented deployment is a common form of deployment in actual production environments. By encapsulating the inference function as a service, clients can access these services through network requests to obtain inference results. PaddleX supports users in achieving service-oriented deployment of the production line at a low cost. For detailed service-oriented deployment procedures, please refer to the [PaddleX Service-Oriented Deployment Guide](../pipeline_deploy/serving.md).
+
+You can choose the appropriate method to deploy the production line according to your needs, and then proceed with subsequent AI application integration.
+This section takes service-oriented deployment as an example and guides you through the service-oriented deployment of the production line and API calls, which can be completed in just 5 simple steps:
+
+(1) Execute the following Python script to save the feature database of the cartoon face demonstration data.
+
+```python
+from paddlex import create_pipeline
+# 创建人脸识别产线
+pipeline = create_pipeline(pipeline="face_recognition")
+# 构建卡通人脸特征底库
+index_data = pipeline.build_index(gallery_imgs="cartoonface_demo_gallery", gallery_label="cartoonface_demo_gallery/gallery.txt")
+# 保存卡通人脸特征底库
+index_data.save("cartoonface_index")
+```
+
+(2) Execute the following command in the command line to install the service-oriented deployment plugin
+
+```bash
+paddlex --install serving
+```
+
+(3) Obtain the face recognition production line configuration file
+
+```bash
+paddlex --get_pipeline_config face_recognition --save_path ./
+```
+
+(4) Modify the configuration file to set the directory for the feature database.
+
+```yaml
+pipeline_name: face_recognition
+
+index: ./cartoonface_index # 本地特征底库目录,使用第(1)步中构建好的特征底库
+det_threshold: 0.6
+rec_threshold: 0.4
+rec_topk: 5
+
+SubModules:
+  Detection:
+    module_name: face_detection
+    model_name: PP-YOLOE_plus-S_face
+    model_dir: "path/to/your/det_model" # 使用卡通人脸数据微调的人脸检测模型
+    batch_size: 1
+  Recognition:
+    module_name: face_feature
+    model_name: ResNet50_face
+    model_dir: "path/to/your/rec_model" # 使用卡通人脸数据微调的人脸特征模型
+    batch_size: 1
+```
+
+(5) Start the server-side service
+
+```bash
+paddlex --serve --pipeline face_recognition.yaml
+```
+
+#### Client Invocation
+
+PaddleX provides simple and convenient invocation interfaces and example code. Here, we use a simple image inference as an example. For more detailed invocation interface support, please refer to the [Face Recognition Pipeline Usage Tutorial](../pipeline_usage/tutorials/cv_pipelines/face_recognition.md#3-Development-Integration-Deployment).
+
+Client invocation example code:
+
+```python
+import base64
+import pprint
+import sys
+
+import requests
+
+API_BASE_URL = "http://0.0.0.0:8080"
+
+infer_image_path = "cartoonface_demo_gallery/test_images/cartoon_demo.jpg" # 测试图片
+
+with open(infer_image_path, "rb") as file:
+    image_bytes = file.read()
+    image_data = base64.b64encode(image_bytes).decode("ascii")
+
+payload = {"image": image_data}
+resp_infer = requests.post(f"{API_BASE_URL}/face-recognition-infer", json=payload)
+if resp_infer.status_code != 200:
+    print(f"Request to face-recognition-infer failed with status code {resp_infer}.")
+    pprint.pp(resp_infer.json())
+    sys.exit(1)
+result_infer = resp_infer.json()["result"]
+
+output_image_path = './out.jpg'
+with open(output_image_path, "wb") as file:
+    file.write(base64.b64decode(result_infer["image"]))
+print(f"Output image saved at {output_image_path}")
+print("\nDetected faces:")
+pprint.pp(result_infer["faces"])
+```
+
+After executing the example code, you can view the inference results of the service deployment in the output log and the saved inference images respectively.

+ 107 - 57
docs/practical_tutorials/ocr_det_license_tutorial.en.md

@@ -33,39 +33,54 @@ After experiencing the pipeline, determine if it meets your expectations (includ
 
 ## 3. Select a Model
 
-PaddleX provides two end-to-end text detection models. For details, refer to the [Model List](../support_list/models_list.en.md). The benchmarks of the models are as follows:
+PaddleX provides 4 end-to-end text detection models. For details, refer to the [Model List](../support_list/models_list.en.md). The benchmarks of the models are as follows:
 
 <table>
 <thead>
 <tr>
-<th>Model List</th>
-<th>Detection Hmean(%)</th>
-<th>Recognition Avg Accuracy(%)</th>
-<th>GPU Inference Time(ms)</th>
-<th>CPU Inference Time(ms)</th>
-<th>Model Size(M)</th>
+<th>Model</th>
+<th>Detection Hmean (%)</th>
+<th>GPU Inference Time (ms)</th>
+<th>CPU Inference Time (ms)</th>
+<th>Model Storage Size (M)</th>
+<th>Introduction</th>
 </tr>
 </thead>
 <tbody>
 <tr>
-<td>PP-OCRv4_server</td>
-<td>82.69</td>
-<td>79.20</td>
-<td>22.20346</td>
-<td>2662.158</td>
-<td>198</td>
+<td>PP-OCRv4_server_det</td>
+<td>82.56</td>
+<td>83.3501</td>
+<td>2434.01</td>
+<td>109</td>
+<td>The server-side text detection model of PP-OCRv4, with higher accuracy, suitable for deployment on high-performance servers</td>
+</tr>
+<tr>
+<td>PP-OCRv4_mobile_det</td>
+<td>77.35</td>
+<td>10.6923</td>
+<td>120.177</td>
+<td>4.7</td>
+<td>The mobile text detection model of PP-OCRv4, with higher efficiency, suitable for deployment on edge devices</td>
 </tr>
 <tr>
-<td>PP-OCRv4_mobile</td>
-<td>77.79</td>
-<td>78.20</td>
-<td>2.719474</td>
-<td>79.1097</td>
-<td>15</td>
+<td>PP-OCRv3_mobile_det</td>
+<td>78.68</td>
+<td>-</td>
+<td>-</td>
+<td>2.1</td>
+<td>The mobile text detection model of PP-OCRv3, with higher efficiency, suitable for deployment on edge devices</td>
+</tr>
+<tr>
+<td>PP-OCRv3_server_det</td>
+<td>80.11</td>
+<td>-</td>
+<td>-</td>
+<td>102.1</td>
+<td>The server-side text detection model of PP-OCRv3, with higher accuracy, suitable for deployment on high-performance servers</td>
 </tr>
 </tbody>
 </table>
-<b>Note: The above accuracy metrics are for the Detection Hmean and Recognition Avg Accuracy on PaddleOCR's self-built Chinese dataset validation set. GPU inference time is based on an NVIDIA Tesla T4 machine with FP32 precision. CPU inference speed is based on an Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz with 8 threads and FP32 precision.</b>
 
 In short, the models listed from top to bottom have faster inference speeds, while from bottom to top, they have higher accuracy. This tutorial uses the `PP-OCRv4_server` model as an example to complete a full model development process. Depending on your actual usage scenario, choose a suitable model for training. After training, evaluate the appropriate model weights within the pipeline and use them in practical scenarios.
 
@@ -86,7 +101,7 @@ tar -xf ./dataset/ccpd_text_det.tar -C ./dataset/
 To validate the dataset, simply use the following command:
 
 ```bash
-python main.py -c paddlex/configs/text_detection/PP-OCRv4_server_det.yaml \
+python main.py -c paddlex/configs/modules/text_detection/PP-OCRv4_server_det.yaml \
     -o Global.mode=check_dataset \
     -o Global.dataset_dir=./dataset/ccpd_text_det
 ```
@@ -100,35 +115,35 @@ After executing the above command, PaddleX will validate the dataset and collect
   "attributes": {
     "train_samples": 5769,
     "train_sample_paths": [
-      "..\/..\/ccpd_text_det\/images\/0274305555556-90_266-204&460_520&548-516&548_209&547_204&464_520&460-0_0_3_25_24_24_24_26-63-89.jpg",
-      "..\/..\/ccpd_text_det\/images\/0126171875-90_267-294&424_498&486-498&486_296&485_294&425_496&424-0_0_3_24_33_32_30_31-157-29.jpg",
-      "..\/..\/ccpd_text_det\/images\/0371516927083-89_254-178&423_517&534-517&534_204&525_178&431_496&423-1_0_3_24_33_31_29_31-117-667.jpg",
-      "..\/..\/ccpd_text_det\/images\/03349609375-90_268-211&469_526&576-526&567_214&576_211&473_520&469-0_0_3_27_31_32_29_32-174-48.jpg",
-      "..\/..\/ccpd_text_det\/images\/0388454861111-90_269-138&409_496&518-496&518_138&517_139&410_491&409-0_0_3_24_27_26_26_30-174-148.jpg",
-      "..\/..\/ccpd_text_det\/images\/0198741319444-89_112-208&517_449&600-423&593_208&600_233&517_449&518-0_0_3_24_28_26_26_26-87-268.jpg",
-      "..\/..\/ccpd_text_det\/images\/3027782118055555555-91_92-186&493_532&574-529&574_199&565_186&497_532&493-0_0_3_27_26_30_33_32-73-336.jpg",
-      "..\/..\/ccpd_text_det\/images\/034375-90_258-168&449_528&546-528&542_186&546_168&449_525&449-0_0_3_26_30_30_26_33-94-221.jpg",
-      "..\/..\/ccpd_text_det\/images\/0286501736111-89_92-290&486_577&587-576&577_290&587_292&491_577&486-0_0_3_17_25_28_30_33-134-122.jpg",
-      "..\/..\/ccpd_text_det\/images\/02001953125-92_103-212&486_458&569-458&569_224&555_212&486_446&494-0_0_3_24_24_25_24_24-88-24.jpg"
+      "check_dataset\/demo_img\/0274305555556-90_266-204&460_520&548-516&548_209&547_204&464_520&460-0_0_3_25_24_24_24_26-63-89.jpg",
+      "check_dataset\/demo_img\/0126171875-90_267-294&424_498&486-498&486_296&485_294&425_496&424-0_0_3_24_33_32_30_31-157-29.jpg",
+      "check_dataset\/demo_img\/0371516927083-89_254-178&423_517&534-517&534_204&525_178&431_496&423-1_0_3_24_33_31_29_31-117-667.jpg",
+      "check_dataset\/demo_img\/03349609375-90_268-211&469_526&576-526&567_214&576_211&473_520&469-0_0_3_27_31_32_29_32-174-48.jpg",
+      "check_dataset\/demo_img\/0388454861111-90_269-138&409_496&518-496&518_138&517_139&410_491&409-0_0_3_24_27_26_26_30-174-148.jpg",
+      "check_dataset\/demo_img\/0198741319444-89_112-208&517_449&600-423&593_208&600_233&517_449&518-0_0_3_24_28_26_26_26-87-268.jpg",
+      "check_dataset\/demo_img\/3027782118055555555-91_92-186&493_532&574-529&574_199&565_186&497_532&493-0_0_3_27_26_30_33_32-73-336.jpg",
+      "check_dataset\/demo_img\/034375-90_258-168&449_528&546-528&542_186&546_168&449_525&449-0_0_3_26_30_30_26_33-94-221.jpg",
+      "check_dataset\/demo_img\/0286501736111-89_92-290&486_577&587-576&577_290&587_292&491_577&486-0_0_3_17_25_28_30_33-134-122.jpg",
+      "check_dataset\/demo_img\/02001953125-92_103-212&486_458&569-458&569_224&555_212&486_446&494-0_0_3_24_24_25_24_24-88-24.jpg"
     ],
     "val_samples": 1001,
     "val_sample_paths": [
-      "..\/..\/ccpd_text_det\/images\/3056141493055555554-88_93-205&455_603&597-603&575_207&597_205&468_595&455-0_0_3_24_32_27_31_33-90-213.jpg",
-      "..\/..\/ccpd_text_det\/images\/0680295138889-88_94-120&474_581&623-577&605_126&623_120&483_581&474-0_0_5_24_31_24_24_24-116-518.jpg",
-      "..\/..\/ccpd_text_det\/images\/0482421875-87_265-154&388_496&530-490&495_154&530_156&411_496&388-0_0_5_25_33_33_33_33-84-104.jpg",
-      "..\/..\/ccpd_text_det\/images\/0347504340278-105_106-235&443_474&589-474&589_240&518_235&443_473&503-0_0_3_25_30_33_27_30-162-4.jpg",
-      "..\/..\/ccpd_text_det\/images\/0205338541667-93_262-182&428_410&519-410&519_187&499_182&428_402&442-0_0_3_24_26_29_32_24-83-63.jpg",
-      "..\/..\/ccpd_text_det\/images\/0380913628472-97_250-234&403_529&534-529&534_250&480_234&403_528&446-0_0_3_25_25_24_25_25-185-85.jpg",
-      "..\/..\/ccpd_text_det\/images\/020598958333333334-93_267-256&471_482&563-478&563_256&546_262&471_482&484-0_0_3_26_24_25_32_24-102-115.jpg",
-      "..\/..\/ccpd_text_det\/images\/3030323350694444445-86_131-170&495_484&593-434&569_170&593_226&511_484&495-11_0_5_30_30_31_33_24-118-59.jpg",
-      "..\/..\/ccpd_text_det\/images\/3016158854166666667-86_97-243&471_462&546-462&527_245&546_243&479_453&471-0_0_3_24_30_27_24_29-98-40.jpg",
-      "..\/..\/ccpd_text_det\/images\/0340831163194-89_264-177&412_488&523-477&506_177&523_185&420_488&412-0_0_3_24_30_29_31_31-109-46.jpg"
+      "check_dataset\/demo_img\/3056141493055555554-88_93-205&455_603&597-603&575_207&597_205&468_595&455-0_0_3_24_32_27_31_33-90-213.jpg",
+      "check_dataset\/demo_img\/0680295138889-88_94-120&474_581&623-577&605_126&623_120&483_581&474-0_0_5_24_31_24_24_24-116-518.jpg",
+      "check_dataset\/demo_img\/0482421875-87_265-154&388_496&530-490&495_154&530_156&411_496&388-0_0_5_25_33_33_33_33-84-104.jpg",
+      "check_dataset\/demo_img\/0347504340278-105_106-235&443_474&589-474&589_240&518_235&443_473&503-0_0_3_25_30_33_27_30-162-4.jpg",
+      "check_dataset\/demo_img\/0205338541667-93_262-182&428_410&519-410&519_187&499_182&428_402&442-0_0_3_24_26_29_32_24-83-63.jpg",
+      "check_dataset\/demo_img\/0380913628472-97_250-234&403_529&534-529&534_250&480_234&403_528&446-0_0_3_25_25_24_25_25-185-85.jpg",
+      "check_dataset\/demo_img\/020598958333333334-93_267-256&471_482&563-478&563_256&546_262&471_482&484-0_0_3_26_24_25_32_24-102-115.jpg",
+      "check_dataset\/demo_img\/3030323350694444445-86_131-170&495_484&593-434&569_170&593_226&511_484&495-11_0_5_30_30_31_33_24-118-59.jpg",
+      "check_dataset\/demo_img\/3016158854166666667-86_97-243&471_462&546-462&527_245&546_243&479_453&471-0_0_3_24_30_27_24_29-98-40.jpg",
+      "check_dataset\/demo_img\/0340831163194-89_264-177&412_488&523-477&506_177&523_185&420_488&412-0_0_3_24_30_29_31_31-109-46.jpg"
     ]
   },
   "analysis": {
     "histogram": "check_dataset\/histogram.png"
   },
-  "dataset_path": "\/mnt\/liujiaxuan01\/new\/new2\/ccpd_text_det",
+  "dataset_path": "ccpd_text_det",
   "show_type": "image",
   "dataset_type": "TextDetDataset"
 }
@@ -171,7 +186,7 @@ During data splitting, the original annotation files will be renamed to `xxx.bak
 Before training, ensure you have verified the dataset. To complete PaddleX model training, simply use the following command:
 
 ```bash
-python main.py -c paddlex/configs/text_detection/PP-OCRv4_server_det.yaml \
+python main.py -c paddlex/configs/modules/text_detection/PP-OCRv4_server_det.yaml \
     -o Global.mode=train \
     -o Global.dataset_dir=./dataset/ccpd_text_det
 ```
@@ -208,7 +223,7 @@ After completing model training, all outputs are saved in the specified output d
 After completing model training, you can evaluate the specified model weights file on the validation set to verify the model's accuracy. Using PaddleX for model evaluation requires only one command:
 
 ```bash
-python main.py -c paddlex/configs/text_detection/PP-OCRv4_server_det.yaml \
+python main.py -c paddlex/configs/modules/text_detection/PP-OCRv4_server_det.yaml \
     -o Global.mode=evaluate \
     -o Global.dataset_dir=./dataset/ccpd_text_det
 ```
@@ -293,13 +308,41 @@ Next, based on a learning rate of 0.001, we can increase the number of training
 
 ## 6. Production Line Testing
 
-Replace the models in the production line with the fine-tuned models for testing, for example:
+Replace the model in the production line with the fine-tuned model for testing. You can obtain the OCR production configuration file and load the configuration file for prediction. You can execute the following command to save the results in `my_path`:
 
 ```bash
-paddlex --pipeline OCR \
-        --model PP-OCRv4_server_det PP-OCRv4_server_rec \
-        --model_dir output/best_accuracy/inference None \
-        --input https://paddle-model-ecology.bj.bcebos.com/paddlex/PaddleX3.0/doc_images/practical_tutorial/OCR/case1.jpg
+paddlex --get_pipeline_config OCR --save_path ./my_path
+```
+
+Modify the `SubModules.TextDetection.model_dir` in the configuration file to your model path.
+
+```yaml
+SubModules:
+  TextDetection:
+    module_name: text_detection
+    model_name: PP-OCRv4_mobile_det
+    model_dir: output/best_accuracy/inference # 替换为微调后的文本检测模型权重路径
+    ...
+```
+
+Subsequently, based on the Python script method, you can load the modified production configuration file:
+
+```python
+from paddlex import create_pipeline
+
+pipeline = create_pipeline(pipeline="my_path/OCR.yaml")
+
+output = pipeline.predict(
+    input="https://paddle-model-ecology.bj.bcebos.com/paddlex/PaddleX3.0/doc_images/practical_tutorial/OCR/case1.jpg",
+    use_doc_orientation_classify=False,
+    use_doc_unwarping=False,
+    use_textline_orientation=False,
+)
+for res in output:
+    res.print()
+    res.save_to_img(save_path="./output/")
+    res.save_to_json(save_path="./output/")
+
 ```
 
 This will generate prediction results under `./output`, where the prediction result for `case1.jpg` is shown below:
@@ -311,19 +354,26 @@ This will generate prediction results under `./output`, where the prediction res
 
 ## 7. Development Integration/Deployment
 If the general OCR pipeline meets your requirements for inference speed and accuracy in the production line, you can proceed directly with development integration/deployment.
-1. Directly apply the trained model in your Python project by referring to the following sample code, and modify the `Pipeline.model` in the `paddlex/pipelines/OCR.yaml` configuration file to your own model path:
+1. Directly apply the trained model in your Python project. You can refer to the following example:
+
 ```python
 from paddlex import create_pipeline
-pipeline = create_pipeline(pipeline="paddlex/pipelines/OCR.yaml")
-output = pipeline.predict("https://paddle-model-ecology.bj.bcebos.com/paddlex/PaddleX3.0/doc_images/practical_tutorial/OCR/case1.jpg")
-for res in output:
-    res.print() # Print the structured output of the prediction
-    res.save_to_img("./output/") # Save the visualized image of the result
-    res.save_to_json("./output/") # Save the structured output of the prediction
+pipeline = create_pipeline(pipeline="paddlex/configs/pipelines/OCR.yaml")
+
+output = pipeline.predict(
+    input="https://paddle-model-ecology.bj.bcebos.com/paddlex/PaddleX3.0/doc_images/practical_tutorial/OCR/case1.jpg",
+    use_doc_orientation_classify=False,
+    use_doc_unwarping=False,
+    use_textline_orientation=False,
+)
+    res.print()
+    res.save_to_img(save_path="./output/")
+    res.save_to_json(save_path="./output/")
+
 ```
 For more parameters, please refer to the [General OCR Pipeline Usage Tutorial](../pipeline_usage/tutorials/ocr_pipelines/OCR.en.md).
 
-2. Additionally, PaddleX offers three other deployment methods, detailed as follows:
+1. Additionally, PaddleX offers three other deployment methods, detailed as follows:
 
 * high-performance inference: In actual production environments, many applications have stringent standards for deployment strategy performance metrics (especially response speed) to ensure efficient system operation and smooth user experience. To this end, PaddleX provides high-performance inference plugins aimed at deeply optimizing model inference and pre/post-processing for significant end-to-end process acceleration. For detailed high-performance inference procedures, please refer to the [PaddleX High-Performance Inference Guide](../pipeline_deploy/high_performance_inference.en.md).
 * Service-Oriented Deployment: Service-oriented deployment is a common deployment form in actual production environments. By encapsulating inference functions as services, clients can access these services through network requests to obtain inference results. PaddleX supports users in achieving cost-effective service-oriented deployment of production lines. For detailed service-oriented deployment procedures, please refer to the [PaddleX Service-Oriented Deployment Guide](../pipeline_deploy/serving.en.md).

+ 32 - 18
docs/practical_tutorials/ts_anomaly_detection.en.md

@@ -13,6 +13,7 @@ First, choose the corresponding PaddleX pipeline based on your task scenario. Th
 PaddleX offers two ways to experience its capabilities: locally on your machine or on the <b>Baidu AIStudio Community</b>.
 
 * Local Experience:
+
 ```python
 from paddlex import create_model
 model = create_model("PatchTST_ad")
@@ -25,7 +26,7 @@ for res in output:
 Note: Due to the tight correlation between time series data and scenarios, the online experience of official models for time series tasks is tailored to a specific scenario and is not a general solution. Therefore, the experience mode does not support using arbitrary files to evaluate the official model's performance. However, after training a model with your own scenario data, you can select your trained model and use data from the corresponding scenario for online experience.
 
 ## 3. Choose a Model
-PaddleX provides five end-to-end time series anomaly detection models. For details, refer to the [Model List](../support_list/models_list.en.md). The benchmarks of these models are as follows:
+PaddleX provides 4 end-to-end time series anomaly detection models. For details, refer to the [Model List](../support_list/models_list.en.md). The benchmarks of these models are as follows:
 
 <table>
 <thead>
@@ -71,14 +72,6 @@ PaddleX provides five end-to-end time series anomaly detection models. For detai
 <td>164K</td>
 <td>A high-precision time series anomaly detection model that balances local patterns and global dependencies</td>
 </tr>
-<tr>
-<td>TimesNet_ad</td>
-<td>0.9837</td>
-<td>0.9480</td>
-<td>0.9656</td>
-<td>732K</td>
-<td>A highly adaptive and high-precision time series anomaly detection model through multi-period analysis</td>
-</tr>
 </tbody>
 </table>
 > <b>Note: The above accuracy metrics are measured on the [PSM](https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ts_anomaly_examples.tar) dataset with a time series length of 100.</b>
@@ -108,7 +101,7 @@ tar -xf ./dataset/msl.tar -C ./dataset/
 Data Validation can be completed with just one command:
 
 ```
-python main.py -c paddlex/configs/ts_anomaly_detection/PatchTST_ad.yaml \
+python main.py -c paddlex/configs/modules/ts_anomaly_detection/PatchTST_ad.yaml \
     -o Global.mode=check_dataset \
     -o Global.dataset_dir=./dataset/msl
 ```
@@ -158,7 +151,7 @@ If you need to convert the dataset format or re-split the dataset, refer to Sect
 Before training, ensure that you have verified the dataset. To complete PaddleX model training, simply use the following command:
 
 ```bash
-python main.py -c paddlex/configs/ts_anomaly_detection/PatchTST_ad.yaml \
+python main.py -c paddlex/configs/modules/ts_anomaly_detection/PatchTST_ad.yaml \
 -o Global.mode=train \
 -o Global.dataset_dir=./dataset/msl \
 -o Train.epochs_iters=5 \
@@ -208,7 +201,7 @@ For more introductions to hyperparameters, please refer to [PaddleX Time Series
 After completing model training, you can evaluate the specified model weights file on the validation set to verify the model's accuracy. Using PaddleX for model evaluation requires just one command:
 
 ```bash
-python main.py -c paddlex/configs/ts_anomaly_detection/PatchTST_ad.yaml \
+python main.py -c paddlex/configs/modules/ts_anomaly_detection/PatchTST_ad.yaml \
     -o Global.mode=evaluate \
     -o Global.dataset_dir=./dataset/msl
 ```
@@ -315,7 +308,7 @@ Increasing Training Epochs Results:
 Replace the model in the production line with the fine-tuned model for testing, using the [test file](https://paddle-model-ecology.bj.bcebos.com/paddlex/PaddleX3.0/doc_images/practical_tutorial/timeseries_anomaly_detection/test.csv) for prediction:
 
 ```
-python main.py -c paddlex/configs/ts_anomaly_detection/PatchTST_ad.yaml \
+python main.py -c paddlex/configs/modules/ts_anomaly_detection/PatchTST_ad.yaml \
     -o Global.mode=predict \
     -o Predict.model_dir="./output/inference" \
     -o Predict.input="https://paddle-model-ecology.bj.bcebos.com/paddlex/PaddleX3.0/doc_images/practical_tutorial/timeseries_anomaly_detection/test.csv"
@@ -330,18 +323,39 @@ Other related parameters can be set by modifying the `Global` and `Evaluate` fie
 ## 7. Integration/Deployment
 If the general-purpose time series anomaly detection pipeline meets your requirements for inference speed and accuracy, you can proceed directly with development integration/deployment.
 
-1. If you need to apply the general-purpose time series anomaly detection pipeline directly in your Python project, you can refer to the following sample code:
+1. If you need to use the fine-tuned model weights, you can obtain the ts_anomaly_detection production configuration file and load the configuration file for prediction. You can execute the following command to save the results in `my_path`:
+
+```
+paddlex --get_pipeline_config ts_anomaly_detection --save_path ./my_path
+```
+
+Fill in the local path of the fine-tuned model weights into the `model_dir` in the production configuration file. If you need to directly apply the general time-series classification pipeline in your Python project, you can refer to the following example:
+
+```yaml
+pipeline_name: ts_anomaly_detection
+
+SubModules:
+  TSAnomalyDetection:
+    module_name: ts_anomaly_detection
+    model_name: PatchTST_ad
+    model_dir: ./output/inference  # 此处替换为您训练后得到的模型权重本地路径
+    batch_size: 1
 ```
+
+Then, execute the following code to complete the prediction:
+
+```python
 from paddlex import create_pipeline
-pipeline = create_pipeline(pipeline="ts_anomaly_detection")
+pipeline = create_pipeline(pipeline="my_path/ts_anomaly_detection.yaml")
 output = pipeline.predict("pre_ts.csv")
 for res in output:
-    res.print()
-    res.save_to_csv("./output/")
+    res.print() # 打印预测的结构化输出
+    res.save_to_csv("./output/") # 保存csv格式结果
 ```
+
 For more parameters, please refer to the [Time Series Anomaly Detection Pipeline Usage Tutorial](../pipeline_usage/tutorials/time_series_pipelines/time_series_anomaly_detection.en.md)
 
-2. Additionally, PaddleX's time series anomaly detection pipeline also offers a service-oriented deployment method, detailed as follows:
+1. Additionally, PaddleX's time series anomaly detection pipeline also offers a service-oriented deployment method, detailed as follows:
 
 Service-Oriented Deployment: This is a common deployment form in actual production environments. By encapsulating the inference functionality as services, clients can access these services through network requests to obtain inference results. PaddleX supports users in achieving service-oriented deployment of pipelines at low cost. For detailed instructions on service-oriented deployment, please refer to the [PaddleX Service-Oriented Deployment Guide](../pipeline_deploy/serving.en.md).
 You can choose the appropriate method to deploy your model pipeline based on your needs, and proceed with subsequent AI application integration.

+ 26 - 6
docs/practical_tutorials/ts_classification.en.md

@@ -73,7 +73,7 @@ Missing Value Handling: To guarantee the quality and integrity of the data, miss
 Data Validation can be completed with just one command:
 
 ```
-python main.py -c paddlex/configs/ts_classification/TimesNet_cls.yaml \
+python main.py -c paddlex/configs/modules/ts_classification/TimesNet_cls.yaml \
     -o Global.mode=check_dataset \
     -o Global.dataset_dir=./dataset/ts_classify_examples
 ```
@@ -121,7 +121,7 @@ If you need to convert the dataset format or re-split the dataset, please refer
 Before training, ensure that you have validated the dataset. To complete PaddleX model training, simply use the following command:
 
 ```bash
-python main.py -c paddlex/configs/ts_classification/TimesNet_cls.yaml \
+python main.py -c paddlex/configs/modules/ts_classification/TimesNet_cls.yaml \
 -o Global.mode=train \
 -o Global.dataset_dir=./dataset/ts_classify_examples \
 -o Train.epochs_iters=5 \
@@ -175,7 +175,7 @@ For more hyperparameter introductions, please refer to [PaddleX Time Series Task
 After completing model training, you can evaluate the specified model weights file on the validation set to verify the model's accuracy. Using PaddleX for model evaluation requires just one command:
 
 ```
-    python main.py -c paddlex/configs/ts_classification/TimesNet_cls.yaml \
+    python main.py -c paddlex/configs/modules/ts_classification/TimesNet_cls.yaml \
     -o Global.mode=evaluate \
     -o Global.dataset_dir=./dataset/ts_classify_examples \
     -o Evaluate.weight_path=./output/best_model/model.pdparams
@@ -277,7 +277,7 @@ Results of Increasing Training Epochs:
 Set the model directory to the trained model for testing, using the [test file](https://paddle-model-ecology.bj.bcebos.com/paddlex/PaddleX3.0/doc_images/practical_tutorial/timeseries_classification/test.csv) to perform predictions:
 
 ```bash
-python main.py -c paddlex/configs/ts_classification/TimesNet_cls.yaml \
+python main.py -c paddlex/configs/modules/ts_classification/TimesNet_cls.yaml \
     -o Global.mode=predict \
     -o Predict.model_dir="./output/inference" \
     -o Predict.input="https://paddle-model-ecology.bj.bcebos.com/paddlex/PaddleX3.0/doc_images/practical_tutorial/timeseries_classification/test.csv"
@@ -294,11 +294,31 @@ Other related parameters can be set by modifying the fields under `Global` and `
 ## 7. Development Integration/Deployment
 If the general time series classification pipeline meets your requirements for inference speed and accuracy, you can directly proceed with development integration/deployment.
 
-1. If you need to directly apply the general time series classification pipeline in your Python project, you can refer to the following sample code:
+1. If you need to use the fine-tuned model weights, you can obtain the ts_classification production configuration file and load the configuration file for prediction. You can execute the following command to save the results in `my_path`:
 
 ```
+paddlex --get_pipeline_config ts_classification --save_path ./my_path
+```
+
+Fill in the local path of the fine-tuned model weights into the `model_dir` in the production configuration file. If you need to directly apply the general time-series classification pipeline in your Python project, you can refer to the following example:
+
+```yaml
+pipeline_name: ts_classification
+
+SubModules:
+  TSClassification:
+    module_name: ts_classification
+    model_name: TimesNet_cls
+    model_dir: ./output/inference  # 此处替换为您训练后得到的模型权重本地路径
+    batch_size: 1
+```
+
+Subsequently, in your Python code, you can use the pipeline as follows:
+
+
+```python
 from paddlex import create_pipeline
-pipeline = create_pipeline(pipeline="ts_classification")
+pipeline = create_pipeline(pipeline="my_path/ts_classification.yaml")
 output = pipeline.predict("pre_ts.csv")
 for res in output:
     res.print() # 打印预测的结构化输出

+ 27 - 7
docs/practical_tutorials/ts_forecast.en.md

@@ -105,7 +105,7 @@ tar -xf ./dataset/electricity.tar -C ./dataset/
 Data Validation can be completed with just one command:
 
 ```
-python main.py -c paddlex/configs/ts_forecast/DLinear.yaml \
+python main.py -c paddlex/configs/modules/ts_forecast/DLinear.yaml \
     -o Global.mode=check_dataset \
     -o Global.dataset_dir=./dataset/electricity
 ```
@@ -240,7 +240,7 @@ If you need to convert the dataset format or re-split the dataset, you can modif
 Before training, ensure that you have validated the dataset. To complete PaddleX model training, simply use the following command:
 
 ```bash
-python main.py -c paddlex/configs/ts_forecast/DLinear.yaml \
+python main.py -c paddlex/configs/modules/ts_forecast/DLinear.yaml \
 -o Global.mode=train \
 -o Global.dataset_dir=./dataset/electricity \
 -o Train.epochs_iters=5 \
@@ -300,7 +300,7 @@ For more hyperparameter introductions, refer to [PaddleX Time Series Task Model
 After completing model training, you can evaluate the specified model weights file on the validation set to verify the model's accuracy. Using PaddleX for model evaluation requires just one command:
 
 ```
-    python main.py -c paddlex/configs/ts_forecast/DLinear.yaml \
+    python main.py -c paddlex/configs/modules/ts_forecast/DLinear.yaml \
     -o Global.mode=evaluate \
     -o Global.dataset_dir=./dataset/electricity \
 ```
@@ -459,7 +459,7 @@ After increasing the training epochs, Experiment 4 achieves the highest accuracy
 Replace the model in the production line with the fine-tuned model and test using [this power test data](https://paddle-model-ecology.bj.bcebos.com/paddlex/PaddleX3.0/doc_images/practical_tutorial/timeseries_forecast/test.csv) for prediction:
 
 ```bash
-python main.py -c paddlex/configs/ts_forecast/DLinear.yaml \
+python main.py -c paddlex/configs/modules/ts_forecast/DLinear.yaml \
     -o Global.mode=predict \
     -o Predict.model_dir="./output/inference" \
     -o Predict.input=https://paddle-model-ecology.bj.bcebos.com/paddlex/PaddleX3.0/doc_images/practical_tutorial/timeseries_forecast/test.csv
@@ -476,10 +476,30 @@ Other related parameters can be set by modifying the `Global` and `Evaluate` fie
 
 If the general-purpose time series forecast pipeline meets your requirements for inference speed and accuracy, you can proceed directly with development integration/deployment.
 
-1. If you need to apply the general-purpose time series forecast pipeline directly in your Python project, you can refer to the following sample code:
+1. If you need to use the fine-tuned model weights, you can obtain the ts_forecast production configuration file and load the configuration file for prediction. You can execute the following command to save the results in `my_path`:
+
+```
+paddlex --get_pipeline_config ts_forecast --save_path ./my_path
 ```
+
+Fill in the local path of the fine-tuned model weights into the `model_dir` in the production configuration file. If you need to directly apply the general time-series classification pipeline in your Python project, you can refer to the following example:
+
+```yaml
+pipeline_name: ts_forecast
+
+SubModules:
+  TSForecast:
+    module_name: ts_forecast
+    model_name: ./output/inference
+    model_dir: null # 此处替换为您训练后得到的模型权重本地路径
+    batch_size: 1
+```
+
+Subsequently, in your Python code, you can use the pipeline as follows:
+
+```python
 from paddlex import create_pipeline
-pipeline = create_pipeline(pipeline="ts_forecast")
+pipeline = create_pipeline(pipeline="my_path/ts_forecast.yaml")
 output = pipeline.predict("pre_ts.csv")
 for res in output:
     res.print()
@@ -487,7 +507,7 @@ for res in output:
 ```
 For more parameters, please refer to the [Time Series forecast Pipeline Usage Tutorial](../pipeline_usage/tutorials/time_series_pipelines/time_series_anomaly_detection.en.md)
 
-2. Additionally, PaddleX's time series forecast pipeline also offers a service-oriented deployment method, detailed as follows:
+1. Additionally, PaddleX's time series forecast pipeline also offers a service-oriented deployment method, detailed as follows:
 
 Service-Oriented Deployment: This is a common deployment form in actual production environments. By encapsulating the inference functionality as services, clients can access these services through network requests to obtain inference results. PaddleX supports users in achieving service-oriented deployment of pipelines at low cost. For detailed instructions on service-oriented deployment, please refer to the [PPaddleX Service-Oriented Deployment Guide](../pipeline_deploy/serving.en.md).
 You can choose the appropriate method to deploy your model pipeline based on your needs, and proceed with subsequent AI application integration.

파일 크기가 너무 크기때문에 변경 상태를 표시하지 않습니다.
+ 205 - 205
docs/support_list/models_list.en.md


파일 크기가 너무 크기때문에 변경 상태를 표시하지 않습니다.
+ 198 - 198
docs/support_list/models_list.md


이 변경점에서 너무 많은 파일들이 변경되어 몇몇 파일들은 표시되지 않았습니다.