Преглед изворни кода

fix docs (#3281)

* fix docs

* remove input dict option for module docs
cuicheng01 пре 9 месеци
родитељ
комит
071389ff54
42 измењених фајлова са 147 додато и 410 уклоњено
  1. 1 1
      docs/CHANGLOG.en.md
  2. 1 1
      docs/CHANGLOG.md
  3. 3 3
      docs/index.en.md
  4. 3 3
      docs/index.md
  5. 1 2
      docs/module_usage/tutorials/cv_modules/anomaly_detection.md
  6. 5 6
      docs/module_usage/tutorials/cv_modules/face_detection.md
  7. 5 6
      docs/module_usage/tutorials/cv_modules/face_feature.md
  8. 1 2
      docs/module_usage/tutorials/cv_modules/human_keypoint_detection.md
  9. 5 6
      docs/module_usage/tutorials/cv_modules/image_feature.md
  10. 1 2
      docs/module_usage/tutorials/cv_modules/image_multilabel_classification.md
  11. 1 1
      docs/module_usage/tutorials/ocr_modules/doc_img_orientation_classification.en.md
  12. 2 3
      docs/module_usage/tutorials/ocr_modules/doc_img_orientation_classification.md
  13. 1 1
      docs/module_usage/tutorials/ocr_modules/formula_recognition.md
  14. 1 1
      docs/module_usage/tutorials/ocr_modules/layout_detection.en.md
  15. 6 7
      docs/module_usage/tutorials/ocr_modules/layout_detection.md
  16. 1 1
      docs/module_usage/tutorials/ocr_modules/seal_text_detection.en.md
  17. 6 7
      docs/module_usage/tutorials/ocr_modules/seal_text_detection.md
  18. 1 2
      docs/module_usage/tutorials/ocr_modules/table_cells_detection.md
  19. 1 2
      docs/module_usage/tutorials/ocr_modules/table_classification.md
  20. 1 1
      docs/module_usage/tutorials/ocr_modules/table_structure_recognition.en.md
  21. 6 7
      docs/module_usage/tutorials/ocr_modules/table_structure_recognition.md
  22. 1 1
      docs/module_usage/tutorials/ocr_modules/text_detection.en.md
  23. 6 7
      docs/module_usage/tutorials/ocr_modules/text_detection.md
  24. 1 2
      docs/module_usage/tutorials/ocr_modules/text_image_unwarping.md
  25. 1 1
      docs/module_usage/tutorials/ocr_modules/text_recognition.en.md
  26. 6 6
      docs/module_usage/tutorials/ocr_modules/text_recognition.md
  27. 1 1
      docs/module_usage/tutorials/ocr_modules/textline_orientation_classification.en.md
  28. 2 3
      docs/module_usage/tutorials/ocr_modules/textline_orientation_classification.md
  29. 1 1
      docs/pipeline_deploy/high_performance_inference.en.md
  30. 1 1
      docs/pipeline_usage/pipeline_develop_guide.en.md
  31. 1 1
      docs/pipeline_usage/pipeline_develop_guide.md
  32. 1 1
      docs/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.en.md
  33. 25 12
      docs/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.md
  34. 10 4
      docs/pipeline_usage/tutorials/ocr_pipelines/doc_preprocessor.md
  35. 15 75
      docs/pipeline_usage/tutorials/ocr_pipelines/layout_parsing_v2.md
  36. 10 222
      docs/pipeline_usage/tutorials/ocr_pipelines/seal_recognition.md
  37. 1 1
      docs/practical_tutorials/document_scene_information_extraction(layout_detection)_tutorial.en.md
  38. 1 1
      docs/practical_tutorials/document_scene_information_extraction(layout_detection)_tutorial.md
  39. 1 1
      docs/practical_tutorials/document_scene_information_extraction(seal_recognition)_tutorial.en.md
  40. 1 1
      docs/practical_tutorials/document_scene_information_extraction(seal_recognition)_tutorial.md
  41. 7 2
      paddlex/configs/pipelines/layout_parsing_v2.yaml
  42. 1 1
      paddlex/inference/pipelines/layout_parsing/utils.py

+ 1 - 1
docs/CHANGLOG.en.md

@@ -36,7 +36,7 @@ PaddleX 3.0 Beta2 is fully compatible with the PaddlePaddle 3.0b2 version. <b>Th
 
 
 ### PaddleX v3.0.0beta1 (9.30/2024)
-PaddleX 3.0 Beta1 offers over 200 models accessible through a streamlined Python API for one-click deployment; realizes full-process model development based on unified commands, and opens source the foundational capabilities of the PP-ChatOCRv3 special model pipeline; supports high-performance inference and service-oriented deployment for over 100 models, as well as edge deployment for 7 key vision models; and fully adapts the development process of over 70 models to Huawei Ascend 910B, and over 15 models to XPU and MLU.
+PaddleX 3.0 Beta1 offers over 200 models accessible through a streamlined Python API for one-click deployment; realizes full-process model development based on unified commands, and opens source the foundational capabilities of the PP-ChatOCRv3-doc special model pipeline; supports high-performance inference and service-oriented deployment for over 100 models, as well as edge deployment for 7 key vision models; and fully adapts the development process of over 70 models to Huawei Ascend 910B, and over 15 models to XPU and MLU.
 
 - <b>Rich Models with One-click Deployment</b>: Integrates over 200 PaddlePaddle models across key domains such as document image intelligent analysis, OCR, object detection, and time series prediction into 13 model pipelines, enabling rapid model experience through a streamlined Python API. Additionally, supports over 20 individual functional modules for convenient model combination.
 - <b>Enhanced Efficiency and Lowered Thresholds</b>: Implements full-process model development based on a graphical interface and unified commands, creating 8 special model pipelines that combine large and small models, leverage large model semi-supervised learning, and multi-model fusion, significantly reducing iteration costs.

+ 1 - 1
docs/CHANGLOG.md

@@ -35,7 +35,7 @@ PaddleX 3.0 Beta2 全面适配 PaddlePaddle 3.0b2 版本。**新增通用图像
 
 
 ### PaddleX v3.0.0beta1(9.30/2024)
-PaddleX 3.0 Beta1 提供 200+ 模型通过极简的 Python API 一键调用;实现基于统一命令的模型全流程开发,并开源 PP-ChatOCRv3 特色模型产线基础能力;支持 100+ 模型高性能推理和服务化部署,7 类重点视觉模型端侧部署;70+ 模型开发全流程适配昇腾 910B,15+ 模型开发全流程适配昆仑芯和寒武纪。
+PaddleX 3.0 Beta1 提供 200+ 模型通过极简的 Python API 一键调用;实现基于统一命令的模型全流程开发,并开源 PP-ChatOCRv3-doc 特色模型产线基础能力;支持 100+ 模型高性能推理和服务化部署,7 类重点视觉模型端侧部署;70+ 模型开发全流程适配昇腾 910B,15+ 模型开发全流程适配昆仑芯和寒武纪。
 
 - <b>模型丰富一键调用:</b> 将覆盖文档图像智能分析、OCR、目标检测、时序预测等多个关键领域的 200+ 飞桨模型整合为 13 条模型产线,通过极简的 Python API 一键调用,快速体验模型效果。同时支持 20+ 单功能模块,方便开发者进行模型组合使用。
 - <b>提高效率降低门槛:</b> 实现基于图形界面和统一命令的模型全流程开发,打造大小模型结合、大模型半监督学习和多模型融合的8条特色模型产线,大幅度降低迭代模型的成本。

+ 3 - 3
docs/index.en.md

@@ -84,7 +84,7 @@ PaddleX 3.0 is a low-code development tool for AI models built on the PaddlePadd
             <td><img src="https://github.com/PaddlePaddle/PaddleX/assets/142379845/1e798e05-dee7-4b41-9cc4-6708b6014efa"></td>
         </tr>
         <tr>
-            <th><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.html"><strong>PP-ChatOCRv3-doc</strong></a></th>
+            <th><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.html"><strong>PP-ChatOCRv3-doc</strong></a></th>
             <th><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/time_series_pipelines/time_series_forecasting.html"><strong>Time Series Forecasting</strong></a></th>
             <th><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/time_series_pipelines/time_series_anomaly_detection.html"><strong>Time Series Anomaly Detection</strong></a></th>
             <th><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/time_series_pipelines/time_series_classification.html"><strong>Time Series Classification</strong></a></th>
@@ -1788,9 +1788,9 @@ The following steps were executed:
 
     ---
 
-    Document scene information extraction v3 (PP-ChatOCRv3) is a document and image intelligent analysis solution with PaddlePaddle features, combining LLM and OCR technologies to solve complex document information extraction challenges such as layout analysis, rare character recognition, multi-page PDF, table, and seal recognition in one stop.
+    Document scene information extraction v3 (PP-ChatOCRv3-doc) is a document and image intelligent analysis solution with PaddlePaddle features, combining LLM and OCR technologies to solve complex document information extraction challenges such as layout analysis, rare character recognition, multi-page PDF, table, and seal recognition in one stop.
 
-    [:octicons-arrow-right-24: Tutorial](pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.en.md)
+    [:octicons-arrow-right-24: Tutorial](pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.en.md)
 
 - **OCR**
 

+ 3 - 3
docs/index.md

@@ -83,7 +83,7 @@ PaddleX 3.0 是基于飞桨框架构建的低代码开发工具,它集成了
             <td><img src="https://github.com/PaddlePaddle/PaddleX/assets/142379845/1e798e05-dee7-4b41-9cc4-6708b6014efa"></td>
         </tr>
         <tr>
-            <th><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.html"><strong>文本图像智能分析</strong></a></th>
+            <th><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.html"><strong>文本图像智能分析</strong></a></th>
             <th><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/time_series_pipelines/time_series_forecasting.html"><strong>时序预测</strong></a></th>
             <th><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/time_series_pipelines/time_series_anomaly_detection.html"><strong>时序异常检测</strong></a></th>
             <th><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/time_series_pipelines/time_series_classification.html"><strong>时序分类</strong></a></th>
@@ -1787,9 +1787,9 @@ for res in output:
 
     ---
 
-    文档场景信息抽取v3(PP-ChatOCRv3)是飞桨特色的文档和图像智能分析解决方案,结合了 LLM 和 OCR 技术,一站式解决版面分析、生僻字、多页 pdf、表格、印章识别等常见的复杂文档信息抽取难点问题。
+    文档场景信息抽取v3(PP-ChatOCRv3-doc)是飞桨特色的文档和图像智能分析解决方案,结合了 LLM 和 OCR 技术,一站式解决版面分析、生僻字、多页 pdf、表格、印章识别等常见的复杂文档信息抽取难点问题。
 
-    [:octicons-arrow-right-24: 教程](pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.md)
+    [:octicons-arrow-right-24: 教程](pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.md)
 
 - **通用OCR**
 

+ 1 - 2
docs/module_usage/tutorials/cv_modules/anomaly_detection.md

@@ -117,8 +117,7 @@ for res in output:
   <li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
   <li><b>URL链接</b>,如图像文件的网络URL:<a href = "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">示例</a></li>
   <li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
-  <li><b>字典</b>,字典的<code>key</code>需与具体任务对应,如图像分类任务对应<code>\"img\"</code>,字典的<code>val</code>支持上述类型数据,例如:<code>{\"img\": \"/root/data1\"}</code></li>
-  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code>,<code>[{\"img\": \"/root/data1\"}, {\"img\": \"/root/data2/img.jpg\"}]</code></li>
+  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>无</td>

+ 5 - 6
docs/module_usage/tutorials/cv_modules/face_detection.md

@@ -164,12 +164,11 @@ for res in output:
 <td><code>Python Var</code>/<code>str</code>/<code>dict</code>/<code>list</code></td>
 <td>
 <ul>
-<li><b>Python变量</b>,如<code>numpy.ndarray</code>表示的图像数据</li>
-<li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
-<li><b>URL链接</b>,如图像文件的网络URL:<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">示例</a></li>
-<li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
-<li><b>字典</b>,字典的<code>key</code>需与具体任务对应,如图像分类任务对应<code>\"img\"</code>,字典的<code>val</code>支持上述类型数据,例如:<code>{\"img\": \"/root/data1\"}</code></li>
-<li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code>,<code>[{\"img\": \"/root/data1\"}, {\"img\": \"/root/data2/img.jpg\"}]</code></li>
+  <li><b>Python变量</b>,如<code>numpy.ndarray</code>表示的图像数据</li>
+  <li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
+  <li><b>URL链接</b>,如图像文件的网络URL:<a href = "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">示例</a></li>
+  <li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
+  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>无</td>

+ 5 - 6
docs/module_usage/tutorials/cv_modules/face_feature.md

@@ -154,12 +154,11 @@ for res in output:
 <td><code>Python Var</code>/<code>str</code>/<code>dict</code>/<code>list</code></td>
 <td>
 <ul>
-<li><b>Python变量</b>,如<code>numpy.ndarray</code>表示的图像数据</li>
-<li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
-<li><b>URL链接</b>,如图像文件的网络URL:<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">示例</a></li>
-<li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
-<li><b>字典</b>,字典的<code>key</code>需与具体任务对应,如图像分类任务对应<code>\"img\"</code>,字典的<code>val</code>支持上述类型数据,例如:<code>{\"img\": \"/root/data1\"}</code></li>
-<li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code>,<code>[{\"img\": \"/root/data1\"}, {\"img\": \"/root/data2/img.jpg\"}]</code></li>
+  <li><b>Python变量</b>,如<code>numpy.ndarray</code>表示的图像数据</li>
+  <li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
+  <li><b>URL链接</b>,如图像文件的网络URL:<a href = "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">示例</a></li>
+  <li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
+  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code></li>
 </ul>
 </td>
 <td>无</td>

+ 1 - 2
docs/module_usage/tutorials/cv_modules/human_keypoint_detection.md

@@ -157,8 +157,7 @@ for res in output:
   <li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
   <li><b>URL链接</b>,如图像文件的网络URL:<a href = "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">示例</a></li>
   <li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
-  <li><b>字典</b>,字典的<code>key</code>需与具体任务对应,如图像分类任务对应<code>\"img\"</code>,字典的<code>val</code>支持上述类型数据,例如:<code>{\"img\": \"/root/data1\"}</code></li>
-  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code>,<code>[{\"img\": \"/root/data1\"}, {\"img\": \"/root/data2/img.jpg\"}]</code></li>
+  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>无</td>

+ 5 - 6
docs/module_usage/tutorials/cv_modules/image_feature.md

@@ -118,12 +118,11 @@ for res in output:
 <td><code>Python Var</code>/<code>str</code>/<code>dict</code>/<code>list</code></td>
 <td>
 <ul>
-<li><b>Python变量</b>,如<code>numpy.ndarray</code>表示的图像数据</li>
-<li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
-<li><b>URL链接</b>,如图像文件的网络URL:<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">示例</a></li>
-<li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
-<li><b>字典</b>,字典的<code>key</code>需与具体任务对应,如图像分类任务对应<code>\"img\"</code>,字典的<code>val</code>支持上述类型数据,例如:<code>{\"img\": \"/root/data1\"}</code></li>
-<li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code>,<code>[{\"img\": \"/root/data1\"}, {\"img\": \"/root/data2/img.jpg\"}]</code></li>
+  <li><b>Python变量</b>,如<code>numpy.ndarray</code>表示的图像数据</li>
+  <li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
+  <li><b>URL链接</b>,如图像文件的网络URL:<a href = "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">示例</a></li>
+  <li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
+  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code>/li>
 </ul>
 </td>
 <td>无</td>

+ 1 - 2
docs/module_usage/tutorials/cv_modules/image_multilabel_classification.md

@@ -152,8 +152,7 @@ for res in output:
   <li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
   <li><b>URL链接</b>,如图像文件的网络URL:<a href = "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/multilabel_classification_005.png">示例</a></li>
   <li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
-  <li><b>字典</b>,字典的<code>key</code>需与具体任务对应,如图像分类任务对应<code>\"img\"</code>,字典的<code>val</code>支持上述类型数据,例如:<code>{\"img\": \"/root/data1\"}</code></li>
-  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code>,<code>[{\"img\": \"/root/data1\"}, {\"img\": \"/root/data2/img.jpg\"}]</code></li>
+  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>无</td>

+ 1 - 1
docs/module_usage/tutorials/ocr_modules/doc_img_orientation_classification.en.md

@@ -435,7 +435,7 @@ The model can be directly integrated into the PaddleX pipeline or into your own
 
 1.<b>Pipeline Integration</b>
 
-The document image classification module can be integrated into PaddleX pipelines such as the [Document Scene Information Extraction Pipeline (PP-ChatOCRv3)](../../..//pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.en.md). Simply replace the model path to update the The document image classification module's model.
+The document image classification module can be integrated into PaddleX pipelines such as the [Document Scene Information Extraction Pipeline (PP-ChatOCRv3-doc)](../../..//pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.en.md). Simply replace the model path to update the The document image classification module's model.
 
 2.<b>Module Integration</b>
 

+ 2 - 3
docs/module_usage/tutorials/ocr_modules/doc_img_orientation_classification.md

@@ -134,8 +134,7 @@ for res in output:
   <li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
   <li><b>URL链接</b>,如图像文件的网络URL:<a href = "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/img_rot180_demo.jpg">示例</a></li>
   <li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
-  <li><b>字典</b>,字典的<code>key</code>需与具体任务对应,如图像分类任务对应<code>\"img\"</code>,字典的<code>val</code>支持上述类型数据,例如:<code>{\"img\": \"/root/data1\"}</code></li>
-  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code>,<code>[{\"img\": \"/root/data1\"}, {\"img\": \"/root/data2/img.jpg\"}]</code></li>
+  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>无</td>
@@ -429,7 +428,7 @@ python main.py -c paddlex/configs/modules/doc_text_orientation/PP-LCNet_x1_0_doc
 
 1.<b>产线集成</b>
 
-文档图像分类模块可以集成的PaddleX产线有[文档场景信息抽取v3产线(PP-ChatOCRv3)](../../../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.md),只需要替换模型路径即可完成文档图像分类模块的模型更新。
+文档图像分类模块可以集成的PaddleX产线有[文档场景信息抽取v3产线(PP-ChatOCRv3-doc)](../../../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.md),只需要替换模型路径即可完成文档图像分类模块的模型更新。
 
 2.<b>模块集成</b>
 

+ 1 - 1
docs/module_usage/tutorials/ocr_modules/formula_recognition.md

@@ -133,7 +133,7 @@ sudo apt-get install texlive texlive-latex-base texlive-latex-extra -y
   <li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
   <li><b>URL链接</b>,如图像文件的网络URL:<a href = "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_formula_rec_001.png">示例</a></li>
   <li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
-  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code>,<code>[{\"img\": \"/root/data1\"}, {\"img\": \"/root/data2/img.jpg\"}]</code></li>
+  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>无</td>

+ 1 - 1
docs/module_usage/tutorials/ocr_modules/layout_detection.en.md

@@ -606,7 +606,7 @@ Other related parameters can be set by modifying the fields under `Global` and `
 The model can be directly integrated into PaddleX pipelines or into your own projects.
 
 1. <b>Pipeline Integration</b>
-The structure analysis module can be integrated into PaddleX pipelines such as the [General Table Recognition Pipeline](../../../pipeline_usage/tutorials/ocr_pipelines/table_recognition.en.md) and the [Document Scene Information Extraction Pipeline v3 (PP-ChatOCRv3)](../../..//pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.en.md). Simply replace the model path to update the layout area localization module. In pipeline integration, you can use high-performance inference and service-oriented deployment to deploy your model.
+The structure analysis module can be integrated into PaddleX pipelines such as the [General Table Recognition Pipeline](../../../pipeline_usage/tutorials/ocr_pipelines/table_recognition.en.md) and the [Document Scene Information Extraction Pipeline v3 (PP-ChatOCRv3-doc)](../../..//pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.en.md). Simply replace the model path to update the layout area localization module. In pipeline integration, you can use high-performance inference and service-oriented deployment to deploy your model.
 
 1. <b>Module Integration</b>
 The weights you produce can be directly integrated into the layout area localization module. You can refer to the Python example code in the [Quick Integration](#quick) section, simply replacing the model with the path to your trained model.

+ 6 - 7
docs/module_usage/tutorials/ocr_modules/layout_detection.md

@@ -335,12 +335,11 @@ for res in output:
 <td><code>Python Var</code>/<code>str</code>/<code>dict</code>/<code>list</code></td>
 <td>
 <ul>
-<li><b>Python变量</b>,如<code>numpy.ndarray</code>表示的图像数据</li>
-<li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
-<li><b>URL链接</b>,如图像文件的网络URL:<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">示例</a></li>
-<li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
-<li><b>字典</b>,字典的<code>key</code>需与具体任务对应,如图像分类任务对应<code>\"img\"</code>,字典的<code>val</code>支持上述类型数据,例如:<code>{\"img\": \"/root/data1\"}</code></li>
-<li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code>,<code>[{\"img\": \"/root/data1\"}, {\"img\": \"/root/data2/img.jpg\"}]</code></li>
+  <li><b>Python变量</b>,如<code>numpy.ndarray</code>表示的图像数据</li>
+  <li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
+  <li><b>URL链接</b>,如图像文件的网络URL:<a href = "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">示例</a></li>
+  <li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
+  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>无</td>
@@ -663,7 +662,7 @@ python main.py -c paddlex/configs/modules/layout_detection/PicoDet-L_layout_3cls
 模型可以直接集成到PaddleX产线中,也可以直接集成到您自己的项目中。
 
 1. <b>产线集成</b>
-版面区域检测模块可以集成的PaddleX产线有[通用表格识别产线](../../../pipeline_usage/tutorials/ocr_pipelines/table_recognition.md)、[文档场景信息抽取v3产线(PP-ChatOCRv3)](../../../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.md),只需要替换模型路径即可完成版面区域检测模块的模型更新。在产线集成中,你可以使用高性能部署和服务化部署来部署你得到的模型。
+版面区域检测模块可以集成的PaddleX产线有[通用表格识别产线](../../../pipeline_usage/tutorials/ocr_pipelines/table_recognition.md)、[文档场景信息抽取v3产线(PP-ChatOCRv3-doc)](../../../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.md),只需要替换模型路径即可完成版面区域检测模块的模型更新。在产线集成中,你可以使用高性能部署和服务化部署来部署你得到的模型。
 
 1. <b>模块集成</b>
 您产出的权重可以直接集成到版面区域检测模块中,可以参考[快速集成](#三快速集成)的 Python 示例代码,只需要将模型替换为你训练的到的模型路径即可。

+ 1 - 1
docs/module_usage/tutorials/ocr_modules/seal_text_detection.en.md

@@ -582,7 +582,7 @@ The model can be directly integrated into the PaddleX pipeline or into your own
 
 1. <b>Pipeline Integration</b>
 
-The document Seal Text Detection module can be integrated into PaddleX pipelines such as the [General OCR Pipeline](../../../pipeline_usage/tutorials/ocr_pipelines/OCR.en.md) and [Document Scene Information Extraction Pipeline v3 (PP-ChatOCRv3)](../../../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.en.md). Simply replace the model path to update the text detection module of the relevant pipeline.
+The document Seal Text Detection module can be integrated into PaddleX pipelines such as the [General OCR Pipeline](../../../pipeline_usage/tutorials/ocr_pipelines/OCR.en.md) and [Document Scene Information Extraction Pipeline v3 (PP-ChatOCRv3-doc)](../../../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.en.md). Simply replace the model path to update the text detection module of the relevant pipeline.
 
 2. <b>Module Integration</b>
 

+ 6 - 7
docs/module_usage/tutorials/ocr_modules/seal_text_detection.md

@@ -197,12 +197,11 @@ for res in output:
 <td><code>Python Var</code>/<code>str</code>/<code>dict</code>/<code>list</code></td>
 <td>
 <ul>
-<li><b>Python变量</b>,如<code>numpy.ndarray</code>表示的图像数据</li>
-<li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
-<li><b>URL链接</b>,如图像文件的网络URL:<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">示例</a></li>
-<li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
-<li><b>字典</b>,字典的<code>key</code>需与具体任务对应,如图像分类任务对应<code>\"img\"</code>,字典的<code>val</code>支持上述类型数据,例如:<code>{\"img\": \"/root/data1\"}</code></li>
-<li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code>,<code>[{\"img\": \"/root/data1\"}, {\"img\": \"/root/data2/img.jpg\"}]</code></li>
+  <li><b>Python变量</b>,如<code>numpy.ndarray</code>表示的图像数据</li>
+  <li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
+  <li><b>URL链接</b>,如图像文件的网络URL:<a href = "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">示例</a></li>
+  <li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
+  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>无</td>
@@ -563,7 +562,7 @@ python main.py -c paddlex/configs/modules/seal_text_detection/PP-OCRv4_server_se
 
 1.<b>产线集成</b>
 
-印章文本检测模块可以集成的PaddleX产线有[文档场景信息抽取v3产线(PP-ChatOCRv3)](../../../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.md),只需要替换模型路径即可完成印章文本检测模块的模型更新。在产线集成中,你可以使用高性能部署和服务化部署来部署你得到的模型。
+印章文本检测模块可以集成的PaddleX产线有[文档场景信息抽取v3产线(PP-ChatOCRv3-doc)](../../../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.md),只需要替换模型路径即可完成印章文本检测模块的模型更新。在产线集成中,你可以使用高性能部署和服务化部署来部署你得到的模型。
 
 2.<b>模块集成</b>
 

+ 1 - 2
docs/module_usage/tutorials/ocr_modules/table_cells_detection.md

@@ -149,8 +149,7 @@ for res in output:
   <li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
   <li><b>URL链接</b>,如图像文件的网络URL:<a href = "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/table_recognition.jpg">示例</a></li>
   <li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
-  <li><b>字典</b>,字典的<code>key</code>需与具体任务对应,如图像分类任务对应<code>\"img\"</code>,字典的<code>val</code>支持上述类型数据,例如:<code>{\"img\": \"/root/data1\"}</code></li>
-  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code>,<code>[{\"img\": \"/root/data1\"}, {\"img\": \"/root/data2/img.jpg\"}]</code></li>
+  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>无</td>

+ 1 - 2
docs/module_usage/tutorials/ocr_modules/table_classification.md

@@ -107,8 +107,7 @@ for res in output:
   <li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
   <li><b>URL链接</b>,如图像文件的网络URL:<a href = "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/table_recognition.jpg">示例</a></li>
   <li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
-  <li><b>字典</b>,字典的<code>key</code>需与具体任务对应,如表格分类任务对应<code>\"img\"</code>,字典的<code>val</code>支持上述类型数据,例如:<code>{\"img\": \"/root/data1\"}</code></li>
-  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code>,<code>[{\"img\": \"/root/data1\"}, {\"img\": \"/root/data2/img.jpg\"}]</code></li>
+  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>无</td>

+ 1 - 1
docs/module_usage/tutorials/ocr_modules/table_structure_recognition.en.md

@@ -394,7 +394,7 @@ The model can be directly integrated into the PaddleX pipeline or directly into
 
 1.<b>Pipeline Integration</b>
 
-The table structure recognition module can be integrated into PaddleX pipelines such as the [General Table Recognition Pipeline](../../../pipeline_usage/tutorials/ocr_pipelines/table_recognition.en.md) and the [Document Scene Information Extraction Pipeline v3 (PP-ChatOCRv3)](../../../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.en.md). Simply replace the model path to update the table structure recognition module in the relevant pipelines. For pipeline integration, you can deploy your obtained model using high-performance inference and service-oriented deployment.
+The table structure recognition module can be integrated into PaddleX pipelines such as the [General Table Recognition Pipeline](../../../pipeline_usage/tutorials/ocr_pipelines/table_recognition.en.md) and the [Document Scene Information Extraction Pipeline v3 (PP-ChatOCRv3-doc)](../../../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.en.md). Simply replace the model path to update the table structure recognition module in the relevant pipelines. For pipeline integration, you can deploy your obtained model using high-performance inference and service-oriented deployment.
 
 2.<b>Module Integration</b>
 

+ 6 - 7
docs/module_usage/tutorials/ocr_modules/table_structure_recognition.md

@@ -127,12 +127,11 @@ for res in output:
 <td><code>Python Var</code>/<code>str</code>/<code>dict</code>/<code>list</code></td>
 <td>
 <ul>
-<li><b>Python变量</b>,如<code>numpy.ndarray</code>表示的图像数据</li>
-<li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
-<li><b>URL链接</b>,如图像文件的网络URL:<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/table_recognition.jpg">示例</a></li>
-<li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
-<li><b>字典</b>,字典的<code>key</code>需与具体任务对应,如图像分类任务对应<code>\"img\"</code>,字典的<code>val</code>支持上述类型数据,例如:<code>{\"img\": \"/root/data1\"}</code></li>
-<li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code>,<code>[{\"img\": \"/root/data1\"}, {\"img\": \"/root/data2/img.jpg\"}]</code></li>
+  <li><b>Python变量</b>,如<code>numpy.ndarray</code>表示的图像数据</li>
+  <li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
+  <li><b>URL链接</b>,如图像文件的网络URL:<a href = "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/table_recognition.jpg">示例</a></li>
+  <li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
+  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>无</td>
@@ -404,7 +403,7 @@ python main.py -c paddlex/configs/modules/table_structure_recognition/SLANet.yam
 
 1.<b>产线集成</b>
 
-表格结构识别模块可以集成的PaddleX产线有[通用表格识别产线](../../../pipeline_usage/tutorials/ocr_pipelines/table_recognition.md)、[文档场景信息抽取v3产线(PP-ChatOCRv3)](../../../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.md),只需要替换模型路径即可完成相关产线的表格结构识别模块的模型更新,具体对应关系详见产线文档。在产线集成中,你可以使用高性能部署和服务化部署来部署你得到的模型。
+表格结构识别模块可以集成的PaddleX产线有[通用表格识别产线](../../../pipeline_usage/tutorials/ocr_pipelines/table_recognition.md)、[文档场景信息抽取v3产线(PP-ChatOCRv3-doc)](../../../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.md),只需要替换模型路径即可完成相关产线的表格结构识别模块的模型更新,具体对应关系详见产线文档。在产线集成中,你可以使用高性能部署和服务化部署来部署你得到的模型。
 
 
 2.<b>模块集成</b>

+ 1 - 1
docs/module_usage/tutorials/ocr_modules/text_detection.en.md

@@ -517,7 +517,7 @@ Models can be directly integrated into PaddleX pipelines or into your own projec
 
 1.<b>Pipeline Integration</b>
 
-The text detection module can be integrated into PaddleX pipelines such as the [General OCR Pipeline](../../../pipeline_usage/tutorials/ocr_pipelines/OCR.en.md), [Table Recognition Pipeline](../../../pipeline_usage/tutorials/ocr_pipelines/table_recognition.en.md), and [PP-ChatOCRv3-doc](../../../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.en.md). Simply replace the model path to update the text detection module of the relevant pipeline.
+The text detection module can be integrated into PaddleX pipelines such as the [General OCR Pipeline](../../../pipeline_usage/tutorials/ocr_pipelines/OCR.en.md), [Table Recognition Pipeline](../../../pipeline_usage/tutorials/ocr_pipelines/table_recognition.en.md), and [PP-ChatOCRv3-doc](../../../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.en.md). Simply replace the model path to update the text detection module of the relevant pipeline.
 
 2.<b>Module Integration</b>
 

+ 6 - 7
docs/module_usage/tutorials/ocr_modules/text_detection.md

@@ -193,12 +193,11 @@ for res in output:
 <td><code>Python Var</code>/<code>str</code>/<code>dict</code>/<code>list</code></td>
 <td>
 <ul>
-<li><b>Python变量</b>,如<code>numpy.ndarray</code>表示的图像数据</li>
-<li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
-<li><b>URL链接</b>,如图像文件的网络URL:<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">示例</a></li>
-<li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
-<li><b>字典</b>,字典的<code>key</code>需与具体任务对应,如图像分类任务对应<code>\"img\"</code>,字典的<code>val</code>支持上述类型数据,例如:<code>{\"img\": \"/root/data1\"}</code></li>
-<li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code>,<code>[{\"img\": \"/root/data1\"}, {\"img\": \"/root/data2/img.jpg\"}]</code></li>
+  <li><b>Python变量</b>,如<code>numpy.ndarray</code>表示的图像数据</li>
+  <li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
+  <li><b>URL链接</b>,如图像文件的网络URL:<a href = "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">示例</a></li>
+  <li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
+  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>无</td>
@@ -530,7 +529,7 @@ python main.py -c paddlex/configs/modules/text_detection/PP-OCRv4_mobile_det.yam
 
 1.<b>产线集成</b>
 
-文本检测模块可以集成的 PaddleX 产线有[通用 OCR 产线](../../../pipeline_usage/tutorials/ocr_pipelines/OCR.md)、[表格识别产线](../../../pipeline_usage/tutorials/ocr_pipelines/table_recognition.md)、[文档场景信息抽取v3产线(PP-ChatOCRv3)](../../../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.md),只需要替换模型路径即可完成相关产线的文本检测模块的模型更新。
+文本检测模块可以集成的 PaddleX 产线有[通用 OCR 产线](../../../pipeline_usage/tutorials/ocr_pipelines/OCR.md)、[表格识别产线](../../../pipeline_usage/tutorials/ocr_pipelines/table_recognition.md)、[文档场景信息抽取v3产线(PP-ChatOCRv3-doc)](../../../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.md),只需要替换模型路径即可完成相关产线的文本检测模块的模型更新。
 
 2.<b>模块集成</b>
 

+ 1 - 2
docs/module_usage/tutorials/ocr_modules/text_image_unwarping.md

@@ -113,8 +113,7 @@ for res in output:
   <li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
   <li><b>URL链接</b>,如图像文件的网络URL:<a href = "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">示例</a></li>
   <li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
-  <li><b>字典</b>,字典的<code>key</code>需与具体任务对应,如图像分类任务对应<code>\"img\"</code>,字典的<code>val</code>支持上述类型数据,例如:<code>{\"img\": \"/root/data1\"}</code></li>
-  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code>,<code>[{\"img\": \"/root/data1\"}, {\"img\": \"/root/data2/img.jpg\"}]</code></li>
+  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>无</td>

+ 1 - 1
docs/module_usage/tutorials/ocr_modules/text_recognition.en.md

@@ -623,7 +623,7 @@ Models can be directly integrated into the PaddleX pipelines or into your own pr
 
 1.<b>Pipeline Integration</b>
 
-The text recognition module can be integrated into PaddleX pipelines such as the [General OCR Pipeline](../../../pipeline_usage/tutorials/ocr_pipelines/OCR.en.md), [General Table Recognition Pipeline](../../../pipeline_usage/tutorials/ocr_pipelines/table_recognition.en.md), and [Document Scene Information Extraction Pipeline v3 (PP-ChatOCRv3)](../../../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.en.md). Simply replace the model path to update the text recognition module of the relevant pipeline.
+The text recognition module can be integrated into PaddleX pipelines such as the [General OCR Pipeline](../../../pipeline_usage/tutorials/ocr_pipelines/OCR.en.md), [General Table Recognition Pipeline](../../../pipeline_usage/tutorials/ocr_pipelines/table_recognition.en.md), and [Document Scene Information Extraction Pipeline v3 (PP-ChatOCRv3-doc)](../../../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.en.md). Simply replace the model path to update the text recognition module of the relevant pipeline.
 
 2.<b>Module Integration</b>
 

+ 6 - 6
docs/module_usage/tutorials/ocr_modules/text_recognition.md

@@ -359,11 +359,11 @@ for res in output:
 <td><code>Python Var</code>/<code>str</code>/<code>list</code></td>
 <td>
 <ul>
-<li><b>Python变量</b>,如<code>numpy.ndarray</code>表示的图像数据</li>
-<li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
-<li><b>URL链接</b>,如图像文件的网络URL:<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">示例</a></li>
-<li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
-<li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code>,<code>[{\"img\": \"/root/data1\"}, {\"img\": \"/root/data2/img.jpg\"}]</code></li>
+  <li><b>Python变量</b>,如<code>numpy.ndarray</code>表示的图像数据</li>
+  <li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
+  <li><b>URL链接</b>,如图像文件的网络URL:<a href = "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png">示例</a></li>
+  <li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
+  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>无</td>
@@ -653,7 +653,7 @@ python main.py -c paddlex/configs/modules/text_recognition/PP-OCRv4_mobile_rec.y
 
 1.<b>产线集成</b>
 
-文本识别模块可以集成的PaddleX产线有[通用 OCR 产线](../../../pipeline_usage/tutorials/ocr_pipelines/OCR.md)、[通用表格识别产线](../../../pipeline_usage/tutorials/ocr_pipelines/table_recognition.md)、[文档场景信息抽取v3产线(PP-ChatOCRv3)](../../../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.md),只需要替换模型路径即可完成相关产线的文本识别模块的模型更新。
+文本识别模块可以集成的PaddleX产线有[通用 OCR 产线](../../../pipeline_usage/tutorials/ocr_pipelines/OCR.md)、[通用表格识别产线](../../../pipeline_usage/tutorials/ocr_pipelines/table_recognition.md)、[文档场景信息抽取v3产线(PP-ChatOCRv3-doc)](../../../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.md),只需要替换模型路径即可完成相关产线的文本识别模块的模型更新。
 
 2.<b>模块集成</b>
 

+ 1 - 1
docs/module_usage/tutorials/ocr_modules/textline_orientation_classification.en.md

@@ -413,7 +413,7 @@ The model can be directly integrated into the PaddleX pipeline or into your own
 
 1. **Pipeline Integration**
 
-The text line orientation classification module can be integrated into the [Document Scene Information Extraction v3 Pipeline (PP-ChatOCRv3)](../../../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.en.md). Simply replace the model path to update the text line orientation classification module.
+The text line orientation classification module can be integrated into the [Document Scene Information Extraction v3 Pipeline (PP-ChatOCRv3-doc)](../../../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.en.md). Simply replace the model path to update the text line orientation classification module.
 
 2. **Module Integration**
 

+ 2 - 3
docs/module_usage/tutorials/ocr_modules/textline_orientation_classification.md

@@ -119,8 +119,7 @@ for res in output:
   <li><b>文件路径</b>,如图像文件的本地路径:<code>/root/data/img.jpg</code></li>
   <li><b>URL链接</b>,如图像文件的网络URL:<a href = "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/textline_rot180_demo.jpg">示例</a></li>
   <li><b>本地目录</b>,该目录下需包含待预测数据文件,如本地路径:<code>/root/data/</code></li>
-  <li><b>字典</b>,字典的<code>key</code>需与具体任务对应,如图像分类任务对应<code>\"img\"</code>,字典的<code>val</code>支持上述类型数据,例如:<code>{\"img\": \"/root/data1\"}</code></li>
-  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code>,<code>[{\"img\": \"/root/data1\"}, {\"img\": \"/root/data2/img.jpg\"}]</code></li>
+  <li><b>列表</b>,列表元素需为上述类型数据,如<code>[numpy.ndarray, numpy.ndarray]</code>,<code>[\"/root/data/img1.jpg\", \"/root/data/img2.jpg\"]</code>,<code>[\"/root/data1\", \"/root/data2\"]</code></li>
 </ul>
 </td>
 <td>无</td>
@@ -414,7 +413,7 @@ python main.py -c paddlex/configs/modules/textline_orientation/PP-LCNet_x0_25_te
 
 1.<b>产线集成</b>
 
-文本行方向分类模块可以集成的PaddleX产线有[通用OCR产线](../../../pipeline_usage/tutorials/ocr_pipelines/OCR.md)和[文档场景信息抽取v3产线(PP-ChatOCRv3)](../../../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.md),只需要替换模型路径即可完成文本行方向分类模块的模型更新。
+文本行方向分类模块可以集成的PaddleX产线有[通用OCR产线](../../../pipeline_usage/tutorials/ocr_pipelines/OCR.md)和[文档场景信息抽取v3产线(PP-ChatOCRv3-doc)](../../../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.md),只需要替换模型路径即可完成文本行方向分类模块的模型更新。
 
 2.<b>模块集成</b>
 

+ 1 - 1
docs/pipeline_deploy/high_performance_inference.en.md

@@ -194,7 +194,7 @@ PaddleX combines model information and runtime environment information to provid
   </tr>
 
   <tr>
-    <td rowspan="7">PP-ChatOCRv3</td>
+    <td rowspan="7">PP-ChatOCRv3-doc</td>
     <td>Table Recognition</td>
     <td>✅</td>
   </tr>

+ 1 - 1
docs/pipeline_usage/pipeline_develop_guide.en.md

@@ -186,7 +186,7 @@ Choose the appropriate deployment method for your model pipeline based on your n
 <tbody>
 <tr>
 <td>PP-ChatOCR-doc v3</td>
-<td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.html">PP-ChatOCR-doc v3 Pipeline Usage Tutorial</a></td>
+<td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.html">PP-ChatOCR-doc v3 Pipeline Usage Tutorial</a></td>
 </tr>
 <tr>
 <td>Image Classification</td>

+ 1 - 1
docs/pipeline_usage/pipeline_develop_guide.md

@@ -188,7 +188,7 @@ Pipeline:
 <tbody>
 <tr>
 <td>文档场景信息抽取v3</td>
-<td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.html">文档场景信息抽取v3产线使用教程</a></td>
+<td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.html">文档场景信息抽取v3产线使用教程</a></td>
 </tr>
 <tr>
 <td>通用图像分类</td>

+ 1 - 1
docs/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.en.md → docs/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.en.md

@@ -311,7 +311,7 @@ You can [experience online](https://aistudio.baidu.com/community/app/182491/webU
 If you are satisfied with the pipeline's performance, you can directly integrate and deploy it. If not, you can also use your private data to <b>fine-tune the models in the pipeline online</b>.
 
 ### 2.2 Local Experience
-Before using the PP-ChatOCRv3 pipeline locally, please ensure you have installed the PaddleX wheel package following the [PaddleX Local Installation Guide](../../../installation/installation.en.md).
+Before using the PP-ChatOCRv3-doc pipeline locally, please ensure you have installed the PaddleX wheel package following the [PaddleX Local Installation Guide](../../../installation/installation.en.md).
 
 A few lines of code are all you need to complete the quick inference of the pipeline. Using the [test file](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/contract.pdf), taking the PP-ChatOCRv3-doc pipeline as an example:
 

+ 25 - 12
docs/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.md → docs/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.md

@@ -5,7 +5,7 @@ comments: true
 # 文档场景信息抽取v3产线使用教程
 
 ## 1. 文档场景信息抽取v3产线介绍
-文档场景信息抽取v3(PP-ChatOCRv3)是飞桨特色的文档和图像智能分析解决方案,结合了 LLM 和 OCR 技术,一站式解决版面分析、生僻字、多页 pdf、表格、印章识别等常见的复杂文档信息抽取难点问题,结合文心大模型将海量数据和知识相融合,准确率高且应用广泛。本产线同时提供了灵活的服务化部署方式,支持在多种硬件上部署。不仅如此,本产线也提供了二次开发的能力,您可以基于本产线在您自己的数据集上训练调优,训练后的模型也可以无缝集成。
+文档场景信息抽取v3(PP-ChatOCRv3-doc)是飞桨特色的文档和图像智能分析解决方案,结合了 LLM 和 OCR 技术,一站式解决版面分析、生僻字、多页 pdf、表格、印章识别等常见的复杂文档信息抽取难点问题,结合文心大模型将海量数据和知识相融合,准确率高且应用广泛。本产线同时提供了灵活的服务化部署方式,支持在多种硬件上部署。不仅如此,本产线也提供了二次开发的能力,您可以基于本产线在您自己的数据集上训练调优,训练后的模型也可以无缝集成。
 
 <img src="https://github.com/user-attachments/assets/90cb740b-7741-4383-bc4c-663f9d042d02"/>
 
@@ -314,7 +314,7 @@ PaddleX 所提供的预训练的模型产线均可以快速体验效果,你可
 
 首先需要配置获取 `PP-ChatOCRv3-doc` 产线的配置文件,可以通过以下命令获取:
 ```bash
-python -m paddlex --get_pipeline_config PP-ChatOCRv3-doc ./
+paddlex --get_pipeline_config PP-ChatOCRv3-doc ./
 ```
 
 执行上述命令后,配置文件会存储在当前路径下,打开配置文件,填写大语言模型的 ak/sk(access_token),如下所示:
@@ -338,7 +338,7 @@ SubModules:
 ......
 ```
 
-PP-ChatOCRv3 仅支持文心大模型,支持在[百度云千帆平台](https://console.bce.baidu.com/qianfan/ais/console/onlineService)或者[星河社区 AIStudio](https://aistudio.baidu.com/)上获取相关的 ak/sk(access_token)。如果使用百度云千帆平台,可以参考[AK和SK鉴权调用API流程](https://cloud.baidu.com/doc/WENXINWORKSHOP/s/Hlwerugt8) 获取ak/sk,如果使用星河社区 AIStudio,可以在[星河社区 AIStudio 访问令牌](https://aistudio.baidu.com/account/accessToken)中获取 access_token。
+PP-ChatOCRv3-doc 仅支持文心大模型,支持在[百度云千帆平台](https://console.bce.baidu.com/qianfan/ais/console/onlineService)或者[星河社区 AIStudio](https://aistudio.baidu.com/)上获取相关的 ak/sk(access_token)。如果使用百度云千帆平台,可以参考[AK和SK鉴权调用API流程](https://cloud.baidu.com/doc/WENXINWORKSHOP/s/Hlwerugt8) 获取ak/sk,如果使用星河社区 AIStudio,可以在[星河社区 AIStudio 访问令牌](https://aistudio.baidu.com/account/accessToken)中获取 access_token。
 
 更新配置文件后,即可使用几行Python代码完成快速推理,可以使用 [测试文件](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/vehicle_certificate-1.png)测试:
 
@@ -360,7 +360,7 @@ for res in visual_predict_res:
     layout_parsing_result = res["layout_parsing_result"]
 
 vector_info = pipeline.build_vector(visual_info_list, flag_save_bytes_vector=True)
-chat_result = pipeline.chat(key_list=["驾驶室准乘人数"], visual_info_list, vector_info=vector_info)
+chat_result = pipeline.chat(key_list=["驾驶室准乘人数"], visual_info=visual_info_list, vector_info=vector_info)
 print(chat_result)
 
 ```
@@ -371,7 +371,7 @@ print(chat_result)
 {'chat_res': {'驾驶室准乘人数': '2'}}
 ```
 
-PP-ChatOCRv3 预测的流程、API说明、产出说明如下:
+PP-ChatOCRv3-doc 预测的流程、API说明、产出说明如下:
 
 <details><summary>(1)调用 <code>create_pipeline</code> 方法实例化PP-ChatOCRv3产线对象。</summary>
 
@@ -391,7 +391,13 @@ PP-ChatOCRv3 预测的流程、API说明、产出说明如下:
 <td><code>pipeline</code></td>
 <td>产线名称或是产线配置文件路径。如为产线名称,则必须为 PaddleX 所支持的产线。</td>
 <td><code>str</code></td>
-<td>无</td>
+<td><code>None</code></td>
+</tr>
+<tr>
+<td><code>config</code></td>
+<td>产线具体的配置信息(如果和<code>pipeline</code>同时设置,优先级高于<code>pipeline</code>,且要求产线名和<code>pipeline</code>一致)。</td>
+<td><code>dict[str, Any]</code></td>
+<td><code>None</code></td>
 </tr>
 <tr>
 <td><code>device</code></td>
@@ -408,7 +414,8 @@ PP-ChatOCRv3 预测的流程、API说明、产出说明如下:
 </tbody>
 </table>
 </details>
-<details><summary>(2)调用 PP-ChatOCRv3 产线对象的 <code>visual_predict()</code> 方法获取视觉预测结果。 该方法将返回一个 generator。</summary>
+
+<details><summary>(2)调用 PP-ChatOCRv3-doc 产线对象的 <code>visual_predict()</code> 方法获取视觉预测结果。 该方法将返回一个 generator。</summary>
 
 以下是 `visual_predict()` 方法的参数及其说明:
 
@@ -870,8 +877,10 @@ for res in visual_predict_res:
             - `rec_scores`: `(List[float])` 单元格的识别置信度
             - `rec_boxes`: `(numpy.ndarray)` 检测框的矩形边界框数组,shape为(n, 4),dtype为int16。每一行表示一个矩形
 
-- 调用`save_to_json()` 方法会将上述内容保存到指定的`save_path`中,如果指定为目录,则保存的路径为`save_path/{your_img_basename}.json`,如果指定为文件,则直接保存到该文件中。由于json文件不支持保存numpy数组,因此会将其中的`numpy.array`类型转换为列表形式。
-- 调用`save_to_img()` 方法会将可视化结果保存到指定的`save_path`中,如果指定为目录,则保存的路径为`save_path/{your_img_basename}_ocr_res_img.{your_img_extension}`,如果指定为文件,则直接保存到该文件中。(产线通常包含较多结果图片,不建议直接指定为具体的文件路径,否则多张图会被覆盖,仅保留最后一张图)
+- 调用`save_to_json()` 方法会将上述内容保存到指定的`save_path`中,如果指定为目录,则保存的路径为`save_path/{your_img_basename}_res.json`,如果指定为文件,则直接保存到该文件中。由于json文件不支持保存numpy数组,因此会将其中的`numpy.array`类型转换为列表形式。
+- 调用`save_to_img()` 方法会将可视化结果保存到指定的`save_path`中,如果指定为目录,则会将版面区域检测可视化图像、全局OCR可视化图像、版面阅读顺序可视化图像等内容保存,如果指定为文件,则直接保存到该文件中。(产线通常包含较多结果图片,不建议直接指定为具体的文件路径,否则多张图会被覆盖,仅保留最后一张图)
+
+
 
 此外,也支持通过属性获取带结果的可视化图像和预测结果,具体如下:
 <table>
@@ -894,7 +903,8 @@ for res in visual_predict_res:
 - `json` 属性获取的预测结果为dict类型的数据,相关内容与调用 `save_to_json()` 方法保存的内容一致。
 - `img` 属性返回的预测结果是一个字典类型的数据。其中,键分别为 `layout_det_res`、`overall_ocr_res`、`text_paragraphs_ocr_res`、`formula_res_region1`、`table_cell_img` 和 `seal_res_region1`,对应的值是 `Image.Image` 对象:分别用于显示版面区域检测、OCR、OCR文本段落、公式、表格和印章结果的可视化图像。如果没有使用可选模块,则字典中只包含 `layout_det_res`。
 </details>
-<details><summary>(4)调用PP-ChatOCRv3的产线对象的 <code>build_vector()</code> 方法,对文本内容进行向量构建。</summary>
+
+<details><summary>(4)调用 PP-ChatOCRv3-doc 的产线对象的 <code>build_vector()</code> 方法,对文本内容进行向量构建。</summary>
 
 以下是 `build_vector()` 方法的参数及其说明:
 
@@ -944,7 +954,8 @@ for res in visual_predict_res:
 - `flag_too_short_text`:`(bool)`是否文本长度小于最小字符数量
 - `vector`: `(str|list)` 文本的二进制内容或者文本内容,取决于`flag_save_bytes_vector`和`min_characters`的值,如果`flag_save_bytes_vector=True`且文本长度大于等于最小字符数量,则返回二进制内容;否则返回原始的文本。
 </details>
-<details><summary>(5)调用PP-ChatOCRv3的产线对象的 <code>chat()</code> 方法,对关键信息进行抽取。</summary>
+
+<details><summary>(5)调用 PP-ChatOCRv3-doc 的产线对象的 <code>chat()</code> 方法,对关键信息进行抽取。</summary>
 
 以下是 `chat()` 方法的参数及其说明:
 
@@ -1690,12 +1701,14 @@ SubModules:
     TextRecognition:
     module_name: text_recognition
     model_name: PP-OCRv4_server_rec
-    model_dir: null # 替换为微调后的文本检测模型权重路径
+    model_dir: null # 替换为微调后的文本识别模型权重路径
     batch_size: 1
             score_thresh: 0
 ......
 ```
 
+<br>注:为了文档紧凑,上述只列举了两个模型,事实上,配置文件中的模型均可替换。</br>
+
 随后, 参考[2.2 本地体验](#22-本地体验)中的命令行方式或Python脚本方式,加载修改后的产线配置文件即可。
 
 ##  5. 多硬件支持

+ 10 - 4
docs/pipeline_usage/tutorials/ocr_pipelines/doc_preprocessor.md

@@ -6,11 +6,11 @@ comments: true
 
 ## 1. 文档图像预处理产线介绍
 
-文档图像预处理产线集成了文档方向分类和形变矫正两大功能。文档方向分类可自动识别文档的四个方向(0°、90°、180°、270°),确保文档以正确的方向进行后续处理。几何形变矫正模型则用于修正文档拍摄或扫描过程中的几何扭曲,恢复文档的原始形状和比例。适用于数字化文档管理、doc_preprocessor识别前处理、以及任何需要提高文档图像质量的场景。通过自动化的方向校正与形变矫正,该模块显著提升了文档处理的准确性和效率,为用户提供更为可靠的图像分析基础。本产线同时提供了灵活的服务化部署方式,支持在多种硬件上使用多种编程语言调用。不仅如此,本产线也提供了二次开发的能力,您可以基于本产线在您自己的数据集上训练调优,训练后的模型也可以无缝集成。
+文档图像预处理产线集成了文档方向分类和形变矫正两大功能。文档方向分类可自动识别文档的四个方向(0°、90°、180°、270°),确保文档以正确的方向进行后续处理。文本图像矫正模型则用于修正文档拍摄或扫描过程中的几何扭曲,恢复文档的原始形状和比例。适用于数字化文档管理、OCR类任务前处理、以及任何需要提高文档图像质量的场景。通过自动化的方向校正与形变矫正,该模块显著提升了文档处理的准确性和效率,为用户提供更为可靠的图像分析基础。本产线同时提供了灵活的服务化部署方式,支持在多种硬件上使用多种编程语言调用。不仅如此,本产线也提供了二次开发的能力,您可以基于本产线在您自己的数据集上训练调优,训练后的模型也可以无缝集成。
 
-<img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/pipelines/doc_preprocessor/02.jpg"/>
-<b>通用文档图像预处理</b><b>产线中包含可选用的文档图像方向分类模块和文档图像矫正模块</b>包含的模型如下。
+<img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/pipelines/doc_preprocessor/02.jpg">
 
+<b>通用文档图像预处理</b><b>产线中包含可选用的文档图像方向分类模块和文本图像矫正模块</b>包含的模型如下。
 <p><b>文档图像方向分类模块(可选):</b></p>
 <table>
 <thead>
@@ -79,7 +79,7 @@ paddlex --pipeline doc_preprocessor \
 
 运行后,会将结果打印到终端上,结果如下:
 
-<pre><code>{'res': {'input_path': 'doc_test_rotated.jpg', 'model_settings': {'use_doc_orientation_classify': True, 'use_doc_unwarping': True}, 'angle': 180}}
+<pre><code>{'res': {'input_path': 'doc_test_rotated.jpg', 'page_index': None, 'model_settings': {'use_doc_orientation_classify': True, 'use_doc_unwarping': True}, 'angle': 180}}
 </code></pre>
 
 运行结果参数说明可以参考[2.1.2 Python脚本方式集成](#212-python脚本方式集成)中的结果解释。
@@ -129,6 +129,12 @@ for res in output:
 <td><code>None</code></td>
 </tr>
 <tr>
+<td><code>config</code></td>
+<td>产线具体的配置信息(如果和<code>pipeline</code>同时设置,优先级高于<code>pipeline</code>,且要求产线名和<code>pipeline</code>一致)。</td>
+<td><code>dict[str, Any]</code></td>
+<td><code>None</code></td>
+</tr>
+<tr>
 <td><code>device</code></td>
 <td>产线推理设备。支持指定GPU具体卡号,如“gpu:0”,其他硬件具体卡号,如“npu:0”,CPU如“cpu”。</td>
 <td><code>str</code></td>

+ 15 - 75
docs/pipeline_usage/tutorials/ocr_pipelines/layout_parsing_v2.md

@@ -93,70 +93,6 @@ comments: true
 <td>4.834</td>
 <td>基于PicoDet-S在中英文论文、杂志、合同、书本、试卷和研报等场景上自建数据集训练的高效率版面区域定位模型</td>
 </tr>
-<tr>
-<td>PicoDet_layout_1x</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PicoDet_layout_1x_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PicoDet_layout_1x_pretrained.pdparams">训练模型</a></td>
-<td>86.8</td>
-<td>9.03 / 3.10</td>
-<td>25.82 / 20.70</td>
-<td>7.4</td>
-<td>基于PicoDet-1x在PubLayNet数据集训练的高效率版面区域定位模型,可定位包含文字、标题、表格、图片以及列表这5类区域</td>
-</tr>
-<tr>
-<td>PicoDet_layout_1x_table</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PicoDet_layout_1x_table_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PicoDet_layout_1x_table_pretrained.pdparams">训练模型</a></td>
-<td>95.7</td>
-<td>8.02 / 3.09</td>
-<td>23.70 / 20.41</td>
-<td>7.4 M</td>
-<td>基于PicoDet-1x在自建数据集训练的高效率版面区域定位模型,可定位包含表格这1类区域</td>
-</tr>
-<tr>
-<td>PicoDet-S_layout_3cls</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PicoDet-S_layout_3cls_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PicoDet-S_layout_3cls_pretrained.pdparams">训练模型</a></td>
-<td>87.1</td>
-<td>8.99 / 2.22</td>
-<td>16.11 / 8.73</td>
-<td>4.8</td>
-<td>基于PicoDet-S轻量模型在中英文论文、杂志和研报等场景上自建数据集训练的高效率版面区域定位模型,包含3个类别:表格,图像和印章</td>
-</tr>
-<tr>
-<td>PicoDet-S_layout_17cls</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PicoDet-S_layout_17cls_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PicoDet-S_layout_17cls_pretrained.pdparams">训练模型</a></td>
-<td>70.3</td>
-<td>9.11 / 2.12</td>
-<td>15.42 / 9.12</td>
-<td>4.8</td>
-<td>基于PicoDet-S轻量模型在中英文论文、杂志和研报等场景上自建数据集训练的高效率版面区域定位模型,包含17个版面常见类别,分别是:段落标题、图片、文本、数字、摘要、内容、图表标题、公式、表格、表格标题、参考文献、文档标题、脚注、页眉、算法、页脚、印章</td>
-</tr>
-<tr>
-<td>PicoDet-L_layout_3cls</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PicoDet-L_layout_3cls_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PicoDet-L_layout_3cls_pretrained.pdparams">训练模型</a></td>
-<td>89.3</td>
-<td>13.05 / 4.50</td>
-<td>41.30 / 41.30</td>
-<td>22.6</td>
-<td>基于PicoDet-L在中英文论文、杂志和研报等场景上自建数据集训练的高效率版面区域定位模型,包含3个类别:表格,图像和印章</td>
-</tr>
-<tr>
-<td>PicoDet-L_layout_17cls</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PicoDet-L_layout_17cls_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PicoDet-L_layout_17cls_pretrained.pdparams">训练模型</a></td>
-<td>79.9</td>
-<td>13.50 / 4.69</td>
-<td>43.32 / 43.32</td>
-<td>22.6</td>
-<td>基于PicoDet-L在中英文论文、杂志和研报等场景上自建数据集训练的高效率版面区域定位模型,包含17个版面常见类别,分别是:段落标题、图片、文本、数字、摘要、内容、图表标题、公式、表格、表格标题、参考文献、文档标题、脚注、页眉、算法、页脚、印章</td>
-</tr>
-<tr>
-<td>RT-DETR-H_layout_3cls</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/RT-DETR-H_layout_3cls_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/RT-DETR-H_layout_3cls_pretrained.pdparams">训练模型</a></td>
-<td>95.9</td>
-<td>114.93 / 27.71</td>
-<td>947.56 / 947.56</td>
-<td>470.1</td>
-<td>基于RT-DETR-H在中英文论文、杂志和研报等场景上自建数据集训练的高精度版面区域定位模型,包含3个类别:表格,图像和印章</td>
-</tr>
-<tr>
-<td>RT-DETR-H_layout_17cls</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/RT-DETR-H_layout_17cls_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/RT-DETR-H_layout_17cls_pretrained.pdparams">训练模型</a></td>
-<td>92.6</td>
-<td>115.29 / 104.09</td>
-<td>995.27 / 995.27</td>
-<td>470.2</td>
-<td>基于RT-DETR-H在中英文论文、杂志和研报等场景上自建数据集训练的高精度版面区域定位模型,包含17个版面常见类别,分别是:段落标题、图片、文本、数字、摘要、内容、图表标题、公式、表格、表格标题、参考文献、文档标题、脚注、页眉、算法、页脚、印章</td>
-</tr>
 </tbody>
 </table>
 <p><b>注:以上精度指标的评估集是 PaddleOCR 自建的版面区域分析数据集,包含中英文论文、杂志和研报等常见的 1w 张文档类型图片。GPU 推理耗时基于 NVIDIA Tesla T4 机器,精度类型为 FP32, CPU 推理速度基于 Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz,线程数为 8,精度类型为 FP32。</b></p>
@@ -540,6 +476,7 @@ devanagari_PP-OCRv3_mobile_rec_infer.tar">推理模型</a>/<a href="">训练模
 </tbody>
 </table>
 <p><b>注:以上精度指标的评估集是自建的数据集,包含500张圆形印章图像。GPU 推理耗时基于 NVIDIA Tesla T4 机器,精度类型为 FP32, CPU 推理速度基于 Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz,线程数为 8,精度类型为 FP32。</b></p>
+</b></p></details>
 <p><b>文本图像矫正模块模型:</b></p>
 <table>
 <thead>
@@ -1138,7 +1075,7 @@ for item in markdown_images:
 </table>
 
 - 调用`print()` 方法会将结果打印到终端,打印到终端的内容解释如下:
-    - `input_path`: `(str)` 待预测图像的输入路径
+    - `input_path`: `(str)` 待预测图像或者PDF的输入路径
 
     - `page_index`: `(Union[int, None])` 如果输入是PDF文件,则表示当前是PDF的第几页,否则为 `None`
 
@@ -1150,13 +1087,22 @@ for item in markdown_images:
         - `use_table_recognition`: `(bool)` 控制是否启用表格识别子产线
         - `use_formula_recognition`: `(bool)` 控制是否启用公式识别子产线
 
+    - `doc_preprocessor_res`: `(Dict[str, Union[List[float], str]])` 文档预处理结果字典,仅当`use_doc_preprocessor=True`时存在
+        - `input_path`: `(str)` 文档预处理子产线接受的图像路径,当输入为`numpy.ndarray`时,保存为`None`,此处为`None`
+        - `page_index`: `None`,此处的输入为`numpy.ndarray`,所以值为`None`
+        - `model_settings`: `(Dict[str, bool])` 文档预处理子产线的模型配置参数
+          - `use_doc_orientation_classify`: `(bool)` 控制是否启用文档图像方向分类子模块
+          - `use_doc_unwarping`: `(bool)` 控制是否启用文本图像扭曲矫正子模块
+        - `angle`: `(int)` 文档图像方向分类子模块的预测结果,启用时返回实际角度值
+
     - `parsing_res_list`: `(List[Dict])` 解析结果的列表,每个元素为一个字典,列表顺序为解析后的阅读顺序。
         - `layout_bbox`: `(np.ndarray)` 版面区域的边界框。
-        - `{label}`: `(str)` key 为版面区域的标签,例如`text`, `table`等,内容为版面区域内的内容。
+        - `label`: `(str)` key 为版面区域的标签,例如`text`, `table`等,内容为版面区域内的内容。
         - `layout`: `(str)` 版面排版类型,例如 `double`, `single` 等。
 
     - `overall_ocr_res`: `(Dict[str, Union[List[str], List[float], numpy.ndarray]])` 全局 OCR 结果的字典
-      -  `input_path`: `(Union[str, None])` 图像OCR子产线接受的图像路径,当输入为`numpy.ndarray`时,保存为`None`
+      - `input_path`: `(Union[str, None])` 图像OCR子产线接受的图像路径,当输入为`numpy.ndarray`时,保存为`None`
+      - `page_index`: `None`,此处的输入为`numpy.ndarray`,所以值为`None`
       - `model_settings`: `(Dict)` OCR子产线的模型配置参数
       - `dt_polys`: `(List[numpy.ndarray])` 文本检测的多边形框列表。每个检测框由4个顶点坐标构成的numpy数组表示,数组shape为(4, 2),数据类型为int16
       - `dt_scores`: `(List[float])` 文本检测框的置信度列表
@@ -1188,6 +1134,7 @@ for item in markdown_images:
 
     - `seal_res_list`: `(List[Dict[str, Union[numpy.ndarray, List[float], str]]])` 印章识别结果列表,每个元素为一个字典
         - `input_path`: `(str)` 印章图像的输入路径
+        - `page_index`: `None`,此处的输入为`numpy.ndarray`,所以值为`None`
         - `model_settings`: `(Dict)` 印章识别子产线的模型配置参数
         - `dt_polys`: `(List[numpy.ndarray])` 印章检测框列表,格式同`dt_polys`
         - `text_det_params`: `(Dict[str, Dict[str, int, float]])` 印章检测模块的配置参数, 具体参数含义同上
@@ -1598,13 +1545,7 @@ for res in output:
 <td>Markdown图片相对路径和base64编码图像的键值对。</td>
 </tr>
 </tbody>
-<<<<<<< HEAD
 </table></details>
-=======
-</table>
-</details>
-
->>>>>>> 6c84cdc9 (update)
 <details><summary>多语言调用服务示例</summary>
 <details>
 <summary>Python</summary>
@@ -1657,9 +1598,8 @@ for res in result["layoutParsingResults"]:
 如果通用版面解析v2产线提供的默认模型权重在您的场景中,精度或速度不满意,您可以尝试利用<b>您自己拥有的特定领域或应用场景的数据</b>对现有模型进行进一步的<b>微调</b>,以提升通用版面解析v2产线的在您的场景中的识别效果。
 
 ### 4.1 模型微调
-由于通用版面解析v2产线包含7个模块,模型产线的效果不及预期可能来自于其中任何一个模块。
 
-由于通用版面解析v2产线包含若干模块,模型产线的效果不及预期可能来自于其中任何一个模块。您可以对提取效果差的 case 进行分析,通过可视化图像,确定是哪个模块存在问题,并参考以下表格中对应的微调教程链接进行模型微调。
+由于通用版面解析v2产线包含若干模块,模型产线的效果不及预期可能来自于其中任何一个模块。您可以对提取效果差的 case 进行分析,通过可视化图像,确定是哪个模块存在问题,并参考以下表格中对应的微调教程链接进行模型微调。
 
 
 <table>

+ 10 - 222
docs/pipeline_usage/tutorials/ocr_pipelines/seal_recognition.md

@@ -517,234 +517,22 @@ paddlex --pipeline seal_recognition \
 <details><summary> 👉点击展开</summary>
 
 ```bash
-{'res': {'input_path': 'seal_text_det.png', 'model_settings': {'use_doc_preprocessor': False, 'use_layout_detection': True}, 'layout_det_res': {'input_path': None, 'page_index': None, 'boxes': [{'cls_id': 16, 'label': 'seal', 'score': 0.975529670715332, 'coordinate': [6.191284, 0.16680908, 634.39325, 628.85345]}]}, 'seal_res_list': [{'input_path': None, 'page_index': None, 'model_settings': {'use_doc_preprocessor': False, 'use_textline_orientation': False}, 'dt_polys': [array([[320,  38],
-       [479,  92],
-       [483,  94],
-       [486,  97],
-       [579, 226],
-       [582, 230],
-       [582, 235],
-       [584, 383],
-       [584, 388],
-       [582, 392],
-       [578, 396],
-       [573, 398],
-       [566, 398],
-       [502, 380],
-       [497, 377],
-       [494, 374],
-       [491, 369],
-       [491, 366],
-       [488, 259],
-       [424, 172],
-       [318, 136],
-       [251, 154],
-       [200, 174],
-       [137, 260],
-       [133, 366],
-       [132, 370],
-       [130, 375],
-       [126, 378],
-       [123, 380],
-       [ 60, 398],
-       [ 55, 398],
-       [ 49, 397],
-       [ 45, 394],
-       [ 43, 390],
-       [ 41, 383],
-       [ 43, 236],
-       [ 44, 230],
-       [ 45, 227],
-       [141,  96],
-       [144,  93],
-       [148,  90],
-       [311,  38],
+{'res': {'input_path': 'seal_text_det.png', 'model_settings': {'use_doc_preprocessor': False, 'use_layout_detection': True}, 'layout_det_res': {'input_path': None, 'page_index': None, 'boxes': [{'cls_id': 16, 'label': 'seal', 'score': 0.975531280040741, 'coordinate': [6.195526, 0.1579895, 634.3982, 628.84595]}]}, 'seal_res_list': [{'input_path': None, 'page_index': None, 'model_settings': {'use_doc_preprocessor': False, 'use_textline_orientation': False}, 'dt_polys': [array([[320,  38],
+       ...,
        [315,  38]]), array([[461, 347],
-       [465, 350],
-       [468, 354],
-       [470, 360],
-       [470, 425],
-       [469, 429],
-       [467, 433],
-       [462, 437],
-       [456, 439],
-       [169, 439],
-       [165, 439],
-       [160, 436],
-       [157, 432],
-       [155, 426],
-       [154, 360],
-       [155, 356],
-       [158, 352],
-       [161, 348],
-       [168, 346],
+       ...,
        [456, 346]]), array([[439, 445],
-       [441, 447],
-       [443, 451],
-       [444, 453],
-       [444, 497],
-       [443, 502],
-       [440, 504],
-       [437, 506],
-       [434, 507],
-       [189, 505],
-       [184, 504],
-       [182, 502],
-       [180, 498],
-       [179, 496],
-       [181, 453],
-       [182, 449],
-       [184, 446],
-       [188, 444],
+       ...,
        [434, 444]]), array([[158, 468],
-       [199, 502],
-       [242, 522],
-       [299, 534],
-       [339, 532],
-       [373, 526],
-       [417, 508],
-       [459, 475],
-       [462, 474],
-       [467, 474],
-       [472, 476],
-       [502, 507],
-       [503, 510],
-       [504, 515],
-       [503, 518],
-       [501, 521],
-       [452, 559],
-       [450, 560],
-       [391, 584],
-       [390, 584],
-       [372, 590],
-       [370, 590],
-       [305, 596],
-       [302, 596],
-       [224, 581],
-       [221, 580],
-       [164, 553],
-       [162, 551],
-       [114, 509],
-       [112, 507],
-       [111, 503],
-       [112, 498],
-       [114, 496],
-       [146, 468],
-       [149, 466],
-       [154, 466]])], 'text_det_params': {'limit_side_len': 736, 'limit_type': 'min', 'thresh': 0.2, 'box_thresh': 0.6, 'unclip_ratio': 0.5}, 'text_type': 'seal', 'textline_orientation_angles': [-1, -1, -1, -1], 'text_rec_score_thresh': 0, 'rec_texts': ['天津君和缘商贸有限公司', '发票专用章', '吗繁物', '5263647368706'], 'rec_scores': [0.9934046268463135, 0.9999403953552246, 0.998250424861908, 0.9913849234580994], 'rec_polys': [array([[320,  38],
-       [479,  92],
-       [483,  94],
-       [486,  97],
-       [579, 226],
-       [582, 230],
-       [582, 235],
-       [584, 383],
-       [584, 388],
-       [582, 392],
-       [578, 396],
-       [573, 398],
-       [566, 398],
-       [502, 380],
-       [497, 377],
-       [494, 374],
-       [491, 369],
-       [491, 366],
-       [488, 259],
-       [424, 172],
-       [318, 136],
-       [251, 154],
-       [200, 174],
-       [137, 260],
-       [133, 366],
-       [132, 370],
-       [130, 375],
-       [126, 378],
-       [123, 380],
-       [ 60, 398],
-       [ 55, 398],
-       [ 49, 397],
-       [ 45, 394],
-       [ 43, 390],
-       [ 41, 383],
-       [ 43, 236],
-       [ 44, 230],
-       [ 45, 227],
-       [141,  96],
-       [144,  93],
-       [148,  90],
-       [311,  38],
+       ...,
+       [154, 466]])], 'text_det_params': {'limit_side_len': 736, 'limit_type': 'min', 'thresh': 0.2, 'box_thresh': 0.6, 'unclip_ratio': 0.5}, 'text_type': 'seal', 'textline_orientation_angles': array([-1, ..., -1]), 'text_rec_score_thresh': 0, 'rec_texts': ['天津君和缘商贸有限公司', '发票专用章', '吗繁物', '5263647368706'], 'rec_scores': array([0.9934051 , ..., 0.99139398]), 'rec_polys': [array([[320,  38],
+       ...,
        [315,  38]]), array([[461, 347],
-       [465, 350],
-       [468, 354],
-       [470, 360],
-       [470, 425],
-       [469, 429],
-       [467, 433],
-       [462, 437],
-       [456, 439],
-       [169, 439],
-       [165, 439],
-       [160, 436],
-       [157, 432],
-       [155, 426],
-       [154, 360],
-       [155, 356],
-       [158, 352],
-       [161, 348],
-       [168, 346],
+       ...,
        [456, 346]]), array([[439, 445],
-       [441, 447],
-       [443, 451],
-       [444, 453],
-       [444, 497],
-       [443, 502],
-       [440, 504],
-       [437, 506],
-       [434, 507],
-       [189, 505],
-       [184, 504],
-       [182, 502],
-       [180, 498],
-       [179, 496],
-       [181, 453],
-       [182, 449],
-       [184, 446],
-       [188, 444],
+       ...,
        [434, 444]]), array([[158, 468],
-       [199, 502],
-       [242, 522],
-       [299, 534],
-       [339, 532],
-       [373, 526],
-       [417, 508],
-       [459, 475],
-       [462, 474],
-       [467, 474],
-       [472, 476],
-       [502, 507],
-       [503, 510],
-       [504, 515],
-       [503, 518],
-       [501, 521],
-       [452, 559],
-       [450, 560],
-       [391, 584],
-       [390, 584],
-       [372, 590],
-       [370, 590],
-       [305, 596],
-       [302, 596],
-       [224, 581],
-       [221, 580],
-       [164, 553],
-       [162, 551],
-       [114, 509],
-       [112, 507],
-       [111, 503],
-       [112, 498],
-       [114, 496],
-       [146, 468],
-       [149, 466],
+       ...,
        [154, 466]])], 'rec_boxes': array([], dtype=float64)}]}}
 ```
 

+ 1 - 1
docs/practical_tutorials/document_scene_information_extraction(layout_detection)_tutorial.en.md

@@ -537,7 +537,7 @@ chat_result = pipeline.chat(
 chat_result.print()
 ```
 
-For more parameters, please refer to the [Document Scene Information Extraction Pipeline Usage Tutorial](../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.en.md).
+For more parameters, please refer to the [Document Scene Information Extraction Pipeline Usage Tutorial](../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.en.md).
 
 2. Additionally, PaddleX offers three other deployment methods, detailed as follows:
 

+ 1 - 1
docs/practical_tutorials/document_scene_information_extraction(layout_detection)_tutorial.md

@@ -539,7 +539,7 @@ chat_result = pipeline.chat(
 chat_result.print()
 ```
 
-更多参数请参考 [文档场景信息抽取v3产线使用教程](../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.md)。
+更多参数请参考 [文档场景信息抽取v3产线使用教程](../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.md)。
 
 2. 此外,PaddleX 也提供了其他三种部署方式,详细说明如下:
 

+ 1 - 1
docs/practical_tutorials/document_scene_information_extraction(seal_recognition)_tutorial.en.md

@@ -468,7 +468,7 @@ chat_result = pipeline.chat(
 chat_result.print()
 ```
 
-For more parameters, please refer to the [Document Scene Information Extraction Pipeline Usage Tutorial](../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.en.md).
+For more parameters, please refer to the [Document Scene Information Extraction Pipeline Usage Tutorial](../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.en.md).
 
 2. Additionally, PaddleX offers three other deployment methods, detailed as follows:
 

+ 1 - 1
docs/practical_tutorials/document_scene_information_extraction(seal_recognition)_tutorial.md

@@ -458,7 +458,7 @@ chat_result = pipeline.chat(
 chat_result.print()
 ```
 
-更多参数请参考 [文档场景信息抽取v3产线使用教程](../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.md)。
+更多参数请参考 [文档场景信息抽取v3产线使用教程](../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.md)。
 
 2. 此外,PaddleX 也提供了其他三种部署方式,详细说明如下:
 

+ 7 - 2
paddlex/configs/pipelines/layout_parsing_v2.yaml

@@ -37,7 +37,7 @@ SubPipelines:
     pipeline_name: OCR
     text_type: general
     use_doc_preprocessor: False
-    use_textline_orientation: False
+    use_textline_orientation: True
     SubModules:
       TextDetection:
         module_name: text_detection
@@ -48,13 +48,18 @@ SubPipelines:
         thresh: 0.3
         box_thresh: 0.6
         unclip_ratio: 2.0
-        
+      TextLineOrientation:
+        module_name: textline_orientation
+        model_name: PP-LCNet_x0_25_textline_ori
+        model_dir: null
+        batch_size: 1 
       TextRecognition:
         module_name: text_recognition
         model_name: PP-OCRv4_server_rec_doc
         model_dir: null
         batch_size: 1
         score_thresh: 0.0
+ 
 
   TableRecognition:
     pipeline_name: table_recognition_v2

+ 1 - 1
paddlex/inference/pipelines/layout_parsing/utils.py

@@ -743,7 +743,7 @@ def _img_array2path(data: np.ndarray) -> str:
         # Generate a unique filename using UUID
         img_name = f"image_{uuid.uuid4().hex}.png"
 
-        return {f"imgs/{img_name}": Image.fromarray(data[:, :, ::-1])}
+        return {f"imgs/{img_name}": Image.fromarray(data)}
     else:
         raise ValueError(
             "Input data must be a 3-dimensional numpy array representing an image."