cuicheng01 6 kuukautta sitten
vanhempi
commit
55be4b4f43

+ 24 - 5
README.md

@@ -35,6 +35,15 @@ PaddleX 3.0 是基于飞桨框架构建的低代码开发工具,它集成了
 
 ## 📣 近期更新
 
+🔥🔥 **2025.5.20,发布 PaddleX v3.0.0**,核心升级如下:
+- **重要能力发布:**
+  - **重磅发布文字识别模型 PP-OCRv5**:全场景 OCR 识别精度跃升13%,单模型同时支持 5 种文字类型(简体中文、繁体中文、中文拼音、英文和日文),在中英文手写字体、竖直文本、生僻字等提升非常明显。可在 [在线Demo](https://aistudio.baidu.com/community/app/91660/webUI?source=appCenter) 中立即体验。
+  - **重磅发布文档解析方案 PP-StructureV3**:强化了版面区域检测、表格识别、中英文公式识别、多栏阅读顺序的恢复能力,增加了图表理解能力,在 OmniDocBench 榜单上,PP-StructureV3 的整体中文和英文的编辑距离均达到 SOTA 水平。可在 [在线Demo](https://aistudio.baidu.com/community/app/518494/webUI?source=appCenter) 中立即体验。
+  - **优化PP-ChatOCRv4**:原生支持文心大模型4.5T,结合PP-DocBee2,关键信息抽取精度相比上一代提升15.7个百分点。可在 [在线Demo](https://aistudio.baidu.com/community/app/518493/webUI?source=appCenter) 中立即体验。
+- **推理能力优化:**
+  - 通用OCR、通用版面解析v3、公式识别、印章文本识别、文档图像预处理产线支持设置batch size>1,一次处理多个页面。
+  - 通用OCR、通用版面解析v3等17条产线支持多卡并行推理;新增产线多进程并行推理示例代码。
+
 🔥🔥 **2025.4.22,发布 PaddleX v3.0.0rc1 。** 本次版本全面适配 PaddlePaddle 3.0正式版,核心升级如下:
 
 - **全面适配飞桨框架3.0新特性**:支持编译器训练,训练命令通过追加 `-o Global.dy2st=True` 即可开启编译器训练,在 GPU 上,多数模型训练速度可提升 10% 以上,少部分模型训练速度可以提升 30% 以上。推理方面,模型整体适配飞桨 3.0 中间表示技术(PIR),拥有更加灵活的扩展能力和兼容性,静态图模型存储文件名由 `xxx.pdmodel` 改为 `xxx.json`。
@@ -111,7 +120,7 @@ PaddleX的各个产线均支持本地**快速推理**,部分模型支持在[AI
         <td>✅</td>
     </tr>
     <tr>
-        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.html">文档场景信息抽取v3</a></td>
+        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.html">文档场景信息抽取v3</a></td>
         <td><a href = "https://aistudio.baidu.com/community/app/182491/webUI?source=appCenter">链接</a></td>
         <td>✅</td>
         <td>✅</td>
@@ -121,6 +130,16 @@ PaddleX的各个产线均支持本地**快速推理**,部分模型支持在[AI
         <td>✅</td>
     </tr>
     <tr>
+        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v4.html">文档场景信息抽取v4</a></td>
+        <td><a href = "https://aistudio.baidu.com/community/app/518493/webUI?source=appCenter">链接</a></td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>✅</td>
+    </tr>
+    <tr>
         <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/table_recognition.html">通用表格识别</a></td>
         <td><a href = "https://aistudio.baidu.com/community/app/91661?source=appMineRecent">链接</a></td>
         <td>✅</td>
@@ -322,13 +341,13 @@ PaddleX的各个产线均支持本地**快速推理**,部分模型支持在[AI
     </tr>
     <tr>
         <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/table_recognition_v2.html">通用表格识别v2</a></td>
-        <td>🚧</td>
+        <td><a href = "https://aistudio.baidu.com/community/app/518495/webUI?source=appCenter">链接</a></td>
         <td>✅</td>
         <td>✅</td>
         <td>✅</td>
         <td>🚧</td>
         <td>✅</td>
-        <td>🚧</td>
+        <td></td>
     </tr>
     <tr>
         <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/layout_parsing.html">通用版面解析</a></td>
@@ -342,13 +361,13 @@ PaddleX的各个产线均支持本地**快速推理**,部分模型支持在[AI
     </tr>
     <tr>
         <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/PP-StructureV3.html">通用版面解析v3</a></td>
-        <td>🚧</td>
+        <td><a href = "https://aistudio.baidu.com/community/app/518494/webUI?source=appCente">链接</a></td>
         <td>✅</td>
         <td>✅</td>
         <td>✅</td>
         <td>🚧</td>
         <td>🚧</td>
-        <td>🚧</td>
+        <td></td>
     </tr>
     <tr>
         <td><a href="https://paddlepaddle.github.io/PaddleX/latest/pipeline_usage/tutorials/ocr_pipelines/doc_preprocessor.html">文档图像预处理</a></td>

+ 38 - 31
README_en.md

@@ -37,6 +37,24 @@ PaddleX 3.0 is a low-code development tool for AI models built on the PaddlePadd
 
 ## 📣 Recent Updates
 
+
+🔥🔥 **2025.5.20: PaddleX v3.0.0 Released**
+
+Core upgrades are as follows:
+
+- **Major Capability Releases:**
+  - **Launch of the groundbreaking text recognition model PP-OCRv5**: Achieves a 13% improvement in OCR accuracy across all scenarios. A single model now supports 5 types of text (Simplified Chinese, Traditional Chinese, Chinese Pinyin, English, and Japanese), with significant enhancements in recognizing handwritten fonts, vertical text, and rare characters in both Chinese and English. You can experience it immediately in the [online demo](https://aistudio.baidu.com/community/app/91660/webUI?source=appCenter).
+  
+  - **Launch of the groundbreaking document parsing solution PP-StructureV3**: Enhanced capabilities in layout area detection, table recognition, Chinese and English formula recognition, and restoration of multi-column reading order, with added abilities for chart understanding. PP-StructureV3 achieves state-of-the-art (SOTA) levels in both Chinese and English editing distances on the OmniDocBench leaderboard. Experience it in the [online demo](https://aistudio.baidu.com/community/app/518494/webUI?source=appCenter).
+  
+  - **Optimization of PP-ChatOCRv4**: Supports the Ernie 4.5T. Combined with PP-DocBee2, it shows a 15.7 percentage point improvement in key information extraction accuracy compared to the previous generation. Experience it in the [online demo](https://aistudio.baidu.com/community/app/518493/webUI?source=appCenter).
+
+- **Inference Capability Optimization:**
+  - The general OCR, PP-StructureV3, formula recognition, seal text recognition, and document image preprocessing pipelines support setting batch size >1, allowing multiple pages to be processed at once.
+  
+  - 17 pipelines, including general OCR and PP-StructureV3, now support multi-GPU parallel inference. Sample code for multi-process parallel inference has been added.
+
+
 🔥 **2025.4.22, PaddleX v3.0.0rc1 major upgrade.** This version fully adapts to PaddlePaddle 3.0.0, with the following core upgrades:
 
 - **Adapts to New Features of PaddlePaddle 3.0**: Supports compiler training, which can be enabled by appending `-o Global.dy2st=True` to the training command. On GPUs, the training speed of most models can be improved by over 10%, and for a few models, the improvement can exceed 30%. For inference, the models are fully adapted to PaddlePaddle 3.0's Intermediate Representation (PIR) technology, offering more flexible extensibility and compatibility. The file names for inference model have been changed from `xxx.pdmodel` to `xxx.json`.
@@ -51,29 +69,6 @@ PaddleX 3.0 is a low-code development tool for AI models built on the PaddlePadd
   - **NPU: The number of models fully validated on Ascend NPU has increased to 200. Additionally, common pipelines such as general OCR, image classification, and object detection support OM model format inference, with inference speed improvements ranging from 113.8% to 226.4%. Inference deployment is supported on Atlas 200 and Atlas 300 series products.**
   - **GCU: Enflame has been officially integrated into the PaddlePaddle regular release system, completing the adaptation of the PaddleX ecosystem. Supports the training and inference of 90 models.**
 
-🔥 **2025.2.14, PaddleX v3.0.0rc0 major upgrade.** This version fully adapts to PaddlePaddle 3.0.0rc and above, with the following core upgrades:
-
-- **Added 12 high-value pipelines, launching self-developed [PP-StructureV3 Pipeline](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/PP-StructureV3.html), [PP-ChatOCRv4-doc Pipeline](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v4.html), [Table Recognition v2 Pipeline](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/table_recognition_v2.html)**. Additionally, new pipelines for document processing, rotated box detection, open vocabulary detection/segmentation, video analysis, multilingual speech recognition, 3D, and other scenarios have been added.
-
-- **Expanded 48 cutting-edge models, including the major releases in the OCR field such as Document Layout Detection Model [PP-DocLayout](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/layout_detection.html), Formula Recognition Model [PP-FormulaNet](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/formula_recognition.html), Table Structure Recognition Model [SLANeXt](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/table_structure_recognition.html), text Recognition Model [PP-OCRv4_server_rec_doc](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/text_recognition.html)**. In the CV field, models for 3D detection, human keypoints, open vocabulary detection/segmentation, and in the speech recognition field, models from the Whisper series, among others.
-
-- **Optimized and upgraded the inference APIs for models and pipelines**, supporting more parameter configurations to enhance the flexibility of model and pipeline inference. [Details](docs/API_change_log/v3.0.0rc.en.md).
-
-- **Expanded hardware support:** added support for Suoyuan GCU (90+ models), and significantly increased the number of models for Ascend NPU/Kunlunxin XPU/Cambricon MLU/Hygon DCU.
-
-- **Upgraded full-scenario deployment capabilities:**
-  - High-performance inference supports one-click installation, Windows systems, and 220+ models, with the core library ultra-infer open-sourced;
-  - Serving deployment added a highly stable solution, supporting dynamic configuration optimization.
-
-- **Enhanced system compatibility:** adapted to Windows training/inference, fully supporting Python 3.11/3.12.
-
-🔥🔥 **11.15, 2024**, PaddleX 3.0 Beta2 open source version is officially released, PaddleX 3.0 Beta2 is fully compatible with the PaddlePaddle 3.0b2 version. <b>This update introduces new pipelines for general image recognition, face recognition, vehicle attribute recognition, and pedestrian attribute recognition. We have also developed 42 new models to fully support the Ascend 910B, with extensive documentation available on [GitHub Pages](https://paddlepaddle.github.io/PaddleX/latest/en/index.html).</b>
-
-🔥🔥 **9.30, 2024**, PaddleX 3.0 Beta1 open source version is officially released, providing **more than 200 models** that can be called with a simple Python API; achieve model full-process development based on unified commands, and open source the basic capabilities of the **PP-ChatOCRv3** pipeline; support **more than 100 models for high-performance inference and serving** (iterating continuously), **more than 7 key visual models for edge-deployment**; **more than 70 models have been adapted for the full development process of Ascend 910B**, **more than 15 models have been adapted for the full development process of Kunlun chips and Cambricon**
-
-🔥 **6.27, 2024**, PaddleX 3.0 Beta open source version is officially released, supporting the use of various mainstream hardware for pipeline and model development in a low-code manner on the local side.
-
-🔥 **3.25, 2024**, PaddleX 3.0 cloud release, supporting the creation of pipelines in the AI Studio  Community in a zero-code manner.
 
 ## 🔠 Explanation of Pipeline
 PaddleX is dedicated to achieving pipeline-level model training, inference, and deployment. A pipeline refers to a series of predefined development processes for specific AI tasks, which includes a combination of single models (single-function modules) capable of independently completing a certain type of task.
@@ -107,7 +102,7 @@ In addition, PaddleX provides developers with a full-process efficient model tra
         <td>✅</td>
     </tr>
     <tr>
-        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.html">PP-ChatOCRv3</a></td>
+        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.html">PP-ChatOCRv3</a></td>
         <td><a href="https://aistudio.baidu.com/community/app/182491/webUI?source=appCenter">Link</a></td>
         <td>✅</td>
         <td>✅</td>
@@ -117,6 +112,16 @@ In addition, PaddleX provides developers with a full-process efficient model tra
         <td>✅</td>
     </tr>
     <tr>
+        <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v4.html">PP-ChatOCRv4</a></td>
+        <td><a href="https://aistudio.baidu.com/community/app/518493/webUI?source=appCenter">Link</a></td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>✅</td>
+        <td>🚧</td>
+        <td>✅</td>
+        <td>✅</td>
+    </tr>
+    <tr>
         <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/table_recognition.html">Table Recognition</a></td>
         <td><a href="https://aistudio.baidu.com/community/app/91661?source=appMineRecent">Link</a></td>
         <td>✅</td>
@@ -318,13 +323,13 @@ In addition, PaddleX provides developers with a full-process efficient model tra
     </tr>
     <tr>
         <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/table_recognition_v2.html">Table Recognition v2</a></td>
-        <td>🚧</td>
+        <td><a href = "https://aistudio.baidu.com/community/app/518495/webUI?source=appCenter">Link</a></td>
         <td>✅</td>
         <td>✅</td>
         <td>✅</td>
         <td>🚧</td>
         <td>✅</td>
-        <td>🚧</td>
+        <td></td>
     </tr>
     <tr>
         <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/layout_parsing.html">Layout Parsing</a></td>
@@ -338,13 +343,13 @@ In addition, PaddleX provides developers with a full-process efficient model tra
     </tr>
     <tr>
         <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/PP-StructureV3.html">PP-StructureV3</a></td>
-        <td>🚧</td>
+        <td><a href = "https://aistudio.baidu.com/community/app/518494/webUI?source=appCente">Link</a></td>
         <td>✅</td>
         <td>✅</td>
         <td>✅</td>
         <td>🚧</td>
         <td>🚧</td>
-        <td>🚧</td>
+        <td></td>
     </tr>
     <tr>
         <td><a href="https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_usage/tutorials/ocr_pipelines/doc_preprocessor.html">Document Image Preprocessing</a></td>
@@ -419,7 +424,7 @@ In addition, PaddleX provides developers with a full-process efficient model tra
 
 </table>
 
-> ❗Note: The above capabilities are implemented based on GPU/CPU. PaddleX can also perform local inference and custom development on mainstream hardware such as Kunlunxin, Ascend, Cambricon, and Haiguang. The table below details the support status of the pipelines. For specific supported model lists, please refer to the [Model List (Kunlunxin XPU)](https://paddlepaddle.github.io/PaddleX/latest/en/support_list/model_list_xpu.html)/[Model List (Ascend NPU)](https://paddlepaddle.github.io/PaddleX/latest/en/support_list/model_list_npu.html)/[Model List (Cambricon MLU)](https://paddlepaddle.github.io/PaddleX/latest/en/support_list/model_list_mlu.html)/[Model List (Haiguang DCU)](https://paddlepaddle.github.io/PaddleX/latest/en/support_list/model_list_dcu.html). We are continuously adapting more models and promoting the implementation of high-performance and serving on mainstream hardware.
+> ❗Note: The above capabilities are implemented based on GPU/CPU. PaddleX can also perform local inference and custom development on mainstream hardware such as Kunlunxin, Ascend, Cambricon, and HYGON. The table below details the support status of the pipelines. For specific supported model lists, please refer to the [Model List (Kunlunxin XPU)](https://paddlepaddle.github.io/PaddleX/latest/en/support_list/model_list_xpu.html)/[Model List (Ascend NPU)](https://paddlepaddle.github.io/PaddleX/latest/en/support_list/model_list_npu.html)/[Model List (Cambricon MLU)](https://paddlepaddle.github.io/PaddleX/latest/en/support_list/model_list_mlu.html)/[Model List (HYGON DCU)](https://paddlepaddle.github.io/PaddleX/latest/en/support_list/model_list_dcu.html). We are continuously adapting more models and promoting the implementation of high-performance and serving on mainstream hardware.
 
 🔥🔥 **Support for Domestic Hardware Capabilities**
 
@@ -429,7 +434,7 @@ In addition, PaddleX provides developers with a full-process efficient model tra
     <th>Ascend 910B</th>
     <th>Kunlunxin R200/R300</th>
     <th>Cambricon MLU370X8</th>
-    <th>Haiguang Z100</th>
+    <th>HYGON Z100</th>
   </tr>
   <tr>
     <td>OCR</td>
@@ -569,7 +574,9 @@ python -m pip install paddlepaddle==3.0.0 -i https://www.paddlepaddle.org.cn/pac
 * **Installing PaddleX**
 
 ```bash
-pip install paddlex==3.0.0rc1
+pip install paddlex==3.0.0[base]
+# You can also install the sub-package for specific pipeline, such as:
+# pip install paddlex==3.0.0[ocr]
 ```
 
 > ❗For more installation methods, refer to the [PaddleX Installation Guide](https://paddlepaddle.github.io/PaddleX/latest/en/installation/installation.html).

+ 1 - 1
docs/other_devices_support/paddlepaddle_install_DCU.en.md

@@ -4,7 +4,7 @@ comments: true
 
 # Hygon DCU PaddlePaddle Installation Tutorial
 
-Currently, PaddleX supports Haiguang Z100 series chips. Considering environmental differences, we recommend using the <b>officially released Haiguang DCU development image by PaddlePaddle</b>, which is pre-installed with the Haiguang DCU basic runtime library (DTK).
+Currently, PaddleX supports HYGON Z100 series chips. Considering environmental differences, we recommend using the <b>officially released HYGON DCU development image by PaddlePaddle</b>, which is pre-installed with the HYGON DCU basic runtime library (DTK).
 
 ## 1. Docker Environment Preparation
 Pull the image. Note that this image is only for development environments and does not include pre-compiled PaddlePaddle installation packages.

+ 1 - 1
mkdocs.yml

@@ -189,7 +189,7 @@ plugins:
             多硬件使用: Multi-Device Usage
             多硬件使用指南: Multi-Device Usage Guide
             飞桨多硬件安装: PaddlePaddle Installation on Multiple Devices
-            海光 DCU 飞桨安装教程: Haiguang DCU PaddlePaddle Installation Guide
+            海光 DCU 飞桨安装教程: HYGON DCU PaddlePaddle Installation Guide
             寒武纪 MLU 飞桨安装教程: Cambricon MLU PaddlePaddle Installation Guide
             昇腾 NPU 飞桨安装教程: Ascend NPU PaddlePaddle Installation Guide
             昆仑 XPU 飞桨安装教程: Kunlun XPU PaddlePaddle Installation Guide

+ 1 - 1
paddlex/.version

@@ -1 +1 @@
-3.0.0.rc1
+3.0.0

+ 1 - 1
paddlex/configs/modules/text_detection/PP-OCRv5_mobile_det.yaml

@@ -19,7 +19,7 @@ Train:
   epochs_iters: 100
   batch_size: 4
   learning_rate: 0.001
-  pretrain_weight_path: https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-OCRv5_mobile_det_pretrained.pdparams
+  pretrain_weight_path: "PP-OCRv5_mobile_det_pretrained.pdparams"
   resume_path: null
   log_interval: 10
   eval_interval: 1

+ 1 - 1
paddlex/configs/modules/text_recognition/PP-OCRv4_mobile_rec.yaml

@@ -18,7 +18,7 @@ Train:
   epochs_iters: 20
   batch_size: 8
   learning_rate: 0.001
-  pretrain_weight_path: https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-OCRv4_mobile_rec_pretrained.pdparams
+  pretrain_weight_path: PP-OCRv5_mobile_det_pretrained.pdparams
   resume_path: null
   log_interval: 20
   eval_interval: 1