hace 5 meses · 540ed5590e
--- a/README.md
+++ b/README.md
@@ -548,7 +548,7 @@ PaddleX的各个产线均支持本地**快速推理**，部分模型支持在[AI
 
				 
			
 
				 ### 🛠️ 安装
			
 
				 
			
 
				-> ❗在安装 PaddleX 之前，请确保您已具备基本的 **Python 运行环境**（注：目前支持 Python 3.8 至 Python 3.12）。PaddleX 3.0-rc1 版本依赖的 PaddlePaddle 版本为 3.0.0 及以上版本，请在使用前务必保证版本的对应关系。
			
 
				+> ❗在安装 PaddleX 之前，请确保您已具备基本的 **Python 运行环境**（注：目前支持 Python 3.8 至 Python 3.12）。PaddleX 3.0.x 版本依赖的 PaddlePaddle 版本为 3.0.0 及以上版本，请在使用前务必保证版本的对应关系。
			
 
				 
			
 
				 * **安装 PaddlePaddle**
			
 
				 ```bash
			
@@ -566,7 +566,7 @@ python -m pip install paddlepaddle==3.0.0 -i https://www.paddlepaddle.org.cn/pac
 
				 * **安装PaddleX**
			
 
				 
			
 
				 ```bash
			
 
				-pip install paddlex[base]==3.0.0
			
 
				+pip install "paddlex[base]==3.0.0"
			
 
				 ```
			
 
				 
			
 
				 > ❗ 更多安装方式参考 [PaddleX 安装教程](https://paddlepaddle.github.io/PaddleX/latest/installation/installation.html)
			
--- a/README_en.md
+++ b/README_en.md
@@ -42,30 +42,30 @@ PaddleX 3.0 is a low-code development tool for AI models built on the PaddlePadd
 
				 
			
 
				 Core upgrades are as follows:
			
 
				 
			
 
				-- **Rich Model Library:**  
			
 
				-  - **Extensive Model Coverage:** PaddleX 3.0 includes **270+ models**, covering diverse scenarios such as image/video classification/detection/segmentation, OCR, speech recognition, time series analysis, and more.  
			
 
				-  - **Mature Solutions:** Built on this robust model library, PaddleX 3.0 offers **critical and production-ready AI solutions**, including general document parsing, key information extraction, document understanding, table recognition, and general image recognition.  
			
 
				+- **Rich Model Library:**
			
 
				+  - **Extensive Model Coverage:** PaddleX 3.0 includes **270+ models**, covering diverse scenarios such as image/video classification/detection/segmentation, OCR, speech recognition, time series analysis, and more.
			
 
				+  - **Mature Solutions:** Built on this robust model library, PaddleX 3.0 offers **critical and production-ready AI solutions**, including general document parsing, key information extraction, document understanding, table recognition, and general image recognition.
			
 
				 
			
 
				-- **Unified Inference API & Enhanced Deployment Capabilities:**  
			
 
				-  - **Standardized Inference Interface:** Reduces API fragmentation across model types, lowering the learning curve for users and accelerating enterprise adoption.  
			
 
				-  - **Multi-Model Composition:** Complex tasks can be efficiently tackled by combining different models, achieving synergistic performance (1+1>2).  
			
 
				-  - **Upgraded Deployment:** Unified commands now manage deployments for diverse models, supporting **multi-GPU inference** and **multi-instance serving deployments**.  
			
 
				+- **Unified Inference API & Enhanced Deployment Capabilities:**
			
 
				+  - **Standardized Inference Interface:** Reduces API fragmentation across model types, lowering the learning curve for users and accelerating enterprise adoption.
			
 
				+  - **Multi-Model Composition:** Complex tasks can be efficiently tackled by combining different models, achieving synergistic performance (1+1>2).
			
 
				+  - **Upgraded Deployment:** Unified commands now manage deployments for diverse models, supporting **multi-GPU inference** and **multi-instance serving deployments**.
			
 
				 
			
 
				-- **Full Compatibility with PaddlePaddle Framework 3.0:**  
			
 
				-  - **Leveraging New Paddle 3.0 Features:**  
			
 
				-    - Compiler-accelerated training: Enable by appending `-o Global.dy2st=True` to training commands. **Most GPU-based models see >10% speed gains, with some exceeding 30%.**  
			
 
				-    - Inference upgrades: Full adaptation to Paddle 3.0’s Program Intermediate Representation (PIR) enhances flexibility and compatibility. Static graph models now use `xxx.json` instead of `xxx.pdmodel`.  
			
 
				-  - **ONNX Model Support:** Seamless format conversion via the Paddle2ONNX plugin.  
			
 
				+- **Full Compatibility with PaddlePaddle Framework 3.0:**
			
 
				+  - **Leveraging New Paddle 3.0 Features:**
			
 
				+    - Compiler-accelerated training: Enable by appending `-o Global.dy2st=True` to training commands. **Most GPU-based models see >10% speed gains, with some exceeding 30%.**
			
 
				+    - Inference upgrades: Full adaptation to Paddle 3.0’s Program Intermediate Representation (PIR) enhances flexibility and compatibility. Static graph models now use `xxx.json` instead of `xxx.pdmodel`.
			
 
				+  - **ONNX Model Support:** Seamless format conversion via the Paddle2ONNX plugin.
			
 
				 
			
 
				-- **Flagship Capabilities:**  
			
 
				-  - **PP-OCRv5:** Powers **multi-hardware inference, multi-backend support, and serving deployments** for this industry-leading OCR system.  
			
 
				-  - **PP-StructureV3:** Orchestrates **15+ models** in hybrid (serial/parallel) pipelines, achieving **SOTA accuracy on OmniDocBench**.  
			
 
				-  - **PP-ChatOCRv4:** Integrates with **PP-DocBee2 and ERNIE 4.5Turbo**, boosting key information extraction accuracy by **15.7 percentage points** over the previous generation.  
			
 
				+- **Flagship Capabilities:**
			
 
				+  - **PP-OCRv5:** Powers **multi-hardware inference, multi-backend support, and serving deployments** for this industry-leading OCR system.
			
 
				+  - **PP-StructureV3:** Orchestrates **15+ models** in hybrid (serial/parallel) pipelines, achieving **SOTA accuracy on OmniDocBench**.
			
 
				+  - **PP-ChatOCRv4:** Integrates with **PP-DocBee2 and ERNIE 4.5Turbo**, boosting key information extraction accuracy by **15.7 percentage points** over the previous generation.
			
 
				 
			
 
				-- **Multi-Hardware Support:**  
			
 
				-  - **Broad Compatibility:** Training and inference supported on **NVIDIA, Intel, Apple M-series, Kunlunxin, Ascend, Cambricon, Hygon, Enflame**, and more.  
			
 
				-  - **Ascend-Optimized:** **200+ fully adapted models**, including **21 OM-accelerated inference models**, plus key solutions like PP-OCRv5 and PP-StructureV3.  
			
 
				-  - **Kunlunxin-Optimized:** Critical classification, detection, and OCR models (including PP-OCRv5) are fully supported.  
			
 
				+- **Multi-Hardware Support:**
			
 
				+  - **Broad Compatibility:** Training and inference supported on **NVIDIA, Intel, Apple M-series, Kunlunxin, Ascend, Cambricon, Hygon, Enflame**, and more.
			
 
				+  - **Ascend-Optimized:** **200+ fully adapted models**, including **21 OM-accelerated inference models**, plus key solutions like PP-OCRv5 and PP-StructureV3.
			
 
				+  - **Kunlunxin-Optimized:** Critical classification, detection, and OCR models (including PP-OCRv5) are fully supported.
			
 
				 
			
 
				 
			
 
				 ## 🔠 Explanation of Pipeline
			
@@ -553,7 +553,7 @@ In addition, PaddleX provides developers with a full-process efficient model tra
 
				 
			
 
				 ### 🛠️ Installation
			
 
				 
			
 
				-> ❗Before installing PaddleX, please ensure that you have a basic **Python runtime environment** (Note: Currently supports Python 3.8 to Python 3.12). The PaddleX 3.0-rc1 version depends on PaddlePaddle version 3.0.0 and above. Please make sure the version compatibility is maintained before use.
			
 
				+> ❗Before installing PaddleX, please ensure that you have a basic **Python runtime environment** (Note: Currently supports Python 3.8 to Python 3.12). The PaddleX 3.0.x version depends on PaddlePaddle version 3.0.0 and above. Please make sure the version compatibility is maintained before use.
			
 
				 
			
 
				 * **Installing PaddlePaddle**
			
 
				 
			
@@ -572,7 +572,7 @@ python -m pip install paddlepaddle==3.0.0 -i https://www.paddlepaddle.org.cn/pac
 
				 * **Installing PaddleX**
			
 
				 
			
 
				 ```bash
			
 
				-pip install paddlex[base]==3.0.0
			
 
				+pip install "paddlex[base]==3.0.0"
			
 
				 ```
			
 
				 
			
 
				 > ❗For more installation methods, refer to the [PaddleX Installation Guide](https://paddlepaddle.github.io/PaddleX/latest/en/installation/installation.html).
			
--- a/paddlex/inference/pipelines/base.py
+++ b/paddlex/inference/pipelines/base.py
@@ -96,14 +96,20 @@ class BasePipeline(ABC, metaclass=AutoRegisterABCMetaClass):
 
				 
			
 
				         logging.info("Creating model: %s", (config["model_name"], model_dir))
			
 
				 
			
 
				+        # TODO(gaotingquan): support to specify pp_option by model in pipeline
			
 
				+        if self.pp_option is not None:
			
 
				+            pp_option = self.pp_option.copy()
			
 
				+            pp_option.model_name = config["model_name"]
			
 
				+            pp_option.run_mode = self.pp_option.run_mode
			
 
				+        else:
			
 
				+            pp_option = None
			
 
				+
			
 
				         model = create_predictor(
			
 
				             model_name=config["model_name"],
			
 
				             model_dir=model_dir,
			
 
				             device=self.device,
			
 
				             batch_size=config.get("batch_size", 1),
			
 
				-            pp_option=(
			
 
				-                self.pp_option.copy() if self.pp_option is not None else self.pp_option
			
 
				-            ),
			
 
				+            pp_option=pp_option,
			
 
				             use_hpip=use_hpip,
			
 
				             hpi_config=hpi_config,
			
 
				             **kwargs,
			
--- a/paddlex/inference/pipelines/layout_parsing/result_v2.py
+++ b/paddlex/inference/pipelines/layout_parsing/result_v2.py
@@ -220,7 +220,7 @@ class LayoutParsingResultV2(BaseCVResult, HtmlMixin, XlsxMixin, MarkdownMixin):
 
				 
			
 
				         if model_settings["use_table_recognition"] and len(self["table_res_list"]) > 0:
			
 
				             table_cell_img = Image.fromarray(
			
 
				-                copy.deepcopy(self["doc_preprocessor_res"]["output_img"])
			
 
				+                copy.deepcopy(self["doc_preprocessor_res"]["output_img"][:, :, ::-1])
			
 
				             )
			
 
				             table_draw = ImageDraw.Draw(table_cell_img)
			
 
				             rectangle_color = (255, 0, 0)
			
--- a/paddlex/inference/utils/mkldnn_blocklist.py
+++ b/paddlex/inference/utils/mkldnn_blocklist.py
@@ -0,0 +1,25 @@
 
				+# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.
			
 
				+#
			
 
				+# Licensed under the Apache License, Version 2.0 (the "License");
			
 
				+# you may not use this file except in compliance with the License.
			
 
				+# You may obtain a copy of the License at
			
 
				+#
			
 
				+#    http://www.apache.org/licenses/LICENSE-2.0
			
 
				+#
			
 
				+# Unless required by applicable law or agreed to in writing, software
			
 
				+# distributed under the License is distributed on an "AS IS" BASIS,
			
 
				+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
			
 
				+# See the License for the specific language governing permissions and
			
 
				+# limitations under the License.
			
 
				+
			
 
				+MKLDNN_BLOCKLIST = [
			
 
				+    "SLANeXt_wired",
			
 
				+    "SLANeXt_wireless",
			
 
				+    "LaTeX_OCR_rec",
			
 
				+    "PP-FormulaNet-L",
			
 
				+    "PP-FormulaNet-S",
			
 
				+    "UniMERNet",
			
 
				+    "PP-FormulaNet_plus-L",
			
 
				+    "PP-FormulaNet_plus-M",
			
 
				+    "PP-FormulaNet_plus-S",
			
 
				+]
			
--- a/paddlex/inference/utils/pp_option.py
+++ b/paddlex/inference/utils/pp_option.py
@@ -24,6 +24,7 @@ from ...utils.device import (
 
				     set_env_for_device_type,
			
 
				 )
			
 
				 from ...utils.flags import USE_PIR_TRT
			
 
				+from .mkldnn_blocklist import MKLDNN_BLOCKLIST
			
 
				 from .new_ir_blocklist import NEWIR_BLOCKLIST
			
 
				 from .trt_blocklist import TRT_BLOCKLIST
			
 
				 from .trt_config import TRT_CFG_SETTING, TRT_PRECISION_MAP
			
@@ -45,7 +46,7 @@ class PaddlePredictorOption(object):
 
				     )
			
 
				     SUPPORT_DEVICE = ("gpu", "cpu", "npu", "xpu", "mlu", "dcu", "gcu")
			
 
				 
			
 
				-    def __init__(self, model_name, **kwargs):
			
 
				+    def __init__(self, model_name=None, **kwargs):
			
 
				         super().__init__()
			
 
				         self._model_name = model_name
			
 
				         self._cfg = {}
			
@@ -137,12 +138,20 @@ class PaddlePredictorOption(object):
 
				             raise ValueError(
			
 
				                 f"`run_mode` must be {support_run_mode_str}, but received {repr(run_mode)}."
			
 
				             )
			
 
				-        # TRT Blocklist
			
 
				-        if run_mode.startswith("trt") and self._model_name in TRT_BLOCKLIST:
			
 
				-            logging.warning(
			
 
				-                f"The model({self._model_name}) is not supported to run in trt mode! Using `paddle` instead!"
			
 
				-            )
			
 
				-            run_mode = "paddle"
			
 
				+
			
 
				+        if self._model_name is not None:
			
 
				+            # TRT Blocklist
			
 
				+            if run_mode.startswith("trt") and self._model_name in TRT_BLOCKLIST:
			
 
				+                logging.warning(
			
 
				+                    f"The model({self._model_name}) is not supported to run in trt mode! Using `paddle` instead!"
			
 
				+                )
			
 
				+                run_mode = "paddle"
			
 
				+            # MKLDNN Blocklist
			
 
				+            elif run_mode.startswith("mkldnn") and self._model_name in MKLDNN_BLOCKLIST:
			
 
				+                logging.warning(
			
 
				+                    f"The model({self._model_name}) is not supported to run in MKLDNN mode! Using `paddle` instead!"
			
 
				+                )
			
 
				+                run_mode = "paddle"
			
 
				 
			
 
				         self._update("run_mode", run_mode)