7 months ago · ddfac47755
--- a/docs/pipeline_deploy/high_performance_inference.en.md
+++ b/docs/pipeline_deploy/high_performance_inference.en.md
@@ -60,7 +60,7 @@ Currently, the supported processor architectures, operating systems, device type
 
				   </tr>
			
 
				 </table>
			
 
				 
			
 
				-#### (1) Installing the High-Performance Inference Plugin in a Docker Container (Highly Recommended):
			
 
				+#### 1.1.1 Installing the High-Performance Inference Plugin in a Docker Container (Highly Recommended):
			
 
				 
			
 
				 Refer to [Get PaddleX based on Docker](../installation/installation.en.md#21-obtaining-paddlex-based-on-docker) to start a PaddleX container using Docker. After starting the container, execute the following commands according to your device type to install the high-performance inference plugin:
			
 
				 
			
@@ -90,9 +90,9 @@ In the official PaddleX Docker image, TensorRT is installed by default. The high
 
				 
			
 
				 **Please note that the aforementioned Docker image refers to the official PaddleX image described in [Get PaddleX via Docker](../installation/installation.en.md#21-get-paddlex-based-on-docker), rather than the PaddlePaddle official image described in [PaddlePaddle Local Installation Tutorial](../installation/paddlepaddle_install.en.md#installing-paddlepaddle-via-docker). For the latter, please refer to the local installation instructions for the high-performance inference plugin.**
			
 
				 
			
 
				-#### (2) Installing the High-Performance Inference Plugin Locally (Not Recommended):
			
 
				+#### 1.1.2 Installing the High-Performance Inference Plugin Locally:
			
 
				 
			
 
				-##### To install the CPU version of the high-performance inference plugin:
			
 
				+**To install the CPU version of the high-performance inference plugin:**
			
 
				 
			
 
				 Run:
			
 
				 
			
@@ -100,7 +100,7 @@ Run:
 
				 paddlex --install hpi-cpu
			
 
				 ```
			
 
				 
			
 
				-##### To install the GPU version of the high-performance inference plugin:
			
 
				+**To install the GPU version of the high-performance inference plugin:**
			
 
				 
			
 
				 Before installation, please ensure that CUDA and cuDNN are installed in your environment. The official PaddleX currently only provides precompiled packages for CUDA 11.8 + cuDNN 8.9, so please ensure that the installed versions of CUDA and cuDNN are compatible with the compiled versions. Below are the installation documentation links for CUDA 11.8 and cuDNN 8.9:
			
 
				 
			
@@ -126,7 +126,7 @@ After confirming that the correct versions of CUDA, cuDNN, and TensorRT (optiona
 
				 paddlex --install hpi-gpu
			
 
				 ```
			
 
				 
			
 
				-##### To install the NPU version of the high-performance inference plugin:
			
 
				+**To install the NPU version of the high-performance inference plugin:**
			
 
				 
			
 
				 Please refer to the [Ascend NPU High-Performance Inference Tutorial](../practical_tutorials/high_performance_npu_tutorial.en.md).
			
 
				 
			
@@ -206,11 +206,11 @@ This section introduces the advanced usage of the high-performance inference plu
 
				 
			
 
				 The high-performance inference plugin supports two working modes. The operating mode can be switched by modifying the high-performance inference configuration.
			
 
				 
			
 
				-#### (1) Safe Auto-Configuration Mode
			
 
				+#### 2.1.1 Safe Auto-Configuration Mode
			
 
				 
			
 
				 In safe auto-configuration mode, a protective mechanism is enabled. By default, **the configuration with the best performance for the current environment is automatically selected**. In this mode, while the user can override the default configuration, the provided configuration will be subject to checks, and PaddleX will reject configurations that are not available based on prior knowledge. This is the default operating mode.
			
 
				 
			
 
				-#### (2) Unrestricted Manual Configuration Mode
			
 
				+#### 2.1.2 Unrestricted Manual Configuration Mode
			
 
				 
			
 
				 In unrestricted manual configuration mode, full freedom is provided to configure—users can **choose the inference backend freely and modify its configuration, etc.**—but there is no guarantee that inference will always succeed. This mode is recommended for experienced users who have clear requirements for the inference backend and its configuration; it is advised to use this mode only when familiar with high-performance inference.
			
 
				 
			
@@ -327,166 +327,112 @@ The available configuration items for `backend_config` vary for different backen
 
				 
			
 
				 Due to the diversity of actual deployment environments and requirements, the default configuration might not meet all needs. In such cases, manual adjustment of the high-performance inference configuration may be necessary. Users can modify the configuration by editing the **pipeline/module configuration file** or by passing the `hpi_config` field in the parameters via **CLI** or **Python API**. **Parameters passed via CLI or Python API will override the settings in the pipeline/module configuration file.** The following examples illustrate how to modify the configuration.
			
 
				 
			
 
				-#### (1) Changing the Inference Backend
			
 
				+**For the general OCR pipeline, use the `onnxruntime` backend for all models:**
			
 
				 
			
 
				-  ##### For the general OCR pipeline, use the `onnxruntime` backend for all models:
			
 
				+<details><summary>👉 Modify via Pipeline Configuration File (click to expand)</summary>
			
 
				 
			
 
				-  <details><summary>👉 1. Modify via Pipeline Configuration File (click to expand)</summary>
			
 
				-
			
 
				-  ```yaml
			
 
				-  pipeline_name: OCR
			
 
				-
			
 
				-  hpi_config:
			
 
				-    backend: onnxruntime
			
 
				-
			
 
				-  ...
			
 
				-  ```
			
 
				+```yaml
			
 
				+...
			
 
				+hpi_config:
			
 
				+  backend: onnxruntime
			
 
				+```
			
 
				 
			
 
				-  </details>
			
 
				-  <details><summary>👉 2. CLI Parameter Method (click to expand)</summary>
			
 
				+</details>
			
 
				+<details><summary>👉 CLI Parameter Method (click to expand)</summary>
			
 
				 
			
 
				-  ```bash
			
 
				-  paddlex \
			
 
				-      --pipeline image_classification \
			
 
				-      --input https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_classification_001.jpg \
			
 
				-      --device gpu:0 \
			
 
				-      --use_hpip \
			
 
				-      --hpi_config '{"backend": "onnxruntime"}'
			
 
				-  ```
			
 
				+```bash
			
 
				+paddlex \
			
 
				+    --pipeline image_classification \
			
 
				+    --input https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_classification_001.jpg \
			
 
				+    --device gpu:0 \
			
 
				+    --use_hpip \
			
 
				+    --hpi_config '{"backend": "onnxruntime"}'
			
 
				+```
			
 
				 
			
 
				-  </details>
			
 
				-  <details><summary>👉 3. Python API Parameter Method (click to expand)</summary>
			
 
				+</details>
			
 
				+<details><summary>👉 Python API Parameter Method (click to expand)</summary>
			
 
				 
			
 
				-  ```python
			
 
				-  from paddlex import create_pipeline
			
 
				+```python
			
 
				+from paddlex import create_pipeline
			
 
				 
			
 
				-  pipeline = create_pipeline(
			
 
				-      pipeline="OCR",
			
 
				-      device="gpu",
			
 
				-      use_hpip=True,
			
 
				-      hpi_config={"backend": "onnxruntime"}
			
 
				-  )
			
 
				-  ```
			
 
				+pipeline = create_pipeline(
			
 
				+    pipeline="OCR",
			
 
				+    device="gpu",
			
 
				+    use_hpip=True,
			
 
				+    hpi_config={"backend": "onnxruntime"}
			
 
				+)
			
 
				+```
			
 
				 
			
 
				-  </details>
			
 
				+</details>
			
 
				 
			
 
				-  ##### For the image classification module, use the `onnxruntime` backend:
			
 
				+**For the image classification module, use the `onnxruntime` backend:**
			
 
				 
			
 
				-  <details><summary>👉 1. Modify via Pipeline Configuration File (click to expand)</summary>
			
 
				+<details><summary>👉 Modify via Pipeline Configuration File (click to expand)</summary>
			
 
				 
			
 
				-  ```yaml
			
 
				-  # paddlex/configs/modules/image_classification/ResNet18.yaml
			
 
				+```yaml
			
 
				+Predict:
			
 
				   ...
			
 
				-  Predict:
			
 
				-    ...
			
 
				-    hpi_config:
			
 
				-        backend: onnxruntime
			
 
				-    ...
			
 
				-  ...
			
 
				-  ```
			
 
				-
			
 
				-  </details>
			
 
				-  <details><summary>👉 2. CLI Parameter Method (click to expand)</summary>
			
 
				-
			
 
				-  ```bash
			
 
				-  python main.py \
			
 
				-      -c paddlex/configs/modules/image_classification/ResNet18.yaml \
			
 
				-      -o Global.mode=predict \
			
 
				-      -o Predict.model_dir=None \
			
 
				-      -o Predict.input=https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_classification_001.jpg \
			
 
				-      -o Global.device=gpu:0 \
			
 
				-      -o Predict.use_hpip=True \
			
 
				-      -o Predict.hpi_config='{"backend": "onnxruntime"}'
			
 
				-  ```
			
 
				-
			
 
				-  </details>
			
 
				-  <details><summary>👉 3. Python API Parameter Method (click to expand)</summary>
			
 
				-
			
 
				-  ```python
			
 
				-  from paddlex import create_model
			
 
				-
			
 
				-  model = create_model(
			
 
				-      model_name="ResNet18",
			
 
				-      device="gpu",
			
 
				-      use_hpip=True,
			
 
				-      hpi_config={"backend": "onnxruntime"}
			
 
				-  )
			
 
				-  ```
			
 
				-
			
 
				-  </details>
			
 
				-
			
 
				-  ##### For the general OCR pipeline, use the `onnxruntime` backend for the `text_detection` module and the `tensorrt` backend for the `text_recognition` module:
			
 
				+  hpi_config:
			
 
				+    backend: onnxruntime
			
 
				+```
			
 
				 
			
 
				-  <details><summary>👉 1. Modify via Pipeline Configuration File (click to expand)</summary>
			
 
				+</details>
			
 
				+<details><summary>👉 CLI Parameter Method (click to expand)</summary>
			
 
				 
			
 
				-  ```yaml
			
 
				-  pipeline_name: OCR
			
 
				+```bash
			
 
				+python main.py \
			
 
				+    -c paddlex/configs/modules/image_classification/ResNet18.yaml \
			
 
				+    -o Global.mode=predict \
			
 
				+    -o Predict.model_dir=None \
			
 
				+    -o Predict.input=https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_classification_001.jpg \
			
 
				+    -o Global.device=gpu:0 \
			
 
				+    -o Predict.use_hpip=True \
			
 
				+    -o Predict.hpi_config='{"backend": "onnxruntime"}'
			
 
				+```
			
 
				 
			
 
				-  ...
			
 
				+</details>
			
 
				+<details><summary>👉 Python API Parameter Method (click to expand)</summary>
			
 
				 
			
 
				-  SubModules:
			
 
				-    TextDetection:
			
 
				-      module_name: text_detection
			
 
				-      model_name: PP-OCRv4_mobile_det
			
 
				-      model_dir: null
			
 
				-      limit_side_len: 960
			
 
				-      limit_type: max
			
 
				-      thresh: 0.3
			
 
				-      box_thresh: 0.6
			
 
				-      unclip_ratio: 2.0
			
 
				-      hpi_config:
			
 
				-        backend: onnxruntime
			
 
				-    TextLineOrientation:
			
 
				-      module_name: textline_orientation
			
 
				-      model_name: PP-LCNet_x0_25_textline_ori
			
 
				-      model_dir: null
			
 
				-      batch_size: 6
			
 
				-    TextRecognition:
			
 
				-      module_name: text_recognition
			
 
				-      model_name: PP-OCRv4_mobile_rec
			
 
				-      model_dir: null
			
 
				-      batch_size: 6
			
 
				-      score_thresh: 0.0
			
 
				-      hpi_config:
			
 
				-        backend: tensorrt
			
 
				-  ```
			
 
				+```python
			
 
				+from paddlex import create_model
			
 
				 
			
 
				-  </details>
			
 
				+model = create_model(
			
 
				+    model_name="ResNet18",
			
 
				+    device="gpu",
			
 
				+    use_hpip=True,
			
 
				+    hpi_config={"backend": "onnxruntime"}
			
 
				+)
			
 
				+```
			
 
				 
			
 
				-#### (2) Modifying TensorRT Dynamic Shape Configuration
			
 
				+</details>
			
 
				 
			
 
				-  ##### For the general image classification pipeline, modify dynamic shape configuration:
			
 
				+**For the general OCR pipeline, use the `onnxruntime` backend for the `text_detection` module and the `tensorrt` backend for the `text_recognition` module:**
			
 
				 
			
 
				-  <details><summary>👉 Click to expand</summary>
			
 
				+<details><summary>👉 Modify via Pipeline Configuration File (click to expand)</summary>
			
 
				 
			
 
				-  ```yaml
			
 
				+```yaml
			
 
				+SubModules:
			
 
				+  TextDetection:
			
 
				     ...
			
 
				-    SubModules:
			
 
				-      ImageClassification:
			
 
				-        ...
			
 
				-        hpi_config:
			
 
				-          backend: tensorrt
			
 
				-          backend_config:
			
 
				-            dynamic_shapes:
			
 
				-              x:
			
 
				-                - [1, 3, 300, 300]
			
 
				-                - [4, 3, 300, 300]
			
 
				-                - [32, 3, 1200, 1200]
			
 
				-              ...
			
 
				+    hpi_config:
			
 
				+      backend: onnxruntime
			
 
				+  TextRecognition:
			
 
				     ...
			
 
				-  ```
			
 
				+    hpi_config:
			
 
				+      backend: tensorrt
			
 
				+```
			
 
				 
			
 
				-  </details>
			
 
				+</details>
			
 
				 
			
 
				-  ##### For the image classification module, modify dynamic shape configuration:
			
 
				+**For the general image classification pipeline, modify dynamic shape configuration:**
			
 
				 
			
 
				-  <details><summary>👉 Click to expand</summary>
			
 
				+<details><summary>👉 Modify via Pipeline Configuration File (click to expand)</summary>
			
 
				 
			
 
				-  ```yaml
			
 
				-  ...
			
 
				-  Predict:
			
 
				-    ...
			
 
				-    hpi_config:
			
 
				+```yaml
			
 
				+  SubModules:
			
 
				+    ImageClassification:
			
 
				+      hpi_config:
			
 
				+        ...
			
 
				         backend: tensorrt
			
 
				         backend_config:
			
 
				           dynamic_shapes:
			
@@ -494,53 +440,52 @@ Due to the diversity of actual deployment environments and requirements, the def
 
				               - [1, 3, 300, 300]
			
 
				               - [4, 3, 300, 300]
			
 
				               - [32, 3, 1200, 1200]
			
 
				+```
			
 
				+
			
 
				+</details>
			
 
				+
			
 
				+**For the image classification module, modify dynamic shape configuration:**
			
 
				+
			
 
				+<details><summary>👉 Modify via Pipeline Configuration File (click to expand)</summary>
			
 
				+
			
 
				+```yaml
			
 
				+Predict:
			
 
				+  hpi_config:
			
 
				     ...
			
 
				-  ...
			
 
				-  ```
			
 
				+    backend: tensorrt
			
 
				+    backend_config:
			
 
				+      dynamic_shapes:
			
 
				+        x:
			
 
				+          - [1, 3, 300, 300]
			
 
				+          - [4, 3, 300, 300]
			
 
				+          - [32, 3, 1200, 1200]
			
 
				+```
			
 
				 
			
 
				-  </details>
			
 
				+</details>
			
 
				 
			
 
				 ### 2.4 Enabling/Disabling the High-Performance Inference Plugin on Sub-pipelines/Submodules
			
 
				 
			
 
				 High-performance inference supports enabling the high-performance inference plugin for only specific sub-pipelines/submodules by configuring `use_hpip` at the sub-pipeline or submodule level. For example:
			
 
				 
			
 
				-##### In the general OCR pipeline, enable high-performance inference for the `text_detection` module, but not for the `text_recognition` module:
			
 
				-
			
 
				-  <details><summary>👉 Click to expand</summary>
			
 
				+**In the general OCR pipeline, enable high-performance inference for the `text_detection` module, but not for the `text_recognition` module:**
			
 
				 
			
 
				-  ```yaml
			
 
				-  pipeline_name: OCR
			
 
				+<details><summary>👉 Click to expand</summary>
			
 
				 
			
 
				-  ...
			
 
				+```yaml
			
 
				+SubModules:
			
 
				+  TextDetection:
			
 
				+    ...
			
 
				+    use_hpip: True # This submodule uses high-performance inference
			
 
				+  TextLineOrientation:
			
 
				+    ...
			
 
				+    # This submodule does not have a specific configuration; it defaults to the global configuration
			
 
				+    # (if neither the configuration file nor CLI/API parameters set it, high-performance inference will not be used)
			
 
				+  TextRecognition:
			
 
				+    ...
			
 
				+    use_hpip: False # This submodule does not use high-performance inference
			
 
				+```
			
 
				 
			
 
				-  SubModules:
			
 
				-    TextDetection:
			
 
				-      module_name: text_detection
			
 
				-      model_name: PP-OCRv4_mobile_det
			
 
				-      model_dir: null
			
 
				-      limit_side_len: 960
			
 
				-      limit_type: max
			
 
				-      thresh: 0.3
			
 
				-      box_thresh: 0.6
			
 
				-      unclip_ratio: 2.0
			
 
				-      use_hpip: True # This submodule uses high-performance inference
			
 
				-    TextLineOrientation:
			
 
				-      module_name: textline_orientation
			
 
				-      model_name: PP-LCNet_x0_25_textline_ori
			
 
				-      model_dir: null
			
 
				-      batch_size: 6
			
 
				-      # This submodule does not have a specific configuration; it defaults to the global configuration
			
 
				-      # (if neither the configuration file nor CLI/API parameters set it, high-performance inference will not be used)
			
 
				-    TextRecognition:
			
 
				-      module_name: text_recognition
			
 
				-      model_name: PP-OCRv4_mobile_rec
			
 
				-      model_dir: null
			
 
				-      batch_size: 6
			
 
				-      score_thresh: 0.0
			
 
				-      use_hpip: False # This submodule does not use high-performance inference
			
 
				-  ```
			
 
				-
			
 
				-  </details>
			
 
				+</details>
			
 
				 
			
 
				 **Note:**
			
 
				 
			
--- a/docs/pipeline_deploy/high_performance_inference.md
+++ b/docs/pipeline_deploy/high_performance_inference.md
@@ -60,7 +60,7 @@ comments: true
 
				   </tr>
			
 
				 </table>
			
 
				 
			
 
				-#### (1) 在 Docker 容器中安装高性能推理插件（强烈推荐）：
			
 
				+#### 1.1.1 在 Docker 容器中安装高性能推理插件（强烈推荐）：
			
 
				 
			
 
				 参考 [基于Docker获取PaddleX](../installation/installation.md#21-基于docker获取paddlex) 使用 Docker 启动 PaddleX 容器。启动容器后，根据设备类型，执行如下指令，安装高性能推理插件：
			
 
				 
			
@@ -90,9 +90,9 @@ PaddleX 官方 Docker 镜像中默认安装了 TensorRT，高性能推理插件
 
				 
			
 
				 **请注意，以上提到的镜像指的是 [基于Docker获取PaddleX](../installation/installation.md#21-基于docker获取paddlex) 中描述的 PaddleX 官方镜像，而非 [飞桨PaddlePaddle本地安装教程](../installation/paddlepaddle_install.md#基于-docker-安装飞桨) 中描述的飞桨框架官方镜像。对于后者，请参考高性能推理插件本地安装说明。**
			
 
				 
			
 
				-#### (2) 本地安装高性能推理插件（不推荐）：
			
 
				+#### 1.1.2 本地安装高性能推理插件：
			
 
				 
			
 
				-##### 安装 CPU 版本的高性能推理插件：
			
 
				+**安装 CPU 版本的高性能推理插件：**
			
 
				 
			
 
				 执行：
			
 
				 
			
@@ -100,7 +100,7 @@ PaddleX 官方 Docker 镜像中默认安装了 TensorRT，高性能推理插件
 
				 paddlex --install hpi-cpu
			
 
				 ```
			
 
				 
			
 
				-##### 安装 GPU 版本的高性能推理插件：
			
 
				+**安装 GPU 版本的高性能推理插件：**
			
 
				 
			
 
				 在安装前，需要确保环境中安装有 CUDA 与 cuDNN。目前 PaddleX 官方仅提供 CUDA 11.8 + cuDNN 8.9 的预编译包，请保证安装的 CUDA 和 cuDNN 版本与编译版本兼容。以下分别是 CUDA 11.8 和 cuDNN 8.9 的安装说明文档：
			
 
				 
			
@@ -126,7 +126,7 @@ pip list | grep nvidia-cudnn
 
				 paddlex --install hpi-gpu
			
 
				 ```
			
 
				 
			
 
				-##### 安装 NPU 版本的高性能推理插件：
			
 
				+**安装 NPU 版本的高性能推理插件：**
			
 
				 
			
 
				 请参考 [昇腾 NPU 高性能推理教程](../practical_tutorials/high_performance_npu_tutorial.md)。
			
 
				 
			
@@ -208,11 +208,11 @@ output = model.predict("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/
 
				 
			
 
				 高性能推理插件支持两种工作模式。通过修改高性能推理配置，可以切换不同的工作模式。
			
 
				 
			
 
				-#### (1) 安全自动配置模式
			
 
				+#### 2.1.1 安全自动配置模式
			
 
				 
			
 
				 安全自动配置模式，具有保护机制，默认**自动选用当前环境性能较优的配置**。在这种模式下，用户可以覆盖默认配置，但用户提供的配置将受到检查，PaddleX将根据先验知识拒绝不可用的配置。这是默认的工作模式。
			
 
				 
			
 
				-#### (2) 无限制手动配置模式
			
 
				+#### 2.1.2 无限制手动配置模式
			
 
				 
			
 
				 无限制手动配置模式，提供完全的配置自由，可以**自由选择推理后端、修改后端配置等**，但无法保证推理一定成功。此模式适合有经验和对推理后端及其配置有明确需求的用户，建议在熟悉高性能推理的情况下使用。
			
 
				 
			
@@ -329,166 +329,112 @@ output = model.predict("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/
 
				 
			
 
				 由于实际部署环境和需求的多样性，默认配置可能无法满足所有要求。这时，可能需要手动调整高性能推理配置。用户可以通过修改**产线/模块配置文件**、**CLI**或**Python API**所传递参数中的 `hpi_config` 字段内容来修改配置。**通过 CLI 或 Python API 传递的参数将覆盖产线/模块配置文件中的设置**。以下将结合一些例子介绍如何修改配置。
			
 
				 
			
 
				-#### (1) 更换推理后端。
			
 
				+**通用OCR产线的所有模型使用 `onnxruntime` 后端：**
			
 
				 
			
 
				-  ##### 通用OCR产线的所有模型使用 `onnxruntime` 后端：
			
 
				+<details><summary>👉 修改产线配置文件方式（点击展开）</summary>
			
 
				 
			
 
				-  <details><summary>👉 1. 修改产线配置文件方式（点击展开）</summary>
			
 
				-
			
 
				-  ```yaml
			
 
				-  pipeline_name: OCR
			
 
				-
			
 
				-  hpi_config:
			
 
				-    backend: onnxruntime
			
 
				-
			
 
				-  ...
			
 
				-  ```
			
 
				+```yaml
			
 
				+...
			
 
				+hpi_config:
			
 
				+  backend: onnxruntime
			
 
				+```
			
 
				 
			
 
				-  </details>
			
 
				-  <details><summary>👉 2. CLI传参方式（点击展开）</summary>
			
 
				+</details>
			
 
				+<details><summary>👉 CLI传参方式（点击展开）</summary>
			
 
				 
			
 
				-  ```bash
			
 
				-  paddlex \
			
 
				-      --pipeline image_classification \
			
 
				-      --input https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_classification_001.jpg \
			
 
				-      --device gpu:0 \
			
 
				-      --use_hpip \
			
 
				-      --hpi_config '{"backend": "onnxruntime"}'
			
 
				-  ```
			
 
				+```bash
			
 
				+paddlex \
			
 
				+    --pipeline image_classification \
			
 
				+    --input https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_classification_001.jpg \
			
 
				+    --device gpu:0 \
			
 
				+    --use_hpip \
			
 
				+    --hpi_config '{"backend": "onnxruntime"}'
			
 
				+```
			
 
				 
			
 
				-  </details>
			
 
				-  <details><summary>👉 3. Python API传参方式（点击展开）</summary>
			
 
				+</details>
			
 
				+<details><summary>👉 Python API传参方式（点击展开）</summary>
			
 
				 
			
 
				-  ```python
			
 
				-  from paddlex import create_pipeline
			
 
				+```python
			
 
				+from paddlex import create_pipeline
			
 
				 
			
 
				-  pipeline = create_pipeline(
			
 
				-      pipeline="OCR",
			
 
				-      device="gpu",
			
 
				-      use_hpip=True,
			
 
				-      hpi_config={"backend": "onnxruntime"}
			
 
				-  )
			
 
				-  ```
			
 
				+pipeline = create_pipeline(
			
 
				+    pipeline="OCR",
			
 
				+    device="gpu",
			
 
				+    use_hpip=True,
			
 
				+    hpi_config={"backend": "onnxruntime"}
			
 
				+)
			
 
				+```
			
 
				 
			
 
				-  </details>
			
 
				+</details>
			
 
				 
			
 
				-  ##### 图像分类模块使用 `onnxruntime` 后端：
			
 
				+**图像分类模块使用 `onnxruntime` 后端：**
			
 
				 
			
 
				-  <details><summary>👉 1. 修改产线配置文件方式（点击展开）</summary>
			
 
				+<details><summary>👉 修改产线配置文件方式（点击展开）</summary>
			
 
				 
			
 
				-  ```yaml
			
 
				-  # paddlex/configs/modules/image_classification/ResNet18.yaml
			
 
				+```yaml
			
 
				+Predict:
			
 
				   ...
			
 
				-  Predict:
			
 
				-    ...
			
 
				-    hpi_config:
			
 
				-        backend: onnxruntime
			
 
				-    ...
			
 
				-  ...
			
 
				-  ```
			
 
				-
			
 
				-  </details>
			
 
				-  <details><summary>👉 2. CLI传参方式（点击展开）</summary>
			
 
				-
			
 
				-  ```bash
			
 
				-  python main.py \
			
 
				-      -c paddlex/configs/modules/image_classification/ResNet18.yaml \
			
 
				-      -o Global.mode=predict \
			
 
				-      -o Predict.model_dir=None \
			
 
				-      -o Predict.input=https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_classification_001.jpg \
			
 
				-      -o Global.device=gpu:0 \
			
 
				-      -o Predict.use_hpip=True \
			
 
				-      -o Predict.hpi_config='{"backend": "onnxruntime"}'
			
 
				-  ```
			
 
				-
			
 
				-  </details>
			
 
				-  <details><summary>👉 3. Python API传参方式（点击展开）</summary>
			
 
				-
			
 
				-  ```python
			
 
				-  from paddlex import create_model
			
 
				-
			
 
				-  model = create_model(
			
 
				-      model_name="ResNet18",
			
 
				-      device="gpu",
			
 
				-      use_hpip=True,
			
 
				-      hpi_config={"backend": "onnxruntime"}
			
 
				-  )
			
 
				-  ```
			
 
				-
			
 
				-  </details>
			
 
				-
			
 
				-  ##### 通用OCR产线的 `text_detection` 模块使用 `onnxruntime` 后端，`text_recognition` 模块使用 `tensorrt` 后端：
			
 
				+  hpi_config:
			
 
				+    backend: onnxruntime
			
 
				+```
			
 
				 
			
 
				-  <details><summary>👉 1. 修改产线配置文件方式（点击展开）</summary>
			
 
				+</details>
			
 
				+<details><summary>👉 CLI传参方式（点击展开）</summary>
			
 
				 
			
 
				-  ```yaml
			
 
				-  pipeline_name: OCR
			
 
				+```bash
			
 
				+python main.py \
			
 
				+    -c paddlex/configs/modules/image_classification/ResNet18.yaml \
			
 
				+    -o Global.mode=predict \
			
 
				+    -o Predict.model_dir=None \
			
 
				+    -o Predict.input=https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_classification_001.jpg \
			
 
				+    -o Global.device=gpu:0 \
			
 
				+    -o Predict.use_hpip=True \
			
 
				+    -o Predict.hpi_config='{"backend": "onnxruntime"}'
			
 
				+```
			
 
				 
			
 
				-  ...
			
 
				+</details>
			
 
				+<details><summary>👉 Python API传参方式（点击展开）</summary>
			
 
				 
			
 
				-  SubModules:
			
 
				-    TextDetection:
			
 
				-      module_name: text_detection
			
 
				-      model_name: PP-OCRv4_mobile_det
			
 
				-      model_dir: null
			
 
				-      limit_side_len: 960
			
 
				-      limit_type: max
			
 
				-      thresh: 0.3
			
 
				-      box_thresh: 0.6
			
 
				-      unclip_ratio: 2.0
			
 
				-      hpi_config:
			
 
				-        backend: onnxruntime
			
 
				-    TextLineOrientation:
			
 
				-      module_name: textline_orientation
			
 
				-      model_name: PP-LCNet_x0_25_textline_ori
			
 
				-      model_dir: null
			
 
				-      batch_size: 6
			
 
				-    TextRecognition:
			
 
				-      module_name: text_recognition
			
 
				-      model_name: PP-OCRv4_mobile_rec
			
 
				-      model_dir: null
			
 
				-      batch_size: 6
			
 
				-      score_thresh: 0.0
			
 
				-      hpi_config:
			
 
				-        backend: tensorrt
			
 
				-  ```
			
 
				+```python
			
 
				+from paddlex import create_model
			
 
				 
			
 
				-  </details>
			
 
				+model = create_model(
			
 
				+    model_name="ResNet18",
			
 
				+    device="gpu",
			
 
				+    use_hpip=True,
			
 
				+    hpi_config={"backend": "onnxruntime"}
			
 
				+)
			
 
				+```
			
 
				 
			
 
				-#### (2) 修改 TensorRT 的动态形状配置
			
 
				+</details>
			
 
				 
			
 
				-  ##### 通用图像分类产线修改动态形状配置：
			
 
				+**通用OCR产线的 `text_detection` 模块使用 `onnxruntime` 后端，`text_recognition` 模块使用 `tensorrt` 后端：**
			
 
				 
			
 
				-  <details><summary>👉 点击展开</summary>
			
 
				+<details><summary>👉 修改产线配置文件方式（点击展开）</summary>
			
 
				 
			
 
				-  ```yaml
			
 
				+```yaml
			
 
				+SubModules:
			
 
				+  TextDetection:
			
 
				     ...
			
 
				-    SubModules:
			
 
				-      ImageClassification:
			
 
				-        ...
			
 
				-        hpi_config:
			
 
				-          backend: tensorrt
			
 
				-          backend_config:
			
 
				-            dynamic_shapes:
			
 
				-              x:
			
 
				-                - [1, 3, 300, 300]
			
 
				-                - [4, 3, 300, 300]
			
 
				-                - [32, 3, 1200, 1200]
			
 
				-              ...
			
 
				+    hpi_config:
			
 
				+      backend: onnxruntime
			
 
				+  TextRecognition:
			
 
				     ...
			
 
				-  ```
			
 
				+    hpi_config:
			
 
				+      backend: tensorrt
			
 
				+```
			
 
				 
			
 
				-  </details>
			
 
				+</details>
			
 
				 
			
 
				-  ##### 图像分类模块修改动态形状配置：
			
 
				+**通用图像分类产线修改动态形状配置：**
			
 
				 
			
 
				-  <details><summary>👉 点击展开</summary>
			
 
				+<details><summary>👉 修改产线配置文件方式（点击展开）</summary>
			
 
				 
			
 
				-  ```yaml
			
 
				-  ...
			
 
				-  Predict:
			
 
				-    ...
			
 
				-    hpi_config:
			
 
				+```yaml
			
 
				+  SubModules:
			
 
				+    ImageClassification:
			
 
				+      hpi_config:
			
 
				+        ...
			
 
				         backend: tensorrt
			
 
				         backend_config:
			
 
				           dynamic_shapes:
			
@@ -496,52 +442,51 @@ output = model.predict("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/
 
				               - [1, 3, 300, 300]
			
 
				               - [4, 3, 300, 300]
			
 
				               - [32, 3, 1200, 1200]
			
 
				+```
			
 
				+
			
 
				+</details>
			
 
				+
			
 
				+**图像分类模块修改 TensorRT 动态形状配置：**
			
 
				+
			
 
				+<details><summary>👉 修改产线配置文件方式（点击展开）</summary>
			
 
				+
			
 
				+```yaml
			
 
				+Predict:
			
 
				+  hpi_config:
			
 
				     ...
			
 
				-  ...
			
 
				-  ```
			
 
				+    backend: tensorrt
			
 
				+    backend_config:
			
 
				+      dynamic_shapes:
			
 
				+        x:
			
 
				+          - [1, 3, 300, 300]
			
 
				+          - [4, 3, 300, 300]
			
 
				+          - [32, 3, 1200, 1200]
			
 
				+```
			
 
				 
			
 
				-  </details>
			
 
				+</details>
			
 
				 
			
 
				 ### 2.4 高性能推理插件在子产线/子模块中的启用/禁用
			
 
				 
			
 
				 高性能推理支持通过在子产线/子模块级别使用 `use_hpip`，实现**仅产线中的某个子产线/子模块使用高性能推理**。示例如下：
			
 
				 
			
 
				-##### 通用OCR产线的 `text_detection` 模块使用高性能推理，`text_recognition` 模块不使用高性能推理：
			
 
				-
			
 
				-  <details><summary>👉 点击展开</summary>
			
 
				+**通用OCR产线的 `text_detection` 模块使用高性能推理，`text_recognition` 模块不使用高性能推理：**
			
 
				 
			
 
				-  ```yaml
			
 
				-  pipeline_name: OCR
			
 
				+<details><summary>👉 点击展开</summary>
			
 
				 
			
 
				-  ...
			
 
				+```yaml
			
 
				+SubModules:
			
 
				+  TextDetection:
			
 
				+    ...
			
 
				+    use_hpip: True # 当前子模块使用高性能推理
			
 
				+  TextLineOrientation:
			
 
				+    ...
			
 
				+    # 当前子模块未单独配置，默认与全局配置一致（如果配置文件和 CLI、API 参数均未设置，则不使用高性能推理）
			
 
				+  TextRecognition:
			
 
				+    ...
			
 
				+    use_hpip: False # 当前子模块不使用高性能推理
			
 
				+```
			
 
				 
			
 
				-  SubModules:
			
 
				-    TextDetection:
			
 
				-      module_name: text_detection
			
 
				-      model_name: PP-OCRv4_mobile_det
			
 
				-      model_dir: null
			
 
				-      limit_side_len: 960
			
 
				-      limit_type: max
			
 
				-      thresh: 0.3
			
 
				-      box_thresh: 0.6
			
 
				-      unclip_ratio: 2.0
			
 
				-      use_hpip: True # 当前子模块使用高性能推理
			
 
				-    TextLineOrientation:
			
 
				-      module_name: textline_orientation
			
 
				-      model_name: PP-LCNet_x0_25_textline_ori
			
 
				-      model_dir: null
			
 
				-      batch_size: 6
			
 
				-      # 当前子模块未单独配置，默认与全局配置一致（如果配置文件和 CLI、API 参数均未设置，则不使用高性能推理）
			
 
				-    TextRecognition:
			
 
				-      module_name: text_recognition
			
 
				-      model_name: PP-OCRv4_mobile_rec
			
 
				-      model_dir: null
			
 
				-      batch_size: 6
			
 
				-      score_thresh: 0.0
			
 
				-      use_hpip: False # 当前子模块不使用高性能推理
			
 
				-  ```
			
 
				-
			
 
				-  </details>
			
 
				+</details>
			
 
				 
			
 
				 **注意：**
			
 
				 
			
--- a/paddlex/inference/pipelines/__init__.py
+++ b/paddlex/inference/pipelines/__init__.py
@@ -125,10 +125,9 @@ def create_pipeline(
 
				         pp_option (Optional[PaddlePredictorOption], optional): The options for
			
 
				             the PaddlePredictor. Defaults to None.
			
 
				         use_hpip (Optional[bool], optional): Whether to use the high-performance
			
 
				-            inference plugin (HPIP) for prediction by default.
			
 
				-            Defaults to None.
			
 
				+            inference plugin (HPIP). Defaults to None.
			
 
				         hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional): The
			
 
				-            default high-performance inference configuration dictionary.
			
 
				+            high-performance inference configuration dictionary.
			
 
				             Defaults to None.
			
 
				         *args: Additional positional arguments.
			
 
				         **kwargs: Additional keyword arguments.
			
--- a/paddlex/inference/pipelines/anomaly_detection/pipeline.py
+++ b/paddlex/inference/pipelines/anomaly_detection/pipeline.py
@@ -44,9 +44,9 @@ class AnomalyDetectionPipeline(BasePipeline):
 
				             device (str, optional): Device to run the predictions on. Defaults to None.
			
 
				             pp_option (PaddlePredictorOption, optional): PaddlePredictor options. Defaults to None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				 
			
--- a/paddlex/inference/pipelines/base.py
+++ b/paddlex/inference/pipelines/base.py
@@ -48,9 +48,9 @@ class BasePipeline(ABC, metaclass=AutoRegisterABCMetaClass):
 
				             device (str, optional): The device to use for prediction. Defaults to None.
			
 
				             pp_option (PaddlePredictorOption, optional): The options for PaddlePredictor. Defaults to None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				         super().__init__()
			
--- a/paddlex/inference/pipelines/doc_preprocessor/pipeline.py
+++ b/paddlex/inference/pipelines/doc_preprocessor/pipeline.py
@@ -48,9 +48,9 @@ class DocPreprocessorPipeline(BasePipeline):
 
				             device (str, optional): Device to run the predictions on. Defaults to None.
			
 
				             pp_option (PaddlePredictorOption, optional): PaddlePredictor options. Defaults to None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				 
			
--- a/paddlex/inference/pipelines/formula_recognition/pipeline.py
+++ b/paddlex/inference/pipelines/formula_recognition/pipeline.py
@@ -52,9 +52,9 @@ class FormulaRecognitionPipeline(BasePipeline):
 
				             device (str, optional): Device to run the predictions on. Defaults to None.
			
 
				             pp_option (PaddlePredictorOption, optional): PaddlePredictor options. Defaults to None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				 
			
--- a/paddlex/inference/pipelines/image_classification/pipeline.py
+++ b/paddlex/inference/pipelines/image_classification/pipeline.py
@@ -45,9 +45,9 @@ class ImageClassificationPipeline(BasePipeline):
 
				             device (str): The device to run the prediction on. Default is None.
			
 
				             pp_option (PaddlePredictorOption): Options for PaddlePaddle predictor. Default is None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				         super().__init__(
			
--- a/paddlex/inference/pipelines/image_multilabel_classification/pipeline.py
+++ b/paddlex/inference/pipelines/image_multilabel_classification/pipeline.py
@@ -45,9 +45,9 @@ class ImageMultiLabelClassificationPipeline(BasePipeline):
 
				             device (str): The device to run the prediction on. Default is None.
			
 
				             pp_option (PaddlePredictorOption): Options for PaddlePaddle predictor. Default is None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				         super().__init__(
			
--- a/paddlex/inference/pipelines/instance_segmentation/pipeline.py
+++ b/paddlex/inference/pipelines/instance_segmentation/pipeline.py
@@ -45,9 +45,9 @@ class InstanceSegmentationPipeline(BasePipeline):
 
				             device (str): The device to run the prediction on. Default is None.
			
 
				             pp_option (PaddlePredictorOption): Options for PaddlePaddle predictor. Default is None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				         super().__init__(
			
--- a/paddlex/inference/pipelines/keypoint_detection/pipeline.py
+++ b/paddlex/inference/pipelines/keypoint_detection/pipeline.py
@@ -47,9 +47,9 @@ class KeypointDetectionPipeline(BasePipeline):
 
				             device (str): The device to run the prediction on. Default is None.
			
 
				             pp_option (PaddlePredictorOption): Options for PaddlePaddle predictor. Default is None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				         super().__init__(
			
--- a/paddlex/inference/pipelines/layout_parsing/pipeline.py
+++ b/paddlex/inference/pipelines/layout_parsing/pipeline.py
@@ -51,9 +51,9 @@ class LayoutParsingPipeline(BasePipeline):
 
				             device (str, optional): Device to run the predictions on. Defaults to None.
			
 
				             pp_option (PaddlePredictorOption, optional): PaddlePredictor options. Defaults to None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				 
			
--- a/paddlex/inference/pipelines/layout_parsing/pipeline_v2.py
+++ b/paddlex/inference/pipelines/layout_parsing/pipeline_v2.py
@@ -53,9 +53,9 @@ class LayoutParsingPipelineV2(BasePipeline):
 
				             device (str, optional): Device to run the predictions on. Defaults to None.
			
 
				             pp_option (PaddlePredictorOption, optional): PaddlePredictor options. Defaults to None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				 
			
--- a/paddlex/inference/pipelines/m_3d_bev_detection/pipeline.py
+++ b/paddlex/inference/pipelines/m_3d_bev_detection/pipeline.py
@@ -45,9 +45,9 @@ class BEVDet3DPipeline(BasePipeline):
 
				             device (str): The device to run the prediction on. Default is None.
			
 
				             pp_option (PaddlePredictorOption): Options for PaddlePaddle predictor. Default is None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				         super().__init__(
			
--- a/paddlex/inference/pipelines/multilingual_speech_recognition/pipeline.py
+++ b/paddlex/inference/pipelines/multilingual_speech_recognition/pipeline.py
@@ -45,9 +45,9 @@ class MultilingualSpeechRecognitionPipeline(BasePipeline):
 
				             device (str): The device to run the prediction on. Default is None.
			
 
				             pp_option (PaddlePredictorOption): Options for PaddlePaddle predictor. Default is None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				         super().__init__(
			
--- a/paddlex/inference/pipelines/object_detection/pipeline.py
+++ b/paddlex/inference/pipelines/object_detection/pipeline.py
@@ -45,9 +45,9 @@ class ObjectDetectionPipeline(BasePipeline):
 
				             device (str): The device to run the prediction on. Default is None.
			
 
				             pp_option (PaddlePredictorOption): Options for PaddlePaddle predictor. Default is None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				         super().__init__(
			
--- a/paddlex/inference/pipelines/ocr/pipeline.py
+++ b/paddlex/inference/pipelines/ocr/pipeline.py
@@ -55,9 +55,9 @@ class OCRPipeline(BasePipeline):
 
				             device (str, optional): Device to run the predictions on. Defaults to None.
			
 
				             pp_option (PaddlePredictorOption, optional): PaddlePredictor options. Defaults to None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				         super().__init__(
			
--- a/paddlex/inference/pipelines/open_vocabulary_detection/pipeline.py
+++ b/paddlex/inference/pipelines/open_vocabulary_detection/pipeline.py
@@ -45,9 +45,9 @@ class OpenVocabularyDetectionPipeline(BasePipeline):
 
				             device (str): The device to run the prediction on. Default is None.
			
 
				             pp_option (PaddlePredictorOption): Options for PaddlePaddle predictor. Default is None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				         super().__init__(
			
--- a/paddlex/inference/pipelines/open_vocabulary_segmentation/pipeline.py
+++ b/paddlex/inference/pipelines/open_vocabulary_segmentation/pipeline.py
@@ -47,9 +47,9 @@ class OpenVocabularySegmentationPipeline(BasePipeline):
 
				             device (str): The device to run the prediction on. Default is None.
			
 
				             pp_option (PaddlePredictorOption): Options for PaddlePaddle predictor. Default is None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				         super().__init__(
			
--- a/paddlex/inference/pipelines/pp_chatocr/pipeline_base.py
+++ b/paddlex/inference/pipelines/pp_chatocr/pipeline_base.py
@@ -37,9 +37,9 @@ class PP_ChatOCR_Pipeline(BasePipeline):
 
				             device (str, optional): Device to run the predictions on. Defaults to None.
			
 
				             pp_option (PaddlePredictorOption, optional): PaddlePredictor options. Defaults to None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				 
			
--- a/paddlex/inference/pipelines/pp_chatocr/pipeline_v3.py
+++ b/paddlex/inference/pipelines/pp_chatocr/pipeline_v3.py
@@ -54,9 +54,9 @@ class PP_ChatOCRv3_Pipeline(PP_ChatOCR_Pipeline):
 
				             device (str, optional): Device to run the predictions on. Defaults to None.
			
 
				             pp_option (PaddlePredictorOption, optional): PaddlePredictor options. Defaults to None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				             initial_predictor (bool, optional): Whether to initialize the predictor. Defaults to True.
			
 
				         """
			
--- a/paddlex/inference/pipelines/pp_chatocr/pipeline_v4.py
+++ b/paddlex/inference/pipelines/pp_chatocr/pipeline_v4.py
@@ -62,9 +62,9 @@ class PP_ChatOCRv4_Pipeline(PP_ChatOCR_Pipeline):
 
				             device (str, optional): Device to run the predictions on. Defaults to None.
			
 
				             pp_option (PaddlePredictorOption, optional): PaddlePredictor options. Defaults to None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				             initial_predictor (bool, optional): Whether to initialize the predictor. Defaults to True.
			
 
				         """
			
--- a/paddlex/inference/pipelines/rotated_object_detection/pipeline.py
+++ b/paddlex/inference/pipelines/rotated_object_detection/pipeline.py
@@ -45,9 +45,9 @@ class RotatedObjectDetectionPipeline(BasePipeline):
 
				             device (str): The device to run the prediction on. Default is None.
			
 
				             pp_option (PaddlePredictorOption): Options for PaddlePaddle predictor. Default is None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				         super().__init__(
			
--- a/paddlex/inference/pipelines/seal_recognition/pipeline.py
+++ b/paddlex/inference/pipelines/seal_recognition/pipeline.py
@@ -49,9 +49,9 @@ class SealRecognitionPipeline(BasePipeline):
 
				             device (str, optional): Device to run the predictions on. Defaults to None.
			
 
				             pp_option (PaddlePredictorOption, optional): PaddlePredictor options. Defaults to None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				 
			
--- a/paddlex/inference/pipelines/semantic_segmentation/pipeline.py
+++ b/paddlex/inference/pipelines/semantic_segmentation/pipeline.py
@@ -45,9 +45,9 @@ class SemanticSegmentationPipeline(BasePipeline):
 
				             device (str): The device to run the prediction on. Default is None.
			
 
				             pp_option (PaddlePredictorOption): Options for PaddlePaddle predictor. Default is None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				         super().__init__(
			
--- a/paddlex/inference/pipelines/small_object_detection/pipeline.py
+++ b/paddlex/inference/pipelines/small_object_detection/pipeline.py
@@ -45,9 +45,9 @@ class SmallObjectDetectionPipeline(BasePipeline):
 
				             device (str): The device to run the prediction on. Default is None.
			
 
				             pp_option (PaddlePredictorOption): Options for PaddlePaddle predictor. Default is None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				         super().__init__(
			
--- a/paddlex/inference/pipelines/table_recognition/pipeline.py
+++ b/paddlex/inference/pipelines/table_recognition/pipeline.py
@@ -54,9 +54,9 @@ class TableRecognitionPipeline(BasePipeline):
 
				             device (str, optional): Device to run the predictions on. Defaults to None.
			
 
				             pp_option (PaddlePredictorOption, optional): PaddlePredictor options. Defaults to None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				 
			
--- a/paddlex/inference/pipelines/table_recognition/pipeline_v2.py
+++ b/paddlex/inference/pipelines/table_recognition/pipeline_v2.py
@@ -64,9 +64,9 @@ class TableRecognitionPipelineV2(BasePipeline):
 
				             device (str, optional): Device to run the predictions on. Defaults to None.
			
 
				             pp_option (PaddlePredictorOption, optional): PaddlePredictor options. Defaults to None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				 
			
--- a/paddlex/inference/pipelines/ts_anomaly_detection/pipeline.py
+++ b/paddlex/inference/pipelines/ts_anomaly_detection/pipeline.py
@@ -44,9 +44,9 @@ class TSAnomalyDetPipeline(BasePipeline):
 
				             device (str, optional): Device to run the predictions on. Defaults to None.
			
 
				             pp_option (PaddlePredictorOption, optional): PaddlePredictor options. Defaults to None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				 
			
--- a/paddlex/inference/pipelines/ts_classification/pipeline.py
+++ b/paddlex/inference/pipelines/ts_classification/pipeline.py
@@ -44,9 +44,9 @@ class TSClsPipeline(BasePipeline):
 
				             device (str, optional): Device to run the predictions on. Defaults to None.
			
 
				             pp_option (PaddlePredictorOption, optional): PaddlePredictor options. Defaults to None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				 
			
--- a/paddlex/inference/pipelines/ts_forecasting/pipeline.py
+++ b/paddlex/inference/pipelines/ts_forecasting/pipeline.py
@@ -44,9 +44,9 @@ class TSFcPipeline(BasePipeline):
 
				             device (str, optional): Device to run the predictions on. Defaults to None.
			
 
				             pp_option (PaddlePredictorOption, optional): PaddlePredictor options. Defaults to None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				 
			
--- a/paddlex/inference/pipelines/video_classification/pipeline.py
+++ b/paddlex/inference/pipelines/video_classification/pipeline.py
@@ -45,9 +45,9 @@ class VideoClassificationPipeline(BasePipeline):
 
				             device (str): The device to run the prediction on. Default is None.
			
 
				             pp_option (PaddlePredictorOption): Options for PaddlePaddle predictor. Default is None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				         super().__init__(
			
--- a/paddlex/inference/pipelines/video_detection/pipeline.py
+++ b/paddlex/inference/pipelines/video_detection/pipeline.py
@@ -45,9 +45,9 @@ class VideoDetectionPipeline(BasePipeline):
 
				             device (str): The device to run the prediction on. Default is None.
			
 
				             pp_option (PaddlePredictorOption): Options for PaddlePaddle predictor. Default is None.
			
 
				             use_hpip (bool, optional): Whether to use the high-performance
			
 
				-                inference plugin (HPIP) by default. Defaults to False.
			
 
				+                inference plugin (HPIP). Defaults to False.
			
 
				             hpi_config (Optional[Union[Dict[str, Any], HPIConfig]], optional):
			
 
				-                The default high-performance inference configuration dictionary.
			
 
				+                The high-performance inference configuration dictionary.
			
 
				                 Defaults to None.
			
 
				         """
			
 
				         super().__init__(
			
--- a/paddlex/paddlex_cli.py
+++ b/paddlex/paddlex_cli.py
@@ -13,6 +13,7 @@
 
				 # limitations under the License.
			
 
				 
			
 
				 import argparse
			
 
				+import ast
			
 
				 import importlib.resources
			
 
				 import os
			
 
				 import shutil
			
@@ -133,7 +134,12 @@ def args_cfg():
 
				     pipeline_group.add_argument(
			
 
				         "--use_hpip",
			
 
				         action="store_true",
			
 
				-        help="Enable HPIP acceleration by default.",
			
 
				+        help="Use high-performance inference plugin.",
			
 
				+    )
			
 
				+    pipeline_group.add_argument(
			
 
				+        "--hpi_config",
			
 
				+        type=ast.literal_eval,
			
 
				+        help="High-performance inference configuration.",
			
 
				     )
			
 
				     pipeline_group.add_argument(
			
 
				         "--get_pipeline_config",
			
@@ -166,20 +172,20 @@ def args_cfg():
 
				     paddle2onnx_group.add_argument(
			
 
				         "--paddle2onnx",
			
 
				         action="store_true",
			
 
				-        help="Convert PaddlePaddle model to ONNX format",
			
 
				+        help="Convert PaddlePaddle model to ONNX format.",
			
 
				     )
			
 
				     paddle2onnx_group.add_argument(
			
 
				         "--paddle_model_dir",
			
 
				         type=str,
			
 
				-        help="Directory containing the PaddlePaddle model",
			
 
				+        help="Directory containing the PaddlePaddle model.",
			
 
				     )
			
 
				     paddle2onnx_group.add_argument(
			
 
				         "--onnx_model_dir",
			
 
				         type=str,
			
 
				-        help="Output directory for the ONNX model",
			
 
				+        help="Output directory for the ONNX model.",
			
 
				     )
			
 
				     paddle2onnx_group.add_argument(
			
 
				-        "--opset_version", type=int, default=7, help="Version of the ONNX opset to use"
			
 
				+        "--opset_version", type=int, default=7, help="Version of the ONNX opset to use."
			
 
				     )
			
 
				 
			
 
				     # Parse known arguments to get the pipeline name
			
@@ -313,10 +319,13 @@ def pipeline_predict(
 
				     device,
			
 
				     save_path,
			
 
				     use_hpip,
			
 
				+    hpi_config,
			
 
				     **pipeline_args,
			
 
				 ):
			
 
				     """pipeline predict"""
			
 
				-    pipeline = create_pipeline(pipeline, device=device, use_hpip=use_hpip)
			
 
				+    pipeline = create_pipeline(
			
 
				+        pipeline, device=device, use_hpip=use_hpip, hpi_config=hpi_config
			
 
				+    )
			
 
				     result = pipeline.predict(input, **pipeline_args)
			
 
				     for res in result:
			
 
				         res.print()
			
@@ -324,11 +333,13 @@ def pipeline_predict(
 
				             res.save_all(save_path=save_path)
			
 
				 
			
 
				 
			
 
				-def serve(pipeline, *, device, use_hpip, host, port):
			
 
				+def serve(pipeline, *, device, use_hpip, hpi_config, host, port):
			
 
				     from .inference.serving.basic_serving import create_pipeline_app, run_server
			
 
				 
			
 
				     pipeline_config = load_pipeline_config(pipeline)
			
 
				-    pipeline = create_pipeline(config=pipeline_config, device=device, use_hpip=use_hpip)
			
 
				+    pipeline = create_pipeline(
			
 
				+        config=pipeline_config, device=device, use_hpip=use_hpip, hpi_config=hpi_config
			
 
				+    )
			
 
				     app = create_pipeline_app(pipeline, pipeline_config)
			
 
				     run_server(app, host=host, port=port)
			
 
				 
			
@@ -438,6 +449,7 @@ def main():
 
				             args.pipeline,
			
 
				             device=args.device,
			
 
				             use_hpip=args.use_hpip,
			
 
				+            hpi_config=args.hpi_config,
			
 
				             host=args.host,
			
 
				             port=args.port,
			
 
				         )
			
@@ -465,5 +477,6 @@ def main():
 
				                 args.device,
			
 
				                 args.save_path,
			
 
				                 use_hpip=args.use_hpip,
			
 
				+                hpi_config=args.hpi_config,
			
 
				                 **pipeline_args_dict,
			
 
				             )
			
--- a/paddlex/utils/deps.py
+++ b/paddlex/utils/deps.py
@@ -222,4 +222,4 @@ def require_paddle2onnx_plugin():
 
				 
			
 
				 
			
 
				 def get_paddle2onnx_spec():
			
 
				-    return "paddle2onnx == 2.0.1a1"
			
 
				+    return "paddle2onnx >= 2.0.1"