Bläddra i källkod

[Feat] Add `hps` code (#4380)

* open source code

* update content

* update content for comment

* update content for comment

* update img url

* add translate version

* fix space

* fix space

* update content for comment

* update develop branch  URL

* update  lastest  URL
Poki Bai 3 månader sedan
förälder
incheckning
6e5be5cb8e
100 ändrade filer med 4597 tillägg och 0 borttagningar
  1. 4 0
      .pre-commit-config.yaml
  2. 161 0
      deploy/hps/README.md
  3. 164 0
      deploy/hps/README_en.md
  4. 1 0
      deploy/hps/sdk/.gitignore
  5. 22 0
      deploy/hps/sdk/common/config_cpu.pbtxt
  6. 23 0
      deploy/hps/sdk/common/config_gpu.pbtxt
  7. 26 0
      deploy/hps/sdk/common/server.sh
  8. 16 0
      deploy/hps/sdk/paddlex-hps-client/pyproject.toml
  9. 3 0
      deploy/hps/sdk/paddlex-hps-client/requirements.txt
  10. 3 0
      deploy/hps/sdk/paddlex-hps-client/scripts/build_wheel.sh
  11. 11 0
      deploy/hps/sdk/paddlex-hps-client/src/paddlex_hps_client/__init__.py
  12. 2 0
      deploy/hps/sdk/paddlex-hps-client/src/paddlex_hps_client/constants.py
  13. 31 0
      deploy/hps/sdk/paddlex-hps-client/src/paddlex_hps_client/request.py
  14. 43 0
      deploy/hps/sdk/paddlex-hps-client/src/paddlex_hps_client/utils.py
  15. 30 0
      deploy/hps/sdk/pipelines/3d_bev_detection/client/client.py
  16. 3 0
      deploy/hps/sdk/pipelines/3d_bev_detection/client/requirements.txt
  17. 42 0
      deploy/hps/sdk/pipelines/3d_bev_detection/server/model_repo/bev-3d-object-detection/1/model.py
  18. 9 0
      deploy/hps/sdk/pipelines/3d_bev_detection/server/pipeline_config.yaml
  19. 39 0
      deploy/hps/sdk/pipelines/OCR/client/client.py
  20. 3 0
      deploy/hps/sdk/pipelines/OCR/client/requirements.txt
  21. 134 0
      deploy/hps/sdk/pipelines/OCR/server/model_repo/ocr/1/model.py
  22. 45 0
      deploy/hps/sdk/pipelines/OCR/server/pipeline_config.yaml
  23. 67 0
      deploy/hps/sdk/pipelines/PP-ChatOCRv3-doc/client/client.py
  24. 3 0
      deploy/hps/sdk/pipelines/PP-ChatOCRv3-doc/client/requirements.txt
  25. 38 0
      deploy/hps/sdk/pipelines/PP-ChatOCRv3-doc/server/model_repo/chatocr-chat/1/model.py
  26. 22 0
      deploy/hps/sdk/pipelines/PP-ChatOCRv3-doc/server/model_repo/chatocr-chat/config.pbtxt
  27. 24 0
      deploy/hps/sdk/pipelines/PP-ChatOCRv3-doc/server/model_repo/chatocr-vector/1/model.py
  28. 22 0
      deploy/hps/sdk/pipelines/PP-ChatOCRv3-doc/server/model_repo/chatocr-vector/config.pbtxt
  29. 148 0
      deploy/hps/sdk/pipelines/PP-ChatOCRv3-doc/server/model_repo/chatocr-visual/1/model.py
  30. 151 0
      deploy/hps/sdk/pipelines/PP-ChatOCRv3-doc/server/pipeline_config.yaml
  31. 79 0
      deploy/hps/sdk/pipelines/PP-ChatOCRv4-doc/client/client.py
  32. 3 0
      deploy/hps/sdk/pipelines/PP-ChatOCRv4-doc/client/requirements.txt
  33. 40 0
      deploy/hps/sdk/pipelines/PP-ChatOCRv4-doc/server/model_repo/chatocr-chat/1/model.py
  34. 22 0
      deploy/hps/sdk/pipelines/PP-ChatOCRv4-doc/server/model_repo/chatocr-chat/config.pbtxt
  35. 27 0
      deploy/hps/sdk/pipelines/PP-ChatOCRv4-doc/server/model_repo/chatocr-mllm/1/model.py
  36. 22 0
      deploy/hps/sdk/pipelines/PP-ChatOCRv4-doc/server/model_repo/chatocr-mllm/config.pbtxt
  37. 24 0
      deploy/hps/sdk/pipelines/PP-ChatOCRv4-doc/server/model_repo/chatocr-vector/1/model.py
  38. 22 0
      deploy/hps/sdk/pipelines/PP-ChatOCRv4-doc/server/model_repo/chatocr-vector/config.pbtxt
  39. 149 0
      deploy/hps/sdk/pipelines/PP-ChatOCRv4-doc/server/model_repo/chatocr-visual/1/model.py
  40. 241 0
      deploy/hps/sdk/pipelines/PP-ChatOCRv4-doc/server/pipeline_config.yaml
  41. 74 0
      deploy/hps/sdk/pipelines/PP-DocTranslation/client/client.py
  42. 3 0
      deploy/hps/sdk/pipelines/PP-DocTranslation/client/requirements.txt
  43. 54 0
      deploy/hps/sdk/pipelines/PP-DocTranslation/server/model_repo/doctrans-translate/1/model.py
  44. 22 0
      deploy/hps/sdk/pipelines/PP-DocTranslation/server/model_repo/doctrans-translate/config.pbtxt
  45. 179 0
      deploy/hps/sdk/pipelines/PP-DocTranslation/server/model_repo/doctrans-visual/1/model.py
  46. 261 0
      deploy/hps/sdk/pipelines/PP-DocTranslation/server/pipeline_config.yaml
  47. 120 0
      deploy/hps/sdk/pipelines/PP-ShiTuV2/client/client.py
  48. 3 0
      deploy/hps/sdk/pipelines/PP-ShiTuV2/client/requirements.txt
  49. 3 0
      deploy/hps/sdk/pipelines/PP-ShiTuV2/server/.isort.cfg
  50. 35 0
      deploy/hps/sdk/pipelines/PP-ShiTuV2/server/model_repo/shitu-index-add/1/model.py
  51. 42 0
      deploy/hps/sdk/pipelines/PP-ShiTuV2/server/model_repo/shitu-index-build/1/model.py
  52. 26 0
      deploy/hps/sdk/pipelines/PP-ShiTuV2/server/model_repo/shitu-index-remove/1/model.py
  53. 22 0
      deploy/hps/sdk/pipelines/PP-ShiTuV2/server/model_repo/shitu-index-remove/config.pbtxt
  54. 66 0
      deploy/hps/sdk/pipelines/PP-ShiTuV2/server/model_repo/shitu-infer/1/model.py
  55. 18 0
      deploy/hps/sdk/pipelines/PP-ShiTuV2/server/pipeline_config.yaml
  56. 0 0
      deploy/hps/sdk/pipelines/PP-ShiTuV2/server/shared_mods/common/__init__.py
  57. 19 0
      deploy/hps/sdk/pipelines/PP-ShiTuV2/server/shared_mods/common/base_model.py
  58. 49 0
      deploy/hps/sdk/pipelines/PP-StructureV3/client/client.py
  59. 3 0
      deploy/hps/sdk/pipelines/PP-StructureV3/client/requirements.txt
  60. 172 0
      deploy/hps/sdk/pipelines/PP-StructureV3/server/model_repo/layout-parsing/1/model.py
  61. 226 0
      deploy/hps/sdk/pipelines/PP-StructureV3/server/pipeline_config.yaml
  62. 36 0
      deploy/hps/sdk/pipelines/anomaly_detection/client/client.py
  63. 3 0
      deploy/hps/sdk/pipelines/anomaly_detection/client/requirements.txt
  64. 31 0
      deploy/hps/sdk/pipelines/anomaly_detection/server/model_repo/anomaly-detection/1/model.py
  65. 8 0
      deploy/hps/sdk/pipelines/anomaly_detection/server/pipeline_config.yaml
  66. 38 0
      deploy/hps/sdk/pipelines/doc_preprocessor/client/client.py
  67. 3 0
      deploy/hps/sdk/pipelines/doc_preprocessor/client/requirements.txt
  68. 132 0
      deploy/hps/sdk/pipelines/doc_preprocessor/server/model_repo/document-preprocessing/1/model.py
  69. 15 0
      deploy/hps/sdk/pipelines/doc_preprocessor/server/pipeline_config.yaml
  70. 49 0
      deploy/hps/sdk/pipelines/doc_understanding/client/client.py
  71. 3 0
      deploy/hps/sdk/pipelines/doc_understanding/client/requirements.txt
  72. 110 0
      deploy/hps/sdk/pipelines/doc_understanding/server/model_repo/document-understanding/1/model.py
  73. 9 0
      deploy/hps/sdk/pipelines/doc_understanding/server/pipeline_config.yaml
  74. 120 0
      deploy/hps/sdk/pipelines/face_recognition/client/client.py
  75. 3 0
      deploy/hps/sdk/pipelines/face_recognition/client/requirements.txt
  76. 3 0
      deploy/hps/sdk/pipelines/face_recognition/server/.isort.cfg
  77. 35 0
      deploy/hps/sdk/pipelines/face_recognition/server/model_repo/face-recognition-index-add/1/model.py
  78. 42 0
      deploy/hps/sdk/pipelines/face_recognition/server/model_repo/face-recognition-index-build/1/model.py
  79. 26 0
      deploy/hps/sdk/pipelines/face_recognition/server/model_repo/face-recognition-index-remove/1/model.py
  80. 22 0
      deploy/hps/sdk/pipelines/face_recognition/server/model_repo/face-recognition-index-remove/config.pbtxt
  81. 66 0
      deploy/hps/sdk/pipelines/face_recognition/server/model_repo/face-recognition-infer/1/model.py
  82. 18 0
      deploy/hps/sdk/pipelines/face_recognition/server/pipeline_config.yaml
  83. 0 0
      deploy/hps/sdk/pipelines/face_recognition/server/shared_mods/common/__init__.py
  84. 19 0
      deploy/hps/sdk/pipelines/face_recognition/server/shared_mods/common/base_model.py
  85. 40 0
      deploy/hps/sdk/pipelines/formula_recognition/client/client.py
  86. 3 0
      deploy/hps/sdk/pipelines/formula_recognition/client/requirements.txt
  87. 132 0
      deploy/hps/sdk/pipelines/formula_recognition/server/model_repo/formula-recognition/1/model.py
  88. 39 0
      deploy/hps/sdk/pipelines/formula_recognition/server/pipeline_config.yaml
  89. 38 0
      deploy/hps/sdk/pipelines/human_keypoint_detection/client/client.py
  90. 3 0
      deploy/hps/sdk/pipelines/human_keypoint_detection/client/requirements.txt
  91. 44 0
      deploy/hps/sdk/pipelines/human_keypoint_detection/server/model_repo/human-keypoint-detection/1/model.py
  92. 17 0
      deploy/hps/sdk/pipelines/human_keypoint_detection/server/pipeline_config.yaml
  93. 38 0
      deploy/hps/sdk/pipelines/image_classification/client/client.py
  94. 3 0
      deploy/hps/sdk/pipelines/image_classification/client/requirements.txt
  95. 36 0
      deploy/hps/sdk/pipelines/image_classification/server/model_repo/image-classification/1/model.py
  96. 10 0
      deploy/hps/sdk/pipelines/image_classification/server/pipeline_config.yaml
  97. 38 0
      deploy/hps/sdk/pipelines/image_multilabel_classification/client/client.py
  98. 3 0
      deploy/hps/sdk/pipelines/image_multilabel_classification/client/requirements.txt
  99. 37 0
      deploy/hps/sdk/pipelines/image_multilabel_classification/server/model_repo/multilabel-image-classification/1/model.py
  100. 9 0
      deploy/hps/sdk/pipelines/image_multilabel_classification/server/pipeline_config.yaml

+ 4 - 0
.pre-commit-config.yaml

@@ -9,6 +9,10 @@ repos:
     -   id: check-symlinks
     -   id: detect-private-key
     -   id: end-of-file-fixer
+        exclude: |
+            (?x)^(
+              deploy/hps/server_env/requirements/.*\.txt
+            )$
     -   id: trailing-whitespace
         files: \.(md|c|cc|cxx|cpp|cu|h|hpp|hxx|py)$
 -   repo: https://github.com/Lucas-C/pre-commit-hooks

+ 161 - 0
deploy/hps/README.md

@@ -0,0 +1,161 @@
+---
+comments: true
+---
+
+# PaddleX 高稳定性服务化部署
+
+本项目提供一套高稳定性服务化部署方案,它由 `server_env` 与 `sdk` 两个目录组成。`server_env` 部分用于构建包含 Triton Inference Server 的多种镜像,为后续模型产线 server 提供运行环境;`sdk` 部分用于打包产线 SDK,提供各模型产线的 server 和 client 代码。如下图所示:
+
+<img src="https://github.com/boomercat/PaddleX_doc_images/blob/main/images/hps/hps_workflow.png?raw=true" />
+
+**请注意,本项目依赖于如下环境配置:**
+
+- **操作系统**:Linux
+- **Docker 版本**:`>= 20.10.0`,用于镜像构建和部署
+- **CPU 架构**:x86-64 
+
+本文档主要介绍如何基于本项目提供的脚本完成高稳定性服务化部署环境搭建与物料打包。整体流程分为两个阶段:
+
+1. 镜像构建:构建包含 Triton Inference Server 的镜像。在这一阶段中,依赖版本被锁定以提升部署镜像构建的可重现性。
+2. 产线物料打包:将各模型产线的客户端和服务端代码进行打包,便于后续部署与集成使用。
+
+如需了解如何使用构建好的镜像与打包好的 SDK 启动服务器和调用服务,可参考 [PaddleX 服务化部署指南](https://paddlepaddle.github.io/PaddleX/latest/pipeline_deploy/serving.html)。
+
+## 1. 镜像构建
+
+本阶段主要介绍镜像构建的整体流程及关键步骤。
+
+镜像构建步骤:
+
+1. 构建依赖收集镜像。
+2. 锁定依赖版本,提升部署镜像构建的可重现性。
+3. 构建部署镜像,基于已锁定的依赖信息,构建最终的部署镜像,为后续的产线运行提供镜像支持。
+
+### 1.1 构建依赖收集镜像
+
+执行 `server_env` 目录下的依赖收集脚本。
+
+```bash
+./scripts/prepare_rc_image.sh
+```
+
+该脚本会为每种设备类型构建一个用于依赖收集的镜像,镜像包含 Python 3.10 以及 [pip-tools](https://github.com/jazzband/pip-tools) 工具。[1.2 锁定依赖](./README.md#12-锁定依赖) 将基于该镜像完成。构建完成后,将分别生成 `paddlex-hps-rc:gpu` 和 `paddlex-hps-rc:cpu` 两个镜像。如果遇到网络问题,可以通过 `-p` 参数指定其他 pip 源。如果不指定,默认使用 https://pypi.org/simple。
+
+### 1.2 锁定依赖
+
+为了使构建结果的可重现性更强,本步骤将依赖锁定到精确版本。执行如下脚本:
+
+```bash
+./script/freeze_requirements.sh
+```
+
+该脚本调用 `pip-tools compile` 解析依赖源文件,并最终生成一系列 `.txt` 文件(如 `requirements/gpu.txt`、`requirements/cpu.txt` 等),这些文件将为 [1.3 镜像构建](./README.md#13-镜像构建) 提供依赖版本约束。
+
+### 1.3 镜像构建
+
+在完成 1.2 锁定依赖后,如需构建 GPU 镜像,需提前将 [cuDNN 8.9.7-CUDA 11.x 安装包](https://developer.nvidia.cn/rdp/cudnn-archive) 和 [TensorRT 8.6.1.6-Ubuntu 20.04 安装包](https://developer.nvidia.com/nvidia-tensorrt-8x-download) 放在 `server_env` 目录下。对于 Triton Server,项目使用预先编译好的版本,将在构建镜像时自动下载,无需手动下载。以构建 GPU 镜像为例,执行以下命令:
+
+```bash
+./scripts/build_deployment_image.sh -k gpu -t latest-gpu 
+```
+
+构建镜像的参数配置项包括
+
+<table>
+<thead>
+<tr>
+<th>名称</th>
+<th>说明</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td><code>-k</code></td>
+<td>指定镜像的设备类型,可选值为 <code>gpu</code> 或 <code>cpu</code>。</td>
+</tr>
+<tr>
+<td><code>-t</code></td>
+<td>镜像标签,默认为 <code>latest:${DEVICE}</code>。</td>
+</tr>
+<tr>
+<td><code>-p</code></td>
+<td>Python 包索引 URL,如不指定默认为 <code>https://pypi.org/simple</code>。</td>
+</tr>
+</tbody>
+</table>
+
+执行成功后,命令行会输出以下提示信息:
+
+```text
+ => => exporting to image                                                         
+ => => exporting layers                                                      
+ => => writing image  sha256:ba3d0b2b079d63ee0239a99043fec7e25f17bf2a7772ec2fc80503c1582b3459   
+ => => naming to ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlex/hps:latest-gpu   
+```
+
+如需批量构建 GPU 和 CPU 镜像,可以执行以下命令:
+
+```bash
+./srcipts/prepare_deployment_images.sh
+```
+
+## 2. 产线物料打包
+
+本阶段主要介绍 `sdk` 目录下为多个模型产线提供统一的打包功能。同时,该目录为每个产线提供对应的 client 和 server 代码实现:
+
+- `client` 部分:用于调用模型服务。
+- `server` 部分:以 [1. 镜像构建](#1-镜像构建) 阶段构建的镜像作为运行环境,用于部署模型服务。
+
+打包可通过 `scripts/assemble.sh` 脚本执行,以打包通用 OCR 产线为例:
+
+```bash
+./scripts/assemble.sh OCR
+```
+
+打包脚本的参数说明如下:
+
+<table>
+<thead>
+<tr>
+<th>名称</th>
+<th>说明</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td><code>pipeline_names</code></td>
+<td>需要打包的产线名称,可以为空或一次指定多个,例如通用 OCR 产线为 <code>OCR</code>。</td>
+</tr>
+<tr>
+<td><code>--all</code></td>
+<td>打包全部产线,与 <code>pipeline_names</code> 不可共用。</td>
+</tr>
+<tr>
+<td><code>--no-server</code></td>
+<td>不打包产线中的 server 代码。</td>
+</tr>
+<tr>
+<td><code>--no-client</code></td>
+<td>不打包产线中的 client 代码。</td>
+</tr>
+</tbody>
+</table>
+
+调用后存储到当前目录 `/output` 路径下。
+
+
+
+## 3.FAQ
+
+**1. 构建镜像时无法拉取 Docker 基础镜像**
+
+由于网络连接问题或镜像源访问限制,可能会导致从 Docker Hub 拉取基础镜像失败。可尝试在本地 Docker 配置文件 `/etc/docker/daemon.json` 中添加国内可信镜像仓库地址,以提升镜像下载速度和稳定性。如果上述方法仍无法解决,可尝试从官方或可信第三方渠道手动下载镜像文件。
+
+
+**2. 镜像构建过程中出现安装 Python 依赖时超时?**
+
+可能由于网络问题,pip 从官方源下载依赖速度过慢或连接失败。在执行构建镜像脚本时,使用 `-p` 参数指定国内 Python 包索引 URL,以构建依赖收集镜像脚本使用清华镜像源为例:
+
+```bash
+./scripts/prepare_rc_image.sh -p https://pypi.tuna.tsinghua.edu.cn/simple
+```

+ 164 - 0
deploy/hps/README_en.md

@@ -0,0 +1,164 @@
+---
+comments: true
+---
+
+# PaddleX High Stability Serving
+
+This project provides a high-stability serving solution, consisting of two main components: `server_env` and `sdk`.`server_env` is responsible for building multiple Docker images that include Triton Inference Server, providing the runtime environment for pipeline servers.`sdk` is used to package the pipeline SDK, including both server and client code for various model pipelines.As shown in the following figure:
+
+<img src="https://github.com/cuicheng01/PaddleX_doc_images/blob/main/images/hps/hps_workflow_en.png?raw=true"/>
+
+
+**Note: This project relies on the following environment configurations:**
+
+- **Operating System**: Linux
+- **Docker Version**: `>= 20.10.0` (Used for image building and deployment)
+- **CPU Architecture**: x86-64
+
+This  document  mainly introduces how to set up a high stability serving environment and package related materials using the scripts provided by this project.The overall process consists of two main stages:
+
+1. Image Building: Build Docker images that include Triton Inference Server. In this stage, requirement versions are locked to ensure reproducibility and stability of the deployment images.
+
+2. Pipeline Material Packaging: Package the client and server code for each model pipeline, making it easier for subsequent deployment and integration.
+
+To learn how to start the server and invoke services using the built images and packaged SDK, please refer to the [PaddleX Serving Guide](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_deploy/serving.html) for detailed instructions.
+
+
+## 1. Image Building
+
+This stage mainly introduces the overall process and key steps of image building.
+
+Image Building Steps:
+
+1. Build a requirement collection image.
+2. Freeze requirement versions to improve the reproducibility of deployment image building. 
+3. Build the deployment image based on the frozen requirement information to generate the final deployment image and provide image support for subsequent pipeline execution. 
+
+
+## 1.1 Build the Requirement Collection Image
+
+Run the requirement collection script located in the `server_env` directory:
+
+```bash
+./scripts/prepare_rc_image.sh
+```
+
+This script builds a requirement collection image for each device type. The image includes Python 3.10 and [pip-tools](https://github.com/jazzband/pip-tools).[1.2 Freeze requirement](./README_en.md#12-freeze-requirement) will be based on this image.After the build is complete, two images: `paddlex-hps-rc:gpu` and `paddlex-hps-rc:cpu` will be generated.If you encounter network issues, you can specify other pip sources through the `-p` parameter.If not specified, the default source https://pypi.org/simple will be used.
+
+## 1.2 Freeze Requirement
+
+To enhance the reproducibility of the build, this step freeze requirement to exact versions. run the following script:
+
+```bash
+./scripts/freeze_requirements.sh
+```
+
+This script uses `pip-tools compile` to parse the source requirement files and generate a series of `.txt` files (such as `requirements/gpu.txt`, `requirements/cpu.txt`, etc.). These files will serve as version requirement for [1.3 Building Image](./README_en.md#13-building-image).
+
+## 1.3 Building Image
+
+After completing Step 1.2: Freeze Requirement, if you need to build the GPU image, make sure to place the following installation packages in the `server_env` directory in advance:[cuDNN 8.9.7-CUDA 11.x Tar](https://developer.nvidia.cn/rdp/cudnn-archive) and [TensorRT 8.6.1.6-Ubuntu 20.04 Tar Package](https://developer.nvidia.com/nvidia-tensorrt-8x-download).For Triton Inference Server, a precompiled version will be automatically downloaded during the build process, so manual download is not required.To build a GPU image , run the following command:
+
+```bash
+./scripts/build_deployment_image.sh -k gpu -t latest-gpu
+```
+
+Build image script supports the following configuration options:
+
+<table>
+<thead>
+<tr>
+<th>Name</th>
+<th>Descrition</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td><code>-k</code></td>
+<td>Specifies the device type for the image. Supported values: <code>gpu</code> or <code>cpu</code>.</td>
+</tr>
+<tr>
+<td><code>-t</code></td>
+<td>Sets the image tag. Default: <code>latest:${DEVICE}</code>.</td>
+</tr>
+<tr>
+<td><code>-p</code></td>
+<td>Python package index URL. If not specified, defaults to <code>https://pypi.org/simple</code>.</td>
+</tr>
+</tbody>
+</table>
+
+After run successfully, the command line will display the following message:
+
+```text
+ => => exporting to image                                                         
+ => => exporting layers                                                      
+ => => writing image  sha256:ba3d0b2b079d63ee0239a99043fec7e25f17bf2a7772ec2fc80503c1582b3459   
+ => => naming to ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlex/hps:latest-gpu   
+```
+
+To build both GPU and CPU images  run the following command:
+
+```bash
+./scripts/prepare_deployment_images.sh
+```
+
+## 2. Pipeline Material Packaging
+
+This stage mainly introduces the unified packaging function provided by the `sdk` directory  for multiple  pipelines. Meanwhile, this directory provides corresponding client and server code implementations for each pipeline:
+
+`client`: Responsible for invoking the model services.
+`server`: Deployed using the images built in [1. Image Building](./README_en.md#1-image-building), serving as the runtime environment for model services.
+
+Packaging can be performed using the `scripts/assemble.sh` script. For example, to package the general OCR pipeline, run:
+
+```bash
+./scripts/assemble.sh OCR
+```
+
+The parameters for the packaging script are described as follows:
+
+<table>
+<thead>
+<tr>
+<th>Name</th>
+<th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td><code>pipeline_names</code></td>
+<td>Specifies the names of the pipelines to be packaged. Can be empty or include multiple names. For example, the general OCR pipeline is <code>OCR</code>.</td>
+</tr>
+<tr>
+<td><code>--all</code></td>
+<td>Packages all pipelines. Cannot be used together with <code>pipeline_names</code>.</td>
+</tr>
+<tr>
+<td><code>--no-server</code></td>
+<td>Excludes the server code from the package.</td>
+</tr>
+<tr>
+<td><code>--no-client</code></td>
+<td>Excludes the client code from the package.</td>
+</tr>
+</tbody>
+</table>
+
+After run successfully, the packaged  will be stored in the `/output` directory.
+
+## 3. FAQ
+
+**1. Failed to pull the base Docker image during build?**
+
+This issue may occur due to network connectivity problems or restricted access to Docker Hub. You can add trusted domestic mirror registry URLs to your local Docker configuration file at `/etc/docker/daemon.json` to improve download speed and stability.If this does not resolve the issue, consider manually downloading the base image from the official source or other trusted third-party source.
+
+
+**2. Timeout when installing Python requirement during image build?**
+
+Network issues may cause slow download speeds or connection failures when pip retrieves packages from the official source.
+When running the image build scripts, you can use the `-p` parameter to specify an alternative Python package index URL. For example, to use the Tsinghua mirror for the requirement collection image:
+
+```bash
+./scripts/prepare_rc_image.sh -p  https://pypi.tuna.tsinghua.edu.cn/simple
+```

+ 1 - 0
deploy/hps/sdk/.gitignore

@@ -0,0 +1 @@
+/output/

+ 22 - 0
deploy/hps/sdk/common/config_cpu.pbtxt

@@ -0,0 +1,22 @@
+backend: "python"
+max_batch_size: 1
+input [
+  {
+    name: "input"
+    data_type: TYPE_STRING
+    dims: [ 1 ]
+  }
+]
+output [
+  {
+    name: "output"
+    data_type: TYPE_STRING
+    dims: [ 1 ]
+  }
+]
+instance_group [
+  {
+      count: 1
+      kind: KIND_CPU
+  }
+]

+ 23 - 0
deploy/hps/sdk/common/config_gpu.pbtxt

@@ -0,0 +1,23 @@
+backend: "python"
+max_batch_size: 1
+input [
+  {
+    name: "input"
+    data_type: TYPE_STRING
+    dims: [ 1 ]
+  }
+]
+output [
+  {
+    name: "output"
+    data_type: TYPE_STRING
+    dims: [ 1 ]
+  }
+]
+instance_group [
+  {
+      count: 1
+      kind: KIND_GPU
+      gpus: [ 0 ]
+  }
+]

+ 26 - 0
deploy/hps/sdk/common/server.sh

@@ -0,0 +1,26 @@
+#!/usr/bin/env bash
+
+set -e
+
+export LANG='C.UTF-8'
+export PADDLEX_HPS_LOGGING_LEVEL='INFO'
+
+export PADDLEX_HPS_PIPELINE_CONFIG_PATH="${PADDLEX_HPS_PIPELINE_CONFIG_PATH:-$(realpath pipeline_config.yaml)}"
+
+readonly MODEL_REPO_DIR=/paddlex/var/paddlex_model_repo
+
+rm -rf "${MODEL_REPO_DIR}"
+
+cp -r model_repo "${MODEL_REPO_DIR}"
+
+find "${MODEL_REPO_DIR}" -mindepth 1 -maxdepth 1 -type d -print0 | while IFS= read -r -d '' dir_; do
+    if [ -f "${dir_}/config_${PADDLEX_HPS_DEVICE_TYPE}.pbtxt" ]; then
+        cp -f "${dir_}/config_${PADDLEX_HPS_DEVICE_TYPE}.pbtxt" "${dir_}/config.pbtxt"
+    fi
+done
+
+if [ -d shared_mods ]; then
+    export PYTHONPATH="$(realpath shared_mods):${PYTHONPATH}"
+fi
+
+exec tritonserver --model-repository="${MODEL_REPO_DIR}" --backend-config=python,shm-default-byte-size=104857600,shm-growth-byte-size=10485760 --log-info=1 --log-warning=1 --log-error=1

+ 16 - 0
deploy/hps/sdk/paddlex-hps-client/pyproject.toml

@@ -0,0 +1,16 @@
+[build-system]
+requires = ["setuptools >= 69"]
+build-backend = "setuptools.build_meta"
+
+[project]
+name = "paddlex-hps-client"
+version = "0.2.0"
+dependencies = [
+    "numpy >= 1.24",
+    # XXX: Not sure about the backward compatibility
+    "tritonclient [grpc] >= 2.15.0",
+]
+
+[tool.isort]
+profile = "black"
+src_paths = ["src"]

+ 3 - 0
deploy/hps/sdk/paddlex-hps-client/requirements.txt

@@ -0,0 +1,3 @@
+numpy >= 1.24
+# XXX: Not sure about the backward compatibility
+tritonclient [grpc] >= 2.15.0

+ 3 - 0
deploy/hps/sdk/paddlex-hps-client/scripts/build_wheel.sh

@@ -0,0 +1,3 @@
+#!/usr/bin/env bash
+
+python -m pip wheel -w wheels --no-deps .

+ 11 - 0
deploy/hps/sdk/paddlex-hps-client/src/paddlex_hps_client/__init__.py

@@ -0,0 +1,11 @@
+from importlib import metadata as _metadata
+
+from .request import triton_request
+
+__all__ = ["__version__", "triton_request"]
+
+# Ref: https://github.com/langchain-ai/langchain/blob/493e474063817b9a4c2521586b2dbc34d20b4cf1/libs/core/langchain_core/__init__.py
+try:
+    __version__ = _metadata.version(__package__)
+except _metadata.PackageNotFoundError:
+    __version__ = ""

+ 2 - 0
deploy/hps/sdk/paddlex-hps-client/src/paddlex_hps_client/constants.py

@@ -0,0 +1,2 @@
+INPUT_NAME = "input"
+OUTPUT_NAME = "output"

+ 31 - 0
deploy/hps/sdk/paddlex-hps-client/src/paddlex_hps_client/request.py

@@ -0,0 +1,31 @@
+import json
+
+import numpy as np
+from tritonclient import grpc as triton_grpc
+
+from . import constants
+
+
+def _create_triton_input(data):
+    data = json.dumps(data, separators=(",", ":"))
+    data = data.encode("utf-8")
+    data = [[data]]
+    data = np.array(data, dtype=np.object_)
+    return data
+
+
+def _parse_triton_output(data):
+    data = data[0, 0]
+    data = data.decode("utf-8")
+    data = json.loads(data)
+    return data
+
+
+def triton_request(client, model_name, data, *, request_kwargs=None):
+    if request_kwargs is None:
+        request_kwargs = {}
+    input_ = triton_grpc.InferInput(constants.INPUT_NAME, [1, 1], "BYTES")
+    input_.set_data_from_numpy(_create_triton_input(data))
+    results = client.infer(model_name, inputs=[input_], **request_kwargs)
+    output = results.as_numpy(constants.OUTPUT_NAME)
+    return _parse_triton_output(output)

+ 43 - 0
deploy/hps/sdk/paddlex-hps-client/src/paddlex_hps_client/utils.py

@@ -0,0 +1,43 @@
+import base64
+import mimetypes
+import shutil
+from urllib.parse import urlparse
+from urllib.request import urlopen
+
+
+def is_url(s):
+    if not (s.startswith("http://") or s.startswith("https://")):
+        # Quick rejection
+        return False
+    result = urlparse(s)
+    return all([result.scheme, result.netloc]) and result.scheme in ("http", "https")
+
+
+def prepare_input_file(file, include_header=False):
+    if is_url(file):
+        return file
+    else:
+        with open(file, "rb") as f:
+            bytes_ = f.read()
+        encoded = base64.b64encode(bytes_).decode("ascii")
+        if include_header:
+            mime_type = mimetypes.guess_type(file)[0] or "application/octet-stream"
+            return f"data:{mime_type};base64,{encoded}"
+        return encoded
+
+
+def save_output_file(file, path, include_header=False):
+    if is_url(file):
+        with urlopen(file) as r:
+            with open(path, "wb") as f:
+                shutil.copyfileobj(r, f)
+    else:
+        if include_header:
+            header, encoded = file.split(",", 1)
+            if not (header.startswith("data:") and header.endswith(";base64")):
+                raise ValueError("Invalid data URI format")
+        else:
+            encoded = file
+        bytes_ = base64.b64decode(encoded)
+        with open(path, "wb") as f:
+            f.write(bytes_)

+ 30 - 0
deploy/hps/sdk/pipelines/3d_bev_detection/client/client.py

@@ -0,0 +1,30 @@
+#!/usr/bin/env python
+
+import argparse
+import pprint
+import sys
+
+from paddlex_hps_client import triton_request, utils
+from tritonclient import grpc as triton_grpc
+
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--tar", type=str, required=True)
+    parser.add_argument("--url", type=str, default="localhost:8001")
+    args = parser.parse_args()
+
+    client = triton_grpc.InferenceServerClient(args.url)
+    input_ = {"tar": utils.prepare_input_file(args.tar)}
+    output = triton_request(client, "bev-3d-object-detection", input_)
+    if output["errorCode"] != 0:
+        print(f"Error code: {output['errorCode']}", file=sys.stderr)
+        print(f"Error message: {output['errorMsg']}", file=sys.stderr)
+        sys.exit(1)
+    result = output["result"]
+    print("Detected objects:")
+    pprint.pp(result["detectedObjects"])
+
+
+if __name__ == "__main__":
+    main()

+ 3 - 0
deploy/hps/sdk/pipelines/3d_bev_detection/client/requirements.txt

@@ -0,0 +1,3 @@
+# paddlex-hps-client
+protobuf == 3.19.6
+tritonclient [grpc] == 2.15

+ 42 - 0
deploy/hps/sdk/pipelines/3d_bev_detection/server/model_repo/bev-3d-object-detection/1/model.py

@@ -0,0 +1,42 @@
+import os
+from typing import Any, Dict, List
+
+from paddlex_hps_server import BaseTritonPythonModel, schemas, utils
+
+
+class TritonPythonModel(BaseTritonPythonModel):
+    def get_input_model_type(self):
+        return schemas.m_3d_bev_detection.InferRequest
+
+    def get_result_model_type(self):
+        return schemas.m_3d_bev_detection.InferResult
+
+    def run(self, input, log_id):
+        file_bytes = utils.get_raw_bytes(input.tar)
+        tar_path = utils.write_to_temp_file(
+            file_bytes,
+            suffix=".tar",
+        )
+
+        try:
+            result = list(
+                self.pipeline(
+                    tar_path,
+                )
+            )[0]
+        finally:
+            os.unlink(tar_path)
+
+        objects: List[Dict[str, Any]] = []
+        for box, label, score in zip(
+            result["boxes_3d"], result["labels_3d"], result["scores_3d"]
+        ):
+            objects.append(
+                dict(
+                    bbox=box,
+                    categoryId=label,
+                    score=score,
+                )
+            )
+
+        return schemas.m_3d_bev_detection.InferResult(detectedObjects=objects)

+ 9 - 0
deploy/hps/sdk/pipelines/3d_bev_detection/server/pipeline_config.yaml

@@ -0,0 +1,9 @@
+
+pipeline_name: 3d_bev_detection
+
+SubModules:
+  3DBEVDetection:
+    module_name: 3d_bev_detection
+    model_name: BEVFusion
+    model_dir: null
+    batch_size: 1

+ 39 - 0
deploy/hps/sdk/pipelines/OCR/client/client.py

@@ -0,0 +1,39 @@
+#!/usr/bin/env python
+
+import argparse
+import sys
+
+from paddlex_hps_client import triton_request, utils
+from tritonclient import grpc as triton_grpc
+
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--file", type=str, required=True)
+    parser.add_argument("--file-type", type=int, choices=[0, 1])
+    parser.add_argument("--no-visualization", action="store_true")
+    parser.add_argument("--url", type=str, default="localhost:8001")
+
+    args = parser.parse_args()
+
+    client = triton_grpc.InferenceServerClient(args.url)
+    input_ = {"file": utils.prepare_input_file(args.file)}
+    if args.file_type is not None:
+        input_["fileType"] = args.file_type
+    if args.no_visualization:
+        input_["visualize"] = False
+    output = triton_request(client, "ocr", input_)
+    if output["errorCode"] != 0:
+        print(f"Error code: {output['errorCode']}", file=sys.stderr)
+        print(f"Error message: {output['errorMsg']}", file=sys.stderr)
+        sys.exit(1)
+    result = output["result"]
+    for i, res in enumerate(result["ocrResults"]):
+        print(res["prunedResult"])
+        ocr_img_path = f"ocr_{i}.jpg"
+        utils.save_output_file(res["ocrImage"], ocr_img_path)
+        print(f"Output image saved at {ocr_img_path}")
+
+
+if __name__ == "__main__":
+    main()

+ 3 - 0
deploy/hps/sdk/pipelines/OCR/client/requirements.txt

@@ -0,0 +1,3 @@
+# paddlex-hps-client
+protobuf == 3.19.6
+tritonclient [grpc] == 2.15

+ 134 - 0
deploy/hps/sdk/pipelines/OCR/server/model_repo/ocr/1/model.py

@@ -0,0 +1,134 @@
+from typing import Any, Dict, Final, List, Tuple
+
+from paddlex_hps_server import (
+    BaseTritonPythonModel,
+    app_common,
+    protocol,
+    schemas,
+    utils,
+)
+from paddlex_hps_server.storage import SupportsGetURL, create_storage
+
+_DEFAULT_MAX_NUM_INPUT_IMGS: Final[int] = 10
+_DEFAULT_MAX_OUTPUT_IMG_SIZE: Final[Tuple[int, int]] = (2000, 2000)
+
+
+class TritonPythonModel(BaseTritonPythonModel):
+    def initialize(self, args):
+        super().initialize(args)
+        self.context = {}
+        self.context["file_storage"] = None
+        self.context["return_img_urls"] = False
+        self.context["max_num_input_imgs"] = _DEFAULT_MAX_NUM_INPUT_IMGS
+        self.context["max_output_img_size"] = _DEFAULT_MAX_OUTPUT_IMG_SIZE
+        if self.app_config.extra:
+            if "file_storage" in self.app_config.extra:
+                self.context["file_storage"] = create_storage(
+                    self.app_config.extra["file_storage"]
+                )
+            if "return_img_urls" in self.app_config.extra:
+                self.context["return_img_urls"] = self.app_config.extra[
+                    "return_img_urls"
+                ]
+            if "max_num_input_imgs" in self.app_config.extra:
+                self.context["max_num_input_imgs"] = self.app_config.extra[
+                    "max_num_input_imgs"
+                ]
+            if "max_output_img_size" in self.app_config.extra:
+                self.context["max_output_img_size"] = self.app_config.extra[
+                    "max_output_img_size"
+                ]
+        if self.context["return_img_urls"]:
+            file_storage = self.context["file_storage"]
+            if not file_storage:
+                raise ValueError(
+                    "The file storage must be properly configured when URLs need to be returned."
+                )
+            if not isinstance(file_storage, SupportsGetURL):
+                raise TypeError(f"{type(file_storage)} does not support getting URLs.")
+
+    def get_input_model_type(self):
+        return schemas.ocr.InferRequest
+
+    def get_result_model_type(self):
+        return schemas.ocr.InferResult
+
+    def run(self, input, log_id):
+        if input.fileType is None:
+            if utils.is_url(input.file):
+                maybe_file_type = utils.infer_file_type(input.file)
+                if maybe_file_type is None or not (
+                    maybe_file_type == "PDF" or maybe_file_type == "IMAGE"
+                ):
+                    return protocol.create_aistudio_output_without_result(
+                        422,
+                        "Unsupported file type",
+                        log_id=log_id,
+                    )
+                file_type = maybe_file_type
+            else:
+                return protocol.create_aistudio_output_without_result(
+                    422,
+                    "File type cannot be determined",
+                    log_id=log_id,
+                )
+        else:
+            file_type = "PDF" if input.fileType == 0 else "IMAGE"
+        visualize_enabled = input.visualize if input.visualize is not None else self.app_config.visualize
+
+        file_bytes = utils.get_raw_bytes(input.file)
+        images, data_info = utils.file_to_images(
+            file_bytes,
+            file_type,
+            max_num_imgs=self.context["max_num_input_imgs"],
+        )
+
+        result = list(
+            self.pipeline(
+                images,
+                use_doc_orientation_classify=input.useDocOrientationClassify,
+                use_doc_unwarping=input.useDocUnwarping,
+                use_textline_orientation=input.useTextlineOrientation,
+                text_det_limit_side_len=input.textDetLimitSideLen,
+                text_det_limit_type=input.textDetLimitType,
+                text_det_thresh=input.textDetThresh,
+                text_det_box_thresh=input.textDetBoxThresh,
+                text_det_unclip_ratio=input.textDetUnclipRatio,
+                text_rec_score_thresh=input.textRecScoreThresh,
+            )
+        )
+
+        ocr_results: List[Dict[str, Any]] = []
+        for i, (img, item) in enumerate(zip(images, result)):
+            pruned_res = app_common.prune_result(item.json["res"])
+            if visualize_enabled:
+                output_imgs = item.img
+                imgs = {
+                    "input_img": img,
+                    "ocr_img": output_imgs["ocr_res_img"],
+                }
+                if "preprocessed_img" in output_imgs:
+                    imgs["doc_preprocessing_img"] = output_imgs["preprocessed_img"]
+                imgs = app_common.postprocess_images(
+                    imgs,
+                    log_id,
+                    filename_template=f"{{key}}_{i}.jpg",
+                    file_storage=self.context["file_storage"],
+                    return_urls=self.context["return_img_urls"],
+                    max_img_size=self.context["max_output_img_size"],
+                )
+            else:
+                imgs = {}
+            ocr_results.append(
+                dict(
+                    prunedResult=pruned_res,
+                    ocrImage=imgs.get("ocr_img"),
+                    docPreprocessingImage=imgs.get("doc_preprocessing_img"),
+                    inputImage=imgs.get("input_img"),
+                )
+            )
+
+        return schemas.ocr.InferResult(
+            ocrResults=ocr_results,
+            dataInfo=data_info,
+        )

+ 45 - 0
deploy/hps/sdk/pipelines/OCR/server/pipeline_config.yaml

@@ -0,0 +1,45 @@
+
+pipeline_name: OCR
+
+text_type: general
+
+use_doc_preprocessor: True
+use_textline_orientation: True
+
+SubPipelines:
+  DocPreprocessor:
+    pipeline_name: doc_preprocessor
+    use_doc_orientation_classify: True
+    use_doc_unwarping: True
+    SubModules:
+      DocOrientationClassify:
+        module_name: doc_text_orientation
+        model_name: PP-LCNet_x1_0_doc_ori
+        model_dir: null
+      DocUnwarping:
+        module_name: image_unwarping
+        model_name: UVDoc
+        model_dir: null
+
+SubModules:
+  TextDetection:
+    module_name: text_detection
+    model_name: PP-OCRv4_mobile_det
+    model_dir: null
+    limit_side_len: 960
+    limit_type: max
+    max_side_limit: 4000
+    thresh: 0.3
+    box_thresh: 0.6
+    unclip_ratio: 1.5
+  TextLineOrientation:
+    module_name: textline_orientation
+    model_name: PP-LCNet_x0_25_textline_ori 
+    model_dir: null
+    batch_size: 6    
+  TextRecognition:
+    module_name: text_recognition
+    model_name: PP-OCRv4_mobile_rec 
+    model_dir: null
+    batch_size: 6
+    score_thresh: 0.0

+ 67 - 0
deploy/hps/sdk/pipelines/PP-ChatOCRv3-doc/client/client.py

@@ -0,0 +1,67 @@
+#!/usr/bin/env python
+
+import argparse
+import sys
+
+from paddlex_hps_client import triton_request, utils
+from tritonclient import grpc as triton_grpc
+
+
+def ensure_no_error(output, additional_msg):
+    if output["errorCode"] != 0:
+        print(additional_msg, file=sys.stderr)
+        print(f"Error code: {output['errorCode']}", file=sys.stderr)
+        print(f"Error message: {output['errorMsg']}", file=sys.stderr)
+        sys.exit(1)
+
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--file", type=str, required=True)
+    parser.add_argument("--key-list", type=str, nargs="+", required=True)
+    parser.add_argument("--file-type", type=int, choices=[0, 1])
+    parser.add_argument("--no-visualization", action="store_true")
+    parser.add_argument("--url", type=str, default="localhost:8001")
+
+    args = parser.parse_args()
+
+    client = triton_grpc.InferenceServerClient(args.url)
+
+    input_ = {"file": utils.prepare_input_file(args.file)}
+    if args.file_type is not None:
+        input_["fileType"] = args.file_type
+    if args.no_visualization:
+        input_["visualize"] = False
+    output = triton_request(client, "chatocr-visual", input_)
+    ensure_no_error(output, "Failed to analyze the images")
+    result_visual = output["result"]
+
+    for i, res in enumerate(result_visual["layoutParsingResults"]):
+        print(res["prunedResult"])
+        for img_name, img in res["outputImages"].items():
+            img_path = f"{img_name}_{i}.jpg"
+            utils.save_output_file(img, img_path)
+            print(f"Output image saved at {img_path}")
+
+    input_ = {
+        "visualInfo": result_visual["visualInfo"],
+    }
+    output = triton_request(client, "chatocr-vector", input_)
+    ensure_no_error(output, "Failed to build a vector store")
+    result_vector = output["result"]
+
+    input_ = {
+        "keyList": args.key_list,
+        "visualInfo": result_visual["visualInfo"],
+        "useVectorRetrieval": True,
+        "vectorInfo": result_vector["vectorInfo"],
+    }
+    output = triton_request(client, "chatocr-chat", input_)
+    ensure_no_error(output, "Failed to chat with the LLM")
+    result_chat = output["result"]
+    print("Final result:")
+    print(result_chat["chatResult"])
+
+
+if __name__ == "__main__":
+    main()

+ 3 - 0
deploy/hps/sdk/pipelines/PP-ChatOCRv3-doc/client/requirements.txt

@@ -0,0 +1,3 @@
+# paddlex-hps-client
+protobuf == 3.19.6
+tritonclient [grpc] == 2.15

+ 38 - 0
deploy/hps/sdk/pipelines/PP-ChatOCRv3-doc/server/model_repo/chatocr-chat/1/model.py

@@ -0,0 +1,38 @@
+from paddlex_hps_server import BaseTritonPythonModel, schemas
+
+
+class TritonPythonModel(BaseTritonPythonModel):
+    @property
+    def pipeline_creation_kwargs(self):
+        return {"initial_predictor": False}
+
+    def get_input_model_type(self):
+        return schemas.pp_chatocrv3_doc.ChatRequest
+
+    def get_result_model_type(self):
+        return schemas.pp_chatocrv3_doc.ChatResult
+
+    def run(self, input, log_id):
+        result = self.pipeline.chat(
+            input.keyList,
+            input.visualInfo,
+            use_vector_retrieval=input.useVectorRetrieval,
+            vector_info=input.vectorInfo,
+            min_characters=input.minCharacters,
+            text_task_description=input.textTaskDescription,
+            text_output_format=input.textOutputFormat,
+            text_rules_str=input.textRulesStr,
+            text_few_shot_demo_text_content=input.textFewShotDemoTextContent,
+            text_few_shot_demo_key_value_list=input.textFewShotDemoKeyValueList,
+            table_task_description=input.tableTaskDescription,
+            table_output_format=input.tableOutputFormat,
+            table_rules_str=input.tableRulesStr,
+            table_few_shot_demo_text_content=input.tableFewShotDemoTextContent,
+            table_few_shot_demo_key_value_list=input.tableFewShotDemoKeyValueList,
+            chat_bot_config=input.chatBotConfig,
+            retriever_config=input.retrieverConfig,
+        )
+
+        return schemas.pp_chatocrv3_doc.ChatResult(
+            chatResult=result["chat_res"],
+        )

+ 22 - 0
deploy/hps/sdk/pipelines/PP-ChatOCRv3-doc/server/model_repo/chatocr-chat/config.pbtxt

@@ -0,0 +1,22 @@
+backend: "python"
+max_batch_size: 1
+input [
+  {
+    name: "input"
+    data_type: TYPE_STRING
+    dims: [ 1 ]
+  }
+]
+output [
+  {
+    name: "output"
+    data_type: TYPE_STRING
+    dims: [ 1 ]
+  }
+]
+instance_group [
+  {
+      count: 1
+      kind: KIND_CPU
+  }
+]

+ 24 - 0
deploy/hps/sdk/pipelines/PP-ChatOCRv3-doc/server/model_repo/chatocr-vector/1/model.py

@@ -0,0 +1,24 @@
+from paddlex_hps_server import BaseTritonPythonModel, schemas
+
+
+class TritonPythonModel(BaseTritonPythonModel):
+    @property
+    def pipeline_creation_kwargs(self):
+        return {"initial_predictor": False}
+
+    def get_input_model_type(self):
+        return schemas.pp_chatocrv3_doc.BuildVectorStoreRequest
+
+    def get_result_model_type(self):
+        return schemas.pp_chatocrv3_doc.BuildVectorStoreResult
+
+    def run(self, input, log_id):
+        vector_info = self.pipeline.build_vector(
+            input.visualInfo,
+            min_characters=input.minCharacters,
+            block_size=input.blockSize,
+            flag_save_bytes_vector=True,
+            retriever_config=input.retrieverConfig,
+        )
+
+        return schemas.pp_chatocrv3_doc.BuildVectorStoreResult(vectorInfo=vector_info)

+ 22 - 0
deploy/hps/sdk/pipelines/PP-ChatOCRv3-doc/server/model_repo/chatocr-vector/config.pbtxt

@@ -0,0 +1,22 @@
+backend: "python"
+max_batch_size: 1
+input [
+  {
+    name: "input"
+    data_type: TYPE_STRING
+    dims: [ 1 ]
+  }
+]
+output [
+  {
+    name: "output"
+    data_type: TYPE_STRING
+    dims: [ 1 ]
+  }
+]
+instance_group [
+  {
+      count: 1
+      kind: KIND_CPU
+  }
+]

+ 148 - 0
deploy/hps/sdk/pipelines/PP-ChatOCRv3-doc/server/model_repo/chatocr-visual/1/model.py

@@ -0,0 +1,148 @@
+from typing import Any, Dict, Final, List, Tuple
+
+from paddlex_hps_server import (
+    BaseTritonPythonModel,
+    app_common,
+    protocol,
+    schemas,
+    utils,
+)
+from paddlex_hps_server.storage import SupportsGetURL, create_storage
+
+_DEFAULT_MAX_NUM_INPUT_IMGS: Final[int] = 10
+_DEFAULT_MAX_OUTPUT_IMG_SIZE: Final[Tuple[int, int]] = (2000, 2000)
+
+
+class TritonPythonModel(BaseTritonPythonModel):
+    def initialize(self, args):
+        super().initialize(args)
+        self.context = {}
+        self.context["file_storage"] = None
+        self.context["return_img_urls"] = False
+        self.context["max_num_input_imgs"] = _DEFAULT_MAX_NUM_INPUT_IMGS
+        self.context["max_output_img_size"] = _DEFAULT_MAX_OUTPUT_IMG_SIZE
+        if self.app_config.extra:
+            if "file_storage" in self.app_config.extra:
+                self.context["file_storage"] = create_storage(
+                    self.app_config.extra["file_storage"]
+                )
+            if "return_img_urls" in self.app_config.extra:
+                self.context["return_img_urls"] = self.app_config.extra[
+                    "return_img_urls"
+                ]
+            if "max_num_input_imgs" in self.app_config.extra:
+                self.context["max_num_input_imgs"] = self.app_config.extra[
+                    "max_num_input_imgs"
+                ]
+            if "max_output_img_size" in self.app_config.extra:
+                self.context["max_output_img_size"] = self.app_config.extra[
+                    "max_output_img_size"
+                ]
+        if self.context["return_img_urls"]:
+            file_storage = self.context["file_storage"]
+            if not file_storage:
+                raise ValueError(
+                    "The file storage must be properly configured when URLs need to be returned."
+                )
+            if not isinstance(file_storage, SupportsGetURL):
+                raise TypeError(f"{type(file_storage)} does not support getting URLs.")
+
+    def get_input_model_type(self):
+        return schemas.pp_chatocrv3_doc.AnalyzeImagesRequest
+
+    def get_result_model_type(self):
+        return schemas.pp_chatocrv3_doc.AnalyzeImagesResult
+
+    def run(self, input, log_id):
+        if input.fileType is None:
+            if utils.is_url(input.file):
+                maybe_file_type = utils.infer_file_type(input.file)
+                if maybe_file_type is None or not (
+                    maybe_file_type == "PDF" or maybe_file_type == "IMAGE"
+                ):
+                    return protocol.create_aistudio_output_without_result(
+                        422,
+                        "Unsupported file type",
+                        log_id=log_id,
+                    )
+                file_type = maybe_file_type
+            else:
+                return protocol.create_aistudio_output_without_result(
+                    422,
+                    "File type cannot be determined",
+                    log_id=log_id,
+                )
+        else:
+            file_type = "PDF" if input.fileType == 0 else "IMAGE"
+        visualize_enabled = input.visualize if input.visualize is not None else self.app_config.visualize
+
+        file_bytes = utils.get_raw_bytes(input.file)
+        images, data_info = utils.file_to_images(
+            file_bytes,
+            file_type,
+            max_num_imgs=self.context["max_num_input_imgs"],
+        )
+
+        result = self.pipeline.visual_predict(
+            images,
+            use_doc_orientation_classify=input.useDocOrientationClassify,
+            use_doc_unwarping=input.useDocUnwarping,
+            use_seal_recognition=input.useSealRecognition,
+            use_table_recognition=input.useTableRecognition,
+            layout_threshold=input.layoutThreshold,
+            layout_nms=input.layoutNms,
+            layout_unclip_ratio=input.layoutUnclipRatio,
+            layout_merge_bboxes_mode=input.layoutMergeBboxesMode,
+            text_det_limit_side_len=input.textDetLimitSideLen,
+            text_det_limit_type=input.textDetLimitType,
+            text_det_thresh=input.textDetThresh,
+            text_det_box_thresh=input.textDetBoxThresh,
+            text_det_unclip_ratio=input.textDetUnclipRatio,
+            text_rec_score_thresh=input.textRecScoreThresh,
+            seal_det_limit_side_len=input.sealDetLimitSideLen,
+            seal_det_limit_type=input.sealDetLimitType,
+            seal_det_thresh=input.sealDetThresh,
+            seal_det_box_thresh=input.sealDetBoxThresh,
+            seal_det_unclip_ratio=input.sealDetUnclipRatio,
+            seal_rec_score_thresh=input.sealRecScoreThresh,
+        )
+
+        layout_parsing_results: List[Dict[str, Any]] = []
+        visual_info: List[dict] = []
+        for i, (img, item) in enumerate(zip(images, result)):
+            pruned_res = app_common.prune_result(
+                item["layout_parsing_result"].json["res"]
+            )
+            if visualize_enabled:
+                imgs = {
+                    "input_img": img,
+                    **item["layout_parsing_result"].img,
+                }
+                imgs = app_common.postprocess_images(
+                    imgs,
+                    log_id,
+                    filename_template=f"{{key}}_{i}.jpg",
+                    file_storage=self.context["file_storage"],
+                    return_urls=self.context["return_img_urls"],
+                    max_img_size=self.context["max_output_img_size"],
+                )
+            else:
+                imgs = {}
+            layout_parsing_results.append(
+                dict(
+                    prunedResult=pruned_res,
+                    outputImages=(
+                        {k: v for k, v in imgs.items() if k != "input_img"}
+                        if imgs
+                        else None
+                    ),
+                    inputImage=imgs.get("input_img"),
+                )
+            )
+            visual_info.append(item["visual_info"])
+
+        return schemas.pp_chatocrv3_doc.AnalyzeImagesResult(
+            layoutParsingResults=layout_parsing_results,
+            visualInfo=visual_info,
+            dataInfo=data_info,
+        )

+ 151 - 0
deploy/hps/sdk/pipelines/PP-ChatOCRv3-doc/server/pipeline_config.yaml

@@ -0,0 +1,151 @@
+
+pipeline_name: PP-ChatOCRv3-doc
+
+use_layout_parser: True
+
+SubModules:
+  LLM_Chat:
+    module_name: chat_bot
+    model_name: ernie-3.5-8k
+    base_url: "https://qianfan.baidubce.com/v2"
+    api_type: openai
+    api_key: "api_key" # Set this to a real API key
+
+  LLM_Retriever:
+    module_name: retriever
+    model_name: embedding-v1
+    base_url: "https://qianfan.baidubce.com/v2"
+    api_type: qianfan
+    api_key: "api_key" # Set this to a real API key
+
+
+  PromptEngneering:
+    KIE_CommonText:
+      module_name: prompt_engneering
+      task_type: text_kie_prompt_v1
+
+      task_description: '你现在的任务是从OCR文字识别的结果中提取关键词列表中每一项对应的关键信息。
+          OCR的文字识别结果使用```符号包围,包含所识别出来的文字,顺序在原始图片中从左至右、从上至下。
+          我指定的关键词列表使用[]符号包围。请注意OCR的文字识别结果可能存在长句子换行被切断、不合理的分词、
+          文字被错误合并等问题,你需要结合上下文语义进行综合判断,以抽取准确的关键信息。'
+
+      rules_str:
+
+      output_format: '在返回结果时使用JSON格式,包含多个key-value对,key值为我指定的问题,value值为该问题对应的答案。
+          如果认为OCR识别结果中,对于问题key,没有答案,则将value赋值为"未知"。请只输出json格式的结果,
+          并做json格式校验后返回,不要包含其它多余文字!'
+
+      few_shot_demo_text_content:
+
+      few_shot_demo_key_value_list:
+          
+    KIE_Table:
+      module_name: prompt_engneering
+      task_type: table_kie_prompt_v1
+
+      task_description: '你现在的任务是从输入的表格内容中提取关键词列表中每一项对应的关键信息,
+          表格内容用```符号包围,我指定的关键词列表使用[]符号包围。你需要结合上下文语义进行综合判断,以抽取准确的关键信息。'
+      
+      rules_str:
+
+      output_format: '在返回结果时使用JSON格式,包含多个key-value对,key值为我指定的关键词,value值为所抽取的结果。
+          如果认为表格识别结果中没有关键词key对应的value,则将value赋值为"未知"。请只输出json格式的结果,
+          并做json格式校验后返回,不要包含其它多余文字!'
+      
+      few_shot_demo_text_content:
+
+      few_shot_demo_key_value_list:
+
+SubPipelines:
+  LayoutParser:
+    pipeline_name: layout_parsing
+
+    use_doc_preprocessor: True
+    use_general_ocr: True
+    use_seal_recognition: True
+    use_table_recognition: True
+    use_formula_recognition: False
+
+    SubModules:
+      LayoutDetection:
+        module_name: layout_detection
+        model_name: RT-DETR-H_layout_3cls
+        model_dir: null
+
+    SubPipelines:
+      DocPreprocessor:
+        pipeline_name: doc_preprocessor
+        use_doc_orientation_classify: True
+        use_doc_unwarping: True
+        SubModules:
+          DocOrientationClassify:
+            module_name: doc_text_orientation
+            model_name: PP-LCNet_x1_0_doc_ori
+            model_dir: null
+          DocUnwarping:
+            module_name: image_unwarping
+            model_name: UVDoc
+            model_dir: null
+
+      GeneralOCR:
+        pipeline_name: OCR
+        text_type: general
+        use_doc_preprocessor: False
+        use_textline_orientation: False
+        SubModules:
+          TextDetection:
+            module_name: text_detection
+            model_name: PP-OCRv4_server_det
+            model_dir: null
+            limit_side_len: 960
+            limit_type: max
+            max_side_limit: 4000
+            thresh: 0.3
+            box_thresh: 0.6
+            unclip_ratio: 1.5
+            
+          TextRecognition:
+            module_name: text_recognition
+            model_name: PP-OCRv4_server_rec
+            model_dir: null
+            batch_size: 6
+            score_thresh: 0
+
+      TableRecognition:
+        pipeline_name: table_recognition
+        use_layout_detection: False
+        use_doc_preprocessor: False
+        use_ocr_model: False
+        SubModules:
+          TableStructureRecognition:
+            module_name: table_structure_recognition
+            model_name: SLANet_plus
+            model_dir: null
+
+      SealRecognition:
+        pipeline_name: seal_recognition
+        use_layout_detection: False
+        use_doc_preprocessor: False
+        SubPipelines:
+          SealOCR:
+            pipeline_name: OCR
+            text_type: seal
+            use_doc_preprocessor: False
+            use_textline_orientation: False
+            SubModules:
+              TextDetection:
+                module_name: seal_text_detection
+                model_name: PP-OCRv4_server_seal_det
+                model_dir: null
+                limit_side_len: 736
+                limit_type: min
+                max_side_limit: 4000
+                thresh: 0.2
+                box_thresh: 0.6
+                unclip_ratio: 0.5
+              TextRecognition:
+                module_name: text_recognition
+                model_name: PP-OCRv4_server_rec
+                model_dir: null
+                batch_size: 1
+                score_thresh: 0

+ 79 - 0
deploy/hps/sdk/pipelines/PP-ChatOCRv4-doc/client/client.py

@@ -0,0 +1,79 @@
+#!/usr/bin/env python
+
+import argparse
+import sys
+
+from paddlex_hps_client import triton_request, utils
+from tritonclient import grpc as triton_grpc
+
+
+def ensure_no_error(output, additional_msg):
+    if output["errorCode"] != 0:
+        print(additional_msg, file=sys.stderr)
+        print(f"Error code: {output['errorCode']}", file=sys.stderr)
+        print(f"Error message: {output['errorMsg']}", file=sys.stderr)
+        sys.exit(1)
+
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--file", type=str, required=True)
+    parser.add_argument("--key-list", type=str, nargs="+", required=True)
+    parser.add_argument("--file-type", type=int, choices=[0, 1])
+    parser.add_argument("--no-visualization", action="store_true")
+    parser.add_argument("--invoke-mllm", action="store_true")
+    parser.add_argument("--url", type=str, default="localhost:8001")
+
+    args = parser.parse_args()
+
+    client = triton_grpc.InferenceServerClient(args.url)
+
+    input_ = {"file": utils.prepare_input_file(args.file)}
+    if args.file_type is not None:
+        input_["fileType"] = args.file_type
+    if args.no_visualization:
+        input_["visualize"] = False
+    output = triton_request(client, "chatocr-visual", input_)
+    ensure_no_error(output, "Failed to analyze the images")
+    result_visual = output["result"]
+
+    for i, res in enumerate(result_visual["layoutParsingResults"]):
+        print(res["prunedResult"])
+        for img_name, img in res["outputImages"].items():
+            img_path = f"{img_name}_{i}.jpg"
+            utils.save_output_file(img, img_path)
+            print(f"Output image saved at {img_path}")
+
+    input_ = {
+        "visualInfo": result_visual["visualInfo"],
+    }
+    output = triton_request(client, "chatocr-vector", input_)
+    ensure_no_error(output, "Failed to build a vector store")
+    result_vector = output["result"]
+
+    if args.invoke_mllm:
+        input_ = {
+            "image": utils.prepare_input_file(args.file),
+            "keyList": args.key_list,
+        }
+        output = triton_request(client, "chatocr-mllm", input_)
+        ensure_no_error(output, "Failed to invoke the MLLM")
+        result_mllm = output["result"]
+
+    input_ = {
+        "keyList": args.key_list,
+        "visualInfo": result_visual["visualInfo"],
+        "useVectorRetrieval": True,
+        "vectorInfo": result_vector["vectorInfo"],
+    }
+    if args.invoke_mllm:
+        input_["mllmPredictInfo"] = result_mllm["mllmPredictInfo"]
+    output = triton_request(client, "chatocr-chat", input_)
+    ensure_no_error(output, "Failed to chat with the LLM")
+    result_chat = output["result"]
+    print("Final result:")
+    print(result_chat["chatResult"])
+
+
+if __name__ == "__main__":
+    main()

+ 3 - 0
deploy/hps/sdk/pipelines/PP-ChatOCRv4-doc/client/requirements.txt

@@ -0,0 +1,3 @@
+# paddlex-hps-client
+protobuf == 3.19.6
+tritonclient [grpc] == 2.15

+ 40 - 0
deploy/hps/sdk/pipelines/PP-ChatOCRv4-doc/server/model_repo/chatocr-chat/1/model.py

@@ -0,0 +1,40 @@
+from paddlex_hps_server import BaseTritonPythonModel, schemas
+
+
+class TritonPythonModel(BaseTritonPythonModel):
+    @property
+    def pipeline_creation_kwargs(self):
+        return {"initial_predictor": False}
+
+    def get_input_model_type(self):
+        return schemas.pp_chatocrv4_doc.ChatRequest
+
+    def get_result_model_type(self):
+        return schemas.pp_chatocrv4_doc.ChatResult
+
+    def run(self, input, log_id):
+        result = self.pipeline.chat(
+            input.keyList,
+            input.visualInfo,
+            use_vector_retrieval=input.useVectorRetrieval,
+            vector_info=input.vectorInfo,
+            min_characters=input.minCharacters,
+            text_task_description=input.textTaskDescription,
+            text_output_format=input.textOutputFormat,
+            text_rules_str=input.textRulesStr,
+            text_few_shot_demo_text_content=input.textFewShotDemoTextContent,
+            text_few_shot_demo_key_value_list=input.textFewShotDemoKeyValueList,
+            table_task_description=input.tableTaskDescription,
+            table_output_format=input.tableOutputFormat,
+            table_rules_str=input.tableRulesStr,
+            table_few_shot_demo_text_content=input.tableFewShotDemoTextContent,
+            table_few_shot_demo_key_value_list=input.tableFewShotDemoKeyValueList,
+            mllm_predict_info=input.mllmPredictInfo,
+            mllm_integration_strategy=input.mllmIntegrationStrategy,
+            chat_bot_config=input.chatBotConfig,
+            retriever_config=input.retrieverConfig,
+        )
+
+        return schemas.pp_chatocrv4_doc.ChatResult(
+            chatResult=result["chat_res"],
+        )

+ 22 - 0
deploy/hps/sdk/pipelines/PP-ChatOCRv4-doc/server/model_repo/chatocr-chat/config.pbtxt

@@ -0,0 +1,22 @@
+backend: "python"
+max_batch_size: 1
+input [
+  {
+    name: "input"
+    data_type: TYPE_STRING
+    dims: [ 1 ]
+  }
+]
+output [
+  {
+    name: "output"
+    data_type: TYPE_STRING
+    dims: [ 1 ]
+  }
+]
+instance_group [
+  {
+      count: 1
+      kind: KIND_CPU
+  }
+]

+ 27 - 0
deploy/hps/sdk/pipelines/PP-ChatOCRv4-doc/server/model_repo/chatocr-mllm/1/model.py

@@ -0,0 +1,27 @@
+from paddlex_hps_server import BaseTritonPythonModel, schemas, utils
+
+
+class TritonPythonModel(BaseTritonPythonModel):
+    @property
+    def pipeline_creation_kwargs(self):
+        return {"initial_predictor": False}
+
+    def get_input_model_type(self):
+        return schemas.pp_chatocrv4_doc.InvokeMLLMRequest
+
+    def get_result_model_type(self):
+        return schemas.pp_chatocrv4_doc.InvokeMLLMResult
+
+    def run(self, input, log_id):
+        file_bytes = utils.get_raw_bytes(input.image)
+        image = utils.image_bytes_to_array(file_bytes)
+
+        mllm_predict_info = self.pipeline.mllm_pred(
+            image,
+            input.keyList,
+            mllm_chat_bot_config=input.mllmChatBotConfig,
+        )
+
+        return schemas.pp_chatocrv4_doc.InvokeMLLMResult(
+            mllmPredictInfo=mllm_predict_info
+        )

+ 22 - 0
deploy/hps/sdk/pipelines/PP-ChatOCRv4-doc/server/model_repo/chatocr-mllm/config.pbtxt

@@ -0,0 +1,22 @@
+backend: "python"
+max_batch_size: 1
+input [
+  {
+    name: "input"
+    data_type: TYPE_STRING
+    dims: [ 1 ]
+  }
+]
+output [
+  {
+    name: "output"
+    data_type: TYPE_STRING
+    dims: [ 1 ]
+  }
+]
+instance_group [
+  {
+      count: 1
+      kind: KIND_CPU
+  }
+]

+ 24 - 0
deploy/hps/sdk/pipelines/PP-ChatOCRv4-doc/server/model_repo/chatocr-vector/1/model.py

@@ -0,0 +1,24 @@
+from paddlex_hps_server import BaseTritonPythonModel, schemas
+
+
+class TritonPythonModel(BaseTritonPythonModel):
+    @property
+    def pipeline_creation_kwargs(self):
+        return {"initial_predictor": False}
+
+    def get_input_model_type(self):
+        return schemas.pp_chatocrv4_doc.BuildVectorStoreRequest
+
+    def get_result_model_type(self):
+        return schemas.pp_chatocrv4_doc.BuildVectorStoreResult
+
+    def run(self, input, log_id):
+        vector_info = self.pipeline.build_vector(
+            input.visualInfo,
+            min_characters=input.minCharacters,
+            block_size=input.blockSize,
+            flag_save_bytes_vector=True,
+            retriever_config=input.retrieverConfig,
+        )
+
+        return schemas.pp_chatocrv4_doc.BuildVectorStoreResult(vectorInfo=vector_info)

+ 22 - 0
deploy/hps/sdk/pipelines/PP-ChatOCRv4-doc/server/model_repo/chatocr-vector/config.pbtxt

@@ -0,0 +1,22 @@
+backend: "python"
+max_batch_size: 1
+input [
+  {
+    name: "input"
+    data_type: TYPE_STRING
+    dims: [ 1 ]
+  }
+]
+output [
+  {
+    name: "output"
+    data_type: TYPE_STRING
+    dims: [ 1 ]
+  }
+]
+instance_group [
+  {
+      count: 1
+      kind: KIND_CPU
+  }
+]

+ 149 - 0
deploy/hps/sdk/pipelines/PP-ChatOCRv4-doc/server/model_repo/chatocr-visual/1/model.py

@@ -0,0 +1,149 @@
+from typing import Any, Dict, Final, List, Tuple
+
+from paddlex_hps_server import (
+    BaseTritonPythonModel,
+    app_common,
+    protocol,
+    schemas,
+    utils,
+)
+from paddlex_hps_server.storage import SupportsGetURL, create_storage
+
+_DEFAULT_MAX_NUM_INPUT_IMGS: Final[int] = 10
+_DEFAULT_MAX_OUTPUT_IMG_SIZE: Final[Tuple[int, int]] = (2000, 2000)
+
+
+class TritonPythonModel(BaseTritonPythonModel):
+    def initialize(self, args):
+        super().initialize(args)
+        self.context = {}
+        self.context["file_storage"] = None
+        self.context["return_img_urls"] = False
+        self.context["max_num_input_imgs"] = _DEFAULT_MAX_NUM_INPUT_IMGS
+        self.context["max_output_img_size"] = _DEFAULT_MAX_OUTPUT_IMG_SIZE
+        if self.app_config.extra:
+            if "file_storage" in self.app_config.extra:
+                self.context["file_storage"] = create_storage(
+                    self.app_config.extra["file_storage"]
+                )
+            if "return_img_urls" in self.app_config.extra:
+                self.context["return_img_urls"] = self.app_config.extra[
+                    "return_img_urls"
+                ]
+            if "max_num_input_imgs" in self.app_config.extra:
+                self.context["max_num_input_imgs"] = self.app_config.extra[
+                    "max_num_input_imgs"
+                ]
+            if "max_output_img_size" in self.app_config.extra:
+                self.context["max_output_img_size"] = self.app_config.extra[
+                    "max_output_img_size"
+                ]
+        if self.context["return_img_urls"]:
+            file_storage = self.context["file_storage"]
+            if not file_storage:
+                raise ValueError(
+                    "The file storage must be properly configured when URLs need to be returned."
+                )
+            if not isinstance(file_storage, SupportsGetURL):
+                raise TypeError(f"{type(file_storage)} does not support getting URLs.")
+
+    def get_input_model_type(self):
+        return schemas.pp_chatocrv4_doc.AnalyzeImagesRequest
+
+    def get_result_model_type(self):
+        return schemas.pp_chatocrv4_doc.AnalyzeImagesResult
+
+    def run(self, input, log_id):
+        if input.fileType is None:
+            if utils.is_url(input.file):
+                maybe_file_type = utils.infer_file_type(input.file)
+                if maybe_file_type is None or not (
+                    maybe_file_type == "PDF" or maybe_file_type == "IMAGE"
+                ):
+                    return protocol.create_aistudio_output_without_result(
+                        422,
+                        "Unsupported file type",
+                        log_id=log_id,
+                    )
+                file_type = maybe_file_type
+            else:
+                return protocol.create_aistudio_output_without_result(
+                    422,
+                    "File type cannot be determined",
+                    log_id=log_id,
+                )
+        else:
+            file_type = "PDF" if input.fileType == 0 else "IMAGE"
+        visualize_enabled = input.visualize if input.visualize is not None else self.app_config.visualize
+
+        file_bytes = utils.get_raw_bytes(input.file)
+        images, data_info = utils.file_to_images(
+            file_bytes,
+            file_type,
+            max_num_imgs=self.context["max_num_input_imgs"],
+        )
+
+        result = self.pipeline.visual_predict(
+            images,
+            use_doc_orientation_classify=input.useDocOrientationClassify,
+            use_doc_unwarping=input.useDocUnwarping,
+            use_textline_orientation=input.useTextlineOrientation,
+            use_seal_recognition=input.useSealRecognition,
+            use_table_recognition=input.useTableRecognition,
+            layout_threshold=input.layoutThreshold,
+            layout_nms=input.layoutNms,
+            layout_unclip_ratio=input.layoutUnclipRatio,
+            layout_merge_bboxes_mode=input.layoutMergeBboxesMode,
+            text_det_limit_side_len=input.textDetLimitSideLen,
+            text_det_limit_type=input.textDetLimitType,
+            text_det_thresh=input.textDetThresh,
+            text_det_box_thresh=input.textDetBoxThresh,
+            text_det_unclip_ratio=input.textDetUnclipRatio,
+            text_rec_score_thresh=input.textRecScoreThresh,
+            seal_det_limit_side_len=input.sealDetLimitSideLen,
+            seal_det_limit_type=input.sealDetLimitType,
+            seal_det_thresh=input.sealDetThresh,
+            seal_det_box_thresh=input.sealDetBoxThresh,
+            seal_det_unclip_ratio=input.sealDetUnclipRatio,
+            seal_rec_score_thresh=input.sealRecScoreThresh,
+        )
+
+        layout_parsing_results: List[Dict[str, Any]] = []
+        visual_info: List[dict] = []
+        for i, (img, item) in enumerate(zip(images, result)):
+            pruned_res = app_common.prune_result(
+                item["layout_parsing_result"].json["res"]
+            )
+            if visualize_enabled:
+                imgs = {
+                    "input_img": img,
+                    **item["layout_parsing_result"].img,
+                }
+                imgs = app_common.postprocess_images(
+                    imgs,
+                    log_id,
+                    filename_template=f"{{key}}_{i}.jpg",
+                    file_storage=self.context["file_storage"],
+                    return_urls=self.context["return_img_urls"],
+                    max_img_size=self.context["max_output_img_size"],
+                )
+            else:
+                imgs = {}
+            layout_parsing_results.append(
+                dict(
+                    prunedResult=pruned_res,
+                    outputImages=(
+                        {k: v for k, v in imgs.items() if k != "input_img"}
+                        if imgs
+                        else None
+                    ),
+                    inputImage=imgs.get("input_img"),
+                )
+            )
+            visual_info.append(item["visual_info"])
+
+        return schemas.pp_chatocrv4_doc.AnalyzeImagesResult(
+            layoutParsingResults=layout_parsing_results,
+            visualInfo=visual_info,
+            dataInfo=data_info,
+        )

+ 241 - 0
deploy/hps/sdk/pipelines/PP-ChatOCRv4-doc/server/pipeline_config.yaml

@@ -0,0 +1,241 @@
+
+pipeline_name: PP-ChatOCRv4-doc
+
+use_layout_parser: True
+
+use_mllm_predict: True
+
+SubModules:
+  LLM_Chat:
+    module_name: chat_bot
+    model_name: ernie-3.5-8k
+    base_url: "https://qianfan.baidubce.com/v2"
+    api_type: openai
+    api_key: "api_key" # Set this to a real API key
+
+  LLM_Retriever:
+    module_name: retriever
+    model_name: embedding-v1
+    base_url: "https://qianfan.baidubce.com/v2"
+    api_type: qianfan
+    api_key: "api_key" # Set this to a real API key
+
+  MLLM_Chat:
+    module_name: chat_bot
+    model_name: PP-DocBee
+    base_url: "http://127.0.0.1:8080/v1/chat/completions"
+    api_type: openai
+    api_key: "api_key"
+
+  PromptEngneering:
+    KIE_CommonText:
+      module_name: prompt_engneering
+      task_type: text_kie_prompt_v2
+      
+      task_description: '你是一个信息提取助手,你的任务是从OCR结果中提取每一个问题的答案。
+  
+  请注意:问题可分为两类,你需要**根据问题的语义自动判断任务类型**,并分别使用以下规则:
+  
+  1. **摘录型问题(适用于法规、定义、条款、原句类问题)**
+     - 特征:问题中常出现“是什么”“内容是什么”“规定是什么”等;
+     - 要求:答案必须**逐字摘录自 OCR 文本**,不可改写或简化。
+
+  2. **问答型问题(适用于需要计算、推理、提取数字信息的问题)**
+     - 特征:问题中含“多少”“增长了多少”“是多少元”等;
+     - 要求:可以基于 OCR 文本进行**理解、补全和单位推理**,但需严格以 OCR 信息为依据,不得引入外部常识或主观猜测。
+
+  OCR识别结果使用```符号包围,按图片中从左至右、从上至下排序。
+  问题列表使用[]符号包围。'
+
+      output_format: '输出以 JSON 格式返回,key 为问题内容,value 为对应答案。
+  - 所有答案必须是完整、可理解的语句或片段;
+  - 若 OCR 中无法确定答案,请将 value 设为 "未知";
+  - 严格使用 "未知",不允许使用 null、空字符串、"-" 等其他表示形式。'
+
+      rules_str: '通用规则:
+              1. **内容来源与完整性**
+                - 所有问题的答案必须**完全依据表格中的内容**进行作答;
+                - 回答时应**尽可能详细和完整**,不得省略或自行补充未在表格中明确出现的信息;
+                - 保持**原文的格式、数字、正负号、单位、符号和标点符号**完全一致。
+
+              2. **标点规范**
+                - 如果原文答案句末带有标点符号(如句号、逗号、分号等),请**保留并添加在答案结尾**。
+              3. **单位补全要求**
+                  - 由于评测可能涉及单位识别,**答案中的所有数字后必须添加单位**;
+                  - 如果原文上下文中已明确提供单位,**请直接使用该单位**;
+                  - 如果上下文中未出现单位,**请你根据语义补充一个合理的常见单位**,如“个”“项”“次”“年”“元”等;
+                  - 对于比率或百分比,请**务必添加“%”符号**;
+                  - **禁止省略单位**,也不得以“无单位”或空字符串代替;
+                  - 添加单位时,**直接紧跟在数字后,不允许加括号、引号或任何额外标注符号**。
+
+              4. **上下文语义保持**
+                - 请严格遵循原文语义;
+                - 如果原文描述为“在30分钟内完成”,请**完整回答“在30分钟内完成”**,而**不是简化为“30分钟”**;
+                - 不可断章取义、丢失时间、条件或限制性描述。
+              5. 如无法确定答案,必须填“未知”。
+  摘录型补充规则:
+  - 答案必须逐字摘抄自 OCR 文本;
+  - 不得缩写、改写或断章取义。
+  问答型补充规则:
+  - 可以合理组合 OCR 中的多个片段;
+  - 可进行单位补全和数值归纳;
+  - 不得引入非文本来源内容。'
+      few_shot_demo_text_content:
+      few_shot_demo_key_value_list:
+          
+    KIE_Table:
+      module_name: prompt_engneering
+      task_type: table_kie_prompt_v2
+
+      task_description: '你现在的任务是从输入的html格式的表格内容中提取问题列表中每一个问题的答案。
+          表格内容使用```符号包围,我指定的问题列表使用[]符号包围。'
+
+      output_format: '在返回结果时使用JSON格式,包含多个key-value对,key值为我指定的问题,value值为该问题对应的答案。
+          如果认为表格内容中,对于问题key,没有答案,则将value赋值为"未知"。请只输出json格式的结果,
+          并做json格式校验后返回,不要包含其它多余文字!'
+        
+      rules_str: '通用规则:
+              1. **内容来源与完整性**
+                - 所有问题的答案必须**完全依据表格中的内容**进行作答;
+                - 回答时应**尽可能详细和完整**,不得省略或自行补充未在表格中明确出现的信息;
+                - 保持**原文的格式、数字、正负号、单位、符号和标点符号**完全一致。
+
+              2. **标点规范**
+                - 如果原文答案句末带有标点符号(如句号、逗号、分号等),请**保留并添加在答案结尾**。
+              3. **单位补全要求**
+                  - 由于评测可能涉及单位识别,**答案中的所有数字后必须添加单位**;
+                  - 如果原文上下文中已明确提供单位,**请直接使用该单位**;
+                  - 如果上下文中未出现单位,**请你根据语义补充一个合理的常见单位**,如“个”“项”“次”“年”“元”等;
+                  - 对于比率或百分比,请**务必添加“%”符号**;
+                  - **禁止省略单位**,也不得以“无单位”或空字符串代替;
+                  - 添加单位时,**直接紧跟在数字后,不允许加括号、引号或任何额外标注符号**。
+
+              4. **上下文语义保持**
+                - 请严格遵循原文语义;
+                - 如果原文描述为“在30分钟内完成”,请**完整回答“在30分钟内完成”**,而**不是简化为“30分钟”**;
+                - 不可断章取义、丢失时间、条件或限制性描述。
+              5. 如无法确定答案,必须填“未知”。
+
+              摘录型补充规则:
+              - 答案必须逐字摘抄自 OCR 文本;
+              - 不得缩写、改写或断章取义。
+              问答型补充规则:
+              - 可以合理组合 OCR 中的多个片段;
+              - 可进行单位补全和数值归纳;
+              - 不得引入非文本来源内容。
+  '
+          
+      few_shot_demo_text_content:
+      few_shot_demo_key_value_list:
+
+    Ensemble:
+      module_name: prompt_engneering
+      task_type: ensemble_prompt
+
+      task_description: '你现在的任务是,对于一个问题,对比方法A和方法B的结果,选择更准确的一个回答。
+        问题用```符号包围。'
+      output_format: '请返回JSON格式的结果,包含多个key-value对,key值为我指定的问题,
+        value值为`方法A`或`方法B`。如果对于问题key,没有找到答案,将value赋值为"未知"。
+        请只输出json格式的结果,并做json格式校验后返回,不要包含其它多余文字!'
+      rules_str: '对于涉及数字的问题,请返回与问题描述最相关且数字表述正确的答案。
+        请特别注意数字中的标点使用是否合理。'
+      few_shot_demo_text_content:
+      few_shot_demo_key_value_list:
+
+SubPipelines:
+  LayoutParser:
+    pipeline_name: layout_parsing
+
+    use_doc_preprocessor: True
+    use_general_ocr: True
+    use_seal_recognition: True
+    use_table_recognition: True
+    use_formula_recognition: False
+
+    SubModules:
+      LayoutDetection:
+        module_name: layout_detection
+        model_name: RT-DETR-H_layout_3cls
+        model_dir: null
+
+    SubPipelines:
+      DocPreprocessor:
+        pipeline_name: doc_preprocessor
+        use_doc_orientation_classify: True
+        use_doc_unwarping: True
+        SubModules:
+          DocOrientationClassify:
+            module_name: doc_text_orientation
+            model_name: PP-LCNet_x1_0_doc_ori
+            model_dir: null
+          DocUnwarping:
+            module_name: image_unwarping
+            model_name: UVDoc
+            model_dir: null
+
+      GeneralOCR:
+        pipeline_name: OCR
+        text_type: general
+        use_doc_preprocessor: False
+        use_textline_orientation: True
+        SubModules:
+          TextDetection:
+            module_name: text_detection
+            model_name: PP-OCRv4_server_det
+            model_dir: null
+            limit_side_len: 960
+            limit_type: max
+            max_side_limit: 4000
+            thresh: 0.3
+            box_thresh: 0.6
+            unclip_ratio: 1.5
+          TextLineOrientation:
+            module_name: textline_orientation
+            model_name: PP-LCNet_x0_25_textline_ori 
+            model_dir: null
+            batch_size: 6   
+          TextRecognition:
+            module_name: text_recognition
+            model_name: PP-OCRv4_server_rec_doc
+            model_dir: null
+            batch_size: 6
+            score_thresh: 0.0
+
+      TableRecognition:
+        pipeline_name: table_recognition
+        use_layout_detection: False
+        use_doc_preprocessor: False
+        use_ocr_model: False
+        SubModules:
+          TableStructureRecognition:
+            module_name: table_structure_recognition
+            model_name: SLANet_plus
+            model_dir: null
+
+      SealRecognition:
+        pipeline_name: seal_recognition
+        use_layout_detection: False
+        use_doc_preprocessor: False
+        SubPipelines:
+          SealOCR:
+            pipeline_name: OCR
+            text_type: seal
+            use_doc_preprocessor: False
+            use_textline_orientation: False
+            SubModules:
+              TextDetection:
+                module_name: seal_text_detection
+                model_name: PP-OCRv4_server_seal_det
+                model_dir: null
+                limit_side_len: 736
+                limit_type: min
+                max_side_limit: 4000
+                thresh: 0.2
+                box_thresh: 0.6
+                unclip_ratio: 0.5
+              TextRecognition:
+                module_name: text_recognition
+                model_name: PP-OCRv4_server_rec_doc
+                model_dir: null
+                batch_size: 1
+                score_thresh: 0

+ 74 - 0
deploy/hps/sdk/pipelines/PP-DocTranslation/client/client.py

@@ -0,0 +1,74 @@
+#!/usr/bin/env python
+
+import argparse
+import sys
+from pathlib import Path
+
+from paddlex_hps_client import triton_request, utils
+from tritonclient import grpc as triton_grpc
+
+
+def ensure_no_error(output, additional_msg):
+    if output["errorCode"] != 0:
+        print(additional_msg, file=sys.stderr)
+        print(f"Error code: {output['errorCode']}", file=sys.stderr)
+        print(f"Error message: {output['errorMsg']}", file=sys.stderr)
+        sys.exit(1)
+
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--file", type=str, required=True)
+    parser.add_argument("--target-language", type=str, default="zh")
+    parser.add_argument("--file-type", type=int, choices=[0, 1])
+    parser.add_argument("--no-visualization", action="store_true")
+    parser.add_argument("--url", type=str, default="localhost:8001")
+
+    args = parser.parse_args()
+
+    client = triton_grpc.InferenceServerClient(args.url)
+
+    input_ = {"file": utils.prepare_input_file(args.file)}
+    if args.file_type is not None:
+        input_["fileType"] = args.file_type
+    if args.no_visualization:
+        input_["visualize"] = False
+
+    output = triton_request(client, "doctrans-visual", input_)
+    ensure_no_error(output, "Failed to analyze the images")
+    result_visual = output["result"]
+
+    markdown_list = []
+    for i, res in enumerate(result_visual["layoutParsingResults"]):
+        print(res["prunedResult"])
+        md_dir = Path(f"markdown_{i}")
+        md_dir.mkdir(exist_ok=True)
+        (md_dir / "doc.md").write_text(res["markdown"]["text"])
+        for img_path, img in res["markdown"]["images"].items():
+            img_path = md_dir / img_path
+            img_path.parent.mkdir(parents=True, exist_ok=True)
+            utils.save_output_file(img, img_path)
+        print(f"Markdown document to be translated is saved at {md_dir / 'doc.md'}")
+        del res["markdown"]["images"]
+        markdown_list.append(res["markdown"])
+        for img_name, img in res["outputImages"].items():
+            img_path = f"{img_name}_{i}.jpg"
+            utils.save_output_file(img, img_path)
+            print(f"Output image saved at {img_path}")
+
+    input_ = {
+        "markdownList": markdown_list,
+        "targetLanguage": args.target_language,
+    }
+    output = triton_request(client, "doctrans-translate", input_)
+    ensure_no_error(output, "Failed to translate the markdown")
+    result_translate = output["result"]
+
+    for i, res in enumerate(result_translate["translationResults"]):
+        md_dir = Path(f"markdown_{i}")
+        (md_dir / "doc_translated.md").write_text(res["markdown"]["text"])
+        print(f"Translated markdown document saved at {md_dir / 'doc_translated.md'}")
+
+
+if __name__ == "__main__":
+    main()

+ 3 - 0
deploy/hps/sdk/pipelines/PP-DocTranslation/client/requirements.txt

@@ -0,0 +1,3 @@
+# paddlex-hps-client
+protobuf == 3.19.6
+tritonclient [grpc] == 2.15

+ 54 - 0
deploy/hps/sdk/pipelines/PP-DocTranslation/server/model_repo/doctrans-translate/1/model.py

@@ -0,0 +1,54 @@
+from typing import Any, Dict, List
+
+from paddlex_hps_server import BaseTritonPythonModel, schemas
+
+
+class TritonPythonModel(BaseTritonPythonModel):
+    def get_input_model_type(self):
+        return schemas.pp_doctranslation.TranslateRequest
+
+    def get_result_model_type(self):
+        return schemas.pp_doctranslation.TranslateResult
+
+    def run(self, input, log_id):
+        ori_md_info_list: List[Dict[str, Any]] = []
+        for i, item in enumerate(input.markdownList):
+            ori_md_info_list.append(
+                {
+                    "input_path": None,
+                    "page_index": i,
+                    "markdown_texts": item.text,
+                    "page_continuation_flags": (item.isStart, item.isEnd),
+                }
+            )
+
+        result = self.pipeline.translate(
+            ori_md_info_list,
+            target_language=input.targetLanguage,
+            chunk_size=input.chunkSize,
+            task_description=input.taskDescription,
+            output_format=input.outputFormat,
+            rules_str=input.rulesStr,
+            few_shot_demo_text_content=input.fewShotDemoTextContent,
+            few_shot_demo_key_value_list=input.fewShotDemoKeyValueList,
+            glossary=input.glossary,
+            llm_request_interval=input.llmRequestInterval,
+            chat_bot_config=input.chatBotConfig,
+        )
+
+        translation_results: List[Dict[str, Any]] = []
+        for item in result:
+            translation_results.append(
+                dict(
+                    language=item["language"],
+                    markdown=dict(
+                        text=item["markdown_texts"],
+                        isStart=item["page_continuation_flags"][0],
+                        isEnd=item["page_continuation_flags"][1],
+                    ),
+                )
+            )
+
+        return schemas.pp_doctranslation.TranslateResult(
+            translationResults=translation_results,
+        )

+ 22 - 0
deploy/hps/sdk/pipelines/PP-DocTranslation/server/model_repo/doctrans-translate/config.pbtxt

@@ -0,0 +1,22 @@
+backend: "python"
+max_batch_size: 1
+input [
+  {
+    name: "input"
+    data_type: TYPE_STRING
+    dims: [ 1 ]
+  }
+]
+output [
+  {
+    name: "output"
+    data_type: TYPE_STRING
+    dims: [ 1 ]
+  }
+]
+instance_group [
+  {
+      count: 1
+      kind: KIND_CPU
+  }
+]

+ 179 - 0
deploy/hps/sdk/pipelines/PP-DocTranslation/server/model_repo/doctrans-visual/1/model.py

@@ -0,0 +1,179 @@
+from typing import Any, Dict, Final, List, Tuple
+
+from paddlex_hps_server import (
+    BaseTritonPythonModel,
+    app_common,
+    protocol,
+    schemas,
+    utils,
+)
+from paddlex_hps_server.storage import SupportsGetURL, create_storage
+
+_DEFAULT_MAX_NUM_INPUT_IMGS: Final[int] = 10
+_DEFAULT_MAX_OUTPUT_IMG_SIZE: Final[Tuple[int, int]] = (2000, 2000)
+
+
+class TritonPythonModel(BaseTritonPythonModel):
+    def initialize(self, args):
+        super().initialize(args)
+
+        self.pipeline.inintial_visual_predictor(self.pipeline.config)
+
+        self.context = {}
+        self.context["file_storage"] = None
+        self.context["return_img_urls"] = False
+        self.context["max_num_input_imgs"] = _DEFAULT_MAX_NUM_INPUT_IMGS
+        self.context["max_output_img_size"] = _DEFAULT_MAX_OUTPUT_IMG_SIZE
+        if self.app_config.extra:
+            if "file_storage" in self.app_config.extra:
+                self.context["file_storage"] = create_storage(
+                    self.app_config.extra["file_storage"]
+                )
+            if "return_img_urls" in self.app_config.extra:
+                self.context["return_img_urls"] = self.app_config.extra[
+                    "return_img_urls"
+                ]
+            if "max_num_input_imgs" in self.app_config.extra:
+                self.context["max_num_input_imgs"] = self.app_config.extra[
+                    "max_num_input_imgs"
+                ]
+            if "max_output_img_size" in self.app_config.extra:
+                self.context["max_output_img_size"] = self.app_config.extra[
+                    "max_output_img_size"
+                ]
+        if self.context["return_img_urls"]:
+            file_storage = self.context["file_storage"]
+            if not file_storage:
+                raise ValueError(
+                    "The file storage must be properly configured when URLs need to be returned."
+                )
+            if not isinstance(file_storage, SupportsGetURL):
+                raise TypeError(f"{type(file_storage)} does not support getting URLs.")
+
+    def get_input_model_type(self):
+        return schemas.pp_doctranslation.AnalyzeImagesRequest
+
+    def get_result_model_type(self):
+        return schemas.pp_doctranslation.AnalyzeImagesResult
+
+    def run(self, input, log_id):
+        if input.fileType is None:
+            if utils.is_url(input.file):
+                maybe_file_type = utils.infer_file_type(input.file)
+                if maybe_file_type is None or not (
+                    maybe_file_type == "PDF" or maybe_file_type == "IMAGE"
+                ):
+                    return protocol.create_aistudio_output_without_result(
+                        422,
+                        "Unsupported file type",
+                        log_id=log_id,
+                    )
+                file_type = maybe_file_type
+            else:
+                return protocol.create_aistudio_output_without_result(
+                    422,
+                    "File type cannot be determined",
+                    log_id=log_id,
+                )
+        else:
+            file_type = "PDF" if input.fileType == 0 else "IMAGE"
+        visualize_enabled = (
+            input.visualize
+            if input.visualize is not None
+            else self.app_config.visualize
+        )
+
+        file_bytes = utils.get_raw_bytes(input.file)
+        images, data_info = utils.file_to_images(
+            file_bytes,
+            file_type,
+            max_num_imgs=self.context["max_num_input_imgs"],
+        )
+
+        result = self.pipeline.visual_predict(
+            images,
+            use_doc_orientation_classify=input.useDocOrientationClassify,
+            use_doc_unwarping=input.useDocUnwarping,
+            use_textline_orientation=input.useTextlineOrientation,
+            use_seal_recognition=input.useSealRecognition,
+            use_table_recognition=input.useTableRecognition,
+            use_formula_recognition=input.useFormulaRecognition,
+            use_chart_recognition=input.useChartRecognition,
+            use_region_detection=input.useRegionDetection,
+            layout_threshold=input.layoutThreshold,
+            layout_nms=input.layoutNms,
+            layout_unclip_ratio=input.layoutUnclipRatio,
+            layout_merge_bboxes_mode=input.layoutMergeBboxesMode,
+            text_det_limit_side_len=input.textDetLimitSideLen,
+            text_det_limit_type=input.textDetLimitType,
+            text_det_thresh=input.textDetThresh,
+            text_det_box_thresh=input.textDetBoxThresh,
+            text_det_unclip_ratio=input.textDetUnclipRatio,
+            text_rec_score_thresh=input.textRecScoreThresh,
+            seal_det_limit_side_len=input.sealDetLimitSideLen,
+            seal_det_limit_type=input.sealDetLimitType,
+            seal_det_thresh=input.sealDetThresh,
+            seal_det_box_thresh=input.sealDetBoxThresh,
+            seal_det_unclip_ratio=input.sealDetUnclipRatio,
+            seal_rec_score_thresh=input.sealRecScoreThresh,
+            use_wired_table_cells_trans_to_html=input.useWiredTableCellsTransToHtml,
+            use_wireless_table_cells_trans_to_html=input.useWirelessTableCellsTransToHtml,
+            use_table_orientation_classify=input.useTableOrientationClassify,
+            use_ocr_results_with_table_cells=input.useOcrResultsWithTableCells,
+            use_e2e_wired_table_rec_model=input.useE2eWiredTableRecModel,
+            use_e2e_wireless_table_rec_model=input.useE2eWirelessTableRecModel,
+        )
+
+        layout_parsing_results: List[Dict[str, Any]] = []
+        for i, (img, item) in enumerate(zip(images, result)):
+            pruned_res = app_common.prune_result(
+                item["layout_parsing_result"].json["res"]
+            )
+            md_data = item["layout_parsing_result"].markdown
+            md_text = md_data["markdown_texts"]
+            md_imgs = app_common.postprocess_images(
+                md_data["markdown_images"],
+                log_id,
+                filename_template=f"markdown_{i}/{{key}}",
+                file_storage=self.context["file_storage"],
+                return_urls=self.context["return_img_urls"],
+                max_img_size=self.context["max_output_img_size"],
+            )
+            md_flags = md_data["page_continuation_flags"]
+            if visualize_enabled:
+                imgs = {
+                    "input_img": img,
+                    **item["layout_parsing_result"].img,
+                }
+                imgs = app_common.postprocess_images(
+                    imgs,
+                    log_id,
+                    filename_template=f"{{key}}_{i}.jpg",
+                    file_storage=self.context["file_storage"],
+                    return_urls=self.context["return_img_urls"],
+                    max_img_size=self.context["max_output_img_size"],
+                )
+            else:
+                imgs = {}
+            layout_parsing_results.append(
+                dict(
+                    prunedResult=pruned_res,
+                    markdown=dict(
+                        text=md_text,
+                        images=md_imgs,
+                        isStart=md_flags[0],
+                        isEnd=md_flags[1],
+                    ),
+                    outputImages=(
+                        {k: v for k, v in imgs.items() if k != "input_img"}
+                        if imgs
+                        else None
+                    ),
+                    inputImage=imgs.get("input_img"),
+                )
+            )
+
+        return schemas.pp_doctranslation.AnalyzeImagesResult(
+            layoutParsingResults=layout_parsing_results,
+            dataInfo=data_info,
+        )

+ 261 - 0
deploy/hps/sdk/pipelines/PP-DocTranslation/server/pipeline_config.yaml

@@ -0,0 +1,261 @@
+
+pipeline_name: PP-DocTranslation
+
+use_layout_parser: True
+
+SubModules:
+  LLM_Chat:
+    module_name: chat_bot
+    model_name: ernie-3.5-8k
+    base_url: "https://qianfan.baidubce.com/v2"
+    api_type: openai
+    api_key: "api_key" # Set this to a real API key
+
+  PromptEngneering:
+    Translate_CommonText:
+      module_name: prompt_engneering
+      task_type: translate_prompt
+      
+      task_description: '你是一位资深的多语种语言翻译专家,精通多种语言的语法、词汇、文化背景以及语言风格。你的任务是将文本从一种语言准确地转换为另一种语言,同时精准地保留原文的语义、风格和语调,确保翻译内容在目标语言中自然流畅且富有文化适应性。'
+
+      output_format: '输出应为翻译后的文本,并与原文保持格式一致,包括标点符号和段落结构。如果原文中包含特定的格式(如表格、公式、列表等),翻译后的文本也应保持相同的格式。'
+
+      rules_str: '通用规则:
+              1. 翻译应确保语义准确完整,并符合目标语言的表达习惯。
+              2. 保留原文的风格和语调,以传达相同的情感和意图。
+              3. 专有名词(如人名、地名、品牌名等)应保持不变,除非它们在目标语言中有公认的翻译。
+              4. 文化特定的表达或成语需根据目标语言的文化背景进行适当的转换或解释。
+              5. 避免使用机器翻译工具的简单直译,需根据上下文进行调整和优化。
+              6. 原文中可能包含的非文本元素(如HTML语法中的图片、表格、公式等)应保持不变。
+              7. 原文中可能包含的代码块,如编程语言代码等,应保持代码块的完整性,不要对代码进行调整。
+              8. 翻译完成后,应仔细校对,确保没有语法和拼写错误。'
+      few_shot_demo_text_content:
+      few_shot_demo_key_value_list:
+
+SubPipelines:
+  LayoutParser:
+    pipeline_name: PP-StructureV3
+
+    batch_size: 8
+
+    use_doc_preprocessor: True
+    use_seal_recognition: True
+    use_table_recognition: True
+    use_formula_recognition: True
+    use_chart_recognition: True
+    use_region_detection: True
+
+    SubModules:
+      LayoutDetection:
+        module_name: layout_detection
+        model_name: PP-DocLayout_plus-L
+        model_dir: null
+        batch_size: 8
+        threshold: 
+          0: 0.3  # paragraph_title
+          1: 0.5  # image
+          2: 0.4  # text
+          3: 0.5  # number
+          4: 0.5  # abstract
+          5: 0.5  # content
+          6: 0.5  # figure_table_chart_title
+          7: 0.3  # formula
+          8: 0.5  # table
+          9: 0.5  # reference
+          10: 0.5 # doc_title
+          11: 0.5 # footnote
+          12: 0.5 # header
+          13: 0.5 # algorithm
+          14: 0.5 # footer
+          15: 0.45 # seal
+          16: 0.5 # chart
+          17: 0.5 # formula_number
+          18: 0.5 # aside_text
+          19: 0.5 # reference_content
+        layout_nms: True
+        layout_unclip_ratio: [1.0, 1.0] 
+        layout_merge_bboxes_mode: 
+          0: "large"  # paragraph_title
+          1: "large"  # image
+          2: "union"  # text
+          3: "union"  # number
+          4: "union"  # abstract
+          5: "union"  # content
+          6: "union"  # figure_table_chart_title
+          7: "large"  # formula
+          8: "union"  # table
+          9: "union"  # reference
+          10: "union" # doc_title
+          11: "union" # footnote
+          12: "union" # header
+          13: "union" # algorithm
+          14: "union" # footer
+          15: "union" # seal
+          16: "large" # chart
+          17: "union" # formula_number
+          18: "union" # aside_text
+          19: "union" # reference_content
+      ChartRecognition:
+        module_name: chart_recognition
+        model_name: PP-Chart2Table
+        model_dir: null
+        batch_size: 1 
+      RegionDetection:
+        module_name: layout_detection
+        model_name: PP-DocBlockLayout
+        model_dir: null
+        layout_nms: True
+        layout_merge_bboxes_mode: "small"
+
+    SubPipelines:
+      DocPreprocessor:
+        pipeline_name: doc_preprocessor
+        batch_size: 8
+        use_doc_orientation_classify: True
+        use_doc_unwarping: True
+        SubModules:
+          DocOrientationClassify:
+            module_name: doc_text_orientation
+            model_name: PP-LCNet_x1_0_doc_ori
+            model_dir: null
+            batch_size: 8
+          DocUnwarping:
+            module_name: image_unwarping
+            model_name: UVDoc
+            model_dir: null
+
+      GeneralOCR:
+        pipeline_name: OCR
+        batch_size: 8
+        text_type: general
+        use_doc_preprocessor: False
+        use_textline_orientation: True
+        SubModules:
+          TextDetection:
+            module_name: text_detection
+            model_name: PP-OCRv5_server_det
+            model_dir: null
+            limit_side_len: 736
+            limit_type: min
+            max_side_limit: 4000
+            thresh: 0.3
+            box_thresh: 0.6
+            unclip_ratio: 1.5
+          TextLineOrientation:
+            module_name: textline_orientation
+            model_name: PP-LCNet_x1_0_textline_ori
+            model_dir: null
+            batch_size: 8
+          TextRecognition:
+            module_name: text_recognition
+            model_name: PP-OCRv5_server_rec
+            model_dir: null
+            batch_size: 8
+            score_thresh: 0.0
+    
+
+      TableRecognition:
+        pipeline_name: table_recognition_v2
+        use_layout_detection: False
+        use_doc_preprocessor: False
+        use_ocr_model: False
+        SubModules:  
+          TableClassification:
+            module_name: table_classification
+            model_name: PP-LCNet_x1_0_table_cls
+            model_dir: null
+
+          WiredTableStructureRecognition:
+            module_name: table_structure_recognition
+            model_name: SLANeXt_wired
+            model_dir: null
+          
+          WirelessTableStructureRecognition:
+            module_name: table_structure_recognition
+            model_name: SLANet_plus
+            model_dir: null
+          
+          WiredTableCellsDetection:
+            module_name: table_cells_detection
+            model_name: RT-DETR-L_wired_table_cell_det
+            model_dir: null
+          
+          WirelessTableCellsDetection:
+            module_name: table_cells_detection
+            model_name: RT-DETR-L_wireless_table_cell_det
+            model_dir: null
+
+          TableOrientationClassify:
+            module_name: doc_text_orientation
+            model_name: PP-LCNet_x1_0_doc_ori
+            model_dir: null
+        SubPipelines:
+          GeneralOCR:
+            pipeline_name: OCR
+            text_type: general
+            use_doc_preprocessor: False
+            use_textline_orientation: True
+            SubModules:
+              TextDetection:
+                module_name: text_detection
+                model_name: PP-OCRv5_server_det
+                model_dir: null
+                limit_side_len: 736
+                limit_type: min
+                max_side_limit: 4000
+                thresh: 0.3
+                box_thresh: 0.4
+                unclip_ratio: 1.5
+              TextLineOrientation:
+                module_name: textline_orientation
+                model_name: PP-LCNet_x1_0_textline_ori
+                model_dir: null
+                batch_size: 8
+              TextRecognition:
+                module_name: text_recognition
+                model_name: PP-OCRv5_server_rec
+                model_dir: null
+                batch_size: 8
+            score_thresh: 0.0
+
+      SealRecognition:
+        pipeline_name: seal_recognition
+        batch_size: 8
+        use_layout_detection: False
+        use_doc_preprocessor: False
+        SubPipelines:
+          SealOCR:
+            pipeline_name: OCR
+            batch_size: 8
+            text_type: seal
+            use_doc_preprocessor: False
+            use_textline_orientation: False
+            SubModules:
+              TextDetection:
+                module_name: seal_text_detection
+                model_name: PP-OCRv4_server_seal_det
+                model_dir: null
+                limit_side_len: 736
+                limit_type: min
+                max_side_limit: 4000
+                thresh: 0.2
+                box_thresh: 0.6
+                unclip_ratio: 0.5
+              TextRecognition:
+                module_name: text_recognition
+                model_name: PP-OCRv5_server_rec
+                model_dir: null
+                batch_size: 8
+                score_thresh: 0
+        
+      FormulaRecognition:
+        pipeline_name: formula_recognition
+        batch_size: 8
+        use_layout_detection: False
+        use_doc_preprocessor: False
+        SubModules:
+          FormulaRecognition:
+            module_name: formula_recognition
+            model_name: PP-FormulaNet_plus-L
+            model_dir: null
+            batch_size: 8

+ 120 - 0
deploy/hps/sdk/pipelines/PP-ShiTuV2/client/client.py

@@ -0,0 +1,120 @@
+#!/usr/bin/env python
+
+import argparse
+import pprint
+import sys
+
+from paddlex_hps_client import triton_request, utils
+from tritonclient import grpc as triton_grpc
+
+OUTPUT_IMAGE_PATH = "out.jpg"
+
+
+def parse_image_label_pairs(image_label_pairs):
+    if len(image_label_pairs) % 2 != 0:
+        raise ValueError("The number of image-label pairs must be even.")
+    return [
+        {"image": utils.prepare_input_file(img), "label": lab}
+        for img, lab in zip(image_label_pairs[0::2], image_label_pairs[1::2])
+    ]
+
+
+def create_triton_client(url):
+    return triton_grpc.InferenceServerClient(url)
+
+
+def ensure_no_error(output):
+    if output["errorCode"] != 0:
+        print(f"Error code: {output['errorCode']}", file=sys.stderr)
+        print(f"Error message: {output['errorMsg']}", file=sys.stderr)
+        sys.exit(1)
+
+
+def do_index_build(args):
+    client = create_triton_client(args.url)
+    if args.image_label_pairs:
+        image_label_pairs = parse_image_label_pairs(args.image_label_pairs)
+    else:
+        image_label_pairs = []
+    input_ = {"imageLabelPairs": image_label_pairs}
+    output = triton_request(client, "shitu-index-build", input_)
+    ensure_no_error(output)
+    result = output["result"]
+    pprint.pp(result)
+
+
+def do_index_add(args):
+    client = create_triton_client(args.url)
+    image_label_pairs = parse_image_label_pairs(args.image_label_pairs)
+    input_ = {"imageLabelPairs": image_label_pairs}
+    if args.index_key is not None:
+        input_["indexKey"] = args.index_key
+    output = triton_request(client, "shitu-index-add", input_)
+    ensure_no_error(output)
+    result = output["result"]
+    pprint.pp(result)
+
+
+def do_index_remove(args):
+    client = create_triton_client(args.url)
+    input_ = {"ids": args.ids}
+    if args.index_key is not None:
+        input_["indexKey"] = args.index_key
+    output = triton_request(client, "shitu-index-remove", input_)
+    ensure_no_error(output)
+    result = output["result"]
+    pprint.pp(result)
+
+
+def do_infer(args):
+    client = create_triton_client(args.url)
+    input_ = {"image": utils.prepare_input_file(args.image)}
+    if args.index_key is not None:
+        input_["indexKey"] = args.index_key
+    if args.no_visualization:
+        input_["visualize"] = False
+    output = triton_request(client, "shitu-infer", input_)
+    ensure_no_error(output)
+    result = output["result"]
+    utils.save_output_file(result["image"], OUTPUT_IMAGE_PATH)
+    print(f"Output image saved at {OUTPUT_IMAGE_PATH}")
+    print("\nDetected objects:")
+    pprint.pp(result["detectedObjects"])
+
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--url", type=str, default="localhost:8001")
+
+    subparsers = parser.add_subparsers(dest="cmd")
+
+    parser_index_build = subparsers.add_parser("index-build")
+    parser_index_build.add_argument("--image-label-pairs", type=str, nargs="+")
+    parser_index_build.set_defaults(func=do_index_build)
+
+    parser_index_add = subparsers.add_parser("index-add")
+    parser_index_add.add_argument(
+        "--image-label-pairs", type=str, nargs="+", required=True
+    )
+    parser_index_add.add_argument("--index-key", type=str, required=True)
+    parser_index_add.set_defaults(func=do_index_add)
+
+    parser_index_remove = subparsers.add_parser("index-remove")
+    parser_index_remove.add_argument("--ids", type=int, nargs="+", required=True)
+    parser_index_remove.add_argument("--index-key", type=str, required=True)
+    parser_index_remove.set_defaults(func=do_index_remove)
+
+    parser_infer = subparsers.add_parser("infer")
+    parser_infer.add_argument("--image", type=str, required=True)
+    parser_infer.add_argument("--index-key", type=str)
+    parser.add_argument("--no-visualization", action="store_true")
+
+    parser_infer.set_defaults(func=do_infer)
+
+    args = parser.parse_args()
+
+    args.func(args)
+
+
+if __name__ == "__main__":
+    main()

+ 3 - 0
deploy/hps/sdk/pipelines/PP-ShiTuV2/client/requirements.txt

@@ -0,0 +1,3 @@
+# paddlex-hps-client
+protobuf == 3.19.6
+tritonclient [grpc] == 2.15

+ 3 - 0
deploy/hps/sdk/pipelines/PP-ShiTuV2/server/.isort.cfg

@@ -0,0 +1,3 @@
+[settings]
+profile=black
+known_first_party=common

+ 35 - 0
deploy/hps/sdk/pipelines/PP-ShiTuV2/server/model_repo/shitu-index-add/1/model.py

@@ -0,0 +1,35 @@
+from operator import attrgetter
+
+from paddlex.inference.pipelines.components import IndexData
+from paddlex_hps_server import schemas, utils
+
+from common.base_model import BaseShiTuModel
+
+
+class TritonPythonModel(BaseShiTuModel):
+    def get_input_model_type(self):
+        return schemas.pp_shituv2.AddImagesToIndexRequest
+
+    def get_result_model_type(self):
+        return schemas.pp_shituv2.AddImagesToIndexResult
+
+    def run(self, input, log_id):
+        file_bytes_list = [
+            utils.get_raw_bytes(img)
+            for img in map(attrgetter("image"), input.imageLabelPairs)
+        ]
+        images = [utils.image_bytes_to_array(item) for item in file_bytes_list]
+        labels = [pair.label for pair in input.imageLabelPairs]
+
+        index_storage = self.context["index_storage"]
+        index_data_bytes = index_storage.get(input.indexKey)
+        index_data = IndexData.from_bytes(index_data_bytes)
+
+        index_data = self.pipeline.append_index(images, labels, index_data)
+
+        index_data_bytes = index_data.to_bytes()
+        index_storage.set(input.indexKey, index_data_bytes)
+
+        return schemas.pp_shituv2.AddImagesToIndexResult(
+            imageCount=len(index_data.id_map)
+        )

+ 42 - 0
deploy/hps/sdk/pipelines/PP-ShiTuV2/server/model_repo/shitu-index-build/1/model.py

@@ -0,0 +1,42 @@
+import uuid
+from operator import attrgetter
+
+from paddlex_hps_server import schemas, utils
+
+from common.base_model import BaseShiTuModel
+
+
+def _generate_index_key():
+    return str(uuid.uuid4())
+
+
+class TritonPythonModel(BaseShiTuModel):
+    def get_input_model_type(self):
+        return schemas.pp_shituv2.BuildIndexRequest
+
+    def get_result_model_type(self):
+        return schemas.pp_shituv2.BuildIndexResult
+
+    def run(self, input, log_id):
+        file_bytes_list = [
+            utils.get_raw_bytes(img)
+            for img in map(attrgetter("image"), input.imageLabelPairs)
+        ]
+        images = [utils.image_bytes_to_array(item) for item in file_bytes_list]
+        labels = [pair.label for pair in input.imageLabelPairs]
+
+        index_data = self.pipeline.build_index(
+            images,
+            labels,
+            index_type="Flat",
+            metric_type="IP",
+        )
+
+        index_storage = self.context["index_storage"]
+        index_key = _generate_index_key()
+        index_data_bytes = index_data.to_bytes()
+        index_storage.set(index_key, index_data_bytes)
+
+        return schemas.pp_shituv2.BuildIndexResult(
+            indexKey=index_key, imageCount=len(index_data.id_map)
+        )

+ 26 - 0
deploy/hps/sdk/pipelines/PP-ShiTuV2/server/model_repo/shitu-index-remove/1/model.py

@@ -0,0 +1,26 @@
+from paddlex.inference.pipelines.components import IndexData
+from paddlex_hps_server import schemas
+
+from common.base_model import BaseShiTuModel
+
+
+class TritonPythonModel(BaseShiTuModel):
+    def get_input_model_type(self):
+        return schemas.pp_shituv2.RemoveImagesFromIndexRequest
+
+    def get_result_model_type(self):
+        return schemas.pp_shituv2.RemoveImagesFromIndexResult
+
+    def run(self, input, log_id):
+        index_storage = self.context["index_storage"]
+        index_data_bytes = index_storage.get(input.indexKey)
+        index_data = IndexData.from_bytes(index_data_bytes)
+
+        index_data = self.pipeline.remove_index(input.ids, index_data)
+
+        index_data_bytes = index_data.to_bytes()
+        index_storage.set(input.indexKey, index_data_bytes)
+
+        return schemas.pp_shituv2.RemoveImagesFromIndexResult(
+            imageCount=len(index_data.id_map)
+        )

+ 22 - 0
deploy/hps/sdk/pipelines/PP-ShiTuV2/server/model_repo/shitu-index-remove/config.pbtxt

@@ -0,0 +1,22 @@
+backend: "python"
+max_batch_size: 1
+input [
+  {
+    name: "input"
+    data_type: TYPE_STRING
+    dims: [ 1 ]
+  }
+]
+output [
+  {
+    name: "output"
+    data_type: TYPE_STRING
+    dims: [ 1 ]
+  }
+]
+instance_group [
+  {
+      count: 1
+      kind: KIND_CPU
+  }
+]

+ 66 - 0
deploy/hps/sdk/pipelines/PP-ShiTuV2/server/model_repo/shitu-infer/1/model.py

@@ -0,0 +1,66 @@
+from typing import Any, Dict, List
+
+from paddlex.inference.pipelines.components import IndexData
+from paddlex_hps_server import schemas, utils
+
+from common.base_model import BaseShiTuModel
+
+
+class TritonPythonModel(BaseShiTuModel):
+    def get_input_model_type(self):
+        return schemas.pp_shituv2.InferRequest
+
+    def get_result_model_type(self):
+        return schemas.pp_shituv2.InferResult
+
+    def run(self, input, log_id):
+        image_bytes = utils.get_raw_bytes(input.image)
+        image = utils.image_bytes_to_array(image_bytes)
+        visualize_enabled = input.visualize if input.visualize is not None else self.app_config.visualize
+
+        if input.indexKey is not None:
+            index_storage = self.context["index_storage"]
+            index_data_bytes = index_storage.get(input.indexKey)
+            index_data = IndexData.from_bytes(index_data_bytes)
+        else:
+            index_data = None
+
+        result = list(
+            self.pipeline(
+                image,
+                index=index_data,
+                det_threshold=input.detThreshold,
+                rec_threshold=input.recThreshold,
+                hamming_radius=input.hammingRadius,
+                topk=input.topk,
+            )
+        )[0]
+
+        objs: List[Dict[str, Any]] = []
+        for obj in result["boxes"]:
+            rec_results: List[Dict[str, Any]] = []
+            if obj["rec_scores"] != [None]:
+                for label, score in zip(obj["labels"], obj["rec_scores"]):
+                    rec_results.append(
+                        dict(
+                            label=label,
+                            score=score,
+                        )
+                    )
+            objs.append(
+                dict(
+                    bbox=obj["coordinate"],
+                    recResults=rec_results,
+                    score=obj["det_score"],
+                )
+            )
+        if visualize_enabled:
+            output_image_base64 = utils.base64_encode(
+                utils.image_to_bytes(result.img["res"])
+            )
+        else:
+            output_image_base64 = None
+
+        return schemas.pp_shituv2.InferResult(
+            detectedObjects=objs, image=output_image_base64
+        )

+ 18 - 0
deploy/hps/sdk/pipelines/PP-ShiTuV2/server/pipeline_config.yaml

@@ -0,0 +1,18 @@
+pipeline_name: PP-ShiTuV2
+
+index: None
+det_threshold: 0.5
+rec_threshold: 0.5
+rec_topk: 5
+
+SubModules:
+  Detection:
+    module_name: text_detection
+    model_name: PP-ShiTuV2_det
+    model_dir: null
+    batch_size: 1    
+  Recognition:
+    module_name: text_recognition
+    model_name: PP-ShiTuV2_rec
+    model_dir: null
+    batch_size: 1

+ 0 - 0
libs/ultra-infer/ultra_infer/CMakeLists.txt → deploy/hps/sdk/pipelines/PP-ShiTuV2/server/shared_mods/common/__init__.py


+ 19 - 0
deploy/hps/sdk/pipelines/PP-ShiTuV2/server/shared_mods/common/base_model.py

@@ -0,0 +1,19 @@
+from paddlex_hps_server import BaseTritonPythonModel
+from paddlex_hps_server.storage import create_storage
+
+# Do we need a lock?
+DEFAULT_INDEX_DIR = ".indexes"
+
+
+class BaseShiTuModel(BaseTritonPythonModel):
+    def initialize(self, args):
+        super().initialize(args)
+        self.context = {}
+        if self.app_config.extra and "index_storage" in self.app_config.extra:
+            self.context["index_storage"] = create_storage(
+                self.app_config.extra["index_storage"]
+            )
+        else:
+            self.context["index_storage"] = create_storage(
+                {"type": "file_system", "directory": DEFAULT_INDEX_DIR}
+            )

+ 49 - 0
deploy/hps/sdk/pipelines/PP-StructureV3/client/client.py

@@ -0,0 +1,49 @@
+#!/usr/bin/env python
+
+import argparse
+import sys
+from pathlib import Path
+
+from paddlex_hps_client import triton_request, utils
+from tritonclient import grpc as triton_grpc
+
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--file", type=str, required=True)
+    parser.add_argument("--file-type", type=int, choices=[0, 1])
+    parser.add_argument("--no-visualization", action="store_true")
+    parser.add_argument("--url", type=str, default="localhost:8001")
+
+    args = parser.parse_args()
+
+    client = triton_grpc.InferenceServerClient(args.url)
+    input_ = {"file": utils.prepare_input_file(args.file)}
+    if args.file_type is not None:
+        input_["fileType"] = args.file_type
+    if args.no_visualization:
+        input_["visualize"] = False
+    output = triton_request(client, "layout-parsing", input_)
+    if output["errorCode"] != 0:
+        print(f"Error code: {output['errorCode']}", file=sys.stderr)
+        print(f"Error message: {output['errorMsg']}", file=sys.stderr)
+        sys.exit(1)
+    result = output["result"]
+    for i, res in enumerate(result["layoutParsingResults"]):
+        print(res["prunedResult"])
+        md_dir = Path(f"markdown_{i}")
+        md_dir.mkdir(exist_ok=True)
+        (md_dir / "doc.md").write_text(res["markdown"]["text"])
+        for img_path, img in res["markdown"]["images"].items():
+            img_path = md_dir / img_path
+            img_path.parent.mkdir(parents=True, exist_ok=True)
+            utils.save_output_file(img, img_path)
+        print(f"Markdown document saved at {md_dir / 'doc.md'}")
+        for img_name, img in res["outputImages"].items():
+            img_path = f"{img_name}_{i}.jpg"
+            utils.save_output_file(img, img_path)
+            print(f"Output image saved at {img_path}")
+
+
+if __name__ == "__main__":
+    main()

+ 3 - 0
deploy/hps/sdk/pipelines/PP-StructureV3/client/requirements.txt

@@ -0,0 +1,3 @@
+# paddlex-hps-client
+protobuf == 3.19.6
+tritonclient [grpc] == 2.15

+ 172 - 0
deploy/hps/sdk/pipelines/PP-StructureV3/server/model_repo/layout-parsing/1/model.py

@@ -0,0 +1,172 @@
+from typing import Any, Dict, Final, List, Tuple
+
+from paddlex_hps_server import (
+    BaseTritonPythonModel,
+    app_common,
+    protocol,
+    schemas,
+    utils,
+)
+from paddlex_hps_server.storage import SupportsGetURL, create_storage
+
+_DEFAULT_MAX_NUM_INPUT_IMGS: Final[int] = 10
+_DEFAULT_MAX_OUTPUT_IMG_SIZE: Final[Tuple[int, int]] = (2000, 2000)
+
+
+class TritonPythonModel(BaseTritonPythonModel):
+    def initialize(self, args):
+        super().initialize(args)
+        self.context = {}
+        self.context["file_storage"] = None
+        self.context["return_img_urls"] = False
+        self.context["max_num_input_imgs"] = _DEFAULT_MAX_NUM_INPUT_IMGS
+        self.context["max_output_img_size"] = _DEFAULT_MAX_OUTPUT_IMG_SIZE
+        if self.app_config.extra:
+            if "file_storage" in self.app_config.extra:
+                self.context["file_storage"] = create_storage(
+                    self.app_config.extra["file_storage"]
+                )
+            if "return_img_urls" in self.app_config.extra:
+                self.context["return_img_urls"] = self.app_config.extra[
+                    "return_img_urls"
+                ]
+            if "max_num_input_imgs" in self.app_config.extra:
+                self.context["max_num_input_imgs"] = self.app_config.extra[
+                    "max_num_input_imgs"
+                ]
+            if "max_output_img_size" in self.app_config.extra:
+                self.context["max_output_img_size"] = self.app_config.extra[
+                    "max_output_img_size"
+                ]
+        if self.context["return_img_urls"]:
+            file_storage = self.context["file_storage"]
+            if not file_storage:
+                raise ValueError(
+                    "The file storage must be properly configured when URLs need to be returned."
+                )
+            if not isinstance(file_storage, SupportsGetURL):
+                raise TypeError(f"{type(file_storage)} does not support getting URLs.")
+
+    def get_input_model_type(self):
+        return schemas.pp_structurev3.InferRequest
+
+    def get_result_model_type(self):
+        return schemas.pp_structurev3.InferResult
+
+    def run(self, input, log_id):
+        if input.fileType is None:
+            if utils.is_url(input.file):
+                maybe_file_type = utils.infer_file_type(input.file)
+                if maybe_file_type is None or not (
+                    maybe_file_type == "PDF" or maybe_file_type == "IMAGE"
+                ):
+                    return protocol.create_aistudio_output_without_result(
+                        422,
+                        "Unsupported file type",
+                        log_id=log_id,
+                    )
+                file_type = maybe_file_type
+            else:
+                return protocol.create_aistudio_output_without_result(
+                    422,
+                    "File type cannot be determined",
+                    log_id=log_id,
+                )
+        else:
+            file_type = "PDF" if input.fileType == 0 else "IMAGE"
+        visualize_enabled = input.visualize if input.visualize is not None else self.app_config.visualize
+
+        file_bytes = utils.get_raw_bytes(input.file)
+        images, data_info = utils.file_to_images(
+            file_bytes,
+            file_type,
+            max_num_imgs=self.context["max_num_input_imgs"],
+        )
+
+        result = list(
+            self.pipeline(
+                images,
+                use_doc_orientation_classify=input.useDocOrientationClassify,
+                use_doc_unwarping=input.useDocUnwarping,
+                use_textline_orientation=input.useTextlineOrientation,
+                use_seal_recognition=input.useSealRecognition,
+                use_table_recognition=input.useTableRecognition,
+                use_formula_recognition=input.useFormulaRecognition,
+                use_chart_recognition=input.useChartRecognition,
+                use_region_detection=input.useRegionDetection,
+                layout_threshold=input.layoutThreshold,
+                layout_nms=input.layoutNms,
+                layout_unclip_ratio=input.layoutUnclipRatio,
+                layout_merge_bboxes_mode=input.layoutMergeBboxesMode,
+                text_det_limit_side_len=input.textDetLimitSideLen,
+                text_det_limit_type=input.textDetLimitType,
+                text_det_thresh=input.textDetThresh,
+                text_det_box_thresh=input.textDetBoxThresh,
+                text_det_unclip_ratio=input.textDetUnclipRatio,
+                text_rec_score_thresh=input.textRecScoreThresh,
+                seal_det_limit_side_len=input.sealDetLimitSideLen,
+                seal_det_limit_type=input.sealDetLimitType,
+                seal_det_thresh=input.sealDetThresh,
+                seal_det_box_thresh=input.sealDetBoxThresh,
+                seal_det_unclip_ratio=input.sealDetUnclipRatio,
+                seal_rec_score_thresh=input.sealRecScoreThresh,
+                use_wired_table_cells_trans_to_html=input.useWiredTableCellsTransToHtml,
+                use_wireless_table_cells_trans_to_html=input.useWirelessTableCellsTransToHtml,
+                use_table_orientation_classify=input.useTableOrientationClassify,
+                use_ocr_results_with_table_cells=input.useOcrResultsWithTableCells,
+                use_e2e_wired_table_rec_model=input.useE2eWiredTableRecModel,
+                use_e2e_wireless_table_rec_model=input.useE2eWirelessTableRecModel,
+            )
+        )
+
+        layout_parsing_results: List[Dict[str, Any]] = []
+        for i, (img, item) in enumerate(zip(images, result)):
+            pruned_res = app_common.prune_result(item.json["res"])
+            md_data = item.markdown
+            md_text = md_data["markdown_texts"]
+            md_imgs = app_common.postprocess_images(
+                md_data["markdown_images"],
+                log_id,
+                filename_template=f"markdown_{i}/{{key}}",
+                file_storage=self.context["file_storage"],
+                return_urls=self.context["return_img_urls"],
+                max_img_size=self.context["max_output_img_size"],
+            )
+            md_flags = md_data["page_continuation_flags"]
+            if visualize_enabled:
+                imgs = {
+                    "input_img": img,
+                    **item.img,
+                }
+                imgs = app_common.postprocess_images(
+                    imgs,
+                    log_id,
+                    filename_template=f"{{key}}_{i}.jpg",
+                    file_storage=self.context["file_storage"],
+                    return_urls=self.context["return_img_urls"],
+                    max_img_size=self.context["max_output_img_size"],
+                )
+            else:
+                imgs = {}
+            layout_parsing_results.append(
+                dict(
+                    prunedResult=pruned_res,
+                    markdown=dict(
+                        text=md_text,
+                        images=md_imgs,
+                        isStart=md_flags[0],
+                        isEnd=md_flags[1],
+                    ),
+                    outputImages=(
+                        {k: v for k, v in imgs.items() if k != "input_img"}
+                        if imgs
+                        else None
+                    ),
+                    inputImage=imgs.get("input_img"),
+                )
+            )
+
+        return schemas.pp_structurev3.InferResult(
+            layoutParsingResults=layout_parsing_results,
+            dataInfo=data_info,
+        )

+ 226 - 0
deploy/hps/sdk/pipelines/PP-StructureV3/server/pipeline_config.yaml

@@ -0,0 +1,226 @@
+
+pipeline_name: PP-StructureV3
+
+batch_size: 8
+
+use_doc_preprocessor: True
+use_seal_recognition: True
+use_table_recognition: True
+use_formula_recognition: True
+use_chart_recognition: True
+use_region_detection: True
+
+SubModules:
+  LayoutDetection:
+    module_name: layout_detection
+    model_name: PP-DocLayout_plus-L
+    model_dir: null
+    batch_size: 8
+    threshold: 
+      0: 0.3  # paragraph_title
+      1: 0.5  # image
+      2: 0.4  # text
+      3: 0.5  # number
+      4: 0.5  # abstract
+      5: 0.5  # content
+      6: 0.5  # figure_table_chart_title
+      7: 0.3  # formula
+      8: 0.5  # table
+      9: 0.5  # reference
+      10: 0.5 # doc_title
+      11: 0.5 # footnote
+      12: 0.5 # header
+      13: 0.5 # algorithm
+      14: 0.5 # footer
+      15: 0.45 # seal
+      16: 0.5 # chart
+      17: 0.5 # formula_number
+      18: 0.5 # aside_text
+      19: 0.5 # reference_content
+    layout_nms: True
+    layout_unclip_ratio: [1.0, 1.0] 
+    layout_merge_bboxes_mode: 
+      0: "large"  # paragraph_title
+      1: "large"  # image
+      2: "union"  # text
+      3: "union"  # number
+      4: "union"  # abstract
+      5: "union"  # content
+      6: "union"  # figure_table_chart_title
+      7: "large"  # formula
+      8: "union"  # table
+      9: "union"  # reference
+      10: "union" # doc_title
+      11: "union" # footnote
+      12: "union" # header
+      13: "union" # algorithm
+      14: "union" # footer
+      15: "union" # seal
+      16: "large" # chart
+      17: "union" # formula_number
+      18: "union" # aside_text
+      19: "union" # reference_content
+  ChartRecognition:
+    module_name: chart_recognition
+    model_name: PP-Chart2Table
+    model_dir: null
+    batch_size: 1 
+  RegionDetection:
+    module_name: layout_detection
+    model_name: PP-DocBlockLayout
+    model_dir: null
+    layout_nms: True
+    layout_merge_bboxes_mode: "small"
+
+SubPipelines:
+  DocPreprocessor:
+    pipeline_name: doc_preprocessor
+    batch_size: 8
+    use_doc_orientation_classify: True
+    use_doc_unwarping: True
+    SubModules:
+      DocOrientationClassify:
+        module_name: doc_text_orientation
+        model_name: PP-LCNet_x1_0_doc_ori
+        model_dir: null
+        batch_size: 8
+      DocUnwarping:
+        module_name: image_unwarping
+        model_name: UVDoc
+        model_dir: null
+
+  GeneralOCR:
+    pipeline_name: OCR
+    batch_size: 8
+    text_type: general
+    use_doc_preprocessor: False
+    use_textline_orientation: True
+    SubModules:
+      TextDetection:
+        module_name: text_detection
+        model_name: PP-OCRv5_server_det
+        model_dir: null
+        limit_side_len: 736
+        limit_type: min
+        max_side_limit: 4000
+        thresh: 0.3
+        box_thresh: 0.6
+        unclip_ratio: 1.5
+      TextLineOrientation:
+        module_name: textline_orientation
+        model_name: PP-LCNet_x0_25_textline_ori
+        model_dir: null
+        batch_size: 8
+      TextRecognition:
+        module_name: text_recognition
+        model_name: PP-OCRv5_server_rec
+        model_dir: null
+        batch_size: 8
+        score_thresh: 0.0
+ 
+
+  TableRecognition:
+    pipeline_name: table_recognition_v2
+    use_layout_detection: False
+    use_doc_preprocessor: False
+    use_ocr_model: False
+    SubModules:  
+      TableClassification:
+        module_name: table_classification
+        model_name: PP-LCNet_x1_0_table_cls
+        model_dir: null
+
+      WiredTableStructureRecognition:
+        module_name: table_structure_recognition
+        model_name: SLANeXt_wired
+        model_dir: null
+      
+      WirelessTableStructureRecognition:
+        module_name: table_structure_recognition
+        model_name: SLANet_plus
+        model_dir: null
+      
+      WiredTableCellsDetection:
+        module_name: table_cells_detection
+        model_name: RT-DETR-L_wired_table_cell_det
+        model_dir: null
+      
+      WirelessTableCellsDetection:
+        module_name: table_cells_detection
+        model_name: RT-DETR-L_wireless_table_cell_det
+        model_dir: null
+
+      TableOrientationClassify:
+        module_name: doc_text_orientation
+        model_name: PP-LCNet_x1_0_doc_ori
+        model_dir: null
+    SubPipelines:
+      GeneralOCR:
+        pipeline_name: OCR
+        text_type: general
+        use_doc_preprocessor: False
+        use_textline_orientation: True
+        SubModules:
+          TextDetection:
+            module_name: text_detection
+            model_name: PP-OCRv5_server_det
+            model_dir: null
+            limit_side_len: 736
+            limit_type: min
+            max_side_limit: 4000
+            thresh: 0.3
+            box_thresh: 0.4
+            unclip_ratio: 1.5
+          TextLineOrientation:
+            module_name: textline_orientation
+            model_name: PP-LCNet_x0_25_textline_ori
+            model_dir: null
+            batch_size: 8
+          TextRecognition:
+            module_name: text_recognition
+            model_name: PP-OCRv5_server_rec
+            model_dir: null
+            batch_size: 8
+        score_thresh: 0.0
+
+  SealRecognition:
+    pipeline_name: seal_recognition
+    batch_size: 8
+    use_layout_detection: False
+    use_doc_preprocessor: False
+    SubPipelines:
+      SealOCR:
+        pipeline_name: OCR
+        batch_size: 8
+        text_type: seal
+        use_doc_preprocessor: False
+        use_textline_orientation: False
+        SubModules:
+          TextDetection:
+            module_name: seal_text_detection
+            model_name: PP-OCRv4_server_seal_det
+            model_dir: null
+            limit_side_len: 736
+            limit_type: min
+            max_side_limit: 4000
+            thresh: 0.2
+            box_thresh: 0.6
+            unclip_ratio: 0.5
+          TextRecognition:
+            module_name: text_recognition
+            model_name: PP-OCRv5_server_rec
+            model_dir: null
+            batch_size: 8
+            score_thresh: 0
+    
+  FormulaRecognition:
+    pipeline_name: formula_recognition
+    batch_size: 8
+    use_layout_detection: False
+    use_doc_preprocessor: False
+    SubModules:
+      FormulaRecognition:
+        module_name: formula_recognition
+        model_name: PP-FormulaNet_plus-L
+        model_dir: null
+        batch_size: 8

+ 36 - 0
deploy/hps/sdk/pipelines/anomaly_detection/client/client.py

@@ -0,0 +1,36 @@
+#!/usr/bin/env python
+
+import argparse
+import sys
+
+from paddlex_hps_client import triton_request, utils
+from tritonclient import grpc as triton_grpc
+
+OUTPUT_IMAGE_PATH = "out.jpg"
+
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--image", type=str, required=True)
+    parser.add_argument("--no-visualization", action="store_true")
+    parser.add_argument("--url", type=str, default="localhost:8001")
+    args = parser.parse_args()
+
+    client = triton_grpc.InferenceServerClient(args.url)
+    input_ = {"image": utils.prepare_input_file(args.image)}
+
+    if args.no_visualization:
+        input_["visualize"] = False
+
+    output = triton_request(client, "anomaly-detection", input_)
+    if output["errorCode"] != 0:
+        print(f"Error code: {output['errorCode']}", file=sys.stderr)
+        print(f"Error message: {output['errorMsg']}", file=sys.stderr)
+        sys.exit(1)
+    result = output["result"]
+    utils.save_output_file(result["image"], OUTPUT_IMAGE_PATH)
+    print(f"Output image saved at {OUTPUT_IMAGE_PATH}")
+
+
+if __name__ == "__main__":
+    main()

+ 3 - 0
deploy/hps/sdk/pipelines/anomaly_detection/client/requirements.txt

@@ -0,0 +1,3 @@
+# paddlex-hps-client
+protobuf == 3.19.6
+tritonclient [grpc] == 2.15

+ 31 - 0
deploy/hps/sdk/pipelines/anomaly_detection/server/model_repo/anomaly-detection/1/model.py

@@ -0,0 +1,31 @@
+from paddlex_hps_server import BaseTritonPythonModel, schemas, utils
+
+
+class TritonPythonModel(BaseTritonPythonModel):
+    def get_input_model_type(self):
+        return schemas.anomaly_detection.InferRequest
+
+    def get_result_model_type(self):
+        return schemas.anomaly_detection.InferResult
+
+    def run(self, input, log_id):
+        file_bytes = utils.get_raw_bytes(input.image)
+        image = utils.image_bytes_to_array(file_bytes)
+
+        result = list(self.pipeline.predict(image))[0]
+
+        pred = result["pred"][0].tolist()
+        size = [len(pred), len(pred[0])]
+        label_map = [item for sublist in pred for item in sublist]
+        visualize_enabled = input.visualize if input.visualize is not None else self.app_config.visualize
+
+        if visualize_enabled:
+            output_image_base64 = utils.base64_encode(
+                utils.image_to_bytes(result.img["res"].convert("RGB"))
+            )
+        else:
+            output_image_base64 = None
+
+        return schemas.anomaly_detection.InferResult(
+            labelMap=label_map, size=size, image=output_image_base64
+        )

+ 8 - 0
deploy/hps/sdk/pipelines/anomaly_detection/server/pipeline_config.yaml

@@ -0,0 +1,8 @@
+pipeline_name: anomaly_detection
+
+SubModules:
+  AnomalyDetection:
+    module_name: anomaly_detection
+    model_name: STFPM
+    model_dir: null
+    batch_size: 1   

+ 38 - 0
deploy/hps/sdk/pipelines/doc_preprocessor/client/client.py

@@ -0,0 +1,38 @@
+#!/usr/bin/env python
+
+import argparse
+import sys
+
+from paddlex_hps_client import triton_request, utils
+from tritonclient import grpc as triton_grpc
+
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--file", type=str, required=True)
+    parser.add_argument("--file-type", type=int, choices=[0, 1])
+    parser.add_argument("--no-visualization", action="store_true")
+    parser.add_argument("--url", type=str, default="localhost:8001")
+
+    args = parser.parse_args()
+
+    client = triton_grpc.InferenceServerClient(args.url)
+    input_ = {"file": utils.prepare_input_file(args.file)}
+    if args.file_type is not None:
+        input_["fileType"] = args.file_type
+    if args.no_visualization:
+        input_["visualize"] = False
+    output = triton_request(client, "document-preprocessing", input_)
+    if output["errorCode"] != 0:
+        print(f"Error code: {output['errorCode']}", file=sys.stderr)
+        print(f"Error message: {output['errorMsg']}", file=sys.stderr)
+        sys.exit(1)
+    result = output["result"]
+    for i, res in enumerate(result["docPreprocessingResults"]):
+        output_img_path = f"out_{i}.png"
+        utils.save_output_file(res["outputImage"], output_img_path)
+        print(f"Output image saved at {output_img_path}")
+
+
+if __name__ == "__main__":
+    main()

+ 3 - 0
deploy/hps/sdk/pipelines/doc_preprocessor/client/requirements.txt

@@ -0,0 +1,3 @@
+# paddlex-hps-client
+protobuf == 3.19.6
+tritonclient [grpc] == 2.15

+ 132 - 0
deploy/hps/sdk/pipelines/doc_preprocessor/server/model_repo/document-preprocessing/1/model.py

@@ -0,0 +1,132 @@
+from typing import Any, Dict, Final, List, Tuple
+
+from paddlex_hps_server import (
+    BaseTritonPythonModel,
+    app_common,
+    protocol,
+    schemas,
+    utils,
+)
+from paddlex_hps_server.storage import SupportsGetURL, create_storage
+
+_DEFAULT_MAX_NUM_INPUT_IMGS: Final[int] = 10
+_DEFAULT_MAX_OUTPUT_IMG_SIZE: Final[Tuple[int, int]] = (2000, 2000)
+
+
+class TritonPythonModel(BaseTritonPythonModel):
+    def initialize(self, args):
+        super().initialize(args)
+        self.context = {}
+        self.context["file_storage"] = None
+        self.context["return_img_urls"] = False
+        self.context["max_num_input_imgs"] = _DEFAULT_MAX_NUM_INPUT_IMGS
+        self.context["max_output_img_size"] = _DEFAULT_MAX_OUTPUT_IMG_SIZE
+        if self.app_config.extra:
+            if "file_storage" in self.app_config.extra:
+                self.context["file_storage"] = create_storage(
+                    self.app_config.extra["file_storage"]
+                )
+            if "return_img_urls" in self.app_config.extra:
+                self.context["return_img_urls"] = self.app_config.extra[
+                    "return_img_urls"
+                ]
+            if "max_num_input_imgs" in self.app_config.extra:
+                self.context["max_num_input_imgs"] = self.app_config.extra[
+                    "max_num_input_imgs"
+                ]
+            if "max_output_img_size" in self.app_config.extra:
+                self.context["max_output_img_size"] = self.app_config.extra[
+                    "max_output_img_size"
+                ]
+        if self.context["return_img_urls"]:
+            file_storage = self.context["file_storage"]
+            if not file_storage:
+                raise ValueError(
+                    "The file storage must be properly configured when URLs need to be returned."
+                )
+            if not isinstance(file_storage, SupportsGetURL):
+                raise TypeError(f"{type(file_storage)} does not support getting URLs.")
+
+    def get_input_model_type(self):
+        return schemas.doc_preprocessor.InferRequest
+
+    def get_result_model_type(self):
+        return schemas.doc_preprocessor.InferResult
+
+    def run(self, input, log_id):
+        if input.fileType is None:
+            if utils.is_url(input.file):
+                maybe_file_type = utils.infer_file_type(input.file)
+                if maybe_file_type is None or not (
+                    maybe_file_type == "PDF" or maybe_file_type == "IMAGE"
+                ):
+                    return protocol.create_aistudio_output_without_result(
+                        422,
+                        "Unsupported file type",
+                        log_id=log_id,
+                    )
+                file_type = maybe_file_type
+            else:
+                return protocol.create_aistudio_output_without_result(
+                    422,
+                    "File type cannot be determined",
+                    log_id=log_id,
+                )
+        else:
+            file_type = "PDF" if input.fileType == 0 else "IMAGE"
+
+        file_bytes = utils.get_raw_bytes(input.file)
+        images, data_info = utils.file_to_images(
+            file_bytes,
+            file_type,
+            max_num_imgs=self.context["max_num_input_imgs"],
+        )
+
+        result = list(
+            self.pipeline(
+                images,
+                use_doc_orientation_classify=input.useDocOrientationClassify,
+                use_doc_unwarping=input.useDocUnwarping,
+            )
+        )
+        visualize_enabled = input.visualize if input.visualize is not None else self.app_config.visualize
+
+        doc_pp_results: List[Dict[str, Any]] = []
+        for i, (img, item) in enumerate(zip(images, result)):
+            pruned_res = app_common.prune_result(item.json["res"])
+            output_img = app_common.postprocess_image(
+                item["output_img"],
+                log_id,
+                "output_img.png",
+                file_storage=self.context["file_storage"],
+                return_url=self.context["return_img_urls"],
+                max_img_size=self.context["max_output_img_size"],
+            )
+            if visualize_enabled:
+                vis_imgs = {
+                    "input_img": img,
+                    "doc_preprocessing_img": item.img["preprocessed_img"],
+                }
+                vis_imgs = app_common.postprocess_images(
+                    vis_imgs,
+                    log_id,
+                    filename_template=f"{{key}}_{i}.jpg",
+                    file_storage=self.context["file_storage"],
+                    return_urls=self.context["return_img_urls"],
+                    max_img_size=self.context["max_output_img_size"],
+                )
+            else:
+                vis_imgs = {}
+            doc_pp_results.append(
+                dict(
+                    outputImage=output_img,
+                    prunedResult=pruned_res,
+                    docPreprocessingImage=vis_imgs.get("doc_preprocessing_img"),
+                    inputImage=vis_imgs.get("input_img"),
+                )
+            )
+
+        return schemas.doc_preprocessor.InferResult(
+            docPreprocessingResults=doc_pp_results,
+            dataInfo=data_info,
+        )

+ 15 - 0
deploy/hps/sdk/pipelines/doc_preprocessor/server/pipeline_config.yaml

@@ -0,0 +1,15 @@
+
+pipeline_name: doc_preprocessor
+
+use_doc_orientation_classify: True
+use_doc_unwarping: True
+
+SubModules:
+  DocOrientationClassify:
+    module_name: doc_text_orientation
+    model_name: PP-LCNet_x1_0_doc_ori
+    model_dir: null
+  DocUnwarping:
+    module_name: image_unwarping
+    model_name: UVDoc
+    model_dir: null

+ 49 - 0
deploy/hps/sdk/pipelines/doc_understanding/client/client.py

@@ -0,0 +1,49 @@
+#!/usr/bin/env python
+
+import argparse
+import sys
+
+from paddlex_hps_client import triton_request, utils
+from tritonclient import grpc as triton_grpc
+
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--image", type=str, required=True)
+    parser.add_argument("--query", type=str, required=True)
+    parser.add_argument("--max-image-tokens", type=int, default=None)
+    parser.add_argument("--url", type=str, default="localhost:8001")
+    args = parser.parse_args()
+
+    client = triton_grpc.InferenceServerClient(
+        args.url,
+        # HACK
+        keepalive_options=triton_grpc.KeepAliveOptions(keepalive_timeout_ms=1000000),
+    )
+    image = utils.prepare_input_file(args.image, include_header=True)
+    input_ = {
+        "model": "pp-docbee",
+        "messages": [
+            {"role": "system", "content": "You are a helpful assistant."},
+            {
+                "role": "user",
+                "content": [
+                    {"type": "text", "text": args.query},
+                    {"type": "image_url", "image_url": {"url": image}},
+                ],
+            },
+        ],
+        "max_image_tokens": args.max_image_tokens,
+    }
+    output = triton_request(client, "document-understanding", input_)
+    if output["errorCode"] != 0:
+        print(f"Error code: {output['errorCode']}", file=sys.stderr)
+        print(f"Error message: {output['errorMsg']}", file=sys.stderr)
+        sys.exit(1)
+    result = output["result"]
+    print("Final result:")
+    print(result["choices"][0]["message"]["content"])
+
+
+if __name__ == "__main__":
+    main()

+ 3 - 0
deploy/hps/sdk/pipelines/doc_understanding/client/requirements.txt

@@ -0,0 +1,3 @@
+# paddlex-hps-client
+protobuf == 3.19.6
+tritonclient [grpc] == 2.15

+ 110 - 0
deploy/hps/sdk/pipelines/doc_understanding/server/model_repo/document-understanding/1/model.py

@@ -0,0 +1,110 @@
+import math
+import time
+from typing import List
+
+from openai.types.chat import ChatCompletion
+from openai.types.chat.chat_completion import Choice as ChatCompletionChoice
+from openai.types.chat.chat_completion_message import ChatCompletionMessage
+from paddlex_hps_server import BaseTritonPythonModel, logging, schemas, utils
+from PIL import Image
+
+
+class TritonPythonModel(BaseTritonPythonModel):
+    def get_input_model_type(self):
+        return schemas.doc_understanding.InferRequest
+
+    def get_result_model_type(self):
+        return ChatCompletion
+
+    @staticmethod
+    def _resize_image_with_token_limit(image, max_token_num=2200, tile_size=28):
+        image = Image.fromarray(image)
+        w0, h0 = image.width, image.height
+        tokens = math.ceil(w0 / tile_size) * math.ceil(h0 / tile_size)
+        if tokens <= max_token_num:
+            return image
+
+        k = math.sqrt(
+            max_token_num / (math.ceil(w0 / tile_size) * math.ceil(h0 / tile_size))
+        )
+        k = min(1.0, k)
+        w_new = max(int(w0 * k), tile_size)
+        h_new = max(int(h0 * k), tile_size)
+        new_size = (w_new, h_new)
+        resized_image = image.resize(new_size)
+        tokens_new = math.ceil(w_new / tile_size) * math.ceil(h_new / tile_size)
+        logging.info(
+            f"Resizing image from {w0}x{h0} to {w_new}x{h_new}, "
+            f"which will reduce the image tokens from {tokens} to {tokens_new}."
+        )
+
+        return resized_image
+
+    def run(self, input, log_id):
+        def _process_messages(messages: List[schemas.doc_understanding.Message]):
+            system_message = ""
+            user_message = ""
+            image_url = ""
+
+            for msg in messages:
+                if msg.role == schemas.doc_understanding.RoleType.SYSTEM:
+                    if isinstance(msg.content, list):
+                        for content in msg.content:
+                            if isinstance(
+                                content, schemas.doc_understanding.TextContent
+                            ):
+                                system_message = content.text
+                                break
+                    else:
+                        system_message = msg.content
+
+                elif msg.role == schemas.doc_understanding.RoleType.USER:
+                    if isinstance(msg.content, list):
+                        for content in msg.content:
+                            if isinstance(content, str):
+                                user_message = content
+                            else:
+                                if isinstance(
+                                    content, schemas.doc_understanding.TextContent
+                                ):
+                                    user_message = content.text
+                                elif isinstance(
+                                    content, schemas.doc_understanding.ImageContent
+                                ):
+                                    image_url = content.image_url
+                                    if isinstance(
+                                        image_url, schemas.doc_understanding.ImageUrl
+                                    ):
+                                        image_url = image_url.url
+                    else:
+                        user_message = msg.content
+            return system_message, user_message, image_url
+
+        system_message, user_message, image_url = _process_messages(input.messages)
+        if input.max_image_tokens is not None:
+            if image_url.startswith("data:image"):
+                _, image_url = image_url.split(",", 1)
+            img_bytes = utils.get_raw_bytes(image_url)
+            image = utils.image_bytes_to_array(img_bytes)
+            image = self._resize_image_with_token_limit(image, input.max_image_tokens)
+        else:
+            image = image_url
+
+        result = list(self.pipeline({"image": image, "query": user_message}))[0]
+
+        return ChatCompletion(
+            id=log_id,
+            model=input.model,
+            choices=[
+                ChatCompletionChoice(
+                    index=0,
+                    finish_reason="stop",
+                    message=ChatCompletionMessage(
+                        role="assistant",
+                        content=result["result"],
+                    ),
+                )
+            ],
+            created=int(time.time()),
+            object="chat.completion",
+        )

+ 9 - 0
deploy/hps/sdk/pipelines/doc_understanding/server/pipeline_config.yaml

@@ -0,0 +1,9 @@
+
+pipeline_name: doc_understanding
+
+SubModules:
+  DocUnderstanding:
+    module_name: doc_vlm
+    model_name: PP-DocBee2-3B
+    model_dir: null
+    batch_size: 1

+ 120 - 0
deploy/hps/sdk/pipelines/face_recognition/client/client.py

@@ -0,0 +1,120 @@
+#!/usr/bin/env python
+
+import argparse
+import pprint
+import sys
+
+from paddlex_hps_client import triton_request, utils
+from tritonclient import grpc as triton_grpc
+
+OUTPUT_IMAGE_PATH = "out.jpg"
+
+
+def parse_image_label_pairs(image_label_pairs):
+    if len(image_label_pairs) % 2 != 0:
+        raise ValueError("The number of image-label pairs must be even.")
+    return [
+        {"image": utils.prepare_input_file(img), "label": lab}
+        for img, lab in zip(image_label_pairs[0::2], image_label_pairs[1::2])
+    ]
+
+
+def create_triton_client(url):
+    return triton_grpc.InferenceServerClient(url)
+
+
+def ensure_no_error(output):
+    if output["errorCode"] != 0:
+        print(f"Error code: {output['errorCode']}", file=sys.stderr)
+        print(f"Error message: {output['errorMsg']}", file=sys.stderr)
+        sys.exit(1)
+
+
+def do_index_build(args):
+    client = create_triton_client(args.url)
+    if args.image_label_pairs:
+        image_label_pairs = parse_image_label_pairs(args.image_label_pairs)
+    else:
+        image_label_pairs = []
+    input_ = {"imageLabelPairs": image_label_pairs}
+    output = triton_request(client, "face-recognition-index-build", input_)
+    ensure_no_error(output)
+    result = output["result"]
+    pprint.pp(result)
+
+
+def do_index_add(args):
+    client = create_triton_client(args.url)
+    image_label_pairs = parse_image_label_pairs(args.image_label_pairs)
+    input_ = {"imageLabelPairs": image_label_pairs}
+    if args.index_key is not None:
+        input_["indexKey"] = args.index_key
+    output = triton_request(client, "face-recognition-index-add", input_)
+    ensure_no_error(output)
+    result = output["result"]
+    pprint.pp(result)
+
+
+def do_index_remove(args):
+    client = create_triton_client(args.url)
+    input_ = {"ids": args.ids}
+    if args.index_key is not None:
+        input_["indexKey"] = args.index_key
+    output = triton_request(client, "face-recognition-index-remove", input_)
+    ensure_no_error(output)
+    result = output["result"]
+    pprint.pp(result)
+
+
+def do_infer(args):
+    client = create_triton_client(args.url)
+    input_ = {"image": utils.prepare_input_file(args.image)}
+    if args.index_key is not None:
+        input_["indexKey"] = args.index_key
+    if args.no_visualization:
+        input_["visualize"] = False
+    output = triton_request(client, "face-recognition-infer", input_)
+    ensure_no_error(output)
+    result = output["result"]
+    utils.save_output_file(result["image"], OUTPUT_IMAGE_PATH)
+    print(f"Output image saved at {OUTPUT_IMAGE_PATH}")
+    print("\nDetected faces:")
+    pprint.pp(result["faces"])
+
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--url", type=str, default="localhost:8001")
+
+    subparsers = parser.add_subparsers(dest="cmd")
+
+    parser_index_build = subparsers.add_parser("index-build")
+    parser_index_build.add_argument("--image-label-pairs", type=str, nargs="+")
+    parser_index_build.set_defaults(func=do_index_build)
+
+    parser_index_add = subparsers.add_parser("index-add")
+    parser_index_add.add_argument(
+        "--image-label-pairs", type=str, nargs="+", required=True
+    )
+    parser_index_add.add_argument("--index-key", type=str, required=True)
+    parser_index_add.set_defaults(func=do_index_add)
+
+    parser_index_remove = subparsers.add_parser("index-remove")
+    parser_index_remove.add_argument("--ids", type=int, nargs="+", required=True)
+    parser_index_remove.add_argument("--index-key", type=str, required=True)
+    parser_index_remove.set_defaults(func=do_index_remove)
+
+    parser_infer = subparsers.add_parser("infer")
+    parser_infer.add_argument("--image", type=str, required=True)
+    parser_infer.add_argument("--index-key", type=str)
+    parser.add_argument("--no-visualization", action="store_true")
+
+    parser_infer.set_defaults(func=do_infer)
+
+    args = parser.parse_args()
+
+    args.func(args)
+
+
+if __name__ == "__main__":
+    main()

+ 3 - 0
deploy/hps/sdk/pipelines/face_recognition/client/requirements.txt

@@ -0,0 +1,3 @@
+# paddlex-hps-client
+protobuf == 3.19.6
+tritonclient [grpc] == 2.15

+ 3 - 0
deploy/hps/sdk/pipelines/face_recognition/server/.isort.cfg

@@ -0,0 +1,3 @@
+[settings]
+profile=black
+known_first_party=common

+ 35 - 0
deploy/hps/sdk/pipelines/face_recognition/server/model_repo/face-recognition-index-add/1/model.py

@@ -0,0 +1,35 @@
+from operator import attrgetter
+
+from paddlex.inference.pipelines.components import IndexData
+from paddlex_hps_server import schemas, utils
+
+from common.base_model import BaseFaceRecognitionModel
+
+
+class TritonPythonModel(BaseFaceRecognitionModel):
+    def get_input_model_type(self):
+        return schemas.face_recognition.AddImagesToIndexRequest
+
+    def get_result_model_type(self):
+        return schemas.face_recognition.AddImagesToIndexResult
+
+    def run(self, input, log_id):
+        file_bytes_list = [
+            utils.get_raw_bytes(img)
+            for img in map(attrgetter("image"), input.imageLabelPairs)
+        ]
+        images = [utils.image_bytes_to_array(item) for item in file_bytes_list]
+        labels = [pair.label for pair in input.imageLabelPairs]
+
+        index_storage = self.context["index_storage"]
+        index_data_bytes = index_storage.get(input.indexKey)
+        index_data = IndexData.from_bytes(index_data_bytes)
+
+        index_data = self.pipeline.append_index(images, labels, index_data)
+
+        index_data_bytes = index_data.to_bytes()
+        index_storage.set(input.indexKey, index_data_bytes)
+
+        return schemas.face_recognition.AddImagesToIndexResult(
+            imageCount=len(index_data.id_map)
+        )

+ 42 - 0
deploy/hps/sdk/pipelines/face_recognition/server/model_repo/face-recognition-index-build/1/model.py

@@ -0,0 +1,42 @@
+import uuid
+from operator import attrgetter
+
+from paddlex_hps_server import schemas, utils
+
+from common.base_model import BaseFaceRecognitionModel
+
+
+def _generate_index_key():
+    return str(uuid.uuid4())
+
+
+class TritonPythonModel(BaseFaceRecognitionModel):
+    def get_input_model_type(self):
+        return schemas.face_recognition.BuildIndexRequest
+
+    def get_result_model_type(self):
+        return schemas.face_recognition.BuildIndexResult
+
+    def run(self, input, log_id):
+        file_bytes_list = [
+            utils.get_raw_bytes(img)
+            for img in map(attrgetter("image"), input.imageLabelPairs)
+        ]
+        images = [utils.image_bytes_to_array(item) for item in file_bytes_list]
+        labels = [pair.label for pair in input.imageLabelPairs]
+
+        index_data = self.pipeline.build_index(
+            images,
+            labels,
+            index_type="Flat",
+            metric_type="IP",
+        )
+
+        index_storage = self.context["index_storage"]
+        index_key = _generate_index_key()
+        index_data_bytes = index_data.to_bytes()
+        index_storage.set(index_key, index_data_bytes)
+
+        return schemas.face_recognition.BuildIndexResult(
+            indexKey=index_key, imageCount=len(index_data.id_map)
+        )

+ 26 - 0
deploy/hps/sdk/pipelines/face_recognition/server/model_repo/face-recognition-index-remove/1/model.py

@@ -0,0 +1,26 @@
+from paddlex.inference.pipelines.components import IndexData
+from paddlex_hps_server import schemas
+
+from common.base_model import BaseFaceRecognitionModel
+
+
+class TritonPythonModel(BaseFaceRecognitionModel):
+    def get_input_model_type(self):
+        return schemas.face_recognition.RemoveImagesFromIndexRequest
+
+    def get_result_model_type(self):
+        return schemas.face_recognition.RemoveImagesFromIndexResult
+
+    def run(self, input, log_id):
+        index_storage = self.context["index_storage"]
+        index_data_bytes = index_storage.get(input.indexKey)
+        index_data = IndexData.from_bytes(index_data_bytes)
+
+        index_data = self.pipeline.remove_index(input.ids, index_data)
+
+        index_data_bytes = index_data.to_bytes()
+        index_storage.set(input.indexKey, index_data_bytes)
+
+        return schemas.face_recognition.RemoveImagesFromIndexResult(
+            imageCount=len(index_data.id_map)
+        )

+ 22 - 0
deploy/hps/sdk/pipelines/face_recognition/server/model_repo/face-recognition-index-remove/config.pbtxt

@@ -0,0 +1,22 @@
+backend: "python"
+max_batch_size: 1
+input [
+  {
+    name: "input"
+    data_type: TYPE_STRING
+    dims: [ 1 ]
+  }
+]
+output [
+  {
+    name: "output"
+    data_type: TYPE_STRING
+    dims: [ 1 ]
+  }
+]
+instance_group [
+  {
+      count: 1
+      kind: KIND_CPU
+  }
+]

+ 66 - 0
deploy/hps/sdk/pipelines/face_recognition/server/model_repo/face-recognition-infer/1/model.py

@@ -0,0 +1,66 @@
+from typing import Any, Dict, List
+
+from paddlex.inference.pipelines.components import IndexData
+from paddlex_hps_server import schemas, utils
+
+from common.base_model import BaseFaceRecognitionModel
+
+
+class TritonPythonModel(BaseFaceRecognitionModel):
+    def get_input_model_type(self):
+        return schemas.face_recognition.InferRequest
+
+    def get_result_model_type(self):
+        return schemas.face_recognition.InferResult
+
+    def run(self, input, log_id):
+        image_bytes = utils.get_raw_bytes(input.image)
+        image = utils.image_bytes_to_array(image_bytes)
+
+        if input.indexKey is not None:
+            index_storage = self.context["index_storage"]
+            index_data_bytes = index_storage.get(input.indexKey)
+            index_data = IndexData.from_bytes(index_data_bytes)
+        else:
+            index_data = None
+        visualize_enabled = input.visualize if input.visualize is not None else self.app_config.visualize
+
+        result = list(
+            self.pipeline(
+                image,
+                index=index_data,
+                det_threshold=input.detThreshold,
+                rec_threshold=input.recThreshold,
+                hamming_radius=input.hammingRadius,
+                topk=input.topk,
+            )
+        )[0]
+
+        objs: List[Dict[str, Any]] = []
+        for obj in result["boxes"]:
+            rec_results: List[Dict[str, Any]] = []
+            if obj["rec_scores"] is not None:
+                for label, score in zip(obj["labels"], obj["rec_scores"]):
+                    rec_results.append(
+                        dict(
+                            label=label,
+                            score=score,
+                        )
+                    )
+            objs.append(
+                dict(
+                    bbox=obj["coordinate"],
+                    recResults=rec_results,
+                    score=obj["det_score"],
+                )
+            )
+        if visualize_enabled:
+            output_image_base64 = utils.base64_encode(
+                utils.image_to_bytes(result.img["res"])
+            )
+        else:
+            output_image_base64 = None
+
+        return schemas.face_recognition.InferResult(
+            faces=objs, image=output_image_base64
+        )

+ 18 - 0
deploy/hps/sdk/pipelines/face_recognition/server/pipeline_config.yaml

@@ -0,0 +1,18 @@
+pipeline_name: face_recognition
+
+index: None
+det_threshold: 0.6
+rec_threshold: 0.4
+rec_topk: 5
+
+SubModules:
+  Detection:
+    module_name: face_detection
+    model_name: PP-YOLOE_plus-S_face
+    model_dir: null
+    batch_size: 1 
+  Recognition:
+    module_name: face_feature
+    model_name: ResNet50_face
+    model_dir: null
+    batch_size: 1

+ 0 - 0
deploy/hps/sdk/pipelines/face_recognition/server/shared_mods/common/__init__.py


+ 19 - 0
deploy/hps/sdk/pipelines/face_recognition/server/shared_mods/common/base_model.py

@@ -0,0 +1,19 @@
+from paddlex_hps_server import BaseTritonPythonModel
+from paddlex_hps_server.storage import create_storage
+
+# Do we need a lock?
+DEFAULT_INDEX_DIR = ".indexes"
+
+
+class BaseFaceRecognitionModel(BaseTritonPythonModel):
+    def initialize(self, args):
+        super().initialize(args)
+        self.context = {}
+        if self.app_config.extra and "index_storage" in self.app_config.extra:
+            self.context["index_storage"] = create_storage(
+                self.app_config.extra["index_storage"]
+            )
+        else:
+            self.context["index_storage"] = create_storage(
+                {"type": "file_system", "directory": DEFAULT_INDEX_DIR}
+            )

+ 40 - 0
deploy/hps/sdk/pipelines/formula_recognition/client/client.py

@@ -0,0 +1,40 @@
+#!/usr/bin/env python
+
+import argparse
+import sys
+
+from paddlex_hps_client import triton_request, utils
+from tritonclient import grpc as triton_grpc
+
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--file", type=str, required=True)
+    parser.add_argument("--file-type", type=int, choices=[0, 1])
+    parser.add_argument("--no-visualization", action="store_true")
+    parser.add_argument("--url", type=str, default="localhost:8001")
+
+    args = parser.parse_args()
+
+    client = triton_grpc.InferenceServerClient(args.url)
+    input_ = {"file": utils.prepare_input_file(args.file)}
+    if args.file_type is not None:
+        input_["fileType"] = args.file_type
+    if args.no_visualization:
+        input_["visualize"] = False
+    output = triton_request(client, "formula-recognition", input_)
+    if output["errorCode"] != 0:
+        print(f"Error code: {output['errorCode']}", file=sys.stderr)
+        print(f"Error message: {output['errorMsg']}", file=sys.stderr)
+        sys.exit(1)
+    result = output["result"]
+    for i, res in enumerate(result["formulaRecResults"]):
+        print(res["prunedResult"])
+        for img_name, img in res["outputImages"].items():
+            img_path = f"{img_name}_{i}.jpg"
+            utils.save_output_file(img, img_path)
+            print(f"Output image saved at {img_path}")
+
+
+if __name__ == "__main__":
+    main()

+ 3 - 0
deploy/hps/sdk/pipelines/formula_recognition/client/requirements.txt

@@ -0,0 +1,3 @@
+# paddlex-hps-client
+protobuf == 3.19.6
+tritonclient [grpc] == 2.15

+ 132 - 0
deploy/hps/sdk/pipelines/formula_recognition/server/model_repo/formula-recognition/1/model.py

@@ -0,0 +1,132 @@
+from typing import Any, Dict, Final, List, Tuple
+
+from paddlex_hps_server import (
+    BaseTritonPythonModel,
+    app_common,
+    protocol,
+    schemas,
+    utils,
+)
+from paddlex_hps_server.storage import SupportsGetURL, create_storage
+
+_DEFAULT_MAX_NUM_INPUT_IMGS: Final[int] = 10
+_DEFAULT_MAX_OUTPUT_IMG_SIZE: Final[Tuple[int, int]] = (2000, 2000)
+
+
+class TritonPythonModel(BaseTritonPythonModel):
+    def initialize(self, args):
+        super().initialize(args)
+        self.context = {}
+        self.context["file_storage"] = None
+        self.context["return_img_urls"] = False
+        self.context["max_num_input_imgs"] = _DEFAULT_MAX_NUM_INPUT_IMGS
+        self.context["max_output_img_size"] = _DEFAULT_MAX_OUTPUT_IMG_SIZE
+        if self.app_config.extra:
+            if "file_storage" in self.app_config.extra:
+                self.context["file_storage"] = create_storage(
+                    self.app_config.extra["file_storage"]
+                )
+            if "return_img_urls" in self.app_config.extra:
+                self.context["return_img_urls"] = self.app_config.extra[
+                    "return_img_urls"
+                ]
+            if "max_num_input_imgs" in self.app_config.extra:
+                self.context["max_num_input_imgs"] = self.app_config.extra[
+                    "max_num_input_imgs"
+                ]
+            if "max_output_img_size" in self.app_config.extra:
+                self.context["max_output_img_size"] = self.app_config.extra[
+                    "max_output_img_size"
+                ]
+        if self.context["return_img_urls"]:
+            file_storage = self.context["file_storage"]
+            if not file_storage:
+                raise ValueError(
+                    "The file storage must be properly configured when URLs need to be returned."
+                )
+            if not isinstance(file_storage, SupportsGetURL):
+                raise TypeError(f"{type(file_storage)} does not support getting URLs.")
+
+    def get_input_model_type(self):
+        return schemas.formula_recognition.InferRequest
+
+    def get_result_model_type(self):
+        return schemas.formula_recognition.InferResult
+
+    def run(self, input, log_id):
+        if input.fileType is None:
+            if utils.is_url(input.file):
+                maybe_file_type = utils.infer_file_type(input.file)
+                if maybe_file_type is None or not (
+                    maybe_file_type == "PDF" or maybe_file_type == "IMAGE"
+                ):
+                    return protocol.create_aistudio_output_without_result(
+                        422,
+                        "Unsupported file type",
+                        log_id=log_id,
+                    )
+                file_type = maybe_file_type
+            else:
+                return protocol.create_aistudio_output_without_result(
+                    422,
+                    "File type cannot be determined",
+                    log_id=log_id,
+                )
+        else:
+            file_type = "PDF" if input.fileType == 0 else "IMAGE"
+        visualize_enabled = input.visualize if input.visualize is not None else self.app_config.visualize
+
+        file_bytes = utils.get_raw_bytes(input.file)
+        images, data_info = utils.file_to_images(
+            file_bytes,
+            file_type,
+            max_num_imgs=self.context["max_num_input_imgs"],
+        )
+
+        result = list(
+            self.pipeline(
+                images,
+                use_layout_detection=input.useLayoutDetection,
+                use_doc_orientation_classify=input.useDocOrientationClassify,
+                use_doc_unwarping=input.useDocUnwarping,
+                layout_threshold=input.layoutThreshold,
+                layout_nms=input.layoutNms,
+                layout_unclip_ratio=input.layoutUnclipRatio,
+                layout_merge_bboxes_mode=input.layoutMergeBboxesMode,
+            )
+        )
+
+        formula_rec_results: List[Dict[str, Any]] = []
+        for i, (img, item) in enumerate(zip(images, result)):
+            pruned_res = app_common.prune_result(item.json["res"])
+            if visualize_enabled:
+                imgs = {
+                    "input_img": img,
+                    **item.img,
+                }
+                imgs = app_common.postprocess_images(
+                    imgs,
+                    log_id,
+                    filename_template=f"{{key}}_{i}.jpg",
+                    file_storage=self.context["file_storage"],
+                    return_urls=self.context["return_img_urls"],
+                    max_img_size=self.context["max_output_img_size"],
+                )
+            else:
+                imgs = {}
+            formula_rec_results.append(
+                dict(
+                    prunedResult=pruned_res,
+                    outputImages=(
+                        {k: v for k, v in imgs.items() if k != "input_img"}
+                        if imgs
+                        else None
+                    ),
+                    inputImage=imgs.get("input_img"),
+                )
+            )
+
+        return schemas.formula_recognition.InferResult(
+            formulaRecResults=formula_rec_results,
+            dataInfo=data_info,
+        )

+ 39 - 0
deploy/hps/sdk/pipelines/formula_recognition/server/pipeline_config.yaml

@@ -0,0 +1,39 @@
+
+pipeline_name: formula_recognition
+
+use_layout_detection: True
+use_doc_preprocessor: True
+
+SubModules:
+  LayoutDetection:
+    module_name: layout_detection
+    model_name: PP-DocLayout_plus-L
+    model_dir: null
+    threshold: 0.5
+    layout_nms: True
+    layout_unclip_ratio: 1.0
+    layout_merge_bboxes_mode: "large"
+    batch_size: 1
+
+  FormulaRecognition:
+    module_name: formula_recognition
+    model_name: PP-FormulaNet_plus-M
+    model_dir: null
+    batch_size: 5
+
+SubPipelines:
+  DocPreprocessor:
+    pipeline_name: doc_preprocessor
+    use_doc_orientation_classify: True
+    use_doc_unwarping: True
+    SubModules:
+      DocOrientationClassify:
+        module_name: doc_text_orientation
+        model_name: PP-LCNet_x1_0_doc_ori
+        model_dir: null
+        batch_size: 1
+      DocUnwarping:
+        module_name: image_unwarping
+        model_name: UVDoc
+        model_dir: null
+        batch_size: 1

+ 38 - 0
deploy/hps/sdk/pipelines/human_keypoint_detection/client/client.py

@@ -0,0 +1,38 @@
+#!/usr/bin/env python
+
+import argparse
+import pprint
+import sys
+
+from paddlex_hps_client import triton_request, utils
+from tritonclient import grpc as triton_grpc
+
+OUTPUT_IMAGE_PATH = "out.jpg"
+
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--image", type=str, required=True)
+    parser.add_argument("--no-visualization", action="store_true")
+    parser.add_argument("--url", type=str, default="localhost:8001")
+
+    args = parser.parse_args()
+
+    client = triton_grpc.InferenceServerClient(args.url)
+    input_ = {"image": utils.prepare_input_file(args.image)}
+    if args.no_visualization:
+        input_["visualize"] = False
+    output = triton_request(client, "human-keypoint-detection", input_)
+    if output["errorCode"] != 0:
+        print(f"Error code: {output['errorCode']}", file=sys.stderr)
+        print(f"Error message: {output['errorMsg']}", file=sys.stderr)
+        sys.exit(1)
+    result = output["result"]
+    utils.save_output_file(result["image"], OUTPUT_IMAGE_PATH)
+    print(f"Output image saved at {OUTPUT_IMAGE_PATH}")
+    print("\nDetected persons:")
+    pprint.pp(result["persons"])
+
+
+if __name__ == "__main__":
+    main()

+ 3 - 0
deploy/hps/sdk/pipelines/human_keypoint_detection/client/requirements.txt

@@ -0,0 +1,3 @@
+# paddlex-hps-client
+protobuf == 3.19.6
+tritonclient [grpc] == 2.15

+ 44 - 0
deploy/hps/sdk/pipelines/human_keypoint_detection/server/model_repo/human-keypoint-detection/1/model.py

@@ -0,0 +1,44 @@
+from typing import Any, Dict, List
+
+from paddlex_hps_server import BaseTritonPythonModel, schemas, utils
+
+
+class TritonPythonModel(BaseTritonPythonModel):
+    def get_input_model_type(self):
+        return schemas.human_keypoint_detection.InferRequest
+
+    def get_result_model_type(self):
+        return schemas.human_keypoint_detection.InferResult
+
+    def run(self, input, log_id):
+        file_bytes = utils.get_raw_bytes(input.image)
+        image = utils.image_bytes_to_array(file_bytes)
+        visualize_enabled = input.visualize if input.visualize is not None else self.app_config.visualize
+
+        result = list(
+            self.pipeline.predict(
+                image,
+                det_threshold=input.detThreshold,
+            )
+        )[0]
+
+        persons: List[Dict[str, Any]] = []
+        for obj in result["boxes"]:
+            persons.append(
+                dict(
+                    bbox=obj["coordinate"],
+                    kpts=obj["keypoints"].tolist(),
+                    detScore=obj["det_score"],
+                    kptScore=obj["kpt_score"],
+                )
+            )
+        if visualize_enabled:
+            output_image_base64 = utils.base64_encode(
+                utils.image_to_bytes(result.img["res"])
+            )
+        else:
+            output_image_base64 = None
+
+        return schemas.human_keypoint_detection.InferResult(
+            persons=persons, image=output_image_base64
+        )

+ 17 - 0
deploy/hps/sdk/pipelines/human_keypoint_detection/server/pipeline_config.yaml

@@ -0,0 +1,17 @@
+pipeline_name: human_keypoint_detection
+
+SubModules:
+  ObjectDetection:
+    module_name: object_detection
+    model_name: PP-YOLOE-S_human
+    model_dir: null
+    batch_size: 1
+    threshold: null
+    img_size: null
+  KeypointDetection:
+    module_name: keypoint_detection
+    model_name: PP-TinyPose_128x96
+    model_dir: null
+    batch_size: 1
+    flip: False
+    use_udp: null

+ 38 - 0
deploy/hps/sdk/pipelines/image_classification/client/client.py

@@ -0,0 +1,38 @@
+#!/usr/bin/env python
+
+import argparse
+import pprint
+import sys
+
+from paddlex_hps_client import triton_request, utils
+from tritonclient import grpc as triton_grpc
+
+OUTPUT_IMAGE_PATH = "out.jpg"
+
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--image", type=str, required=True)
+    parser.add_argument("--no-visualization", action="store_true")
+    parser.add_argument("--url", type=str, default="localhost:8001")
+
+    args = parser.parse_args()
+
+    client = triton_grpc.InferenceServerClient(args.url)
+    input_ = {"image": utils.prepare_input_file(args.image)}
+    if args.no_visualization:
+        input_["visualize"] = False
+    output = triton_request(client, "image-classification", input_)
+    if output["errorCode"] != 0:
+        print(f"Error code: {output['errorCode']}", file=sys.stderr)
+        print(f"Error message: {output['errorMsg']}", file=sys.stderr)
+        sys.exit(1)
+    result = output["result"]
+    utils.save_output_file(result["image"], OUTPUT_IMAGE_PATH)
+    print(f"Output image saved at {OUTPUT_IMAGE_PATH}")
+    print("\nCategories")
+    pprint.pp(result["categories"])
+
+
+if __name__ == "__main__":
+    main()

+ 3 - 0
deploy/hps/sdk/pipelines/image_classification/client/requirements.txt

@@ -0,0 +1,3 @@
+# paddlex-hps-client
+protobuf == 3.19.6
+tritonclient [grpc] == 2.15

+ 36 - 0
deploy/hps/sdk/pipelines/image_classification/server/model_repo/image-classification/1/model.py

@@ -0,0 +1,36 @@
+from typing import Any, Dict, List
+
+from paddlex_hps_server import BaseTritonPythonModel, schemas, utils
+
+
+class TritonPythonModel(BaseTritonPythonModel):
+    def get_input_model_type(self):
+        return schemas.image_classification.InferRequest
+
+    def get_result_model_type(self):
+        return schemas.image_classification.InferResult
+
+    def run(self, input, log_id):
+        file_bytes = utils.get_raw_bytes(input.image)
+        image = utils.image_bytes_to_array(file_bytes)
+        visualize_enabled = input.visualize if input.visualize is not None else self.app_config.visualize
+
+        result = list(self.pipeline.predict(image, topk=input.topk))[0]
+        if "label_names" in result:
+            cat_names = result["label_names"]
+        else:
+            cat_names = [str(id_) for id_ in result["class_ids"]]
+
+        categories: List[Dict[str, Any]] = []
+        for id_, name, score in zip(result["class_ids"], cat_names, result["scores"]):
+            categories.append(dict(id=id_, name=name, score=score))
+        if visualize_enabled:
+            output_image_base64 = utils.base64_encode(
+                utils.image_to_bytes(result.img["res"])
+            )
+        else:
+            output_image_base64 = None
+
+        return schemas.image_classification.InferResult(
+            categories=categories, image=output_image_base64
+        )

+ 10 - 0
deploy/hps/sdk/pipelines/image_classification/server/pipeline_config.yaml

@@ -0,0 +1,10 @@
+
+pipeline_name: image_classification
+
+SubModules:
+  ImageClassification:
+    module_name: image_classification
+    model_name: PP-LCNet_x0_5
+    model_dir: null
+    batch_size: 4
+    topk: 5 

+ 38 - 0
deploy/hps/sdk/pipelines/image_multilabel_classification/client/client.py

@@ -0,0 +1,38 @@
+#!/usr/bin/env python
+
+import argparse
+import pprint
+import sys
+
+from paddlex_hps_client import triton_request, utils
+from tritonclient import grpc as triton_grpc
+
+OUTPUT_IMAGE_PATH = "out.jpg"
+
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--image", type=str, required=True)
+    parser.add_argument("--no-visualization", action="store_true")
+    parser.add_argument("--url", type=str, default="localhost:8001")
+
+    args = parser.parse_args()
+
+    client = triton_grpc.InferenceServerClient(args.url)
+    input_ = {"image": utils.prepare_input_file(args.image)}
+    if args.no_visualization:
+        input_["visualize"] = False
+    output = triton_request(client, "multilabel-image-classification", input_)
+    if output["errorCode"] != 0:
+        print(f"Error code: {output['errorCode']}", file=sys.stderr)
+        print(f"Error message: {output['errorMsg']}", file=sys.stderr)
+        sys.exit(1)
+    result = output["result"]
+    utils.save_output_file(result["image"], OUTPUT_IMAGE_PATH)
+    print(f"Output image saved at {OUTPUT_IMAGE_PATH}")
+    print("\nCategories")
+    pprint.pp(result["categories"])
+
+
+if __name__ == "__main__":
+    main()

+ 3 - 0
deploy/hps/sdk/pipelines/image_multilabel_classification/client/requirements.txt

@@ -0,0 +1,3 @@
+# paddlex-hps-client
+protobuf == 3.19.6
+tritonclient [grpc] == 2.15

+ 37 - 0
deploy/hps/sdk/pipelines/image_multilabel_classification/server/model_repo/multilabel-image-classification/1/model.py

@@ -0,0 +1,37 @@
+from typing import Any, Dict, List
+
+from paddlex_hps_server import BaseTritonPythonModel, schemas, utils
+
+
+class TritonPythonModel(BaseTritonPythonModel):
+    def get_input_model_type(self):
+        return schemas.image_multilabel_classification.InferRequest
+
+    def get_result_model_type(self):
+        return schemas.image_multilabel_classification.InferResult
+
+    def run(self, input, log_id):
+        file_bytes = utils.get_raw_bytes(input.image)
+        image = utils.image_bytes_to_array(file_bytes)
+        visualize_enabled = input.visualize if input.visualize is not None else self.app_config.visualize
+
+        result = list(self.pipeline.predict(image, threshold=input.threshold))[0]
+
+        if "label_names" in result:
+            cat_names = result["label_names"]
+        else:
+            cat_names = [str(id_) for id_ in result["class_ids"]]
+
+        categories: List[Dict[str, Any]] = []
+        for id_, name, score in zip(result["class_ids"], cat_names, result["scores"]):
+            categories.append(dict(id=id_, name=name, score=score))
+        if visualize_enabled:
+            output_image_base64 = utils.base64_encode(
+                utils.image_to_bytes(result.img["res"])
+            )
+        else:
+            output_image_base64 = None
+
+        return schemas.image_multilabel_classification.InferResult(
+            categories=categories, image=output_image_base64
+        )

+ 9 - 0
deploy/hps/sdk/pipelines/image_multilabel_classification/server/pipeline_config.yaml

@@ -0,0 +1,9 @@
+
+pipeline_name: image_multilabel_classification
+
+SubModules:
+  ImageMultiLabelClassification:
+    module_name: image_multilabel_classification
+    model_name: PP-HGNetV2-B6_ML
+    model_dir: null
+    batch_size: 4    

Vissa filer visades inte eftersom för många filer har ändrats