comments: true

PaddleX High Stability Serving

This project provides a high-stability serving solution, consisting of two main components: server_env and sdk.server_env is responsible for building multiple Docker images that include Triton Inference Server, providing the runtime environment for pipeline servers.sdk is used to package the pipeline SDK, including both server and client code for various model pipelines. As shown in the following figure:

Note: This project relies on the following environment configurations:

CPU Architecture: x86-64
Operating System: Linux
Docker Engine Version: >= 20.10.0 (Used for image building and deployment)

This document mainly introduces how to set up a high stability serving environment and package related materials using the scripts provided by this project. The overall process consists of two main stages:

Image Building: Build Docker images that include Triton Inference Server. In this stage, requirement versions are locked to ensure reproducibility and stability of the deployment images.
Pipeline Material Packaging: Package the client and server code for each model pipeline, making it easier for subsequent deployment and integration.

To learn how to start the server and invoke services using the built images and packaged SDK, please refer to the PaddleX Serving Guide for detailed instructions.

1. Image Building

This stage mainly introduces the overall process and key steps of image building.

Image Building Steps:

Build a requirement collection image. (Optional)
Freeze requirement versions to improve the reproducibility of deployment image building. (Optional)
Build the deployment image based on the frozen requirement information to generate the final deployment image and provide image support for subsequent pipeline execution.

If you do not need to modify requirement-related information, you can skip to 1.3 Building Image to build the deployment image using cached requirement information.

1.1 Build the Requirement Collection Image (Optional)

Navigate to the server_env directory and run follow script for building the requirement collection image in this directory.

./scripts/prepare_rc_image.sh

This script builds a requirement collection image for each device type. The image includes Python 3.10 and pip-tools. 1.2 Freeze Requirement (Optional) will be based on this image. After the build is complete, two images: paddlex-hps-rc:gpu and paddlex-hps-rc:cpu will be generated. If you encounter network issues, you can specify other pip sources through the -p parameter. If not specified, the default source https://pypi.org/simple will be used.If you encounter issues pulling the base image during the build process, please refer to the relevant solutions in the FAQ.

1.2 Freeze Requirement (Optional)

To enhance the reproducibility of the build, this step freeze requirement to exact versions. Please switch to the server_env directory and run the following script:

./scripts/freeze_requirements.sh

This script uses pip-tools compile to parse the source requirement files and generate a series of .txt files (such as requirements/gpu.txt, requirements/cpu.txt, etc.). These files will serve as version requirement for 1.3 Building Image.

1.3 Building Image

If you need to build the GPU image, make sure to place the following two installation packages in the server_env directory in advance:

Package Name	Version	Download URL	File Name
cuDNN	v8.9.7-CUDA 11.x	NVIDIA cuDNN Archive	Local Installer for Linux x86_64 (Tar)
TensorRT	8.6-GA	TensorRT 8.x Download Page	TensorRT 8.6 GA for Linux x86_64 and CUDA 11.x TAR Package

For Triton Inference Server, a precompiled version will be automatically downloaded during the build process, so manual download is not required. To build a GPU image, run the following command:

./scripts/build_deployment_image.sh -k gpu -t latest-gpu

Build image script supports the following configuration options:

Name	Descrition
`-k`	Specifies the device type for the image. Supported values: `gpu` or `cpu`.
`-t`	Sets the image tag. Default: `latest:${DEVICE}`.
`-p`	Python package index URL. If not specified, defaults to `https://pypi.org/simple`.

If the basic image cannot be pulled, please refer to the solutions in the FAQ.

After run successfully, the command line will display the following message:

 => => exporting to image
 => => exporting layers
 => => writing image  sha256:ba3d0b2b079d63ee0239a99043fec7e25f17bf2a7772ec2fc80503c1582b3459
 => => naming to ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlex/hps:latest-gpu

To build both GPU and CPU images run the following command:

./scripts/prepare_deployment_images.sh

2. Pipeline Material Packaging

This stage mainly introduces how to package pipeline materials. This function is provided in the sdk directory, which offers corresponding client and server code implementations for each pipeline:

client: Responsible for invoking the model services.
server: Deployed using the images built in 1. Image Building, serving as the runtime environment for model services.

Before packaging the pipeline materials, you need to switch to the sdk directory and run the scripts/assemble.sh script in this directory for packaging. For example, to package the general OCR pipeline, run:

./scripts/assemble.sh OCR

The parameters for the packaging script are described as follows:

Name	Description
`pipeline_names`	Specifies the names of the pipelines to be packaged. Can be empty or include multiple names. For example, the general OCR pipeline is `OCR`.
`--all`	Packages all pipelines. Cannot be used together with `pipeline_names`.
`--no-server`	Excludes the server code from the package.
`--no-client`	Excludes the client code from the package.

After run successfully, the packaged will be stored in the output directory.

3. FAQ

1. Failed to pull the base Docker image during build?

This issue may occur due to network connectivity problems or restricted access to Docker Hub. You can add trusted domestic mirror registry URLs to your local Docker configuration file at /etc/docker/daemon.json to improve download speed and stability. If this does not resolve the issue, consider manually downloading the base image from the official source or other trusted third-party source.

2. Timeout when installing Python requirement during image build?

Network issues may cause slow download speeds or connection failures when pip retrieves packages from the official source. When running the image build scripts, you can use the -p parameter to specify an alternative Python package index URL. For example, to use the Tsinghua mirror for the requirement collection image:

./scripts/prepare_rc_image.sh -p  https://pypi.tuna.tsinghua.edu.cn/simple

README_en.md 8.2 KB 文件歷史 原始文件