---
comments: true
---
# Document Image Preprocessing Pipeline Tutorial
## 1. Introduction to the Do Pipeline
The document image preprocessing pipeline integrates two major functions: document orientation classification and geometric distortion correction. The document orientation classification can automatically identify the four orientations of a document (0°, 90°, 180°, 270°) to ensure that the document is processed in the correct direction for subsequent tasks. The geometric distortion correction model is used to correct geometric distortions that occur during the document's photographing or scanning process, restoring the document to its original shape and proportions. This is suitable for digital document management, preprocessing for doc_preprocessor recognition, and any scenario where improving document image quality is necessary. Through automated orientation correction and distortion correction, this module significantly enhances the accuracy and efficiency of document processing, providing users with a more reliable foundation for image analysis. The pipeline also offers flexible service deployment options, supporting invocation using various programming languages on multiple hardware platforms. Moreover, it provides the capability for further development, allowing you to train and fine-tune on your own dataset based on this pipeline, with the trained models being seamlessly integrable.
**The general document image preprocessing pipeline includes optional document image orientation classification module and document image correction module**. Each module contains multiple models, and you can choose the model based on the benchmark test data below.
### 1.1 Model benchmark data
> The inference time only includes the model inference time and does not include the time for pre- or post-processing.
Document Image Orientation Classification Module (Optional):
| Model | Model Download Link | Top-1 Acc (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (MB) | Introduction |
|---|---|---|---|---|---|---|
| PP-LCNet_x1_0_doc_ori | Inference Model/Training Model | 99.06 | 2.62 / 0.59 | 3.24 / 1.19 | 7 | A document image classification model based on PP-LCNet_x1_0, with four categories: 0 degrees, 90 degrees, 180 degrees, and 270 degrees. |
Text Image Correction Module (Optional):
| Model | Model Download Link | CER | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (MB) | Description |
|---|---|---|---|---|---|---|
| UVDoc | Inference Model/Training Model | 0.179 | 19.05 / 19.05 | - / 869.82 | 30.3 | High-accuracy text image rectification model |
| Mode | GPU Configuration | CPU Configuration | Acceleration Technology Combination |
|---|---|---|---|
| Normal Mode | FP32 Precision / No TRT Acceleration | FP32 Precision / 8 Threads | PaddleInference |
| High-Performance Mode | Optimal combination of pre-selected precision types and acceleration strategies | FP32 Precision / 8 Threads | Pre-selected optimal backend (Paddle/OpenVINO/TRT, etc.) |
| Pipeline configuration | Hardware | Avg. inference time (s) | Peak CPU utilization (%) | Avg. CPU utilization (%) | Peak host memory (MB) | Avg. host memory (MB) | Peak GPU utilization (%) | Avg. GPU utilization (%) | Peak device memory (MB) | Avg. device memory (MB) |
|---|---|---|---|---|---|---|---|---|---|---|
| doc_preprocessor-default | Intel 6271C | 1.13 | 1012.50 | 789.07 | 1913.89 | 1650.30 | N/A | N/A | N/A | N/A |
| Intel 8350C | 0.76 | 1003.70 | 782.77 | 1916.81 | 1655.66 | N/A | N/A | N/A | N/A | |
| Hygon 7490 + P800 | 0.10 | 174.70 | 135.91 | 2023.54 | 1963.53 | N/A | N/A | N/A | N/A | |
| Intel 8350C + A100 | 0.08 | 148.90 | 125.73 | 1748.06 | 1683.35 | 11 | 5.86 | 828.00 | 828.00 | |
| Intel 6271C + V100 | 0.11 | 168.60 | 131.05 | 1837.66 | 1707.50 | 26 | 11.65 | 694.00 | 694.00 | |
| Intel 8563C + H20 | 0.07 | 140.90 | 122.80 | 1992.86 | 1911.34 | 11 | 7.75 | 890.00 | 890.00 | |
| Intel 8350C + A10 | 0.08 | 146.40 | 125.37 | 1896.46 | 1825.98 | 14 | 5.93 | 586.00 | 586.00 | |
| M4 | 0.37 | 118.20 | 103.40 | 1973.23 | 1756.00 | N/A | N/A | N/A | N/A | |
| Intel 6271C + T4 | 0.12 | 156.90 | 123.75 | 1874.29 | 1727.01 | 73 | 23.81 | 450.00 | 450.00 | |
| doc_preprocessor-clsonly | Intel 6271C | 0.06 | 1085.60 | 1057.12 | 1254.27 | 1203.36 | N/A | N/A | N/A | N/A |
| Intel 8350C | 0.06 | 1051.60 | 1043.39 | 1263.88 | 1206.67 | N/A | N/A | N/A | N/A | |
| Hygon 7490 + P800 | 0.05 | 202.70 | 173.75 | 1868.11 | 1810.80 | N/A | N/A | N/A | N/A | |
| Intel 8350C + A100 | 0.04 | 167.80 | 150.56 | 1347.92 | 1325.32 | 2 | 1.25 | 514.00 | 514.00 | |
| Intel 6271C + V100 | 0.05 | 202.70 | 170.72 | 1404.16 | 1352.74 | 2 | 1.20 | 394.00 | 394.00 | |
| Intel 8563C + H20 | 0.04 | 153.80 | 141.46 | 1590.48 | 1540.25 | 2 | 1.12 | 578.00 | 578.00 | |
| Intel 8350C + A10 | 0.04 | 179.80 | 156.96 | 1569.85 | 1514.64 | 2 | 0.88 | 302.00 | 302.00 | |
| M4 | 0.03 | 127.60 | 123.24 | 1459.06 | 1386.33 | N/A | N/A | N/A | N/A | |
| Intel 6271C + T4 | 0.05 | 180.80 | 154.34 | 1515.43 | 1465.99 | 4 | 2.89 | 160.00 | 160.00 | |
| doc_preprocessor-unwarponly | Intel 6271C | 1.09 | 1002.10 | 749.82 | 1875.45 | 1680.20 | N/A | N/A | N/A | N/A |
| Intel 8350C | 0.75 | 951.30 | 754.01 | 1843.54 | 1645.73 | N/A | N/A | N/A | N/A | |
| Hygon 7490 + P800 | 0.10 | 101.20 | 100.08 | 1960.48 | 1863.72 | N/A | N/A | N/A | N/A | |
| Intel 8350C + A100 | 0.08 | 100.90 | 100.15 | 1570.25 | 1446.01 | 9 | 5.57 | 788.00 | 788.00 | |
| Intel 6271C + V100 | 0.13 | 100.90 | 100.25 | 1561.52 | 1459.73 | 24 | 11.35 | 656.00 | 656.00 | |
| Intel 8563C + H20 | 0.07 | 106.90 | 100.62 | 1808.00 | 1715.35 | 6 | 4.83 | 684.00 | 684.00 | |
| Intel 8350C + A10 | 0.08 | 101.90 | 100.14 | 1848.25 | 1719.24 | 19 | 7.73 | 544.00 | 544.00 | |
| M4 | 0.30 | 102.40 | 100.76 | 1957.42 | 1734.67 | N/A | N/A | N/A | N/A | |
| Intel 6271C + T4 | 0.12 | 101.80 | 100.36 | 1674.25 | 1605.14 | 48 | 26.45 | 412.00 | 412.00 |
| Pipeline configuration | description |
|---|---|
| doc_preprocessor-default | Default configuration |
| doc_preprocessor-clsonly | Based on the default configuration, only document image orientation classification is enabled |
| doc_preprocessor-unwarponly | Based on the default configuration, only text image rectification is enabled |
{'res': {'input_path': 'doc_test_rotated.jpg', 'model_settings': {'use_doc_orientation_classify': True, 'use_doc_unwarping': True}, 'angle': 180}}
You can refer to the results explanation in [2.1.2 Python Script Integration](#212-python-script-integration) for a description of the output parameters.
The visualized results are saved under `save_path`. The visualized results are as follows:
#### 2.1.2 Python Script Integration
The above command line is for quickly experiencing and viewing the effect. Generally, in a project, it is often necessary to integrate through code. You can complete quick inference in a pipeline with just a few lines of code. The inference code is as follows:
```python
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline="doc_preprocessor")
output = pipeline.predict(
input="doc_test_rotated.jpg",
use_doc_orientation_classify=True,
use_doc_unwarping=True,
)
for res in output:
res.print()
res.save_to_img(save_path="./output/")
res.save_to_json(save_path="./output/")
```
In the above Python script, the following steps were executed:
(1) Instantiate the `doc_preprocessor` pipeline object using `create_pipeline()`. The specific parameter descriptions are as follows:
| Parameter | Description | Type | Default | |
|---|---|---|---|---|
pipeline |
The pipeline name or the path to the pipeline configuration file. If it is a pipeline name, it must be a pipeline supported by PaddleX. | str |
None |
|
device |
Inference device for the pipeline. Supports specifying the GPU card number, such as "gpu:0", other hardware card numbers, such as "npu:0", and CPU as "cpu". Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to Pipeline Parallel Inference. | str |
gpu:0 |
|
use_hpip |
Whether to enable the high-performance inference plugin. If set to None, the setting from the configuration file or config will be used. |
bool |
None | None |
hpi_config |
High-performance inference configuration | dict | None |
None | None |
| Parameter | Description | Type | Options | Default |
|---|---|---|---|---|
input |
Data to be predicted, supporting various input types, required | Python Var|str|list |
|
None |
use_doc_orientation_classify |
Whether to use the document orientation classification module | bool|None |
|
None |
use_doc_unwarping |
Whether to use the document unwarping correction module | bool|None |
|
None |
| Method | Description | Parameter | Type | Description | Default |
|---|---|---|---|---|---|
print() |
Prints the results to the terminal | format_json |
bool |
Whether to format the output using JSON indentation |
True |
indent |
int |
Specifies the indentation level to beautify the output JSON data for better readability, effective only when format_json is True |
4 | ||
ensure_ascii |
bool |
Controls whether to escape non-ASCII characters as Unicode. When set to True, all non-ASCII characters will be escaped; False retains the original characters, effective only when format_json is True |
False |
||
save_to_json() |
Saves the results as a JSON format file | save_path |
str |
The file path to save, naming consistent with the input file type when it is a directory | None |
indent |
int |
Specifies the indentation level to beautify the output JSON data for better readability, effective only when format_json is True |
4 | ||
ensure_ascii |
bool |
Controls whether to escape non-ASCII characters as Unicode. When set to True, all non-ASCII characters will be escaped; False retains the original characters, effective only when format_json is True |
False |
||
save_to_img() |
Saves the results as an image format file | save_path |
str |
The file path to save, supporting both directory or file path | None |
| Attribute | Description |
|---|---|
json |
Retrieves the prediction results in json format |
img |
Retrieves visualized images in dict format |
For the main operations provided by the service:
200, and the attributes of the response body are as follows:| Name | Type | Meaning |
|---|---|---|
logId |
string |
The UUID of the request. |
errorCode |
integer |
Error code. Fixed as 0. |
errorMsg |
string |
Error message. Fixed as "Success". |
result |
object |
The result of the operation. |
| Name | Type | Meaning |
|---|---|---|
logId |
string |
The UUID of the request. |
errorCode |
integer |
Error code. Same as the response status code. |
errorMsg |
string |
Error message. |
The main operations provided by the service are as follows:
inferObtain the document image preprocessing results.
POST /document-preprocessing
| Name | Type | Meaning | Required |
|---|---|---|---|
file |
string |
The URL of an image or PDF file accessible by the server, or the Base64-encoded content of the file. By default, for PDF files exceeding 10 pages, only the first 10 pages will be processed. To remove the page limit, please add the following configuration to the pipeline configuration file: |
Yes |
fileType |
integer | null |
The type of the file. 0 for PDF files, 1 for image files. If this attribute is missing, the file type will be inferred from the URL. |
No |
useDocOrientationClassify |
boolean | null |
Please refer to the description of the use_doc_orientation_classify parameter of the pipeline object's predict method. |
No |
useDocUnwarping |
boolean | null |
Please refer to the description of the use_doc_unwarping parameter of the pipeline object's predict method. |
No |
visualize |
boolean | null |
Whether to return the final visualization image and intermediate images during the processing.
For example, adding the following setting to the pipeline config file:
will disable image return by default. This behavior can be overridden by explicitly setting the visualize parameter in the request.If neither the request body nor the configuration file is set (If visualize is set to null in the request and not defined in the configuration file), the image is returned by default.
|
No |
result in the response body has the following attributes:| Name | Type | Meaning |
|---|---|---|
docPreprocessingResults |
object |
Document image preprocessing results. The array length is 1 (for image input) or the actual number of document pages processed (for PDF input). For PDF input, each element in the array represents the result of each page actually processed in the PDF file. |
dataInfo |
object |
Information about the input data. |
Each element in docPreprocessingResults is an object with the following attributes:
| Name | Type | Meaning |
|---|---|---|
outputImage |
string |
The preprocessed image. The image is in PNG format and is Base64-encoded. |
prunedResult |
object |
A simplified version of the res field in the JSON representation of the result generated by the pipeline object's predict method, excluding the input_path and the page_index fields. |
docPreprocessingImage |
string | null |
The visualization result image. The image is in JPEG format and is Base64-encoded. |
inputImage |
string | null |
The input image. The image is in JPEG format and is Base64-encoded. |
import base64
import requests
API_URL = "http://localhost:8080/document-preprocessing"
file_path = "./demo.jpg"
with open(file_path, "rb") as file:
file_bytes = file.read()
file_data = base64.b64encode(file_bytes).decode("ascii")
payload = {"file": file_data, "fileType": 1}
response = requests.post(API_URL, json=payload)
assert response.status_code == 200
result = response.json()["result"]
for i, res in enumerate(result["docPreprocessingResults"]):
print(res["prunedResult"])
output_img_path = f"out_{i}.png"
with open(output_img_path, "wb") as f:
f.write(base64.b64decode(res["outputImage"]))
print(f"Output image saved at {output_img_path}")
#include <iostream>
#include <fstream>
#include <vector>
#include <string>
#include "cpp-httplib/httplib.h" // https://github.com/Huiyicc/cpp-httplib
#include "nlohmann/json.hpp" // https://github.com/nlohmann/json
#include "base64.hpp" // https://github.com/tobiaslocker/base64
int main() {
httplib::Client client("localhost", 8080);
const std::string filePath = "./demo.jpg";
std::ifstream file(filePath, std::ios::binary | std::ios::ate);
if (!file) {
std::cerr << "Error opening file: " << filePath << std::endl;
return 1;
}
std::streamsize size = file.tellg();
file.seekg(0, std::ios::beg);
std::vector buffer(size);
if (!file.read(buffer.data(), size)) {
std::cerr << "Error reading file." << std::endl;
return 1;
}
std::string bufferStr(buffer.data(), static_cast(size));
std::string encodedFile = base64::to_base64(bufferStr);
nlohmann::json jsonObj;
jsonObj["file"] = encodedFile;
jsonObj["fileType"] = 1;
auto response = client.Post("/document-preprocessing", jsonObj.dump(), "application/json");
if (response && response->status == 200) {
nlohmann::json jsonResponse = nlohmann::json::parse(response->body);
auto result = jsonResponse["result"];
if (!result.is_object() || !result["docPreprocessingResults"].is_array()) {
std::cerr << "Unexpected response format." << std::endl;
return 1;
}
for (size_t i = 0; i < result["docPreprocessingResults"].size(); ++i) {
auto res = result["docPreprocessingResults"][i];
if (res.contains("prunedResult")) {
std::cout << "Preprocessed result: " << res["prunedResult"].dump() << std::endl;
}
if (res.contains("outputImage")) {
std::string outputImgPath = "out_" + std::to_string(i) + ".png";
std::string decodedImage = base64::from_base64(res["outputImage"].get());
std::ofstream outFile(outputImgPath, std::ios::binary);
if (outFile.is_open()) {
outFile.write(decodedImage.c_str(), decodedImage.size());
outFile.close();
std::cout << "Saved image: " << outputImgPath << std::endl;
} else {
std::cerr << "Failed to write image: " << outputImgPath << std::endl;
}
}
}
} else {
std::cerr << "Request failed." << std::endl;
if (response) {
std::cerr << "HTTP status: " << response->status << std::endl;
std::cerr << "Response body: " << response->body << std::endl;
}
return 1;
}
return 0;
}
import okhttp3.*;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.node.ObjectNode;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.Base64;
public class Main {
public static void main(String[] args) throws IOException {
String API_URL = "http://localhost:8080/document-preprocessing";
String imagePath = "./demo.jpg";
File file = new File(imagePath);
byte[] fileContent = java.nio.file.Files.readAllBytes(file.toPath());
String base64Image = Base64.getEncoder().encodeToString(fileContent);
ObjectMapper objectMapper = new ObjectMapper();
ObjectNode payload = objectMapper.createObjectNode();
payload.put("file", base64Image);
payload.put("fileType", 1);
OkHttpClient client = new OkHttpClient();
MediaType JSON = MediaType.get("application/json; charset=utf-8");
RequestBody body = RequestBody.create(JSON, payload.toString());
Request request = new Request.Builder()
.url(API_URL)
.post(body)
.build();
try (Response response = client.newCall(request).execute()) {
if (response.isSuccessful()) {
String responseBody = response.body().string();
JsonNode root = objectMapper.readTree(responseBody);
JsonNode result = root.get("result");
JsonNode docPreprocessingResults = result.get("docPreprocessingResults");
for (int i = 0; i < docPreprocessingResults.size(); i++) {
JsonNode item = docPreprocessingResults.get(i);
int finalI = i;
JsonNode prunedResult = item.get("prunedResult");
System.out.println("Pruned Result [" + i + "]: " + prunedResult.toString());
String outputImgBase64 = item.get("outputImage").asText();
byte[] outputImgBytes = Base64.getDecoder().decode(outputImgBase64);
String outputImgPath = "out_" + finalI + ".png";
try (FileOutputStream fos = new FileOutputStream(outputImgPath)) {
fos.write(outputImgBytes);
System.out.println("Saved output image: " + outputImgPath);
}
JsonNode inputImageNode = item.get("inputImage");
if (inputImageNode != null && !inputImageNode.isNull()) {
String inputImageBase64 = inputImageNode.asText();
byte[] inputImageBytes = Base64.getDecoder().decode(inputImageBase64);
String inputImgPath = "inputImage_" + i + ".jpg";
try (FileOutputStream fos = new FileOutputStream(inputImgPath)) {
fos.write(inputImageBytes);
System.out.println("Saved input image to: " + inputImgPath);
}
}
}
} else {
System.err.println("Request failed with HTTP code: " + response.code());
}
}
}
}
package main
import (
"bytes"
"encoding/base64"
"encoding/json"
"fmt"
"io/ioutil"
"net/http"
"os"
)
func main() {
API_URL := "http://localhost:8080/document-preprocessing"
filePath := "./demo.jpg"
fileBytes, err := ioutil.ReadFile(filePath)
if err != nil {
fmt.Printf("Error reading file: %v\n", err)
return
}
fileData := base64.StdEncoding.EncodeToString(fileBytes)
payload := map[string]interface{}{
"file": fileData,
"fileType": 1,
}
payloadBytes, err := json.Marshal(payload)
if err != nil {
fmt.Printf("Error marshaling payload: %v\n", err)
return
}
client := &http.Client{}
req, err := http.NewRequest("POST", API_URL, bytes.NewBuffer(payloadBytes))
if err != nil {
fmt.Printf("Error creating request: %v\n", err)
return
}
req.Header.Set("Content-Type", "application/json")
res, err := client.Do(req)
if err != nil {
fmt.Printf("Error sending request: %v\n", err)
return
}
defer res.Body.Close()
if res.StatusCode != http.StatusOK {
fmt.Printf("Unexpected status code: %d\n", res.StatusCode)
return
}
body, err := ioutil.ReadAll(res.Body)
if err != nil {
fmt.Printf("Error reading response body: %v\n", err)
return
}
type DocPreprocessingResult struct {
PrunedResult map[string]interface{} `json:"prunedResult"`
OutputImage string `json:"outputImage"`
DocPreprocessingImage *string `json:"docPreprocessingImage"`
InputImage *string `json:"inputImage"`
}
type Response struct {
Result struct {
DocPreprocessingResults []DocPreprocessingResult `json:"docPreprocessingResults"`
DataInfo interface{} `json:"dataInfo"`
} `json:"result"`
}
var respData Response
if err := json.Unmarshal(body, &respData); err != nil {
fmt.Printf("Error unmarshaling response: %v\n", err)
return
}
for i, res := range respData.Result.DocPreprocessingResults {
fmt.Printf("Result %d - prunedResult: %+v\n", i, res.PrunedResult)
imgBytes, err := base64.StdEncoding.DecodeString(res.OutputImage)
if err != nil {
fmt.Printf("Error decoding outputImage at index %d: %v\n", i, err)
continue
}
filename := fmt.Sprintf("out_%d.png", i)
if err := os.WriteFile(filename, imgBytes, 0644); err != nil {
fmt.Printf("Error saving image %s: %v\n", filename, err)
continue
}
fmt.Printf("Saved output image to %s\n", filename)
}
}
using System;
using System.IO;
using System.Net.Http;
using System.Text;
using System.Threading.Tasks;
using Newtonsoft.Json.Linq;
class Program
{
static readonly string API_URL = "http://localhost:8080/document-preprocessing";
static readonly string inputFilePath = "./demo.jpg";
static async Task Main(string[] args)
{
var httpClient = new HttpClient();
byte[] fileBytes = File.ReadAllBytes(inputFilePath);
string fileData = Convert.ToBase64String(fileBytes);
var payload = new JObject
{
{ "file", fileData },
{ "fileType", 1 }
};
var content = new StringContent(payload.ToString(), Encoding.UTF8, "application/json");
HttpResponseMessage response = await httpClient.PostAsync(API_URL, content);
response.EnsureSuccessStatusCode();
string responseBody = await response.Content.ReadAsStringAsync();
JObject jsonResponse = JObject.Parse(responseBody);
JArray docPreResults = (JArray)jsonResponse["result"]["docPreprocessingResults"];
for (int i = 0; i < docPreResults.Count; i++)
{
var res = docPreResults[i];
Console.WriteLine($"[{i}] prunedResult:\n{res["prunedResult"]}");
string base64Image = res["outputImage"]?.ToString();
if (!string.IsNullOrEmpty(base64Image))
{
string outputPath = $"out_{i}.png";
byte[] imageBytes = Convert.FromBase64String(base64Image);
File.WriteAllBytes(outputPath, imageBytes);
Console.WriteLine($"Output image saved at {outputPath}");
}
else
{
Console.WriteLine($"outputImage at index {i} is null.");
}
}
}
}
const axios = require('axios');
const fs = require('fs');
const path = require('path');
const API_URL = 'http://localhost:8080/document-preprocessing';
const imagePath = './demo.jpg';
function encodeImageToBase64(filePath) {
const bitmap = fs.readFileSync(filePath);
return Buffer.from(bitmap).toString('base64');
}
const payload = {
file: encodeImageToBase64(imagePath),
fileType: 1
};
axios.post(API_URL, payload, {
headers: {
'Content-Type': 'application/json'
},
maxBodyLength: Infinity
})
.then((response) => {
const results = response.data.result.docPreprocessingResults;
results.forEach((res, index) => {
console.log(`\n[${index}] prunedResult:`);
console.log(res.prunedResult);
const base64Image = res.outputImage;
if (base64Image) {
const outputImagePath = `out_${index}.png`;
const imageBuffer = Buffer.from(base64Image, 'base64');
fs.writeFileSync(outputImagePath, imageBuffer);
console.log(`Output image saved at ${outputImagePath}`);
} else {
console.log(`outputImage at index ${index} is null.`);
}
});
})
.catch((error) => {
console.error('API error:', error.message);
});
<?php
$API_URL = "http://localhost:8080/document-preprocessing";
$image_path = "./demo.jpg";
$output_image_path = "./out_0.png";
$image_data = base64_encode(file_get_contents($image_path));
$payload = array("file" => $image_data, "fileType" => 1);
$ch = curl_init($API_URL);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($payload));
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Type: application/json'));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($ch);
curl_close($ch);
$result = json_decode($response, true)["result"]["docPreprocessingResults"];
foreach ($result as $i => $item) {
echo "[$i] prunedResult:\n";
print_r($item["prunedResult"]);
if (!empty($item["outputImage"])) {
$output_image_path = "out_" . $i . ".png";
file_put_contents($output_image_path, base64_decode($item["outputImage"]));
echo "Output image saved at $output_image_path\n";
} else {
echo "No outputImage found for item $i\n";
}
}
?>
| situation | Fine-tuning model | Fine-tuning reference link |
|---|---|---|
| The overall image rotation correction is inaccurate. | Image orientation classification module | Link |
| The image distortion correction is inaccurate. | Image Unwarping | Fine-tuning is not supported at the moment. |