--- comments: true --- # Text Image Unwarping Module Development Tutorial ## I. Overview The primary purpose of Text Image Unwarping is to perform geometric transformations on images in order to correct issues such as document distortion, tilt, perspective deformation, etc., enabling more accurate recognition by subsequent text recognition modules. ## II. Supported Model List
Model NameModel Download Link MS-SSIM (%) Model Size (M) information
UVDocInference Model/Trained Model 54.40 30.3 M High-precision Text Image Unwarping Model
The accuracy metrics of the above models are measured on the [DocUNet benchmark](https://www3.cs.stonybrook.edu/~cvl/docunet.html) dataset. ## III. Quick Integration > ❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to the [PaddleX Local Installation Guide](../../../installation/installation.en.md) Just a few lines of code can complete the inference of the Text Image Unwarping module, allowing you to easily switch between models under this module. You can also integrate the model inference of the the Text Image Unwarping module into your project. Before running the following code, please download the [demo image](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/doc_test.jpg) to your local machine. ```python from paddlex import create_model model = create_model(model_name="UVDoc") output = model.predict("doc_test.jpg", batch_size=1) for res in output: res.print() res.save_to_img(save_path="./output/") res.save_to_json(save_path="./output/res.json") ``` After running, the result obtained is: ```bash {'res': "{'input_path': 'doc_test.jpg', 'doctr_img': '...'}"} ``` The meanings of the running result parameters are as follows: - `input_path`: Indicates the path of the input image to be corrected. - `doctr_img`: Indicates the result of the corrected image. Since there is too much data to print directly, `...` is used here as a placeholder. The prediction result can be saved as an image through `res.save_to_img()` and as a JSON file through `res.save_to_json()`. The visualization image is as follows: Note: Due to network issues, the above URL may not be successfully parsed. If you need the content of this webpage, please check the validity of the link and try again. Alternatively, if parsing this link is not necessary for your question, please proceed with other questions. Relevant methods, parameters, and explanations are as follows: * `create_model` instantiates an image correction model (here using `UVDoc` as an example). The specific explanation is as follows:
Parameter Parameter Description Parameter Type Options Default Value
model_name Name of the model str All model names supported by PaddleX None
model_dir Path to store the model str None None
* The `model_name` must be specified. After specifying `model_name`, the default model parameters built into PaddleX will be used. If `model_dir` is specified, the user-defined model will be used. * The `predict()` method of the image correction model is called for inference prediction. The parameters of the `predict()` method are `input` and `batch_size`, with specific explanations as follows:
Parameter Parameter Description Parameter Type Options Default Value
input Data to be predicted, supporting multiple input types Python Var/str/dict/list
  • Python variable, such as image data represented by numpy.ndarray
  • File path, such as the local path of an image file: /root/data/img.jpg
  • URL link, such as the network URL of an image file: Example
  • Local directory, the directory should contain data files to be predicted, such as the local path: /root/data/
  • Dictionary, the key of the dictionary must correspond to the specific task, such as "img" for image classification tasks. The value of the dictionary supports the above types of data, for example: {"img": "/root/data1"}
  • List, elements of the list must be the above types of data, such as [numpy.ndarray, numpy.ndarray], ["/root/data/img1.jpg", "/root/data/img2.jpg"], ["/root/data1", "/root/data2"], [{"img": "/root/data1"}, {"img": "/root/data2/img.jpg"}]
None
batch_size Batch size int Any integer 1
* The prediction results are processed, with each sample's prediction result being of type `dict`, and supporting operations such as printing, saving as an image, and saving as a `json` file:
Method Method Description Parameter Parameter Type Parameter Description Default Value
print() Print the result to the terminal format_json bool Whether to format the output content using JSON indentation True
indent int Specify the indentation level to beautify the output JSON data, making it more readable. This is only effective when format_json is True 4
ensure_ascii bool Control whether non-ASCII characters are escaped to Unicode. When set to True, all non-ASCII characters will be escaped; False retains the original characters. This is only effective when format_json is True False
save_to_json() Save the result as a JSON file save_path str The file path for saving. When it is a directory, the saved file name will match the input file name None
indent int Specify the indentation level to beautify the output JSON data, making it more readable. This is only effective when format_json is True 4
ensure_ascii bool Control whether non-ASCII characters are escaped to Unicode. When set to True, all non-ASCII characters will be escaped; False retains the original characters. This is only effective when format_json is True False
save_to_img() Save the result as an image file save_path str The file path for saving. When it is a directory, the saved file name will match the input file name None
* Additionally, it also supports obtaining visualized images with results and prediction results through attributes, as follows:
Attribute Attribute Description
json Get the prediction result in json format
img Get the visualized image in dict format
For more information on using PaddleX's single-model inference API, refer to the [PaddleX Single Model Python Script Usage Instructions](../../instructions/model_python_API.en.md). ## IV. Custom Development The current module temporarily does not support fine-tuning training and only supports inference integration. Fine-tuning training for this module is planned to be supported in the future.