Ver Fonte

add textline_orientation (#2477)

zhangyubo0722 há 11 meses atrás
pai
commit
8a64bb1a92

+ 252 - 0
docs/module_usage/tutorials/ocr_modules/textline_orientation_classification.en.md

@@ -0,0 +1,252 @@
+---
+comments: true
+---
+
+# Tutorial for Text Line Orientation Classification Module
+
+## I. Overview
+The text line orientation classification module primarily distinguishes the orientation of text lines and corrects them using post-processing. In processes such as document scanning and license/certificate photography, to capture clearer images, the capture device may be rotated, resulting in text lines in various orientations. Standard OCR pipelines cannot handle such data well. By utilizing image classification technology, the orientation of text lines can be predetermined and adjusted, thereby enhancing the accuracy of OCR processing.
+
+## II. Supported Model List
+
+<table>
+<thead>
+<tr>
+<th>Model</th>
+<th>Top-1 Accuracy (%)</th>
+<th>GPU Inference Time (ms)</th>
+<th>CPU Inference Time (ms)</th>
+<th>Model Size (M)</th>
+<th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>PP-LCNet_x0_25_textline_ori</td>
+<td>95.54</td>
+<td>-</td>
+<td>-</td>
+<td>0.32</td>
+<td>Text line classification model based on PP-LCNet_x0_25, with two classes: 0 degrees and 180 degrees</td>
+</tr>
+</tbody>
+</table>
+<b>Note: The above accuracy metrics are evaluated on a self-built dataset covering multiple scenarios such as documents and licenses, containing 1000 images. GPU inference time is based on an NVIDIA Tesla T4 machine with FP32 precision. CPU inference speed is based on an Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz with 8 threads and FP32 precision.</b>
+
+## III. Quick Integration
+
+> ❗ Before quick integration, please install the PaddleX wheel package. For details, refer to the [PaddleX Local Installation Tutorial](../../../installation/installation.en.md).
+
+After installing the wheel package, a few lines of code can complete the inference of the text line orientation classification module. You can switch models under this module freely, and you can also integrate the model inference of the text line orientation classification module into your project. Before running the following code, please download the [example image](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/textline_rot180_demo.jpg) locally.
+
+```bash
+from paddlex import create_model
+model = create_model("PP-LCNet_x0_25_textline_ori")
+output = model.predict("textline_rot180_demo.jpg", batch_size=1)
+for res in output:
+    res.print(json_format=False)
+    res.save_to_img("./output/demo.png")
+    res.save_to_json("./output/res.json")
+```
+For more information on using the PaddleX single-model inference API, refer to the [PaddleX Single Model Python Script Usage Instructions](../../instructions/model_python_API.en.md).
+
+## IV. Custom Development
+If you aim for higher accuracy with existing models, you can leverage PaddleX's custom development capabilities to develop better text line orientation classification models. Before developing a text line orientation classification model with PaddleX, ensure that you have installed PaddleX's classification-related model training capabilities. The installation process can be found in the [PaddleX Local Installation Tutorial](../../../installation/installation.en.md).
+
+### 4.1 Data Preparation
+Before model training, you need to prepare a dataset for the corresponding task module. PaddleX provides data validation functionality for each module, and <b>only data that passes validation can be used for model training</b>. Additionally, PaddleX provides demo datasets for each module, allowing you to complete subsequent development based on the official demo data. If you wish to use a private dataset for subsequent model training, refer to the [PaddleX Image Classification Task Module Data Preparation Tutorial](../../../data_annotations/cv_modules/image_classification.en.md).
+
+#### 4.1.1 Demo Data Download
+You can download the demo dataset to a specified folder using the following command:
+
+```bash
+wget https://paddle-model-ecology.bj.bcebos.com/paddlex/data/textline_orientation_example_data.tar -P ./dataset
+tar -xf ./dataset/textline_orientation_example_data.tar -C ./dataset/
+```
+#### 4.1.2 Data Validation
+You can complete data validation with a single command:
+
+```bash
+python main.py -c paddlex/configs/textline_orientation/PP-LCNet_x0_25_textline_ori.yaml \
+    -o Global.mode=check_dataset \
+    -o Global.dataset_dir=./dataset/textline_orientation_example_data
+```
+After executing the above command, PaddleX will validate the dataset and collect basic information about it. Upon successful execution, the log will print the message `Check dataset passed !`. The validation result file is saved in `./output/check_dataset_result.json`, and related outputs are saved in the `./output/check_dataset` directory under the current directory, including visualized sample images and sample distribution histograms.
+
+<details><summary>👉 <b>Details of Verification Results (Click to Expand)</b></summary>
+
+<p>The specific content of the verification result file is as follows:</p>
+<pre><code class="language-bash">{
+  "done_flag": true,
+  "check_pass": true,
+  "attributes": {
+    "label_file": "../../dataset/textline_orientation_example_data/label.txt",
+    "num_classes": 2,
+    "train_samples": 1000,
+    "train_sample_paths": [
+      "check_dataset/demo_img/ILSVRC2012_val_00019234_4284.jpg",
+      "check_dataset/demo_img/lsvt_train_images_4655.jpg",
+      "check_dataset/demo_img/lsvt_train_images_60562.jpg",
+      "check_dataset/demo_img/lsvt_train_images_14013.jpg",
+      "check_dataset/demo_img/ILSVRC2012_val_00011156_12950.jpg",
+      "check_dataset/demo_img/ILSVRC2012_val_00016578_10192.jpg",
+      "check_dataset/demo_img/26920921_2341381071.jpg",
+      "check_dataset/demo_img/31979250_3394569384.jpg",
+      "check_dataset/demo_img/25959328_518853598.jpg",
+      "check_dataset/demo_img/ILSVRC2012_val_00018420_14077.jpg"
+    ],
+    "val_samples": 200,
+    "val_sample_paths": [
+      "check_dataset/demo_img/lsvt_train_images_79109.jpg",
+      "check_dataset/demo_img/lsvt_train_images_131133.jpg",
+      "check_dataset/demo_img/mtwi_train_images_65423.jpg",
+      "check_dataset/demo_img/lsvt_train_images_120718.jpg",
+      "check_dataset/demo_img/mtwi_train_images_58098.jpg",
+      "check_dataset/demo_img/rctw_train_images_25817.jpg",
+      "check_dataset/demo_img/lsvt_val_images_6336.jpg",
+      "check_dataset/demo_img/lsvt_train_images_71775.jpg",
+      "check_dataset/demo_img/mtwi_train_images_78064.jpg",
+      "check_dataset/demo_img/mtwi_train_images_52578.jpg"
+    ]
+  },
+  "analysis": {
+    "histogram": "check_dataset/histogram.png"
+  },
+  "dataset_path": "./dataset/textline_orientation_example_data",
+  "show_type": "image",
+  "dataset_type": "ClsDataset"
+}
+</code></pre>
+<p>In the above verification results, check_pass being True indicates that the dataset format meets the requirements. Explanations for other indicators are as follows:</p>
+<ul>
+<li><code>attributes.num_classes</code>: The number of classes in this dataset is 2;</li>
+<li><code>attributes.train_samples</code>: The number of training samples in this dataset is 1000;</li>
+<li><code>attributes.val_samples</code>: The number of validation samples in this dataset is 200;</li>
+<li><code>attributes.train_sample_paths</code>: The list of relative paths to the visualization images of the training samples in this dataset;</li>
+<li><code>attributes.val_sample_paths</code>: The list of relative paths to the visualization images of the validation samples in this dataset;</li>
+</ul>
+<p>The dataset verification also analyzes the distribution of sample numbers across all classes in the dataset and generates a distribution histogram (histogram.png):</p>
+<p><img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/modules/textline_ori_classification/01.png"></p></details>
+
+### 4.1.3 Dataset Format Conversion / Dataset Splitting (Optional)
+After completing data validation, you can convert the dataset format and re-split the training/validation ratio of the dataset by **modifying the configuration file** or **adding hyperparameters**.
+
+<details><summary>👉 <b>Details on Format Conversion / Dataset Splitting (Click to Expand)</b></summary>
+
+<p><b>(1) Dataset Format Conversion</b></p>
+<p>Text line orientation classification temporarily does not support data format conversion.</p>
+<p><b>(2) Dataset Splitting</b></p>
+<p>Parameters for dataset splitting can be set by modifying the fields under <code>CheckDataset</code> in the configuration file. Examples of some parameters in the configuration file are as follows:</p>
+<ul>
+<li><code>CheckDataset</code>:</li>
+<li><code>split</code>:</li>
+<li><code>enable</code>: Whether to re-split the dataset. When set to <code>True</code>, dataset splitting is performed, with a default of <code>False</code>;</li>
+<li><code>train_percent</code>: If re-splitting the dataset, you need to set the percentage of the training set, which is any integer between 0 and 100, and must sum to 100 with the value of <code>val_percent</code>;</li>
+</ul>
+<p>For example, if you want to re-split the dataset with 90% for the training set and 10% for the validation set, you need to modify the configuration file as follows:</p>
+<pre><code class="language-bash">......
+CheckDataset:
+  ......
+  split:
+    enable: True
+    train_percent: 90
+    val_percent: 10
+  ......
+</code></pre>
+<p>Then execute the command:</p>
+<pre><code class="language-bash">python main.py -c paddlex/configs/textline_orientation/PP-LCNet_x0_25_textline_ori.yaml \
+    -o Global.mode=check_dataset \
+    -o Global.dataset_dir=./dataset/textline_orientation_example_data
+</code></pre>
+<p>After the data splitting is executed, the original annotation files will be renamed to <code>xxx.bak</code> in the original path.</p>
+<p>The above parameters can also be set by appending command-line arguments:</p>
+<pre><code class="language-bash">python main.py -c paddlex/configs/textline_orientation/PP-LCNet_x0_25_textline_ori.yaml \
+    -o Global.mode=check_dataset \
+    -o Global.dataset_dir=./dataset/textline_orientation_example_data \
+    -o CheckDataset.split.enable=True \
+    -o CheckDataset.split.train_percent=90 \
+    -o CheckDataset.split.val_percent=10
+</code></pre></details>
+
+### 4.2 Model Training
+Model training can be completed with a single command. Here, the training of the text line orientation classification model (PP-LCNet_x1_0_textline_ori) is taken as an example:
+
+```bash
+python main.py -c paddlex/configs/textline_orientation/PP-LCNet_x0_25_textline_ori.yaml \
+    -o Global.mode=train \
+    -o Global.dataset_dir=./dataset/textline_orientation_example_data
+```
+The following steps are required:
+
+* Specify the path to the `.yaml` configuration file for the model (here it is `PP-LCNet_x0_25_textline_ori.yaml`. When training other models, you need to specify the corresponding configuration file. The correspondence between models and configuration files can be found in the [PaddleX Model List (CPU/GPU)](../../../support_list/models_list.en.md)).
+* Specify the mode as model training: `-o Global.mode=train`
+* Specify the path to the training dataset: `-o Global.dataset_dir`
+Other related parameters can be set by modifying the fields under `Global` and `Train` in the `.yaml` configuration file or by appending parameters in the command line. For example, to specify the first two GPUs for training: `-o Global.device=gpu:0,1`; to set the number of training epochs to 10: `-o Train.epochs_iters=10`. For more modifiable parameters and their detailed explanations, refer to the configuration file description for the corresponding task module of the model [PaddleX Common Model Configuration Parameters](../../instructions/config_parameters_common.en.md).
+
+<details><summary>👉 <b>More Details (Click to Expand)</b></summary>
+
+<ul>
+<li>During model training, PaddleX automatically saves the model weights to the default directory <code>output</code>. If you need to specify a save path, you can set it through the <code>-o Global.output</code> field in the configuration file.</li>
+<li>PaddleX shields you from the concepts of dynamic graph weights and static graph weights. During model training, both dynamic and static graph weights are produced, and static graph weights are selected by default for model inference.</li>
+<li>
+<p>After completing model training, all outputs are saved in the specified output directory (default is <code>./output/</code>), typically including the following:</p>
+</li>
+<li>
+<p><code>train_result.json</code>: Training result record file, recording whether the training task was completed normally, as well as the output weight metrics, related file paths, etc.;</p>
+</li>
+<li><code>train.log</code>: Training log file, recording changes in model metrics, loss, etc., during training;</li>
+<li><code>config.yaml</code>: Training configuration file, recording the hyperparameter configurations for this training;</li>
+<li><code>.pdparams</code>, <code>.pdema</code>, <code>.pdopt.pdstate</code>, <code>.pdiparams</code>, <code>.pdmodel</code>: Model weight-related files, including network parameters, optimizer, EMA, static graph network parameters, static graph network structure, etc.;</li>
+</ul></details>
+
+### **4.3 Model Evaluation**
+After completing model training, you can evaluate the specified model weights on the validation set to verify the model's accuracy. Using PaddleX for model evaluation can be done with a single command:
+
+```bash
+python main.py -c paddlex/configs/textline_orientation/PP-LCNet_x0_25_textline_ori.yaml \
+    -o Global.mode=evaluate \
+    -o Global.dataset_dir=./dataset/textline_orientation_example_data
+```
+Similar to model training, the following steps are required:
+
+* Specify the path to the model's `.yaml` configuration file (here it is `PP-LCNet_x0_25_textline_ori.yaml`)
+* Specify the mode as model evaluation: `-o Global.mode=evaluate`
+* Specify the path to the validation dataset: `-o Global.dataset_dir`
+Other related parameters can be set by modifying the fields under `Global` and `Evaluate` in the `.yaml` configuration file. For details, please refer to [PaddleX Common Model Configuration File Parameter Description](../../instructions/config_parameters_common.en.md).
+
+<details><summary>👉 **More Details (Click to Expand)**</summary>
+
+<p>When evaluating the model, you need to specify the path to the model weights file. Each configuration file has a default weight save path built in. If you need to change it, you can set it by appending a command-line parameter, such as <code>-o Evaluate.weight_path="./output/best_model/best_model.pdparams"</code>.</p>
+<p>After completing the model evaluation, the following outputs are typically generated:</p>
+<p>Upon completion of the model evaluation, an `evaluate_result.json` file will be produced, which records the evaluation results. Specifically, it records whether the evaluation task was completed normally and the model's evaluation metrics, including Top-1 Accuracy.</p></details>
+
+### **4.4 Model Inference and Model Integration**
+After completing model training and evaluation, you can use the trained model weights for inference predictions or Python integration.
+
+#### 4.4.1 Model Inference
+Performing inference predictions through the command line requires only the following single command. Before running the following code, please download the [example image](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/textline_rot180_demo.jpg) locally.
+
+```bash
+python main.py -c paddlex/configs/textline_orientation/PP-LCNet_x0_25_textline_ori.yaml \
+    -o Global.mode=predict \
+    -o Predict.model_dir="./output/best_model/inference" \
+    -o Predict.input="textline_rot180_demo.jpg"
+```
+Similar to model training and evaluation, the following steps are required:
+
+* Specify the path to the model's `.yaml` configuration file (here it is `PP-LCNet_x0_25_textline_ori.yaml`)
+* Specify the mode as model inference prediction: `-o Global.mode=predict`
+* Specify the path to the model weights: `-o Predict.model_dir="./output/best_model/inference"`
+* Specify the path to the input data: `-o Predict.input="..."`
+Other related parameters can be set by modifying the fields under `Global` and `Predict` in the `.yaml` configuration file. For details, please refer to [PaddleX Common Model Configuration File Parameter Description](../../instructions/config_parameters_common.en.md).
+
+#### 4.4.2 Model Integration
+The model can be directly integrated into the PaddleX pipeline or into your own project.
+
+1. **Pipeline Integration**
+
+The text line orientation classification module can be integrated into the [Document Scene Information Extraction v3 Pipeline (PP-ChatOCRv3)](../../../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.en.md). Simply replace the model path to update the text line orientation classification module.
+
+2. **Module Integration**
+
+The weights you produce can be directly integrated into the text line orientation classification module. You can refer to the Python example code in [Quick Integration](##Quick-Integration) and only need to replace the model with the path to your trained model.

+ 253 - 0
docs/module_usage/tutorials/ocr_modules/textline_orientation_classification.md

@@ -0,0 +1,253 @@
+---
+comments: true
+---
+
+# 文本行方向分类模块使用教程
+
+## 一、概述
+文本行方向分类模块主要是将文本行的方向区分出来,并使用后处理将其矫正。在诸如文档扫描、证照拍摄等过程中,有时为了拍摄更清晰,会将拍摄设备进行旋转,导致得到的文本行也是不同方向的。此时,标准的OCR流程无法很好地应对这些数据。利用图像分类技术,可以预先判断文本行方向,并将其进行方向调整,从而提高OCR处理的准确性。
+
+## 二、支持模型列表
+
+
+<table>
+<thead>
+<tr>
+<th>模型</th>
+<th>Top-1 Acc(%)</th>
+<th>GPU推理耗时(ms)</th>
+<th>CPU推理耗时 (ms)</th>
+<th>模型存储大小(M)</th>
+<th>介绍</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>PP-LCNet_x0_25_textline_ori</td>
+<td>95.54</td>
+<td>-</td>
+<td>-</td>
+<td>0.32</td>
+<td>基于PP-LCNet_x0_25的文本行分类模型,含有两个类别,即0度,180度</td>
+</tr>
+</tbody>
+</table>
+<b>注:以上精度指标的评估集是自建的数据集,覆盖证件和文档等多个场景,包含 1000 张图片。GPU 推理耗时基于 NVIDIA Tesla T4 机器,精度类型为 FP32, CPU 推理速度基于 Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz,线程数为 8,精度类型为 FP32。</b>
+
+## 三、快速集成
+
+> ❗ 在快速集成前,请先安装 PaddleX 的 wheel 包,详细请参考 [PaddleX本地安装教程](../../../installation/installation.md)
+
+完成wheel 包的安装后,几行代码即可完成文本行方向分类模块的推理,可以任意切换该模块下的模型,您也可以将文本行方向分类模块中的模型推理集成到您的项目中。运行以下代码前,请您下载[示例图片](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/textline_rot180_demo.jpg)到本地。
+
+```bash
+from paddlex import create_model
+model = create_model("PP-LCNet_x0_25_textline_ori")
+output = model.predict("textline_rot180_demo.jpg",  batch_size=1)
+for res in output:
+    res.print(json_format=False)
+    res.save_to_img("./output/demo.png")
+    res.save_to_json("./output/res.json")
+```
+关于更多 PaddleX 的单模型推理的 API 的使用方法,可以参考的使用方法,可以参考[PaddleX单模型Python脚本使用说明](../../instructions/model_python_API.md)。
+
+## 四、二次开发
+如果你追求更高精度的现有模型,可以使用PaddleX的二次开发能力,开发更好的文本行方向分类模型。在使用PaddleX开发文本行方向分类模型之前,请务必安装PaddleX的分类相关的模型训练能力,安装过程可以参考 [PaddleX本地安装教程](../../../installation/installation.md)
+
+### 4.1 数据准备
+在进行模型训练前,需要准备相应任务模块的数据集。PaddleX 针对每一个模块提供了数据校验功能,<b>只有通过数据校验的数据才可以进行模型训练</b>。此外,PaddleX为每一个模块都提供了 Demo 数据集,您可以基于官方提供的 Demo 数据完成后续的开发。若您希望用私有数据集进行后续的模型训练,可以参考[PaddleX图像分类任务模块数据准备教程](../../../data_annotations/cv_modules/image_classification.md)。
+
+#### 4.1.1 Demo 数据下载
+您可以参考下面的命令将 Demo 数据集下载到指定文件夹:
+
+```bash
+wget https://paddle-model-ecology.bj.bcebos.com/paddlex/data/textline_orientation_example_data.tar -P ./dataset
+tar -xf ./dataset/textline_orientation_example_data.tar -C ./dataset/
+```
+#### 4.1.2 数据校验
+一行命令即可完成数据校验:
+
+```bash
+python main.py -c paddlex/configs/textline_orientation/PP-LCNet_x0_25_textline_ori.yaml \
+    -o Global.mode=check_dataset \
+    -o Global.dataset_dir=./dataset/textline_orientation_example_data
+```
+执行上述命令后,PaddleX 会对数据集进行校验,并统计数据集的基本信息,命令运行成功后会在log中打印出`Check dataset passed !`信息。校验结果文件保存在`./output/check_dataset_result.json`,同时相关产出会保存在当前目录的`./output/check_dataset`目录下,产出目录中包括可视化的示例样本图片和样本分布直方图。
+
+<details><summary>👉 <b>校验结果详情(点击展开)</b></summary>
+
+<p>校验结果文件具体内容为:</p>
+<pre><code class="language-bash">{
+  &quot;done_flag&quot;: true,
+  &quot;check_pass&quot;: true,
+  &quot;attributes&quot;: {
+    &quot;label_file&quot;: &quot;..\/..\/dataset\/textline_orientation_example_data\/label.txt&quot;,
+    &quot;num_classes&quot;: 2,
+    &quot;train_samples&quot;: 1000,
+    &quot;train_sample_paths&quot;: [
+      &quot;check_dataset\/demo_img\/ILSVRC2012_val_00019234_4284.jpg&quot;,
+      &quot;check_dataset\/demo_img\/lsvt_train_images_4655.jpg&quot;,
+      &quot;check_dataset\/demo_img\/lsvt_train_images_60562.jpg&quot;,
+      &quot;check_dataset\/demo_img\/lsvt_train_images_14013.jpg&quot;,
+      &quot;check_dataset\/demo_img\/ILSVRC2012_val_00011156_12950.jpg&quot;,
+      &quot;check_dataset\/demo_img\/ILSVRC2012_val_00016578_10192.jpg&quot;,
+      &quot;check_dataset\/demo_img\/26920921_2341381071.jpg&quot;,
+      &quot;check_dataset\/demo_img\/31979250_3394569384.jpg&quot;,
+      &quot;check_dataset\/demo_img\/25959328_518853598.jpg&quot;,
+      &quot;check_dataset\/demo_img\/ILSVRC2012_val_00018420_14077.jpg&quot;
+    ],
+    &quot;val_samples&quot;: 200,
+    &quot;val_sample_paths&quot;: [
+      &quot;check_dataset\/demo_img\/lsvt_train_images_79109.jpg&quot;,
+      &quot;check_dataset\/demo_img\/lsvt_train_images_131133.jpg&quot;,
+      &quot;check_dataset\/demo_img\/mtwi_train_images_65423.jpg&quot;,
+      &quot;check_dataset\/demo_img\/lsvt_train_images_120718.jpg&quot;,
+      &quot;check_dataset\/demo_img\/mtwi_train_images_58098.jpg&quot;,
+      &quot;check_dataset\/demo_img\/rctw_train_images_25817.jpg&quot;,
+      &quot;check_dataset\/demo_img\/lsvt_val_images_6336.jpg&quot;,
+      &quot;check_dataset\/demo_img\/lsvt_train_images_71775.jpg&quot;,
+      &quot;check_dataset\/demo_img\/mtwi_train_images_78064.jpg&quot;,
+      &quot;check_dataset\/demo_img\/mtwi_train_images_52578.jpg&quot;
+    ]
+  },
+  &quot;analysis&quot;: {
+    &quot;histogram&quot;: &quot;check_dataset\/histogram.png&quot;
+  },
+  &quot;dataset_path&quot;: &quot;.\/dataset\/textline_orientation_example_data&quot;,
+  &quot;show_type&quot;: &quot;image&quot;,
+  &quot;dataset_type&quot;: &quot;ClsDataset&quot;
+}
+</code></pre>
+<p>上述校验结果中,check_pass 为 True 表示数据集格式符合要求,其他部分指标的说明如下:</p>
+<ul>
+<li><code>attributes.num_classes</code>:该数据集类别数为 4;</li>
+<li><code>attributes.train_samples</code>:该数据集训练集样本数量为 1552;</li>
+<li><code>attributes.val_samples</code>:该数据集验证集样本数量为 2593;</li>
+<li><code>attributes.train_sample_paths</code>:该数据集训练集样本可视化图片相对路径列表;</li>
+<li><code>attributes.val_sample_paths</code>:该数据集验证集样本可视化图片相对路径列表;</li>
+</ul>
+<p>数据集校验还对数据集中所有类别的样本数量分布情况进行了分析,并绘制了分布直方图(histogram.png):</p>
+<p><img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/modules/textline_ori_classification/01.png"></p></details>
+
+#### 4.1.3 数据集格式转换/数据集划分(可选)
+在您完成数据校验之后,可以通过<b>修改配置文件</b>或是<b>追加超参数</b>的方式对数据集的格式进行转换,也可以对数据集的训练/验证比例进行重新划分。
+
+<details><summary>👉 <b>格式转换/数据集划分详情(点击展开)</b></summary>
+
+<p><b>(1)数据集格式转换</b></p>
+<p>文本行方向分类暂不支持数据格式转换。</p>
+<p><b>(2)数据集划分</b></p>
+<p>数据集划分的参数可以通过修改配置文件中 <code>CheckDataset</code> 下的字段进行设置,配置文件中部分参数的示例说明如下:</p>
+<ul>
+<li><code>CheckDataset</code>:</li>
+<li><code>split</code>:</li>
+<li><code>enable</code>: 是否进行重新划分数据集,为 <code>True</code> 时进行数据集格式转换,默认为 <code>False</code>;</li>
+<li><code>train_percent</code>: 如果重新划分数据集,则需要设置训练集的百分比,类型为0-100之间的任意整数,需要保证与 <code>val_percent</code> 的值之和为100;</li>
+</ul>
+<p>例如,您想重新划分数据集为 训练集占比90%、验证集占比10%,则需将配置文件修改为:</p>
+<pre><code class="language-bash">......
+CheckDataset:
+  ......
+  split:
+    enable: True
+    train_percent: 90
+    val_percent: 10
+  ......
+</code></pre>
+<p>随后执行命令:</p>
+<pre><code class="language-bash">python main.py -c paddlex/configs/textline_orientation/PP-LCNet_x0_25_textline_ori.yaml \
+    -o Global.mode=check_dataset \
+    -o Global.dataset_dir=./dataset/textline_orientation_example_data
+</code></pre>
+<p>数据划分执行之后,原有标注文件会被在原路径下重命名为 <code>xxx.bak</code>。</p>
+<p>以上参数同样支持通过追加命令行参数的方式进行设置:</p>
+<pre><code class="language-bash">python main.py -c paddlex/configs/textline_orientation/PP-LCNet_x0_25_textline_ori.yaml \
+    -o Global.mode=check_dataset \
+    -o Global.dataset_dir=./dataset/textline_orientation_example_data \
+    -o CheckDataset.split.enable=True \
+    -o CheckDataset.split.train_percent=90 \
+    -o CheckDataset.split.val_percent=10
+</code></pre></details>
+
+### 4.2 模型训练
+一条命令即可完成模型的训练,此处以文本行方向分类模型(PP-LCNet_x1_0_textline_ori)的训练为例:
+
+```bash
+python main.py -c paddlex/configs/textline_orientation/PP-LCNet_x0_25_textline_ori.yaml \
+    -o Global.mode=train \
+    -o Global.dataset_dir=./dataset/textline_orientation_example_data
+```
+需要如下几步:
+
+* 指定模型的`.yaml` 配置文件路径(此处为`PP-LCNet_x0_25_textline_ori.yaml`,训练其他模型时,需要的指定相应的配置文件,模型和配置的文件的对应关系,可以查阅[PaddleX模型列表(CPU/GPU)](../../../support_list/models_list.md))
+* 指定模式为模型训练:`-o Global.mode=train`
+* 指定训练数据集路径:`-o Global.dataset_dir`
+其他相关参数均可通过修改`.yaml`配置文件中的`Global`和`Train`下的字段来进行设置,也可以通过在命令行中追加参数来进行调整。如指定前 2 卡 gpu 训练:`-o Global.device=gpu:0,1`;设置训练轮次数为 10:`-o Train.epochs_iters=10`。更多可修改的参数及其详细解释,可以查阅模型对应任务模块的配置文件说明[PaddleX通用模型配置文件参数说明](../../instructions/config_parameters_common.md)。
+
+<details><summary>👉 <b>更多说明(点击展开)</b></summary>
+
+<ul>
+<li>模型训练过程中,PaddleX 会自动保存模型权重文件,默认为<code>output</code>,如需指定保存路径,可通过配置文件中 <code>-o Global.output</code> 字段进行设置。</li>
+<li>PaddleX 对您屏蔽了动态图权重和静态图权重的概念。在模型训练的过程中,会同时产出动态图和静态图的权重,在模型推理时,默认选择静态图权重推理。</li>
+<li>
+<p>在完成模型训练后,所有产出保存在指定的输出目录(默认为<code>./output/</code>)下,通常有以下产出:</p>
+</li>
+<li>
+<p><code>train_result.json</code>:训练结果记录文件,记录了训练任务是否正常完成,以及产出的权重指标、相关文件路径等;</p>
+</li>
+<li><code>train.log</code>:训练日志文件,记录了训练过程中的模型指标变化、loss 变化等;</li>
+<li><code>config.yaml</code>:训练配置文件,记录了本次训练的超参数的配置;</li>
+<li><code>.pdparams</code>、<code>.pdema</code>、<code>.pdopt.pdstate</code>、<code>.pdiparams</code>、<code>.pdmodel</code>:模型权重相关文件,包括网络参数、优化器、EMA、静态图网络参数、静态图网络结构等;</li>
+</ul></details>
+
+### <b>4.3 模型评估</b>
+在完成模型训练后,可以对指定的模型权重文件在验证集上进行评估,验证模型精度。使用 PaddleX 进行模型评估,一条命令即可完成模型的评估:
+
+``` bash
+python main.py -c paddlex/configs/textline_orientation/PP-LCNet_x0_25_textline_ori.yaml \
+    -o Global.mode=evaluate \
+    -o Global.dataset_dir=./dataset/textline_orientation_example_data
+```
+与模型训练类似,需要如下几步:
+
+* 指定模型的`.yaml` 配置文件路径(此处为`PP-LCNet_x0_25_textline_ori.yaml`)
+* 指定模式为模型评估:`-o Global.mode=evaluate`
+* 指定验证数据集路径:`-o Global.dataset_dir`
+其他相关参数均可通过修改`.yaml`配置文件中的`Global`和`Evaluate`下的字段来进行设置,详细请参考[PaddleX通用模型配置文件参数说明](../../instructions/config_parameters_common.md)。
+
+<details><summary>👉 <b>更多说明(点击展开)</b></summary>
+
+<p>在模型评估时,需要指定模型权重文件路径,每个配置文件中都内置了默认的权重保存路径,如需要改变,只需要通过追加命令行参数的形式进行设置即可,如<code>-o Evaluate.weight_path=``./output/best_model/best_model.pdparams</code>。</p>
+<p>在完成模型评估后,通常有以下产出:</p>
+<p>在完成模型评估后,会产出<code>evaluate_result.json,其记录了</code>评估的结果,具体来说,记录了评估任务是否正常完成,以及模型的评估指标,包含 Top1 Acc。</p></details>
+
+### <b>4.4 模型推理和模型集成</b>
+在完成模型的训练和评估后,即可使用训练好的模型权重进行推理预测或者进行Python集成。
+
+#### 4.4.1 模型推理
+通过命令行的方式进行推理预测,只需如下一条命令。运行以下代码前,请您下载[示例图片](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/textline_rot180_demo.jpg)到本地。
+
+```bash
+python main.py -c paddlex/configs/textline_orientation/PP-LCNet_x0_25_textline_ori.yaml \
+    -o Global.mode=predict \
+    -o Predict.model_dir="./output/best_model/inference" \
+    -o Predict.input="textline_rot180_demo.jpg"
+```
+与模型训练和评估类似,需要如下几步:
+
+* 指定模型的`.yaml` 配置文件路径(此处为`PP-LCNet_x0_25_textline_ori.yaml`)
+* 指定模式为模型推理预测:`-o Global.mode=predict`
+* 指定模型权重路径:`-o Predict.model_dir=``"./output/best_model/inference"`
+* 指定输入数据路径:`-o Predict.input="..."`
+其他相关参数均可通过修改`.yaml`配置文件中的`Global`和`Predict`下的字段来进行设置,详细请参考[PaddleX通用模型配置文件参数说明](../../instructions/config_parameters_common.md)。
+
+#### 4.4.2 模型集成
+模型可以直接集成到PaddleX产线中,也可以直接集成到您自己的项目中。
+
+1.<b>产线集成</b>
+
+文本行方向分类模块可以集成的PaddleX产线有[文档场景信息抽取v3产线(PP-ChatOCRv3)](../../../pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction.md),只需要替换模型路径即可完成文本行方向分类模块的模型更新。
+
+2.<b>模块集成</b>
+
+您产出的权重可以直接集成到文本行方向分类模块中,可以参考[快速集成](#三快速集成)的 Python 示例代码,只需要将模型替换为你训练的到的模型路径即可。

+ 41 - 0
paddlex/configs/textline_orientation/PP-LCNet_x0_25_textline_ori.yaml

@@ -0,0 +1,41 @@
+Global:
+  model: PP-LCNet_x0_25_textline_ori
+  mode: check_dataset # check_dataset/train/evaluate/predict
+  dataset_dir: "/paddle/dataset/paddlex/cls/textline_orientation_example_data"
+  device: gpu:0,1,2,3
+  output: "output"
+
+CheckDataset:
+  convert:
+    enable: False
+    src_dataset_type: null
+  split:
+    enable: False
+    train_percent: null
+    val_percent: null
+
+Train:
+  num_classes: 2
+  epochs_iters: 20
+  batch_size: 32
+  learning_rate: 0.8
+  pretrain_weight_path: https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-LCNet_x0_25_textline_ori_pretrained.pdparams
+  warmup_steps: 100
+  resume_path: null
+  log_interval: 10
+  eval_interval: 1
+  save_interval: 1
+
+Evaluate:
+  weight_path: "output/best_model/best_model.pdparams"
+  log_interval: 10
+
+Export:
+  weight_path: https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-LCNet_x0_25_textline_ori_pretrained.pdparams
+
+Predict:
+  batch_size: 1
+  model_dir: "output/best_model/inference"
+  input: "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/img_textline180_demo.jpg"
+  kernel_option:
+    run_mode: paddle

+ 1 - 0
paddlex/inference/utils/official_models.py

@@ -32,6 +32,7 @@ OFFICIAL_MODELS = {
     "ResNet152_vd": "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0b2/ResNet152_vd_infer.tar",
     "ResNet200_vd": "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0b2/ResNet200_vd_infer.tar",
     "PP-LCNet_x0_25": "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0b2/PP-LCNet_x0_25_infer.tar",
+    "PP-LCNet_x0_25_textline_ori": "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0b2/PP-LCNet_x0_25_textline_ori_infer.tar",
     "PP-LCNet_x0_35": "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0b2/PP-LCNet_x0_35_infer.tar",
     "PP-LCNet_x0_5": "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0b2/PP-LCNet_x0_5_infer.tar",
     "PP-LCNet_x0_75": "https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0b2/PP-LCNet_x0_75_infer.tar",

+ 1 - 0
paddlex/modules/image_classification/model_list.py

@@ -56,6 +56,7 @@ MODELS = [
     "PP-HGNetV2-B5",
     "PP-HGNetV2-B6",
     "PP-LCNet_x0_25",
+    "PP-LCNet_x0_25_textline_ori",
     "PP-LCNet_x0_35",
     "PP-LCNet_x0_5",
     "PP-LCNet_x0_75",

+ 10 - 2
paddlex/repo_apis/PaddleClas_api/cls/register.py

@@ -874,7 +874,6 @@ register_model_info(
         "config_path": osp.join(PDX_CONFIG_DIR, "MobileFaceNet.yaml"),
         "supported_apis": ["train", "evaluate", "predict", "export", "infer"],
         "infer_config": "deploy/configs/inference_cls.yaml",
-        "hpi_config_path": None,
     }
 )
 
@@ -885,6 +884,15 @@ register_model_info(
         "config_path": osp.join(PDX_CONFIG_DIR, "ResNet50_face.yaml"),
         "supported_apis": ["train", "evaluate", "predict", "export", "infer"],
         "infer_config": "deploy/configs/inference_cls.yaml",
-        "hpi_config_path": None,
+    }
+)
+
+register_model_info(
+    {
+        "model_name": "PP-LCNet_x0_25_textline_ori",
+        "suite": "Cls",
+        "config_path": osp.join(PDX_CONFIG_DIR, "PP-LCNet_x0_25_textline_ori.yaml"),
+        "supported_apis": ["train", "evaluate", "predict", "export", "infer"],
+        "infer_config": "deploy/configs/inference_cls.yaml",
     }
 )

+ 144 - 0
paddlex/repo_apis/PaddleClas_api/configs/PP-LCNet_x0_25_textline_ori.yaml

@@ -0,0 +1,144 @@
+# global configs
+Global:
+  checkpoints: null
+  pretrained_model: null
+  output_dir: ./output/
+  device: gpu
+  save_interval: 1
+  eval_during_train: True
+  start_eval_epoch: 1
+  eval_interval: 1
+  epochs: 20
+  print_batch_step: 10
+  use_visualdl: False
+  # used for static mode and model export
+  image_shape: [3, 80, 160]
+  save_inference_dir: ./inference
+  # training model under @to_static
+  to_static: False
+  use_dali: False
+
+# model architecture
+Arch:
+  name: PPLCNet_x0_25
+  class_num: 2
+  pretrained: True
+  use_ssld: True
+  use_last_conv: False
+  stride_list: [[1, 2], [2, 1], [2, 1], [2, 1], [2, 1]]
+ 
+# loss function config for traing/eval process
+Loss:
+  Train:
+    - CELoss:
+        weight: 1.0
+  Eval:
+    - CELoss:
+        weight: 1.0
+
+
+Optimizer:
+  name: Momentum
+  momentum: 0.9
+  lr:
+    name: Cosine
+    learning_rate: 0.8
+    warmup_epoch: 5
+  regularizer:
+    name: 'L2'
+    coeff: 0.00004
+
+
+# data loader for train and eval
+DataLoader:
+  Train:
+    dataset:
+      name: ImageNetDataset
+      image_root: ./dataset/textline_orientation/
+      cls_label_path: ./dataset/textline_orientation/train_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - ResizeImage:
+            size: [160, 80]
+        - TimmAutoAugment:
+            prob: 1.0
+            config_str: rand-m9-mstd0.5-inc1
+            interpolation: bicubic
+            img_size: [160, 80]
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+        - RandomErasing:
+            EPSILON: 0.0
+            sl: 0.02
+            sh: 1.0/3.0
+            r1: 0.3
+            attempt: 10
+            use_log_aspect: True
+            mode: pixel
+
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 256
+      drop_last: False
+      shuffle: True
+    loader:
+      num_workers: 16
+      use_shared_memory: True
+
+  Eval:
+    dataset: 
+      name: ImageNetDataset
+      image_root: ./dataset/textline_orientation/
+      cls_label_path: ./dataset/textline_orientation/val_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - ResizeImage:
+            size: [160, 80]
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 32
+      drop_last: False
+      shuffle: False
+    loader:
+      num_workers: 8
+      use_shared_memory: True
+
+Infer:
+  infer_imgs: deploy/images/PULC/textline_orientation/textline_orientation_test_0_0.png
+  batch_size: 10
+  transforms:
+    - DecodeImage:
+        to_rgb: True
+        channel_first: False
+    - ResizeImage:
+        size: [160, 80]
+    - NormalizeImage:
+        scale: 1.0/255.0
+        mean: [0.485, 0.456, 0.406]
+        std: [0.229, 0.224, 0.225]
+        order: ''
+    - ToCHWImage:
+  PostProcess:
+    name: Topk
+    topk: 1
+    class_id_map_file: ppcls/utils/PULC_label_list/textline_orientation_label_list.txt
+
+Metric:
+  Train:
+    - TopkAcc:
+        topk: [1, 2]
+  Eval:
+    - TopkAcc:
+        topk: [1, 2]