Sfoglia il codice sorgente

fix attr rec docs (#3300)

* fix attr rec docs

* fix cls en docs
zhangyubo0722 9 mesi fa
parent
commit
b68b74b049

+ 1 - 1
docs/module_usage/tutorials/cv_modules/image_classification.en.md

@@ -908,7 +908,7 @@ After executing the above command, PaddleX will validate the dataset and summari
   "analysis": {
     "histogram": "check_dataset/histogram.png"
   },
-  "dataset_path": "./dataset/cls_flowers_examples",
+  "dataset_path": "cls_flowers_examples",
   "show_type": "image",
   "dataset_type": "ClsDataset"
 }

+ 46 - 26
docs/module_usage/tutorials/cv_modules/pedestrian_attribute_recognition.en.md

@@ -55,61 +55,81 @@ After running, the obtained result is:
 {'res': {'input_path': 'pedestrian_attribute_006.jpg', 'page_index': None, 'class_ids': array([10, ..., 23]), 'scores': array([1.     , ..., 0.54777]), 'label_names': ['LongCoat(长外套)', 'Age18-60(年龄在18-60岁之间)', 'Trousers(长裤)', 'Front(面朝前)']}}
 ```
 
-运行结果参数含义如下:
-- `input_path`:表示输入待预测多类别图像的路径
-- `page_index`:如果输入是PDF文件,则表示当前是PDF的第几页,否则为 `None`
-- `class_ids`:表示行人属性图像的预测标签ID
-- `scores`:表示行人属性图像的预测标签置信度
-- `label_names`:表示行人属性图像的预测标签名称
+<b>Note</b>: The index of the `class_ids` value represents the following attributes: index 0 indicates whether a hat is worn, index 1 indicates whether glasses are worn, indexes 2-7 represent the style of the upper garment, indexes 8-13 represent the style of the lower garment, index 14 indicates whether boots are worn, indexes 15-17 represent the type of bag carried, index 18 indicates whether an object is held in front, indexes 19-21 represent age, index 22 represents gender, and indexes 23-25 represent orientation. Specifically, the attributes include the following types:
 
-可视化图片如下:
+```
+- Gender: Male, Female
+- Age: Under 18, 18-60, Over 60
+- Orientation: Front, Back, Side
+- Accessories: Glasses, Hat, None
+- Holding Object in Front: Yes, No
+- Bag: Backpack, Shoulder Bag, Handbag
+- Upper Garment Style: Striped, Logo, Plaid, Patchwork
+- Lower Garment Style: Striped, Patterned
+- Short-sleeved Shirt: Yes, No
+- Long-sleeved Shirt: Yes, No
+- Long Coat: Yes, No
+- Pants: Yes, No
+- Shorts: Yes, No
+- Skirt: Yes, No
+- Boots: Yes, No
+```
+
+The meanings of the parameters in the running result are as follows:
+- `input_path`: Indicates the path of the input multi-category image to be predicted.
+- `page_index`: If the input is a PDF file, it indicates which page of the PDF is currently being processed; otherwise, it is `None`.
+- `class_ids`: Indicates the predicted label IDs of the pedestrian attribute images.
+- `scores`: Indicates the confidence scores of the predicted labels of the pedestrian attribute images.
+- `label_names`: Indicates the names of the predicted labels of the pedestrian attribute images.
+
+The visualization image is as follows:
 
 <img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/modules/ped_attri/pedestrian_attribute_006_res.jpg" alt="Pedestrian Attribute Result">
 
-相关方法、参数等说明如下:
+Relevant methods, parameters, and explanations are as follows:
 
-* `create_model`实例化行人属性识别模型(此处以`PP-LCNet_x1_0_pedestrian_attribute`为例),具体说明如下:
+* `create_model` instantiates the vehicle attribute recognition model (here, `PP-LCNet_x1_0_pedestrian_attribute` is used as an example). The specific explanations are as follows:
 <table>
 <thead>
 <tr>
-<th>参数</th>
-<th>参数说明</th>
-<th>参数类型</th>
-<th>可选项</th>
-<th>默认值</th>
+<th>Parameter</th>
+<th>Parameter Description</th>
+<th>Parameter Type</th>
+<th>Options</th>
+<th>Default Value</th>
 </tr>
 </thead>
 <tr>
 <td><code>model_name</code></td>
-<td>模型名称</td>
+<td>The name of the model</td>
 <td><code>str</code></td>
-<td>无</td>
+<td>None</td>
 <td><code>PP-LCNet_x1_0_pedestrian_attribute</code></td>
 </tr>
 <tr>
 <td><code>model_dir</code></td>
-<td>模型存储路径</td>
+<td>The storage path of the model</td>
 <td><code>str</code></td>
-<td></td>
-<td></td>
+<td>None</td>
+<td>None</td>
 </tr>
 <tr>
 <td><code>threshold</code></td>
-<td>行人属性识别阈值</td>
+<td>The threshold for vehicle attribute recognition</td>
 <td><code>float/list/dict</code></td>
-<td><li><b>float类型变量</b>,任意[0-1]之间浮点数:<code>0.5</code></li>
-<li><b>list类型变量</b>,由多个[0-1]之间浮点数组成的列表:<code>[0.5,0.5,...]</code></li>
-<li><b>dict类型变量</b>,指定不同类别使用不同的阈值,其中"default"为必须包含的键:<code>{"default":0.5,1:0.1,...}</code></li>
+<td><li><b>float variable</b>, any floating-point number between [0-1]: <code>0.5</code></li>
+<li><b>list variable</b>, a list composed of multiple floating-point numbers between [0-1]: <code>[0.5,0.5,...]</code></li>
+<li><b>dict variable</b>, specifying different thresholds for different categories, where "default" is a required key: <code>{"default":0.5,1:0.1,...}</code></li>
 </td>
 <td>0.5</td>
 </tr>
 </table>
 
-* 其中,`model_name` 必须指定,指定 `model_name` 后,默认使用 PaddleX 内置的模型参数,在此基础上,指定 `model_dir` 时,使用用户自定义的模型。
+* The `model_name` must be specified. After specifying `model_name`, PaddleX's built-in model parameters are used by default. If `model_dir` is specified, the user-defined model is used.
 
-* 其中,`threshold` 参数用于设置多标签分类的阈值,默认为0.7。当设置为浮点数时,表示所有类别均使用该阈值;当设置为列表时,表示不同类别使用不同的阈值,此时需保持列表长度与类别数量一致;当设置为字典时,`default` 为必须包含的键, 表示所有类别的默认阈值,其它类别使用各自的阈值。例如:{"default":0.5,1:0.1}。
+* The `threshold` parameter is used to set the threshold for multi-label classification, with a default value of 0.7. When set as a float, it means all categories use this threshold; when set as a list, different categories use different thresholds, and the list length must match the number of categories; when set as a dictionary, "default" is a required key, indicating the default threshold for all categories, while other categories use their respective thresholds. For example: <code>{"default":0.5,1:0.1}</code>.
 
-* 调用多标签分类模型的 `predict()` 方法进行推理预测,`predict()` 方法参数有 `input` , `batch_size` 和  `threshold`,具体说明如下:
+* The `predict()` method of the multi-label classification model is called for inference prediction. The parameters of the `predict()` method include `input`, `batch_size`, and `threshold`, with specific explanations as follows:
 
 <table>
 <thead>

+ 21 - 7
docs/module_usage/tutorials/cv_modules/pedestrian_attribute_recognition.md

@@ -61,6 +61,26 @@ for res in output:
 - `scores`:表示行人属性图像的预测标签置信度
 - `label_names`:表示行人属性图像的预测标签名称
 
+<b>备注</b>:其中 `class_ids` 的值索引为0表示是否佩戴帽子,索引值为1表示是否佩戴眼镜,索引值2-7表示上衣风格,索引值8-13表示下装风格,索引值14表示是否穿靴子,索引值15-17表示背的包的类型,索引值18表示正面是否持物,索引值19-21表示年龄,索引值22表示性别,索引值23-25表示朝向。具体地,属性包含以下类型:
+
+```
+- 性别:男、女
+- 年龄:小于18、18-60、大于60
+- 朝向:朝前、朝后、侧面
+- 配饰:眼镜、帽子、无
+- 正面持物:是、否
+- 包:双肩包、单肩包、手提包
+- 上衣风格:带条纹、带logo、带格子、拼接风格
+- 下装风格:带条纹、带图案
+- 短袖上衣:是、否
+- 长袖上衣:是、否
+- 长外套:是、否
+- 长裤:是、否
+- 短裤:是、否
+- 短裙&裙子:是、否
+- 穿靴:是、否
+```
+
 可视化图片如下:
 
 <img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/modules/ped_attri/pedestrian_attribute_006_res.jpg">
@@ -310,15 +330,9 @@ python main.py -c paddlex/configs/modules/pedestrian_attribute_recognition/PP-LC
   "analysis": {
     "histogram": "check_dataset/histogram.png"
   },
-<<<<<<< HEAD
-  "dataset_path": "./dataset/pedestrian_attribute_examples",
+  "dataset_path": "pedestrian_attribute_examples",
   "show_type": "image",
   "dataset_type": "MLClsDataset"
-=======
-  &quot;dataset_path&quot;: &quot;pedestrian_attribute_examples&quot;,
-  &quot;show_type&quot;: &quot;image&quot;,
-  &quot;dataset_type&quot;: &quot;MLClsDataset&quot;
->>>>>>> modify_pipeline_and_module_docs
 }
 </code></pre>
 <p>上述校验结果中,check_pass 为 true 表示数据集格式符合要求,其他部分指标的说明如下:</p>

+ 0 - 2
docs/module_usage/tutorials/cv_modules/vehicle_attribute_recognition.en.md

@@ -68,8 +68,6 @@ The visualization image is as follows:
 
 <img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/modules/vehicle_attri/vehicle_attribute_007_res.jpg" alt="Vehicle Attribute Result">
 
-Please note that due to network issues, the above image link may not be successfully parsed. This issue might be related to the link itself or the network connection. If you need the content of this link, please check the validity of the link and try again. If the problem persists, you may need to access the link directly through a browser.
-
 Relevant methods, parameters, and explanations are as follows:
 
 * `create_model` instantiates the vehicle attribute recognition model (here, `PP-LCNet_x1_0_vehicle_attribute` is used as an example). The specific explanations are as follows:

+ 1 - 1
docs/module_usage/tutorials/ocr_modules/doc_img_orientation_classification.en.md

@@ -113,7 +113,7 @@ Related methods, parameters, and other explanations are as follows:
 <tr>
 <td><code>input</code></td>
 <td>Data to be predicted, supporting multiple input types</td>
-<td><code>Python Var</code>/<code>str</code>/<code>dict</code>/<code>list</code></td>
+<td><code>Python Var</code>/<code>str</code>/<code>list</code></td>
 <td>
 <ul>
 <li><b>Python variable</b>, such as image data represented by <code>numpy.ndarray</code></li>

+ 1 - 1
docs/module_usage/tutorials/ocr_modules/doc_img_orientation_classification.md

@@ -134,7 +134,7 @@ for res in output:
 <tr>
 <td><code>input</code></td>
 <td>待预测数据,支持多种输入类型</td>
-<td><code>Python Var</code>/<code>str</code>/<code>dict</code>/<code>list</code></td>
+<td><code>Python Var</code>/<code>str</code>/<code>list</code></td>
 <td>
 <ul>
   <li><b>Python变量</b>,如<code>numpy.ndarray</code>表示的图像数据</li>

+ 1 - 1
docs/module_usage/tutorials/ocr_modules/textline_orientation_classification.en.md

@@ -112,7 +112,7 @@ The explanations for the methods, parameters, etc., are as follows:
 <tr>
 <td><code>input</code></td>
 <td>Data to be predicted, supporting multiple input types</td>
-<td><code>Python Var</code>/<code>str</code>/<code>dict</code>/<code>list</code></td>
+<td><code>Python Var</code>/<code>str</code>/<code>list</code></td>
 <td>
 <ul>
   <li><b>Python variable</b>, such as image data represented by <code>numpy.ndarray</code></li>

+ 1 - 1
docs/module_usage/tutorials/ocr_modules/textline_orientation_classification.md

@@ -112,7 +112,7 @@ for res in output:
 <tr>
 <td><code>input</code></td>
 <td>待预测数据,支持多种输入类型</td>
-<td><code>Python Var</code>/<code>str</code>/<code>dict</code>/<code>list</code></td>
+<td><code>Python Var</code>/<code>str</code>/<code>list</code></td>
 <td>
 <ul>
   <li><b>Python变量</b>,如<code>numpy.ndarray</code>表示的图像数据</li>

+ 11 - 4
docs/pipeline_usage/tutorials/cv_pipelines/image_multi_label_classification.en.md

@@ -5,7 +5,7 @@ comments: true
 # General Image Multi-Label Classification Pipeline Tutorial
 
 ## 1. Introduction to the General Image Multi-Label Classification Pipeline
-Image multi-label classification is a technique that assigns multiple relevant categories to a single image simultaneously, widely used in image annotation, content recommendation, and social media analysis. It can identify multiple objects or features present in an image, for example, an image containing both "dog" and "outdoor" labels. By leveraging deep learning models, image multi-label classification automatically extracts image features and performs accurate classification, providing users with more comprehensive information. This technology is of great significance in applications such as intelligent search engines and automatic content generation.
+Image multi-label classification is a technique that assigns multiple relevant categories to a single image simultaneously, widely used in image annotation, content recommendation, and social media analysis. It can identify multiple objects or features present in an image, for example, an image containing both "dog" and "outdoor" labels. By leveraging deep learning models, image multi-label classification automatically extracts image features and performs accurate classification, providing users with more comprehensive information. This technology is of great significance in applications such as intelligent search engines and automatic content generation.This pipeline also offers a flexible service-oriented deployment approach, supporting the use of multiple programming languages on various hardware platforms. Moreover, this production line provides the capability for secondary development. You can train and optimize models on your own dataset based on this production line, and the trained models can be seamlessly integrated.
 
 <img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/pipelines/image_multi_label_classification/01.png">
 
@@ -118,6 +118,12 @@ In the above Python script, the following steps are performed:
 <td>None</td>
 </tr>
 <tr>
+<td><code>config</code></td>
+<td>Specific configuration information for the production line (if set simultaneously with <code>pipeline</code>, it has higher priority than <code>pipeline</code>, and the production line name must be consistent with <code>pipeline</code>).</td>
+<td><code>dict[str, Any]</code></td>
+<td><code>None</code></td>
+</tr>
+<tr>
 <td><code>device</code></td>
 <td>Pipeline inference device. Supports specifying the specific GPU card number, such as "gpu:0", other hardware specific card numbers, such as "npu:0", CPU such as "cpu".</td>
 <td><code>str</code></td>
@@ -256,6 +262,7 @@ In the above Python script, the following steps are performed:
 - Calling the `print()` method will print the result to the terminal. The content printed to the terminal is explained as follows:
 
     - `input_path`: `(str)` Input path of the image to be predicted.
+    - `page_index`: `(Union[int, None])` If the input is a PDF file, it indicates the current page number of the PDF; otherwise, it is `None`.
     - `class_ids`: `(List[numpy.ndarray])` Indicates the class IDs of the prediction results.
     - `scores`: `(List[numpy.ndarray])` Indicates the confidence scores of the prediction results.
     - `label_names`: `(List[str])` Indicates the class names of the prediction results.
@@ -291,7 +298,7 @@ In addition, you can obtain the general image multi-label classification pipelin
 paddlex --get_pipeline_config image_multilabel_classification --save_path ./my_path
 ```
 
-If you have obtained the configuration file, you can customize the settings for the OCR production line by simply modifying the `pipeline` parameter value in the `create_pipeline` method to the path of the configuration file. An example is as follows:
+If you have obtained the configuration file, you can customize the settings for the image multi-label classification production line by simply modifying the `pipeline` parameter value in the `create_pipeline` method to the path of the configuration file. An example is as follows:
 
 ```python
 from paddlex import create_pipeline
@@ -908,7 +915,7 @@ SubModules:
   ImageMultiLabelClassification:
     module_name: image_multilabel_classification
     model_name: PP-HGNetV2-B6_ML
-    model_dir: null
+    model_dir: null # Modify this path to the local fine-tuned model weight file
     batch_size: 4
 ```
 
@@ -925,4 +932,4 @@ paddlex --pipeline image_multilabel_classification \
         --device npu:0
 ```
 
-If you want to use the general OCR pipeline on more types of hardware, please refer to the [PaddleX Multi-Hardware Usage Guide](../../../other_devices_support/multi_devices_use_guide.en.md).
+If you want to use the general image multi-label classification pipeline on more types of hardware, please refer to the [PaddleX Multi-Hardware Usage Guide](../../../other_devices_support/multi_devices_use_guide.en.md).

+ 12 - 5
docs/pipeline_usage/tutorials/cv_pipelines/image_multi_label_classification.md

@@ -5,7 +5,7 @@ comments: true
 # 通用图像多标签分类产线使用教程
 
 ## 1. 通用图像多标签分类产线介绍
-图像多标签分类是一种将一张图像同时分配到多个相关类别的技术,广泛应用于图像标注、内容推荐和社交媒体分析等领域。它能够识别图像中存在的多个物体或特征,例如一张图片中同时包含“狗”和“户外”这两个标签。通过使用深度学习模型,图像多标签分类能够自动提取图像特征并进行准确分类,以便为用户提供更加全面的信息。这项技术在智能搜索引擎和自动内容生成等应用中具有重要意义。
+图像多标签分类是一种将一张图像同时分配到多个相关类别的技术,广泛应用于图像标注、内容推荐和社交媒体分析等领域。它能够识别图像中存在的多个物体或特征,例如一张图片中同时包含“狗”和“户外”这两个标签。通过使用深度学习模型,图像多标签分类能够自动提取图像特征并进行准确分类,以便为用户提供更加全面的信息。这项技术在智能搜索引擎和自动内容生成等应用中具有重要意义。本产线同时提供了灵活的服务化部署方式,支持在多种硬件上使用多种编程语言调用。不仅如此,本产线也提供了二次开发的能力,您可以基于本产线在您自己的数据集上训练调优,训练后的模型也可以无缝集成。
 
 <img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/pipelines/image_multi_label_classification/01.png">
 
@@ -117,6 +117,12 @@ for res in output:
 <td>None</td>
 </tr>
 <tr>
+<td><code>config</code></td>
+<td>产线具体的配置信息(如果和<code>pipeline</code>同时设置,优先级高于<code>pipeline</code>,且要求产线名和<code>pipeline</code>一致)。</td>
+<td><code>dict[str, Any]</code></td>
+<td><code>None</code></td>
+</tr>
+<tr>
 <td><code>device</code></td>
 <td>产线推理设备。支持指定GPU具体卡号,如“gpu:0”,其他硬件具体卡号,如“npu:0”,CPU如“cpu”。</td>
 <td><code>str</code></td>
@@ -253,11 +259,12 @@ for res in output:
 - 调用`print()` 方法会将结果打印到终端,打印到终端的内容解释如下:
 
     - `input_path`: `(str)` 待预测图像的输入路径。
+    - `page_index`: `(Union[int, None])` 如果输入是PDF文件,则表示当前是PDF的第几页,否则为 `None`。
     - `class_ids`: `(List[numpy.ndarray])` 表示预测结果的类别id。
     - `scores`: `(List[numpy.ndarray])` 表示预测结果的置信度。
     - `label_names`: `(List[str])` 表示预测结果的类别名称。
 
-- 调用`save_to_json()` 方法会将上述内容保存到指定的`save_path`中,如果指定为目录,则保存的路径为`save_path/{your_img_basename}.json`,如果指定为文件,则直接保存到该文件中。由于json文件不支持保存numpy数组,因此会将其中的`numpy.array`类型转换为列表形式。
+- 调用`save_to_json()` 方法会将上述内容保存到指定的`save_path`中,如果指定为目录,则保存的路径为`save_path/{your_img_basename}_res.json`,如果指定为文件,则直接保存到该文件中。由于json文件不支持保存numpy数组,因此会将其中的`numpy.array`类型转换为列表形式。
 - 调用`save_to_img()` 方法会将可视化结果保存到指定的`save_path`中,如果指定为目录,则保存的路径为`save_path/{your_img_basename}_res.{your_img_extension}`,如果指定为文件,则直接保存到该文件中。(产线通常包含较多结果图片,不建议直接指定为具体的文件路径,否则多张图会被覆盖,仅保留最后一张图)
 
 * 此外,也支持通过属性获取带结果的可视化图像和预测结果,具体如下:
@@ -288,7 +295,7 @@ for res in output:
 paddlex --get_pipeline_config image_multilabel_classification --save_path ./my_path
 ```
 
-若您获取了配置文件,即可对OCR产线各项配置进行自定义,只需要修改 `create_pipeline` 方法中的 `pipeline` 参数值为产线配置文件路径即可。示例如下:
+若您获取了配置文件,即可对通用图像多标签分类产线各项配置进行自定义,只需要修改 `create_pipeline` 方法中的 `pipeline` 参数值为产线配置文件路径即可。示例如下:
 
 ```python
 from paddlex import create_pipeline
@@ -885,7 +892,7 @@ SubModules:
   ImageMultiLabelClassification:
     module_name: image_multilabel_classification
     model_name: PP-HGNetV2-B6_ML
-    model_dir: null
+    model_dir: null # 替换为微调后的多标签分类模型权重路径
     batch_size: 4
 ```
 随后, 参考本地体验中的命令行方式或 Python 脚本方式,加载修改后的产线配置文件即可。
@@ -903,4 +910,4 @@ paddlex --pipeline image_multilabel_classification \
 
 当然,您也可以在 Python 脚本中 `create_pipeline()` 时或者 `predict()` 时指定硬件设备。
 
-若您想在更多种类的硬件上使用通用OCR产线,请参考[PaddleX多硬件使用指南](../../../other_devices_support/multi_devices_use_guide.md)。
+若您想在更多种类的硬件上使用通用通用图像多标签分类产线,请参考[PaddleX多硬件使用指南](../../../other_devices_support/multi_devices_use_guide.md)。

+ 1 - 1
docs/pipeline_usage/tutorials/cv_pipelines/object_detection.en.md

@@ -551,7 +551,7 @@ In the above Python script, the following steps are executed:
         - `score`: The confidence score of the bounding box, a floating-point number.
         - `coordinate`: The coordinates of the bounding box, a list of floating-point numbers in the format <code>[xmin, ymin, xmax, ymax]</code>.
 
-- Calling the `save_to_json()` method will save the above content to the specified `save_path`. If specified as a directory, the saved path will be `save_path/{your_img_basename}.json`. If specified as a file, it will be saved directly to that file. Since JSON files do not support saving NumPy arrays, any `numpy.array` type will be converted to a list format.
+- Calling the `save_to_json()` method will save the above content to the specified `save_path`. If specified as a directory, the saved path will be `save_path/{your_img_basename}_res.json`. If specified as a file, it will be saved directly to that file. Since JSON files do not support saving NumPy arrays, any `numpy.array` type will be converted to a list format.
 - Calling the `save_to_img()` method will save the visualization results to the specified `save_path`. If specified as a directory, the saved path will be `save_path/{your_img_basename}_res.{your_img_extension}`. If specified as a file, it will be saved directly to that file. (The pipeline usually contains many result images, so it is not recommended to specify a specific file path directly, otherwise multiple images will be overwritten and only the last image will be retained.)
 
 * Additionally, it also supports obtaining the visualization image with results and prediction results through attributes, as follows:

+ 1 - 1
docs/pipeline_usage/tutorials/cv_pipelines/object_detection.md

@@ -566,7 +566,7 @@ for res in output:
         - `score`:目标框置信度,一个浮点数
         - `coordinate`:目标框坐标,一个浮点数列表,格式为<code>[xmin, ymin, xmax, ymax]</code>
 
-- 调用`save_to_json()` 方法会将上述内容保存到指定的`save_path`中,如果指定为目录,则保存的路径为`save_path/{your_img_basename}.json`,如果指定为文件,则直接保存到该文件中。由于json文件不支持保存numpy数组,因此会将其中的`numpy.array`类型转换为列表形式。
+- 调用`save_to_json()` 方法会将上述内容保存到指定的`save_path`中,如果指定为目录,则保存的路径为`save_path/{your_img_basename}_res.json`,如果指定为文件,则直接保存到该文件中。由于json文件不支持保存numpy数组,因此会将其中的`numpy.array`类型转换为列表形式。
 - 调用`save_to_img()` 方法会将可视化结果保存到指定的`save_path`中,如果指定为目录,则保存的路径为`save_path/{your_img_basename}_res.{your_img_extension}`,如果指定为文件,则直接保存到该文件中。(产线通常包含较多结果图片,不建议直接指定为具体的文件路径,否则多张图会被覆盖,仅保留最后一张图)
 
 * 此外,也支持通过属性获取带结果的可视化图像和预测结果,具体如下:

+ 3 - 3
docs/pipeline_usage/tutorials/cv_pipelines/pedestrian_attribute_recognition.en.md

@@ -297,7 +297,7 @@ In the above Python script, the following steps are executed:
     - `cls_scores`: `(List[numpy.ndarray])` Indicates the confidence of the attribute prediction result.
     - `det_scores`: `(float)` Indicates the confidence of the pedestrian detection box.
 
-- Calling the `save_to_json()` method will save the above content to the specified `save_path`. If a directory is specified, the saved path will be `save_path/{your_img_basename}.json`. If a file is specified, it will be saved directly to that file. Since JSON files do not support saving numpy arrays, the `numpy.array` type will be converted to a list format.
+- Calling the `save_to_json()` method will save the above content to the specified `save_path`. If a directory is specified, the saved path will be `save_path/{your_img_basename}_res.json`. If a file is specified, it will be saved directly to that file. Since JSON files do not support saving numpy arrays, the `numpy.array` type will be converted to a list format.
 - Calling the `save_to_img()` method will save the visualization result to the specified `save_path`. If a directory is specified, the saved path will be `save_path/{your_img_basename}_res.{your_img_extension}`. If a file is specified, it will be saved directly to that file. (The production line usually contains many result images, so it is not recommended to specify a specific file path directly, otherwise multiple images will be overwritten, and only the last image will be retained.)
 
 * Additionally, it also supports obtaining visualized images with results and prediction results through attributes, as follows:
@@ -604,13 +604,13 @@ SubModules:
   Detection:
     module_name: object_detection
     model_name: PP-YOLOE-L_human
-    model_dir: null # Replace with the path to the fine-tuned image classification model weights
+    model_dir: null # Replace with the path to the fine-tuned pedestrian detection model weights
     batch_size: 1
     threshold: 0.5
   Classification:
     module_name: multilabel_classification
     model_name: PP-LCNet_x1_0_pedestrian_attribute
-    model_dir: null # Replace with the path to the fine-tuned image classification model weights
+    model_dir: null # Replace with the path to the fine-tuned pedestrian attribute recognition model weights
     batch_size: 1
     threshold: 0.7
 ```

+ 6 - 5
docs/pipeline_usage/tutorials/cv_pipelines/vehicle_attribute_recognition.en.md

@@ -272,13 +272,14 @@ In the above Python script, the following steps are executed:
 - When calling the <code>print()</code> method, the result will be printed to the terminal. The printed content is explained as follows:
 
     - `input_path`: `(str)` The input path of the image to be predicted.
+    - `page_index`: `(Union[int, None])` If the input is a PDF file, it indicates the current page number of the PDF; otherwise, it is `None`.
     - `boxes`: `(List[Dict])` The category IDs of the prediction results.
     - `labels`: `(List[str])` The category names of the prediction results.
     - `cls_scores`: `(List[numpy.ndarray])` The confidence scores of the attribute prediction results.
     - `det_scores`: `(float)` The confidence scores of the vehicle detection boxes.
 
-- When calling the <code>save_to_json()</code> method, the above content will be saved to the specified <code>save_path</code>. If a directory is specified, the saved path will be <code>save_path/{your_img_basename}.json</code>. If a file is specified, it will be saved directly to that file. Since JSON files do not support saving numpy arrays, the <code>numpy.array</code> type will be converted to a list format.
-- When calling the <code>save_to_img()</code> method, the visualization result will be saved to the specified <code>save_path</code>. If a directory is specified, the saved path will be <code>save_path/{your_img_basename}_res.{your_img_extension}</code>. If a file is specified, it will be saved directly to that file. (In production, there are usually many result images, so it is not recommended to specify a specific file path directly; otherwise, multiple images will be overwritten, and only the last image will be retained.)
+- Calling the `save_to_json()` method will save the above content to the specified `save_path`. If a directory is specified, the saved path will be `save_path/{your_img_basename}_res.json`. If a file is specified, it will be saved directly to that file. Since JSON files do not support saving numpy arrays, the `numpy.array` type will be converted to a list format.
+- Calling the `save_to_img()` method will save the visualization result to the specified `save_path`. If a directory is specified, the saved path will be `save_path/{your_img_basename}_res.{your_img_extension}`. If a file is specified, it will be saved directly to that file. (The production line usually contains many result images, so it is not recommended to specify a specific file path directly, otherwise multiple images will be overwritten, and only the last image will be retained.)
 
 * Additionally, it also supports obtaining visualized images with results and prediction results through attributes, as follows:
 
@@ -587,20 +588,20 @@ In addition, PaddleX also provides three other deployment methods, which are det
 
 Below are the API references for basic service-oriented deployment and multi-language service invocation examples:
 
-```bash
+```yaml
 pipeline_name: vehicle_attribute_recognition
 
 SubModules:
   Detection:
     module_name: object_detection
     model_name: PP-YOLOE-L_vehicle
-    model_dir: null
+    model_dir: null # Replace with the path to the fine-tuned vehicle detection model weights
     batch_size: 1
     threshold: 0.5
   Classification:
     module_name: multilabel_classification
     model_name: PP-LCNet_x1_0_vehicle_attribute
-    model_dir: null
+    model_dir: null # Replace with the path to the fine-tuned vehicle attribute recognition model weights
     batch_size: 1
     threshold: 0.7
 ```