Selaa lähdekoodia

add formula visualization code (#2320)

* add formula visualization

* modify doc

* modfiy doc2
liuhongen1234567 1 vuosi sitten
vanhempi
commit
aa5bd3d32f

+ 6 - 6
docs/module_usage/tutorials/ocr_modules/formula_recognition.md

@@ -132,14 +132,14 @@ python main.py -c paddlex/configs/formula_recognition/LaTeX_OCR_rec.yaml \
 
 **(1)数据集格式转换**
 
-公式识别支持 `PKL`格式的数据集转换为 `LaTeXOCRDataset`格式,数据集格式转换的参数可以通过修改配置文件中 `CheckDataset` 下的字段进行设置,配置文件中部分参数的示例说明如下:
+公式识别支持 `MSTextRecDataset`格式的数据集转换为 `LaTeXOCRDataset`格式(`PKL`格式),数据集格式转换的参数可以通过修改配置文件中 `CheckDataset` 下的字段进行设置,配置文件中部分参数的示例说明如下:
 
 * `CheckDataset`:
   * `convert`:
-    * `enable`: 是否进行数据集格式转换,公式识别支持 `PKL`格式的数据集转换为 `LaTeXOCRDataset`格式,默认为 `True`;
-    * `src_dataset_type`: 如果进行数据集格式转换,则需设置源数据集格式,默认为 `PKL`,可选值为 `PKL` 
+    * `enable`: 是否进行数据集格式转换,公式识别支持 `MSTextRecDataset`格式的数据集转换为 `LaTeXOCRDataset`格式,默认为 `True`;
+    * `src_dataset_type`: 如果进行数据集格式转换,则需设置源数据集格式,默认为 `MSTextRecDataset`
 
-例如,您想将 `PKL`格式的数据集转换为 `LaTeXOCRDataset`格式,则需将配置文件修改为:
+例如,您想将 `MSTextRecDataset`格式的数据集转换为 `LaTeXOCRDataset`格式,则需将配置文件修改为:
 
 ```bash
 ......
@@ -147,7 +147,7 @@ CheckDataset:
   ......
   convert:
     enable: True
-    src_dataset_type: PKL
+    src_dataset_type: MSTextRecDataset
   ......
 ```
 随后执行命令:
@@ -166,7 +166,7 @@ python main.py -c  paddlex/configs/formula_recognition/LaTeX_OCR_rec.yaml \
     -o Global.mode=check_dataset \
     -o Global.dataset_dir=./dataset/ocr_rec_latexocr_dataset_example \
     -o CheckDataset.convert.enable=True \
-    -o CheckDataset.convert.src_dataset_type=PKL
+    -o CheckDataset.convert.src_dataset_type=MSTextRecDataset
 ```
 **(2)数据集划分**
 

+ 11 - 11
docs/module_usage/tutorials/ocr_modules/formula_recognition_en.md

@@ -25,7 +25,7 @@ The formula recognition module is a crucial component of OCR (Optical Character
     <td>89.7 M</td>
     <td>LaTeX-OCR is a formula recognition algorithm based on an autoregressive large model. By adopting Hybrid ViT as the backbone network and transformer as the decoder, it significantly improves the accuracy of formula recognition.</td>
   </tr>
-  
+
 </table>
 
 **Note: The above accuracy metrics are measured on the LaTeX-OCR formula recognition test set.**
@@ -132,22 +132,22 @@ After completing the data verification, you can convert the dataset format and r
 
 **(1) Dataset Format Conversion**
 
-The formula recognition supports converting `PKL` format datasets to `LaTeXOCRDataset` format. The parameters for dataset format conversion can be set by modifying the fields under `CheckDataset` in the configuration file. Examples of some parameters in the configuration file are as follows:
+The formula recognition supports converting `MSTextRecDataset` format datasets to `LaTeXOCRDataset` format ( `PKL` format ). The parameters for dataset format conversion can be set by modifying the fields under `CheckDataset` in the configuration file. Examples of some parameters in the configuration file are as follows:
 
 * `CheckDataset`:
   * `convert`:
-    * `enable`: Whether to perform dataset format conversion. Formula recognition supports converting `PKL` format datasets to `LaTeXOCRDataset` format, default is `True`;
-    * `src_dataset_type`: If dataset format conversion is performed, the source dataset format needs to be set, default is `PKL`, optional value is `PKL`;
+    * `enable`: Whether to perform dataset format conversion. Formula recognition supports converting `MSTextRecDataset` format datasets to `LaTeXOCRDataset` format, default is `True`;
+    * `src_dataset_type`: If dataset format conversion is performed, the source dataset format needs to be set, default is `MSTextRecDataset`;
 
-For example, if you want to convert a `PKL` format dataset to `LaTeXOCRDataset` format, you need to modify the configuration file as follows:
+For example, if you want to convert a `MSTextRecDataset` format dataset to `LaTeXOCRDataset` format, you need to modify the configuration file as follows:
 
 ```bash
 ......
 CheckDataset:
   ......
-  convert: 
+  convert:
     enable: True
-    src_dataset_type: PKL
+    src_dataset_type: MSTextRecDataset
   ......
 ```
 Then execute the command:
@@ -166,7 +166,7 @@ python main.py -c  paddlex/configs/formula_recognition/LaTeX_OCR_rec.yaml \
     -o Global.mode=check_dataset \
     -o Global.dataset_dir=./dataset/ocr_rec_latexocr_dataset_example \
     -o CheckDataset.convert.enable=True \
-    -o CheckDataset.convert.src_dataset_type=PKL
+    -o CheckDataset.convert.src_dataset_type=MSTextRecDataset
 ```
 **(2) Dataset Splitting**
 
@@ -222,7 +222,7 @@ The following steps are required:
 
 * Specify the `.yaml` configuration file path for the model (here it is `LaTeX_OCR_rec.yaml`,When training other models, you need to specify the corresponding configuration files. The relationship between the model and configuration files can be found in the [PaddleX Model List (CPU/GPU)](../../../support_list/models_list_en.md))
 * Set the mode to model training: `-o Global.mode=train`
-* Specify the path to the training dataset: `-o Global.dataset_dir`. 
+* Specify the path to the training dataset: `-o Global.dataset_dir`.
 Other related parameters can be set by modifying the `Global` and `Train` fields in the `.yaml` configuration file, or adjusted by appending parameters in the command line. For example, to specify training on the first two GPUs: `-o Global.device=gpu:0,1`; to set the number of training epochs to 10: `-o Train.epochs_iters=10`. For more modifiable parameters and their detailed explanations, refer to the configuration file instructions for the corresponding task module of the model [PaddleX Common Configuration File Parameters](../../instructions/config_parameters_common_en.md).
 
 <details>
@@ -251,7 +251,7 @@ Similar to model training, the following steps are required:
 
 * Specify the `.yaml` configuration file path for the model (here it is `LaTeX_OCR_rec.yaml`)
 * Set the mode to model evaluation: `-o Global.mode=evaluate`
-* Specify the path to the validation dataset: `-o Global.dataset_dir`. 
+* Specify the path to the validation dataset: `-o Global.dataset_dir`.
 Other related parameters can be set by modifying the `Global` and `Evaluate` fields in the `.yaml` configuration file, detailed instructions can be found in [PaddleX Common Configuration File Parameters](../../instructions/config_parameters_common_en.md).
 
 <details>
@@ -282,7 +282,7 @@ Similar to model training and evaluation, the following steps are required:
 * Specify the `.yaml` configuration file path for the model (here it is `LaTeX_OCR_rec.yaml`)
 * Set the mode to model inference prediction: `-o Global.mode=predict`
 * Specify the model weights path: `-o Predict.model_dir="./output/best_accuracy/inference"`
-* Specify the input data path: `-o Predict.input="..."`. 
+* Specify the input data path: `-o Predict.input="..."`.
 Other related parameters can be set by modifying the `Global` and `Predict` fields in the `.yaml` configuration file. For details, please refer to [PaddleX Common Model Configuration File Parameter Description](../../instructions/config_parameters_common_en.md).
 
 

+ 22 - 2
docs/pipeline_usage/tutorials/ocr_pipelines/formula_recognition.md

@@ -106,7 +106,7 @@ paddlex --pipeline ./formula_recognition.yaml --input general_formula_recognitio
 
 ![](https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/pipelines/formula_recognition/02.jpg)
 
-可视化图片默认不进行保存,您可以通过 `--save_path` 自定义保存路径,随后所有结果将被保存在指定路径下。此外,您可以通过网站 [https://www.lddgo.net/math/latex-to-image](https://www.lddgo.net/math/latex-to-image) 对识别出来的LaTeX代码进行可视化
+可视化图片默认不进行保存,您可以通过 `--save_path` 自定义保存路径,随后所有结果将被保存在指定路径下。公式识别可视化需要单独配置环境,请您参考[2.3 公式识别产线可视化](#23-公式识别产线可视化) 对LaTeX渲染引擎进行安装
 
 ### 2.2 Python脚本方式集成
 几行代码即可完成产线的快速推理,以公式识别产线为例:
@@ -119,7 +119,6 @@ pipeline = create_pipeline(pipeline="formula_recognition")
 output = pipeline.predict("general_formula_recognition.png")
 for res in output:
     res.print()
-    res.save_to_img("./output/")
 ```
 
 > ❗ Python脚本运行得到的结果与命令行方式相同。
@@ -165,8 +164,29 @@ pipeline = create_pipeline(pipeline="./my_path/formula_recognition.yaml")
 output = pipeline.predict("general_formula_recognition.png")
 for res in output:
     res.print()
+```
+### 2.3 公式识别产线可视化
+如果您需要对公式识别产线进行可视化,需要运行如下命令来对LaTeX渲染环境进行安装:
+```python
+apt-get install sudo
+sudo apt-get update
+sudo apt-get install texlive
+sudo apt-get install texlive-latex-base
+sudo apt-get install texlive-latex-extra
+```
+之后,使用 `save_to_img` 方法对可视化图片进行保存。具体命令如下:
+```python
+from paddlex import create_pipeline
+
+pipeline = create_pipeline(pipeline="formula_recognition")
+
+output = pipeline.predict("general_formula_recognition.png")
+for res in output:
+    res.print()
     res.save_to_img("./output/")
 ```
+**备注**: 由于公式识别可视化过程中需要对每张公式图片进行渲染,因此耗时较长,请您耐心等待。
+
 ## 3. 开发集成/部署
 如果公式识别产线可以达到您对产线推理速度和精度的要求,您可以直接进行开发集成/部署。
 

+ 22 - 2
docs/pipeline_usage/tutorials/ocr_pipelines/formula_recognition_en.md

@@ -105,7 +105,7 @@ Where `dt_polys` represents the coordinates of the detected formula area, and `r
 The visualization result is as follows:
 ![](https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/pipelines/formula_recognition/02.jpg)
 
-The visualized image not saved by default. You can customize the save path through `--save_path`, and then all results will be saved in the specified path. Additionally, you can visualize the recognized LaTeX code through the website [https://www.lddgo.net/math/latex-to-image](https://www.lddgo.net/math/latex-to-image).
+The visualized image not saved by default. You can customize the save path through `--save_path`, and then all results will be saved in the specified path. Formula recognition visualization requires a separate environment configuration. Please refer to [2.3 Formula Recognition Pipeline Visualization](#23-formula-recognition-pipeline-visualization) to install the LaTeX rendering engine.
 
 
 #### 2.2 Python Script Integration
@@ -119,7 +119,6 @@ pipeline = create_pipeline(pipeline="formula_recognition")
 output = pipeline.predict("general_formula_recognition.png")
 for res in output:
     res.print()
-    res.save_to_img("./output/")
 ```
 > ❗ The results obtained from running the Python script are the same as those from the command line.
 
@@ -164,8 +163,29 @@ pipeline = create_pipeline(pipeline="./my_path/formula_recognition.yaml")
 output = pipeline.predict("general_formula_recognition.png")
 for res in output:
     res.print()
+```
+
+### 2.3 Formula Recognition Pipeline Visualization
+If you need to visualize the formula recognition pipeline, you need to run the following command to install the LaTeX rendering environment:
+```python
+apt-get install sudo
+sudo apt-get update
+sudo apt-get install texlive
+sudo apt-get install texlive-latex-base
+sudo apt-get install texlive-latex-extra
+```
+After that, use the `save_to_img` method to save the visualization image. The specific command is as follows:
+```python
+from paddlex import create_pipeline
+
+pipeline = create_pipeline(pipeline="formula_recognition")
+
+output = pipeline.predict("general_formula_recognition.png")
+for res in output:
+    res.print()
     res.save_to_img("./output/")
 ```
+**Note**: Since the formula recognition visualization process requires rendering each formula image, it may take a relatively long time. Please be patient.
 
 ## 3. Development Integration/Deployment
 If the formula recognition pipeline meets your requirements for inference speed and accuracy, you can proceed directly with development integration/deployment.

+ 3 - 3
paddlex/configs/formula_recognition/LaTeX_OCR_rec.yaml

@@ -2,13 +2,13 @@ Global:
   model: LaTeX_OCR_rec
   mode: check_dataset # check_dataset/train/evaluate/predict
   dataset_dir: "./dataset/ocr_rec_latexocr_dataset_example"
-  device: gpu:0
+  device: gpu:0,1,2,3
   output: "output"
 
 CheckDataset:
   convert: 
     enable: True
-    src_dataset_type: PKL
+    src_dataset_type: MSTextRecDataset
   split: 
     enable: False
     train_percent: null
@@ -16,7 +16,7 @@ CheckDataset:
 
 Train:
   epochs_iters: 20
-  batch_size_train: 40
+  batch_size_train: 30
   batch_size_val: 10
   learning_rate: 0.0001
   pretrain_weight_path: null

+ 2 - 2
paddlex/inference/pipelines/formula_recognition.py

@@ -14,7 +14,7 @@
 
 import numpy as np
 from ..components import CropByBoxes
-from ..results import FormulaRecResult
+from ..results import FormulaResult
 from .base import BasePipeline
 from ...utils import logging
 
@@ -89,7 +89,7 @@ class FormulaRecognitionPipeline(BasePipeline):
                         single_img_res["rec_formula"].append(
                             str(formula_res["rec_text"])
                         )
-            yield FormulaRecResult(single_img_res)
+            yield FormulaResult(single_img_res)
 
     def sorted_formula_box(self, x):
         coordinate = x["coordinate"]

+ 1 - 1
paddlex/inference/results/__init__.py

@@ -21,7 +21,7 @@ from .seal_rec import SealOCRResult
 from .ocr import OCRResult
 from .det import DetResult
 from .seg import SegResult
-from .formula_rec import FormulaRecResult
+from .formula_rec import FormulaRecResult, FormulaResult
 from .instance_seg import InstanceSegResult
 from .ts import TSFcResult, TSAdResult, TSClsResult
 from .warp import DocTrResult

+ 265 - 1
paddlex/inference/results/formula_rec.py

@@ -12,15 +12,65 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
+import os
+import sys
+import cv2
+import math
 import random
+import tempfile
+import subprocess
 import numpy as np
 from PIL import Image, ImageDraw
 
 from .base import CVResult
 from ...utils import logging
+from .ocr import draw_box_txt_fine
+from ...utils.fonts import PINGFANG_FONT_FILE_PATH
 
 
 class FormulaRecResult(CVResult):
+    def _to_str(self, *args, **kwargs):
+        return super()._to_str(*args, **kwargs).replace("\\\\", "\\")
+
+    def _to_img(
+        self,
+    ):
+        """Draw formula on image"""
+        image = self._img_reader.read(self["input_path"])
+        rec_formula = str(self["rec_text"])
+        image = np.array(image.convert("RGB"))
+        xywh = crop_white_area(image)
+        if xywh is not None:
+            x, y, w, h = xywh
+            image = image[y : y + h, x : x + w]
+        image = Image.fromarray(image)
+        image_width, image_height = image.size
+        box = [[0, 0], [image_width, 0], [image_width, image_height], [0, image_height]]
+        try:
+            img_formula = draw_formula_module(
+                image.size, box, rec_formula, is_debug=False
+            )
+            img_formula = Image.fromarray(img_formula)
+            render_width, render_height = img_formula.size
+            resize_height = render_height
+            resize_width = int(resize_height * image_width / image_height)
+            image = image.resize((resize_width, resize_height), Image.LANCZOS)
+
+            new_image_width = image.width + int(render_width) + 10
+            new_image = Image.new(
+                "RGB", (new_image_width, render_height), (255, 255, 255)
+            )
+            new_image.paste(image, (0, 0))
+            new_image.paste(img_formula, (image.width + 10, 0))
+            return new_image
+        except subprocess.CalledProcessError as e:
+            logging.warning(
+                "Please refer to 2.3 Formula Recognition Pipeline Visualization in Formula Recognition Pipeline Tutorial to install the LaTeX rendering engine at first."
+            )
+            return None
+
+
+class FormulaResult(CVResult):
     _HARD_FLAG = False
 
     def _to_str(self, *args, **kwargs):
@@ -30,5 +80,219 @@ class FormulaRecResult(CVResult):
         self,
     ):
         """draw formula result"""
-        logging.warning("FormulaRecResult don't support save to img!")
+
+        boxes = self["dt_polys"]
+        formulas = self["rec_formula"]
+        image = self._img_reader.read(self["input_path"])
+        if self._HARD_FLAG:
+            image_np = np.array(image)
+            image = Image.fromarray(image_np[:, :, ::-1])
+        h, w = image.height, image.width
+        img_left = image.copy()
+        img_right = np.ones((h, w, 3), dtype=np.uint8) * 255
+        random.seed(0)
+        draw_left = ImageDraw.Draw(img_left)
+
+        if formulas is None or len(formulas) != len(boxes):
+            formulas = [None] * len(boxes)
+        for idx, (box, formula) in enumerate(zip(boxes, formulas)):
+            try:
+                color = (
+                    random.randint(0, 255),
+                    random.randint(0, 255),
+                    random.randint(0, 255),
+                )
+                box = np.array(box)
+                pts = [(x, y) for x, y in box.tolist()]
+                draw_left.polygon(pts, outline=color, width=8)
+                draw_left.polygon(box, fill=color)
+                img_right_text = draw_box_formula_fine(
+                    (w, h),
+                    box,
+                    formula,
+                    is_debug=False,
+                )
+                pts = np.array(box, np.int32).reshape((-1, 1, 2))
+                cv2.polylines(img_right_text, [pts], True, color, 1)
+                img_right = cv2.bitwise_and(img_right, img_right_text)
+            except subprocess.CalledProcessError as e:
+                logging.warning(
+                    "Please refer to 2.3 Formula Recognition Pipeline Visualization in Formula Recognition Pipeline Tutorial to install the LaTeX rendering engine at first."
+                )
+                return None
+
+        img_left = Image.blend(image, img_left, 0.5)
+        img_show = Image.new("RGB", (int(w * 2), h), (255, 255, 255))
+        img_show.paste(img_left, (0, 0, w, h))
+        img_show.paste(Image.fromarray(img_right), (w, 0, w * 2, h))
+        return img_show
+
+
+def get_align_equation(equation):
+    is_align = False
+    equation = str(equation) + "\n"
+    begin_dict = [
+        r"begin{align}",
+        r"begin{align*}",
+    ]
+    for begin_sym in begin_dict:
+        if begin_sym in equation:
+            is_align = True
+            break
+    if not is_align:
+        equation = (
+            r"\begin{equation}"
+            + "\n"
+            + equation.strip()
+            + r"\nonumber"
+            + "\n"
+            + r"\end{equation}"
+            + "\n"
+        )
+    return equation
+
+
+def generate_tex_file(tex_file_path, equation):
+    with open(tex_file_path, "w") as fp:
+        start_template = (
+            r"\documentclass{article}" + "\n"
+            r"\usepackage{cite}" + "\n"
+            r"\usepackage{amsmath,amssymb,amsfonts}" + "\n"
+            r"\usepackage{graphicx}" + "\n"
+            r"\usepackage{textcomp}" + "\n"
+            r"\DeclareMathSizes{14}{14}{9.8}{7}" + "\n"
+            r"\pagestyle{empty}" + "\n"
+            r"\begin{document}" + "\n"
+            r"\begin{large}" + "\n"
+        )
+        fp.write(start_template)
+        equation = get_align_equation(equation)
+        fp.write(equation)
+        end_template = r"\end{large}" + "\n" r"\end{document}" + "\n"
+        fp.write(end_template)
+
+
+def generate_pdf_file(tex_path, pdf_dir, is_debug=False):
+    if os.path.exists(tex_path):
+        command = "pdflatex -halt-on-error -output-directory={} {}".format(
+            pdf_dir, tex_path
+        )
+        if is_debug:
+            subprocess.check_call(command, shell=True)
+        else:
+            devNull = open(os.devnull, "w")
+            subprocess.check_call(
+                command, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, shell=True
+            )
+
+
+def crop_white_area(image):
+    image = np.array(image).astype("uint8")
+    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
+    _, thresh = cv2.threshold(gray, 240, 255, cv2.THRESH_BINARY_INV)
+    contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
+    if len(contours) > 0:
+        x, y, w, h = cv2.boundingRect(np.concatenate(contours))
+        return [x, y, w, h]
+    else:
         return None
+
+
+def pdf2img(pdf_path, img_path, is_padding=False):
+    import fitz
+
+    pdfDoc = fitz.open(pdf_path)
+    if pdfDoc.page_count != 1:
+        return None
+    for pg in range(pdfDoc.page_count):
+        page = pdfDoc[pg]
+        rotate = int(0)
+        zoom_x = 2
+        zoom_y = 2
+        mat = fitz.Matrix(zoom_x, zoom_y).prerotate(rotate)
+        pix = page.get_pixmap(matrix=mat, alpha=False)
+        if not os.path.exists(img_path):
+            os.makedirs(img_path)
+
+        pix._writeIMG(img_path, 7, 100)
+        img = cv2.imread(img_path)
+        xywh = crop_white_area(img)
+
+        if xywh is not None:
+            x, y, w, h = xywh
+            img = img[y : y + h, x : x + w]
+            if is_padding:
+                img = cv2.copyMakeBorder(
+                    img, 30, 30, 30, 30, cv2.BORDER_CONSTANT, value=(255, 255, 255)
+                )
+            return img
+    return None
+
+
+def draw_formula_module(img_size, box, formula, is_debug=False):
+    """draw box formula for module"""
+    box_width, box_height = img_size
+    with tempfile.TemporaryDirectory() as td:
+        tex_file_path = os.path.join(td, "temp.tex")
+        pdf_file_path = os.path.join(td, "temp.pdf")
+        img_file_path = os.path.join(td, "temp.jpg")
+        generate_tex_file(tex_file_path, formula)
+        if os.path.exists(tex_file_path):
+            generate_pdf_file(tex_file_path, td, is_debug)
+        formula_img = None
+        if os.path.exists(pdf_file_path):
+            formula_img = pdf2img(pdf_file_path, img_file_path, is_padding=False)
+        if formula_img is not None:
+            return formula_img
+        else:
+            img_right_text = draw_box_txt_fine(
+                img_size, box, "Rendering Failed", PINGFANG_FONT_FILE_PATH
+            )
+        return img_right_text
+
+
+def draw_box_formula_fine(img_size, box, formula, is_debug=False):
+    """draw box formula for pipeline"""
+    box_height = int(
+        math.sqrt((box[0][0] - box[3][0]) ** 2 + (box[0][1] - box[3][1]) ** 2)
+    )
+    box_width = int(
+        math.sqrt((box[0][0] - box[1][0]) ** 2 + (box[0][1] - box[1][1]) ** 2)
+    )
+    with tempfile.TemporaryDirectory() as td:
+        tex_file_path = os.path.join(td, "temp.tex")
+        pdf_file_path = os.path.join(td, "temp.pdf")
+        img_file_path = os.path.join(td, "temp.jpg")
+        generate_tex_file(tex_file_path, formula)
+        if os.path.exists(tex_file_path):
+            generate_pdf_file(tex_file_path, td, is_debug)
+        formula_img = None
+        if os.path.exists(pdf_file_path):
+            formula_img = pdf2img(pdf_file_path, img_file_path, is_padding=False)
+        if formula_img is not None:
+            formula_h, formula_w = formula_img.shape[:-1]
+            resize_height = box_height
+            resize_width = formula_w * resize_height / formula_h
+            formula_img = cv2.resize(
+                formula_img, (int(resize_width), int(resize_height))
+            )
+            formula_h, formula_w = formula_img.shape[:-1]
+            pts1 = np.float32(
+                [[0, 0], [box_width, 0], [box_width, box_height], [0, box_height]]
+            )
+            pts2 = np.array(box, dtype=np.float32)
+            M = cv2.getPerspectiveTransform(pts1, pts2)
+            formula_img = np.array(formula_img, dtype=np.uint8)
+            img_right_text = cv2.warpPerspective(
+                formula_img,
+                M,
+                img_size,
+                flags=cv2.INTER_NEAREST,
+                borderMode=cv2.BORDER_CONSTANT,
+                borderValue=(255, 255, 255),
+            )
+        else:
+            img_right_text = draw_box_txt_fine(
+                img_size, box, "Rendering Failed", PINGFANG_FONT_FILE_PATH
+            )
+        return img_right_text

+ 5 - 5
paddlex/modules/text_recognition/dataset_checker/dataset_src/convert_dataset.py

@@ -28,11 +28,11 @@ from .....utils.logging import info, warning
 
 def check_src_dataset(root_dir, dataset_type):
     """check src dataset format validity"""
-    if dataset_type in ("PKL"):
-        anno_suffix = ".pkl"
+    if dataset_type in ("MSTextRecDataset"):
+        anno_suffix = ".txt"
     else:
         raise ConvertFailedError(
-            message=f"数据格式转换失败!不支持{dataset_type}格式数据集。当前仅支持 PKL 格式。"
+            message=f"数据格式转换失败!不支持{dataset_type}格式数据集。当前仅支持 MSTextRecDataset 格式。"
         )
 
     err_msg_prefix = f"数据格式转换失败!请参考上述`{dataset_type}格式数据集示例`检查待转换数据集格式。"
@@ -50,11 +50,11 @@ def convert(dataset_type, input_dir):
     """convert dataset to pkl format"""
     # check format validity
     check_src_dataset(input_dir, dataset_type)
-    if dataset_type in ("PKL"):
+    if dataset_type in ("MSTextRecDataset"):
         convert_pkl_dataset(input_dir)
     else:
         raise ConvertFailedError(
-            message=f"数据格式转换失败!不支持{dataset_type}格式数据集。当前仅支持 PKL 格式。"
+            message=f"数据格式转换失败!不支持{dataset_type}格式数据集。当前仅支持 MSTextRecDataset 格式。"
         )