浏览代码

fix(ocr_mkcontent): expand para_to_standard_format_v2 to handle list and index blocks

- Modified the condition to include List and Index block types- This change enhances the function's capability to process different paragraph types
myhloli 1 年之前
父节点
当前提交
644085760b
共有 1 个文件被更改,包括 1 次插入1 次删除
  1. 1 1
      magic_pdf/dict2md/ocr_mkcontent.py

+ 1 - 1
magic_pdf/dict2md/ocr_mkcontent.py

@@ -162,7 +162,7 @@ def merge_para_with_text(para_block):
 def para_to_standard_format_v2(para_block, img_buket_path, page_idx, drop_reason=None):
     para_type = para_block['type']
     para_content = {}
-    if para_type == BlockType.Text:
+    if para_type in [BlockType.Text, BlockType.List, BlockType.Index]:
         para_content = {
             'type': 'text',
             'text': merge_para_with_text(para_block),