瀏覽代碼

Merge pull request #64 from papayalove/master

修复了para_split内容丢失
myhloli 1 年之前
父節點
當前提交
737e803d91
共有 1 個文件被更改,包括 2 次插入2 次删除
  1. 2 2
      magic_pdf/para/para_split_v2.py

+ 2 - 2
magic_pdf/para/para_split_v2.py

@@ -246,11 +246,11 @@ def __group_line_by_layout(blocks, layout_bboxes, lang="en"):
     for lyout in layout_bboxes:
         lines = [line for block in blocks if block["type"] == BlockType.Text and is_in_layout(block['bbox'], lyout['layout_bbox']) for line in
                  block['lines']]
-        blocks = [block for block in blocks if is_in_layout(block['bbox'], lyout['layout_bbox'])]
+        blocks_in_layout = [block for block in blocks if is_in_layout(block['bbox'], lyout['layout_bbox'])]
 
 
         lines_group.append(lines)
-        blocks_group.append(blocks)
+        blocks_group.append(blocks_in_layout)
     return lines_group, blocks_group