Browse Source

fix(magic_pdf): include List and Index block types in processing

Add List and Index to the list of block types being processed in the draw_bbox.py file. This inclusion ensures that these block types are handled similarly to other text-containing blocks, improving the overall document processing accuracy and consistency.
myhloli 1 year ago
parent
commit
0a9a6d3e53
1 changed files with 2 additions and 0 deletions
  1. 2 0
      magic_pdf/libs/draw_bbox.py

+ 2 - 0
magic_pdf/libs/draw_bbox.py

@@ -237,6 +237,8 @@ def draw_span_bbox(pdf_info, pdf_bytes, out_path, filename):
                 BlockType.Text,
                 BlockType.Text,
                 BlockType.Title,
                 BlockType.Title,
                 BlockType.InterlineEquation,
                 BlockType.InterlineEquation,
+                BlockType.List,
+                BlockType.Index,
             ]:
             ]:
                 for line in block['lines']:
                 for line in block['lines']:
                     for span in line['spans']:
                     for span in line['spans']: