浏览代码

Merge pull request #56 from myhloli/master

有些ocr的text和block框差异过大,降低fill阈值到0.7
myhloli 1 年之前
父节点
当前提交
1e69067047
共有 1 个文件被更改,包括 1 次插入1 次删除
  1. 1 1
      magic_pdf/pre_proc/ocr_dict_merge.py

+ 1 - 1
magic_pdf/pre_proc/ocr_dict_merge.py

@@ -156,7 +156,7 @@ def fill_spans_in_blocks(blocks, spans):
         block_spans = []
         for span in spans:
             span_bbox = span['bbox']
-            if calculate_overlap_area_in_bbox1_area_ratio(span_bbox, block_bbox) > 0.8:
+            if calculate_overlap_area_in_bbox1_area_ratio(span_bbox, block_bbox) > 0.7:
                 block_spans.append(span)
 
         '''行内公式调整, 高度调整至与同行文字高度一致(优先左侧, 其次右侧)'''