소스 검색

feat(新增SealOCR识别器支持): 在适配器模块中引入SealOCRRecognizer,并更新BaseLayoutDetector类以处理印章类别的重叠情况,优化印章识别流程。

zhch158_admin 1 개월 전
부모
커밋
49a0fefc0e
2개의 변경된 파일8개의 추가작업 그리고 0개의 파일을 삭제
  1. 2 0
      ocr_tools/universal_doc_parser/models/adapters/__init__.py
  2. 6 0
      ocr_tools/universal_doc_parser/models/adapters/base.py

+ 2 - 0
ocr_tools/universal_doc_parser/models/adapters/__init__.py

@@ -40,6 +40,7 @@ try:
         MinerUOCRRecognizer
     )
     from .mineru_wired_table import MinerUWiredTableRecognizer
+    from .seal_ocr_adapter import SealOCRRecognizer
     MINERU_AVAILABLE = True
 except ImportError:
     MINERU_AVAILABLE = False
@@ -78,6 +79,7 @@ if MINERU_AVAILABLE:
         'MinerUVLRecognizer',
         'MinerUOCRRecognizer',
         'MinerUWiredTableRecognizer',
+        'SealOCRRecognizer',
     ])
 
 

+ 6 - 0
ocr_tools/universal_doc_parser/models/adapters/base.py

@@ -620,6 +620,12 @@ class BaseLayoutDetector(BaseAdapter):
                 bbox1, bbox2 = results[i].get('bbox', []), results[j].get('bbox', [])
                 if len(bbox1) < 4 or len(bbox2) < 4:
                     continue
+
+                cat_i = results[i].get('category', '')
+                cat_j = results[j].get('category', '')
+                # 印章常压在表格/文字之上,与大面积区域重叠属正常,保留双方
+                if cat_i == 'seal' or cat_j == 'seal':
+                    continue
                 
                 # 计算重叠指标
                 iou = coordinate_utils.calculate_iou(bbox1, bbox2)