Просмотр исходного кода

feat(恢复印章识别功能+修复表格文字丢失): 从 main 同步被 sync commit 6472578 误删的功能代码

印章识别功能(8文件):
- 新增 seal_ocr_adapter.py: MinerU PytorchPaddleOCR(lang=seal) 印章专用 OCR 适配器
- layout_model_router.py: 手动合并 main 的 seal_supplement 补充检测 + dev 的 filter_layout_models_for_router hook
- element_processors.py: process_seal_element 优先 SealOCRRecognizer,回退 VLM
- pipeline_manager_v2.py: 集成 seal_ocr_recognizer + IMAGE_BODY_CATEGORIES 加 chart
- model_factory.py: 新增 create_seal_ocr_recognizer 工厂方法
- adapters/__init__.py: 注册 SealOCRRecognizer
- bank_statement_yusys_v4.yaml: 新增 seal_supplement + seal_recognition 配置段(保留 dev 的 cuda 生产配置)
- docs/mineru/印章识别-seal处理流程.md: 印章识别技术文档

误删修复(3文件):
- base.py: 恢复 seal 去重保护(IoU 高时跳过 seal 类,避免印章被误删)
- _mineru_vl_patches.py: 恢复 OTSL 补丁(修复 PaddleOCR-VL 表格首格 <fcel> 缺失导致所有单元格文字丢失)
- mineru_adapter.py: 手动合并 main 的 patches 调用 + dev 的 device strategy 生产适配
hankal 9 часов назад
Родитель
Сommit
7d2d878b38

+ 436 - 0
docs/mineru/印章识别-seal处理流程.md

@@ -0,0 +1,436 @@
+# MinerU 印章(Seal)识别处理流程
+
+## 概览
+
+MinerU 从 v2.5 版本开始在 pipeline 后端中支持印章(seal)识别。印章识别不依赖单独的专有检测模型,而是分两步完成:
+
+1. **布局检测**:PP-DocLayoutV2 模型在 25 个布局类别中包含了 `seal`(第 20 类),一次性完成印章区域的定位。
+2. **印章文字 OCR**:使用专门针对印章场景调优的 PaddleOCR 变体模型,对检测到的印章区域进行文字检测与识别。
+
+## 使用的模型
+
+| 步骤 | 模型 | 说明 |
+|------|------|------|
+| **Step 1: 印章区域检测** | PP-DocLayoutV2(基于 RT-DETR) | 在页面上定位印章区域,输出 bbox 坐标 |
+| **Step 2: 印章文字检测** | `seal_PP-OCRv4_det_server_infer.pth` | 专用 DB 文字检测器,使用 PP-HGNet_small 骨干网络,多边形检测框 |
+| **Step 2: 印章文字识别** | `ch_PP-OCRv4_rec_server_infer.pth` | 通用中文文字识别模型 |
+| **轻量版** | `seal_PP-OCRv4_det_infer.pth` + `ch_PP-OCRv4_rec_infer.pth` | CPU 环境下自动切换的轻量版 |
+
+**关键点**:不需要单独的"印章检测模型"。印章检测已整合在 PP-DocLayoutV2 中,共享同一个视觉编码器。
+
+---
+
+## 一、Step 1: PP-DocLayoutV2 布局检测
+
+### 1.1 标签定义
+
+PP-DocLayoutV2 定义了 25 个布局类别,`seal` 是第 20 类(索引 20):
+
+```python
+# mineru/model/layout/pp_doclayoutv2.py
+PP_DOCLAYOUT_V2_LABELS = [
+    "abstract",           # 0
+    "algorithm",          # 1
+    "aside_text",         # 2
+    "chart",              # 3
+    "content",            # 4
+    "display_formula",    # 5
+    "doc_title",          # 6
+    "figure_title",       # 7
+    "footer",             # 8
+    "footer_image",       # 9
+    "footnote",           # 10
+    "formula_number",     # 11
+    "header",             # 12
+    "header_image",       # 13
+    "image",              # 14
+    "inline_formula",     # 15
+    "number",             # 16
+    "paragraph_title",    # 17
+    "reference",          # 18
+    "reference_content",  # 19
+    "seal",               # 20 印章 ←
+    "table",              # 21
+    "text",               # 22
+    "vertical_text",      # 23
+    "vision_footnote",    # 24
+]
+```
+
+### 1.2 置信度阈值
+
+印章的置信度阈值设为 **0.45**,低于大多数其他类别,以确保印章不被漏检:
+
+```python
+DEFAULT_CLASS_THRESHOLDS = [
+    # ... 其他类别 ...
+    0.45,  # 20 seal  ← 较低的阈值
+    # ...
+]
+```
+
+### 1.3 MagicModel 中的映射
+
+布局检测结果进入 `MagicModel` 后,`seal` 标签被映射为 `BlockType.SEAL`:
+
+```python
+# mineru/backend/pipeline/pipeline_magic_model.py
+PP_DOCLAYOUT_V2_LABELS_TO_BLOCK_TYPES = {
+    # ...
+    "seal": BlockType.SEAL,
+    # ...
+}
+```
+
+`BlockType.SEAL` 是 MinerU 内部枚举类中定义的类型:
+
+```python
+# mineru/utils/enum_class.py
+class BlockType:
+    # ...
+    SEAL = "seal"
+    # ...
+```
+
+### 1.4 印章块的 span 构造
+
+在 `MagicModel.__build_page_blocks()` 中,`BlockType.SEAL` 类型的块被构造为纯图的 span(类似 image/table/chart):
+
+```python
+# mineru/backend/pipeline/pipeline_magic_model.py (__build_page_blocks)
+elif block["type"] in [BlockType.SEAL]:
+    span_type = ContentType.SEAL
+
+if span_type in [
+    ContentType.IMAGE,
+    ContentType.TABLE,
+    ContentType.CHART,
+    ContentType.INTERLINE_EQUATION,
+    ContentType.SEAL      # ← 印章走纯图路径
+]:
+    span = {
+        "bbox": block["bbox"],
+        "type": span_type,
+    }
+    if span_type == ContentType.SEAL:
+        span["content"] = block.get("text")  # 印章 OCR 识别出的文字
+```
+
+印章块被当作视觉块处理,不会被送入常规的文本 OCR 流。
+
+---
+
+## 二、Step 2: 印章文字 OCR
+
+### 2.1 入口:batch_analyze.py
+
+在 `batch_analyze.py` 中,所有页面的布局检测完成后,专门有一段代码处理印章 OCR(约第 873-918 行)。
+
+#### 流程:
+
+**a) 收集印章块**
+
+遍历所有页面的布局结果,收集 `label == "seal"` 的块:
+
+```python
+# mineru/backend/pipeline/batch_analyze.py
+seal_ocr_items = []
+for ocr_res_list_dict in ocr_res_list_all_page:
+    for layout_res_item in ocr_res_list_dict['layout_res']:
+        if layout_res_item.get("label") == "seal":
+            seal_ocr_items.append((ocr_res_list_dict, layout_res_item))
+```
+
+**b) 裁剪印章子图**
+
+根据布局检测输出的 bbox 从原图中裁剪出印章区域:
+
+```python
+seal_bbox = normalize_to_int_bbox(
+    layout_res_item.get("bbox"),
+    image_size=(image_h, image_w),
+)
+x0, y0, x1, y1 = seal_bbox
+seal_crop_rgb = np_img[y0:y1, x0:x1]  # 裁剪印章子图
+```
+
+**c) 加载 Seal OCR 模型**
+
+通过 `atom_model_manager.get_atom_model()` 获取 `lang="seal"` 的 `PytorchPaddleOCR` 实例:
+
+```python
+if seal_ocr_model is None:
+    seal_ocr_model = atom_model_manager.get_atom_model(
+        atom_model_name=AtomicModel.OCR,
+        lang="seal",       # ← 关键:指定印章模式
+    )
+```
+
+**d) 运行 OCR**
+
+对裁剪出的印章图片执行检测+识别:
+
+```python
+seal_crop_bgr = cv2.cvtColor(seal_crop_rgb, cv2.COLOR_RGB2BGR)
+seal_ocr_res = seal_ocr_model.ocr(seal_crop_bgr, det=True, rec=True)[0]
+```
+
+**e) 汇总文字**
+
+将识别出的所有文本段拼接为列表,存入 `layout_res_item["text"]`:
+
+```python
+seal_texts = []
+for seal_item in seal_ocr_res:
+    rec_result = seal_item[1]       # (text, score) 元组
+    rec_text = rec_result[0]
+    if rec_text:
+        seal_texts.append(rec_text)
+layout_res_item["text"] = seal_texts
+```
+
+### 2.2 Seal OCR 模型配置
+
+#### 模型文件
+
+定义在 `mineru/model/utils/pytorchocr/utils/resources/models_config.yml`:
+
+```yaml
+seal:
+    det: seal_PP-OCRv4_det_server_infer.pth    # 印章文字检测模型
+    rec: ch_PP-OCRv4_rec_server_infer.pth      # 中文文字识别模型
+    dict: ppocr_keys_v1.txt                    # 字符字典
+
+seal_lite:                                     # CPU 环境自动降级为 lite 版
+    det: seal_PP-OCRv4_det_infer.pth
+    rec: ch_PP-OCRv4_rec_infer.pth
+    dict: ppocr_keys_v1.txt
+```
+
+#### 检测模型架构
+
+`seal_PP-OCRv4_det_server_infer` 使用 PP-HGNet_small 骨干 + DB Head(`mineru/model/utils/pytorchocr/utils/resources/arch_config.yaml`):
+
+```yaml
+seal_PP-OCRv4_det_server_infer:
+  model_type: det
+  algorithm: DB
+  Backbone:
+    name: PPHGNet_small
+  Head:
+    name: DBHead
+    k: 50
+```
+
+#### 特殊参数(针对印章场景调优)
+
+在 `PytorchPaddleOCR.__init__()` 中,当 `lang == "seal"` 时,会覆盖默认的 OCR 参数:
+
+```python
+# mineru/model/ocr/pytorch_paddle.py
+if self.is_seal:
+    kwargs['det_limit_side_len'] = 736       # 最小边长限制(确保小印章不被过度缩放)
+    kwargs['det_limit_type'] = 'min'
+    kwargs['det_max_side_limit'] = 4000      # 最大边长限制
+    kwargs['det_db_thresh'] = 0.2            # 极低检测阈值(弧形细小文字不易漏检)
+    kwargs['det_db_box_thresh'] = 0.6        # 检测框阈值
+    kwargs['det_db_unclip_ratio'] = 0.5      # 文本框扩展比例
+    kwargs['det_box_type'] = 'poly'          # 使用多边形检测框(印章文字常沿弧形分布)
+    kwargs['use_dilation'] = False           # 不进行膨胀操作
+    kwargs['enable_merge_det_boxes'] = False # 不合并检测框
+    kwargs['drop_score'] = 0                 # 不丢弃任何低置信度结果
+```
+
+关键区别总结:
+
+| 参数 | seal 模式 | 普通 OCR 模式 | 原因 |
+|------|-----------|---------------|------|
+| `det_box_type` | `'poly'`(多边形) | `'quad'`(四边形) | 印章文字沿弧形分布,矩形框效果差 |
+| `det_db_thresh` | `0.2` | `0.3` | 印章文字细小,需要更低阈值 |
+| `drop_score` | `0` | `0.5` | 印章文字模糊,不丢弃低置信度 |
+| `enable_merge_det_boxes` | `False` | `True` | 印章文字排列稀疏,不合并检测框 |
+| `det_limit_side_len` | `736`(`min`模式) | `960`(`max`模式) | 确保小印章有足够分辨率 |
+
+### 2.3 多边形裁剪与校正
+
+印章文字检测后,使用专用的多边形裁剪和校正管线:
+
+```python
+# mineru/model/ocr/pytorch_paddle.py (ocr 方法)
+if self.is_seal:
+    dt_boxes = self._seal_sort_boxes(dt_boxes)              # 排序检测框
+    img_crop_list = self._seal_crop_by_polys(ori_im, dt_boxes)  # 多边形裁剪
+```
+
+`CropByPolys` 类(`mineru/model/ocr/seal_crop.py`)负责:
+- 对检测到的多边形框进行排序
+- 使用 `cv2.minAreaRect` + 透视变换进行旋转裁剪
+- 对不规则多边形区域,通过 `AutoRectifier` 进行仿射/单应性校正,将弧形文字展平为水平文字
+- 当多边形简化失败时,回退到外接矩形裁剪以保证流程不会中断
+
+---
+
+## 三、输出格式
+
+印章识别结果在多个输出文件中体现:
+
+### 3.1 content_list.json
+
+印章作为独立的内容类型 `seal` 输出:
+
+```json
+{
+    "type": "seal",
+    "img_path": "images/xxx.jpg",
+    "text": ["识别出的印章文字"],
+    "bbox": [x0, y0, x1, y1],
+    "page_idx": 0
+}
+```
+
+生成代码位于 `mineru/backend/pipeline/pipeline_middle_json_mkcontent.py` 的 `_get_seal_span` 和 `_get_seal_text` 函数。
+
+### 3.2 content_list_v2.json (3.0+)
+
+采用新版结构化格式:
+
+```json
+{
+    "type": "seal",
+    "content": {
+        "image_source": {
+            "path": "images/xxx.jpg"
+        },
+        "seal_content": [
+            {"type": "text", "content": "识别出的印章文字"}
+        ]
+    },
+    "bbox": [x0, y0, x1, y1]
+}
+```
+
+### 3.3 middle.json
+
+印章块出现在 `preproc_blocks` 中,类型为 `seal`,包含 `lines[].spans[]` 结构,span 类型为 `seal`。
+
+### 3.4 model.json
+
+印章检测结果在布局检测阶段输出,包含 `cls_id: 20`、`label: "seal"`、`score` 和 `bbox` 信息。
+
+### 3.5 layout.pdf
+
+印章区域在布局可视化 PDF 中以独立颜色块标注,与其他布局类别一同展示。
+
+---
+
+## 四、seal 内容类型的枚举定义
+
+```python
+# mineru/utils/enum_class.py
+
+class BlockType:
+    SEAL = "seal"           # 块级类型
+
+class ContentType:
+    SEAL = 'seal'           # content_list.json 中的类型
+
+class ContentTypeV2:
+    SEAL = 'seal'           # content_list_v2.json 中的类型
+```
+
+---
+
+## 五、完整流程图
+
+```
+输入 PDF 页面
+    │
+    ▼
+┌─────────────────────────────────┐
+│  PP-DocLayoutV2 布局检测        │
+│  (基于 RT-DETR)                  │
+│  检测 25 个类别,含 seal         │
+│  cls_id=20, threshold=0.45      │
+└───────────────┬─────────────────┘
+                │
+    ┌───────────▼───────────┐
+    │  过滤 label=="seal"    │
+    │  的布局检测结果         │
+    └───────────┬───────────┘
+                │
+                ▼
+    ┌───────────────────────────┐
+    │  从原图按 bbox 裁剪        │
+    │  seal_crop = img[y0:y1, x0:x1] │
+    └───────────┬───────────────┘
+                │
+                ▼
+    ┌───────────────────────────┐
+    │  PytorchPaddleOCR(lang="seal") │
+    │                           │
+    │  ┌─ 检测 (det):          │
+    │  │  seal_PP-OCRv4_det    │
+    │  │  DB + PP-HGNet_small  │
+    │  │  多边形框检测           │
+    │  └───────────────────────│
+    │                           │
+    │  ┌─ 识别 (rec):          │
+    │  │  ch_PP-OCRv4_rec      │
+    │  │  中文文字识别           │
+    │  └───────────────────────│
+    └───────────┬───────────────┘
+                │
+                ▼
+    ┌───────────────────────────┐
+    │  CropByPolys +            │
+    │  AutoRectifier            │
+    │  多边形裁剪 + 仿射校正     │
+    │  弧形文字 → 水平文字       │
+    └───────────┬───────────────┘
+                │
+                ▼
+    ┌───────────────────────────┐
+    │  输出结果                  │
+    │  • seal_text (文字列表)    │
+    │  • seal_image (截图)       │
+    │  → content_list.json      │
+    │  → middle.json            │
+    │  → model.json             │
+    │  → layout.pdf             │
+    └───────────────────────────┘
+```
+
+---
+
+## 六、调试功能
+
+Seal OCR 支持环境变量控制的调试输出:
+
+| 环境变量 | 说明 |
+|----------|------|
+| `MINERU_SEAL_OCR_DEBUG=1` | 启用调试模式 |
+| `MINERU_SEAL_OCR_DEBUG_DIR=/path/to/dir` | 指定调试输出目录(默认 `output_images/seal_ocr_debug/`) |
+
+调试模式下,每个印章识别样本会输出:
+- `input.png` — 输入图像
+- `det_vis.png` — 检测框可视化
+- `crop_NN.png` — 每个裁剪后的文字区域
+- `meta.json` — 元数据(文字、置信度)
+
+---
+
+## 七、关键文件索引
+
+| 文件 | 作用 |
+|------|------|
+| `mineru/model/layout/pp_doclayoutv2.py` | PP-DocLayoutV2 模型定义,包含 seal 类别(索引 20)|
+| `mineru/backend/pipeline/batch_analyze.py` | 印章 OCR 主流程(第 873-918 行)|
+| `mineru/backend/pipeline/pipeline_magic_model.py` | MagicModel 中的 seal 块映射与 span 构造 |
+| `mineru/backend/pipeline/model_json_to_middle_json.py` | 中间 JSON 拼装,印章图片截图 |
+| `mineru/model/ocr/pytorch_paddle.py` | PytorchPaddleOCR 类,seal 模式的参数与 OCR 流程 |
+| `mineru/model/ocr/seal_crop.py` | 印章多边形裁剪(`CropByPolys`)与仿射校正 |
+| `mineru/model/ocr/seal_det_warp.py` | 印章检测框仿射校正器(`AutoRectifier`) |
+| `mineru/model/utils/pytorchocr/utils/resources/models_config.yml` | seal/seal_lite 模型文件配置 |
+| `mineru/model/utils/pytorchocr/utils/resources/arch_config.yaml` | seal 检测模型架构配置 |
+| `mineru/utils/enum_class.py` | BlockType.SEAL / ContentType.SEAL 枚举定义 |
+| `mineru/backend/pipeline/pipeline_middle_json_mkcontent.py` | content_list.json / content_list_v2.json 的 seal 输出格式化 |

+ 23 - 0
ocr_tools/universal_doc_parser/config/bank_statement_yusys_v4.yaml

@@ -82,6 +82,22 @@ layout_detection:
     min_text_width_ratio: 0.4         # 最小宽度占比(40%)
     min_text_height_ratio: 0.3        # 最小高度占比(30%)
 
+  # 印章补充检测:使用 PP-DocLayoutV3 补充 docling 无法识别的密封区域
+  seal_supplement:
+    enabled: true                # 启用 seal 补充检测
+    replace_existing: false      # false=增量合并; true=完全替换主结果中已有 seal
+    replace_overlapping_image: true   # seal 与 image_body/image 等高 IoU 时替换为 seal(非丢弃)
+    replace_iou_threshold: 0.7        # 触发替换的最小 IoU
+    duplicate_iou_threshold: 0.3      # 未替换时,与任意框 IoU 超此值视为重复 seal
+    # seal_detector 使用的模型配置,默认复用 paddle_ppdoclayoutv3 的配置
+    model_config:
+      module: "paddle"
+      model_name: "PP-DocLayoutV3"
+      model_dir: "PaddlePaddle/PP-DocLayoutV3_safetensors"
+      device: "cpu"
+      conf: 0.3
+      num_threads: 4
+
   # Debug 可视化(底图为 inference_image,与 Layout 检测输入一致)
   debug_options:
     enabled: false              # 由命令行 --debug / --debug-layout 控制
@@ -263,6 +279,13 @@ vl_recognition:
   table_recognition:
 
 # ============================================================
+# 印章 OCR 识别配置 - 基于 MinerU PytorchPaddleOCR(lang="seal")
+# ============================================================
+seal_recognition:
+  enabled: true                # 启用印章专用 OCR,关闭则回退 VLM 识别
+  module: "mineru"             # 使用 MinerU 印章 OCR 模型
+
+# ============================================================
 # 输出配置
 # ============================================================
 output:

+ 41 - 13
ocr_tools/universal_doc_parser/core/element_processors.py

@@ -48,6 +48,7 @@ class ElementProcessors:
         wired_table_recognizer: Optional[Any] = None,
         table_classifier: Optional[Any] = None,
         vl_recognizer_lazy_loader: Optional[Any] = None,  # 🆕 懒加载回调
+        seal_ocr_recognizer: Optional[Any] = None,  # 🆕 印章 OCR 识别器
     ):
         """
         初始化元素处理器
@@ -60,6 +61,7 @@ class ElementProcessors:
             wired_table_recognizer: 有线表格识别器(可选)
             table_classifier: 表格分类器(区分有线/无线表格,可选)
             vl_recognizer_lazy_loader: VL识别器懒加载回调函数(可选)
+            seal_ocr_recognizer: 印章 OCR 识别器(可选,不存在时回退 VLM)
         """
         self.preprocessor = preprocessor
         self.ocr_recognizer = ocr_recognizer
@@ -67,6 +69,7 @@ class ElementProcessors:
         self.table_cell_matcher = table_cell_matcher
         self.wired_table_recognizer = wired_table_recognizer
         self.table_classifier = table_classifier
+        self.seal_ocr_recognizer = seal_ocr_recognizer
         
         # VL 识别器懒加载支持
         self._vl_recognizer_lazy_loader = vl_recognizer_lazy_loader
@@ -729,23 +732,47 @@ class ElementProcessors:
         layout_item: Dict[str, Any]
     ) -> Dict[str, Any]:
         """
-        处理印章(seal)元素 - 使用 VLM 识别
-        
+        处理印章(seal)元素 - 优先使用 SealOCRRecognizer,回退 VLM
+
         Args:
             image: 页面图像
             layout_item: 布局检测项
-            
+
         Returns:
             处理后的元素字典
         """
         bbox = layout_item.get('bbox', [0, 0, 0, 0])
         category = layout_item.get('category', 'seal')
         cropped_region = CoordinateUtils.crop_region(image, bbox)
-        
+
         content = {'text': '', 'confidence': 0.0}
-        
+
+        # 优先使用 SealOCRRecognizer(MinerU 印章专用 OCR)
+        if self.seal_ocr_recognizer is not None:
+            try:
+                seal_result = self.seal_ocr_recognizer.recognize(cropped_region)
+                if seal_result.get('text', '').strip():
+                    content = {
+                        'text': seal_result['text'],
+                        'confidence': seal_result.get('confidence', 0.0),
+                        'texts': seal_result.get('texts', []),
+                        'details': seal_result.get('details', []),
+                        'recognition_method': 'seal_ocr',
+                    }
+                    logger.info(f"🔖 Seal recognized (OCR): {content['text'][:50]}..."
+                                if len(content['text']) > 50
+                                else f"🔖 Seal recognized (OCR): {content['text']}")
+                    return {
+                        'type': category,
+                        'bbox': bbox,
+                        'confidence': layout_item.get('confidence', 0.0),
+                        'content': content
+                    }
+            except Exception as e:
+                logger.warning(f"SealOCRRecognizer failed, falling back to VLM: {e}")
+
+        # 回退:使用 VLM 识别
         try:
-            # 懒加载 VL 识别器
             vl_recognizer = self._ensure_vl_recognizer()
             if vl_recognizer is None:
                 logger.error("❌ VL recognizer not available for seal recognition")
@@ -754,19 +781,20 @@ class ElementProcessors:
                     'bbox': bbox,
                     'content': content
                 }
-            
-            # 使用 recognize_text 方法,传入 element_type='seal'
-            # GLM-OCR 适配器会根据 element_type 使用相应的提示词
+
             seal_result = vl_recognizer.recognize_text(cropped_region, element_type='seal')
             content = {
                 'text': seal_result.get('text', ''),
-                'confidence': seal_result.get('confidence', 0.0)
+                'confidence': seal_result.get('confidence', 0.0),
+                'recognition_method': 'vlm',
             }
-            
-            logger.info(f"🔖 Seal recognized: {content['text'][:50]}..." if len(content['text']) > 50 else f"🔖 Seal recognized: {content['text']}")
+
+            logger.info(f"🔖 Seal recognized (VLM): {content['text'][:50]}..."
+                        if len(content['text']) > 50
+                        else f"🔖 Seal recognized (VLM): {content['text']}")
         except Exception as e:
             logger.warning(f"Seal recognition failed: {e}")
-        
+
         return {
             'type': category,
             'bbox': bbox,

+ 218 - 3
ocr_tools/universal_doc_parser/core/layout_model_router.py

@@ -57,6 +57,9 @@ class SmartLayoutRouter(BaseLayoutDetector):
         self.page_name = None  # 将在 detect 方法中设置
         # 分数差距阈值:当模型间分数差距小于此值时,优先选择 docling
         self.score_diff_threshold = config.get('score_diff_threshold', 0.05)
+        # seal 补充检测配置
+        self.seal_supplement_config = config.get('seal_supplement', {})
+        self.seal_detector = None  # PP-DocLayoutV3 用于 seal 补充检测
         
     def initialize(self):
         """初始化所有模型"""
@@ -107,6 +110,28 @@ class SmartLayoutRouter(BaseLayoutDetector):
         
         if not self.models:
             raise RuntimeError("No layout models available")
+        
+        # 初始化 seal 补充检测器(PP-DocLayoutV3)
+        if self.seal_supplement_config.get('enabled', False):
+            try:
+                seal_model_config = self.seal_supplement_config.get('model_config', {})
+                if not seal_model_config:
+                    # 尝试从 model_configs 中查找 PP-DocLayoutV3
+                    for model_name, model_config in self.model_configs.items():
+                        if model_config.get('model_name') == 'PP-DocLayoutV3':
+                            seal_model_config = model_config
+                            break
+                if seal_model_config:
+                    logger.info(f"🔧 Initializing seal supplement detector: PP-DocLayoutV3")
+                    self.seal_detector = ModelFactory.create_layout_detector(
+                        _merge_child_model_config(seal_model_config)
+                    )
+                    logger.info(f"✅ Seal supplement detector initialized")
+                else:
+                    logger.warning(f"⚠️ Seal supplement enabled but no PP-DocLayoutV3 model config found")
+            except Exception as e:
+                logger.warning(f"⚠️ Failed to initialize seal supplement detector: {e}")
+                self.seal_detector = None
     
     def cleanup(self):
         """清理所有模型资源"""
@@ -116,6 +141,12 @@ class SmartLayoutRouter(BaseLayoutDetector):
             except Exception as e:
                 logger.warning(f"⚠️ Failed to cleanup {model_name}: {e}")
         self.models.clear()
+        if self.seal_detector is not None:
+            try:
+                self.seal_detector.cleanup()
+            except Exception as e:
+                logger.warning(f"⚠️ Failed to cleanup seal detector: {e}")
+            self.seal_detector = None
     
     def set_ocr_recognizer(self, ocr_recognizer):
         """设置OCR识别器(用于ocr_eval策略)"""
@@ -180,14 +211,31 @@ class SmartLayoutRouter(BaseLayoutDetector):
         if page_name is not None:
             self.page_name = page_name
         
+        results = []
         if self.strategy == 'ocr_eval':
-            return self._ocr_eval_detect(image, ocr_spans)
+            results = self._ocr_eval_detect(image, ocr_spans)
         elif self.strategy == 'auto':
-            return self._auto_select_detect(image)
+            results = self._auto_select_detect(image)
         elif self.strategy == 'scene':
-            return self._scene_select_detect(image)
+            results = self._scene_select_detect(image)
         else:
             raise ValueError(f"Unknown strategy: {self.strategy}")
+        
+        # 补充 seal 检测结果(如果启用)
+        seal_supplement_applied = (
+            self.seal_supplement_config.get('enabled', False)
+            and self.seal_detector is not None
+        )
+        if seal_supplement_applied:
+            primary_results = results
+            results = self._supplement_seal_detections(image, results)
+            # 子模型 detect() 已在 supplement 前写出 layout_post(仅主模型、无 seal)
+            if self._is_layout_debug_enabled():
+                self._save_router_layout_debug(image, primary_results, suffix='post_primary')
+            # 覆盖 layout_post,与 pipeline 实际使用的 layout(含 seal 补充)一致
+            self._save_router_layout_debug(image, results, suffix='post')
+
+        return results
 
     def _scene_select_detect(
         self,
@@ -376,6 +424,173 @@ class SmartLayoutRouter(BaseLayoutDetector):
             results = first_model.detect(image)
         
         return results
+
+    def _save_router_layout_debug(
+        self,
+        image: Union[np.ndarray, Image.Image],
+        layout_results: List[Dict[str, Any]],
+        suffix: str,
+    ) -> None:
+        """在 SmartLayoutRouter 层写出 layout debug(含 seal 补充后的最终结果)。"""
+        if not self._is_layout_debug_enabled() or not layout_results:
+            return
+        output_dir, page_name = self._resolve_layout_debug_paths()
+        dbg_opts = self._layout_debug_options()
+        if output_dir and dbg_opts.get('save_post_processed', True):
+            self._visualize_layout_results(
+                image, layout_results, output_dir, page_name, suffix=suffix
+            )
+
+    def _run_seal_detector(self, image: Union[np.ndarray, Image.Image]) -> List[Dict[str, Any]]:
+        """运行 seal 补充检测器,不写出子模型 layout debug。"""
+        seal_det = self.seal_detector
+        if seal_det is None:
+            return []
+
+        prev_debug_mode = getattr(seal_det, 'debug_mode', None)
+        prev_debug_opts: Optional[Dict[str, Any]] = None
+        if hasattr(seal_det, 'config') and isinstance(seal_det.config, dict):
+            opts = seal_det.config.get('debug_options')
+            if isinstance(opts, dict):
+                prev_debug_opts = opts.copy()
+                opts['enabled'] = False
+        seal_det.debug_mode = False  # type: ignore[attr-defined]
+
+        try:
+            if hasattr(seal_det, '_detect_raw'):
+                raw = seal_det._detect_raw(image)
+                pp_config = (
+                    seal_det.config.get('post_process', {})
+                    if hasattr(seal_det, 'config')
+                    else {}
+                )
+                return seal_det.post_process(raw, image, pp_config)
+            return seal_det.detect(image)
+        finally:
+            seal_det.debug_mode = prev_debug_mode  # type: ignore[attr-defined]
+            if prev_debug_opts is not None and hasattr(seal_det, 'config'):
+                seal_det.config['debug_options'] = prev_debug_opts
+    
+    # 主模型常把印章误标为 image;补充 seal 与高 IoU 重叠时应替换而非丢弃
+    _SEAL_REPLACEABLE_CATEGORIES = frozenset({
+        'image_body', 'image', 'figure', 'abandon', 'discarded',
+    })
+
+    @staticmethod
+    def _bbox_iou(box_a: List[float], box_b: List[float]) -> float:
+        xa = max(box_a[0], box_b[0])
+        ya = max(box_a[1], box_b[1])
+        xb = min(box_a[2], box_b[2])
+        yb = min(box_a[3], box_b[3])
+        inter = max(0, xb - xa) * max(0, yb - ya)
+        area_a = (box_a[2] - box_a[0]) * (box_a[3] - box_a[1])
+        area_b = (box_b[2] - box_b[0]) * (box_b[3] - box_b[1])
+        union = area_a + area_b - inter
+        return inter / union if union > 0 else 0.0
+
+    def _supplement_seal_detections(
+        self,
+        image: Union[np.ndarray, Image.Image],
+        existing_results: List[Dict[str, Any]]
+    ) -> List[Dict[str, Any]]:
+        """
+        使用 PP-DocLayoutV3 补充检测印章区域,将 seal 结果合并到主模型输出
+        
+        策略:
+        1. 运行 PP-DocLayoutV3,仅保留 category == 'seal'
+        2. replace_existing=true:丢弃主结果中已有 seal,全部采用补充模型 seal
+        3. 默认:若 seal 与主结果中 image_body/image 等 IoU >= replace_iou_threshold,
+           将该框**替换**为 seal(解决主模型把章标成 image 导致补充 seal 被去重丢弃)
+        4. 否则 IoU > duplicate_iou_threshold 视为重复跳过;否则追加新 seal
+        
+        Args:
+            image: 输入图像
+            existing_results: 主模型的 layout 检测结果
+            
+        Returns:
+            合并 seal 检测后的结果列表
+        """
+        try:
+            seal_results = self._run_seal_detector(image)
+            seal_only_items = [item for item in seal_results if item.get('category') == 'seal']
+            
+            if not seal_only_items:
+                logger.info("🔖 Seal supplement: no seal detected by PP-DocLayoutV3")
+                return existing_results
+
+            if self.seal_supplement_config.get('replace_existing', False):
+                logger.info("🔖 Seal supplement: replacing existing seal detections with PP-DocLayoutV3 results")
+                result = [item for item in existing_results if item.get('category') != 'seal']
+                result.extend(seal_only_items)
+                return result
+
+            replace_image = self.seal_supplement_config.get('replace_overlapping_image', True)
+            replace_iou_threshold = float(
+                self.seal_supplement_config.get('replace_iou_threshold', 0.7)
+            )
+            duplicate_iou_threshold = float(
+                self.seal_supplement_config.get('duplicate_iou_threshold', 0.3)
+            )
+
+            merged = list(existing_results)
+            replaced_count = 0
+            added_count = 0
+            skipped_duplicate = 0
+
+            for seal_item in seal_only_items:
+                seal_bbox = seal_item.get('bbox', [])
+                if not seal_bbox or len(seal_bbox) < 4:
+                    continue
+                seal_bbox = seal_bbox[:4]
+
+                if replace_image:
+                    best_idx = -1
+                    best_iou = 0.0
+                    for idx, existing in enumerate(merged):
+                        if existing.get('category') not in self._SEAL_REPLACEABLE_CATEGORIES:
+                            continue
+                        existing_bbox = existing.get('bbox', [])
+                        if not existing_bbox or len(existing_bbox) < 4:
+                            continue
+                        overlap = self._bbox_iou(seal_bbox, existing_bbox[:4])
+                        if overlap >= replace_iou_threshold and overlap > best_iou:
+                            best_iou = overlap
+                            best_idx = idx
+                    if best_idx >= 0:
+                        old_cat = merged[best_idx].get('category', '')
+                        new_item = dict(seal_item)
+                        new_item['category'] = 'seal'
+                        merged[best_idx] = new_item
+                        replaced_count += 1
+                        logger.debug(
+                            f"🔖 Seal supplement: replaced {old_cat} with seal "
+                            f"(IoU={best_iou:.3f}, bbox={seal_bbox})"
+                        )
+                        continue
+
+                is_duplicate = False
+                for existing in merged:
+                    existing_bbox = existing.get('bbox', [])
+                    if existing_bbox and len(existing_bbox) >= 4:
+                        if self._bbox_iou(seal_bbox, existing_bbox[:4]) > duplicate_iou_threshold:
+                            is_duplicate = True
+                            break
+                if is_duplicate:
+                    skipped_duplicate += 1
+                else:
+                    merged.append(dict(seal_item))
+                    added_count += 1
+
+            logger.info(
+                f"🔖 Seal supplement: PP-DocLayoutV3 seal={len(seal_only_items)}, "
+                f"replaced={replaced_count}, added={added_count}, "
+                f"skipped_duplicate={skipped_duplicate}"
+            )
+            return merged
+            
+        except Exception as e:
+            logger.warning(f"⚠️ Seal supplement failed: {e}")
+            return existing_results
     
     def _get_ocr_spans(self, image: Union[np.ndarray, Image.Image]) -> List[Dict[str, Any]]:
         """

+ 8 - 0
ocr_tools/universal_doc_parser/core/model_factory.py

@@ -115,6 +115,14 @@ class ModelFactory:
             raise ValueError(f"Unknown table classifier module: {module_name}")
     
     @classmethod
+    def create_seal_ocr_recognizer(cls, config: Dict[str, Any]):
+        """创建印章 OCR 识别器(基于 MinerU PytorchPaddleOCR lang=seal)"""
+        from models.adapters import SealOCRRecognizer
+        recognizer = SealOCRRecognizer(config)
+        recognizer.initialize()
+        return recognizer
+    
+    @classmethod
     def cleanup_all(cls):
         """清理所有模型资源"""
         # 在实际应用中,可以维护一个活跃模型列表进行清理

+ 17 - 3
ocr_tools/universal_doc_parser/core/pipeline_manager_v2.py

@@ -16,7 +16,6 @@
 - pdf_utils.py: PDF处理工具
 - element_processors.py: 元素处理器
 """
-import os
 import sys
 from typing import Dict, List, Any, Optional
 from pathlib import Path
@@ -82,7 +81,7 @@ class EnhancedDocPipeline:
     TABLE_TEXT_CATEGORIES = ['table_caption', 'table_footnote']
     
     # 图片相关元素
-    IMAGE_BODY_CATEGORIES = ['image', 'image_body', 'figure']
+    IMAGE_BODY_CATEGORIES = ['image', 'image_body', 'figure', 'chart']
     IMAGE_TEXT_CATEGORIES = ['image_caption', 'image_footnote']
     
     # 公式类元素
@@ -199,6 +198,18 @@ class EnhancedDocPipeline:
                 self.layout_detector.set_ocr_recognizer(self.ocr_recognizer)
                 logger.info("✅ OCR recognizer set for smart router")
 
+            # 4b. 印章 OCR 识别器(可选,基于 MinerU PytorchPaddleOCR lang=seal)
+            self.seal_ocr_recognizer = None
+            seal_recognition_config = self.config.get('seal_recognition', {})
+            if seal_recognition_config.get('enabled', False):
+                try:
+                    self.seal_ocr_recognizer = ModelFactory.create_seal_ocr_recognizer(
+                        seal_recognition_config
+                    )
+                    logger.info("✅ Seal OCR recognizer initialized")
+                except Exception as e:
+                    logger.warning(f"⚠️ Seal OCR recognizer init failed, will fallback to VLM: {e}")
+
             # 5. 表格分类器(可选)
             self.table_classifier = None
             table_cls_config = self.config.get('table_classification', {})
@@ -250,6 +261,7 @@ class EnhancedDocPipeline:
             wired_table_recognizer=getattr(self, 'wired_table_recognizer', None),
             table_classifier=getattr(self, 'table_classifier', None),
             vl_recognizer_lazy_loader=self._ensure_vl_recognizer,  # 🎯 传入懒加载回调
+            seal_ocr_recognizer=getattr(self, 'seal_ocr_recognizer', None),  # 🆕 印章 OCR 识别器
         )
     
     # ==================== 主处理流程 ====================
@@ -1085,7 +1097,7 @@ class EnhancedDocPipeline:
                 logger.warning(f"⚠️ Equation processing failed: {e}")
                 processed_elements.append(ElementProcessors.create_error_element(item, str(e)))
         
-        # 🔧 处理 Seal(印章)元素 - 使用 VLM 识别
+        # 处理 Seal(印章)元素 - 优先 SealOCRRecognizer,回退 VLM
         for item in classified_elements['seal']:
             try:
                 element = self.element_processors.process_seal_element(
@@ -1146,6 +1158,8 @@ class EnhancedDocPipeline:
                 self.vl_recognizer.cleanup()
             if hasattr(self, 'ocr_recognizer'):
                 self.ocr_recognizer.cleanup()
+            if hasattr(self, 'seal_ocr_recognizer') and self.seal_ocr_recognizer is not None:
+                self.seal_ocr_recognizer.cleanup()
             logger.info("✅ Pipeline cleanup completed")
         except Exception as e:
             logger.warning(f"⚠️ Cleanup failed: {e}")

+ 2 - 0
ocr_tools/universal_doc_parser/models/adapters/__init__.py

@@ -40,6 +40,7 @@ try:
         MinerUOCRRecognizer
     )
     from .mineru_wired_table import MinerUWiredTableRecognizer
+    from .seal_ocr_adapter import SealOCRRecognizer
     MINERU_AVAILABLE = True
 except ImportError:
     MINERU_AVAILABLE = False
@@ -78,6 +79,7 @@ if MINERU_AVAILABLE:
         'MinerUVLRecognizer',
         'MinerUOCRRecognizer',
         'MinerUWiredTableRecognizer',
+        'SealOCRRecognizer',
     ])
 
 

+ 92 - 0
ocr_tools/universal_doc_parser/models/adapters/_mineru_vl_patches.py

@@ -0,0 +1,92 @@
+"""mineru_vl_utils 运行时补丁集合。
+
+集中存放对第三方库 ``mineru_vl_utils`` 的运行时修补(monkey-patch),
+目的是在**不修改第三方源码**的前提下修复其在 PaddleOCR-VL 模型上的兼容性问题,
+并保证补丁随本仓库一起版本化、可随时开关、升级第三方库后不会丢失。
+
+当前包含的补丁:
+
+1. ``patch_convert_otsl_to_html``
+   修复 PaddleOCR-VL 输出的 OTSL「整表首个单元格缺少前导 ``<fcel>`` token」
+   导致 ``otsl_parse_texts`` 文本错位、所有单元格文字丢失的问题。
+
+统一通过 :func:`apply_once` 应用,幂等且仅在首次调用时生效。
+"""
+
+from __future__ import annotations
+
+from loguru import logger
+
+# OTSL 结构 token;与 mineru_vl_utils.post_process 内部定义保持一致
+_OTSL_STRUCT_TOKENS = ("<nl>", "<fcel>", "<ecel>", "<lcel>", "<ucel>", "<xcel>")
+
+_applied = False
+
+
+def _make_otsl_normalizer(orig_convert):
+    """生成一个在调用原始 convert_otsl_to_html 前先归一化 OTSL 的包装函数。"""
+
+    def _normalize_then_convert(otsl_content):
+        if isinstance(otsl_content, str):
+            stripped = otsl_content.lstrip()
+            # 整表首格缺少前导结构 token(如 PaddleOCR-VL)时补 <fcel>,
+            # 否则 otsl_parse_texts 的 text_idx 会永久错位,导致全部单元格文字丢失。
+            if (
+                stripped
+                and not stripped.startswith("<table")
+                and not stripped.startswith(_OTSL_STRUCT_TOKENS)
+            ):
+                otsl_content = "<fcel>" + stripped
+        return orig_convert(otsl_content)
+
+    # 记录原始函数,便于排查与还原
+    _normalize_then_convert.__wrapped__ = orig_convert
+    return _normalize_then_convert
+
+
+def _patch_convert_otsl_to_html():
+    """替换 post_process 命名空间中的 convert_otsl_to_html。
+
+    mineru_vl_utils.post_process.__init__ 通过
+    ``from .otsl2html import convert_otsl_to_html`` 导入该函数,
+    其内部 simple_process / _convert_pure_table_content_to_html 在调用时
+    按 post_process 模块全局名查找,因此覆盖该命名空间即可拦截全部内部调用。
+    """
+    import mineru_vl_utils.post_process as pp
+    from mineru_vl_utils.post_process import otsl2html
+
+    orig = getattr(otsl2html, "convert_otsl_to_html", None)
+    if orig is None:
+        # 上游接口变更时大声报错,避免补丁静默失效后又开始丢字
+        raise RuntimeError(
+            "mineru_vl_utils 接口已变更:找不到 otsl2html.convert_otsl_to_html,"
+            "请检查第三方库版本并更新补丁。"
+        )
+
+    wrapped = _make_otsl_normalizer(orig)
+    # 关键:post_process 内部调用按此命名空间查找
+    pp.convert_otsl_to_html = wrapped
+    # 兜底:若有代码直接 import otsl2html.convert_otsl_to_html
+    otsl2html.convert_otsl_to_html = wrapped
+
+
+def apply_once() -> bool:
+    """应用全部 mineru_vl_utils 运行时补丁,幂等。
+
+    应在任何 ``content_extract`` / ``batch_content_extract`` 调用之前执行一次
+    (通常放在 VL 识别器/检测器的 ``initialize()`` 内、获取模型之前)。
+
+    Returns:
+        bool: 本次调用是否真正应用了补丁(首次为 True,后续为 False)。
+    """
+    global _applied
+    if _applied:
+        return False
+    try:
+        _patch_convert_otsl_to_html()
+        _applied = True
+        logger.info("已应用 mineru_vl_utils 补丁:OTSL 整表首格 <fcel> 归一化")
+        return True
+    except Exception as e:  # 补丁失败不应阻断主流程,但需明确告警
+        logger.error(f"应用 mineru_vl_utils 补丁失败:{e}")
+        raise

+ 6 - 0
ocr_tools/universal_doc_parser/models/adapters/base.py

@@ -620,6 +620,12 @@ class BaseLayoutDetector(BaseAdapter):
                 bbox1, bbox2 = results[i].get('bbox', []), results[j].get('bbox', [])
                 if len(bbox1) < 4 or len(bbox2) < 4:
                     continue
+
+                cat_i = results[i].get('category', '')
+                cat_j = results[j].get('category', '')
+                # 印章常压在表格/文字之上,与大面积区域重叠属正常,保留双方
+                if cat_i == 'seal' or cat_j == 'seal':
+                    continue
                 
                 # 计算重叠指标
                 iou = coordinate_utils.calculate_iou(bbox1, bbox2)

+ 12 - 1
ocr_tools/universal_doc_parser/models/adapters/mineru_adapter.py

@@ -370,7 +370,18 @@ class MinerUVLRecognizer(BaseVLRecognizer):
         # 🔧 添加图片尺寸限制配置
         self.max_image_size = config.get('max_image_size', 1568)  # VLM 模型的最大尺寸
         self.resize_mode = config.get('resize_mode', 'max')  # 'max' or 'fixed'
-        
+
+        # 应用 mineru_vl_utils 运行时补丁(修复 PaddleOCR-VL OTSL 首格 <fcel> 缺失导致表格文字丢失)
+        # 放在 __init__ 中,可同时覆盖 mineru 与 paddle 两条路径:
+        # PaddleVLRecognizer 重写了 initialize() 但其 __init__ 会经 super().__init__ 到达这里。
+        # 补丁仅替换 mineru_vl_utils.post_process 内函数,无需模型已加载,且幂等。
+        try:
+            from ._mineru_vl_patches import apply_once as _apply_mineru_vl_patches
+            _apply_mineru_vl_patches()
+        except Exception as e:
+            # 补丁失败不应阻断识别器创建,退回默认行为,但需明确告警
+            logger.warning(f"应用 mineru_vl_utils 补丁失败(退回默认行为,表格可能丢字): {e}")
+
     def initialize(self):
         """初始化VL模型"""
         try:

+ 175 - 0
ocr_tools/universal_doc_parser/models/adapters/seal_ocr_adapter.py

@@ -0,0 +1,175 @@
+"""印章 OCR 识别适配器,封装 MinerU 的 PytorchPaddleOCR(lang="seal")"""
+
+from typing import Dict, Any, List, Union
+import numpy as np
+import cv2
+from PIL import Image
+from loguru import logger
+
+from .base import BaseOCRRecognizer
+
+try:
+    from mineru.backend.pipeline.model_init import AtomModelSingleton
+    from mineru.backend.pipeline.model_list import AtomicModel
+    MINERU_AVAILABLE = True
+except ImportError as e:
+    logger.warning(f"MinerU components not available for seal OCR: {e}")
+    MINERU_AVAILABLE = False
+
+
+class SealOCRRecognizer(BaseOCRRecognizer):
+    """印章 OCR 识别适配器,复用 MinerU 的印章专用 OCR 模型
+
+    使用 PytorchPaddleOCR(lang="seal"),该模型针对印章文本做了专项优化:
+    - 检测模型: seal_PP-OCRv4_det_server_infer.pth
+    - 识别模型: ch_PP-OCRv4_rec_server_infer.pth
+    - 使用 polygon 边界框,低阈值 (db_thresh=0.2, box_thresh=0.6)
+    - 不合并检测框 (enable_merge_det_boxes=False)
+    - drop_score=0 以保留低置信度结果
+    """
+
+    def __init__(self, config: Dict[str, Any]):
+        super().__init__(config)
+        if not MINERU_AVAILABLE:
+            raise ImportError("MinerU components not available")
+        self.atom_model_manager = AtomModelSingleton()
+        self.seal_model = None
+
+    def initialize(self):
+        """初始化印章 OCR 模型"""
+        try:
+            self.seal_model = self.atom_model_manager.get_atom_model(
+                atom_model_name=AtomicModel.OCR,
+                lang="seal",
+            )
+            logger.info("SealOCRRecognizer initialized with lang=seal")
+        except Exception as e:
+            logger.error(f"Failed to initialize SealOCRRecognizer: {e}")
+            raise
+
+    def cleanup(self):
+        """清理资源"""
+        self.seal_model = None
+
+    def recognize(self, image: Union[np.ndarray, Image.Image]) -> Dict[str, Any]:
+        """识别印章图片中的文字
+
+        与 MinerU batch_analyze.py 中的印章 OCR 逻辑保持一致:
+        1. 将 RGB 图像转为 BGR
+        2. 调用 seal_ocr_model.ocr(bgr_img, det=True, rec=True)
+        3. 提取识别出的文本列表
+
+        Args:
+            image: 印章裁剪图像 (RGB/OpenCV numpy array 或 PIL Image)
+
+        Returns:
+            {
+                'text': str,              # 合并后的文本(用空格连接)
+                'texts': List[str],       # 各文本框识别出的文本列表
+                'confidence': float,      # 平均置信度
+                'details': List[Dict]     # 详细结果 (bbox, text, confidence)
+            }
+        """
+        if self.seal_model is None:
+            raise RuntimeError("Seal OCR model not initialized")
+
+        # 转换为 BGR 格式
+        if isinstance(image, Image.Image):
+            img_rgb = np.array(image)
+        else:
+            img_rgb = image
+
+        if img_rgb.size == 0:
+            return {'text': '', 'texts': [], 'confidence': 0.0, 'details': []}
+
+        img_bgr = cv2.cvtColor(img_rgb, cv2.COLOR_RGB2BGR)
+
+        try:
+            seal_ocr_res = self.seal_model.ocr(img_bgr, det=True, rec=True)
+            if not seal_ocr_res or not seal_ocr_res[0]:
+                return {'text': '', 'texts': [], 'confidence': 0.0, 'details': []}
+
+            seal_texts: List[str] = []
+            details: List[Dict[str, Any]] = []
+            confidences: List[float] = []
+
+            for seal_item in seal_ocr_res[0]:
+                if not seal_item or len(seal_item) != 2:
+                    continue
+                poly = seal_item[0]  # 多边形坐标
+                rec_result = seal_item[1]
+                if not rec_result or len(rec_result) < 1:
+                    continue
+                rec_text = rec_result[0]
+                rec_conf = rec_result[1] if len(rec_result) >= 2 else 0.0
+                if rec_text:
+                    seal_texts.append(rec_text)
+                    confidences.append(rec_conf)
+                    details.append({
+                        'poly': poly,
+                        'text': rec_text,
+                        'confidence': rec_conf,
+                    })
+
+            avg_confidence = sum(confidences) / len(confidences) if confidences else 0.0
+            combined_text = " ".join(seal_texts)
+
+            logger.debug(
+                f"Seal OCR: '{combined_text[:50]}...' (avg conf: {avg_confidence:.3f})"
+            )
+
+            return {
+                'text': combined_text,
+                'texts': seal_texts,
+                'confidence': avg_confidence,
+                'details': details,
+            }
+
+        except Exception as e:
+            logger.warning(f"Seal OCR recognition failed: {e}")
+            return {'text': '', 'texts': [], 'confidence': 0.0, 'details': []}
+
+    def recognize_text(self, image: Union[np.ndarray, Image.Image]) -> List[Dict[str, Any]]:
+        """实现 BaseOCRRecognizer 接口,将 recognize() 结果转为标准 OCR 列表格式"""
+        result = self.recognize(image)
+        formatted: List[Dict[str, Any]] = []
+        for detail in result.get('details', []):
+            poly = detail.get('poly')
+            if not poly:
+                continue
+            formatted.append({
+                'poly': poly,
+                'text': detail.get('text', ''),
+                'confidence': detail.get('confidence', 0.0),
+            })
+        return formatted
+
+    def detect_text_boxes(self, image: Union[np.ndarray, Image.Image]) -> List[Dict[str, Any]]:
+        """只检测印章文本框(不识别文字)"""
+        if self.seal_model is None:
+            raise RuntimeError("Seal OCR model not initialized")
+
+        if isinstance(image, Image.Image):
+            img_rgb = np.array(image)
+        else:
+            img_rgb = image
+
+        if img_rgb.size == 0:
+            return []
+
+        img_bgr = cv2.cvtColor(img_rgb, cv2.COLOR_RGB2BGR)
+
+        try:
+            ocr_results = self.seal_model.ocr(img_bgr, det=True, rec=False)
+            formatted: List[Dict[str, Any]] = []
+            if ocr_results and ocr_results[0]:
+                for poly in ocr_results[0]:
+                    if poly and len(poly) >= 4:
+                        formatted.append({
+                            'poly': poly,
+                            'confidence': 1.0,
+                        })
+            return formatted
+        except Exception as e:
+            logger.warning(f"Seal text box detection failed: {e}")
+            return []