SHA1
--- a/docs/ocr_tools/universal_doc_parser/水印去除技术文档.md
+++ b/docs/ocr_tools/universal_doc_parser/水印去除技术文档.md
@@ -2,12 +2,26 @@
 
				 
			
 
				 ## 概述
			
 
				 
			
 
				-水印去除模块 (`ocr_utils/watermark_utils.py`) 提供了**两层独立的水印去除能力**，针对不同类型的文档和场景进行优化：
			
 
				+水印去除能力位于 `ocr_utils/watermark/` 包，对外兼容入口为 `ocr_utils/watermark_utils.py`（re-export）。核心编排类为 **`WatermarkProcessor`**，支持 **页级（page）** 与 **单元格级（cell）** 两套预设（`presets.py`）。
			
 
				 
			
 
				-| 层级 | 处理对象 | 适用场景 | 特点 |
			
 
				-|------|---------|---------|------|
			
 
				-| **PDF 层级** | 文字型 PDF 的 XObject | 银行流水等文字型 PDF | 保留文字可搜索性，无损处理 |
			
 
				-| **图像层级** | 扫描件/渲染图像的像素 | 扫描件、图片 | 像素级处理，适用于 OCR 前预处理 |
			
 
				+除 PDF/页级预处理外，银行流水等场景在 **有线表格二次 OCR** 中可对单个单元格裁剪图再次去水印（`text_filling.cell_preprocess`）。
			
 
				+
			
 
				+| 层级 | 处理对象 | 配置位置 | 适用场景 |
			
 
				+|------|---------|---------|---------|
			
 
				+| **PDF 层级** | 文字型 PDF 的 XObject | `input.txt_pdf_watermark_removal` | 文字型 PDF，渲染前 |
			
 
				+| **页级图像** | 整页渲染图 | `preprocessor.watermark_removal` | 扫描件页级 OCR 前（可选） |
			
 
				+| **格级图像** | 单元格裁剪图 | `table_recognition_wired.second_pass_ocr.cell_preprocess.watermark` | 二次 OCR 前（推荐 cell-first） |
			
 
				+
			
 
				+**实现模块：**
			
 
				+
			
 
				+| 路径 | 职责 |
			
 
				+|------|------|
			
 
				+| `ocr_utils/watermark/presets.py` | 页级/格级预设、`merge_watermark_config` |
			
 
				+| `ocr_utils/watermark/removal.py` | `threshold` / `masked_adaptive` 去水印 |
			
 
				+| `ocr_utils/watermark/processor.py` | `WatermarkProcessor` 门面 |
			
 
				+| `ocr_utils/watermark/pdf.py` | 文字型 PDF XObject 去水印 |
			
 
				+| `models/adapters/wired_table/text_filling.py` | 格级预处理 + 二次 OCR |
			
 
				+| `ocr_tools/cell_preprocess_lab/cell_sweep.py` | 单格参数网格扫描（调参） |
			
 
				 
			
 
				 ---
			
 
				 
			
@@ -35,13 +49,17 @@ graph TB
 
				     F[图像输入] --> J[阶段二: 图像级去水印]
			
 
				     
			
 
				     J --> K{启用 watermark_removal?}
			
 
				-    K -->|是| L[检测浅色斜向水印]
			
 
				+    K -->|是| L[WatermarkProcessor page]
			
 
				     K -->|否| N
			
 
				-    L --> M[阈值化去除水印]
			
 
				+    L --> M[method: threshold / masked / masked_adaptive]
			
 
				     M --> N[方向校正]
			
 
				     
			
 
				     N --> O[Layout 检测]
			
 
				-    O --> P[OCR 识别]
			
 
				+    O --> P[表格 OCR]
			
 
				+    P --> Q{二次 OCR 格级 wm?}
			
 
				+    Q -->|是| R[WatermarkProcessor cell + upscale]
			
 
				+    Q -->|否| S
			
 
				+    R --> S[格内 OCR]
			
 
				     
			
 
				     style C fill:#e1f5ff
			
 
				     style E fill:#e1f5ff
			
@@ -186,17 +204,81 @@ def scan_pdf_watermark_xobjs(pdf_bytes: bytes, sample_pages: int = 3) -> bool:
 
				 
			
 
				 ### 适用场景
			
 
				 
			
 
				-**扫描件/图片（`pdf_type='ocr'`）**：无法从 PDF 内部结构处理，只能对渲染后的图像进行像素级处理。
			
 
				+- **页级**：扫描件/图片（`pdf_type='ocr'`），在 `MinerUPreprocessor` 中通过 `WatermarkProcessor(scope="page")` 调用。
			
 
				+- **格级**：有线表格 **二次 OCR** 前，对 `raw_crop` 通过 `WatermarkProcessor(scope="cell")` 调用（与页级独立配置）。
			
 
				 
			
 
				-### 原理
			
 
				+银行流水当前推荐策略 **cell-first**：页级 `watermark_removal.enabled: false`，重点调格级 `cell_preprocess.watermark`。
			
 
				+
			
 
				+### 灰度值约定（必读）
			
 
				+
			
 
				+OpenCV / PIL 的 **8 位灰度图**统一约定：
			
 
				+
			
 
				+| 灰度值 | 视觉 |
			
 
				+|--------|------|
			
 
				+| **0** | 黑（深笔画、墨迹） |
			
 
				+| **255** | 白（背景、纸面） |
			
 
				+| 中间值（如 100～220） | 灰（浅色水印、淡笔画、扫描噪声） |
			
 
				+
			
 
				+典型银行流水扫描件上：
			
 
				+
			
 
				+- **汉字笔画**：低灰度（偏黑，通常远小于 threshold）
			
 
				+- **纸面背景 / 浅色斜纹水印**：高灰度（偏白，通常大于 threshold）
			
 
				+
			
 
				+> **注意**：个别 UI 或 PhotoShop 可能用「0=白」显示，但本仓库代码与 OpenCV 一致，**以 0=黑、255=白 为准**。
			
 
				+
			
 
				+### 去水印方法（`method`）
			
 
				+
			
 
				+| method | 说明 | YAML 需写 |
			
 
				+|--------|------|-----------|
			
 
				+| `threshold` | 全局 `gray > threshold → 255`，简单快速 | `method` + 可选 `threshold` |
			
 
				+| `masked` | 掩膜定位水印区再处理 | 仅 `method`（细参见 preset） |
			
 
				+| `masked_adaptive` | 掩膜 + 掩膜内自适应阈值 | 仅 `method`（细参见 preset） |
			
 
				+
			
 
				+预设默认值（`presets.py`，YAML 未覆盖时生效）：
			
 
				+
			
 
				+| scope | 默认 `threshold` | 默认 `contrast_enhancement` |
			
 
				+|-------|------------------|----------------------------|
			
 
				+| **page** | 175 | enabled |
			
 
				+| **cell** | 155 | disabled |
			
 
				+
			
 
				+`merge_watermark_config(scope, user_cfg)` 将用户 YAML 与上表预设合并；`mask` / `hough` / `adaptive` 等细参不必写入场景 YAML。
			
 
				+
			
 
				+### `threshold` 方法原理
			
 
				 
			
 
				 银行流水等金融文档的水印特征：
			
 
				 
			
 
				-- **颜色浅**：灰度值通常在 160-220 之间（介于正文和背景之间）
			
 
				-- **角度斜**：通常 45° 斜向排列
			
 
				-- **文字稀疏**：水印文字占比较小
			
 
				+- **颜色浅**：灰度多在 160～220（介于正文与白纸之间）
			
 
				+- **角度斜**：常见 45° 斜向重复文字
			
 
				+- **占比较小**：相对整页/整格为稀疏浅色纹理
			
 
				+
			
 
				+核心代码（`removal.py`）：
			
 
				+
			
 
				+```python
			
 
				+cleaned = gray.copy()
			
 
				+cleaned[gray > threshold] = 255   # 亮于阈值的像素 → 白
			
 
				+```
			
 
				+
			
 
				+**语义**：保留 **灰度 ≤ threshold** 的像素（深字），把 **更亮** 的像素刷成白纸，用于削弱浅色水印。
			
 
				+
			
 
				+### `threshold` 调高 / 调低的实际作用
			
 
				+
			
 
				+判断规则：`gray > threshold` 才变白 → **threshold 是「多亮才算背景」的分界线**。
			
 
				 
			
 
				-基于这些特征，采用**阈值化处理**：将灰度值高于阈值的像素置为白色，保留深色正文。
			
 
				+| 操作 | 白化强度 | 被刷白的像素范围 | 对水印 | 对正文 / OCR |
			
 
				+|------|----------|------------------|--------|----------------|
			
 
				+| **调低** threshold（如 175→155） | **更强** | 更多中等灰度（如 156～175）也会变白 | 去得更干净 | 淡笔画、被水印冲淡的边缘可能被啃掉；背景更干净时 det 有时更易检出一整行 |
			
 
				+| **调高** threshold（如 155→175） | **更弱** | 只有更亮的像素才变白 | 易残留斜纹、浅灰噪声 | 笔画保留更多；残留干扰可能导致 det 碎框、高分短错文 |
			
 
				+
			
 
				+记忆口诀：
			
 
				+
			
 
				+- **threshold ↓** → 更激进地去浅色 → 背景更白，**易伤淡字**
			
 
				+- **threshold ↑** → 更保守 → **易留水印**，但深字更安全
			
 
				+
			
 
				+调参建议（单格可用 `cell_sweep.py` 在 **`*_raw.png` 原图上**扫描，勿对已预处理 debug 图二次去水印）：
			
 
				+
			
 
				+1. 优先在 **155～175** 间扫，结合 OCR 文本是否完整、det 框是否稳定。
			
 
				+2. **不要只看 rec 分数**：threshold 偏高时可能出现高分但错误的短文本（如仅「折取款」）。
			
 
				+3. 格级与页级 **threshold 可不同**（预设 page=175、cell=155），按 sweep 结果分别写 YAML。
			
 
				 
			
 
				 ### 水印检测 (`detect_watermark`)
			
 
				 
			
@@ -232,79 +314,136 @@ def detect_watermark(image, midtone_low=100, midtone_high=220, ratio_threshold=0
 
				     return diagonal_count >= 2
			
 
				 ```
			
 
				 
			
 
				-### 水印去除 (`remove_watermark_from_image`)
			
 
				+### 水印去除 API
			
 
				+
			
 
				+**推荐（页级 / 格级统一）：**
			
 
				 
			
 
				 ```python
			
 
				-def remove_watermark_from_image(image, threshold=160, morph_close_kernel=0):
			
 
				-    """
			
 
				-    去除图像中的浅色斜向文字水印
			
 
				-    
			
 
				-    原理：
			
 
				-    - 正文为深黑色（灰度 < threshold）
			
 
				-    - 水印为浅灰（灰度 > threshold）
			
 
				-    - 将高于阈值的像素置为白色（255）
			
 
				-    
			
 
				-    Args:
			
 
				-        threshold: 灰度阈值，建议 140-180，默认 160
			
 
				-        morph_close_kernel: 形态学闭运算核，0 表示跳过
			
 
				-    """
			
 
				-    gray = to_grayscale(image)
			
 
				-    
			
 
				-    # 阈值化：保留深色正文
			
 
				-    cleaned = gray.copy()
			
 
				-    cleaned[gray > threshold] = 255
			
 
				-    
			
 
				-    # 可选：形态学闭运算填补字符断裂
			
 
				-    if morph_close_kernel > 0:
			
 
				-        kernel = np.ones((morph_close_kernel, morph_close_kernel), np.uint8)
			
 
				-        cleaned = cv2.morphologyEx(cleaned, cv2.MORPH_CLOSE, kernel)
			
 
				-    
			
 
				-    return cleaned
			
 
				+from ocr_utils.watermark import WatermarkProcessor, merge_watermark_config
			
 
				+
			
 
				+processor = WatermarkProcessor.from_user_config(
			
 
				+    {"enabled": True, "method": "threshold", "threshold": 155},
			
 
				+    scope="cell",  # 或 "page"
			
 
				+)
			
 
				+cleaned_bgr, stages = processor.process(cell_bgr_image, force=True)
			
 
				+# stages 可能含 "wm"、"contrast" 等，供 debug JSON 使用
			
 
				+```
			
 
				+
			
 
				+**底层（兼容旧代码）：**
			
 
				+
			
 
				+```python
			
 
				+from ocr_utils.watermark import remove_watermark_from_image_rgb
			
 
				+
			
 
				+out = remove_watermark_from_image_rgb(
			
 
				+    image,
			
 
				+    threshold=175,
			
 
				+    watermark_removal_cfg=merge_watermark_config("page", {"method": "threshold"}),
			
 
				+)
			
 
				 ```
			
 
				 
			
 
				-### 参数说明
			
 
				+### 参数说明（`method: threshold`）
			
 
				 
			
 
				-| 参数 | 默认值 | 说明 | 调整建议 |
			
 
				-|------|--------|------|---------|
			
 
				-| `threshold` | 160 | 灰度阈值 | 140-180，越大越保守（可能残留水印） |
			
 
				-| `morph_close_kernel` | 0 | 形态学核大小 | 非二值图建议设为 0（闭运算会适得其反） |
			
 
				+| 参数 | page 预设 | cell 预设 | 说明 |
			
 
				+|------|-----------|-----------|------|
			
 
				+| `threshold` | 175 | 155 | 见上文「调高/调低」；**越大越保守**，越小白化越强 |
			
 
				+| `morph_close_kernel` | 0 | 0 | 闭运算核；**0=关闭**（推荐，非二值图闭运算易引噪） |
			
 
				+| `detect_before_remove` | true | false | 页级可先检测再去除；格级通常 `force=True` 直接处理 |
			
 
				+| `contrast_enhancement` | 默认开 | 默认关 | 去水印后 `text_restore`；格级默认关，需时再开 |
			
 
				+
			
 
				+---
			
 
				+
			
 
				+## 阶段三：格级二次 OCR 预处理
			
 
				+
			
 
				+### 流程
			
 
				+
			
 
				+```
			
 
				+表图 raw_crop
			
 
				+  → WatermarkProcessor(cell)     # wm
			
 
				+  → 可选 denoise / contrast      # YAML 开关
			
 
				+  → upscale（light.upscale_min_side，如 192）
			
 
				+  → det 分行 / whole 兜底 OCR
			
 
				+```
			
 
				+
			
 
				+Debug 输出（`tablecell_ocr/`）：
			
 
				+
			
 
				+| 文件 | 含义 |
			
 
				+|------|------|
			
 
				+| `cellNNN_*_*.png` | 送入 OCR 的预处理后图像 |
			
 
				+| `cellNNN_*_*_raw.png` | 未去水印的原始裁剪（供 `cell_sweep` 调参） |
			
 
				+| `cellNNN_*_*.json` | 含 `preprocess_stages`、`debug_images`、`lines`/`whole` 等 |
			
 
				+
			
 
				+### 参数探索工具
			
 
				+
			
 
				+```bash
			
 
				+cd ocr_tools/cell_preprocess_lab
			
 
				+
			
 
				+# 单格扫描（自动优先 *_raw.png）
			
 
				+python cell_sweep.py /path/to/cell219_empty_empty_raw.png \
			
 
				+  -o ./output/cell219_sweep -t "ATM存折取款"
			
 
				+
			
 
				+# 批量 tablecell_ocr 目录
			
 
				+python cell_sweep.py /path/to/tablecell_ocr/ -o ./sweep_out --quick
			
 
				+```
			
 
				+
			
 
				+报告 `sweep_report.json` 含每条组合的 `text`、`score`（加权识别分）、`boxes[]`（逐框分数）。
			
 
				 
			
 
				 ---
			
 
				 
			
 
				 ## 配置说明
			
 
				 
			
 
				-### 完整配置示例
			
 
				+### 完整配置示例（`bank_statement_yusys_local.yaml`）
			
 
				 
			
 
				 ```yaml
			
 
				-# 输入配置 - PDF 层级去水印
			
 
				 input:
			
 
				-  dpi: 200
			
 
				   txt_pdf_watermark_removal:
			
 
				-    enabled: true        # 是否启用 PDF 层级去水印
			
 
				-    sample_pages: 3      # 快速预扫描页数
			
 
				+    enabled: true
			
 
				+    sample_pages: 3
			
 
				 
			
 
				-# 预处理配置 - 图像级去水印
			
 
				 preprocessor:
			
 
				-  module: "mineru"
			
 
				-  orientation_classifier:
			
 
				-    enabled: true
			
 
				+  order: orient_first
			
 
				   watermark_removal:
			
 
				-    enabled: true           # 是否启用图像级去水印
			
 
				-    threshold: 160          # 灰度阈值
			
 
				-    morph_close_kernel: 0   # 形态学核大小（建议 0）
			
 
				+    enabled: false              # cell-first：页级可关
			
 
				+    detect_before_remove: true
			
 
				+    method: threshold
			
 
				+    threshold: 175              # 页级预设默认 175
			
 
				+    contrast_enhancement:
			
 
				+      enabled: false
			
 
				+
			
 
				+table_recognition_wired:
			
 
				+  second_pass_ocr:
			
 
				+    suspicious_short_min_chars: 4
			
 
				+    cell_preprocess:
			
 
				+      watermark:
			
 
				+        enabled: true
			
 
				+        method: threshold
			
 
				+        threshold: 155          # 建议显式写出；未写则用 cell 预设 155
			
 
				+      denoise:
			
 
				+        enabled: false
			
 
				+      contrast:
			
 
				+        enabled: false          # Pass1 可选 text_restore
			
 
				+      light:
			
 
				+        upscale_min_side: 192
			
 
				+    enhance_retry:
			
 
				+      enabled: false            # Pass2 增强重试（与 cell_preprocess 同级）
			
 
				 ```
			
 
				 
			
 
				 ### 配置项详解
			
 
				 
			
 
				-| 配置路径 | 类型 | 默认值 | 说明 |
			
 
				-|---------|------|--------|------|
			
 
				-| `input.txt_pdf_watermark_removal.enabled` | bool | `false` | PDF 层级去水印开关 |
			
 
				-| `input.txt_pdf_watermark_removal.sample_pages` | int | 3 | 预扫描页数 |
			
 
				-| `preprocessor.watermark_removal.enabled` | bool | `false` | 图像级去水印开关 |
			
 
				-| `preprocessor.watermark_removal.threshold` | int | 160 | 灰度阈值 |
			
 
				-| `preprocessor.watermark_removal.morph_close_kernel` | int | 0 | 形态学核大小 |
			
 
				+| 配置路径 | 说明 |
			
 
				+|---------|------|
			
 
				+| `input.txt_pdf_watermark_removal.*` | PDF XObject 去水印 |
			
 
				+| `preprocessor.watermark_removal.*` | 页级 `WatermarkProcessor(scope=page)` |
			
 
				+| `preprocessor.watermark_removal.method` | `threshold` \| `masked` \| `masked_adaptive` |
			
 
				+| `preprocessor.watermark_removal.threshold` | 仅 `threshold` 法；见「调高/调低」 |
			
 
				+| `second_pass_ocr.cell_preprocess.watermark.*` | 格级 `WatermarkProcessor(scope=cell)` |
			
 
				+| `second_pass_ocr.cell_preprocess.light.upscale_min_side` | 去水印后最短边放大 |
			
 
				+| `second_pass_ocr.enhance_retry` | Pass2 预处理（与 `cell_preprocess` 同级，非其子项） |
			
 
				+
			
 
				+**说明：**
			
 
				 
			
 
				-**注意**：两个配置均无默认值，必须在 YAML 中显式配置 `enabled: true` 才会触发。
			
 
				+- `morph_close_kernel` 在 preset 中已为 `0`，一般 **不必写入 YAML**。
			
 
				+- 格级 `threshold` **建议在 sweep 后显式配置**，不要假设与页级相同。
			
 
				+- `enabled: true` 才会执行；页级、格级开关相互独立。
			
 
				 
			
 
				 ---
			
 
				 
			
@@ -337,26 +476,29 @@ if pdf_type == 'ocr':  # 条件①：仅扫描件
 
				 
			
 
				 # mineru_adapter.py: MinerUPreprocessor.process()
			
 
				 
			
 
				-if config.get('watermark_removal', {}).get('enabled', False):  # 条件②
			
 
				-    image = remove_watermark_from_image_rgb(image, threshold=160)
			
 
				+processor = WatermarkProcessor.from_user_config(wm_cfg, scope="page")
			
 
				+if processor.enabled:
			
 
				+    image, _ = processor.process(image)  # 内部 detect_before_remove + method
			
 
				 ```
			
 
				 
			
 
				 **触发条件**：
			
 
				 1. PDF 类型为 `ocr`（扫描件）
			
 
				 2. `preprocessor.watermark_removal.enabled: true`
			
 
				 
			
 
				+**格级二次 OCR**（`text_filling.py`）：表体触发二次 OCR 时，对 `raw_crop` 调用 `_preprocess_cell_for_ocr` → `WatermarkProcessor(scope="cell")`。
			
 
				+
			
 
				 ---
			
 
				 
			
 
				-## 两阶段对比
			
 
				+## 各层级对比
			
 
				 
			
 
				-| 维度 | 阶段一（PDF 层级） | 阶段二（图像级） |
			
 
				-|------|------------------|-----------------|
			
 
				-| **处理对象** | 文字型 PDF | 扫描件/图片 |
			
 
				-| **处理层级** | PDF XObject | 图像像素 |
			
 
				-| **保留文字可搜索性** | ✅ 是 | ❌ 否 |
			
 
				-| **无损处理** | ✅ 是 | ❌ 否（像素修改） |
			
 
				-| **处理时机** | 渲染前 | 渲染后、检测前 |
			
 
				-| **依赖库** | PyMuPDF (fitz) | OpenCV, NumPy |
			
 
				+| 维度 | PDF 层级 | 页级图像 | 格级图像 |
			
 
				+|------|----------|----------|----------|
			
 
				+| **处理对象** | 文字型 PDF XObject | 整页渲染图 | 单元格裁剪 |
			
 
				+| **配置** | `input.txt_pdf_*` | `preprocessor.watermark_removal` | `second_pass_ocr.cell_preprocess.watermark` |
			
 
				+| **默认 threshold 预设** | — | 175 | 155 |
			
 
				+| **保留 PDF 文字层** | ✅ | — | — |
			
 
				+| **处理时机** | 渲染前 | Layout/OCR 前 | 格内二次 OCR 前 |
			
 
				+| **依赖库** | PyMuPDF | OpenCV | OpenCV |
			
 
				 
			
 
				 ---
			
 
				 
			
@@ -367,9 +509,9 @@ if config.get('watermark_removal', {}).get('enabled', False):  # 条件②
 
				 ```python
			
 
				 # pipeline_manager_v2.py
			
 
				 
			
 
				-from ocr_utils.watermark_utils import (
			
 
				+from ocr_utils.watermark import (
			
 
				     scan_pdf_watermark_xobjs,
			
 
				-    remove_txt_pdf_watermark
			
 
				+    remove_txt_pdf_watermark,
			
 
				 )
			
 
				 
			
 
				 class EnhancedDocPipeline:
			
@@ -400,23 +542,29 @@ class EnhancedDocPipeline:
 
				 ```python
			
 
				 # models/adapters/mineru_adapter.py
			
 
				 
			
 
				-from ocr_utils.watermark_utils import remove_watermark_from_image_rgb
			
 
				+from ocr_utils.watermark import WatermarkProcessor
			
 
				 
			
 
				 class MinerUPreprocessor:
			
 
				     def process(self, image):
			
 
				-        # 图像级水印去除（在方向校正之前）
			
 
				-        if self.config.get('watermark_removal', {}).get('enabled', False):
			
 
				-            threshold = self.config.get('watermark_removal', {}).get('threshold', 160)
			
 
				-            image = remove_watermark_from_image_rgb(image, threshold=threshold)
			
 
				-        
			
 
				-        # 方向校正
			
 
				-        if self.orientation_classifier:
			
 
				-            angle = self.orientation_classifier.predict(image)
			
 
				-            image = self._apply_rotation(image, angle)
			
 
				-        
			
 
				+        wm_cfg = self.config.get("watermark_removal") or {}
			
 
				+        processor = WatermarkProcessor.from_user_config(wm_cfg, scope="page")
			
 
				+        if processor.enabled:
			
 
				+            image, _ = processor.process(image)
			
 
				+        # 方向校正 ...
			
 
				         return image, angle
			
 
				 ```
			
 
				 
			
 
				+### 格级二次 OCR 集成
			
 
				+
			
 
				+```python
			
 
				+# models/adapters/wired_table/text_filling.py
			
 
				+
			
 
				+self._cell_wm_processor = WatermarkProcessor.from_user_config(wm_user, scope="cell")
			
 
				+
			
 
				+cell_img, stages = self._preprocess_cell_for_ocr(raw_crop, mode="light")
			
 
				+# stages 示例: ["wm", "upscale"] 或 ["wm", "contrast", "upscale"]
			
 
				+```
			
 
				+
			
 
				 ---
			
 
				 
			
 
				 ## 使用示例
			
@@ -475,17 +623,21 @@ python main_v2.py -i doc.pdf -c config.yaml --scene bank_statement --debug
 
				 
			
 
				 ## 注意事项
			
 
				 
			
 
				-1. **两个阶段是互补的**：阶段一处理文字型 PDF，阶段二处理扫描件，实际不会重复执行
			
 
				-2. **阈值选择**：`threshold=160` 适用于大多数银行流水，如果误删浅色文字可适当提高
			
 
				-3. **形态学运算**：`morph_close_kernel=0` 是推荐值，非二值图时闭运算可能引入噪声
			
 
				-4. **大文件优化**：`sample_pages=3` 快速预扫描，避免对无水印的大文件全量处理
			
 
				-5. **依赖要求**：PDF 层级去水印需要 `PyMuPDF`，图像级需要 `OpenCV`
			
 
				+1. **三层互补**：PDF 层级、页级、格级可独立开关；银行流水推荐 **cell-first**（页级 wm 关、格级 wm 开）。
			
 
				+2. **灰度方向**：**0=黑、255=白**；`gray > threshold → 255` 表示把「比阈值更亮」的像素刷白。
			
 
				+3. **threshold 方向**：**调高**更保守（易留水印、少伤字）；**调低**更激进（背景更干净、易啃淡笔画）。页级与格级应分别调参。
			
 
				+4. **勿混用 det 阈值**：`ocr_recognition.det_threshold` 是 OCR 检测框过滤，与去水印 `threshold` 无关。
			
 
				+5. **调参输入**：`cell_sweep.py` 应使用 `*_raw.png`（原裁剪），不要对已预处理的 `cell*_empty_empty.png` 再扫（等于二次去水印，结论失真）。
			
 
				+6. **形态学**：preset 中 `morph_close_kernel=0`，非二值图不建议开启闭运算。
			
 
				+7. **依赖**：PDF 层级需 `PyMuPDF`；图像级需 `OpenCV`。
			
 
				 
			
 
				 ---
			
 
				 
			
 
				 ## 参考资料
			
 
				 
			
 
				-- `ocr_utils/watermark_utils.py` - 水印工具函数实现
			
 
				-- `core/pipeline_manager_v2.py` - 流水线集成
			
 
				-- `models/adapters/mineru_adapter.py` - 预处理器集成
			
 
				-- `config/bank_statement_*.yaml` - 配置示例
			
 
				+- `ocr_utils/watermark/` — 实现包（presets / removal / processor / pdf）
			
 
				+- `ocr_utils/watermark_utils.py` — 兼容 re-export
			
 
				+- `ocr_tools/cell_preprocess_lab/cell_sweep.py` — 格级参数扫描
			
 
				+- `models/adapters/mineru_adapter.py` — 页级预处理
			
 
				+- `models/adapters/wired_table/text_filling.py` — 格级二次 OCR
			
 
				+- `config/bank_statement_yusys_local.yaml` — 场景配置示例
			
--- a/ocr_tools/cell_preprocess_lab/cell121_sweep.py
+++ b/ocr_tools/cell_preprocess_lab/cell121_sweep.py
@@ -1,194 +0,0 @@
 
				-#!/usr/bin/env python3
			
 
				-"""cell121 参数扫描：去水印方式 / threshold / contrast / upscale / det 阈值 / 整格 rec。"""
			
 
				-from __future__ import annotations
			
 
				-
			
 
				-import json
			
 
				-import os
			
 
				-import sys
			
 
				-from itertools import product
			
 
				-from pathlib import Path
			
 
				-from typing import Any, Dict, List, Optional, Tuple
			
 
				-
			
 
				-import cv2
			
 
				-import numpy as np
			
 
				-
			
 
				-_repo_root = Path(__file__).resolve().parents[2]
			
 
				-if str(_repo_root) not in sys.path:
			
 
				-    sys.path.insert(0, str(_repo_root))
			
 
				-
			
 
				-from ocr_utils.watermark import WatermarkProcessor, merge_watermark_config
			
 
				-from ocr_utils.watermark.contrast import apply_contrast_enhancement_config
			
 
				-
			
 
				-CELL121 = Path(
			
 
				-    "/Users/zhch158/workspace/data/流水分析/彭_广东兴宁农村商业银行/"
			
 
				-    "bank_statement_yusys_local/debug/table_recognition_wired/tablecell_ocr/"
			
 
				-    "彭_广东兴宁农村商业银行_page_002_0/cell121_empty_empty.png"
			
 
				-)
			
 
				-OUT_DIR = Path(__file__).parent / "output/彭_广东兴宁农村商业银行/cell121_sweep"
			
 
				-MODEL_DIR = Path(
			
 
				-    "/Users/zhch158/models/modelscope_cache/models/OpenDataLab/"
			
 
				-    "PDF-Extract-Kit-1___0/models/OCR/paddleocr_torch"
			
 
				-)
			
 
				-
			
 
				-TARGET = "20240927"
			
 
				-
			
 
				-
			
 
				-def _upscale(img: np.ndarray, min_side: int) -> np.ndarray:
			
 
				-    h, w = img.shape[:2]
			
 
				-    if h >= min_side and w >= min_side:
			
 
				-        return img
			
 
				-    s = max(min_side / max(h, 1), min_side / max(w, 1), 1.0)
			
 
				-    return cv2.resize(img, None, fx=s, fy=s, interpolation=cv2.INTER_CUBIC)
			
 
				-
			
 
				-
			
 
				-def _preprocess(
			
 
				-    raw: np.ndarray,
			
 
				-    *,
			
 
				-    method: str,
			
 
				-    thresh: Optional[int],
			
 
				-    contrast: bool,
			
 
				-    upscale: int,
			
 
				-) -> np.ndarray:
			
 
				-    user: Dict[str, Any] = {"enabled": True, "method": method}
			
 
				-    if method == "threshold" and thresh is not None:
			
 
				-        user["threshold"] = thresh
			
 
				-    cfg = merge_watermark_config("cell", user)
			
 
				-    img, _ = WatermarkProcessor(cfg, scope="cell").process(raw, force=True)
			
 
				-    if contrast:
			
 
				-        gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
			
 
				-        ce = dict(cfg.get("contrast_enhancement") or {})
			
 
				-        ce["enabled"] = True
			
 
				-        ce["text_black_target"] = 88
			
 
				-        gray = apply_contrast_enhancement_config(gray, ce)
			
 
				-        img = cv2.cvtColor(gray, cv2.COLOR_GRAY2BGR)
			
 
				-    return _upscale(img, upscale)
			
 
				-
			
 
				-
			
 
				-def _ocr(engine: Any, img: np.ndarray, *, det: bool, rec: bool) -> Dict[str, Any]:
			
 
				-    try:
			
 
				-        res = engine.ocr(img, det=det, rec=rec)
			
 
				-        texts: List[str] = []
			
 
				-        if res and res[0]:
			
 
				-            if det:
			
 
				-                for item in res[0]:
			
 
				-                    if item and len(item) >= 2 and item[1]:
			
 
				-                        texts.append(str(item[1][0] or ""))
			
 
				-            else:
			
 
				-                for item in res[0]:
			
 
				-                    if isinstance(item, (list, tuple)) and len(item) >= 1:
			
 
				-                        texts.append(str(item[0] or ""))
			
 
				-        text = "".join(texts).strip()
			
 
				-        return {
			
 
				-            "text": text,
			
 
				-            "det": det,
			
 
				-            "rec": rec,
			
 
				-            "n_boxes": len(res[0]) if res and res[0] else 0,
			
 
				-        }
			
 
				-    except Exception as e:
			
 
				-        return {"text": "", "error": str(e), "det": det, "rec": rec}
			
 
				-
			
 
				-
			
 
				-def _make_engine(det_thresh: float) -> Any:
			
 
				-    from ocr_tools.pytorch_models.pytorch_paddle import PytorchPaddleOCR
			
 
				-
			
 
				-    return PytorchPaddleOCR(
			
 
				-        lang="ch",
			
 
				-        det_model_path=str(MODEL_DIR / "ch_PP-OCRv5_det_infer.pth"),
			
 
				-        rec_model_path=str(MODEL_DIR / "ch_PP-OCRv4_rec_server_doc_infer.pth"),
			
 
				-        det_db_box_thresh=det_thresh,
			
 
				-    )
			
 
				-
			
 
				-
			
 
				-def main() -> None:
			
 
				-    if not CELL121.is_file():
			
 
				-        raise FileNotFoundError(CELL121)
			
 
				-    raw = cv2.imread(str(CELL121))
			
 
				-    OUT_DIR.mkdir(parents=True, exist_ok=True)
			
 
				-
			
 
				-    methods = ["threshold", "masked_adaptive"]
			
 
				-    thresholds = [155, 165, 170, 175, 180, None]
			
 
				-    contrasts = [False, True]
			
 
				-    upscales = [64, 96, 128, 192]
			
 
				-    det_threshs = [0.2, 0.3, 0.4, 0.5]
			
 
				-    ocr_modes = [("det_rec", True, True), ("whole_rec", False, True)]
			
 
				-
			
 
				-    results: List[Dict[str, Any]] = []
			
 
				-    hits: List[Dict[str, Any]] = []
			
 
				-    engines: Dict[float, Any] = {}
			
 
				-
			
 
				-    total = 0
			
 
				-    for method, thresh, contrast, upscale, det_th in product(
			
 
				-        methods, thresholds, contrasts, upscales, det_threshs
			
 
				-    ):
			
 
				-        if method != "threshold" and thresh is not None:
			
 
				-            continue
			
 
				-        if det_th not in engines:
			
 
				-            print(f"加载 OCR det_db_box_thresh={det_th} ...")
			
 
				-            engines[det_th] = _make_engine(det_th)
			
 
				-
			
 
				-        img = _preprocess(
			
 
				-            raw, method=method, thresh=thresh, contrast=contrast, upscale=upscale
			
 
				-        )
			
 
				-        tag = (
			
 
				-            f"{method}_t{thresh or 'd'}_c{int(contrast)}_u{upscale}_det{det_th}"
			
 
				-        )
			
 
				-        cv2.imwrite(str(OUT_DIR / f"{tag}.png"), img)
			
 
				-
			
 
				-        for mode_name, det, rec in ocr_modes:
			
 
				-            total += 1
			
 
				-            ocr = _ocr(engines[det_th], img, det=det, rec=rec)
			
 
				-            row = {
			
 
				-                "tag": tag,
			
 
				-                "method": method,
			
 
				-                "threshold": thresh,
			
 
				-                "contrast": contrast,
			
 
				-                "upscale": upscale,
			
 
				-                "det_db_box_thresh": det_th,
			
 
				-                "ocr_mode": mode_name,
			
 
				-                **ocr,
			
 
				-            }
			
 
				-            results.append(row)
			
 
				-            t = row.get("text", "")
			
 
				-            if TARGET in t or (len(t) >= 6 and t.isdigit()):
			
 
				-                row["match"] = "full" if TARGET in t else "partial"
			
 
				-                hits.append(row)
			
 
				-                print(f"HIT [{row['match']}] {mode_name} {tag} -> {t!r}")
			
 
				-
			
 
				-    # 原图对照
			
 
				-    for det_th in [0.3, 0.5]:
			
 
				-        if det_th not in engines:
			
 
				-            engines[det_th] = _make_engine(det_th)
			
 
				-        for mode_name, det, rec in ocr_modes:
			
 
				-            ocr = _ocr(engines[det_th], _upscale(raw, 128), det=det, rec=rec)
			
 
				-            row = {
			
 
				-                "tag": "raw_upscale128",
			
 
				-                "det_db_box_thresh": det_th,
			
 
				-                "ocr_mode": mode_name,
			
 
				-                **ocr,
			
 
				-            }
			
 
				-            results.append(row)
			
 
				-            if TARGET in (row.get("text") or ""):
			
 
				-                hits.append(row)
			
 
				-
			
 
				-    report = {
			
 
				-        "input": str(CELL121),
			
 
				-        "target": TARGET,
			
 
				-        "total_trials": total,
			
 
				-        "hits": hits,
			
 
				-        "all_results": results,
			
 
				-    }
			
 
				-    out_json = OUT_DIR / "cell121_sweep_report.json"
			
 
				-    out_json.write_text(json.dumps(report, ensure_ascii=False, indent=2), encoding="utf-8")
			
 
				-
			
 
				-    print(f"\n完成 {total} 次 OCR 试验，命中 {len(hits)} 条")
			
 
				-    print(f"报告: {out_json}")
			
 
				-    if hits:
			
 
				-        print("\n最佳命中:")
			
 
				-        for h in hits[:10]:
			
 
				-            print(f"  {h.get('ocr_mode')} {h.get('tag')}: {h.get('text')!r}")
			
 
				-    else:
			
 
				-        print("未出现完整 20240927，请查看 cell121_sweep/*.png 与 report 中 partial 结果")
			
 
				-
			
 
				-
			
 
				-if __name__ == "__main__":
			
 
				-    main()
			
--- a/ocr_tools/cell_preprocess_lab/cell_preprocess_lab.py
+++ b/ocr_tools/cell_preprocess_lab/cell_preprocess_lab.py
@@ -8,6 +8,9 @@
 
				     python cell_preprocess_lab.py cell219.png -o /tmp/cell_lab
			
 
				     python cell_preprocess_lab.py /path/to/tablecell_ocr/ -o /tmp/batch --compare-methods
			
 
				     python cell_preprocess_lab.py cell217.png -o /tmp/out --denoise --contrast
			
 
				+
			
 
				+参数网格扫描见 cell_sweep.py:
			
 
				+    python cell_sweep.py cell219_empty_empty_raw.png -o ./out -t "ATM存折取款"
			
 
				 """
			
 
				 from __future__ import annotations
			
 
				 
			
--- a/ocr_tools/cell_preprocess_lab/cell_sweep.py
+++ b/ocr_tools/cell_preprocess_lab/cell_sweep.py
@@ -0,0 +1,554 @@
 
				+#!/usr/bin/env python3
			
 
				+"""
			
 
				+单元格裁剪图预处理参数扫描：去水印 / threshold / contrast / upscale / det 阈值 / OCR 模式。
			
 
				+
			
 
				+默认从 **原图**（`*_raw.png`）出发，与 pipeline 二次 OCR 一致，避免对已预处理 debug 图二次去水印。
			
 
				+
			
 
				+用法:
			
 
				+    python cell_sweep.py cell219_empty_empty_raw.png -o ./out -t "ATM存折取款"
			
 
				+    python cell_sweep.py /path/to/tablecell_ocr/ -o ./out
			
 
				+    python cell_sweep.py cell.png --quick --no-save-images
			
 
				+    OCR_DET_MODEL_PATH=... OCR_REC_MODEL_PATH=... python cell_sweep.py cell.png
			
 
				+"""
			
 
				+from __future__ import annotations
			
 
				+
			
 
				+import argparse
			
 
				+import json
			
 
				+import os
			
 
				+import sys
			
 
				+from itertools import product
			
 
				+from pathlib import Path
			
 
				+from typing import Any, Dict, Iterable, List, Optional, Sequence, Tuple
			
 
				+
			
 
				+import cv2
			
 
				+import numpy as np
			
 
				+
			
 
				+_repo_root = Path(__file__).resolve().parents[2]
			
 
				+if str(_repo_root) not in sys.path:
			
 
				+    sys.path.insert(0, str(_repo_root))
			
 
				+
			
 
				+from ocr_utils.watermark import WatermarkProcessor, merge_watermark_config
			
 
				+from ocr_utils.watermark.contrast import apply_contrast_enhancement_config
			
 
				+
			
 
				+_IMAGE_SUFFIXES = {".png", ".jpg", ".jpeg", ".bmp", ".tif", ".tiff", ".webp"}
			
 
				+_DEFAULT_MODEL_DIR = Path(
			
 
				+    "/Users/zhch158/models/modelscope_cache/models/OpenDataLab/"
			
 
				+    "PDF-Extract-Kit-1___0/models/OCR/paddleocr_torch"
			
 
				+)
			
 
				+
			
 
				+
			
 
				+def _parse_csv_ints(s: str) -> List[Optional[int]]:
			
 
				+    out: List[Optional[int]] = []
			
 
				+    for part in s.split(","):
			
 
				+        part = part.strip()
			
 
				+        if not part or part.lower() in ("none", "d", "default"):
			
 
				+            out.append(None)
			
 
				+        else:
			
 
				+            out.append(int(part))
			
 
				+    return out
			
 
				+
			
 
				+
			
 
				+def _parse_csv_floats(s: str) -> List[float]:
			
 
				+    return [float(x.strip()) for x in s.split(",") if x.strip()]
			
 
				+
			
 
				+
			
 
				+def _parse_csv_bools(s: str) -> List[bool]:
			
 
				+    out: List[bool] = []
			
 
				+    for part in s.split(","):
			
 
				+        p = part.strip().lower()
			
 
				+        if p in ("1", "true", "yes", "on"):
			
 
				+            out.append(True)
			
 
				+        elif p in ("0", "false", "no", "off"):
			
 
				+            out.append(False)
			
 
				+        else:
			
 
				+            raise ValueError(f"无效的 bool 值: {part!r}")
			
 
				+    return out
			
 
				+
			
 
				+
			
 
				+def _default_model_dir() -> Path:
			
 
				+    det = os.environ.get("OCR_DET_MODEL_PATH")
			
 
				+    if det:
			
 
				+        return Path(det).parent
			
 
				+    return _DEFAULT_MODEL_DIR
			
 
				+
			
 
				+
			
 
				+def _upscale(img: np.ndarray, min_side: int) -> np.ndarray:
			
 
				+    h, w = img.shape[:2]
			
 
				+    if h >= min_side and w >= min_side:
			
 
				+        return img
			
 
				+    s = max(min_side / max(h, 1), min_side / max(w, 1), 1.0)
			
 
				+    return cv2.resize(img, None, fx=s, fy=s, interpolation=cv2.INTER_CUBIC)
			
 
				+
			
 
				+
			
 
				+def _preprocess(
			
 
				+    raw: np.ndarray,
			
 
				+    *,
			
 
				+    method: str,
			
 
				+    thresh: Optional[int],
			
 
				+    contrast: bool,
			
 
				+    upscale: int,
			
 
				+    text_black_target: int,
			
 
				+) -> np.ndarray:
			
 
				+    user: Dict[str, Any] = {"enabled": True, "method": method}
			
 
				+    if method == "threshold" and thresh is not None:
			
 
				+        user["threshold"] = thresh
			
 
				+    cfg = merge_watermark_config("cell", user)
			
 
				+    img, _ = WatermarkProcessor(cfg, scope="cell").process(raw, force=True)
			
 
				+    if contrast:
			
 
				+        gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
			
 
				+        ce = dict(cfg.get("contrast_enhancement") or {})
			
 
				+        ce["enabled"] = True
			
 
				+        ce["text_black_target"] = text_black_target
			
 
				+        gray = apply_contrast_enhancement_config(gray, ce)
			
 
				+        img = cv2.cvtColor(gray, cv2.COLOR_GRAY2BGR)
			
 
				+    return _upscale(img, upscale)
			
 
				+
			
 
				+
			
 
				+def _parse_rec_pair(rec_part: Any) -> Tuple[str, float]:
			
 
				+    """从 OCR 返回的 (text, score) 或嵌套结构中解析识别结果。"""
			
 
				+    if rec_part is None:
			
 
				+        return "", 0.0
			
 
				+    if isinstance(rec_part, (list, tuple)) and len(rec_part) >= 2:
			
 
				+        if isinstance(rec_part[0], (list, tuple, dict)):
			
 
				+            return "", 0.0
			
 
				+        txt = str(rec_part[0] or "").strip()
			
 
				+        try:
			
 
				+            sc = float(rec_part[1] or 0.0)
			
 
				+        except (TypeError, ValueError):
			
 
				+            sc = 0.0
			
 
				+        return txt, sc if txt else 0.0
			
 
				+    if isinstance(rec_part, (list, tuple)) and len(rec_part) == 1:
			
 
				+        txt = str(rec_part[0] or "").strip()
			
 
				+        return txt, 0.0
			
 
				+    return "", 0.0
			
 
				+
			
 
				+
			
 
				+def _aggregate_rec_score(boxes: List[Dict[str, Any]]) -> float:
			
 
				+    """按字符数加权平均识别分（与 pipeline aggregate_line_ocr 一致）。"""
			
 
				+    total_len = sum(len(b.get("text") or "") for b in boxes)
			
 
				+    if total_len <= 0:
			
 
				+        return 0.0
			
 
				+    weighted = sum(
			
 
				+        len(b.get("text") or "") * float(b.get("score") or 0.0) for b in boxes
			
 
				+    )
			
 
				+    return weighted / total_len
			
 
				+
			
 
				+
			
 
				+def _ocr(engine: Any, img: np.ndarray, *, det: bool, rec: bool) -> Dict[str, Any]:
			
 
				+    empty: Dict[str, Any] = {
			
 
				+        "text": "",
			
 
				+        "score": 0.0,
			
 
				+        "boxes": [],
			
 
				+        "det": det,
			
 
				+        "rec": rec,
			
 
				+        "n_boxes": 0,
			
 
				+    }
			
 
				+    try:
			
 
				+        res = engine.ocr(img, det=det, rec=rec)
			
 
				+        items = res[0] if res and res[0] is not None else []
			
 
				+        boxes_out: List[Dict[str, Any]] = []
			
 
				+
			
 
				+        if det:
			
 
				+            for item in items:
			
 
				+                if not item or len(item) < 2:
			
 
				+                    continue
			
 
				+                text, score = _parse_rec_pair(item[1])
			
 
				+                bbox = item[0]
			
 
				+                if hasattr(bbox, "tolist"):
			
 
				+                    bbox = bbox.tolist()
			
 
				+                entry: Dict[str, Any] = {
			
 
				+                    "text": text,
			
 
				+                    "score": round(score, 6),
			
 
				+                }
			
 
				+                if bbox is not None:
			
 
				+                    entry["det_bbox"] = bbox
			
 
				+                boxes_out.append(entry)
			
 
				+        else:
			
 
				+            for item in items:
			
 
				+                text, score = _parse_rec_pair(item)
			
 
				+                if not text and isinstance(item, (list, tuple)) and len(item) >= 1:
			
 
				+                    text, score = _parse_rec_pair(item[0])
			
 
				+                boxes_out.append({"text": text, "score": round(score, 6)})
			
 
				+
			
 
				+        text = "".join(b["text"] for b in boxes_out if b.get("text")).strip()
			
 
				+        agg_score = _aggregate_rec_score(boxes_out)
			
 
				+        return {
			
 
				+            "text": text,
			
 
				+            "score": round(agg_score, 6),
			
 
				+            "boxes": boxes_out,
			
 
				+            "det": det,
			
 
				+            "rec": rec,
			
 
				+            "n_boxes": len(boxes_out),
			
 
				+        }
			
 
				+    except Exception as e:
			
 
				+        out = dict(empty)
			
 
				+        out["error"] = str(e)
			
 
				+        return out
			
 
				+
			
 
				+
			
 
				+def _make_engine(det_thresh: float, model_dir: Path) -> Any:
			
 
				+    from ocr_tools.pytorch_models.pytorch_paddle import PytorchPaddleOCR
			
 
				+
			
 
				+    det_path = os.environ.get("OCR_DET_MODEL_PATH") or str(
			
 
				+        model_dir / "ch_PP-OCRv5_det_infer.pth"
			
 
				+    )
			
 
				+    rec_path = os.environ.get("OCR_REC_MODEL_PATH") or str(
			
 
				+        model_dir / "ch_PP-OCRv4_rec_server_doc_infer.pth"
			
 
				+    )
			
 
				+    return PytorchPaddleOCR(
			
 
				+        lang="ch",
			
 
				+        det_model_path=det_path,
			
 
				+        rec_model_path=rec_path,
			
 
				+        det_db_box_thresh=det_thresh,
			
 
				+    )
			
 
				+
			
 
				+
			
 
				+def resolve_input_image(path: Path, *, prefer_raw: bool) -> Path:
			
 
				+    """优先使用与 pipeline debug 配套的 *_raw.png。"""
			
 
				+    if not prefer_raw or path.stem.endswith("_raw"):
			
 
				+        return path
			
 
				+    raw_path = path.parent / f"{path.stem}_raw{path.suffix}"
			
 
				+    if raw_path.is_file():
			
 
				+        print(f"  使用原图: {raw_path.name}（跳过 {path.name}）")
			
 
				+        return raw_path
			
 
				+    return path
			
 
				+
			
 
				+
			
 
				+def collect_inputs(path: Path, *, prefer_raw: bool) -> List[Path]:
			
 
				+    if path.is_file():
			
 
				+        if path.suffix.lower() not in _IMAGE_SUFFIXES:
			
 
				+            raise ValueError(f"不支持的图像格式: {path}")
			
 
				+        return [resolve_input_image(path, prefer_raw=prefer_raw)]
			
 
				+
			
 
				+    if not path.is_dir():
			
 
				+        raise FileNotFoundError(path)
			
 
				+
			
 
				+    all_images = sorted(
			
 
				+        p
			
 
				+        for p in path.iterdir()
			
 
				+        if p.is_file() and p.suffix.lower() in _IMAGE_SUFFIXES
			
 
				+    )
			
 
				+    if not all_images:
			
 
				+        raise FileNotFoundError(f"目录内无图像: {path}")
			
 
				+
			
 
				+    if prefer_raw:
			
 
				+        raws = [p for p in all_images if p.stem.endswith("_raw")]
			
 
				+        if raws:
			
 
				+            return raws
			
 
				+
			
 
				+    chosen: List[Path] = []
			
 
				+    for p in all_images:
			
 
				+        if p.stem.endswith("_raw"):
			
 
				+            continue
			
 
				+        raw_sibling = p.parent / f"{p.stem}_raw{p.suffix}"
			
 
				+        if prefer_raw and raw_sibling.is_file():
			
 
				+            continue
			
 
				+        chosen.append(p)
			
 
				+    return chosen or all_images
			
 
				+
			
 
				+
			
 
				+def _match_hit(text: str, target: Optional[str]) -> Optional[str]:
			
 
				+    if not text:
			
 
				+        return None
			
 
				+    if not target:
			
 
				+        return "nonempty"
			
 
				+    if target in text:
			
 
				+        return "full"
			
 
				+    if len(target) >= 6 and target.isdigit() and len(text) >= 6 and text.isdigit():
			
 
				+        return "partial"
			
 
				+    return None
			
 
				+
			
 
				+
			
 
				+def run_sweep(
			
 
				+    input_path: Path,
			
 
				+    out_dir: Path,
			
 
				+    *,
			
 
				+    prefer_raw: bool,
			
 
				+    target: Optional[str],
			
 
				+    model_dir: Path,
			
 
				+    methods: Sequence[str],
			
 
				+    thresholds: Sequence[Optional[int]],
			
 
				+    contrasts: Sequence[bool],
			
 
				+    upscales: Sequence[int],
			
 
				+    det_threshs: Sequence[float],
			
 
				+    text_black_target: int,
			
 
				+    save_images: bool,
			
 
				+    run_baseline: bool,
			
 
				+    baseline_upscale: int,
			
 
				+) -> Dict[str, Any]:
			
 
				+    resolved = resolve_input_image(input_path, prefer_raw=prefer_raw)
			
 
				+    raw = cv2.imread(str(resolved))
			
 
				+    if raw is None:
			
 
				+        raise RuntimeError(f"无法读取图像: {resolved}")
			
 
				+
			
 
				+    stem = resolved.stem.removesuffix("_raw") if resolved.stem.endswith("_raw") else resolved.stem
			
 
				+    cell_out = out_dir / stem
			
 
				+    cell_out.mkdir(parents=True, exist_ok=True)
			
 
				+
			
 
				+    ocr_modes: List[Tuple[str, bool, bool]] = [
			
 
				+        ("det_rec", True, True),
			
 
				+        ("whole_rec", False, True),
			
 
				+    ]
			
 
				+
			
 
				+    results: List[Dict[str, Any]] = []
			
 
				+    hits: List[Dict[str, Any]] = []
			
 
				+    engines: Dict[float, Any] = {}
			
 
				+    total = 0
			
 
				+
			
 
				+    for method, thresh, contrast, upscale, det_th in product(
			
 
				+        methods, thresholds, contrasts, upscales, det_threshs
			
 
				+    ):
			
 
				+        if method != "threshold" and thresh is not None:
			
 
				+            continue
			
 
				+        if det_th not in engines:
			
 
				+            print(f"  [{stem}] 加载 OCR det_db_box_thresh={det_th} ...")
			
 
				+            engines[det_th] = _make_engine(det_th, model_dir)
			
 
				+
			
 
				+        img = _preprocess(
			
 
				+            raw,
			
 
				+            method=method,
			
 
				+            thresh=thresh,
			
 
				+            contrast=contrast,
			
 
				+            upscale=upscale,
			
 
				+            text_black_target=text_black_target,
			
 
				+        )
			
 
				+        tag = f"{method}_t{thresh or 'd'}_c{int(contrast)}_u{upscale}_det{det_th}"
			
 
				+        if save_images:
			
 
				+            cv2.imwrite(str(cell_out / f"{tag}.png"), img)
			
 
				+
			
 
				+        for mode_name, det, rec in ocr_modes:
			
 
				+            total += 1
			
 
				+            ocr = _ocr(engines[det_th], img, det=det, rec=rec)
			
 
				+            row: Dict[str, Any] = {
			
 
				+                "tag": tag,
			
 
				+                "method": method,
			
 
				+                "threshold": thresh,
			
 
				+                "contrast": contrast,
			
 
				+                "upscale": upscale,
			
 
				+                "det_db_box_thresh": det_th,
			
 
				+                "ocr_mode": mode_name,
			
 
				+                **ocr,
			
 
				+            }
			
 
				+            results.append(row)
			
 
				+            m = _match_hit(row.get("text", ""), target)
			
 
				+            if m:
			
 
				+                row["match"] = m
			
 
				+                hits.append(row)
			
 
				+                print(
			
 
				+                    f"  HIT [{m}] {mode_name} {tag} "
			
 
				+                    f"score={row.get('score')} -> {row.get('text')!r}"
			
 
				+                )
			
 
				+
			
 
				+    if run_baseline:
			
 
				+        for det_th in det_threshs:
			
 
				+            if det_th not in engines:
			
 
				+                engines[det_th] = _make_engine(det_th, model_dir)
			
 
				+            base_img = _upscale(raw, baseline_upscale)
			
 
				+            if save_images:
			
 
				+                cv2.imwrite(str(cell_out / f"baseline_upscale{baseline_upscale}.png"), base_img)
			
 
				+            for mode_name, det, rec in ocr_modes:
			
 
				+                ocr = _ocr(engines[det_th], base_img, det=det, rec=rec)
			
 
				+                row = {
			
 
				+                    "tag": f"baseline_upscale{baseline_upscale}",
			
 
				+                    "det_db_box_thresh": det_th,
			
 
				+                    "ocr_mode": mode_name,
			
 
				+                    **ocr,
			
 
				+                }
			
 
				+                results.append(row)
			
 
				+                m = _match_hit(row.get("text", ""), target)
			
 
				+                if m:
			
 
				+                    row["match"] = m
			
 
				+                    hits.append(row)
			
 
				+
			
 
				+    report = {
			
 
				+        "input": str(resolved),
			
 
				+        "input_requested": str(input_path),
			
 
				+        "output_dir": str(cell_out),
			
 
				+        "target": target,
			
 
				+        "total_trials": total,
			
 
				+        "hits": hits,
			
 
				+        "all_results": results,
			
 
				+    }
			
 
				+    report_path = cell_out / "sweep_report.json"
			
 
				+    report_path.write_text(
			
 
				+        json.dumps(report, ensure_ascii=False, indent=2), encoding="utf-8"
			
 
				+    )
			
 
				+    return report
			
 
				+
			
 
				+
			
 
				+def _build_arg_parser() -> argparse.ArgumentParser:
			
 
				+    p = argparse.ArgumentParser(
			
 
				+        description="单元格图预处理 + OCR 参数网格扫描（对齐 pipeline 格级二次 OCR）",
			
 
				+    )
			
 
				+    p.add_argument(
			
 
				+        "input",
			
 
				+        type=Path,
			
 
				+        help="单元格裁剪图路径，或 tablecell_ocr 目录（批量扫描）",
			
 
				+    )
			
 
				+    p.add_argument(
			
 
				+        "-o",
			
 
				+        "--output",
			
 
				+        type=Path,
			
 
				+        default=None,
			
 
				+        help="输出目录，默认 <input_dir|input_parent>/sweep_out/<stem>",
			
 
				+    )
			
 
				+    p.add_argument(
			
 
				+        "-t",
			
 
				+        "--target",
			
 
				+        default=None,
			
 
				+        help="期望 OCR 文本；用于标记 HIT（子串匹配）。省略则任意非空为 HIT",
			
 
				+    )
			
 
				+    p.add_argument(
			
 
				+        "--model-dir",
			
 
				+        type=Path,
			
 
				+        default=None,
			
 
				+        help="PaddleOCR torch 模型目录（含 det/rec .pth），也可用 OCR_*_MODEL_PATH",
			
 
				+    )
			
 
				+    p.add_argument(
			
 
				+        "--no-prefer-raw",
			
 
				+        action="store_true",
			
 
				+        help="不自动选用同名的 *_raw.png",
			
 
				+    )
			
 
				+    p.add_argument(
			
 
				+        "--quick",
			
 
				+        action="store_true",
			
 
				+        help="缩小网格（threshold 170,175 × upscale 128,192 × det 0.3,0.5）",
			
 
				+    )
			
 
				+    p.add_argument(
			
 
				+        "--methods",
			
 
				+        default="threshold,masked_adaptive",
			
 
				+        help="去水印方式，逗号分隔",
			
 
				+    )
			
 
				+    p.add_argument(
			
 
				+        "--thresholds",
			
 
				+        default="155,165,170,175,180,none",
			
 
				+        help="threshold 法的阈值；none=预设默认",
			
 
				+    )
			
 
				+    p.add_argument(
			
 
				+        "--contrasts",
			
 
				+        default="false,true",
			
 
				+        help="是否 contrast，逗号分隔 false,true",
			
 
				+    )
			
 
				+    p.add_argument(
			
 
				+        "--upscales",
			
 
				+        default="64,96,128,192",
			
 
				+        help="最短边放大目标，逗号分隔整数",
			
 
				+    )
			
 
				+    p.add_argument(
			
 
				+        "--det-threshs",
			
 
				+        default="0.2,0.3,0.4,0.5",
			
 
				+        help="det_db_box_thresh，逗号分隔",
			
 
				+    )
			
 
				+    p.add_argument(
			
 
				+        "--text-black-target",
			
 
				+        type=int,
			
 
				+        default=88,
			
 
				+        help="contrast text_restore 目标黑度",
			
 
				+    )
			
 
				+    p.add_argument(
			
 
				+        "--no-save-images",
			
 
				+        action="store_true",
			
 
				+        help="不写出中间预处理 png（仅报告）",
			
 
				+    )
			
 
				+    p.add_argument(
			
 
				+        "--no-baseline",
			
 
				+        action="store_true",
			
 
				+        help="跳过「仅放大、不去水印」对照组",
			
 
				+    )
			
 
				+    p.add_argument(
			
 
				+        "--baseline-upscale",
			
 
				+        type=int,
			
 
				+        default=128,
			
 
				+        help="baseline 对照组的最短边放大",
			
 
				+    )
			
 
				+    return p
			
 
				+
			
 
				+
			
 
				+def main(argv: Optional[Sequence[str]] = None) -> None:
			
 
				+    args = _build_arg_parser().parse_args(argv)
			
 
				+    inputs = collect_inputs(args.input, prefer_raw=not args.no_prefer_raw)
			
 
				+    if not inputs:
			
 
				+        raise SystemExit("未找到可扫描的图像")
			
 
				+
			
 
				+    if args.output is not None:
			
 
				+        out_root = args.output
			
 
				+    elif args.input.is_file():
			
 
				+        out_root = args.input.parent / "sweep_out"
			
 
				+    else:
			
 
				+        out_root = args.input / "sweep_out"
			
 
				+    out_root.mkdir(parents=True, exist_ok=True)
			
 
				+
			
 
				+    model_dir = args.model_dir or _default_model_dir()
			
 
				+    methods = [m.strip() for m in args.methods.split(",") if m.strip()]
			
 
				+
			
 
				+    if args.quick:
			
 
				+        thresholds = [170, 175]
			
 
				+        upscales = [128, 192]
			
 
				+        det_threshs = [0.3, 0.5]
			
 
				+        contrasts = [False, True]
			
 
				+    else:
			
 
				+        thresholds = _parse_csv_ints(args.thresholds)
			
 
				+        upscales = [int(x) for x in args.upscales.split(",") if x.strip()]
			
 
				+        det_threshs = _parse_csv_floats(args.det_threshs)
			
 
				+        contrasts = _parse_csv_bools(args.contrasts)
			
 
				+
			
 
				+    print(f"扫描 {len(inputs)} 张图 -> {out_root}")
			
 
				+    print(f"  methods={methods} thresholds={thresholds} upscales={upscales}")
			
 
				+    if args.target:
			
 
				+        print(f"  target={args.target!r}")
			
 
				+
			
 
				+    summary: List[Dict[str, Any]] = []
			
 
				+    for img_path in inputs:
			
 
				+        print(f"\n=== {img_path.name} ===")
			
 
				+        report = run_sweep(
			
 
				+            img_path,
			
 
				+            out_root,
			
 
				+            prefer_raw=not args.no_prefer_raw,
			
 
				+            target=args.target,
			
 
				+            model_dir=model_dir,
			
 
				+            methods=methods,
			
 
				+            thresholds=thresholds,
			
 
				+            contrasts=contrasts,
			
 
				+            upscales=upscales,
			
 
				+            det_threshs=det_threshs,
			
 
				+            text_black_target=args.text_black_target,
			
 
				+            save_images=not args.no_save_images,
			
 
				+            run_baseline=not args.no_baseline,
			
 
				+            baseline_upscale=args.baseline_upscale,
			
 
				+        )
			
 
				+        summary.append(
			
 
				+            {
			
 
				+                "input": report["input"],
			
 
				+                "hits": len(report["hits"]),
			
 
				+                "report": str(Path(report["output_dir"]) / "sweep_report.json"),
			
 
				+            }
			
 
				+        )
			
 
				+
			
 
				+    index_path = out_root / "sweep_index.json"
			
 
				+    index_path.write_text(
			
 
				+        json.dumps(summary, ensure_ascii=False, indent=2), encoding="utf-8"
			
 
				+    )
			
 
				+    print(f"\n全部完成，索引: {index_path}")
			
 
				+    for s in summary:
			
 
				+        print(f"  {s['input']}: {s['hits']} hits -> {s['report']}")
			
 
				+
			
 
				+
			
 
				+if __name__ == "__main__":
			
 
				+    if len(sys.argv) == 1:
			
 
				+        print("ℹ️  未提供命令行参数，使用默认配置运行...")
			
 
				+        default_config = {
			
 
				+            "input": "/Users/zhch158/workspace/data/流水分析/彭_广东兴宁农村商业银行/bank_statement_yusys_local/debug/table_recognition_wired/tablecell_ocr/彭_广东兴宁农村商业银行_page_002_0/cell219_empty_empty_raw.png",
			
 
				+            "output": "./output/彭_广东兴宁农村商业银行/cell219_sweep",
			
 
				+            "target": "ATM存折取款",
			
 
				+        }
			
 
				+        sys.argv = [sys.argv[0], default_config["input"]]
			
 
				+        for key, value in default_config.items():
			
 
				+            if key == "input":
			
 
				+                continue
			
 
				+            flag = f"--{key.replace('_', '-')}"
			
 
				+            if isinstance(value, bool) and value:
			
 
				+                sys.argv.append(flag)
			
 
				+            elif not isinstance(value, bool):
			
 
				+                sys.argv.extend([flag, str(value)])
			
 
				+
			
 
				+    sys.exit(main())
			
--- a/ocr_tools/gan_experiments_lab/evaluate.py
+++ b/ocr_tools/gan_experiments_lab/evaluate.py
@@ -0,0 +1,439 @@
 
				+"""
			
 
				+去水印评估脚本：对比 baseline (masked_adaptive) 与 LaMa GAN 方法。
			
 
				+
			
 
				+用法:
			
 
				+    cd ocr_platform/ocr_tools/gan_experiments_lab
			
 
				+
			
 
				+    # 对 test_images/input/ 下所有图片做对比
			
 
				+    python evaluate.py
			
 
				+
			
 
				+    # 指定输入/输出目录
			
 
				+    python evaluate.py --input ./test_images/synthetic/ --output ./output/synthetic_compare
			
 
				+
			
 
				+    # 有clean参考图时计算 PSNR/SSIM
			
 
				+    python evaluate.py --input ./test_images/synthetic/ --clean-dir ./test_images/clean/
			
 
				+
			
 
				+生成物:
			
 
				+    output/compare/     — 三联对比图 (原图 | baseline | GAN)
			
 
				+    output/inpainted/   — GAN 修复结果
			
 
				+    output/mask_debug/  — 掩膜可视化
			
 
				+    output/metrics/     — 评估指标 JSON
			
 
				+"""
			
 
				+from __future__ import annotations
			
 
				+
			
 
				+import argparse
			
 
				+import json
			
 
				+import sys
			
 
				+import time
			
 
				+from pathlib import Path
			
 
				+from typing import Any, Dict, List, Optional, Tuple
			
 
				+
			
 
				+import cv2
			
 
				+import numpy as np
			
 
				+
			
 
				+# 将 ocr_platform 根目录加入 sys.path，以便导入 ocr_utils
			
 
				+_repo_root = Path(__file__).parents[2]
			
 
				+if str(_repo_root) not in sys.path:
			
 
				+    sys.path.insert(0, str(_repo_root))
			
 
				+
			
 
				+from loguru import logger
			
 
				+from PIL import Image
			
 
				+
			
 
				+from ocr_utils.watermark import (
			
 
				+    WatermarkProcessor,
			
 
				+    build_watermark_mask,
			
 
				+    detect_watermark,
			
 
				+    merge_watermark_config,
			
 
				+    render_watermark_mask_overlay,
			
 
				+)
			
 
				+from lama_inpaint import LamaInpainter
			
 
				+
			
 
				+# ── 评估指标 ────────────────────────────────────────────────────
			
 
				+
			
 
				+
			
 
				+def _to_gray(img: np.ndarray) -> np.ndarray:
			
 
				+    if img.ndim == 3:
			
 
				+        return cv2.cvtColor(img, cv2.COLOR_BGR2GRAY).astype(np.float64)
			
 
				+    return img.astype(np.float64)
			
 
				+
			
 
				+
			
 
				+def compute_psnr(img1: np.ndarray, img2: np.ndarray) -> float:
			
 
				+    g1, g2 = _to_gray(img1), _to_gray(img2)
			
 
				+    mse = np.mean((g1 - g2) ** 2)
			
 
				+    if mse < 1e-10:
			
 
				+        return 100.0
			
 
				+    return float(20 * np.log10(255.0 / np.sqrt(mse)))
			
 
				+
			
 
				+
			
 
				+def compute_ssim(img1: np.ndarray, img2: np.ndarray) -> float:
			
 
				+    """简易 SSIM 实现（灰度，8x8 block）。"""
			
 
				+    from math import exp, pi, sqrt
			
 
				+
			
 
				+    g1, g2 = _to_gray(img1), _to_gray(img2)
			
 
				+    k1, k2 = 0.01, 0.03
			
 
				+    l = 255.0
			
 
				+    c1, c2 = (k1 * l) ** 2, (k2 * l) ** 2
			
 
				+
			
 
				+    kernel = cv2.getGaussianKernel(11, 1.5)
			
 
				+    window = np.outer(kernel, kernel)
			
 
				+    window /= window.sum()
			
 
				+
			
 
				+    mu1 = cv2.filter2D(g1, -1, window, borderType=cv2.BORDER_REFLECT)
			
 
				+    mu2 = cv2.filter2D(g2, -1, window, borderType=cv2.BORDER_REFLECT)
			
 
				+    mu1_sq = mu1 * mu1
			
 
				+    mu2_sq = mu2 * mu2
			
 
				+    mu1_mu2 = mu1 * mu2
			
 
				+    sigma1_sq = cv2.filter2D(g1 * g1, -1, window, borderType=cv2.BORDER_REFLECT) - mu1_sq
			
 
				+    sigma2_sq = cv2.filter2D(g2 * g2, -1, window, borderType=cv2.BORDER_REFLECT) - mu2_sq
			
 
				+    sigma12 = cv2.filter2D(g1 * g2, -1, window, borderType=cv2.BORDER_REFLECT) - mu1_mu2
			
 
				+
			
 
				+    num = (2 * mu1_mu2 + c1) * (2 * sigma12 + c2)
			
 
				+    denom = (mu1_sq + mu2_sq + c1) * (sigma1_sq + sigma2_sq + c2)
			
 
				+    ssim_map = num / (denom + 1e-10)
			
 
				+    return float(ssim_map.mean())
			
 
				+
			
 
				+
			
 
				+# ── 水印配置 ────────────────────────────────────────────────────
			
 
				+
			
 
				+
			
 
				+def _baseline_config() -> Dict[str, Any]:
			
 
				+    return merge_watermark_config("page", {
			
 
				+        "method": "masked_adaptive",
			
 
				+        "threshold": 175,
			
 
				+        "contrast_enhancement": {"enabled": True, "method": "text_restore", "text_black_target": 85},
			
 
				+    })
			
 
				+
			
 
				+
			
 
				+def _gan_wm_config() -> Dict[str, Any]:
			
 
				+    return merge_watermark_config("page", {"method": "masked_adaptive", "threshold": 175})
			
 
				+
			
 
				+
			
 
				+# ── 单图处理 ────────────────────────────────────────────────────
			
 
				+
			
 
				+
			
 
				+def _load_image(path: Path) -> np.ndarray:
			
 
				+    """加载图片为 BGR ndarray。"""
			
 
				+    pil = Image.open(str(path)).convert("RGB")
			
 
				+    np_img = np.array(pil)
			
 
				+    return cv2.cvtColor(np_img, cv2.COLOR_RGB2BGR)
			
 
				+
			
 
				+
			
 
				+def _run_baseline(bgr: np.ndarray, cfg: Dict[str, Any]) -> Tuple[np.ndarray, Dict[str, Any]]:
			
 
				+    """运行 masked_adaptive 方法。"""
			
 
				+    proc = WatermarkProcessor(cfg, scope="page")
			
 
				+    debug: Dict[str, Any] = {}
			
 
				+    result, stages = proc.process(bgr, apply_removal=True, removal_debug=debug)
			
 
				+    return np.asarray(result), debug
			
 
				+
			
 
				+
			
 
				+def _run_gan(
			
 
				+    bgr: np.ndarray,
			
 
				+    wm_cfg: Dict[str, Any],
			
 
				+    inpainter: LamaInpainter,
			
 
				+) -> Tuple[np.ndarray, Dict[str, Any]]:
			
 
				+    """
			
 
				+    使用GAN修复水印区域。
			
 
				+
			
 
				+    1. 用 build_watermark_mask 检测水印区域
			
 
				+    2. 用 LaMa 修复
			
 
				+    3. 失败则回退 baseline
			
 
				+    """
			
 
				+    debug: Dict[str, Any] = {"mode": "gan"}
			
 
				+
			
 
				+    gray = cv2.cvtColor(bgr, cv2.COLOR_BGR2GRAY)
			
 
				+    mask_cfg = wm_cfg.get("mask", {})
			
 
				+    wm_mask, mask_debug = build_watermark_mask(gray, bgr=bgr, **mask_cfg)
			
 
				+
			
 
				+    debug.update({k: v for k, v in mask_debug.items()
			
 
				+                  if not isinstance(v, np.ndarray)})
			
 
				+    debug["wm_mask"] = wm_mask
			
 
				+
			
 
				+    if not np.any(wm_mask):
			
 
				+        logger.info("  未检测到水印区域，跳过GAN")
			
 
				+        debug["mode"] = "gan_no_mask"
			
 
				+        clean_gray, _ = _run_baseline(bgr, wm_cfg)
			
 
				+        return clean_gray, debug
			
 
				+
			
 
				+    logger.info(f"  水印区域: {wm_mask.sum()} 像素 "
			
 
				+                f"({100 * wm_mask.sum() / wm_mask.size:.2f}%)")
			
 
				+
			
 
				+    t0 = time.perf_counter()
			
 
				+    result = inpainter.inpaint(bgr, wm_mask)
			
 
				+    elapsed = time.perf_counter() - t0
			
 
				+
			
 
				+    if result is not None:
			
 
				+        debug["mode"] = "gan"
			
 
				+        debug["gan_success"] = True
			
 
				+        debug["gan_inference_time_s"] = round(elapsed, 2)
			
 
				+        logger.info(f"  GAN修复成功 ({elapsed:.1f}s)")
			
 
				+        # 对修复结果做对比度增强
			
 
				+        from ocr_utils.watermark.contrast import apply_contrast_enhancement_config
			
 
				+        ce_cfg = wm_cfg.get("contrast_enhancement")
			
 
				+        result_gray = cv2.cvtColor(result, cv2.COLOR_BGR2GRAY)
			
 
				+        result_gray = apply_contrast_enhancement_config(result_gray, ce_cfg)
			
 
				+        return result_gray, debug
			
 
				+
			
 
				+    # GAN 失败，回退
			
 
				+    logger.warning("  GAN修复失败，回退 baseline")
			
 
				+    debug["mode"] = "gan_fallback"
			
 
				+    debug["fallback_reason"] = "gan_inference_failed"
			
 
				+    clean_gray, fallback_debug = _run_baseline(bgr, wm_cfg)
			
 
				+    debug["fallback_debug"] = fallback_debug
			
 
				+    return clean_gray, debug
			
 
				+
			
 
				+
			
 
				+# ── 输出 ──────────────────────────────────────────────────────────
			
 
				+
			
 
				+
			
 
				+def _make_compare_image(
			
 
				+    bgr: np.ndarray,
			
 
				+    baseline_gray: np.ndarray,
			
 
				+    gan_result: np.ndarray,
			
 
				+    wm_mask: Optional[np.ndarray] = None,
			
 
				+) -> np.ndarray:
			
 
				+    """生成四联对比图。"""
			
 
				+    h, w = baseline_gray.shape[:2] if baseline_gray.ndim == 2 else baseline_gray.shape
			
 
				+
			
 
				+    def _to_bgr(arr: np.ndarray) -> np.ndarray:
			
 
				+        if arr.ndim == 2:
			
 
				+            return cv2.cvtColor(arr, cv2.COLOR_GRAY2BGR)
			
 
				+        return arr
			
 
				+
			
 
				+    def _resize(arr: np.ndarray, target_h: int, target_w: int) -> np.ndarray:
			
 
				+        if arr.shape[0] != target_h or arr.shape[1] != target_w:
			
 
				+            return cv2.resize(arr, (target_w, target_h))
			
 
				+        return arr
			
 
				+
			
 
				+    # GAN 结果可能是灰度或BGR
			
 
				+    gan_bgr = _to_bgr(gan_result) if gan_result.ndim == 2 else gan_result
			
 
				+    if gan_result.ndim == 3 and gan_result.shape[2] == 3:
			
 
				+        gan_bgr = gan_result  # 已经是BGR
			
 
				+
			
 
				+    # 统一尺寸
			
 
				+    ref_h, ref_w = bgr.shape[:2]
			
 
				+    baseline_bgr = _to_bgr(baseline_gray) if baseline_gray.ndim == 2 else baseline_gray
			
 
				+    baseline_bgr = _resize(baseline_bgr, ref_h, ref_w)
			
 
				+    gan_bgr = _resize(gan_bgr, ref_h, ref_w)
			
 
				+
			
 
				+    panels = [bgr, baseline_bgr, gan_bgr]
			
 
				+
			
 
				+    # 如果有mask，叠加到原图上作为第四联
			
 
				+    if wm_mask is not None and np.any(wm_mask):
			
 
				+        mask_overlay = render_watermark_mask_overlay(bgr, wm_mask)
			
 
				+        panels.append(mask_overlay)
			
 
				+
			
 
				+    # 添加标签
			
 
				+    labels = ["Original", "Baseline (masked_adaptive)", "GAN (LaMa)"]
			
 
				+    if len(panels) == 4:
			
 
				+        labels.append("Watermark Mask")
			
 
				+
			
 
				+    labeled = []
			
 
				+    for panel, label in zip(panels, labels):
			
 
				+        h_p = panel.shape[0]
			
 
				+        # 底部加标签条
			
 
				+        bar = np.ones((36, panel.shape[1], 3), dtype=np.uint8) * 240
			
 
				+        cv2.putText(bar, label, (12, 24), cv2.FONT_HERSHEY_SIMPLEX, 0.55, (0, 0, 0), 1)
			
 
				+        labeled.append(np.vstack([panel, bar]))
			
 
				+
			
 
				+    # 水平拼接
			
 
				+    max_h = max(p.shape[0] for p in labeled)
			
 
				+    for i in range(len(labeled)):
			
 
				+        if labeled[i].shape[0] < max_h:
			
 
				+            pad = np.ones((max_h - labeled[i].shape[0], labeled[i].shape[1], 3), dtype=np.uint8) * 255
			
 
				+            labeled[i] = np.vstack([labeled[i], pad])
			
 
				+
			
 
				+    return np.hstack(labeled)
			
 
				+
			
 
				+
			
 
				+def _save_result(
			
 
				+    stem: str,
			
 
				+    result: np.ndarray,
			
 
				+    output_dir: Path,
			
 
				+    prefix: str = "",
			
 
				+) -> Path:
			
 
				+    """保存结果图片。"""
			
 
				+    p = output_dir / f"{stem}_{prefix}.png"
			
 
				+    if result.ndim == 2:
			
 
				+        cv2.imwrite(str(p), result)
			
 
				+    else:
			
 
				+        cv2.imwrite(str(p), result)
			
 
				+    return p
			
 
				+
			
 
				+
			
 
				+def _save_metrics_json(
			
 
				+    metrics_list: List[Dict[str, Any]],
			
 
				+    output_dir: Path,
			
 
				+) -> None:
			
 
				+    output_dir.mkdir(parents=True, exist_ok=True)
			
 
				+    p = output_dir / "metrics.json"
			
 
				+    p.write_text(json.dumps(metrics_list, ensure_ascii=False, indent=2), encoding="utf-8")
			
 
				+    logger.info(f"评估指标: {p}")
			
 
				+
			
 
				+
			
 
				+# ── 主函数 ────────────────────────────────────────────────────────
			
 
				+
			
 
				+
			
 
				+def evaluate(
			
 
				+    input_dir: Path,
			
 
				+    output_root: Path,
			
 
				+    *,
			
 
				+    clean_dir: Optional[Path] = None,
			
 
				+    device: str = "cpu",
			
 
				+    gan_only: bool = False,
			
 
				+) -> None:
			
 
				+    """批量评估。"""
			
 
				+    img_files = sorted([
			
 
				+        f for f in input_dir.iterdir()
			
 
				+        if f.suffix.lower() in {".png", ".jpg", ".jpeg", ".bmp", ".tif", ".tiff", ".webp"}
			
 
				+    ])
			
 
				+    if not img_files:
			
 
				+        logger.error(f"{input_dir} 下没有图片文件")
			
 
				+        return
			
 
				+
			
 
				+    # 输出目录
			
 
				+    out_compare = output_root / "compare"
			
 
				+    out_inpainted = output_root / "inpainted"
			
 
				+    out_mask = output_root / "mask_debug"
			
 
				+    out_metrics = output_root / "metrics"
			
 
				+    for d in [out_compare, out_inpainted, out_mask, out_metrics]:
			
 
				+        d.mkdir(parents=True, exist_ok=True)
			
 
				+
			
 
				+    baseline_cfg = _baseline_config()
			
 
				+    wm_cfg = _gan_wm_config()
			
 
				+
			
 
				+    inpainter = LamaInpainter(device=device)
			
 
				+    available = inpainter.is_available
			
 
				+    logger.info(f"LaMa 可用: {available}, backend: {inpainter._backend or '未加载'}")
			
 
				+
			
 
				+    if not available and not gan_only:
			
 
				+        logger.warning("LaMa backend 不可用，GAN将回退到OpenCV inpaint")
			
 
				+
			
 
				+    all_metrics: List[Dict[str, Any]] = []
			
 
				+
			
 
				+    for f in img_files:
			
 
				+        logger.info(f"\n处理: {f.name}")
			
 
				+        stem = f.stem
			
 
				+        bgr = _load_image(f)
			
 
				+
			
 
				+        # 检查是否有对应 clean 参考图
			
 
				+        clean_img: Optional[np.ndarray] = None
			
 
				+        if clean_dir:
			
 
				+            for ext in (".png", ".jpg", ".jpeg"):
			
 
				+                clean_path = clean_dir / f"{stem}{ext}"
			
 
				+                if clean_path.exists():
			
 
				+                    clean_img = _load_image(clean_path)
			
 
				+                    break
			
 
				+            if clean_img is None:
			
 
				+                # 尝试移除 _watermarked 后缀
			
 
				+                clean_name = stem.replace("_watermarked", "")
			
 
				+                for ext in (".png", ".jpg", ".jpeg"):
			
 
				+                    clean_path = clean_dir / f"{clean_name}{ext}"
			
 
				+                    if clean_path.exists():
			
 
				+                        clean_img = _load_image(clean_path)
			
 
				+                        break
			
 
				+
			
 
				+        # 检测水印
			
 
				+        gray = cv2.cvtColor(bgr, cv2.COLOR_BGR2GRAY)
			
 
				+        has_wm = detect_watermark(gray, ratio_threshold=0.025)
			
 
				+        logger.info(f"  水印检测: {'有水印' if has_wm else '无水印'}")
			
 
				+
			
 
				+        # ── Baseline ──
			
 
				+        logger.info("  运行 baseline (masked_adaptive)...")
			
 
				+        t0 = time.perf_counter()
			
 
				+        baseline_result, baseline_debug = _run_baseline(bgr, baseline_cfg)
			
 
				+        baseline_time = time.perf_counter() - t0
			
 
				+        logger.info(f"  baseline 耗时: {baseline_time:.1f}s")
			
 
				+
			
 
				+        # ── GAN ──
			
 
				+        t0 = time.perf_counter()
			
 
				+        gan_result, gan_debug = _run_gan(bgr, wm_cfg, inpainter)
			
 
				+        gan_time = time.perf_counter() - t0
			
 
				+
			
 
				+        # ── 保存结果 ──
			
 
				+        _save_result(stem, baseline_result, out_inpainted, "baseline")
			
 
				+        gan_save = gan_result
			
 
				+        if gan_result.ndim == 2:
			
 
				+            gan_save_bgr = cv2.cvtColor(gan_result, cv2.COLOR_GRAY2BGR)
			
 
				+        else:
			
 
				+            gan_save_bgr = gan_result
			
 
				+        _save_result(stem, gan_save_bgr, out_inpainted, "gan")
			
 
				+
			
 
				+        # ── 掩膜可视化 ──
			
 
				+        wm_mask = gan_debug.get("wm_mask")
			
 
				+        if wm_mask is not None and np.any(wm_mask):
			
 
				+            mask_overlay = render_watermark_mask_overlay(bgr, wm_mask)
			
 
				+            _save_result(stem, mask_overlay, out_mask, "mask_overlay")
			
 
				+
			
 
				+        # ── 对比图 ──
			
 
				+        compare_img = _make_compare_image(bgr, baseline_result, gan_save_bgr, wm_mask)
			
 
				+        _save_result(stem, compare_img, out_compare, "compare")
			
 
				+
			
 
				+        # ── 评估指标 ──
			
 
				+        metrics: Dict[str, Any] = {
			
 
				+            "file": f.name,
			
 
				+            "has_watermark": has_wm,
			
 
				+            "baseline_time_s": round(baseline_time, 2),
			
 
				+            "gan_time_s": round(gan_time, 2),
			
 
				+            "gan_mode": gan_debug.get("mode", "unknown"),
			
 
				+        }
			
 
				+        if clean_img is not None:
			
 
				+            # baseline vs clean
			
 
				+            metrics["baseline_psnr"] = round(compute_psnr(baseline_result, clean_img), 2)
			
 
				+            metrics["baseline_ssim"] = round(compute_ssim(baseline_result, clean_img), 4)
			
 
				+            # gan vs clean
			
 
				+            metrics["gan_psnr"] = round(compute_psnr(gan_save_bgr, clean_img), 2)
			
 
				+            metrics["gan_ssim"] = round(compute_ssim(gan_save_bgr, clean_img), 4)
			
 
				+            logger.info(
			
 
				+                f"  PSNR: baseline={metrics['baseline_psnr']}dB, "
			
 
				+                f"GAN={metrics['gan_psnr']}dB"
			
 
				+            )
			
 
				+        all_metrics.append(metrics)
			
 
				+
			
 
				+    _save_metrics_json(all_metrics, out_metrics)
			
 
				+
			
 
				+    # 汇总
			
 
				+    logger.info(f"\n{'='*50}")
			
 
				+    logger.info(f"评估完成，共 {len(img_files)} 张图")
			
 
				+    logger.info(f"  对比图:   {out_compare}")
			
 
				+    logger.info(f"  修复结果: {out_inpainted}")
			
 
				+    logger.info(f"  掩膜:     {out_mask}")
			
 
				+    logger.info(f"  指标:     {out_metrics}")
			
 
				+
			
 
				+    if clean_img is not None:
			
 
				+        avg_baseline_psnr = np.mean([m.get("baseline_psnr", 0) for m in all_metrics])
			
 
				+        avg_gan_psnr = np.mean([m.get("gan_psnr", 0) for m in all_metrics])
			
 
				+        logger.info(f"  平均 PSNR: baseline={avg_baseline_psnr:.1f}dB, GAN={avg_gan_psnr:.1f}dB")
			
 
				+
			
 
				+
			
 
				+def main():
			
 
				+    root = Path(__file__).parent
			
 
				+
			
 
				+    parser = argparse.ArgumentParser(
			
 
				+        description="去水印评估：baseline vs GAN",
			
 
				+        formatter_class=argparse.RawDescriptionHelpFormatter,
			
 
				+        epilog=__doc__,
			
 
				+    )
			
 
				+    parser.add_argument("--input", type=Path, default=root / "test_images" / "input",
			
 
				+                        help="输入图片目录")
			
 
				+    parser.add_argument("--output", type=Path, default=root / "output",
			
 
				+                        help="输出根目录")
			
 
				+    parser.add_argument("--clean-dir", type=Path, default=None,
			
 
				+                        help="clean参考图目录（用于计算PSNR/SSIM）")
			
 
				+    parser.add_argument("--device", type=str, default="cpu",
			
 
				+                        choices=["cpu", "cuda", "mps"],
			
 
				+                        help="推理设备")
			
 
				+    parser.add_argument("--gan-only", action="store_true",
			
 
				+                        help="仅运行GAN（跳过baseline）")
			
 
				+    args = parser.parse_args()
			
 
				+
			
 
				+    evaluate(
			
 
				+        args.input,
			
 
				+        args.output,
			
 
				+        clean_dir=args.clean_dir,
			
 
				+        device=args.device,
			
 
				+        gan_only=args.gan_only,
			
 
				+    )
			
 
				+
			
 
				+
			
 
				+if __name__ == "__main__":
			
 
				+    main()
			
--- a/ocr_tools/gan_experiments_lab/lama_inpaint.py
+++ b/ocr_tools/gan_experiments_lab/lama_inpaint.py
@@ -0,0 +1,245 @@
 
				+"""
			
 
				+LaMa (Large Mask Inpainting) 推理模块。
			
 
				+
			
 
				+封装预训练LaMa模型的加载与推理，方案选择（按优先级）:
			
 
				+1. simple_lama_inpainting  pip包（最简）
			
 
				+2. 本地 lama 仓库代码（big-lama checkpoint）
			
 
				+3. OpenCV inpainting（终极回退，不用GAN）
			
 
				+
			
 
				+用法:
			
 
				+    from gan_experiments_lab.lama_inpaint import LamaInpainter
			
 
				+    inpaint = LamaInpainter(device="cpu")
			
 
				+    result = inpaint.inpaint(bgr_image, mask_bool)
			
 
				+"""
			
 
				+from __future__ import annotations
			
 
				+
			
 
				+import sys
			
 
				+from pathlib import Path
			
 
				+from typing import Optional
			
 
				+
			
 
				+import cv2
			
 
				+import numpy as np
			
 
				+from loguru import logger
			
 
				+
			
 
				+
			
 
				+def _check_simple_lama() -> bool:
			
 
				+    try:
			
 
				+        import simple_lama_inpainting  # noqa: F401
			
 
				+        return True
			
 
				+    except ImportError:
			
 
				+        return False
			
 
				+
			
 
				+
			
 
				+def _check_lama_repo() -> Optional[Path]:
			
 
				+    """检查本地是否有 lama 仓库并已加入 sys.path。"""
			
 
				+    candidates = [
			
 
				+        Path(__file__).parent / "lama",
			
 
				+        Path(__file__).parents[2] / "lama",
			
 
				+        Path.home() / "lama",
			
 
				+        Path("/tmp/lama"),
			
 
				+    ]
			
 
				+    for p in candidates:
			
 
				+        if (p / "saicinpainting" / "__init__.py").exists():
			
 
				+            return p
			
 
				+    return None
			
 
				+
			
 
				+
			
 
				+class LamaInpainter:
			
 
				+    """LaMa inpainting 门面，自动选择可用后端。"""
			
 
				+
			
 
				+    def __init__(
			
 
				+        self,
			
 
				+        *,
			
 
				+        device: str = "cpu",
			
 
				+        inference_size: Optional[int] = None,
			
 
				+        pad_to_multiple: int = 8,
			
 
				+    ):
			
 
				+        self._device = device
			
 
				+        self._inference_size = inference_size  # None = 保持原尺寸
			
 
				+        self._pad_to_multiple = pad_to_multiple
			
 
				+        self._model = None
			
 
				+        self._backend = None  # "simple_lama" | "lama_repo" | "opencv"
			
 
				+        self._lama_repo_path: Optional[Path] = None
			
 
				+
			
 
				+    @property
			
 
				+    def is_available(self) -> bool:
			
 
				+        if self._backend is not None:
			
 
				+            return self._backend != "opencv"
			
 
				+        if _check_simple_lama():
			
 
				+            self._backend = "simple_lama"
			
 
				+            return True
			
 
				+        if _check_lama_repo():
			
 
				+            self._backend = "lama_repo"
			
 
				+            return True
			
 
				+        return False
			
 
				+
			
 
				+    def load(self) -> bool:
			
 
				+        """加载模型，返回是否成功。"""
			
 
				+        if self._model is not None:
			
 
				+            return True
			
 
				+
			
 
				+        if _check_simple_lama():
			
 
				+            return self._load_simple_lama()
			
 
				+        repo = _check_lama_repo()
			
 
				+        if repo:
			
 
				+            return self._load_lama_repo(repo)
			
 
				+
			
 
				+        logger.warning("LaMa backends 都不可用，将回退 OpenCV inpainting")
			
 
				+        self._backend = "opencv"
			
 
				+        return False
			
 
				+
			
 
				+    def _load_simple_lama(self) -> bool:
			
 
				+        try:
			
 
				+            from simple_lama_inpainting import SimpleLama
			
 
				+            self._model = SimpleLama(device=self._device)
			
 
				+            self._backend = "simple_lama"
			
 
				+            logger.info(f"LaMa (simple_lama_inpainting) 已加载, device={self._device}")
			
 
				+            return True
			
 
				+        except Exception as e:
			
 
				+            logger.warning(f"simple_lama_inpainting 加载失败: {e}")
			
 
				+            return False
			
 
				+
			
 
				+    def _load_lama_repo(self, repo_path: Path) -> bool:
			
 
				+        try:
			
 
				+            if str(repo_path) not in sys.path:
			
 
				+                sys.path.insert(0, str(repo_path))
			
 
				+
			
 
				+            from omegaconf import OmegaConf
			
 
				+            from saicinpainting.training.trainers import load_checkpoint
			
 
				+
			
 
				+            config_path = repo_path / "big-lama" / "config.yaml"
			
 
				+            ckpt_path = repo_path / "big-lama" / "models" / "best.ckpt"
			
 
				+
			
 
				+            if not config_path.exists() or not ckpt_path.exists():
			
 
				+                logger.warning(
			
 
				+                    f"lama 模型文件缺失。请下载: "
			
 
				+                    f"wget https://github.com/Sanster/models/releases/download/add_big_lama/big-lama.zip && "
			
 
				+                    f"unzip big-lama.zip -d {repo_path}"
			
 
				+                )
			
 
				+                return False
			
 
				+
			
 
				+            conf = OmegaConf.load(str(config_path))
			
 
				+            conf.training_model.predict_only = True
			
 
				+            conf.visualizer.kind = "noop"
			
 
				+
			
 
				+            model = load_checkpoint(conf, str(ckpt_path), strict=False, map_location="cpu")
			
 
				+            model.eval()
			
 
				+            if self._device != "cpu":
			
 
				+                model.cuda()
			
 
				+            self._model = model
			
 
				+            self._lama_repo_path = repo_path
			
 
				+            self._backend = "lama_repo"
			
 
				+            logger.info(f"LaMa (lama_repo) 已加载, device={self._device}")
			
 
				+            return True
			
 
				+        except Exception as e:
			
 
				+            logger.warning(f"lama_repo 加载失败: {e}")
			
 
				+            return False
			
 
				+
			
 
				+    def inpaint(self, image: np.ndarray, mask: np.ndarray) -> Optional[np.ndarray]:
			
 
				+        """
			
 
				+        修复图像。
			
 
				+
			
 
				+        Args:
			
 
				+            image: BGR ndarray (H, W, 3), uint8
			
 
				+            mask: bool ndarray (H, W), True=需要修复的水印区域
			
 
				+
			
 
				+        Returns:
			
 
				+            BGR ndarray (H, W, 3), uint8, or None
			
 
				+        """
			
 
				+        if not self._model:
			
 
				+            if not self.load():
			
 
				+                return self._opencv_inpaint(image, mask)
			
 
				+
			
 
				+        if self._backend == "simple_lama":
			
 
				+            return self._inpaint_simple_lama(image, mask)
			
 
				+        elif self._backend == "lama_repo":
			
 
				+            return self._inpaint_lama_repo(image, mask)
			
 
				+        else:
			
 
				+            return self._opencv_inpaint(image, mask)
			
 
				+
			
 
				+    def _inpaint_simple_lama(self, image: np.ndarray, mask: np.ndarray) -> Optional[np.ndarray]:
			
 
				+        try:
			
 
				+            rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
			
 
				+            mask_u8 = mask.astype(np.uint8) * 255
			
 
				+            # 按需 resize
			
 
				+            if self._inference_size:
			
 
				+                rgb, mask_u8, orig_size = self._resize_to_inference(rgb, mask_u8)
			
 
				+            result_rgb = self._model(rgb, mask_u8)
			
 
				+            if self._inference_size:
			
 
				+                result_rgb = cv2.resize(result_rgb, (orig_size[1], orig_size[0]))
			
 
				+            return cv2.cvtColor(result_rgb, cv2.COLOR_RGB2BGR)
			
 
				+        except Exception as e:
			
 
				+            logger.warning(f"simple_lama 推理失败: {e}")
			
 
				+            return None
			
 
				+
			
 
				+    def _inpaint_lama_repo(self, image: np.ndarray, mask: np.ndarray) -> Optional[np.ndarray]:
			
 
				+        try:
			
 
				+            import torch
			
 
				+            import torch.nn.functional as F
			
 
				+            from saicinpainting.evaluation.data import pad_tensor_to_modulo
			
 
				+
			
 
				+            rgb = cv2.cvtColor(image.astype(np.float32) / 255.0, cv2.COLOR_BGR2RGB)
			
 
				+            mask_f = mask.astype(np.float32)
			
 
				+            orig_h, orig_w = rgb.shape[:2]
			
 
				+
			
 
				+            # resize
			
 
				+            if self._inference_size:
			
 
				+                rgb, mask_f, (orig_w, orig_h) = self._resize_image_mask(rgb, mask_f)
			
 
				+
			
 
				+            img_t = torch.from_numpy(rgb).permute(2, 0, 1).unsqueeze(0)
			
 
				+            mask_t = torch.from_numpy(mask_f).unsqueeze(0).unsqueeze(0)
			
 
				+
			
 
				+            img_t = pad_tensor_to_modulo(img_t, self._pad_to_multiple)
			
 
				+            mask_t = pad_tensor_to_modulo(mask_t, self._pad_to_multiple)
			
 
				+
			
 
				+            if self._device != "cpu":
			
 
				+                img_t = img_t.cuda()
			
 
				+                mask_t = mask_t.cuda()
			
 
				+
			
 
				+            with torch.no_grad():
			
 
				+                output = self._model(img_t, mask_t)
			
 
				+                # output shape: (B, C, H, W)
			
 
				+                result = output[0].permute(1, 2, 0).cpu().numpy()
			
 
				+                # 裁掉 pad
			
 
				+                result = result[:orig_h, :orig_w, :]
			
 
				+
			
 
				+            result = np.clip(result, 0, 1)
			
 
				+            result_u8 = (result * 255).astype(np.uint8)
			
 
				+            return cv2.cvtColor(result_u8, cv2.COLOR_RGB2BGR)
			
 
				+        except Exception as e:
			
 
				+            logger.warning(f"lama_repo 推理失败: {e}")
			
 
				+            return None
			
 
				+
			
 
				+    def _resize_to_inference(self, rgb: np.ndarray, mask: np.ndarray) -> tuple:
			
 
				+        h, w = rgb.shape[:2]
			
 
				+        size = self._inference_size or min(h, w)
			
 
				+        scale = size / min(h, w)
			
 
				+        new_w, new_h = int(w * scale), int(h * scale)
			
 
				+        rgb_rs = cv2.resize(rgb, (new_w, new_h), interpolation=cv2.INTER_CUBIC)
			
 
				+        mask_rs = cv2.resize(mask, (new_w, new_h), interpolation=cv2.INTER_NEAREST)
			
 
				+        return rgb_rs, mask_rs, (w, h)
			
 
				+
			
 
				+    def _resize_image_mask(self, rgb: np.ndarray, mask: np.ndarray) -> tuple:
			
 
				+        h, w = rgb.shape[:2]
			
 
				+        size = self._inference_size or min(h, w)
			
 
				+        scale = size / min(h, w)
			
 
				+        new_w, new_h = int(w * scale), int(h * scale)
			
 
				+        rgb_rs = cv2.resize(rgb, (new_w, new_h), interpolation=cv2.INTER_CUBIC)
			
 
				+        mask_rs = cv2.resize(mask, (new_w, new_h), interpolation=cv2.INTER_NEAREST)
			
 
				+        return rgb_rs, mask_rs, (w, h)
			
 
				+
			
 
				+    def _opencv_inpaint(self, image: np.ndarray, mask: np.ndarray) -> np.ndarray:
			
 
				+        """OpenCV Telea inpainting 回退（非GAN）。"""
			
 
				+        logger.info("使用 OpenCV inpainting 回退")
			
 
				+        mask_u8 = mask.astype(np.uint8) * 255
			
 
				+        return cv2.inpaint(image, mask_u8, inpaintRadius=5, flags=cv2.INPAINT_TELEA)
			
 
				+
			
 
				+
			
 
				+if __name__ == "__main__":
			
 
				+    # 快速功能测试
			
 
				+    print("LaMa 后端检测:")
			
 
				+    print(f"  simple_lama_inpainting: {_check_simple_lama()}")
			
 
				+    repo = _check_lama_repo()
			
 
				+    print(f"  lama_repo:              {repo}")
			
 
				+    inpaint = LamaInpainter(device="cpu")
			
 
				+    print(f"  is_available:           {inpaint.is_available}")
			
--- a/ocr_tools/gan_experiments_lab/test_images/input/彭_广东兴宁农村商业银行_page_002.png
+++ b/ocr_tools/gan_experiments_lab/test_images/input/彭_广东兴宁农村商业银行_page_002.png
--- a/ocr_tools/gan_experiments_lab/watermark_synthesis.py
+++ b/ocr_tools/gan_experiments_lab/watermark_synthesis.py
@@ -0,0 +1,222 @@
 
				+"""
			
 
				+水印合成脚本：在clean图片上叠加斜向浅色文字水印，输出带水印图 + 精确mask。
			
 
				+
			
 
				+用法:
			
 
				+    python watermark_synthesis.py                          # 默认参数演示
			
 
				+    python watermark_synthesis.py --input ./test_images/clean/   # 指定输入目录
			
 
				+    python watermark_synthesis.py --text "SAMPLE" --opacity 0.15 --angle 45
			
 
				+"""
			
 
				+from __future__ import annotations
			
 
				+
			
 
				+import argparse
			
 
				+import math
			
 
				+from pathlib import Path
			
 
				+from typing import Optional
			
 
				+
			
 
				+import cv2
			
 
				+import numpy as np
			
 
				+from loguru import logger
			
 
				+from PIL import Image, ImageDraw, ImageFont
			
 
				+
			
 
				+
			
 
				+def _find_font() -> str:
			
 
				+    """查找可用中文字体，找不到返回默认字体。"""
			
 
				+    candidates = [
			
 
				+        "/System/Library/Fonts/PingFang.ttc",
			
 
				+        "/System/Library/Fonts/STHeiti Light.ttc",
			
 
				+        "/System/Library/Fonts/Hiragino Sans GB.ttc",
			
 
				+        "/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf",
			
 
				+        "/usr/share/fonts/opentype/noto/NotoSansCJK-Regular.ttc",
			
 
				+    ]
			
 
				+    for fp in candidates:
			
 
				+        if Path(fp).exists():
			
 
				+            return fp
			
 
				+    logger.warning("未找到中文字体，使用PIL默认字体")
			
 
				+    return ""
			
 
				+
			
 
				+
			
 
				+def _text_size_to_font_size(text_height_px: int) -> int:
			
 
				+    """根据目标文字像素高度估算 font_size。"""
			
 
				+    return int(text_height_px * 1.15)
			
 
				+
			
 
				+
			
 
				+def _render_watermark_tile(
			
 
				+    pil_img: Image.Image,
			
 
				+    text: str,
			
 
				+    font_path: str,
			
 
				+    font_size: int,
			
 
				+    opacity: float,
			
 
				+    angle_deg: float,
			
 
				+    spacing_x: int,
			
 
				+    spacing_y: int,
			
 
				+) -> tuple[np.ndarray, np.ndarray]:
			
 
				+    """
			
 
				+    在图上平铺斜向水印文字，返回 (watermarked_np, mask_np)。
			
 
				+
			
 
				+    mask_np: H×W bool, True=水印像素位置。
			
 
				+    """
			
 
				+    w, h = pil_img.size
			
 
				+    text_height = int(font_size / 1.15)
			
 
				+    gray_value = int(255 * (1 - opacity))
			
 
				+
			
 
				+    # 创建水印文字mask（稍大画布以覆盖旋转后区域）
			
 
				+    diag = int(math.sqrt(w * w + h * h)) + text_height * 4
			
 
				+    tile_w = diag
			
 
				+    tile_h = diag
			
 
				+
			
 
				+    tile = Image.new("L", (tile_w, tile_h), 0)
			
 
				+    draw = ImageDraw.Draw(tile)
			
 
				+    font = ImageFont.truetype(font_path, font_size) if font_path else ImageFont.load_default()
			
 
				+
			
 
				+    # 步长取spacing + 文字大小，确保均匀分布
			
 
				+    step_x = text_height + spacing_x
			
 
				+    step_y = text_height + spacing_y
			
 
				+
			
 
				+    for y in range(0, tile_h, step_y):
			
 
				+        for x in range(0, tile_w, step_x):
			
 
				+            draw.text((x, y), text, fill=255, font=font)
			
 
				+
			
 
				+    # 旋转
			
 
				+    tile_rot = tile.rotate(angle_deg, expand=False, fillcolor=0)
			
 
				+
			
 
				+    # 裁剪到原图大小（中心对齐）
			
 
				+    cx, cy = tile_rot.size[0] // 2, tile_rot.size[1] // 2
			
 
				+    left = cx - w // 2
			
 
				+    top = cy - h // 2
			
 
				+    watermark_tile = tile_rot.crop((left, top, left + w, top + h))
			
 
				+
			
 
				+    mask_np = np.array(watermark_tile) > 0
			
 
				+
			
 
				+    # 叠加到原图
			
 
				+    base = np.array(pil_img.convert("RGB"))
			
 
				+    alpha = opacity
			
 
				+    result = base.copy()
			
 
				+    result[mask_np] = (
			
 
				+        base[mask_np].astype(np.float32) * (1 - alpha)
			
 
				+        + np.array([gray_value, gray_value, gray_value], dtype=np.float32) * alpha
			
 
				+    ).astype(np.uint8)
			
 
				+
			
 
				+    return result, mask_np
			
 
				+
			
 
				+
			
 
				+def synthesize_watermark(
			
 
				+    input_path: Path,
			
 
				+    output_dir: Path,
			
 
				+    *,
			
 
				+    text: str = "SAMPLE",
			
 
				+    font_path: str = "",
			
 
				+    text_height_px: int = 36,
			
 
				+    opacity: float = 0.12,
			
 
				+    angle_deg: float = 45.0,
			
 
				+    spacing_x: int = 180,
			
 
				+    spacing_y: int = 180,
			
 
				+    save_mask: bool = True,
			
 
				+) -> Path:
			
 
				+    """
			
 
				+    在输入图片上合成水印，输出到 output_dir。
			
 
				+
			
 
				+    Returns:
			
 
				+        合成后的图片路径
			
 
				+    """
			
 
				+    output_dir.mkdir(parents=True, exist_ok=True)
			
 
				+    pil_img = Image.open(str(input_path)).convert("RGB")
			
 
				+
			
 
				+    fp = font_path or _find_font()
			
 
				+    font_size = _text_size_to_font_size(text_height_px)
			
 
				+
			
 
				+    logger.info(
			
 
				+        f"合成水印: {input_path.name} | "
			
 
				+        f"text='{text}' font_size={font_size} opacity={opacity} angle={angle_deg}"
			
 
				+    )
			
 
				+
			
 
				+    result_np, mask_np = _render_watermark_tile(
			
 
				+        pil_img, text, fp, font_size, opacity, angle_deg, spacing_x, spacing_y
			
 
				+    )
			
 
				+
			
 
				+    out_name = f"{input_path.stem}_watermarked{input_path.suffix}"
			
 
				+    out_path = output_dir / out_name
			
 
				+    Image.fromarray(result_np).save(str(out_path))
			
 
				+    logger.info(f"  水印图: {out_path}")
			
 
				+
			
 
				+    if save_mask:
			
 
				+        mask_path = output_dir / f"{input_path.stem}_mask.png"
			
 
				+        cv2.imwrite(str(mask_path), (mask_np.astype(np.uint8) * 255))
			
 
				+        logger.info(f"  mask:   {mask_path}")
			
 
				+
			
 
				+    return out_path
			
 
				+
			
 
				+
			
 
				+def main():
			
 
				+    parser = argparse.ArgumentParser(description="水印合成工具")
			
 
				+    parser.add_argument("--input", type=Path, default=None,
			
 
				+                        help="输入图片或目录（默认: test_images/clean/）")
			
 
				+    parser.add_argument("--output", type=Path, default=None,
			
 
				+                        help="输出目录（默认: test_images/synthetic/）")
			
 
				+    parser.add_argument("--text", type=str, default="行内内部使用",
			
 
				+                        help="水印文字内容")
			
 
				+    parser.add_argument("--text-height", type=int, default=48,
			
 
				+                        help="文字像素高度（默认48）")
			
 
				+    parser.add_argument("--opacity", type=float, default=0.10,
			
 
				+                        help="水印透明度 0~1（默认0.10）")
			
 
				+    parser.add_argument("--angle", type=float, default=45.0,
			
 
				+                        help="水印倾斜角度（默认45°）")
			
 
				+    parser.add_argument("--spacing-x", type=int, default=250,
			
 
				+                        help="水印文字水平间距（默认250px）")
			
 
				+    parser.add_argument("--spacing-y", type=int, default=250,
			
 
				+                        help="水印文字垂直间距（默认250px）")
			
 
				+    parser.add_argument("--font", type=str, default="",
			
 
				+                        help="字体文件路径")
			
 
				+    parser.add_argument("--no-mask", action="store_true",
			
 
				+                        help="不保存mask")
			
 
				+    parser.add_argument("--demo", action="store_true",
			
 
				+                        help="使用input目录下第一张测试图生成演示图")
			
 
				+    args = parser.parse_args()
			
 
				+
			
 
				+    root = Path(__file__).parent
			
 
				+    input_dir = args.input or (root / "test_images" / "clean")
			
 
				+    output_dir = args.output or (root / "test_images" / "synthetic")
			
 
				+
			
 
				+    if args.demo:
			
 
				+        # 无clean图时，直接用input目录的水印图再加一层合成水印做演示
			
 
				+        img_files = sorted(root.glob("test_images/input/*"))
			
 
				+        if not img_files:
			
 
				+            logger.error("test_images/input/ 下没有测试图片，请放入图片后重试")
			
 
				+            return
			
 
				+        input_dir = root / "test_images" / "input"
			
 
				+        output_dir = root / "test_images" / "synthetic"
			
 
				+
			
 
				+    input_dir = Path(input_dir)
			
 
				+    output_dir = Path(output_dir)
			
 
				+    output_dir.mkdir(parents=True, exist_ok=True)
			
 
				+
			
 
				+    if input_dir.is_dir():
			
 
				+        img_files = sorted([
			
 
				+            f for f in input_dir.iterdir()
			
 
				+            if f.suffix.lower() in {".png", ".jpg", ".jpeg", ".bmp", ".tif", ".tiff", ".webp"}
			
 
				+        ])
			
 
				+    elif input_dir.is_file():
			
 
				+        img_files = [input_dir]
			
 
				+    else:
			
 
				+        logger.error(f"输入路径不存在: {input_dir}")
			
 
				+        return
			
 
				+
			
 
				+    if not img_files:
			
 
				+        logger.warning(f"{input_dir} 下没有图片文件")
			
 
				+        return
			
 
				+
			
 
				+    for f in img_files:
			
 
				+        synthesize_watermark(
			
 
				+            f, output_dir,
			
 
				+            text=args.text,
			
 
				+            font_path=args.font,
			
 
				+            text_height_px=args.text_height,
			
 
				+            opacity=args.opacity,
			
 
				+            angle_deg=args.angle,
			
 
				+            spacing_x=args.spacing_x,
			
 
				+            spacing_y=args.spacing_y,
			
 
				+            save_mask=not args.no_mask,
			
 
				+        )
			
 
				+
			
 
				+
			
 
				+if __name__ == "__main__":
			
 
				+    main()
			
--- a/ocr_tools/universal_doc_parser/config/bank_statement_glm_vl.yaml
+++ b/ocr_tools/universal_doc_parser/config/bank_statement_glm_vl.yaml
@@ -19,46 +19,57 @@ preprocessor:
 
				     model_dir: null  # 使用默认路径
			
 
				   unwarping:
			
 
				     enabled: false
			
 
				-  # -------------------------------------------------------
			
 
				-  # 水印去除配置（适用于银行流水浅色斜向文字水印）
			
 
				-  # -------------------------------------------------------
			
 
				+  # 页级水印（细参见 ocr_utils/watermark/presets.py PAGE_WATERMARK_PRESETS）
			
 
				   watermark_removal:
			
 
				-    enabled: false           # 是否启用水印去除
			
 
				-    method: threshold # threshold | masked | masked_adaptive
			
 
				-    threshold: 175          # 全局阈值或掩膜失败时的回退阈值（140-180）
			
 
				-    morph_close_kernel: 0   # 去水印后灰度图闭运算，0 跳过
			
 
				-    # 去水印后对比度增强（text_restore 将笔画拉深，比全局 gamma 更接近原图）
			
 
				+    enabled: false
			
 
				+    detect_before_remove: true
			
 
				+    method: threshold   # threshold | masked | masked_adaptive
			
 
				+    threshold: 175
			
 
				     contrast_enhancement:
			
 
				-      enabled: true
			
 
				-      method: text_restore   # text_restore | clahe | gamma | linear
			
 
				-      text_black_target: 85  # 略提高，减轻去水印后笔画被拉花（原 75 过深）
			
 
				-      background_threshold: 248
			
 
				-      text_lo_percentile: 1.0
			
 
				-      text_hi_percentile: 99.0
			
 
				-      gamma: 0.75            # method=gamma 时生效
			
 
				-      clip_limit: 2.0        # method=clahe
			
 
				-      tile_grid_size: 8
			
 
				-      black_percentile: 2.0  # method=linear
			
 
				-      white_percentile: 98.0
			
 
				+      enabled: false
			
 
				+      method: text_restore
			
 
				+      text_black_target: 85
			
 
				     debug_options:
			
 
				-      enabled: false              # 由命令行 --debug / --debug-layout 统一控制
			
 
				-      output_dir: null            # null 时使用 pipeline 输出目录
			
 
				-      prefix: ""                  # 文件名前缀（运行时注入 page_name）
			
 
				-      subdir: watermark_removal   # 输出至 debug/watermark_removal/
			
 
				-      save_compare: true          # 保存左右对比图 *_watermark_compare.*
			
 
				-      image_format: "png"         # jpg / png
			
 
				+      enabled: false
			
 
				+      output_dir: null
			
 
				+      prefix: ""
			
 
				+      subdir: watermark_removal
			
 
				+      save_compare: true
			
 
				+      image_format: "png"
			
 
				 
			
 
				 # ============================================================
			
 
				-# Layout 检测配置 - 使用 PP-DocLayoutV3
			
 
				+# Layout 检测配置 - 智能路由器（按场景直接选择模型）
			
 
				 # ============================================================
			
 
				 layout_detection:
			
 
				-  module: "paddle"
			
 
				-  model_name: "PP-DocLayoutV3"
			
 
				-  model_dir: "PaddlePaddle/PP-DocLayoutV3_safetensors"
			
 
				-  device: "cpu"
			
 
				-  conf: 0.3
			
 
				-  num_threads: 4
			
 
				-  batch_size: 1
			
 
				+  module: "smart_router"
			
 
				+  strategy: "scene"  # 按场景直接选择模型，不走ocr_eval
			
 
				+
			
 
				+  # 场景策略：指定场景直接选用的布局模型
			
 
				+  scene_strategy:
			
 
				+    bank_statement:
			
 
				+      model: "docling"
			
 
				+    financial_report:
			
 
				+      model: "paddle_ppdoclayoutv3"
			
 
				+  default_model: "docling"
			
 
				+
			
 
				+  # 配置多个模型
			
 
				+  models:
			
 
				+    docling:
			
 
				+      module: "docling"
			
 
				+      model_name: "docling-layout-old"
			
 
				+      model_dir: "ds4sd/docling-layout-old"
			
 
				+      device: "cpu"
			
 
				+      conf: 0.3
			
 
				+      num_threads: 4
			
 
				+
			
 
				+    paddle_ppdoclayoutv3:
			
 
				+      module: "paddle"
			
 
				+      model_name: "PP-DocLayoutV3"
			
 
				+      model_dir: "PaddlePaddle/PP-DocLayoutV3_safetensors"
			
 
				+      device: "cpu"
			
 
				+      conf: 0.3
			
 
				+      num_threads: 4
			
 
				+      batch_size: 1
			
 
				   
			
 
				   # 后处理配置
			
 
				   post_process:
			
@@ -70,7 +81,7 @@ layout_detection:
 
				 
			
 
				   # Debug 可视化（底图为 inference_image，与 Layout 检测输入一致）
			
 
				   debug_options:
			
 
				-    enabled: true              # 由命令行 --debug / --debug-layout 控制
			
 
				+    enabled: false              # 由命令行 --debug / --debug-layout 控制
			
 
				     output_dir: null            # null 时由 pipeline 按页注入
			
 
				     prefix: ""
			
 
				     subdir: layout_detection    # 输出至 debug/layout_detection/
			
@@ -80,7 +91,123 @@ layout_detection:
 
				     image_format: "png"
			
 
				 
			
 
				 # ============================================================
			
 
				-# VL识别配置 - 使用 GLM-OCR
			
 
				+# OCR 识别配置
			
 
				+# ============================================================
			
 
				+ocr_recognition:
			
 
				+  module: "mineru"
			
 
				+  language: "ch"
			
 
				+  det_threshold: 0.5
			
 
				+  unclip_ratio: 1.5
			
 
				+  enable_merge_det_boxes: false
			
 
				+  batch_size: 8
			
 
				+  device: "cpu"
			
 
				+
			
 
				+  # Debug 可视化（底图为 inference_image，与整页 OCR 输入一致）
			
 
				+  debug_options:
			
 
				+    enabled: false              # 由命令行 --debug / --debug-ocr 控制
			
 
				+    output_dir: null
			
 
				+    prefix: ""
			
 
				+    subdir: ocr_recognition     # 输出至 debug/ocr_recognition/
			
 
				+    save_json: true
			
 
				+    image_format: png
			
 
				+
			
 
				+# ============================================================
			
 
				+# 表格分类配置（自动区分有线/无线表格）
			
 
				+# ============================================================
			
 
				+table_classification:
			
 
				+  enabled: true               # 启用自动表格分类
			
 
				+  module: "paddle"            # 分类模型：paddle（MinerU PaddleTableClsModel）
			
 
				+  confidence_threshold: 0.5   # 分类置信度阈值
			
 
				+  batch_size: 16              # 批处理大小
			
 
				+
			
 
				+  # Debug 可视化配置
			
 
				+  debug_options:
			
 
				+    enabled: false              # 由命令行 --debug / --debug-table 统一控制
			
 
				+    output_dir: null            # null 时由 pipeline 按页注入
			
 
				+    prefix: ""
			
 
				+    subdir: table_classification  # 输出至 debug/table_classification/
			
 
				+    save_table_lines: true      # paddle 线条检测叠加图
			
 
				+    image_format: "png"
			
 
				+
			
 
				+# ============================================================
			
 
				+# 有线表格识别专用配置（MinerU UNet）
			
 
				+# ============================================================
			
 
				+table_recognition_wired:
			
 
				+  use_wired_unet: false      # 不使用有线表格识别
			
 
				+  upscale_ratio: 3.333
			
 
				+  need_ocr: true
			
 
				+  row_threshold: 10
			
 
				+  col_threshold: 15
			
 
				+  ocr_conf_threshold: 0.9       # 单元格 OCR 置信度阈值
			
 
				+  cell_crop_margin: 2
			
 
				+  use_custom_postprocess: true  # 是否使用自定义后处理（默认启用）
			
 
				+
			
 
				+  # 是否启用倾斜矫正
			
 
				+  enable_deskew: true
			
 
				+
			
 
				+  # 🆕 启用多源单元格融合
			
 
				+  use_cell_fusion: true
			
 
				+  
			
 
				+  # 融合引擎配置
			
 
				+  cell_fusion:
			
 
				+    # RT-DETR 模型路径（必需）
			
 
				+    rtdetr_model_path: "/Users/zhch158/models/pytorch_models/Table/RT-DETR-L_wired_table_cell_det.onnx"
			
 
				+    
			
 
				+    # 融合权重
			
 
				+    unet_weight: 0.6        # UNet 权重（结构性强）
			
 
				+    rtdetr_weight: 0.4      # RT-DETR 权重（鲁棒性强）
			
 
				+    
			
 
				+    # 阈值配置
			
 
				+    iou_merge_threshold: 0.7    # 高IoU合并阈值（>0.7则加权平均）
			
 
				+    iou_nms_threshold: 0.5      # NMS去重阈值
			
 
				+    rtdetr_conf_threshold: 0.5  # RT-DETR置信度阈值
			
 
				+    
			
 
				+    # 功能开关
			
 
				+    enable_ocr_compensation: true      # 启用OCR边缘补偿
			
 
				+
			
 
				+  # 单元格二次 OCR（det 分行 + 整格/条带兜底 + 低分笔画增强重试）
			
 
				+  second_pass_ocr:
			
 
				+    reocr_mode: bank_statement       # 表体空单元必跑 + 同行多数非空则空格也跑
			
 
				+    header_row: 0                    # 表头行号（0=首行）
			
 
				+    row_peer_min_nonempty: 5         # 同行至少 N 个非空格时，本格空也触发二次 OCR
			
 
				+    line_min_score: 0.8              # 低于此分的分行从文本与计分中丢弃
			
 
				+    drop_low_score_blocks: true
			
 
				+    whole_cell_fallback: true        # 整格 det=False 兜底 + 条带扫描
			
 
				+    prefer_whole_on_tie: true
			
 
				+    whole_longer_min_extra_chars: 2  # 整格/条带文本比分行多长至少 N 字则优先
			
 
				+    strip_fallback_aspect_ratio: 1.8 # 高/宽>=该值且仅检出<=1行时滑动条带分行
			
 
				+    suspicious_short_min_chars: 4    # 高分但过短仍跑整格/条带兜底（与 enhance_retry 无关）
			
 
				+    cell_preprocess:
			
 
				+      watermark:
			
 
				+        enabled: true
			
 
				+        method: threshold
			
 
				+      denoise:
			
 
				+        enabled: false   # 小格 median 易糊笔画；lab 用 --denoise 对比
			
 
				+      contrast:
			
 
				+        enabled: false   # Pass1 去水印后可选；lab 对比 text_restore
			
 
				+        method: text_restore
			
 
				+        text_black_target: 88
			
 
				+      light:
			
 
				+        upscale_min_side: 192  # 128, 192 用于难例日期列
			
 
				+    enhance_retry:
			
 
				+      enabled: false
			
 
				+      # enabled: true 时 Pass2 预处理，默认见代码（cell_preprocess.enhance_retry 已废弃）
			
 
				+
			
 
				+  # Debug 可视化配置
			
 
				+  debug_options:
			
 
				+    enabled: false              # 由命令行 --debug / --debug-table 统一控制
			
 
				+    output_dir: null            # null 时由 pipeline 按页注入
			
 
				+    prefix: ""
			
 
				+    subdir: table_recognition_wired  # 输出至 debug/table_recognition_wired/
			
 
				+    save_table_lines: true
			
 
				+    save_connected_components: true
			
 
				+    save_grid_structure: true
			
 
				+    save_text_overlay: true
			
 
				+    image_format: "png"
			
 
				+    # 单元格二次 OCR 裁剪图：debug/table_recognition_wired/tablecell_ocr/
			
 
				+
			
 
				+# ============================================================
			
 
				+# VL识别配置 - 使用 GLM-OCR（无线表格 + seal识别）
			
 
				 # ============================================================
			
 
				 vl_recognition:
			
 
				   module: "glmocr"
			
@@ -116,29 +243,6 @@ vl_recognition:
 
				   
			
 
				   # 场景特定配置
			
 
				   table_recognition:
			
 
				-    bank_statement_mode: true
			
 
				-
			
 
				-# ============================================================
			
 
				-# OCR识别配置
			
 
				-# ============================================================
			
 
				-ocr_recognition:
			
 
				-  module: "mineru" 
			
 
				-  language: "ch"
			
 
				-  det_threshold: 0.6
			
 
				-  unclip_ratio: 1.5
			
 
				-  enable_merge_det_boxes: false
			
 
				-  batch_size: 8
			
 
				-  device: "cpu"
			
 
				-
			
 
				-
			
 
				-  # Debug 可视化（底图为 inference_image，与整页 OCR 输入一致）
			
 
				-  debug_options:
			
 
				-    enabled: false              # 由命令行 --debug / --debug-ocr 控制
			
 
				-    output_dir: null
			
 
				-    prefix: ""
			
 
				-    subdir: ocr_recognition     # 输出至 debug/ocr_recognition/
			
 
				-    save_json: true
			
 
				-    image_format: png
			
 
				 
			
 
				 # ============================================================
			
 
				 # 输出配置
			
--- a/ocr_tools/universal_doc_parser/config/bank_statement_mineru_vl.yaml
+++ b/ocr_tools/universal_doc_parser/config/bank_statement_mineru_vl.yaml
@@ -19,35 +19,27 @@ preprocessor:
 
				     model_dir: null  # 使用默认路径
			
 
				   unwarping:
			
 
				     enabled: false
			
 
				-  # -------------------------------------------------------
			
 
				-  # 水印去除配置（适用于银行流水浅色斜向文字水印）
			
 
				-  # -------------------------------------------------------
			
 
				+  # 页级水印（细参见 ocr_utils/watermark/presets.py PAGE_WATERMARK_PRESETS）
			
 
				   watermark_removal:
			
 
				-    enabled: false           # 是否启用水印去除
			
 
				-    method: threshold # threshold | masked | masked_adaptive
			
 
				-    threshold: 175          # 全局阈值或掩膜失败时的回退阈值（140-180）
			
 
				-    morph_close_kernel: 0   # 去水印后灰度图闭运算，0 跳过
			
 
				-    # 去水印后对比度增强（text_restore 将笔画拉深，比全局 gamma 更接近原图）
			
 
				+    enabled: false
			
 
				+    detect_before_remove: true
			
 
				+    method: threshold   # threshold | masked | masked_adaptive
			
 
				+    threshold: 175
			
 
				     contrast_enhancement:
			
 
				-      enabled: true
			
 
				-      method: text_restore   # text_restore | clahe | gamma | linear
			
 
				-      text_black_target: 85  # 略提高，减轻去水印后笔画被拉花（原 75 过深）
			
 
				-      background_threshold: 248
			
 
				-      text_lo_percentile: 1.0
			
 
				-      text_hi_percentile: 99.0
			
 
				-      gamma: 0.75            # method=gamma 时生效
			
 
				-      clip_limit: 2.0        # method=clahe
			
 
				-      tile_grid_size: 8
			
 
				-      black_percentile: 2.0  # method=linear
			
 
				-      white_percentile: 98.0
			
 
				+      enabled: false
			
 
				+      method: text_restore
			
 
				+      text_black_target: 85
			
 
				     debug_options:
			
 
				-      enabled: false              # 由命令行 --debug / --debug-layout 统一控制
			
 
				-      output_dir: null            # null 时使用 pipeline 输出目录
			
 
				-      prefix: ""                  # 文件名前缀（运行时注入 page_name）
			
 
				-      subdir: watermark_removal   # 输出至 debug/watermark_removal/
			
 
				-      save_compare: true          # 保存左右对比图 *_watermark_compare.*
			
 
				-      image_format: "png"         # jpg / png
			
 
				+      enabled: false
			
 
				+      output_dir: null
			
 
				+      prefix: ""
			
 
				+      subdir: watermark_removal
			
 
				+      save_compare: true
			
 
				+      image_format: "png"
			
 
				 
			
 
				+# ============================================================
			
 
				+# Layout 检测配置 - 智能路由器（按场景直接选择模型）
			
 
				+# ============================================================
			
 
				 layout_detection:
			
 
				   # MinerU-VL layout（通过 VLM 服务做版式检测）
			
 
				   module: "mineru_vl"
			
--- a/ocr_tools/universal_doc_parser/config/bank_statement_paddle_vl.yaml
+++ b/ocr_tools/universal_doc_parser/config/bank_statement_paddle_vl.yaml
@@ -19,35 +19,27 @@ preprocessor:
 
				     model_dir: null  # 使用默认路径
			
 
				   unwarping:
			
 
				     enabled: false
			
 
				-  # -------------------------------------------------------
			
 
				-  # 水印去除配置（适用于银行流水浅色斜向文字水印）
			
 
				-  # -------------------------------------------------------
			
 
				+  # 页级水印（细参见 ocr_utils/watermark/presets.py PAGE_WATERMARK_PRESETS）
			
 
				   watermark_removal:
			
 
				-    enabled: false           # 是否启用水印去除
			
 
				-    method: threshold # threshold | masked | masked_adaptive
			
 
				-    threshold: 175          # 全局阈值或掩膜失败时的回退阈值（140-180）
			
 
				-    morph_close_kernel: 0   # 去水印后灰度图闭运算，0 跳过
			
 
				-    # 去水印后对比度增强（text_restore 将笔画拉深，比全局 gamma 更接近原图）
			
 
				+    enabled: false
			
 
				+    detect_before_remove: true
			
 
				+    method: threshold   # threshold | masked | masked_adaptive
			
 
				+    threshold: 175
			
 
				     contrast_enhancement:
			
 
				-      enabled: true
			
 
				-      method: text_restore   # text_restore | clahe | gamma | linear
			
 
				-      text_black_target: 85  # 略提高，减轻去水印后笔画被拉花（原 75 过深）
			
 
				-      background_threshold: 248
			
 
				-      text_lo_percentile: 1.0
			
 
				-      text_hi_percentile: 99.0
			
 
				-      gamma: 0.75            # method=gamma 时生效
			
 
				-      clip_limit: 2.0        # method=clahe
			
 
				-      tile_grid_size: 8
			
 
				-      black_percentile: 2.0  # method=linear
			
 
				-      white_percentile: 98.0
			
 
				+      enabled: false
			
 
				+      method: text_restore
			
 
				+      text_black_target: 85
			
 
				     debug_options:
			
 
				-      enabled: false              # 由命令行 --debug / --debug-layout 统一控制
			
 
				-      output_dir: null            # null 时使用 pipeline 输出目录
			
 
				-      prefix: ""                  # 文件名前缀（运行时注入 page_name）
			
 
				-      subdir: watermark_removal   # 输出至 debug/watermark_removal/
			
 
				-      save_compare: true          # 保存左右对比图 *_watermark_compare.*
			
 
				-      image_format: "png"         # jpg / png
			
 
				+      enabled: false
			
 
				+      output_dir: null
			
 
				+      prefix: ""
			
 
				+      subdir: watermark_removal
			
 
				+      save_compare: true
			
 
				+      image_format: "png"
			
 
				 
			
 
				+# ============================================================
			
 
				+# Layout 检测配置 - 智能路由器（按场景直接选择模型）
			
 
				+# ============================================================
			
 
				 layout_detection:
			
 
				   # module: "paddle"
			
 
				   # model_name: "RT-DETR-H_layout_17cls"
			
@@ -104,7 +96,7 @@ vl_recognition:
 
				 ocr_recognition:
			
 
				   module: "mineru" 
			
 
				   language: "ch"
			
 
				-  det_threshold: 0.6
			
 
				+  det_threshold: 0.5
			
 
				   unclip_ratio: 1.5
			
 
				   enable_merge_det_boxes: false
			
 
				   batch_size: 8
			
--- a/ocr_tools/universal_doc_parser/config/bank_statement_paddle_vl_local.yaml
+++ b/ocr_tools/universal_doc_parser/config/bank_statement_paddle_vl_local.yaml
@@ -22,34 +22,23 @@ preprocessor:
 
				     model_dir: null  # 使用默认路径
			
 
				   unwarping:
			
 
				     enabled: false
			
 
				-  # -------------------------------------------------------
			
 
				-  # 水印去除配置（适用于银行流水浅色斜向文字水印）
			
 
				-  # -------------------------------------------------------
			
 
				+  # 页级水印（细参见 ocr_utils/watermark/presets.py PAGE_WATERMARK_PRESETS）
			
 
				   watermark_removal:
			
 
				-    enabled: false           # 是否启用水印去除
			
 
				-    method: threshold # threshold | masked | masked_adaptive
			
 
				-    threshold: 175          # 全局阈值或掩膜失败时的回退阈值（140-180）
			
 
				-    morph_close_kernel: 0   # 去水印后灰度图闭运算，0 跳过
			
 
				-    # 去水印后对比度增强（text_restore 将笔画拉深，比全局 gamma 更接近原图）
			
 
				+    enabled: false
			
 
				+    detect_before_remove: true
			
 
				+    method: threshold   # threshold | masked | masked_adaptive
			
 
				+    threshold: 175
			
 
				     contrast_enhancement:
			
 
				-      enabled: true
			
 
				-      method: text_restore   # text_restore | clahe | gamma | linear
			
 
				-      text_black_target: 85  # 略提高，减轻去水印后笔画被拉花（原 75 过深）
			
 
				-      background_threshold: 248
			
 
				-      text_lo_percentile: 1.0
			
 
				-      text_hi_percentile: 99.0
			
 
				-      gamma: 0.75            # method=gamma 时生效
			
 
				-      clip_limit: 2.0        # method=clahe
			
 
				-      tile_grid_size: 8
			
 
				-      black_percentile: 2.0  # method=linear
			
 
				-      white_percentile: 98.0
			
 
				+      enabled: false
			
 
				+      method: text_restore
			
 
				+      text_black_target: 85
			
 
				     debug_options:
			
 
				-      enabled: false              # 由命令行 --debug / --debug-layout 统一控制
			
 
				-      output_dir: null            # null 时使用 pipeline 输出目录
			
 
				-      prefix: ""                  # 文件名前缀（运行时注入 page_name）
			
 
				-      subdir: watermark_removal   # 输出至 debug/watermark_removal/
			
 
				-      save_compare: true          # 保存左右对比图 *_watermark_compare.*
			
 
				-      image_format: "png"         # jpg / png
			
 
				+      enabled: false
			
 
				+      output_dir: null
			
 
				+      prefix: ""
			
 
				+      subdir: watermark_removal
			
 
				+      save_compare: true
			
 
				+      image_format: "png"
			
 
				 
			
 
				 # ============================================================
			
 
				 # Layout 检测配置 - 智能路由器（按场景直接选择模型）
			
@@ -180,13 +169,33 @@ table_recognition_wired:
 
				     # 功能开关
			
 
				     enable_ocr_compensation: true      # 启用OCR边缘补偿
			
 
				 
			
 
				-
			
 
				-  # 单元格二次 OCR（det 分行 + 整格兜底 + 低分块过滤）
			
 
				+  # 单元格二次 OCR（det 分行 + 整格/条带兜底 + 低分笔画增强重试）
			
 
				   second_pass_ocr:
			
 
				-    line_min_score: 0.8
			
 
				+    reocr_mode: bank_statement       # 表体空单元必跑 + 同行多数非空则空格也跑
			
 
				+    header_row: 0                    # 表头行号（0=首行）
			
 
				+    row_peer_min_nonempty: 5         # 同行至少 N 个非空格时，本格空也触发二次 OCR
			
 
				+    line_min_score: 0.8              # 低于此分的分行从文本与计分中丢弃
			
 
				     drop_low_score_blocks: true
			
 
				-    whole_cell_fallback: true
			
 
				+    whole_cell_fallback: true        # 整格 det=False 兜底 + 条带扫描
			
 
				     prefer_whole_on_tie: true
			
 
				+    whole_longer_min_extra_chars: 2  # 整格/条带文本比分行多长至少 N 字则优先
			
 
				+    strip_fallback_aspect_ratio: 1.8 # 高/宽>=该值且仅检出<=1行时滑动条带分行
			
 
				+    suspicious_short_min_chars: 4    # 高分但过短仍跑整格/条带兜底（与 enhance_retry 无关）
			
 
				+    cell_preprocess:
			
 
				+      watermark:
			
 
				+        enabled: true
			
 
				+        method: threshold
			
 
				+      denoise:
			
 
				+        enabled: false   # 小格 median 易糊笔画；lab 用 --denoise 对比
			
 
				+      contrast:
			
 
				+        enabled: false   # Pass1 去水印后可选；lab 对比 text_restore
			
 
				+        method: text_restore
			
 
				+        text_black_target: 88
			
 
				+      light:
			
 
				+        upscale_min_side: 192  # 128, 192 用于难例日期列
			
 
				+    enhance_retry:
			
 
				+      enabled: false
			
 
				+      # enabled: true 时 Pass2 预处理，默认见代码（cell_preprocess.enhance_retry 已废弃）
			
 
				 
			
 
				   # Debug 可视化配置
			
 
				   debug_options:
			
--- a/ocr_tools/universal_doc_parser/config/bank_statement_smart_router.yaml
+++ b/ocr_tools/universal_doc_parser/config/bank_statement_smart_router.yaml
@@ -21,35 +21,27 @@ preprocessor:
 
				     model_dir: null  # 使用默认路径
			
 
				   unwarping:
			
 
				     enabled: false
			
 
				-  # -------------------------------------------------------
			
 
				-  # 水印去除配置（适用于银行流水浅色斜向文字水印）
			
 
				-  # -------------------------------------------------------
			
 
				+  # 页级水印（细参见 ocr_utils/watermark/presets.py PAGE_WATERMARK_PRESETS）
			
 
				   watermark_removal:
			
 
				-    enabled: false           # 是否启用水印去除
			
 
				-    method: threshold # threshold | masked | masked_adaptive
			
 
				-    threshold: 175          # 全局阈值或掩膜失败时的回退阈值（140-180）
			
 
				-    morph_close_kernel: 0   # 去水印后灰度图闭运算，0 跳过
			
 
				-    # 去水印后对比度增强（text_restore 将笔画拉深，比全局 gamma 更接近原图）
			
 
				+    enabled: false
			
 
				+    detect_before_remove: true
			
 
				+    method: threshold   # threshold | masked | masked_adaptive
			
 
				+    threshold: 175
			
 
				     contrast_enhancement:
			
 
				-      enabled: true
			
 
				-      method: text_restore   # text_restore | clahe | gamma | linear
			
 
				-      text_black_target: 85  # 略提高，减轻去水印后笔画被拉花（原 75 过深）
			
 
				-      background_threshold: 248
			
 
				-      text_lo_percentile: 1.0
			
 
				-      text_hi_percentile: 99.0
			
 
				-      gamma: 0.75            # method=gamma 时生效
			
 
				-      clip_limit: 2.0        # method=clahe
			
 
				-      tile_grid_size: 8
			
 
				-      black_percentile: 2.0  # method=linear
			
 
				-      white_percentile: 98.0
			
 
				+      enabled: false
			
 
				+      method: text_restore
			
 
				+      text_black_target: 85
			
 
				     debug_options:
			
 
				-      enabled: false              # 由命令行 --debug / --debug-layout 统一控制
			
 
				-      output_dir: null            # null 时使用 pipeline 输出目录
			
 
				-      prefix: ""                  # 文件名前缀（运行时注入 page_name）
			
 
				-      subdir: watermark_removal   # 输出至 debug/watermark_removal/
			
 
				-      save_compare: true          # 保存左右对比图 *_watermark_compare.*
			
 
				-      image_format: "png"         # jpg / png
			
 
				-
			
 
				+      enabled: false
			
 
				+      output_dir: null
			
 
				+      prefix: ""
			
 
				+      subdir: watermark_removal
			
 
				+      save_compare: true
			
 
				+      image_format: "png"
			
 
				+
			
 
				+# ============================================================
			
 
				+# Layout 检测配置 - 智能路由器（按场景直接选择模型）
			
 
				+# ============================================================
			
 
				 layout_detection:
			
 
				   module: "smart_router"
			
 
				   strategy: "ocr_eval"  # ocr_eval（推荐，基于OCR评估选择最佳）, auto（快速模式，基于文档特征）
			
@@ -73,14 +65,6 @@ layout_detection:
 
				       model_dir: null  # 使用默认路径
			
 
				       device: "cpu"
			
 
				     
			
 
				-  # Debug 可视化配置（与 MinerUWiredTableRecognizer.DebugOptions 对齐）
			
 
				-  # 默认关闭。开启后将保存：layout检测结果
			
 
				-  debug_options:
			
 
				-    enabled: true               # 是否开启调试可视化输出
			
 
				-    output_dir: null             # 调试输出目录；null不输出
			
 
				-    prefix: ""                  # 保存文件名前缀（如设置为页码）
			
 
				-
			
 
				-  
			
 
				   # 可选：回退模型（当所有模型都失败时使用）
			
 
				   fallback_model:
			
 
				     module: "mineru"
			
@@ -90,11 +74,25 @@ layout_detection:
 
				   # 后处理配置
			
 
				   post_process:
			
 
				     # 将大面积文本块转换为表格（后处理）
			
 
				-    convert_large_text_to_table: true
			
 
				-    min_text_area_ratio: 0.25
			
 
				-    min_text_width_ratio: 0.4
			
 
				-    min_text_height_ratio: 0.3
			
 
				+    convert_large_text_to_table: true  # 是否启用
			
 
				+    min_text_area_ratio: 0.25         # 最小面积占比（25%）
			
 
				+    min_text_width_ratio: 0.4         # 最小宽度占比（40%）
			
 
				+    min_text_height_ratio: 0.3        # 最小高度占比（30%）
			
 
				+
			
 
				+  # Debug 可视化（底图为 inference_image，与 Layout 检测输入一致）
			
 
				+  debug_options:
			
 
				+    enabled: false              # 由命令行 --debug / --debug-layout 控制
			
 
				+    output_dir: null            # null 时由 pipeline 按页注入
			
 
				+    prefix: ""
			
 
				+    subdir: layout_detection    # 输出至 debug/layout_detection/
			
 
				+    save_raw: true              # 后处理前
			
 
				+    save_post_processed: true   # 后处理后
			
 
				+    save_json: true
			
 
				+    image_format: "png"
			
 
				 
			
 
				+# ============================================================
			
 
				+# OCR 识别配置
			
 
				+# ============================================================
			
 
				 ocr_recognition:
			
 
				   module: "mineru"
			
 
				   language: "ch"
			
@@ -104,7 +102,6 @@ ocr_recognition:
 
				   batch_size: 8
			
 
				   device: "cpu"
			
 
				 
			
 
				-
			
 
				   # Debug 可视化（底图为 inference_image，与整页 OCR 输入一致）
			
 
				   debug_options:
			
 
				     enabled: true              # 由命令行 --debug / --debug-ocr 控制
			
@@ -114,56 +111,100 @@ ocr_recognition:
 
				     save_json: true
			
 
				     image_format: png
			
 
				 
			
 
				+# ============================================================
			
 
				 # 表格分类配置（自动区分有线/无线表格）
			
 
				+# ============================================================
			
 
				 table_classification:
			
 
				   enabled: true               # 是否启用自动表格分类（默认关闭，使用手动配置）
			
 
				   module: "paddle"            # 分类模型：paddle（MinerU PaddleTableClsModel）
			
 
				   confidence_threshold: 0.5   # 分类置信度阈值
			
 
				   batch_size: 16              # 批处理大小
			
 
				 
			
 
				-
			
 
				-
			
 
				-  # Debug 可视化（底图为 inference_image，与 Layout 检测输入一致）
			
 
				+  # Debug 可视化配置
			
 
				   debug_options:
			
 
				-    enabled: true              # 由命令行 --debug / --debug-layout 控制
			
 
				+    enabled: false              # 由命令行 --debug / --debug-table 统一控制
			
 
				     output_dir: null            # null 时由 pipeline 按页注入
			
 
				     prefix: ""
			
 
				-    subdir: layout_detection    # 输出至 debug/layout_detection/
			
 
				-    save_raw: true              # 后处理前
			
 
				-    save_post_processed: true   # 后处理后
			
 
				-    save_json: true
			
 
				+    subdir: table_classification  # 输出至 debug/table_classification/
			
 
				+    save_table_lines: true      # paddle 线条检测叠加图
			
 
				     image_format: "png"
			
 
				 
			
 
				-# 有线表格识别专用配置
			
 
				+# ============================================================
			
 
				+# 有线表格识别专用配置（MinerU UNet）
			
 
				+# ============================================================
			
 
				 table_recognition_wired:
			
 
				   use_wired_unet: true
			
 
				   upscale_ratio: 3.333
			
 
				   need_ocr: true
			
 
				   row_threshold: 10
			
 
				   col_threshold: 15
			
 
				-  ocr_conf_threshold: 0.8
			
 
				+  ocr_conf_threshold: 0.9       # 单元格 OCR 置信度阈值
			
 
				   cell_crop_margin: 2
			
 
				   use_custom_postprocess: true  # 是否使用自定义后处理（默认启用）
			
 
				 
			
 
				   # 是否启用倾斜矫正
			
 
				   enable_deskew: true
			
 
				 
			
 
				+  # 🆕 启用多源单元格融合
			
 
				+  use_cell_fusion: true
			
 
				+  
			
 
				+  # 融合引擎配置
			
 
				+  cell_fusion:
			
 
				+    # RT-DETR 模型路径（必需）
			
 
				+    rtdetr_model_path: "/Users/zhch158/models/pytorch_models/Table/RT-DETR-L_wired_table_cell_det.onnx"
			
 
				+    
			
 
				+    # 融合权重
			
 
				+    unet_weight: 0.6        # UNet 权重（结构性强）
			
 
				+    rtdetr_weight: 0.4      # RT-DETR 权重（鲁棒性强）
			
 
				+    
			
 
				+    # 阈值配置
			
 
				+    iou_merge_threshold: 0.7    # 高IoU合并阈值（>0.7则加权平均）
			
 
				+    iou_nms_threshold: 0.5      # NMS去重阈值
			
 
				+    rtdetr_conf_threshold: 0.5  # RT-DETR置信度阈值
			
 
				+    
			
 
				+    # 功能开关
			
 
				+    enable_ocr_compensation: true      # 启用OCR边缘补偿
			
 
				 
			
 
				-  # 单元格二次 OCR（det 分行 + 整格兜底 + 低分块过滤）
			
 
				+  # 单元格二次 OCR（det 分行 + 整格/条带兜底 + 低分笔画增强重试）
			
 
				   second_pass_ocr:
			
 
				-    line_min_score: 0.8
			
 
				+    reocr_mode: bank_statement       # 表体空单元必跑 + 同行多数非空则空格也跑
			
 
				+    header_row: 0                    # 表头行号（0=首行）
			
 
				+    row_peer_min_nonempty: 5         # 同行至少 N 个非空格时，本格空也触发二次 OCR
			
 
				+    line_min_score: 0.8              # 低于此分的分行从文本与计分中丢弃
			
 
				     drop_low_score_blocks: true
			
 
				-    whole_cell_fallback: true
			
 
				+    whole_cell_fallback: true        # 整格 det=False 兜底 + 条带扫描
			
 
				     prefer_whole_on_tie: true
			
 
				+    whole_longer_min_extra_chars: 2  # 整格/条带文本比分行多长至少 N 字则优先
			
 
				+    strip_fallback_aspect_ratio: 1.8 # 高/宽>=该值且仅检出<=1行时滑动条带分行
			
 
				+    suspicious_short_min_chars: 4    # 高分但过短仍跑整格/条带兜底（与 enhance_retry 无关）
			
 
				+    cell_preprocess:
			
 
				+      watermark:
			
 
				+        enabled: true
			
 
				+        method: threshold
			
 
				+      denoise:
			
 
				+        enabled: false   # 小格 median 易糊笔画；lab 用 --denoise 对比
			
 
				+      contrast:
			
 
				+        enabled: false   # Pass1 去水印后可选；lab 对比 text_restore
			
 
				+        method: text_restore
			
 
				+        text_black_target: 88
			
 
				+      light:
			
 
				+        upscale_min_side: 192  # 128, 192 用于难例日期列
			
 
				+    enhance_retry:
			
 
				+      enabled: false
			
 
				+      # enabled: true 时 Pass2 预处理，默认见代码（cell_preprocess.enhance_retry 已废弃）
			
 
				 
			
 
				   # Debug 可视化配置
			
 
				   debug_options:
			
 
				-    enabled: true              # 由命令行 --debug / --debug-table 统一控制
			
 
				+    enabled: false              # 由命令行 --debug / --debug-table 统一控制
			
 
				     output_dir: null            # null 时由 pipeline 按页注入
			
 
				     prefix: ""
			
 
				-    subdir: table_classification  # 输出至 debug/table_classification/
			
 
				-    save_table_lines: true      # paddle 线条检测叠加图
			
 
				+    subdir: table_recognition_wired  # 输出至 debug/table_recognition_wired/
			
 
				+    save_table_lines: true
			
 
				+    save_connected_components: true
			
 
				+    save_grid_structure: true
			
 
				+    save_text_overlay: true
			
 
				     image_format: "png"
			
 
				+    # 单元格二次 OCR 裁剪图：debug/table_recognition_wired/tablecell_ocr/
			
 
				 
			
 
				 # VLM 表格识别配置（当分类为 'wireless' 时使用）
			
 
				 vl_recognition:
			
@@ -187,6 +228,9 @@ vl_recognition:
 
				   # 表格识别特定配置
			
 
				   table_recognition:
			
 
				 
			
 
				+# ============================================================
			
 
				+# 输出配置
			
 
				+# ============================================================
			
 
				 output:
			
 
				   create_subdir: false
			
 
				   save_pdf_images: true
			
--- a/ocr_tools/universal_doc_parser/config/bank_statement_yusys_local.yaml
+++ b/ocr_tools/universal_doc_parser/config/bank_statement_yusys_local.yaml
@@ -26,11 +26,10 @@ preprocessor:
 
				   watermark_removal:
			
 
				     enabled: false
			
 
				     detect_before_remove: true
			
 
				-    method: masked_adaptive   # threshold | masked | masked_adaptive
			
 
				+    method: threshold   # threshold | masked | masked_adaptive
			
 
				     threshold: 175
			
 
				-    morph_close_kernel: 0
			
 
				     contrast_enhancement:
			
 
				-      enabled: true
			
 
				+      enabled: false
			
 
				       method: text_restore
			
 
				       text_black_target: 85
			
 
				     debug_options:
			
@@ -180,31 +179,22 @@ table_recognition_wired:
 
				     prefer_whole_on_tie: true
			
 
				     whole_longer_min_extra_chars: 2  # 整格/条带文本比分行多长至少 N 字则优先
			
 
				     strip_fallback_aspect_ratio: 1.8 # 高/宽>=该值且仅检出<=1行时滑动条带分行
			
 
				+    suspicious_short_min_chars: 4    # 高分但过短仍跑整格/条带兜底（与 enhance_retry 无关）
			
 
				     cell_preprocess:
			
 
				       watermark:
			
 
				         enabled: true
			
 
				-        method: masked_adaptive
			
 
				+        method: threshold
			
 
				       denoise:
			
 
				         enabled: false   # 小格 median 易糊笔画；lab 用 --denoise 对比
			
 
				-        method: median
			
 
				       contrast:
			
 
				-        enabled: false
			
 
				+        enabled: false   # Pass1 去水印后可选；lab 对比 text_restore
			
 
				         method: text_restore
			
 
				         text_black_target: 88
			
 
				       light:
			
 
				-        upscale_min_side: 64
			
 
				-      enhance_retry:
			
 
				-        enabled: false
			
 
				-        score_below: 0.90
			
 
				-        min_chars: 4
			
 
				-        short_text_in_tall_cell: true
			
 
				-        contrast:
			
 
				-          enabled: true
			
 
				-          method: text_restore
			
 
				-          text_black_target: 75
			
 
				-        sharpen:
			
 
				-          enabled: false
			
 
				-          amount: 0.3
			
 
				+        upscale_min_side: 192  # 128, 192 用于难例日期列
			
 
				+    enhance_retry:
			
 
				+      enabled: false
			
 
				+      # enabled: true 时 Pass2 预处理，默认见代码（cell_preprocess.enhance_retry 已废弃）
			
 
				 
			
 
				   # Debug 可视化配置
			
 
				   debug_options:
			
--- a/ocr_tools/universal_doc_parser/config/bank_statement_yusys_v2.yaml
+++ b/ocr_tools/universal_doc_parser/config/bank_statement_yusys_v2.yaml
@@ -1,158 +0,0 @@
 
				-# 银行交易流水场景配置 v2
			
 
				-# 支持完整的处理流程：PDF分类 → 方向识别 → Layout检测 → OCR/VLM并行处理 → 坐标匹配
			
 
				-
			
 
				-scene_name: "bank_statement"
			
 
				-description: "银行交易流水、对账单等场景 - 增强版"
			
 
				-
			
 
				-# ============================================================
			
 
				-# 输入配置
			
 
				-# ============================================================
			
 
				-input:
			
 
				-  supported_formats: [".pdf", ".png", ".jpg", ".jpeg", ".bmp", ".tiff"]
			
 
				-  dpi: 200  # PDF转图片的DPI
			
 
				-  txt_pdf_watermark_removal:
			
 
				-    enabled: true   # 文字型PDF渲染前去除水印XObject（保留文字可搜索性）
			
 
				-    sample_pages: 3  # 扫描前N页快速预检
			
 
				-
			
 
				-# ============================================================
			
 
				-# 预处理配置（方向识别）
			
 
				-# ============================================================
			
 
				-preprocessor:
			
 
				-  module: "mineru"
			
 
				-  orientation_classifier:
			
 
				-    enabled: true  # 扫描件自动开启，数字PDF自动跳过
			
 
				-    model_name: "paddle_orientation_classification"
			
 
				-    model_dir: null  # 使用默认路径
			
 
				-  unwarping:
			
 
				-    enabled: false  # 图像矫正（可选）
			
 
				-  # -------------------------------------------------------
			
 
				-  # 水印去除配置（适用于银行流水浅色斜向文字水印）
			
 
				-  # -------------------------------------------------------
			
 
				-  watermark_removal:
			
 
				-    enabled: true           # 是否启用水印去除
			
 
				-    threshold: 160          # 灰度阈值（140-180）：高于此值视为水印变白
			
 
				-                            # 值越大保守（残留水印），值越小激进（损失浅色正文）
			
 
				-    morph_close_kernel: 0   # 形态学闭运算核大小（像素），默认的 morph_kernel 改为 0（非二值图像时形态学闭运算会适得其反）
			
 
				-
			
 
				-# ============================================================
			
 
				-# 版式检测配置
			
 
				-# ============================================================
			
 
				-layout_detection:
			
 
				-  module: "docling"
			
 
				-  model_name: "docling-layout-old"
			
 
				-  model_dir: ds4sd/docling-layout-old  # 使用默认路径，自动下载 doclayout_yolo_docstructbench_imgsz1280_2501.pt
			
 
				-  device: "cpu"  # 可选: "cpu", "cuda", "mps"
			
 
				-  conf: 0.3
			
 
				-  num_threads: 4
			
 
				-  
			
 
				-  # 后处理配置
			
 
				-  post_process:
			
 
				-    # 将大面积文本块转换为表格（后处理）
			
 
				-    convert_large_text_to_table: true  # 是否启用
			
 
				-    min_text_area_ratio: 0.25         # 最小面积占比（25%）
			
 
				-    min_text_width_ratio: 0.4         # 最小宽度占比（40%）
			
 
				-    min_text_height_ratio: 0.3        # 最小高度占比（30%）
			
 
				-
			
 
				-# ============================================================
			
 
				-# VL识别配置（表格、公式）
			
 
				-# ============================================================
			
 
				-vl_recognition:
			
 
				-  # 可选: "mineru" (MinerU VLM) 或 "paddle" (PaddleOCR-VL)
			
 
				-  module: "paddle"
			
 
				-  model_name: "PaddleOCR-VL-0.9B"
			
 
				-  
			
 
				-  # 后端配置
			
 
				-  backend: "http-client"  # 可选: "http-client", "vllm-engine", "transformers"
			
 
				-  server_url: "http://10.192.72.11:20016"  # PaddleOCR-VL 服务地址
			
 
				-  
			
 
				-  # 图片尺寸限制（避免序列长度超限）
			
 
				-  max_image_size: 4096
			
 
				-  resize_mode: 'max'  # 'max' 保持宽高比, 'fixed' 固定尺寸
			
 
				-  
			
 
				-  device: "cpu"
			
 
				-  batch_size: 1
			
 
				-  
			
 
				-  model_params:
			
 
				-    max_concurrency: 10
			
 
				-    http_timeout: 600
			
 
				-  
			
 
				-  # 表格识别特定配置
			
 
				-  table_recognition:
			
 
				-    bank_statement_mode: true      # 银行流水优化模式
			
 
				-
			
 
				-# ============================================================
			
 
				-# OCR识别配置（文本检测+识别）
			
 
				-# ============================================================
			
 
				-ocr_recognition:
			
 
				-  module: "mineru"
			
 
				-  language: "ch"  # 语言: ch, ch_lite, en, japan 等
			
 
				-  det_threshold: 0.6  # 检测阈值
			
 
				-  unclip_ratio: 1.5   # 文本框扩展比例
			
 
				-  enable_merge_det_boxes: false  # 不合并框
			
 
				-  batch_size: 8
			
 
				-  device: "cpu"
			
 
				-
			
 
				-# ============================================================
			
 
				-# 输出配置
			
 
				-# ============================================================
			
 
				-output:
			
 
				-  # 基础输出
			
 
				-  create_subdir: false       # 创建子目录
			
 
				-  save_json: true           # 保存 middle.json（MinerU标准格式）
			
 
				-  save_markdown: true       # 保存 Markdown 文件
			
 
				-  save_html: true           # 保存表格 HTML 文件
			
 
				-  
			
 
				-  # Debug 输出（通过命令行 --debug 开启）
			
 
				-  save_layout_image: true  # 保存 layout 可视化图片
			
 
				-  save_ocr_image: true     # 保存 OCR 可视化图片
			
 
				-  draw_type_label: true     # 在可视化图片上标注类型
			
 
				-  draw_bbox_number: true    # 在可视化图片上标注序号
			
 
				-  
			
 
				-  # 增强输出
			
 
				-  save_enhanced_json: true  # 保存增强版 JSON（包含单元格坐标）
			
 
				-
			
 
				-  normalize_numbers: true  # 金额数字标准化（全角→半角）
			
 
				-
			
 
				-# ============================================================
			
 
				-# 场景特定配置
			
 
				-# ============================================================
			
 
				-scene_config:
			
 
				-  bank_statement:
			
 
				-    # 表格结构特征
			
 
				-    table_structure: "single_column_list"  # 单栏列表形式
			
 
				-    merged_cells: false                     # 无合并单元格
			
 
				-    
			
 
				-    # 预期列名（用于验证）
			
 
				-    expected_columns: ["日期", "摘要", "收入", "支出", "余额"]
			
 
				-    
			
 
				-    # 验证规则
			
 
				-    amount_validation: true   # 金额格式验证
			
 
				-    date_validation: true     # 日期格式验证
			
 
				-    balance_validation: true  # 余额一致性验证
			
 
				-    
			
 
				-  processing_rules:
			
 
				-    # 表格处理规则
			
 
				-    table_rules:
			
 
				-      - detect_table_type: ["wired", "wireless"]  # 检测有线/无线表格
			
 
				-      - extract_header_automatically: true         # 自动提取表头
			
 
				-      - validate_amount_format: true               # 验证金额格式
			
 
				-      - merge_continuation_rows: true              # 合并续行
			
 
				-      
			
 
				-    # OCR后处理规则
			
 
				-    ocr_rules:
			
 
				-      - filter_low_confidence: 0.7      # 过滤低置信度结果
			
 
				-      - merge_adjacent_text: true       # 合并相邻文本
			
 
				-      - number_format_normalization: true  # 数字格式标准化
			
 
				-
			
 
				-# ============================================================
			
 
				-# 跨页表格合并配置
			
 
				-# ============================================================
			
 
				-cross_page_merge:
			
 
				-  enabled: true
			
 
				-  # 判断表格是否跨页的条件
			
 
				-  conditions:
			
 
				-    - table_at_page_bottom: true    # 表格位于页面底部
			
 
				-    - table_at_page_top: true       # 下一页表格位于顶部
			
 
				-    - similar_column_count: true    # 列数相似
			
 
				-    - header_match: false           # 表头匹配（跨页表格通常没有重复表头）
			
 
				-
			
--- a/ocr_tools/universal_doc_parser/config/bank_statement_yusys_v4.yaml
+++ b/ocr_tools/universal_doc_parser/config/bank_statement_yusys_v4.yaml
@@ -21,34 +21,23 @@ preprocessor:
 
				     model_dir: null  # 使用默认路径
			
 
				   unwarping:
			
 
				     enabled: false
			
 
				-  # -------------------------------------------------------
			
 
				-  # 水印去除配置（适用于银行流水浅色斜向文字水印）
			
 
				-  # -------------------------------------------------------
			
 
				+  # 页级水印（细参见 ocr_utils/watermark/presets.py PAGE_WATERMARK_PRESETS）
			
 
				   watermark_removal:
			
 
				-    enabled: false           # 是否启用水印去除
			
 
				-    method: threshold # threshold | masked | masked_adaptive
			
 
				-    threshold: 175          # 全局阈值或掩膜失败时的回退阈值（140-180）
			
 
				-    morph_close_kernel: 0   # 去水印后灰度图闭运算，0 跳过
			
 
				-    # 去水印后对比度增强（text_restore 将笔画拉深，比全局 gamma 更接近原图）
			
 
				+    enabled: false
			
 
				+    detect_before_remove: true
			
 
				+    method: threshold   # threshold | masked | masked_adaptive
			
 
				+    threshold: 175
			
 
				     contrast_enhancement:
			
 
				-      enabled: true
			
 
				-      method: text_restore   # text_restore | clahe | gamma | linear
			
 
				-      text_black_target: 85  # 略提高，减轻去水印后笔画被拉花（原 75 过深）
			
 
				-      background_threshold: 248
			
 
				-      text_lo_percentile: 1.0
			
 
				-      text_hi_percentile: 99.0
			
 
				-      gamma: 0.75            # method=gamma 时生效
			
 
				-      clip_limit: 2.0        # method=clahe
			
 
				-      tile_grid_size: 8
			
 
				-      black_percentile: 2.0  # method=linear
			
 
				-      white_percentile: 98.0
			
 
				+      enabled: false
			
 
				+      method: text_restore
			
 
				+      text_black_target: 85
			
 
				     debug_options:
			
 
				-      enabled: false              # 由命令行 --debug / --debug-layout 统一控制
			
 
				-      output_dir: null            # null 时使用 pipeline 输出目录
			
 
				-      prefix: ""                  # 文件名前缀（运行时注入 page_name）
			
 
				-      subdir: watermark_removal   # 输出至 debug/watermark_removal/
			
 
				-      save_compare: true          # 保存左右对比图 *_watermark_compare.*
			
 
				-      image_format: "png"         # jpg / png
			
 
				+      enabled: false
			
 
				+      output_dir: null
			
 
				+      prefix: ""
			
 
				+      subdir: watermark_removal
			
 
				+      save_compare: true
			
 
				+      image_format: "png"
			
 
				 
			
 
				 # ============================================================
			
 
				 # Layout 检测配置 - 智能路由器（按场景直接选择模型）
			
@@ -115,7 +104,6 @@ ocr_recognition:
 
				   batch_size: 8
			
 
				   device: "cpu"
			
 
				 
			
 
				-
			
 
				   # Debug 可视化（底图为 inference_image，与整页 OCR 输入一致）
			
 
				   debug_options:
			
 
				     enabled: false              # 由命令行 --debug / --debug-ocr 控制
			
@@ -179,13 +167,33 @@ table_recognition_wired:
 
				     # 功能开关
			
 
				     enable_ocr_compensation: true      # 启用OCR边缘补偿
			
 
				 
			
 
				-
			
 
				-  # 单元格二次 OCR（det 分行 + 整格兜底 + 低分块过滤）
			
 
				+  # 单元格二次 OCR（det 分行 + 整格/条带兜底 + 低分笔画增强重试）
			
 
				   second_pass_ocr:
			
 
				-    line_min_score: 0.8
			
 
				+    reocr_mode: bank_statement       # 表体空单元必跑 + 同行多数非空则空格也跑
			
 
				+    header_row: 0                    # 表头行号（0=首行）
			
 
				+    row_peer_min_nonempty: 5         # 同行至少 N 个非空格时，本格空也触发二次 OCR
			
 
				+    line_min_score: 0.8              # 低于此分的分行从文本与计分中丢弃
			
 
				     drop_low_score_blocks: true
			
 
				-    whole_cell_fallback: true
			
 
				+    whole_cell_fallback: true        # 整格 det=False 兜底 + 条带扫描
			
 
				     prefer_whole_on_tie: true
			
 
				+    whole_longer_min_extra_chars: 2  # 整格/条带文本比分行多长至少 N 字则优先
			
 
				+    strip_fallback_aspect_ratio: 1.8 # 高/宽>=该值且仅检出<=1行时滑动条带分行
			
 
				+    suspicious_short_min_chars: 4    # 高分但过短仍跑整格/条带兜底（与 enhance_retry 无关）
			
 
				+    cell_preprocess:
			
 
				+      watermark:
			
 
				+        enabled: true
			
 
				+        method: threshold
			
 
				+      denoise:
			
 
				+        enabled: false   # 小格 median 易糊笔画；lab 用 --denoise 对比
			
 
				+      contrast:
			
 
				+        enabled: false   # Pass1 去水印后可选；lab 对比 text_restore
			
 
				+        method: text_restore
			
 
				+        text_black_target: 88
			
 
				+      light:
			
 
				+        upscale_min_side: 192  # 128, 192 用于难例日期列
			
 
				+    enhance_retry:
			
 
				+      enabled: false
			
 
				+      # enabled: true 时 Pass2 预处理，默认见代码（cell_preprocess.enhance_retry 已废弃）
			
 
				 
			
 
				   # Debug 可视化配置
			
 
				   debug_options:
			
--- a/ocr_tools/universal_doc_parser/models/adapters/wired_table/text_filling.py
+++ b/ocr_tools/universal_doc_parser/models/adapters/wired_table/text_filling.py
@@ -63,6 +63,7 @@ class TextFiller:
 
				         self.second_pass_row_peer_min_nonempty: int = int(
			
 
				             sp_cfg.get("row_peer_min_nonempty", 5)
			
 
				         )
			
 
				+        _short_min = sp_cfg.get("suspicious_short_min_chars")
			
 
				         cpp = sp_cfg.get("cell_preprocess") or {}
			
 
				         if not isinstance(cpp, dict):
			
 
				             cpp = {}
			
@@ -70,16 +71,18 @@ class TextFiller:
 
				         if not isinstance(light, dict):
			
 
				             light = {}
			
 
				         self.second_pass_light_upscale_min: int = int(
			
 
				-            light.get("upscale_min_side", 64)
			
 
				+            light.get("upscale_min_side", 192)
			
 
				         )
			
 
				-        er = cpp.get("enhance_retry") or {}
			
 
				+        er = sp_cfg.get("enhance_retry") or cpp.get("enhance_retry") or {}
			
 
				         if not isinstance(er, dict):
			
 
				             er = {}
			
 
				+        if _short_min is None:
			
 
				+            _short_min = er.get("min_chars", 4)
			
 
				+        self.second_pass_suspicious_short_min_chars: int = int(_short_min)
			
 
				         self.second_pass_enhance_retry_enabled: bool = bool(er.get("enabled", True))
			
 
				         self.second_pass_enhance_score_below: float = float(
			
 
				             er.get("score_below", 0.90)
			
 
				         )
			
 
				-        self.second_pass_enhance_min_chars: int = int(er.get("min_chars", 4))
			
 
				         self.second_pass_enhance_short_tall: bool = bool(
			
 
				             er.get("short_text_in_tall_cell", True)
			
 
				         )
			
@@ -101,7 +104,7 @@ class TextFiller:
 
				         denoise = cpp.get("denoise") or {}
			
 
				         if not isinstance(denoise, dict):
			
 
				             denoise = {}
			
 
				-        self._cell_denoise_enabled: bool = bool(denoise.get("enabled", True))
			
 
				+        self._cell_denoise_enabled: bool = bool(denoise.get("enabled", False))
			
 
				         self._cell_denoise_method: str = str(denoise.get("method", "median"))
			
 
				         cell_contrast = cpp.get("contrast") or {}
			
 
				         if not isinstance(cell_contrast, dict):
			
@@ -245,12 +248,40 @@ class TextFiller:
 
				         return x1, y1, x2, y2
			
 
				 
			
 
				     @staticmethod
			
 
				+    def _normalize_rec_score(score: float) -> float:
			
 
				+        """识别分归一化到 [0,1]；部分引擎返回 0～100。"""
			
 
				+        try:
			
 
				+            sc = float(score)
			
 
				+        except (TypeError, ValueError):
			
 
				+            return 0.0
			
 
				+        if sc != sc:  # NaN
			
 
				+            return 0.0
			
 
				+        if sc > 1.0:
			
 
				+            if sc <= 100.0:
			
 
				+                return sc / 100.0
			
 
				+            return 0.0
			
 
				+        if sc < 0.0:
			
 
				+            return 0.0
			
 
				+        return sc
			
 
				+
			
 
				+    @staticmethod
			
 
				+    def _parse_det_rec_item(item: Any) -> Tuple[str, float]:
			
 
				+        """解析 det+rec 一体结果的一项：[[box], (text, score)]。"""
			
 
				+        if item is None:
			
 
				+            return "", 0.0
			
 
				+        if isinstance(item, (list, tuple)) and len(item) >= 2:
			
 
				+            head = item[0]
			
 
				+            if isinstance(head, (list, tuple)) and len(head) >= 4:
			
 
				+                return TextFiller._parse_single_rec_item(item[1])
			
 
				+        return TextFiller._parse_single_rec_item(item)
			
 
				+
			
 
				+    @staticmethod
			
 
				     def _parse_single_rec_item(rec_item: Any) -> Tuple[str, float]:
			
 
				         if rec_item is None:
			
 
				             return "", 0.0
			
 
				         if isinstance(rec_item, tuple) and len(rec_item) >= 2:
			
 
				             txt = str(rec_item[0] or "").strip()
			
 
				-            sc = float(rec_item[1] or 0.0)
			
 
				+            sc = TextFiller._normalize_rec_score(float(rec_item[1] or 0.0))
			
 
				             return txt, 0.0 if not txt else sc
			
 
				         if isinstance(rec_item, list) and len(rec_item) >= 2:
			
 
				             if isinstance(rec_item[0], (list, tuple, dict)):
			
@@ -266,15 +297,19 @@ class TextFiller:
 
				                     total_len = sum(len(t) for t in texts_list)
			
 
				                     if total_len > 0:
			
 
				                         weighted = sum(len(t) * s for t, s in zip(texts_list, scores_list)) / total_len
			
 
				-                        return combined, weighted
			
 
				-                    return combined, sum(scores_list) / len(scores_list)
			
 
				+                        return combined, TextFiller._normalize_rec_score(weighted)
			
 
				+                    return combined, TextFiller._normalize_rec_score(
			
 
				+                        sum(scores_list) / len(scores_list)
			
 
				+                    )
			
 
				                 return "", 0.0
			
 
				             txt = str(rec_item[0] or "").strip()
			
 
				-            sc = float(rec_item[1] or 0.0)
			
 
				+            sc = TextFiller._normalize_rec_score(float(rec_item[1] or 0.0))
			
 
				             return txt, 0.0 if not txt else sc
			
 
				         if isinstance(rec_item, dict):
			
 
				             txt = str(rec_item.get("text") or rec_item.get("label") or "").strip()
			
 
				-            sc = float(rec_item.get("score") or rec_item.get("confidence") or 0.0)
			
 
				+            sc = TextFiller._normalize_rec_score(
			
 
				+                float(rec_item.get("score") or rec_item.get("confidence") or 0.0)
			
 
				+            )
			
 
				             return txt, 0.0 if not txt else sc
			
 
				         return "", 0.0
			
 
				 
			
@@ -293,7 +328,18 @@ class TextFiller:
 
				             items = self._extract_ocr_batch_results(rec_res)
			
 
				             if not items:
			
 
				                 return "", 0.0
			
 
				-            return self._parse_single_rec_item(items[0] if len(items) == 1 else items)
			
 
				+            blocks: List[Tuple[str, float]] = []
			
 
				+            for item in items:
			
 
				+                text, score = self._parse_det_rec_item(item)
			
 
				+                if text:
			
 
				+                    blocks.append((text, score))
			
 
				+            if not blocks:
			
 
				+                return "", 0.0
			
 
				+            return self.aggregate_line_ocr(
			
 
				+                blocks,
			
 
				+                line_min_score=0.0,
			
 
				+                drop_low_score_blocks=False,
			
 
				+            )
			
 
				         except Exception as e:
			
 
				             logger.warning(f"整格 OCR 失败: {e}")
			
 
				             return "", 0.0
			
@@ -418,7 +464,11 @@ class TextFiller:
 
				         return cell_img
			
 
				 
			
 
				     def _apply_cell_contrast(
			
 
				-        self, cell_img: np.ndarray, contrast_cfg: Dict[str, Any]
			
 
				+        self,
			
 
				+        cell_img: np.ndarray,
			
 
				+        contrast_cfg: Dict[str, Any],
			
 
				+        *,
			
 
				+        sharpen_cfg: Optional[Dict[str, Any]] = None,
			
 
				     ) -> np.ndarray:
			
 
				         from ocr_utils.watermark.contrast import apply_contrast_enhancement_config
			
 
				 
			
@@ -429,8 +479,9 @@ class TextFiller:
 
				         else:
			
 
				             gray = cell_img
			
 
				         gray = apply_contrast_enhancement_config(gray, contrast_cfg)
			
 
				-        if self.second_pass_enhance_sharpen.get("enabled", False):
			
 
				-            amount = float(self.second_pass_enhance_sharpen.get("amount", 0.3))
			
 
				+        sharpen = sharpen_cfg or {}
			
 
				+        if sharpen.get("enabled", False):
			
 
				+            amount = float(sharpen.get("amount", 0.3))
			
 
				             blurred = cv2.GaussianBlur(gray, (0, 0), 1.0)
			
 
				             gray = cv2.addWeighted(gray, 1.0 + amount, blurred, -amount, 0)
			
 
				         if cell_img.ndim == 3:
			
@@ -451,12 +502,18 @@ class TextFiller:
 
				             img = self._denoise_cell(img)
			
 
				             stages.append("denoise")
			
 
				 
			
 
				-        if mode == "enhance":
			
 
				+        if mode == "light":
			
 
				+            if self._cell_contrast_cfg.get("enabled", False) and "wm" in stages:
			
 
				+                img = self._apply_cell_contrast(img, self._cell_contrast_cfg)
			
 
				+                stages.append("contrast")
			
 
				+        elif mode == "enhance":
			
 
				             contrast_cfg = self.second_pass_enhance_contrast
			
 
				             if self._cell_contrast_cfg.get("enabled", False):
			
 
				                 contrast_cfg = self._cell_contrast_cfg
			
 
				             if contrast_cfg.get("enabled", False) and "wm" in stages:
			
 
				-                img = self._apply_cell_contrast(img, contrast_cfg)
			
 
				+                img = self._apply_cell_contrast(
			
 
				+                    img, contrast_cfg, sharpen_cfg=self.second_pass_enhance_sharpen
			
 
				+                )
			
 
				                 stages.append("contrast")
			
 
				 
			
 
				         img = self._upscale_cell_if_small(img)
			
@@ -473,10 +530,18 @@ class TextFiller:
 
				         strip_score: float = 0.0,
			
 
				     ) -> Tuple[str, float, str]:
			
 
				         """返回 (text, score, strategy)。"""
			
 
				+        line_score = self._normalize_rec_score(line_score)
			
 
				+        whole_score = self._normalize_rec_score(whole_score)
			
 
				+        strip_score = self._normalize_rec_score(strip_score)
			
 
				+
			
 
				         candidates: List[Tuple[str, float, str]] = []
			
 
				         if line_text:
			
 
				             candidates.append((line_text, line_score, "lines"))
			
 
				-        if whole_text and self.second_pass_whole_fallback:
			
 
				+        if (
			
 
				+            whole_text
			
 
				+            and self.second_pass_whole_fallback
			
 
				+            and 0.0 < whole_score <= 1.0
			
 
				+        ):
			
 
				             candidates.append((whole_text, whole_score, "whole"))
			
 
				         if strip_text:
			
 
				             candidates.append((strip_text, strip_score, "strip"))
			
@@ -487,6 +552,7 @@ class TextFiller:
 
				         if (
			
 
				             whole_text
			
 
				             and line_text
			
 
				+            and 0.0 < whole_score <= 1.0
			
 
				             and line_score > whole_score
			
 
				             and len(whole_text) >= len(line_text) + self.second_pass_whole_longer_extra
			
 
				             and len(whole_text) > len(line_text)
			
@@ -567,7 +633,7 @@ class TextFiller:
 
				         if (
			
 
				             line_text
			
 
				             and line_score >= base_conf_th
			
 
				-            and len(line_text) < self.second_pass_enhance_min_chars
			
 
				+            and len(line_text) < self.second_pass_suspicious_short_min_chars
			
 
				         ):
			
 
				             return True
			
 
				         return False
			
@@ -587,7 +653,7 @@ class TextFiller:
 
				             reasons.append("not_accepted")
			
 
				         if score < self.second_pass_enhance_score_below:
			
 
				             reasons.append("score_below_threshold")
			
 
				-        if text and len(text) < self.second_pass_enhance_min_chars:
			
 
				+        if text and len(text) < self.second_pass_suspicious_short_min_chars:
			
 
				             reasons.append("suspicious_short_text")
			
 
				         h, w = cell_img.shape[:2]
			
 
				         if (
			
@@ -595,7 +661,7 @@ class TextFiller:
 
				             and w > 0
			
 
				             and h / w >= self.second_pass_strip_aspect
			
 
				             and len(result.get("lines") or []) <= 1
			
 
				-            and len(text) < self.second_pass_enhance_min_chars + 2
			
 
				+            and len(text) < self.second_pass_suspicious_short_min_chars + 2
			
 
				         ):
			
 
				             reasons.append("tall_cell_single_line")
			
 
				         return bool(reasons), reasons
			
@@ -620,7 +686,7 @@ class TextFiller:
 
				             whole_text, whole_score = self._recognize_whole_cell(cell_img)
			
 
				             whole_skipped = None
			
 
				         elif line_text and line_score >= base_conf_th:
			
 
				-            if len(line_text) < self.second_pass_enhance_min_chars:
			
 
				+            if len(line_text) < self.second_pass_suspicious_short_min_chars:
			
 
				                 whole_skipped = "short_text_high_score"
			
 
				             else:
			
 
				                 whole_skipped = "line_score>=%.2f" % base_conf_th
			
@@ -757,6 +823,7 @@ class TextFiller:
 
				         debug_img: np.ndarray,
			
 
				         result: Dict[str, Any],
			
 
				         *,
			
 
				+        raw_img: Optional[np.ndarray] = None,
			
 
				         first_pass_text: str = "",
			
 
				         first_pass_score: float = 0.0,
			
 
				         trigger_reasons: Optional[List[str]] = None,
			
@@ -769,15 +836,31 @@ class TextFiller:
 
				         if pass_label:
			
 
				             stem += f"_{pass_label}"
			
 
				         stem += f"_{strategy}_{tag}"
			
 
				-        png_path = os.path.join(cell_ocr_dir, f"{stem}.png")
			
 
				+        preprocessed_name = f"{stem}.png"
			
 
				+        preprocessed_path = os.path.join(cell_ocr_dir, preprocessed_name)
			
 
				         try:
			
 
				-            cv2.imwrite(png_path, debug_img)
			
 
				+            cv2.imwrite(preprocessed_path, debug_img)
			
 
				         except Exception as e:
			
 
				             logger.warning(f"保存单元格OCR图片失败 (cell {cell_idx}): {e}")
			
 
				             return
			
 
				+
			
 
				+        raw_name: Optional[str] = None
			
 
				+        if raw_img is not None and raw_img.size > 0:
			
 
				+            raw_name = f"{stem}_raw.png"
			
 
				+            raw_path = os.path.join(cell_ocr_dir, raw_name)
			
 
				+            try:
			
 
				+                cv2.imwrite(raw_path, raw_img)
			
 
				+            except Exception as e:
			
 
				+                logger.warning(f"保存单元格原图失败 (cell {cell_idx}): {e}")
			
 
				+                raw_name = None
			
 
				+
			
 
				         payload = {
			
 
				             "cell_idx": cell_idx,
			
 
				             "bbox": bbox,
			
 
				+            "debug_images": {
			
 
				+                "raw": raw_name,
			
 
				+                "preprocessed": preprocessed_name,
			
 
				+            },
			
 
				             "first_pass": {"text": first_pass_text, "score": first_pass_score},
			
 
				             "trigger_reason": trigger_reasons or [],
			
 
				             "lines": result.get("lines") or [],
			
@@ -828,7 +911,7 @@ class TextFiller:
 
				         
			
 
				         if text_len == 1:
			
 
				             # 单字符：提高阈值 +0.05
			
 
				-            return min(0.95, base_threshold + 0.1)
			
 
				+            return min(0.92, base_threshold + 0.1)
			
 
				         elif text_len <= 3:
			
 
				             # 2-3字符：轻微提高阈值 +0.02
			
 
				             return min(0.92, base_threshold + 0.02)
			
@@ -1456,6 +1539,7 @@ class TextFiller:
 
				                         cell_idx,
			
 
				                         debug_img,
			
 
				                         result,
			
 
				+                        raw_img=raw_crop,
			
 
				                         first_pass_text=fp_text,
			
 
				                         first_pass_score=fp_score,
			
 
				                         trigger_reasons=trigger_reasons,
			
--- a/ocr_tools/universal_doc_parser/tests/test_second_pass_ocr_aggregate.py
+++ b/ocr_tools/universal_doc_parser/tests/test_second_pass_ocr_aggregate.py
@@ -72,7 +72,7 @@ class TestShouldRunWholeFallback:
 
				             config={
			
 
				                 "second_pass_ocr": {
			
 
				                     "whole_cell_fallback": True,
			
 
				-                    "enhance_retry": {"min_chars": 4},
			
 
				+                    "suspicious_short_min_chars": 4,
			
 
				                 }
			
 
				             },
			
 
				         )
			
@@ -94,6 +94,61 @@ class TestShouldRunWholeFallback:
 
				         assert f._should_run_whole_fallback("", 0.0, cell, [], 0.9)
			
 
				 
			
 
				 
			
 
				+class TestCellPreprocessConfig:
			
 
				+    def test_suspicious_short_from_top_level(self):
			
 
				+        f = TextFiller(
			
 
				+            ocr_engine=None,
			
 
				+            config={"second_pass_ocr": {"suspicious_short_min_chars": 6}},
			
 
				+        )
			
 
				+        assert f.second_pass_suspicious_short_min_chars == 6
			
 
				+
			
 
				+    def test_light_contrast_stage_when_enabled(self):
			
 
				+        import numpy as np
			
 
				+
			
 
				+        f = TextFiller(
			
 
				+            ocr_engine=None,
			
 
				+            config={
			
 
				+                "second_pass_ocr": {
			
 
				+                    "cell_preprocess": {
			
 
				+                        "watermark": {"enabled": True, "method": "threshold"},
			
 
				+                        "contrast": {
			
 
				+                            "enabled": True,
			
 
				+                            "method": "text_restore",
			
 
				+                            "text_black_target": 88,
			
 
				+                        },
			
 
				+                    }
			
 
				+                }
			
 
				+            },
			
 
				+        )
			
 
				+        cell = np.ones((40, 80, 3), dtype=np.uint8) * 200
			
 
				+        _, stages = f._preprocess_cell_for_ocr(cell, mode="light")
			
 
				+        assert "wm" in stages
			
 
				+        assert "contrast" in stages
			
 
				+
			
 
				+
			
 
				+class TestWholeCellParse:
			
 
				+    def test_parse_det_rec_item_uses_rec_not_box(self):
			
 
				+        item = [
			
 
				+            [[146.0, 15.0], [199.0, 15.0], [199.0, 85.0], [146.0, 85.0]],
			
 
				+            ("/", 0.9213118553161621),
			
 
				+        ]
			
 
				+        t, s = TextFiller._parse_det_rec_item(item)
			
 
				+        assert t == "/"
			
 
				+        assert abs(s - 0.9213118553161621) < 1e-6
			
 
				+
			
 
				+    def test_normalize_rec_score_percent(self):
			
 
				+        assert abs(TextFiller._normalize_rec_score(92.5) - 0.925) < 1e-6
			
 
				+        assert TextFiller._normalize_rec_score(0.921) == 0.921
			
 
				+        assert TextFiller._normalize_rec_score(999) == 0.0
			
 
				+
			
 
				+    def test_pick_line_when_whole_score_invalid(self):
			
 
				+        f = TextFiller(ocr_engine=None, config={"second_pass_ocr": {}})
			
 
				+        t, s, strat = f._pick_line_vs_whole("/", 0.92, "146.0199.0146.0/", 999.0)
			
 
				+        assert t == "/"
			
 
				+        assert strat == "lines"
			
 
				+        assert abs(s - 0.92) < 1e-6
			
 
				+
			
 
				+
			
 
				 class TestPickBetterOcrResult:
			
 
				     def test_reject_invalid_pass2_score(self):
			
 
				         pass1 = {"final_text": "取款", "final_score": 0.99, "accepted": True}
			
--- a/ocr_utils/watermark/presets.py
+++ b/ocr_utils/watermark/presets.py
@@ -113,7 +113,7 @@ def _base_preset(scope: Scope, method: Method) -> Dict[str, Any]:
 
				         if scope == "cell"
			
 
				         else copy.deepcopy(_CONTRAST_PAGE_DEFAULT)
			
 
				     )
			
 
				-    threshold = 175 if scope == "page" else 170
			
 
				+    threshold = 175 if scope == "page" else 155
			
 
				     cfg: Dict[str, Any] = {
			
 
				         "enabled": True,
			
 
				         "detect_before_remove": scope == "page",
			
--- a/ocr_utils/watermark/processor.py
+++ b/ocr_utils/watermark/processor.py
@@ -45,7 +45,7 @@ class WatermarkProcessor:
 
				 
			
 
				     @property
			
 
				     def threshold(self) -> int:
			
 
				-        return int(self.config.get("threshold", 175))
			
 
				+        return int(self.config.get("threshold", 155))
			
 
				 
			
 
				     @property
			
 
				     def morph_close_kernel(self) -> int:
Szerző	SHA1 Üzenet	Dátum
zhch158_admin	b210ab056b fix(优化水印处理与布局检测配置): 更新多个bank_statement配置文件，调整水印去除设置，启用检测前处理，优化布局检测模块，新增OCR识别和表格分类功能，提升整体OCR处理的准确性与灵活性。	1 hónapja
zhch158_admin	70f36c0904 fix(调整水印处理与单元格预处理配置): 更新bank_statement_yusys_local.yaml中的水印处理方法和对比度增强设置，调整阈值和启用状态，以优化OCR处理效果和灵活性。	1 hónapja
zhch158_admin	b11fe5592e fix(调整阈值以优化水印处理): 修改水印处理模块中的阈值设置，将单元格处理的阈值从170调整至155，以提升OCR处理的准确性和灵活性。	1 hónapja
zhch158_admin	a2311846f1 feat(增强二次OCR处理与单元格预处理功能): 在test_second_pass_ocr_aggregate.py中新增测试类和用例，验证短文本最小字符配置、单元格预处理的对比度调整及水印处理逻辑，提升OCR处理的准确性与灵活性。	1 hónapja
zhch158_admin	df98998bd5 feat(优化文本填充与OCR识别逻辑): 更新TextFiller类，新增短文本最小字符配置，重构识别逻辑以支持更灵活的文本解析和分数归一化，优化单元格对比度调整与增强功能，提升OCR处理的准确性与灵活性。	1 hónapja
zhch158_admin	eb694a01bb feat(新增水印评估与合成模块): 添加evaluate.py用于对比baseline与LaMa GAN方法的水印去除效果，新增lama_inpaint.py实现LaMa模型的推理，新增watermark_synthesis.py用于合成水印并生成相应的mask，提升水印处理的评估与合成能力。	1 hónapja
zhch158_admin	d25c465024 feat(新增单元格预处理参数扫描功能): 在cell_preprocess_lab.py中添加参数网格扫描示例，新增cell_sweep.py文件实现单元格裁剪图的预处理参数扫描功能，支持去水印、对比度调整等多种参数配置，提升OCR处理的灵活性与准确性，同时删除不再使用的cell121_sweep.py文件。	1 hónapja
zhch158_admin	95bfd4baed feat(更新水印去除模块文档): 扩展水印去除模块的文档，详细描述水印处理能力、适用场景及参数配置，增加对页级和格级处理的说明，优化用户理解与使用体验。	1 hónapja