```
@workspace 根据zhch/PP-StructureV3-zhch.yaml生成流水线执行markdown mermaid流程图，图中需要注明调用的模型以及传递的数据
```


```mermaid
graph TD
    A[输入文档图像/PDF] --> B{use_doc_preprocessor: True}
    
    B -->|是| C[DocPreprocessor子流水线]
    B -->|否| D[LayoutDetection]
    
    C --> C1[DocOrientationClassify<br/>PP-LCNet_x1_0_doc_ori]
    C1 -->|方向矫正后图像| C2[DocUnwarping<br/>UVDoc]
    C2 -->|预处理后图像| D
    
    D[LayoutDetection<br/>PP-DocLayout_plus-L] -->|版面检测结果<br/>各区域边界框| E[GeneralOCR子流水线]
    
    E --> E1[TextDetection<br/>PP-OCRv5_server_det]
    E1 -->|文本检测框| E2[TextLineOrientation<br/>PP-LCNet_x1_0_textline_ori]
    E2 -->|文本行方向| E3[TextRecognition<br/>PP-OCRv5_server_rec]
    E3 -->|OCR结果<br/>文本+坐标| F{并行处理各子模块}
    
    D -->|版面区域| F
    
    F -->|表格区域| G[TableRecognition子流水线]
    F -->|印章区域| H[SealRecognition子流水线]
    F -->|公式区域| I[FormulaRecognition子流水线]
    F -->|图表区域| J[ChartRecognition]
    F -->|区域检测| K[RegionDetection]
    
    %% TableRecognition详细流程
    G --> G1[TableClassification<br/>PP-LCNet_x1_0_table_cls]
    G1 -->|有线表| G2[WiredTableStructureRecognition<br/>SLANeXt_wired]
    G1 -->|无线表| G3[WirelessTableStructureRecognition<br/>SLANet_plus]
    G2 --> G4[WiredTableCellsDetection<br/>RT-DETR-L_wired_table_cell_det]
    G3 --> G5[WirelessTableCellsDetection<br/>RT-DETR-L_wireless_table_cell_det]
    G4 -->|表格结构+单元格| G6[TableOrientationClassify<br/>PP-LCNet_x1_0_doc_ori]
    G5 -->|表格结构+单元格| G6
    G6 --> G7[内部GeneralOCR<br/>box_thresh: 0.4]
    G7 -->|表格OCR结果| G8[文本匹配到单元格<br/>生成HTML表格]
    
    %% SealRecognition详细流程
    H --> H1[SealOCR子流水线]
    H1 --> H2[SealTextDetection<br/>PP-OCRv4_server_seal_det]
    H2 -->|印章文本检测框| H3[TextRecognition<br/>PP-OCRv5_server_rec]
    H3 -->|印章文本识别结果| H4[印章识别结果]
    
    %% FormulaRecognition详细流程
    I --> I1[FormulaRecognition<br/>PP-FormulaNet_plus-L]
    I1 -->|公式识别结果| I2[LaTeX格式公式]
    
    %% ChartRecognition详细流程
    J --> J1[ChartRecognition<br/>PP-Chart2Table]
    J1 -->|图表转表格结果| J2[表格化图表数据]
    
    %% RegionDetection详细流程
    K --> K1[RegionDetection<br/>PP-DocBlockLayout]
    K1 -->|细粒度区域检测| K2[区域检测结果]
    
    %% 结果汇总
    G8 --> L[结果汇总与整合]
    H4 --> L
    I2 --> L
    J2 --> L
    K2 --> L
    E3 --> L
    
    L --> M[生成结构化输出]
    M --> N[保存为JSON/Markdown]
    
    %% 数据流标注
    classDef inputData fill:#e1f5fe
    classDef model fill:#fff3e0
    classDef output fill:#e8f5e8
    classDef pipeline fill:#f3e5f5
    
    class A,C2 inputData
    class D,C1,C2,E1,E2,E3,G1,G2,G3,G4,G5,G6,G7,H2,H3,I1,J1,K1 model
    class G8,H4,I2,J2,K2,M,N output
    class C,E,G,H,I pipeline
```

## 关键数据传递说明：

### 1. **主数据流**
- **输入**: 文档图像/PDF
- **DocPreprocessor**: 预处理后的标准化图像
- **LayoutDetection**: 版面区域边界框 `[x1,y1,x2,y2,class_id,confidence]`
- **GeneralOCR**: OCR结果 `{text, bbox, confidence}`

### 2. **表格识别数据流**
- **输入**: 表格区域图像 + 整体OCR结果
- **TableClassification**: 表格类型标签 `"wired_table"/"wireless_table"`
- **StructureRecognition**: HTML结构标签序列 `["<tr>", "<td>", "</td>", ...]`
- **CellsDetection**: 单元格边界框列表
- **内部OCR**: 表格区域专用OCR结果（较低box_thresh=0.4）
- **输出**: HTML格式表格

### 3. **并行处理特点**
- 各子模块基于版面检测结果**并行处理**
- **数据共享**: 主GeneralOCR结果被多个子模块复用
- **参数差异**: 不同模块使用不同的OCR参数优化

### 4. **最终输出**
- **JSON格式**: 结构化的识别结果
- **Markdown格式**: 可读性强的文档格式
- **图像保存**: 处理过程中的可视化结果

这个流程图清楚展示了PP-StructureV3的复杂数据流和模型调用关系，特别突出了TableRecognition中OCR结果的复用和匹配机制。