myhloli 2 ヶ月 前
コミット
34fab4f5b8
2 ファイル変更19 行追加22 行削除
  1. 9 14
      docs/en/reference/output_files.md
  2. 10 8
      docs/zh/reference/output_files.md

+ 9 - 14
docs/en/reference/output_files.md

@@ -519,15 +519,10 @@ Text levels are distinguished through the `text_level` field:
 
 Structure is broadly similar to the pipeline backend, but with these differences:
 
-- 1. `list` becomes a second‑level block; a new field `sub_type` distinguishes list categories:
-  - `text`: ordinary list
-  - `ref_text`: reference / bibliography style list
-- 2. New `code` block type with `sub_type`:
-  - `code`
-  - `algorithm`
-  A code block always has at least a `code_body`; it may optionally have a `code_caption`.
-- 3. `discarded_blocks` may contain additional types: `header`, `footer`, `page_number`, `aside_text`, `page_footnote`.
-- 4. All blocks include an `angle` field indicating rotation (one of `0, 90, 180, 270`).
+- `list` becomes a second‑level block; a new field `sub_type` distinguishes list categories:`text`: ordinary list; `ref_text`: reference / bibliography style list
+- New `code` block type with `sub_type`:`code`、`algorithm`, a code block always has at least a `code_body`; it may optionally have a `code_caption`.
+- `discarded_blocks` may contain additional types: `header`, `footer`, `page_number`, `aside_text`, `page_footnote`.
+- All blocks include an `angle` field indicating rotation (one of `0, 90, 180, 270`).
 
 ##### Examples
 - Example: list block
@@ -633,13 +628,13 @@ Structure is broadly similar to the pipeline backend, but with these differences
 
 Based on the pipeline format, with these VLM-specific extensions:
 
-- 1. New `code` type with `sub_type` (`code` | `algorithm`):
+- New `code` type with `sub_type` (`code` | `algorithm`):
   - Fields: `code_body` (string), optional `code_caption` (list of strings)
-- 2. New `list` type with `sub_type` (`text` | `ref_text`):
+- New `list` type with `sub_type` (`text` | `ref_text`):
   - Field: `list_items` (array of strings)
-- 3. All `discarded_blocks` entries are also output (e.g., headers, footers, page numbers, margin notes, page footnotes).
-- 4. Existing types (`image`, `table`, `text`, `equation`) remain unchanged.
-- 5. `bbox` still uses the 0–1000 normalized coordinate mapping.
+- All `discarded_blocks` entries are also output (e.g., headers, footers, page numbers, margin notes, page footnotes).
+- Existing types (`image`, `table`, `text`, `equation`) remain unchanged.
+- `bbox` still uses the 0–1000 normalized coordinate mapping.
 
 
 ##### Examples

+ 10 - 8
docs/zh/reference/output_files.md

@@ -535,10 +535,11 @@ inference_result: list[PageInferenceResults] = []
 
 ##### 文件格式说明
 vlm 后端的 middle.json 文件结构与 pipeline 后端类似,但存在以下差异: 
-- 1. list变成二级block,增加"sub_type"字段区分list类型,"sub_type"可选"text"(文本类型),"ref_text"(引用类型)
-- 2. 增加code类型block,code类型包含两种"sub_type",分别是"code"和"algorithm",至少有code_body,可选code_caption
-- 3. `discarded_blocks`内元素type增加"header"、"footer"、"page_number"、"aside_text"、"page_footnote"类型
-- 4. 所有block增加`angle`字段,用来表示旋转角度,0,90,180,270
+
+- list变成二级block,增加"sub_type"字段区分list类型,"sub_type"可选"text"(文本类型),"ref_text"(引用类型)
+- 增加code类型block,code类型包含两种"sub_type",分别是"code"和"algorithm",至少有code_body,可选code_caption
+- `discarded_blocks`内元素type增加"header"、"footer"、"page_number"、"aside_text"、"page_footnote"类型
+- 所有block增加`angle`字段,用来表示旋转角度,0,90,180,270
 
 
 ##### 示例数据
@@ -713,10 +714,11 @@ vlm 后端的 middle.json 文件结构与 pipeline 后端类似,但存在以
 **文件命名格式**:`{原文件名}_content_list.json`
 
 ##### 文件格式说明
-vlm 后端的 content_list.json 文件结构与 pipeline 后端类似,伴随本次middle.json的变化,做了以下调整:
-- 1. 新增`code`类型,code类型包含两种"sub_type",分别是"code"和"algorithm",至少有code_body,可选code_caption
-- 2. 新增`list`类型,list类型包含两种"sub_type",分别是"text"和"ref_text" 
-- 3. 增加所有所有`discarded_blocks`的输出内容
+vlm 后端的 content_list.json 文件结构与 pipeline 后端类似,伴随本次middle.json的变化,做了以下调整: 
+
+- 新增`code`类型,code类型包含两种"sub_type",分别是"code"和"algorithm",至少有code_body,可选code_caption
+- 新增`list`类型,list类型包含两种"sub_type",分别是"text"和"ref_text" 
+- 增加所有所有`discarded_blocks`的输出内容
 
 ##### 示例数据
 - code 类型 content