myhloli 2 mesi fa
parent
commit
4864a086ce

+ 23 - 9
docs/en/reference/output_files.md

@@ -519,9 +519,19 @@ Text levels are distinguished through the `text_level` field:
 
 Structure is broadly similar to the pipeline backend, but with these differences:
 
-- `list` becomes a second‑level block; a new field `sub_type` distinguishes list categories:`text`: ordinary list; `ref_text`: reference / bibliography style list
-- New `code` block type with `sub_type`:`code`、`algorithm`, a code block always has at least a `code_body`; it may optionally have a `code_caption`.
-- `discarded_blocks` may contain additional types: `header`, `footer`, `page_number`, `aside_text`, `page_footnote`.
+- `list` becomes a second‑level block, a new field `sub_type` distinguishes list categories:
+  * `text`: ordinary list
+  * `ref_text`: reference / bibliography style list
+- New `code` block type with `sub_type`:
+  * `code`
+  * `algorithm`
+- A code block always has at least a `code_body`, it may optionally have a `code_caption`.
+- `discarded_blocks` may contain additional types: 
+  * `header`
+  * `footer`
+  * `page_number`
+  * `aside_text`
+  * `page_footnote`
 - All blocks include an `angle` field indicating rotation (one of `0, 90, 180, 270`).
 
 ##### Examples
@@ -629,9 +639,9 @@ Structure is broadly similar to the pipeline backend, but with these differences
 Based on the pipeline format, with these VLM-specific extensions:
 
 - New `code` type with `sub_type` (`code` | `algorithm`):
-  - Fields: `code_body` (string), optional `code_caption` (list of strings)
+  * Fields: `code_body` (string), optional `code_caption` (list of strings)
 - New `list` type with `sub_type` (`text` | `ref_text`):
-  - Field: `list_items` (array of strings)
+  * Field: `list_items` (array of strings)
 - All `discarded_blocks` entries are also output (e.g., headers, footers, page numbers, margin notes, page footnotes).
 - Existing types (`image`, `table`, `text`, `equation`) remain unchanged.
 - `bbox` still uses the 0–1000 normalized coordinate mapping.
@@ -688,7 +698,11 @@ Example: discarded blocks output
 
 The above files constitute MinerU's complete output results. Users can choose appropriate files for subsequent processing based on their needs:
 
-- **Model outputs**: Use raw outputs (model.json, model_output.txt)
-- **Debugging and verification**: Use visualization files (layout.pdf, spans.pdf) 
-- **Content extraction**: Use simplified files (*.md, content_list.json)
-- **Secondary development**: Use structured files (middle.json)
+- **Model outputs**: 
+  * Use raw outputs (model.json, model_output.txt)
+- **Debugging and verification**:
+  * Use visualization files (layout.pdf, spans.pdf) 
+- **Content extraction**: 
+  * Use simplified files (*.md, content_list.json)
+- **Secondary development**: 
+  * Use structured files (middle.json)

+ 20 - 6
docs/en/usage/cli_tools.md

@@ -65,9 +65,23 @@ Options:
 Some parameters of MinerU command line tools have equivalent environment variable configurations. Generally, environment variable configurations have higher priority than command line parameters and take effect across all command line tools.
 Here are the environment variables and their descriptions:
 
-- `MINERU_DEVICE_MODE`: Used to specify inference device, supports device types like `cpu/cuda/cuda:0/npu/mps`, only effective for `pipeline` backend.
-- `MINERU_VIRTUAL_VRAM_SIZE`: Used to specify maximum GPU VRAM usage per process (GB), only effective for `pipeline` backend.
-- `MINERU_MODEL_SOURCE`: Used to specify model source, supports `huggingface/modelscope/local`, defaults to `huggingface`, can be switched to `modelscope` or local models through environment variables.
-- `MINERU_TOOLS_CONFIG_JSON`: Used to specify configuration file path, defaults to `mineru.json` in user directory, can specify other configuration file paths through environment variables.
-- `MINERU_FORMULA_ENABLE`: Used to enable formula parsing, defaults to `true`, can be set to `false` through environment variables to disable formula parsing.
-- `MINERU_TABLE_ENABLE`: Used to enable table parsing, defaults to `true`, can be set to `false` through environment variables to disable table parsing.
+- `MINERU_DEVICE_MODE`:
+  * Used to specify inference device
+  * supports device types like `cpu/cuda/cuda:0/npu/mps`
+  * only effective for `pipeline` backend.
+- `MINERU_VIRTUAL_VRAM_SIZE`: 
+  * Used to specify maximum GPU VRAM usage per process (GB)
+  * only effective for `pipeline` backend.
+- `MINERU_MODEL_SOURCE`: 
+  * Used to specify model source
+  * supports `huggingface/modelscope/local`
+  * defaults to `huggingface`, can be switched to `modelscope` or local models through environment variables.
+- `MINERU_TOOLS_CONFIG_JSON`: 
+  * Used to specify configuration file path
+  * defaults to `mineru.json` in user directory, can specify other configuration file paths through environment variables.
+- `MINERU_FORMULA_ENABLE`:
+  * Used to enable formula parsing
+  * defaults to `true`, can be set to `false` through environment variables to disable formula parsing.
+- `MINERU_TABLE_ENABLE`: 
+  * Used to enable table parsing
+  * defaults to `true`, can be set to `false` through environment variables to disable table parsing.

+ 11 - 3
docs/en/usage/quick_usage.md

@@ -77,7 +77,15 @@ MinerU is now ready to use out of the box, but also supports extending functiona
 
 Here are some available configuration options:  
 
-- `latex-delimiter-config`: Used to configure LaTeX formula delimiters, defaults to `$` symbol, can be modified to other symbols or strings as needed.
-- `llm-aided-config`: Used to configure parameters for LLM-assisted title hierarchy, compatible with all LLM models supporting `openai protocol`, defaults to using Alibaba Cloud Bailian's `qwen2.5-32b-instruct` model. You need to configure your own API key and set `enable` to `true` to enable this feature.
-- `models-dir`: Used to specify local model storage directory, please specify model directories for `pipeline` and `vlm` backends separately. After specifying the directory, you can use local models by configuring the environment variable `export MINERU_MODEL_SOURCE=local`.
+- `latex-delimiter-config`: 
+  * Used to configure LaTeX formula delimiters
+  * Defaults to `$` symbol, can be modified to other symbols or strings as needed.
+- `llm-aided-config`:
+  * Used to configure parameters for LLM-assisted title hierarchy
+  * Compatible with all LLM models supporting `openai protocol`, defaults to using Alibaba Cloud Bailian's `qwen2.5-32b-instruct` model. 
+  * You need to configure your own API key and set `enable` to `true` to enable this feature.
+- `models-dir`: 
+  * Used to specify local model storage directory
+  * Please specify model directories for `pipeline` and `vlm` backends separately.
+  * After specifying the directory, you can use local models by configuring the environment variable `export MINERU_MODEL_SOURCE=local`.
 

+ 31 - 9
docs/zh/reference/output_files.md

@@ -536,9 +536,18 @@ inference_result: list[PageInferenceResults] = []
 ##### 文件格式说明
 vlm 后端的 middle.json 文件结构与 pipeline 后端类似,但存在以下差异: 
 
-- list变成二级block,增加"sub_type"字段区分list类型,"sub_type"可选"text"(文本类型),"ref_text"(引用类型)
-- 增加code类型block,code类型包含两种"sub_type",分别是"code"和"algorithm",至少有code_body,可选code_caption
-- `discarded_blocks`内元素type增加"header"、"footer"、"page_number"、"aside_text"、"page_footnote"类型
+- list变成二级block,增加`sub_type`字段区分list类型:
+  * `text`(文本类型)
+  * `ref_text`(引用类型)
+- 增加code类型block,code类型包含两种"sub_type":
+  * 分别是"code"和"algorithm"
+  * 至少有code_body,可选code_caption
+- `discarded_blocks`内元素type增加以下类型:
+  * `header`(页眉)
+  * `footer`(页脚)
+  * `page_number`(页码)
+  * `aside_text`(装订线文本)
+  * `page_footnote`(脚注)
 - 所有block增加`angle`字段,用来表示旋转角度,0,90,180,270
 
 
@@ -716,9 +725,18 @@ vlm 后端的 middle.json 文件结构与 pipeline 后端类似,但存在以
 ##### 文件格式说明
 vlm 后端的 content_list.json 文件结构与 pipeline 后端类似,伴随本次middle.json的变化,做了以下调整: 
 
-- 新增`code`类型,code类型包含两种"sub_type",分别是"code"和"algorithm",至少有code_body,可选code_caption
-- 新增`list`类型,list类型包含两种"sub_type",分别是"text"和"ref_text" 
+- 新增`code`类型,code类型包含两种"sub_type":
+  * 分别是"code"和"algorithm"
+  * 至少有code_body, 可选code_caption
+- 新增`list`类型,list类型包含两种"sub_type":
+  * `text`
+  * `ref_text` 
 - 增加所有所有`discarded_blocks`的输出内容
+  * `header`
+  * `footer`
+  * `page_number`
+  * `aside_text`
+  * `page_footnote`
 
 ##### 示例数据
 - code 类型 content
@@ -790,7 +808,11 @@ vlm 后端的 content_list.json 文件结构与 pipeline 后端类似,伴随
 
 以上文件为 MinerU 的完整输出结果,用户可根据需要选择合适的文件进行后续处理:
 
-- **模型输出**:使用原始输出(model.json、model_output.txt)
-- **调试和验证**:使用可视化文件(layout.pdf、spans.pdf) 
-- **内容提取**:使用简化文件(*.md、content_list.json)
-- **二次开发**:使用结构化文件(middle.json)
+- **模型输出**:
+  * 使用原始输出(model.json、model_output.txt)
+- **调试和验证**:
+  * 使用可视化文件(layout.pdf、spans.pdf) 
+- **内容提取**:
+  * 使用简化文件(*.md、content_list.json)
+- **二次开发**:
+  * 使用结构化文件(middle.json)

+ 20 - 6
docs/zh/usage/cli_tools.md

@@ -60,9 +60,23 @@ Options:
 MinerU命令行工具的某些参数存在相同功能的环境变量配置,通常环境变量配置的优先级高于命令行参数,且在所有命令行工具中都生效。
 以下是常用的环境变量及其说明: 
 
-- `MINERU_DEVICE_MODE`:用于指定推理设备,支持`cpu/cuda/cuda:0/npu/mps`等设备类型,仅对`pipeline`后端生效。
-- `MINERU_VIRTUAL_VRAM_SIZE`:用于指定单进程最大 GPU 显存占用(GB),仅对`pipeline`后端生效。
-- `MINERU_MODEL_SOURCE`:用于指定模型来源,支持`huggingface/modelscope/local`,默认为`huggingface`,可通过环境变量切换为`modelscope`或使用本地模型。
-- `MINERU_TOOLS_CONFIG_JSON`:用于指定配置文件路径,默认为用户目录下的`mineru.json`,可通过环境变量指定其他配置文件路径。
-- `MINERU_FORMULA_ENABLE`:用于启用公式解析,默认为`true`,可通过环境变量设置为`false`来禁用公式解析。
-- `MINERU_TABLE_ENABLE`:用于启用表格解析,默认为`true`,可通过环境变量设置为`false`来禁用表格解析。
+- `MINERU_DEVICE_MODE`:
+  * 用于指定推理设备
+  * 支持`cpu/cuda/cuda:0/npu/mps`等设备类型
+  * 仅对`pipeline`后端生效。
+- `MINERU_VIRTUAL_VRAM_SIZE`:
+  * 用于指定单进程最大 GPU 显存占用(GB)
+  * 仅对`pipeline`后端生效。
+- `MINERU_MODEL_SOURCE`:
+  * 用于指定模型来源
+  * 支持`huggingface/modelscope/local`
+  * 默认为`huggingface`可通过环境变量切换为`modelscope`或使用本地模型。
+- `MINERU_TOOLS_CONFIG_JSON`:
+  * 用于指定配置文件路径
+  * 默认为用户目录下的`mineru.json`,可通过环境变量指定其他配置文件路径。
+- `MINERU_FORMULA_ENABLE`:
+  * 用于启用公式解析
+  * 默认为`true`,可通过环境变量设置为`false`来禁用公式解析。
+- `MINERU_TABLE_ENABLE`:
+  * 用于启用表格解析
+  * 默认为`true`,可通过环境变量设置为`false`来禁用表格解析。

+ 11 - 4
docs/zh/usage/quick_usage.md

@@ -64,7 +64,7 @@ mineru -p <input_path> -o <output_path> -b vlm-transformers
   > ```
 
 > [!NOTE]
-> 所有vllm官方支持的参数都可用通过命令行参数传递给 MinerU,包括以下命令:`mineru`、`mineru-vllm -server`、`mineru-gradio`、`mineru-api`,
+> 所有vllm官方支持的参数都可用通过命令行参数传递给 MinerU,包括以下命令:`mineru`、`mineru-vllm-server`、`mineru-gradio`、`mineru-api`,
 > 我们整理了一些`vllm`使用中的常用参数和使用方法,可以在文档[命令行进阶参数](./advanced_cli_parameters.md)中获取。
 
 ## 基于配置文件扩展 MinerU 功能
@@ -76,6 +76,13 @@ MinerU 现已实现开箱即用,但也支持通过配置文件扩展功能。
 
 以下是一些可用的配置选项: 
 
-- `latex-delimiter-config`:用于配置 LaTeX 公式的分隔符,默认为`$`符号,可根据需要修改为其他符号或字符串。
-- `llm-aided-config`:用于配置 LLM 辅助标题分级的相关参数,兼容所有支持`openai协议`的 LLM 模型,默认使用`阿里云百炼`的`qwen2.5-32b-instruct`模型,您需要自行配置 API 密钥并将`enable`设置为`true`来启用此功能。
-- `models-dir`:用于指定本地模型存储目录,请为`pipeline`和`vlm`后端分别指定模型目录,指定目录后您可通过配置环境变量`export MINERU_MODEL_SOURCE=local`来使用本地模型。
+- `latex-delimiter-config`:
+  * 用于配置 LaTeX 公式的分隔符
+  * 默认为`$`符号,可根据需要修改为其他符号或字符串。
+- `llm-aided-config`:
+  * 用于配置 LLM 辅助标题分级的相关参数,兼容所有支持`openai协议`的 LLM 模型
+  * 默认使用`阿里云百炼`的`qwen2.5-32b-instruct`模型
+  * 您需要自行配置 API 密钥并将`enable`设置为`true`来启用此功能。
+- `models-dir`:
+  * 用于指定本地模型存储目录,请为`pipeline`和`vlm`后端分别指定模型目录,
+  * 指定目录后您可通过配置环境变量`export MINERU_MODEL_SOURCE=local`来使用本地模型。