浏览代码

Merge pull request #3505 from myhloli/dev

update docs
Xiaomeng Zhao 2 月之前
父节点
当前提交
9d5568a9cb

+ 7 - 0
docs/en/reference/output_files.md

@@ -699,10 +699,17 @@ Example: discarded blocks output
 The above files constitute MinerU's complete output results. Users can choose appropriate files for subsequent processing based on their needs:
 
 - **Model outputs**: 
+
   * Use raw outputs (model.json, model_output.txt)
+  
 - **Debugging and verification**:
+
   * Use visualization files (layout.pdf, spans.pdf) 
+  
 - **Content extraction**: 
+
   * Use simplified files (*.md, content_list.json)
+  
 - **Secondary development**: 
+
   * Use structured files (middle.json)

+ 11 - 0
docs/en/usage/cli_tools.md

@@ -66,22 +66,33 @@ Some parameters of MinerU command line tools have equivalent environment variabl
 Here are the environment variables and their descriptions:
 
 - `MINERU_DEVICE_MODE`:
+
   * Used to specify inference device
   * supports device types like `cpu/cuda/cuda:0/npu/mps`
   * only effective for `pipeline` backend.
+  
 - `MINERU_VIRTUAL_VRAM_SIZE`: 
+
   * Used to specify maximum GPU VRAM usage per process (GB)
   * only effective for `pipeline` backend.
+  
 - `MINERU_MODEL_SOURCE`: 
+
   * Used to specify model source
   * supports `huggingface/modelscope/local`
   * defaults to `huggingface`, can be switched to `modelscope` or local models through environment variables.
+  
 - `MINERU_TOOLS_CONFIG_JSON`: 
+
   * Used to specify configuration file path
   * defaults to `mineru.json` in user directory, can specify other configuration file paths through environment variables.
+  
 - `MINERU_FORMULA_ENABLE`:
+
   * Used to enable formula parsing
   * defaults to `true`, can be set to `false` through environment variables to disable formula parsing.
+  
 - `MINERU_TABLE_ENABLE`: 
+
   * Used to enable table parsing
   * defaults to `true`, can be set to `false` through environment variables to disable table parsing.

+ 5 - 0
docs/en/usage/quick_usage.md

@@ -78,13 +78,18 @@ MinerU is now ready to use out of the box, but also supports extending functiona
 Here are some available configuration options:  
 
 - `latex-delimiter-config`: 
+
   * Used to configure LaTeX formula delimiters
   * Defaults to `$` symbol, can be modified to other symbols or strings as needed.
+  
 - `llm-aided-config`:
+
   * Used to configure parameters for LLM-assisted title hierarchy
   * Compatible with all LLM models supporting `openai protocol`, defaults to using Alibaba Cloud Bailian's `qwen2.5-32b-instruct` model. 
   * You need to configure your own API key and set `enable` to `true` to enable this feature.
+  
 - `models-dir`: 
+
   * Used to specify local model storage directory
   * Please specify model directories for `pipeline` and `vlm` backends separately.
   * After specifying the directory, you can use local models by configuring the environment variable `export MINERU_MODEL_SOURCE=local`.

+ 7 - 0
docs/zh/reference/output_files.md

@@ -809,10 +809,17 @@ vlm 后端的 content_list.json 文件结构与 pipeline 后端类似,伴随
 以上文件为 MinerU 的完整输出结果,用户可根据需要选择合适的文件进行后续处理:
 
 - **模型输出**:
+
   * 使用原始输出(model.json、model_output.txt)
+  
 - **调试和验证**:
+
   * 使用可视化文件(layout.pdf、spans.pdf) 
+  
 - **内容提取**:
+
   * 使用简化文件(*.md、content_list.json)
+  
 - **二次开发**:
+
   * 使用结构化文件(middle.json)

+ 11 - 0
docs/zh/usage/cli_tools.md

@@ -61,22 +61,33 @@ MinerU命令行工具的某些参数存在相同功能的环境变量配置,
 以下是常用的环境变量及其说明: 
 
 - `MINERU_DEVICE_MODE`:
+
   * 用于指定推理设备
   * 支持`cpu/cuda/cuda:0/npu/mps`等设备类型
   * 仅对`pipeline`后端生效。
+  
 - `MINERU_VIRTUAL_VRAM_SIZE`:
+
   * 用于指定单进程最大 GPU 显存占用(GB)
   * 仅对`pipeline`后端生效。
+  
 - `MINERU_MODEL_SOURCE`:
+
   * 用于指定模型来源
   * 支持`huggingface/modelscope/local`
   * 默认为`huggingface`可通过环境变量切换为`modelscope`或使用本地模型。
+  
 - `MINERU_TOOLS_CONFIG_JSON`:
+
   * 用于指定配置文件路径
   * 默认为用户目录下的`mineru.json`,可通过环境变量指定其他配置文件路径。
+  
 - `MINERU_FORMULA_ENABLE`:
+
   * 用于启用公式解析
   * 默认为`true`,可通过环境变量设置为`false`来禁用公式解析。
+  
 - `MINERU_TABLE_ENABLE`:
+
   * 用于启用表格解析
   * 默认为`true`,可通过环境变量设置为`false`来禁用表格解析。

+ 5 - 0
docs/zh/usage/quick_usage.md

@@ -77,12 +77,17 @@ MinerU 现已实现开箱即用,但也支持通过配置文件扩展功能。
 以下是一些可用的配置选项: 
 
 - `latex-delimiter-config`:
+
   * 用于配置 LaTeX 公式的分隔符
   * 默认为`$`符号,可根据需要修改为其他符号或字符串。
+  
 - `llm-aided-config`:
+
   * 用于配置 LLM 辅助标题分级的相关参数,兼容所有支持`openai协议`的 LLM 模型
   * 默认使用`阿里云百炼`的`qwen2.5-32b-instruct`模型
   * 您需要自行配置 API 密钥并将`enable`设置为`true`来启用此功能。
+  
 - `models-dir`:
+
   * 用于指定本地模型存储目录,请为`pipeline`和`vlm`后端分别指定模型目录,
   * 指定目录后您可通过配置环境变量`export MINERU_MODEL_SOURCE=local`来使用本地模型。