浏览代码

Merge remote-tracking branch 'origin/dev' into dev

Sidney233 4 月之前
父节点
当前提交
2d23d70e7b

+ 16 - 3
README.md

@@ -44,6 +44,14 @@
 
 
 # Changelog
 # Changelog
 
 
+- 2025/07/16 2.1.1 Released
+  - Bug fixes
+    - Fixed text block content loss issue that could occur in certain `pipeline` scenarios #3005
+    - Fixed issue where `sglang-client` required unnecessary packages like `torch` #2968
+    - Updated `dockerfile` to fix incomplete text content parsing due to missing fonts in Linux #2915
+  - Usability improvements
+    - Updated `compose.yaml` to facilitate direct startup of `sglang-server`, `mineru-api`, and `mineru-gradio` services
+    - Launched brand new [online documentation site](https://opendatalab.github.io/MinerU/), simplified readme, providing better documentation experience
 - 2025/07/05 Version 2.1.0 Released
 - 2025/07/05 Version 2.1.0 Released
   - This is the first major update of MinerU 2, which includes a large number of new features and improvements, covering significant performance optimizations, user experience enhancements, and bug fixes. The detailed update contents are as follows:
   - This is the first major update of MinerU 2, which includes a large number of new features and improvements, covering significant performance optimizations, user experience enhancements, and bug fixes. The detailed update contents are as follows:
   - **Performance Optimizations:**
   - **Performance Optimizations:**
@@ -51,10 +59,10 @@
     - Greatly enhanced post-processing speed when the `pipeline` backend handles batch processing of documents with fewer pages (<10 pages).
     - Greatly enhanced post-processing speed when the `pipeline` backend handles batch processing of documents with fewer pages (<10 pages).
     - Layout analysis speed of the `pipeline` backend has been increased by approximately 20%.
     - Layout analysis speed of the `pipeline` backend has been increased by approximately 20%.
   - **Experience Enhancements:**
   - **Experience Enhancements:**
-    - Built-in ready-to-use `fastapi service` and `gradio webui`. For detailed usage instructions, please refer to [Documentation](#3-api-calls-or-visual-invocation).
+    - Built-in ready-to-use `fastapi service` and `gradio webui`. For detailed usage instructions, please refer to [Documentation](https://opendatalab.github.io/MinerU/usage/quick_usage/#advanced-usage-via-api-webui-sglang-clientserver).
     - Adapted to `sglang` version `0.4.8`, significantly reducing the GPU memory requirements for the `vlm-sglang` backend. It can now run on graphics cards with as little as `8GB GPU memory` (Turing architecture or newer).
     - Adapted to `sglang` version `0.4.8`, significantly reducing the GPU memory requirements for the `vlm-sglang` backend. It can now run on graphics cards with as little as `8GB GPU memory` (Turing architecture or newer).
     - Added transparent parameter passing for all commands related to `sglang`, allowing the `sglang-engine` backend to receive all `sglang` parameters consistently with the `sglang-server`.
     - Added transparent parameter passing for all commands related to `sglang`, allowing the `sglang-engine` backend to receive all `sglang` parameters consistently with the `sglang-server`.
-    - Supports feature extensions based on configuration files, including `custom formula delimiters`, `enabling heading classification`, and `customizing local model directories`. For detailed usage instructions, please refer to [Documentation](#4-extending-mineru-functionality-through-configuration-files).
+    - Supports feature extensions based on configuration files, including `custom formula delimiters`, `enabling heading classification`, and `customizing local model directories`. For detailed usage instructions, please refer to [Documentation](https://opendatalab.github.io/MinerU/usage/quick_usage/#extending-mineru-functionality-with-configuration-files).
   - **New Features:**
   - **New Features:**
     - Updated the `pipeline` backend with the PP-OCRv5 multilingual text recognition model, supporting text recognition in 37 languages such as French, Spanish, Portuguese, Russian, and Korean, with an average accuracy improvement of over 30%. [Details](https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/algorithm/PP-OCRv5/PP-OCRv5_multi_languages.html)
     - Updated the `pipeline` backend with the PP-OCRv5 multilingual text recognition model, supporting text recognition in 37 languages such as French, Spanish, Portuguese, Russian, and Korean, with an average accuracy improvement of over 30%. [Details](https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/algorithm/PP-OCRv5/PP-OCRv5_multi_languages.html)
     - Introduced limited support for vertical text layout in the `pipeline` backend.
     - Introduced limited support for vertical text layout in the `pipeline` backend.
@@ -517,6 +525,11 @@ You can get the [Docker Deployment Instructions](https://opendatalab.github.io/M
 
 
 ### Using MinerU
 ### Using MinerU
 
 
+The simplest command line invocation is:
+```bash
+mineru -p <input_path> -o <output_path>
+```
+
 You can use MinerU for PDF parsing through various methods such as command line, API, and WebUI. For detailed instructions, please refer to the [Usage Guide](https://opendatalab.github.io/MinerU/usage/).
 You can use MinerU for PDF parsing through various methods such as command line, API, and WebUI. For detailed instructions, please refer to the [Usage Guide](https://opendatalab.github.io/MinerU/usage/).
 
 
 # TODO
 # TODO
@@ -617,4 +630,4 @@ Currently, some models in this project are trained based on YOLO. However, since
 - [PDF-Extract-Kit (A Comprehensive Toolkit for High-Quality PDF Content Extraction)](https://github.com/opendatalab/PDF-Extract-Kit)
 - [PDF-Extract-Kit (A Comprehensive Toolkit for High-Quality PDF Content Extraction)](https://github.com/opendatalab/PDF-Extract-Kit)
 - [OmniDocBench (A Comprehensive Benchmark for Document Parsing and Evaluation)](https://github.com/opendatalab/OmniDocBench)
 - [OmniDocBench (A Comprehensive Benchmark for Document Parsing and Evaluation)](https://github.com/opendatalab/OmniDocBench)
 - [Magic-HTML (Mixed web page extraction tool)](https://github.com/opendatalab/magic-html)
 - [Magic-HTML (Mixed web page extraction tool)](https://github.com/opendatalab/magic-html)
-- [Magic-Doc (Fast speed ppt/pptx/doc/docx/pdf extraction tool)](https://github.com/InternLM/magic-doc) 
+- [Magic-Doc (Fast speed ppt/pptx/doc/docx/pdf extraction tool)](https://github.com/InternLM/magic-doc) 

+ 17 - 3
README_zh-CN.md

@@ -43,17 +43,25 @@
 </div>
 </div>
 
 
 # 更新记录
 # 更新记录
+- 2025/07/16 2.1.1发布
+  - bug修复 
+    - 修复`pipeline`在某些情况可能发生的文本块内容丢失问题 #3005
+    - 修复`sglang-client`需要安装`torch`等不必要的包的问题 #2968
+    - 更新`dockerfile`以修复linux字体缺失导致的解析文本内容不完整问题 #2915
+  - 易用性更新
+    - 更新`compose.yaml`,便于用户直接启动`sglang-server`、`mineru-api`、`mineru-gradio`服务
+    - 启用全新的[在线文档站点](https://opendatalab.github.io/MinerU/zh/),简化readme,提供更好的文档体验
 - 2025/07/05 2.1.0发布
 - 2025/07/05 2.1.0发布
   - 这是 MinerU 2 的第一个大版本更新,包含了大量新功能和改进,包含众多性能优化、体验优化和bug修复,具体更新内容如下: 
   - 这是 MinerU 2 的第一个大版本更新,包含了大量新功能和改进,包含众多性能优化、体验优化和bug修复,具体更新内容如下: 
   - 性能优化: 
   - 性能优化: 
     - 大幅提升某些特定分辨率(长边2000像素左右)文档的预处理速度
     - 大幅提升某些特定分辨率(长边2000像素左右)文档的预处理速度
     - 大幅提升`pipeline`后端批量处理大量页数较少(<10)文档时的后处理速度
     - 大幅提升`pipeline`后端批量处理大量页数较少(<10)文档时的后处理速度
-    - `pipline`后端的layout分析速度提升约20%
+    - `pipeline`后端的layout分析速度提升约20%
   - 体验优化:
   - 体验优化:
-    - 内置开箱即用的`fastapi服务`和`gradio webui`,详细使用方法请参考[文档](#3-api-调用-或-可视化调用)
+    - 内置开箱即用的`fastapi服务`和`gradio webui`,详细使用方法请参考[文档](https://opendatalab.github.io/MinerU/zh/usage/quick_usage/#apiwebuisglang-clientserver)
     - `sglang`适配`0.4.8`版本,大幅降低`vlm-sglang`后端的显存要求,最低可在`8G显存`(Turing及以后架构)的显卡上运行
     - `sglang`适配`0.4.8`版本,大幅降低`vlm-sglang`后端的显存要求,最低可在`8G显存`(Turing及以后架构)的显卡上运行
     - 对所有命令增加`sglang`的参数透传,使得`sglang-engine`后端可以与`sglang-server`一致,接收`sglang`的所有参数
     - 对所有命令增加`sglang`的参数透传,使得`sglang-engine`后端可以与`sglang-server`一致,接收`sglang`的所有参数
-    - 支持基于配置文件的功能扩展,包含`自定义公式标识符`、`开启标题分级功能`、`自定义本地模型目录`,详细使用方法请参考[文档](#4-基于配置文件扩展-mineru-功能)
+    - 支持基于配置文件的功能扩展,包含`自定义公式标识符`、`开启标题分级功能`、`自定义本地模型目录`,详细使用方法请参考[文档](https://opendatalab.github.io/MinerU/zh/usage/quick_usage/#mineru_1)
   - 新特性:  
   - 新特性:  
     - `pipeline`后端更新 PP-OCRv5 多语种文本识别模型,支持法语、西班牙语、葡萄牙语、俄语、韩语等 37 种语言的文字识别,平均精度涨幅超30%。[详情](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/algorithm/PP-OCRv5/PP-OCRv5_multi_languages.html)
     - `pipeline`后端更新 PP-OCRv5 多语种文本识别模型,支持法语、西班牙语、葡萄牙语、俄语、韩语等 37 种语言的文字识别,平均精度涨幅超30%。[详情](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/algorithm/PP-OCRv5/PP-OCRv5_multi_languages.html)
     - `pipeline`后端增加对竖排文本的有限支持
     - `pipeline`后端增加对竖排文本的有限支持
@@ -503,6 +511,12 @@ MinerU提供了便捷的docker部署方式,这有助于快速搭建环境并
 ---
 ---
 
 
 ### 使用 MinerU
 ### 使用 MinerU
+
+最简单的命令行调用方式:
+```bash
+mineru -p <input_path> -o <output_path>
+```
+
 您可以通过命令行、API、WebUI等多种方式使用MinerU进行PDF解析,具体使用方法请参考[使用指南](https://opendatalab.github.io/MinerU/zh/usage/)。
 您可以通过命令行、API、WebUI等多种方式使用MinerU进行PDF解析,具体使用方法请参考[使用指南](https://opendatalab.github.io/MinerU/zh/usage/)。
 
 
 # TODO
 # TODO

+ 8 - 4
docker/china/Dockerfile

@@ -3,14 +3,18 @@ FROM lmsysorg/sglang:v0.4.8.post1-cu126
 
 
 # Install libgl for opencv support & Noto fonts for Chinese characters
 # Install libgl for opencv support & Noto fonts for Chinese characters
 RUN apt-get update && \
 RUN apt-get update && \
-    apt-get install -y fonts-noto-core fonts-noto-cjk && \
-    apt-get install -y libgl1 && \
-    apt-get clean && \
+    apt-get install -y \
+        fonts-noto-core \
+        fonts-noto-cjk \
+        fontconfig \
+        libgl1 && \
     fc-cache -fv && \
     fc-cache -fv && \
+    apt-get clean && \
     rm -rf /var/lib/apt/lists/*
     rm -rf /var/lib/apt/lists/*
 
 
 # Install mineru latest
 # Install mineru latest
-RUN python3 -m pip install -U 'mineru[core]' -i https://mirrors.aliyun.com/pypi/simple --break-system-packages
+RUN python3 -m pip install -U 'mineru[core]' -i https://mirrors.aliyun.com/pypi/simple --break-system-packages && \
+    python3 -m pip cache purge
 
 
 # Download models and update the configuration file
 # Download models and update the configuration file
 RUN /bin/bash -c "mineru-models-download -s modelscope -m all"
 RUN /bin/bash -c "mineru-models-download -s modelscope -m all"

+ 9 - 5
docker/global/Dockerfile

@@ -1,16 +1,20 @@
 # Use the official sglang image
 # Use the official sglang image
 FROM lmsysorg/sglang:v0.4.8.post1-cu126
 FROM lmsysorg/sglang:v0.4.8.post1-cu126
 
 
-# Install libgl for opencv support
+# Install libgl for opencv support & Noto fonts for Chinese characters
 RUN apt-get update && \
 RUN apt-get update && \
-    apt-get install -y fonts-noto-core fonts-noto-cjk && \
-    apt-get install -y libgl1 && \
-    apt-get clean && \
+    apt-get install -y \
+        fonts-noto-core \
+        fonts-noto-cjk \
+        fontconfig \
+        libgl1 && \
     fc-cache -fv && \
     fc-cache -fv && \
+    apt-get clean && \
     rm -rf /var/lib/apt/lists/*
     rm -rf /var/lib/apt/lists/*
 
 
 # Install mineru latest
 # Install mineru latest
-RUN python3 -m pip install -U 'mineru[core]' --break-system-packages
+RUN python3 -m pip install -U 'mineru[core]' --break-system-packages && \
+    python3 -m pip cache purge
 
 
 # Download models and update the configuration file
 # Download models and update the configuration file
 RUN /bin/bash -c "mineru-models-download -s huggingface -m all"
 RUN /bin/bash -c "mineru-models-download -s huggingface -m all"

+ 2 - 2
docs/en/faq/index.md

@@ -1,8 +1,8 @@
 # Frequently Asked Questions
 # Frequently Asked Questions
 
 
-If your question is not listed, you can also use [DeepWiki](https://deepwiki.com/opendatalab/MinerU) to communicate with the AI assistant, which can solve most common problems.
+If your question is not listed, try using [DeepWiki](https://deepwiki.com/opendatalab/MinerU)'s AI assistant for common issues.
 
 
-If you still cannot resolve the issue, you can join the community through [Discord](https://discord.gg/Tdedn9GTXq) or [WeChat](http://mineru.space/s/V85Yl) to communicate with other users and developers.
+For unresolved problems, join our [Discord](https://discord.gg/Tdedn9GTXq) or [WeChat](http://mineru.space/s/V85Yl) community for support.
 
 
 ??? question "Encountered the error `ImportError: libGL.so.1: cannot open shared object file: No such file or directory` in Ubuntu 22.04 on WSL2"
 ??? question "Encountered the error `ImportError: libGL.so.1: cannot open shared object file: No such file or directory` in Ubuntu 22.04 on WSL2"
 
 

+ 5 - 11
docs/en/quick_start/docker_deployment.md

@@ -13,8 +13,6 @@ docker build -t mineru-sglang:latest -f Dockerfile .
 > The [Dockerfile](https://github.com/opendatalab/MinerU/blob/master/docker/global/Dockerfile) uses `lmsysorg/sglang:v0.4.8.post1-cu126` as the base image by default, supporting Turing/Ampere/Ada Lovelace/Hopper platforms.
 > The [Dockerfile](https://github.com/opendatalab/MinerU/blob/master/docker/global/Dockerfile) uses `lmsysorg/sglang:v0.4.8.post1-cu126` as the base image by default, supporting Turing/Ampere/Ada Lovelace/Hopper platforms.
 > If you are using the newer `Blackwell` platform, please modify the base image to `lmsysorg/sglang:v0.4.8.post1-cu128-b200` before executing the build operation.
 > If you are using the newer `Blackwell` platform, please modify the base image to `lmsysorg/sglang:v0.4.8.post1-cu128-b200` before executing the build operation.
 
 
----
-
 ## Docker Description
 ## Docker Description
 
 
 MinerU's Docker uses `lmsysorg/sglang` as the base image, so it includes the `sglang` inference acceleration framework and necessary dependencies by default. Therefore, on compatible devices, you can directly use `sglang` to accelerate VLM model inference.
 MinerU's Docker uses `lmsysorg/sglang` as the base image, so it includes the `sglang` inference acceleration framework and necessary dependencies by default. Therefore, on compatible devices, you can directly use `sglang` to accelerate VLM model inference.
@@ -28,9 +26,7 @@ MinerU's Docker uses `lmsysorg/sglang` as the base image, so it includes the `sg
 >
 >
 > If your device doesn't meet the above requirements, you can still use other features of MinerU, but cannot use `sglang` to accelerate VLM model inference, meaning you cannot use the `vlm-sglang-engine` backend or start the `vlm-sglang-server` service.
 > If your device doesn't meet the above requirements, you can still use other features of MinerU, but cannot use `sglang` to accelerate VLM model inference, meaning you cannot use the `vlm-sglang-engine` backend or start the `vlm-sglang-server` service.
 
 
----
-
-## Start Docker Container:
+## Start Docker Container
 
 
 ```bash
 ```bash
 docker run --gpus all \
 docker run --gpus all \
@@ -42,9 +38,7 @@ docker run --gpus all \
 ```
 ```
 
 
 After executing this command, you will enter the Docker container's interactive terminal with some ports mapped for potential services. You can directly run MinerU-related commands within the container to use MinerU's features.
 After executing this command, you will enter the Docker container's interactive terminal with some ports mapped for potential services. You can directly run MinerU-related commands within the container to use MinerU's features.
-You can also directly start MinerU services by replacing `/bin/bash` with service startup commands. For detailed instructions, please refer to the [MinerU Usage Documentation](../usage/index.md).
-
----
+You can also directly start MinerU services by replacing `/bin/bash` with service startup commands. For detailed instructions, please refer to the [Start the service via command](https://opendatalab.github.io/MinerU/usage/quick_usage/#advanced-usage-via-api-webui-sglang-clientserver).
 
 
 ## Start Services Directly with Docker Compose
 ## Start Services Directly with Docker Compose
 
 
@@ -66,7 +60,7 @@ wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/compose.yaml
 ### Start sglang-server service
 ### Start sglang-server service
 connect to `sglang-server` via `vlm-sglang-client` backend
 connect to `sglang-server` via `vlm-sglang-client` backend
   ```bash
   ```bash
-  docker compose -f compose.yaml --profile mineru-sglang-server up -d
+  docker compose -f compose.yaml --profile sglang-server up -d
   ```
   ```
   >[!TIP]
   >[!TIP]
   >In another terminal, connect to sglang server via sglang client (only requires CPU and network, no sglang environment needed)
   >In another terminal, connect to sglang server via sglang client (only requires CPU and network, no sglang environment needed)
@@ -78,7 +72,7 @@ connect to `sglang-server` via `vlm-sglang-client` backend
 
 
 ### Start Web API service
 ### Start Web API service
   ```bash
   ```bash
-  docker compose -f compose.yaml --profile mineru-api up -d
+  docker compose -f compose.yaml --profile api up -d
   ```
   ```
   >[!TIP]
   >[!TIP]
   >Access `http://<server_ip>:8000/docs` in your browser to view the API documentation.
   >Access `http://<server_ip>:8000/docs` in your browser to view the API documentation.
@@ -87,7 +81,7 @@ connect to `sglang-server` via `vlm-sglang-client` backend
 
 
 ### Start Gradio WebUI service
 ### Start Gradio WebUI service
   ```bash
   ```bash
-  docker compose -f compose.yaml --profile mineru-gradio up -d
+  docker compose -f compose.yaml --profile gradio up -d
   ```
   ```
   >[!TIP]
   >[!TIP]
   >
   >

+ 6 - 1
docs/en/quick_start/index.md

@@ -1,6 +1,6 @@
 # Quick Start
 # Quick Start
 
 
-If you encounter any installation issues, please check the [FAQ](../FAQ/index.md) first.
+If you encounter any installation issues, please check the [FAQ](../faq/index.md) first.
 
 
 ## Online Experience
 ## Online Experience
 
 
@@ -93,4 +93,9 @@ You can get the [Docker Deployment Instructions](./docker_deployment.md) in the
 
 
 ### Using MinerU
 ### Using MinerU
 
 
+The simplest command line invocation is:
+```bash
+mineru -p <input_path> -o <output_path>
+```
+
 You can use MinerU for PDF parsing through various methods such as command line, API, and WebUI. For detailed instructions, please refer to the [Usage Guide](../usage/index.md).
 You can use MinerU for PDF parsing through various methods such as command line, API, and WebUI. For detailed instructions, please refer to the [Usage Guide](../usage/index.md).

+ 0 - 12
docs/en/usage/advanced_cli_parameters.md

@@ -1,7 +1,5 @@
 # Advanced Command Line Parameters
 # Advanced Command Line Parameters
 
 
----
-
 ## SGLang Acceleration Parameter Optimization
 ## SGLang Acceleration Parameter Optimization
 
 
 ### Memory Optimization Parameters
 ### Memory Optimization Parameters
@@ -11,8 +9,6 @@
 > - If you encounter insufficient VRAM when using a single graphics card, you may need to reduce the KV cache size with `--mem-fraction-static 0.5`. If VRAM issues persist, try reducing it further to `0.4` or lower.
 > - If you encounter insufficient VRAM when using a single graphics card, you may need to reduce the KV cache size with `--mem-fraction-static 0.5`. If VRAM issues persist, try reducing it further to `0.4` or lower.
 > - If you have two or more graphics cards, you can try using tensor parallelism (TP) mode to simply expand available VRAM: `--tp-size 2`
 > - If you have two or more graphics cards, you can try using tensor parallelism (TP) mode to simply expand available VRAM: `--tp-size 2`
 
 
----
-
 ### Performance Optimization Parameters
 ### Performance Optimization Parameters
 > [!TIP]
 > [!TIP]
 > If you can already use SGLang normally for accelerated VLM model inference but still want to further improve inference speed, you can try the following parameters:
 > If you can already use SGLang normally for accelerated VLM model inference but still want to further improve inference speed, you can try the following parameters:
@@ -20,15 +16,11 @@
 > - If you have multiple graphics cards, you can use SGLang's multi-card parallel mode to increase throughput: `--dp-size 2`
 > - If you have multiple graphics cards, you can use SGLang's multi-card parallel mode to increase throughput: `--dp-size 2`
 > - You can also enable `torch.compile` to accelerate inference speed by approximately 15%: `--enable-torch-compile`
 > - You can also enable `torch.compile` to accelerate inference speed by approximately 15%: `--enable-torch-compile`
 
 
----
-
 ### Parameter Passing Instructions
 ### Parameter Passing Instructions
 > [!TIP]
 > [!TIP]
 > - All officially supported SGLang parameters can be passed to MinerU through command line arguments, including the following commands: `mineru`, `mineru-sglang-server`, `mineru-gradio`, `mineru-api`
 > - All officially supported SGLang parameters can be passed to MinerU through command line arguments, including the following commands: `mineru`, `mineru-sglang-server`, `mineru-gradio`, `mineru-api`
 > - If you want to learn more about `sglang` parameter usage, please refer to the [SGLang official documentation](https://docs.sglang.ai/backend/server_arguments.html#common-launch-commands)
 > - If you want to learn more about `sglang` parameter usage, please refer to the [SGLang official documentation](https://docs.sglang.ai/backend/server_arguments.html#common-launch-commands)
 
 
----
-
 ## GPU Device Selection and Configuration
 ## GPU Device Selection and Configuration
 
 
 ### CUDA_VISIBLE_DEVICES Basic Usage
 ### CUDA_VISIBLE_DEVICES Basic Usage
@@ -39,8 +31,6 @@
 >   ```
 >   ```
 > - This specification method is effective for all command line calls, including `mineru`, `mineru-sglang-server`, `mineru-gradio`, and `mineru-api`, and applies to both `pipeline` and `vlm` backends.
 > - This specification method is effective for all command line calls, including `mineru`, `mineru-sglang-server`, `mineru-gradio`, and `mineru-api`, and applies to both `pipeline` and `vlm` backends.
 
 
----
-
 ### Common Device Configuration Examples
 ### Common Device Configuration Examples
 > [!TIP]
 > [!TIP]
 > Here are some common `CUDA_VISIBLE_DEVICES` setting examples:
 > Here are some common `CUDA_VISIBLE_DEVICES` setting examples:
@@ -52,8 +42,6 @@
 >   CUDA_VISIBLE_DEVICES=""  # No GPU will be visible
 >   CUDA_VISIBLE_DEVICES=""  # No GPU will be visible
 >   ```
 >   ```
 
 
----
-
 ## Practical Application Scenarios
 ## Practical Application Scenarios
 > [!TIP]
 > [!TIP]
 > Here are some possible usage scenarios:
 > Here are some possible usage scenarios:

+ 10 - 83
docs/en/usage/index.md

@@ -1,89 +1,16 @@
-# Using MinerU
+# Usage Guide
 
 
-## Quick Model Source Configuration
-MinerU uses `huggingface` as the default model source. If users cannot access `huggingface` due to network restrictions, they can conveniently switch the model source to `modelscope` through environment variables:
-```bash
-export MINERU_MODEL_SOURCE=modelscope
-```
-For more information about model source configuration and custom local model paths, please refer to the [Model Source Documentation](./model_source.md) in the documentation.
+This section provides comprehensive usage instructions for the project. We will help you progressively master the project's usage from basic to advanced through the following sections:
 
 
----
+## Table of Contents
 
 
-## Quick Usage via Command Line
-MinerU has built-in command line tools that allow users to quickly use MinerU for PDF parsing through the command line:
-```bash
-# Default parsing using pipeline backend
-mineru -p <input_path> -o <output_path>
-```
-> [!TIP]
->- `<input_path>`: Local PDF/image file or directory
->- `<output_path>`: Output directory
->
-> For more information about output files, please refer to [Output File Documentation](../output_files.md).
+- [Quick Usage](./quick_usage.md) - Quick setup and basic usage
+- [Model Source Configuration](./model_source.md) - Detailed configuration instructions for model sources
+- [Command Line Tools](./cli_tools.md) - Detailed parameter descriptions for command line tools
+- [Advanced Optimization Parameters](./advanced_cli_parameters.md) - Advanced parameter descriptions for command line tool adaptation
 
 
-> [!NOTE]
-> The command line tool will automatically attempt cuda/mps acceleration on Linux and macOS systems. 
-> Windows users who need cuda acceleration should visit the [PyTorch official website](https://pytorch.org/get-started/locally/) to select the appropriate command for their cuda version to install acceleration-enabled `torch` and `torchvision`.
+## Getting Started
 
 
+We recommend reading the documentation in the order listed above, which will help you better understand and use the project features.
 
 
-```bash
-# Or specify vlm backend for parsing
-mineru -p <input_path> -o <output_path> -b vlm-transformers
-```
-> [!TIP]
-> The vlm backend additionally supports `sglang` acceleration. Compared to the `transformers` backend, `sglang` can achieve 20-30x speedup. You can check the installation method for the complete package supporting `sglang` acceleration in the [Extension Modules Installation Guide](../quick_start/extension_modules.md).
-
-If you need to adjust parsing options through custom parameters, you can also check the more detailed [Command Line Tools Usage Instructions](./cli_tools.md) in the documentation.
-
----
-
-## Advanced Usage via API, WebUI, sglang-client/server
-
-- Direct Python API calls: [Python Usage Example](https://github.com/opendatalab/MinerU/blob/master/demo/demo.py)
-- FastAPI calls:
-  ```bash
-  mineru-api --host 127.0.0.1 --port 8000
-  ```
-  >[!TIP]
-  >Access `http://127.0.0.1:8000/docs` in your browser to view the API documentation.
-- Start Gradio WebUI visual frontend:
-  ```bash
-  # Using pipeline/vlm-transformers/vlm-sglang-client backends
-  mineru-gradio --server-name 127.0.0.1 --server-port 7860
-  # Or using vlm-sglang-engine/pipeline backends (requires sglang environment)
-  mineru-gradio --server-name 127.0.0.1 --server-port 7860 --enable-sglang-engine true
-  ```
-  >[!TIP]
-  >
-  >- Access `http://127.0.0.1:7860` in your browser to use the Gradio WebUI.
-  >- Access `http://127.0.0.1:7860/?view=api` to use the Gradio API.
-- Using `sglang-client/server` method:
-  ```bash
-  # Start sglang server (requires sglang environment)
-  mineru-sglang-server --port 30000
-  ``` 
-  >[!TIP]
-  >In another terminal, connect to sglang server via sglang client (only requires CPU and network, no sglang environment needed)
-  > ```bash
-  > mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://127.0.0.1:30000
-  > ```
-
-> [!TIP]
-> All officially supported sglang parameters can be passed to MinerU through command line arguments, including the following commands: `mineru`, `mineru-sglang-server`, `mineru-gradio`, `mineru-api`.
-> We have compiled some commonly used parameters and usage methods for `sglang`, which can be found in the documentation [Advanced Command Line Parameters](./advanced_cli_parameters.md).
-
----
-
-## Extending MinerU Functionality with Configuration Files
-
-MinerU is now ready to use out of the box, but also supports extending functionality through configuration files. You can edit `mineru.json` file in your user directory to add custom configurations.  
-
->[!TIP]
->The `mineru.json` file will be automatically generated when you use the built-in model download command `mineru-models-download`, or you can create it by copying the [configuration template file](https://github.com/opendatalab/MinerU/blob/master/mineru.template.json) to your user directory and renaming it to `mineru.json`.  
-
-Here are some available configuration options:  
-
-- `latex-delimiter-config`: Used to configure LaTeX formula delimiters, defaults to `$` symbol, can be modified to other symbols or strings as needed.
-- `llm-aided-config`: Used to configure parameters for LLM-assisted title hierarchy, compatible with all LLM models supporting `openai protocol`, defaults to using Alibaba Cloud Bailian's `qwen2.5-32b-instruct` model. You need to configure your own API key and set `enable` to `true` to enable this feature.
-- `models-dir`: Used to specify local model storage directory, please specify model directories for `pipeline` and `vlm` backends separately. After specifying the directory, you can use local models by configuring the environment variable `export MINERU_MODEL_SOURCE=local`.
-
+If you encounter issues during usage, please check the [FAQ](../faq/index.md)

+ 1 - 1
docs/en/usage/model_source.md

@@ -36,7 +36,7 @@ or use the interactive command line tool to select model downloads:
 ```bash
 ```bash
 mineru-models-download
 mineru-models-download
 ```
 ```
->[!TIP]
+> [!NOTE]
 >- After download completion, the model path will be output in the current terminal window and automatically written to `mineru.json` in the user directory.
 >- After download completion, the model path will be output in the current terminal window and automatically written to `mineru.json` in the user directory.
 >- You can also create it by copying the [configuration template file](https://github.com/opendatalab/MinerU/blob/master/mineru.template.json) to your user directory and renaming it to `mineru.json`.
 >- You can also create it by copying the [configuration template file](https://github.com/opendatalab/MinerU/blob/master/mineru.template.json) to your user directory and renaming it to `mineru.json`.
 >- After downloading models locally, you can freely move the model folder to other locations while updating the model path in `mineru.json`.
 >- After downloading models locally, you can freely move the model folder to other locations while updating the model path in `mineru.json`.

+ 83 - 0
docs/en/usage/quick_usage.md

@@ -0,0 +1,83 @@
+# Using MinerU
+
+## Quick Model Source Configuration
+MinerU uses `huggingface` as the default model source. If users cannot access `huggingface` due to network restrictions, they can conveniently switch the model source to `modelscope` through environment variables:
+```bash
+export MINERU_MODEL_SOURCE=modelscope
+```
+For more information about model source configuration and custom local model paths, please refer to the [Model Source Documentation](./model_source.md) in the documentation.
+
+## Quick Usage via Command Line
+MinerU has built-in command line tools that allow users to quickly use MinerU for PDF parsing through the command line:
+```bash
+# Default parsing using pipeline backend
+mineru -p <input_path> -o <output_path>
+```
+> [!TIP]
+>- `<input_path>`: Local PDF/image file or directory
+>- `<output_path>`: Output directory
+>
+> For more information about output files, please refer to [Output File Documentation](../reference/output_files.md).
+
+> [!NOTE]
+> The command line tool will automatically attempt cuda/mps acceleration on Linux and macOS systems. 
+> Windows users who need cuda acceleration should visit the [PyTorch official website](https://pytorch.org/get-started/locally/) to select the appropriate command for their cuda version to install acceleration-enabled `torch` and `torchvision`.
+
+
+```bash
+# Or specify vlm backend for parsing
+mineru -p <input_path> -o <output_path> -b vlm-transformers
+```
+> [!TIP]
+> The vlm backend additionally supports `sglang` acceleration. Compared to the `transformers` backend, `sglang` can achieve 20-30x speedup. You can check the installation method for the complete package supporting `sglang` acceleration in the [Extension Modules Installation Guide](../quick_start/extension_modules.md).
+
+If you need to adjust parsing options through custom parameters, you can also check the more detailed [Command Line Tools Usage Instructions](./cli_tools.md) in the documentation.
+
+## Advanced Usage via API, WebUI, sglang-client/server
+
+- Direct Python API calls: [Python Usage Example](https://github.com/opendatalab/MinerU/blob/master/demo/demo.py)
+- FastAPI calls:
+  ```bash
+  mineru-api --host 0.0.0.0 --port 8000
+  ```
+  >[!TIP]
+  >Access `http://127.0.0.1:8000/docs` in your browser to view the API documentation.
+- Start Gradio WebUI visual frontend:
+  ```bash
+  # Using pipeline/vlm-transformers/vlm-sglang-client backends
+  mineru-gradio --server-name 0.0.0.0 --server-port 7860
+  # Or using vlm-sglang-engine/pipeline backends (requires sglang environment)
+  mineru-gradio --server-name 0.0.0.0 --server-port 7860 --enable-sglang-engine true
+  ```
+  >[!TIP]
+  >
+  >- Access `http://127.0.0.1:7860` in your browser to use the Gradio WebUI.
+  >- Access `http://127.0.0.1:7860/?view=api` to use the Gradio API.
+- Using `sglang-client/server` method:
+  ```bash
+  # Start sglang server (requires sglang environment)
+  mineru-sglang-server --port 30000
+  ``` 
+  >[!TIP]
+  >In another terminal, connect to sglang server via sglang client (only requires CPU and network, no sglang environment needed)
+  > ```bash
+  > mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://127.0.0.1:30000
+  > ```
+
+> [!NOTE]
+> All officially supported sglang parameters can be passed to MinerU through command line arguments, including the following commands: `mineru`, `mineru-sglang-server`, `mineru-gradio`, `mineru-api`.
+> We have compiled some commonly used parameters and usage methods for `sglang`, which can be found in the documentation [Advanced Command Line Parameters](./advanced_cli_parameters.md).
+
+## Extending MinerU Functionality with Configuration Files
+
+MinerU is now ready to use out of the box, but also supports extending functionality through configuration files. You can edit `mineru.json` file in your user directory to add custom configurations.  
+
+>[!IMPORTANT]
+>The `mineru.json` file will be automatically generated when you use the built-in model download command `mineru-models-download`, or you can create it by copying the [configuration template file](https://github.com/opendatalab/MinerU/blob/master/mineru.template.json) to your user directory and renaming it to `mineru.json`.  
+
+Here are some available configuration options:  
+
+- `latex-delimiter-config`: Used to configure LaTeX formula delimiters, defaults to `$` symbol, can be modified to other symbols or strings as needed.
+- `llm-aided-config`: Used to configure parameters for LLM-assisted title hierarchy, compatible with all LLM models supporting `openai protocol`, defaults to using Alibaba Cloud Bailian's `qwen2.5-32b-instruct` model. You need to configure your own API key and set `enable` to `true` to enable this feature.
+- `models-dir`: Used to specify local model storage directory, please specify model directories for `pipeline` and `vlm` backends separately. After specifying the directory, you can use local models by configuring the environment variable `export MINERU_MODEL_SOURCE=local`.
+

+ 5 - 11
docs/zh/quick_start/docker_deployment.md

@@ -13,8 +13,6 @@ docker build -t mineru-sglang:latest -f Dockerfile .
 > [Dockerfile](https://github.com/opendatalab/MinerU/blob/master/docker/china/Dockerfile)默认使用`lmsysorg/sglang:v0.4.8.post1-cu126`作为基础镜像,支持Turing/Ampere/Ada Lovelace/Hopper平台,
 > [Dockerfile](https://github.com/opendatalab/MinerU/blob/master/docker/china/Dockerfile)默认使用`lmsysorg/sglang:v0.4.8.post1-cu126`作为基础镜像,支持Turing/Ampere/Ada Lovelace/Hopper平台,
 > 如您使用较新的`Blackwell`平台,请将基础镜像修改为`lmsysorg/sglang:v0.4.8.post1-cu128-b200` 再执行build操作。
 > 如您使用较新的`Blackwell`平台,请将基础镜像修改为`lmsysorg/sglang:v0.4.8.post1-cu128-b200` 再执行build操作。
 
 
----
-
 ## Docker说明
 ## Docker说明
 
 
 Mineru的docker使用了`lmsysorg/sglang`作为基础镜像,因此在docker中默认集成了`sglang`推理加速框架和必需的依赖环境。因此在满足条件的设备上,您可以直接使用`sglang`加速VLM模型推理。
 Mineru的docker使用了`lmsysorg/sglang`作为基础镜像,因此在docker中默认集成了`sglang`推理加速框架和必需的依赖环境。因此在满足条件的设备上,您可以直接使用`sglang`加速VLM模型推理。
@@ -27,9 +25,7 @@ Mineru的docker使用了`lmsysorg/sglang`作为基础镜像,因此在docker中
 >
 >
 > 如果您的设备不满足上述条件,您仍然可以使用MinerU的其他功能,但无法使用`sglang`加速VLM模型推理,即无法使用`vlm-sglang-engine`后端和启动`vlm-sglang-server`服务。
 > 如果您的设备不满足上述条件,您仍然可以使用MinerU的其他功能,但无法使用`sglang`加速VLM模型推理,即无法使用`vlm-sglang-engine`后端和启动`vlm-sglang-server`服务。
 
 
----
-
-## 启动 Docker 容器:
+## 启动 Docker 容器
 
 
 ```bash
 ```bash
 docker run --gpus all \
 docker run --gpus all \
@@ -41,9 +37,7 @@ docker run --gpus all \
 ```
 ```
 
 
 执行该命令后,您将进入到Docker容器的交互式终端,并映射了一些端口用于可能会使用的服务,您可以直接在容器内运行MinerU相关命令来使用MinerU的功能。
 执行该命令后,您将进入到Docker容器的交互式终端,并映射了一些端口用于可能会使用的服务,您可以直接在容器内运行MinerU相关命令来使用MinerU的功能。
-您也可以直接通过替换`/bin/bash`为服务启动命令来启动MinerU服务,详细说明请参考[MinerU使用文档](../usage/index.md)。
-
----
+您也可以直接通过替换`/bin/bash`为服务启动命令来启动MinerU服务,详细说明请参考[通过命令启动服务](https://opendatalab.github.io/MinerU/zh/usage/quick_usage/#apiwebuisglang-clientserver)。
 
 
 ## 通过 Docker Compose 直接启动服务
 ## 通过 Docker Compose 直接启动服务
 
 
@@ -64,7 +58,7 @@ wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/compose.yaml
 ### 启动 sglang-server 服务
 ### 启动 sglang-server 服务
 并通过`vlm-sglang-client`后端连接`sglang-server`
 并通过`vlm-sglang-client`后端连接`sglang-server`
   ```bash
   ```bash
-  docker compose -f compose.yaml --profile mineru-sglang-server up -d
+  docker compose -f compose.yaml --profile sglang-server up -d
   ```
   ```
   >[!TIP]
   >[!TIP]
   >在另一个终端中通过sglang client连接sglang server(只需cpu与网络,不需要sglang环境)
   >在另一个终端中通过sglang client连接sglang server(只需cpu与网络,不需要sglang环境)
@@ -76,7 +70,7 @@ wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/compose.yaml
 
 
 ### 启动 Web API 服务
 ### 启动 Web API 服务
   ```bash
   ```bash
-  docker compose -f compose.yaml --profile mineru-api up -d
+  docker compose -f compose.yaml --profile api up -d
   ```
   ```
   >[!TIP]
   >[!TIP]
   >在浏览器中访问 `http://<server_ip>:8000/docs` 查看API文档。
   >在浏览器中访问 `http://<server_ip>:8000/docs` 查看API文档。
@@ -85,7 +79,7 @@ wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/compose.yaml
 
 
 ### 启动 Gradio WebUI 服务
 ### 启动 Gradio WebUI 服务
   ```bash
   ```bash
-  docker compose -f compose.yaml --profile mineru-gradio up -d
+  docker compose -f compose.yaml --profile gradio up -d
   ```
   ```
   >[!TIP]
   >[!TIP]
   > 
   > 

+ 6 - 1
docs/zh/quick_start/index.md

@@ -1,6 +1,6 @@
 # 快速开始
 # 快速开始
 
 
-如果遇到任何安装问题,请先查询 [FAQ](../FAQ/index.md) 
+如果遇到任何安装问题,请先查询 [FAQ](../faq/index.md) 
 
 
 ## 在线体验
 ## 在线体验
 
 
@@ -93,4 +93,9 @@ MinerU提供了便捷的docker部署方式,这有助于快速搭建环境并
 
 
 ### 使用 MinerU
 ### 使用 MinerU
 
 
+最简单的命令行调用方式:
+```bash
+mineru -p <input_path> -o <output_path>
+```
+
 您可以通过命令行、API、WebUI等多种方式使用MinerU进行PDF解析,具体使用方法请参考[使用指南](../usage/index.md)。
 您可以通过命令行、API、WebUI等多种方式使用MinerU进行PDF解析,具体使用方法请参考[使用指南](../usage/index.md)。

+ 1 - 13
docs/zh/usage/advanced_cli_parameters.md

@@ -1,6 +1,4 @@
-# 命令行参数进阶技巧
-
----
+# 命令行参数进阶
 
 
 ## SGLang 加速参数优化
 ## SGLang 加速参数优化
 
 
@@ -11,8 +9,6 @@
 > - 如果您使用单张显卡遇到显存不足的情况时,可能需要调低KV缓存大小,`--mem-fraction-static 0.5`,如仍出现显存不足问题,可尝试进一步降低到`0.4`或更低
 > - 如果您使用单张显卡遇到显存不足的情况时,可能需要调低KV缓存大小,`--mem-fraction-static 0.5`,如仍出现显存不足问题,可尝试进一步降低到`0.4`或更低
 > - 如您有两张以上显卡,可尝试通过张量并行(TP)模式简单扩充可用显存:`--tp-size 2`
 > - 如您有两张以上显卡,可尝试通过张量并行(TP)模式简单扩充可用显存:`--tp-size 2`
 
 
----
-
 ### 性能优化参数
 ### 性能优化参数
 > [!TIP]
 > [!TIP]
 > 如果您已经可以正常使用sglang对vlm模型进行加速推理,但仍然希望进一步提升推理速度,可以尝试以下参数:
 > 如果您已经可以正常使用sglang对vlm模型进行加速推理,但仍然希望进一步提升推理速度,可以尝试以下参数:
@@ -20,15 +16,11 @@
 > - 如果您有超过多张显卡,可以使用sglang的多卡并行模式来增加吞吐量:`--dp-size 2`
 > - 如果您有超过多张显卡,可以使用sglang的多卡并行模式来增加吞吐量:`--dp-size 2`
 > - 同时您可以启用`torch.compile`来将推理速度加速约15%:`--enable-torch-compile`
 > - 同时您可以启用`torch.compile`来将推理速度加速约15%:`--enable-torch-compile`
 
 
----
-
 ### 参数传递说明
 ### 参数传递说明
 > [!TIP]
 > [!TIP]
 > - 所有sglang官方支持的参数都可用通过命令行参数传递给 MinerU,包括以下命令:`mineru`、`mineru-sglang-server`、`mineru-gradio`、`mineru-api`
 > - 所有sglang官方支持的参数都可用通过命令行参数传递给 MinerU,包括以下命令:`mineru`、`mineru-sglang-server`、`mineru-gradio`、`mineru-api`
 > - 如果您想了解更多有关`sglang`的参数使用方法,请参考 [sglang官方文档](https://docs.sglang.ai/backend/server_arguments.html#common-launch-commands)
 > - 如果您想了解更多有关`sglang`的参数使用方法,请参考 [sglang官方文档](https://docs.sglang.ai/backend/server_arguments.html#common-launch-commands)
 
 
----
-
 ## GPU 设备选择与配置
 ## GPU 设备选择与配置
 
 
 ### CUDA_VISIBLE_DEVICES 基本用法
 ### CUDA_VISIBLE_DEVICES 基本用法
@@ -39,8 +31,6 @@
 >   ```
 >   ```
 > - 这种指定方式对所有的命令行调用都有效,包括 `mineru`、`mineru-sglang-server`、`mineru-gradio` 和 `mineru-api`,且对`pipeline`、`vlm`后端均适用。
 > - 这种指定方式对所有的命令行调用都有效,包括 `mineru`、`mineru-sglang-server`、`mineru-gradio` 和 `mineru-api`,且对`pipeline`、`vlm`后端均适用。
 
 
----
-
 ### 常见设备配置示例
 ### 常见设备配置示例
 > [!TIP]
 > [!TIP]
 > 以下是一些常见的 `CUDA_VISIBLE_DEVICES` 设置示例:
 > 以下是一些常见的 `CUDA_VISIBLE_DEVICES` 设置示例:
@@ -52,8 +42,6 @@
 >   CUDA_VISIBLE_DEVICES=""  # No GPU will be visible
 >   CUDA_VISIBLE_DEVICES=""  # No GPU will be visible
 >   ```
 >   ```
 
 
----
-
 ## 实际应用场景
 ## 实际应用场景
 
 
 > [!TIP]
 > [!TIP]

+ 15 - 22
docs/zh/usage/cli_tools.md

@@ -31,33 +31,28 @@ mineru-api --help
 Usage: mineru-api [OPTIONS]
 Usage: mineru-api [OPTIONS]
 
 
 Options:
 Options:
-  --host TEXT     Server host (default: 127.0.0.1)
-  --port INTEGER  Server port (default: 8000)
-  --reload        Enable auto-reload (development mode)
-  --help          Show this message and exit.
+  --host TEXT     服务器主机地址(默认:127.0.0.1)
+  --port INTEGER  服务器端口(默认:8000)
+  --reload        启用自动重载(开发模式)
+  --help          显示此帮助信息并退出
 ```
 ```
 ```bash
 ```bash
 mineru-gradio --help
 mineru-gradio --help
 Usage: mineru-gradio [OPTIONS]
 Usage: mineru-gradio [OPTIONS]
 
 
 Options:
 Options:
-  --enable-example BOOLEAN        Enable example files for input.The example
-                                  files to be input need to be placed in the
-                                  `example` folder within the directory where
-                                  the command is currently executed.
-  --enable-sglang-engine BOOLEAN  Enable SgLang engine backend for faster
-                                  processing.
-  --enable-api BOOLEAN            Enable gradio API for serving the
-                                  application.
-  --max-convert-pages INTEGER     Set the maximum number of pages to convert
-                                  from PDF to Markdown.
-  --server-name TEXT              Set the server name for the Gradio app.
-  --server-port INTEGER           Set the server port for the Gradio app.
+  --enable-example BOOLEAN        启用示例文件输入(需要将示例文件放置在当前
+                                  执行命令目录下的 `example` 文件夹中)
+  --enable-sglang-engine BOOLEAN  启用 SgLang 引擎后端以提高处理速度
+  --enable-api BOOLEAN            启用 Gradio API 以提供应用程序服务
+  --max-convert-pages INTEGER     设置从 PDF 转换为 Markdown 的最大页数
+  --server-name TEXT              设置 Gradio 应用程序的服务器主机名
+  --server-port INTEGER           设置 Gradio 应用程序的服务器端口
   --latex-delimiters-type [a|b|all]
   --latex-delimiters-type [a|b|all]
-                                  Set the type of LaTeX delimiters to use in
-                                  Markdown rendering:'a' for type '$', 'b' for
-                                  type '()[]', 'all' for both types.
-  --help                          Show this message and exit.
+                                  设置在 Markdown 渲染中使用的 LaTeX 分隔符类型
+                                  ('a' 表示 '$' 类型,'b' 表示 '()[]' 类型,
+                                  'all' 表示两种类型都使用)
+  --help                          显示此帮助信息并退出
 ```
 ```
 
 
 ## 环境变量说明
 ## 环境变量说明
@@ -71,5 +66,3 @@ MinerU命令行工具的某些参数存在相同功能的环境变量配置,
 - `MINERU_TOOLS_CONFIG_JSON`:用于指定配置文件路径,默认为用户目录下的`mineru.json`,可通过环境变量指定其他配置文件路径。
 - `MINERU_TOOLS_CONFIG_JSON`:用于指定配置文件路径,默认为用户目录下的`mineru.json`,可通过环境变量指定其他配置文件路径。
 - `MINERU_FORMULA_ENABLE`:用于启用公式解析,默认为`true`,可通过环境变量设置为`false`来禁用公式解析。
 - `MINERU_FORMULA_ENABLE`:用于启用公式解析,默认为`true`,可通过环境变量设置为`false`来禁用公式解析。
 - `MINERU_TABLE_ENABLE`:用于启用表格解析,默认为`true`,可通过环境变量设置为`false`来禁用表格解析。
 - `MINERU_TABLE_ENABLE`:用于启用表格解析,默认为`true`,可通过环境变量设置为`false`来禁用表格解析。
-
-

+ 10 - 82
docs/zh/usage/index.md

@@ -1,88 +1,16 @@
-# 使用 MinerU
+# 使用指南
 
 
-## 快速配置模型源
-MinerU默认使用`huggingface`作为模型源,若用户网络无法访问`huggingface`,可以通过环境变量便捷地切换模型源为`modelscope`:
-```bash
-export MINERU_MODEL_SOURCE=modelscope
-```
-有关模型源配置和自定义本地模型路径的更多信息,请参考文档中的[模型源说明](./model_source.md)。
+本章节提供了项目的完整使用说明。我们将通过以下几个部分,帮助您从基础到进阶逐步掌握项目的使用方法:
 
 
----
+## 目录
 
 
-## 通过命令行快速使用
-MinerU内置了命令行工具,用户可以通过命令行快速使用MinerU进行PDF解析:
-```bash
-# 默认使用pipeline后端解析
-mineru -p <input_path> -o <output_path>
-```
-> [!TIP]
-> - `<input_path>`:本地 PDF/图片 文件或目录
-> - `<output_path>`:输出目录
-> 
-> 更多关于输出文件的信息,请参考[输出文件说明](../output_files.md)。
+- [快速使用](./quick_usage.md) - 快速上手和基本使用
+- [模型源配置](./model_source.md) - 模型源的详细配置说明  
+- [命令行工具](./cli_tools.md) - 命令行工具的详细参数说明
+- [进阶优化参数](./advanced_cli_parameters.md) - 一些适配命令行工具的进阶参数说明
 
 
-> [!NOTE]
-> 命令行工具会在Linux和macOS系统自动尝试cuda/mps加速。Windows用户如需使用cuda加速,
-> 请前往 [Pytorch官网](https://pytorch.org/get-started/locally/) 选择适合自己cuda版本的命令安装支持加速的`torch`和`torchvision`。
+## 开始使用
 
 
+建议按照上述顺序阅读文档,这样可以帮助您更好地理解和使用项目功能。
 
 
-```bash
-# 或指定vlm后端解析
-mineru -p <input_path> -o <output_path> -b vlm-transformers
-```
-> [!TIP]
-> vlm后端另外支持`sglang`加速,与`transformers`后端相比,`sglang`的加速比可达20~30倍,可以在[扩展模块安装指南](../quick_start/extension_modules.md)中查看支持`sglang`加速的完整包安装方法。
-
-如果需要通过自定义参数调整解析选项,您也可以在文档中查看更详细的[命令行工具使用说明](./cli_tools.md)。
-
----
-
-## 通过api、webui、sglang-client/server进阶使用
-
-- 通过python api直接调用:[Python 调用示例](https://github.com/opendatalab/MinerU/blob/master/demo/demo.py)
-- 通过fast api方式调用:
-  ```bash
-  mineru-api --host 127.0.0.1 --port 8000
-  ```
-  >[!TIP]
-  >在浏览器中访问 `http://127.0.0.1:8000/docs` 查看API文档。
-- 启动gradio webui 可视化前端:
-  ```bash
-  # 使用 pipeline/vlm-transformers/vlm-sglang-client 后端
-  mineru-gradio --server-name 127.0.0.1 --server-port 7860
-  # 或使用 vlm-sglang-engine/pipeline 后端(需安装sglang环境)
-  mineru-gradio --server-name 127.0.0.1 --server-port 7860 --enable-sglang-engine true
-  ```
-  >[!TIP]
-  > 
-  >- 在浏览器中访问 `http://127.0.0.1:7860` 使用 Gradio WebUI。
-  >- 访问 `http://127.0.0.1:7860/?view=api` 使用 Gradio API。
-- 使用`sglang-client/server`方式调用:
-  ```bash
-  # 启动sglang server(需要安装sglang环境)
-  mineru-sglang-server --port 30000
-  ``` 
-  >[!TIP]
-  >在另一个终端中通过sglang client连接sglang server(只需cpu与网络,不需要sglang环境)
-  > ```bash
-  > mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://127.0.0.1:30000
-  > ```
-
-> [!TIP]
-> 所有sglang官方支持的参数都可用通过命令行参数传递给 MinerU,包括以下命令:`mineru`、`mineru-sglang-server`、`mineru-gradio`、`mineru-api`,
-> 我们整理了一些`sglang`使用中的常用参数和使用方法,可以在文档[命令行进阶参数](./advanced_cli_parameters.md)中获取。
-
----
-
-## 基于配置文件扩展 MinerU 功能
-
-MinerU 现已实现开箱即用,但也支持通过配置文件扩展功能。您可通过编辑用户目录下的 `mineru.json` 文件,添加自定义配置。
-
->[!TIP]
->`mineru.json` 文件会在您使用内置模型下载命令 `mineru-models-download` 时自动生成,也可以通过将[配置模板文件](https://github.com/opendatalab/MinerU/blob/master/mineru.template.json)复制到用户目录下并重命名为 `mineru.json` 来创建。  
-
-以下是一些可用的配置选项: 
-
-- `latex-delimiter-config`:用于配置 LaTeX 公式的分隔符,默认为`$`符号,可根据需要修改为其他符号或字符串。
-- `llm-aided-config`:用于配置 LLM 辅助标题分级的相关参数,兼容所有支持`openai协议`的 LLM 模型,默认使用`阿里云百炼`的`qwen2.5-32b-instruct`模型,您需要自行配置 API 密钥并将`enable`设置为`true`来启用此功能。
-- `models-dir`:用于指定本地模型存储目录,请为`pipeline`和`vlm`后端分别指定模型目录,指定目录后您可通过配置环境变量`export MINERU_MODEL_SOURCE=local`来使用本地模型。
+如果您在使用过程中遇到问题,请查看 [FAQ](../faq/index.md)

+ 1 - 1
docs/zh/usage/model_source.md

@@ -37,7 +37,7 @@ mineru-models-download --help
 ```bash
 ```bash
 mineru-models-download
 mineru-models-download
 ```
 ```
->[!TIP]
+> [!NOTE]
 >- 下载完成后,模型路径会在当前终端窗口输出,并自动写入用户目录下的 `mineru.json`。
 >- 下载完成后,模型路径会在当前终端窗口输出,并自动写入用户目录下的 `mineru.json`。
 >- 您也可以通过将[配置模板文件](https://github.com/opendatalab/MinerU/blob/master/mineru.template.json)复制到用户目录下并重命名为 `mineru.json` 来创建配置文件。
 >- 您也可以通过将[配置模板文件](https://github.com/opendatalab/MinerU/blob/master/mineru.template.json)复制到用户目录下并重命名为 `mineru.json` 来创建配置文件。
 >- 模型下载到本地后,您可以自由移动模型文件夹到其他位置,同时需要在 `mineru.json` 中更新模型路径。
 >- 模型下载到本地后,您可以自由移动模型文件夹到其他位置,同时需要在 `mineru.json` 中更新模型路径。

+ 81 - 0
docs/zh/usage/quick_usage.md

@@ -0,0 +1,81 @@
+# 使用 MinerU
+
+## 快速配置模型源
+MinerU默认使用`huggingface`作为模型源,若用户网络无法访问`huggingface`,可以通过环境变量便捷地切换模型源为`modelscope`:
+```bash
+export MINERU_MODEL_SOURCE=modelscope
+```
+有关模型源配置和自定义本地模型路径的更多信息,请参考文档中的[模型源说明](./model_source.md)。
+
+## 通过命令行快速使用
+MinerU内置了命令行工具,用户可以通过命令行快速使用MinerU进行PDF解析:
+```bash
+# 默认使用pipeline后端解析
+mineru -p <input_path> -o <output_path>
+```
+> [!TIP]
+> - `<input_path>`:本地 PDF/图片 文件或目录
+> - `<output_path>`:输出目录
+> 
+> 更多关于输出文件的信息,请参考[输出文件说明](../reference/output_files.md)。
+
+> [!NOTE]
+> 命令行工具会在Linux和macOS系统自动尝试cuda/mps加速。Windows用户如需使用cuda加速,
+> 请前往 [Pytorch官网](https://pytorch.org/get-started/locally/) 选择适合自己cuda版本的命令安装支持加速的`torch`和`torchvision`。
+
+```bash
+# 或指定vlm后端解析
+mineru -p <input_path> -o <output_path> -b vlm-transformers
+```
+> [!TIP]
+> vlm后端另外支持`sglang`加速,与`transformers`后端相比,`sglang`的加速比可达20~30倍,可以在[扩展模块安装指南](../quick_start/extension_modules.md)中查看支持`sglang`加速的完整包安装方法。
+
+如果需要通过自定义参数调整解析选项,您也可以在文档中查看更详细的[命令行工具使用说明](./cli_tools.md)。
+
+## 通过api、webui、sglang-client/server进阶使用
+
+- 通过python api直接调用:[Python 调用示例](https://github.com/opendatalab/MinerU/blob/master/demo/demo.py)
+- 通过fast api方式调用:
+  ```bash
+  mineru-api --host 0.0.0.0 --port 8000
+  ```
+  >[!TIP]
+  >在浏览器中访问 `http://127.0.0.1:8000/docs` 查看API文档。
+- 启动gradio webui 可视化前端:
+  ```bash
+  # 使用 pipeline/vlm-transformers/vlm-sglang-client 后端
+  mineru-gradio --server-name 0.0.0.0 --server-port 7860
+  # 或使用 vlm-sglang-engine/pipeline 后端(需安装sglang环境)
+  mineru-gradio --server-name 0.0.0.0 --server-port 7860 --enable-sglang-engine true
+  ```
+  >[!TIP]
+  > 
+  >- 在浏览器中访问 `http://127.0.0.1:7860` 使用 Gradio WebUI。
+  >- 访问 `http://127.0.0.1:7860/?view=api` 使用 Gradio API。
+- 使用`sglang-client/server`方式调用:
+  ```bash
+  # 启动sglang server(需要安装sglang环境)
+  mineru-sglang-server --port 30000
+  ``` 
+  >[!TIP]
+  >在另一个终端中通过sglang client连接sglang server(只需cpu与网络,不需要sglang环境)
+  > ```bash
+  > mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://127.0.0.1:30000
+  > ```
+
+> [!NOTE]
+> 所有sglang官方支持的参数都可用通过命令行参数传递给 MinerU,包括以下命令:`mineru`、`mineru-sglang-server`、`mineru-gradio`、`mineru-api`,
+> 我们整理了一些`sglang`使用中的常用参数和使用方法,可以在文档[命令行进阶参数](./advanced_cli_parameters.md)中获取。
+
+## 基于配置文件扩展 MinerU 功能
+
+MinerU 现已实现开箱即用,但也支持通过配置文件扩展功能。您可通过编辑用户目录下的 `mineru.json` 文件,添加自定义配置。
+
+>[!IMPORTANT]
+>`mineru.json` 文件会在您使用内置模型下载命令 `mineru-models-download` 时自动生成,也可以通过将[配置模板文件](https://github.com/opendatalab/MinerU/blob/master/mineru.template.json)复制到用户目录下并重命名为 `mineru.json` 来创建。  
+
+以下是一些可用的配置选项: 
+
+- `latex-delimiter-config`:用于配置 LaTeX 公式的分隔符,默认为`$`符号,可根据需要修改为其他符号或字符串。
+- `llm-aided-config`:用于配置 LLM 辅助标题分级的相关参数,兼容所有支持`openai协议`的 LLM 模型,默认使用`阿里云百炼`的`qwen2.5-32b-instruct`模型,您需要自行配置 API 密钥并将`enable`设置为`true`来启用此功能。
+- `models-dir`:用于指定本地模型存储目录,请为`pipeline`和`vlm`后端分别指定模型目录,指定目录后您可通过配置环境变量`export MINERU_MODEL_SOURCE=local`来使用本地模型。

+ 1 - 1
mineru/version.py

@@ -1 +1 @@
-__version__ = "2.1.0"
+__version__ = "2.1.1"

+ 5 - 3
mkdocs.yml

@@ -56,12 +56,12 @@ extra:
       name: GitHub
       name: GitHub
     - icon: fontawesome/brands/x-twitter
     - icon: fontawesome/brands/x-twitter
       link: https://x.com/OpenDataLab_AI
       link: https://x.com/OpenDataLab_AI
-      name: Twitter
+      name: X-Twitter
     - icon: fontawesome/brands/discord
     - icon: fontawesome/brands/discord
       link: https://discord.gg/Tdedn9GTXq
       link: https://discord.gg/Tdedn9GTXq
       name: Discord
       name: Discord
     - icon: fontawesome/brands/weixin
     - icon: fontawesome/brands/weixin
-      link: https://mineru.space/common/qun/?qid=362634
+      link: http://mineru.space/s/V85Yl
       name: WeChat
       name: WeChat
     - icon: material/email
     - icon: material/email
       link: mailto:OpenDataLab@pjlab.org.cn
       link: mailto:OpenDataLab@pjlab.org.cn
@@ -78,8 +78,9 @@ nav:
       - Docker Deployment: quick_start/docker_deployment.md
       - Docker Deployment: quick_start/docker_deployment.md
     - Usage:
     - Usage:
       - Usage: usage/index.md
       - Usage: usage/index.md
-      - CLI Tools: usage/cli_tools.md
+      - Quick Usage: usage/quick_usage.md
       - Model Source: usage/model_source.md
       - Model Source: usage/model_source.md
+      - CLI Tools: usage/cli_tools.md
       - Advanced CLI Parameters: usage/advanced_cli_parameters.md
       - Advanced CLI Parameters: usage/advanced_cli_parameters.md
     - Reference:
     - Reference:
       - Output File Format: reference/output_files.md
       - Output File Format: reference/output_files.md
@@ -117,6 +118,7 @@ plugins:
             Extension Modules: 扩展模块安装
             Extension Modules: 扩展模块安装
             Docker Deployment: Docker部署
             Docker Deployment: Docker部署
             Usage: 使用方法
             Usage: 使用方法
+            Quick Usage: 快速使用
             CLI Tools: 命令行工具
             CLI Tools: 命令行工具
             Model Source: 模型源
             Model Source: 模型源
             Advanced CLI Parameters: 命令行进阶参数
             Advanced CLI Parameters: 命令行进阶参数

+ 8 - 0
signatures/version1/cla.json

@@ -383,6 +383,14 @@
       "created_at": "2025-06-30T05:44:13Z",
       "created_at": "2025-06-30T05:44:13Z",
       "repoId": 765083837,
       "repoId": 765083837,
       "pullRequestNo": 2831
       "pullRequestNo": 2831
+    },
+    {
+      "name": "Tuyohai",
+      "id": 98230804,
+      "comment_id": 3077606100,
+      "created_at": "2025-07-16T08:53:24Z",
+      "repoId": 765083837,
+      "pullRequestNo": 3070
     }
     }
   ]
   ]
 }
 }