4 月之前 · 2d23d70e7b
--- a/README.md
+++ b/README.md
@@ -44,6 +44,14 @@
 
															 # Changelog
														
 
															+- 2025/07/16 2.1.1 Released
														
 
															+  - Bug fixes
														
 
															+    - Fixed text block content loss issue that could occur in certain `pipeline` scenarios #3005
														
 
															+    - Fixed issue where `sglang-client` required unnecessary packages like `torch` #2968
														
 
															+    - Updated `dockerfile` to fix incomplete text content parsing due to missing fonts in Linux #2915
														
 
															+  - Usability improvements
														
 
															+    - Updated `compose.yaml` to facilitate direct startup of `sglang-server`, `mineru-api`, and `mineru-gradio` services
														
 
															+    - Launched brand new [online documentation site](https://opendatalab.github.io/MinerU/), simplified readme, providing better documentation experience
														
 
															 - 2025/07/05 Version 2.1.0 Released
														
 
															   - This is the first major update of MinerU 2, which includes a large number of new features and improvements, covering significant performance optimizations, user experience enhancements, and bug fixes. The detailed update contents are as follows:
														
 
															   - **Performance Optimizations:**
														
@@ -51,10 +59,10 @@
 
															     - Greatly enhanced post-processing speed when the `pipeline` backend handles batch processing of documents with fewer pages (<10 pages).
														
 
															     - Layout analysis speed of the `pipeline` backend has been increased by approximately 20%.
														
 
															   - **Experience Enhancements:**
														
 
															-    - Built-in ready-to-use `fastapi service` and `gradio webui`. For detailed usage instructions, please refer to [Documentation](#3-api-calls-or-visual-invocation).
														
 
															+    - Built-in ready-to-use `fastapi service` and `gradio webui`. For detailed usage instructions, please refer to [Documentation](https://opendatalab.github.io/MinerU/usage/quick_usage/#advanced-usage-via-api-webui-sglang-clientserver).
														
 
															     - Adapted to `sglang` version `0.4.8`, significantly reducing the GPU memory requirements for the `vlm-sglang` backend. It can now run on graphics cards with as little as `8GB GPU memory` (Turing architecture or newer).
														
 
															     - Added transparent parameter passing for all commands related to `sglang`, allowing the `sglang-engine` backend to receive all `sglang` parameters consistently with the `sglang-server`.
														
 
															-    - Supports feature extensions based on configuration files, including `custom formula delimiters`, `enabling heading classification`, and `customizing local model directories`. For detailed usage instructions, please refer to [Documentation](#4-extending-mineru-functionality-through-configuration-files).
														
 
															+    - Supports feature extensions based on configuration files, including `custom formula delimiters`, `enabling heading classification`, and `customizing local model directories`. For detailed usage instructions, please refer to [Documentation](https://opendatalab.github.io/MinerU/usage/quick_usage/#extending-mineru-functionality-with-configuration-files).
														
 
															   - **New Features:**
														
 
															     - Updated the `pipeline` backend with the PP-OCRv5 multilingual text recognition model, supporting text recognition in 37 languages such as French, Spanish, Portuguese, Russian, and Korean, with an average accuracy improvement of over 30%. [Details](https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/algorithm/PP-OCRv5/PP-OCRv5_multi_languages.html)
														
 
															     - Introduced limited support for vertical text layout in the `pipeline` backend.
														
@@ -517,6 +525,11 @@ You can get the [Docker Deployment Instructions](https://opendatalab.github.io/M
 
															 ### Using MinerU
														
 
															+The simplest command line invocation is:
														
 
															+```bash
														
 
															+mineru -p <input_path> -o <output_path>
														
 
															+```
														
 
															+
														
 
															 You can use MinerU for PDF parsing through various methods such as command line, API, and WebUI. For detailed instructions, please refer to the [Usage Guide](https://opendatalab.github.io/MinerU/usage/).
														
 
															 # TODO
														
@@ -617,4 +630,4 @@ Currently, some models in this project are trained based on YOLO. However, since
 
															 - [PDF-Extract-Kit (A Comprehensive Toolkit for High-Quality PDF Content Extraction)](https://github.com/opendatalab/PDF-Extract-Kit)
														
 
															 - [OmniDocBench (A Comprehensive Benchmark for Document Parsing and Evaluation)](https://github.com/opendatalab/OmniDocBench)
														
 
															 - [Magic-HTML (Mixed web page extraction tool)](https://github.com/opendatalab/magic-html)
														
 
															-- [Magic-Doc (Fast speed ppt/pptx/doc/docx/pdf extraction tool)](https://github.com/InternLM/magic-doc) 
														
 
															+- [Magic-Doc (Fast speed ppt/pptx/doc/docx/pdf extraction tool)](https://github.com/InternLM/magic-doc) 
														
--- a/README_zh-CN.md
+++ b/README_zh-CN.md
@@ -43,17 +43,25 @@
 
															 </div>
														
 
															 # 更新记录
														
 
															+- 2025/07/16 2.1.1发布
														
 
															+  - bug修复 
														
 
															+    - 修复`pipeline`在某些情况可能发生的文本块内容丢失问题 #3005
														
 
															+    - 修复`sglang-client`需要安装`torch`等不必要的包的问题 #2968
														
 
															+    - 更新`dockerfile`以修复linux字体缺失导致的解析文本内容不完整问题 #2915
														
 
															+  - 易用性更新
														
 
															+    - 更新`compose.yaml`，便于用户直接启动`sglang-server`、`mineru-api`、`mineru-gradio`服务
														
 
															+    - 启用全新的[在线文档站点](https://opendatalab.github.io/MinerU/zh/)，简化readme，提供更好的文档体验
														
 
															 - 2025/07/05 2.1.0发布
														
 
															   - 这是 MinerU 2 的第一个大版本更新，包含了大量新功能和改进，包含众多性能优化、体验优化和bug修复，具体更新内容如下： 
														
 
															   - 性能优化： 
														
 
															     - 大幅提升某些特定分辨率（长边2000像素左右）文档的预处理速度
														
 
															     - 大幅提升`pipeline`后端批量处理大量页数较少（<10）文档时的后处理速度
														
 
															-    - `pipline`后端的layout分析速度提升约20%
														
 
															+    - `pipeline`后端的layout分析速度提升约20%
														
 
															   - 体验优化：
														
 
															-    - 内置开箱即用的`fastapi服务`和`gradio webui`，详细使用方法请参考[文档](#3-api-调用-或-可视化调用)
														
 
															+    - 内置开箱即用的`fastapi服务`和`gradio webui`，详细使用方法请参考[文档](https://opendatalab.github.io/MinerU/zh/usage/quick_usage/#apiwebuisglang-clientserver)
														
 
															     - `sglang`适配`0.4.8`版本，大幅降低`vlm-sglang`后端的显存要求，最低可在`8G显存`(Turing及以后架构)的显卡上运行
														
 
															     - 对所有命令增加`sglang`的参数透传，使得`sglang-engine`后端可以与`sglang-server`一致，接收`sglang`的所有参数
														
 
															-    - 支持基于配置文件的功能扩展，包含`自定义公式标识符`、`开启标题分级功能`、`自定义本地模型目录`，详细使用方法请参考[文档](#4-基于配置文件扩展-mineru-功能)
														
 
															+    - 支持基于配置文件的功能扩展，包含`自定义公式标识符`、`开启标题分级功能`、`自定义本地模型目录`，详细使用方法请参考[文档](https://opendatalab.github.io/MinerU/zh/usage/quick_usage/#mineru_1)
														
 
															   - 新特性：  
														
 
															     - `pipeline`后端更新 PP-OCRv5 多语种文本识别模型，支持法语、西班牙语、葡萄牙语、俄语、韩语等 37 种语言的文字识别，平均精度涨幅超30%。[详情](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/algorithm/PP-OCRv5/PP-OCRv5_multi_languages.html)
														
 
															     - `pipeline`后端增加对竖排文本的有限支持
														
@@ -503,6 +511,12 @@ MinerU提供了便捷的docker部署方式，这有助于快速搭建环境并
 
															 ---
														
 
															 ### 使用 MinerU
														
 
															+
														
 
															+最简单的命令行调用方式:
														
 
															+```bash
														
 
															+mineru -p <input_path> -o <output_path>
														
 
															+```
														
 
															+
														
 
															 您可以通过命令行、API、WebUI等多种方式使用MinerU进行PDF解析，具体使用方法请参考[使用指南](https://opendatalab.github.io/MinerU/zh/usage/)。
														
 
															 # TODO
														
--- a/docker/china/Dockerfile
+++ b/docker/china/Dockerfile
@@ -3,14 +3,18 @@ FROM lmsysorg/sglang:v0.4.8.post1-cu126
 
															 # Install libgl for opencv support & Noto fonts for Chinese characters
														
 
															 RUN apt-get update && \
														
 
															-    apt-get install -y fonts-noto-core fonts-noto-cjk && \
														
 
															-    apt-get install -y libgl1 && \
														
 
															-    apt-get clean && \
														
 
															+    apt-get install -y \
														
 
															+        fonts-noto-core \
														
 
															+        fonts-noto-cjk \
														
 
															+        fontconfig \
														
 
															+        libgl1 && \
														
 
															     fc-cache -fv && \
														
 
															+    apt-get clean && \
														
 
															     rm -rf /var/lib/apt/lists/*
														
 
															 # Install mineru latest
														
 
															-RUN python3 -m pip install -U 'mineru[core]' -i https://mirrors.aliyun.com/pypi/simple --break-system-packages
														
 
															+RUN python3 -m pip install -U 'mineru[core]' -i https://mirrors.aliyun.com/pypi/simple --break-system-packages && \
														
 
															+    python3 -m pip cache purge
														
 
															 # Download models and update the configuration file
														
 
															 RUN /bin/bash -c "mineru-models-download -s modelscope -m all"
														
--- a/docker/global/Dockerfile
+++ b/docker/global/Dockerfile
@@ -1,16 +1,20 @@
 
															 # Use the official sglang image
														
 
															 FROM lmsysorg/sglang:v0.4.8.post1-cu126
														
 
															-# Install libgl for opencv support
														
 
															+# Install libgl for opencv support & Noto fonts for Chinese characters
														
 
															 RUN apt-get update && \
														
 
															-    apt-get install -y fonts-noto-core fonts-noto-cjk && \
														
 
															-    apt-get install -y libgl1 && \
														
 
															-    apt-get clean && \
														
 
															+    apt-get install -y \
														
 
															+        fonts-noto-core \
														
 
															+        fonts-noto-cjk \
														
 
															+        fontconfig \
														
 
															+        libgl1 && \
														
 
															     fc-cache -fv && \
														
 
															+    apt-get clean && \
														
 
															     rm -rf /var/lib/apt/lists/*
														
 
															 # Install mineru latest
														
 
															-RUN python3 -m pip install -U 'mineru[core]' --break-system-packages
														
 
															+RUN python3 -m pip install -U 'mineru[core]' --break-system-packages && \
														
 
															+    python3 -m pip cache purge
														
 
															 # Download models and update the configuration file
														
 
															 RUN /bin/bash -c "mineru-models-download -s huggingface -m all"
														
--- a/docs/en/faq/index.md
+++ b/docs/en/faq/index.md
@@ -1,8 +1,8 @@
 
															 # Frequently Asked Questions
														
 
															-If your question is not listed, you can also use [DeepWiki](https://deepwiki.com/opendatalab/MinerU) to communicate with the AI assistant, which can solve most common problems.
														
 
															+If your question is not listed, try using [DeepWiki](https://deepwiki.com/opendatalab/MinerU)'s AI assistant for common issues.
														
 
															-If you still cannot resolve the issue, you can join the community through [Discord](https://discord.gg/Tdedn9GTXq) or [WeChat](http://mineru.space/s/V85Yl) to communicate with other users and developers.
														
 
															+For unresolved problems, join our [Discord](https://discord.gg/Tdedn9GTXq) or [WeChat](http://mineru.space/s/V85Yl) community for support.
														
 
															 ??? question "Encountered the error `ImportError: libGL.so.1: cannot open shared object file: No such file or directory` in Ubuntu 22.04 on WSL2"
														
--- a/docs/en/quick_start/docker_deployment.md
+++ b/docs/en/quick_start/docker_deployment.md
@@ -13,8 +13,6 @@ docker build -t mineru-sglang:latest -f Dockerfile .
 
															 > The [Dockerfile](https://github.com/opendatalab/MinerU/blob/master/docker/global/Dockerfile) uses `lmsysorg/sglang:v0.4.8.post1-cu126` as the base image by default, supporting Turing/Ampere/Ada Lovelace/Hopper platforms.
														
 
															 > If you are using the newer `Blackwell` platform, please modify the base image to `lmsysorg/sglang:v0.4.8.post1-cu128-b200` before executing the build operation.
														
 
															----
														
 
															-
														
 
															 ## Docker Description
														
 
															 MinerU's Docker uses `lmsysorg/sglang` as the base image, so it includes the `sglang` inference acceleration framework and necessary dependencies by default. Therefore, on compatible devices, you can directly use `sglang` to accelerate VLM model inference.
														
@@ -28,9 +26,7 @@ MinerU's Docker uses `lmsysorg/sglang` as the base image, so it includes the `sg
 
															 >
														
 
															 > If your device doesn't meet the above requirements, you can still use other features of MinerU, but cannot use `sglang` to accelerate VLM model inference, meaning you cannot use the `vlm-sglang-engine` backend or start the `vlm-sglang-server` service.
														
 
															----
														
 
															-
														
 
															-## Start Docker Container:
														
 
															+## Start Docker Container
														
 
															 ```bash
														
 
															 docker run --gpus all \
														
@@ -42,9 +38,7 @@ docker run --gpus all \
 
															 ```
														
 
															 After executing this command, you will enter the Docker container's interactive terminal with some ports mapped for potential services. You can directly run MinerU-related commands within the container to use MinerU's features.
														
 
															-You can also directly start MinerU services by replacing `/bin/bash` with service startup commands. For detailed instructions, please refer to the [MinerU Usage Documentation](../usage/index.md).
														
 
															-
														
 
															----
														
 
															+You can also directly start MinerU services by replacing `/bin/bash` with service startup commands. For detailed instructions, please refer to the [Start the service via command](https://opendatalab.github.io/MinerU/usage/quick_usage/#advanced-usage-via-api-webui-sglang-clientserver).
														
 
															 ## Start Services Directly with Docker Compose
														
@@ -66,7 +60,7 @@ wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/compose.yaml
 
															 ### Start sglang-server service
														
 
															 connect to `sglang-server` via `vlm-sglang-client` backend
														
 
															   ```bash
														
 
															-  docker compose -f compose.yaml --profile mineru-sglang-server up -d
														
 
															+  docker compose -f compose.yaml --profile sglang-server up -d
														
 
															   ```
														
 
															   >[!TIP]
														
 
															   >In another terminal, connect to sglang server via sglang client (only requires CPU and network, no sglang environment needed)
														
@@ -78,7 +72,7 @@ connect to `sglang-server` via `vlm-sglang-client` backend
 
															 ### Start Web API service
														
 
															   ```bash
														
 
															-  docker compose -f compose.yaml --profile mineru-api up -d
														
 
															+  docker compose -f compose.yaml --profile api up -d
														
 
															   ```
														
 
															   >[!TIP]
														
 
															   >Access `http://<server_ip>:8000/docs` in your browser to view the API documentation.
														
@@ -87,7 +81,7 @@ connect to `sglang-server` via `vlm-sglang-client` backend
 
															 ### Start Gradio WebUI service
														
 
															   ```bash
														
 
															-  docker compose -f compose.yaml --profile mineru-gradio up -d
														
 
															+  docker compose -f compose.yaml --profile gradio up -d
														
 
															   ```
														
 
															   >[!TIP]
														
 
															   >
														
--- a/docs/en/quick_start/index.md
+++ b/docs/en/quick_start/index.md
@@ -1,6 +1,6 @@
 
															 # Quick Start
														
 
															-If you encounter any installation issues, please check the [FAQ](../FAQ/index.md) first.
														
 
															+If you encounter any installation issues, please check the [FAQ](../faq/index.md) first.
														
 
															 ## Online Experience
														
@@ -93,4 +93,9 @@ You can get the [Docker Deployment Instructions](./docker_deployment.md) in the
 
															 ### Using MinerU
														
 
															+The simplest command line invocation is:
														
 
															+```bash
														
 
															+mineru -p <input_path> -o <output_path>
														
 
															+```
														
 
															+
														
 
															 You can use MinerU for PDF parsing through various methods such as command line, API, and WebUI. For detailed instructions, please refer to the [Usage Guide](../usage/index.md).
														
--- a/docs/en/usage/advanced_cli_parameters.md
+++ b/docs/en/usage/advanced_cli_parameters.md
@@ -1,7 +1,5 @@
 
															 # Advanced Command Line Parameters
														
 
															----
														
 
															-
														
 
															 ## SGLang Acceleration Parameter Optimization
														
 
															 ### Memory Optimization Parameters
														
@@ -11,8 +9,6 @@
 
															 > - If you encounter insufficient VRAM when using a single graphics card, you may need to reduce the KV cache size with `--mem-fraction-static 0.5`. If VRAM issues persist, try reducing it further to `0.4` or lower.
														
 
															 > - If you have two or more graphics cards, you can try using tensor parallelism (TP) mode to simply expand available VRAM: `--tp-size 2`
														
 
															----
														
 
															-
														
 
															 ### Performance Optimization Parameters
														
 
															 > [!TIP]
														
 
															 > If you can already use SGLang normally for accelerated VLM model inference but still want to further improve inference speed, you can try the following parameters:
														
@@ -20,15 +16,11 @@
 
															 > - If you have multiple graphics cards, you can use SGLang's multi-card parallel mode to increase throughput: `--dp-size 2`
														
 
															 > - You can also enable `torch.compile` to accelerate inference speed by approximately 15%: `--enable-torch-compile`
														
 
															----
														
 
															-
														
 
															 ### Parameter Passing Instructions
														
 
															 > [!TIP]
														
 
															 > - All officially supported SGLang parameters can be passed to MinerU through command line arguments, including the following commands: `mineru`, `mineru-sglang-server`, `mineru-gradio`, `mineru-api`
														
 
															 > - If you want to learn more about `sglang` parameter usage, please refer to the [SGLang official documentation](https://docs.sglang.ai/backend/server_arguments.html#common-launch-commands)
														
 
															----
														
 
															-
														
 
															 ## GPU Device Selection and Configuration
														
 
															 ### CUDA_VISIBLE_DEVICES Basic Usage
														
@@ -39,8 +31,6 @@
 
															 >   ```
														
 
															 > - This specification method is effective for all command line calls, including `mineru`, `mineru-sglang-server`, `mineru-gradio`, and `mineru-api`, and applies to both `pipeline` and `vlm` backends.
														
 
															----
														
 
															-
														
 
															 ### Common Device Configuration Examples
														
 
															 > [!TIP]
														
 
															 > Here are some common `CUDA_VISIBLE_DEVICES` setting examples:
														
@@ -52,8 +42,6 @@
 
															 >   CUDA_VISIBLE_DEVICES=""  # No GPU will be visible
														
 
															 >   ```
														
 
															----
														
 
															-
														
 
															 ## Practical Application Scenarios
														
 
															 > [!TIP]
														
 
															 > Here are some possible usage scenarios:
														
--- a/docs/en/usage/index.md
+++ b/docs/en/usage/index.md
@@ -1,89 +1,16 @@
 
															-# Using MinerU
														
 
															+# Usage Guide
														
 
															-## Quick Model Source Configuration
														
 
															-MinerU uses `huggingface` as the default model source. If users cannot access `huggingface` due to network restrictions, they can conveniently switch the model source to `modelscope` through environment variables:
														
 
															-```bash
														
 
															-export MINERU_MODEL_SOURCE=modelscope
														
 
															-```
														
 
															-For more information about model source configuration and custom local model paths, please refer to the [Model Source Documentation](./model_source.md) in the documentation.
														
 
															+This section provides comprehensive usage instructions for the project. We will help you progressively master the project's usage from basic to advanced through the following sections:
														
 
															----
														
 
															+## Table of Contents
														
 
															-## Quick Usage via Command Line
														
 
															-MinerU has built-in command line tools that allow users to quickly use MinerU for PDF parsing through the command line:
														
 
															-```bash
														
 
															-# Default parsing using pipeline backend
														
 
															-mineru -p <input_path> -o <output_path>
														
 
															-```
														
 
															-> [!TIP]
														
 
															->- `<input_path>`: Local PDF/image file or directory
														
 
															->- `<output_path>`: Output directory
														
 
															->
														
 
															-> For more information about output files, please refer to [Output File Documentation](../output_files.md).
														
 
															+- [Quick Usage](./quick_usage.md) - Quick setup and basic usage
														
 
															+- [Model Source Configuration](./model_source.md) - Detailed configuration instructions for model sources
														
 
															+- [Command Line Tools](./cli_tools.md) - Detailed parameter descriptions for command line tools
														
 
															+- [Advanced Optimization Parameters](./advanced_cli_parameters.md) - Advanced parameter descriptions for command line tool adaptation
														
 
															-> [!NOTE]
														
 
															-> The command line tool will automatically attempt cuda/mps acceleration on Linux and macOS systems. 
														
 
															-> Windows users who need cuda acceleration should visit the [PyTorch official website](https://pytorch.org/get-started/locally/) to select the appropriate command for their cuda version to install acceleration-enabled `torch` and `torchvision`.
														
 
															+## Getting Started
														
 
															+We recommend reading the documentation in the order listed above, which will help you better understand and use the project features.
														
 
															-```bash
														
 
															-# Or specify vlm backend for parsing
														
 
															-mineru -p <input_path> -o <output_path> -b vlm-transformers
														
 
															-```
														
 
															-> [!TIP]
														
 
															-> The vlm backend additionally supports `sglang` acceleration. Compared to the `transformers` backend, `sglang` can achieve 20-30x speedup. You can check the installation method for the complete package supporting `sglang` acceleration in the [Extension Modules Installation Guide](../quick_start/extension_modules.md).
														
 
															-
														
 
															-If you need to adjust parsing options through custom parameters, you can also check the more detailed [Command Line Tools Usage Instructions](./cli_tools.md) in the documentation.
														
 
															-
														
 
															----
														
 
															-
														
 
															-## Advanced Usage via API, WebUI, sglang-client/server
														
 
															-
														
 
															-- Direct Python API calls: [Python Usage Example](https://github.com/opendatalab/MinerU/blob/master/demo/demo.py)
														
 
															-- FastAPI calls:
														
 
															-  ```bash
														
 
															-  mineru-api --host 127.0.0.1 --port 8000
														
 
															-  ```
														
 
															-  >[!TIP]
														
 
															-  >Access `http://127.0.0.1:8000/docs` in your browser to view the API documentation.
														
 
															-- Start Gradio WebUI visual frontend:
														
 
															-  ```bash
														
 
															-  # Using pipeline/vlm-transformers/vlm-sglang-client backends
														
 
															-  mineru-gradio --server-name 127.0.0.1 --server-port 7860
														
 
															-  # Or using vlm-sglang-engine/pipeline backends (requires sglang environment)
														
 
															-  mineru-gradio --server-name 127.0.0.1 --server-port 7860 --enable-sglang-engine true
														
 
															-  ```
														
 
															-  >[!TIP]
														
 
															-  >
														
 
															-  >- Access `http://127.0.0.1:7860` in your browser to use the Gradio WebUI.
														
 
															-  >- Access `http://127.0.0.1:7860/?view=api` to use the Gradio API.
														
 
															-- Using `sglang-client/server` method:
														
 
															-  ```bash
														
 
															-  # Start sglang server (requires sglang environment)
														
 
															-  mineru-sglang-server --port 30000
														
 
															-  ``` 
														
 
															-  >[!TIP]
														
 
															-  >In another terminal, connect to sglang server via sglang client (only requires CPU and network, no sglang environment needed)
														
 
															-  > ```bash
														
 
															-  > mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://127.0.0.1:30000
														
 
															-  > ```
														
 
															-
														
 
															-> [!TIP]
														
 
															-> All officially supported sglang parameters can be passed to MinerU through command line arguments, including the following commands: `mineru`, `mineru-sglang-server`, `mineru-gradio`, `mineru-api`.
														
 
															-> We have compiled some commonly used parameters and usage methods for `sglang`, which can be found in the documentation [Advanced Command Line Parameters](./advanced_cli_parameters.md).
														
 
															-
														
 
															----
														
 
															-
														
 
															-## Extending MinerU Functionality with Configuration Files
														
 
															-
														
 
															-MinerU is now ready to use out of the box, but also supports extending functionality through configuration files. You can edit `mineru.json` file in your user directory to add custom configurations.  
														
 
															-
														
 
															->[!TIP]
														
 
															->The `mineru.json` file will be automatically generated when you use the built-in model download command `mineru-models-download`, or you can create it by copying the [configuration template file](https://github.com/opendatalab/MinerU/blob/master/mineru.template.json) to your user directory and renaming it to `mineru.json`.  
														
 
															-
														
 
															-Here are some available configuration options:  
														
 
															-
														
 
															-- `latex-delimiter-config`: Used to configure LaTeX formula delimiters, defaults to `$` symbol, can be modified to other symbols or strings as needed.
														
 
															-- `llm-aided-config`: Used to configure parameters for LLM-assisted title hierarchy, compatible with all LLM models supporting `openai protocol`, defaults to using Alibaba Cloud Bailian's `qwen2.5-32b-instruct` model. You need to configure your own API key and set `enable` to `true` to enable this feature.
														
 
															-- `models-dir`: Used to specify local model storage directory, please specify model directories for `pipeline` and `vlm` backends separately. After specifying the directory, you can use local models by configuring the environment variable `export MINERU_MODEL_SOURCE=local`.
														
 
															-
														
 
															+If you encounter issues during usage, please check the [FAQ](../faq/index.md)
														
--- a/docs/en/usage/model_source.md
+++ b/docs/en/usage/model_source.md
@@ -36,7 +36,7 @@ or use the interactive command line tool to select model downloads:
 
															 ```bash
														
 
															 mineru-models-download
														
 
															 ```
														
 
															->[!TIP]
														
 
															+> [!NOTE]
														
 
															 >- After download completion, the model path will be output in the current terminal window and automatically written to `mineru.json` in the user directory.
														
 
															 >- You can also create it by copying the [configuration template file](https://github.com/opendatalab/MinerU/blob/master/mineru.template.json) to your user directory and renaming it to `mineru.json`.
														
 
															 >- After downloading models locally, you can freely move the model folder to other locations while updating the model path in `mineru.json`.
														
--- a/docs/en/usage/quick_usage.md
+++ b/docs/en/usage/quick_usage.md
@@ -0,0 +1,83 @@
 
															+# Using MinerU
														
 
															+
														
 
															+## Quick Model Source Configuration
														
 
															+MinerU uses `huggingface` as the default model source. If users cannot access `huggingface` due to network restrictions, they can conveniently switch the model source to `modelscope` through environment variables:
														
 
															+```bash
														
 
															+export MINERU_MODEL_SOURCE=modelscope
														
 
															+```
														
 
															+For more information about model source configuration and custom local model paths, please refer to the [Model Source Documentation](./model_source.md) in the documentation.
														
 
															+
														
 
															+## Quick Usage via Command Line
														
 
															+MinerU has built-in command line tools that allow users to quickly use MinerU for PDF parsing through the command line:
														
 
															+```bash
														
 
															+# Default parsing using pipeline backend
														
 
															+mineru -p <input_path> -o <output_path>
														
 
															+```
														
 
															+> [!TIP]
														
 
															+>- `<input_path>`: Local PDF/image file or directory
														
 
															+>- `<output_path>`: Output directory
														
 
															+>
														
 
															+> For more information about output files, please refer to [Output File Documentation](../reference/output_files.md).
														
 
															+
														
 
															+> [!NOTE]
														
 
															+> The command line tool will automatically attempt cuda/mps acceleration on Linux and macOS systems. 
														
 
															+> Windows users who need cuda acceleration should visit the [PyTorch official website](https://pytorch.org/get-started/locally/) to select the appropriate command for their cuda version to install acceleration-enabled `torch` and `torchvision`.
														
 
															+
														
 
															+
														
 
															+```bash
														
 
															+# Or specify vlm backend for parsing
														
 
															+mineru -p <input_path> -o <output_path> -b vlm-transformers
														
 
															+```
														
 
															+> [!TIP]
														
 
															+> The vlm backend additionally supports `sglang` acceleration. Compared to the `transformers` backend, `sglang` can achieve 20-30x speedup. You can check the installation method for the complete package supporting `sglang` acceleration in the [Extension Modules Installation Guide](../quick_start/extension_modules.md).
														
 
															+
														
 
															+If you need to adjust parsing options through custom parameters, you can also check the more detailed [Command Line Tools Usage Instructions](./cli_tools.md) in the documentation.
														
 
															+
														
 
															+## Advanced Usage via API, WebUI, sglang-client/server
														
 
															+
														
 
															+- Direct Python API calls: [Python Usage Example](https://github.com/opendatalab/MinerU/blob/master/demo/demo.py)
														
 
															+- FastAPI calls:
														
 
															+  ```bash
														
 
															+  mineru-api --host 0.0.0.0 --port 8000
														
 
															+  ```
														
 
															+  >[!TIP]
														
 
															+  >Access `http://127.0.0.1:8000/docs` in your browser to view the API documentation.
														
 
															+- Start Gradio WebUI visual frontend:
														
 
															+  ```bash
														
 
															+  # Using pipeline/vlm-transformers/vlm-sglang-client backends
														
 
															+  mineru-gradio --server-name 0.0.0.0 --server-port 7860
														
 
															+  # Or using vlm-sglang-engine/pipeline backends (requires sglang environment)
														
 
															+  mineru-gradio --server-name 0.0.0.0 --server-port 7860 --enable-sglang-engine true
														
 
															+  ```
														
 
															+  >[!TIP]
														
 
															+  >
														
 
															+  >- Access `http://127.0.0.1:7860` in your browser to use the Gradio WebUI.
														
 
															+  >- Access `http://127.0.0.1:7860/?view=api` to use the Gradio API.
														
 
															+- Using `sglang-client/server` method:
														
 
															+  ```bash
														
 
															+  # Start sglang server (requires sglang environment)
														
 
															+  mineru-sglang-server --port 30000
														
 
															+  ``` 
														
 
															+  >[!TIP]
														
 
															+  >In another terminal, connect to sglang server via sglang client (only requires CPU and network, no sglang environment needed)
														
 
															+  > ```bash
														
 
															+  > mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://127.0.0.1:30000
														
 
															+  > ```
														
 
															+
														
 
															+> [!NOTE]
														
 
															+> All officially supported sglang parameters can be passed to MinerU through command line arguments, including the following commands: `mineru`, `mineru-sglang-server`, `mineru-gradio`, `mineru-api`.
														
 
															+> We have compiled some commonly used parameters and usage methods for `sglang`, which can be found in the documentation [Advanced Command Line Parameters](./advanced_cli_parameters.md).
														
 
															+
														
 
															+## Extending MinerU Functionality with Configuration Files
														
 
															+
														
 
															+MinerU is now ready to use out of the box, but also supports extending functionality through configuration files. You can edit `mineru.json` file in your user directory to add custom configurations.  
														
 
															+
														
 
															+>[!IMPORTANT]
														
 
															+>The `mineru.json` file will be automatically generated when you use the built-in model download command `mineru-models-download`, or you can create it by copying the [configuration template file](https://github.com/opendatalab/MinerU/blob/master/mineru.template.json) to your user directory and renaming it to `mineru.json`.  
														
 
															+
														
 
															+Here are some available configuration options:  
														
 
															+
														
 
															+- `latex-delimiter-config`: Used to configure LaTeX formula delimiters, defaults to `$` symbol, can be modified to other symbols or strings as needed.
														
 
															+- `llm-aided-config`: Used to configure parameters for LLM-assisted title hierarchy, compatible with all LLM models supporting `openai protocol`, defaults to using Alibaba Cloud Bailian's `qwen2.5-32b-instruct` model. You need to configure your own API key and set `enable` to `true` to enable this feature.
														
 
															+- `models-dir`: Used to specify local model storage directory, please specify model directories for `pipeline` and `vlm` backends separately. After specifying the directory, you can use local models by configuring the environment variable `export MINERU_MODEL_SOURCE=local`.
														
 
															+
														
--- a/docs/zh/quick_start/docker_deployment.md
+++ b/docs/zh/quick_start/docker_deployment.md
@@ -13,8 +13,6 @@ docker build -t mineru-sglang:latest -f Dockerfile .
 
															 > [Dockerfile](https://github.com/opendatalab/MinerU/blob/master/docker/china/Dockerfile)默认使用`lmsysorg/sglang:v0.4.8.post1-cu126`作为基础镜像，支持Turing/Ampere/Ada Lovelace/Hopper平台，
														
 
															 > 如您使用较新的`Blackwell`平台，请将基础镜像修改为`lmsysorg/sglang:v0.4.8.post1-cu128-b200` 再执行build操作。
														
 
															----
														
 
															-
														
 
															 ## Docker说明
														
 
															 Mineru的docker使用了`lmsysorg/sglang`作为基础镜像，因此在docker中默认集成了`sglang`推理加速框架和必需的依赖环境。因此在满足条件的设备上，您可以直接使用`sglang`加速VLM模型推理。
														
@@ -27,9 +25,7 @@ Mineru的docker使用了`lmsysorg/sglang`作为基础镜像，因此在docker中
 
															 >
														
 
															 > 如果您的设备不满足上述条件，您仍然可以使用MinerU的其他功能，但无法使用`sglang`加速VLM模型推理，即无法使用`vlm-sglang-engine`后端和启动`vlm-sglang-server`服务。
														
 
															----
														
 
															-
														
 
															-## 启动 Docker 容器：
														
 
															+## 启动 Docker 容器
														
 
															 ```bash
														
 
															 docker run --gpus all \
														
@@ -41,9 +37,7 @@ docker run --gpus all \
 
															 ```
														
 
															 执行该命令后，您将进入到Docker容器的交互式终端，并映射了一些端口用于可能会使用的服务，您可以直接在容器内运行MinerU相关命令来使用MinerU的功能。
														
 
															-您也可以直接通过替换`/bin/bash`为服务启动命令来启动MinerU服务，详细说明请参考[MinerU使用文档](../usage/index.md)。
														
 
															-
														
 
															----
														
 
															+您也可以直接通过替换`/bin/bash`为服务启动命令来启动MinerU服务，详细说明请参考[通过命令启动服务](https://opendatalab.github.io/MinerU/zh/usage/quick_usage/#apiwebuisglang-clientserver)。
														
 
															 ## 通过 Docker Compose 直接启动服务
														
@@ -64,7 +58,7 @@ wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/compose.yaml
 
															 ### 启动 sglang-server 服务
														
 
															 并通过`vlm-sglang-client`后端连接`sglang-server`
														
 
															   ```bash
														
 
															-  docker compose -f compose.yaml --profile mineru-sglang-server up -d
														
 
															+  docker compose -f compose.yaml --profile sglang-server up -d
														
 
															   ```
														
 
															   >[!TIP]
														
 
															   >在另一个终端中通过sglang client连接sglang server（只需cpu与网络，不需要sglang环境）
														
@@ -76,7 +70,7 @@ wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/compose.yaml
 
															 ### 启动 Web API 服务
														
 
															   ```bash
														
 
															-  docker compose -f compose.yaml --profile mineru-api up -d
														
 
															+  docker compose -f compose.yaml --profile api up -d
														
 
															   ```
														
 
															   >[!TIP]
														
 
															   >在浏览器中访问 `http://<server_ip>:8000/docs` 查看API文档。
														
@@ -85,7 +79,7 @@ wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/compose.yaml
 
															 ### 启动 Gradio WebUI 服务
														
 
															   ```bash
														
 
															-  docker compose -f compose.yaml --profile mineru-gradio up -d
														
 
															+  docker compose -f compose.yaml --profile gradio up -d
														
 
															   ```
														
 
															   >[!TIP]
														
 
															   > 
														
--- a/docs/zh/quick_start/index.md
+++ b/docs/zh/quick_start/index.md
@@ -1,6 +1,6 @@
 
															 # 快速开始
														
 
															-如果遇到任何安装问题，请先查询 [FAQ](../FAQ/index.md) 
														
 
															+如果遇到任何安装问题，请先查询 [FAQ](../faq/index.md) 
														
 
															 ## 在线体验
														
@@ -93,4 +93,9 @@ MinerU提供了便捷的docker部署方式，这有助于快速搭建环境并
 
															 ### 使用 MinerU
														
 
															+最简单的命令行调用方式:
														
 
															+```bash
														
 
															+mineru -p <input_path> -o <output_path>
														
 
															+```
														
 
															+
														
 
															 您可以通过命令行、API、WebUI等多种方式使用MinerU进行PDF解析，具体使用方法请参考[使用指南](../usage/index.md)。
														
--- a/docs/zh/usage/advanced_cli_parameters.md
+++ b/docs/zh/usage/advanced_cli_parameters.md
@@ -1,6 +1,4 @@
 
															-# 命令行参数进阶技巧
														
 
															-
														
 
															----
														
 
															+# 命令行参数进阶
														
 
															 ## SGLang 加速参数优化
														
@@ -11,8 +9,6 @@
 
															 > - 如果您使用单张显卡遇到显存不足的情况时，可能需要调低KV缓存大小，`--mem-fraction-static 0.5`，如仍出现显存不足问题，可尝试进一步降低到`0.4`或更低
														
 
															 > - 如您有两张以上显卡，可尝试通过张量并行（TP）模式简单扩充可用显存：`--tp-size 2`
														
 
															----
														
 
															-
														
 
															 ### 性能优化参数
														
 
															 > [!TIP]
														
 
															 > 如果您已经可以正常使用sglang对vlm模型进行加速推理，但仍然希望进一步提升推理速度，可以尝试以下参数：
														
@@ -20,15 +16,11 @@
 
															 > - 如果您有超过多张显卡，可以使用sglang的多卡并行模式来增加吞吐量：`--dp-size 2`
														
 
															 > - 同时您可以启用`torch.compile`来将推理速度加速约15%：`--enable-torch-compile`
														
 
															----
														
 
															-
														
 
															 ### 参数传递说明
														
 
															 > [!TIP]
														
 
															 > - 所有sglang官方支持的参数都可用通过命令行参数传递给 MinerU，包括以下命令:`mineru`、`mineru-sglang-server`、`mineru-gradio`、`mineru-api`
														
 
															 > - 如果您想了解更多有关`sglang`的参数使用方法，请参考 [sglang官方文档](https://docs.sglang.ai/backend/server_arguments.html#common-launch-commands)
														
 
															----
														
 
															-
														
 
															 ## GPU 设备选择与配置
														
 
															 ### CUDA_VISIBLE_DEVICES 基本用法
														
@@ -39,8 +31,6 @@
 
															 >   ```
														
 
															 > - 这种指定方式对所有的命令行调用都有效，包括 `mineru`、`mineru-sglang-server`、`mineru-gradio` 和 `mineru-api`，且对`pipeline`、`vlm`后端均适用。
														
 
															----
														
 
															-
														
 
															 ### 常见设备配置示例
														
 
															 > [!TIP]
														
 
															 > 以下是一些常见的 `CUDA_VISIBLE_DEVICES` 设置示例：
														
@@ -52,8 +42,6 @@
 
															 >   CUDA_VISIBLE_DEVICES=""  # No GPU will be visible
														
 
															 >   ```
														
 
															----
														
 
															-
														
 
															 ## 实际应用场景
														
 
															 > [!TIP]
														
--- a/docs/zh/usage/cli_tools.md
+++ b/docs/zh/usage/cli_tools.md
@@ -31,33 +31,28 @@ mineru-api --help
 
															 Usage: mineru-api [OPTIONS]
														
 
															 Options:
														
 
															-  --host TEXT     Server host (default: 127.0.0.1)
														
 
															-  --port INTEGER  Server port (default: 8000)
														
 
															-  --reload        Enable auto-reload (development mode)
														
 
															-  --help          Show this message and exit.
														
 
															+  --host TEXT     服务器主机地址（默认：127.0.0.1）
														
 
															+  --port INTEGER  服务器端口（默认：8000）
														
 
															+  --reload        启用自动重载（开发模式）
														
 
															+  --help          显示此帮助信息并退出
														
 
															 ```
														
 
															 ```bash
														
 
															 mineru-gradio --help
														
 
															 Usage: mineru-gradio [OPTIONS]
														
 
															 Options:
														
 
															-  --enable-example BOOLEAN        Enable example files for input.The example
														
 
															-                                  files to be input need to be placed in the
														
 
															-                                  `example` folder within the directory where
														
 
															-                                  the command is currently executed.
														
 
															-  --enable-sglang-engine BOOLEAN  Enable SgLang engine backend for faster
														
 
															-                                  processing.
														
 
															-  --enable-api BOOLEAN            Enable gradio API for serving the
														
 
															-                                  application.
														
 
															-  --max-convert-pages INTEGER     Set the maximum number of pages to convert
														
 
															-                                  from PDF to Markdown.
														
 
															-  --server-name TEXT              Set the server name for the Gradio app.
														
 
															-  --server-port INTEGER           Set the server port for the Gradio app.
														
 
															+  --enable-example BOOLEAN        启用示例文件输入(需要将示例文件放置在当前
														
 
															+                                  执行命令目录下的 `example` 文件夹中)
														
 
															+  --enable-sglang-engine BOOLEAN  启用 SgLang 引擎后端以提高处理速度
														
 
															+  --enable-api BOOLEAN            启用 Gradio API 以提供应用程序服务
														
 
															+  --max-convert-pages INTEGER     设置从 PDF 转换为 Markdown 的最大页数
														
 
															+  --server-name TEXT              设置 Gradio 应用程序的服务器主机名
														
 
															+  --server-port INTEGER           设置 Gradio 应用程序的服务器端口
														
 
															   --latex-delimiters-type [a|b|all]
														
 
															-                                  Set the type of LaTeX delimiters to use in
														
 
															-                                  Markdown rendering:'a' for type '$', 'b' for
														
 
															-                                  type '()[]', 'all' for both types.
														
 
															-  --help                          Show this message and exit.
														
 
															+                                  设置在 Markdown 渲染中使用的 LaTeX 分隔符类型
														
 
															+                                  ('a' 表示 '$' 类型，'b' 表示 '()[]' 类型，
														
 
															+                                  'all' 表示两种类型都使用)
														
 
															+  --help                          显示此帮助信息并退出
														
 
															 ```
														
 
															 ## 环境变量说明
														
@@ -71,5 +66,3 @@ MinerU命令行工具的某些参数存在相同功能的环境变量配置，
 
															 - `MINERU_TOOLS_CONFIG_JSON`：用于指定配置文件路径，默认为用户目录下的`mineru.json`，可通过环境变量指定其他配置文件路径。
														
 
															 - `MINERU_FORMULA_ENABLE`：用于启用公式解析，默认为`true`，可通过环境变量设置为`false`来禁用公式解析。
														
 
															 - `MINERU_TABLE_ENABLE`：用于启用表格解析，默认为`true`，可通过环境变量设置为`false`来禁用表格解析。
														
 
															-
														
 
															-
														
--- a/docs/zh/usage/index.md
+++ b/docs/zh/usage/index.md
@@ -1,88 +1,16 @@
 
															-# 使用 MinerU
														
 
															+# 使用指南
														
 
															-## 快速配置模型源
														
 
															-MinerU默认使用`huggingface`作为模型源，若用户网络无法访问`huggingface`，可以通过环境变量便捷地切换模型源为`modelscope`：
														
 
															-```bash
														
 
															-export MINERU_MODEL_SOURCE=modelscope
														
 
															-```
														
 
															-有关模型源配置和自定义本地模型路径的更多信息，请参考文档中的[模型源说明](./model_source.md)。
														
 
															+本章节提供了项目的完整使用说明。我们将通过以下几个部分，帮助您从基础到进阶逐步掌握项目的使用方法：
														
 
															----
														
 
															+## 目录
														
 
															-## 通过命令行快速使用
														
 
															-MinerU内置了命令行工具，用户可以通过命令行快速使用MinerU进行PDF解析：
														
 
															-```bash
														
 
															-# 默认使用pipeline后端解析
														
 
															-mineru -p <input_path> -o <output_path>
														
 
															-```
														
 
															-> [!TIP]
														
 
															-> - `<input_path>`：本地 PDF/图片 文件或目录
														
 
															-> - `<output_path>`：输出目录
														
 
															-> 
														
 
															-> 更多关于输出文件的信息，请参考[输出文件说明](../output_files.md)。
														
 
															+- [快速使用](./quick_usage.md) - 快速上手和基本使用
														
 
															+- [模型源配置](./model_source.md) - 模型源的详细配置说明  
														
 
															+- [命令行工具](./cli_tools.md) - 命令行工具的详细参数说明
														
 
															+- [进阶优化参数](./advanced_cli_parameters.md) - 一些适配命令行工具的进阶参数说明
														
 
															-> [!NOTE]
														
 
															-> 命令行工具会在Linux和macOS系统自动尝试cuda/mps加速。Windows用户如需使用cuda加速，
														
 
															-> 请前往 [Pytorch官网](https://pytorch.org/get-started/locally/) 选择适合自己cuda版本的命令安装支持加速的`torch`和`torchvision`。
														
 
															+## 开始使用
														
 
															+建议按照上述顺序阅读文档，这样可以帮助您更好地理解和使用项目功能。
														
 
															-```bash
														
 
															-# 或指定vlm后端解析
														
 
															-mineru -p <input_path> -o <output_path> -b vlm-transformers
														
 
															-```
														
 
															-> [!TIP]
														
 
															-> vlm后端另外支持`sglang`加速，与`transformers`后端相比，`sglang`的加速比可达20～30倍，可以在[扩展模块安装指南](../quick_start/extension_modules.md)中查看支持`sglang`加速的完整包安装方法。
														
 
															-
														
 
															-如果需要通过自定义参数调整解析选项，您也可以在文档中查看更详细的[命令行工具使用说明](./cli_tools.md)。
														
 
															-
														
 
															----
														
 
															-
														
 
															-## 通过api、webui、sglang-client/server进阶使用
														
 
															-
														
 
															-- 通过python api直接调用：[Python 调用示例](https://github.com/opendatalab/MinerU/blob/master/demo/demo.py)
														
 
															-- 通过fast api方式调用：
														
 
															-  ```bash
														
 
															-  mineru-api --host 127.0.0.1 --port 8000
														
 
															-  ```
														
 
															-  >[!TIP]
														
 
															-  >在浏览器中访问 `http://127.0.0.1:8000/docs` 查看API文档。
														
 
															-- 启动gradio webui 可视化前端：
														
 
															-  ```bash
														
 
															-  # 使用 pipeline/vlm-transformers/vlm-sglang-client 后端
														
 
															-  mineru-gradio --server-name 127.0.0.1 --server-port 7860
														
 
															-  # 或使用 vlm-sglang-engine/pipeline 后端（需安装sglang环境）
														
 
															-  mineru-gradio --server-name 127.0.0.1 --server-port 7860 --enable-sglang-engine true
														
 
															-  ```
														
 
															-  >[!TIP]
														
 
															-  > 
														
 
															-  >- 在浏览器中访问 `http://127.0.0.1:7860` 使用 Gradio WebUI。
														
 
															-  >- 访问 `http://127.0.0.1:7860/?view=api` 使用 Gradio API。
														
 
															-- 使用`sglang-client/server`方式调用：
														
 
															-  ```bash
														
 
															-  # 启动sglang server(需要安装sglang环境)
														
 
															-  mineru-sglang-server --port 30000
														
 
															-  ``` 
														
 
															-  >[!TIP]
														
 
															-  >在另一个终端中通过sglang client连接sglang server（只需cpu与网络，不需要sglang环境）
														
 
															-  > ```bash
														
 
															-  > mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://127.0.0.1:30000
														
 
															-  > ```
														
 
															-
														
 
															-> [!TIP]
														
 
															-> 所有sglang官方支持的参数都可用通过命令行参数传递给 MinerU，包括以下命令:`mineru`、`mineru-sglang-server`、`mineru-gradio`、`mineru-api`，
														
 
															-> 我们整理了一些`sglang`使用中的常用参数和使用方法，可以在文档[命令行进阶参数](./advanced_cli_parameters.md)中获取。
														
 
															-
														
 
															----
														
 
															-
														
 
															-## 基于配置文件扩展 MinerU 功能
														
 
															-
														
 
															-MinerU 现已实现开箱即用，但也支持通过配置文件扩展功能。您可通过编辑用户目录下的 `mineru.json` 文件，添加自定义配置。
														
 
															-
														
 
															->[!TIP]
														
 
															->`mineru.json` 文件会在您使用内置模型下载命令 `mineru-models-download` 时自动生成，也可以通过将[配置模板文件](https://github.com/opendatalab/MinerU/blob/master/mineru.template.json)复制到用户目录下并重命名为 `mineru.json` 来创建。  
														
 
															-
														
 
															-以下是一些可用的配置选项： 
														
 
															-
														
 
															-- `latex-delimiter-config`：用于配置 LaTeX 公式的分隔符，默认为`$`符号，可根据需要修改为其他符号或字符串。
														
 
															-- `llm-aided-config`：用于配置 LLM 辅助标题分级的相关参数，兼容所有支持`openai协议`的 LLM 模型，默认使用`阿里云百炼`的`qwen2.5-32b-instruct`模型，您需要自行配置 API 密钥并将`enable`设置为`true`来启用此功能。
														
 
															-- `models-dir`：用于指定本地模型存储目录，请为`pipeline`和`vlm`后端分别指定模型目录，指定目录后您可通过配置环境变量`export MINERU_MODEL_SOURCE=local`来使用本地模型。
														
 
															+如果您在使用过程中遇到问题，请查看 [FAQ](../faq/index.md)
														
--- a/docs/zh/usage/model_source.md
+++ b/docs/zh/usage/model_source.md
@@ -37,7 +37,7 @@ mineru-models-download --help
 
															 ```bash
														
 
															 mineru-models-download
														
 
															 ```
														
 
															->[!TIP]
														
 
															+> [!NOTE]
														
 
															 >- 下载完成后，模型路径会在当前终端窗口输出，并自动写入用户目录下的 `mineru.json`。
														
 
															 >- 您也可以通过将[配置模板文件](https://github.com/opendatalab/MinerU/blob/master/mineru.template.json)复制到用户目录下并重命名为 `mineru.json` 来创建配置文件。
														
 
															 >- 模型下载到本地后，您可以自由移动模型文件夹到其他位置，同时需要在 `mineru.json` 中更新模型路径。
														
--- a/docs/zh/usage/quick_usage.md
+++ b/docs/zh/usage/quick_usage.md
@@ -0,0 +1,81 @@
 
															+# 使用 MinerU
														
 
															+
														
 
															+## 快速配置模型源
														
 
															+MinerU默认使用`huggingface`作为模型源，若用户网络无法访问`huggingface`，可以通过环境变量便捷地切换模型源为`modelscope`：
														
 
															+```bash
														
 
															+export MINERU_MODEL_SOURCE=modelscope
														
 
															+```
														
 
															+有关模型源配置和自定义本地模型路径的更多信息，请参考文档中的[模型源说明](./model_source.md)。
														
 
															+
														
 
															+## 通过命令行快速使用
														
 
															+MinerU内置了命令行工具，用户可以通过命令行快速使用MinerU进行PDF解析：
														
 
															+```bash
														
 
															+# 默认使用pipeline后端解析
														
 
															+mineru -p <input_path> -o <output_path>
														
 
															+```
														
 
															+> [!TIP]
														
 
															+> - `<input_path>`：本地 PDF/图片 文件或目录
														
 
															+> - `<output_path>`：输出目录
														
 
															+> 
														
 
															+> 更多关于输出文件的信息，请参考[输出文件说明](../reference/output_files.md)。
														
 
															+
														
 
															+> [!NOTE]
														
 
															+> 命令行工具会在Linux和macOS系统自动尝试cuda/mps加速。Windows用户如需使用cuda加速，
														
 
															+> 请前往 [Pytorch官网](https://pytorch.org/get-started/locally/) 选择适合自己cuda版本的命令安装支持加速的`torch`和`torchvision`。
														
 
															+
														
 
															+```bash
														
 
															+# 或指定vlm后端解析
														
 
															+mineru -p <input_path> -o <output_path> -b vlm-transformers
														
 
															+```
														
 
															+> [!TIP]
														
 
															+> vlm后端另外支持`sglang`加速，与`transformers`后端相比，`sglang`的加速比可达20～30倍，可以在[扩展模块安装指南](../quick_start/extension_modules.md)中查看支持`sglang`加速的完整包安装方法。
														
 
															+
														
 
															+如果需要通过自定义参数调整解析选项，您也可以在文档中查看更详细的[命令行工具使用说明](./cli_tools.md)。
														
 
															+
														
 
															+## 通过api、webui、sglang-client/server进阶使用
														
 
															+
														
 
															+- 通过python api直接调用：[Python 调用示例](https://github.com/opendatalab/MinerU/blob/master/demo/demo.py)
														
 
															+- 通过fast api方式调用：
														
 
															+  ```bash
														
 
															+  mineru-api --host 0.0.0.0 --port 8000
														
 
															+  ```
														
 
															+  >[!TIP]
														
 
															+  >在浏览器中访问 `http://127.0.0.1:8000/docs` 查看API文档。
														
 
															+- 启动gradio webui 可视化前端：
														
 
															+  ```bash
														
 
															+  # 使用 pipeline/vlm-transformers/vlm-sglang-client 后端
														
 
															+  mineru-gradio --server-name 0.0.0.0 --server-port 7860
														
 
															+  # 或使用 vlm-sglang-engine/pipeline 后端（需安装sglang环境）
														
 
															+  mineru-gradio --server-name 0.0.0.0 --server-port 7860 --enable-sglang-engine true
														
 
															+  ```
														
 
															+  >[!TIP]
														
 
															+  > 
														
 
															+  >- 在浏览器中访问 `http://127.0.0.1:7860` 使用 Gradio WebUI。
														
 
															+  >- 访问 `http://127.0.0.1:7860/?view=api` 使用 Gradio API。
														
 
															+- 使用`sglang-client/server`方式调用：
														
 
															+  ```bash
														
 
															+  # 启动sglang server(需要安装sglang环境)
														
 
															+  mineru-sglang-server --port 30000
														
 
															+  ``` 
														
 
															+  >[!TIP]
														
 
															+  >在另一个终端中通过sglang client连接sglang server（只需cpu与网络，不需要sglang环境）
														
 
															+  > ```bash
														
 
															+  > mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://127.0.0.1:30000
														
 
															+  > ```
														
 
															+
														
 
															+> [!NOTE]
														
 
															+> 所有sglang官方支持的参数都可用通过命令行参数传递给 MinerU，包括以下命令:`mineru`、`mineru-sglang-server`、`mineru-gradio`、`mineru-api`，
														
 
															+> 我们整理了一些`sglang`使用中的常用参数和使用方法，可以在文档[命令行进阶参数](./advanced_cli_parameters.md)中获取。
														
 
															+
														
 
															+## 基于配置文件扩展 MinerU 功能
														
 
															+
														
 
															+MinerU 现已实现开箱即用，但也支持通过配置文件扩展功能。您可通过编辑用户目录下的 `mineru.json` 文件，添加自定义配置。
														
 
															+
														
 
															+>[!IMPORTANT]
														
 
															+>`mineru.json` 文件会在您使用内置模型下载命令 `mineru-models-download` 时自动生成，也可以通过将[配置模板文件](https://github.com/opendatalab/MinerU/blob/master/mineru.template.json)复制到用户目录下并重命名为 `mineru.json` 来创建。  
														
 
															+
														
 
															+以下是一些可用的配置选项： 
														
 
															+
														
 
															+- `latex-delimiter-config`：用于配置 LaTeX 公式的分隔符，默认为`$`符号，可根据需要修改为其他符号或字符串。
														
 
															+- `llm-aided-config`：用于配置 LLM 辅助标题分级的相关参数，兼容所有支持`openai协议`的 LLM 模型，默认使用`阿里云百炼`的`qwen2.5-32b-instruct`模型，您需要自行配置 API 密钥并将`enable`设置为`true`来启用此功能。
														
 
															+- `models-dir`：用于指定本地模型存储目录，请为`pipeline`和`vlm`后端分别指定模型目录，指定目录后您可通过配置环境变量`export MINERU_MODEL_SOURCE=local`来使用本地模型。
														
--- a/mineru/version.py
+++ b/mineru/version.py
@@ -1 +1 @@
 
															-__version__ = "2.1.0"
														
 
															+__version__ = "2.1.1"
														
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -56,12 +56,12 @@ extra:
 
															       name: GitHub
														
 
															     - icon: fontawesome/brands/x-twitter
														
 
															       link: https://x.com/OpenDataLab_AI
														
 
															-      name: Twitter
														
 
															+      name: X-Twitter
														
 
															     - icon: fontawesome/brands/discord
														
 
															       link: https://discord.gg/Tdedn9GTXq
														
 
															       name: Discord
														
 
															     - icon: fontawesome/brands/weixin
														
 
															-      link: https://mineru.space/common/qun/?qid=362634
														
 
															+      link: http://mineru.space/s/V85Yl
														
 
															       name: WeChat
														
 
															     - icon: material/email
														
 
															       link: mailto:OpenDataLab@pjlab.org.cn
														
@@ -78,8 +78,9 @@ nav:
 
															       - Docker Deployment: quick_start/docker_deployment.md
														
 
															     - Usage:
														
 
															       - Usage: usage/index.md
														
 
															-      - CLI Tools: usage/cli_tools.md
														
 
															+      - Quick Usage: usage/quick_usage.md
														
 
															       - Model Source: usage/model_source.md
														
 
															+      - CLI Tools: usage/cli_tools.md
														
 
															       - Advanced CLI Parameters: usage/advanced_cli_parameters.md
														
 
															     - Reference:
														
 
															       - Output File Format: reference/output_files.md
														
@@ -117,6 +118,7 @@ plugins:
 
															             Extension Modules: 扩展模块安装
														
 
															             Docker Deployment: Docker部署
														
 
															             Usage: 使用方法
														
 
															+            Quick Usage: 快速使用
														
 
															             CLI Tools: 命令行工具
														
 
															             Model Source: 模型源
														
 
															             Advanced CLI Parameters: 命令行进阶参数
														
--- a/signatures/version1/cla.json
+++ b/signatures/version1/cla.json
@@ -383,6 +383,14 @@
 
															       "created_at": "2025-06-30T05:44:13Z",
														
 
															       "repoId": 765083837,
														
 
															       "pullRequestNo": 2831
														
 
															+    },
														
 
															+    {
														
 
															+      "name": "Tuyohai",
														
 
															+      "id": 98230804,
														
 
															+      "comment_id": 3077606100,
														
 
															+      "created_at": "2025-07-16T08:53:24Z",
														
 
															+      "repoId": 765083837,
														
 
															+      "pullRequestNo": 3070
														
 
															     }
														
 
															   ]
														
 
															 }
	`@@ -1 +1 @@`
	`-__version__ = "2.1.0"`
			`+__version__ = "2.1.1"`