瀏覽代碼

Merge branch 'opendatalab:dev' into dev

Sidney233 4 月之前
父節點
當前提交
da048cbf76

文件差異過大導致無法顯示
+ 0 - 0
README.md


文件差異過大導致無法顯示
+ 0 - 0
README_zh-CN.md


+ 0 - 42
docs/en/FAQ/index.md

@@ -1,42 +0,0 @@
-# Frequently Asked Questions
-
-If your question is not listed, you can also use [DeepWiki](https://deepwiki.com/opendatalab/MinerU) to communicate with the AI assistant, which can solve most common problems.
-
-If you still cannot resolve the issue, you can join the community through [Discord](https://discord.gg/Tdedn9GTXq) or [WeChat](http://mineru.space/s/V85Yl) to communicate with other users and developers.
-
-## 1. Encountered the error `ImportError: libGL.so.1: cannot open shared object file: No such file or directory` in Ubuntu 22.04 on WSL2
-
-The `libgl` library is missing in Ubuntu 22.04 on WSL2. You can install the `libgl` library with the following command to resolve the issue:
-
-```bash
-sudo apt-get install libgl1-mesa-glx
-```
-
-Reference: https://github.com/opendatalab/MinerU/issues/388
-
-
-## 2. Error when installing MinerU on CentOS 7 or Ubuntu 18: `ERROR: Failed building wheel for simsimd`
-
-The new version of albumentations (1.4.21) introduces a dependency on simsimd. Since the pre-built package of simsimd for Linux requires a glibc version greater than or equal to 2.28, this causes installation issues on some Linux distributions released before 2019. You can resolve this issue by using the following command:
-```
-conda create -n mineru python=3.11 -y
-conda activate mineru
-pip install -U "mineru[pipeline_old_linux]"
-```
-
-Reference: https://github.com/opendatalab/MinerU/issues/1004
-
-
-## 3. Missing text information in parsing results when installing and using on Linux systems.
-
-MinerU uses `pypdfium2` instead of `pymupdf` as the PDF page rendering engine in versions >=2.0 to resolve AGPLv3 license issues. On some Linux distributions, due to missing CJK fonts, some text may be lost during the process of rendering PDFs to images.
-To solve this problem, you can install the noto font package with the following commands, which are effective on Ubuntu/Debian systems:
-```bash
-sudo apt update
-sudo apt install fonts-noto-core
-sudo apt install fonts-noto-cjk
-fc-cache -fv
-```
-You can also directly use our [Docker deployment](../quick_start/docker_deployment.md) method to build the image, which includes the above font packages by default.
-
-Reference: https://github.com/opendatalab/MinerU/issues/2915

+ 2 - 0
docs/en/demo/index.md

@@ -0,0 +1,2 @@
+<script type="module" src="https://gradio.s3-us-west-2.amazonaws.com/5.35.0/gradio.js"></script>
+<gradio-app src="https://opendatalab-mineru.hf.space"></gradio-app>

+ 42 - 0
docs/en/faq/index.md

@@ -0,0 +1,42 @@
+# Frequently Asked Questions
+
+If your question is not listed, you can also use [DeepWiki](https://deepwiki.com/opendatalab/MinerU) to communicate with the AI assistant, which can solve most common problems.
+
+If you still cannot resolve the issue, you can join the community through [Discord](https://discord.gg/Tdedn9GTXq) or [WeChat](http://mineru.space/s/V85Yl) to communicate with other users and developers.
+
+??? question "Encountered the error `ImportError: libGL.so.1: cannot open shared object file: No such file or directory` in Ubuntu 22.04 on WSL2"
+
+    The `libgl` library is missing in Ubuntu 22.04 on WSL2. You can install the `libgl` library with the following command to resolve the issue:
+    
+    ```bash
+    sudo apt-get install libgl1-mesa-glx
+    ```
+    
+    Reference: [#388](https://github.com/opendatalab/MinerU/issues/388)
+
+
+??? question "Error when installing MinerU on CentOS 7 or Ubuntu 18: `ERROR: Failed building wheel for simsimd`"
+
+    The new version of albumentations (1.4.21) introduces a dependency on simsimd. Since the pre-built package of simsimd for Linux requires a glibc version greater than or equal to 2.28, this causes installation issues on some Linux distributions released before 2019. You can resolve this issue by using the following command:
+    ```
+    conda create -n mineru python=3.11 -y
+    conda activate mineru
+    pip install -U "mineru[pipeline_old_linux]"
+    ```
+    
+    Reference: [#1004](https://github.com/opendatalab/MinerU/issues/1004)
+
+
+??? question "Missing text information in parsing results when installing and using on Linux systems."
+
+    MinerU uses `pypdfium2` instead of `pymupdf` as the PDF page rendering engine in versions >=2.0 to resolve AGPLv3 license issues. On some Linux distributions, due to missing CJK fonts, some text may be lost during the process of rendering PDFs to images.
+    To solve this problem, you can install the noto font package with the following commands, which are effective on Ubuntu/Debian systems:
+    ```bash
+    sudo apt update
+    sudo apt install fonts-noto-core
+    sudo apt install fonts-noto-cjk
+    fc-cache -fv
+    ```
+    You can also directly use our [Docker deployment](../quick_start/docker_deployment.md) method to build the image, which includes the above font packages by default.
+    
+    Reference: [#2915](https://github.com/opendatalab/MinerU/issues/2915)

文件差異過大導致無法顯示
+ 2 - 1
docs/en/index.md


+ 21 - 7
docs/en/quick_start/docker_deployment.md

@@ -5,12 +5,12 @@ MinerU provides a convenient Docker deployment method, which helps quickly set u
 ## Build Docker Image using Dockerfile:
 
 ```bash
-wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/Dockerfile
+wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/global/Dockerfile
 docker build -t mineru-sglang:latest -f Dockerfile .
 ```
 
 > [!TIP]
-> The [Dockerfile](https://github.com/opendatalab/MinerU/blob/master/docker/china/Dockerfile) uses `lmsysorg/sglang:v0.4.8.post1-cu126` as the base image by default, supporting Turing/Ampere/Ada Lovelace/Hopper platforms.
+> The [Dockerfile](https://github.com/opendatalab/MinerU/blob/master/docker/global/Dockerfile) uses `lmsysorg/sglang:v0.4.8.post1-cu126` as the base image by default, supporting Turing/Ampere/Ada Lovelace/Hopper platforms.
 > If you are using the newer `Blackwell` platform, please modify the base image to `lmsysorg/sglang:v0.4.8.post1-cu128-b200` before executing the build operation.
 
 ## Docker Description
@@ -19,6 +19,7 @@ MinerU's Docker uses `lmsysorg/sglang` as the base image, so it includes the `sg
 
 > [!NOTE]
 > Requirements for using `sglang` to accelerate VLM model inference:
+> 
 > - Device must have Turing architecture or later graphics cards with 8GB+ available VRAM.
 > - The host machine's graphics driver should support CUDA 12.6 or higher; `Blackwell` platform should support CUDA 12.8 or higher. You can check the driver version using the `nvidia-smi` command.
 > - Docker container must have access to the host machine's graphics devices.
@@ -41,28 +42,41 @@ You can also directly start MinerU services by replacing `/bin/bash` with servic
 
 ## Start Services Directly with Docker Compose
 
-We provide a `compose.yml` file that you can use to quickly start MinerU services.
+We provide a [compose.yaml](https://github.com/opendatalab/MinerU/blob/master/docker/compose.yaml) file that you can use to quickly start MinerU services.
 
 ```bash
 # Download compose.yaml file
 wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/compose.yaml
 ```
 
+>[!NOTE]
+>
+>- The `compose.yaml` file contains configurations for multiple services of MinerU, you can choose to start specific services as needed.
+>- Different services might have additional parameter configurations, which you can view and edit in the `compose.yaml` file.
+>- Due to the pre-allocation of GPU memory by the `sglang` inference acceleration framework, you may not be able to run multiple `sglang` services simultaneously on the same machine. Therefore, ensure that other services that might use GPU memory have been stopped before starting the `vlm-sglang-server` service or using the `vlm-sglang-engine` backend.
+
 - Start `sglang-server` service and connect to `sglang-server` via `vlm-sglang-client` backend:
   ```bash
   docker compose -f compose.yaml --profile mineru-sglang-server up -d
-  # In another terminal, connect to sglang server via sglang client (only requires CPU and network, no sglang environment needed)
-  mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://<server_ip>:30000
   ```
+  >[!TIP]
+  >In another terminal, connect to sglang server via sglang client (only requires CPU and network, no sglang environment needed)
+  > ```bash
+  > mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://<server_ip>:30000
+  > ```
 
 - Start API service:
   ```bash
   docker compose -f compose.yaml --profile mineru-api up -d
   ```
-  Access `http://<server_ip>:8000/docs` in your browser to view the API documentation.
+  >[!TIP]
+  >Access `http://<server_ip>:8000/docs` in your browser to view the API documentation.
 
 - Start Gradio WebUI service:
   ```bash
   docker compose -f compose.yaml --profile mineru-gradio up -d
   ```
-  Access `http://<server_ip>:7860` in your browser to use the Gradio WebUI or access `http://<server_ip>:7860/?view=api` to use the Gradio API.
+  >[!TIP]
+  >
+  >- Access `http://<server_ip>:7860` in your browser to use the Gradio WebUI.
+  >- Access `http://<server_ip>:7860/?view=api` to use the Gradio API.

文件差異過大導致無法顯示
+ 0 - 1
docs/en/quick_start/index.md


+ 214 - 163
docs/en/usage/output_files.md → docs/en/reference/output_files.md

@@ -1,69 +1,118 @@
-# Overview
+# MinerU Output Files Documentation
 
-After executing the `mineru` command, in addition to outputting files related to markdown, several other files unrelated to markdown will also be generated. These files will be introduced one by one.
+## Overview
 
-## some_pdf_layout.pdf
+After executing the `mineru` command, in addition to the main markdown file output, multiple auxiliary files are generated for debugging, quality inspection, and further processing. These files include:
 
-Each page's layout consists of one or more bounding boxes. The number in the top-right corner of each box indicates the reading order. Additionally, different content blocks are highlighted with distinct background colors within the layout.pdf.
-![layout example](../images/layout_example.png)
+- **Visual debugging files**: Help users intuitively understand the document parsing process and results
+- **Structured data files**: Contain detailed parsing data for secondary development
 
-## some_pdf_spans.pdf(Applicable only to the pipeline backend)
+The following sections provide detailed descriptions of each file's purpose and format.
 
-All spans on the page are drawn with different colored line frames according to the span type. This file can be used for quality control, allowing for quick identification of issues such as missing text or unrecognized inline formulas.
+## Visual Debugging Files
 
-![spans example](../images/spans_example.png)
+### Layout Analysis File (layout.pdf)
 
-## some_pdf_model.json(Applicable only to the pipeline backend)
+**File naming format**: `{original_filename}_layout.pdf`
 
-### Structure Definition
+**Functionality**:
+
+- Visualizes layout analysis results for each page
+- Numbers in the top-right corner of each detection box indicate reading order
+- Different background colors distinguish different types of content blocks
+
+**Use cases**:
+
+- Check if layout analysis is correct
+- Verify if reading order is reasonable
+- Debug layout-related issues
+
+![layout page example](../images/layout_example.png)
+
+### Text Spans File (spans.pdf)
+
+> [!NOTE]
+> Only applicable to pipeline backend
+
+**File naming format**: `{original_filename}_spans.pdf`
+
+**Functionality**:
+
+- Uses different colored line boxes to annotate page content based on span type
+- Used for quality inspection and issue troubleshooting
+
+**Use cases**:
+
+- Quickly troubleshoot text loss issues
+- Check inline formula recognition
+- Verify text segmentation accuracy
+
+![span page example](../images/spans_example.png)
+
+## Structured Data Files
+
+### Model Inference Results (model.json)
+
+> [!NOTE]
+> Only applicable to pipeline backend
+
+**File naming format**: `{original_filename}_model.json`
+
+#### Data Structure Definition
 
 ```python
 from pydantic import BaseModel, Field
 from enum import IntEnum
 
 class CategoryType(IntEnum):
-     title = 0               # Title
-     plain_text = 1          # Text
-     abandon = 2             # Includes headers, footers, page numbers, and page annotations
-     figure = 3              # Image
-     figure_caption = 4      # Image description
-     table = 5               # Table
-     table_caption = 6       # Table description
-     table_footnote = 7      # Table footnote
-     isolate_formula = 8     # Block formula
-     formula_caption = 9     # Formula label
-
-     embedding = 13          # Inline formula
-     isolated = 14           # Block formula
-     text = 15               # OCR recognition result
-
+    """Content category enumeration"""
+    title = 0               # Title
+    plain_text = 1          # Text
+    abandon = 2             # Including headers, footers, page numbers, and page annotations
+    figure = 3              # Image
+    figure_caption = 4      # Image caption
+    table = 5               # Table
+    table_caption = 6       # Table caption
+    table_footnote = 7      # Table footnote
+    isolate_formula = 8     # Interline formula
+    formula_caption = 9     # Interline formula number
+    embedding = 13          # Inline formula
+    isolated = 14           # Interline formula
+    text = 15               # OCR recognition result
 
 class PageInfo(BaseModel):
-    page_no: int = Field(description="Page number, the first page is 0", ge=0)
+    """Page information"""
+    page_no: int = Field(description="Page number, first page is 0", ge=0)
     height: int = Field(description="Page height", gt=0)
     width: int = Field(description="Page width", ge=0)
 
 class ObjectInferenceResult(BaseModel):
+    """Object recognition result"""
     category_id: CategoryType = Field(description="Category", ge=0)
-    poly: list[float] = Field(description="Quadrilateral coordinates, representing the coordinates of the top-left, top-right, bottom-right, and bottom-left points respectively")
-    score: float = Field(description="Confidence of the inference result")
+    poly: list[float] = Field(description="Quadrilateral coordinates, format: [x0,y0,x1,y1,x2,y2,x3,y3]")
+    score: float = Field(description="Confidence score of inference result")
     latex: str | None = Field(description="LaTeX parsing result", default=None)
     html: str | None = Field(description="HTML parsing result", default=None)
 
 class PageInferenceResults(BaseModel):
-     layout_dets: list[ObjectInferenceResult] = Field(description="Page recognition results", ge=0)
-     page_info: PageInfo = Field(description="Page metadata")
+    """Page inference results"""
+    layout_dets: list[ObjectInferenceResult] = Field(description="Page recognition results")
+    page_info: PageInfo = Field(description="Page metadata")
 
-
-# The inference results of all pages, ordered by page number, are stored in a list as the inference results of MinerU
+# Complete inference results
 inference_result: list[PageInferenceResults] = []
-
 ```
 
-The format of the poly coordinates is \[x0, y0, x1, y1, x2, y2, x3, y3\], representing the coordinates of the top-left, top-right, bottom-right, and bottom-left points respectively.
-![Poly Coordinate Diagram](../images/poly.png)
+#### Coordinate System Description
 
-### example
+`poly` coordinate format: `[x0, y0, x1, y1, x2, y2, x3, y3]`
+
+- Represents coordinates of top-left, top-right, bottom-right, bottom-left points respectively
+- Coordinate origin is at the top-left corner of the page
+
+![poly coordinate diagram](../images/poly.png)
+
+#### Sample Data
 
 ```json
 [
@@ -116,142 +165,127 @@ The format of the poly coordinates is \[x0, y0, x1, y1, x2, y2, x3, y3\], repres
 ]
 ```
 
-## some_pdf_model_output.txt (Applicable only to the VLM backend)
-
-This file contains the output of the VLM model, with each page's output separated by `----`.  
-Each page's output consists of text blocks starting with `<|box_start|>` and ending with `<|md_end|>`.  
-The meaning of each field is as follows:  
-- `<|box_start|>x0 y0 x1 y1<|box_end|>`  
-  x0 y0 x1 y1 represent the coordinates of a quadrilateral, indicating the top-left and bottom-right points. The values are based on a normalized page size of 1000x1000.
-- `<|ref_start|>type<|ref_end|>`  
-  `type` indicates the block type. Possible values are:
-  ```json
-  {
-      "text": "Text",
-      "title": "Title",
-      "image": "Image",
-      "image_caption": "Image Caption",
-      "image_footnote": "Image Footnote",
-      "table": "Table",
-      "table_caption": "Table Caption",
-      "table_footnote": "Table Footnote",
-      "equation": "Interline Equation"
-  }
-  ```
-- `<|md_start|>Markdown content<|md_end|>`  
-  This field contains the Markdown content of the block. If `type` is `text`, the end of the text may contain the `<|txt_contd|>` tag, indicating that this block can be connected with the following `text` block(s).
-  If `type` is `table`, the content is in `otsl` format and needs to be converted into HTML for rendering in Markdown.
-
-## some_pdf_middle.json
-
-| Field Name     | Description                                                                                                    |
-|:---------------| :------------------------------------------------------------------------------------------------------------- |
-| pdf_info       | list, each element is a dict representing the parsing result of each PDF page, see the table below for details |
-| \_backend      | pipeline \| vlm, used to indicate the mode used in this intermediate parsing state                                  |
-| \_version_name | string, indicates the version of mineru used in this parsing                                                |
-
-<br>
-
-**pdf_info**
+### VLM Output Results (model_output.txt)
 
-Field structure description
+> [!NOTE]
+> Only applicable to VLM backend
 
-| Field Name          | Description                                                                                                        |
-| :------------------ | :----------------------------------------------------------------------------------------------------------------- |
-| preproc_blocks      | Intermediate result after PDF preprocessing, not yet segmented                                                     |
-| layout_bboxes       | Layout segmentation results, containing layout direction (vertical, horizontal), and bbox, sorted by reading order |
-| page_idx            | Page number, starting from 0                                                                                       |
-| page_size           | Page width and height                                                                                              |
-| \_layout_tree       | Layout tree structure                                                                                              |
-| images              | list, each element is a dict representing an img_block                                                             |
-| tables              | list, each element is a dict representing a table_block                                                            |
-| interline_equations | list, each element is a dict representing an interline_equation_block                                              |
-| discarded_blocks    | List, block information returned by the model that needs to be dropped                                             |
-| para_blocks         | Result after segmenting preproc_blocks                                                                             |
+**File naming format**: `{original_filename}_model_output.txt`
 
-In the above table, `para_blocks` is an array of dicts, each dict representing a block structure. A block can support up to one level of nesting.
+#### File Format Description
 
-<br>
+- Uses `----` to separate output results for each page
+- Each page contains multiple text blocks starting with `<|box_start|>` and ending with `<|md_end|>`
 
-**block**
+#### Field Meanings
 
-The outer block is referred to as a first-level block, and the fields in the first-level block include:
+| Tag | Format | Description |
+|-----|--------|-------------|
+| Bounding box | `<\|box_start\|>x0 y0 x1 y1<\|box_end\|>` | Quadrilateral coordinates (top-left, bottom-right points), coordinate values after scaling page to 1000×1000 |
+| Type tag | `<\|ref_start\|>type<\|ref_end\|>` | Content block type identifier |
+| Content | `<\|md_start\|>markdown content<\|md_end\|>` | Markdown content of the block |
 
-| Field Name | Description                                                    |
-| :--------- | :------------------------------------------------------------- |
-| type       | Block type (table\|image)                                      |
-| bbox       | Block bounding box coordinates                                 |
-| blocks     | list, each element is a dict representing a second-level block |
+#### Supported Content Types
+
+```json
+{
+    "text": "Text",
+    "title": "Title", 
+    "image": "Image",
+    "image_caption": "Image caption",
+    "image_footnote": "Image footnote",
+    "table": "Table",
+    "table_caption": "Table caption", 
+    "table_footnote": "Table footnote",
+    "equation": "Interline formula"
+}
+```
 
-<br>
-There are only two types of first-level blocks: "table" and "image". All other blocks are second-level blocks.
+#### Special Tags
 
-The fields in a second-level block include:
+- `<|txt_contd|>`: Appears at the end of text, indicating that this text block can be connected with subsequent text blocks
+- Table content uses `otsl` format and needs to be converted to HTML for rendering in Markdown
 
-| Field Name | Description                                                                                                 |
-| :--------- | :---------------------------------------------------------------------------------------------------------- |
-| type       | Block type                                                                                                  |
-| bbox       | Block bounding box coordinates                                                                              |
-| lines      | list, each element is a dict representing a line, used to describe the composition of a line of information |
+### Intermediate Processing Results (middle.json)
 
-Detailed explanation of second-level block types
+**File naming format**: `{original_filename}_middle.json`
 
-| type               | Description            |
-| :----------------- | :--------------------- |
-| image_body         | Main body of the image |
-| image_caption      | Image description text |
-| image_footnote     | Image footnote         |
-| table_body         | Main body of the table |
-| table_caption      | Table description text |
-| table_footnote     | Table footnote         |
-| text               | Text block             |
-| title              | Title block            |
-| index              | Index block            |
-| list               | List block             |
-| interline_equation | Block formula          |
+#### Top-level Structure
 
-<br>
+| Field Name | Type | Description |
+|------------|------|-------------|
+| `pdf_info` | `list[dict]` | Array of parsing results for each page |
+| `_backend` | `string` | Parsing mode: `pipeline` or `vlm` |
+| `_version_name` | `string` | MinerU version number |
 
-**line**
+#### Page Information Structure (pdf_info)
 
-The field format of a line is as follows:
+| Field Name | Description |
+|------------|-------------|
+| `preproc_blocks` | Unsegmented intermediate results after PDF preprocessing |
+| `layout_bboxes` | Layout segmentation results, including layout direction and bounding boxes, sorted by reading order |
+| `page_idx` | Page number, starting from 0 |
+| `page_size` | Page width and height `[width, height]` |
+| `_layout_tree` | Layout tree structure |
+| `images` | Image block information list |
+| `tables` | Table block information list |
+| `interline_equations` | Interline formula block information list |
+| `discarded_blocks` | Block information to be discarded |
+| `para_blocks` | Content block results after segmentation |
 
-| Field Name | Description                                                                                             |
-| :--------- | :------------------------------------------------------------------------------------------------------ |
-| bbox       | Bounding box coordinates of the line                                                                    |
-| spans      | list, each element is a dict representing a span, used to describe the composition of the smallest unit |
-
-<br>
+#### Block Structure Hierarchy
 
-**span**
+```
+Level 1 blocks (table | image)
+└── Level 2 blocks
+    └── Lines
+        └── Spans
+```
 
-| Field Name          | Description                                                                                              |
-| :------------------ | :------------------------------------------------------------------------------------------------------- |
-| bbox                | Bounding box coordinates of the span                                                                     |
-| type                | Type of the span                                                                                         |
-| content \| img_path | Text spans use content, chart spans use img_path to store the actual text or screenshot path information |
+#### Level 1 Block Fields
 
-The types of spans are as follows:
+| Field Name | Description |
+|------------|-------------|
+| `type` | Block type: `table` or `image` |
+| `bbox` | Rectangular box coordinates of the block `[x0, y0, x1, y1]` |
+| `blocks` | List of contained level 2 blocks |
 
-| type               | Description    |
-| :----------------- | :------------- |
-| image              | Image          |
-| table              | Table          |
-| text               | Text           |
-| inline_equation    | Inline formula |
-| interline_equation | Block formula  |
+#### Level 2 Block Fields
 
-**Summary**
+| Field Name | Description |
+|------------|-------------|
+| `type` | Block type (see table below) |
+| `bbox` | Rectangular box coordinates of the block |
+| `lines` | List of contained line information |
 
-A span is the smallest storage unit for all elements.
+#### Level 2 Block Types
 
-The elements stored within para_blocks are block information.
+| Type | Description |
+|------|-------------|
+| `image_body` | Image body |
+| `image_caption` | Image caption text |
+| `image_footnote` | Image footnote |
+| `table_body` | Table body |
+| `table_caption` | Table caption text |
+| `table_footnote` | Table footnote |
+| `text` | Text block |
+| `title` | Title block |
+| `index` | Index block |
+| `list` | List block |
+| `interline_equation` | Interline formula block |
 
-The block structure is as follows:
+#### Line and Span Structure
 
-First-level block (if any) -> Second-level block -> Line -> Span
+**Line fields**:
+- `bbox`: Rectangular box coordinates of the line
+- `spans`: List of contained spans
 
-### example
+**Span fields**:
+- `bbox`: Rectangular box coordinates of the span
+- `type`: Span type (`image`, `table`, `text`, `inline_equation`, `interline_equation`)
+- `content` | `img_path`: Text content or image path
+
+#### Sample Data
 
 ```json
 {
@@ -354,29 +388,37 @@ First-level block (if any) -> Second-level block -> Line -> Span
 }
 ```
 
+### Content List (content_list.json)
+
+**File naming format**: `{original_filename}_content_list.json`
+
+#### Functionality
+
+This is a simplified version of `middle.json` that stores all readable content blocks in reading order as a flat structure, removing complex layout information for easier subsequent processing.
+
+#### Content Types
 
-## some_pdf_content_list.json
+| Type | Description |
+|------|-------------|
+| `image` | Image |
+| `table` | Table |
+| `text` | Text/Title |
+| `equation` | Interline formula |
 
-This file is a JSON array where each element is a dict storing all readable content blocks in the document in reading order.  
-`content_list` can be viewed as a simplified version of `middle.json`. The content block types are mostly consistent with those in `middle.json`, but layout information is not included.  
+#### Text Level Identification
 
-The content has the following types:
+Text levels are distinguished through the `text_level` field:
 
-| type     | desc          |
-|:---------|:--------------|
-| image    | Image         |
-| table    | Table         |
-| text     | Text / Title  |
-| equation | Block formula |
+- No `text_level` or `text_level: 0`: Body text
+- `text_level: 1`: Level 1 heading
+- `text_level: 2`: Level 2 heading
+- And so on...
 
-Please note that both `title` and text blocks in `content_list` are uniformly represented using the text type. The `text_level` field is used to distinguish the hierarchy of text blocks:
-- A block without the `text_level` field or with `text_level=0` represents body text.
-- A block with `text_level=1` represents a level-1 heading.
-- A block with `text_level=2` represents a level-2 heading, and so on.
+#### Common Fields
 
-Each content contains the `page_idx` field, indicating the page number (starting from 0) where the content block resides.
+All content blocks include a `page_idx` field indicating the page number (starting from 0).
 
-### example
+#### Sample Data
 
 ```json
 [
@@ -437,4 +479,13 @@ Each content contains the `page_idx` field, indicating the page number (starting
         "page_idx": 5
     }
 ]
-```
+```
+
+## Summary
+
+The above files constitute MinerU's complete output results. Users can choose appropriate files for subsequent processing based on their needs:
+
+- **Model outputs**: Use raw outputs (model.json, model_output.txt)
+- **Debugging and verification**: Use visualization files (layout.pdf, spans.pdf) 
+- **Content extraction**: Use simplified files (*.md, content_list.json)
+- **Secondary development**: Use structured files (middle.json)

+ 18 - 9
docs/en/usage/advanced_cli_parameters.md

@@ -5,19 +5,21 @@
 ### Memory Optimization Parameters
 > [!TIP]
 > SGLang acceleration mode currently supports running on Turing architecture graphics cards with a minimum of 8GB VRAM, but graphics cards with <24GB VRAM may encounter insufficient memory issues. You can optimize memory usage with the following parameters:
+> 
 > - If you encounter insufficient VRAM when using a single graphics card, you may need to reduce the KV cache size with `--mem-fraction-static 0.5`. If VRAM issues persist, try reducing it further to `0.4` or lower.
 > - If you have two or more graphics cards, you can try using tensor parallelism (TP) mode to simply expand available VRAM: `--tp-size 2`
 
 ### Performance Optimization Parameters
 > [!TIP]
 > If you can already use SGLang normally for accelerated VLM model inference but still want to further improve inference speed, you can try the following parameters:
+> 
 > - If you have multiple graphics cards, you can use SGLang's multi-card parallel mode to increase throughput: `--dp-size 2`
 > - You can also enable `torch.compile` to accelerate inference speed by approximately 15%: `--enable-torch-compile`
 
 ### Parameter Passing Instructions
 > [!TIP]
-> - If you want to learn more about `sglang` parameter usage, please refer to the [SGLang official documentation](https://docs.sglang.ai/backend/server_arguments.html#common-launch-commands)
 > - All officially supported SGLang parameters can be passed to MinerU through command line arguments, including the following commands: `mineru`, `mineru-sglang-server`, `mineru-gradio`, `mineru-api`
+> - If you want to learn more about `sglang` parameter usage, please refer to the [SGLang official documentation](https://docs.sglang.ai/backend/server_arguments.html#common-launch-commands)
 
 ## GPU Device Selection and Configuration
 
@@ -31,22 +33,29 @@
 
 ### Common Device Configuration Examples
 > [!TIP]
-> - Here are some common `CUDA_VISIBLE_DEVICES` setting examples:
+> Here are some common `CUDA_VISIBLE_DEVICES` setting examples:
 >   ```bash
->   CUDA_VISIBLE_DEVICES=1 Only device 1 will be seen
->   CUDA_VISIBLE_DEVICES=0,1 Devices 0 and 1 will be visible
->   CUDA_VISIBLE_DEVICES="0,1" Same as above, quotation marks are optional
->   CUDA_VISIBLE_DEVICES=0,2,3 Devices 0, 2, 3 will be visible; device 1 is masked
->   CUDA_VISIBLE_DEVICES="" No GPU will be visible
+>   CUDA_VISIBLE_DEVICES=1  # Only device 1 will be seen
+>   CUDA_VISIBLE_DEVICES=0,1  # Devices 0 and 1 will be visible
+>   CUDA_VISIBLE_DEVICES="0,1"  # Same as above, quotation marks are optional
+>   CUDA_VISIBLE_DEVICES=0,2,3  # Devices 0, 2, 3 will be visible; device 1 is masked
+>   CUDA_VISIBLE_DEVICES=""  # No GPU will be visible
 >   ```
 
-### Practical Application Scenarios
+## Practical Application Scenarios
 > [!TIP]
 > Here are some possible usage scenarios:
-> - If you have multiple graphics cards and need to specify cards 0 and 1, using multi-card parallelism to start 'sglang-server', you can use the following command:
+> 
+> - If you have multiple graphics cards and need to specify cards 0 and 1, using multi-card parallelism to start `sglang-server`, you can use the following command:
 >   ```bash
 >   CUDA_VISIBLE_DEVICES=0,1 mineru-sglang-server --port 30000 --dp-size 2
 >   ```
+> 
+> - If you have multiple GPUs and need to specify GPU 0–3, and start the `sglang-server` using multi-GPU data parallelism and tensor parallelism, you can use the following command:
+>   ```bash
+>   CUDA_VISIBLE_DEVICES=0,1,2,3 mineru-sglang-server --port 30000 --dp-size 2 --tp-size 2
+>   ```
+>       
 > - If you have multiple graphics cards and need to start two `fastapi` services on cards 0 and 1, listening on different ports respectively, you can use the following commands:
 >   ```bash
 >   # In terminal 1

+ 2 - 0
docs/en/usage/cli_tools.md

@@ -63,6 +63,8 @@ Options:
 ## Environment Variables Description
 
 Some parameters of MinerU command line tools have equivalent environment variable configurations. Generally, environment variable configurations have higher priority than command line parameters and take effect across all command line tools.
+Here are the environment variables and their descriptions:
+
 - `MINERU_DEVICE_MODE`: Used to specify inference device, supports device types like `cpu/cuda/cuda:0/npu/mps`, only effective for `pipeline` backend.
 - `MINERU_VIRTUAL_VRAM_SIZE`: Used to specify maximum GPU VRAM usage per process (GB), only effective for `pipeline` backend.
 - `MINERU_MODEL_SOURCE`: Used to specify model source, supports `huggingface/modelscope/local`, defaults to `huggingface`, can be switched to `modelscope` or local models through environment variables.

+ 27 - 15
docs/en/usage/index.md

@@ -15,14 +15,16 @@ MinerU has built-in command line tools that allow users to quickly use MinerU fo
 # Default parsing using pipeline backend
 mineru -p <input_path> -o <output_path>
 ```
-- `<input_path>`: Local PDF/image file or directory
-- `<output_path>`: Output directory
+> [!TIP]
+>- `<input_path>`: Local PDF/image file or directory
+>- `<output_path>`: Output directory
+>
+> For more information about output files, please refer to [Output File Documentation](../output_files.md).
 
 > [!NOTE]
-> The command line tool will automatically attempt cuda/mps acceleration on Linux and macOS systems. Windows users who need cuda acceleration should visit the [PyTorch official website](https://pytorch.org/get-started/locally/) to select the appropriate command for their cuda version to install acceleration-enabled `torch` and `torchvision`.
+> The command line tool will automatically attempt cuda/mps acceleration on Linux and macOS systems. 
+> Windows users who need cuda acceleration should visit the [PyTorch official website](https://pytorch.org/get-started/locally/) to select the appropriate command for their cuda version to install acceleration-enabled `torch` and `torchvision`.
 
-> [!TIP]
-> For more information about output files, please refer to [Output File Documentation](./output_file.md).
 
 ```bash
 # Or specify vlm backend for parsing
@@ -42,7 +44,8 @@ If you need to adjust parsing options through custom parameters, you can also ch
   ```bash
   mineru-api --host 127.0.0.1 --port 8000
   ```
-  Access http://127.0.0.1:8000/docs in your browser to view the API documentation.
+  >[!TIP]
+  >Access `http://127.0.0.1:8000/docs` in your browser to view the API documentation.
 - Start Gradio WebUI visual frontend:
   ```bash
   # Using pipeline/vlm-transformers/vlm-sglang-client backends
@@ -50,23 +53,32 @@ If you need to adjust parsing options through custom parameters, you can also ch
   # Or using vlm-sglang-engine/pipeline backends (requires sglang environment)
   mineru-gradio --server-name 127.0.0.1 --server-port 7860 --enable-sglang-engine true
   ```
-  Access http://127.0.0.1:7860 in your browser to use Gradio WebUI or access http://127.0.0.1:7860/?view=api to use the Gradio API.
+  >[!TIP]
+  >
+  >- Access `http://127.0.0.1:7860` in your browser to use the Gradio WebUI.
+  >- Access `http://127.0.0.1:7860/?view=api` to use the Gradio API.
 - Using `sglang-client/server` method:
   ```bash
   # Start sglang server (requires sglang environment)
   mineru-sglang-server --port 30000
-  # In another terminal, connect to sglang server via sglang client (only requires CPU and network, no sglang environment needed)
-  mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://127.0.0.1:30000
   ``` 
+  >[!TIP]
+  >In another terminal, connect to sglang server via sglang client (only requires CPU and network, no sglang environment needed)
+  > ```bash
+  > mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://127.0.0.1:30000
+  > ```
+
 > [!TIP]
 > All officially supported sglang parameters can be passed to MinerU through command line arguments, including the following commands: `mineru`, `mineru-sglang-server`, `mineru-gradio`, `mineru-api`.
 > We have compiled some commonly used parameters and usage methods for `sglang`, which can be found in the documentation [Advanced Command Line Parameters](./advanced_cli_parameters.md).
 
 ## Extending MinerU Functionality with Configuration Files
 
-- MinerU is now ready to use out of the box, but also supports extending functionality through configuration files. You can create a `mineru.json` file in your user directory to add custom configurations.
-- The `mineru.json` file will be automatically generated when you use the built-in model download command `mineru-models-download`, or you can create it by copying the [configuration template file](https://github.com/opendatalab/MinerU/blob/master/mineru.template.json) to your user directory and renaming it to `mineru.json`.
-- Here are some available configuration options:
-  - `latex-delimiter-config`: Used to configure LaTeX formula delimiters, defaults to `$` symbol, can be modified to other symbols or strings as needed.
-  - `llm-aided-config`: Used to configure parameters for LLM-assisted title hierarchy, compatible with all LLM models supporting `openai protocol`, defaults to using Alibaba Cloud Bailian's `qwen2.5-32b-instruct` model. You need to configure your own API key and set `enable` to `true` to enable this feature.
-  - `models-dir`: Used to specify local model storage directory, please specify model directories for `pipeline` and `vlm` backends separately. After specifying the directory, you can use local models by configuring the environment variable `export MINERU_MODEL_SOURCE=local`.
+MinerU is now ready to use out of the box, but also supports extending functionality through configuration files. You can create a `mineru.json` file in your user directory to add custom configurations.  
+The `mineru.json` file will be automatically generated when you use the built-in model download command `mineru-models-download`, or you can create it by copying the [configuration template file](https://github.com/opendatalab/MinerU/blob/master/mineru.template.json) to your user directory and renaming it to `mineru.json`.  
+Here are some available configuration options:  
+
+- `latex-delimiter-config`: Used to configure LaTeX formula delimiters, defaults to `$` symbol, can be modified to other symbols or strings as needed.
+- `llm-aided-config`: Used to configure parameters for LLM-assisted title hierarchy, compatible with all LLM models supporting `openai protocol`, defaults to using Alibaba Cloud Bailian's `qwen2.5-32b-instruct` model. You need to configure your own API key and set `enable` to `true` to enable this feature.
+- `models-dir`: Used to specify local model storage directory, please specify model directories for `pipeline` and `vlm` backends separately. After specifying the directory, you can use local models by configuring the environment variable `export MINERU_MODEL_SOURCE=local`.
+

+ 1 - 0
docs/en/usage/model_source.md

@@ -38,6 +38,7 @@ mineru-models-download
 ```
 >[!TIP]
 >- After download completion, the model path will be output in the current terminal window and automatically written to `mineru.json` in the user directory.
+>- You can also create it by copying the [configuration template file](https://github.com/opendatalab/MinerU/blob/master/mineru.template.json) to your user directory and renaming it to `mineru.json`.
 >- After downloading models locally, you can freely move the model folder to other locations while updating the model path in `mineru.json`.
 >- If you deploy the model folder to another server, please ensure you move the `mineru.json` file to the user directory of the new device and configure the model path correctly.
 >- If you need to update model files, you can run the `mineru-models-download` command again. Model updates do not support custom paths currently - if you haven't moved the local model folder, model files will be incrementally updated; if you have moved the model folder, model files will be re-downloaded to the default location and `mineru.json` will be updated.

+ 21 - 227
docs/images/logo.svg

@@ -1,228 +1,22 @@
-<?xml version="1.0" encoding="UTF-8" standalone="no"?>
-<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
-<svg version="1.1" id="Layer_1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" x="0px" y="0px" width="516px" height="516px" viewBox="0 0 516 516" enable-background="new 0 0 516 516" xml:space="preserve">  <image id="image0" width="516" height="516" x="0" y="0"
-    xlink:href="
-AAB1MAAA6mAAADqYAAAXcJy6UTwAAAACYktHRAD/h4/MvwAAAAlwSFlzAAAWJQAAFiUBSVIk8AAA
-MLJJREFUeNrt3XecXFX9//HXmfSekN4I6SGVhJAAMQkkFAEF4QsCoiBFaYIIyE/Er/IFUbCioIB0
-BemKikIoAjZAQGkhodcACSSkk5BkP78/Npvtm5k5955zZ+b93McDdjZ77/mcuzvvvfUcZ4hIpcvF
-LkBE4lMQiIiCQEQUBCKCgkBEUBCICAoCEUFBICIoCEQEBYGIoCAQERQEIoKCQERQEIgICgIRQUEg
-IigIRAQFgYigIBARFAQigoJARFAQiAgKAhFBQSAiKAhEBAWBiKAgEBEUBCKCgkBEUBCICAoCEUFB
-ICIoCEQEBYGIoCAQERQEIoKCQERQEIgICgIRQUEgIigIRAQFgYigIBARFAQigoJARFAQiAgKAhFB
-QSAiKAhEBAWBiKAgEBEUBCKCgkBEUBCICAoCEUFBICIoCEQEBYGIoCAQERQEIoKCQERQEIgICgIR
-QUEgIigIRAQFgYigIBARFAQigoJARFAQiAgKAhFBQSAiKAhEBAWBiKAgEBGgdaiGXL1XOaoAh8MA
-A3IYAIbDUVXne6q/Vvfrhm3+r2thiZrlar7eeInml2Pzko17UFMLdb4vt/lVwyXKXc1PomrLdqiq
-s/3qcpu3Ts3PoPY7an6SYeuuqWrT5t+72p9orl/VECYwMTfShtGVbjg2sJx3WWSP8JR7jcW2inrV
-V21eY1VK1YbZNsGCQCTjtmEmn2BvNwHqvf3a04XB4A4EVvIP5vEA82MXmzQXKou1R1B+ymiPYKg7
-jtnMyHMFr/EEv+auctojwAJ9NPWjcORwm38hqj9zm79a93uqv1b3627Lf1taomY56i1Xd4nml2v4
-i1zz77W11P2+HLnNa68sNT+J2u1Qd/vV/cjV+SlT7zuIsOVymz9aVbfe3p3r3nBW4MfK3J/c9Nrf
-rfROtgV6fyoIFATFKocgcLNzjxUcAjUfyzk/171mjWkJ8/7UVQOpXF04m3ttWtHLd3Pf4nZ2it2N
-JCgIpFINst+479HWbyU2193GZ2N3xZ+CQCrTULvRDkhkTUPc9Xwzdnd86fKhVKJtuZWpia2tvV3g
-4HuxO+VDewRSeQZwjSUXAwDYtzk6drd8KAik0nS1K21u4mtt537C7NhdK56CQCqMO419U1lxd3c1
-g2L3rlgKAqksu3F6ausezgWlejuJgkAqSXfOo1t6q3dHcnDsLhZHQSAVxB3jZqbcwjl0jt3LYigI
-pHL04+zU25joDo/dzWIoCKQEuC3PB3h9nEyvAKUeQ9fY26twCgIpAUaV/0cn2zNIsdP5TOztVTgF
-gVQIty+TwzTEvrSP3dtCKQikUuzv+4BRvtw+DInd2UIpCKQyDGB8sLa6MjF2dwulIJAS4fw+xjEm
-YLFzS+3GIj19KCXCc+y+bV3A43Y3zdqxLlx7/rRHIJVhVNDWJtA9docLoyCQStCRkUHba822sbtc
-GAWBVIL2bBO4xRJ7DlFBIJWgTZqPGjWpR+wuF0YnC6WkFHkyvm3wIOgeuD1PCgJJVjt6sB1TGUM/
-BtIBA1ayhNd5jv+yiOVsilBVLvhveom9s0qsXMmwHOPYkX3Y2zX667tlirNXuJeHeYzXi22kyIuI
-61jqBgfdGiV18VBBIEk5wB1us12/lr/JhnOiO5H57q92Bw8HrM7YGHh7LArcnicFgfjbx53CbDrm
-/f3jGOcOZx4/5clAFX7EksDbZEXg9jzpqoH4GcAv3R3sU0AMVOvFEfyFrwR6Tm8NbwXeLi8Ebs+T
-gkB8fNr91U6kQ5FL9+ESbmF0gDqreDnkZuE9lgVtz5uCQIqV40xu830bu/35AwcWsVyhHy+yIdym
-sQf5KFxrSVAQSHHauWvsB7TzX5GN5iZ3UsFLFfqxkDcCbp3Ho1wk9aAgkGK0dVdzVGKP2rbjZ5yQ
-csUv8VTKLdTaGPSKSCIUBFIw19ZdyRGJrrK1u5TjUi3aeCjV9ddt6t/F3ycRi4JACua+ypGJr7SV
-+4nbPc2q7S+8meb667ir1E4VKgikcJ+x76Sy3i78MtUrCK/ZHSmuvdbr3BmknUQpCKQwQ90P6ZTS
-use4s1Md4usXQW78/R0LArSSMAWBFCT3LUakuPrPJ3zuob5X7MoU115tiV2eehspUBBIIfa2Q1Nd
-fyvOZUCK67+Yd1OtH67jpZRbSIWCQPLX1Z2d2mHBZm44R6e4+lft3DSrtyfs+2muPz0KAsnfHqQ8
-lzCAO5XeKa7+am5Lbd2rOYflKdaeIgWB5KuV+1SQ35c+HJbi2jfZd9I6mWdncG+KladKQSD5GuH2
-C9OQO6TgZxkLscCOTuXv9hXcmGLVKVMQSL72sD6BWprM1FTX/5idkPRDQXaVncGaVKtOlQYmiaxk
-ZsbKBZzsu5Obbn9LtYVbrK27PLn9DruCU/k49e2SIgVBZJ4TeYXT1U0P1pZjV1zKm+Y3vM+lDE9g
-TWa/5MzSjgEdGki+ptA5YGsjA0wsfg8H8VfvtSzhZE4ttaFKG1MQSH52CHoUMzDIlGHP2MH2HVYW
-vwK73w6yy6gKtl1So0OD6ErkLMGEoK11c0XcX1jEscSHnGd/c99lGm0KXvY9+wFXsSq5Tqd9NNQS
-BUF0VhpREHYKL2e9grX1EJ+wg9wpTC3g4OcVu5lLWJxYd6OfK1IQSH76Bm6vS9DWfsfvbH/3Saay
-01a+80P+YfdzZ1JjG+QyclyhIMiAktgnCD13YNvgPfyj/ZEBTGAcOzGOoa7e/oG9zfPM5xkW8h/f
-KwSOmp957P2AWgqCTCiBKAj9OxvnPfIO7zCPdnSlI11sG1phONa6paxlVTLnA1wGDgQaUxBIfkKP
-yhvzetZ63k9nxdmNe10+zIjs/Y1owOMiW1HWxu5w8rIbA9ojkHylPaBHA67o4T+zF6lZDoAaCoLM
-yN4vcD2vB/113mCJXZqTfOjQQPLzbNDW3g06L5EoCCRPTwVtbVHpTRHSlFI4KKimIJD8vMSrAVt7
-PvjJyYS5EgoBUBBIvj6yB4K1tckejd3dSqMgkHzdHux85juEC50UlNa+QDUFgeTrGZ4P1NK/eS12
-Z4uXy/r1n2aqFsnP+/b7MA3Zr2J3tRildlagPgWB5GsTt/JW+s3Y/dwXu6vFURBIZXg2wDy/VVyU
-1L51uF/uUo6AagoCKYBdlvatxnYniVwxcHX+m7bSjwEFgRRmgX031fW/w/ms9luFa/TakdYRfGmf
-F6hLQSCFucL+kNq6zb6f3h2MSb9lXQrrjEdBIIXZxKmp3WN4C6lfL0jiF7583v61FARSqDftSJam
-sN5/2XF+g4Dl/wb1eSuXz+FAXQoCKdw/7fNJR4EtsGP85g4s7u3Z/MBhVmetaZ5nyAYFgRTjHjuC
-DxNc30K+wAvFLpzz/Atf+99cnTd8yOsO8SkIpDjz7LO8ndC6/m6H8mTsDlU2BYEU6347hH8nsJ4b
-7QCeKW7Rct5ZD0tBIMV71D5lv/Q6wfeenWLHFn+QkcWBwUuTgkB8vM+pdrA9VtzCdovtyaWsL2ZZ
-7QkkS0EgfjbxJ2bbUcxnYwFLrbAHbV+O4LnY5Us1jWIs/tbza7uJo9wsZjBsq9/9Ag/YbTxUbGPZ
-miqsXLhQm7T+rlz11I8113CN2uEcDIejqs73VH+t7tcN2/xf18ISNcvVfL3xEs0v19TRp9tSn235
-vPr7cptfVfqvp8PGuZ1sGuMY4/rU/zeD13iahTzNE7zs0cJWfrI1Pzm3ld+Rpr9ef4mWfne23kLt
-OlwTvahZqrae5s54hPmt0h6BJGk+87mOnvSkuw2gJ+0BY5V7j2UsY7Hf/EWlOfZPaVAQSPKWpnIL
-cunqTDs60Io2wMdsZB2r2BC7qPoUBFICqnegS043RrA9kxjPALalE22ANaziHZ7jOZ7jpeIPkpKm
-IBBJ3vZMZ4oby0TXu8G/dKKT9WOKA1jg/sHdBBoJsmUKAsmwmhNwJaMDu7O3m8YABuVxaX57trcj
-3NN2Bb+OffpDQSDiqy3d2JZZfMpNpqO1K2jZjuziprkj7Tz+FjMMFASSSSVyt8BAtmMMuzHdjfSo
-t5XNcbPsZ/yAJbE6oiCQzCmBJwjaMZXpbpztwFjXPpE1tnZnsCvHsiBOhxQEIoUYym7s5sYziD7+
-K2tgF3eXncw9MbqlIBDZuo50YyJz2MONoJO1Sq2dYe52O5ZbwndQQSAZkrn7BXIMYSRjmcsM1yNI
-i53c5fZB+ElgFQSSARk8NdjfJrmdGG/j3LjAtXV3V9qnmR+2uwoCyYQMnSCczF7s6kYwkG6RKhjq
-rrJ9WB6ySQXB1uTQg4Upysy+QFe6M4N92N11p3PsYtiZs/hmyAYVBE1pzQAGMpBt6U87oIpVLOEN
-3ubdtOf+qxwZ2QfozBBGMI05bueMRBIA7lS7h7+Fa09BUF9fdmac24FRbO/aNvrX5bzonuI/zOP1
-2IXGkNxfb1dn/IBoxrCDm8Bkm+QGZCkCNuvkLrQ5rAvVnAYmqf53xy5uLtPYjmF03GpnXnD/sGv5
-Z6UNTFK7BZt+47gtW7zh3/vaV2kPG5JHC13Y1e3FeEawbab/EG7i83ZzqIiq7CBo5bpU9WfX3F72
-CboWeGS4xt1bdS7PVGoQVPe4vuaDgPzfpqQUBK1dl6oh7JGbbdPpTIfY2zIv8+wA1muEojT1YQjb
-MYdZbmyRmdvJDnS72I+4JtEZf0pK3UG76kd93aG3crEjsjfbMZrZzHSjM3gI0JIp7Mi/wjRVaUHQ
-lslMdBOZbJNdR+9fi37uR7Yn/4+nY3crrvoj7jlaHoMvmLaMZ7KbwI5MSeBnHUNvZioIkrYdc9nJ
-7cAQ+iW5Wrc3YzmTW2N3T+roz2zmuvE2mIGxS/G0J5ezIkRD5R4EHejFJHZzMxlu3VLq7WC72rXi
-pthdrXjt6cRY9mOu246utPVfYXxudxuoIPDRnmEMYmc+4XalU+o7hZ3tMmfcHLvTFWsIQ5jAXuzS
-aGCwUpdje54P0VD5BcEwpjCWiezCgIDTYnXjF7Y+G6PPVZDu7MhYprAjE8p2CrRPcEeIZsonCDoz
-jbluKtsxlDbhfy1sG/dLW8DC2JuhQkxkhpvFKBvmupfkacD8DQrTTKkHQRs6MZB9meMm0tXi3iPe
-z11te/BR7E1SxjrShV3Yh5muH11Jb1SADHEjwrRTukEwkIFMYjaz3aDM/E3Y1Z1l/xe7iDLUiQGM
-ZBqzmekq4u1fR/cwzZReEHRhMmPYkR2Z5LJX/an8gadiF1FGBrET45jKRDc0M3EfVqBOZ++t1JxW
-jGOGm872NpyeGXlyrbFt3LF2SuwiykAbZjLD7cpgG+naVmgEBJX9IOhED2Yw182gn3UrgePCw7mq
-0u80LFqOjgxiFvu58fSmS+xyKkl2g6Ar/RnFLnySca6Ubg7pySEKgoL1ZCCj2J1PuImxS6lMWQyC
-UUxhe6YyyZXmDaIH8zPej11EyZjAFEaxK1NcVx0CxJOtIJjDvm4SQ2xk9EErPLjRNoG/xq4i8/qx
-Gzu6KQxnSCn/tMtFNoKgNWP4AgfRt0yOC/dTEDSjDR2ZxAz2cqNtGwqbJVBSlIWBSSbb6e7wEjgN
-mL+Ftn3sEtLQcGASmhk2pMmv92ewjeWTbnayT3+WvdeqhoVoJuoegeH6cbI7gV4xq0jBAAbzVuwi
-MqIH2zOJSezI1LJ9HqAMRA0Ct6ddyJQyPDpsxyQFASOZ66YxzrajjyIg62IGwZf4cZmcE2ioTckP
-iFG89nRkFru7XdnWepbVAV9ZixUEbflf963YnU9Njp6xSwiuPX3Zjl3Yhyl0LsO9vDIXJwjacJE7
-LXbXU1VuZz1aMpjxjGU6OzBShwClKkoQuJPstNgdT1nGJvVNyVTmuClsbyNdB90JUNpiBMHnuCh2
-t8VTJw7naEZb5R0ClangQZAbZz/WjSQlrQ97urNtXOwyJEmhg6A1l+mGkpJ2mDvNpuswoNwEDgJ3
-uM2M3WUpWncudMeZLgmWobBB0I9SuGRorOc1u5v+7jB0GrzWRLuSabGLkHQEDQJ3NKNid7hFS1ls
-85nHo24hm/gKh8cuKENmulvoH7sISUvIIOjNIbG724zVvMjT9ixP8F9Wb/mqdoFrzXI3KQbKWcgg
-mMv42N1tyF7kQf7LAl5hUb2vxy4sW2a4W+kbuwhJU7ggaMOetInd3c02ssr+xr08ynu8VyE3/xQt
-tw0XmGKgzAULAtebObE7yxqW2Os8wr08ysf6s5+nn9rs2CVI2oIFgW3rtovYz5dZwEv2Dx7jnYhV
-lKIj7MjYJUj6wh0azIrRPVvKozzunraFLNSVwCJs674fuwQJIVwQjA3YK2OtvcA8HnXzbQmrArZc
-ZtwJDI5dg4QQ7hzByACNGO/ztv2Hh/knb7ExVN/K1gg+G7sECSPcHkHvVNe+jOd41R7nMZ4M1qPy
-9xmGxy5BwggXBAPSWKnBk/yLp3iBZ3QIkCzXjU/FrkFCCRcEyba0yd7jn9zDkyxmcbA+VBSb7GbE
-rkFCCRcEyVy1X81i3rC/8CALWK9bgVLk2C0j099IAKXzo17ISzxnj/CQ0yEApP9YZFs3R3dcVY5S
-CIJl9nOe4Vle1p0AtVJ/k3Zlcuw+SjilEATvcRHrYhdRccbQOXYJEk4udgF50a9keKlc5ZGsKo0g
-0DFBeCFuAJPMKI0g2EwnrwLqEbsACamkgkACysrYERJECQaB9guC0GauKCUYBCKSNAWBiCgIRERB
-ICIoCEQEBYGIoCAQERQEIoKCQESoqCDQrXIizamgIBCR5igIRERBICIVEgQ6OyDSsooIAhFpWZkH
-gfYFRPJR5kEgIvlQEIiIgkBEFAQigoJARFAQiAgKAhFBQSAiKAhEBAWBiKAgEBEUBCKCgkBEUBCI
-CAoCEUFBICIoCEQEBYGIoCAQERQEIoKCQERQEIgICgIRQUEgIigIWqJtIxVDv+zNW+WxrKZYkpLS
-OnYBGbYcwxW3qGufdnFKGkmS9gia57NtuscuXqQQCoLmFbk3IFJ6FATN2+CxbJvYxYsUQucImreM
-KloVuWx3HcNLKdEeQfM+9li2Y+ziRQqhIGjeEqqKXlZ7WlJSFATN8ztH0C52+SL5UxA070OPi/Xd
-GBi7fJH8lf0urMc1wI9ZzTZFLtuatrF7LpI/7RG0ZEXRS3ahd+ziRfKnIGjeJpYUvWx7usUuXyR/
-ZX9o4HE9v4rFRR9YtHPFHlSk3y+RRrRH0LxNvFf0sm3oE7t8kfyV/R6Blw89lu2tv9lSOrRH0JJ3
-PJbVHoGUEAVBS171WLaPbjOW0qEgaEnx5wigDz1ily+Sr3BB4NOSx+G28/lY7fHgUT8FgZSOcCcL
-1xR9r50r+mFgX+tYwqAilx3geqVbnE5GVoRAw+OEC4JlRf+FbEdXFhfbrNfbZSWvuGKDIGcDfJoW
-AWBdmGZK4dAgF22PYI3XdYOxkaqWMmI+l7ALEC4Iiv/THC8I1vOSx9JD9OCReAt0BBguCIpPtg4R
-xwT22SMYpwuI4m1TmGbCnSMo/vx7K7+hQL3Otrzv0e721pPlPo2LsCxMM8H2COyDohdtS2evln0+
-3vX4QbRn2/S2p1QIn3GyChDu0KD4s5+eewReFvGWx9KTotUt5cJjn7QQ4YJgZdFLdog4yMf7HmMS
-wIxodUu5WB2mmXDnCJYXv6jr5NOw12nXj3jTo+4drTUbfZqXirc8TDPhgqD4Yb+gl981FK/ThS96
-LNuX4bzgVbpUuqVhmimFy4f4nSz0PF34ImuLbrgtU9LZmFIxiv/tK0i4IFjksWzXYFU29oxHhLV2
-c70eetrKh1SAd8M0Ey4Ilnss2xevswReXi3+OQdggnX02h9p8UMqwPIwzYQLAp9jnW5+QeD5d3eh
-R9MDGZHO5pSK8LELdNUgWBC49awveuEedPFp2/Mv79MeTQ9wO6e1RaX82RIru3ME6zxujdjGLwg8
-Pehxv7fTTUXiYZnXnNwFCHeL8Wor/h69vm6biKfVnvW64jFZcx5J0d4vv/EI1vvcLGmdI55WW2+P
-eyy9A6NS2Z5p07nILCjLIPA5XVjsOEFJMB71WLoD20esvXia8SIL3im/INjA6x5Lj4562fwBr6UP
-pENahaV4l0Lf4FtZGvN50qUgIYcz99kj2Jb2AStt6A3zuK3DzSzJyU6qYhcg4BaF+vsX7vIhzifd
-BkYNgg/5t8fSXZgdsfZipbYXI3n72JaGOlUTco/gveIfPHKjI95bCGv4h9fy+6ZVWIq/Jj3TW7Xk
-abXXo3oFCXf5EFvh0a2u9AtVaZOe9rgdCje3BCc7ibu9BeCdUE8ahN4j8OnWmICVNjaf5z2W7uE+
-nVZhKT3H0CbayNFSa2moh5BDnyz0GRN4h4CVNvau17gCrTg0avWFGxD1nIxUe5dVoZoKGQSbvI54
-RgestDHjEa8D8h0YH7X+QnXU9LgZsCjcbV1hf9yvFL+o2zFopY3Y/V5zIw9wn4xbf4GG6KpBfB43
-5Rcs5OVDeNXjEYquDA9Va5Oe5zWv5fdIZ3iVlG4nGqRDg+jWeo2gXaCQVw2wVz0ODtoyIVStzfTg
-Qa/FZ6VzcJDSvuMA3WIc3VL+G66xsIcGL3gEQRumBq21sTu9RiTuQEoHB6mMfTQsyhaWuj4KNcsR
-hA6CDz1mN8DtELTWxp70GtEYDiihuwn0pEF8r4e7nSh0EMBTHssOjTqIKZjd6LO4m5jeHYYJ6xxx
-2lnZzGtkrIIFDgJ7zOOgtk/kewngz36Luy9Frj9fg0vyManyUuX1R7NgofcIXvZ4qq2ni/1k/6vm
-98zBdHZKvqgUrhlsqxuMo9tEOe8R8ARril7WMTFwtQ2t4h6v5du7rydfVAonCwfSMdIWlhqrw86R
-FfQ+Aodb7XXCbZTvnEfeHvIavxBmRX5mIj9jYxcgPOY2hZzGJux9BFiV/ctjJRPcyMhzAz3uNWwZ
-9HGHJbtdU9CRkbFLEHvULOQ0NuHvKH/CY9m+tl3kuYE+5m6v5R2HsF2i25PEzxL0d7EPwQSeCdtc
-+CB4zmsQrBlRxy4E7F7PGz/HcmDiNSX70S/5qJICbeTZsA2GD4J3eMlj6V0i30sAL3geHMBXMn5O
-flzsAsSeCTXnYY3wQbDcPO6gdjsyIHjFDdj1Xrca44bx+WQrSvTAoI3bPebWFQAe8TwpXbDwQbCe
-Jz2Wbsf04BU3dLfXA8mAO51tkiwo0QODDhnYwvKsxzR7RYkx/MSzXn9R94xQcX1VdqnnGvpzcuxO
-NGs0Q2OXUPHWez7VUoQYQfC6z1Qnbm7UCVGr3egzfRuA+yL9kysnyUODtJ6RlAIs8DqPVpQ4QeAz
-EGjvDOwTvGt3eK5hqDs+uXKSPDRw+0XdsgKwwGt0z6LECIL1LPSp2M2NUHN9m7iJ1V5rcByayYt0
-w3UzUQY8E36eqShDVNqjrPVYfGYGxtP7D494rmGM+2zsTjRhH7rFLqHiLTefebWKFGes2n95nXcf
-zh5Rqq5rNX/wu4gI7pTsPXfg5mg+g+gW81j4RuMEwWKvswQd3W5Rqq7HbvC998sGuXNi96KB0UyJ
-XYLYcx5P6BYt0uj19nevxacnex2+KCvsCu91HM7OsbtRzyy2jV1CxaviTzGajTWNxd2s81h6avSx
-igBu9T6328r9JJnzHYlcOmzrZsV+kkNYy8Mxmo0VBC96PbrTjtkZmInnQ7vKex07cVzsbmwx3JXK
-mIrl7Fmfu2yKF+vttN7u91ncHZCJ4TVv9J6ksrX7ehLDgCRyD8FUi3/AVfHsrjjtxvu7eoPX0hMj
-z3tU7UW72nsdg/ly7G4A0D53ROwSBPD6A1m8eEHwktf91I6jolVe1/Us8V2FOzkD90rCeNs7dgli
-z/FqnJbjBcEK+5vP4m6vDFw5gOfxmusAgNa5izJwoPO52AUIcJf3wWaR4gXBx547QSP5dLTa67Cr
-WOy9jsnu/MjdGOy+ELkCgY38M9gghQ3EPPf+BC/7LO4SHt6jSM9zWwJrOS7uGXv3RXrFbF8A+G+M
-ewqrxQyC1z3ncpkUfZ4DAOw7vOu9kvacH/Gt2IEDorUttZ7xfby9eDGDYBP38rHH8r0zsk+wzH6c
-wFqmuG/E6oD7XDYitcKt5b54jUe9Lcce8Bz0aw49Y9a/xVXmM/zaZu7Lcc56uN7uGNrEaFnqedM8
-59b0Eff+vNc8R2+fRDYuea0ggX0C6+J+EWOYMNvVsvXEQ4Wyf3qOceElbhCY3eh1lrQ1h9M+ag9q
-/IW/JrCWwe7i4I8Bt8sdm4HbtaWKm2I2H/tX4E9+04W4mYyO3INqK/hRInm+H18LW7ibY3uFbVGa
-9Hq8KwYQPwjWeN5b3c0FfuM0x+4miSO8Vu7coPcZtnWn0C5ge9IMuyXmgQG4UPcvNPt86y7OZ1pU
-+MhGhB/qsUkj3OOJ3CH4ktur6vXCFskVObOjO5YrNCZRBpiNb26wnjDv0Nh7BLDA84RhB7JyT9zL
-9r1E1jOSX9ApSMXtOEkxkAX2cJyHj2vFD4Llvk/1u+MychERfuU9pCkAti/nhijXHcWEEO3IVt3h
-NZxvAuIHATzseWfesCTnCPCywr7tO6RpNXeqOzL1aofyTd0/kAkv85fYJWQhCJ7jH559ODwzY+09
-ZNcmsp62/CI3d2s/HEfOZ2yx4xkSbsNIC/4e6+HjWlkIgirvocHHk5X5eTbyw4TmrevMxWyfYqUz
-3AlhNolsxWq7PXYJ2QgC7M+84LmKIwKdXtu6l7ggmRXZeK5jYEpVutwFmsokI17godglZCQIWG5/
-8rtK4mZkZp8Au9HuTGhN09wlhVzlz/cwweFOsdmht4s0aaP9OvaJQsjCfQTV+riX6OrVwLM2JZkT
-dQkY7B5LbLbjn9rpzf2Tw22+f6D6PgKXZ5q6CcxLcjZm8bDSxrR8srxS7iOotoQHPNcwgewc875l
-30psGsuvcVHC1bXiHMVAVtgNCYxmkYCsBAH2Q68pTwB3embuJ4Cb7dakVuXO4ltJlpY7mkPDbxBp
-0nouiV1CtcwEAU96ToMGQ9yJsTuxxVrOTujqAeDOd6fWzkjkabydG3XLSB12X/wLh9WyEwQf81vv
-vmTnfgJ43c5lQ2Jr+6E7NJEg6O5+kdqVCCncVV5jdCUoO0EA9/C45xrGckjsTtRxk3e01Wprv8pn
-h77lH6cjdwazYm8W2WKe95mxxGQpCN7jJt9TbLmT3IjY3ahlX0/m2QMAutplHOy5jgMtIw9tC7CB
-yI8e15WlIMBu5jXPNQyzs2L3oo737XRLbsKKHnal1x7PeH6UmduuBF6LNc9hUzIVBLxrf/RdhTua
-KbG7Ucej/G+Ca+tu13BYMQvmyHVxlzAs9uaQWnZbvMHLG8tWEMCVrPRcQ2v3I1rH7kYdl/GbBNfW
-2a52JxexXGsuYLfYm0LqWMLFsUuoK2tBsCCBIRxncmzsbtRlp5HAYOdbdORSvl3wz+0cOyX2dpC6
-7Od8ELuGurJyi3GtUe4x7wG/XrBPxh7xpZ4p7r6Ep2z9iZ2JNb7F2DV5ttXtz810iL0RpJa9xbR8
-5/SorFuMa72YwBP9ozN0uzHAf+yrCf88T3cX5z2Q+z5cpxjImJ97Tu2TuOwFAVzjv9PkTkvqern3
-nXzVbrCk5zs+ld8woKl/yNX/GOOuoUfCbYufF8nACAT1ZTEInkvgPEE7zkmqbwnc1gvwY36fTD1b
-6jrY/ZFRW/mm/lxLv2TbFW+/zdSBK5DNIMCu8r+w4vbyHcnQNfN5kVbaVzwHbm/EdrTbmdPCNwyw
-GzSdWea8ZlfHLqGxTAYBz/C7BNZyLpN9Fk/ooKDWO3Y8ixJe5wRuc19u5t/62VUtxoREYTfyduwa
-GsveVYNqfdyz9PFu9I92EJuKXThHVZ26E9pOe7vb6JLMqrbYZD/NfcfW1lw12JztvbjOMjNqk2zx
-tk0u7BxYpV41qLbEEphfmP0p+vDAFfj1PM2zk3zHXWiklTuTmxrcNdiL6xUDWWQXZOv+gRpZ3SOA
-Tjzi/KffeMf2YEFx9bom9wgS2EP4X3eed78aW2hfc/ds3iPob9dmZMJ4qcce5EBWFLhMkMqyukcA
-a/hRAmsZ4C5MZxIPj2sJP7A0RqUZ425w59Ia2NZuUAxk1A8KjYFQsrtHAD240yVwN4B9lZ8XU2/L
-ewRu81eL2n6t+ZU72r9nTVT9a7vNfcNmpLFu8WXz+B/WFLxUkNqyHARwgLs57/vnmrfEPs2/C683
-nyCo+ZeCt2KX3G/sAO+eSSlZYwcUMxBJpR8aAPwxkTnh+riLXe/0iizqIGGVfSXRR5Ek+27IznhE
-jWU7CMy+7f1YMsAuaY/MU/gchPa27e85IbyUkvfsh7FLaEm2gwDmJ3IZEfd1Ppt2qQXvGbxjh7Iw
-7aokG+wSXoldQ0uyfY4AoKd7gu0SKOAtt2dVXjMs1txIlO85gvr/XuAx3Vh3e6pTnUo2PGvT+ai4
-RXWOoNpSOyuRWYMGc2UCJx6T9rwdzPOxi5CUrbZvFBsDoWQ/COD3ycwaZDNdQvMU5yfPfaDn7UDm
-h6xLQrObuDt2DVtTCkGwkR8kND/c6XzZfyX5y/OswYvaKyhrb3BxoP17D6UQBPBfrkiou+e4aS1/
-R9LPHLp81rrQDua5hBuWrLikFGI++ycLq3Vyf2Z2EnXYc3yypYeBa08S+p4srLtkbuunOUa6G5i2
-tW+SknOP7eO3Ap0srGuNnVn8A8V1ufHuEtqF78BW9wxe4iCX2AzKkhGr7dTYJeSnVIIAnrDvJ7Sm
-A90vkh91JD8tnjVYxFF2ZZy6JB12Pi/FriE/pRMEcClPJLSmY3KnREoCoIX7ENdxkv0kYmGSKLsv
-qXNb6SulIFhsZ7E8kTU5/tcOjN2dJic538g3OS8rU2WLl2V8J6sPHTdWSkEAD5LQsI/Wy13JJ2J3
-B4wcDc4crLfv2GezNCueFMXsxwnOhZ26UrlqUKOju49dEyrpRduPlxvXmc5Vg7ozEtXMS1QdBFXk
-MKg/S9FcdyN9E+qnxPA324v1SaxIVw2astZOSWxO+VHuOvrH7lB9W34cD9hu3M2G2PVIkZbbkcnE
-QCilFgTwH0tuxL8Z7lZ6xu5QQw5wuNfc/QqCUmXf4I3YNRSm1A4NADpyi/tUYmu7277A0rp1xjw0
-yFGFw/q649mLaemMtihps5s5KrkTvhqqrHlj3J8YkVhll3NyzeF5VewgaFc1xp3AfgxOcnNJUE2e
-eyqegqCltR3F1bRKbG2XcEb1bnjUIBhTNT13hM2mbaKbSsLaYAfy5yRXGOYd2jpIK4mz6914zkxs
-bae4FXw74hNiw5nCnhzkemb+ITXZmvOSjYFQSnSPAOjEn10ijyEBsNH9H9+NsEcwIjfLdmcqY5Le
-PBKDPcTuia8zSOWlGwQwzt2X4OW/Kvd/XFC1KUgQtLJ2Nsrtxz4Mp1ep7pVJQzafzyR5dmDzWoPU
-XspBAIe53yT5NnK/rfq9e8ktrlrGx6kEQcdcL+ttE3K720yGlOClW2nJGjuMu5JfrYIgn7VexFkJ
-r7LK/bdqIQvdy+6tqrdY4jZPWeoRBJ3omxtio2wg43PjbFQqG0Li+2Ziz8fWoyDIR6fcjanNGPQB
-7/IBS3mVxbxpb/MeK1mWRxC0ogP9GOQGsQ0j6U0/ejOQbilVKZlg13NqInNwNF5zkPpLPQhgsLuH
-samX/xEfsY4NrONDVvE+uOWsrHMB06wtPehOd7rTiVa0pwMddSGwYsy3XdOJAQVB/nZ28+gaqBsi
-jb1pB/N4WivXQ0f5etROLq0HPKSsrLcT04uBUMohCOC3dmEik6CIFGqTncu9sYvwVw6HBgBt3HV8
-LlBXRLawCzkn3T9COkdQmK7uLmYG6oxItXn2P6xJtwmdIyjMSjs8scFNRfLxTzsu7RgIpXz2CAB2
-cHcxMFCHpNI9Z3vzTvrNaI+gcE/ZiSyLXYRUhGV2RogYCKW8ggD+ZCdlfQJqKQOr7aRyuFZQq9yC
-AG6xU9za2EVIWVtlX+KW2EUkq/yCAK62i2KXIGVsg53FzbGLSFo5BgF2YelMNSUl5+dcHruE5JXX
-VYNabdyVHBW2SakIl9opYRvUDUV+2rlfcWToRqXM/dqOZ13YJnX50M96O96ujV2ElBO7NnwMhFK+
-QQDrOJHrYhchZeM6TizXGCjnQ4Nqbdyv+GKcpqWsXGdfjjMFnQ4NkrDBTkAHCOLrWjuhvGeiLPcg
-gPV2ol0duwgpZXa1nVjuQ9+UfxDAek7mmthFSMm6hgoYAasSggDW2/E6QJCi/NxOKP8YqJQggI12
-ApfFLkJKzDousq+W97mBGuV+1aC+8903NMGY5GmdnZaFW9V1Z2EajnEX0yV2EVICltlXuCl2EaAg
-SKkKO8RdpVkQZCs+sBO5PXYR1XQfQTpusy/yWuwiJNNetP/JSgyEUnl7BAATuMLtErsWySZ7mC/z
-Yuwq6tQTpJXKDALo425iTuxqJIP+bF/kg9hF1KVDgzQtsQO4nE2xy5BM2WQ/sUOzFQOhVGoQwGpO
-tLP4OHYZkhkr7UzOKJd5CgpVqYcGNZ8f7C5keOyqJANesVP5S+wimqJzBKlUYQ0/n+RuY2TsuiQu
-e4STeCp2Fc3UFqSVyj00qPG07W7Xxy5CYrJr+WRWYyAUBQEs4gS7kFWxy5AoVts5HM/K2GXEpkOD
-Gp9xP2Pb2PVJWLaQM7J5ZqBOjUFa0R5BjTvZ2+6JXYSEZL9j36zHQCgKgloLOcjOK9/hKaWeNXY2
-h+tm8xo6NGjw2h3MRQyLXaekbIF9jXmxi8iPDg3iuN0OsN/GLkLSZFfZ/qUSA6Foj6DBa4dBK77i
-vs02sauVFHzIWXYNVbHLyJ9uKEqliryCAGCW+x4zYtcrCbvfznDPhPqdT4aCIJUq8g4C6M5p7lR6
-xK5ZEvKu/Zifs8EFemslRUGQShUFBAHAHHeOHlcuC3fZeTwOjX/uWacgSKWKAoMAevBV91W6x65c
-PHxo3+bamucKFQRNURA0eN3kr8kMznPaLyhNVXYPZ/NM7RcUBE1REDR43cyvSc6dzkkMjV2/FOg1
-fmKXs7HulxQETVEQNHjd3K+Jw8a6MziCdrH7IHm7wn7mFjT8eSoImqIgaPC6hSAA2M+dy9TYvZA8
-PGQXcH9TP08FQVMUBA1ebyUIoAdfdOfQM3ZPpAVv24+5pvrRYgVBfhQEDV5vNQgABvEt90UdJGTS
-B/YHzueNmpcKgvwoCBq8zisIAHcIh3Ng7P5IA7fbpTxc9wsKgvwoCBq8zjsIsLYczGlup9h9kmr2
-CD/kzw3HpVYQ5EdB0OB1AUEA0J3Pu//HoNj9qnjP2neZx4rG/6AgyI+CoMHrAoMAoCsnuyMYF7tv
-FesFu45fNjfqoIIgPwqCBq+LCAKAXu5YjmNE7P5VnDe5zX7Koua/QUGQHwVBg9dFBgHAKA7nGKcB
-UAOxJVzOrcxv+bsUBPlREDR47REEAH05yH1DoyGn7m27kutrLxI2T0GQHwVBg9eeQQDQhRPdUYyN
-3dcyZfzbbueKfOehUBDkR0HQ4HUCQQCwDV9yc9kzdn/LzgP2W37DhvwXUBDkR0HQ4HVCQQDQ1e1s
-x7uDYve5PNhabna/t783dYmwJQqC/CgIGrxOMAhwWAc31g5zx2ggVC9v2q3cwHy3sfDfVgVBfhQE
-DV4nHATV39efY92+TNHTCQVbzfN2B5dVnxEo5i2sIMiPgqDB61SCAHBt7CC3N/vSN/Y2KBmvcq/d
-7f7Y/E8sHwqC/CgIGrxOLQiq/7+jm2wHun1jb4eM22B/cXfZv3i+5Z9YPhQE+VEQNHidchDgsM6M
-Z08+54bqUKGRj+0V7uQmXnFr62+3hp/nS0GQHwVBg9cBgqD6H9uxs/syOzGMVrG3SiZ8zMv2MHfw
-YPUsRI22FwqCNCkIGrwOFgTVn/fmU25Xdq3o24828DRP2N+5k7W1X1QQ1FAQpFJFxoKg+pPhboJN
-Zp/KG9vAHudu/s6Cxo8NKQi2bKMgrSgIGryOEgTVn/VkNLPZ302gY0Y2V1qqWG+Pcjf38RZLm/4W
-BUENBUEqVWQ4CKrl3ADbn93dePqW4byLH/KePcZD7q+2qOUZiRUENRQEqVSR+SCo+bwTO7lPMI4x
-bF8GVxfWs4CF9jSP8Sgf5bP1FAQ1FASpVFEyQVDz2SA3ysYwlulMzcY2LIStYj7z+Rev8yJvF7L1
-FAQ1FASpVFFyQVD9X0cvt42NZmd2YozrQ5uMbNCmVLHRVvFvnuQpnmUZy+tPOZbf1lMQ1FAQpFJF
-iQZB7f9ztGUEk5nGaDeILnSjS+ytCsBK1vC+vc18nuZpXmFdSxtIQZA/BUEqVZR8ENT9vrYMZSTD
-3HAG051+DKBb0M35Ph+wlEV8aM/zMm/zbL5vMgVB/hQEqVRRVkFQ97PO9KU/3V0v68cgejGYPnRx
-iT7iZCt4iw9ZwmLedu/Yct5lKcv4sOl+5PtzaPk7FARhqm0du5uSkNWs5pXNnzva0YkOtKED7Wwo
-PWhPT3rTlS70xtGOHrRik+tG5/rrsBXkyLGSFaxlNatZyQes5V1WuEWsZz1rWM86Piqx95JsVbA9
-AhHJrlzsAkQkPgWBiCgIRERBICIoCEQEBYGIoCAQERQEIoKCQERQEIgICgIRQUEgIigIRAQFgYig
-IBARFAQigoJARFAQiAgKAhFBQSAiKAhEBAWBiKAgEBEUBCKCgkBEgP8PCPTCCyMAfxEAAAAldEVY
-dGRhdGU6Y3JlYXRlADIwMjUtMDctMDhUMDI6Mjc6NTgrMDA6MDDf29LGAAAAJXRFWHRkYXRlOm1v
-ZGlmeQAyMDI1LTA3LTA4VDAyOjI3OjU4KzAwOjAwroZqegAAACh0RVh0ZGF0ZTp0aW1lc3RhbXAA
-MjAyNS0wNy0wOFQwMjoyNzo1OCswMDowMPmTS6UAAAAASUVORK5CYII=" />
+<svg width="24" height="24" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg">
+<path d="M19.7238 3.86898C19.7238 4.57597 19.1502 5.1491 18.4427 5.1491C17.7352 5.1491 17.1616 4.57597 17.1616 3.86898C17.1616 3.16199 17.7352 2.58887 18.4427 2.58887C19.1502 2.58887 19.7238 3.16199 19.7238 3.86898Z" fill="url(#paint0_linear_8609_1645)"/>
+<path d="M19.7238 3.86898C19.7238 4.57597 19.1502 5.1491 18.4427 5.1491C17.7352 5.1491 17.1616 4.57597 17.1616 3.86898C17.1616 3.16199 17.7352 2.58887 18.4427 2.58887C19.1502 2.58887 19.7238 3.16199 19.7238 3.86898Z" fill="#010101"/>
+<path d="M15.3681 5.1491C15.3681 5.85609 14.7945 6.42921 14.087 6.42921C13.3794 6.42921 12.8059 5.85609 12.8059 5.1491C12.8059 4.44211 13.3794 3.86898 14.087 3.86898C14.7945 3.86898 15.3681 4.44211 15.3681 5.1491Z" fill="url(#paint1_linear_8609_1645)"/>
+<path d="M15.3681 5.1491C15.3681 5.85609 14.7945 6.42921 14.087 6.42921C13.3794 6.42921 12.8059 5.85609 12.8059 5.1491C12.8059 4.44211 13.3794 3.86898 14.087 3.86898C14.7945 3.86898 15.3681 4.44211 15.3681 5.1491Z" fill="#010101"/>
+<path fill-rule="evenodd" clip-rule="evenodd" d="M8.05175 11.2368C8.05175 13.4605 9.14375 15.4293 10.8211 16.6371C11.8241 15.7389 12.4551 14.4345 12.4551 12.9828V9.39673C12.4551 8.85661 12.8197 8.38448 13.3426 8.24757L19.8924 6.53265C20.6459 6.33534 21.3826 6.90341 21.3826 7.6818L21.3826 12.0452C21.3826 17.2179 17.1861 21.4111 12.0095 21.4111L11.9942 21.4111C6.81758 21.4111 2.62109 17.2179 2.62109 12.0452V9.03388C2.62109 8.49175 2.9884 8.01839 3.51385 7.88336L6.56677 7.09882C7.31904 6.9055 8.05175 7.47318 8.05175 8.24934V11.2368ZM3.9798 12.0452C3.9798 13.8476 4.57565 15.5108 5.58124 16.849C6.04996 17.4728 6.7655 17.8884 7.54573 17.8884V17.8884C8.28848 17.8884 8.9927 17.7236 9.62376 17.4286C7.83439 15.9596 6.69304 13.7314 6.69304 11.2368V8.46821L3.9798 9.16546V12.0452Z" fill="url(#paint2_linear_8609_1645)"/>
+<path fill-rule="evenodd" clip-rule="evenodd" d="M8.05175 11.2368C8.05175 13.4605 9.14375 15.4293 10.8211 16.6371C11.8241 15.7389 12.4551 14.4345 12.4551 12.9828V9.39673C12.4551 8.85661 12.8197 8.38448 13.3426 8.24757L19.8924 6.53265C20.6459 6.33534 21.3826 6.90341 21.3826 7.6818L21.3826 12.0452C21.3826 17.2179 17.1861 21.4111 12.0095 21.4111L11.9942 21.4111C6.81758 21.4111 2.62109 17.2179 2.62109 12.0452V9.03388C2.62109 8.49175 2.9884 8.01839 3.51385 7.88336L6.56677 7.09882C7.31904 6.9055 8.05175 7.47318 8.05175 8.24934V11.2368ZM3.9798 12.0452C3.9798 13.8476 4.57565 15.5108 5.58124 16.849C6.04996 17.4728 6.7655 17.8884 7.54573 17.8884V17.8884C8.28848 17.8884 8.9927 17.7236 9.62376 17.4286C7.83439 15.9596 6.69304 13.7314 6.69304 11.2368V8.46821L3.9798 9.16546V12.0452Z" fill="#010101"/>
+<defs>
+<linearGradient id="paint0_linear_8609_1645" x1="14.3898" y1="8.36821" x2="13.1876" y2="19.4461" gradientUnits="userSpaceOnUse">
+<stop stop-color="white"/>
+<stop offset="1" stop-color="#2E2E2E"/>
+</linearGradient>
+<linearGradient id="paint1_linear_8609_1645" x1="14.3898" y1="8.36821" x2="13.1876" y2="19.4461" gradientUnits="userSpaceOnUse">
+<stop stop-color="white"/>
+<stop offset="1" stop-color="#2E2E2E"/>
+</linearGradient>
+<linearGradient id="paint2_linear_8609_1645" x1="14.3898" y1="8.36821" x2="13.1876" y2="19.4461" gradientUnits="userSpaceOnUse">
+<stop stop-color="white"/>
+<stop offset="1" stop-color="#2E2E2E"/>
+</linearGradient>
+</defs>
 </svg>

+ 0 - 41
docs/zh/FAQ/index.md

@@ -1,41 +0,0 @@
-# 常见问题解答
-
-如果未能列出您的问题,您也可以使用[DeepWiki](https://deepwiki.com/opendatalab/MinerU)与AI助手交流,这可以解决大部分常见问题。
-
-如果您仍然无法解决问题,您可通过[Discord](https://discord.gg/Tdedn9GTXq)或[WeChat](http://mineru.space/s/V85Yl)加入社区,与其他用户和开发者交流。
-
-### 1. 在WSL2的Ubuntu22.04中遇到报错`ImportError: libGL.so.1: cannot open shared object file: No such file or directory`
-
-WSL2的Ubuntu22.04中缺少`libgl`库,可通过以下命令安装`libgl`库解决:
-
-```bash
-sudo apt-get install libgl1-mesa-glx
-```
-
-参考:https://github.com/opendatalab/MinerU/issues/388
-
-
-### 2. 在 CentOS 7 或 Ubuntu 18 系统安装MinerU时报错`ERROR: Failed building wheel for simsimd`
-
-新版本albumentations(1.4.21)引入了依赖simsimd,由于simsimd在linux的预编译包要求glibc的版本大于等于2.28,导致部分2019年之前发布的Linux发行版无法正常安装,可通过如下命令安装:
-```
-conda create -n mineru python=3.11 -y
-conda activate mineru
-pip install -U "mineru[pipeline_old_linux]"
-```
-
-参考:https://github.com/opendatalab/MinerU/issues/1004
-
-### 3. 在 Linux 系统安装并使用时,解析结果缺失部份文字信息。
-
-MinerU在>=2.0的版本中使用`pypdfium2`代替`pymupdf`作为PDF页面的渲染引擎,以解决AGPLv3的许可证问题,在某些Linux发行版,由于缺少CJK字体,可能会在将PDF渲染成图片的过程中丢失部份文字。
-为了解决这个问题,您可以通过以下命令安装noto字体包,这在Ubuntu/debian系统中有效:
-```bash
-sudo apt update
-sudo apt install fonts-noto-core
-sudo apt install fonts-noto-cjk
-fc-cache -fv
-```
-也可以直接使用我们的[Docker部署](../quick_start/docker_deployment.md)方式构建镜像,镜像中默认包含以上字体包。
-
-参考:https://github.com/opendatalab/MinerU/issues/2915

+ 1 - 0
docs/zh/demo/index.md

@@ -0,0 +1 @@
+<iframe src="https://opendatalab-mineru.ms.show" frameborder="0" width="850" height="850"></iframe>

+ 41 - 0
docs/zh/faq/index.md

@@ -0,0 +1,41 @@
+# 常见问题解答
+
+如果未能列出您的问题,您也可以使用[DeepWiki](https://deepwiki.com/opendatalab/MinerU)与AI助手交流,这可以解决大部分常见问题。
+
+如果您仍然无法解决问题,您可通过[Discord](https://discord.gg/Tdedn9GTXq)或[WeChat](http://mineru.space/s/V85Yl)加入社区,与其他用户和开发者交流。
+
+??? question "在WSL2的Ubuntu22.04中遇到报错`ImportError: libGL.so.1: cannot open shared object file: No such file or directory`"
+
+    WSL2的Ubuntu22.04中缺少`libgl`库,可通过以下命令安装`libgl`库解决:
+    
+    ```bash
+    sudo apt-get install libgl1-mesa-glx
+    ```
+    
+    参考:[#388](https://github.com/opendatalab/MinerU/issues/388)
+
+
+??? question "在 CentOS 7 或 Ubuntu 18 系统安装MinerU时报错`ERROR: Failed building wheel for simsimd`"
+
+    新版本albumentations(1.4.21)引入了依赖simsimd,由于simsimd在linux的预编译包要求glibc的版本大于等于2.28,导致部分2019年之前发布的Linux发行版无法正常安装,可通过如下命令安装:
+    ```
+    conda create -n mineru python=3.11 -y
+    conda activate mineru
+    pip install -U "mineru[pipeline_old_linux]"
+    ```
+    
+    参考:[#1004](https://github.com/opendatalab/MinerU/issues/1004)
+
+??? question "在 Linux 系统安装并使用时,解析结果缺失部份文字信息。"
+
+    MinerU在>=2.0的版本中使用`pypdfium2`代替`pymupdf`作为PDF页面的渲染引擎,以解决AGPLv3的许可证问题,在某些Linux发行版,由于缺少CJK字体,可能会在将PDF渲染成图片的过程中丢失部份文字。
+    为了解决这个问题,您可以通过以下命令安装noto字体包,这在Ubuntu/debian系统中有效:
+    ```bash
+    sudo apt update
+    sudo apt install fonts-noto-core
+    sudo apt install fonts-noto-cjk
+    fc-cache -fv
+    ```
+    也可以直接使用我们的[Docker部署](../quick_start/docker_deployment.md)方式构建镜像,镜像中默认包含以上字体包。
+    
+    参考:[#2915](https://github.com/opendatalab/MinerU/issues/2915)

文件差異過大導致無法顯示
+ 1 - 1
docs/zh/index.md


+ 17 - 5
docs/zh/quick_start/docker_deployment.md

@@ -37,33 +37,45 @@ docker run --gpus all \
 ```
 
 执行该命令后,您将进入到Docker容器的交互式终端,并映射了一些端口用于可能会使用的服务,您可以直接在容器内运行MinerU相关命令来使用MinerU的功能。
-您也可以直接通过替换`/bin/bash`为服务启动命令来启动MinerU服务,详细说明请参考[MinerU使用文档](../usage/index_back.md)。
+您也可以直接通过替换`/bin/bash`为服务启动命令来启动MinerU服务,详细说明请参考[MinerU使用文档](../usage/index.md)。
 
 
 ## 通过 Docker Compose 直接启动服务
 
-我们提供了`compose.yml`文件,您可以通过它来快速启动MinerU服务。
+我们提供了[compose.yml](https://github.com/opendatalab/MinerU/blob/master/docker/compose.yaml)文件,您可以通过它来快速启动MinerU服务。
 
 ```bash
 # 下载 compose.yaml 文件
 wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/compose.yaml
 ```
+>[!NOTE]
+>  
+>- `compose.yaml`文件中包含了MinerU的多个服务配置,您可以根据需要选择启动特定的服务。
+>- 不同的服务可能会有额外的参数配置,您可以在`compose.yaml`文件中查看并编辑。
+>- 由于`sglang`推理加速框架预分配显存的特性,您可能无法在同一台机器上同时运行多个`sglang`服务,因此请确保在启动`vlm-sglang-server`服务或使用`vlm-sglang-engine`后端时,其他可能使用显存的服务已停止。
 
 - 启动`sglang-server`服务,并通过`vlm-sglang-client`后端连接`sglang-server`:
   ```bash
   docker compose -f compose.yaml --profile mineru-sglang-server up -d
-  # 在另一个终端中通过sglang client连接sglang server(只需cpu与网络,不需要sglang环境)
-  mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://<server_ip>:30000
   ```
+  >[!TIP]
+  >在另一个终端中通过sglang client连接sglang server(只需cpu与网络,不需要sglang环境)
+  > ```bash
+  > mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://<server_ip>:30000
+  > ```
+
 - 启动 API 服务:
   ```bash
   docker compose -f compose.yaml --profile mineru-api up -d
   ```
   >[!TIP]
   >在浏览器中访问 `http://<server_ip>:8000/docs` 查看API文档。
+
 - 启动 Gradio WebUI 服务:
   ```bash
   docker compose -f compose.yaml --profile mineru-gradio up -d
   ```
   >[!TIP]
-  >在浏览器中访问 `http://<server_ip>:7860` 使用 Gradio WebUI 或访问 `http://<server_ip>:7860/?view=api` 使用 Gradio API。
+  > 
+  >- 在浏览器中访问 `http://<server_ip>:7860` 使用 Gradio WebUI。
+  >- 访问 `http://<server_ip>:7860/?view=api` 使用 Gradio API。

文件差異過大導致無法顯示
+ 0 - 1
docs/zh/quick_start/index.md


文件差異過大導致無法顯示
+ 109 - 15
docs/zh/reference/output_files.md


+ 13 - 8
docs/zh/usage/advanced_cli_parameters.md

@@ -18,8 +18,8 @@
 
 ### 参数传递说明
 > [!TIP]
-> - 如果您想了解更多有关`sglang`的参数使用方法,请参考 [sglang官方文档](https://docs.sglang.ai/backend/server_arguments.html#common-launch-commands)
 > - 所有sglang官方支持的参数都可用通过命令行参数传递给 MinerU,包括以下命令:`mineru`、`mineru-sglang-server`、`mineru-gradio`、`mineru-api`
+> - 如果您想了解更多有关`sglang`的参数使用方法,请参考 [sglang官方文档](https://docs.sglang.ai/backend/server_arguments.html#common-launch-commands)
 
 ## GPU 设备选择与配置
 
@@ -35,23 +35,28 @@
 > [!TIP]
 > 以下是一些常见的 `CUDA_VISIBLE_DEVICES` 设置示例:
 >   ```bash
->   CUDA_VISIBLE_DEVICES=1 Only device 1 will be seen
->   CUDA_VISIBLE_DEVICES=0,1 Devices 0 and 1 will be visible
->   CUDA_VISIBLE_DEVICES="0,1" Same as above, quotation marks are optional
->   CUDA_VISIBLE_DEVICES=0,2,3 Devices 0, 2, 3 will be visible; device 1 is masked
->   CUDA_VISIBLE_DEVICES="" No GPU will be visible
+>   CUDA_VISIBLE_DEVICES=1  # Only device 1 will be seen
+>   CUDA_VISIBLE_DEVICES=0,1  # Devices 0 and 1 will be visible
+>   CUDA_VISIBLE_DEVICES="0,1"  # Same as above, quotation marks are optional
+>   CUDA_VISIBLE_DEVICES=0,2,3  # Devices 0, 2, 3 will be visible; device 1 is masked
+>   CUDA_VISIBLE_DEVICES=""  # No GPU will be visible
 >   ```
 
-### 实际应用场景
+## 实际应用场景
 
 > [!TIP]
 > 以下是一些可能的使用场景:
 > 
-> - 如果您有多张显卡,需要指定卡0和卡1,并使用多卡并行来启动'sglang-server',可以使用以下命令: 
+> - 如果您有多张显卡,需要指定卡0和卡1,并使用多卡并行来启动`sglang-server`,可以使用以下命令: 
 >   ```bash
 >   CUDA_VISIBLE_DEVICES=0,1 mineru-sglang-server --port 30000 --dp-size 2
 >   ```
 >   
+> - 如果您有多张显卡,需要指定卡0-3,并使用多卡数据并行和张量并行来启动`sglang-server`,可以使用以下命令: 
+>   ```bash
+>   CUDA_VISIBLE_DEVICES=0,1,2,3 mineru-sglang-server --port 30000 --dp-size 2 --tp-size 2
+>   ```
+>   
 > - 如果您有多张显卡,需要在卡0和卡1上启动两个`fastapi`服务,并分别监听不同的端口,可以使用以下命令: 
 >   ```bash
 >   # 在终端1中

+ 20 - 11
docs/zh/usage/index.md

@@ -19,7 +19,7 @@ mineru -p <input_path> -o <output_path>
 > - `<input_path>`:本地 PDF/图片 文件或目录
 > - `<output_path>`:输出目录
 > 
-> 更多关于输出文件的信息,请参考[输出文件说明](./output_file.md)。
+> 更多关于输出文件的信息,请参考[输出文件说明](../output_files.md)。
 
 > [!NOTE]
 > 命令行工具会在Linux和macOS系统自动尝试cuda/mps加速。Windows用户如需使用cuda加速,
@@ -44,7 +44,8 @@ mineru -p <input_path> -o <output_path> -b vlm-transformers
   ```bash
   mineru-api --host 127.0.0.1 --port 8000
   ```
-  在浏览器中访问 http://127.0.0.1:8000/docs 查看API文档。
+  >[!TIP]
+  >在浏览器中访问 `http://127.0.0.1:8000/docs` 查看API文档。
 - 启动gradio webui 可视化前端:
   ```bash
   # 使用 pipeline/vlm-transformers/vlm-sglang-client 后端
@@ -52,14 +53,21 @@ mineru -p <input_path> -o <output_path> -b vlm-transformers
   # 或使用 vlm-sglang-engine/pipeline 后端(需安装sglang环境)
   mineru-gradio --server-name 127.0.0.1 --server-port 7860 --enable-sglang-engine true
   ```
-  在浏览器中访问 http://127.0.0.1:7860 使用 Gradio WebUI 或访问 http://127.0.0.1:7860/?view=api 使用 Gradio API。
+  >[!TIP]
+  > 
+  >- 在浏览器中访问 `http://127.0.0.1:7860` 使用 Gradio WebUI。
+  >- 访问 `http://127.0.0.1:7860/?view=api` 使用 Gradio API。
 - 使用`sglang-client/server`方式调用:
   ```bash
   # 启动sglang server(需要安装sglang环境)
   mineru-sglang-server --port 30000
-  # 在另一个终端中通过sglang client连接sglang server(只需cpu与网络,不需要sglang环境)
-  mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://127.0.0.1:30000
   ``` 
+  >[!TIP]
+  >在另一个终端中通过sglang client连接sglang server(只需cpu与网络,不需要sglang环境)
+  > ```bash
+  > mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://127.0.0.1:30000
+  > ```
+
 > [!TIP]
 > 所有sglang官方支持的参数都可用通过命令行参数传递给 MinerU,包括以下命令:`mineru`、`mineru-sglang-server`、`mineru-gradio`、`mineru-api`,
 > 我们整理了一些`sglang`使用中的常用参数和使用方法,可以在文档[命令行参数进阶技巧](./advanced_cli_parameters.md)中获取。
@@ -67,9 +75,10 @@ mineru -p <input_path> -o <output_path> -b vlm-transformers
 
 ## 基于配置文件扩展 MinerU 功能
 
-- MinerU 现已实现开箱即用,但也支持通过配置文件扩展功能。您可以在用户目录下创建 `mineru.json` 文件,添加自定义配置。
-- `mineru.json` 文件会在您使用内置模型下载命令 `mineru-models-download` 时自动生成,也可以通过将[配置模板文件](https://github.com/opendatalab/MinerU/blob/master/mineru.template.json)复制到用户目录下并重命名为 `mineru.json` 来创建。
-- 以下是一些可用的配置选项:
-  - `latex-delimiter-config`:用于配置 LaTeX 公式的分隔符,默认为`$`符号,可根据需要修改为其他符号或字符串。
-  - `llm-aided-config`:用于配置 LLM 辅助标题分级的相关参数,兼容所有支持`openai协议`的 LLM 模型,默认使用`阿里云百炼`的`qwen2.5-32b-instruct`模型,您需要自行配置 API 密钥并将`enable`设置为`true`来启用此功能。
-  - `models-dir`:用于指定本地模型存储目录,请为`pipeline`和`vlm`后端分别指定模型目录,指定目录后您可通过配置环境变量`export MINERU_MODEL_SOURCE=local`来使用本地模型。
+MinerU 现已实现开箱即用,但也支持通过配置文件扩展功能。您可以在用户目录下创建 `mineru.json` 文件,添加自定义配置。  
+`mineru.json` 文件会在您使用内置模型下载命令 `mineru-models-download` 时自动生成,也可以通过将[配置模板文件](https://github.com/opendatalab/MinerU/blob/master/mineru.template.json)复制到用户目录下并重命名为 `mineru.json` 来创建。  
+以下是一些可用的配置选项: 
+
+- `latex-delimiter-config`:用于配置 LaTeX 公式的分隔符,默认为`$`符号,可根据需要修改为其他符号或字符串。
+- `llm-aided-config`:用于配置 LLM 辅助标题分级的相关参数,兼容所有支持`openai协议`的 LLM 模型,默认使用`阿里云百炼`的`qwen2.5-32b-instruct`模型,您需要自行配置 API 密钥并将`enable`设置为`true`来启用此功能。
+- `models-dir`:用于指定本地模型存储目录,请为`pipeline`和`vlm`后端分别指定模型目录,指定目录后您可通过配置环境变量`export MINERU_MODEL_SOURCE=local`来使用本地模型。

+ 1 - 0
docs/zh/usage/model_source.md

@@ -39,6 +39,7 @@ mineru-models-download
 ```
 >[!TIP]
 >- 下载完成后,模型路径会在当前终端窗口输出,并自动写入用户目录下的 `mineru.json`。
+>- 您也可以通过将[配置模板文件](https://github.com/opendatalab/MinerU/blob/master/mineru.template.json)复制到用户目录下并重命名为 `mineru.json` 来创建配置文件。
 >- 模型下载到本地后,您可以自由移动模型文件夹到其他位置,同时需要在 `mineru.json` 中更新模型路径。
 >- 如您将模型文件夹部署到其他服务器上,请确保将 `mineru.json`文件一同移动到新设备的用户目录中并正确配置模型路径。
 >- 如您需要更新模型文件,可以再次运行 `mineru-models-download` 命令,模型更新暂不支持自定义路径,如您没有移动本地模型文件夹,模型文件会增量更新;如您移动了模型文件夹,模型文件会重新下载到默认位置并更新`mineru.json`。

+ 28 - 10
mkdocs.yml

@@ -30,35 +30,49 @@ theme:
         icon: material/brightness-4
         name: Switch to system preference
   logo: images/logo.png
-  favicon: images/logo.svg
+  favicon: images/logo.png
   features:
     - content.tabs.link
     - content.code.annotate
     - content.code.copy
+    - navigation.footer
+    - navigation.tabs
     - navigation.instant
+    - navigation.instant.prefetch
     - navigation.instant.progress
-    - navigation.tabs
-    - navigation.tabs.sticky
     - navigation.sections
     - navigation.path
     - navigation.indexes
+    - navigation.top
+    - navigation.tracking
     - search.suggest
+    - toc.follow
 
 nav:
   - Home:
     - "MinerU": index.md
     - Quick Start:
-      - quick_start/index.md
+      - Quick Start: quick_start/index.md
       - Extension Modules: quick_start/extension_modules.md
       - Docker Deployment: quick_start/docker_deployment.md
     - Usage:
-      - usage/index.md
+      - Usage: usage/index.md
       - CLI Tools: usage/cli_tools.md
       - Model Source: usage/model_source.md
       - Advanced CLI Parameters: usage/advanced_cli_parameters.md
-      - Output File Format: usage/output_files.md
+    - Reference:
+      - Output File Format: reference/output_files.md
+    - FAQ:
+      - FAQ: faq/index.md
+    - Demo:
+      - Demo: demo/index.md
+  - Reference:
+    - Output File Format: reference/output_files.md
   - FAQ:
-      - FAQ: FAQ/index.md
+    - FAQ: faq/index.md
+  - Demo:
+    - Demo: demo/index.md
+
 
 plugins:
   - search
@@ -75,17 +89,21 @@ plugins:
           nav_translations:
             Home: 主页
             Quick Start: 快速开始
-            Extension Modules: 扩展模块
+            Extension Modules: 扩展模块安装
             Docker Deployment: Docker部署
             Usage: 使用方法
             CLI Tools: 命令行工具
             Model Source: 模型源
-            Advanced CLI Parameters: 命令行参数进阶技巧
-            FAQ: FAQ
+            Advanced CLI Parameters: 命令行进阶参数
+            FAQ: 常见问题解答
+            Reference: 参考资料
             Output File Format: 输出文件格式
   - mkdocs-video
 
 markdown_extensions:
+  - admonition
+  - pymdownx.details
+  - attr_list
   - gfm_admonition
   - pymdownx.highlight:
       use_pygments: true

部分文件因文件數量過多而無法顯示