Jelajahi Sumber

Merge branch 'dev'

Sidney233 4 bulan lalu
induk
melakukan
a9c139fc78

+ 9 - 0
.github/ISSUE_TEMPLATE/bug_report.yml

@@ -24,6 +24,15 @@ body:
         - label: I have searched the MinerU [Discussions](https://github.com/opendatalab/MinerU/discussions) and found no similar bug report.
           required: true
 
+  - type: checkboxes
+    attributes:
+      label: 🤖 Consult the online AI assistant for assistance | 在线 AI 助手咨询
+      description: >
+        This [online AI assistant](https://deepwiki.com/opendatalab/MinerU) is specifically trained to help with MinerU and related topics! It's available 24/7 and ready to provide insights.
+      options:
+        - label: I have consulted the [online AI assistant](https://deepwiki.com/opendatalab/MinerU) but was unable to obtain a solution to the issue.
+          required: true
+
   - type: textarea
     id: description
     attributes:

+ 2 - 2
docs/FAQ_en_us.md → docs/en/FAQ/index.md

@@ -1,6 +1,6 @@
 # Frequently Asked Questions
 
-### 1. Encountered the error `ImportError: libGL.so.1: cannot open shared object file: No such file or directory` in Ubuntu 22.04 on WSL2
+## 1. Encountered the error `ImportError: libGL.so.1: cannot open shared object file: No such file or directory` in Ubuntu 22.04 on WSL2
 
 The `libgl` library is missing in Ubuntu 22.04 on WSL2. You can install the `libgl` library with the following command to resolve the issue:
 
@@ -11,7 +11,7 @@ sudo apt-get install libgl1-mesa-glx
 Reference: https://github.com/opendatalab/MinerU/issues/388
 
 
-### 2. Error when installing MinerU on CentOS 7 or Ubuntu 18: `ERROR: Failed building wheel for simsimd`
+## 2. Error when installing MinerU on CentOS 7 or Ubuntu 18: `ERROR: Failed building wheel for simsimd`
 
 The new version of albumentations (1.4.21) introduces a dependency on simsimd. Since the pre-built package of simsimd for Linux requires a glibc version greater than or equal to 2.28, this causes installation issues on some Linux distributions released before 2019. You can resolve this issue by using the following command:
 ```

File diff ditekan karena terlalu besar
+ 16 - 0
docs/en/index.md


+ 10 - 0
docs/en/known_issues.md

@@ -0,0 +1,10 @@
+# Known Issues
+
+- Reading order is determined by the model based on the spatial distribution of readable content, and may be out of order in some areas under extremely complex layouts.
+- Limited support for vertical text.
+- Tables of contents and lists are recognized through rules, and some uncommon list formats may not be recognized.
+- Code blocks are not yet supported in the layout model.
+- Comic books, art albums, primary school textbooks, and exercises cannot be parsed well.
+- Table recognition may result in row/column recognition errors in complex tables.
+- OCR recognition may produce inaccurate characters in PDFs of lesser-known languages (e.g., diacritical marks in Latin script, easily confused characters in Arabic script).
+- Some formulas may not render correctly in Markdown.

+ 14 - 14
docs/output_file_en_us.md → docs/en/output_file.md

@@ -1,21 +1,21 @@
-## Overview
+# Overview
 
 After executing the `mineru` command, in addition to outputting files related to markdown, several other files unrelated to markdown will also be generated. These files will be introduced one by one.
 
-### some_pdf_layout.pdf
+## some_pdf_layout.pdf
 
 Each page's layout consists of one or more bounding boxes. The number in the top-right corner of each box indicates the reading order. Additionally, different content blocks are highlighted with distinct background colors within the layout.pdf.
-![layout example](images/layout_example.png)
+![layout example](../images/layout_example.png)
 
-### some_pdf_spans.pdf(Applicable only to the pipeline backend)
+## some_pdf_spans.pdf(Applicable only to the pipeline backend)
 
 All spans on the page are drawn with different colored line frames according to the span type. This file can be used for quality control, allowing for quick identification of issues such as missing text or unrecognized inline formulas.
 
-![spans example](images/spans_example.png)
+![spans example](../images/spans_example.png)
 
-### some_pdf_model.json(Applicable only to the pipeline backend)
+## some_pdf_model.json(Applicable only to the pipeline backend)
 
-#### Structure Definition
+### Structure Definition
 
 ```python
 from pydantic import BaseModel, Field
@@ -61,9 +61,9 @@ inference_result: list[PageInferenceResults] = []
 ```
 
 The format of the poly coordinates is \[x0, y0, x1, y1, x2, y2, x3, y3\], representing the coordinates of the top-left, top-right, bottom-right, and bottom-left points respectively.
-![Poly Coordinate Diagram](images/poly.png)
+![Poly Coordinate Diagram](../images/poly.png)
 
-#### example
+### example
 
 ```json
 [
@@ -116,7 +116,7 @@ The format of the poly coordinates is \[x0, y0, x1, y1, x2, y2, x3, y3\], repres
 ]
 ```
 
-### some_pdf_model_output.txt (Applicable only to the VLM backend)
+## some_pdf_model_output.txt (Applicable only to the VLM backend)
 
 This file contains the output of the VLM model, with each page's output separated by `----`.  
 Each page's output consists of text blocks starting with `<|box_start|>` and ending with `<|md_end|>`.  
@@ -142,7 +142,7 @@ The meaning of each field is as follows:
   This field contains the Markdown content of the block. If `type` is `text`, the end of the text may contain the `<|txt_contd|>` tag, indicating that this block can be connected with the following `text` block(s).
   If `type` is `table`, the content is in `otsl` format and needs to be converted into HTML for rendering in Markdown.
 
-### some_pdf_middle.json
+## some_pdf_middle.json
 
 | Field Name     | Description                                                                                                    |
 |:---------------| :------------------------------------------------------------------------------------------------------------- |
@@ -251,7 +251,7 @@ The block structure is as follows:
 
 First-level block (if any) -> Second-level block -> Line -> Span
 
-#### example
+### example
 
 ```json
 {
@@ -355,7 +355,7 @@ First-level block (if any) -> Second-level block -> Line -> Span
 ```
 
 
-### some_pdf_content_list.json
+## some_pdf_content_list.json
 
 This file is a JSON array where each element is a dict storing all readable content blocks in the document in reading order.  
 `content_list` can be viewed as a simplified version of `middle.json`. The content block types are mostly consistent with those in `middle.json`, but layout information is not included.  
@@ -376,7 +376,7 @@ Please note that both `title` and text blocks in `content_list` are uniformly re
 
 Each content contains the `page_idx` field, indicating the page number (starting from 0) where the content block resides.
 
-#### example
+### example
 
 ```json
 [

+ 59 - 0
docs/en/quick_start/index.md

@@ -0,0 +1,59 @@
+# Quick Start
+
+If you encounter any installation issues, please first consult the [FAQ](../FAQ/index.md).
+
+
+If the parsing results are not as expected, refer to the [Known Issues](../known_issues.md).
+
+
+There are three different ways to experience MinerU:
+
+- [Online Demo](online_demo.md)
+- [Local Deployment](local_deployment.md)
+
+
+> [!WARNING]
+> **Pre-installation Notice—Hardware and Software Environment Support**
+>
+> To ensure the stability and reliability of the project, we only optimize and test for specific hardware and software environments during development. This ensures that users deploying and running the project on recommended system configurations will get the best performance with the fewest compatibility issues.
+>
+> By focusing resources on the mainline environment, our team can more efficiently resolve potential bugs and develop new features.
+>
+> In non-mainline environments, due to the diversity of hardware and software configurations, as well as third-party dependency compatibility issues, we cannot guarantee 100% project availability. Therefore, for users who wish to use this project in non-recommended environments, we suggest carefully reading the documentation and FAQ first. Most issues already have corresponding solutions in the FAQ. We also encourage community feedback to help us gradually expand support.
+
+<table>
+    <tr>
+        <td>Parsing Backend</td>
+        <td>pipeline</td>
+        <td>vlm-transformers</td>
+        <td>vlm-sglang</td>
+    </tr>
+    <tr>
+        <td>Operating System</td>
+        <td>windows/linux/mac</td>
+        <td>windows/linux</td>
+        <td>windows(wsl2)/linux</td>
+    </tr>
+    <tr>
+        <td>CPU Inference Support</td>
+        <td>✅</td>
+        <td colspan="2">❌</td>
+    </tr>
+    <tr>
+        <td>GPU Requirements</td>
+        <td>Turing architecture or later, 6GB+ VRAM or Apple Silicon</td>
+        <td colspan="2">Ampere architecture or later, 8GB+ VRAM</td>
+    </tr>
+    <tr>
+        <td>Memory Requirements</td>
+        <td colspan="3">Minimum 16GB+, 32GB+ recommended</td>
+    </tr>
+    <tr>
+        <td>Disk Space Requirements</td>
+        <td colspan="3">20GB+, SSD recommended</td>
+    </tr>
+    <tr>
+        <td>Python Version</td>
+        <td colspan="3">3.10-3.13</td>
+    </tr>
+</table>

+ 72 - 0
docs/en/quick_start/local_deployment.md

@@ -0,0 +1,72 @@
+# Local Deployment
+
+## Install MinerU
+
+### Install via pip or uv
+
+```bash
+pip install --upgrade pip
+pip install uv
+uv pip install -U "mineru[core]"
+```
+
+### Install from source
+
+```bash
+git clone https://github.com/opendatalab/MinerU.git
+cd MinerU
+uv pip install -e .[core]
+```
+
+> [!NOTE]  
+> Linux and macOS systems automatically support CUDA/MPS acceleration after installation. For Windows users who want to use CUDA acceleration, 
+> please visit the [PyTorch official website](https://pytorch.org/get-started/locally/) to install PyTorch with the appropriate CUDA version.
+
+### Install Full Version (supports sglang acceleration) (requires device with Turing or newer architecture and at least 8GB GPU memory)
+
+If you need to use **sglang to accelerate VLM model inference**, you can choose any of the following methods to install the full version:
+
+- Install using uv or pip:
+  ```bash
+  uv pip install -U "mineru[all]"
+  ```
+- Install from source:
+  ```bash
+  uv pip install -e .[all]
+  ```
+
+> [!TIP]  
+> If any exceptions occur during the installation of `sglang`, please refer to the [official sglang documentation](https://docs.sglang.ai/start/install.html) for troubleshooting and solutions, or directly use Docker-based installation.
+
+- Build image using Dockerfile:
+  ```bash
+  wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/global/Dockerfile
+  docker build -t mineru-sglang:latest -f Dockerfile .
+  ```
+  Start Docker container:
+  ```bash
+  docker run --gpus all \
+    --shm-size 32g \
+    -p 30000:30000 \
+    --ipc=host \
+    mineru-sglang:latest \
+    mineru-sglang-server --host 0.0.0.0 --port 30000
+  ```
+  Or start using Docker Compose:
+  ```bash
+    wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/compose.yaml
+    docker compose -f compose.yaml up -d
+  ```
+  
+> [!TIP]
+> The Dockerfile uses `lmsysorg/sglang:v0.4.8.post1-cu126` as the default base image, which supports the Turing/Ampere/Ada Lovelace/Hopper platforms.  
+> If you are using the newer Blackwell platform, please change the base image to `lmsysorg/sglang:v0.4.8.post1-cu128-b200`.
+
+### Install client  (for connecting to sglang-server on edge devices that require only CPU and network connectivity)
+
+```bash
+uv pip install -U mineru
+mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://<host_ip>:<port>
+```
+
+---

File diff ditekan karena terlalu besar
+ 2 - 0
docs/en/quick_start/online_demo.md


+ 9 - 0
docs/en/todo.md

@@ -0,0 +1,9 @@
+# TODO
+
+- [x] Reading order based on the model  
+- [x] Recognition of `index` and `list` in the main text  
+- [x] Table recognition
+- [x] Heading Classification
+- [ ] Code block recognition in the main text
+- [ ] [Chemical formula recognition](../chemical_knowledge_introduction/introduction.pdf)
+- [ ] Geometric shape recognition

+ 58 - 0
docs/en/usage/api.md

@@ -0,0 +1,58 @@
+# API Calls or Visual Invocation
+
+1. Directly invoke using Python API: [Python Invocation Example](https://github.com/opendatalab/MinerU/blob/master/demo/demo.py)
+2. Invoke using FastAPI:
+   ```bash
+   mineru-api --host 127.0.0.1 --port 8000
+   ```
+   Visit http://127.0.0.1:8000/docs in your browser to view the API documentation.
+
+3. Use Gradio WebUI or Gradio API:
+   ```bash
+   # Using pipeline/vlm-transformers/vlm-sglang-client backend
+   mineru-gradio --server-name 127.0.0.1 --server-port 7860
+   # Or using vlm-sglang-engine/pipeline backend
+   mineru-gradio --server-name 127.0.0.1 --server-port 7860 --enable-sglang-engine true
+   ```
+   Access http://127.0.0.1:7860 in your browser to use the Gradio WebUI, or visit http://127.0.0.1:7860/?view=api to use the Gradio API.
+
+
+> [!TIP]  
+> - Below are some suggestions and notes for using the sglang acceleration mode:  
+> - The sglang acceleration mode currently supports operation on Turing architecture GPUs with a minimum of 8GB VRAM, but you may encounter VRAM shortages on GPUs with less than 24GB VRAM. You can optimize VRAM usage with the following parameters:  
+>   - If running on a single GPU and encountering VRAM shortage, reduce the KV cache size by setting `--mem-fraction-static 0.5`. If VRAM issues persist, try lowering it further to `0.4` or below.  
+>   - If you have more than one GPU, you can expand available VRAM using tensor parallelism (TP) mode: `--tp-size 2`  
+> - If you are already successfully using sglang to accelerate VLM inference but wish to further improve inference speed, consider the following parameters:  
+>   - If using multiple GPUs, increase throughput using sglang's multi-GPU parallel mode: `--dp-size 2`  
+>   - You can also enable `torch.compile` to accelerate inference speed by about 15%: `--enable-torch-compile`  
+> - For more information on using sglang parameters, please refer to the [sglang official documentation](https://docs.sglang.ai/backend/server_arguments.html#common-launch-commands)  
+> - All sglang-supported parameters can be passed to MinerU via command-line arguments, including those used with the following commands: `mineru`, `mineru-sglang-server`, `mineru-gradio`, `mineru-api`
+
+> [!TIP]  
+> - In any case, you can specify visible GPU devices at the start of a command line by adding the `CUDA_VISIBLE_DEVICES` environment variable. For example:  
+>   ```bash
+>   CUDA_VISIBLE_DEVICES=1 mineru -p <input_path> -o <output_path>
+>   ```
+> - This method works for all command-line calls, including `mineru`, `mineru-sglang-server`, `mineru-gradio`, and `mineru-api`, and applies to both `pipeline` and `vlm` backends.  
+> - Below are some common `CUDA_VISIBLE_DEVICES` settings:  
+>   ```bash
+>   CUDA_VISIBLE_DEVICES=1 Only device 1 will be seen
+>   CUDA_VISIBLE_DEVICES=0,1 Devices 0 and 1 will be visible
+>   CUDA_VISIBLE_DEVICES="0,1" Same as above, quotation marks are optional
+>   CUDA_VISIBLE_DEVICES=0,2,3 Devices 0, 2, 3 will be visible; device 1 is masked
+>   CUDA_VISIBLE_DEVICES="" No GPU will be visible
+>   ```
+> - Below are some possible use cases:  
+>   - If you have multiple GPUs and need to specify GPU 0 and GPU 1 to launch 'sglang-server' in multi-GPU mode, you can use the following command:  
+>   ```bash
+>   CUDA_VISIBLE_DEVICES=0,1 mineru-sglang-server --port 30000 --dp-size 2
+>   ```
+>   - If you have multiple GPUs and need to launch two `fastapi` services on GPU 0 and GPU 1 respectively, listening on different ports, you can use the following commands:  
+>   ```bash
+>   # In terminal 1
+>   CUDA_VISIBLE_DEVICES=0 mineru-api --host 127.0.0.1 --port 8000
+>   # In terminal 2
+>   CUDA_VISIBLE_DEVICES=1 mineru-api --host 127.0.0.1 --port 8001
+>   ```
+
+---

+ 10 - 0
docs/en/usage/config.md

@@ -0,0 +1,10 @@
+# Extending MinerU Functionality Through Configuration Files
+
+- MinerU is designed to work out-of-the-box, but also supports extending functionality through configuration files. You can create a `mineru.json` file in your home directory and add custom configurations.
+- The `mineru.json` file will be automatically generated when you use the built-in model download command `mineru-models-download`. Alternatively, you can create it by copying the [configuration template file](../../mineru.template.json) to your home directory and renaming it to `mineru.json`.
+- Below are some available configuration options:
+  - `latex-delimiter-config`: Used to configure LaTeX formula delimiters, defaults to the `$` symbol, and can be modified to other symbols or strings as needed.
+  - `llm-aided-config`: Used to configure related parameters for LLM-assisted heading level detection, compatible with all LLM models supporting the `OpenAI protocol`. It defaults to Alibaba Cloud Qwen's `qwen2.5-32b-instruct` model. You need to configure an API key yourself and set `enable` to `true` to activate this feature.
+  - `models-dir`: Used to specify local model storage directories. Please specify separate model directories for the `pipeline` and `vlm` backends. After specifying these directories, you can use local models by setting the environment variable `export MINERU_MODEL_SOURCE=local`.
+
+---

+ 125 - 0
docs/en/usage/index.md

@@ -0,0 +1,125 @@
+# Using MinerU
+
+## Command Line Usage
+
+### Basic Usage
+
+The simplest command line invocation is:
+
+```bash
+mineru -p <input_path> -o <output_path>
+```
+
+- `<input_path>`: Local PDF/Image file or directory (supports pdf/png/jpg/jpeg/webp/gif)
+- `<output_path>`: Output directory
+
+### View Help Information
+
+Get all available parameter descriptions:
+
+```bash
+mineru --help
+```
+
+### Parameter Details
+
+```text
+Usage: mineru [OPTIONS]
+
+Options:
+  -v, --version                   Show version and exit
+  -p, --path PATH                 Input file path or directory (required)
+  -o, --output PATH              Output directory (required)
+  -m, --method [auto|txt|ocr]     Parsing method: auto (default), txt, ocr (pipeline backend only)
+  -b, --backend [pipeline|vlm-transformers|vlm-sglang-engine|vlm-sglang-client]
+                                  Parsing backend (default: pipeline)
+  -l, --lang [ch|ch_server|ch_lite|en|korean|japan|chinese_cht|ta|te|ka|latin|arabic|east_slavic|cyrillic|devanagari]
+                                  Specify document language (improves OCR accuracy, pipeline backend only)
+  -u, --url TEXT                  Service address when using sglang-client
+  -s, --start INTEGER             Starting page number (0-based)
+  -e, --end INTEGER               Ending page number (0-based)
+  -f, --formula BOOLEAN           Enable formula parsing (default: on)
+  -t, --table BOOLEAN             Enable table parsing (default: on)
+  -d, --device TEXT               Inference device (e.g., cpu/cuda/cuda:0/npu/mps, pipeline backend only)
+  --vram INTEGER                  Maximum GPU VRAM usage per process (GB)(pipeline backend only)
+  --source [huggingface|modelscope|local]
+                                  Model source, default: huggingface
+  --help                          Show help information
+```
+
+---
+
+## Model Source Configuration
+
+MinerU automatically downloads required models from HuggingFace on first run. If HuggingFace is inaccessible, you can switch model sources:
+
+### Switch to ModelScope Source
+
+```bash
+mineru -p <input_path> -o <output_path> --source modelscope
+```
+
+Or set environment variable:
+
+```bash
+export MINERU_MODEL_SOURCE=modelscope
+mineru -p <input_path> -o <output_path>
+```
+
+### Using Local Models
+
+#### 1. Download Models Locally
+
+```bash
+mineru-models-download --help
+```
+
+Or use interactive command-line tool to select models:
+
+```bash
+mineru-models-download
+```
+
+After download, model paths will be displayed in current terminal and automatically written to `mineru.json` in user directory.
+
+#### 2. Parse Using Local Models
+
+```bash
+mineru -p <input_path> -o <output_path> --source local
+```
+
+Or enable via environment variable:
+
+```bash
+export MINERU_MODEL_SOURCE=local
+mineru -p <input_path> -o <output_path>
+```
+
+---
+
+## Using sglang to Accelerate VLM Model Inference
+
+### Through the sglang-engine Mode
+
+```bash
+mineru -p <input_path> -o <output_path> -b vlm-sglang-engine
+```
+
+### Through the sglang-server/client Mode
+
+1. Start Server:
+
+```bash
+mineru-sglang-server --port 30000
+```
+
+2. Use Client in another terminal:
+
+```bash
+mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://127.0.0.1:30000
+```
+
+> [!TIP]
+> For more information about output files, please refer to [Output File Documentation](../output_file.md)
+
+---

TEMPAT SAMPAH
docs/images/logo.png


+ 228 - 0
docs/images/logo.svg

@@ -0,0 +1,228 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<svg version="1.1" id="Layer_1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" x="0px" y="0px" width="516px" height="516px" viewBox="0 0 516 516" enable-background="new 0 0 516 516" xml:space="preserve">  <image id="image0" width="516" height="516" x="0" y="0"
+    xlink:href="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAgQAAAIECAQAAADMC/4dAAAAIGNIUk0AAHomAACAhAAA+gAAAIDo
+AAB1MAAA6mAAADqYAAAXcJy6UTwAAAACYktHRAD/h4/MvwAAAAlwSFlzAAAWJQAAFiUBSVIk8AAA
+MLJJREFUeNrt3XecXFX9//HXmfSekN4I6SGVhJAAMQkkFAEF4QsCoiBFaYIIyE/Er/IFUbCioIB0
+BemKikIoAjZAQGkhodcACSSkk5BkP78/Npvtm5k5955zZ+b93McDdjZ77/mcuzvvvfUcZ4hIpcvF
+LkBE4lMQiIiCQEQUBCKCgkBEUBCICAoCEUFBICIoCEQEBYGIoCAQERQEIoKCQERQEIgICgIRQUEg
+IigIRAQFgYigIBARFAQigoJARFAQiAgKAhFBQSAiKAhEBAWBiKAgEBEUBCKCgkBEUBCICAoCEUFB
+ICIoCEQEBYGIoCAQERQEIoKCQERQEIgICgIRQUEgIigIRAQFgYigIBARFAQigoJARFAQiAgKAhFB
+QSAiKAhEBAWBiKAgEBEUBCKCgkBEUBCICAoCEUFBICIoCEQEBYGIoCAQERQEIoKCQERQEIgICgIR
+QUEgIigIRAQFgYigIBARFAQigoJARFAQiAgKAhFBQSAiKAhEBAWBiKAgEBGgdaiGXL1XOaoAh8MA
+A3IYAIbDUVXne6q/Vvfrhm3+r2thiZrlar7eeInml2Pzko17UFMLdb4vt/lVwyXKXc1PomrLdqiq
+s/3qcpu3Ts3PoPY7an6SYeuuqWrT5t+72p9orl/VECYwMTfShtGVbjg2sJx3WWSP8JR7jcW2inrV
+V21eY1VK1YbZNsGCQCTjtmEmn2BvNwHqvf3a04XB4A4EVvIP5vEA82MXmzQXKou1R1B+ymiPYKg7
+jtnMyHMFr/EEv+auctojwAJ9NPWjcORwm38hqj9zm79a93uqv1b3627Lf1taomY56i1Xd4nml2v4
+i1zz77W11P2+HLnNa68sNT+J2u1Qd/vV/cjV+SlT7zuIsOVymz9aVbfe3p3r3nBW4MfK3J/c9Nrf
+rfROtgV6fyoIFATFKocgcLNzjxUcAjUfyzk/171mjWkJ8/7UVQOpXF04m3ttWtHLd3Pf4nZ2it2N
+JCgIpFINst+479HWbyU2193GZ2N3xZ+CQCrTULvRDkhkTUPc9Xwzdnd86fKhVKJtuZWpia2tvV3g
+4HuxO+VDewRSeQZwjSUXAwDYtzk6drd8KAik0nS1K21u4mtt537C7NhdK56CQCqMO419U1lxd3c1
+g2L3rlgKAqksu3F6ausezgWlejuJgkAqSXfOo1t6q3dHcnDsLhZHQSAVxB3jZqbcwjl0jt3LYigI
+pHL04+zU25joDo/dzWIoCKQEuC3PB3h9nEyvAKUeQ9fY26twCgIpAUaV/0cn2zNIsdP5TOztVTgF
+gVQIty+TwzTEvrSP3dtCKQikUuzv+4BRvtw+DInd2UIpCKQyDGB8sLa6MjF2dwulIJAS4fw+xjEm
+YLFzS+3GIj19KCXCc+y+bV3A43Y3zdqxLlx7/rRHIJVhVNDWJtA9docLoyCQStCRkUHba822sbtc
+GAWBVIL2bBO4xRJ7DlFBIJWgTZqPGjWpR+wuF0YnC6WkFHkyvm3wIOgeuD1PCgJJVjt6sB1TGUM/
+BtIBA1ayhNd5jv+yiOVsilBVLvhveom9s0qsXMmwHOPYkX3Y2zX667tlirNXuJeHeYzXi22kyIuI
+61jqBgfdGiV18VBBIEk5wB1us12/lr/JhnOiO5H57q92Bw8HrM7YGHh7LArcnicFgfjbx53CbDrm
+/f3jGOcOZx4/5clAFX7EksDbZEXg9jzpqoH4GcAv3R3sU0AMVOvFEfyFrwR6Tm8NbwXeLi8Ebs+T
+gkB8fNr91U6kQ5FL9+ESbmF0gDqreDnkZuE9lgVtz5uCQIqV40xu830bu/35AwcWsVyhHy+yIdym
+sQf5KFxrSVAQSHHauWvsB7TzX5GN5iZ3UsFLFfqxkDcCbp3Ho1wk9aAgkGK0dVdzVGKP2rbjZ5yQ
+csUv8VTKLdTaGPSKSCIUBFIw19ZdyRGJrrK1u5TjUi3aeCjV9ddt6t/F3ycRi4JACua+ypGJr7SV
++4nbPc2q7S+8meb667ir1E4VKgikcJ+x76Sy3i78MtUrCK/ZHSmuvdbr3BmknUQpCKQwQ90P6ZTS
+use4s1Md4usXQW78/R0LArSSMAWBFCT3LUakuPrPJ3zuob5X7MoU115tiV2eehspUBBIIfa2Q1Nd
+fyvOZUCK67+Yd1OtH67jpZRbSIWCQPLX1Z2d2mHBZm44R6e4+lft3DSrtyfs+2muPz0KAsnfHqQ8
+lzCAO5XeKa7+am5Lbd2rOYflKdaeIgWB5KuV+1SQ35c+HJbi2jfZd9I6mWdncG+KladKQSD5GuH2
+C9OQO6TgZxkLscCOTuXv9hXcmGLVKVMQSL72sD6BWprM1FTX/5idkPRDQXaVncGaVKtOlQYmiaxk
+ZsbKBZzsu5Obbn9LtYVbrK27PLn9DruCU/k49e2SIgVBZJ4TeYXT1U0P1pZjV1zKm+Y3vM+lDE9g
+TWa/5MzSjgEdGki+ptA5YGsjA0wsfg8H8VfvtSzhZE4ttaFKG1MQSH52CHoUMzDIlGHP2MH2HVYW
+vwK73w6yy6gKtl1So0OD6ErkLMGEoK11c0XcX1jEscSHnGd/c99lGm0KXvY9+wFXsSq5Tqd9NNQS
+BUF0VhpREHYKL2e9grX1EJ+wg9wpTC3g4OcVu5lLWJxYd6OfK1IQSH76Bm6vS9DWfsfvbH/3Saay
+01a+80P+YfdzZ1JjG+QyclyhIMiAktgnCD13YNvgPfyj/ZEBTGAcOzGOoa7e/oG9zfPM5xkW8h/f
+KwSOmp957P2AWgqCTCiBKAj9OxvnPfIO7zCPdnSlI11sG1phONa6paxlVTLnA1wGDgQaUxBIfkKP
+yhvzetZ63k9nxdmNe10+zIjs/Y1owOMiW1HWxu5w8rIbA9ojkHylPaBHA67o4T+zF6lZDoAaCoLM
+yN4vcD2vB/113mCJXZqTfOjQQPLzbNDW3g06L5EoCCRPTwVtbVHpTRHSlFI4KKimIJD8vMSrAVt7
+PvjJyYS5EgoBUBBIvj6yB4K1tckejd3dSqMgkHzdHux85juEC50UlNa+QDUFgeTrGZ4P1NK/eS12
+Z4uXy/r1n2aqFsnP+/b7MA3Zr2J3tRildlagPgWB5GsTt/JW+s3Y/dwXu6vFURBIZXg2wDy/VVyU
+1L51uF/uUo6AagoCKYBdlvatxnYniVwxcHX+m7bSjwEFgRRmgX031fW/w/ms9luFa/TakdYRfGmf
+F6hLQSCFucL+kNq6zb6f3h2MSb9lXQrrjEdBIIXZxKmp3WN4C6lfL0jiF7583v61FARSqDftSJam
+sN5/2XF+g4Dl/wb1eSuXz+FAXQoCKdw/7fNJR4EtsGP85g4s7u3Z/MBhVmetaZ5nyAYFgRTjHjuC
+DxNc30K+wAvFLpzz/Atf+99cnTd8yOsO8SkIpDjz7LO8ndC6/m6H8mTsDlU2BYEU6347hH8nsJ4b
+7QCeKW7Rct5ZD0tBIMV71D5lv/Q6wfeenWLHFn+QkcWBwUuTgkB8vM+pdrA9VtzCdovtyaWsL2ZZ
+7QkkS0EgfjbxJ2bbUcxnYwFLrbAHbV+O4LnY5Us1jWIs/tbza7uJo9wsZjBsq9/9Ag/YbTxUbGPZ
+miqsXLhQm7T+rlz11I8113CN2uEcDIejqs73VH+t7tcN2/xf18ISNcvVfL3xEs0v19TRp9tSn235
+vPr7cptfVfqvp8PGuZ1sGuMY4/rU/zeD13iahTzNE7zs0cJWfrI1Pzm3ld+Rpr9ef4mWfne23kLt
+OlwTvahZqrae5s54hPmt0h6BJGk+87mOnvSkuw2gJ+0BY5V7j2UsY7Hf/EWlOfZPaVAQSPKWpnIL
+cunqTDs60Io2wMdsZB2r2BC7qPoUBFICqnegS043RrA9kxjPALalE22ANaziHZ7jOZ7jpeIPkpKm
+IBBJ3vZMZ4oby0TXu8G/dKKT9WOKA1jg/sHdBBoJsmUKAsmwmhNwJaMDu7O3m8YABuVxaX57trcj
+3NN2Bb+OffpDQSDiqy3d2JZZfMpNpqO1K2jZjuziprkj7Tz+FjMMFASSSSVyt8BAtmMMuzHdjfSo
+t5XNcbPsZ/yAJbE6oiCQzCmBJwjaMZXpbpztwFjXPpE1tnZnsCvHsiBOhxQEIoUYym7s5sYziD7+
+K2tgF3eXncw9MbqlIBDZuo50YyJz2MONoJO1Sq2dYe52O5ZbwndQQSAZkrn7BXIMYSRjmcsM1yNI
+i53c5fZB+ElgFQSSARk8NdjfJrmdGG/j3LjAtXV3V9qnmR+2uwoCyYQMnSCczF7s6kYwkG6RKhjq
+rrJ9WB6ySQXB1uTQg4Upysy+QFe6M4N92N11p3PsYtiZs/hmyAYVBE1pzQAGMpBt6U87oIpVLOEN
+3ubdtOf+qxwZ2QfozBBGMI05bueMRBIA7lS7h7+Fa09BUF9fdmac24FRbO/aNvrX5bzonuI/zOP1
+2IXGkNxfb1dn/IBoxrCDm8Bkm+QGZCkCNuvkLrQ5rAvVnAYmqf53xy5uLtPYjmF03GpnXnD/sGv5
+Z6UNTFK7BZt+47gtW7zh3/vaV2kPG5JHC13Y1e3FeEawbab/EG7i83ZzqIiq7CBo5bpU9WfX3F72
+CboWeGS4xt1bdS7PVGoQVPe4vuaDgPzfpqQUBK1dl6oh7JGbbdPpTIfY2zIv8+wA1muEojT1YQjb
+MYdZbmyRmdvJDnS72I+4JtEZf0pK3UG76kd93aG3crEjsjfbMZrZzHSjM3gI0JIp7Mi/wjRVaUHQ
+lslMdBOZbJNdR+9fi37uR7Yn/4+nY3crrvoj7jlaHoMvmLaMZ7KbwI5MSeBnHUNvZioIkrYdc9nJ
+7cAQ+iW5Wrc3YzmTW2N3T+roz2zmuvE2mIGxS/G0J5ezIkRD5R4EHejFJHZzMxlu3VLq7WC72rXi
+pthdrXjt6cRY9mOu246utPVfYXxudxuoIPDRnmEMYmc+4XalU+o7hZ3tMmfcHLvTFWsIQ5jAXuzS
+aGCwUpdje54P0VD5BcEwpjCWiezCgIDTYnXjF7Y+G6PPVZDu7MhYprAjE8p2CrRPcEeIZsonCDoz
+jbluKtsxlDbhfy1sG/dLW8DC2JuhQkxkhpvFKBvmupfkacD8DQrTTKkHQRs6MZB9meMm0tXi3iPe
+z11te/BR7E1SxjrShV3Yh5muH11Jb1SADHEjwrRTukEwkIFMYjaz3aDM/E3Y1Z1l/xe7iDLUiQGM
+ZBqzmekq4u1fR/cwzZReEHRhMmPYkR2Z5LJX/an8gadiF1FGBrET45jKRDc0M3EfVqBOZ++t1JxW
+jGOGm872NpyeGXlyrbFt3LF2SuwiykAbZjLD7cpgG+naVmgEBJX9IOhED2Yw182gn3UrgePCw7mq
+0u80LFqOjgxiFvu58fSmS+xyKkl2g6Ar/RnFLnySca6Ubg7pySEKgoL1ZCCj2J1PuImxS6lMWQyC
+UUxhe6YyyZXmDaIH8zPej11EyZjAFEaxK1NcVx0CxJOtIJjDvm4SQ2xk9EErPLjRNoG/xq4i8/qx
+Gzu6KQxnSCn/tMtFNoKgNWP4AgfRt0yOC/dTEDSjDR2ZxAz2cqNtGwqbJVBSlIWBSSbb6e7wEjgN
+mL+Ftn3sEtLQcGASmhk2pMmv92ewjeWTbnayT3+WvdeqhoVoJuoegeH6cbI7gV4xq0jBAAbzVuwi
+MqIH2zOJSezI1LJ9HqAMRA0Ct6ddyJQyPDpsxyQFASOZ66YxzrajjyIg62IGwZf4cZmcE2ioTckP
+iFG89nRkFru7XdnWepbVAV9ZixUEbflf963YnU9Njp6xSwiuPX3Zjl3Yhyl0LsO9vDIXJwjacJE7
+LXbXU1VuZz1aMpjxjGU6OzBShwClKkoQuJPstNgdT1nGJvVNyVTmuClsbyNdB90JUNpiBMHnuCh2
+t8VTJw7naEZb5R0ClangQZAbZz/WjSQlrQ97urNtXOwyJEmhg6A1l+mGkpJ2mDvNpuswoNwEDgJ3
+uM2M3WUpWncudMeZLgmWobBB0I9SuGRorOc1u5v+7jB0GrzWRLuSabGLkHQEDQJ3NKNid7hFS1ls
+85nHo24hm/gKh8cuKENmulvoH7sISUvIIOjNIbG724zVvMjT9ixP8F9Wb/mqdoFrzXI3KQbKWcgg
+mMv42N1tyF7kQf7LAl5hUb2vxy4sW2a4W+kbuwhJU7ggaMOetInd3c02ssr+xr08ynu8VyE3/xQt
+tw0XmGKgzAULAtebObE7yxqW2Os8wr08ysf6s5+nn9rs2CVI2oIFgW3rtovYz5dZwEv2Dx7jnYhV
+lKIj7MjYJUj6wh0azIrRPVvKozzunraFLNSVwCJs674fuwQJIVwQjA3YK2OtvcA8HnXzbQmrArZc
+ZtwJDI5dg4QQ7hzByACNGO/ztv2Hh/knb7ExVN/K1gg+G7sECSPcHkHvVNe+jOd41R7nMZ4M1qPy
+9xmGxy5BwggXBAPSWKnBk/yLp3iBZ3QIkCzXjU/FrkFCCRcEyba0yd7jn9zDkyxmcbA+VBSb7GbE
+rkFCCRcEyVy1X81i3rC/8CALWK9bgVLk2C0j099IAKXzo17ISzxnj/CQ0yEApP9YZFs3R3dcVY5S
+CIJl9nOe4Vle1p0AtVJ/k3Zlcuw+SjilEATvcRHrYhdRccbQOXYJEk4udgF50a9keKlc5ZGsKo0g
+0DFBeCFuAJPMKI0g2EwnrwLqEbsACamkgkACysrYERJECQaB9guC0GauKCUYBCKSNAWBiCgIRERB
+ICIoCEQEBYGIoCAQERQEIoKCQESoqCDQrXIizamgIBCR5igIRERBICIVEgQ6OyDSsooIAhFpWZkH
+gfYFRPJR5kEgIvlQEIiIgkBEFAQigoJARFAQiAgKAhFBQSAiKAhEBAWBiKAgEBEUBCKCgkBEUBCI
+CAoCEUFBICIoCEQEBYGIoCAQERQEIoKCQERQEIgICgIRQUEgIigIWqJtIxVDv+zNW+WxrKZYkpLS
+OnYBGbYcwxW3qGufdnFKGkmS9gia57NtuscuXqQQCoLmFbk3IFJ6FATN2+CxbJvYxYsUQucImreM
+KloVuWx3HcNLKdEeQfM+9li2Y+ziRQqhIGjeEqqKXlZ7WlJSFATN8ztH0C52+SL5UxA070OPi/Xd
+GBi7fJH8lf0urMc1wI9ZzTZFLtuatrF7LpI/7RG0ZEXRS3ahd+ziRfKnIGjeJpYUvWx7usUuXyR/
+ZX9o4HE9v4rFRR9YtHPFHlSk3y+RRrRH0LxNvFf0sm3oE7t8kfyV/R6Blw89lu2tv9lSOrRH0JJ3
+PJbVHoGUEAVBS171WLaPbjOW0qEgaEnx5wigDz1ily+Sr3BB4NOSx+G28/lY7fHgUT8FgZSOcCcL
+1xR9r50r+mFgX+tYwqAilx3geqVbnE5GVoRAw+OEC4JlRf+FbEdXFhfbrNfbZSWvuGKDIGcDfJoW
+AWBdmGZK4dAgF22PYI3XdYOxkaqWMmI+l7ALEC4Iiv/THC8I1vOSx9JD9OCReAt0BBguCIpPtg4R
+xwT22SMYpwuI4m1TmGbCnSMo/vx7K7+hQL3Otrzv0e721pPlPo2LsCxMM8H2COyDohdtS2evln0+
+3vX4QbRn2/S2p1QIn3GyChDu0KD4s5+eewReFvGWx9KTotUt5cJjn7QQ4YJgZdFLdog4yMf7HmMS
+wIxodUu5WB2mmXDnCJYXv6jr5NOw12nXj3jTo+4drTUbfZqXirc8TDPhgqD4Yb+gl981FK/ThS96
+LNuX4bzgVbpUuqVhmimFy4f4nSz0PF34ImuLbrgtU9LZmFIxiv/tK0i4IFjksWzXYFU29oxHhLV2
+c70eetrKh1SAd8M0Ey4Ilnss2xevswReXi3+OQdggnX02h9p8UMqwPIwzYQLAp9jnW5+QeD5d3eh
+R9MDGZHO5pSK8LELdNUgWBC49awveuEedPFp2/Mv79MeTQ9wO6e1RaX82RIru3ME6zxujdjGLwg8
+Pehxv7fTTUXiYZnXnNwFCHeL8Wor/h69vm6biKfVnvW64jFZcx5J0d4vv/EI1vvcLGmdI55WW2+P
+eyy9A6NS2Z5p07nILCjLIPA5XVjsOEFJMB71WLoD20esvXia8SIL3im/INjA6x5Lj4562fwBr6UP
+pENahaV4l0Lf4FtZGvN50qUgIYcz99kj2Jb2AStt6A3zuK3DzSzJyU6qYhcg4BaF+vsX7vIhzifd
+BkYNgg/5t8fSXZgdsfZipbYXI3n72JaGOlUTco/gveIfPHKjI95bCGv4h9fy+6ZVWIq/Jj3TW7Xk
+abXXo3oFCXf5EFvh0a2u9AtVaZOe9rgdCje3BCc7ibu9BeCdUE8ahN4j8OnWmICVNjaf5z2W7uE+
+nVZhKT3H0CbayNFSa2moh5BDnyz0GRN4h4CVNvau17gCrTg0avWFGxD1nIxUe5dVoZoKGQSbvI54
+RgestDHjEa8D8h0YH7X+QnXU9LgZsCjcbV1hf9yvFL+o2zFopY3Y/V5zIw9wn4xbf4GG6KpBfB43
+5Rcs5OVDeNXjEYquDA9Va5Oe5zWv5fdIZ3iVlG4nGqRDg+jWeo2gXaCQVw2wVz0ODtoyIVStzfTg
+Qa/FZ6VzcJDSvuMA3WIc3VL+G66xsIcGL3gEQRumBq21sTu9RiTuQEoHB6mMfTQsyhaWuj4KNcsR
+hA6CDz1mN8DtELTWxp70GtEYDiihuwn0pEF8r4e7nSh0EMBTHssOjTqIKZjd6LO4m5jeHYYJ6xxx
+2lnZzGtkrIIFDgJ7zOOgtk/kewngz36Luy9Frj9fg0vyManyUuX1R7NgofcIXvZ4qq2ni/1k/6vm
+98zBdHZKvqgUrhlsqxuMo9tEOe8R8ARril7WMTFwtQ2t4h6v5du7rydfVAonCwfSMdIWlhqrw86R
+FfQ+Aodb7XXCbZTvnEfeHvIavxBmRX5mIj9jYxcgPOY2hZzGJux9BFiV/ctjJRPcyMhzAz3uNWwZ
+9HGHJbtdU9CRkbFLEHvULOQ0NuHvKH/CY9m+tl3kuYE+5m6v5R2HsF2i25PEzxL0d7EPwQSeCdtc
++CB4zmsQrBlRxy4E7F7PGz/HcmDiNSX70S/5qJICbeTZsA2GD4J3eMlj6V0i30sAL3geHMBXMn5O
+flzsAsSeCTXnYY3wQbDcPO6gdjsyIHjFDdj1Xrca44bx+WQrSvTAoI3bPebWFQAe8TwpXbDwQbCe
+Jz2Wbsf04BU3dLfXA8mAO51tkiwo0QODDhnYwvKsxzR7RYkx/MSzXn9R94xQcX1VdqnnGvpzcuxO
+NGs0Q2OXUPHWez7VUoQYQfC6z1Qnbm7UCVGr3egzfRuA+yL9kysnyUODtJ6RlAIs8DqPVpQ4QeAz
+EGjvDOwTvGt3eK5hqDs+uXKSPDRw+0XdsgKwwGt0z6LECIL1LPSp2M2NUHN9m7iJ1V5rcByayYt0
+w3UzUQY8E36eqShDVNqjrPVYfGYGxtP7D494rmGM+2zsTjRhH7rFLqHiLTefebWKFGes2n95nXcf
+zh5Rqq5rNX/wu4gI7pTsPXfg5mg+g+gW81j4RuMEwWKvswQd3W5Rqq7HbvC998sGuXNi96KB0UyJ
+XYLYcx5P6BYt0uj19nevxacnex2+KCvsCu91HM7OsbtRzyy2jV1CxaviTzGajTWNxd2s81h6avSx
+igBu9T6328r9JJnzHYlcOmzrZsV+kkNYy8Mxmo0VBC96PbrTjtkZmInnQ7vKex07cVzsbmwx3JXK
+mIrl7Fmfu2yKF+vttN7u91ncHZCJ4TVv9J6ksrX7ehLDgCRyD8FUi3/AVfHsrjjtxvu7eoPX0hMj
+z3tU7UW72nsdg/ly7G4A0D53ROwSBPD6A1m8eEHwktf91I6jolVe1/Us8V2FOzkD90rCeNs7dgli
+z/FqnJbjBcEK+5vP4m6vDFw5gOfxmusAgNa5izJwoPO52AUIcJf3wWaR4gXBx547QSP5dLTa67Cr
+WOy9jsnu/MjdGOy+ELkCgY38M9gghQ3EPPf+BC/7LO4SHt6jSM9zWwJrOS7uGXv3RXrFbF8A+G+M
+ewqrxQyC1z3ncpkUfZ4DAOw7vOu9kvacH/Gt2IEDorUttZ7xfby9eDGDYBP38rHH8r0zsk+wzH6c
+wFqmuG/E6oD7XDYitcKt5b54jUe9Lcce8Bz0aw49Y9a/xVXmM/zaZu7Lcc56uN7uGNrEaFnqedM8
+59b0Eff+vNc8R2+fRDYuea0ggX0C6+J+EWOYMNvVsvXEQ4Wyf3qOceElbhCY3eh1lrQ1h9M+ag9q
+/IW/JrCWwe7i4I8Bt8sdm4HbtaWKm2I2H/tX4E9+04W4mYyO3INqK/hRInm+H18LW7ibY3uFbVGa
+9Hq8KwYQPwjWeN5b3c0FfuM0x+4miSO8Vu7coPcZtnWn0C5ge9IMuyXmgQG4UPcvNPt86y7OZ1pU
++MhGhB/qsUkj3OOJ3CH4ktur6vXCFskVObOjO5YrNCZRBpiNb26wnjDv0Nh7BLDA84RhB7JyT9zL
+9r1E1jOSX9ApSMXtOEkxkAX2cJyHj2vFD4Llvk/1u+MychERfuU9pCkAti/nhijXHcWEEO3IVt3h
+NZxvAuIHATzseWfesCTnCPCywr7tO6RpNXeqOzL1aofyTd0/kAkv85fYJWQhCJ7jH559ODwzY+09
+ZNcmsp62/CI3d2s/HEfOZ2yx4xkSbsNIC/4e6+HjWlkIgirvocHHk5X5eTbyw4TmrevMxWyfYqUz
+3AlhNolsxWq7PXYJ2QgC7M+84LmKIwKdXtu6l7ggmRXZeK5jYEpVutwFmsokI17godglZCQIWG5/
+8rtK4mZkZp8Au9HuTGhN09wlhVzlz/cwweFOsdmht4s0aaP9OvaJQsjCfQTV+riX6OrVwLM2JZkT
+dQkY7B5LbLbjn9rpzf2Tw22+f6D6PgKXZ5q6CcxLcjZm8bDSxrR8srxS7iOotoQHPNcwgewc875l
+30psGsuvcVHC1bXiHMVAVtgNCYxmkYCsBAH2Q68pTwB3embuJ4Cb7dakVuXO4ltJlpY7mkPDbxBp
+0nouiV1CtcwEAU96ToMGQ9yJsTuxxVrOTujqAeDOd6fWzkjkabydG3XLSB12X/wLh9WyEwQf81vv
+vmTnfgJ43c5lQ2Jr+6E7NJEg6O5+kdqVCCncVV5jdCUoO0EA9/C45xrGckjsTtRxk3e01Wprv8pn
+h77lH6cjdwazYm8W2WKe95mxxGQpCN7jJt9TbLmT3IjY3ahlX0/m2QMAutplHOy5jgMtIw9tC7CB
+yI8e15WlIMBu5jXPNQyzs2L3oo737XRLbsKKHnal1x7PeH6UmduuBF6LNc9hUzIVBLxrf/RdhTua
+KbG7Ucej/G+Ca+tu13BYMQvmyHVxlzAs9uaQWnZbvMHLG8tWEMCVrPRcQ2v3I1rH7kYdl/GbBNfW
+2a52JxexXGsuYLfYm0LqWMLFsUuoK2tBsCCBIRxncmzsbtRlp5HAYOdbdORSvl3wz+0cOyX2dpC6
+7Od8ELuGurJyi3GtUe4x7wG/XrBPxh7xpZ4p7r6Ep2z9iZ2JNb7F2DV5ttXtz810iL0RpJa9xbR8
+5/SorFuMa72YwBP9ozN0uzHAf+yrCf88T3cX5z2Q+z5cpxjImJ97Tu2TuOwFAVzjv9PkTkvqern3
+nXzVbrCk5zs+ld8woKl/yNX/GOOuoUfCbYufF8nACAT1ZTEInkvgPEE7zkmqbwnc1gvwY36fTD1b
+6jrY/ZFRW/mm/lxLv2TbFW+/zdSBK5DNIMCu8r+w4vbyHcnQNfN5kVbaVzwHbm/EdrTbmdPCNwyw
+GzSdWea8ZlfHLqGxTAYBz/C7BNZyLpN9Fk/ooKDWO3Y8ixJe5wRuc19u5t/62VUtxoREYTfyduwa
+GsveVYNqfdyz9PFu9I92EJuKXThHVZ26E9pOe7vb6JLMqrbYZD/NfcfW1lw12JztvbjOMjNqk2zx
+tk0u7BxYpV41qLbEEphfmP0p+vDAFfj1PM2zk3zHXWiklTuTmxrcNdiL6xUDWWQXZOv+gRpZ3SOA
+Tjzi/KffeMf2YEFx9bom9wgS2EP4X3eed78aW2hfc/ds3iPob9dmZMJ4qcce5EBWFLhMkMqyukcA
+a/hRAmsZ4C5MZxIPj2sJP7A0RqUZ425w59Ia2NZuUAxk1A8KjYFQsrtHAD240yVwN4B9lZ8XU2/L
+ewRu81eL2n6t+ZU72r9nTVT9a7vNfcNmpLFu8WXz+B/WFLxUkNqyHARwgLs57/vnmrfEPs2/C683
+nyCo+ZeCt2KX3G/sAO+eSSlZYwcUMxBJpR8aAPwxkTnh+riLXe/0iizqIGGVfSXRR5Ek+27IznhE
+jWU7CMy+7f1YMsAuaY/MU/gchPa27e85IbyUkvfsh7FLaEm2gwDmJ3IZEfd1Ppt2qQXvGbxjh7Iw
+7aokG+wSXoldQ0uyfY4AoKd7gu0SKOAtt2dVXjMs1txIlO85gvr/XuAx3Vh3e6pTnUo2PGvT+ai4
+RXWOoNpSOyuRWYMGc2UCJx6T9rwdzPOxi5CUrbZvFBsDoWQ/COD3ycwaZDNdQvMU5yfPfaDn7UDm
+h6xLQrObuDt2DVtTCkGwkR8kND/c6XzZfyX5y/OswYvaKyhrb3BxoP17D6UQBPBfrkiou+e4aS1/
+R9LPHLp81rrQDua5hBuWrLikFGI++ycLq3Vyf2Z2EnXYc3yypYeBa08S+p4srLtkbuunOUa6G5i2
+tW+SknOP7eO3Ap0srGuNnVn8A8V1ufHuEtqF78BW9wxe4iCX2AzKkhGr7dTYJeSnVIIAnrDvJ7Sm
+A90vkh91JD8tnjVYxFF2ZZy6JB12Pi/FriE/pRMEcClPJLSmY3KnREoCoIX7ENdxkv0kYmGSKLsv
+qXNb6SulIFhsZ7E8kTU5/tcOjN2dJic538g3OS8rU2WLl2V8J6sPHTdWSkEAD5LQsI/Wy13JJ2J3
+B4wcDc4crLfv2GezNCueFMXsxwnOhZ26UrlqUKOju49dEyrpRduPlxvXmc5Vg7ozEtXMS1QdBFXk
+MKg/S9FcdyN9E+qnxPA324v1SaxIVw2astZOSWxO+VHuOvrH7lB9W34cD9hu3M2G2PVIkZbbkcnE
+QCilFgTwH0tuxL8Z7lZ6xu5QQw5wuNfc/QqCUmXf4I3YNRSm1A4NADpyi/tUYmu7277A0rp1xjw0
+yFGFw/q649mLaemMtihps5s5KrkTvhqqrHlj3J8YkVhll3NyzeF5VewgaFc1xp3AfgxOcnNJUE2e
+eyqegqCltR3F1bRKbG2XcEb1bnjUIBhTNT13hM2mbaKbSsLaYAfy5yRXGOYd2jpIK4mz6914zkxs
+bae4FXw74hNiw5nCnhzkemb+ITXZmvOSjYFQSnSPAOjEn10ijyEBsNH9H9+NsEcwIjfLdmcqY5Le
+PBKDPcTuia8zSOWlGwQwzt2X4OW/Kvd/XFC1KUgQtLJ2Nsrtxz4Mp1ep7pVJQzafzyR5dmDzWoPU
+XspBAIe53yT5NnK/rfq9e8ktrlrGx6kEQcdcL+ttE3K720yGlOClW2nJGjuMu5JfrYIgn7VexFkJ
+r7LK/bdqIQvdy+6tqrdY4jZPWeoRBJ3omxtio2wg43PjbFQqG0Li+2Ziz8fWoyDIR6fcjanNGPQB
+7/IBS3mVxbxpb/MeK1mWRxC0ogP9GOQGsQ0j6U0/ejOQbilVKZlg13NqInNwNF5zkPpLPQhgsLuH
+samX/xEfsY4NrONDVvE+uOWsrHMB06wtPehOd7rTiVa0pwMddSGwYsy3XdOJAQVB/nZ28+gaqBsi
+jb1pB/N4WivXQ0f5etROLq0HPKSsrLcT04uBUMohCOC3dmEik6CIFGqTncu9sYvwVw6HBgBt3HV8
+LlBXRLawCzkn3T9COkdQmK7uLmYG6oxItXn2P6xJtwmdIyjMSjs8scFNRfLxTzsu7RgIpXz2CAB2
+cHcxMFCHpNI9Z3vzTvrNaI+gcE/ZiSyLXYRUhGV2RogYCKW8ggD+ZCdlfQJqKQOr7aRyuFZQq9yC
+AG6xU9za2EVIWVtlX+KW2EUkq/yCAK62i2KXIGVsg53FzbGLSFo5BgF2YelMNSUl5+dcHruE5JXX
+VYNabdyVHBW2SakIl9opYRvUDUV+2rlfcWToRqXM/dqOZ13YJnX50M96O96ujV2ElBO7NnwMhFK+
+QQDrOJHrYhchZeM6TizXGCjnQ4Nqbdyv+GKcpqWsXGdfjjMFnQ4NkrDBTkAHCOLrWjuhvGeiLPcg
+gPV2ol0duwgpZXa1nVjuQ9+UfxDAek7mmthFSMm6hgoYAasSggDW2/E6QJCi/NxOKP8YqJQggI12
+ApfFLkJKzDousq+W97mBGuV+1aC+8903NMGY5GmdnZaFW9V1Z2EajnEX0yV2EVICltlXuCl2EaAg
+SKkKO8RdpVkQZCs+sBO5PXYR1XQfQTpusy/yWuwiJNNetP/JSgyEUnl7BAATuMLtErsWySZ7mC/z
+Yuwq6tQTpJXKDALo425iTuxqJIP+bF/kg9hF1KVDgzQtsQO4nE2xy5BM2WQ/sUOzFQOhVGoQwGpO
+tLP4OHYZkhkr7UzOKJd5CgpVqYcGNZ8f7C5keOyqJANesVP5S+wimqJzBKlUYQ0/n+RuY2TsuiQu
+e4STeCp2Fc3UFqSVyj00qPG07W7Xxy5CYrJr+WRWYyAUBQEs4gS7kFWxy5AoVts5HM/K2GXEpkOD
+Gp9xP2Pb2PVJWLaQM7J5ZqBOjUFa0R5BjTvZ2+6JXYSEZL9j36zHQCgKgloLOcjOK9/hKaWeNXY2
+h+tm8xo6NGjw2h3MRQyLXaekbIF9jXmxi8iPDg3iuN0OsN/GLkLSZFfZ/qUSA6Foj6DBa4dBK77i
+vs02sauVFHzIWXYNVbHLyJ9uKEqliryCAGCW+x4zYtcrCbvfznDPhPqdT4aCIJUq8g4C6M5p7lR6
+xK5ZEvKu/Zifs8EFemslRUGQShUFBAHAHHeOHlcuC3fZeTwOjX/uWacgSKWKAoMAevBV91W6x65c
+PHxo3+bamucKFQRNURA0eN3kr8kMznPaLyhNVXYPZ/NM7RcUBE1REDR43cyvSc6dzkkMjV2/FOg1
+fmKXs7HulxQETVEQNHjd3K+Jw8a6MziCdrH7IHm7wn7mFjT8eSoImqIgaPC6hSAA2M+dy9TYvZA8
+PGQXcH9TP08FQVMUBA1ebyUIoAdfdOfQM3ZPpAVv24+5pvrRYgVBfhQEDV5vNQgABvEt90UdJGTS
+B/YHzueNmpcKgvwoCBq8zisIAHcIh3Ng7P5IA7fbpTxc9wsKgvwoCBq8zjsIsLYczGlup9h9kmr2
+CD/kzw3HpVYQ5EdB0OB1AUEA0J3Pu//HoNj9qnjP2neZx4rG/6AgyI+CoMHrAoMAoCsnuyMYF7tv
+FesFu45fNjfqoIIgPwqCBq+LCAKAXu5YjmNE7P5VnDe5zX7Koua/QUGQHwVBg9dFBgHAKA7nGKcB
+UAOxJVzOrcxv+bsUBPlREDR47REEAH05yH1DoyGn7m27kutrLxI2T0GQHwVBg9eeQQDQhRPdUYyN
+3dcyZfzbbueKfOehUBDkR0HQ4HUCQQCwDV9yc9kzdn/LzgP2W37DhvwXUBDkR0HQ4HVCQQDQ1e1s
+x7uDYve5PNhabna/t783dYmwJQqC/CgIGrxOMAhwWAc31g5zx2ggVC9v2q3cwHy3sfDfVgVBfhQE
+DV4nHATV39efY92+TNHTCQVbzfN2B5dVnxEo5i2sIMiPgqDB61SCAHBt7CC3N/vSN/Y2KBmvcq/d
+7f7Y/E8sHwqC/CgIGrxOLQiq/7+jm2wHun1jb4eM22B/cXfZv3i+5Z9YPhQE+VEQNHidchDgsM6M
+Z08+54bqUKGRj+0V7uQmXnFr62+3hp/nS0GQHwVBg9cBgqD6H9uxs/syOzGMVrG3SiZ8zMv2MHfw
+YPUsRI22FwqCNCkIGrwOFgTVn/fmU25Xdq3o24828DRP2N+5k7W1X1QQ1FAQpFJFxoKg+pPhboJN
+Zp/KG9vAHudu/s6Cxo8NKQi2bKMgrSgIGryOEgTVn/VkNLPZ302gY0Y2V1qqWG+Pcjf38RZLm/4W
+BUENBUEqVWQ4CKrl3ADbn93dePqW4byLH/KePcZD7q+2qOUZiRUENRQEqVSR+SCo+bwTO7lPMI4x
+bF8GVxfWs4CF9jSP8Sgf5bP1FAQ1FASpVFEyQVDz2SA3ysYwlulMzcY2LIStYj7z+Rev8yJvF7L1
+FAQ1FASpVFFyQVD9X0cvt42NZmd2YozrQ5uMbNCmVLHRVvFvnuQpnmUZy+tPOZbf1lMQ1FAQpFJF
+iQZB7f9ztGUEk5nGaDeILnSjS+ytCsBK1vC+vc18nuZpXmFdSxtIQZA/BUEqVZR8ENT9vrYMZSTD
+3HAG051+DKBb0M35Ph+wlEV8aM/zMm/zbL5vMgVB/hQEqVRRVkFQ97PO9KU/3V0v68cgejGYPnRx
+iT7iZCt4iw9ZwmLedu/Yct5lKcv4sOl+5PtzaPk7FARhqm0du5uSkNWs5pXNnzva0YkOtKED7Wwo
+PWhPT3rTlS70xtGOHrRik+tG5/rrsBXkyLGSFaxlNatZyQes5V1WuEWsZz1rWM86Piqx95JsVbA9
+AhHJrlzsAkQkPgWBiCgIRERBICIoCEQEBYGIoCAQERQEIoKCQERQEIgICgIRQUEgIigIRAQFgYig
+IBARFAQigoJARFAQiAgKAhFBQSAiKAhEBAWBiKAgEBEUBCKCgkBEgP8PCPTCCyMAfxEAAAAldEVY
+dGRhdGU6Y3JlYXRlADIwMjUtMDctMDhUMDI6Mjc6NTgrMDA6MDDf29LGAAAAJXRFWHRkYXRlOm1v
+ZGlmeQAyMDI1LTA3LTA4VDAyOjI3OjU4KzAwOjAwroZqegAAACh0RVh0ZGF0ZTp0aW1lc3RhbXAA
+MjAyNS0wNy0wOFQwMjoyNzo1OCswMDowMPmTS6UAAAAASUVORK5CYII=" />
+</svg>

+ 29 - 0
docs/mineru.template.json

@@ -0,0 +1,29 @@
+{
+    "bucket_info":{
+        "bucket-name-1":["ak", "sk", "endpoint"],
+        "bucket-name-2":["ak", "sk", "endpoint"]
+    },
+    "latex-delimiter-config": {
+        "display": {
+            "left": "$$",
+            "right": "$$"
+        },
+        "inline": {
+            "left": "$",
+            "right": "$"
+        }
+    },
+    "llm-aided-config": {
+        "title_aided": {
+            "api_key": "your_api_key",
+            "base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
+            "model": "qwen2.5-32b-instruct",
+            "enable": false
+        }
+    },
+    "models-dir": {
+        "pipeline": "",
+        "vlm": ""
+    },
+    "config_version": "1.3.0"
+}

+ 3 - 0
docs/requirements.txt

@@ -0,0 +1,3 @@
+mkdocs
+markdown-gfm-admonition
+mkdocs-video

+ 2 - 2
docs/FAQ_zh_cn.md → docs/zh/FAQ/index.md

@@ -1,6 +1,6 @@
 # 常见问题解答
 
-### 1.在WSL2的Ubuntu22.04中遇到报错`ImportError: libGL.so.1: cannot open shared object file: No such file or directory`
+## 1.在WSL2的Ubuntu22.04中遇到报错`ImportError: libGL.so.1: cannot open shared object file: No such file or directory`
 
 WSL2的Ubuntu22.04中缺少`libgl`库,可通过以下命令安装`libgl`库解决:
 
@@ -11,7 +11,7 @@ sudo apt-get install libgl1-mesa-glx
 参考:https://github.com/opendatalab/MinerU/issues/388
 
 
-### 2.在 CentOS 7 或 Ubuntu 18 系统安装MinerU时报错`ERROR: Failed building wheel for simsimd`
+## 2.在 CentOS 7 或 Ubuntu 18 系统安装MinerU时报错`ERROR: Failed building wheel for simsimd`
 
 新版本albumentations(1.4.21)引入了依赖simsimd,由于simsimd在linux的预编译包要求glibc的版本大于等于2.28,导致部分2019年之前发布的Linux发行版无法正常安装,可通过如下命令安装:
 ```

File diff ditekan karena terlalu besar
+ 16 - 0
docs/zh/index.md


+ 10 - 0
docs/zh/known_issues.md

@@ -0,0 +1,10 @@
+# Known Issues
+
+- 阅读顺序基于模型对可阅读内容在空间中的分布进行排序,在极端复杂的排版下可能会部分区域乱序
+- 对竖排文字的支持较为有限
+- 目录和列表通过规则进行识别,少部分不常见的列表形式可能无法识别
+- 代码块在layout模型里还没有支持
+- 漫画书、艺术图册、小学教材、习题尚不能很好解析
+- 表格识别在复杂表格上可能会出现行/列识别错误
+- 在小语种PDF上,OCR识别可能会出现字符不准确的情况(如拉丁文的重音符号、阿拉伯文易混淆字符等)
+- 部分公式可能会无法在markdown中渲染

+ 14 - 14
docs/output_file_zh_cn.md → docs/zh/output_file.md

@@ -1,22 +1,22 @@
-## 概览
+# 概览
 
 `mineru` 命令执行后除了输出 markdown 文件以外,还可能会生成若干个和 markdown 无关的文件。现在将一一介绍这些文件
 
-### some_pdf_layout.pdf
+## some_pdf_layout.pdf
 
 每一页的 layout 均由一个或多个框组成。 每个框右上角的数字表明它们的阅读顺序。此外 layout.pdf 框内用不同的背景色块圈定不同的内容块。
 
-![layout 页面示例](images/layout_example.png)
+![layout 页面示例](../images/layout_example.png)
 
-### some_pdf_spans.pdf(仅适用于pipeline后端)
+## some_pdf_spans.pdf(仅适用于pipeline后端)
 
 根据 span 类型的不同,采用不同颜色线框绘制页面上所有 span。该文件可以用于质检,可以快速排查出文本丢失、行内公式未识别等问题。
 
-![span 页面示例](images/spans_example.png)
+![span 页面示例](../images/spans_example.png)
 
-### some_pdf_model.json(仅适用于pipeline后端)
+## some_pdf_model.json(仅适用于pipeline后端)
 
-#### 结构定义
+### 结构定义
 
 ```python
 from pydantic import BaseModel, Field
@@ -62,9 +62,9 @@ inference_result: list[PageInferenceResults] = []
 ```
 
 poly 坐标的格式 \[x0, y0, x1, y1, x2, y2, x3, y3\], 分别表示左上、右上、右下、左下四点的坐标
-![poly 坐标示意图](images/poly.png)
+![poly 坐标示意图](../images/poly.png)
 
-#### 示例数据
+### 示例数据
 
 ```json
 [
@@ -117,7 +117,7 @@ poly 坐标的格式 \[x0, y0, x1, y1, x2, y2, x3, y3\], 分别表示左上、
 ]
 ```
 
-### some_pdf_model_output.txt(仅适用于vlm后端)
+## some_pdf_model_output.txt(仅适用于vlm后端)
 
 该文件是vlm模型的输出结果,使用`----`分割每一页的输出结果。  
 每一页的输出结果一些以`<|box_start|>`开头,`<|md_end|>`结尾的文本块。  
@@ -143,7 +143,7 @@ poly 坐标的格式 \[x0, y0, x1, y1, x2, y2, x3, y3\], 分别表示左上、
     该字段是该block的markdown内容,如type为text,文本末尾可能存在`<|txt_contd|>`标记,表示该文本块可以后后续text块连接。
     如type为table,内容为`otsl`格式表示的表格内容,需要转换为html格式才能在markdown中渲染。
 
-### some_pdf_middle.json
+## some_pdf_middle.json
 
 | 字段名            | 解释                                        |
 |:---------------|:------------------------------------------|
@@ -251,7 +251,7 @@ para_blocks内存储的元素为区块信息
 
 一级block(如有)->二级block->line->span
 
-#### 示例数据
+### 示例数据
 
 ```json
 {
@@ -354,7 +354,7 @@ para_blocks内存储的元素为区块信息
 }
 ```
 
-### some_pdf_content_list.json
+## some_pdf_content_list.json
 
 该文件是一个json数组,每个元素是一个dict,按阅读顺序平铺存储文档中所有可阅读的内容块。  
 content_list可以看成简化后的middle.json,内容块的类型基本和middle.json一致,但不包含布局信息。  
@@ -370,7 +370,7 @@ content的类型有如下几种:
 需要注意的是,content_list中的title和text块统一使用text类型表示,通过`text_level`字段来区分文本块的层级,不含`text_level`字段或`text_level`为0的文本块表示正文文本,`text_level`为1的文本块表示一级标题,`text_level`为2的文本块表示二级标题,以此类推。  
 每个content包含`page_idx`字段,表示该内容块所在的页码,从0开始。
 
-#### 示例数据
+### 示例数据
 
 ```json
 [

+ 59 - 0
docs/zh/quick_start/index.md

@@ -0,0 +1,59 @@
+# 快速开始
+
+如果遇到任何安装问题,请先查询 [FAQ](../FAQ/index.md) 
+
+
+如果遇到解析效果不及预期,参考 [Known Issues](../known_issues.md) 
+
+
+有2种不同方式可以体验MinerU的效果:
+
+- [在线体验](online_demo.md)
+- [本地部署](local_deployment.md)
+
+
+> [!WARNING]
+> **安装前必看——软硬件环境支持说明**
+> 
+> 为了确保项目的稳定性和可靠性,我们在开发过程中仅对特定的软硬件环境进行优化和测试。这样当用户在推荐的系统配置上部署和运行项目时,能够获得最佳的性能表现和最少的兼容性问题。
+>
+> 通过集中资源和精力于主线环境,我们团队能够更高效地解决潜在的BUG,及时开发新功能。
+>
+> 在非主线环境中,由于硬件、软件配置的多样性,以及第三方依赖项的兼容性问题,我们无法100%保证项目的完全可用性。因此,对于希望在非推荐环境中使用本项目的用户,我们建议先仔细阅读文档以及FAQ,大多数问题已经在FAQ中有对应的解决方案,除此之外我们鼓励社区反馈问题,以便我们能够逐步扩大支持范围。
+
+<table>
+    <tr>
+        <td>解析后端</td>
+        <td>pipeline</td>
+        <td>vlm-transformers</td>
+        <td>vlm-sglang</td>
+    </tr>
+    <tr>
+        <td>操作系统</td>
+        <td>windows/linux/mac</td>
+        <td>windows/linux</td>
+        <td>windows(wsl2)/linux</td>
+    </tr>
+    <tr>
+        <td>CPU推理支持</td>
+        <td>✅</td>
+        <td colspan="2">❌</td>
+    </tr>
+    <tr>
+        <td>GPU要求</td>
+        <td>Turing及以后架构,6G显存以上或Apple Silicon</td>
+        <td colspan="2">Ampere及以后架构,8G显存以上</td>
+    </tr>
+    <tr>
+        <td>内存要求</td>
+        <td colspan="3">最低16G以上,推荐32G以上</td>
+    </tr>
+    <tr>
+        <td>磁盘空间要求</td>
+        <td colspan="3">20G以上,推荐使用SSD</td>
+    </tr>
+    <tr>
+        <td>python版本</td>
+        <td colspan="3">3.10-3.13</td>
+    </tr>
+</table>

+ 72 - 0
docs/zh/quick_start/local_deployment.md

@@ -0,0 +1,72 @@
+# 本地部署
+
+## 安装 MinerU
+
+### 使用 pip 或 uv 安装
+
+```bash
+pip install --upgrade pip -i https://mirrors.aliyun.com/pypi/simple
+pip install uv -i https://mirrors.aliyun.com/pypi/simple
+uv pip install -U "mineru[core]" -i https://mirrors.aliyun.com/pypi/simple 
+```
+
+### 源码安装
+
+```bash
+git clone https://github.com/opendatalab/MinerU.git
+cd MinerU
+uv pip install -e .[core] -i https://mirrors.aliyun.com/pypi/simple
+```
+
+> [!NOTE]
+> Linux和macOS系统安装后自动支持cuda/mps加速,Windows用户如需使用cuda加速,
+> 请前往 [Pytorch官网](https://pytorch.org/get-started/locally/) 选择合适的cuda版本安装pytorch。
+
+### 安装完整版(支持 sglang 加速)(需确保设备有Turing及以后架构,8G显存及以上显卡)
+
+如需使用 **sglang 加速 VLM 模型推理**,请选择合适的方式安装完整版本:
+
+- 使用uv或pip安装
+  ```bash
+  uv pip install -U "mineru[all]" -i https://mirrors.aliyun.com/pypi/simple
+  ```
+- 从源码安装:
+  ```bash
+  uv pip install -e .[all] -i https://mirrors.aliyun.com/pypi/simple
+  ```
+  
+> [!TIP]
+> sglang安装过程中如发生异常,请参考[sglang官方文档](https://docs.sglang.ai/start/install.html)尝试解决或直接使用docker方式安装。
+
+- 使用 Dockerfile 构建镜像:
+  ```bash
+  wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/Dockerfile
+  docker build -t mineru-sglang:latest -f Dockerfile .
+  ```
+  启动 Docker 容器:
+  ```bash
+  docker run --gpus all \
+    --shm-size 32g \
+    -p 30000:30000 \
+    --ipc=host \
+    mineru-sglang:latest \
+    mineru-sglang-server --host 0.0.0.0 --port 30000
+  ```
+  或使用 Docker Compose 启动:
+  ```bash
+    wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/compose.yaml
+    docker compose -f compose.yaml up -d
+  ```
+  
+> [!TIP]
+> Dockerfile默认使用`lmsysorg/sglang:v0.4.8.post1-cu126`作为基础镜像,支持Turing/Ampere/Ada Lovelace/Hopper平台,
+> 如您使用较新的`Blackwell`平台,请将基础镜像修改为`lmsysorg/sglang:v0.4.8.post1-cu128-b200`。
+
+### 安装client(用于在仅需 CPU 和网络连接的边缘设备上连接 sglang-server)
+
+```bash
+uv pip install -U mineru -i https://mirrors.aliyun.com/pypi/simple
+mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://<host_ip>:<port>
+```
+
+---

File diff ditekan karena terlalu besar
+ 2 - 0
docs/zh/quick_start/online_demo.md


+ 9 - 0
docs/zh/todo.md

@@ -0,0 +1,9 @@
+# TODO
+
+- [x] 基于模型的阅读顺序  
+- [x] 正文中目录、列表识别  
+- [x] 表格识别
+- [x] 标题分级
+- [ ] 正文中代码块识别
+- [ ] [化学式识别](../chemical_knowledge_introduction/introduction.pdf)
+- [ ] 几何图形识别

+ 57 - 0
docs/zh/usage/api.md

@@ -0,0 +1,57 @@
+# API 调用 或 可视化调用
+
+1. 使用python api直接调用:[Python 调用示例](https://github.com/opendatalab/MinerU/blob/master/demo/demo.py)
+2. 使用fast api方式调用:
+    ```bash
+    mineru-api --host 127.0.0.1 --port 8000
+    ```
+    在浏览器中访问 http://127.0.0.1:8000/docs 查看API文档。
+
+3. 使用gradio webui 或 gradio api调用
+    ```bash
+    # 使用 pipeline/vlm-transformers/vlm-sglang-client 后端
+    mineru-gradio --server-name 127.0.0.1 --server-port 7860
+    # 或使用 vlm-sglang-engine/pipeline 后端
+    mineru-gradio --server-name 127.0.0.1 --server-port 7860 --enable-sglang-engine true
+    ```
+    在浏览器中访问 http://127.0.0.1:7860 使用 Gradio WebUI 或访问 http://127.0.0.1:7860/?view=api 使用 Gradio API。
+
+> [!TIP]
+> - 以下是一些使用sglang加速模式的建议和注意事项:
+> - sglang加速模式目前支持在最低8G显存的Turing架构显卡上运行,但在显存<24G的显卡上可能会遇到显存不足的问题, 可以通过使用以下参数来优化显存使用:
+>   - 如果您使用单张显卡遇到显存不足的情况时,可能需要调低KV缓存大小,`--mem-fraction-static 0.5`,如仍出现显存不足问题,可尝试进一步降低到`0.4`或更低。
+>   - 如您有两张以上显卡,可尝试通过张量并行(TP)模式简单扩充可用显存:`--tp-size 2`
+> - 如果您已经可以正常使用sglang对vlm模型进行加速推理,但仍然希望进一步提升推理速度,可以尝试以下参数:
+>   - 如果您有超过多张显卡,可以使用sglang的多卡并行模式来增加吞吐量:`--dp-size 2`
+>   - 同时您可以启用`torch.compile`来将推理速度加速约15%:`--enable-torch-compile`
+> - 如果您想了解更多有关`sglang`的参数使用方法,请参考 [sglang官方文档](https://docs.sglang.ai/backend/server_arguments.html#common-launch-commands)
+> - 所有sglang官方支持的参数都可用通过命令行参数传递给 MinerU,包括以下命令:`mineru`、`mineru-sglang-server`、`mineru-gradio`、`mineru-api`
+
+> [!TIP]
+> - 任何情况下,您都可以通过在命令行的开头添加`CUDA_VISIBLE_DEVICES` 环境变量来指定可见的 GPU 设备。例如:
+>   ```bash
+>   CUDA_VISIBLE_DEVICES=1 mineru -p <input_path> -o <output_path>
+>   ```
+> - 这种指定方式对所有的命令行调用都有效,包括 `mineru`、`mineru-sglang-server`、`mineru-gradio` 和 `mineru-api`,且对`pipeline`、`vlm`后端均适用。
+> - 以下是一些常见的 `CUDA_VISIBLE_DEVICES` 设置示例:
+>   ```bash
+>   CUDA_VISIBLE_DEVICES=1 Only device 1 will be seen
+>   CUDA_VISIBLE_DEVICES=0,1 Devices 0 and 1 will be visible
+>   CUDA_VISIBLE_DEVICES=“0,1” Same as above, quotation marks are optional
+>   CUDA_VISIBLE_DEVICES=0,2,3 Devices 0, 2, 3 will be visible; device 1 is masked
+>   CUDA_VISIBLE_DEVICES="" No GPU will be visible
+>   ```
+> - 以下是一些可能的使用场景:
+>   - 如果您有多张显卡,需要指定卡0和卡1,并使用多卡并行来启动'sglang-server',可以使用以下命令:
+>   ```bash
+>   CUDA_VISIBLE_DEVICES=0,1 mineru-sglang-server --port 30000 --dp-size 2
+>   ```
+>   - 如果您有多张显卡,需要在卡0和卡1上启动两个`fastapi`服务,并分别监听不同的端口,可以使用以下命令:
+>   ```bash
+>   # 在终端1中
+>   CUDA_VISIBLE_DEVICES=0 mineru-api --host 127.0.0.1 --port 8000
+>   # 在终端2中
+>   CUDA_VISIBLE_DEVICES=1 mineru-api --host 127.0.0.1 --port 8001
+>   ```
+
+---

+ 11 - 0
docs/zh/usage/config.md

@@ -0,0 +1,11 @@
+
+# 基于配置文件扩展 MinerU 功能
+
+- MinerU 现已实现开箱即用,但也支持通过配置文件扩展功能。您可以在用户目录下创建 `mineru.json` 文件,添加自定义配置。
+- `mineru.json` 文件会在您使用内置模型下载命令 `mineru-models-download` 时自动生成,也可以通过将[配置模板文件](../../mineru.template.json)复制到用户目录下并重命名为 `mineru.json` 来创建。
+- 以下是一些可用的配置选项:
+  - `latex-delimiter-config`:用于配置 LaTeX 公式的分隔符,默认为`$`符号,可根据需要修改为其他符号或字符串。
+  - `llm-aided-config`:用于配置 LLM 辅助标题分级的相关参数,兼容所有支持`openai协议`的 LLM 模型,默认使用`阿里云百炼`的`qwen2.5-32b-instruct`模型,您需要自行配置 API 密钥并将`enable`设置为`true`来启用此功能。
+  - `models-dir`:用于指定本地模型存储目录,请为`pipeline`和`vlm`后端分别指定模型目录,指定目录后您可通过配置环境变量`export MINERU_MODEL_SOURCE=local`来使用本地模型。
+
+---

+ 125 - 0
docs/zh/usage/index.md

@@ -0,0 +1,125 @@
+# 使用 MinerU
+
+## 命令行使用方式
+
+### 基础用法
+
+最简单的命令行调用方式如下:
+
+```bash
+mineru -p <input_path> -o <output_path>
+```
+
+- `<input_path>`:本地 PDF/图片 文件或目录(支持 pdf/png/jpg/jpeg/webp/gif)
+- `<output_path>`:输出目录
+
+### 查看帮助信息
+
+获取所有可用参数说明:
+
+```bash
+mineru --help
+```
+
+### 参数详解
+
+```text
+Usage: mineru [OPTIONS]
+
+Options:
+  -v, --version                   显示版本并退出
+  -p, --path PATH                 输入文件路径或目录(必填)
+  -o, --output PATH               输出目录(必填)
+  -m, --method [auto|txt|ocr]     解析方法:auto(默认)、txt、ocr(仅用于 pipeline 后端)
+  -b, --backend [pipeline|vlm-transformers|vlm-sglang-engine|vlm-sglang-client]
+                                  解析后端(默认为 pipeline)
+  -l, --lang [ch|ch_server|ch_lite|en|korean|japan|chinese_cht|ta|te|ka|latin|arabic|east_slavic|cyrillic|devanagari]
+                                  指定文档语言(可提升 OCR 准确率,仅用于 pipeline 后端)
+  -u, --url TEXT                  当使用 sglang-client 时,需指定服务地址
+  -s, --start INTEGER             开始解析的页码(从 0 开始)
+  -e, --end INTEGER               结束解析的页码(从 0 开始)
+  -f, --formula BOOLEAN           是否启用公式解析(默认开启)
+  -t, --table BOOLEAN             是否启用表格解析(默认开启)
+  -d, --device TEXT               推理设备(如 cpu/cuda/cuda:0/npu/mps,仅 pipeline 后端)
+  --vram INTEGER                  单进程最大 GPU 显存占用(GB)(仅 pipeline 后端)
+  --source [huggingface|modelscope|local]
+                                  模型来源,默认 huggingface
+  --help                          显示帮助信息
+```
+
+---
+
+## 模型源配置
+
+MinerU 默认在首次运行时自动从 HuggingFace 下载所需模型。若无法访问 HuggingFace,可通过以下方式切换模型源:
+
+### 切换至 ModelScope 源
+
+```bash
+mineru -p <input_path> -o <output_path> --source modelscope
+```
+
+或设置环境变量:
+
+```bash
+export MINERU_MODEL_SOURCE=modelscope
+mineru -p <input_path> -o <output_path>
+```
+
+### 使用本地模型
+
+#### 1. 下载模型到本地
+
+```bash
+mineru-models-download --help
+```
+
+或使用交互式命令行工具选择模型下载:
+
+```bash
+mineru-models-download
+```
+
+下载完成后,模型路径会在当前终端窗口输出,并自动写入用户目录下的 `mineru.json`。
+
+#### 2. 使用本地模型进行解析
+
+```bash
+mineru -p <input_path> -o <output_path> --source local
+```
+
+或通过环境变量启用:
+
+```bash
+export MINERU_MODEL_SOURCE=local
+mineru -p <input_path> -o <output_path>
+```
+
+---
+
+## 使用 sglang 加速 VLM 模型推理
+
+### 通过 sglang-engine 模式
+
+```bash
+mineru -p <input_path> -o <output_path> -b vlm-sglang-engine
+```
+
+### 通过 sglang-server/client 模式
+
+1. 启动 Server:
+
+```bash
+mineru-sglang-server --port 30000
+```
+
+2. 在另一个终端中使用 Client 调用:
+
+```bash
+mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://127.0.0.1:30000
+```
+
+> [!TIP]
+> 更多关于输出文件的信息,请参考 [输出文件说明](../output_file.md)
+
+---

+ 11 - 8
mineru/model/table/rapid_table.py

@@ -76,11 +76,14 @@ class RapidTableModel(object):
 
 
         if ocr_result:
-            table_results = self.table_model(np.asarray(image), ocr_result)
-            html_code = table_results.pred_html
-            table_cell_bboxes = table_results.cell_bboxes
-            logic_points = table_results.logic_points
-            elapse = table_results.elapse
-            return html_code, table_cell_bboxes, logic_points, elapse
-        else:
-            return None, None, None, None
+            try:
+                table_results = self.table_model(np.asarray(image), ocr_result)
+                html_code = table_results.pred_html
+                table_cell_bboxes = table_results.cell_bboxes
+                logic_points = table_results.logic_points
+                elapse = table_results.elapse
+                return html_code, table_cell_bboxes, logic_points, elapse
+            except Exception as e:
+                logger.exception(e)
+
+        return None, None, None, None

+ 96 - 0
mkdocs.yml

@@ -0,0 +1,96 @@
+site_name: MinerU
+site_url: https://sidney233.github.io/MinerU/
+repo_name: opendatalab/MinerU
+repo_url: https://github.com/opendatalab/MinerU
+
+theme:
+  name: material
+  palette:
+    # Palette toggle for automatic mode
+    - media: "(prefers-color-scheme)"
+      scheme: default
+      primary: black
+      toggle:
+        icon: material/brightness-auto
+        name: Switch to light mode
+
+    # Palette toggle for light mode
+    - media: "(prefers-color-scheme: light)"
+      scheme: default
+      primary: black
+      toggle:
+        icon: material/brightness-7
+        name: Switch to dark mode
+
+    # Palette toggle for dark mode
+    - media: "(prefers-color-scheme: dark)"
+      scheme: slate
+      primary: black
+      toggle:
+        icon: material/brightness-4
+        name: Switch to system preference
+  logo: images/logo.png
+  favicon: images/logo.svg
+  features:
+    - content.tabs.link
+    - content.code.annotate
+    - content.code.copy
+    - navigation.instant
+    - navigation.instant.progress
+    - navigation.tabs
+    - navigation.tabs.sticky
+    - navigation.sections
+    - navigation.path
+    - navigation.indexes
+    - search.suggest
+
+nav:
+  - Home:
+    - "MinerU": index.md
+    - Quick Start:
+      - quick_start/index.md
+      - Online Demo: quick_start/online_demo.md
+      - Local Deployment: quick_start/local_deployment.md
+    - Usage:
+      - usage/index.md
+      - API Calls or Visual Invocation: usage/api.md
+      - Extending MinerU Functionality Through Configuration Files: usage/config.md
+  - FAQ:
+      - FAQ: FAQ/index.md
+  - Output File Format: output_file.md
+  - Known Issues: known_issues.md
+  - TODO: todo.md
+
+plugins:
+  - search
+  - i18n:
+      docs_structure: folder
+      languages:
+        - locale: en
+          default: true
+          name: English
+          build: true
+        - locale: zh
+          name: 中文
+          build: true
+          nav_translations:
+            Home: 主页
+            Quick Start: 快速开始
+            Online Demo: 在线体验
+            Local Deployment: 本地部署
+            Usage: 使用方法
+            API Calls or Visual Invocation: API 调用 或 可视化调用
+            Extending MinerU Functionality Through Configuration Files: 基于配置文件扩展 MinerU 功能
+            FAQ: FAQ
+            Output File Format: 输出文件格式
+            Known Issues: Known Issues
+            TODO: TODO
+  - mkdocs-video
+
+markdown_extensions:
+  - gfm_admonition
+  - pymdownx.highlight:
+      use_pygments: true
+  - pymdownx.superfences
+  - pymdownx.tasklist:
+      custom_checkbox: true

Beberapa file tidak ditampilkan karena terlalu banyak file yang berubah dalam diff ini