Эх сурвалжийг харах

Merge pull request #906 from icecraft/feat/add_en_docs

Feat/add en docs
Xiaomeng Zhao 1 жил өмнө
parent
commit
784c61a219
100 өөрчлөгдсөн 1686 нэмэгдсэн , 602 устгасан
  1. 3 0
      .gitignore
  2. 2 2
      .readthedocs.yaml
  3. 1 177
      README.md
  4. 1 176
      README_zh-CN.md
  5. 2 2
      docs/en/.readthedocs.yaml
  6. 0 0
      docs/en/Makefile
  7. 0 0
      docs/en/_static/image/MinerU-logo-hq.png
  8. 0 0
      docs/en/_static/image/MinerU-logo.png
  9. 0 0
      docs/en/_static/image/datalab_logo.png
  10. 0 0
      docs/en/_static/image/flowchart_en.png
  11. 0 0
      docs/en/_static/image/flowchart_zh_cn.png
  12. 0 0
      docs/en/_static/image/layout_example.png
  13. 0 0
      docs/en/_static/image/logo.png
  14. 0 0
      docs/en/_static/image/poly.png
  15. 0 0
      docs/en/_static/image/project_panorama_en.png
  16. 0 0
      docs/en/_static/image/project_panorama_zh_cn.png
  17. 0 0
      docs/en/_static/image/spans_example.png
  18. 0 0
      docs/en/_static/image/web_demo_1.png
  19. 12 0
      docs/en/additional_notes/faq.rst
  20. 0 0
      docs/en/additional_notes/glossary.rst
  21. 20 0
      docs/en/additional_notes/known_issues.rst
  22. 0 0
      docs/en/api.rst
  23. 0 0
      docs/en/api/classes.rst
  24. 0 0
      docs/en/api/data_reader_writer.rst
  25. 0 0
      docs/en/api/dataset.rst
  26. 0 0
      docs/en/api/io.rst
  27. 0 0
      docs/en/api/read_api.rst
  28. 0 0
      docs/en/api/schemas.rst
  29. 0 0
      docs/en/conf.py
  30. 23 22
      docs/en/index.rst
  31. 0 0
      docs/en/make.bat
  32. 0 0
      docs/en/user_guide.rst
  33. 0 0
      docs/en/user_guide/data.rst
  34. 0 0
      docs/en/user_guide/data/data_reader_writer.rst
  35. 0 0
      docs/en/user_guide/data/dataset.rst
  36. 0 0
      docs/en/user_guide/data/io.rst
  37. 0 0
      docs/en/user_guide/data/read_api.rst
  38. 0 0
      docs/en/user_guide/install.rst
  39. 8 15
      docs/en/user_guide/install/boost_with_cuda.rst
  40. 0 0
      docs/en/user_guide/install/download_model_weight_files.rst
  41. 107 0
      docs/en/user_guide/install/install.rst
  42. 0 0
      docs/en/user_guide/quick_start.rst
  43. 4 1
      docs/en/user_guide/quick_start/command_line.rst
  44. 0 0
      docs/en/user_guide/quick_start/to_markdown.rst
  45. 0 0
      docs/en/user_guide/tutorial.rst
  46. 0 0
      docs/en/user_guide/tutorial/output_file_description.rst
  47. 0 0
      docs/requirements.txt
  48. 2 2
      docs/zh_cn/.readthedocs.yaml
  49. 0 0
      docs/zh_cn/Makefile
  50. 0 0
      docs/zh_cn/_static/image/MinerU-logo-hq.png
  51. 0 0
      docs/zh_cn/_static/image/MinerU-logo.png
  52. 0 0
      docs/zh_cn/_static/image/datalab_logo.png
  53. 0 0
      docs/zh_cn/_static/image/flowchart_en.png
  54. 0 0
      docs/zh_cn/_static/image/flowchart_zh_cn.png
  55. 0 0
      docs/zh_cn/_static/image/layout_example.png
  56. 0 0
      docs/zh_cn/_static/image/logo.png
  57. 0 0
      docs/zh_cn/_static/image/poly.png
  58. 0 0
      docs/zh_cn/_static/image/project_panorama_en.png
  59. 0 0
      docs/zh_cn/_static/image/project_panorama_zh_cn.png
  60. 0 0
      docs/zh_cn/_static/image/spans_example.png
  61. 0 0
      docs/zh_cn/_static/image/web_demo_1.png
  62. 78 0
      docs/zh_cn/additional_notes/faq.rst
  63. 11 0
      docs/zh_cn/additional_notes/glossary.rst
  64. 13 0
      docs/zh_cn/additional_notes/known_issues.rst
  65. 32 3
      docs/zh_cn/conf.py
  66. 81 0
      docs/zh_cn/index.rst
  67. 0 0
      docs/zh_cn/make.bat
  68. 10 0
      docs/zh_cn/user_guide.rst
  69. 20 0
      docs/zh_cn/user_guide/data.rst
  70. 186 0
      docs/zh_cn/user_guide/data/data_reader_writer.rst
  71. 31 0
      docs/zh_cn/user_guide/data/dataset.rst
  72. 21 0
      docs/zh_cn/user_guide/data/io.rst
  73. 48 0
      docs/zh_cn/user_guide/data/read_api.rst
  74. 13 0
      docs/zh_cn/user_guide/install.rst
  75. 293 0
      docs/zh_cn/user_guide/install/boost_with_cuda.rst
  76. 36 0
      docs/zh_cn/user_guide/install/download_model_weight_files.rst
  77. 96 0
      docs/zh_cn/user_guide/install/install.rst
  78. 13 0
      docs/zh_cn/user_guide/quick_start.rst
  79. 61 0
      docs/zh_cn/user_guide/quick_start/command_line.rst
  80. 53 0
      docs/zh_cn/user_guide/quick_start/to_markdown.rst
  81. 11 0
      docs/zh_cn/user_guide/tutorial.rst
  82. 394 0
      docs/zh_cn/user_guide/tutorial/output_file_description.rst
  83. 0 26
      next_docs/en/additional_notes/changelog.rst
  84. 0 19
      next_docs/en/additional_notes/known_issues.rst
  85. 0 1
      next_docs/en/api/utils.rst
  86. 0 13
      next_docs/en/projects.rst
  87. 0 107
      next_docs/en/user_guide/install/install.rst
  88. 0 10
      next_docs/en/user_guide/quick_start/extract_text.rst
  89. 0 26
      next_docs/zh_cn/index.rst
  90. 0 0
      old_docs/FAQ_en_us.md
  91. 0 0
      old_docs/FAQ_zh_cn.md
  92. 0 0
      old_docs/README_Ubuntu_CUDA_Acceleration_en_US.md
  93. 0 0
      old_docs/README_Ubuntu_CUDA_Acceleration_zh_CN.md
  94. 0 0
      old_docs/README_Windows_CUDA_Acceleration_en_US.md
  95. 0 0
      old_docs/README_Windows_CUDA_Acceleration_zh_CN.md
  96. 0 0
      old_docs/chemical_knowledge_introduction/introduction.pdf
  97. 0 0
      old_docs/chemical_knowledge_introduction/introduction.xmind
  98. 0 0
      old_docs/download_models.py
  99. 0 0
      old_docs/download_models_hf.py
  100. 0 0
      old_docs/how_to_download_models_en.md

+ 3 - 0
.gitignore

@@ -48,3 +48,6 @@ debug_utils/
 
 # sphinx docs
 _build/
+
+
+output/

+ 2 - 2
.readthedocs.yaml

@@ -10,7 +10,7 @@ formats:
 
 python:
   install:
-    - requirements: next_docs/zh_cn/requirements.txt
+    - requirements: docs/zh_cn/requirements.txt
 
 sphinx:
-  configuration: next_docs/zh_cn/conf.py
+  configuration: docs/zh_cn/conf.py

+ 1 - 177
README.md

@@ -75,12 +75,10 @@
             <ul>
             <li><a href="#online-demo">Online Demo</a></li>
             <li><a href="#quick-cpu-demo">Quick CPU Demo</a></li>
-            <li><a href="#using-gpu">Using GPU</a></li>
             </ul>
         </li>
         <li><a href="#usage">Usage</a>
             <ul>
-            <li><a href="#command-line">Command Line</a></li>
             <li><a href="#api">API</a></li>
             <li><a href="#deploy-derived-projects">Deploy Derived Projects</a></li>
             <li><a href="#development-guide">Development Guide</a></li>
@@ -89,8 +87,6 @@
       </ul>
     </li>
     <li><a href="#todo">TODO</a></li>
-    <li><a href="#known-issues">Known Issues</a></li>
-    <li><a href="#faq">FAQ</a></li>
     <li><a href="#all-thanks-to-our-contributors">All Thanks To Our Contributors</a></li>
     <li><a href="#license-information">License Information</a></li>
     <li><a href="#acknowledgments">Acknowledgments</a></li>
@@ -112,89 +108,12 @@ Compared to well-known commercial products, MinerU is still young. If you encoun
 
 https://github.com/user-attachments/assets/4bea02c9-6d54-4cd6-97ed-dff14340982c
 
-## Key Features
-
-- Remove headers, footers, footnotes, page numbers, etc., to ensure semantic coherence.
-- Output text in human-readable order, suitable for single-column, multi-column, and complex layouts.
-- Preserve the structure of the original document, including headings, paragraphs, lists, etc.
-- Extract images, image descriptions, tables, table titles, and footnotes.
-- Automatically recognize and convert formulas in the document to LaTeX format.
-- Automatically recognize and convert tables in the document to LaTeX or HTML format.
-- Automatically detect scanned PDFs and garbled PDFs and enable OCR functionality.
-- OCR supports detection and recognition of 84 languages.
-- Supports multiple output formats, such as multimodal and NLP Markdown, JSON sorted by reading order, and rich intermediate formats.
-- Supports various visualization results, including layout visualization and span visualization, for efficient confirmation of output quality.
-- Supports both CPU and GPU environments.
-- Compatible with Windows, Linux, and Mac platforms.
-
 ## Quick Start
 
-If you encounter any installation issues, please first consult the <a href="#faq">FAQ</a>. </br>
-If the parsing results are not as expected, refer to the <a href="#known-issues">Known Issues</a>. </br>
-There are three different ways to experience MinerU:
+There are multiple different ways to experience MinerU:
 
 - [Online Demo (No Installation Required)](#online-demo)
 - [Quick CPU Demo (Windows, Linux, Mac)](#quick-cpu-demo)
-- [Linux/Windows + CUDA](#Using-GPU)
-
-> [!WARNING]
-> **Pre-installation Notice—Hardware and Software Environment Support**
->
-> To ensure the stability and reliability of the project, we only optimize and test for specific hardware and software environments during development. This ensures that users deploying and running the project on recommended system configurations will get the best performance with the fewest compatibility issues.
->
-> By focusing resources on the mainline environment, our team can more efficiently resolve potential bugs and develop new features.
->
-> In non-mainline environments, due to the diversity of hardware and software configurations, as well as third-party dependency compatibility issues, we cannot guarantee 100% project availability. Therefore, for users who wish to use this project in non-recommended environments, we suggest carefully reading the documentation and FAQ first. Most issues already have corresponding solutions in the FAQ. We also encourage community feedback to help us gradually expand support.
-
-<table>
-    <tr>
-        <td colspan="3" rowspan="2">Operating System</td>
-    </tr>
-    <tr>
-        <td>Ubuntu 22.04 LTS</td>
-        <td>Windows 10 / 11</td>
-        <td>macOS 11+</td>
-    </tr>
-    <tr>
-        <td colspan="3">CPU</td>
-        <td>x86_64(unsupported ARM Linux)</td>
-        <td>x86_64(unsupported ARM Windows)</td>
-        <td>x86_64 / arm64</td>
-    </tr>
-    <tr>
-        <td colspan="3">Memory</td>
-        <td colspan="3">16GB or more, recommended 32GB+</td>
-    </tr>
-    <tr>
-        <td colspan="3">Python Version</td>
-        <td colspan="3">3.10(Please make sure to create a Python 3.10 virtual environment using conda)</td>
-    </tr>
-    <tr>
-        <td colspan="3">Nvidia Driver Version</td>
-        <td>latest (Proprietary Driver)</td>
-        <td>latest</td>
-        <td>None</td>
-    </tr>
-    <tr>
-        <td colspan="3">CUDA Environment</td>
-        <td>Automatic installation [12.1 (pytorch) + 11.8 (paddle)]</td>
-        <td>11.8 (manual installation) + cuDNN v8.7.0 (manual installation)</td>
-        <td>None</td>
-    </tr>
-    <tr>
-        <td rowspan="2">GPU Hardware Support List</td>
-        <td colspan="2">Minimum Requirement 8G+ VRAM</td>
-        <td colspan="2">3060ti/3070/4060<br>
-        8G VRAM enables layout, formula recognition acceleration and OCR acceleration</td>
-        <td rowspan="2">None</td>
-    </tr>
-    <tr>
-        <td colspan="2">Recommended Configuration 10G+ VRAM</td>
-        <td colspan="2">3080/3080ti/3090/3090ti/4070/4070ti/4070tisuper/4080/4090<br>
-        10G VRAM or more can enable layout, formula recognition, OCR acceleration and table recognition acceleration simultaneously
-        </td>
-    </tr>
-</table>
 
 ### Online Demo
 
@@ -251,85 +170,8 @@ You can modify certain configurations in this file to enable or disable features
 }
 ```
 
-### Using GPU
-
-If your device supports CUDA and meets the GPU requirements of the mainline environment, you can use GPU acceleration. Please select the appropriate guide based on your system:
-
-- [Ubuntu 22.04 LTS + GPU](docs/README_Ubuntu_CUDA_Acceleration_en_US.md)
-- [Windows 10/11 + GPU](docs/README_Windows_CUDA_Acceleration_en_US.md)
-- Quick Deployment with Docker
-> [!IMPORTANT]
-> Docker requires a GPU with at least 16GB of VRAM, and all acceleration features are enabled by default.
->
-> Before running this Docker, you can use the following command to check if your device supports CUDA acceleration on Docker.
-> 
-> ```bash
-> docker run --rm --gpus=all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
-> ```
-  ```bash
-  wget https://github.com/opendatalab/MinerU/raw/master/Dockerfile
-  docker build -t mineru:latest .
-  docker run --rm -it --gpus=all mineru:latest /bin/bash
-  magic-pdf --help
-  ```
-
 ## Usage
 
-### Command Line
-
-```bash
-magic-pdf --help
-Usage: magic-pdf [OPTIONS]
-
-Options:
-  -v, --version                display the version and exit
-  -p, --path PATH              local pdf filepath or directory  [required]
-  -o, --output-dir PATH        output local directory  [required]
-  -m, --method [ocr|txt|auto]  the method for parsing pdf. ocr: using ocr
-                               technique to extract information from pdf. txt:
-                               suitable for the text-based pdf only and
-                               outperform ocr. auto: automatically choose the
-                               best method for parsing pdf from ocr and txt.
-                               without method specified, auto will be used by
-                               default.
-  -l, --lang TEXT              Input the languages in the pdf (if known) to
-                               improve OCR accuracy.  Optional. You should
-                               input "Abbreviation" with language form url: ht
-                               tps://paddlepaddle.github.io/PaddleOCR/latest/en
-                               /ppocr/blog/multi_languages.html#5-support-languages-
-                               and-abbreviations
-  -d, --debug BOOLEAN          Enables detailed debugging information during
-                               the execution of the CLI commands.
-  -s, --start INTEGER          The starting page for PDF parsing, beginning
-                               from 0.
-  -e, --end INTEGER            The ending page for PDF parsing, beginning from
-                               0.
-  --help                       Show this message and exit.
-
-
-## show version
-magic-pdf -v
-
-## command line example
-magic-pdf -p {some_pdf} -o {some_output_dir} -m auto
-```
-
-`{some_pdf}` can be a single PDF file or a directory containing multiple PDFs.
-The results will be saved in the `{some_output_dir}` directory. The output file list is as follows:
-
-```text
-├── some_pdf.md                          # markdown file
-├── images                               # directory for storing images
-├── some_pdf_layout.pdf                  # layout diagram (Include layout reading order)
-├── some_pdf_middle.json                 # MinerU intermediate processing result
-├── some_pdf_model.json                  # model inference result
-├── some_pdf_origin.pdf                  # original PDF file
-├── some_pdf_spans.pdf                   # smallest granularity bbox position information diagram
-└── some_pdf_content_list.json           # Rich text JSON arranged in reading order
-```
-> [!TIP]
-> For more information about the output files, please refer to the [Output File Description](docs/output_file_en_us.md).
-
 ### API
 
 Processing files from local disk
@@ -386,24 +228,6 @@ TODO
 - [ ] [Chemical formula recognition](docs/chemical_knowledge_introduction/introduction.pdf)
 - [ ] Geometric shape recognition
 
-# Known Issues
-
-- Reading order is determined by the model based on the spatial distribution of readable content, and may be out of order in some areas under extremely complex layouts.
-- Vertical text is not supported.
-- Tables of contents and lists are recognized through rules, and some uncommon list formats may not be recognized.
-- Only one level of headings is supported; hierarchical headings are not currently supported.
-- Code blocks are not yet supported in the layout model.
-- Comic books, art albums, primary school textbooks, and exercises cannot be parsed well.
-- Table recognition may result in row/column recognition errors in complex tables.
-- OCR recognition may produce inaccurate characters in PDFs of lesser-known languages (e.g., diacritical marks in Latin script, easily confused characters in Arabic script).
-- Some formulas may not render correctly in Markdown.
-
-# FAQ
-
-[FAQ in Chinese](docs/FAQ_zh_cn.md)
-
-[FAQ in English](docs/FAQ_en_us.md)
-
 # All Thanks To Our Contributors
 
 <a href="https://github.com/opendatalab/MinerU/graphs/contributors">

+ 1 - 176
README_zh-CN.md

@@ -76,12 +76,10 @@
             <ul>
             <li><a href="#在线体验">在线体验</a></li>
             <li><a href="#使用CPU快速体验">使用CPU快速体验</a></li>
-            <li><a href="#使用GPU">使用GPU</a></li>
             </ul>
         </li>
         <li><a href="#使用">使用方式</a>
             <ul>
-            <li><a href="#命令行">命令行</a></li>
             <li><a href="#api">API</a></li>
             <li><a href="#部署衍生项目">部署衍生项目</a></li>
             <li><a href="#二次开发">二次开发</a></li>
@@ -113,90 +111,13 @@ MinerU诞生于[书生-浦语](https://github.com/InternLM/InternLM)的预训练
 
 https://github.com/user-attachments/assets/4bea02c9-6d54-4cd6-97ed-dff14340982c
 
-## 主要功能
-
-- 删除页眉、页脚、脚注、页码等元素,确保语义连贯
-- 输出符合人类阅读顺序的文本,适用于单栏、多栏及复杂排版
-- 保留原文档的结构,包括标题、段落、列表等
-- 提取图像、图片描述、表格、表格标题及脚注
-- 自动识别并转换文档中的公式为LaTeX格式
-- 自动识别并转换文档中的表格为LaTeX或HTML格式
-- 自动检测扫描版PDF和乱码PDF,并启用OCR功能
-- OCR支持84种语言的检测与识别
-- 支持多种输出格式,如多模态与NLP的Markdown、按阅读顺序排序的JSON、含有丰富信息的中间格式等
-- 支持多种可视化结果,包括layout可视化、span可视化等,便于高效确认输出效果与质检
-- 支持CPU和GPU环境
-- 兼容Windows、Linux和Mac平台
 
 ## 快速开始
 
-如果遇到任何安装问题,请先查询 <a href="#faq">FAQ</a> </br>
-如果遇到解析效果不及预期,参考 <a href="#known-issues">Known Issues</a></br>
-有3种不同方式可以体验MinerU的效果:
+有多种不同方式可以体验MinerU的效果:
 
 - [在线体验(无需任何安装)](#在线体验)
 - [使用CPU快速体验(Windows,Linux,Mac)](#使用cpu快速体验)
-- [Linux/Windows + CUDA](#使用gpu)
-
-
-> [!WARNING]
-> **安装前必看——软硬件环境支持说明**
-> 
-> 为了确保项目的稳定性和可靠性,我们在开发过程中仅对特定的软硬件环境进行优化和测试。这样当用户在推荐的系统配置上部署和运行项目时,能够获得最佳的性能表现和最少的兼容性问题。
->
-> 通过集中资源和精力于主线环境,我们团队能够更高效地解决潜在的BUG,及时开发新功能。
->
-> 在非主线环境中,由于硬件、软件配置的多样性,以及第三方依赖项的兼容性问题,我们无法100%保证项目的完全可用性。因此,对于希望在非推荐环境中使用本项目的用户,我们建议先仔细阅读文档以及FAQ,大多数问题已经在FAQ中有对应的解决方案,除此之外我们鼓励社区反馈问题,以便我们能够逐步扩大支持范围。
-
-<table>
-    <tr>
-        <td colspan="3" rowspan="2">操作系统</td>
-    </tr>
-    <tr>
-        <td>Ubuntu 22.04 LTS</td>
-        <td>Windows 10 / 11</td>
-        <td>macOS 11+</td>
-    </tr>
-    <tr>
-        <td colspan="3">CPU</td>
-        <td>x86_64(暂不支持ARM Linux)</td>
-        <td>x86_64(暂不支持ARM Windows)</td>
-        <td>x86_64 / arm64</td>
-    </tr>
-    <tr>
-        <td colspan="3">内存</td>
-        <td colspan="3">大于等于16GB,推荐32G以上</td>
-    </tr>
-    <tr>
-        <td colspan="3">python版本</td>
-        <td colspan="3">3.10 (请务必通过conda创建3.10虚拟环境)</td>
-    </tr>
-    <tr>
-        <td colspan="3">Nvidia Driver 版本</td>
-        <td>latest(专有驱动)</td>
-        <td>latest</td>
-        <td>None</td>
-    </tr>
-    <tr>
-        <td colspan="3">CUDA环境</td>
-        <td>自动安装[12.1(pytorch)+11.8(paddle)]</td>
-        <td>11.8(手动安装)+cuDNN v8.7.0(手动安装)</td>
-        <td>None</td>
-    </tr>
-    <tr>
-        <td rowspan="2">GPU硬件支持列表</td>
-        <td colspan="2">最低要求 8G+显存</td>
-        <td colspan="2">3060ti/3070/4060<br>
-        8G显存可开启layout、公式识别和ocr加速</td>
-        <td rowspan="2">None</td>
-    </tr>
-    <tr>
-        <td colspan="2">推荐配置 10G+显存</td>
-        <td colspan="2">3080/3080ti/3090/3090ti/4070/4070ti/4070tisuper/4080/4090<br>
-        10G显存及以上可以同时开启layout、公式识别和ocr加速和表格识别加速<br>
-        </td>
-    </tr>
-</table>
 
 ### 在线体验
 稳定版(经过QA验证的稳定版本):
@@ -257,87 +178,9 @@ pip install -U magic-pdf[full] --extra-index-url https://wheels.myhloli.com -i h
 }
 ```
 
-### 使用GPU
-
-如果您的设备支持CUDA,且满足主线环境中的显卡要求,则可以使用GPU加速,请根据自己的系统选择适合的教程:
-
-- [Ubuntu22.04LTS + GPU](docs/README_Ubuntu_CUDA_Acceleration_zh_CN.md)
-- [Windows10/11 + GPU](docs/README_Windows_CUDA_Acceleration_zh_CN.md)
-- 使用Docker快速部署
-> [!IMPORTANT]
-> Docker 需设备gpu显存大于等于16GB,默认开启所有加速功能
-> 
-> 运行本docker前可以通过以下命令检测自己的设备是否支持在docker上使用CUDA加速
-> 
-> ```bash
-> docker run --rm --gpus=all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
-> ```
-  ```bash
-  wget https://github.com/opendatalab/MinerU/raw/master/Dockerfile
-  docker build -t mineru:latest .
-  docker run --rm -it --gpus=all mineru:latest /bin/bash
-  magic-pdf --help
-  ```
-    
 
 ## 使用
 
-### 命令行
-
-```bash
-magic-pdf --help
-Usage: magic-pdf [OPTIONS]
-
-Options:
-  -v, --version                display the version and exit
-  -p, --path PATH              local pdf filepath or directory  [required]
-  -o, --output-dir PATH        output local directory  [required]
-  -m, --method [ocr|txt|auto]  the method for parsing pdf. ocr: using ocr
-                               technique to extract information from pdf. txt:
-                               suitable for the text-based pdf only and
-                               outperform ocr. auto: automatically choose the
-                               best method for parsing pdf from ocr and txt.
-                               without method specified, auto will be used by
-                               default.
-  -l, --lang TEXT              Input the languages in the pdf (if known) to
-                               improve OCR accuracy.  Optional. You should
-                               input "Abbreviation" with language form url: ht
-                               tps://paddlepaddle.github.io/PaddleOCR/latest/en
-                               /ppocr/blog/multi_languages.html#5-support-languages-
-                               and-abbreviations
-  -d, --debug BOOLEAN          Enables detailed debugging information during
-                               the execution of the CLI commands.
-  -s, --start INTEGER          The starting page for PDF parsing, beginning
-                               from 0.
-  -e, --end INTEGER            The ending page for PDF parsing, beginning from
-                               0.
-  --help                       Show this message and exit.
-
-
-## show version
-magic-pdf -v
-
-## command line example
-magic-pdf -p {some_pdf} -o {some_output_dir} -m auto
-```
-
-其中 `{some_pdf}` 可以是单个pdf文件,也可以是一个包含多个pdf文件的目录。
-运行完命令后输出的结果会保存在`{some_output_dir}`目录下, 输出的文件列表如下
-
-```text
-├── some_pdf.md                          # markdown 文件
-├── images                               # 存放图片目录
-├── some_pdf_layout.pdf                  # layout 绘图 (包含layout阅读顺序)
-├── some_pdf_middle.json                 # minerU 中间处理结果
-├── some_pdf_model.json                  # 模型推理结果
-├── some_pdf_origin.pdf                  # 原 pdf 文件
-├── some_pdf_spans.pdf                   # 最小粒度的bbox位置信息绘图
-└── some_pdf_content_list.json           # 按阅读顺序排列的富文本json
-```
-
-> [!TIP]
-> 更多有关输出文件的信息,请参考[输出文件说明](docs/output_file_zh_cn.md)
-
 ### API
 
 处理本地磁盘上的文件
@@ -394,24 +237,6 @@ TODO
 - [ ] [化学式识别](docs/chemical_knowledge_introduction/introduction.pdf)
 - [ ] 几何图形识别
 
-# Known Issues
-
-- 阅读顺序基于模型对可阅读内容在空间中的分布进行排序,在极端复杂的排版下可能会部分区域乱序
-- 不支持竖排文字
-- 目录和列表通过规则进行识别,少部分不常见的列表形式可能无法识别
-- 标题只有一级,目前不支持标题分级
-- 代码块在layout模型里还没有支持
-- 漫画书、艺术图册、小学教材、习题尚不能很好解析
-- 表格识别在复杂表格上可能会出现行/列识别错误
-- 在小语种PDF上,OCR识别可能会出现字符不准确的情况(如拉丁文的重音符号、阿拉伯文易混淆字符等)
-- 部分公式可能会无法在markdown中渲染
-
-# FAQ
-
-[常见问题](docs/FAQ_zh_cn.md)
-
-
-[FAQ](docs/FAQ_en_us.md)
 
 # All Thanks To Our Contributors
 

+ 2 - 2
next_docs/en/.readthedocs.yaml → docs/en/.readthedocs.yaml

@@ -10,7 +10,7 @@ formats:
 
 python:
   install:
-    - requirements: next_docs/requirements.txt
+    - requirements: docs/requirements.txt
 
 sphinx:
-  configuration: next_docs/en/conf.py
+  configuration: docs/en/conf.py

+ 0 - 0
next_docs/en/Makefile → docs/en/Makefile


+ 0 - 0
docs/images/MinerU-logo-hq.png → docs/en/_static/image/MinerU-logo-hq.png


+ 0 - 0
docs/images/MinerU-logo.png → docs/en/_static/image/MinerU-logo.png


+ 0 - 0
docs/images/datalab_logo.png → docs/en/_static/image/datalab_logo.png


+ 0 - 0
docs/images/flowchart_en.png → docs/en/_static/image/flowchart_en.png


+ 0 - 0
docs/images/flowchart_zh_cn.png → docs/en/_static/image/flowchart_zh_cn.png


+ 0 - 0
docs/images/layout_example.png → docs/en/_static/image/layout_example.png


+ 0 - 0
next_docs/en/_static/image/logo.png → docs/en/_static/image/logo.png


+ 0 - 0
docs/images/poly.png → docs/en/_static/image/poly.png


+ 0 - 0
docs/images/project_panorama_en.png → docs/en/_static/image/project_panorama_en.png


+ 0 - 0
docs/images/project_panorama_zh_cn.png → docs/en/_static/image/project_panorama_zh_cn.png


+ 0 - 0
docs/images/spans_example.png → docs/en/_static/image/spans_example.png


+ 0 - 0
docs/images/web_demo_1.png → docs/en/_static/image/web_demo_1.png


+ 12 - 0
next_docs/en/additional_notes/faq.rst → docs/en/additional_notes/faq.rst

@@ -74,3 +74,15 @@ CUDA version used by Paddle needs to be upgraded.
    pip install paddlepaddle-gpu==3.0.0b1 -i https://www.paddlepaddle.org.cn/packages/stable/cu123/
 
 Reference: https://github.com/opendatalab/MinerU/issues/558
+
+
+7. On some Linux servers, the program immediately reports an error ``Illegal instruction (core dumped)``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+This might be because the server's CPU does not support the AVX/AVX2
+instruction set, or the CPU itself supports it but has been disabled by
+the system administrator. You can try contacting the system
+administrator to remove the restriction or change to a different server.
+
+References: https://github.com/opendatalab/MinerU/issues/591 ,
+https://github.com/opendatalab/MinerU/issues/736

+ 0 - 0
next_docs/en/additional_notes/glossary.rst → docs/en/additional_notes/glossary.rst


+ 20 - 0
docs/en/additional_notes/known_issues.rst

@@ -0,0 +1,20 @@
+Known Issues
+============
+
+-  Reading order is determined by the model based on the spatial
+   distribution of readable content, and may be out of order in some
+   areas under extremely complex layouts.
+-  Vertical text is not supported.
+-  Tables of contents and lists are recognized through rules, and some
+   uncommon list formats may not be recognized.
+-  Only one level of headings is supported; hierarchical headings are
+   not currently supported.
+-  Code blocks are not yet supported in the layout model.
+-  Comic books, art albums, primary school textbooks, and exercises
+   cannot be parsed well.
+-  Table recognition may result in row/column recognition errors in
+   complex tables.
+-  OCR recognition may produce inaccurate characters in PDFs of
+   lesser-known languages (e.g., diacritical marks in Latin script,
+   easily confused characters in Arabic script).
+-  Some formulas may not render correctly in Markdown.

+ 0 - 0
next_docs/en/api.rst → docs/en/api.rst


+ 0 - 0
next_docs/en/api/classes.rst → docs/en/api/classes.rst


+ 0 - 0
next_docs/en/api/data_reader_writer.rst → docs/en/api/data_reader_writer.rst


+ 0 - 0
next_docs/en/api/dataset.rst → docs/en/api/dataset.rst


+ 0 - 0
next_docs/en/api/io.rst → docs/en/api/io.rst


+ 0 - 0
next_docs/en/api/read_api.rst → docs/en/api/read_api.rst


+ 0 - 0
next_docs/en/api/schemas.rst → docs/en/api/schemas.rst


+ 0 - 0
next_docs/en/conf.py → docs/en/conf.py


+ 23 - 22
next_docs/en/index.rst → docs/en/index.rst

@@ -46,20 +46,29 @@ the relevant PDF**.
 Key Features
 ------------
 
--  Removes elements such as headers, footers, footnotes, and page
-   numbers while maintaining semantic continuity
--  Outputs text in a human-readable order from multi-column documents
--  Retains the original structure of the document, including titles,
-   paragraphs, and lists
--  Extracts images, image captions, tables, and table captions
--  Automatically recognizes formulas in the document and converts them
-   to LaTeX
--  Automatically recognizes tables in the document and converts them to
-   LaTeX
--  Automatically detects and enables OCR for corrupted PDFs
--  Supports both CPU and GPU environments
--  Supports Windows, Linux, and Mac platforms
-
+-  Remove headers, footers, footnotes, page numbers, etc., to ensure
+   semantic coherence.
+-  Output text in human-readable order, suitable for single-column,
+   multi-column, and complex layouts.
+-  Preserve the structure of the original document, including headings,
+   paragraphs, lists, etc.
+-  Extract images, image descriptions, tables, table titles, and
+   footnotes.
+-  Automatically recognize and convert formulas in the document to LaTeX
+   format.
+-  Automatically recognize and convert tables in the document to LaTeX
+   or HTML format.
+-  Automatically detect scanned PDFs and garbled PDFs and enable OCR
+   functionality.
+-  OCR supports detection and recognition of 84 languages.
+-  Supports multiple output formats, such as multimodal and NLP
+   Markdown, JSON sorted by reading order, and rich intermediate
+   formats.
+-  Supports various visualization results, including layout
+   visualization and span visualization, for efficient confirmation of
+   output quality.
+-  Supports both CPU and GPU environments.
+-  Compatible with Windows, Linux, and Mac platforms.
 
 User Guide
 -------------
@@ -91,14 +100,6 @@ Additional Notes
 
    additional_notes/known_issues
    additional_notes/faq
-   additional_notes/changelog
    additional_notes/glossary
 
 
-Projects 
----------
-.. toctree::
-   :maxdepth: 1
-   :caption: Projects
-
-   projects

+ 0 - 0
next_docs/en/make.bat → docs/en/make.bat


+ 0 - 0
next_docs/en/user_guide.rst → docs/en/user_guide.rst


+ 0 - 0
next_docs/en/user_guide/data.rst → docs/en/user_guide/data.rst


+ 0 - 0
next_docs/en/user_guide/data/data_reader_writer.rst → docs/en/user_guide/data/data_reader_writer.rst


+ 0 - 0
next_docs/en/user_guide/data/dataset.rst → docs/en/user_guide/data/dataset.rst


+ 0 - 0
next_docs/en/user_guide/data/io.rst → docs/en/user_guide/data/io.rst


+ 0 - 0
next_docs/en/user_guide/data/read_api.rst → docs/en/user_guide/data/read_api.rst


+ 0 - 0
next_docs/en/user_guide/install.rst → docs/en/user_guide/install.rst


+ 8 - 15
next_docs/en/user_guide/install/boost_with_cuda.rst → docs/en/user_guide/install/boost_with_cuda.rst

@@ -137,7 +137,7 @@ Download a sample file from the repository and test it.
 .. code:: sh
 
    wget https://github.com/opendatalab/MinerU/raw/master/demo/small_ocr.pdf
-   magic-pdf -p small_ocr.pdf
+   magic-pdf -p small_ocr.pdf -o ./output
 
 9. Test CUDA Acceleration
 ~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -145,10 +145,6 @@ Download a sample file from the repository and test it.
 If your graphics card has at least **8GB** of VRAM, follow these steps
 to test CUDA acceleration:
 
-   ❗ Due to the extremely limited nature of 8GB VRAM for running this
-   application, you need to close all other programs using VRAM to
-   ensure that 8GB of VRAM is available when running this application.
-
 1. Modify the value of ``"device-mode"`` in the ``magic-pdf.json``
    configuration file located in your home directory.
 
@@ -162,7 +158,7 @@ to test CUDA acceleration:
 
    .. code:: sh
 
-      magic-pdf -p small_ocr.pdf
+      magic-pdf -p small_ocr.pdf -o ./output
 
 10. Enable CUDA Acceleration for OCR
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -178,7 +174,9 @@ to test CUDA acceleration:
 
    .. code:: sh
 
-      magic-pdf -p small_ocr.pdf
+      magic-pdf -p small_ocr.pdf -o ./output
+
+
 
 .. _windows_10_or_11_section:
 
@@ -252,7 +250,7 @@ Download a sample file from the repository and test it.
 .. code:: powershell
 
      wget https://github.com/opendatalab/MinerU/raw/master/demo/small_ocr.pdf -O small_ocr.pdf
-     magic-pdf -p small_ocr.pdf
+     magic-pdf -p small_ocr.pdf -o ./output
 
 8. Test CUDA Acceleration
 ~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -260,10 +258,6 @@ Download a sample file from the repository and test it.
 If your graphics card has at least 8GB of VRAM, follow these steps to
 test CUDA-accelerated parsing performance.
 
-   ❗ Due to the extremely limited nature of 8GB VRAM for running this
-   application, you need to close all other programs using VRAM to
-   ensure that 8GB of VRAM is available when running this application.
-
 1. **Overwrite the installation of torch and torchvision** supporting
    CUDA.
 
@@ -295,7 +289,7 @@ test CUDA-accelerated parsing performance.
 
    ::
 
-      magic-pdf -p small_ocr.pdf
+      magic-pdf -p small_ocr.pdf -o ./output
 
 9. Enable CUDA Acceleration for OCR
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -311,5 +305,4 @@ test CUDA-accelerated parsing performance.
 
    ::
 
-      magic-pdf -p small_ocr.pdf
-
+      magic-pdf -p small_ocr.pdf -o ./output

+ 0 - 0
next_docs/en/user_guide/install/download_model_weight_files.rst → docs/en/user_guide/install/download_model_weight_files.rst


+ 107 - 0
docs/en/user_guide/install/install.rst

@@ -0,0 +1,107 @@
+
+Install 
+===============================================================
+If you encounter any installation issues, please first consult the :doc:`../../additional_notes/faq`.
+If the parsing results are not as expected, refer to the :doc:`../../additional_notes/known_issues`.
+
+Pre-installation Notice—Hardware and Software Environment Support
+------------------------------------------------------------------
+
+To ensure the stability and reliability of the project, we only optimize
+and test for specific hardware and software environments during
+development. This ensures that users deploying and running the project
+on recommended system configurations will get the best performance with
+the fewest compatibility issues.
+
+By focusing resources on the mainline environment, our team can more
+efficiently resolve potential bugs and develop new features.
+
+In non-mainline environments, due to the diversity of hardware and
+software configurations, as well as third-party dependency compatibility
+issues, we cannot guarantee 100% project availability. Therefore, for
+users who wish to use this project in non-recommended environments, we
+suggest carefully reading the documentation and FAQ first. Most issues
+already have corresponding solutions in the FAQ. We also encourage
+community feedback to help us gradually expand support.
+
+.. raw:: html
+
+    <style>
+        table, th, td {
+        border: 1px solid black;
+        border-collapse: collapse;
+        }
+    </style>
+    <table>
+        <tr>
+            <td colspan="3" rowspan="2">Operating System</td>
+        </tr>
+        <tr>
+            <td>Ubuntu 22.04 LTS</td>
+            <td>Windows 10 / 11</td>
+            <td>macOS 11+</td>
+        </tr>
+        <tr>
+            <td colspan="3">CPU</td>
+            <td>x86_64(unsupported ARM Linux)</td>
+            <td>x86_64(unsupported ARM Windows)</td>
+            <td>x86_64 / arm64</td>
+        </tr>
+        <tr>
+            <td colspan="3">Memory</td>
+            <td colspan="3">16GB or more, recommended 32GB+</td>
+        </tr>
+        <tr>
+            <td colspan="3">Python Version</td>
+            <td colspan="3">3.10(Please make sure to create a Python 3.10 virtual environment using conda)</td>
+        </tr>
+        <tr>
+            <td colspan="3">Nvidia Driver Version</td>
+            <td>latest (Proprietary Driver)</td>
+            <td>latest</td>
+            <td>None</td>
+        </tr>
+        <tr>
+            <td colspan="3">CUDA Environment</td>
+            <td>Automatic installation [12.1 (pytorch) + 11.8 (paddle)]</td>
+            <td>11.8 (manual installation) + cuDNN v8.7.0 (manual installation)</td>
+            <td>None</td>
+        </tr>
+        <tr>
+            <td rowspan="2">GPU Hardware Support List</td>
+            <td colspan="2">Minimum Requirement 8G+ VRAM</td>
+            <td colspan="2">3060ti/3070/4060<br>
+            8G VRAM enables layout, formula recognition acceleration and OCR acceleration</td>
+            <td rowspan="2">None</td>
+        </tr>
+        <tr>
+            <td colspan="2">Recommended Configuration 10G+ VRAM</td>
+            <td colspan="2">3080/3080ti/3090/3090ti/4070/4070ti/4070tisuper/4080/4090<br>
+            10G VRAM or more can enable layout, formula recognition, OCR acceleration and table recognition acceleration simultaneously
+            </td>
+        </tr>
+    </table>
+
+
+
+Create an environment
+~~~~~~~~~~~~~~~~~~~~~
+
+.. code-block:: shell
+
+    conda create -n MinerU python=3.10
+    conda activate MinerU
+    pip install -U magic-pdf[full] --extra-index-url https://wheels.myhloli.com
+
+
+Download model weight files
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. code-block:: shell
+
+    pip install huggingface_hub
+    wget https://github.com/opendatalab/MinerU/raw/master/scripts/download_models_hf.py -O download_models_hf.py
+    python download_models_hf.py    
+
+
+The MinerU is installed, Check out :doc:`../quick_start` or reading :doc:`boost_with_cuda` for accelerate inference

+ 0 - 0
next_docs/en/user_guide/quick_start.rst → docs/en/user_guide/quick_start.rst


+ 4 - 1
next_docs/en/user_guide/quick_start/command_line.rst → docs/en/user_guide/quick_start/command_line.rst

@@ -55,5 +55,8 @@ directory. The output file list is as follows:
    ├── some_pdf_spans.pdf                   # smallest granularity bbox position information diagram
    └── some_pdf_content_list.json           # Rich text JSON arranged in reading order
 
-For more information about the output files, please refer to the :doc:`../tutorial/output_file_description`
+.. admonition:: Tip
+   :class: tip
+
+   For more information about the output files, please refer to the :doc:`../tutorial/output_file_description`
 

+ 0 - 0
next_docs/en/user_guide/quick_start/to_markdown.rst → docs/en/user_guide/quick_start/to_markdown.rst


+ 0 - 0
next_docs/en/user_guide/tutorial.rst → docs/en/user_guide/tutorial.rst


+ 0 - 0
next_docs/en/user_guide/tutorial/output_file_description.rst → docs/en/user_guide/tutorial/output_file_description.rst


+ 0 - 0
next_docs/requirements.txt → docs/requirements.txt


+ 2 - 2
next_docs/zh_cn/.readthedocs.yaml → docs/zh_cn/.readthedocs.yaml

@@ -10,7 +10,7 @@ formats:
 
 python:
   install:
-    - requirements: next_docs/requirements.txt
+    - requirements: docs/requirements.txt
 
 sphinx:
-  configuration: next_docs/zh_cn/conf.py
+  configuration: docs/zh_cn/conf.py

+ 0 - 0
next_docs/zh_cn/Makefile → docs/zh_cn/Makefile


+ 0 - 0
next_docs/en/_static/image/MinerU-logo-hq.png → docs/zh_cn/_static/image/MinerU-logo-hq.png


+ 0 - 0
next_docs/en/_static/image/MinerU-logo.png → docs/zh_cn/_static/image/MinerU-logo.png


+ 0 - 0
next_docs/en/_static/image/datalab_logo.png → docs/zh_cn/_static/image/datalab_logo.png


+ 0 - 0
next_docs/en/_static/image/flowchart_en.png → docs/zh_cn/_static/image/flowchart_en.png


+ 0 - 0
next_docs/en/_static/image/flowchart_zh_cn.png → docs/zh_cn/_static/image/flowchart_zh_cn.png


+ 0 - 0
next_docs/en/_static/image/layout_example.png → docs/zh_cn/_static/image/layout_example.png


+ 0 - 0
next_docs/zh_cn/_static/image/logo.png → docs/zh_cn/_static/image/logo.png


+ 0 - 0
next_docs/en/_static/image/poly.png → docs/zh_cn/_static/image/poly.png


+ 0 - 0
next_docs/en/_static/image/project_panorama_en.png → docs/zh_cn/_static/image/project_panorama_en.png


+ 0 - 0
next_docs/en/_static/image/project_panorama_zh_cn.png → docs/zh_cn/_static/image/project_panorama_zh_cn.png


+ 0 - 0
next_docs/en/_static/image/spans_example.png → docs/zh_cn/_static/image/spans_example.png


+ 0 - 0
next_docs/en/_static/image/web_demo_1.png → docs/zh_cn/_static/image/web_demo_1.png


+ 78 - 0
docs/zh_cn/additional_notes/faq.rst

@@ -0,0 +1,78 @@
+常见问题解答
+============
+
+1.在较新版本的mac上使用命令安装pip install magic-pdf[full] zsh: no matches found: magic-pdf[full]
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+在 macOS 上,默认的 shell 从 Bash 切换到了 Z shell,而 Z shell
+对于某些类型的字符串匹配有特殊的处理逻辑,这可能导致no matches
+found错误。 可以通过在命令行禁用globbing特性,再尝试运行安装命令
+
+.. code:: bash
+
+   setopt no_nomatch
+   pip install magic-pdf[full]
+
+2.使用过程中遇到_pickle.UnpicklingError: invalid load key, ‘v’.错误
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+可能是由于模型文件未下载完整导致,可尝试重新下载模型文件后再试
+参考:https://github.com/opendatalab/MinerU/issues/143
+
+3.模型文件应该下载到哪里/models-dir的配置应该怎么填
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+模型文件的路径输入是在”magic-pdf.json”中通过
+
+.. code:: json
+
+   {
+     "models-dir": "/tmp/models"
+   }
+
+进行配置的。
+这个路径是绝对路径而不是相对路径,绝对路径的获取可在models目录中通过命令
+“pwd” 获取。
+参考:https://github.com/opendatalab/MinerU/issues/155#issuecomment-2230216874
+
+4.在WSL2的Ubuntu22.04中遇到报错\ ``ImportError: libGL.so.1: cannot open shared object file: No such file or directory``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+WSL2的Ubuntu22.04中缺少\ ``libgl``\ 库,可通过以下命令安装\ ``libgl``\ 库解决:
+
+.. code:: bash
+
+   sudo apt-get install libgl1-mesa-glx
+
+参考:https://github.com/opendatalab/MinerU/issues/388
+
+5.遇到报错 ``ModuleNotFoundError : Nomodulenamed 'fairscale'``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+需要卸载该模块并重新安装
+
+.. code:: bash
+
+   pip uninstall fairscale
+   pip install fairscale
+
+参考:https://github.com/opendatalab/MinerU/issues/411
+
+6.在部分较新的设备如H100上,使用CUDA加速OCR时解析出的文字乱码。
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+cuda11对新显卡的兼容性不好,需要升级paddle使用的cuda版本
+
+.. code:: bash
+
+   pip install paddlepaddle-gpu==3.0.0b1 -i https://www.paddlepaddle.org.cn/packages/stable/cu123/
+
+参考:https://github.com/opendatalab/MinerU/issues/558
+
+7.在部分Linux服务器上,程序一运行就报错 ``非法指令 (核心已转储)`` 或 ``Illegal instruction (core dumped)``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+可能是因为服务器CPU不支持AVX/AVX2指令集,或cpu本身支持但被运维禁用了,可以尝试联系运维解除限制或更换服务器。
+
+参考:https://github.com/opendatalab/MinerU/issues/591 ,
+https://github.com/opendatalab/MinerU/issues/736

+ 11 - 0
docs/zh_cn/additional_notes/glossary.rst

@@ -0,0 +1,11 @@
+
+
+名词解释
+===========
+
+1. jsonl 
+    TODO: add description
+
+2. magic-pdf.json
+    TODO: add description
+

+ 13 - 0
docs/zh_cn/additional_notes/known_issues.rst

@@ -0,0 +1,13 @@
+已知问题
+============
+
+-  阅读顺序基于模型对可阅读内容在空间中的分布进行排序,在极端复杂的排版下可能会部分区域乱序
+-  不支持竖排文字
+-  目录和列表通过规则进行识别,少部分不常见的列表形式可能无法识别
+-  标题只有一级,目前不支持标题分级
+-  代码块在layout模型里还没有支持
+-  漫画书、艺术图册、小学教材、习题尚不能很好解析
+-  表格识别在复杂表格上可能会出现行/列识别错误
+-  在小语种PDF上,OCR识别可能会出现字符不准确的情况(如拉丁文的重音符号、阿拉伯文易混淆字符等)
+-  部分公式可能会无法在markdown中渲染
+

+ 32 - 3
next_docs/zh_cn/conf.py → docs/zh_cn/conf.py

@@ -15,7 +15,8 @@ import subprocess
 import sys
 
 from sphinx.ext import autodoc
-
+from docutils import nodes
+from docutils.parsers.rst import Directive
 
 def install(package):
     subprocess.check_call([sys.executable, '-m', 'pip', 'install', package])
@@ -33,8 +34,8 @@ sys.path.insert(0, os.path.abspath('../..'))
 # -- Project information -----------------------------------------------------
 
 project = 'MinerU'
-copyright = '2024, OpenDataLab'
-author = 'MinerU Contributors'
+copyright = '2024, MinerU Contributors'
+author = 'OpenDataLab'
 
 # The full version, including alpha/beta/rc tags
 version_file = '../../magic_pdf/libs/version.py'
@@ -58,10 +59,20 @@ extensions = [
     'sphinx_copybutton',
     'sphinx.ext.autodoc',
     'sphinx.ext.autosummary',
+    'sphinx.ext.inheritance_diagram',
     'myst_parser',
     'sphinxarg.ext',
+    'sphinxcontrib.autodoc_pydantic',
 ]
 
+# class hierarchy diagram
+inheritance_graph_attrs = dict(rankdir="LR", size='"8.0, 12.0"', fontsize=14, ratio='compress')
+inheritance_node_attrs = dict(shape='ellipse', fontsize=14, height=0.75)
+inheritance_edge_attrs = dict(arrow='vee')
+
+autodoc_pydantic_model_show_json = True
+autodoc_pydantic_model_show_config_summary = False
+
 # Add any paths that contain templates here, relative to this directory.
 templates_path = ['_templates']
 
@@ -120,3 +131,21 @@ class MockedClassDocumenter(autodoc.ClassDocumenter):
 autodoc.ClassDocumenter = MockedClassDocumenter
 
 navigation_with_keys = False
+
+
+# add custom directive 
+
+
+class VideoDirective(Directive):
+    required_arguments = 1
+    optional_arguments = 0
+    final_argument_whitespace = True
+    option_spec = {}
+
+    def run(self):
+        url = self.arguments[0]
+        video_node = nodes.raw('', f'<iframe width="560" height="315" src="{url}" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>', format='html')
+        return [video_node]
+
+def setup(app):
+    app.add_directive('video', VideoDirective)

+ 81 - 0
docs/zh_cn/index.rst

@@ -0,0 +1,81 @@
+.. xtuner documentation master file, created by
+   sphinx-quickstart on Tue Jan  9 16:33:06 2024.
+   You can adapt this file completely to your liking, but it should at least
+   contain the root `toctree` directive.
+
+欢迎来到 MinerU 文档库
+==============================================
+
+.. figure:: ./_static/image/logo.png
+  :align: center
+  :alt: mineru
+  :class: no-scaled-link
+
+.. raw:: html
+
+   <p style="text-align:center">
+   <strong> 一站式、高质量的开源文档提取工具
+   </strong>
+   </p>
+
+   <p style="text-align:center">
+   <script async defer src="https://buttons.github.io/buttons.js"></script>
+   <a class="github-button" href="https://github.com/opendatalab/MinerU" data-show-count="true" data-size="large" aria-label="Star">Star</a>
+   <a class="github-button" href="https://github.com/opendatalab/MinerU/subscription" data-icon="octicon-eye" data-size="large" aria-label="Watch">Watch</a>
+   <a class="github-button" href="https://github.com/opendatalab/MinerU/fork" data-icon="octicon-repo-forked" data-size="large" aria-label="Fork">Fork</a>
+   </p>
+
+
+项目介绍
+--------------------
+
+MinerU是一款将PDF转化为机器可读格式的工具(如markdown、json),可以很方便地抽取为任意格式。
+MinerU诞生于\ `书生-浦语 <https://github.com/InternLM/InternLM>`__\ 的预训练过程中,我们将会集中精力解决科技文献中的符号转化问题,希望在大模型时代为科技发展做出贡献。
+相比国内外知名商用产品MinerU还很年轻,如果遇到问题或者结果不及预期请到\ `issue <https://github.com/opendatalab/MinerU/issues>`__\ 提交问题,同时\ **附上相关PDF**\ 。
+
+.. video:: https://github.com/user-attachments/assets/4bea02c9-6d54-4cd6-97ed-dff14340982c
+
+主要功能
+--------
+
+-  删除页眉、页脚、脚注、页码等元素,确保语义连贯
+-  输出符合人类阅读顺序的文本,适用于单栏、多栏及复杂排版
+-  保留原文档的结构,包括标题、段落、列表等
+-  提取图像、图片描述、表格、表格标题及脚注
+-  自动识别并转换文档中的公式为LaTeX格式
+-  自动识别并转换文档中的表格为LaTeX或HTML格式
+-  自动检测扫描版PDF和乱码PDF,并启用OCR功能
+-  OCR支持84种语言的检测与识别
+-  支持多种输出格式,如多模态与NLP的Markdown、按阅读顺序排序的JSON、含有丰富信息的中间格式等
+-  支持多种可视化结果,包括layout可视化、span可视化等,便于高效确认输出效果与质检
+-  支持CPU和GPU环境
+-  兼容Windows、Linux和Mac平台
+
+
+用户指南
+-------------
+.. toctree::
+   :maxdepth: 2
+   :caption: 用户指南
+
+   user_guide
+
+
+API 接口
+-------------
+本章节主要介绍函数、类、类方法的细节信息
+
+目前只提供英文版本的接口文档,请切换到英文版本的接口文档!
+
+
+附录
+------------------
+.. toctree::
+   :maxdepth: 1
+   :caption: 附录
+
+   additional_notes/known_issues
+   additional_notes/faq
+   additional_notes/glossary
+
+

+ 0 - 0
next_docs/zh_cn/make.bat → docs/zh_cn/make.bat


+ 10 - 0
docs/zh_cn/user_guide.rst

@@ -0,0 +1,10 @@
+
+
+.. toctree::
+    :maxdepth: 2
+
+    user_guide/install
+    user_guide/quick_start
+    user_guide/tutorial
+    user_guide/data
+    

+ 20 - 0
docs/zh_cn/user_guide/data.rst

@@ -0,0 +1,20 @@
+
+
+数据
+=========
+
+.. toctree::
+   :maxdepth: 2
+   :caption: 数据
+
+   data/dataset
+
+   data/read_api
+
+   data/data_reader_writer 
+
+   data/io
+
+
+
+

+ 186 - 0
docs/zh_cn/user_guide/data/data_reader_writer.rst

@@ -0,0 +1,186 @@
+
+数据读取和写入类 
+=================
+
+旨在从不同的媒介读取或写入字节。如果 MinerU 没有提供合适的类,你可以实现新的类以满足个人场景的需求。实现新的类非常容易,唯一的要求是继承自 DataReader 或 DataWriter。
+
+.. code:: python
+
+    class SomeReader(DataReader):
+        def read(self, path: str) -> bytes:
+            pass
+
+        def read_at(self, path: str, offset: int = 0, limit: int = -1) -> bytes:
+            pass
+
+
+    class SomeWriter(DataWriter):
+        def write(self, path: str, data: bytes) -> None:
+            pass
+
+        def write_string(self, path: str, data: str) -> None:
+            pass
+
+读者可能会对 io 和本节的区别感到好奇。乍一看,这两部分非常相似。io 提供基本功能,而本节则更注重应用层面。用户可以构建自己的类以满足特定应用需求,这些类可能共享相同的基本 IO 功能。这就是为什么我们有 io。
+
+重要类
+------------
+.. code:: python
+
+    class FileBasedDataReader(DataReader):
+        def __init__(self, parent_dir: str = ''):
+            pass
+
+
+    class FileBasedDataWriter(DataWriter):
+        def __init__(self, parent_dir: str = '') -> None:
+            pass
+
+类 FileBasedDataReader 使用单个参数 parent_dir 初始化。这意味着 FileBasedDataReader 提供的每个方法将具有以下特性:
+
+#. 从绝对路径文件读取内容,parent_dir 将被忽略。
+#. 从相对路径读取文件,首先将路径与 parent_dir 连接,然后从合并后的路径读取内容。
+
+.. note::
+
+    `FileBasedDataWriter` 与 `FileBasedDataReader` 具有相同的行为。
+
+.. code:: python
+
+    class MultiS3Mixin:
+        def __init__(self, default_prefix: str, s3_configs: list[S3Config]):
+            pass
+
+    class MultiBucketS3DataReader(DataReader, MultiS3Mixin):
+        pass
+
+MultiBucketS3DataReader 提供的所有读取相关方法将具有以下特性:
+
+#. 从完整的 S3 格式路径读取对象,例如 s3://test_bucket/test_object,default_prefix 将被忽略。
+#. 从相对路径读取对象,首先将路径与 default_prefix 连接并去掉 bucket_name,然后读取内容。bucket_name 是将 default_prefix 用分隔符 \ 分割后的第一个元素。
+
+.. note::
+    MultiBucketS3DataWriter 与 MultiBucketS3DataReader 具有类似的行为。
+
+.. code:: python
+
+    class S3DataReader(MultiBucketS3DataReader):
+        pass
+
+S3DataReader 基于 MultiBucketS3DataReader 构建,但仅支持单个桶。S3DataWriter 也是类似的情况。
+
+读取示例
+---------
+.. code:: python
+
+    # 文件相关的
+    file_based_reader1 = FileBasedDataReader('')
+
+    ## 将读取文件 abc 
+    file_based_reader1.read('abc') 
+
+    file_based_reader2 = FileBasedDataReader('/tmp')
+
+    ## 将读取 /tmp/abc
+    file_based_reader2.read('abc')
+
+    ## 将读取 /var/logs/message.txt
+    file_based_reader2.read('/var/logs/message.txt')
+
+    # 多桶 S3 相关的
+    multi_bucket_s3_reader1 = MultiBucketS3DataReader("test_bucket1/test_prefix", list[S3Config(
+            bucket_name=test_bucket1, access_key=ak, secret_key=sk, endpoint_url=endpoint_url
+        ),
+        S3Config(
+            bucket_name=test_bucket_2,
+            access_key=ak_2,
+            secret_key=sk_2,
+            endpoint_url=endpoint_url_2,
+        )])
+
+    ## 将读取 s3://test_bucket1/test_prefix/abc
+    multi_bucket_s3_reader1.read('abc')
+
+    ## 将读取 s3://test_bucket1/efg
+    multi_bucket_s3_reader1.read('s3://test_bucket1/efg')
+
+    ## 将读取 s3://test_bucket2/abc
+    multi_bucket_s3_reader1.read('s3://test_bucket2/abc')
+
+    # S3 相关的
+    s3_reader1 = S3DataReader(
+        default_prefix_without_bucket = "test_prefix",
+        bucket: "test_bucket",
+        ak: "ak",
+        sk: "sk",
+        endpoint_url: "localhost"
+    )
+
+    ## 将读取 s3://test_bucket/test_prefix/abc 
+    s3_reader1.read('abc')
+
+    ## 将读取 s3://test_bucket/efg
+    s3_reader1.read('s3://test_bucket/efg')
+
+写入示例
+----------
+.. code:: python
+
+    # 文件相关的
+    file_based_writer1 = FileBasedDataWriter('')
+
+    ## 将写入 123 到 abc
+    file_based_writer1.write('abc', '123'.encode()) 
+
+    ## 将写入 123 到 abc
+    file_based_writer1.write_string('abc', '123') 
+
+    file_based_writer2 = FileBasedDataWriter('/tmp')
+
+    ## 将写入 123 到 /tmp/abc
+    file_based_writer2.write_string('abc', '123')
+
+    ## 将写入 123 到 /var/logs/message.txt
+    file_based_writer2.write_string('/var/logs/message.txt', '123')
+
+    # 多桶 S3 相关的
+    multi_bucket_s3_writer1 = MultiBucketS3DataWriter("test_bucket1/test_prefix", list[S3Config(
+            bucket_name=test_bucket1, access_key=ak, secret_key=sk, endpoint_url=endpoint_url
+        ),
+        S3Config(
+            bucket_name=test_bucket_2,
+            access_key=ak_2,
+            secret_key=sk_2,
+            endpoint_url=endpoint_url_2,
+        )])
+
+    ## 将写入 123 到 s3://test_bucket1/test_prefix/abc
+    multi_bucket_s3_writer1.write_string('abc', '123')
+
+    ## 将写入 123 到 s3://test_bucket1/test_prefix/abc
+    multi_bucket_s3_writer1.write('abc', '123'.encode())
+
+    ## 将写入 123 到 s3://test_bucket1/efg
+    multi_bucket_s3_writer1.write('s3://test_bucket1/efg', '123'.encode())
+
+    ## 将写入 123 到 s3://test_bucket2/abc
+    multi_bucket_s3_writer1.write('s3://test_bucket2/abc', '123'.encode())
+
+    # S3 相关的
+    s3_writer1 = S3DataWriter(
+        default_prefix_without_bucket = "test_prefix",
+        bucket: "test_bucket",
+        ak: "ak",
+        sk: "sk",
+        endpoint_url: "localhost"
+    )
+
+    ## 将写入 123 到 s3://test_bucket/test_prefix/abc 
+    s3_writer1.write('abc', '123'.encode())
+
+    ## 将写入 123 到 s3://test_bucket/test_prefix/abc 
+    s3_writer1.write_string('abc', '123')
+
+    ## 将写入 123 到 s3://test_bucket/efg
+    s3_writer1.write('s3://test_bucket/efg', '123'.encode())
+

+ 31 - 0
docs/zh_cn/user_guide/data/dataset.rst

@@ -0,0 +1,31 @@
+
+数据集
+======
+
+导入数据类
+-----------
+
+数据集
+^^^^^^^^
+
+每个 PDF 或图像将形成一个 Dataset。众所周知,PDF 有两种类别::ref:`TXT <digital_method_section>` 或 :ref:`OCR <ocr_method_section>` 方法部分。从图像中可以获得 ImageDataset,它是 Dataset 的子类;从 PDF 文件中可以获得 PymuDocDataset。ImageDataset 和 PymuDocDataset 之间的区别在于 ImageDataset 仅支持 OCR 解析方法,而 PymuDocDataset 支持 OCR 和 TXT 两种方法。
+
+.. note::
+
+    实际上,有些 PDF 可能是由图像生成的,这意味着它们不支持 `TXT` 方法。目前,由用户保证不会调用 `TXT` 方法来解析图像生成的 PDF
+
+PDF 解析方法
+---------------
+
+.. _ocr_method_section:
+
+OCR
+^^^^
+通过 光学字符识别 技术提取字符。
+
+.. _digital_method_section:
+
+TXT
+^^^^^^^^
+通过第三方库提取字符,目前我们使用的是 pymupdf。
+

+ 21 - 0
docs/zh_cn/user_guide/data/io.rst

@@ -0,0 +1,21 @@
+
+
+IO
+====
+
+旨在从不同的媒介读取或写入字节。目前,我们提供了 S3Reader 和 S3Writer 用于兼容 AWS S3 的媒介,以及 HttpReader 和 HttpWriter 用于远程 HTTP 文件。如果 MinerU 没有提供合适的类,你可以实现新的类以满足个人场景的需求。实现新的类非常容易,唯一的要求是继承自 IOReader 或 IOWriter。
+
+.. code:: python
+
+    class SomeReader(IOReader):
+        def read(self, path: str) -> bytes:
+            pass
+
+        def read_at(self, path: str, offset: int = 0, limit: int = -1) -> bytes:
+            pass
+
+
+    class SomeWriter(IOWriter):
+        def write(self, path: str, data: bytes) -> None:
+            pass
+        

+ 48 - 0
docs/zh_cn/user_guide/data/read_api.rst

@@ -0,0 +1,48 @@
+
+
+read_api
+=========
+
+从文件或目录读取内容以创建 Dataset。目前,我们提供了几个覆盖某些场景的函数。如果你有新的、大多数用户都会遇到的场景,可以在官方 GitHub 问题页面上发布详细描述。同时,实现你自己的读取相关函数也非常容易。
+
+重要函数
+---------
+
+read_jsonl
+^^^^^^^^^^^^^^^^
+
+从本地机器或远程 S3 上的 JSONL 文件读取内容。如果你想了解更多关于 JSONL 的信息,请参阅 :doc:`../../additional_notes/glossary`。
+
+.. code:: python
+
+    # 从本地机器读取 JSONL
+    datasets = read_jsonl("tt.jsonl", None)
+
+    # 从远程 S3 读取 JSONL
+    datasets = read_jsonl("s3://bucket_1/tt.jsonl", s3_reader)
+
+read_local_pdfs
+^^^^^^^^^^^^^^^^
+
+从路径或目录读取 PDF 文件。
+
+.. code:: python
+
+    # 读取 PDF 路径
+    datasets = read_local_pdfs("tt.pdf")
+
+    # 读取目录下的 PDF 文件
+    datasets = read_local_pdfs("pdfs/")
+
+read_local_images
+^^^^^^^^^^^^^^^^^^^
+
+从路径或目录读取图像。
+
+.. code:: python
+
+    # 从图像路径读取
+    datasets = read_local_images("tt.png")
+
+    # 从目录读取以 suffixes 数组中指定后缀结尾的文件
+    datasets = read_local_images("images/", suffixes=["png", "jpg"])

+ 13 - 0
docs/zh_cn/user_guide/install.rst

@@ -0,0 +1,13 @@
+
+安装
+==============
+
+.. toctree::
+   :maxdepth: 1
+   :caption: 安装文档
+
+   install/install
+   install//boost_with_cuda
+   install/download_model_weight_files
+
+

+ 293 - 0
docs/zh_cn/user_guide/install/boost_with_cuda.rst

@@ -0,0 +1,293 @@
+使用 CUDA 加速
+================
+
+如果您的设备支持 CUDA 并符合主线环境的 GPU 要求,您可以使用 GPU 加速。请选择适合您系统的指南:
+
+-  :ref:`ubuntu_22_04_lts_section`
+-  :ref:`windows_10_or_11_section`
+
+.. admonition:: Important
+   :class: warning
+
+   使用 Docker 快速部署 > Docker 需要至少 16GB 显存的 GPU,并且所有加速功能默认启用。
+   在运行此 Docker 容器之前,您可以使用以下命令检查您的设备是否支持 Docker 上的 CUDA 加速。
+
+   .. code-block:: sh
+
+      bash  docker run --rm --gpus=all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
+
+.. code:: sh
+
+   wget https://github.com/opendatalab/MinerU/raw/master/Dockerfile
+   docker build -t mineru:latest .
+   docker run --rm -it --gpus=all mineru:latest /bin/bash
+   magic-pdf --help
+
+
+.. _ubuntu_22_04_lts_section:
+
+Ubuntu 22.04 LT
+----------------
+
+1.检查 NVIDIA 驱动程序是否已安装
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+.. code:: sh
+
+   nvidia-smi
+
+如果您看到类似以下的信息,则表示 NVIDIA 驱动程序已安装,可以跳过第 2 步。
+
+注意:“CUDA 版本”应 >= 12.1,如果显示的版本号小于 12.1,请升级驱动程序。
+
+.. code:: text
+
+   +---------------------------------------------------------------------------------------+
+   | NVIDIA-SMI 537.34                 Driver Version: 537.34       CUDA Version: 12.2     |
+   |-----------------------------------------+----------------------+----------------------+
+   | GPU  Name                     TCC/WDDM  | Bus-Id        Disp.A | Volatile Uncorr. ECC |
+   | Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
+   |                                         |                      |               MIG M. |
+   |=========================================+======================+======================|
+   |   0  NVIDIA GeForce RTX 3060 Ti   WDDM  | 00000000:01:00.0  On |                  N/A |
+   |  0%   51C    P8              12W / 200W |   1489MiB /  8192MiB |      5%      Default |
+   |                                         |                      |                  N/A |
+   +-----------------------------------------+----------------------+----------------------+
+
+
+2. 安装驱动程序
+~~~~~~~~~~~~~~~~~~~~~
+
+如果没有安装驱动程序,请使用以下命令:
+
+.. code:: sh
+
+   sudo apt-get update
+   sudo apt-get install nvidia-driver-545
+
+安装专有驱动程序并在安装后重启计算机。
+
+.. code:: sh
+
+   reboot
+
+3. 安装 Anaconda
+~~~~~~~~~~~~~~~~~~
+
+如果已经安装了 Anaconda,请跳过此步骤。
+
+.. code:: sh
+
+   wget https://repo.anaconda.com/archive/Anaconda3-2024.06-1-Linux-x86_64.sh
+   bash Anaconda3-2024.06-1-Linux-x86_64.sh
+
+在最后一步中输入 ``yes``,关闭终端并重新打开。
+
+4. 使用 Conda 创建环境
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+指定 Python 版本为 3.10。
+
+.. code:: sh
+
+   conda create -n MinerU python=3.10
+   conda activate MinerU
+
+5. 安装应用程序
+~~~~~~~~~~~~~~~~~~~~~~~
+
+.. code:: sh
+
+   pip install -U magic-pdf[full] --extra-index-url https://wheels.myhloli.com
+
+❗ 安装完成后,请确保使用以下命令检查 ``magic-pdf`` 的版本:
+
+.. code:: sh
+
+   magic-pdf --version
+
+如果版本号小于 0.7.0,请报告问题。
+
+6. 下载模型
+~~~~~~~~~~~~~~~~~~
+
+参考详细说明 :doc:`下载模型权重文件 <download_model_weight_files>`
+
+7. 了解配置文件的位置
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+完成 `6. 下载模型 <#6-download-models>`__ 步骤后,脚本将自动在用户目录中生成一个 ``magic-pdf.json`` 文件并配置默认模型路径。您可以在用户目录中找到 ``magic-pdf.json`` 文件。
+
+   Linux 用户目录是 “/home/用户名”。
+
+8. 首次运行
+~~~~~~~~~~~~
+
+从仓库下载示例文件并测试它。
+
+.. code:: sh
+
+   wget https://github.com/opendatalab/MinerU/raw/master/demo/small_ocr.pdf
+   magic-pdf -p small_ocr.pdf -o ./output
+
+9. 测试 CUDA 加速
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+如果您的显卡至少有 **8GB** 显存,请按照以下步骤测试 CUDA 加速:
+
+1. 修改位于用户目录中的 ``magic-pdf.json`` 配置文件中的 ``"device-mode"`` 值。
+
+   .. code:: json
+
+      {
+        "device-mode": "cuda"
+      }
+
+2. 使用以下命令测试 CUDA 加速:
+
+   .. code:: sh
+
+      magic-pdf -p small_ocr.pdf -o ./output
+
+10. 启用 OCR 的 CUDA 加速
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+1. 下载 ``paddlepaddle-gpu``。安装将自动启用 OCR 加速。
+
+   .. code:: sh
+
+      python -m pip install paddlepaddle-gpu==3.0.0b1 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/
+
+2. 使用以下命令测试 OCR 加速:
+
+   .. code:: sh
+
+      magic-pdf -p small_ocr.pdf -o ./output
+
+.. _windows_10_or_11_section:
+
+Windows 10/11
+--------------
+
+1. 安装 CUDA 和 cuDNN
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+所需版本:CUDA 11.8 + cuDNN 8.7.0
+
+-  CUDA 11.8: https://developer.nvidia.com/cuda-11-8-0-download-archive
+-  cuDNN v8.7.0(2022年11月28日发布),适用于 CUDA 11.x:
+   https://developer.nvidia.com/rdp/cudnn-archive
+
+2. 安装 Anaconda
+~~~~~~~~~~~~~~~~~~
+
+如果已经安装了 Anaconda,您可以跳过此步骤。
+
+下载链接:https://repo.anaconda.com/archive/Anaconda3-2024.06-1-Windows-x86_64.exe
+
+3. 使用 Conda 创建环境
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Python 版本必须是 3.10。
+
+::
+
+   conda create -n MinerU python=3.10
+   conda activate MinerU
+
+4. 安装应用程序
+~~~~~~~~~~~~~~~~~~~~~~~
+
+.. code:: bash
+
+   pip install -U magic-pdf[full] --extra-index-url https://wheels.myhloli.com
+
+
+❗️安装完成后,请验证 ``magic-pdf`` 的版本:
+
+.. code:: bash
+
+      magic-pdf --version
+
+如果版本号小于 0.7.0,请在问题部分报告。
+
+5. 下载模型
+~~~~~~~~~~~~~~~~~~
+
+参考详细说明 :doc:`下载模型权重文件 <download_model_weight_files>`
+
+6. 了解配置文件的位置
+~~~~~~~~~~~~~~~~~~~~
+
+完成 `5. 下载模型 <#5-download-models>__` 步骤后,脚本将自动在用户目录中生成一个 magic-pdf.json 文件并配置默认模型路径。您可以在【用户目录】中找到 magic-pdf.json 文件。
+
+Windows 用户目录是 “C:/Users/用户名”。
+
+7. 首次运行
+~~~~~~~~~~
+
+从仓库下载示例文件并测试它。
+
+.. code:: powershell
+
+     wget https://github.com/opendatalab/MinerU/raw/master/demo/small_ocr.pdf -O small_ocr.pdf
+     magic-pdf -p small_ocr.pdf -o ./output
+
+8. 测试CUDA加速
+~~~~~~~~~~~~~~~~
+
+如果您的显卡显存大于等于 **8GB**
+,可以进行以下流程,测试CUDA解析加速效果
+
+**1.覆盖安装支持cuda的torch和torchvision**
+
+.. code:: bash
+
+   pip install --force-reinstall torch==2.3.1 torchvision==0.18.1 --index-url https://download.pytorch.org/whl/cu118
+
+..
+
+   ❗️务必在命令中指定以下版本
+
+   .. code:: bash
+
+      torch==2.3.1 torchvision==0.18.1
+
+   这是我们支持的最高版本,如果不指定版本会自动安装更高版本导致程序无法运行
+
+**2.修改【用户目录】中配置文件magic-pdf.json中”device-mode”的值**
+
+.. code:: json
+
+   {
+     "device-mode":"cuda"
+   }
+
+**3.运行以下命令测试cuda加速效果**
+
+.. code:: bash
+
+   magic-pdf -p small_ocr.pdf -o ./output
+
+..
+
+   提示:CUDA加速是否生效可以根据log中输出的各个阶段的耗时来简单判断,通常情况下,\ ``layout detection time``
+   和 ``mfr time`` 应提速10倍以上。
+
+9. 为ocr开启cuda加速
+~~~~~~~~~~~~~~~~~~~~~~~
+
+**1.下载paddlepaddle-gpu, 安装完成后会自动开启ocr加速**
+
+.. code:: bash
+
+   pip install paddlepaddle-gpu==2.6.1
+
+**2.运行以下命令测试ocr加速效果**
+
+.. code:: bash
+
+   magic-pdf -p small_ocr.pdf -o ./output
+
+..
+
+提示:CUDA加速是否生效可以根据log中输出的各个阶段cost耗时来简单判断,通常情况下,\ ``ocr time``\ 应提速10倍以上。

+ 36 - 0
docs/zh_cn/user_guide/install/download_model_weight_files.rst

@@ -0,0 +1,36 @@
+下载模型权重文件
+===============
+
+模型下载分为初始下载和更新到模型目录。请参考相应的文档以获取如何操作的指示。
+
+初始下载模型文件
+--------------
+从 Hugging Face 下载模型
+
+
+使用 Python 脚本从 Hugging Face 下载模型文件
+
+.. code:: bash
+
+   pip install huggingface_hub
+   wget https://github.com/opendatalab/MinerU/raw/master/scripts/download_models_hf.py -O download_models_hf.py
+   python download_models_hf.py
+
+该 Python 脚本将自动下载模型文件,并在配置文件中配置模型目录。
+
+配置文件可以在用户目录中找到,文件名为 ``magic-pdf.json``。
+
+如何更新先前下载的模型
+-----------------------------------------
+
+1. 通过 Git LFS 下载的模型
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+   由于一些用户的反馈指出使用 git lfs 下载模型文件会出现不完整或导致模型文件损坏的情况,因此不再推荐使用这种方法。
+
+如果您之前通过 git lfs 下载了模型文件,您可以导航到之前的下载目录并使用 ``git pull`` 命令来更新模型。
+
+2. 通过 Hugging Face 或 ModelScope 下载的模型
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+如果您之前通过 Hugging Face 或 ModelScope 下载了模型,您可以重新运行用于初始下载的 Python 脚本。这将自动将模型目录更新到最新版本。

+ 96 - 0
docs/zh_cn/user_guide/install/install.rst

@@ -0,0 +1,96 @@
+
+安装
+=====
+
+如果您遇到任何安装问题,请首先查阅 :doc:`../../additional_notes/faq`。如果解析结果不如预期,可参考 :doc:`../../additional_notes/known_issues`。
+
+预安装须知—硬件和软件环境支持
+-----------------------------
+
+为了确保项目的稳定性和可靠性,在开发过程中我们仅对特定的硬件和软件环境进行了优化和测试。这确保了在推荐系统配置上部署和运行项目的用户能够获得最佳性能,并且兼容性问题最少。
+
+通过将资源集中在主线环境中,我们的团队可以更高效地解决潜在的错误并开发新功能。
+
+在非主线环境中,由于硬件和软件配置的多样性以及第三方依赖项的兼容性问题,我们无法保证100%的项目可用性。因此,对于希望在非推荐环境中使用该项目的用户,我们建议首先仔细阅读文档和常见问题解答。大多数问题在常见问题解答中已经有相应的解决方案。我们也鼓励社区反馈,以帮助我们逐步扩大支持。
+
+
+.. raw:: html
+
+    <style>
+        table, th, td {
+        border: 1px solid black;
+        border-collapse: collapse;
+        }
+    </style>
+    <table>
+        <tr>
+            <td colspan="3" rowspan="2">操作系统</td>
+        </tr>
+        <tr>
+            <td>Ubuntu 22.04 LTS</td>
+            <td>Windows 10 / 11</td>
+            <td>macOS 11+</td>
+        </tr>
+        <tr>
+            <td colspan="3">CPU</td>
+            <td>x86_64(暂不支持ARM Linux)</td>
+            <td>x86_64(暂不支持ARM Windows)</td>
+            <td>x86_64 / arm64</td>
+        </tr>
+        <tr>
+            <td colspan="3">内存</td>
+            <td colspan="3">大于等于16GB,推荐32G以上</td>
+        </tr>
+        <tr>
+            <td colspan="3">python版本</td>
+            <td colspan="3">3.10 (请务必通过conda创建3.10虚拟环境)</td>
+        </tr>
+        <tr>
+            <td colspan="3">Nvidia Driver 版本</td>
+            <td>latest(专有驱动)</td>
+            <td>latest</td>
+            <td>None</td>
+        </tr>
+        <tr>
+            <td colspan="3">CUDA环境</td>
+            <td>自动安装[12.1(pytorch)+11.8(paddle)]</td>
+            <td>11.8(手动安装)+cuDNN v8.7.0(手动安装)</td>
+            <td>None</td>
+        </tr>
+        <tr>
+            <td rowspan="2">GPU硬件支持列表</td>
+            <td colspan="2">最低要求 8G+显存</td>
+            <td colspan="2">3060ti/3070/4060<br>
+            8G显存可开启layout、公式识别和ocr加速</td>
+            <td rowspan="2">None</td>
+        </tr>
+        <tr>
+            <td colspan="2">推荐配置 10G+显存</td>
+            <td colspan="2">3080/3080ti/3090/3090ti/4070/4070ti/4070tisuper/4080/4090<br>
+            10G显存及以上可以同时开启layout、公式识别和ocr加速和表格识别加速<br>
+            </td>
+        </tr>
+    </table>
+
+
+创建环境
+~~~~~~~~~~
+
+.. code-block:: shell
+
+    conda create -n MinerU python=3.10
+    conda activate MinerU
+    pip install -U magic-pdf[full] --extra-index-url https://wheels.myhloli.com
+
+
+下载模型权重文件
+~~~~~~~~~~~~~~~
+
+.. code-block:: shell
+
+    pip install huggingface_hub
+    wget https://github.com/opendatalab/MinerU/raw/master/scripts/download_models_hf.py -O download_models_hf.py
+    python download_models_hf.py    
+
+MinerU 已安装,查看 :doc:`../quick_start` 或阅读 :doc:`boost_with_cuda` 以加速推理。
+

+ 13 - 0
docs/zh_cn/user_guide/quick_start.rst

@@ -0,0 +1,13 @@
+
+快速开始 
+==============
+
+从这里开始学习 MinerU 基本使用方法。若还没有安装,请参考安装文档进行安装
+
+.. toctree::
+    :maxdepth: 1
+    :caption: 快速开始
+
+    quick_start/command_line
+    quick_start/to_markdown
+

+ 61 - 0
docs/zh_cn/user_guide/quick_start/command_line.rst

@@ -0,0 +1,61 @@
+
+
+命令行
+========
+
+.. code:: bash
+
+   magic-pdf --help
+   Usage: magic-pdf [OPTIONS]
+
+   Options:
+     -v, --version                display the version and exit
+     -p, --path PATH              local pdf filepath or directory  [required]
+     -o, --output-dir PATH        output local directory  [required]
+     -m, --method [ocr|txt|auto]  the method for parsing pdf. ocr: using ocr
+                                  technique to extract information from pdf. txt:
+                                  suitable for the text-based pdf only and
+                                  outperform ocr. auto: automatically choose the
+                                  best method for parsing pdf from ocr and txt.
+                                  without method specified, auto will be used by
+                                  default.
+     -l, --lang TEXT              Input the languages in the pdf (if known) to
+                                  improve OCR accuracy.  Optional. You should
+                                  input "Abbreviation" with language form url: ht
+                                  tps://paddlepaddle.github.io/PaddleOCR/en/ppocr
+                                  /blog/multi_languages.html#5-support-languages-
+                                  and-abbreviations
+     -d, --debug BOOLEAN          Enables detailed debugging information during
+                                  the execution of the CLI commands.
+     -s, --start INTEGER          The starting page for PDF parsing, beginning
+                                  from 0.
+     -e, --end INTEGER            The ending page for PDF parsing, beginning from
+                                  0.
+     --help                       Show this message and exit.
+
+
+   ## show version
+   magic-pdf -v
+
+   ## command line example
+   magic-pdf -p {some_pdf} -o {some_output_dir} -m auto
+
+``{some_pdf}`` 可以是单个 PDF 文件或者一个包含多个 PDF 文件的目录。 解析的结果文件存放在目录 ``{some_output_dir}`` 下。 生成的结果文件列表如下所示:
+
+.. code:: text
+
+   ├── some_pdf.md                          # markdown 文件
+   ├── images                               # 存放图片目录
+   ├── some_pdf_layout.pdf                  # layout 绘图 (包含layout阅读顺序)
+   ├── some_pdf_middle.json                 # minerU 中间处理结果
+   ├── some_pdf_model.json                  # 模型推理结果
+   ├── some_pdf_origin.pdf                  # 原 pdf 文件
+   ├── some_pdf_spans.pdf                   # 最小粒度的bbox位置信息绘图
+   └── some_pdf_content_list.json           # 按阅读顺序排列的富文本json
+
+
+.. admonition:: Tip
+   :class: tip
+
+   欲知更多有关结果文件的信息,请参考 :doc:`../tutorial/output_file_description`
+

+ 53 - 0
docs/zh_cn/user_guide/quick_start/to_markdown.rst

@@ -0,0 +1,53 @@
+
+
+转换为 Markdown 文件
+========================
+
+.. code:: python
+
+    import os
+
+    from magic_pdf.data.data_reader_writer import FileBasedDataWriter, FileBasedDataReader
+    from magic_pdf.libs.MakeContentConfig import DropMode, MakeMode
+    from magic_pdf.pipe.OCRPipe import OCRPipe
+
+
+    ## args
+    model_list = []
+    pdf_file_name = "abc.pdf"  # replace with the real pdf path
+
+
+    ## prepare env
+    local_image_dir, local_md_dir = "output/images", "output"
+    os.makedirs(local_image_dir, exist_ok=True)
+
+    image_writer, md_writer = FileBasedDataWriter(local_image_dir), FileBasedDataWriter(
+        local_md_dir
+    ) # create 00
+    image_dir = str(os.path.basename(local_image_dir))
+
+    reader1 = FileBasedDataReader("")
+    pdf_bytes = reader1.read(pdf_file_name)   # read the pdf content
+
+
+    pipe = OCRPipe(pdf_bytes, model_list, image_writer)
+
+    pipe.pipe_classify()
+    pipe.pipe_analyze()
+    pipe.pipe_parse()
+
+    pdf_info = pipe.pdf_mid_data["pdf_info"]
+
+
+    md_content = pipe.pipe_mk_markdown(
+        image_dir, drop_mode=DropMode.NONE, md_make_mode=MakeMode.MM_MD
+    )
+
+    if isinstance(md_content, list):
+        md_writer.write_string(f"{pdf_file_name}.md", "\n".join(md_content))
+    else:
+        md_writer.write_string(f"{pdf_file_name}.md", md_content)
+
+
+前去 :doc:`../data/data_reader_writer` 获取更多有关**读写**示例
+

+ 11 - 0
docs/zh_cn/user_guide/tutorial.rst

@@ -0,0 +1,11 @@
+
+教程
+===========
+
+让我们通过构建一个最小项目来学习 MinerU 
+
+.. toctree::
+    :maxdepth: 1
+    :caption: 教程
+
+    tutorial/output_file_description

+ 394 - 0
docs/zh_cn/user_guide/tutorial/output_file_description.rst

@@ -0,0 +1,394 @@
+
+输出文件格式介绍
+===============
+
+``magic-pdf`` 命令执行后除了输出和 markdown
+有关的文件以外,还会生成若干个和 markdown
+无关的文件。现在将一一介绍这些文件
+
+some_pdf_layout.pdf
+~~~~~~~~~~~~~~~~~~~
+
+每一页的 layout 均由一个或多个框组成。
+每个框左上脚的数字表明它们的序号。此外 layout.pdf
+框内用不同的背景色块圈定不同的内容块。
+
+.. figure:: ../../_static/image/layout_example.png
+   :alt: layout 页面示例
+
+   layout 页面示例
+
+some_pdf_spans.pdf
+~~~~~~~~~~~~~~~~~~
+
+根据 span 类型的不同,采用不同颜色线框绘制页面上所有
+span。该文件可以用于质检,可以快速排查出文本丢失、行间公式未识别等问题。
+
+.. figure:: ../../_static/image/spans_example.png
+   :alt: span 页面示例
+
+   span 页面示例
+
+some_pdf_model.json
+~~~~~~~~~~~~~~~~~~~
+
+结构定义
+^^^^^^^^
+
+.. code:: python
+
+   from pydantic import BaseModel, Field
+   from enum import IntEnum
+
+   class CategoryType(IntEnum):
+        title = 0               # 标题
+        plain_text = 1          # 文本
+        abandon = 2             # 包括页眉页脚页码和页面注释
+        figure = 3              # 图片
+        figure_caption = 4      # 图片描述
+        table = 5               # 表格
+        table_caption = 6       # 表格描述
+        table_footnote = 7      # 表格注释
+        isolate_formula = 8     # 行间公式
+        formula_caption = 9     # 行间公式的标号
+
+        embedding = 13          # 行内公式
+        isolated = 14           # 行间公式
+        text = 15               # ocr 识别结果
+
+
+   class PageInfo(BaseModel):
+       page_no: int = Field(description="页码序号,第一页的序号是 0", ge=0)
+       height: int = Field(description="页面高度", gt=0)
+       width: int = Field(description="页面宽度", ge=0)
+
+   class ObjectInferenceResult(BaseModel):
+       category_id: CategoryType = Field(description="类别", ge=0)
+       poly: list[float] = Field(description="四边形坐标, 分别是 左上,右上,右下,左下 四点的坐标")
+       score: float = Field(description="推理结果的置信度")
+       latex: str | None = Field(description="latex 解析结果", default=None)
+       html: str | None = Field(description="html 解析结果", default=None)
+
+   class PageInferenceResults(BaseModel):
+        layout_dets: list[ObjectInferenceResult] = Field(description="页面识别结果", ge=0)
+        page_info: PageInfo = Field(description="页面元信息")
+
+
+   # 所有页面的推理结果按照页码顺序依次放到列表中即为 minerU 推理结果
+   inference_result: list[PageInferenceResults] = []
+
+poly 坐标的格式 [x0, y0, x1, y1, x2, y2, x3, y3],
+分别表示左上、右上、右下、左下四点的坐标 |poly 坐标示意图|
+
+示例数据
+^^^^^^^^
+
+.. code:: json
+
+   [
+       {
+           "layout_dets": [
+               {
+                   "category_id": 2,
+                   "poly": [
+                       99.1906967163086,
+                       100.3119125366211,
+                       730.3707885742188,
+                       100.3119125366211,
+                       730.3707885742188,
+                       245.81326293945312,
+                       99.1906967163086,
+                       245.81326293945312
+                   ],
+                   "score": 0.9999997615814209
+               }
+           ],
+           "page_info": {
+               "page_no": 0,
+               "height": 2339,
+               "width": 1654
+           }
+       },
+       {
+           "layout_dets": [
+               {
+                   "category_id": 5,
+                   "poly": [
+                       99.13092803955078,
+                       2210.680419921875,
+                       497.3183898925781,
+                       2210.680419921875,
+                       497.3183898925781,
+                       2264.78076171875,
+                       99.13092803955078,
+                       2264.78076171875
+                   ],
+                   "score": 0.9999997019767761
+               }
+           ],
+           "page_info": {
+               "page_no": 1,
+               "height": 2339,
+               "width": 1654
+           }
+       }
+   ]
+
+some_pdf_middle.json
+~~~~~~~~~~~~~~~~~~~~
+
++-----------+----------------------------------------------------------+
+| 字段名    | 解释                                                     |
++===========+==========================================================+
+| pdf_info  | list,每个                                               |
+|           | 元素都是一个dict,这个dict是每一页pdf的解析结果,详见下表 |
++-----------+----------------------------------------------------------+
+| \_p       | ocr \| txt,用来标识本次解析的中间态使用的模式           |
+| arse_type |                                                          |
++-----------+----------------------------------------------------------+
+| \_ver     | string, 表示本次解析使用的 magic-pdf 的版本号            |
+| sion_name |                                                          |
++-----------+----------------------------------------------------------+
+
+**pdf_info** 字段结构说明
+
++--------------+-------------------------------------------------------+
+| 字段名       | 解释                                                  |
++==============+=======================================================+
+| pr           | pdf预处理后,未分段的中间结果                         |
+| eproc_blocks |                                                       |
++--------------+-------------------------------------------------------+
+| l            | 布局分割的结果,                                      |
+| ayout_bboxes | 含有布局的方向(垂直、水平),和bbox,按阅读顺序排序  |
++--------------+-------------------------------------------------------+
+| page_idx     | 页码,从0开始                                         |
++--------------+-------------------------------------------------------+
+| page_size    | 页面的宽度和高度                                      |
++--------------+-------------------------------------------------------+
+| \            | 布局树状结构                                          |
+| _layout_tree |                                                       |
++--------------+-------------------------------------------------------+
+| images       | list,每个元素是一个dict,每个dict表示一个img_block   |
++--------------+-------------------------------------------------------+
+| tables       | list,每个元素是一个dict,每个dict表示一个table_block |
++--------------+-------------------------------------------------------+
+| interli      | list,每个元素                                        |
+| ne_equations | 是一个dict,每个dict表示一个interline_equation_block  |
++--------------+-------------------------------------------------------+
+| disc         | List, 模型返回的需要drop的block信息                   |
+| arded_blocks |                                                       |
++--------------+-------------------------------------------------------+
+| para_blocks  | 将preproc_blocks进行分段之后的结果                    |
++--------------+-------------------------------------------------------+
+
+上表中 ``para_blocks``
+是个dict的数组,每个dict是一个block结构,block最多支持一次嵌套
+
+**block**
+
+外层block被称为一级block,一级block中的字段包括
+
+====== ===============================================
+字段名 解释
+====== ===============================================
+type   block类型(table|image)
+bbox   block矩形框坐标
+blocks list,里面的每个元素都是一个dict格式的二级block
+====== ===============================================
+
+一级block只有”table”和”image”两种类型,其余block均为二级block
+
+二级block中的字段包括
+
++-----+----------------------------------------------------------------+
+| 字  | 解释                                                           |
+| 段  |                                                                |
+| 名  |                                                                |
++=====+================================================================+
+| t   | block类型                                                      |
+| ype |                                                                |
++-----+----------------------------------------------------------------+
+| b   | block矩形框坐标                                                |
+| box |                                                                |
++-----+----------------------------------------------------------------+
+| li  | list,每个元素都是一个dict表示的line,用来描述一行信息的构成   |
+| nes |                                                                |
++-----+----------------------------------------------------------------+
+
+二级block的类型详解
+
+================== ==============
+type               desc
+================== ==============
+image_body         图像的本体
+image_caption      图像的描述文本
+image_footnote     图像的脚注
+table_body         表格本体
+table_caption      表格的描述文本
+table_footnote     表格的脚注
+text               文本块
+title              标题块
+index              目录块
+list               列表块
+interline_equation 行间公式块
+================== ==============
+
+**line**
+
+line 的 字段格式如下
+
++----+-----------------------------------------------------------------+
+| 字 | 解释                                                            |
+| 段 |                                                                 |
+| 名 |                                                                 |
++====+=================================================================+
+| bb | line的矩形框坐标                                                |
+| ox |                                                                 |
++----+-----------------------------------------------------------------+
+| s  | list,                                                          |
+| pa | 每个元素都是一个dict表示的span,用来描述一个最小组成单元的构成  |
+| ns |                                                                 |
++----+-----------------------------------------------------------------+
+
+**span**
+
++------------+---------------------------------------------------------+
+| 字段名     | 解释                                                    |
++============+=========================================================+
+| bbox       | span的矩形框坐标                                        |
++------------+---------------------------------------------------------+
+| type       | span的类型                                              |
++------------+---------------------------------------------------------+
+| content \| | 文本类型的span使用content,图表类使用img_path           |
+| img_path   | 用来存储实际的文本或者截图路径信息                      |
++------------+---------------------------------------------------------+
+
+span 的类型有如下几种
+
+================== ========
+type               desc
+================== ========
+image              图片
+table              表格
+text               文本
+inline_equation    行内公式
+interline_equation 行间公式
+================== ========
+
+**总结**
+
+span是所有元素的最小存储单元
+
+para_blocks内存储的元素为区块信息
+
+区块结构为
+
+一级block(如有)->二级block->line->span
+
+.. _示例数据-1:
+
+示例数据
+^^^^^^^^
+
+.. code:: json
+
+   {
+       "pdf_info": [
+           {
+               "preproc_blocks": [
+                   {
+                       "type": "text",
+                       "bbox": [
+                           52,
+                           61.956024169921875,
+                           294,
+                           82.99800872802734
+                       ],
+                       "lines": [
+                           {
+                               "bbox": [
+                                   52,
+                                   61.956024169921875,
+                                   294,
+                                   72.0000228881836
+                               ],
+                               "spans": [
+                                   {
+                                       "bbox": [
+                                           54.0,
+                                           61.956024169921875,
+                                           296.2261657714844,
+                                           72.0000228881836
+                                       ],
+                                       "content": "dependent on the service headway and the reliability of the departure ",
+                                       "type": "text",
+                                       "score": 1.0
+                                   }
+                               ]
+                           }
+                       ]
+                   }
+               ],
+               "layout_bboxes": [
+                   {
+                       "layout_bbox": [
+                           52,
+                           61,
+                           294,
+                           731
+                       ],
+                       "layout_label": "V",
+                       "sub_layout": []
+                   }
+               ],
+               "page_idx": 0,
+               "page_size": [
+                   612.0,
+                   792.0
+               ],
+               "_layout_tree": [],
+               "images": [],
+               "tables": [],
+               "interline_equations": [],
+               "discarded_blocks": [],
+               "para_blocks": [
+                   {
+                       "type": "text",
+                       "bbox": [
+                           52,
+                           61.956024169921875,
+                           294,
+                           82.99800872802734
+                       ],
+                       "lines": [
+                           {
+                               "bbox": [
+                                   52,
+                                   61.956024169921875,
+                                   294,
+                                   72.0000228881836
+                               ],
+                               "spans": [
+                                   {
+                                       "bbox": [
+                                           54.0,
+                                           61.956024169921875,
+                                           296.2261657714844,
+                                           72.0000228881836
+                                       ],
+                                       "content": "dependent on the service headway and the reliability of the departure ",
+                                       "type": "text",
+                                       "score": 1.0
+                                   }
+                               ]
+                           }
+                       ]
+                   }
+               ]
+           }
+       ],
+       "_parse_type": "txt",
+       "_version_name": "0.6.1"
+   }
+
+.. |poly 坐标示意图| image:: ../../_static/image/poly.png

+ 0 - 26
next_docs/en/additional_notes/changelog.rst

@@ -1,26 +0,0 @@
-
-
-Changelog
-=========
-
--  2024/09/27 Version 0.8.1 released, Fixed some bugs, and providing a
-   `localized deployment version <projects/web_demo/README.md>`__ of the
-   `online
-   demo <https://opendatalab.com/OpenSourceTools/Extractor/PDF/>`__ and
-   the `front-end interface <projects/web/README.md>`__.
--  2024/09/09: Version 0.8.0 released, supporting fast deployment with
-   Dockerfile, and launching demos on Huggingface and Modelscope.
--  2024/08/30: Version 0.7.1 released, add paddle tablemaster table
-   recognition option
--  2024/08/09: Version 0.7.0b1 released, simplified installation
-   process, added table recognition functionality
--  2024/08/01: Version 0.6.2b1 released, optimized dependency conflict
-   issues and installation documentation
--  2024/07/05: Initial open-source release
-
-
-.. warning::
-
-   fix ``localized deployment version`` and ``front-end interface``
-
-

+ 0 - 19
next_docs/en/additional_notes/known_issues.rst

@@ -1,19 +0,0 @@
-Known Issues
-============
-
--  Reading order is based on the model’s sorting of text distribution in
-   space, which may become disordered under extremely complex layouts.
--  Vertical text is not supported.
--  Tables of contents and lists are recognized through rules; a few
-   uncommon list formats may not be identified.
--  Only one level of headings is supported; hierarchical heading levels
-   are currently not supported.
--  Code blocks are not yet supported in the layout model.
--  Comic books, art books, elementary school textbooks, and exercise
-   books are not well-parsed yet
--  Enabling OCR may produce better results in PDFs with a high density
-   of formulas
--  If you are processing PDFs with a large number of formulas, it is
-   strongly recommended to enable the OCR function. When using PyMuPDF
-   to extract text, overlapping text lines can occur, leading to
-   inaccurate formula insertion positions.

+ 0 - 1
next_docs/en/api/utils.rst

@@ -1 +0,0 @@
-

+ 0 - 13
next_docs/en/projects.rst

@@ -1,13 +0,0 @@
-
-
-
-llama_index_rag 
-===============
-
-
-gradio_app
-============
-
-
-other projects
-===============

+ 0 - 107
next_docs/en/user_guide/install/install.rst

@@ -1,107 +0,0 @@
-
-Install 
-===============================================================
-If you encounter any installation issues, please first consult the FAQ.
-If the parsing results are not as expected, refer to the Known Issues.
-There are three different ways to experience MinerU
-
-Pre-installation Notice—Hardware and Software Environment Support
-------------------------------------------------------------------
-
-To ensure the stability and reliability of the project, we only optimize
-and test for specific hardware and software environments during
-development. This ensures that users deploying and running the project
-on recommended system configurations will get the best performance with
-the fewest compatibility issues.
-
-By focusing resources on the mainline environment, our team can more
-efficiently resolve potential bugs and develop new features.
-
-In non-mainline environments, due to the diversity of hardware and
-software configurations, as well as third-party dependency compatibility
-issues, we cannot guarantee 100% project availability. Therefore, for
-users who wish to use this project in non-recommended environments, we
-suggest carefully reading the documentation and FAQ first. Most issues
-already have corresponding solutions in the FAQ. We also encourage
-community feedback to help us gradually expand support.
-
-.. raw:: html
-
-   <style>
-      table, th, td {
-      border: 1px solid black;
-      border-collapse: collapse;
-      }
-   </style>
-   <table>
-    <tr>
-        <td colspan="3" rowspan="2">Operating System</td>
-    </tr>
-    <tr>
-        <td>Ubuntu 22.04 LTS</td>
-        <td>Windows 10 / 11</td>
-        <td>macOS 11+</td>
-    </tr>
-    <tr>
-        <td colspan="3">CPU</td>
-        <td>x86_64</td>
-        <td>x86_64</td>
-        <td>x86_64 / arm64</td>
-    </tr>
-    <tr>
-        <td colspan="3">Memory</td>
-        <td colspan="3">16GB or more, recommended 32GB+</td>
-    </tr>
-    <tr>
-        <td colspan="3">Python Version</td>
-        <td colspan="3">3.10</td>
-    </tr>
-    <tr>
-        <td colspan="3">Nvidia Driver Version</td>
-        <td>latest (Proprietary Driver)</td>
-        <td>latest</td>
-        <td>None</td>
-    </tr>
-    <tr>
-        <td colspan="3">CUDA Environment</td>
-        <td>Automatic installation [12.1 (pytorch) + 11.8 (paddle)]</td>
-        <td>11.8 (manual installation) + cuDNN v8.7.0 (manual installation)</td>
-        <td>None</td>
-    </tr>
-    <tr>
-        <td rowspan="2">GPU Hardware Support List</td>
-        <td colspan="2">Minimum Requirement 8G+ VRAM</td>
-        <td colspan="2">3060ti/3070/3080/3080ti/4060/4070/4070ti<br>
-        8G VRAM enables layout, formula recognition acceleration and OCR acceleration</td>
-        <td rowspan="2">None</td>
-    </tr>
-    <tr>
-        <td colspan="2">Recommended Configuration 16G+ VRAM</td>
-        <td colspan="2">3090/3090ti/4070ti super/4080/4090<br>
-        16G VRAM or more can enable layout, formula recognition, OCR acceleration and table recognition acceleration simultaneously
-        </td>
-    </tr>
-   </table>
-
-
-Create an environment
-~~~~~~~~~~~~~~~~~~~~~
-
-.. code-block:: shell
-
-    conda create -n MinerU python=3.10
-    conda activate MinerU
-    pip install -U magic-pdf[full] --extra-index-url https://wheels.myhloli.com
-
-
-Download model weight files
-~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-.. code-block:: shell
-
-    pip install huggingface_hub
-    wget https://github.com/opendatalab/MinerU/raw/master/scripts/download_models_hf.py -O download_models_hf.py
-    python download_models_hf.py    
-
-
-The MinerU is installed, Check out :doc:`../quick_start` or reading :doc:`boost_with_cuda` for accelerate inference

+ 0 - 10
next_docs/en/user_guide/quick_start/extract_text.rst

@@ -1,10 +0,0 @@
-
-
-Extract Content from Pdf
-========================
-
-.. code:: python
-
-    from magic_pdf.data.read_api import read_local_pdfs
-    from magic_pdf.pdf_parse_union_core_v2 import pdf_parse_union
-    from magic_pdf.model.doc_analyze_by_custom_model import doc_analyze

+ 0 - 26
next_docs/zh_cn/index.rst

@@ -1,26 +0,0 @@
-.. xtuner documentation master file, created by
-   sphinx-quickstart on Tue Jan  9 16:33:06 2024.
-   You can adapt this file completely to your liking, but it should at least
-   contain the root `toctree` directive.
-
-欢迎来到 MinerU 的中文文档
-==============================================
-
-.. figure:: ./_static/image/logo.png
-  :align: center
-  :alt: mineru
-  :class: no-scaled-link
-
-.. raw:: html
-
-   <p style="text-align:center">
-   <strong> 一站式开源高质量数据提取工具
-   </strong>
-   </p>
-
-   <p style="text-align:center">
-   <script async defer src="https://buttons.github.io/buttons.js"></script>
-   <a class="github-button" href="https://github.com/opendatalab/MinerU" data-show-count="true" data-size="large" aria-label="Star">Star</a>
-   <a class="github-button" href="https://github.com/opendatalab/MinerU/subscription" data-icon="octicon-eye" data-size="large" aria-label="Watch">Watch</a>
-   <a class="github-button" href="https://github.com/opendatalab/MinerU/fork" data-icon="octicon-repo-forked" data-size="large" aria-label="Fork">Fork</a>
-   </p>

+ 0 - 0
docs/FAQ_en_us.md → old_docs/FAQ_en_us.md


+ 0 - 0
docs/FAQ_zh_cn.md → old_docs/FAQ_zh_cn.md


+ 0 - 0
docs/README_Ubuntu_CUDA_Acceleration_en_US.md → old_docs/README_Ubuntu_CUDA_Acceleration_en_US.md


+ 0 - 0
docs/README_Ubuntu_CUDA_Acceleration_zh_CN.md → old_docs/README_Ubuntu_CUDA_Acceleration_zh_CN.md


+ 0 - 0
docs/README_Windows_CUDA_Acceleration_en_US.md → old_docs/README_Windows_CUDA_Acceleration_en_US.md


+ 0 - 0
docs/README_Windows_CUDA_Acceleration_zh_CN.md → old_docs/README_Windows_CUDA_Acceleration_zh_CN.md


+ 0 - 0
docs/chemical_knowledge_introduction/introduction.pdf → old_docs/chemical_knowledge_introduction/introduction.pdf


+ 0 - 0
docs/chemical_knowledge_introduction/introduction.xmind → old_docs/chemical_knowledge_introduction/introduction.xmind


+ 0 - 0
docs/download_models.py → old_docs/download_models.py


+ 0 - 0
docs/download_models_hf.py → old_docs/download_models_hf.py


+ 0 - 0
docs/how_to_download_models_en.md → old_docs/how_to_download_models_en.md


Энэ ялгаанд хэт олон файл өөрчлөгдсөн тул зарим файлыг харуулаагүй болно