浏览代码

docs: update URLs to gitee for Windows CUDA acceleration guides

Update the URLs for downloading the `magic-pdf.template.json` and `small_ocr.pdf`
files in the Windows CUDA acceleration guides. The links now point to the giteerepository instead of GitHub, ensuring users have access to the necessary files
from the correct source.
myhloli 1 年之前
父节点
当前提交
d3e42e0845

+ 1 - 1
docs/README_Ubuntu_CUDA_Acceleration_en_US.md

@@ -57,7 +57,7 @@
    If the version number is less than 0.6.2, please report the issue.
 
 ### 6. Download Models
-   Refer to detailed instructions on how to download model files.
+   Refer to detailed instructions on [how to download model files](how_to_download_models_en.md).  
    After downloading, move the `models` directory to an SSD with more space.
    
 ❗ After downloading the models, ensure they are complete:

+ 103 - 0
docs/README_Windows_CUDA_Acceleration_en_US.md

@@ -0,0 +1,103 @@
+# Windows 10/11
+
+### 1. Install CUDA and cuDNN
+Required versions: CUDA 11.8 + cuDNN 8.7.0
+   - CUDA 11.8: https://developer.nvidia.com/cuda-11-8-0-download-archive
+   - cuDNN v8.7.0 (November 28th, 2022), for CUDA 11.x: https://developer.nvidia.com/rdp/cudnn-archive
+   
+### 2. Install Anaconda
+   If Anaconda is already installed, you can skip this step.
+   Download link: https://repo.anaconda.com/archive/Anaconda3-2024.06-1-Windows-x86_64.exe
+
+### 3. Create an Environment Using Conda
+   Python version must be 3.10.
+   ```
+   conda create -n MinerU python=3.10
+   conda activate MinerU
+   ```
+
+### 4. Install Applications
+   ```
+   pip install magic-pdf[full]==0.6.2b1 detectron2 --extra-index-url https://wheels.myhloli.com
+   ```
+   >❗️After installation, verify the version of `magic-pdf`:
+   >  ```bash
+   >  magic-pdf --version
+   >  ```
+   > If the version number is less than 0.6.2, please report it in the issues section.
+   
+### 5. Download Models
+   Refer to detailed instructions on [how to download model files](how_to_download_models_en.md).  
+   After downloading, move the `models` directory to an SSD with more space.
+   
+   >❗ After downloading the models, ensure they are complete:
+   >- Check that the file sizes match the description on the website.
+   >- If possible, verify the integrity using SHA256.
+
+### 6. Configuration Before the First Run
+   Obtain the configuration template file `magic-pdf.template.json` from the repository root directory.
+    
+   >❗️Execute the following command to copy the configuration file to your user directory, or the program will not run.
+   >   
+   > In Windows, user directory is "C:\Users\username"
+   
+   ```
+     (New-Object System.Net.WebClient).DownloadFile('https://github.com/opendatalab/MinerU/raw/master/magic-pdf.template.json', 'magic-pdf.template.json')
+     cp magic-pdf.template.json ~/magic-pdf.json
+   ```
+
+   Find the `magic-pdf.json` file in your user directory and configure `"models-dir"` to point to the directory where the model weights from step 5 were downloaded.
+   
+   > ❗️Ensure the absolute path of the model weights directory is correctly configured, or the program will fail to run due to not finding the model files.
+   >    
+   > In Windows, this path should include the drive letter and replace all double quotes (`"\"`) with forward slashes (`"/"`).
+   >   
+   > Example: If the models are placed in the root directory of drive D, the value for `model-dir` should be `"D:/models"`.
+   
+   ```
+   {
+     "models-dir": "/tmp/models"
+   }
+   ```
+
+### 7. First Run
+   Download a sample file from the repository and test it.
+   ```
+     (New-Object System.Net.WebClient).DownloadFile('https://github.com/opendatalab/MinerU/raw/master/demo/small_ocr.pdf', 'small_ocr.pdf')
+     magic-pdf pdf-command --pdf small_ocr.pdf
+   ```
+
+### 8. Test CUDA Acceleration
+   If your graphics card has at least 8GB of VRAM, follow these steps to test CUDA-accelerated parsing performance.
+   1. **Overwrite the installation of torch and torchvision** supporting CUDA.
+      ```
+      pip install --force-reinstall torch==2.3.1 torchvision==0.18.1 --index-url https://download.pytorch.org/whl/cu118
+      ```
+      >❗️Ensure the following versions are specified in the command:
+      >```
+      > torch==2.3.1 torchvision==0.18.1
+      >```
+      >These are the highest versions we support. Installing higher versions without specifying them will cause the program to fail.
+   2. **Modify the value of `"device-mode"`** in the `magic-pdf.json` configuration file located in your user directory.
+     
+      ```json
+      {
+        "device-mode": "cuda"
+      }
+      ```
+   3. **Run the following command to test CUDA acceleration**:
+
+      ```
+      magic-pdf pdf-command --pdf small_ocr.pdf
+      ```
+
+### 9. Enable CUDA Acceleration for OCR
+   >❗️This operation requires at least 16GB of VRAM on your graphics card, otherwise it will cause the program to crash or slow down.
+   1. **Download paddlepaddle-gpu**, which will automatically enable OCR acceleration upon installation.
+      ```
+      pip install paddlepaddle-gpu==2.6.1
+      ```
+   2. **Run the following command to test OCR acceleration**:
+      ```
+      magic-pdf pdf-command --pdf small_ocr.pdf
+      ```

+ 3 - 3
docs/README_Windows_CUDA_Acceleration_zh_CN.md

@@ -42,14 +42,14 @@ pip install magic-pdf[full]==0.6.2b1 detectron2 --extra-index-url https://wheels
 >  
 > windows用户目录为 "C:\Users\用户名"
 ```powershell
-(New-Object System.Net.WebClient).DownloadFile('https://github.com/opendatalab/MinerU/raw/master/magic-pdf.template.json', 'magic-pdf.template.json')
+(New-Object System.Net.WebClient).DownloadFile('https://gitee.com/myhloli/MinerU/raw/master/magic-pdf.template.json', 'magic-pdf.template.json')
 cp magic-pdf.template.json ~/magic-pdf.json
 ```
 
 在用户目录中找到magic-pdf.json文件并配置"models-dir"为[5. 下载模型](#5-下载模型)中下载的模型权重文件所在目录
 > ❗️务必正确配置模型权重文件所在目录的【绝对路径】,否则会因为找不到模型文件而导致程序无法运行
 > 
-> windows系统中此路径应包含盘符,且需把路径中所有的"\"替换为"/",否则会因为转义原因导致json文件语法错误。
+> windows系统中此路径应包含盘符,且需把路径中所有的`"\"`替换为`"/"`,否则会因为转义原因导致json文件语法错误。
 > 
 > 例如:模型放在D盘根目录的models目录,则model-dir的值应为"D:/models"
 ```json
@@ -61,7 +61,7 @@ cp magic-pdf.template.json ~/magic-pdf.json
 ## 7. 第一次运行
 从仓库中下载样本文件,并测试
 ```powershell
-(New-Object System.Net.WebClient).DownloadFile('https://github.com/opendatalab/MinerU/raw/master/demo/small_ocr.pdf', 'small_ocr.pdf')
+(New-Object System.Net.WebClient).DownloadFile('https://gitee.com/myhloli/MinerU/raw/master/demo/small_ocr.pdf', 'small_ocr.pdf')
 magic-pdf pdf-command --pdf small_ocr.pdf
 ```