You need to install a CUDA version that is compatible with torch's requirements. Currently, torch supports CUDA 11.8/12.4/12.6.
If Anaconda is already installed, you can skip this step.
Download link: https://repo.anaconda.com/archive/Anaconda3-2024.06-1-Windows-x86_64.exe
conda create -n mineru 'python<3.13' -y
conda activate mineru
pip install -U magic-pdf[full]
[!IMPORTANT] After installation, verify the version of
magic-pdf:> magic-pdf --version > ``` > > If the version number is less than 1.3.0, please report it in the issues section. ### 5. Download Models Refer to detailed instructions on [how to download model files](how_to_download_models_en.md). ### 6. Understand the Location of the Configuration File After completing the [5. Download Models](#5-download-models) step, the script will automatically generate a `magic-pdf.json` file in the user directory and configure the default model path. You can find the `magic-pdf.json` file in your 【user directory】 . > [!TIP] > The user directory for Windows is "C:/Users/username". ### 7. First Run Download a sample file from the repository and test it.powershell wget https://github.com/opendatalab/MinerU/raw/master/demo/pdfs/small_ocr.pdf -O small_ocr.pdf magic-pdf -p small_ocr.pdf -o ./output
### 8. Test CUDA Acceleration If your graphics card has at least 6GB of VRAM, follow these steps to test CUDA-accelerated parsing performance. 1. **Overwrite the installation of torch and torchvision** supporting CUDA.(Please select the appropriate index-url based on your CUDA version. For more details, refer to the [PyTorch official website](https://pytorch.org/get-started/locally/).)pip install --force-reinstall torch==2.6.0 torchvision==0.21.0 "numpy<2.0.0" --index-url https://download.pytorch.org/whl/cu124
2. **Modify the value of `"device-mode"`** in the `magic-pdf.json` configuration file located in your user directory.json {
"device-mode": "cuda"}
3. **Run the following command to test CUDA acceleration**:magic-pdf -p small_ocr.pdf -o ./output ```