|
@@ -1,7 +1,5 @@
|
|
|
# Advanced Command Line Parameters
|
|
# Advanced Command Line Parameters
|
|
|
|
|
|
|
|
----
|
|
|
|
|
-
|
|
|
|
|
## SGLang Acceleration Parameter Optimization
|
|
## SGLang Acceleration Parameter Optimization
|
|
|
|
|
|
|
|
### Memory Optimization Parameters
|
|
### Memory Optimization Parameters
|
|
@@ -11,8 +9,6 @@
|
|
|
> - If you encounter insufficient VRAM when using a single graphics card, you may need to reduce the KV cache size with `--mem-fraction-static 0.5`. If VRAM issues persist, try reducing it further to `0.4` or lower.
|
|
> - If you encounter insufficient VRAM when using a single graphics card, you may need to reduce the KV cache size with `--mem-fraction-static 0.5`. If VRAM issues persist, try reducing it further to `0.4` or lower.
|
|
|
> - If you have two or more graphics cards, you can try using tensor parallelism (TP) mode to simply expand available VRAM: `--tp-size 2`
|
|
> - If you have two or more graphics cards, you can try using tensor parallelism (TP) mode to simply expand available VRAM: `--tp-size 2`
|
|
|
|
|
|
|
|
----
|
|
|
|
|
-
|
|
|
|
|
### Performance Optimization Parameters
|
|
### Performance Optimization Parameters
|
|
|
> [!TIP]
|
|
> [!TIP]
|
|
|
> If you can already use SGLang normally for accelerated VLM model inference but still want to further improve inference speed, you can try the following parameters:
|
|
> If you can already use SGLang normally for accelerated VLM model inference but still want to further improve inference speed, you can try the following parameters:
|
|
@@ -20,15 +16,11 @@
|
|
|
> - If you have multiple graphics cards, you can use SGLang's multi-card parallel mode to increase throughput: `--dp-size 2`
|
|
> - If you have multiple graphics cards, you can use SGLang's multi-card parallel mode to increase throughput: `--dp-size 2`
|
|
|
> - You can also enable `torch.compile` to accelerate inference speed by approximately 15%: `--enable-torch-compile`
|
|
> - You can also enable `torch.compile` to accelerate inference speed by approximately 15%: `--enable-torch-compile`
|
|
|
|
|
|
|
|
----
|
|
|
|
|
-
|
|
|
|
|
### Parameter Passing Instructions
|
|
### Parameter Passing Instructions
|
|
|
> [!TIP]
|
|
> [!TIP]
|
|
|
> - All officially supported SGLang parameters can be passed to MinerU through command line arguments, including the following commands: `mineru`, `mineru-sglang-server`, `mineru-gradio`, `mineru-api`
|
|
> - All officially supported SGLang parameters can be passed to MinerU through command line arguments, including the following commands: `mineru`, `mineru-sglang-server`, `mineru-gradio`, `mineru-api`
|
|
|
> - If you want to learn more about `sglang` parameter usage, please refer to the [SGLang official documentation](https://docs.sglang.ai/backend/server_arguments.html#common-launch-commands)
|
|
> - If you want to learn more about `sglang` parameter usage, please refer to the [SGLang official documentation](https://docs.sglang.ai/backend/server_arguments.html#common-launch-commands)
|
|
|
|
|
|
|
|
----
|
|
|
|
|
-
|
|
|
|
|
## GPU Device Selection and Configuration
|
|
## GPU Device Selection and Configuration
|
|
|
|
|
|
|
|
### CUDA_VISIBLE_DEVICES Basic Usage
|
|
### CUDA_VISIBLE_DEVICES Basic Usage
|
|
@@ -39,8 +31,6 @@
|
|
|
> ```
|
|
> ```
|
|
|
> - This specification method is effective for all command line calls, including `mineru`, `mineru-sglang-server`, `mineru-gradio`, and `mineru-api`, and applies to both `pipeline` and `vlm` backends.
|
|
> - This specification method is effective for all command line calls, including `mineru`, `mineru-sglang-server`, `mineru-gradio`, and `mineru-api`, and applies to both `pipeline` and `vlm` backends.
|
|
|
|
|
|
|
|
----
|
|
|
|
|
-
|
|
|
|
|
### Common Device Configuration Examples
|
|
### Common Device Configuration Examples
|
|
|
> [!TIP]
|
|
> [!TIP]
|
|
|
> Here are some common `CUDA_VISIBLE_DEVICES` setting examples:
|
|
> Here are some common `CUDA_VISIBLE_DEVICES` setting examples:
|
|
@@ -52,8 +42,6 @@
|
|
|
> CUDA_VISIBLE_DEVICES="" # No GPU will be visible
|
|
> CUDA_VISIBLE_DEVICES="" # No GPU will be visible
|
|
|
> ```
|
|
> ```
|
|
|
|
|
|
|
|
----
|
|
|
|
|
-
|
|
|
|
|
## Practical Application Scenarios
|
|
## Practical Application Scenarios
|
|
|
> [!TIP]
|
|
> [!TIP]
|
|
|
> Here are some possible usage scenarios:
|
|
> Here are some possible usage scenarios:
|