config.rst 3.2 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160
  1. Config
  2. =========
  3. File **magic-pdf.json** is typically located in the **${HOME}** directory under a Linux system or in the **C:\Users\{username}** directory under a Windows system.
  4. magic-pdf.json
  5. ----------------
  6. .. code:: json
  7. {
  8. "bucket_info":{
  9. "bucket-name-1":["ak", "sk", "endpoint"],
  10. "bucket-name-2":["ak", "sk", "endpoint"]
  11. },
  12. "models-dir":"/tmp/models",
  13. "layoutreader-model-dir":"/tmp/layoutreader",
  14. "device-mode":"cpu",
  15. "layout-config": {
  16. "model": "layoutlmv3"
  17. },
  18. "formula-config": {
  19. "mfd_model": "yolo_v8_mfd",
  20. "mfr_model": "unimernet_small",
  21. "enable": true
  22. },
  23. "table-config": {
  24. "model": "rapid_table",
  25. "enable": false,
  26. "max_time": 400
  27. },
  28. "config_version": "1.0.0"
  29. }
  30. bucket_info
  31. ^^^^^^^^^^^^^^
  32. Store the access_key, secret_key and endpoint of AWS S3 Compatible storage config
  33. Example:
  34. .. code:: text
  35. {
  36. "image_bucket":[{access_key}, {secret_key}, {endpoint}],
  37. "video_bucket":[{access_key}, {secret_key}, {endpoint}]
  38. }
  39. models-dir
  40. ^^^^^^^^^^^^
  41. Store the models download from **huggingface** or **modelshop**. You do not need to modify this field if you download the model using the scripts shipped with **MinerU**
  42. layoutreader-model-dir
  43. ^^^^^^^^^^^^^^^^^^^^^^^
  44. Store the models download from **huggingface** or **modelshop**. You do not need to modify this field if you download the model using the scripts shipped with **MinerU**
  45. devide-mode
  46. ^^^^^^^^^^^^^^
  47. This field have two options, **cpu** or **cuda**.
  48. **cpu**: inference via cpu
  49. **cuda**: using cuda to accelerate inference
  50. layout-config
  51. ^^^^^^^^^^^^^^^
  52. .. code:: json
  53. {
  54. "model": "layoutlmv3"
  55. }
  56. layout model can not be disabled now, And we have only kind of layout model currently.
  57. formula-config
  58. ^^^^^^^^^^^^^^^^
  59. .. code:: json
  60. {
  61. "mfd_model": "yolo_v8_mfd",
  62. "mfr_model": "unimernet_small",
  63. "enable": true
  64. }
  65. mfd_model
  66. """"""""""
  67. Specify the formula detection model, options are ['yolo_v8_mfd']
  68. mfr_model
  69. """"""""""
  70. Specify the formula recognition model, options are ['unimernet_small']
  71. Check `UniMERNet <https://github.com/opendatalab/UniMERNet>`_ for more details
  72. enable
  73. """"""""
  74. on-off flag, options are [true, false]. **true** means enable formula inference, **false** means disable formula inference
  75. table-config
  76. ^^^^^^^^^^^^^^^^
  77. .. code:: json
  78. {
  79. "model": "rapid_table",
  80. "enable": false,
  81. "max_time": 400
  82. }
  83. model
  84. """"""""
  85. Specify the table inference model, options are ['rapid_table', 'tablemaster', 'struct_eqtable']
  86. max_time
  87. """""""""
  88. Since table recognition is a time-consuming process, we set a timeout period. If the process exceeds this time, the table recognition will be terminated.
  89. enable
  90. """""""
  91. on-off flag, options are [true, false]. **true** means enable table inference, **false** means disable table inference
  92. config_version
  93. ^^^^^^^^^^^^^^^^
  94. The version of config schema.
  95. .. admonition:: Tip
  96. :class: tip
  97. Check `Config Schema <https://github.com/opendatalab/MinerU/blob/master/magic-pdf.template.json>`_ for the latest details