inference_result.rst 4.7 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144
  1. Inference Result
  2. ==================
  3. .. admonition:: Tip
  4. :class: tip
  5. Please first navigate to :doc:`tutorial/pipeline` to get an initial understanding of how the pipeline works; this will help in understanding the content of this section.
  6. The **InferenceResult** class is a container for storing model inference results and implements a series of methods related to these results, such as draw_model, dump_model.
  7. Checkout :doc:`../api/model_operators` for more details about **InferenceResult**
  8. Model Inference Result
  9. -----------------------
  10. Structure Definition
  11. ^^^^^^^^^^^^^^^^^^^^^^^^
  12. .. code:: python
  13. from pydantic import BaseModel, Field
  14. from enum import IntEnum
  15. class CategoryType(IntEnum):
  16. title = 0 # Title
  17. plain_text = 1 # Text
  18. abandon = 2 # Includes headers, footers, page numbers, and page annotations
  19. figure = 3 # Image
  20. figure_caption = 4 # Image description
  21. table = 5 # Table
  22. table_caption = 6 # Table description
  23. table_footnote = 7 # Table footnote
  24. isolate_formula = 8 # Block formula
  25. formula_caption = 9 # Formula label
  26. embedding = 13 # Inline formula
  27. isolated = 14 # Block formula
  28. text = 15 # OCR recognition result
  29. class PageInfo(BaseModel):
  30. page_no: int = Field(description="Page number, the first page is 0", ge=0)
  31. height: int = Field(description="Page height", gt=0)
  32. width: int = Field(description="Page width", ge=0)
  33. class ObjectInferenceResult(BaseModel):
  34. category_id: CategoryType = Field(description="Category", ge=0)
  35. poly: list[float] = Field(description="Quadrilateral coordinates, representing the coordinates of the top-left, top-right, bottom-right, and bottom-left points respectively")
  36. score: float = Field(description="Confidence of the inference result")
  37. latex: str | None = Field(description="LaTeX parsing result", default=None)
  38. html: str | None = Field(description="HTML parsing result", default=None)
  39. class PageInferenceResults(BaseModel):
  40. layout_dets: list[ObjectInferenceResult] = Field(description="Page recognition results", ge=0)
  41. page_info: PageInfo = Field(description="Page metadata")
  42. Example
  43. ^^^^^^^^^^^
  44. .. code:: json
  45. [
  46. {
  47. "layout_dets": [
  48. {
  49. "category_id": 2,
  50. "poly": [
  51. 99.1906967163086,
  52. 100.3119125366211,
  53. 730.3707885742188,
  54. 100.3119125366211,
  55. 730.3707885742188,
  56. 245.81326293945312,
  57. 99.1906967163086,
  58. 245.81326293945312
  59. ],
  60. "score": 0.9999997615814209
  61. }
  62. ],
  63. "page_info": {
  64. "page_no": 0,
  65. "height": 2339,
  66. "width": 1654
  67. }
  68. },
  69. {
  70. "layout_dets": [
  71. {
  72. "category_id": 5,
  73. "poly": [
  74. 99.13092803955078,
  75. 2210.680419921875,
  76. 497.3183898925781,
  77. 2210.680419921875,
  78. 497.3183898925781,
  79. 2264.78076171875,
  80. 99.13092803955078,
  81. 2264.78076171875
  82. ],
  83. "score": 0.9999997019767761
  84. }
  85. ],
  86. "page_info": {
  87. "page_no": 1,
  88. "height": 2339,
  89. "width": 1654
  90. }
  91. }
  92. ]
  93. The format of the poly coordinates is [x0, y0, x1, y1, x2, y2, x3, y3],
  94. representing the coordinates of the top-left, top-right, bottom-right,
  95. and bottom-left points respectively. |Poly Coordinate Diagram|
  96. Inference Result
  97. -------------------------
  98. .. code:: python
  99. from magic_pdf.operators.models import InferenceResult
  100. from magic_pdf.data.dataset import Dataset
  101. dataset : Dataset = some_data_set # not real dataset
  102. # The inference results of all pages, ordered by page number, are stored in a list as the inference results of MinerU
  103. model_inference_result: list[PageInferenceResults] = []
  104. Inference_result = InferenceResult(model_inference_result, dataset)
  105. some_model.pdf
  106. ^^^^^^^^^^^^^^^^^^^^
  107. .. figure:: ../_static/image/inference_result.png
  108. .. |Poly Coordinate Diagram| image:: ../_static/image/poly.png