Commit History

Автор SHA1 Съобщение Дата
  icecraft b492c19c4c refactor: move some constants or enums defs to config folder преди 1 година
  myhloli 08f46125a0 refactor(model): rename and restructure model modules преди 1 година
  myhloli 5936684fd8 refactor(pdf_parse): adjust line count threshold for layoutreader преди 1 година
  myhloli 5468e56fba refactor(pdf_parse): adjust line count limit for layoutreader преди 1 година
  myhloli 7d5850e3ce feat(model): add xycut algorithm for block sorting преди 1 година
  myhloli 149132d608 feat(pdf_parse): improve span filtering and add new block types преди 1 година
  myhloli ad0d06b6a0 fix(pdf_parse): improve span removal logic for all content types преди 1 година
  myhloli 509128d505 fix(pdf_parse): improve span removal logic for all content types преди 1 година
  myhloli eeda90af31 fix(pdf_parse): improve span removal logic for all content types преди 1 година
  myhloli 6b9f816f9e fix(pdf_parse): optimize span processing by removing outside spans преди 1 година
  myhloli 4cf7e9a224 refactor(pdf_parse): adjust block splitting logic for wide blocks преди 1 година
  myhloli c34c9d21ef refactor(ocr): improve image and table block handling преди 1 година
  icecraft 283b597a6e feat: add [figure | table] match [caption | footnote] match algorithm v2 преди 1 година
  myhloli 7e301b849b refactor(pdf): adjust span filling threshold in block construction преди 1 година
  myhloli 6f63e70e94 feat(pdf_parse_union_core_v2): reintegrate para_split_v3 and add page range support преди 1 година
  myhloli ded2818ac2 feat(layoutreader): support local model directory and improve model loading преди 1 година
  myhloli a71db70314 feat: add arXiv paper link to header and adjust PDF parsing logic- Add arXiv paper link to the header template for easy access to the latest research paper. преди 1 година
  myhloli 564c4ce1e3 refactor(magic_pdf): improve line sorting and block indexing преди 1 година
  myhloli 4c9bf8abd5 refactor(memory management): remove unused clean_memory function преди 1 година
  myhloli 42a7d792c3 refactor(magic_pdf): import model helpers directly for clarity преди 1 година
  myhloli 5522d0a36c refactor(pdf_parse_union_core_v2): update import paths to use new package structure преди 1 година
  myhloli 2145a8b6d2 fix(pdf_parse): handle blocks without lines and enable bf16 on compatible devices преди 1 година
  myhloli 177ab08e9f refactor(pdf_parse): remove redundant sorting and optimize block indexing преди 1 година
  myhloli b9dfdea3cb refactor(pdf_parse_union_core_v2): implement model initialization within classRefactored model initialization to be handled by a singleton class to ensure that model преди 1 година
  myhloli b2790f6f45 refactor(drawing): simplify draw bbox functions and adjust debug config преди 1 година
  myhloli 34f8965007 refactor(draw_bbox): add line sorting visualization преди 1 година
  myhloli 1efebe421c refactor(pdf_parse_union): integrate LayoutLMv3 for block orderingReplace the heuristic-based block ordering algorithm with LayoutLMv3 model predictions toimprove the accuracy of block ordering on PDF pages. Additionally, refactor the span преди 1 година