تاریخچه Commit ها

نویسنده SHA1 پیام تاریخ
  myhloli 21fa78195e refactor(pre_proc): remove unused functions and simplify code 11 ماه پیش
  myhloli ecdaa49aee refactor(magic_pdf): remove unused functions and simplify code 11 ماه پیش
  myhloli 8163506295 feat(pdf_parse): improve text extraction for vertical spans 11 ماه پیش
  myhloli 7d4dfca253 feat(pdf_parse): add OCR score to span data 11 ماه پیش
  myhloli 14656085f5 refactor(pdf_parse): improve text content extraction from PDF spans 11 ماه پیش
  myhloli 7964ae45d2 refactor(pdf_parse): improve code readability and maintainability 11 ماه پیش
  myhloli 97bcc8b23b refactor(pdf_parse): improve code readability and maintainability 11 ماه پیش
  myhloli 034c59a887 refactor(txt_spans_extract_v2): optimize span processing and OCR logic 11 ماه پیش
  myhloli 0d3ef89fb9 fix(pdf_parse): Move the logic for filling text content into spans before the discarded_block recognition to fix the issue of empty text blocks in discarded_block. 11 ماه پیش
  myhloli 6b296ee2b5 fix(pdf_parse): improve OCR result handling 1 سال پیش
  myhloli 5d6cbcb123 refactor(para): improve line stop flag and remove unused debug mode 1 سال پیش
  myhloli ae3b0a1e60 fix(pdf_parse): improve line stop flag detection accuracy 1 سال پیش
  myhloli 309be741e8 refactor(txt_parse): improve text extraction accuracy with new algorithm 1 سال پیش
  icecraft b492c19c4c refactor: move some constants or enums defs to config folder 1 سال پیش
  myhloli 08f46125a0 refactor(model): rename and restructure model modules 1 سال پیش
  myhloli 5936684fd8 refactor(pdf_parse): adjust line count threshold for layoutreader 1 سال پیش
  myhloli 5468e56fba refactor(pdf_parse): adjust line count limit for layoutreader 1 سال پیش
  myhloli 7d5850e3ce feat(model): add xycut algorithm for block sorting 1 سال پیش
  myhloli 149132d608 feat(pdf_parse): improve span filtering and add new block types 1 سال پیش
  myhloli ad0d06b6a0 fix(pdf_parse): improve span removal logic for all content types 1 سال پیش
  myhloli 509128d505 fix(pdf_parse): improve span removal logic for all content types 1 سال پیش
  myhloli eeda90af31 fix(pdf_parse): improve span removal logic for all content types 1 سال پیش
  myhloli 6b9f816f9e fix(pdf_parse): optimize span processing by removing outside spans 1 سال پیش
  myhloli 4cf7e9a224 refactor(pdf_parse): adjust block splitting logic for wide blocks 1 سال پیش
  myhloli c34c9d21ef refactor(ocr): improve image and table block handling 1 سال پیش
  icecraft 283b597a6e feat: add [figure | table] match [caption | footnote] match algorithm v2 1 سال پیش
  myhloli 7e301b849b refactor(pdf): adjust span filling threshold in block construction 1 سال پیش
  myhloli 6f63e70e94 feat(pdf_parse_union_core_v2): reintegrate para_split_v3 and add page range support 1 سال پیش
  myhloli ded2818ac2 feat(layoutreader): support local model directory and improve model loading 1 سال پیش
  myhloli a71db70314 feat: add arXiv paper link to header and adjust PDF parsing logic- Add arXiv paper link to the header template for easy access to the latest research paper. 1 سال پیش