Commit History

Author SHA1 Message Date
  myhloli 9951a17026 style(pdf_parse_union_core_v2): remove unnecessary spaces and improve code formatting- Remove extra space in conditional statement for character spacing logic 10 months ago
  myhloli 7c5cdcd4d7 refactor(pdf_parse): improve character spacing handling in PDF text extraction 10 months ago
  myhloli 2684e7753b fix(npu): correct module name for NPU operations 10 months ago
  myhloli 50f4841716 refactor(device): optimize memory cleaning and device selection 11 months ago
  myhloli 7990e7dfbb feat(model): add npu support and optimize table model 11 months ago
  myhloli 0a468eca6e feat(llm_aided): add title optimization feature 11 months ago
  Xiaomeng Zhao da3257a631 Merge pull request #1352 from myhloli/add-llm-aided 11 months ago
  myhloli c660fdc8f0 feat(llm): add LLM-aided formula and text correction 11 months ago
  myhloli 15e876677d refactor(pre_proc): improve character overlap handling in spans 11 months ago
  myhloli 2f4d4b0c80 feat(pre_proc): add function to remove overlapping characters in spans 11 months ago
  pangguosheng 51b8c57df2 fix: skip the char corresponding to invalid bounding boxes 11 months ago
  myhloli 9e4ebea939 refactor(magic_pdf): remove YOLO_VERBOSE setting and update YOLOv8 prediction verbosity 11 months ago
  myhloli c638fc5d1f fix(pdf): improve ligature handling and text extraction 11 months ago
  myhloli 9efc35ecaa refactor(magic_pdf): remove unused import in pdf_parse_union_core_v2.py 11 months ago
  Xiaomeng Zhao fa113b5750 Merge pull request #1178 from icecraft/refactor/add_user_api 11 months ago
  myhloli 012a46e07d refactor(magic-pdf): optimize model initialization and concurrency control 11 months ago
  myhloli 47a83d28f5 refactor(ocr): replace AtomModelSingleton with ocr_model_init for OCR model instantiation 11 months ago
  myhloli f2a92d5782 refactor(model): implement thread-safe OCR model initialization 11 months ago
  myhloli a1744b770f refactor(magic_pdf): remove unused threading lock and model initialization code 11 months ago
  myhloli 30220233ab refactor(magic_pdf): replace AtomModelSingleton with ocr_model_init for OCR model instantiation 11 months ago
  xu rui f6bd47de6a docs: add dataset method description 11 months ago
  icecraft 4a82d6a07a feat: add function definitions 1 year ago
  icecraft a3a720ea87 refactor: isolate inference and pipeline 1 year ago
  myhloli d4345b6e39 refactor(pdf_parse): adjust character-axis alignment algorithm 1 year ago
  myhloli 949d0867fb feat(pdf_parse): add line start flag detection and optimize line stop flag logic 1 year ago
  myhloli ac88815620 refactor(pdf_check): improve character detection using PyMuPDF 1 year ago
  myhloli 88c0854a65 refactor(ocr): improve text processing and span handling 1 year ago
  myhloli 37da8c44c4 feat(pdf_parse): filter out skewed text lines 1 year ago
  myhloli 08392d63a0 fix(Hybrid OCR):Enable Hybrid OCR for Empty Spans That Contain a Certain Number of Placeholders but No Actual Text 1 year ago
  myhloli 1d2eb70aa0 refactor(pdf_parse_union_core_v2): optimize page processing time logging 1 year ago