myhloli ef78819aa9 refactor(draw_bbox): remove redundant '_line_sort' suffix from output filename 11 meses atrás
..
config 87af738ab1 fix: 1. ocr txt mode error 2. lose pdf_parse_type field 11 meses atrás
data 113448903a fix: unicode decode error 11 meses atrás
dict2md 74ee428bbb fix(dict2md): add space for inline equations in CJK contexts 11 meses atrás
filter e1be7da644 refactor(magic_pdf): switch to pdfminer for invalid character detection 11 meses atrás
integrations b492c19c4c refactor: move some constants or enums defs to config folder 1 ano atrás
libs ef78819aa9 refactor(draw_bbox): remove redundant '_line_sort' suffix from output filename 11 meses atrás
model f5d812b313 feat(layout): improve layout detection for DocLayout_YOLO model 11 meses atrás
para 41545a13c6 refactor(para): adjust line height multiplier for block splitting 11 meses atrás
pipe f6bd47de6a docs: add dataset method description 11 meses atrás
pre_proc 7f8dc353b0 fix(pre_proc): prevent errors when imageWriter is None 11 meses atrás
resources 240fe99e3c feat(table): integrate RapidTable model for table recognition 1 ano atrás
rw 2db3c26374 refactor(libs): remove unused imports and functions 1 ano atrás
spark b492c19c4c refactor: move some constants or enums defs to config folder 1 ano atrás
tools 4e7511fb86 fix: dup classify pdf type 11 meses atrás
utils 9cda7051c6 add init to magic_pdf.utils 1 ano atrás
__init__.py d5dbed7325 目录重构 1 ano atrás
pdf_parse_by_ocr.py a3a720ea87 refactor: isolate inference and pipeline 11 meses atrás
pdf_parse_by_txt.py a3a720ea87 refactor: isolate inference and pipeline 11 meses atrás
pdf_parse_union_core_v2.py 9efc35ecaa refactor(magic_pdf): remove unused import in pdf_parse_union_core_v2.py 11 meses atrás
user_api.py 87af738ab1 fix: 1. ocr txt mode error 2. lose pdf_parse_type field 11 meses atrás