icecraft 440fd0c75b fix: projects hai 11 meses
..
config 87af738ab1 fix: 1. ocr txt mode error 2. lose pdf_parse_type field hai 1 ano
data b04867f90a docs: check links in doc hai 11 meses
dict2md 74ee428bbb fix(dict2md): add space for inline equations in CJK contexts hai 1 ano
filter e1be7da644 refactor(magic_pdf): switch to pdfminer for invalid character detection hai 11 meses
integrations b492c19c4c refactor: move some constants or enums defs to config folder hai 1 ano
libs 391a99860d Update version.py with new version hai 11 meses
model 6a75d7dce5 perf(layout): optimize layout detection for PDF extraction hai 11 meses
para 41545a13c6 refactor(para): adjust line height multiplier for block splitting hai 1 ano
pipe 440fd0c75b fix: projects hai 11 meses
post_proc 6a75d7dce5 perf(layout): optimize layout detection for PDF extraction hai 11 meses
pre_proc 7f8dc353b0 fix(pre_proc): prevent errors when imageWriter is None hai 1 ano
resources 240fe99e3c feat(table): integrate RapidTable model for table recognition hai 1 ano
rw 2db3c26374 refactor(libs): remove unused imports and functions hai 1 ano
spark b492c19c4c refactor: move some constants or enums defs to config folder hai 1 ano
tools 712d7d4a8d fix: classif pdf type hai 11 meses
utils f6af67eb11 feat: support convert ppt/pptx/doc/docx hai 11 meses
__init__.py d5dbed7325 目录重构 hai 1 ano
pdf_parse_by_ocr.py a3a720ea87 refactor: isolate inference and pipeline hai 1 ano
pdf_parse_by_txt.py a3a720ea87 refactor: isolate inference and pipeline hai 1 ano
pdf_parse_union_core_v2.py 9efc35ecaa refactor(magic_pdf): remove unused import in pdf_parse_union_core_v2.py hai 11 meses
user_api.py 87af738ab1 fix: 1. ocr txt mode error 2. lose pdf_parse_type field hai 1 ano