myhloli 04febf52d0 refactor(ocr): comment out unnecessary log statement hai 11 meses
..
config 20438bd2b7 feat(language-detection): add YOLOv11 language detection model hai 11 meses
data cd11ddcd6b docs: make sure the generate process of docs work properly hai 11 meses
dict2md 0a468eca6e feat(llm_aided): add title optimization feature hai 11 meses
filter e1be7da644 refactor(magic_pdf): switch to pdfminer for invalid character detection hai 1 ano
integrations b492c19c4c refactor: move some constants or enums defs to config folder hai 1 ano
libs 2684e7753b fix(npu): correct module name for NPU operations hai 11 meses
model 04febf52d0 refactor(ocr): comment out unnecessary log statement hai 11 meses
operators 489f70e91d refactor(magic_pdf): move model config variables hai 11 meses
pipe b2887ca0aa refactor: refactor code hai 11 meses
post_proc 512adb6701 feat(model): add onnxruntime support for paddleocr on cpu hai 11 meses
pre_proc 15e876677d refactor(pre_proc): improve character overlap handling in spans hai 11 meses
resources 20438bd2b7 feat(language-detection): add YOLOv11 language detection model hai 11 meses
rw 2db3c26374 refactor(libs): remove unused imports and functions hai 1 ano
spark b492c19c4c refactor: move some constants or enums defs to config folder hai 1 ano
tools bf2ff5a241 feat(gradio-app): improve PDF conversion and UI functionalities hai 11 meses
utils f6af67eb11 feat: support convert ppt/pptx/doc/docx hai 1 ano
__init__.py d5dbed7325 目录重构 hai 1 ano
pdf_parse_by_ocr.py a3a720ea87 refactor: isolate inference and pipeline hai 1 ano
pdf_parse_by_txt.py a3a720ea87 refactor: isolate inference and pipeline hai 1 ano
pdf_parse_union_core_v2.py 7c5cdcd4d7 refactor(pdf_parse): improve character spacing handling in PDF text extraction hai 11 meses
user_api.py 87af738ab1 fix: 1. ocr txt mode error 2. lose pdf_parse_type field hai 1 ano