pangguosheng 1e6de549a8 fix: drop reason append error преди 11 месеца
..
config 20438bd2b7 feat(language-detection): add YOLOv11 language detection model преди 11 месеца
data cd11ddcd6b docs: make sure the generate process of docs work properly преди 11 месеца
dict2md c638fc5d1f fix(pdf): improve ligature handling and text extraction преди 11 месеца
filter e1be7da644 refactor(magic_pdf): switch to pdfminer for invalid character detection преди 11 месеца
integrations b492c19c4c refactor: move some constants or enums defs to config folder преди 1 година
libs 391a99860d Update version.py with new version преди 11 месеца
model 489f70e91d refactor(magic_pdf): move model config variables преди 11 месеца
operators 489f70e91d refactor(magic_pdf): move model config variables преди 11 месеца
para 41545a13c6 refactor(para): adjust line height multiplier for block splitting преди 11 месеца
pipe b2887ca0aa refactor: refactor code преди 11 месеца
post_proc 6a75d7dce5 perf(layout): optimize layout detection for PDF extraction преди 11 месеца
pre_proc 1e6de549a8 fix: drop reason append error преди 11 месеца
resources 20438bd2b7 feat(language-detection): add YOLOv11 language detection model преди 11 месеца
rw 2db3c26374 refactor(libs): remove unused imports and functions преди 11 месеца
spark b492c19c4c refactor: move some constants or enums defs to config folder преди 1 година
tools bf2ff5a241 feat(gradio-app): improve PDF conversion and UI functionalities преди 11 месеца
utils f6af67eb11 feat: support convert ppt/pptx/doc/docx преди 11 месеца
__init__.py d5dbed7325 目录重构 преди 1 година
pdf_parse_by_ocr.py a3a720ea87 refactor: isolate inference and pipeline преди 11 месеца
pdf_parse_by_txt.py a3a720ea87 refactor: isolate inference and pipeline преди 11 месеца
pdf_parse_union_core_v2.py 51b8c57df2 fix: skip the char corresponding to invalid bounding boxes преди 11 месеца
user_api.py 87af738ab1 fix: 1. ocr txt mode error 2. lose pdf_parse_type field преди 11 месеца