myhloli f35a6c08a8 refactor(filter): remove unused text layout analysis for PDF classification 9 månader sedan
..
config 20438bd2b7 feat(language-detection): add YOLOv11 language detection model 11 månader sedan
data 3271cf75d3 refactor(langdetect): simplify language detection model and improve logging 10 månader sedan
dict2md 0a468eca6e feat(llm_aided): add title optimization feature 11 månader sedan
filter f35a6c08a8 refactor(filter): remove unused text layout analysis for PDF classification 9 månader sedan
integrations b492c19c4c refactor: move some constants or enums defs to config folder 1 år sedan
libs 4211c74c9d Update version.py with new version 10 månader sedan
model b1ac7afdaa perf(model): optimize batch ratio for different GPU memory sizes 9 månader sedan
operators 52efe94da8 feat(api): simplify markdown and content list generation 10 månader sedan
post_proc d986e39313 feat(llm_aided): add reasonability check and fine-tuning guidelines 10 månader sedan
pre_proc f37b14bc83 refactor(pre_proc): adjust IOU threshold for character overlap detection 10 månader sedan
resources 2a3a006f4d fix(models): update unimernet_small model path 10 månader sedan
spark b492c19c4c refactor: move some constants or enums defs to config folder 1 år sedan
tools f911a102ab feat(tools): add character bounding box drawing functionality 10 månader sedan
utils f6af67eb11 feat: support convert ppt/pptx/doc/docx 11 månader sedan
__init__.py d5dbed7325 目录重构 1 år sedan
pdf_parse_union_core_v2.py 9f18ca2019 feat(pdf_parse): improve OCR processing and contrast filtering 9 månader sedan