| .. |
|
config
|
20438bd2b7
feat(language-detection): add YOLOv11 language detection model
|
11 сар өмнө |
|
data
|
3271cf75d3
refactor(langdetect): simplify language detection model and improve logging
|
10 сар өмнө |
|
dict2md
|
0a468eca6e
feat(llm_aided): add title optimization feature
|
11 сар өмнө |
|
filter
|
e1be7da644
refactor(magic_pdf): switch to pdfminer for invalid character detection
|
11 сар өмнө |
|
integrations
|
b492c19c4c
refactor: move some constants or enums defs to config folder
|
1 жил өмнө |
|
libs
|
1a549a0e4b
fix(language): remove invalid UTF-16 surrogate pairs from input text
|
10 сар өмнө |
|
model
|
052a4d72ed
perf(magic_pdf): optimize batch ratio calculation for GPU
|
10 сар өмнө |
|
operators
|
52efe94da8
feat(api): simplify markdown and content list generation
|
10 сар өмнө |
|
post_proc
|
d986e39313
feat(llm_aided): add reasonability check and fine-tuning guidelines
|
10 сар өмнө |
|
pre_proc
|
f37b14bc83
refactor(pre_proc): adjust IOU threshold for character overlap detection
|
10 сар өмнө |
|
resources
|
2a3a006f4d
fix(models): update unimernet_small model path
|
10 сар өмнө |
|
spark
|
b492c19c4c
refactor: move some constants or enums defs to config folder
|
1 жил өмнө |
|
tools
|
f911a102ab
feat(tools): add character bounding box drawing functionality
|
10 сар өмнө |
|
utils
|
f6af67eb11
feat: support convert ppt/pptx/doc/docx
|
11 сар өмнө |
|
__init__.py
|
d5dbed7325
目录重构
|
1 жил өмнө |
|
pdf_parse_union_core_v2.py
|
ba6c17a9d9
feat(pdf_parse): remove tilted lines for better text extraction
|
10 сар өмнө |