| .. |
|
config
|
20438bd2b7
feat(language-detection): add YOLOv11 language detection model
|
преди 11 месеца |
|
data
|
cd11ddcd6b
docs: make sure the generate process of docs work properly
|
преди 11 месеца |
|
dict2md
|
c638fc5d1f
fix(pdf): improve ligature handling and text extraction
|
преди 11 месеца |
|
filter
|
e1be7da644
refactor(magic_pdf): switch to pdfminer for invalid character detection
|
преди 11 месеца |
|
integrations
|
b492c19c4c
refactor: move some constants or enums defs to config folder
|
преди 1 година |
|
libs
|
391a99860d
Update version.py with new version
|
преди 11 месеца |
|
model
|
489f70e91d
refactor(magic_pdf): move model config variables
|
преди 11 месеца |
|
operators
|
489f70e91d
refactor(magic_pdf): move model config variables
|
преди 11 месеца |
|
para
|
41545a13c6
refactor(para): adjust line height multiplier for block splitting
|
преди 11 месеца |
|
pipe
|
b2887ca0aa
refactor: refactor code
|
преди 11 месеца |
|
post_proc
|
6a75d7dce5
perf(layout): optimize layout detection for PDF extraction
|
преди 11 месеца |
|
pre_proc
|
1e6de549a8
fix: drop reason append error
|
преди 11 месеца |
|
resources
|
20438bd2b7
feat(language-detection): add YOLOv11 language detection model
|
преди 11 месеца |
|
rw
|
2db3c26374
refactor(libs): remove unused imports and functions
|
преди 11 месеца |
|
spark
|
b492c19c4c
refactor: move some constants or enums defs to config folder
|
преди 1 година |
|
tools
|
bf2ff5a241
feat(gradio-app): improve PDF conversion and UI functionalities
|
преди 11 месеца |
|
utils
|
f6af67eb11
feat: support convert ppt/pptx/doc/docx
|
преди 11 месеца |
|
__init__.py
|
d5dbed7325
目录重构
|
преди 1 година |
|
pdf_parse_by_ocr.py
|
a3a720ea87
refactor: isolate inference and pipeline
|
преди 11 месеца |
|
pdf_parse_by_txt.py
|
a3a720ea87
refactor: isolate inference and pipeline
|
преди 11 месеца |
|
pdf_parse_union_core_v2.py
|
51b8c57df2
fix: skip the char corresponding to invalid bounding boxes
|
преди 11 месеца |
|
user_api.py
|
87af738ab1
fix: 1. ocr txt mode error 2. lose pdf_parse_type field
|
преди 11 месеца |