| .. |
|
config
|
87af738ab1
fix: 1. ocr txt mode error 2. lose pdf_parse_type field
|
пре 11 месеци |
|
data
|
b04867f90a
docs: check links in doc
|
пре 11 месеци |
|
dict2md
|
74ee428bbb
fix(dict2md): add space for inline equations in CJK contexts
|
пре 11 месеци |
|
filter
|
e1be7da644
refactor(magic_pdf): switch to pdfminer for invalid character detection
|
пре 11 месеци |
|
integrations
|
b492c19c4c
refactor: move some constants or enums defs to config folder
|
пре 1 година |
|
libs
|
391a99860d
Update version.py with new version
|
пре 11 месеци |
|
model
|
bdacf29179
Merge pull request #1257 from icecraft/docs/refactor_en_docs
|
пре 11 месеци |
|
para
|
41545a13c6
refactor(para): adjust line height multiplier for block splitting
|
пре 11 месеци |
|
pipe
|
f6bd47de6a
docs: add dataset method description
|
пре 11 месеци |
|
pre_proc
|
7f8dc353b0
fix(pre_proc): prevent errors when imageWriter is None
|
пре 11 месеци |
|
resources
|
240fe99e3c
feat(table): integrate RapidTable model for table recognition
|
пре 1 година |
|
rw
|
2db3c26374
refactor(libs): remove unused imports and functions
|
пре 11 месеци |
|
spark
|
b492c19c4c
refactor: move some constants or enums defs to config folder
|
пре 1 година |
|
tools
|
712d7d4a8d
fix: classif pdf type
|
пре 11 месеци |
|
utils
|
f6af67eb11
feat: support convert ppt/pptx/doc/docx
|
пре 11 месеци |
|
__init__.py
|
d5dbed7325
目录重构
|
пре 1 година |
|
pdf_parse_by_ocr.py
|
a3a720ea87
refactor: isolate inference and pipeline
|
пре 11 месеци |
|
pdf_parse_by_txt.py
|
a3a720ea87
refactor: isolate inference and pipeline
|
пре 11 месеци |
|
pdf_parse_union_core_v2.py
|
9efc35ecaa
refactor(magic_pdf): remove unused import in pdf_parse_union_core_v2.py
|
пре 11 месеци |
|
user_api.py
|
87af738ab1
fix: 1. ocr txt mode error 2. lose pdf_parse_type field
|
пре 11 месеци |