| .. |
|
cli
|
30ac6f227c
fix(magic-pdf): add default values and improve warning logs for config optionsEnsure that 'temp-output-dir', 'models-dir', and 'device-mode' have sensible default
|
hai 1 ano |
|
dict2md
|
ff13c8e115
fix(mkmarkdown): add 2 space after image and table URLs
|
hai 1 ano |
|
filter
|
df14c61f6f
update: Enhance the capability to detect garbled document issues
|
hai 1 ano |
|
layout
|
d5dbed7325
目录重构
|
hai 1 ano |
|
libs
|
d244a1c1a7
fix(config_reader): add utf-8 encoding when reading config file
|
hai 1 ano |
|
model
|
7ecc82da63
fix(magic_pdf): remove unused import from pdf_extract_kit
|
hai 1 ano |
|
para
|
7dcf63e69c
fix:close some log output if not in debug mode
|
hai 1 ano |
|
pipe
|
f8f6ba6fd3
update:Add md make mode config in do_parse.You can control whether the produced md is for NLP or MM by changing the value of f_make_md_mode
|
hai 1 ano |
|
post_proc
|
1b9d65b3d3
1、Trace类的key增加前置下划线
|
hai 1 ano |
|
pre_proc
|
e831df807a
fix(magic_pdf): use interline_equations instead of interline_equation_blocks
|
hai 1 ano |
|
resources
|
57380cbed5
feat(language): add FT LANG cache directory setup
|
hai 1 ano |
|
rw
|
5db8911daa
add errors="replace" in write mode MODE_TXT
|
hai 1 ano |
|
spark
|
c9af3457f5
delete useless files
|
hai 1 ano |
|
train_utils
|
efed5faa53
feat: modify foot note bbox tmp
|
hai 1 ano |
|
__init__.py
|
d5dbed7325
目录重构
|
hai 1 ano |
|
pdf_parse_by_ocr.py
|
959b8d82d8
renamed pipeline file name
|
hai 1 ano |
|
pdf_parse_by_txt.py
|
959b8d82d8
renamed pipeline file name
|
hai 1 ano |
|
pdf_parse_for_train.py
|
d438b97a0a
切图逻辑重构
|
hai 1 ano |
|
pdf_parse_union_core.py
|
e831df807a
fix(magic_pdf): use interline_equations instead of interline_equation_blocks
|
hai 1 ano |
|
user_api.py
|
959b8d82d8
renamed pipeline file name
|
hai 1 ano |