| .. |
|
config
|
b492c19c4c
refactor: move some constants or enums defs to config folder
|
1 rok pred |
|
data
|
b0529b6fbd
fix: reduce maximum image size
|
11 mesiacov pred |
|
dict2md
|
b80befe9cf
refactor(mkcontent): optimize paragraph text merging and language detection
|
11 mesiacov pred |
|
filter
|
ac88815620
refactor(pdf_check): improve character detection using PyMuPDF
|
11 mesiacov pred |
|
integrations
|
b492c19c4c
refactor: move some constants or enums defs to config folder
|
1 rok pred |
|
libs
|
b9f3435cb7
Update version.py with new version
|
11 mesiacov pred |
|
model
|
7f2f2c0f28
refactor(ocr): Fix the error of paddleocr failing to initialize in a multi-threaded environment
|
11 mesiacov pred |
|
para
|
41545a13c6
refactor(para): adjust line height multiplier for block splitting
|
11 mesiacov pred |
|
pipe
|
b492c19c4c
refactor: move some constants or enums defs to config folder
|
1 rok pred |
|
pre_proc
|
7f8dc353b0
fix(pre_proc): prevent errors when imageWriter is None
|
11 mesiacov pred |
|
resources
|
240fe99e3c
feat(table): integrate RapidTable model for table recognition
|
1 rok pred |
|
rw
|
2db3c26374
refactor(libs): remove unused imports and functions
|
11 mesiacov pred |
|
spark
|
b492c19c4c
refactor: move some constants or enums defs to config folder
|
1 rok pred |
|
tools
|
9c8d995ed2
Merge pull request #1045 from myhloli/dev
|
1 rok pred |
|
utils
|
9cda7051c6
add init to magic_pdf.utils
|
1 rok pred |
|
__init__.py
|
d5dbed7325
目录重构
|
1 rok pred |
|
pdf_parse_by_ocr.py
|
309be741e8
refactor(txt_parse): improve text extraction accuracy with new algorithm
|
1 rok pred |
|
pdf_parse_by_txt.py
|
309be741e8
refactor(txt_parse): improve text extraction accuracy with new algorithm
|
1 rok pred |
|
pdf_parse_union_core_v2.py
|
d4345b6e39
refactor(pdf_parse): adjust character-axis alignment algorithm
|
11 mesiacov pred |
|
user_api.py
|
309be741e8
refactor(txt_parse): improve text extraction accuracy with new algorithm
|
1 rok pred |