myhloli
|
9efc35ecaa
refactor(magic_pdf): remove unused import in pdf_parse_union_core_v2.py
|
11 ay önce |
Xiaomeng Zhao
|
fa113b5750
Merge pull request #1178 from icecraft/refactor/add_user_api
|
11 ay önce |
myhloli
|
012a46e07d
refactor(magic-pdf): optimize model initialization and concurrency control
|
11 ay önce |
myhloli
|
47a83d28f5
refactor(ocr): replace AtomModelSingleton with ocr_model_init for OCR model instantiation
|
11 ay önce |
myhloli
|
f2a92d5782
refactor(model): implement thread-safe OCR model initialization
|
11 ay önce |
myhloli
|
a1744b770f
refactor(magic_pdf): remove unused threading lock and model initialization code
|
11 ay önce |
myhloli
|
30220233ab
refactor(magic_pdf): replace AtomModelSingleton with ocr_model_init for OCR model instantiation
|
11 ay önce |
xu rui
|
f6bd47de6a
docs: add dataset method description
|
11 ay önce |
icecraft
|
4a82d6a07a
feat: add function definitions
|
11 ay önce |
icecraft
|
a3a720ea87
refactor: isolate inference and pipeline
|
11 ay önce |
myhloli
|
d4345b6e39
refactor(pdf_parse): adjust character-axis alignment algorithm
|
11 ay önce |
myhloli
|
949d0867fb
feat(pdf_parse): add line start flag detection and optimize line stop flag logic
|
11 ay önce |
myhloli
|
ac88815620
refactor(pdf_check): improve character detection using PyMuPDF
|
11 ay önce |
myhloli
|
88c0854a65
refactor(ocr): improve text processing and span handling
|
11 ay önce |
myhloli
|
37da8c44c4
feat(pdf_parse): filter out skewed text lines
|
11 ay önce |
myhloli
|
08392d63a0
fix(Hybrid OCR):Enable Hybrid OCR for Empty Spans That Contain a Certain Number of Placeholders but No Actual Text
|
11 ay önce |
myhloli
|
1d2eb70aa0
refactor(pdf_parse_union_core_v2): optimize page processing time logging
|
11 ay önce |
myhloli
|
2db3c26374
refactor(libs): remove unused imports and functions
|
11 ay önce |
myhloli
|
21fa78195e
refactor(pre_proc): remove unused functions and simplify code
|
11 ay önce |
myhloli
|
ecdaa49aee
refactor(magic_pdf): remove unused functions and simplify code
|
11 ay önce |
myhloli
|
8163506295
feat(pdf_parse): improve text extraction for vertical spans
|
11 ay önce |
myhloli
|
7d4dfca253
feat(pdf_parse): add OCR score to span data
|
11 ay önce |
myhloli
|
14656085f5
refactor(pdf_parse): improve text content extraction from PDF spans
|
11 ay önce |
myhloli
|
7964ae45d2
refactor(pdf_parse): improve code readability and maintainability
|
11 ay önce |
myhloli
|
97bcc8b23b
refactor(pdf_parse): improve code readability and maintainability
|
11 ay önce |
myhloli
|
034c59a887
refactor(txt_spans_extract_v2): optimize span processing and OCR logic
|
11 ay önce |
myhloli
|
0d3ef89fb9
fix(pdf_parse): Move the logic for filling text content into spans before the discarded_block recognition to fix the issue of empty text blocks in discarded_block.
|
11 ay önce |
myhloli
|
6b296ee2b5
fix(pdf_parse): improve OCR result handling
|
1 yıl önce |
myhloli
|
5d6cbcb123
refactor(para): improve line stop flag and remove unused debug mode
|
1 yıl önce |
myhloli
|
ae3b0a1e60
fix(pdf_parse): improve line stop flag detection accuracy
|
1 yıl önce |