myhloli
|
058d318491
feat(pdf_parse): add footnote block handling in layout split
|
преди 7 месеца |
myhloli
|
ea730ae2e9
refactor(ocr): improve OCR score precision to three decimal places
|
преди 7 месеца |
myhloli
|
795233d1bb
refactor(magic_pdf): remove OCR timing measurement code
|
преди 7 месеца |
myhloli
|
553f250fc7
refactor(magic_pdf): optimize code and improve logging
|
преди 7 месеца |
myhloli
|
a024c30fc4
feat(ocr): implement dynamic OCR processing for text spans with low contrast
|
преди 7 месеца |
myhloli
|
3cb156f549
fix(pdf_parse_union_core_v2): suppress FutureWarning from transformers
|
преди 7 месеца |
myhloli
|
59d6b195b0
refactor(model): integrate AtomModelSingleton for OCR and improve OCR result handling
|
преди 7 месеца |
myhloli
|
a330651d64
feat(ocr): implement separate detection and recognition processes
|
преди 7 месеца |
myhloli
|
72e66c2d1e
refactor(pdf_parse): adjust line calculation for block height
|
преди 7 месеца |
myhloli
|
71efb101dc
refactor(pdf_parse): adjust line calculation for block height
|
преди 7 месеца |
myhloli
|
3f2bafa88f
feat(pre_proc): add function to remove x-overlapping characters in spans
|
преди 8 месеца |
myhloli
|
7210f7a65a
perf(model): enable bfloat16 for layoutreader on supported devices
|
преди 8 месеца |
myhloli
|
cf4ea78dac
refactor: remove torchtext deprecation warning handling
|
преди 8 месеца |
myhloli
|
af27c0cc81
refactor(magic_pdf): support mps device and optimize image processing
|
преди 8 месеца |
myhloli
|
6bfc17119d
refactor(pdf_parse): comment out performance measurement and logging
|
преди 8 месеца |
myhloli
|
e516cf535c
feat(performance): add performance monitoring and optimization
|
преди 8 месеца |
myhloli
|
6ec440d6f1
feat(pdf_parse): implement multi-threaded page processing
|
преди 8 месеца |
myhloli
|
0a246f0f40
refactor(magic_pdf): simplify device selection in model initialization
|
преди 8 месеца |
myhloli
|
9b00f988ac
refactor(magic_pdf): remove bfloat16 support checks and usage
|
преди 8 месеца |
myhloli
|
30bd3a83c7
fix(pdf_parse): Fixed the issue where some headings were missing in certain complex layouts.
|
преди 9 месеца |
myhloli
|
5561ac9555
fix(pdf_parse): improve image processing and OCR accuracy
|
преди 9 месеца |
myhloli
|
9f18ca2019
feat(pdf_parse): improve OCR processing and contrast filtering
|
преди 9 месеца |
myhloli
|
10e848b39d
feat(pdf_parse_union_core_v2): add timing log for LLM aided processes
|
преди 9 месеца |
myhloli
|
1d08865f4a
refactor(pdf_parse): uncomment char bbox validation logic
|
преди 9 месеца |
myhloli
|
ba6c17a9d9
feat(pdf_parse): remove tilted lines for better text extraction
|
преди 10 месеца |
myhloli
|
8570e006f8
refactor(magic_pdf): improve title block merging logic
|
преди 10 месеца |
Xiaomeng Zhao
|
206fcb3900
Merge pull request #1537 from myhloli/doclayoutyolo-fix
|
преди 10 месеца |
myhloli
|
c20e9a1e84
feat(layout): improve title block handling and layout detection
|
преди 10 месеца |
Xiaomeng Zhao
|
9f12c39817
Update pdf_parse_union_core_v2.py
|
преди 10 месеца |
myhloli
|
aaff1a2616
fix(llm_aided): add enable flag check for LLM aided optimizations
|
преди 10 месеца |