myhloli
|
058d318491
feat(pdf_parse): add footnote block handling in layout split
|
7 月之前 |
myhloli
|
8caf59f7cb
refactor(footnote_detection): adjust footnote detection threshold
|
7 月之前 |
myhloli
|
b3127233f0
refactor: modify bbox processing for layout separation
|
11 月之前 |
myhloli
|
a46b12e967
refactor(pre_proc): clean up OCR processing code
|
11 月之前 |
myhloli
|
21fa78195e
refactor(pre_proc): remove unused functions and simplify code
|
11 月之前 |
icecraft
|
b492c19c4c
refactor: move some constants or enums defs to config folder
|
1 年之前 |
myhloli
|
c34c9d21ef
refactor(ocr): improve image and table block handling
|
1 年之前 |
myhloli
|
1279f2cd0f
feat(model): add support for DocLayout-YOLO model
|
1 年之前 |
myhloli
|
1f1dd3538d
feat(list&index block): detect and merge list and index blocks
|
1 年之前 |
myhloli
|
34f8965007
refactor(draw_bbox): add line sorting visualization
|
1 年之前 |
myhloli
|
1efebe421c
refactor(pdf_parse_union): integrate LayoutLMv3 for block orderingReplace the heuristic-based block ordering algorithm with LayoutLMv3 model predictions toimprove the accuracy of block ordering on PDF pages. Additionally, refactor the span
|
1 年之前 |
Xiaomeng Zhao
|
9067cd31ca
fix(detect_all_bboxes): remove small overlapping blocks by merging (#501)
|
1 年之前 |
myhloli
|
e831df807a
fix(magic_pdf): use interline_equations instead of interline_equation_blocks
|
1 年之前 |
赵小蒙
|
e92de75844
add todo about interline_equation
|
1 年之前 |
赵小蒙
|
2f13b3a87c
add new drop scene
|
1 年之前 |
赵小蒙
|
3ec3a38456
fix: all_bboxes with score
|
1 年之前 |
赵小蒙
|
deb98fd0b1
fix footnote overlap error
|
1 年之前 |
赵小蒙
|
eebd976715
remove overlap between with all blocks
|
1 年之前 |
赵小蒙
|
a817075b3c
update discarded block and spans build logic
|
1 年之前 |
赵小蒙
|
f70289f99e
fix remove error
|
1 年之前 |
赵小蒙
|
1936703b71
fix remove error
|
1 年之前 |
赵小蒙
|
91ee991150
change some remove logic
|
1 年之前 |
赵小蒙
|
83641d3d97
文本框与标题框重叠,优先信任文本框
|
1 年之前 |
赵小蒙
|
55f358d1c5
block重叠和嵌套问题修复
|
1 年之前 |
赵小蒙
|
45ce99bf87
block type 字段名修复
|
1 年之前 |
赵小蒙
|
f5341e162f
重构 parse_by_ocr_v2.py
|
1 年之前 |
赵小蒙
|
7e8e9cabee
重构parse_by_ocr_v2
|
1 年之前 |