myhloli
|
c34c9d21ef
refactor(ocr): improve image and table block handling
|
1 年之前 |
myhloli
|
644085760b
fix(ocr_mkcontent): expand para_to_standard_format_v2 to handle list and index blocks
|
1 年之前 |
myhloli
|
fc49f5c446
refactor(magic_pdf): remove unused parameters and simplify functions
|
1 年之前 |
myhloli
|
011a1b973b
refactor(ocr):Increase the dilation factor in OCR to address the issue of word concatenation.
|
1 年之前 |
myhloli
|
1f1dd3538d
feat(list&index block): detect and merge list and index blocks
|
1 年之前 |
Xiaomeng Zhao
|
98313d4a25
Merge branch 'dev' into content-list-not-drop
|
1 年之前 |
myhloli
|
16699a9a70
fix(ocr_mkcontent): streamline drop reason handling
|
1 年之前 |
myhloli
|
196de029a3
fix(ocr_mkcontent): correct drop mode handling for pages with drop reasons
|
1 年之前 |
myhloli
|
37fbe998ac
feat(ocr_mkcontent): support drop reason in none_with_reason modeEnable the `NONE_WITH_REASON` drop mode in `para_to_standard_format_v2` by updating the
|
1 年之前 |
myhloli
|
6062862c96
feat(pipeline): pass language parameter for parsing and markdown conversion
|
1 年之前 |
icecraft
|
03469909bb
Feat/support footnote in figure (#532)
|
1 年之前 |
yyy
|
d714ac8b76
Release: Release 0.7.1 verison, update dev (#527)
|
1 年之前 |
drunkpig
|
18e65be489
fix: delete hyphen at end of line
|
1 年之前 |
drunkpig
|
83e0d55a34
fix: replace \u0002, \u0003 in common text (#521)
|
1 年之前 |
Xiaomeng Zhao
|
dd19f59eb6
fix(ocr_mkcontent): revise table caption output (#397)
|
1 年之前 |
Xiaomeng Zhao
|
66e3ce9c4a
fix(ocr_mkcontent): improve language detection and content formatting (#458)
|
1 年之前 |
liukaiwen
|
ec7271faee
fix table recognition bug#321
|
1 年之前 |
myhloli
|
0998d22a32
fix(ocr_mkcontent): add spaces around inline equation in content
|
1 年之前 |
Kaiwen Liu
|
37925f36d9
feat(model inference): add table recognition and conversion to LaTeX (#284)
|
1 年之前 |
myhloli
|
a5c35165ee
feat(dict2md): add page index to para content for standard format v2
|
1 年之前 |
myhloli
|
ff13c8e115
fix(mkmarkdown): add 2 space after image and table URLs
|
1 年之前 |
赵小蒙
|
5de013e6d5
fix:use line_lang instead of content_lang to concatenate para
|
1 年之前 |
赵小蒙
|
6199e608d4
add union_make logic
|
1 年之前 |
liukaiwen
|
503b9fad3e
解决标题后空格丢失
|
1 年之前 |
赵小蒙
|
f01cb89f01
fix lost image or table bug
|
1 年之前 |
赵小蒙
|
e980d2efa0
fix UNIPipe and spans space with language
|
1 年之前 |
赵小蒙
|
d3542f6a71
add para_to_standard_format logic
|
1 年之前 |
赵小蒙
|
7631907f49
fix interline_equations block
|
1 年之前 |
赵小蒙
|
81f73a3d9d
避免空para导致的error
|
1 年之前 |
赵小蒙
|
52777b224a
fix ocr_mk_markdown_with_para_core_v2
|
1 年之前 |