Browse Source

docs(README): update release notes for version 1.2.0

- Update English and Chinese README files with the changelog for version 1.2.0
- Include details on performance optimizations, parsing improvements, and bug fixes
- Highlight specific enhancements for PDF document classification, watermark handling, and layout matching
myhloli 8 months ago
parent
commit
2a466e0308
2 changed files with 20 additions and 0 deletions
  1. 10 0
      README.md
  2. 10 0
      README_zh-CN.md

+ 10 - 0
README.md

@@ -47,6 +47,16 @@ Easier to use: Just grab MinerU Desktop. No coding, no login, just a simple inte
 </div>
 
 # Changelog
+
+- 2025/02/24 Release 1.2.0: This version includes several fixes and improvements to enhance parsing efficiency and accuracy:
+  - Performance Optimization
+    - Increased classification speed for PDF documents in auto mode.
+  - Parsing Optimization
+    - Improved parsing logic for documents containing watermarks, significantly enhancing the parsing results for such documents.
+    - Enhanced the matching logic for multiple images/tables and captions within a single page, improving the accuracy of image-text matching in complex layouts.
+  - Bug Fixes
+    - Fixed an issue where image/table spans were incorrectly filled into text blocks under certain conditions.
+    - Resolved an issue where title blocks were empty in some cases.
 - 2025/01/22 1.1.0 released. In this version we have focused on improving parsing accuracy and efficiency:
   - Model capability upgrade (requires re-executing the [model download process](docs/how_to_download_models_en.md) to obtain incremental updates of model files)
     - The layout recognition model has been upgraded to the latest `doclayout_yolo(2501)` model, improving layout recognition accuracy.

+ 10 - 0
README_zh-CN.md

@@ -46,6 +46,16 @@
 </div>
 
 # 更新记录
+- 2025/02/24 1.2.0 发布,这个版本我们修复了一些问题,提升了解析的效率与精度:
+  - 性能优化 
+    - auto模式下pdf文档的分类速度提升
+    - 在华为昇腾 NPU 加速模式下,添加高性能插件支持,常见场景下端到端加速可达 300% [申请链接](https://aicarrier.feishu.cn/share/base/form/shrcnb10VaoNQB8kQPA8DEfZC6d)
+  - 解析优化
+    - 优化对包含水印文档的解析逻辑,显著提升包含水印文档的解析效果
+    - 改进了单页内多个图像/表格与caption的匹配逻辑,提升了复杂布局下图文匹配的准确性
+  - 问题修复
+    - 修复在某些情况下图片/表格span被填充进textblock导致的异常
+    - 修复在某些情况下标题block为空的问题
 - 2025/01/22 1.1.0 发布,在这个版本我们重点提升了解析的精度与效率:
   - 模型能力升级(需重新执行[模型下载流程](docs/how_to_download_models_zh_cn.md)以获得模型文件的增量更新) 
     - 布局识别模型升级到最新的`doclayout_yolo(2501)`模型,提升了layout识别精度