|
@@ -47,6 +47,11 @@ Easier to use: Just grab MinerU Desktop. No coding, no login, just a simple inte
|
|
|
</div>
|
|
</div>
|
|
|
|
|
|
|
|
# Changelog
|
|
# Changelog
|
|
|
|
|
+- 2025/01/22 1.1.0 released. In this version we have focused on improving parsing accuracy and efficiency:
|
|
|
|
|
+ - Upgraded to the latest doclayout_yolo(2501) model, enhancing layout recognition accuracy.
|
|
|
|
|
+ - Upgraded to the latest unimernet(2501) model, improving formula recognition accuracy.
|
|
|
|
|
+ - On devices that meet certain configuration requirements (16GB+ VRAM), by optimizing resource usage and restructuring the processing pipeline, overall parsing speed has been increased by more than 50%.
|
|
|
|
|
+ - Added a new heading classification feature (testing version, enabled by default) to the online demo, which supports hierarchical classification of headings, thereby enhancing document structuring.
|
|
|
- 2025/01/10 1.0.1 released. This is our first official release, where we have introduced a completely new API interface and enhanced compatibility through extensive refactoring, as well as a brand new automatic language identification feature:
|
|
- 2025/01/10 1.0.1 released. This is our first official release, where we have introduced a completely new API interface and enhanced compatibility through extensive refactoring, as well as a brand new automatic language identification feature:
|
|
|
- New API Interface
|
|
- New API Interface
|
|
|
- For the data-side API, we have introduced the Dataset class, designed to provide a robust and flexible data processing framework. This framework currently supports a variety of document formats, including images (.jpg and .png), PDFs, Word documents (.doc and .docx), and PowerPoint presentations (.ppt and .pptx). It ensures effective support for data processing tasks ranging from simple to complex.
|
|
- For the data-side API, we have introduced the Dataset class, designed to provide a robust and flexible data processing framework. This framework currently supports a variety of document formats, including images (.jpg and .png), PDFs, Word documents (.doc and .docx), and PowerPoint presentations (.ppt and .pptx). It ensures effective support for data processing tasks ranging from simple to complex.
|
|
@@ -356,6 +361,7 @@ TODO
|
|
|
- [x] Reading order based on the model
|
|
- [x] Reading order based on the model
|
|
|
- [x] Recognition of `index` and `list` in the main text
|
|
- [x] Recognition of `index` and `list` in the main text
|
|
|
- [x] Table recognition
|
|
- [x] Table recognition
|
|
|
|
|
+- [x] Heading Classification
|
|
|
- [ ] Code block recognition in the main text
|
|
- [ ] Code block recognition in the main text
|
|
|
- [ ] [Chemical formula recognition](docs/chemical_knowledge_introduction/introduction.pdf)
|
|
- [ ] [Chemical formula recognition](docs/chemical_knowledge_introduction/introduction.pdf)
|
|
|
- [ ] Geometric shape recognition
|
|
- [ ] Geometric shape recognition
|
|
@@ -365,7 +371,6 @@ TODO
|
|
|
- Reading order is determined by the model based on the spatial distribution of readable content, and may be out of order in some areas under extremely complex layouts.
|
|
- Reading order is determined by the model based on the spatial distribution of readable content, and may be out of order in some areas under extremely complex layouts.
|
|
|
- Vertical text is not supported.
|
|
- Vertical text is not supported.
|
|
|
- Tables of contents and lists are recognized through rules, and some uncommon list formats may not be recognized.
|
|
- Tables of contents and lists are recognized through rules, and some uncommon list formats may not be recognized.
|
|
|
-- Only one level of headings is supported; hierarchical headings are not currently supported.
|
|
|
|
|
- Code blocks are not yet supported in the layout model.
|
|
- Code blocks are not yet supported in the layout model.
|
|
|
- Comic books, art albums, primary school textbooks, and exercises cannot be parsed well.
|
|
- Comic books, art albums, primary school textbooks, and exercises cannot be parsed well.
|
|
|
- Table recognition may result in row/column recognition errors in complex tables.
|
|
- Table recognition may result in row/column recognition errors in complex tables.
|