Parcourir la source

fix(magic_pdf): correct range for images in document analysis

- Update the range used to generate images_with_extra_info to match the number of images
- This fixes a potential IndexError when the number of images differs from the dataset length
myhloli il y a 7 mois
Parent
commit
67b31a78d0
1 fichiers modifiés avec 1 ajouts et 1 suppressions
  1. 1 1
      magic_pdf/model/doc_analyze_by_custom_model.py

+ 1 - 1
magic_pdf/model/doc_analyze_by_custom_model.py

@@ -147,7 +147,7 @@ def doc_analyze(
             images.append(img_dict['img'])
             page_wh_list.append((img_dict['width'], img_dict['height']))
 
-    images_with_extra_info = [(images[index], ocr, dataset._lang) for index in range(len(dataset))]
+    images_with_extra_info = [(images[index], ocr, dataset._lang) for index in range(len(images))]
 
     if len(images) >= MIN_BATCH_INFERENCE_SIZE:
         batch_size = MIN_BATCH_INFERENCE_SIZE