Ticket #631 (closed defect: worksforme)

Opened 8 years ago

Last modified 7 years ago

incremental classifiers

Reported by: kjdon Owned by: kjdon
Priority: moderate Milestone: 2.84 Release
Component: Collection Building Severity: major
Keywords: Cc:

Description

When incrementally updating classifiers, we first reconstruct the old document, then classify the new version with edit mode delete/update. This leads to it being in hte classifier twice in some cases. Can we just not reconstruct a doc if we know it will be deleted or updated?

part of the output:

Adding reconstructed HASH30aa188f4d8ddaef558fc9 into classify structures
Adding reconstructed HASH163120a9a8b21602ebb36c into classify structures
ArchivesInfPlugin?: processing /research/kjdon/home/testing/2.83/gsdl/collect/lucenedemo/archives/archiveinf-doc.gdb
GreenstoneXMLPlugin: processing HASH1631.dir/doc.xml
Deleting old HASH163120a9a8b21602ebb36c for List
Deleting old HASH163120a9a8b21602ebb36c for Hierarchy

Change History

Changed 8 years ago by kjdon

  • owner changed from nobody to kjdon
  • status changed from new to assigned

Changed 7 years ago by kjdon

  • status changed from assigned to closed
  • resolution set to worksforme

I couldn't reproduce this. Maybe the code has changed since I found this problem originally?

Will close the ticket for now. It can be reopened if we find a collection where we can reproduce the problem.

I tried with some demo documents, and List adn Hierarchy classifiers. And AZlist and AZSectionList.

Note: See TracTickets for help on using tickets.