Changeset 36794 for main


Ignore:
Timestamp:
2022-10-13T19:21:02+13:00 (18 months ago)
Author:
anupama
Message:

Dr Bainbridge fixed a diffcol classifier's subsidiary documents ordering issue: for identical titles under an authorr bookshelf for AZCompactList classifier on Creators in Word-PDF-Basic tutorial model collection, a different order of the documents would appear each time. The solution was 2-fold: Besides the PERL_PERTURB_KEYS environment variable, which we set to 0, there is also the PERL_HASH_SEED (see https://www.perlmonks.org/?node_id=1167787 ), and they both need to be set to 0 to get consistent ordering when calling perl's 'keys' command on a hashmap. The other part of the solution is to initialise AZCompactList's sort property to 'nosort' which then uses an array (thus, having a sense of ordering) instead of AZCompactList's default behaviour of using a hashmap (which does not enforce a sense of ordering). Setting the sort property to nosort had the effect of a consistent order of the same identically Titled documents upon a single build, but no consistent ordering between builds which is what PERL_PERTURB_KEYS in conjunction with PERL_HASH_SEED ensure.

Location:
main/trunk
Files:
4 edited

Legend:

Unmodified
Added
Removed
  • main/trunk/greenstone2/common-src/cgi-bin/gsdlCGI.pm

    r36096 r36794  
    704704    }
    705705
    706     # If perl_perturb_keys isn't set, then search results with remote GS
    707     # return different documents from the ones that should be returned
     706    # If perl_perturb_keys and related perl_hash_seed aren't set, then search results
     707    # with remote GS return different documents from the ones that should be returned
    708708    $ENV{'PERL_PERTURB_KEYS'}=0;
     709    $ENV{'PERL_HASH_SEED'}=0;
    709710    $ENV{'WGETRC'}=&FileUtils::filenameConcatenate($gsdlhome,"bin",$gsdlos,"wgetrc");
    710711}
  • main/trunk/greenstone2/perllib/classify/AZCompactList.pm

    r33902 r36794  
    6565    'type' => "metadata",
    6666#   'deft' => "Title",
     67    'deft' => "nosort",
    6768    'reqd' => "no" },
    6869      { 'name' => "removeprefix",
  • main/trunk/greenstone2/setup.bat

    r34111 r36794  
    292292:: Perl >= v5.18.* randomises map iteration order within a process
    293293set PERL_PERTURB_KEYS=0
     294set PERL_HASH_SEED=0
    294295
    295296:: The user can customise wget flags like number of retries and setting timeouts in the Wgetrc file.
  • main/trunk/greenstone3/src/java/org/greenstone/gsdl3/build/GS2PerlConstructor.java

    r33051 r36794  
    397397    args.add("GSDL-RUN-SETUP=true");
    398398    args.add("PERL_PERTURB_KEYS=0");
     399    args.add("PERL_HASH_SEED=0");
    399400   
    400401    if(envvars != null) {
Note: See TracChangeset for help on using the changeset viewer.