Changeset 26158

Show
Ignore:
Timestamp:
07.09.2012 17:06:09 (7 years ago)
Author:
ak19
Message:

Bugfix. Many thanks to Anthony Blake. The phind browse classifier used to fail on certain particular combinations of documents on 64 bit linux. This was because of a bug in suffix.exe that caused a seg fault: all pointers are 64 bits on 64 bit machines. However, suffixCompare() and prefixCompare() functions used to try to convert their arguments, which were void* pointers to pointers, by first casting the void* pointers to symbol* and then dereference these to get their values (which are ultimately 32 bit unsigned int) but which are meant to be addresses (pointers themselves) and then converting these back to pointers objects which are 64 bit values. During this process of multiple conversion, 64 bit pointers were truncated to 32 bits before being given 64 bits for storage again. This used to work on 32 bit machines before since pointers there were 32 bits, but showed up here. The solution was a double pointer cast operation: since the data in the void* arguments are actually pointers to pointers, convert them directly back to the specific pointer data types we need. Many thanks to Anthony Blake who initially came to help with working out why gdb wasn't loading the symbols despite suffix.cpp being compiled up with the minus-g flag (when attempting to debug by printing out the values of the dereferenced pointers where things appeared to go wrong or at least step through the code). He ended up finding out what the bug actually was and that a conversion to double pointers was the solution necessary to get things working on 64 bit machines as well as still working on 32 bit machines. Now the phind classifier's been tested on 64 bit and 32 bit machines. And Anthony confirmed that this issue has indeed been resolved. Though it struck randomly, as only certain document combinations in collections show up the underlying error via a seg fault, he inspected the assembly instructions in gdb to see that the move operations were now correct, which they hadn't been before.

Files:
1 modified

Legend:

Unmodified
Added
Removed
  • main/trunk/greenstone2/build-src/src/phind/generate/suffix.cpp

    r26135 r26158  
    180180int suffixCompare(const void *cpa, const void *cpb) { 
    181181 
    182   // Cast then dereference pointers to suffix array elements 
    183   symbol *pa = (symbol *) cpa; 
    184   symbol *pb = (symbol *) cpb; 
    185   pa = (symbol *) *pa; 
    186   pb = (symbol *) *pb; 
     182  // The following is a bug and so at times results in a segmentation fault on 64 bit linux machines. 
     183  // The bug is in the 3rd and 4th lines, which perform identical operations. Considering the 3rd line: 
     184  // The *pa part of this statement truncates pa, the 64 bit pointer (symbol*), with the 32 bit value  
     185  // symbol (unsigned int) that it's being dereferenced to, before trying to reconvert it to a pointer 
     186  // which is always 64 bit on 64 bit machines. The 4th line has the same problem and prefixCompare() too. 
     187 
     188  // WRONG: Cast then dereference pointers to suffix array elements 
     189  //symbol *pa = (symbol *) cpa; 
     190  //symbol *pb = (symbol *) cpb; 
     191  //pa = (symbol *) *pa;  
     192  //pb = (symbol *) *pb;  
     193 
     194  // This is the correct way that will work on both 64 bit and 32 bit machines: 
     195  // the 2 void* arguments to this function, cpa and cpb, are "pointers to elements" as per  
     196  // http://www.cplusplus.com/reference/clibrary/cstdlib/qsort/. The array elements themselves are also  
     197  // *pointers* themselves: to symbol values. As the comment to this function says: cpa and cpb are  
     198  // pointers to pointers, making them double pointers. When we cast the void* cpa and cpb arguments to  
     199  // the types we need to work with here, they therefore need to be cast as double pointers to symbol. 
     200  // We declare pa and pb as symbol pointers (symbol*) to the actual values being compared, therefore pa  
     201  // and pb need to be instantiated as the address of the values referenced by cpa and cpb. 
     202  symbol *pa = *((symbol **) cpa);  
     203  symbol *pb = *((symbol **) cpb); 
     204   
    187205 
    188206  // If the two elements are the same, examine the next one 
     
    210228int prefixCompare(const void *cpa, const void *cpb) { 
    211229 
    212   // Cast then dereference pointers to prefix array elements 
    213   symbol *pa = (symbol *) cpa; 
    214   symbol *pb = (symbol *) cpb; 
    215   pa = (symbol *) *pa; 
    216   pb = (symbol *) *pb; 
     230  // Cast and dereference double pointers (pointers to prefix array elements) 
     231  // See comments in suffixCompare() 
     232  symbol *pa = *((symbol **) cpa);  
     233  symbol *pb = *((symbol **) cpb); 
    217234 
    218235  // If the two elements are the same, examine the next one 
     
    222239  } 
    223240 
    224   // Make the copmparison and return 
     241  // Make the comparison and return 
    225242  if ( *pa > *pb) { 
    226243    return -1;