source: other-projects/nightly-tasks/diffcol/trunk/diffcol/gdbdiff.pm@ 27604

Last change on this file since 27604 was 27604, checked in by ak19, 11 years ago

Fixing up diffcol process so it works better. Current state finds no errors in Small-HTML model-collection. 1. Better handling of gdb database (and ignores .idh) by filtering out fields that are expected to differ such as date before doing the diff. Handles archiveinf-doc.gdb and -src.gdb files and with the sort flag Dr Bainbridge added to db2text and the sorting of keys in perllib/dbutil/gdbmtxtgz, the ordering of keys in the database is no longer affecting the outcome. 2. Better handling of doc.xml files. Once more date fields that will differ are filtered out before performing the diff. EarliestDatestamp file is ignored. 3. The task script now ensures that model-collect is up to date with the svn version when about to perform the diff col testing.

File size: 2.7 KB
Line 
1package gdbdiff;
2
3BEGIN {
4 die "GSDLHOME not set\n" unless defined $ENV{'GSDLHOME'};
5 die "GSDLOS not set\n" unless defined $ENV{'GSDLOS'};
6 unshift (@INC, "$ENV{'GSDLHOME'}/perllib");
7 unshift (@INC, "$ENV{'GSDLHOME'}/perllib/cpan");
8}
9
10use util;
11use diffutil;
12use Text::Diff;
13
14sub readin_gdb
15{
16 my ($cmd) = @_;
17
18 open(PIN,"$cmd|")
19 || die "Unable to open pipe to $cmd: $!\n";
20
21 my $text_content = "";
22
23 while (defined (my $line = <PIN>)) {
24 $text_content .= $line;
25 }
26
27 close(PIN);
28 return $text_content;
29}
30
31
32sub test_gdb
33{
34 my ($full_modeldb, $full_testdb,$strColName) = @_;
35
36
37 # print "Now is testing database\n";
38
39 # need to sort text output of both test and model col database files, to normalise them for the comparison
40 # the -sort option to db2txt was added specifically to support diffcol
41 my $model_cmd = "db2txt -sort $full_modeldb 2>&1";
42 my $test_cmd = "db2txt -sort $full_testdb 2>&1";
43
44 my $model_text = readin_gdb($model_cmd);
45 my $test_text = readin_gdb($test_cmd);
46
47
48 # filter out the fields that can be ignored in the two database files
49 my $ignore_line_re = "\n<(lastmodified|lastmodifieddate|oailastmodified|oailastmodifieddate)>([^\n])*";
50 $model_text =~ s/$ignore_line_re//g;
51 $test_text =~ s/$ignore_line_re//g;
52
53
54 # ignore absolute path prefixes in modelcol and testcol (necessary for archiveinf-doc and -src.gdb files)
55
56 # Remember the original model col on SVN could have been built anywhere,
57 # and in the gdb files, absolute paths are stored to the collection location.
58 # Crop these paths to the collect/<colname> point.
59
60 # Entries are of the form [Entry] or <Entry>. In order to do a sensible diff,
61 # need to remove the prefix to the collect/colname folder in any (absolute) path that occurs in Entry
62 # E.g. [/full/path/collect/colname/import/file.ext] should become [collect/colname/import/file.ext]
63 # Better regex is of the form /BEGIN((?:(?!BEGIN).)*)END/, see http://docstore.mik.ua/orelly/perl/cookbook/ch06_16.htm
64
65 $model_text =~ s@^([^\\//]*).*(\\|/)(collect(\\|/)$strColName)(.*)$@$1$3$5@mg;
66 $test_text =~ s@^([^\\//]*).*(\\|/)(collect(\\|/)$strColName)(.*)$@$1$3$5@mg;
67
68
69 my $report_type = "OldStyle"; # Can not change this type.
70 my $diff_gdb = diff \$model_text, \$test_text, { STYLE => $report_type };
71
72 # leaving the ignore regex as it used to be in the following, in case it helps with single line comparisons
73 $diff_gdb = &diffutil::GenerateOutput($diff_gdb,"^<(lastmodified|lastmodifieddate|oailastmodified|oailastmodifieddate)>.*");
74
75 if($diff_gdb eq "")
76 {
77 return "";
78 }
79 else
80 {
81 return "Difference Report: Differences found in the Database file: \n$diff_gdb";
82 }
83 # Call diff?
84}
85
861;
Note: See TracBrowser for help on using the repository browser.