source: other-projects/nightly-tasks/diffcol/trunk/model-collect/Enhanced-PDF/log/build_log.1372915500033.txt@ 27958

Last change on this file since 27958 was 27958, checked in by ak19, 11 years ago

Adding in the Enhanced-PDF model collection

File size: 18.1 KB
Line 
1s
2Command: perl -S /research/ak19/GS286bin_26Jun2013/bin/script/full-import.pl -gli -language en -collectdir /research/ak19/GS286bin_26Jun2013/collect -verbosity 5 Enhanced-PDF
3import.pl> Removing current contents of the archives directory...
4import.pl> Removing contents of the collection "tmp" directory...
5import.pl> Global file scan checking directory: /research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/import
6import.pl> DirectoryPlugin block: getting directory /research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/import
7import.pl> DirectoryPlugin block recurring: metadata.xml
8import.pl> DirectoryPlugin block recurring: pdf01.pdf
9import.pl> DirectoryPlugin block recurring: pdf03.pdf
10import.pl> DirectoryPlugin block recurring: pdf05-notext.pdf
11import.pl> DirectoryPlugin block recurring: pdf06-weirdchars.pdf
12import.pl> DirectoryPlugin read: getting directory /research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/import
13import.pl> DirectoryPlugin metadata recurring: metadata.xml
14import.pl> MetadataXMLPlugin: processing metadata.xml
15import.pl> DirectoryPlugin metadata recurring: pdf01.pdf
16import.pl> EmbeddedMetadataPlugin: processing pdf01.pdf
17import.pl> Extracted 15 pieces of metadata from /research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/import/pdf01.pdf EXIF block
18import.pl> DirectoryPlugin metadata recurring: pdf03.pdf
19import.pl> EmbeddedMetadataPlugin: processing pdf03.pdf
20import.pl> Extracted 16 pieces of metadata from /research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/import/pdf03.pdf EXIF block
21import.pl> DirectoryPlugin metadata recurring: pdf05-notext.pdf
22import.pl> EmbeddedMetadataPlugin: processing pdf05-notext.pdf
23import.pl> Extracted 34 pieces of metadata from /research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/import/pdf05-notext.pdf EXIF block
24import.pl> DirectoryPlugin metadata recurring: pdf06-weirdchars.pdf
25import.pl> EmbeddedMetadataPlugin: processing pdf06-weirdchars.pdf
26import.pl> Extracted 16 pieces of metadata from /research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/import/pdf06-weirdchars.pdf EXIF block
27import.pl> DirectoryPlugin: file /research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/import/metadata.xml was blocked for read
28import.pl> DirectoryPlugin: preparing metadata for pdf01.pdf
29import.pl> File "pdf01.pdf" matches filespec "pdf01\.pdf"
30import.pl> DirectoryPlugin recurring: pdf01.pdf
31import.pl> Converting pdf01.pdf to pagedimg_jpg format
32import.pl> calling cmd "/usr/bin/perl" -S gsConvert.pl -verbose 5 -pdf_zoom 2 -errlog "/research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915502/err.log" -output pagedimg_jpg "/research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915502/pdf01.pdf"
33import.pl> Error executing pdftoimg.pl
34import.pl> pdfpstoimg error log:
35import.pl> util::mk_dir() is deprecated, using FileUtils::makeDirectory() instead at /research/ak19/GS286bin_26Jun2013/bin/script/pdfpstoimg.pl line 76
36import.pl> convert: memory allocation failed `/tmp/magick-14441lxXiryZUEMx61' @ error/png.c/ReadOnePNGImage/2160.
37import.pl> convert: corrupt image `/tmp/magick-14441lxXiryZUEMx61' @ error/png.c/ReadPNGImage/3794.
38import.pl> convert: Postscript delegate failed `/research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915502/pdf01.pdf': No such file or directory @ error/pdf.c/ReadPDFImage/681.
39import.pl> convert: no images defined `/research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915502/pdf01/pdf01.jpg' @ error/convert.c/ConvertImageCommand/3068.
40import.pl> Convert error for /research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915502/pdf01.pdf
41import.pl> Could not convert pdf01.pdf to pagedimg_jpg format
42import.pl> util::mk_dir() is deprecated, using FileUtils::makeDirectory() instead at /research/ak19/GS286bin_26Jun2013/bin/script/pdfpstoimg.pl line 76
43import.pl> convert: memory allocation failed `/tmp/magick-14441lxXiryZUEMx61' @ error/png.c/ReadOnePNGImage/2160.
44import.pl> convert: corrupt image `/tmp/magick-14441lxXiryZUEMx61' @ error/png.c/ReadPNGImage/3794.
45import.pl> convert: Postscript delegate failed `/research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915502/pdf01.pdf': No such file or directory @ error/pdf.c/ReadPDFImage/681.
46import.pl> convert: no images defined `/research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915502/pdf01/pdf01.jpg' @ error/convert.c/ConvertImageCommand/3068.
47import.pl> Convert error for /research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915502/pdf01.pdf
48import.pl> WARNING: No plugin could process pdf01.pdf
49import.pl> DirectoryPlugin: preparing metadata for pdf03.pdf
50import.pl> File "pdf03.pdf" matches filespec "pdf03\.pdf"
51import.pl> DirectoryPlugin recurring: pdf03.pdf
52import.pl> Converting pdf03.pdf to pagedimg_jpg format
53import.pl> calling cmd "/usr/bin/perl" -S gsConvert.pl -verbose 5 -pdf_zoom 2 -errlog "/research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915503/err.log" -output pagedimg_jpg "/research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915503/pdf03.pdf"
54import.pl> Error executing pdftoimg.pl
55import.pl> pdfpstoimg error log:
56import.pl> util::mk_dir() is deprecated, using FileUtils::makeDirectory() instead at /research/ak19/GS286bin_26Jun2013/bin/script/pdfpstoimg.pl line 76
57import.pl> convert: memory allocation failed `/tmp/magick-14451W38ut5Mb3Xfh1' @ error/png.c/ReadOnePNGImage/2160.
58import.pl> convert: corrupt image `/tmp/magick-14451W38ut5Mb3Xfh1' @ error/png.c/ReadPNGImage/3794.
59import.pl> convert: Postscript delegate failed `/research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915503/pdf03.pdf': No such file or directory @ error/pdf.c/ReadPDFImage/681.
60import.pl> convert: no images defined `/research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915503/pdf03/pdf03.jpg' @ error/convert.c/ConvertImageCommand/3068.
61import.pl> Convert error for /research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915503/pdf03.pdf
62import.pl> Could not convert pdf03.pdf to pagedimg_jpg format
63import.pl> util::mk_dir() is deprecated, using FileUtils::makeDirectory() instead at /research/ak19/GS286bin_26Jun2013/bin/script/pdfpstoimg.pl line 76
64import.pl> convert: memory allocation failed `/tmp/magick-14451W38ut5Mb3Xfh1' @ error/png.c/ReadOnePNGImage/2160.
65import.pl> convert: corrupt image `/tmp/magick-14451W38ut5Mb3Xfh1' @ error/png.c/ReadPNGImage/3794.
66import.pl> convert: Postscript delegate failed `/research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915503/pdf03.pdf': No such file or directory @ error/pdf.c/ReadPDFImage/681.
67import.pl> convert: no images defined `/research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915503/pdf03/pdf03.jpg' @ error/convert.c/ConvertImageCommand/3068.
68import.pl> Convert error for /research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915503/pdf03.pdf
69import.pl> WARNING: No plugin could process pdf03.pdf
70import.pl> DirectoryPlugin: preparing metadata for pdf05-notext.pdf
71import.pl> File "pdf05-notext.pdf" matches filespec "pdf05-notext\.pdf"
72import.pl> DirectoryPlugin recurring: pdf05-notext.pdf
73import.pl> Converting pdf05-notext.pdf to pagedimg_jpg format
74import.pl> calling cmd "/usr/bin/perl" -S gsConvert.pl -verbose 5 -pdf_zoom 2 -errlog "/research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915504/err.log" -output pagedimg_jpg "/research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915504/pdf05-notext.pdf"
75import.pl> Error executing pdftoimg.pl
76import.pl> pdfpstoimg error log:
77import.pl> util::mk_dir() is deprecated, using FileUtils::makeDirectory() instead at /research/ak19/GS286bin_26Jun2013/bin/script/pdfpstoimg.pl line 76
78import.pl> convert: memory allocation failed `/tmp/magick-144612bFmsycgPWmz1' @ error/png.c/ReadOnePNGImage/2160.
79import.pl> convert: corrupt image `/tmp/magick-144612bFmsycgPWmz1' @ error/png.c/ReadPNGImage/3794.
80import.pl> convert: Postscript delegate failed `/research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915504/pdf05-notext.pdf': No such file or directory @ error/pdf.c/ReadPDFImage/681.
81import.pl> convert: no images defined `/research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915504/pdf05-notext/pdf05-notext.jpg' @ error/convert.c/ConvertImageCommand/3068.
82import.pl> Convert error for /research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915504/pdf05-notext.pdf
83import.pl> Could not convert pdf05-notext.pdf to pagedimg_jpg format
84import.pl> util::mk_dir() is deprecated, using FileUtils::makeDirectory() instead at /research/ak19/GS286bin_26Jun2013/bin/script/pdfpstoimg.pl line 76
85import.pl> convert: memory allocation failed `/tmp/magick-144612bFmsycgPWmz1' @ error/png.c/ReadOnePNGImage/2160.
86import.pl> convert: corrupt image `/tmp/magick-144612bFmsycgPWmz1' @ error/png.c/ReadPNGImage/3794.
87import.pl> convert: Postscript delegate failed `/research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915504/pdf05-notext.pdf': No such file or directory @ error/pdf.c/ReadPDFImage/681.
88import.pl> convert: no images defined `/research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915504/pdf05-notext/pdf05-notext.jpg' @ error/convert.c/ConvertImageCommand/3068.
89import.pl> Convert error for /research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915504/pdf05-notext.pdf
90import.pl> WARNING: No plugin could process pdf05-notext.pdf
91import.pl> DirectoryPlugin: preparing metadata for pdf06-weirdchars.pdf
92import.pl> File "pdf06-weirdchars.pdf" matches filespec "pdf06-weirdchars\.pdf"
93import.pl> DirectoryPlugin recurring: pdf06-weirdchars.pdf
94import.pl> Converting pdf06-weirdchars.pdf to pagedimg_jpg format
95import.pl> calling cmd "/usr/bin/perl" -S gsConvert.pl -verbose 5 -pdf_zoom 2 -errlog "/research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915505/err.log" -output pagedimg_jpg "/research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915505/pdf06-weirdchars.pdf"
96import.pl> Error executing pdftoimg.pl
97import.pl> pdfpstoimg error log:
98import.pl> util::mk_dir() is deprecated, using FileUtils::makeDirectory() instead at /research/ak19/GS286bin_26Jun2013/bin/script/pdfpstoimg.pl line 76
99import.pl> convert: memory allocation failed `/tmp/magick-14471n99dVBnKQH7R1' @ error/png.c/ReadOnePNGImage/2160.
100import.pl> convert: corrupt image `/tmp/magick-14471n99dVBnKQH7R1' @ error/png.c/ReadPNGImage/3794.
101import.pl> convert: Postscript delegate failed `/research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915505/pdf06-weirdchars.pdf': No such file or directory @ error/pdf.c/ReadPDFImage/681.
102import.pl> convert: no images defined `/research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915505/pdf06-weirdchars/pdf06-weirdchars.jpg' @ error/convert.c/ConvertImageCommand/3068.
103import.pl> Convert error for /research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915505/pdf06-weirdchars.pdf
104import.pl> Could not convert pdf06-weirdchars.pdf to pagedimg_jpg format
105import.pl> util::mk_dir() is deprecated, using FileUtils::makeDirectory() instead at /research/ak19/GS286bin_26Jun2013/bin/script/pdfpstoimg.pl line 76
106import.pl> convert: memory allocation failed `/tmp/magick-14471n99dVBnKQH7R1' @ error/png.c/ReadOnePNGImage/2160.
107import.pl> convert: corrupt image `/tmp/magick-14471n99dVBnKQH7R1' @ error/png.c/ReadPNGImage/3794.
108import.pl> convert: Postscript delegate failed `/research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915505/pdf06-weirdchars.pdf': No such file or directory @ error/pdf.c/ReadPDFImage/681.
109import.pl> convert: no images defined `/research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915505/pdf06-weirdchars/pdf06-weirdchars.jpg' @ error/convert.c/ConvertImageCommand/3068.
110import.pl> Convert error for /research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/tmp/1372915505/pdf06-weirdchars.pdf
111import.pl> WARNING: No plugin could process pdf06-weirdchars.pdf
112import.pl> *********************************************
113import.pl> Import complete
114import.pl> *********************************************
115import.pl> * 4 documents were considered for processing
116import.pl> * 0 were processed and included in the collection
117import.pl> * 4 were rejected
118import.pl> See /research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/etc/fail.log for a list of unrecognised and/or rejected documents
119import.pl> Command complete.
120import.pl> Extracting new metadata from archive files.
121import.pl> Archived metadata extraction complete.
122Command: perl -S /research/ak19/GS286bin_26Jun2013/bin/script/full-buildcol.pl -gli -language en -collectdir /research/ak19/GS286bin_26Jun2013/collect -verbosity 5 Enhanced-PDF
123buildcol.pl> *** creating the compressed text
124buildcol.pl> collecting text statistics (mgpp_passes -T1)
125buildcol.pl> DirectoryPlugin read: getting directory /research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/archives
126buildcol.pl> DirectoryPlugin metadata recurring: archiveinf-src.gdb
127buildcol.pl> DirectoryPlugin metadata recurring: earliestDatestamp
128buildcol.pl> DirectoryPlugin: preparing metadata for archiveinf-src.gdb
129buildcol.pl> DirectoryPlugin recurring: archiveinf-src.gdb
130buildcol.pl> WARNING: No plugin could recognise archiveinf-src.gdb
131buildcol.pl> DirectoryPlugin: preparing metadata for earliestDatestamp
132buildcol.pl> DirectoryPlugin recurring: earliestDatestamp
133buildcol.pl> WARNING: No plugin could recognise earliestDatestamp
134buildcol.pl> Stats (Compressing text from text)
135buildcol.pl> Total bytes in collection: 0
136buildcol.pl> Total bytes in text: 0
137buildcol.pl> ***************
138buildcol.pl> WARNING: There is very little or no text to compress
139buildcol.pl> Was this your intention?
140buildcol.pl> ***************
141buildcol.pl> creating the compression dictionary
142buildcol.pl> compressing the text (mgpp_passes -T2)
143buildcol.pl> DirectoryPlugin read: getting directory /research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/archives
144buildcol.pl> DirectoryPlugin metadata recurring: archiveinf-src.gdb
145buildcol.pl> DirectoryPlugin metadata recurring: earliestDatestamp
146buildcol.pl> DirectoryPlugin: preparing metadata for archiveinf-src.gdb
147buildcol.pl> DirectoryPlugin recurring: archiveinf-src.gdb
148buildcol.pl> WARNING: No plugin could recognise archiveinf-src.gdb
149buildcol.pl> DirectoryPlugin: preparing metadata for earliestDatestamp
150buildcol.pl> DirectoryPlugin recurring: earliestDatestamp
151buildcol.pl> WARNING: No plugin could recognise earliestDatestamp
152buildcol.pl> Stats (Compressing text from text)
153buildcol.pl> Total bytes in collection: 0
154buildcol.pl> Total bytes in text: 0
155buildcol.pl> ***************
156buildcol.pl> WARNING: There is very little or no text to compress
157buildcol.pl> Was this your intention?
158buildcol.pl> ***************
159buildcol.pl> *** building index text;dc.Title,ex.dc.Title,Title;Source; in subdirectory idx
160buildcol.pl> creating index dictionary (mgpp_passes -I1)
161buildcol.pl> DirectoryPlugin read: getting directory /research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/archives
162buildcol.pl> DirectoryPlugin metadata recurring: archiveinf-src.gdb
163buildcol.pl> DirectoryPlugin metadata recurring: earliestDatestamp
164buildcol.pl> DirectoryPlugin: preparing metadata for archiveinf-src.gdb
165buildcol.pl> DirectoryPlugin recurring: archiveinf-src.gdb
166buildcol.pl> WARNING: No plugin could recognise archiveinf-src.gdb
167buildcol.pl> DirectoryPlugin: preparing metadata for earliestDatestamp
168buildcol.pl> DirectoryPlugin recurring: earliestDatestamp
169buildcol.pl> WARNING: No plugin could recognise earliestDatestamp
170buildcol.pl> Stats (Creating index text;dc.Title,ex.dc.Title,Title;Source;)
171buildcol.pl> Total bytes in collection: 0
172buildcol.pl> Total bytes in text;dc.Title,ex.dc.Title,Title;Source;: 0
173buildcol.pl> ***************
174buildcol.pl> WARNING: There is very little or no text to process for text;dc.Title,ex.dc.Title,Title;Source;
175buildcol.pl> Was this your intention?
176buildcol.pl> ***************
177buildcol.pl> inverting the text (mgpp_passes -I2)
178buildcol.pl> DirectoryPlugin read: getting directory /research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/archives
179buildcol.pl> DirectoryPlugin metadata recurring: archiveinf-src.gdb
180buildcol.pl> DirectoryPlugin metadata recurring: earliestDatestamp
181buildcol.pl> DirectoryPlugin: preparing metadata for archiveinf-src.gdb
182buildcol.pl> DirectoryPlugin recurring: archiveinf-src.gdb
183buildcol.pl> WARNING: No plugin could recognise archiveinf-src.gdb
184buildcol.pl> DirectoryPlugin: preparing metadata for earliestDatestamp
185buildcol.pl> DirectoryPlugin recurring: earliestDatestamp
186buildcol.pl> WARNING: No plugin could recognise earliestDatestamp
187buildcol.pl> Stats (Creating index text;dc.Title,ex.dc.Title,Title;Source;)
188buildcol.pl> Total bytes in collection: 0
189buildcol.pl> Total bytes in text;dc.Title,ex.dc.Title,Title;Source;: 0
190buildcol.pl> ***************
191buildcol.pl> WARNING: There is very little or no text to process for text;dc.Title,ex.dc.Title,Title;Source;
192buildcol.pl> Was this your intention?
193buildcol.pl> ***************
194buildcol.pl> create the weights file
195buildcol.pl> creating 'on-disk' stemmed dictionary
196buildcol.pl> creating stem indexes
197buildcol.pl> deleting Enhanced-PDF.ict
198buildcol.pl> deleting Enhanced-PDF.idh
199buildcol.pl> deleting Enhanced-PDF.ii
200buildcol.pl> deleting Enhanced-PDF.ic
201buildcol.pl> deleting Enhanced-PDF.id
202buildcol.pl> BuildDir: /research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/building
203buildcol.pl> *** creating the info database and processing associated files
204buildcol.pl> DirectoryPlugin read: getting directory /research/ak19/GS286bin_26Jun2013/collect/Enhanced-PDF/archives
205buildcol.pl> DirectoryPlugin metadata recurring: archiveinf-src.gdb
206buildcol.pl> DirectoryPlugin metadata recurring: earliestDatestamp
207buildcol.pl> DirectoryPlugin: preparing metadata for archiveinf-src.gdb
208buildcol.pl> DirectoryPlugin recurring: archiveinf-src.gdb
209buildcol.pl> WARNING: No plugin could recognise archiveinf-src.gdb
210buildcol.pl> DirectoryPlugin: preparing metadata for earliestDatestamp
211buildcol.pl> DirectoryPlugin recurring: earliestDatestamp
212buildcol.pl> WARNING: No plugin could recognise earliestDatestamp
213buildcol.pl> Warning: No metadata values assigned to dc.Title;ex.Title.
214buildcol.pl> *** outputting information for classifier: CL1
215buildcol.pl> Warning: No metadata values assigned to ex.Source.
216buildcol.pl> *** outputting information for classifier: CL2
217buildcol.pl> *** outputting information for classifier: oai
218buildcol.pl> *** creating auxiliary files
219buildcol.pl> Command complete.
Note: See TracBrowser for help on using the repository browser.