Changeset 24166
- Timestamp:
- 2011-06-16T19:16:15+12:00 (12 years ago)
- Location:
- main/trunk
- Files:
-
- 3 edited
Legend:
- Unmodified
- Added
- Removed
-
main/trunk/binaries/windows/bin/docx2html.vbs
r24164 r24166 1 1 Option Explicit 2 2 3 'args = WScript.Arguments.Count 4 'If args < 2 then 5 ' WScript.Echo usage: args.vbs argument [input docx path] [output html path] 6 ' WScript.Quit 7 'end If 8 'WScript.Echo WScript.Arguments.Item(0) 9 'WScript.Echo WScript.Arguments.Item(1) 3 ' http://www.robvanderwoude.com/vbstech_automation_word.php 4 ' http://www.nilpo.com/2008/06/windows-scripting/reading-word-documents-in-wsh/ - for grabbing just the text (cleaned of Word mark-up) from a doc(x) 5 ' http://msdn.microsoft.com/en-us/library/3ca8tfek%28v=VS.85%29.aspx - VBScript Functions (CreateObject etc) 10 6 11 Doc2HTML WScript.Arguments.Item(0),WScript.Arguments.Item(1) 12 ' In terminal, run as: > docx2html.vbs C:\fullpath\to\input.docx C:\fullpath\to\output.html 7 ' Error Handling: 8 ' http://blogs.msdn.com/b/ericlippert/archive/2004/08/19/error-handling-in-vbscript-part-one.aspx 9 ' http://msdn.microsoft.com/en-us/library/53f3k80h%28v=VS.85%29.aspx 13 10 14 11 15 ' http://www.robvanderwoude.com/vbstech_automation_word.php 16 ' http://www.nilpo.com/2008/06/windows-scripting/reading-word-documents-in-wsh/ 12 ' To Do: 13 ' +1. error output on bad input to this file. And commit. 14 ' +1b. Active X error msg when trying to convert normal *.doc: only when windows scripting is on and Word not installed. 15 ' +1c. Make docx accepted by default as well. Changed WordPlugin. 16 ' 2. Try converting from other office types (xlsx, pptx) to html. They may use other constants for conversion filetypes 17 ' 3. gsConvert.pl's any_to_txt can be implemented for docx by getting all the text contents. Use a separate subroutine for this. Or use wdFormatUnicodeText as outputformat. 18 ' 4. Try out this script on Windows 7 to see whether WSH is active by default, as it is on XP and Vista. 19 ' 5. What kind of error occurs if any when user tries to convert docx on a machine with an old version of Word (pre-docx/pre-Word 2007)? 20 ' 6. Ask Dr Bainbridge whether this script can or shouldn't replace word2html, since this launches all version of word as well I think. 21 22 23 ' gsConvert.pl expects error output to go to the console's STDERR 24 ' for which we need to launch this vbs with "CScript //Nologo" '(cannot use WScript if using StdErr 25 ' and //Nologo is needed to repress Microsoft logo text output which messes up error reporting) 26 ' http://www.devguru.com/technologies/wsh/quickref/wscript_StdErr.html 27 Dim objStdErr, args 28 Set objStdErr = WScript.StdErr 29 30 args = WScript.Arguments.Count 31 If args < 2 then 32 'WScript.Echo Usage: args.vbs argument [input docx path] [output html path] 33 objStdErr.Write ("ERROR. Usage: CScript //Nologo " & WScript.ScriptName & " [input office doc path] [output html path]" & vbCrLf) 34 WScript.Quit 35 end If 36 37 ' Now run the conversion subroutine 38 Doc2HTML WScript.Arguments.Item(0),WScript.Arguments.Item(1) 39 ' In terminal, run as: > docx2html.vbs C:\fullpath\to\input.docx C:\fullpath\to\output.html 40 ' In terminal, run as: > CScript //Nologo docx2html.vbs C:\fullpath\to\input.docx C:\fullpath\to\output.html 41 ' if you want echoed error output to go to console (instead of creating a popup) and to avoid 2 lines of MS logo. 42 ' Will be using WScript.StdErr object to make error output go to stderr of CScript console (can't launch with WScript). 43 ' http://www.devguru.com/technologies/wsh/quickref/wscript_StdErr.html 44 17 45 18 46 Sub Doc2HTML( inFile, outHTML ) … … 55 83 Const wdFormatXMLTemplateMacroEnabled = 15 56 84 Const wdFormatXPS = 18 57 85 58 86 ' Create a File System object 59 87 Set objFSO = CreateObject( "Scripting.FileSystemObject" ) 60 88 61 ' Create a Word object 89 ' Create a Word object. Exit with error msg if not possible (such as when Word is not installed) 90 On Error Resume Next 62 91 Set objWord = CreateObject( "Word.Application" ) 92 If CStr(Err.Number) = 429 Then ' 429 is the error code for "ActiveX component can't create object" 93 ' http://msdn.microsoft.com/en-us/library/xe43cc8d%28v=VS.85%29.aspx 94 'WScript.Echo "Microsoft Word cannot be found -- document conversion cannot take place. Error #" & CStr(Err.Number) & ": " & Err.Description & "." & vbCrLf 95 objStdErr.Write ("ERROR: Windows-scripting failed. Document conversion cannot take place:" & vbCrLf) 96 objStdErr.Write (" Microsoft Word cannot be found or cannot be launched. (Error #" & CStr(Err.Number) & ": " & Err.Description & "). " & vbCrLf) 97 objStdErr.Write (" For converting the latest Office documents, install OpenOffice and Greenstone's OpenOffice extension. (Turn it on and turn off windows-scripting.)" & vbCrLf) 98 Exit Sub 99 End If 63 100 64 101 With objWord … … 71 108 strFile = objFile.Path 72 109 Else 73 WScript.Echo "FILE OPEN ERROR: The file does not exist" & vbCrLf 110 'WScript.Echo "FILE OPEN ERROR: The file does not exist" & vbCrLf 111 objStdErr.Write ("ERROR: Windows-scripting failed. Cannot open " & inFile & ". The file does not exist. ") 74 112 ' Close Word 75 113 .Quit … … 79 117 'outHTML = objFSO.BuildPath( objFile.ParentFolder, _ 80 118 ' objFSO.GetBaseName( objFile ) & ".html" ) 81 'outHTML = outFile82 119 83 120 ' Open the Word document -
main/trunk/greenstone2/bin/script/gsConvert.pl
r24164 r24166 786 786 my ($input_filename, $output_filestem) = @_; 787 787 788 my $vbScript = &util::filename_cat($ENV{'GSDLHOME'}, "bin", 789 $ENV{'GSDLOS'}, "word2html");788 # build up the path to the doc-to-html conversion tool we're going to use 789 my $vbScript = &util::filename_cat($ENV{'GSDLHOME'}, "bin", $ENV{'GSDLOS'}); 790 790 791 791 if ($ENV{'GSDLOS'} =~ m/^windows$/i) { 792 # if windows scripting with docx input, use new VBscript 793 if($input_filename =~ m/docx$/i) { 794 $vbScript = "docx2html.vbs"; 795 } else { # use the usual word2html conversion VB executable 796 $vbScript = "word2html"; 797 } 798 } 792 # if windows scripting with docx input, use new VBscript to get the local Word install (if 793 # any) to do the conversion, since docX can't be processed by word2html's windows_scripting 794 795 if($input_filename =~ m/docx$/i) { # need to use full path to docx2html script, 796 # else script launch fails when there are error msgs 797 $vbScript = &util::filename_cat($vbScript, "docx2html.vbs"); 798 $vbScript = "CScript //Nologo \"$vbScript\""; # launche with CScript for error output in STDERR 799 # //Nologo flag avoids Microsoft's opening/logo msgs 800 } 801 else { # old doc versions. use the usual VB executable word2html for the 802 # conversion. Doesn't need full path, since bin\windows is on PATH 803 $vbScript = "word2html"; #$vbScript = "\"".&util::filename_cat($vbScript, "word2html")."\""; 804 } 805 } 806 else { # not windows 807 $vbScript = "\"".&util::filename_cat($vbScript, "word2html")."\""; 808 } 799 809 800 810 if (-e "$output_filestem.html") { … … 812 822 813 823 # redirecting STDERR 814 $cmd .= " 2> \"$output_filestem.err\"" 815 if ($ENV {'GSDLOS'} !~ m/^windows$/i || $is_winnt_2000); 816 824 825 $cmd .= " 2> \"$output_filestem.err\"" 826 if ($ENV {'GSDLOS'} !~ m/^windows$/i || $is_winnt_2000); 827 #print STDERR "@@@@@@@@@ cmd=$cmd\n"; 828 817 829 # execute the command 818 830 $!=0; … … 822 834 if (-s "$output_filestem.err") { 823 835 open (ERRFILE, "<$output_filestem.err"); 824 836 825 837 my $write_to_fail_log=0; 826 838 if ($faillogfile ne "" && defined(open(FAILLOG,">>$faillogfile"))) -
main/trunk/greenstone2/perllib/plugins/WordPlugin.pm
r23751 r24166 239 239 return q^(?i)\.(doc|dot|docx|odt|wpd)$^; 240 240 } 241 return q^(?i)\.(doc |dot)$^;241 return q^(?i)\.(docx?|dot)$^; 242 242 } 243 243
Note:
See TracChangeset
for help on using the changeset viewer.