Changeset 24166 for main/trunk/binaries

Show
Ignore:
Timestamp:
16.06.2011 19:16:15 (8 years ago)
Author:
ak19
Message:

2nd and tentatively final set of changes changes to get the new docx2html functionality to work on docx files. The changes have to do with error reporting when Word is not installed/can't be found/can't be instantiated, when the script is launched with the wrong number of args and if the input file does not exist. WordPlugin? now has docx as part of the default process_expression (even when OO is not installed).

Files:
1 modified

Legend:

Unmodified
Added
Removed
  • main/trunk/binaries/windows/bin/docx2html.vbs

    r24164 r24166  
    11Option Explicit 
    22 
    3 'args = WScript.Arguments.Count 
    4 'If args < 2 then 
    5 '  WScript.Echo usage: args.vbs argument [input docx path] [output html path] 
    6 '  WScript.Quit 
    7 'end If 
    8 'WScript.Echo WScript.Arguments.Item(0) 
    9 'WScript.Echo WScript.Arguments.Item(1) 
     3' http://www.robvanderwoude.com/vbstech_automation_word.php 
     4' http://www.nilpo.com/2008/06/windows-scripting/reading-word-documents-in-wsh/ - for grabbing just the text (cleaned of Word mark-up) from a doc(x) 
     5' http://msdn.microsoft.com/en-us/library/3ca8tfek%28v=VS.85%29.aspx - VBScript Functions (CreateObject etc) 
    106 
    11 Doc2HTML WScript.Arguments.Item(0),WScript.Arguments.Item(1) 
    12 ' In terminal, run as: > docx2html.vbs C:\fullpath\to\input.docx C:\fullpath\to\output.html 
     7' Error Handling: 
     8' http://blogs.msdn.com/b/ericlippert/archive/2004/08/19/error-handling-in-vbscript-part-one.aspx 
     9' http://msdn.microsoft.com/en-us/library/53f3k80h%28v=VS.85%29.aspx  
    1310 
    1411 
    15 ' http://www.robvanderwoude.com/vbstech_automation_word.php 
    16 ' http://www.nilpo.com/2008/06/windows-scripting/reading-word-documents-in-wsh/ 
     12' To Do: 
     13' +1. error output on bad input to this file. And commit. 
     14' +1b. Active X error msg when trying to convert normal *.doc: only when windows scripting is on and Word not installed. 
     15' +1c. Make docx accepted by default as well. Changed WordPlugin. 
     16' 2. Try converting from other office types (xlsx, pptx) to html. They may use other constants for conversion filetypes 
     17' 3. gsConvert.pl's any_to_txt can be implemented for docx by getting all the text contents. Use a separate subroutine for this. Or use wdFormatUnicodeText as outputformat. 
     18' 4. Try out this script on Windows 7 to see whether WSH is active by default, as it is on XP and Vista. 
     19' 5. What kind of error occurs if any when user tries to convert docx on a machine with an old version of Word (pre-docx/pre-Word 2007)? 
     20' 6. Ask Dr Bainbridge whether this script can or shouldn't replace word2html, since this launches all version of word as well I think. 
     21 
     22 
     23' gsConvert.pl expects error output to go to the console's STDERR  
     24' for which we need to launch this vbs with "CScript //Nologo" '(cannot use WScript if using StdErr 
     25' and //Nologo is needed to repress Microsoft logo text output which messes up error reporting) 
     26' http://www.devguru.com/technologies/wsh/quickref/wscript_StdErr.html 
     27Dim objStdErr, args 
     28Set objStdErr = WScript.StdErr 
     29 
     30args = WScript.Arguments.Count 
     31If args < 2 then 
     32  'WScript.Echo Usage: args.vbs argument [input docx path] [output html path] 
     33  objStdErr.Write ("ERROR. Usage: CScript //Nologo " & WScript.ScriptName & " [input office doc path] [output html path]" & vbCrLf) 
     34  WScript.Quit 
     35end If 
     36 
     37' Now run the conversion subroutine 
     38Doc2HTML WScript.Arguments.Item(0),WScript.Arguments.Item(1) 
     39    ' In terminal, run as: > docx2html.vbs C:\fullpath\to\input.docx C:\fullpath\to\output.html 
     40    ' In terminal, run as: > CScript //Nologo docx2html.vbs C:\fullpath\to\input.docx C:\fullpath\to\output.html 
     41    ' if you want echoed error output to go to console (instead of creating a popup) and to avoid 2 lines of MS logo. 
     42    ' Will be using WScript.StdErr object to make error output go to stderr of CScript console (can't launch with WScript). 
     43    ' http://www.devguru.com/technologies/wsh/quickref/wscript_StdErr.html 
     44 
    1745 
    1846Sub Doc2HTML( inFile, outHTML ) 
     
    5583    Const wdFormatXMLTemplateMacroEnabled     = 15 
    5684    Const wdFormatXPS                         = 18 
    57  
     85     
    5886    ' Create a File System object 
    5987    Set objFSO = CreateObject( "Scripting.FileSystemObject" ) 
    6088 
    61     ' Create a Word object 
     89    ' Create a Word object. Exit with error msg if not possible (such as when Word is not installed) 
     90    On Error Resume Next 
    6291    Set objWord = CreateObject( "Word.Application" ) 
     92    If CStr(Err.Number) = 429 Then  ' 429 is the error code for "ActiveX component can't create object"  
     93                                    ' http://msdn.microsoft.com/en-us/library/xe43cc8d%28v=VS.85%29.aspx         
     94        'WScript.Echo "Microsoft Word cannot be found -- document conversion cannot take place. Error #" & CStr(Err.Number) & ": " & Err.Description & "." & vbCrLf 
     95        objStdErr.Write ("ERROR: Windows-scripting failed. Document conversion cannot take place:" & vbCrLf)  
     96        objStdErr.Write ("   Microsoft Word cannot be found or cannot be launched. (Error #" & CStr(Err.Number) & ": " & Err.Description & "). " & vbCrLf)       
     97        objStdErr.Write ("   For converting the latest Office documents, install OpenOffice and Greenstone's OpenOffice extension. (Turn it on and turn off windows-scripting.)" & vbCrLf)  
     98        Exit Sub 
     99    End If 
    63100 
    64101    With objWord 
     
    71108            strFile = objFile.Path 
    72109        Else 
    73             WScript.Echo "FILE OPEN ERROR: The file does not exist" & vbCrLf 
     110            'WScript.Echo "FILE OPEN ERROR: The file does not exist" & vbCrLf 
     111            objStdErr.Write ("ERROR: Windows-scripting failed. Cannot open " & inFile & ". The file does not exist. ") 
    74112            ' Close Word 
    75113            .Quit 
     
    79117        'outHTML = objFSO.BuildPath( objFile.ParentFolder, _ 
    80118        '          objFSO.GetBaseName( objFile ) & ".html" ) 
    81     'outHTML = outFile 
    82119 
    83120        ' Open the Word document