source: main/trunk/greenstone2/build-src/packages/wget/README@ 32922

Last change on this file since 32922 was 32922, checked in by ak19, 3 years ago

Updating wget README to document changes made in revision 32908

  • Property svn:keywords set to Author Date Id Revision
File size: 15.6 KB
Line 
1wget - web site mirroring software
2
3This directory contains a version of the GNU wget program that was written by
4Hrvoje Niksic and others and distributed under the GNU General Public Licence
5
6The GNU wget homepage is http://www.cg.tuwien.ac.at/~prikryl/wget.html
7
81) A couple of very small changes were made to this package by Stefan Boddie
9(sjboddie@cs.waikato.ac.nz). The USE_STDARG preprocessor definition was
10added to the VC++ makefile as it appeared to need it to compile on either
11VC++ 4.2 or VC++ 6.0. A change was also made to prevent backslashes from
12mistakenly being converted to "@5c" on windows.
13
142) 2003/11/27 - sjboddie@cs.waikato.ac.nz made a few more small changes to fix
15h_errno bug preventing code from compiling on latest versions of GCC.
16
173) 2004/06/30 - kjdon@cs.waikato.ac.nz replaced version 1.5.3 with version 1.9.
18It compiled as is under Windows XP, using VC++ 6.0, under Linux using GCC 3.2.3
19and GCC 2.95.3. So I haven't made any changes.
20Under Windows, I haven't compiled it with openssl, so it doesn't support
21https://. OpenSSL comes with a warning about it being illegal in some countries
22to distribute cryptographic technology, so I haven't used it. Look at
23wget-1.9/windows/README for Windows compiling instructions if you want to add
24it in.
25
264) 2008/07/03 - oranfry@cs.waikato.ac.nz replaced version 1.9 with version 1.11.4.
27Version 1.9 didn't compile statically with the new libraries which come with
28fedora, so upgrading. Had to apply the patch at
29http://marc.info/?l=wget&m=115131792408685&w=2 to get it to compile under
30windows(VC++ 6.0), and those changes are in the tar.gz file in the repository.
31Again, no ssl in windows wget. The packages/configure and packages/Makefile
32files perform the same operations on wget as before, just the directory name
33has changed from wget/wget-1.9 to wget/wget-1.11.4.
34
35
365) 2013/01/09 - davidb@cs.waikato.ac.nz Make minor change to src/Makefile.in so the code compiles under minGW cross-compiler running on Linux
37diff -r wget-1.13.4/src/Makefile.in wget-1.13.4.MINGW/src/Makefile.in
381136a1137,1143
39>
40> ifeq ($(GSDLOS),windows)
41> # Compiling wget with MinGW
42> CFLAGS += -DIN6_ARE_ADDR_EQUAL=IN6_ADDR_EQUAL
43> LIBS += -lws2_32
44> endif
45>
46
47These newly added lines should come after the line:
48> CLEANFILES = *~ *.bak core core.[0-9]* build_info.c version.c
49
50and before the command to make 'all':
51> all: config.h
52> $(MAKE) $(AM_MAKEFLAGS) all-am
53
54
556) 2013/01/29 - davidb@cs.waikato.ac.nz added some #ifdef __ANDROID__ blocks into src/util.c to cope with the fact that this OS lacks localeconv()
56
57diff wget-1.13.4-orig/src/utils.c wget-1.13.4-gs/src/utils.c
58
591466a1467,1470
60> #ifdef __ANDROID__
61> cached_sep =",";
62> cached_grouping = "\x03";
63> #else
641489a1494
65> #endif
66
67[Now the #endif comes around line 1441 or 1459, immediately before:
68> initialized = true;
69> }
70> *sep = cached_sep;
71> *grouping = cached_grouping;]
72
73
747) 2014/10/13 - ak19
75Moved to wget version 1.15.
76Only the changes numbered 5 and 6 above have been ported into it, following Dr Bainbridge's instructions, as they were both changes made to the previous wget version Greenstone used (1.13.4).
77The wget tar file name (wget-1.15-gs) now indicates the version number and that it has been modified for Greenstone, so the Makefile and configure file in build-src/packages/ have been updated to reflect this.
78
798) 2017/07/27 - ak19 (anupama.krishnan@waikato.ac.nz) - still using wget version 1.15, but now compiling wget up with OpenSSL support. Wget needs SSL support in order for it to access pages over HTTPS. In future, the web will be using https.
80
81Downloaded OpenSSL version 1.0.2l downloaded from https://www.openssl.org/source/
82We're now compiling up OpenSSL during the configuration phase since wget needs it to exist during its configure phase. We're building OpenSSL statically, by setting the no-shared flag. Since OpenSSL is being statically built, we don't need to ship its lib, bin and include folders with GS, since wget will be compiled and linked against our static openSSL, producing a wget binary independent of OpenSSL. The built OpenSSL now just gets put into gs2build/build-src/packages/openssl too, along with other openSSL subfolders generated when compiling openSSL.
83
84When configuring wget, we build wget against our OpenSSl, and make and make install proceed as normal. Refer to gs2build/build-src/packages configure.
85We weren't compiling up wget statically before either, so we're still not doing so. But if that will be necessary in future, see the section on COMPILING WGET UP STATICALLY further below.
86
87To compile up wget (statically or not) with openssl, a helpful page was
88https://stackoverflow.com/questions/9817337/compiling-wget-with-static-linking-self-compiled-openssl-library-linking-issu
89Note, however, that since the CPPFLAGS and LDFLAGS are now set to point to our OpenSSL during the configure stage, the make command needn't additionally set them as well, contrary to the instruction for make on the stackoverflow page. So we just need to do the usual make, make install once the configure is done against OpenSSL.
90
91The existing version of wget, 1.15, works with HTTPS when compiled against OpenSSL. However, this version of the binary needs to be run with the --no-check-certificate flag on to access https pages without a security certificate.
92
93e.g. ./wget --no-check-certificate http://englishhistory.net/tudor/citizens/
94
95The system wget on Ubuntu 16.04 is version 1.17.1 and does not require this flag. The wget 1.11.4 we have on Windows so far, compiled without SSL does not work on https pages.
96* http://nebm.ist.utl.pt/~glopes/wget/
97Prebuilt Windows wget binaries (for 32 and 64 bit) version 1.11.4 that is described as including SSL support
98* http://gnuwin32.sourceforge.net/packages/wget.htm
99GNU's prebuilt Windows binaries of wget v 1.11.4. May not have been built with SSL support.
100
101However, the wget 1.11.4 Windows binary from http://nebm.ist.utl.pt/~glopes/wget/ does not work on HTTPS pages either, even after including the --no-check-certificate flag in the wget command. So we'll need to upgrade the wget binary on Windows to a later version too: pre-compiled windows binaries are available for versions newer than 1.11.4 (see the section WINDOWS WGET BINARIES WITH OPENSSL SUPPORT), with Windows' wget version 1.17.1 also not requiring the --no-check-certificate flag, similar to System versions of wget 1.17.1 on Ubuntu 16.04 but contrary to the compiled up version which requires the flag. We'd like both unix and windows operating systems to behave similarly, ideally. However, no matter which version of wget we compile up on Unix, 1.15, 1.17 or 1.19, and no matter which compiled version of openssl (1.0.2x or 1.1.0x) we've built it against, the wget binary we generate on unix always requires --no-check-certificate. So this will indeed be different from the wget 1.17+ binary we've downloaded for Windows.
102
103
1049) We're now shifting to wget-1.17.1 which is installed on Ubuntu 16.04, and for which a windows binary compiled with OpenSSL is available. The 32 bit version of the windows wget 1.17.1 binary has been downloaded from https://eternallybored.org/misc/wget/, instructions are in the section WINDOWS WGET BINARIES WITH OPENSSL SUPPORT.
105
106Both the linux system version and windows binary work on https urls without the --no-check-certificate flag being necessary. However, the compiled up Linux version still needs this flag, see under PROBLEM.
107
108This way our perl code can launch wget as before, without always passing that additional flag. Hopefully the output in the Download pane will be the same so that the donwload parsing will work.
109
110
11110) 15/03/2019
112Dr Bainbridge discovered and fixed an issue between wget and openSSL that appeared on an svn checkout on CentOS. When wget was being compiled, wget expected a folder called "lib64" to exist in the openssl that had been compiled up. But the folder created was called "lib". Dr Bainbridge experimented to find that copying the openssl "lib" folder across as "lib64" got wget successfully compiled. Upon further investigation he found that setting the OPENSSL_CFLAGS and OPENSSL_LIBS conflicted with passing the --with-libssl-prefix="$PACKAGES/openssl" flag.
113
114The final changes that Dr Bainbridge made to get openssl and wget to compile up successfully were:
1151. Remove the flag: --with-libssl-prefix="$PACKAGES/openssl"
1162. Change
117 OPENSSL_LIBS="-L$PACKAGES/openssl/lib -lssl -lcrypto"
118to
119 OPENSSL_LIBS="-L$PACKAGES/openssl/lib -lssl -lcrypto -lz -ldl"
120
121
122
123LINUX CHANGES:
124- Grabbed wget-1.17.1.tar.gz from https://ftp.gnu.org/gnu/wget/
125- Made the modifications in steps 5 and 6 above
126- To be compatible with the version of pod2man doc generation tool installed on LSB release-kit machines, can't pass the new --utf8 flag to it. Changed doc/Makefile.in and doc/Makefile.am from:
127 $(POD2MAN) --center="GNU Wget" --release="GNU Wget @VERSION@" --utf8 $? > $@
128changed to
129 $(POD2MAN) --center="GNU Wget" --release="GNU Wget @VERSION@" $? > $@
130- Then re-tarred as wget-1.17.1-gs.tar.gz, with corresponding changes in build-src/packages' configure, Makefile.in and Makefile
131- The configure step has now changed for wget v 1.17.1. Refer to build-src/packages/configure
132
133The configure step requires setting
134* --with-libssl-prefix to $bindir/openssl, so the wget build process can find openssl's include and lib folders. (Whereas the --with-ssl indicates what type of ssl we're using, which is openssl in our case.)
135* configuring had initially failed, reporting that OPENSSL_CFLAGS and OPENSSL_LIBS need to be set if not wanting to use whatever pkg-config may find. To set LIBS variables, use one of these forms: LIBS="-L/path/to/lib" or LIBS="/path/to/lib/lib.a" or LIBS="-lssl". To combine all three, separate with spaces. See http://trac.greenstone.org/changeset/30948 and https://github.com/tatsuhiro-t/spdylay/issues/43
136
137
138________________________________
139 FURTHER INFO
140________________________________
141
142SHIFTING TO WGET 1.19
143Wget 1.19 doesn't yet compile on Mac, but it works on Linux.
144Shifting to wget-1.19 requires only steps 5 and 6 to be applied to get wget-1.19-gs. This new version of wget takes care of the pod2man issue on our LSB release-kit environment without requiring the additional edits seen in (9) above.
145
146PROBLEM AND SOLUTION WITH WGETRC
147Can turn off requiring a certificate check for https URLs in wgetrc conf file, as explained here:
148https://superuser.com/questions/508696/wget-without-no-check-certificate
149And the location of wgetrc: https://www.gnu.org/software/wget/manual/html_node/Wgetrc-Location.html (for now setting the env var WGETRC in $GSDLHOME/setup.bash, which points to $GSDLHOME/bin/linux). Then the locally built linux wget 1.17.1 works, and so would v 1.15.
150
151However, the linux system wget and windows wget binary v 1.17.1 are not setting thisvariable in their wgetrc file, so how is the certificate check off for them by default? The windows wget binary v 1.15 does require the no-check-certificate flag, but v 1.17.1 works out of the box. So why does the 1.17.1 version I built on linux behave like the 1.15 on Windows binary (and built on Linux), which required the flag?
152
153
154WINDOWS WGET BINARIES WITH OPENSSL SUPPORT
155Windows binaries for wget 1.7.11 and other versions, built with openSSL support, are at:
156https://eternallybored.org/misc/wget/
157
158I downloaded the 32 bit version "wget-1.17.1-win32.zip" from there (at https://eternallybored.org/misc/wget/releases/old/wget-1.17.1-win32.zip)
159
160Unzipping wouldn't succeed, nor copying the zip's wget.exe directly, both producing a windows error message.
161To successfully extract: use 7zip to view the contents of the zip, then rename the wget.exe to wget.not, then copy out the wget.not file and rename it back.
162
163This version of wget on Windows 64 bit worked successfully to retrieve the https page of the Tudors site, and without the --no-check-certificate flag.
164
165# http://osxdaily.com/2012/05/22/install-wget-mac-os-x/
166# https://lists.gnu.org/archive/html/bug-wget/2014-12/msg00104.html
167
168Alternatives for Windows:
169Source:
170- https://soliloquyforthefallen.net/?p=238
171- https://github.com/wertarbyte/wget/tree/master/windows (README at end)
172Binaries:
173- https://stackoverflow.com/questions/14344921/wget-for-windows-7-trusted-source
174
175
176COMBINING GREENSTONE'S GPL WITH OpenSSL LICENSES
177OpenSSL is under a double license, see https://www.openssl.org/source/license.html
178The licenses for GPL and OpenSSL are incompatible, see https://www.gnu.org/licenses/license-list.en.html#OpenSSL
179but you can combine it this way: https://opensource.stackexchange.com/questions/2233/gpl-v3-with-openssl-exception?rq=1
180which is what we've done for GS2 and GS3.
181
182
183TO COMPILE WGET STATICALLY
184First refer to https://stackoverflow.com/questions/9817337/compiling-wget-with-static-linking-self-compiled-openssl-library-linking-issu
185
186If compiling wget up statically, then, in the LDFLAGS prepended to wget's configure command, append -static. Further, the gcc command that gets run needs to have -lpthread in its library listing at the end. The order of the libraries listed also needs to change for static compilation to be successful:
187-lprce -lpthread -ldl <remaining -llibs>
188
189However, warnings appear when compiling wget statically, as it does not make sense to create some programs statically since they may be stuck including a local context (e.g. something related to DNS warnings in compiling up a previous component statically). Linking against some libraries to create a static binary may not make sense either. For instance -ldl, the dynamic loading or linking library, may not make sense if the binary created is static. This seems to imply that wget makes more sense if compiled up as a shared object, .so, than as a static one, .a.
190
191
192TO COMPILE WGET WITH OPENSSL v 1.1.0f
193At present, we're compiling Wget 1.17 with openSSL v1.0.2l.
194
195To compile with OpenSSL 1.1.0x, you'll need
196* Wget v. 1.19
197* -lpthread prepended to $LIBS.
198
199Note: Also need to update build-src/packages/Makefile.in's distclean command to remove the extra folder "share" and file "openssl.cnf.dist" generated when building openssl v 1.1.0f.
200
201So the wget compile command will look like:
202
203LIBS="-lpthread $LIBS" OPENSSL_CFLAGS="-I/Scratch/ak19/gs3-svn-13July2017/gs2build/build-src/packages/openssl/include" OPENSSL_LIBS="-L/Scratch/ak19/gs3-svn-13July2017/gs2build/build-src/packages/openssl/lib -lssl -lcrypto" ./configure --prefix=/Scratch/ak19/gs3-svn-13July2017/gs2build/build-src/packages/wget --with-ssl=openssl --with-openssl=auto --with-libssl-prefix="/Scratch/ak19/gs3-svn-13July2017/gs2build/build-src/packages/openssl" --bindir="/Scratch/ak19/gs3-svn-13July2017/gs2build/bin/linux" -disable-nls
204
205
206TO DO:
207+ If I delete the gs2build/bin/linux/openssl folder, the built wget still works fine without it. Dr Bainbridge confirmed that this is because, wget is built against OpenSSL's static libraries and therefore no longer needs the OpenSSL stuff we build and have been putting into gs2build/bin/linux/openssl. So we no longer need to put the built OpenSSL there.
208
209- Add a tick box in GLI > File > Preferences for turning on No Check Certificate over https, this should then replace our wgetrc file and env variable set in GS2's setup.bash. By default leave this flag unticked, so downloading won't work over https. Need to store this user setting in GLI's config.xml. Ensure that when the download over https failed, it results in an error.
210
211- If the downloading error count > 0:
212At the bottom of GLI > Download Pane > View Log > download error log - when we get errors:
213You have the option of adjusting your proxy server settings (go through the Configure Proxy button)
214For https certificate authentication, you have the option of turning off checking the certificate in the Connections tab of File > Preferences
215
216Check the warnings on windows. If it's no longer always warning, then do the stuff above on warning too, not just on error.
Note: See TracBrowser for help on using the repository browser.