[ILUG] Re: meta http-equiv useless??
Brian Foster
blf at blf.utvinternet.ie
Sun Aug 21 14:09:58 IST 2005
| Date: Sun, 21 Aug 2005 01:13:36 -0500
| From: greg wm <ilug at nvpf.org>
|[ ... ]
| > wget -ENKkrl19 -nH -w2 -owget.log http://nonviolentpeaceforce.org
|
| my locale is en_IE.UTF-8, so why did wget save in latin-1 format?
whoa! slow down here ....
I suspect the answer is “because wget(1) does not alter
the charset (used for the page's contents)”. I suspect
this for three reasons:
◆ no evidence — e.g., a diff(1) listing — has been
posted that shows wget did so.
◆ the wget(1) man page fails to mention any such change.
◆ no such change was observed in an experiment (below).
the page was served up as Latin1 (née ISO-8859-1),
and that was what wget saved. the only(?) changes
wget made were to the URLs.
my experiment (I use a UTF-8 locale):
using Opera, I saved a copy of what the URL
http://nonviolentpeaceforce.org/spanish/welcome.asp
was served as. it's meta-equiv was iso-8850-1, and
it used a mixture of &...; entities and literal Latin1
characters. and the page (file) really was Latin1.
everything was consistent, and as expected and reported,
then I used the above wget options (sans -r) to fetch
that same URL. result? the `.orig' file was _identical_,
and the only apparent changes in the `.html' file were the
URLs (not exhaustively checked). more to the point, the
literal Latin1 characters were _identical_.
the conclusion? wget does not alter the charset used for
the page's contents. hence, lacking any diff(1) listing
to the contrary, I'll claim this did _not_ happen in the
original situation. that is, any theory that wget changed
the charset/encoding of the page's contents is incorrect.
(I am open to correction, provided evidence is supplied.)
| the wget manual page mentions nothing at all about character sets.
broadly, Yes, it does not. why should it? presuming
my experiment above is equivalent to what was done,
wget does not alter the charset of the page contents.
the Apache/server “default” charset answer is interesting.
an authorative override is not what I call a default?!
FWIW, the http-equiv _is_ used when / useful for viewing
local HTML files (i.e., not served up by a server).
cheers!
-blf-
--
Experienced (20+ yrs) kernel/software Eng: | Brian Foster Montpellier,
• Unix, embedded, &tc; • Linux; • doc; | blf at utvinternet.ie FRANCE
• IDL, automated testing, process, &tc. | Stop E$$o (ExxonMobile)!
Résumé (CV) http://www.blf.utvinternet.ie | http://www.stopesso.com
More information about the ILUG
mailing list