[ILUG] Stripping html in mutt
ilug at tracking.wunsch.org
Wed Sep 5 16:23:23 IST 2001
On Tue, 04-Sep-2001 at 16:42:54 -0700, Rick Moen wrote:
> Incidentally, one good resource for converters, including the in-line,
> MSWord-to-something-reasonable kind, is this site:
Very handy for those pesky Word document senders.
In case anybody's interested, I have the following in my mailcap:
text/html; html2text; copiousoutput
text/rtf; rtf2text %s; copiousoutput
application/rtf; rtf2text %s; copiousoutput
application/msword; word2text %s; copiousoutput
The script html2text contains the following:
/usr/bin/w3m -dump -T text/html | perl -pe 's/\n\s*\n/\n\n/gs; s/\xa0/ /gs;'
I find that w3m produces nicer text output than Lynx does, especially when
dealing with tables.
The script rtf2text is from the Perl package RTF::Parser. The script
word2text contains the following:
wvWare -x /usr/local/share/wv/wvHtml.xml "$1" 2>/dev/null | perl -0777 -p \
-e 's|<img .*?>||gs;' | html2text
It uses the wvWare package referenced above, and my simple html2text
The end result of all this stuff is that I can generally read just about
anything people send me, without having to leave the comfort of the Mutt
... A conclusion is simply the place where you got tired of thinking.
More information about the ILUG