[ILUG] kmail and character sets
Brian Foster
blf at blf.utvinternet.ie
Fri Apr 22 00:02:56 IST 2005
| Date: Thu, 21 Apr 2005 16:50:32 +0100
| From: Seán Mac Suibhne (Lists) <seanmac2004 at eircom.net>
|
| I wonder is there a univeraly accepted character set for all email so that all
| mail reader programmes can read equally well?
Yes & no (pedantically no, but in practice, for this
audience, possibly yes).
| Would this be ISO-8859-15?
NO. None of the CJK scripts, or Greek, or Cyrillic,
or indeed most of the world's scripts (languages),
are encodable in ISO-8859-15. you have to use a
full UCS encoding, and in practice, that means UTF-8.
| It seems that Eudora and webmail readers like Yahoo and webmail.u.tv do not
| like UTF-8 and I have not been able to send the Euro sign in ISO-8859-1
( I know nothing about Eudora or Yahoo: _if_ they cannot
handle UTF-8, they are *broken*! SO DO NOT USE THEM! )
the Euro sign (€) is not one of the UCS characters
encodable in ISO-8859-1, so of course it does not work.
| however I can send all the accented characters in Irish and the Euro sign in
| ISO-8859-15
Yes. you _should_ be able to sent most(? all?) Latin
scripts in ISO-8859-15, but that still excludes most
of the world's languages, plus non-language symbols
such as mathematics (∑, ∩, ∃, ∫, &tc), boxes (┏━┓┣┫ &tc),
and indeed most of the c.50,000 characters or so defined
in the UCS (née Unicode): ISO-8859-15 encodes only 256
of them (not all printable).
| Basicly I want to send áéíóú ÁÉÍÓÚ and € in emails without them coming out
| like the one the this link:
|
| http://groups.yahoo.com/group/eolas-ibi/message/704
_if_ you view that page forcing a character encoding
of “UTF-8”, it looks fine. the problem _seems_ to be
that no character encoding (or charset) is specified
by the HTML(/CSS), and so if (e.g.) yer browser is set
to default to ISO-8859-(1 or 15) you see garbage, such
as « Déardaoin » instead of the (presumably correct)
« Déardaoin ». so complain to Yahoo?
here's a hint: whenever you see strange nonsense
which renders as Ã... (or similar), try decoding
as UTF-8. for reasons left as an exercise to the
reader, non–US-ASCII Latin characters in either
ISO-8859-(1 or 15) tend to render similar to that.
| GRMA
| Seán
cheers!
-blf-
--
Experienced (20+ yrs) kernel/software Eng: | Brian Foster Montpellier,
• Unix, embedded, &tc; • Linux; • doc; | blf at utvinternet.ie FRANCE
• IDL, automated testing, process, &tc. | Stop E$$o (ExxonMobile)!
Résumé (CV) http://www.blf.utvinternet.ie | http://www.stopesso.com
More information about the ILUG
mailing list