[ILUG] UTF-8, how to type ``Latin Small Letter e With Acute''?
Brian Foster
blf at utvinternet.ie
Thu May 30 20:50:23 IST 2002
| Date: Thu, 30 May 2002 13:53:36 +0200
| From: David Neary <dneary at wanadoo.fr>
| References: <200205300957.g4U9vaX04925 at linux.local>
|
| Brian Foster wrote:
| > I've been playing with using UTF-8 encoding (rather than
| > ISO-8859-15) with the Xfree86 v4.<something> supplied with
| > my SuSE 7.3 system. whilst there are numerous niggles,
| > the issue currently driving me batty is I cannot figure out
| > how to input (type in) characters such as é (Latin Small
| > Letter e with Acute; that is, ISO-8859-15 code E9 hex, or
| > UTF-8 octet stream C3 A9 hex) from my UK+€ (English) QWERTY
| > keyboard.
|
| Either use an editor which supports UTF-8 (such as emacs or vim
| with the necessary extentions, which I've never managed to get
| right, or gedit-2, or Mozilla composer), and do the compose-e-'
| thing, followed by an explicit "save encoded as UTF-8",
sorry, I wasn't clear in my original posting.... been there,
done that, doesn't work. regardless of the application I use,
xterm(1), vim(1), and on and on, COMPOSE isn't composing but
eating the next two keys (whether or not it's a valid compose
sequence!). but I haved managed to start gvim(1)'s which
almost work, albeit don't understand how/why yet .... !?
| or use
| some kind of token to replace it (I ended up replacing all the
| és with é in my html CV).
as a general rule always use entities (e.g., ``é'')
in HTML and XML --- in fact, some HTML-editors auto-generate
the entities. never use explicit values (e.g., ``&#xxxx;'')
which are apriori charset-dependent, even if the charset is
declared in the META Content-Type tag. but this HTML/XML
side-issue is neither here nor there ....
a cute(-ish) application for selecting individual Unicode
glyphs is ucm(1).
| > I'm using ``xterm -u8'' with an ISO-10646-1 font that has the
| > necessary é glyph, and (AFAIK) the locale is consistent (and
| > not relevant?). sans ``-u8'' COMPOSE works, so it's not an
| > xterm(1) problem per se (unlike, or so it seems, Eterm(1)).
|
| Afriad I don't know anthing about utf-8 xterm :) Sorry. You could
| use some kind of utility to utf-8ise your iso-8859-15 document
iconv(1) seems to do a good job of converting to/from UTF-8.
I never said anything about any document in any encoding.
instead, I'm toying with the idea of running the whole system
as UTF-8, ah la Plan 9 ... as my normal working environment.
(mad or what? ...don't answer!) you are correct, for dealing
with individual UTF-8 documents, xterm+iconv+vim looks like it
should work fairly well.
to get UTF-8 vim to behave sensibly (with X11R6 on my system,
dunno about the console &tc):
1. :set encoding=utf-8
2. :set fileencoding=... as appropriate
3(vim). use ``xterm -u8'' with a fixed-width ISO-10646-1 font
3(gvim). :set guifont=... also a fixed-width ISO-10646-1 font
if the UTF-8 locale is valid, vim/gvim does the 1st one Ok,
and somehow tries the 2nd (doesn't seem to always work?).
it's gvim's guifont which annoys me. I just threw together
a bash(1) script which, after Very Limited testing, starts
up an UTF-8 vim or gvim from any locale (UTF-8 or not).
but I'm still mystified as to COMPOSE ?!?? any ideas?
( bizarrely, some gvim's started by the above-mentioned
script do handle COMPOSE --- not 100% but close .... )
cheers!
-blf-
--
Innovative, very experienced, Unix and | Brian Foster Dublin, Ireland
Chorus (embedded RTOS) kernel internals | e-mail: blf at utvinternet.ie
expert looking for a new position ... | mobile: (+353 or 0)86 854 9268
For a résumé, contact me, or see my website http://www.blf.utvinternet.ie
Stop E$$o (ExxonMobile): «Whatever you do, don't buy Esso --- they
don't give a damn about global warming.» http://www.stopesso.com
Supported by Greenpeace, Friends of the Earth, and numerous others...
More information about the ILUG
mailing list