[ILUG] PHP plus Celtic languages
kevin
kevin at cybercolloids.net
Thu Aug 19 09:07:17 IST 2004
Yes, I specify
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta http-equiv="Content-Language" content="kw"/>
Much to my surprise w3c has a content language for Cornish kw=kernewek
It seems to work OK in Mozilla and Konqueror. To continue the pedantic
note....what is the correct code to use for "small t with cedilla"? if not
ţ
Kevin.
On Wednesday 18 August 2004 22:38, Brian Foster wrote:
| From: kevin <kevin at cybercolloids.net>
| Date: Wed, 18 Aug 2004 11:21:50 +0100
|[ ... ]
| Cornish uses some accents including t-cedilla in words such as
|
| conveţhaz - Verb, to understand
|
| I can write this using codes in UTF-8 like conveţhaz [ ... ]
uh, not exactly. “ţ” does not (cannot)
represent literal UTF-8 per se. (it _is_ the
UCS codepoint value for U+0163, which is
“LATIN SMALL LETTER T WITH CEDILLA”, which
apparently is the character you want.)
I cannot recall if the “&#<dec>;” and “&#X<hex>;”
HTML/XML entities specify UCS codepoints (i.e.,
independent of the document's charset/encoding),
or character values specific to the document's
encoding.
I presume yer document effectively specifies
its encoding is UTF-8, in which case my bad
memory matters less than usual: the 163 hex
(355 decimal) UCS value is turned into the
correct UTF-8 byte sequence (which is the
two hex bytes C5 A3).
pedantically cheers!
-blf-
--
«How many surrealists does it take to | Brian Foster Montpellier,
change a lightbulb? Three. One calms | blf at utvinternet.ie FRANCE
the warthog, and two fill the bathtub | Stop E$$o (ExxonMobile)!
with brightly-colored machine tools.» | http://www.stopesso.com
More information about the ILUG
mailing list