[ILUG] regexps for domains
bigbro at skynet.ie
Thu Aug 11 17:39:37 IST 2005
-----BEGIN PGP SIGNED MESSAGE-----
Kae Verens wrote:
| Ruairí Newman wrote:
|>> I was asked why that happened - it happened because that is, of course,
|>> a perfectly valid domain name.
Actually, old-skool it's NOT a valid RFC domain name, because the first
character of a section must be an alpha character.
From RFC 1035:
~ "The labels must follow the rules for ARPANET host names. They must
start with a letter, end with a letter or digit, and have as interior
characters only letters, digits, and hyphen. There are also some
restrictions on the length. Labels must be 63 characters or less."
A number of people (perhaps most notably 3com.com) merrily broke this
rule and bind generally didn't care, so now the RFC has been superceded
with more vague rules as to what's allowed. My last reading seems to
imply that basically anything goes now (I'm open to correction.)
Some software is unable to resolve domains such as 3com.com and 888.com
for this very reason (It's not original DNS RFC compliant.) In fairness,
I've mostly come across strict RFC compliant resolver and server
products in the telco arena rather than the internet arena - though
there is a similar disagreement in the ISP / Internet arena as to
whether underscores should be allowed in DNS names...
The regex you might be looking for might have a [A-Za-z][A-Za-z0-9-]+\.
~ at the start of each label and will neatly solve your problem.
It will, of course, NOT pick up and highlight domains such as 3com.com,
NOR will it highlight IP addresses (which are valid as URL strings) - so
you may wish to pick them up some other way.
Hope this helps.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (MingW32)
-----END PGP SIGNATURE-----
More information about the ILUG