[ILUG] Gawk query
Brian Foster
blf at utvinternet.ie
Fri Aug 24 00:44:29 IST 2007
| From: Brendan Halpin <brendan.halpin at ul.ie>
| Date: Thu, 23 Aug 2007 20:31:03 +0100
|
| Brian Foster <blf at utvinternet.ie> writes:
| > after a bit of head-scratching, the easiest approach
| > seems to be a bit of pre-processing; that is, make
| > the two types of spaces unique.
|
| Frankly, the easiest approach is to bite the bullet and go for a
| regexp approach. [ ... ]
it's some of this and some of that: unless yer an RE guru,
the size/number of REs involved can be daunting, difficult
to debug, and (I suspect) yer wondering what odd cases were
missed (i.e., it could be difficult for an RE non-guru to
confidently grok when the RE “fails”). OTOH, an RE can be
the easiest and quickest (in most senses) approach ....
w.r.t. my speculation it ought to be possible to generalise
the OP's case, here's a possible general solution (this may
only work with GNU sed(1)? yer kiloage could vary!):
# each input line consists of zero or more FIELDs.
# each FIELD is printed on a separate output line.
# a FIELD is [BTXT] or "QTXT" or STXT where:
# - BTXT does not contain ] but may contain [, ", and space anywhere.
# - QTXT does not contain " but may contain [, ], and space anywhere.
# - STXT does not contain space, [, or ", but may contain ] anywhere.
# spaces at the beginning and end of an input line are discarded.
# spaces not in BTXT or QTXT separate FIELDs (and are discarded).
# both [BTXT (no ]) and "QTXT (no terminal ") may cause chaos.
sed -e ':again
s/^ \+//
/^$/d
/^\[/{
s/^\[\([^]]*\)\]/\1\n/
bprint
}
/^"/{
s/^"\([^"]*\)"/\1\n/
bprint
}
s/^\([^[" ]*\)/\1\n/
:print
P
s/^.*\n//
tagain'
other solutions are possible.
both [BTXT and "QTXT malformed FIELDs should be handled better. ;-\
cheers!
-blf-
--
▶ ▶ I AM CURRENTLY LOOKING FOR A JOB! ◀ ◀ | Brian Foster
Experienced (>25 yrs) software engineer: | Montpellier, FRANCE
• Unix, Linux, embedded, design-for-test; | Stop E$$o (ExxonMobile)!
• Software/hardware co-design, debugging; | http:/www.stopesso.com
• Kernels, drivers, filesystems, &tc; Résumé (CV) & contact details:
• IDL, automated testing, process, &tc. http://www.blf.utvinternet.ie
More information about the ILUG
mailing list