[ILUG] Multi line regular expression help
Marcus Furlong
furlongm at gmail.com
Fri Dec 2 07:55:55 GMT 2011
On Fri, Dec 2, 2011 at 07:19, Kingsley G. Morse Jr. <kingsley at loaner.com> wrote:
> Hi Marcus,
>
> You're very welcome.
>
> I'll try to help with your other questions, before
> giving you an updated script.
>
> 1.) I'm not certain that I know exactly which
> extra spaces and dashes you alluded to, but my
> guess is that appending
>
> | sed 's/[ -]*">/">/'
>
> to the end of the pipe gets rid of them.
Yep, it did.
> 2.) You can specify only two of the [A-Z] characters
> with
>
> [A-Z]{2}
Didn't know this syntax before, it'll definitely come in handy. Opted
for specifying all valid languages in the end, like so:
(BG|CS|DA|DE|EL|EN|ES|ET|FI|FR|GA|HU|IT|LT|LV|MT|NL|PL|PT|RO|SK|SL|SV)
> 3.) My understanding is that the -E option is an
> undocumented alternative to -r. Both let sed
> use extended regular expressions.
Yep, using -r on the older version of sed worked a treat.
> Here's an updated script...
>
> #!/bin/bash
>
> echo "<tag>(FR) text
>
> <tag> - (FR) text
>
> <tag> (FR)
> text
>
> <tag>
> (FR) text
>
> <tag>
> - (FR) text
>
> <tag>
> othertext - (FR) text
>
> <tag>othertext - (FR) text
>
> <tag>othertext -
> (FR) text" | sed -r -n '/>/{N; s/\n//; s/>(.*)\(([A-Z]{2})\)/ language="\2" attribute="\1">/g;p;}' | sed -e 's/> \?/>\n/g' | sed 's/[ -]*">/">/'
>
>
> OK?
Perfect! With a few minor modifications I've got it doing exactly what
it should. Thanks again for all your help!
Marcus.
--
Marcus Furlong
More information about the ILUG
mailing list