[ILUG] Multi line regular expression help

Kingsley G. Morse Jr. kingsley at loaner.com
Thu Dec 1 20:19:13 GMT 2011


Hi Marcus,

You're very welcome.

I'll try to help with your other questions, before
giving you an updated script.

1.) I'm not certain that I know exactly which
    extra spaces and dashes you alluded to, but my
    guess is that appending

        | sed 's/[ -]*">/">/'

    to the end of the pipe gets rid of them.

2.) You can specify only two of the [A-Z] characters
    with 

        [A-Z]{2}

3.) My understanding is that the -E option is an
    undocumented alternative to -r. Both let sed
    use extended regular expressions.


Here's an updated script...

#!/bin/bash

echo "<tag>(FR) text

<tag> - (FR) text

<tag> (FR)
text

<tag>
(FR) text

<tag>
 - (FR) text

<tag>
othertext - (FR) text

<tag>othertext - (FR) text

<tag>othertext -
(FR) text" | sed -r -n '/>/{N; s/\n//; s/>(.*)\(([A-Z]{2})\)/ language="\2" attribute="\1">/g;p;}' | sed -e 's/> \?/>\n/g' | sed 's/[ -]*">/">/'


OK?
Kingsley





More information about the ILUG mailing list