[ILUG] sed question

Marcus Furlong furlongm at hotmail.com
Thu Aug 14 12:52:51 IST 2008


Hi,

I'm having trouble using sed to do replacements on some badly tagged
xml. I have a large number of files that are tagged as follows:

<first id="34">
blah blah
<second id="56" name="xyz1">hello hello</second>
<second name="xyz4">hello hello</second>
<second id="16" name="xyz5">hello hello</second>
<first id="3">
blah blah blah
<second>hello hello</second>
<second id="12" name="xyz5">hello hello</second>

The "first" tags have no closing tags at all, and may or may not have
text between the tag and the next tag. What I want to do is remove the
"first" tag and any text up to, but not including the "second" tag.

I've got to the following stage, but don't know how to get it to _not_
delete the line containing the "second" tag:

sed -e '<first/,/<second s/.*//' file.xml

Any suggestions?
Thanks,
-- 
Marcus Furlong



More information about the ILUG mailing list