[ILUG] Remove duplicate lines from a file?
Kenn Humborg
kenn at bluetree.ie
Fri Jun 30 15:23:27 IST 2000
> > or
> >
> > [duplicate_filter]
> > file_containing_masses_of_duplicate_lines_distributed_at_random
> > file_containg_unique_lines
> >
> > Any ideas?
>
> man uniq. Next question.
My man uniq says "uniq requires sorted input..."
I'll let you work out the options for these commands
but here's the principle:
cat -n (to add line numbers)
| sort on second and subsequent fields (to ignore the line numbers)
| uniq -s to uniqueify while skipping the line numbers
| sort numerically on field 1
| awk/sed/cut to trim off line numbers.
The details are left as an exercise for the reader.
Later,
Kenn
More information about the ILUG
mailing list