[ILUG] Remove duplicate lines from a file?
kenn at bluetree.ie
Fri Jun 30 15:23:27 IST 2000
> > or
> > [duplicate_filter]
> > file_containing_masses_of_duplicate_lines_distributed_at_random
> > file_containg_unique_lines
> > Any ideas?
> man uniq. Next question.
My man uniq says "uniq requires sorted input..."
I'll let you work out the options for these commands
but here's the principle:
cat -n (to add line numbers)
| sort on second and subsequent fields (to ignore the line numbers)
| uniq -s to uniqueify while skipping the line numbers
| sort numerically on field 1
| awk/sed/cut to trim off line numbers.
The details are left as an exercise for the reader.
More information about the ILUG