[ILUG] Remove duplicate lines from a file?

Kenn Humborg kenn at bluetree.ie
Fri Jun 30 15:23:27 IST 2000


> > or
> > 
> > [duplicate_filter]
> > file_containing_masses_of_duplicate_lines_distributed_at_random
> > file_containg_unique_lines
> > 
> > Any ideas?
>  
>  man uniq. Next question.

My man uniq says "uniq requires sorted input..."

I'll let you work out the options for these commands
but here's the principle:

cat -n (to add line numbers)
 | sort on second and subsequent fields (to ignore the line numbers) 
 | uniq -s to uniqueify while skipping the line numbers
 | sort numerically on field 1
 | awk/sed/cut to trim off line numbers.

The details are left as an exercise for the reader.

Later,
Kenn





More information about the ILUG mailing list