[ILUG] Deleting duplicate photos

Brendan Kehoe brendan at zen.org
Mon Sep 29 20:11:05 IST 2008


Timothy Murphy wrote:
> What is the best way of eliminating duplicate photos
> on a number of machines, all running Linux (Fedora or CentOS)?
>
> I suppose one could ask the same question about files generally;
> how to tag or delete duplicates.
>
> Any suggestions gratefully received.
>   

I don't have the files handy at the moment, but my approach was to run
md5sum on every single one, and then write a perl script to use an
associative array to then emit commands to remove all of the ones after
the initial version.  Something like

   /([0-9]+)\s(.*)/ && do {
      if (exists ($files{$1}) {
          print "rm -vf ", $2;
      } else {
          chop;
          $files{$1} = $2;
      }
  };


A lot more sanity-checking needs to be done, but that was the idea.

Hope this helps,
B




More information about the ILUG mailing list