10-21-2009, 7:36 AM
Is there a program that will locate duplicate files by content not name? IE I took 50 pictures and the kids copied and renamed them. now I'm trying to clean up the duplicates. I would do it by sight but it's like 12k total pictures and maybe 500 have been renamed or copy and renamed. I don't want to lose my only copy if they renamed it but if it is a copy, at 5mb each, they are taking up space.


10-21-2009, 7:46 AM
Maybe if you use something like iPhoto, Aperture, Photoshop Elements, Lightroom. I'm not sure if they can do cataloging by the actual image checksum.

You could write a script that basically catalogs all images based on exif data and then tells you if there are two images with the same exif. Then delete it by hand (or write that into your script). Perl has exif modules. (Image::EXIF, iirc).

Just a tip for the future. Never alter your originals. Keep the files the same. If you want a method of sorting that will be program independent, try storing photos in directories by date. i.e. /Volume/<folder>/YYYY-MM-DD, or /Volume/<folder>/YYYY/YYYY-MM-DD, etc. If you want to make a change to one, copy it, make the changes to the copy. Of course, if you shoot RAW, you'd never alter your original.

10-21-2009, 7:57 AM
Hum, I use elements I'll have to play around with the library function. I'm normally pretty good at it but this time the kids got a hold of the originals before I did and it got messed up before I realized what happened. Thanks for the tips.


This is my favorite. It compares file content by CRC, then shows you the dupes. Lots of options.

here's some other stuff though: http://lifehacker.com/395896/de+duplicate-your-data-with-free-tools