SOLUTION see EDIT at bottom of this comment.
PROBLEM: I have a directory with a heap of images, named something like below:
- image001.nef
- image002.nef
- image003.nef
- image003 - 20170609.jpg
- image004.nef
- image005.nef
- image006 - 20170609.nef
- image007.nef
- image007 - 20170609.jpg
- image008.jpg
- image008 - 20170609.nef
I want to find all images that are a duplicate base name (like imageXXX) AND the extension is JPG
So from my above list, there are only three items that match the criteria to delete (i have bold those items).
I have 2,500 images so a pythonic way is desirable to me manually going through.
I am having a hard time finding an example script to use, all the ones I have found are checking the HASH or something, which I don't believe is useful as the images are indeed similar, but not identical.
Cheers
EDIT: thanks to dawg I was able to get the output I desire... here is the final code that worked for me:
import os
directory = r'C:\temp'
out_directory = r'C:\temp\temp_usa_photos'
fns = os.listdir(directory)
ref_nef = {fn[0:15] for fn in fns if fn.upper().endswith('.NEF')}
print ref_nef
out_list = filter(lambda e: e[0:15] in ref_nef, [fn for fn in fns if fn.upper().endswith('.JPG')])
print out_list
for f in out_list:
input_file = os.path.join(directory, f)
output_file = os.path.join(out_directory, f)
os.rename(input_file, output_file)