6

Here's the scenario:

I have a long list of time-stamped file names with characters before and after the time-stamp.

Something like this: prefix_20160817_suffix

What I want is a list (which will ultimately be a subset of the original list) that contains file names with specific prefixes, suffixes, and parts of the timestamp. These specific strings are already given in a list. Note: this "contains" list might vary in size.

For example: ['prefix1', '2016', 'suffix'] or ['201608', 'suffix']

How can I easily get a list of file names that contain every element in the "contains" array?

Here's some pseudo code to demonstrate what I want:

for each fileName in the master list:
    if the fileName contains EVERY element in the "contains" array:
        add fileName to filtered list of filenames
3
  • filtered_list = [fn for fn in master_list if all(item in fn for item in contains_list)] Commented Aug 17, 2016 at 17:04
  • all(element in fileName for element in contains)? Commented Aug 17, 2016 at 17:05
  • 1
    Out of curiosity, why the downvote? What should I have done differently? Commented Aug 17, 2016 at 17:16

5 Answers 5

6

I'd compile the list into a fnmatch pattern:

import fnmatch

pattern = '*'.join(contains)
filetered_filenames = fnmatch.filter(master_list, pattern)

This basically concatenates all strings in contains into a glob pattern with * wildcards in between. This assumes the order of contains is significant. Given that you are looking for prefixes, suffixes and (parts of) dates in between, that's not that much of a stretch.

It is important to note that if you run this on an OS that has a case-insensitive filesystem, that fnmatch matching is also case-insensitive. This is usually exactly what you'd want in that case.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks, this is an awesome answer. I can get my "contains" array as a string with asterisks VIA user input so it is even smoother this way.
@LukeH: if you are applying this to os.lisdir() output you may want to check out the glob module too and skip having to calling os.listdir() yourself.
5

You're looking for something like that (using list comprehension and all():

>>> files = ["prefix_20160817_suffix", "some_other_file_with_suffix"]
>>> contains = ['prefix', '2016', 'suffix']
>>> [ f for f in files if all(c in f for c in contains) ]
['prefix_20160817_suffix']

Comments

2

Given:

>>> cond1=['prefix1', '2016', 'suffix']
>>> cond2=['201608', 'suffix']
>>> fn="prefix_20160817_suffix"

You can test the existence of each substring in the list of conditions with in and (in the interim example) a list comprehension:

>>> [e in fn for e in cond1]
[False, True, True]
>>> [e in fn for e in cond2]
[True, True]

That can then be used in a single all statement to test all the substrings:

>>> all(e in fn for e in cond1)
False
>>> all(e in fn for e in cond2)
True

Then you can combine with filter (or use a list comprehension or a loop) to filter the list:

>>> fns=["prefix_20160817_suffix", "prefix1_20160817_suffix"]
>>> filter(lambda fn: all(e in fn for e in cond1), fns)
['prefix1_20160817_suffix']
>>> filter(lambda fn: all(e in fn for e in cond2), fns)
['prefix_20160817_suffix', 'prefix1_20160817_suffix']

Comments

1

Your pseudocode was not far from a usable implementation as you see:

masterList=["prefix_20160817_suffix"]
containsArray=['prefix1', '2016', 'suffix']
filteredListOfFilenames=[]

for fileName in masterList:
    if all((element in fileName) for element in containsArray):
        filteredListOfFilenames.append(fileName)

I would suggest to have a deeper look into the really good official tutorial - it contains many useful things.

Comments

0

This should work for you.

filtered_list = []

for file_name in master_list:
    for element in contains_array:
        if element not in file_name:
            break
        filtered_list.append(file_name)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.