1

Need an opinion.

I have a function that defines some data. My idea is that the user can tell it to read the data from a file:

acquire_data('read_from_file',filename)

or the user can supply data directly:

acquire_data('use_this_list',datalist)

So the function would have a form something like

def acquire_data(mode,arg2):
    if mode == 'read_from_file':
        inputs=open(arg2)
        data = #etc.
    else:
        data = arg2  #or deepcopy(arg2) or whatever

Well, this works, but it seems kind of hackneyed. In particular, "arg2" has very different functions depending on the value of "mode". So: Is this good code? Is this "pythonic"? Anyone see a better way to code this? Thanks.

1
  • hi check my second approach :) Commented Feb 27, 2013 at 20:02

6 Answers 6

5
def acquire_data(list_or_filename):
    # assuming py3 here, for py2 use 'isinstance(list_or_filename, basestring)
    if isinstance(list_or_filename, str):
        with open(list_or_filename,"r") as f:
            return acquire_data_from_file(f)
    else:
        return acquire_data_from_list(list_or_filename)
Sign up to request clarification or add additional context in comments.

6 Comments

If you want to type switch on strings in 2.x, don't use str, use basestring. Users (especially on Windows) will often have filenames in unicode objects instead of str.
More importantly, you don't need the type-switching here at all. Why not treat anything that works as a parameter to open—a str, a unicode, a pathlib.Path instance (which may be part of the stdlib in 3.4 or later), etc.—as a pathname? This is Python, after all.
@arbarnert: Sounds very interesting, but I am too much of a noob to really know you mean. If you have time to explain further, would be much obliged.
Well, let me say, I know waht you mean on further reflection, but how to generalize isinstance(arg,str)? Thanks.
@bob.sacamento: Well, you can generalize it by passing a tuple of types to isinstance… but that's the wrong way to think about it. If you want "anything that can act as a filename to the open function", the way to check that is to pass it to the open function. If you don't get a TypeError, it was a filename. See my answer for details.
|
4

Instead of letting acquire_data open and read the entire content of the file, it would be more pythonic to pass a fileObj to acquire_data.

The Reason this is a better design

  1. Uniform interface for sequence, generator and file
  2. You do not need to read the entire data in the function
  3. You can control the lifetime of the file, even when an exception happens

A skeleton code would be

def acquire_data(iterable):
    for line in iterable:
        # Do what ever you want

with open("whatever") as fin:
    acquire_data(fin)

acquire_data(some_seq)

acquire_data(some_gen)

Comments

2

How about just using different functions for different tasks: acquire_data_from_file, acquire_data_from_list, etc? A lot clearer and simpler.

7 Comments

+1: Example JSON. It has different interfaces to read from string and fileObj.
In a sense, yes. But my very non-expert programming users are not going to want to choose between functions.
@bob.sacamento: And you trust them to know what string to pass to the function, never make a typo, etc.?
Not particularly. Guess I should have said that's another reason I was asking for alternative coding.
@bob.sacamento: They should, instead of passing such strings. But the isinstance-stuff shown in other posts is a viable option, of course.
|
1

If you really want a single function that can take either filenames or lists of data (which you probably don't, but you seem resistant to the more pythonic alternatives that oseiskar and Abhijit suggested), you definitely don't want to do it this way.

In general, if you find yourself doing type switching, you're doing something wrong. But doing fake type switching based on a string, and relying on the user to match the string to the type, is even more wrong.

One alternative is to just try to open the filename, and, if that fails, assume it was a sequence instead of a filename instead. (It's easier to ask forgiveness than permission.) For example:

def acquire_data(filename_or_list):
    try:
        with open(filename_or_list) as f:
            data = list(f)
    except TypeError:
        data = list(filename_or_list)
    # ... now the rest of your code uses data

This works even if the user passes a unicode filename instead of an str, or a tuple instead of a list, or even some class you've never heard of that works with the open or list function. That's the essence of duck typing.

Of course it's a bit of an abuse of duck typing, but that's inherent in your problem: You're trying to take a parameter that's of one of two different types, and any solution that makes that work will be abusing some feature.

The only worry is that someone might pass something that works with both open and list. In fact, that's true with a plain old str. So you need a generic decision for what to do with such cases—it seems like trying it as a pathname first, then as a sequence, is better than the other way around. That's certainly what you want from str, but you have to think through whether that would be true for every possible type that works both ways.

2 Comments

I particularly like this answer because it distinguishes carefully between duck typing and type switching, avoiding a pet peeve of mine. They aren't the same thing!
@senderle: Agreed (obviously—I'm not going to disagree with you liking my answer:)), but as jdotdot's answer shows, sometimes you can use type switching at a lower level to implement duck typing at a higher level, so they aren't entirely separate.
1

This kind of work I would to divide further and use dict as switch case :

def read_file(arg):
 # code

def read_data(arg):
 # code

def default_f(arg):
 # code

def acquire_data(mode, arg2):

    fun = {
    'read_from_file': read_file, 
    'use_this_list': read_data
    }.get(mode, default_f)

    fun(arg2)

EDIT:
Second approach: combined my and @möter.

def acquire_data(arg):
   fun = {
     True: read_file,
     False: read_data
     }.get(
         isinstance(arg, str),
         default_f
         )
   return fun(arg);

call this:

acquire_data('read_from_file',filename)
acquire_data('use_this_list',datalist)

4 Comments

OK, the jury says moter's answer below. But I am for now accepting Grijesh's answer. Why? Well, they provide basically the same operations, but Grijesh's answers avoids an if-block. And I think it's generally cleaner to do things without if-blocks. But I'm still learning and am open to being educated. Anyone want to try? Thanks.
@bob.sacamento I am also new to Python. But I have practice from C to avoid nested if/loops. But cation I knows every few constructs in Python
I take it back. I think it is better to get rid of that "mode" string. Sorry Grijesh. But thanks.
@bob.sacamento No problem its correct. I also advice wait some more time. you may get more answers. Even Martijn Pieters is online
1

Due to Python's duck-typing, having functions perform different behaviors depending on the datatype of a variable or argument, this type of practice is relatively common.

You could leave your function as is, though I'd consider doing

if mode == 'read_from_file':
        inputs=open(arg2)
        data = #etc.
elif mode == 'use_this_list':
        data = arg2  #or deepcopy(arg2) or whatever
else:
        raise InputError # or something like this

Jut to make sure that your function is extensible and that you don't have any improper arguments passed to the second part of the function.

Another way might be simply to accept an open file and/or filename in addition to the data itself:

def acquire_data(arg):
    if isinstance(arg, file):
        data = arg.read() # make sure to parse the data
    elif isinstance(arg, basestring):
        data = open(arg, 'r').read() # make sure to parse
    else:
        data = arg

2 Comments

The second suggestion is definitely better (yes, type-switching is bad, but faking type-switching by making the user pass an idiosyncratic string representing the type is much worse), except for one thing: If this is 2.x, you should use basestring, not str, for filenames; if this is 3.x, you cannot use file, and even in 2.6-2.7, you should use the new io interfaces instead. But really, avoiding type switching is even better.
sorry, you're right on basestring--meant to put that but str came out instead

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.