1

I have the following CSV file:

id;area;zz;nc
1;35.66;2490.8;1
2;65.35;2414.93;1
3;79.05;2269.33;1
4;24.5;2807.68;1
5;19.31;2528.59;1
6;25.51;2596.44;1

where each rows represents a so called Cell object with its id, area, zz, cc.

Consequentially, I have created the following class:

class cells():
    #    
    # Initializer / Instance Attributes
    def __init__(self, idm, area,zz,nc):
        self.idm  = idm
        self.area = area

The idea is to create a number of object as the number of cells and to assign to them the attributes according to the data in the file.

The first idea that I have is to read the csv file as a DataFrame and after a list of objects to be populated in a cycle.

As far as I know, python is very inefficient with cycle and I would like to know if there is another way (smart one) to do that.

Thanks, Diego

4
  • What is your expected output? Commented Nov 16, 2019 at 0:44
  • Is there any particular reason why you need them to be objects of specific class? Would you be alright with using namedtuples instead? Commented Nov 16, 2019 at 0:48
  • 1
    Also, your class should probably be named Cell instead of cells, since Python classes follow the CapWords naming convention, and each object represents a single cell. Commented Nov 16, 2019 at 0:54
  • Do you find any of the current answers satisfactory, are you hoping for new ones? Commented Nov 18, 2019 at 1:59

4 Answers 4

2

I don't quite understand what you mean by cycle, but this will create a list of cell objects for each row that you have - given the format your data is in.

Pandas list comprehension over series is a reasonable option, see https://stackoverflow.com/a/55557758/7582537

Try this:

import pandas as pd 


class Cell():
    # Initializer / Instance Attributes
    def __init__(self, idm, area, zz, nc):
        self.idm  = idm
        self.area = area


def create_cells(row):
    newcell = Cell(row[0], row[1], row[2], row[3])
    return newcell


file = pd.read_table("your_file.csv", sep=';')
zipp = zip(file['id'], file['area'], file['zz'], file['nc'])
cells = [create_cells(row) for row in zipp]

print(cells)
Sign up to request clarification or add additional context in comments.

7 Comments

Thanks for sharing that post, it was quite informative!
I hesitated for a while. I do agree that using a list comprehension or some other kind of plain Python iteration might make sense here, but I’m not a fan of using pd.read_table() and creating an entire function for what is just tuple unpacking in a constructor. In retrospect, I should have commented on those rather than just downvoting, sorry. For what it’s worth I have upvoted you know, since the solution is ultimately correct and generally well-written :)
I think that a key function is "zip". I have to understand it properly.
@diedro zip() is great, extremely useful function. It's a built-in function, and I think the docs do a good job of explaining it.
@AlexanderCécile Yeah, I changed it after researching pd.read_table - as that is suitable for reading non-csv files as opposed to read_csv. But yes I agree, the premise still holds and there is no reason to use pandas here.
|
1

I don't think you need pandas in this case. pandas is overkill if you only need to read a csv file.

either read it directly:

objects = []
next(f) # skip header row
with open('your_file', 'r') as f:
    for row in f:
        objects.append(cells(*row.strip().split(';')))

or using csv module.

Comments

1

uMdRupert shared a link to an interesting post in his answer, I would recommend checking it out!


I like his idea of using a list comprehension, so I wanted to share a similar method:

import pandas as pd


class Cell:
    def __init__(self, idm, area, zz, nc):
        self.idm = idm
        self.area = area


cell_df = pd.read_csv('../resources/test_cell_data.csv', delimiter=';')
cell_df = cell_df.rename({'id': 'idm'}, axis='columns')

cell_objs_lst = [Cell(*curr_tuple._asdict()) for curr_tuple in cell_df.itertuples(index=False)]

Pandas might be overkill for this task, so here is a dead-simple method which uses the csv module:

import csv


class Cell:
    def __init__(self, idm, area, zz, nc):
        self.idm = idm
        self.area = area


with open('../resources/test_cell_data.csv', newline='') as in_file:
    next(in_file)
    reader = csv.DictReader(in_file, fieldnames=['idm', 'area', 'zz', 'nc'], delimiter=';')
    cells_lst = [Cell(**curr_row) for curr_row in reader]

Comments

0

I don't know your purpose of using object Cells for each row of df. However, I think you may achieve it with df.agg and keep every object in a series

class Cells():
    # Initializer / Instance Attributes
    def __init__(self, idm, area, zz, nc):
        self.idm  = idm
        self.area = area
        self.zz = zz
        self.nc = nc

s = df.agg(lambda x: Cells(*x), axis=1)
print(s)

Output:
0    <__main__.Cells object at 0x09FA38D0>
1    <__main__.Cells object at 0x09FA3510>
2    <__main__.Cells object at 0x09FA3870>
3    <__main__.Cells object at 0x09FA3AF0>
4    <__main__.Cells object at 0x09B27790>
5    <__main__.Cells object at 0x09B27770>
dtype: object

After that you may access each object from indexing of s

In [303]: s[0].__dict__
Out[303]: {'idm': 1.0, 'area': 35.66, 'zz': 2490.8, 'nc': 1.0}

In [304]: s[1].__dict__
Out[304]: {'idm': 2.0, 'area': 65.35, 'zz': 2414.93, 'nc': 1.0}

3 Comments

Where does df come from?
@AlexanderCécile: in the pandas world, df is always known as the working dataframe. OP says The first idea that I have is to read the csv file as a DataFrame, so I assume he already knew the way to read csv to dataframe. If he doesn't, a simple google would yield him the instruction to read_csv.
Right yes i’m familiar with the convention, I guess I was caught off guard because I only glanced at your mention of df in the beginning of your post. facepalm

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.