0

I've struggling trying to insert a sorting in this code:

all_files = glob.glob(path + "/*.CSV") # To get all csv files disorganized
all_csv = [pd.read_csv(f, sep=',') for f in all_files] # List of dataframes

# I want to sort it by the values of the first column of each dataframe in the all_csv list.

for f in all_csv:
    goal = pd.DataFrame.sort_values(by=(f.iloc[:,0])) #Maybe something like this??

So, Anyone has an idea how can I do this? I've looking on other post but does not apply to a undefined column name (a.k.a. f.iloc[:,0] ) or a list of dataframes (I also thought of using dictionaries but I'd like to see if is posible to use with lists).

Thank you :)

May be useful this ideas: link, link

2 Answers 2

1

This uses enke's code to sort each dataframe by the first column, but returns all dataframes in a list as you requested:

all_csv = [df.sort_values(by=df.columns[0]) for df in all_csv]

Sign up to request clarification or add additional context in comments.

Comments

0

You can index df.columns for individual dataframes:

goal = df.sort_values(by=df.columns[0])

For the entire list of dataframes, you can use list comprehension:

all_csv = [df.sort_values(by=df.columns[0]) for df in all_csv]

Suppose you had a dataframe that looked like:

   a  b
0  2  1
1  3  2
2  1  3

Then when you run:

df = df.sort_values(by=df.columns[0])

df becomes:

   a  b
2  1  3
0  2  1
1  3  2

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.