0

I have about 200 csv files with the same number of columns: A,B,C,D,E. I want to sort all of them by column B and then column A. Can do this in Python?

7
  • Yes, it is possible. Did you try searching the site for similar/identical questions? There are many solutions. Commented Mar 13, 2014 at 1:03
  • I made an edit on your post to replace excel with csv beause it seems like you dont work with excel files but really csv files. right? Commented Mar 13, 2014 at 1:04
  • @beroe, I wish you had given me a link to that identical question. Commented Mar 14, 2014 at 20:30
  • @Lynch, I created those files using python csv library, and read them using excel Commented Mar 14, 2014 at 20:33
  • Thanks everyone, the problem is solved. Please see my comment under the first answer Commented Mar 14, 2014 at 20:34

2 Answers 2

2

I created a sort program for csv files that outputs a new sorted csv file with two keys. In order to sort, first sort by the secondary key and then by the primary key

To sort multiple files, loop over all the input files creating the basic statistics array. Afterward sort the result.

I only had one input file so I did not have to do that. Here is what I did for one file. You would change where I have infile to be the result of the input loop.

ifile = open('file.csv', 'rb')
infile = csv.DictReader(ifile)
infields = infile.fieldnames
try:
  # This assumes that the first row is data
  sortedlist = sorted(infile, key = lambda d: float(d['statistic2'], reverse =dir) # dir is True or False
except ValueError:
  # Go back and skip header 
  ifile.seek(0)
  ifile.next()
  sortedlist = sorted(infile, key = lambda d: float(d['statistic2'], reverse =dir) # dir is True or False
# Now do the primary key.
  sortedlist.sort(key = lambda d: float(d['statistic1'], reverse =dir) # dir is True or False

ifile.close()

Now open the output file using csv.DictWriter, write the header and output the data from sortedlist.

Sign up to request clarification or add additional context in comments.

1 Comment

@Navid Wu I edited the answer to be correct. I had accidentally done a copy paste that was wrong for the second sort.
1

csv is a standard text file (not an Excel file). Python can certainly process these files. There is a library called csv which is designed for just this type of work: http://docs.python.org/2/library/csv.html

Assuming the file sizes are manageable, you should be able to simply load them all into memory and then sort.

What have you tried so far?

1 Comment

I solved the problem by using pandas to sort the data before I write them into csv files. So the files are already sorted now

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.