1

I am trying to sort a csv file on column 3. python sorts the csv but for two rows. Really confused

here is the code i am using.

import csv
import operator
import numpy

sample = open('data.csv','rU')
csv1 = csv.reader(sample,delimiter=',')
sort=sorted(csv1,key=lambda x:x[3])
for eachline in sort:
    print eachline

and here is the o/p From the third row the O/P looks good. Any ideas ?

['6/23/02', 'Julian Jaynes', '618057072', '12.5']
['7/15/98', 'Timothy "The Parser" Campbell', '968411304', '18.99']
['10/4/04', 'Randel Helms', '879755725', '4.5']
['9/30/03', 'Scott Adams', '740721909', '4.95']
['10/4/04', 'Benjamin Radcliff', '804818088', '4.95']
['1/21/85', 'Douglas Adams', '345391802', '5.95']
['12/3/99', 'Richard Friedman', '60630353', '5.95']
['1/12/90', 'Douglas Hofstadter', '465026567', '9.95']
['9/19/01', 'Karen Armstrong', '345384563', '9.95']
['6/23/02', 'David Jones', '198504691', '9.95']
['REVIEW_DATE', 'AUTHOR', 'ISBN', 'DISCOUNTED_PRICE']
2
  • what is expected output ? Commented Jun 1, 2015 at 15:33
  • x[3] is the 4th column, you need x[2] Commented Jun 1, 2015 at 15:41

1 Answer 1

1

You are sorting strings, you need to use float(x[3])

sort=sorted(csv1,key=lambda x:float(x[3]))

If you want to sort by the third column it is x[2], casting to int:

sort=sorted(csv1,key=lambda x:int(x[2]))

You will also need to skip the header to avoid a ValueError:

csv1 = csv.reader(sample,delimiter=',')
header = next(csv1)
sort=sorted(csv1,key=lambda x:int(x[2]))

Python will compare the strings character by character putting "2" after "12" unless you cast to int:

In [82]: "2" < "12"
Out[82]: False

In [83]: int("2") < int("12")
Out[83]: True
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.