5

I am importing a csv file using csv.reader and pandas. However, the number of rows from the same file are different.

reviews = []
openfile = open("reviews.csv", 'rb')
r = csv.reader(openfile)
for i in r:
    reviews.append(i)
openfile.close()
print len(reviews)

the results is 10,000 (which is the correct value). However, pandas returns a different value.

df = pd.read_csv("reviews.csv", header=None)
df.info()

this returns 9,985

Does anyone know why there is difference between the two methods of importing data?

I just tried this:

reviews_df = pd.DataFrame(reviews)
reviews_df.info()

This returns 10,000.

8
  • I can not reproduce this problem. When I run this code on a csv file the difference is only one row and that's the header Commented Apr 29, 2016 at 3:28
  • Can you give us a small sample input file which demonstrates the problem? This should be possible if you first figure out which rows are missing. Commented Apr 29, 2016 at 3:30
  • That is supposed to be. I have never seen this issue before. Commented Apr 29, 2016 at 3:30
  • Is there a way to figure out which rows are missing? That knowledge would be helpful at this point. Commented Apr 29, 2016 at 3:33
  • @kevin: Sure, you could write the table back out to a new CSV and diff them. Commented Apr 29, 2016 at 3:39

1 Answer 1

11

Refer to the pandas.read_csv there is an argument named skip_blank_lines and its default value is True hence unless you are setting it to False it will not read the blank lines.

Consider the following example, there are two blank rows:

A,B,C,D
0.07,-0.71,1.42,-0.37

0.08,0.36,0.99,0.11
1.06,1.55,-0.93,-0.90
-0.33,0.13,-0.11,0.89
1.91,-0.74,0.69,0.83
-0.28,0.14,1.28,-0.40
0.35,1.75,-1.10,1.23

-0.09,0.32,0.91,-0.08

Read it with skip_blank_lines=False:

df = pd.read_csv('test_data.csv', skip_blank_lines=False)
len(df)
10 

Read it with skip_blank_lines=True:

  df = pd.read_csv('test_data.csv', skip_blank_lines=True)
  len(df)
  8
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.