6

We want to delete the first 34 rows of our csv file, as it is useless text headers, which is to no use. We are trying to do so by running the following lines of code in Python 3:

with open("test.csv",'r') as f, open("temp.csv",'w') as f1:
    next(f) # skip header line
    for line in f:
        f1.write(line)

Above should only 'delete' first line, but we imagine we could make a for loop with range(0, 35) around next(f). Although that doesn't seem like the "pythonical" way of solving this.

Our data is in test.csv, and we have an empty csv file called temp.csv. The code described above is supposed to skip first line of test.csv, and then copy the rest into temp.csv.

Unfortunately we receive this error:

Traceback (most recent call last):
  File "delete.py", line 2, in <module>
    next(f) # skip header line
  File "/usr/lib/python3.6/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf8 in position 2625: invalid start byte

Why does this occur? And what is the pythonical way of deleting the first 34 rows in a csv file?

3

1 Answer 1

11

I know you want to skip rows using the with context manager when opening and reading a CSV file, but I would suggest you use an excellent library called pandas to read and skip rows from the CSV file like below, also you can save that to another CSV file from the df data frame very easily

without pandas,

import csv

with open('my_csv_file.csv', 'r') as file:
    # Skip rows by advancing file pointer (very memory efficient)
    for _ in range(34):
        next(file)
    
    # Only now start parsing as CSV
    csv_reader = csv.reader(file)
    # Process one row at a time instead of loading all to memory
    for row in csv_reader:
        # Process row here
        pass

with pandas,

import pandas as pd
# skiprows=34 will skip the first 34 lines and try to read from 35 line
df = pd.read_csv('my_csv_file.csv', skiprows=34)
# print the data frame
df 
Sign up to request clarification or add additional context in comments.

4 Comments

This works perfectly! We wanted to clean our csv file, so we could use it in another program we have. We had no idea, we could easily just use 'skiprows' instead. Thanks a lot
Isn’t Pandas overkill for this?
I wouldn't install pandas just for this.
ok updated my answer without using pandas.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.