Delete first 2 rows of multiple csv files with python

Question

I have a directory (with several subfolders) of csv files. I want to delete the first 2 rows of all csv files before I upload the csv files to a database (SQL server). I started with the following python script on a small subset of csv files located in one folder (no sub folders) and although the script runs successfully but no rows are deleted from the files. What am I missing:

import glob
import csv

myfiles = glob.glob("C:\Data\*.csv")
for file in myfiles:
    lines = open(file).readlines()
    open(file, 'w').writelines(lines[1:])

Here is my sample data:

"Title: Distribution of Nonelderly Population by Household Employment Status | The Henry J. Kaiser Family Foundation"
"Timeframe: 2015"
"Location","At Least 1 Full Time Worker","Part Time Workers","Non Workers","Total"
"United States","0.82","0.08","0.10","1.00"
"Alabama","0.79","0.06","0.15","1.00"
"Alaska","0.85","0.06","0.09","1.00"
"Arizona","0.80","0.08","0.12","1.00"
"Arkansas","0.78","0.07","0.15","1.00"
"California","0.81","0.08","0.10","1.00"

I want to maintain the same directory structure with the edited output csv files. Any help will be highly appreciated.

I tried on a small subset of csv files in one folder (no subfolders) and the above script although runs successfully but doesn't delete any row. — user7717771
– user7717771, Commented Apr 20, 2017 at 20:26
shouldn't you close the file before opening it in write mode? — DevLounge
– DevLounge, Commented Apr 20, 2017 at 20:27
Describe your actual problem in your question, not the comments. You probably also want to look at os.walk — pvg
– pvg, Commented Apr 20, 2017 at 20:28
if all his csv files are in the same flat directory, no need to use os.walk — DevLounge
– DevLounge, Commented Apr 20, 2017 at 20:33

score 2 · Accepted Answer · 2017-04-20 20:59:28Z

2

Try this:

import os

# Change this to your CSV file base directory
base_directory = 'C:\\Data'    
for dir_path, dir_name_list, file_name_list in os.walk(base_directory):
    for file_name in file_name_list:
        # If this is not a CSV file
        if not file_name.endswith('.csv'):
            # Skip it
            continue
        file_path = os.path.join(dir_path, file_name)
        with open(file_path, 'r') as ifile:
            line_list = ifile.readlines()
        with open(file_path, 'w') as ofile:
            ofile.writelines(line_list[2:])

Note: DO NOT use file as a variable name or you will clobber the built-in class.

edited Apr 20, 2017 at 20:59

answered Apr 20, 2017 at 20:39

user3657941

Sign up to request clarification or add additional context in comments.

2 Comments

user7717771 Over a year ago

No luck with the above

user3657941 Over a year ago

I updated the answer to handle an arbitrary directory structure using os.walk.

Collectives™ on Stack Overflow

Delete first 2 rows of multiple csv files with python

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related