2

I have a directory (with several subfolders) of csv files. I want to delete the first 2 rows of all csv files before I upload the csv files to a database (SQL server). I started with the following python script on a small subset of csv files located in one folder (no sub folders) and although the script runs successfully but no rows are deleted from the files. What am I missing:

import glob
import csv

myfiles = glob.glob("C:\Data\*.csv")
for file in myfiles:
    lines = open(file).readlines()
    open(file, 'w').writelines(lines[1:])

Here is my sample data:

"Title: Distribution of Nonelderly Population by Household Employment Status | The Henry J. Kaiser Family Foundation"
"Timeframe: 2015"
"Location","At Least 1 Full Time Worker","Part Time Workers","Non Workers","Total"
"United States","0.82","0.08","0.10","1.00"
"Alabama","0.79","0.06","0.15","1.00"
"Alaska","0.85","0.06","0.09","1.00"
"Arizona","0.80","0.08","0.12","1.00"
"Arkansas","0.78","0.07","0.15","1.00"
"California","0.81","0.08","0.10","1.00"

I want to maintain the same directory structure with the edited output csv files. Any help will be highly appreciated.

7
  • So, what's the question? What problems did you run into? Commented Apr 20, 2017 at 20:24
  • I tried on a small subset of csv files in one folder (no subfolders) and the above script although runs successfully but doesn't delete any row. Commented Apr 20, 2017 at 20:26
  • 1
    shouldn't you close the file before opening it in write mode? Commented Apr 20, 2017 at 20:27
  • 1
    Describe your actual problem in your question, not the comments. You probably also want to look at os.walk Commented Apr 20, 2017 at 20:28
  • if all his csv files are in the same flat directory, no need to use os.walk Commented Apr 20, 2017 at 20:33

1 Answer 1

2

Try this:

import os

# Change this to your CSV file base directory
base_directory = 'C:\\Data'    
for dir_path, dir_name_list, file_name_list in os.walk(base_directory):
    for file_name in file_name_list:
        # If this is not a CSV file
        if not file_name.endswith('.csv'):
            # Skip it
            continue
        file_path = os.path.join(dir_path, file_name)
        with open(file_path, 'r') as ifile:
            line_list = ifile.readlines()
        with open(file_path, 'w') as ofile:
            ofile.writelines(line_list[2:])

Note: DO NOT use file as a variable name or you will clobber the built-in class.

Sign up to request clarification or add additional context in comments.

2 Comments

No luck with the above
I updated the answer to handle an arbitrary directory structure using os.walk.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.