2

I have 2 scripts, a mapper and a reducer. Both are taking input from the csv reader. The mapper script should take its input from a tab-delimited text file, dataset.csv, the input to the reducer should be the output to the mapper. I want to save the output of the reducer to a text file, output.txt. What is the correct chain of commands to do it?

mapper:

#/usr/bin/python

import sys, csv

reader = csv.reader(sys.stdin, delimiter='\t')
writer = csv.writer(sys.stdout, delimiter='\t', quotechar='"', quoting=csv.QUOTE_ALL)

for line in reader:
if len(line) > 5: # parse only lines in the forum_node.tsv file
    if line[5] == 'question':
        _id = line[0]
        student = line[3] # author_id
    elif line[5] != 'node_type':
        _id = line[7]
        student = line[3] # author_id
    else:
        continue # ignore header

    print '{0}\t{1}'.format(_id, student)

reducer:

#/usr/bin/python

import sys, csv

reader = csv.reader(sys.stdin, delimiter='\t')
writer = csv.writer(sys.stdout, delimiter='\t', quotechar='"', quoting=csv.QUOTE_ALL)

oldID = None
students = []

for line in reader:
if len(line) != 2:
    continue

thisID, thisStudent = data

if oldID and oldID != thisID:
    print 'Thread: {0}, students: {1}'.format(oldID, ', '.join(students))
    students = []

thisID = oldID
students.append(thisStudent)

if oldID != None:
print 'Thread: {0}, students: {1}'.format(oldID, ', '.join(students))

1 Answer 1

4

Pipe the files together:

python mapper.py < dataset.csv | python reducer.py > output.txt

The < dataset.csv gives mapper.py the CSV file on stdin, and the | redirects the stdout to another commend. That other command is python reducer.py, and > output.txt connects the stdout from that script to `output.txt.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.