0

I need to parse a CSV file for certain fields, and based on matching a pattern I need to add fields together. I've succeeded in setting variables, but need help figuring out how to add them when there may be 1-20 variables. (Or possibly another, simpler way to approach this.)

Source file contents example:

Server-Name,Volume-Name,Vol-Size,Logical-Space-In-Use
FTWTRAQNETSQL01,FTWTRAQNETSQL01_e,2008,1989
FTWTRAQNETSQL01,FTWTRAQNETSQL01_f,106,63.698
FTWTRAQNETSQL02,FTWTRAQNETSQL02_e,2008,1989
FTWTRAQNETSQL02,FTWTRAQNETSQL02_f,106,4.155
ftwvocmpsqln01,ftwvocmpsqln01_1,1002,21.047
ftwvocmpsqln01,ftwvocmpsqln01_2,104,55.379
ftwspsqln02,ftwspsqln02_H,501,0
ftwvocmpsqln02,ftwvocmpsqln02_1,1002,20.732
ftwvocmpsqln02,ftwvocmpsqln02_2,104,55.380

Output should be one line for each each unique server name, and adding all the field 3 values and all the field 4 values. Servers can have many volumes, some as many as 20. Desired file output would be:

Server-Name,Vol-Size,Logical-Space-In-Use
FTWTRAQNETSQL01,2114,2052.698
FTWTRAQNETSQL02,2114,1993.155
ftwvocmpsqln01,1106,76.426
ftwspsqln02,501,0
ftwvocmpsqln02,1106,76.112

I can do this in about 7 seconds in Excel, but so far haven't figured out a solution for automating with bash (or other shells.)

This is the code I have so far, only looking at field 3. It correctly sets variables for each iteration of unique servers, but I can't figure out how to do the addition with a variable number of variables.

for i in $( awk -F , '{print $1}' $REPORT | grep -v Server-Name | uniq )
do
    c=0
    for num in $( grep $i $REPORT | awk -F , '{print $3}' )
        do
        eval "var$c=$num";
        c=$((c+1));
    done
done

3 Answers 3

2

With GNU datamash:

$ datamash -t, --header-in groupby 1 sum 3,4 < file.csv
FTWTRAQNETSQL01,2114,2052.698
FTWTRAQNETSQL02,2114,1993.155
ftwvocmpsqln01,1106,76.426
ftwspsqln02,501,0
ftwvocmpsqln02,1106,76.112
0
1

Not a shell, but "unix way":

awk -F',' 'NR==1; NR>1{s3[$1]+=$3; s4[$1]+=$4} END { for(i in s3){printf("%s,%s,%s\n",i,s3[i],s4[i])} }' file

The order of the output will (probably) not match the input order.

Description:

awk                   # use awk.
-F','                 # set the field separator as comma (,)
'                                         # start an awk script.
   NR==1;                                 # print first line (header)
   NR>1{                                  # for lines other than first
         s3[$1]+=$3;                      # add values on third field
         s4[$1]+=$4                       # add values on fourth field
       }                                  # close the previous {
         END {                            # after all lines have been read
               for(i in s3){              # for each index of the array
                                          # (all unique values of field $1)
                             printf("%s,%s,%s\n",i,s3[i],s4[i])   # print values.
                           }              # close the for loop.
             }                            # close the END loop.
' file                                    # end script code and name the file.
3
  • That seems to work! Thanks! (Now I just have to take it apart so I understand what you're doing.) Commented Jun 20, 2019 at 19:00
  • @TeresaM Description added .... Commented Jun 20, 2019 at 19:47
  • Thank you, @Isaac. Much appreciate it, so I don't have to ask the next time I need this type function. Commented Jun 20, 2019 at 21:31
1

Using Miller (mlr) to compute the sums of the two columns Vol-Size and Logical-Space-In-Use, grouping by the Server-Name field:

$ mlr --csv stats1 -a sum -f Vol-Size,Logical-Space-In-Use -g Server-Name file
Server-Name,Vol-Size_sum,Logical-Space-In-Use_sum
FTWTRAQNETSQL01,2114,2052.698000
FTWTRAQNETSQL02,2114,1993.155000
ftwvocmpsqln01,1106,76.426000
ftwspsqln02,501,0
ftwvocmpsqln02,1106,76.112000

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.