9

Good day everyone! My target is to make csv reader to skip the blank lines while parsing a file, do nothing basically, only get me the rows with at least one value. At the moment I have two methods -> 1st is just reading all rows as List of Strings array and returns it, 2nd converts the result into List of Lists of Strings, both are bellow:

private List<String[]> readCSVFile(File filename) throws IOException {

    CSVReader reader = new CSVReader(new FileReader(filename));
    List<String[]> allRows = reader.readAll();

    return allRows;

}

public List<List<String>> readFile(File filename) throws IOException {

        List<String[]> allRows = readCSVFile(filename);     
        List<List<String>> allRowsAsLists = new ArrayList<List<String>>();      
        for (String[] rowItemsArray :  allRows) {
            List<String> rowItems = new ArrayList<String>();
            rowItems.addAll(Arrays.asList(rowItemsArray));
            allRowsAsLists.add(rowItems);

        }
    return allRowsAsLists;

}

My first thought was to check (in the 2'nd method) the length of an array if its 0 just to ignore it - which would be something like this:

for (String[] rowItemsArray :  allRows) {
            **if(rowItemArray.length == 0) continue;**
            List<String> rowItems = new ArrayList<String>();
            rowItems.addAll(Arrays.asList(rowItemsArray));
            allRowsAsLists.add(rowItems);

}  

Unfortunately that didn't work for the reason that even if the row is blank it still returns an array of elements - empty Strings in fact. Checking an individual String is not an option as there are 100+ columns and this is variable. Please suggest what’s the best way to achieve this. Thanks.

Sorted it out this way:

    public List<List<String>> readFile(File filename) throws IOException {

            List<String[]> allRows = readCSVFile(filename, includeHeaders, trimWhitespacesInFieldValues);       
            List<List<String>> allRowsAsLists = new ArrayList<List<String>>();      
            for (String[] rowItemsArray :  allRows) {
                **if(allValuesInRowAreEmpty(rowItemsArray)) continue;**
                List<String> rowItems = new ArrayList<String>();
                rowItems.addAll(Arrays.asList(rowItemsArray));
                allRowsAsLists.add(rowItems);

            }
            return allRowsAsLists;

        }

    private boolean allValuesInRowAreEmpty(String[] row) {
        boolean returnValue = true;
        for (String s : row) {
            if (s.length() != 0) {
                returnValue = false;
            }
        }
        return returnValue;
    }
1
  • You should avoid reading the empty lines at all. That means, the returned List of String arrays should not contain empty entries. Therefore, update your readCSVFile method. Commented Jan 22, 2015 at 12:13

8 Answers 8

9

You could check the length and the first element. If the line contains only a field separator then the lenght > 1. If the line contains a single space character then the first element is not empty.

if (rowItemsArray.length == 1 && rowItemsArray[0].isEmpty()) {
    continue;
}
Sign up to request clarification or add additional context in comments.

Comments

8

For opencsv 5.0 there is an API-option to read CSV lines directly into a Bean.

For people who prefer using the "CsvToBean" feature, the following solution is using the (sadly deprecated) #withFilter(..) method on CsvToBeanBuilder to skip blank lines in the Inputstream:

InputStream inputStream; // provided
List<MyBean> data = new CsvToBeanBuilder(new BufferedReader(new InputStreamReader(inputStream)))
    .withType(MyBean.class)
    .withFilter(new CsvToBeanFilter() {
        /*
         * This filter ignores empty lines from the input
         */
        @Override
        public boolean allowLine(String[] strings) {
            for (String one : strings) {
                if (one != null && one.length() > 0) {
                    return true;
                }
            }
            return false;
        }
    }).build().parse();

Update: With opencsv Release 5.1 (dated 2/2/2020), CsvToBeanFilter got undeprecated as per feature request #120.

3 Comments

What would you do with opencsv ver 5.1 @Martin
@logbasex CsvToBeanFilter was only marked as deprecated in 5.0 and that deprecation mark was removed in 5.1 again. You can use the above code with opencsv 5.0 and with opencsv 5.1
As of opencsv 5.1, there is a withIgnoreEmptyLines method on CsvToBeanBuilder.
2

Here is an updated solution with lambdas based on @Martin's solution:

InputStream inputStream; // provided
List<MyBean> data = new CsvToBeanBuilder(new BufferedReader(new InputStreamReader(inputStream)))
    .withType(MyBean.class)
    // This filter ignores empty lines from the input
    .withFilter(stringValues -> Arrays.stream(stringValues)
        .anyMatch(value -> value != null && value.length() > 0))
    .build()
    .parse();

Comments

1

You can use a filter with lambda: like below:

CsvToBean<T> csvToBean = new CsvToBeanBuilder<T>(new StringReader(CSV_HEADER + "\n" + lines))
    .withType(clazz)
    .withFieldAsNull(CSVReaderNullFieldIndicator.EMPTY_SEPARATORS)
    .withSeparator(delimiter)
    .withSkipLines(skipLines)
    .withIgnoreLeadingWhiteSpace(true).withFilter(strings -> {
      for (String r : strings) {
        if (r != null && r.length() > 0) {
          return true;
        }
      }
      return false;
    }).build();

Your lambda filter:

.withFilter(strings -> {
      for (String r : strings) {
        if (r != null && r.length() > 0) {
          return true;
        }
      }
      return false;
    })

Comments

0

You could summarize all string values per row after trimming them. If the resulting string is empty, there are no values in any cell. In that case ignore the line.
Something like this:

private boolean onlyEmptyCells(ArrayList<String> check) {
    StringBuilder sb = new StringBuilder();
    for (String s : check) {
        sb.append(s.trim());
    }
    return sb.toString().isEmpty(); //<- ignore 'check' if this returns true
}

Comments

0

If you do not parse into a Bean, you can use Java Streams API to help you with filtering of invalid CSV rows. My approach is like this (where is is java.io.InputStream instance with CSV data and YourBean map(String[] row) is your mapping method that maps a CSV row to a your Java object:

CSVParser csvp = new CSVParserBuilder()
    .withSeparator(';')
    .withFieldAsNull(CSVReaderNullFieldIndicator.BOTH)
    .build();
CSVReader csvr = new CSVReaderBuilder(new InputStreamReader(is))
    .withCSVParser(csvp)
    .build();
List<YourBean> result = StreamSupport.stream(csvr.spliterator(), false)
    .filter(Objects::nonNull)
    .filter(row -> row.length > 0)
    .map(row -> map(row))
    .collect(Collectors.toList());

Comments

0

The JavaDoc for CsvToBeanFilter states "Here's an example showing how to use CsvToBean that removes empty lines. Since the parser returns an array with a single empty string for a blank line that is what it is checking." and lists an example of how to do this:

private class EmptyLineFilter implements CsvToBeanFilter {

    private final MappingStrategy strategy;

    public EmptyLineFilter(MappingStrategy strategy) {
        this.strategy = strategy;
    }

    public boolean allowLine(String[] line) {
        boolean blankLine = line.length == 1 && line[0].isEmpty();
        return !blankLine;
    }

 }

 public List<Feature> parseCsv(InputStreamReader streamReader) {
    HeaderColumnNameTranslateMappingStrategy<Feature> strategy = new HeaderColumnNameTranslateMappingStrategy();
    Map<String, String> columnMap = new HashMap();
    columnMap.put("FEATURE_NAME", "name");
    columnMap.put("STATE", "state");
    strategy.setColumnMapping(columnMap);
    strategy.setType(Feature.class);
    CSVReader reader = new CSVReader(streamReader);
    CsvToBeanFilter filter = new EmptyLineFilter(strategy);
    return new CsvToBean().parse(strategy, reader, filter);
 }

Comments

0

Another way to do it, is to override the allowLine() method of the Class CsvToBeanFilter and pass it to .withFilter() when building the bean (example in Kotlin, should also work for Java):

import com.opencsv.bean.CsvToBeanFilter
import com.opencsv.bean.CsvToBeanBuilder
import java.io.FileReader

class NonBlankLinesFilter : CsvToBeanFilter {
    override fun allowLine(line: Array<String?>?): Boolean {
        if (line == null) {
            return false
        }
        for (value in line) {
            if (!value.isNullOrBlank()) {
                return true
        }
    }
    return false
  }
}

fun main() {
  val reader = FileReader("path/to/csv/file.csv")

  val csvToBean = CsvToBeanBuilder<MyBean>(reader)
        .withType(MyBean::class.java)
        .withSeparator(',')
        .withFilter(NonBlankLinesFilter())
        .build()

  val uploadList = csvToBean.parse()

  println(uploadList)
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.