Class Tokenizer

All Implemented Interfaces:
Closeable, AutoCloseable, ITokenizer

public class Tokenizer extends AbstractTokenizer
Reads the CSV file, line by line. If you want the line-reading functionality of this class, but want to define your own implementation of readColumns(List), then consider writing your own Tokenizer by extending AbstractTokenizer.
  • Field Details

    • NEWLINE

      private static final char NEWLINE
      See Also:
    • SPACE

      private static final char SPACE
      See Also:
    • currentColumn

      private final StringBuilder currentColumn
    • currentRow

      private final StringBuilder currentRow
    • quoteChar

      private final int quoteChar
    • delimeterChar

      private final int delimeterChar
    • surroundingSpacesNeedQuotes

      private final boolean surroundingSpacesNeedQuotes
    • ignoreEmptyLines

      private final boolean ignoreEmptyLines
    • commentMatcher

      private final CommentMatcher commentMatcher
    • maxLinesPerRow

      private final int maxLinesPerRow
  • Constructor Details

    • Tokenizer

      public Tokenizer(Reader reader, CsvPreference preferences)
      Constructs a new Tokenizer, which reads the CSV file, line by line.
      Parameters:
      reader - the reader
      preferences - the CSV preferences
      Throws:
      NullPointerException - if reader or preferences is null
  • Method Details

    • readColumns

      public boolean readColumns(List<String> columns) throws IOException
      Reads a CSV row into the supplied List of columns (which can potentially span multiple lines in the file). The columns list is cleared as the first operation in the method. Any empty columns ("") will be added to the list as null.
      Parameters:
      columns - the List of columns to read into
      Returns:
      true if something was read, or false if EOF
      Throws:
      IOException - when an IOException occurs
    • appendSpaces

      private static void appendSpaces(StringBuilder sb, int spaces)
      Appends the required number of spaces to the StringBuilder.
      Parameters:
      sb - the StringBuilder
      spaces - the required number of spaces to append
    • getUntokenizedRow

      public String getUntokenizedRow()
      Returns the raw (untokenized) CSV row that was just read (which can potentially span multiple lines in the file).
      Returns:
      the raw (untokenized) CSV row that was just read