|
Vince's CSV Parser
|
Main class for parsing CSVs from files and in-memory sources. More...
#include <csv_reader.hpp>
Classes | |
| class | iterator |
| An input iterator capable of handling large files. More... | |
Public Member Functions | |
| CSVReader (const CSVReader &)=delete | |
| Not copyable. | |
| CSVReader (CSVReader &&)=delete | |
| Not movable: contains std::mutex. | |
| CSVReader & | operator= (const CSVReader &)=delete |
| Not copyable. | |
| CSVReader & | operator= (CSVReader &&)=delete |
| Not movable: contains std::mutex. | |
Constructors | |
Constructors for iterating over large files and parsing in-memory sources. | |
| CSVReader (csv::string_view filename, CSVFormat format=CSVFormat::guess_csv()) | |
| Construct CSVReader from filename using memory-mapped I/O. | |
| template<typename TStream , csv::enable_if_t< std::is_base_of< std::istream, TStream >::value, int > = 0> | |
| CSVReader (TStream &source, CSVFormat format=CSVFormat::guess_csv()) | |
| Construct CSVReader from std::istream. | |
Retrieving CSV Rows | |
| bool | read_row (CSVRow &row) |
| Retrieve rows as CSVRow objects, returning true if more rows are available. | |
| iterator | begin () |
| Return an iterator to the first row in the reader. | |
| CSV_CONST iterator | end () const noexcept |
| A placeholder for the imaginary past the end row in a CSV. | |
| bool | eof () const noexcept |
| Returns true if we have reached end of file. | |
CSV Metadata | |
| CSVFormat | get_format () const |
| Return the format of the original raw CSV. | |
| std::vector< std::string > | get_col_names () const |
| Return the CSV's column names as a vector of strings. | |
| int | index_of (csv::string_view col_name) const |
| Return the index of the column name if found or csv::CSV_NOT_FOUND otherwise. | |
CSV Metadata: Attributes | |
| CONSTEXPR bool | empty () const noexcept |
| Whether or not the file or stream contains valid CSV rows, not including the header. | |
| CONSTEXPR size_t | n_rows () const noexcept |
| Retrieves the number of rows that have been read so far. | |
| bool | utf8_bom () const noexcept |
| Whether or not CSV was prefixed with a UTF-8 bom. | |
Protected Member Functions | |
| void | set_col_names (const std::vector< std::string > &) |
| Sets this reader's column names and associated data. | |
Multi-Threaded File Reading Functions | |
| bool | read_csv (size_t bytes=internals::ITERATION_CHUNK_SIZE) |
| Read a chunk of CSV data. | |
Protected Attributes | |
CSV Settings | |
| CSVFormat | _format |
Parser State | |
| internals::ColNamesPtr | col_names = std::make_shared<internals::ColNames>() |
| Pointer to a object containing column information. | |
| std::unique_ptr< internals::IBasicCSVParser > | parser = nullptr |
| Helper class which actually does the parsing. | |
| std::unique_ptr< RowCollection > | records {new RowCollection(100)} |
| Queue of parsed CSV rows. | |
| size_t | n_cols = 0 |
| The number of columns in this CSV. | |
| size_t | _n_rows = 0 |
| How many rows (minus header) have been read so far. | |
Main class for parsing CSVs from files and in-memory sources.
All rows are compared to the column names for length consistency
Definition at line 77 of file csv_reader.hpp.
| csv::CSVReader::CSVReader | ( | csv::string_view | filename, |
| CSVFormat | format = CSVFormat::guess_csv() |
||
| ) |
Construct CSVReader from filename using memory-mapped I/O.
Reads an arbitrarily large CSV file using memory-mapped IO.
CODE PATH 1 of 2: Uses MmapParser with mio library for maximum performance. This is fundamentally different from the stream-based constructor below.
Details: Reads the first block of a CSV file synchronously to get information such as column names and delimiting character.
| [in] | filename | Path to CSV file |
| [in] | format | Format of the CSV file |
Guess delimiter and header row
Definition at line 167 of file csv_reader.cpp.
|
inline |
Construct CSVReader from std::istream.
CODE PATH 2 of 2: Uses StreamParser with different internal implementation than the memory-mapped constructor above. Issue #281 was specific to THIS path only.
| TStream | An input stream deriving from std::istream |
Definition at line 183 of file csv_reader.hpp.
|
inline |
Definition at line 215 of file csv_reader.hpp.
| CSVReader::iterator csv::CSVReader::begin | ( | ) |
Return an iterator to the first row in the reader.
Definition at line 9 of file csv_reader_iterator.cpp.
|
inlinenoexcept |
Whether or not the file or stream contains valid CSV rows, not including the header.
Definition at line 246 of file csv_reader.hpp.
|
noexcept |
A placeholder for the imaginary past the end row in a CSV.
Attempting to deference this will lead to bad things.
Definition at line 20 of file csv_reader_iterator.cpp.
|
inlinenoexcept |
Returns true if we have reached end of file.
Definition at line 228 of file csv_reader.hpp.
| std::vector< std::string > csv::CSVReader::get_col_names | ( | ) | const |
Return the CSV's column names as a vector of strings.
Definition at line 207 of file csv_reader.cpp.
| CSVFormat csv::CSVReader::get_format | ( | ) | const |
Return the format of the original raw CSV.
Definition at line 194 of file csv_reader.cpp.
| int csv::CSVReader::index_of | ( | csv::string_view | col_name | ) | const |
Return the index of the column name if found or csv::CSV_NOT_FOUND otherwise.
Definition at line 218 of file csv_reader.cpp.
|
inlinenoexcept |
Retrieves the number of rows that have been read so far.
Definition at line 249 of file csv_reader.hpp.
| bool csv::CSVReader::read_row | ( | CSVRow & | row | ) |
Retrieve rows as CSVRow objects, returning true if more rows are available.
| [out] | row | The variable where the parsed row will be stored |
Example:
Definition at line 310 of file csv_reader.cpp.
|
inlinenoexcept |
Whether or not CSV was prefixed with a UTF-8 bom.
Definition at line 252 of file csv_reader.hpp.