|
Vince's CSV Parser
|
Stores information about how to parse a CSV file. More...
#include <csv_format.hpp>
Public Member Functions | |
| CSVFormat ()=default | |
| Settings for parsing a RFC 4180 CSV file. | |
| CSVFormat & | delimiter (char delim) |
| Sets the delimiter of the CSV file. | |
| CSVFormat & | delimiter (const std::vector< char > &delim) |
| Sets a list of potential delimiters. | |
| CSVFormat & | trim (const std::vector< char > &ws) |
| Sets the whitespace characters to be trimmed. | |
| CSVFormat & | quote (char quote) |
| Sets the quote character. | |
| CSVFormat & | column_names (const std::vector< std::string > &names) |
| Sets the column names. | |
| CSVFormat & | header_row (int row) |
| Sets the header row. | |
| CSVFormat & | no_header () |
| Tells the parser that this CSV has no header row. | |
| CSVFormat & | quote (bool use_quote) |
| Turn quoting on or off. | |
| CONSTEXPR_14 CSVFormat & | variable_columns (VariableColumnPolicy policy=VariableColumnPolicy::IGNORE_ROW) |
| Tells the parser how to handle columns of a different length than the others. | |
| CONSTEXPR_14 CSVFormat & | variable_columns (bool policy) |
| Tells the parser how to handle columns of a different length than the others. | |
| CONSTEXPR_14 CSVFormat & | column_names_policy (ColumnNamePolicy policy) |
| Sets the column name lookup policy. | |
| CSVFormat & | chunk_size (size_t size) |
| Sets the chunk size used when reading the CSV. | |
| CONSTEXPR_14 CSVFormat & | threading (bool enabled=true) |
| Enable or disable parser threading at runtime. | |
| CONSTEXPR_14 CSVFormat & | speculative_parallel_threads (size_t n_threads) |
| Set the worker count used by speculative parallel parsing. | |
| CONSTEXPR_14 CSVFormat & | speculative_parallel_min_bytes (size_t bytes) |
| Set the minimum source size required for speculative parallel parsing. | |
| CONSTEXPR_14 CSVFormat & | eager_field_classification (bool enabled=true) |
| Enable parser-time scalar classification for typed consumers. | |
| bool | guess_delim () const |
Static Public Member Functions | |
| static CSVFormat | guess_csv () |
| CSVFormat preset for delimiter inference with header/n_cols inference enabled. | |
Public Attributes | |
| friend | CSVReader |
Friends | |
| template<typename RowSink , typename ParsePolicy , typename FieldPolicy , typename RowPolicy > | |
| class | internals::CSVParserCore |
Stores information about how to parse a CSV file.
Can be used to construct a csv::CSVReader.
Definition at line 49 of file csv_format.hpp.
| CSVFormat & csv::CSVFormat::chunk_size | ( | size_t | size | ) |
Sets the chunk size used when reading the CSV.
| [in] | size | Chunk size in bytes (minimum: CSV_CHUNK_SIZE_FLOOR) |
| std::invalid_argument | if size < CSV_CHUNK_SIZE_FLOOR or size > UINT32_MAX |
Use this when constructing a CSVReader from a filename and individual rows may exceed the default 10MB chunk size. The value is passed to CSVReader at construction time, before any data is read.
Definition at line 55 of file csv_format.cpp.
| CSVFormat & csv::CSVFormat::column_names | ( | const std::vector< std::string > & | names | ) |
Sets the column names.
Definition at line 38 of file csv_format.cpp.
|
inline |
Sets the column name lookup policy.
| [in] | policy | Use ColumnNamePolicy::CASE_INSENSITIVE to allow case-insensitive column lookups via CSVRow::operator[] and CSVReader::index_of(). |
Definition at line 131 of file csv_format.hpp.
| CSVFormat & csv::CSVFormat::delimiter | ( | char | delim | ) |
Sets the delimiter of the CSV file.
Passing a single delimiter disables delimiter inference. Header-row inference still runs unless header_row()/no_header() was set explicitly or column_names() was provided.
| `std::runtime_error` | thrown if trim, quote, or possible delimiting characters overlap |
Definition at line 13 of file csv_format.cpp.
| CSVFormat & csv::CSVFormat::delimiter | ( | const std::vector< char > & | delim | ) |
Sets a list of potential delimiters.
Passing multiple delimiters enables delimiter inference.
| `std::runtime_error` | thrown if trim, quote, or possible delimiting characters overlap |
Definition at line 19 of file csv_format.cpp.
|
inline |
Enable parser-time scalar classification for typed consumers.
Disabled by default so normal string-only parsing keeps the historical lazy classification cost model.
Definition at line 181 of file csv_format.hpp.
|
inlinestatic |
CSVFormat preset for delimiter inference with header/n_cols inference enabled.
Definition at line 229 of file csv_format.hpp.
|
inline |
Definition at line 241 of file csv_format.hpp.
| CSVFormat & csv::CSVFormat::header_row | ( | int | row | ) |
Sets the header row.
| [in] | row | Row index containing column names; negative means no header row. |
Definition at line 45 of file csv_format.cpp.
|
inline |
Tells the parser that this CSV has no header row.
header_row(-1) Definition at line 102 of file csv_format.hpp.
|
inline |
Turn quoting on or off.
Definition at line 108 of file csv_format.hpp.
| CSVFormat & csv::CSVFormat::quote | ( | char | quote | ) |
Sets the quote character.
| `std::runtime_error` | thrown if trim, quote, or possible delimiting characters overlap |
Definition at line 25 of file csv_format.cpp.
|
inline |
Set the minimum source size required for speculative parallel parsing.
Definition at line 171 of file csv_format.hpp.
|
inline |
Set the worker count used by speculative parallel parsing.
A value of 0 means "choose automatically" when the reader is created.
Definition at line 165 of file csv_format.hpp.
|
inline |
Enable or disable parser threading at runtime.
Threading is enabled by default when the library is compiled with CSV_ENABLE_THREADS=1. Disable it for workloads with many small CSVs where a background parser thread costs more than it helps.
When disabled, CSVReader parses synchronously on the caller thread and speculative parallel parsing is also disabled.
Definition at line 156 of file csv_format.hpp.
| CSVFormat & csv::CSVFormat::trim | ( | const std::vector< char > & | ws | ) |
Sets the whitespace characters to be trimmed.
| `std::runtime_error` | thrown if trim, quote, or possible delimiting characters overlap |
Definition at line 32 of file csv_format.cpp.
|
inline |
Tells the parser how to handle columns of a different length than the others.
Definition at line 120 of file csv_format.hpp.
|
inline |
Tells the parser how to handle columns of a different length than the others.
Definition at line 114 of file csv_format.hpp.
|
friend |
Definition at line 247 of file csv_format.hpp.
| friend csv::CSVFormat::CSVReader |
Definition at line 245 of file csv_format.hpp.