Vince's CSV Parser
Loading...
Searching...
No Matches
csv::CSVFormat Class Reference

Stores information about how to parse a CSV file. More...

#include <csv_format.hpp>

Public Member Functions

 CSVFormat ()=default
 Settings for parsing a RFC 4180 CSV file.
 
CSVFormatdelimiter (char delim)
 Sets the delimiter of the CSV file.
 
CSVFormatdelimiter (const std::vector< char > &delim)
 Sets a list of potential delimiters.
 
CSVFormattrim (const std::vector< char > &ws)
 Sets the whitespace characters to be trimmed.
 
CSVFormatquote (char quote)
 Sets the quote character.
 
CSVFormatcolumn_names (const std::vector< std::string > &names)
 Sets the column names.
 
CSVFormatheader_row (int row)
 Sets the header row.
 
CSVFormatno_header ()
 Tells the parser that this CSV has no header row.
 
CSVFormatquote (bool use_quote)
 Turn quoting on or off.
 
CONSTEXPR_14 CSVFormatvariable_columns (VariableColumnPolicy policy=VariableColumnPolicy::IGNORE_ROW)
 Tells the parser how to handle columns of a different length than the others.
 
CONSTEXPR_14 CSVFormatvariable_columns (bool policy)
 Tells the parser how to handle columns of a different length than the others.
 
CONSTEXPR_14 CSVFormatcolumn_names_policy (ColumnNamePolicy policy)
 Sets the column name lookup policy.
 
CSVFormatchunk_size (size_t size)
 Sets the chunk size used when reading the CSV.
 
CONSTEXPR_14 CSVFormatthreading (bool enabled=true)
 Enable or disable parser threading at runtime.
 
CONSTEXPR_14 CSVFormatspeculative_parallel_threads (size_t n_threads)
 Set the worker count used by speculative parallel parsing.
 
CONSTEXPR_14 CSVFormatspeculative_parallel_min_bytes (size_t bytes)
 Set the minimum source size required for speculative parallel parsing.
 
CONSTEXPR_14 CSVFormateager_field_classification (bool enabled=true)
 Enable parser-time scalar classification for typed consumers.
 
bool guess_delim () const
 

Static Public Member Functions

static CSVFormat guess_csv ()
 CSVFormat preset for delimiter inference with header/n_cols inference enabled.
 

Public Attributes

friend CSVReader
 

Friends

template<typename RowSink , typename ParsePolicy , typename FieldPolicy , typename RowPolicy >
class internals::CSVParserCore
 

Detailed Description

Stores information about how to parse a CSV file.

Can be used to construct a csv::CSVReader.

Definition at line 49 of file csv_format.hpp.

Member Function Documentation

◆ chunk_size()

CSVFormat & csv::CSVFormat::chunk_size ( size_t  size)

Sets the chunk size used when reading the CSV.

Parameters
[in]sizeChunk size in bytes (minimum: CSV_CHUNK_SIZE_FLOOR)
Exceptions
std::invalid_argumentif size < CSV_CHUNK_SIZE_FLOOR or size > UINT32_MAX

Use this when constructing a CSVReader from a filename and individual rows may exceed the default 10MB chunk size. The value is passed to CSVReader at construction time, before any data is read.

Definition at line 55 of file csv_format.cpp.

◆ column_names()

CSVFormat & csv::CSVFormat::column_names ( const std::vector< std::string > &  names)

Sets the column names.

Note
Unsets any values set by header_row()

Definition at line 38 of file csv_format.cpp.

◆ column_names_policy()

CONSTEXPR_14 CSVFormat & csv::CSVFormat::column_names_policy ( ColumnNamePolicy  policy)
inline

Sets the column name lookup policy.

Parameters
[in]policyUse ColumnNamePolicy::CASE_INSENSITIVE to allow case-insensitive column lookups via CSVRow::operator[] and CSVReader::index_of().

Definition at line 131 of file csv_format.hpp.

◆ delimiter() [1/2]

CSVFormat & csv::CSVFormat::delimiter ( char  delim)

Sets the delimiter of the CSV file.

Passing a single delimiter disables delimiter inference. Header-row inference still runs unless header_row()/no_header() was set explicitly or column_names() was provided.

Exceptions
`std::runtime_error`thrown if trim, quote, or possible delimiting characters overlap

Definition at line 13 of file csv_format.cpp.

◆ delimiter() [2/2]

CSVFormat & csv::CSVFormat::delimiter ( const std::vector< char > &  delim)

Sets a list of potential delimiters.

Passing multiple delimiters enables delimiter inference.

Exceptions
`std::runtime_error`thrown if trim, quote, or possible delimiting characters overlap

Definition at line 19 of file csv_format.cpp.

◆ eager_field_classification()

CONSTEXPR_14 CSVFormat & csv::CSVFormat::eager_field_classification ( bool  enabled = true)
inline

Enable parser-time scalar classification for typed consumers.

Disabled by default so normal string-only parsing keeps the historical lazy classification cost model.

Definition at line 181 of file csv_format.hpp.

◆ guess_csv()

static CSVFormat csv::CSVFormat::guess_csv ( )
inlinestatic

CSVFormat preset for delimiter inference with header/n_cols inference enabled.

Definition at line 229 of file csv_format.hpp.

◆ guess_delim()

bool csv::CSVFormat::guess_delim ( ) const
inline

Definition at line 241 of file csv_format.hpp.

◆ header_row()

CSVFormat & csv::CSVFormat::header_row ( int  row)

Sets the header row.

Parameters
[in]rowRow index containing column names; negative means no header row.
Note
Unsets any values set by column_names()

Definition at line 45 of file csv_format.cpp.

◆ no_header()

CSVFormat & csv::CSVFormat::no_header ( )
inline

Tells the parser that this CSV has no header row.

Note
Equivalent to header_row(-1)

Definition at line 102 of file csv_format.hpp.

◆ quote() [1/2]

CSVFormat & csv::CSVFormat::quote ( bool  use_quote)
inline

Turn quoting on or off.

Definition at line 108 of file csv_format.hpp.

◆ quote() [2/2]

CSVFormat & csv::CSVFormat::quote ( char  quote)

Sets the quote character.

Exceptions
`std::runtime_error`thrown if trim, quote, or possible delimiting characters overlap

Definition at line 25 of file csv_format.cpp.

◆ speculative_parallel_min_bytes()

CONSTEXPR_14 CSVFormat & csv::CSVFormat::speculative_parallel_min_bytes ( size_t  bytes)
inline

Set the minimum source size required for speculative parallel parsing.

Definition at line 171 of file csv_format.hpp.

◆ speculative_parallel_threads()

CONSTEXPR_14 CSVFormat & csv::CSVFormat::speculative_parallel_threads ( size_t  n_threads)
inline

Set the worker count used by speculative parallel parsing.

A value of 0 means "choose automatically" when the reader is created.

Definition at line 165 of file csv_format.hpp.

◆ threading()

CONSTEXPR_14 CSVFormat & csv::CSVFormat::threading ( bool  enabled = true)
inline

Enable or disable parser threading at runtime.

Threading is enabled by default when the library is compiled with CSV_ENABLE_THREADS=1. Disable it for workloads with many small CSVs where a background parser thread costs more than it helps.

When disabled, CSVReader parses synchronously on the caller thread and speculative parallel parsing is also disabled.

Definition at line 156 of file csv_format.hpp.

◆ trim()

CSVFormat & csv::CSVFormat::trim ( const std::vector< char > &  ws)

Sets the whitespace characters to be trimmed.

Exceptions
`std::runtime_error`thrown if trim, quote, or possible delimiting characters overlap

Definition at line 32 of file csv_format.cpp.

◆ variable_columns() [1/2]

CONSTEXPR_14 CSVFormat & csv::CSVFormat::variable_columns ( bool  policy)
inline

Tells the parser how to handle columns of a different length than the others.

Definition at line 120 of file csv_format.hpp.

◆ variable_columns() [2/2]

CONSTEXPR_14 CSVFormat & csv::CSVFormat::variable_columns ( VariableColumnPolicy  policy = VariableColumnPolicy::IGNORE_ROW)
inline

Tells the parser how to handle columns of a different length than the others.

Definition at line 114 of file csv_format.hpp.

Friends And Related Symbol Documentation

◆ internals::CSVParserCore

template<typename RowSink , typename ParsePolicy , typename FieldPolicy , typename RowPolicy >
friend class internals::CSVParserCore
friend

Definition at line 247 of file csv_format.hpp.

Member Data Documentation

◆ CSVReader

friend csv::CSVFormat::CSVReader

Definition at line 245 of file csv_format.hpp.


The documentation for this class was generated from the following files: