|
Vince's CSV Parser
|
Classes | |
| class | const_iterator |
| Row-wise const iterator over DataFrameRow entries. More... | |
| class | iterator |
| Row-wise iterator over DataFrameRow entries. More... | |
Public Types | |
| using | row_entry = std::pair< KeyType, CSVRow > |
| Type alias for internal row storage: pair of key and CSVRow. | |
| using | DuplicateKeyPolicy = DataFrameOptions::DuplicateKeyPolicy |
Public Member Functions | |
| DataFrame ()=default | |
| Construct an empty DataFrame. | |
| DataFrame (CSVReader &reader) | |
| Construct an unkeyed DataFrame from a CSV reader. | |
| DataFrame (CSVReader &reader, const DataFrameOptions &options) | |
| Construct a keyed DataFrame from a CSV reader with options. | |
| DataFrame (csv::string_view filename, const DataFrameOptions &options, CSVFormat format=CSVFormat::guess_csv()) | |
| Construct a keyed DataFrame directly from a CSV file. | |
| DataFrame (CSVReader &reader, const std::string &_key_column, DuplicateKeyPolicy policy=DuplicateKeyPolicy::OVERWRITE, bool throw_on_missing_key=true) | |
| Construct a keyed DataFrame using a column name as the key. | |
| template<typename KeyFunc , typename ResultType = invoke_result_t<KeyFunc, const CSVRow&>, csv::enable_if_t< std::is_convertible< ResultType, KeyType >::value, int > = 0> | |
| DataFrame (CSVReader &reader, KeyFunc key_func, DuplicateKeyPolicy policy=DuplicateKeyPolicy::OVERWRITE) | |
| Construct a keyed DataFrame using a custom key function. | |
| template<typename KeyFunc , typename ResultType = invoke_result_t<KeyFunc, const CSVRow&>, csv::enable_if_t< std::is_convertible< ResultType, KeyType >::value, int > = 0> | |
| DataFrame (CSVReader &reader, KeyFunc key_func, const DataFrameOptions &options) | |
| Construct a keyed DataFrame using a custom key function with options. | |
| size_t | size () const noexcept |
| Get the number of rows in the DataFrame. | |
| bool | empty () const noexcept |
| Check if the DataFrame is empty (has no rows). | |
| size_t | n_rows () const noexcept |
| Get the number of rows in the DataFrame. | |
| size_t | n_cols () const noexcept |
| Get the number of columns in the DataFrame. | |
| bool | has_column (const std::string &name) const |
| Check if a column exists in the DataFrame. | |
| int | index_of (const std::string &name) const |
| Get the index of a column by name. | |
| const std::vector< std::string > & | columns () const noexcept |
| Get the column names in order. | |
| const std::string & | key_name () const noexcept |
| Get the name of the key column (empty string if unkeyed). | |
| template<typename K = KeyType, csv::enable_if_t<!std::is_integral< K >::value, int > = 0> | |
| DataFrameRow< KeyType > | operator[] (size_t i) |
| Access a row by position (unchecked). | |
| template<typename K = KeyType, csv::enable_if_t<!std::is_integral< K >::value, int > = 0> | |
| DataFrameRow< KeyType > | operator[] (size_t i) const |
| Access a row by position (unchecked, const version). | |
| DataFrameRow< KeyType > | at (size_t i) |
| Access a row by position with bounds checking. | |
| DataFrameRow< KeyType > | at (size_t i) const |
| Access a row by position with bounds checking (const version). | |
| DataFrameRow< KeyType > | operator[] (const KeyType &key) |
| Access a row by its key. | |
| DataFrameRow< KeyType > | operator[] (const KeyType &key) const |
| Access a row by its key (const version). | |
| DataFrameRow< KeyType > | iloc (size_t i) |
| Access a row by position (iloc-style, pandas naming). | |
| DataFrameRow< KeyType > | iloc (size_t i) const |
| Access a row by position (const version). | |
| bool | try_get (size_t i, DataFrameRow< KeyType > &out) |
| Attempt to access a row by position without throwing. | |
| bool | try_get (size_t i, DataFrameRow< KeyType > &out) const |
| Attempt to access a row by position without throwing (const version). | |
| const KeyType & | key_at (size_t i) const |
| Get the key for a row at a given position. | |
| bool | contains (const KeyType &key) const |
| Check if a key exists in the DataFrame. | |
| DataFrameRow< KeyType > | at (const KeyType &key) |
| Access a row by its key with bounds checking. | |
| DataFrameRow< KeyType > | at (const KeyType &key) const |
| Access a row by its key with bounds checking (const version). | |
| bool | try_get (const KeyType &key, DataFrameRow< KeyType > &out) |
| Attempt to access a row by key without throwing. | |
| bool | try_get (const KeyType &key, DataFrameRow< KeyType > &out) const |
| Attempt to access a row by key without throwing (const version). | |
| std::string | get (const KeyType &key, const std::string &column) const |
| Get a cell value as a string, accounting for edits. | |
| void | set (const KeyType &key, const std::string &column, const std::string &value) |
| Set a cell value (stored in edit overlay). | |
| bool | erase_row (const KeyType &key) |
| Remove a row by its key. | |
| bool | erase_row_at (size_t i) |
| Remove a row by its position. | |
| void | set_at (size_t i, const std::string &column, const std::string &value) |
| Set a cell value by position (stored in edit overlay). | |
| template<typename T = std::string> | |
| std::vector< T > | column (const std::string &name) const |
| Extract all values from a column with type conversion. | |
| template<typename GroupFunc , typename GroupKey = invoke_result_t<GroupFunc, const CSVRow&>, csv::enable_if_t< internals::is_hashable< GroupKey >::value &&internals::is_equality_comparable< GroupKey >::value, int > = 0> | |
| std::unordered_map< GroupKey, std::vector< size_t > > | group_by (GroupFunc group_func) const |
| Group row positions using an arbitrary grouping function. | |
| std::unordered_map< std::string, std::vector< size_t > > | group_by (const std::string &name, bool use_edits=true) const |
| Group row positions by the value of a column. | |
| iterator | begin () |
| Get iterator to the first row. | |
| iterator | end () |
| Get iterator past the last row. | |
| const_iterator | begin () const |
| Get const iterator to the first row. | |
| const_iterator | end () const |
| Get const iterator past the last row. | |
| const_iterator | cbegin () const |
| Get const iterator to the first row (explicit). | |
| const_iterator | cend () const |
| Get const iterator past the last row (explicit). | |
Definition at line 190 of file data_frame.hpp.
| using csv::DataFrame< KeyType >::DuplicateKeyPolicy = DataFrameOptions::DuplicateKeyPolicy |
Definition at line 312 of file data_frame.hpp.
| using csv::DataFrame< KeyType >::row_entry = std::pair<KeyType, CSVRow> |
Type alias for internal row storage: pair of key and CSVRow.
Definition at line 193 of file data_frame.hpp.
|
inlineexplicit |
Construct an unkeyed DataFrame from a CSV reader.
Rows are accessible by position only.
Definition at line 321 of file data_frame.hpp.
|
inlineexplicit |
Construct a keyed DataFrame from a CSV reader with options.
| reader | CSV reader to consume |
| options | Configuration including key column and duplicate policies |
| std::runtime_error | if key column is empty or not found |
Definition at line 332 of file data_frame.hpp.
|
inline |
Construct a keyed DataFrame directly from a CSV file.
| filename | Path to the CSV file |
| options | Configuration including key column and duplicate policies |
| format | CSV format specification (defaults to auto-detection) |
| std::runtime_error | if key column is empty or not found |
Definition at line 344 of file data_frame.hpp.
|
inline |
Construct a keyed DataFrame using a column name as the key.
| reader | CSV reader to consume |
| _key_column | Name of the column to use as the key |
| policy | How to handle duplicate keys (default: OVERWRITE) |
| throw_on_missing_key | Whether to throw if a key value cannot be parsed (default: true) |
| std::runtime_error | if key column is empty or not found |
Definition at line 362 of file data_frame.hpp.
|
inline |
Construct a keyed DataFrame using a custom key function.
| reader | CSV reader to consume |
| key_func | Function that extracts a key from each row |
| policy | How to handle duplicate keys (default: OVERWRITE) |
| std::runtime_error | if policy is THROW and duplicate keys are encountered |
Definition at line 388 of file data_frame.hpp.
|
inline |
Construct a keyed DataFrame using a custom key function with options.
| reader | CSV reader to consume |
| key_func | Function that extracts a key from each row |
| options | Configuration for duplicate key policy |
Definition at line 409 of file data_frame.hpp.
|
inline |
Access a row by its key with bounds checking.
| key | The row key to look up |
| std::runtime_error | if the DataFrame was not created with a key column |
| std::out_of_range | if the key is not found |
Definition at line 640 of file data_frame.hpp.
|
inline |
Access a row by its key with bounds checking (const version).
Definition at line 650 of file data_frame.hpp.
|
inline |
Access a row by position with bounds checking.
| i | Row index (0-based) |
| std::out_of_range | if index is out of bounds |
Definition at line 500 of file data_frame.hpp.
|
inline |
Access a row by position with bounds checking (const version).
Definition at line 510 of file data_frame.hpp.
|
inline |
Get iterator to the first row.
Definition at line 897 of file data_frame.hpp.
|
inline |
Get const iterator to the first row.
Definition at line 903 of file data_frame.hpp.
|
inline |
Get const iterator to the first row (explicit).
Definition at line 909 of file data_frame.hpp.
|
inline |
Get const iterator past the last row (explicit).
Definition at line 912 of file data_frame.hpp.
|
inline |
Extract all values from a column with type conversion.
Accounts for edited values in the overlay.
| T | Target type for conversion (default: std::string) |
| name | Column name |
| std::runtime_error | if column is not found |
Definition at line 800 of file data_frame.hpp.
|
inlinenoexcept |
Get the column names in order.
Definition at line 455 of file data_frame.hpp.
|
inline |
Check if a key exists in the DataFrame.
| key | The key to check |
| std::runtime_error | if the DataFrame was not created with a key column |
Definition at line 626 of file data_frame.hpp.
|
inlinenoexcept |
Check if the DataFrame is empty (has no rows).
Definition at line 421 of file data_frame.hpp.
|
inline |
Get iterator past the last row.
Definition at line 900 of file data_frame.hpp.
|
inline |
Get const iterator past the last row.
Definition at line 906 of file data_frame.hpp.
|
inline |
Remove a row by its key.
| key | The row key to remove |
| std::runtime_error | if the DataFrame was not created with a key column |
Definition at line 741 of file data_frame.hpp.
|
inline |
Remove a row by its position.
| i | Row index (0-based) |
Definition at line 762 of file data_frame.hpp.
|
inline |
Get a cell value as a string, accounting for edits.
| key | The row key |
| column | The column name |
| std::runtime_error | if the DataFrame was not created with a key column |
| std::out_of_range | if the key is not found |
Definition at line 705 of file data_frame.hpp.
|
inline |
Group row positions by the value of a column.
| name | Column to group by |
| use_edits | If true, use edited values when present (default: true) |
| std::runtime_error | if column is not found |
Definition at line 861 of file data_frame.hpp.
|
inline |
Group row positions using an arbitrary grouping function.
| GroupFunc | Callable that takes a CSVRow and returns a hashable key |
| group_func | Function to extract group key from each row |
Definition at line 842 of file data_frame.hpp.
|
inline |
Check if a column exists in the DataFrame.
| name | Column name to check |
Definition at line 437 of file data_frame.hpp.
|
inline |
Access a row by position (iloc-style, pandas naming).
| i | Row index (0-based) |
| std::out_of_range | if index is out of bounds |
Definition at line 553 of file data_frame.hpp.
|
inline |
Access a row by position (const version).
Definition at line 563 of file data_frame.hpp.
|
inline |
Get the index of a column by name.
| name | Column name to find |
Definition at line 447 of file data_frame.hpp.
|
inline |
Get the key for a row at a given position.
| i | Row index (0-based) |
| std::runtime_error | if the DataFrame was not created with a key column |
| std::out_of_range | if index is out of bounds |
Definition at line 614 of file data_frame.hpp.
|
inlinenoexcept |
Get the name of the key column (empty string if unkeyed).
Definition at line 460 of file data_frame.hpp.
|
inlinenoexcept |
Get the number of columns in the DataFrame.
Definition at line 429 of file data_frame.hpp.
|
inlinenoexcept |
Get the number of rows in the DataFrame.
Alias for size().
Definition at line 426 of file data_frame.hpp.
|
inline |
Access a row by its key.
| key | The row key to look up |
| std::runtime_error | if the DataFrame was not created with a key column |
| std::out_of_range | if the key is not found |
Definition at line 527 of file data_frame.hpp.
|
inline |
Access a row by its key (const version).
Definition at line 537 of file data_frame.hpp.
|
inline |
Access a row by position (unchecked).
| i | Row index (0-based) |
| std::out_of_range | if index is out of bounds (via std::vector::at) |
Definition at line 477 of file data_frame.hpp.
|
inline |
Access a row by position (unchecked, const version).
Disabled when KeyType is an integral type — use iloc() instead.
Definition at line 487 of file data_frame.hpp.
|
inline |
Set a cell value (stored in edit overlay).
| key | The row key |
| column | The column name |
| value | The new value as a string |
| std::runtime_error | if the DataFrame was not created with a key column |
| std::out_of_range | if the key is not found |
Definition at line 728 of file data_frame.hpp.
|
inline |
Set a cell value by position (stored in edit overlay).
| i | Row index (0-based) |
| column | The column name |
| value | The new value as a string |
| std::runtime_error | if the DataFrame was not created with a key column |
| std::out_of_range | if index is out of bounds |
Definition at line 780 of file data_frame.hpp.
|
inlinenoexcept |
Get the number of rows in the DataFrame.
Definition at line 416 of file data_frame.hpp.
|
inline |
Attempt to access a row by key without throwing.
| key | The row key to look up |
| out | Output parameter that receives the DataFrameRow if found |
| std::runtime_error | if the DataFrame was not created with a key column |
Definition at line 667 of file data_frame.hpp.
|
inline |
Attempt to access a row by key without throwing (const version).
Definition at line 682 of file data_frame.hpp.
|
inline |
Attempt to access a row by position without throwing.
| i | Row index (0-based) |
| out | Output parameter that receives the DataFrameRow if found |
Definition at line 579 of file data_frame.hpp.
|
inline |
Attempt to access a row by position without throwing (const version).
Definition at line 593 of file data_frame.hpp.