Vince's CSV Parser
Loading...
Searching...
No Matches
Vince's CSV Library

This is the detailed documentation for Vince's CSV library. For quick examples, go to this project's GitHub page.

Python users should start with the fastpycsv Python bindings.

Outline

CSV Reading

See also

CSV Tuning

Working with parsed data

See also

Scalar Conversion Reference

DataFrame

An in-memory keyed table built from a csv::CSVReader. Supports O(1) key lookup, column extraction, editing, and grouping.

  • csv::DataFrame: Main container class (template parameter is the key type, default std::string)
    • Construction
    • Inspection
      • csv::DataFrame::n_rows()
      • csv::DataFrame::n_cols()
      • csv::DataFrame::size()
      • csv::DataFrame::empty()
      • csv::DataFrame::has_column()
      • csv::DataFrame::get_col_names()
    • Row access by key
      • csv::DataFrame::operator[](): Access by key value (O(1))
      • csv::DataFrame::contains(): Check if a key exists
      • csv::DataFrame::try_get(): Non-throwing keyed lookup
    • Row access by position
      • csv::DataFrame::iloc(): Positional access by index (use instead of operator[](size_t) for integer-keyed DataFrames)
      • csv::DataFrame::try_get(): Non-throwing positional lookup (overloaded on size_t)
    • Column extraction
      • csv::DataFrame::column(): Extract all values from a named column as std::vector<T>
    • Editing
      • csv::DataFrame::set(): Edit a cell by key and column name
      • csv::DataFrame::set_at(): Edit a cell by position and column name
      • csv::DataFrame::erase_row(): Remove a row by key
      • csv::DataFrame::erase_row_at(): Remove a row by position
    • Grouping
      • csv::DataFrame::group_by(): Group row indices by an arbitrary key function or column name
    • Iteration
      • csv::DataFrame::begin()
      • csv::DataFrame::end()
  • csv::DataFrameRow: Proxy row object returned by DataFrame access methods
    • csv::DataFrameRow::operator[](): Access a field by column name
    • csv::DataFrameRow::size()
    • csv::DataFrameRow::empty()
    • csv::DataFrameRow::get_col_names()
  • csv::DataFrameOptions: Configuration for DataFrame construction
    • Duplicate key policy (THROW, OVERWRITE, or KEEP_FIRST)
    • set_key_column(): Specify which column to use as the key
    • set_throw_on_missing_key(): Control exception behavior for missing keys

ETL Utilities

See also

High Performance ETL

Deep Dives

Implementation notes and internal architecture documents are collected under Internal / Deep Dives.

CSV Writing

The CSV Writing Guide contains a high-level overview of writing CSVs.

Frequently Asked Questions

How does automatic starting row detection work?

See "How does automatic delimiter detection work?"

How does automatic delimiter detection work?

See the implementation in csv::internals::_guess_format() — the source is the authoritative reference and is kept up to date.

Is the CSV parser thread-safe?

This library already does a lot of work behind the scenes to use threads to squeeze performance from your CPU. However, ambitious users who are in the mood for experimenting should follow these guidelines: