This is the detailed documentation for Vince's CSV library. For quick examples, go to this project's GitHub page.
Python users should start with the fastpycsv Python bindings.
Outline
CSV Reading
- csv::CSVFormat: Stores information about how to parse a CSV file.
- csv::CSVReader
- Convenience Functions
- File Utilities
See also
CSV Tuning
Working with parsed data
- csv::CSVRow: Data structure for representing CSV rows.
- csv::CSVRow::operator std::vector<std::string>()
- csv::CSVRow::iterator
- csv::CSVRow::to_json()
- csv::CSVRow::to_json_array()
- csv::CSVField
See also
Scalar Conversion Reference
DataFrame
An in-memory keyed table built from a csv::CSVReader. Supports O(1) key lookup, column extraction, editing, and grouping.
- csv::DataFrame: Main container class (template parameter is the key type, default
std::string)
- Construction
- Inspection
- Row access by key
- Row access by position
- csv::DataFrame::iloc(): Positional access by index (use instead of
operator[](size_t) for integer-keyed DataFrames)
- csv::DataFrame::try_get(): Non-throwing positional lookup (overloaded on
size_t)
- Column extraction
- Editing
- csv::DataFrame::set(): Edit a cell by key and column name
- csv::DataFrame::set_at(): Edit a cell by position and column name
- csv::DataFrame::erase_row(): Remove a row by key
- csv::DataFrame::erase_row_at(): Remove a row by position
- Grouping
- Iteration
- csv::DataFrameRow: Proxy row object returned by DataFrame access methods
- csv::DataFrameOptions: Configuration for DataFrame construction
- Duplicate key policy (
THROW, OVERWRITE, or KEEP_FIRST)
set_key_column(): Specify which column to use as the key
set_throw_on_missing_key(): Control exception behavior for missing keys
ETL Utilities
See also
High Performance ETL
Deep Dives
Implementation notes and internal architecture documents are collected under Internal / Deep Dives.
CSV Writing
The CSV Writing Guide contains a high-level overview of writing CSVs.
Frequently Asked Questions
How does automatic starting row detection work?
See "How does automatic delimiter detection work?"
How does automatic delimiter detection work?
See the implementation in csv::internals::_guess_format() — the source is the authoritative reference and is kept up to date.
Is the CSV parser thread-safe?
This library already does a lot of work behind the scenes to use threads to squeeze performance from your CPU. However, ambitious users who are in the mood for experimenting should follow these guidelines:
- csv::CSVReader::iterator should only be used from one thread
- A workaround is to chunk blocks of
CSVRow objects together and create separate threads to process each column
- csv::CSVRow may be safely processed from multiple threads
- csv::CSVField objects should only be read from one thread at a time