# Quickstart `fastpycsv` is centered around four operations: - `reader()` for lazy row iteration - `read_numpy()` for eager selected-column NumPy export - `read_numpy_batches()` for bounded-memory NumPy export - `write_csv()` for streaming CSV output ## Read Rows `fastpycsv.reader()` returns lazy, list-like row objects. By default, the first row is consumed as column names, so ordinary ETL code can use string indexing without building a dictionary for every row. ```python import fastpycsv for row in fastpycsv.reader("vehicles.csv"): if row["region"] == "el paso": print(row["price"]) ``` Rows can be indexed by position or column name: ```python row = next(fastpycsv.reader("vehicles.csv")) row[0] row["price"] len(row) ``` Use explicit materialization only when the downstream API needs normal Python objects: ```python reader = fastpycsv.reader("vehicles.csv") rows = reader.lists(["id", "price", "year"]).all() ``` For bounded memory, stream materialized batches: ```python for rows in fastpycsv.reader("vehicles.csv").dicts(["id", "price"]).chunks(50_000): send_to_api(rows) ``` ## Export NumPy Arrays Use `read_numpy()` when the target is pandas, NumPy, or another column-oriented consumer: ```python import pandas as pd arrays = fastpycsv.read_numpy("vehicles.csv", columns=["price", "year", "odometer"]) frame = pd.DataFrame(arrays) ``` Use `read_numpy_batches()` when the file is large enough that peak memory matters: ```python for arrays in fastpycsv.read_numpy_batches( "vehicles.csv", columns=["price", "year", "odometer"], schema="sample", ): process(arrays) ``` ## Filter With Native Predicates Python predicates are fine for flexible business logic. For common comparisons, native predicates avoid repeated Python callbacks: ```python predicate = fastpycsv.all_of( fastpycsv.equal("manufacturer", "ford", case_sensitive=False), fastpycsv.less("price", 10_000), ) arrays = fastpycsv.read_numpy( "vehicles.csv", columns=["region", "price", "year", "odometer"], predicate=predicate, ) ``` Chaining `reader.filter(...)` combines native predicates with `all_of()` by default. Use `append=False` when a later filter should replace the earlier one. ## Write CSV Output `write_csv()` accepts a path or text file-like object plus lazy rows, dictionaries, lists, tuples, and other Python iterables. Fields are stringified before writing; `None` becomes an empty CSV field. ```python reader = fastpycsv.reader("vehicles.csv") fastpycsv.write_csv( "cheap_el_paso_fords.csv", (row for row in reader if row["region"] == "el paso" and row["manufacturer"] == "ford"), fieldnames=["id", "price", "year", "region"], ) with open("cheap_el_paso_fords.csv", "w", newline="", encoding="utf-8") as out: fastpycsv.write_csv(out, [["id", "price"], [1, 9000]], write_header=False) ``` ## Installation And Local Builds Install from the repository root while developing: ```powershell python -m pip install -e E:\GitHub\csv-parser ``` Or build the native extension directly: ```powershell cmake -S . -B build/fastpycsv -DBUILD_PYTHON=ON -DCMAKE_BUILD_TYPE=Release cmake --build build/fastpycsv --target fastpycsv --config Release ``` For an existing top-level build, the `fastpycsv` target bootstraps its own Python build tree: ```powershell cmake --build build/x64-Release --target fastpycsv --config Release ```