Read GeoParquet with DuckDB
Query Land Records .parquet exports with DuckDB's spatial extension — including row-group pruning, columnar projection, and direct reads from object storage.
GeoParquet is the right choice for analytical workloads: columnar storage, row-group statistics for predicate pushdown, and dramatically smaller files than GeoPackage at equivalent fidelity. DuckDB reads it natively when the spatial extension is loaded.
Install / load extensions
INSTALL spatial;
LOAD spatial;parquet is built in; spatial adds GeoParquet metadata handling and the ST_* functions.
1. Inspect
DuckDB recognizes GeoParquet via the geo metadata block embedded in the file:
SELECT * FROM parquet_metadata('parcels-2026.2.parquet') LIMIT 5;
SELECT * FROM parquet_schema('parcels-2026.2.parquet');You'll see one row per row-group. Each row-group carries min/max statistics for every column — DuckDB uses these to skip entire row-groups when your WHERE clause is selective enough.
2. Query directly
SELECT lrid, ownername, parceladdr, ST_AsText(geom) AS geom_wkt
FROM read_parquet('parcels-2026.2.parquet')
WHERE statefp = '01' AND countyfp = '003'
LIMIT 10;The Land Records publisher writes files sorted by (statefp, countyfp) and tunes the row-group size so a single-county filter typically reads only a few row-groups. State-level filters read a few hundred.
3. Aggregations (no geometry needed)
When you don't reference geom, DuckDB only reads the columns you ask for — orders of magnitude faster than loading the same data from GeoPackage:
SELECT
statefp,
parcelstate,
COUNT(*) AS parcels,
SUM(calcarea) / 1e6 AS total_km2
FROM read_parquet('parcels-2026.2.parquet')
GROUP BY 1, 2
ORDER BY parcels DESC;This typically completes in seconds on the nationwide file because no row-group containing only irrelevant columns is fetched at all.
4. Spatial filters
geom is stored as WKB. ST_GeomFromWKB reconstructs it; with LOAD spatial DuckDB also recognizes the GeoParquet geo metadata, so ST_* functions accept the column directly:
SELECT lrid, ownername, calcarea
FROM read_parquet('parcels-2026.2.parquet')
WHERE ST_DWithin_Spheroid(
geom,
ST_Point(-86.81, 33.52),
5 * 1609.344
);For a single-point lookup against the nationwide file, combine a bounding-box prefilter (which prunes row-groups via the bbox column statistics) with the exact geographic predicate:
WITH near AS (
SELECT lrid, ownername, calcarea, geom
FROM read_parquet('parcels-2026.2.parquet')
WHERE centroidx BETWEEN -86.9 AND -86.7
AND centroidy BETWEEN 33.4 AND 33.6
)
SELECT lrid, ownername, calcarea
FROM near
WHERE ST_DWithin_Spheroid(geom, ST_Point(-86.81, 33.52), 5 * 1609.344);centroidx and centroidy are stored as scalars in the file, so DuckDB can prune row-groups using their per-row-group min/max stats before it ever decodes any geometry.
5. Reading from object storage
DuckDB can query Parquet directly from GCS or S3 — no download step:
INSTALL httpfs;
LOAD httpfs;
SELECT count(*)
FROM read_parquet('s3://my-bucket/landrecords/parcels-2026.2.parquet')
WHERE statefp = '06'; -- CaliforniaFor GCS, point httpfs at gs://... with a configured SECRET:
CREATE SECRET (
TYPE GCS,
KEY_ID '<access-key>',
SECRET '<secret-key>'
);
SELECT count(*) FROM read_parquet('gs://my-bucket/landrecords/parcels-2026.2.parquet');DuckDB issues Range HTTP requests for only the row-groups that pass the WHERE clause — querying a single state from the nationwide file transfers a few hundred MB rather than the full file.
6. Materialize for repeated heavy use
ATTACH 'landrecords.duckdb' AS lr;
CREATE TABLE lr.parcels AS
SELECT * FROM read_parquet('parcels-2026.2.parquet');
CREATE INDEX parcels_geom_rtree ON lr.parcels USING RTREE (geom);For purely analytical workloads (no point-in-polygon lookups), the raw Parquet file is often faster than a materialized table because it stays compressed on disk and DuckDB streams through the column projections you ask for.
When to pick GeoParquet over GeoPackage
| Workload | Pick GeoParquet | Pick GeoPackage |
|---|---|---|
| Aggregations across many parcels | ✓ | |
| Reading from S3 / GCS without download | ✓ | |
| Loading into BigQuery / Snowflake / Athena | ✓ | |
| Point-in-polygon, repeated geo queries | ✓ | |
| Editing parcels in QGIS | ✓ | |
| Joining to multiple in-file attribute tables | ✓ |