Loading the Data

Read GeoPackage with DuckDB

Query a Land Records .gpkg file directly with DuckDB's spatial extension — no import step, no separate database.

DuckDB's spatial extension reads GeoPackage in-place via GDAL. For interactive analysis on a workstation, this is often the fastest path to results: no database server, no import step, and DuckDB's vectorized engine pushes predicates down through the GDAL driver where it can.

Install / load the extension

INSTALL spatial;
LOAD spatial;

The spatial extension ships everything you need — ST_Read, ST_* functions, and the GDAL driver bundle.

1. List layers

SELECT * FROM ST_Read_Meta('parcels-2026.2.gpkg');

You'll see one row for the lr_parcel_us layer with its feature count and bounding box.

2. Query without importing

ST_Read reads the GeoPackage layer like a table:

SELECT lrid, ownername, parceladdr
FROM ST_Read('parcels-2026.2.gpkg', layer = 'lr_parcel_us')
WHERE statefp = '01' AND countyfp = '003'
LIMIT 10;

DuckDB streams rows through GDAL; this is essentially a sequential scan of the layer with the WHERE clause applied in DuckDB after the read. For wide queries this is fine; for narrow queries materialize first (step 4).

3. Aggregations

SELECT
    statefp,
    parcelstate,
    COUNT(*) AS parcels,
    SUM(calcarea) / 1e6 AS total_km2
FROM ST_Read('parcels-2026.2.gpkg', layer = 'lr_parcel_us')
GROUP BY 1, 2
ORDER BY parcels DESC
LIMIT 10;

4. Materialize into a DuckDB table

For repeated queries, copy the layer (or a slice) into a native DuckDB table. This is dramatically faster and lets you build indexes:

CREATE TABLE parcels AS
SELECT * FROM ST_Read('parcels-2026.2.gpkg', layer = 'lr_parcel_us');

-- Optional: an R-tree for spatial predicates.
CREATE INDEX parcels_geom_rtree ON parcels USING RTREE (geom);
CREATE INDEX parcels_fips_idx   ON parcels (statefp, countyfp);

Alternatively, write the table to a DuckDB file once and reopen later:

ATTACH 'landrecords.duckdb' AS lr;
CREATE TABLE lr.parcels AS
SELECT * FROM ST_Read('parcels-2026.2.gpkg', layer = 'lr_parcel_us');

5. Spatial queries

Parcels within 5 miles of a point:

SELECT lrid, ownername, calcarea
FROM parcels
WHERE ST_DWithin_Spheroid(
    geom,
    ST_Point(-86.81, 33.52),
    5 * 1609.344        -- meters
);

Intersect against an arbitrary polygon:

SELECT lrid, calcarea
FROM parcels
WHERE ST_Intersects(
    geom,
    ST_GeomFromText('POLYGON((-87 33, -86.8 33, -86.8 33.2, -87 33.2, -87 33))')
);

6. Export to GeoParquet

The same spatial extension can write GeoParquet for downstream tools — useful as a one-time conversion of the entire dataset:

COPY (
    SELECT * FROM ST_Read('parcels-2026.2.gpkg', layer = 'lr_parcel_us')
) TO 'parcels-2026.2.parquet' (FORMAT PARQUET);

See Read GeoParquet with DuckDB for follow-on patterns.

Performance notes

  • ST_Read reads via GDAL; it does not push predicates into the GeoPackage's SQLite query planner. If you're going to query the same slice repeatedly, materialize (step 4).
  • For nationwide aggregations that don't need geometry, project columns explicitly: SELECT statefp, calcarea FROM ST_Read(...). DuckDB still has to read each feature off disk, but it avoids materializing the WKB.
  • The R-tree index (USING RTREE) accelerates ST_Intersects / ST_DWithin predicates on the materialized table. It is built once and lives with the table.

On this page