Loading the Data

Read GeoPackage with GeoPandas

Load a Land Records .gpkg file into a GeoPandas GeoDataFrame, inspect attributes, and run typical filter / area / plotting workflows.

GeoPandas reads GeoPackage natively via Fiona/PyOGRIO. A nationwide Land Records .gpkg is too large to hold entirely in memory on most workstations, so the patterns below show both bounded reads (a single state or county) and streaming reads (a fraction at a time).

A runnable notebook is attached: geopackage-geopandas.ipynb.

Install

# Python ≥ 3.10. Pyogrio is dramatically faster than Fiona for big files.
pip install "geopandas>=1.0" pyogrio shapely matplotlib

1. Inspect the layers

A GeoPackage is a SQLite database; there can be more than one layer. List them:

import pyogrio

pyogrio.list_layers("parcels-2026.2.gpkg")
# [('lr_parcel_us', 'MultiPolygon')]

2. Read a single state (bounded)

The recommended workflow on a laptop. Pass a SQL WHERE clause that touches an indexed column (statefp is indexed in every Land Records .gpkg):

import geopandas as gpd

al = gpd.read_file(
    "parcels-2026.2.gpkg",
    layer="lr_parcel_us",
    where="statefp = '01'",            # Alabama
)
print(al.shape)            # (≈2.2M, 94)
print(al.crs)              # EPSG:4326
al.head()

3. Stream the entire file (chunked)

For nationwide aggregations, iterate so you never hold more than one chunk at a time:

from pyogrio import read_dataframe

chunks = []
for chunk in read_dataframe(
    "parcels-2026.2.gpkg",
    layer="lr_parcel_us",
    columns=["statefp", "countyfp", "calcarea"],  # drop geometry for speed
    read_geometry=False,
    skip_features=0,
    max_features=500_000,
    use_arrow=True,
):
    chunks.append(chunk.groupby(["statefp", "countyfp"], as_index=False)["calcarea"].sum())

import pandas as pd
totals = pd.concat(chunks).groupby(["statefp", "countyfp"], as_index=False).sum()

4. Filter by attribute

The 2026.2 schema includes ownership/access classification — you can pull every publicly-owned federal parcel in one line:

federal = gpd.read_file(
    "parcels-2026.2.gpkg",
    layer="lr_parcel_us",
    where="parceltype = 'GOVT FEDERAL'",
)

5. Filter by bounding box

bbox accepts a (minx, miny, maxx, maxy) tuple in the layer's CRS — fast because pyogrio pushes the predicate down to the GDAL spatial index:

denver_bbox = (-105.11, 39.61, -104.85, 39.91)
denver = gpd.read_file(
    "parcels-2026.2.gpkg",
    layer="lr_parcel_us",
    bbox=denver_bbox,
)

6. Project for accurate area / distance

calcarea is already in square meters (computed from the polygon during publishing). If you want to recompute or compute new metrics, reproject to an equal-area CRS first — never compute area in EPSG:4326.

denver_alb = denver.to_crs("EPSG:5070")  # NAD83 / Conus Albers, m²
denver_alb["sqmi"] = denver_alb.geometry.area / 2_589_988.11
denver_alb.sort_values("sqmi", ascending=False).head()

7. Plot a county

import matplotlib.pyplot as plt

ax = denver.plot(figsize=(10, 10), column="parceltype", legend=True)
ax.set_axis_off()
plt.savefig("denver-parcels.png", dpi=150, bbox_inches="tight")

8. Write a derivative GeoPackage

denver.to_file("denver-parcels.gpkg", driver="GPKG", layer="parcels")

Performance notes

  • Always pass where= and/or bbox= to avoid materializing the entire layer.
  • pyogrio.read_dataframe(..., use_arrow=True) is roughly 5× faster than Fiona at the same memory footprint.
  • For multi-state aggregations that don't need geometry, set read_geometry=False.
  • Spatial indexes already exist on lr_parcel_us — you do not need to rebuild them.

On this page