STAC Generation¶
The gpio publish stac command generates STAC (SpatioTemporal Asset Catalog) metadata from GeoParquet files, making your data discoverable and interoperable.
Quick Start¶
# Single file -> STAC Item
gpio publish stac data.parquet output.json --bucket s3://my-bucket/data/
# Partitioned directory -> STAC Collection + Items
gpio publish stac partitions/ . --bucket s3://my-bucket/dataset/
from geoparquet_io import Table
table = Table("data.parquet")
# Generate STAC Item
table.to_stac("output.json", bucket="s3://my-bucket/data/")
Output Modes¶
Single File¶
For a single GeoParquet file, gpio generates a STAC Item:
gpio publish stac roads.parquet roads_item.json --bucket s3://my-bucket/roads/
This creates a STAC Item JSON with: - Bounding box from the geometry column - Asset link to the parquet file - Properties from file metadata
Partitioned Dataset¶
For partitioned directories, gpio generates a STAC Collection with Items:
gpio publish stac partitions/ . --bucket s3://my-bucket/dataset/
This creates:
- collection.json in the output directory
- Individual Item JSONs co-located with each parquet file
- Links between Collection and Items
Asset URL Configuration¶
S3 Bucket Prefix¶
The --bucket option sets the S3 prefix for asset hrefs:
gpio publish stac data.parquet output.json --bucket s3://source.coop/org/dataset/
Public URL Mapping¶
For publicly accessible data, add a public URL:
gpio publish stac data.parquet output.json \
--bucket s3://my-bucket/roads/ \
--public-url https://data.example.com/roads/
This adds alternate links with public HTTPS URLs.
PMTiles Overview Support¶
GPIO automatically detects PMTiles overview files and includes them as additional assets:
# If data.pmtiles exists alongside data.parquet
gpio publish stac data.parquet output.json --bucket s3://bucket/data/
The STAC Item will include both the parquet and pmtiles assets.
Custom IDs¶
Item ID¶
gpio publish stac data.parquet output.json \
--bucket s3://bucket/data/ \
--item-id my-custom-item-id
Collection ID¶
gpio publish stac partitions/ . \
--bucket s3://bucket/dataset/ \
--collection-id my-dataset-collection
Overwriting Existing Files¶
Use --overwrite to replace existing STAC files:
gpio publish stac data.parquet output.json --bucket s3://bucket/data/ --overwrite
Example Workflow¶
Complete workflow from partition to STAC:
# 1. Partition by admin boundaries
gpio partition admin roads.parquet by_country/ --levels country
# 2. Generate STAC Collection with Items
gpio publish stac by_country/ . \
--bucket s3://source.coop/my-org/roads/ \
--public-url https://data.source.coop/my-org/roads/ \
--collection-id global-roads
# 3. Upload everything including STAC metadata
gpio publish upload by_country/ s3://source.coop/my-org/roads/
CLI Reference¶
See the CLI Reference for complete options.