Performance Benchmarks¶

gpio includes a benchmark suite for measuring performance and detecting regressions across versions.

Quick Start¶

Run benchmarks comparing current version against a previous release:

# Run benchmarks on current version
python scripts/version_benchmark.py --version-label "current" -o results_current.json

# Compare against previous results
python scripts/version_benchmark.py --compare results_baseline.json results_current.json

Benchmark Operations¶

The suite tests these operations covering most gpio capabilities:

Extract Operations¶

Operation	Description
`inspect`	Read and display file metadata
`extract-limit`	Extract first 100 rows
`extract-columns`	Extract specific columns (includes geometry)
`extract-bbox`	Spatial bounding box filtering

Add Column Operations¶

Operation	Description
`add-bbox`	Add bounding box column
`add-quadkey`	Add quadkey column (resolution 12)
`add-h3`	Add H3 cell ID column (resolution 8)

Sort Operations¶

Operation	Description
`sort-hilbert`	Sort by Hilbert curve for spatial locality
`sort-quadkey`	Sort by quadkey spatial index

Transform Operations¶

Operation	Description
`reproject`	Reproject to Web Mercator (EPSG:3857)

Partition Operations¶

Operation	Description
`partition-quadkey`	Partition by quadkey (resolution 4)
`partition-h3`	Partition by H3 cells (resolution 4)

Convert/Export Operations¶

Operation	Description
`convert-geojson`	Convert to GeoJSON format
`convert-flatgeobuf`	Convert to FlatGeobuf format
`convert-geopackage`	Convert to GeoPackage format

Import Operations¶

Operation	Description
`import-geojson`	Import from GeoJSON to GeoParquet
`import-geopackage`	Import from GeoPackage to GeoParquet

Note: Import operations only run on tiny/small file sizes as source format files are only available in those sizes.

Chain Operations (Multi-step Workflows)¶

Operation	Description
`chain-extract-bbox-sort`	Extract columns → Add bbox → Hilbert sort
`chain-filter-sort`	Bbox filter → Hilbert sort

Operation Presets¶

Preset	Operations
`quick`	inspect, extract-limit, add-bbox
`standard`	inspect, extract-limit, extract-columns, add-bbox, sort-hilbert
`full`	All 19 operations including imports and chains

Test Data¶

Benchmark files are hosted on source.coop with different size tiers:

Tier	Rows	Geometry	CRS	Source
tiny	1,000	Polygon	EPSG:4326	Overture Buildings (Singapore)
small	10,000	Polygon	EPSG:4326	Overture Buildings (Singapore)
medium	100,000	Polygon	EPSG:4326	Overture Buildings (Singapore)
large	809,000	Polygon	EPSG:3794	fiboa field boundaries (Slovenia)
points-tiny	1,000	Point	EPSG:3857	Building centroids (Web Mercator)
points-small	10,000	Point	EPSG:3857	Building centroids (Web Mercator)

The points files provide variation in geometry type and CRS for regression testing.

File Presets¶

Preset	Files
`quick`	tiny, small
`standard`	small, medium
`full`	tiny, small, medium, large, points-tiny, points-small

Files are automatically downloaded and cached locally in /tmp/gpio-benchmark-cache/.

Running Benchmarks Locally¶

Version Comparison Script¶

The scripts/version_benchmark.py script works with any gpio version:

# Run full benchmarks (all files, all operations)
python scripts/version_benchmark.py --version-label "v0.9.0" -o results.json

# Run quick benchmarks (smaller file set, fewer operations)
python scripts/version_benchmark.py --version-label "v0.9.0" -o results.json --files quick --ops quick

# Run benchmarks with more iterations for accuracy
python scripts/version_benchmark.py --version-label "main" -o results.json -n 5

# Run specific operations on specific files
python scripts/version_benchmark.py --version-label "test" --files small,medium --ops add-bbox,sort-hilbert

# Compare two result files
python scripts/version_benchmark.py --compare results_baseline.json results_current.json

# Analyze trends across multiple baselines (oldest to newest)
python scripts/version_benchmark.py --trend results_v0.7.0.json results_v0.8.0.json results_v0.9.0.json

# Customize degradation threshold for trend detection (default: 0.05 = 5%)
python scripts/version_benchmark.py --trend baseline1.json baseline2.json baseline3.json --trend-threshold 0.10

# Skip local caching (test remote file performance)
python scripts/version_benchmark.py --version-label "remote-test" --no-cache

Managing Historical Baselines¶

Use the scripts/manage_baselines.py tool to work with baselines stored in GitHub artifacts:

# List available baselines
uv run python scripts/manage_baselines.py list

# Download specific baseline versions
uv run python scripts/manage_baselines.py download v0.9.0 v0.8.0

# Compare specific baselines (downloads if needed)
uv run python scripts/manage_baselines.py compare v0.8.0 v0.9.0

# Analyze trends across multiple versions
uv run python scripts/manage_baselines.py trends v0.7.0 v0.8.0 v0.9.0

# Use custom degradation threshold
uv run python scripts/manage_baselines.py trends v0.7.0 v0.8.0 v0.9.0 --threshold 0.10

Authentication: - Requires GitHub token: set GITHUB_TOKEN or authenticate with gh auth login - Auto-detects repository from git remote - Downloads baselines to baselines/ directory by default

### Sample Output

**Point-in-time comparison:**

====================================================================== Comparison: v0.9.0 vs main ======================================================================

Operation File v0.9.0 main Delta¶

inspect tiny 0.468s 0.440s -5.8% faster extract-limit tiny 0.543s 0.540s -0.5% faster add-bbox large 0.378s 0.408s +8.1% slower sort-hilbert large 27.366s 26.946s -1.5% faster

**Trend analysis across releases:**

====================================================================== Trend Analysis Across Releases ====================================================================== Versions: v0.7.0 → v0.8.0 → v0.9.0 Baselines: 3 Operations tracked: 42

Overall Statistics: Average change: +1.23% Max regression: +12.5% Max improvement: -8.3%

⚠️ Gradual Degradation Detected (2 operations):¶

• extract-limit (small): 7.2% avg degradation over last 2 releases • partition-quadkey (medium): 6.1% avg degradation over last 2 releases

🚀 Consistent Improvements (3 operations):¶

• add-bbox (large): 8.5% avg improvement over last 2 releases • sort-hilbert (small): 5.9% avg improvement over last 2 releases • inspect (tiny): 5.2% avg improvement over last 2 releases

### CLI Benchmark Commands

gpio also includes built-in benchmark commands:

```bash
# Run benchmark suite on specific files
gpio benchmark suite --files path/to/file.parquet --operations core

# Run quick benchmark (single operation, timing only)
gpio benchmark run inspect path/to/file.parquet

Profiling Integration¶

When benchmarks identify performance regressions, profiling helps diagnose which code paths are responsible.

Enabling Profiling¶

Add the --profile flag to enable cProfile integration:

CLIPython

# Run benchmarks with profiling enabled
gpio benchmark suite \
  --files path/to/file.parquet \
  --operations core \
  --profile \
  --profile-dir ./profiles

# Profile specific operations
gpio benchmark suite \
  --files large.parquet \
  --operations add-bbox,sort-hilbert \
  --profile

from geoparquet_io.core.benchmark_suite import run_benchmark_suite
from pathlib import Path

# Run benchmarks with profiling enabled
result = run_benchmark_suite(
    input_files=[Path('path/to/file.parquet')],
    operations=['add-bbox', 'extract', 'inspect'],
    iterations=3,
    profile=True,
    profile_dir=Path('./profiles'),
    verbose=True
)

# Profile files are saved in ./profiles/
print(f"Generated {len(result.results)} benchmark results")

This generates .prof files in the specified directory (default: ./profiles/).

Analyzing Profile Data¶

View profile interactively:

CLIPython

uv run python -m pstats profiles/add-bbox_large_1.prof
# Then use commands like:
# - stats 20  (show top 20 functions)
# - sort cumtime  (sort by cumulative time)
# - callers duckdb  (show callers of duckdb functions)

from geoparquet_io.benchmarks.profile_report import format_profile_stats

# Show top 20 slowest functions
summary = format_profile_stats('profiles/add-bbox_large_1.prof', top_n=20)
print(summary)

Sample profile output:

Profile: add-bbox_large_1.prof
================================================================================

Top 20 functions by cumulative time:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.002    0.002   12.456   12.456 geoparquet_io/core/add_column.py:45(add_bbox_column)
        1    0.001    0.001   11.234   11.234 duckdb.py:123(execute)
      100    5.678    0.057    9.876    0.099 duckdb.py:234(_fetch_arrow)
    10000    2.345    0.000    3.456    0.000 pyarrow.lib:456(cast)

Profiling Overhead¶

Profiling adds ~5-15% overhead to benchmark timing
Profile files are typically 50-500KB each
Disabled by default to keep benchmarks fast

CI Integration¶

The benchmark workflow automatically suggests profiling when regressions are detected:

⚠️ Performance regression detected (+25% slower on sort-hilbert)

💡 To diagnose, run locally with profiling:
   gpio benchmark suite --files large.parquet --operations sort-hilbert --profile

Profile artifacts are uploaded with 30-day retention when profiling is enabled.

GitHub Actions Workflows¶

PR Benchmarks (Opt-in)¶

Benchmarks run on PRs only when the benchmark label is added:

Add the benchmark label to your PR
The workflow runs automatically
Results are posted as a comment on the PR

Manual Benchmark Run¶

Run benchmarks manually from the Actions tab:

Go to Actions → Benchmark Suite
Click Run workflow
Configure options:
iterations: Number of runs per operation (default: 3)
files: File preset or comma-separated list (default: full)
ops: Operation preset or comma-separated list (default: full)
compare_version: Optional version to compare against (e.g., v0.9.0)
View results in the workflow summary

Release Benchmarks¶

When a release is created, benchmarks automatically:

Run on the new release version
Compare against the previous release tag
Detect regressions (>25% slower)
Fetch historical baselines from up to 5 previous releases
Analyze performance trends across releases
Append results to the release notes

Results include: - Point-in-time comparison table showing performance delta - Performance trends across multiple releases - Warning for significant regressions (>25% in single release) - Warning for gradual degradation (>5% per release for 2+ consecutive releases) - Detailed benchmark data in collapsible section

Baseline Storage: - Baselines are stored as GitHub Actions artifacts - Retention: 400 days (covers ~5-10 releases) - Artifact naming: release-benchmark-{version} - Contains: benchmark results JSON, comparison text, trend analysis

Where Results Are Published¶

Trigger	Results Location
PR with `benchmark` label	Comment on PR
Manual workflow run	Workflow summary + artifacts
Release	Appended to release notes

All runs also upload JSON artifacts for historical tracking.

Interpreting Results¶

Regression Thresholds¶

Point-in-time (single release comparison):

Severity	Threshold	Action
Normal variance	±10%	No action needed
Warning	+10-25%	Investigate cause
Regression	>+25%	Flagged in release notes

Trend analysis (across multiple releases):

Pattern	Threshold	Action
Gradual degradation	>5% per release for 2+ consecutive releases	Warning flagged
Consistent improvement	>5% per release for 2+ consecutive releases	Highlighted
Single spike	One-time regression/improvement	Ignored (not a trend)

Trend analysis helps detect gradual performance drift that might be missed when comparing only adjacent releases.

Expected Variance¶

Small files (<10K rows): High variance (±20%) due to startup overhead
Large files (>100K rows): Low variance (±5%), most reliable for comparison
CI environment: May differ from local; compare CI-to-CI results

Known Performance Characteristics¶

Operation	Notes
`inspect`	Slower since v0.6.0 due to geometry type detection
`add-bbox`	75x faster since v0.6.0 for large files
`extract` with geometry	Slow due to WKB serialization; use `--exclude-cols geometry` if not needed
`sort-hilbert`	Scales linearly with row count

Pre-Release Checklist¶

Before releasing a new version:

Run benchmarks locally against the previous release:

# Install previous version
git checkout v0.9.0 && pip install -e .
python scripts/version_benchmark.py --version-label "v0.9.0" -o baseline.json -n 5

# Install new version
git checkout main && pip install -e .
python scripts/version_benchmark.py --version-label "new" -o current.json -n 5

# Compare
python scripts/version_benchmark.py --compare baseline.json current.json

Check for regressions (>25% slower on large files)
Document known changes in release notes if performance differs intentionally
Create release - the release-benchmark workflow will automatically verify and append results