I ran into this while trying to disaggregate Mexico's rooftop PV potential on a decently sized laptop.
Attempting to disaggregate the raster will result on a large amount of memory consumption, leading to crashes.
The files themselves are not that large (GeoTiff is ~30 MB, Geoparquet is ~40 MB).
The difference here is the number of regions (2000+).
I suspect this may be caused by rioxarray exploding its dimensionality to the number of shapes during processing.
Specs
- OS: Fedora 42
- RAM: 16 GB DDR5 SODIMM
- Swap: 8 GB
- CPU: i7 13k w/ 20 CPU threads
Reproduction
Input files
Can be found here
https://surfdrive.surf.nl/files/index.php/s/Z4bLHF38tc87T6J
Script
# %%
import math
import geopandas as gpd
import gregor
import pandas as pd
import rioxarray as rxr
from matplotlib import pyplot as plt
case = "MEX"
year = 2023
# %%
shapes_df = gpd.read_parquet(f"downloads/{case}/{case}.parquet")
countries_df = shapes_df[["country_id", "geometry"]].dissolve("country_id").reset_index()
case_df = countries_df[countries_df["country_id"] == case]
case_df.plot()
# %%
area_potential = rxr.open_rasterio(
f"downloads/{case}/{case}.tif",
chunks={"x": 1024, "y": 1024},
).squeeze()
area_potential = area_potential.rio.write_crs("EPSG:4326")
# %%
# Decide on a maximum number of pixels in the final plot
max_pixels = 50_000_000 # tweak this to taste
# Compute needed coarsening factor
nx, ny = area_potential.sizes["x"], area_potential.sizes["y"]
factor = math.ceil(math.sqrt((nx * ny) / max_pixels))
pixel_count = (nx // factor) * (ny // factor)
print(
f"Downsampling factor: {factor} (output will be ~{pixel_count} pixels)"
)
# Coarsen (block-average) the data
coarse = area_potential.coarsen(x=factor, y=factor, boundary="trim").mean()
# Set up plot
fig, ax = plt.subplots(figsize=(8, 6), layout="constrained")
# Plot full extent of the coarsened raster
case_df.to_crs(area_potential.rio.crs).geometry.boundary.plot(
ax=ax, color="black", aspect=None, linewidth=0.3, alpha=0.2
)
coarse.plot.imshow(
ax=ax,
cmap="Oranges",
vmax=500,
add_colorbar=True,
cbar_kwargs={"location": "bottom", "label": "Area potential for PV"},
alpha=1
)
ax.set_aspect("equal")
ax.set_xlabel("Longitude")
ax.set_ylabel("Latitude")
ax.set_title(f"{case} Rooftop PV Potential used for aggregation\n"
f"(figure coarsened to ~{pixel_count:.1e} pixels)")
plt.show()
# %%
# WARNING: crash below!
case_df["to_disaggregate"] = 1282 # dummy value
aggregated_pv = gregor.disaggregate.disaggregate_polygon_to_raster(case_df, column="to_disaggregate", proxy=area_potential)
Dependencies
❯ pixi list --explicit
Package Version Build Size Kind Source
cartopy 0.24.0 py312hf9745cd_0 1.5 MiB conda https://conda.anaconda.org/conda-forge/
click 8.2.1 pyh707e725_0 85.7 KiB conda https://conda.anaconda.org/conda-forge/
clio-tools 2025.03.03 pyhd8ed1ab_0 14.2 KiB conda https://conda.anaconda.org/conda-forge/
conda 25.3.1 py312h7900ff3_0 1.1 MiB conda https://conda.anaconda.org/conda-forge/
contextily 1.6.2 pyhd8ed1ab_1 20.3 KiB conda https://conda.anaconda.org/conda-forge/
dask 2025.5.1 pyhe01879c_1 11.1 KiB conda https://conda.anaconda.org/conda-forge/
gdal 3.10.3 py312hf1b357c_11 1.7 MiB conda https://conda.anaconda.org/conda-forge/
geopandas 1.0.1 pyhd8ed1ab_3 7.4 KiB conda https://conda.anaconda.org/conda-forge/
gregor 0.0.3.dev0 pypi git+https://github.com/jnnr/gregor.git?rev=4d54d11#4d54d1167ebb78de553c0439374ab936c03923ad
ipdb 0.13.13 pyhd8ed1ab_1 18.3 KiB conda https://conda.anaconda.org/conda-forge/
ipykernel 6.29.5 pyh3099207_0 116.3 KiB conda https://conda.anaconda.org/conda-forge/
libgdal-arrow-parquet 3.10.3 h8ae71d8_11 807.9 KiB conda https://conda.anaconda.org/conda-forge/
libgdal-core 3.10.3 hcac4edf_11 10.3 MiB conda https://conda.anaconda.org/conda-forge/
mypy 1.15.0 py312h66e93f0_0 17.8 MiB conda https://conda.anaconda.org/conda-forge/
pandera-geopandas 0.24.0 hd8ed1ab_2 7.3 KiB conda https://conda.anaconda.org/conda-forge/
pandera-pandas 0.24.0 hd8ed1ab_2 7.3 KiB conda https://conda.anaconda.org/conda-forge/
powerplantmatching 0.7.1 pyhd8ed1ab_0 661.1 KiB conda https://conda.anaconda.org/conda-forge/
pyarrow 19.0.1 py312h7900ff3_0 24.7 KiB conda https://conda.anaconda.org/conda-forge/
pycountry 24.6.1 pyhd8ed1ab_0 3 MiB conda https://conda.anaconda.org/conda-forge/
pystac-client 0.8.6 pyhd8ed1ab_0 35 KiB conda https://conda.anaconda.org/conda-forge/
pytest 8.3.5 pyhd8ed1ab_0 253.7 KiB conda https://conda.anaconda.org/conda-forge/
python 3.12.9 h9e4cc4f_1_cpython 30.2 MiB conda https://conda.anaconda.org/conda-forge/
rasterio 1.4.3 py312h021bea1_1 7.6 MiB conda https://conda.anaconda.org/conda-forge/
richdem 2.3.0 py312h546fd74_12 5.1 MiB conda https://conda.anaconda.org/conda-forge/
ruff 0.11.4 py312h286b59f_0 8.6 MiB conda https://conda.anaconda.org/conda-forge/
snakefmt 0.11.0 pyhdfd78af_0 31.2 KiB conda https://conda.anaconda.org/bioconda/
snakemake-minimal 9.1.9 pyhdfd78af_0 848.4 KiB conda https://conda.anaconda.org/bioconda/
I ran into this while trying to disaggregate Mexico's rooftop PV potential on a decently sized laptop.
Attempting to disaggregate the raster will result on a large amount of memory consumption, leading to crashes.
The files themselves are not that large (GeoTiff is ~30 MB, Geoparquet is ~40 MB).
The difference here is the number of regions (2000+).
I suspect this may be caused by
rioxarrayexploding its dimensionality to the number of shapes during processing.Specs
Reproduction
Input files
Can be found here
https://surfdrive.surf.nl/files/index.php/s/Z4bLHF38tc87T6J
Script
Dependencies