⚠️ Our 0.1 release refactored several early-development functions for long-term stability, to update your code see here. ⚠️
Future changes will come with deprecation warnings! 🙂

Implicit lazy loading#

Lazy loading, also known as “call-by-need”, is the delay in loading or evaluating a dataset.

In GeoUtils, we implicitly load and pass only metadata until the data is actually needed, and are working to implement lazy analysis tools relying on other packages.

Lazy instantiation of Rasters#

By default, GeoUtils instantiate a Raster from an on-disk file without loading its geoutils.Raster.data array. It only loads its metadata (transform, crs, nodata and derivatives, as well as name and driver).

import geoutils as gu

# Instantiate a raster from a filename on disk
filename_rast = gu.examples.get_path("everest_landsat_b4")
rast = gu.Raster(filename_rast)

# This raster is not loaded
rast
Raster(
  data=not_loaded; shape on disk (1, 655, 800); will load (655, 800)
  transform=| 30.00, 0.00, 478000.00|
            | 0.00,-30.00, 3108140.00|
            | 0.00, 0.00, 1.00|
  crs=EPSG:32645
  nodata=None)

To load the data explicitly during instantiation opening, load_data=True can be passed to Raster. Or the load() method can be called after. The two are equivalent.

# Initiate another raster just for the purpose of loading
rast_to_load = gu.Raster(gu.examples.get_path("everest_landsat_b4"))
rast_to_load.load()

# This raster is loaded
rast_to_load
Raster(
  data=[[255 255 255 ... 255 255 255]
        [255 255 255 ... 255 255 255]
        [255 255 255 ... 255 255 255]
        ...
        [ 74  76  79 ... 121 119 141]
        [ 75  83  70 ... 112 130 150]
        [ 64  86  68 ... 124 131 130]]
  transform=| 30.00, 0.00, 478000.00|
            | 0.00,-30.00, 3108140.00|
            | 0.00, 0.00, 1.00|
  crs=EPSG:32645
  nodata=None)

Lazy passing of georeferencing metadata#

Operations relying on georeferencing metadata of Rasters or Vectors are always done by respecting the possible lazy loading of the objects.

For instance, using any Raster or Vector as a match-reference for a geospatial operation (see Match-reference functionality) will always conserve the lazy loading of that match-reference object.

# Use a smaller Raster as reference to crop the initial one
smaller_rast = gu.Raster(gu.examples.get_path("everest_landsat_b4_cropped"))
rast.crop(smaller_rast)

# The reference raster is not loaded
smaller_rast
/home/docs/checkouts/readthedocs.org/user_builds/geoutils/checkouts/latest/geoutils/raster/raster.py:388: UserWarning: One raster has a pixel interpretation "Area" and the other "Point". To silence this warning, either correct the pixel interpretation of one raster, or deactivate warnings of pixel interpretation with geoutils.config["warn_area_or_point"]=False.
  warnings.warn(message=msg, category=UserWarning)
Raster(
  data=not_loaded; shape on disk (1, 315, 492); will load (315, 492)
  transform=| 30.00, 0.00, 483430.00|
            | 0.00,-30.00, 3102710.00|
            | 0.00, 0.00, 1.00|
  crs=EPSG:32645
  nodata=None)

Optimized geospatial subsetting#

Important

These features are a work in progress, we aim to make GeoUtils more lazy-friendly through Dask in future versions of the package!

Some georeferencing operations can be done without loading the entire array. Right now, relying directly on Rasterio, GeoUtils supports optimized subsetting through the crop() method.

# The previously cropped Raster was loaded without accessing the entire array
rast
Raster(
  data=not_loaded; shape on disk (1, 655, 800); will load (655, 800)
  transform=| 30.00, 0.00, 478000.00|
            | 0.00,-30.00, 3108140.00|
            | 0.00, 0.00, 1.00|
  crs=EPSG:32645
  nodata=None)