Implicit lazy loading#
Lazy loading, also known as “call-by-need”, is the delay in loading or evaluating a dataset.
In GeoUtils, we implicitly load and pass only metadata until the data is actually needed, and are working to implement lazy analysis tools relying on other packages.
Lazy instantiation of Rasters
#
By default, GeoUtils instantiate a Raster
from an on-disk file without loading its geoutils.Raster.data
array. It only loads its
metadata (transform
, crs
, nodata
and derivatives, as well as
name
and driver
).
import geoutils as gu
# Instantiate a raster from a filename on disk
filename_rast = gu.examples.get_path("everest_landsat_b4")
rast = gu.Raster(filename_rast)
# This raster is not loaded
rast
Raster(
data=not_loaded; shape on disk (1, 655, 800); will load (655, 800)
transform=| 30.00, 0.00, 478000.00|
| 0.00,-30.00, 3108140.00|
| 0.00, 0.00, 1.00|
crs=EPSG:32645
nodata=None)
To load the data explicitly during instantiation opening, load_data=True
can be passed to Raster
. Or the load()
method can be called after. The two are equivalent.
# Initiate another raster just for the purpose of loading
rast_to_load = gu.Raster(gu.examples.get_path("everest_landsat_b4"))
rast_to_load.load()
# This raster is loaded
rast_to_load
Raster(
data=[[255 255 255 ... 255 255 255]
[255 255 255 ... 255 255 255]
[255 255 255 ... 255 255 255]
...
[ 74 76 79 ... 121 119 141]
[ 75 83 70 ... 112 130 150]
[ 64 86 68 ... 124 131 130]]
transform=| 30.00, 0.00, 478000.00|
| 0.00,-30.00, 3108140.00|
| 0.00, 0.00, 1.00|
crs=EPSG:32645
nodata=None)
Lazy passing of georeferencing metadata#
Operations relying on georeferencing metadata of Rasters
or Vectors
are always done by respecting the
possible lazy loading of the objects.
For instance, using any Raster
or Vector
as a match-reference for a geospatial operation (see Match-reference functionality) will
always conserve the lazy loading of that match-reference object.
# Use a smaller Raster as reference to crop the initial one
smaller_rast = gu.Raster(gu.examples.get_path("everest_landsat_b4_cropped"))
rast.crop(smaller_rast)
# The reference raster is not loaded
smaller_rast
/home/docs/checkouts/readthedocs.org/user_builds/geoutils/checkouts/latest/geoutils/raster/raster.py:388: UserWarning: One raster has a pixel interpretation "Area" and the other "Point". To silence this warning, either correct the pixel interpretation of one raster, or deactivate warnings of pixel interpretation with geoutils.config["warn_area_or_point"]=False.
warnings.warn(message=msg, category=UserWarning)
Raster(
data=not_loaded; shape on disk (1, 315, 492); will load (315, 492)
transform=| 30.00, 0.00, 483430.00|
| 0.00,-30.00, 3102710.00|
| 0.00, 0.00, 1.00|
crs=EPSG:32645
nodata=None)
Optimized geospatial subsetting#
Important
These features are a work in progress, we aim to make GeoUtils more lazy-friendly through Dask in future versions of the package!
Some georeferencing operations can be done without loading the entire array. Right now, relying directly on Rasterio, GeoUtils supports optimized subsetting
through the crop()
method.
# The previously cropped Raster was loaded without accessing the entire array
rast
Raster(
data=not_loaded; shape on disk (1, 655, 800); will load (655, 800)
transform=| 30.00, 0.00, 478000.00|
| 0.00,-30.00, 3108140.00|
| 0.00, 0.00, 1.00|
crs=EPSG:32645
nodata=None)