⚠️ Our 0.1 release refactored several early-development functions for long-term stability, to update your code see here. ⚠️
Future changes will come with deprecation warnings! 🙂

The georeferenced raster (Raster)#

Below, a summary of the Raster object and its methods.

Object definition and attributes#

A Raster contains four main attributes:

  1. a numpy.ma.MaskedArray as data, of either integer or floating dtype,

  2. an affine.Affine as transform,

  3. a pyproj.crs.CRS as crs, and

  4. a float or int as nodata.

For more details on Raster class composition, see Composition from Rasterio and GeoPandas.

A Raster also contains many derivative attributes, with naming generally consistent with that of a rasterio.io.DatasetReader.

A first category includes georeferencing attributes directly derived from transform, namely: shape, height, width, res, bounds.

A second category concerns the attributes derived from the raster array shape and type: count, bands and dtype. The two former refer to the number of bands loaded in a Raster, and the band indexes.

Important

The bands of rasterio.io.DatasetReader start from 1 and not 0, be careful when instantiating or loading from a multi-band raster!

Finally, the remaining attributes are only relevant when instantiating from a on-disk file: name, driver, count_on_disk, bands_on_disk, is_loaded and is_modified.

Note

The count and bands attributes always exist, while count_on_disk and bands_on_disk only refers to the number of bands on the on-disk dataset, if it exists.

For example, count and count_on_disk will differ when a single band is loaded from a 3-band on-disk file, by passing a single index to the bands argument in Raster or load().

The complete list of Raster attributes with description is available in dedicated sections of the API.

Open and save#

A Raster is opened by instantiating with either a str, a pathlib.Path, a rasterio.io.DatasetReader or a rasterio.io.MemoryFile.

import geoutils as gu

# Instantiate a raster from a filename on disk
filename_rast = gu.examples.get_path("exploradores_aster_dem")
rast = gu.Raster(filename_rast)
rast
Raster(
  data=not_loaded; shape on disk (1, 618, 539); will load (618, 539)
  transform=| 30.00, 0.00, 627175.00|
            | 0.00,-30.00, 4852085.00|
            | 0.00, 0.00, 1.00|
  crs=EPSG:32718
  nodata=-9999.0)

Detailed information on the Raster is printed using info(), along with basic statistics using stats=True:

# Print details of raster
print(rast.info(stats=True))
Driver:               GTiff 
Opened from file:     /home/docs/checkouts/readthedocs.org/user_builds/geoutils/checkouts/latest/examples/data/Exploradores_ASTER/AST_L1A_00303182012144228_Z.tif 
Filename:             /home/docs/checkouts/readthedocs.org/user_builds/geoutils/checkouts/latest/examples/data/Exploradores_ASTER/AST_L1A_00303182012144228_Z.tif 
Loaded?               False 
Modified since load?  False 
Grid size:                 539, 618
Number of bands:      1
Data types:           float32
Coordinate system:    ['EPSG:32718']
Nodata value:         -9999.0
Pixel interpretation: Area
Pixel size:           30.0, 30.0
Upper left corner:    627175.0, 4833545.0
Lower right corner:   643345.0, 4852085.0
[MAXIMUM]:          3959.90
[MINIMUM]:          317.71
[MEDIAN]:           1276.06
[MEAN]:             1497.26
[STD DEV]:          586.92

None

Note

Calling info() with stats=True automatically loads the array in-memory, like any other operation calling data.

A Raster is saved to file by calling save() with a str or a pathlib.Path.

# Save raster to disk
rast.save("myraster.tif")

Create from ndarray#

A Raster is created from an array by calling the class method from_array() and passing the four main attributes.

import rasterio as rio
import pyproj
import numpy as np

# Create a random 3 x 3 masked array
np.random.seed(42)
arr = np.random.randint(0, 255, size=(3, 3), dtype="uint8")
mask = np.random.randint(0, 2, size=(3, 3), dtype="bool")
ma = np.ma.masked_array(data=arr, mask=mask)

# Create a raster from array
rast = gu.Raster.from_array(
       data = ma,
       transform = rio.transform.from_bounds(0, 0, 1, 1, 3, 3),
       crs = pyproj.CRS.from_epsg(4326),
       nodata = 255
    )
rast
Raster(
  data=[[102 -- --]
        [-- 179 61]
        [234 203 --]]
  transform=| 0.33, 0.00, 0.00|
            | 0.00,-0.33, 1.00|
            | 0.00, 0.00, 1.00|
  crs=EPSG:4326
  nodata=255)

Important

The data attribute can be passed as an unmasked ndarray. That array will be converted to a MaskedArray with a mask on all nan and inf in the data, and on all values matching the nodata value passed to from_array().

Get array#

The array of a Raster is available in data as a MaskedArray.

# Get raster's masked-array
rast.data
masked_array(
  data=[[102, --, --],
        [--, 179, 61],
        [234, 203, --]],
  mask=[[False,  True,  True],
        [ True, False, False],
        [False, False,  True]],
  fill_value=255,
  dtype=uint8)

For those less familiar with MaskedArrays and the associated functions in NumPy, an unmasked ndarray filled with nan on masked values can be extracted using get_nanarray().

# Get raster's nan-array
rast.get_nanarray()
array([[102.,  nan,  nan],
       [ nan, 179.,  61.],
       [234., 203.,  nan]], dtype=float32)

Important

Getting a ndarray filled with nan will automatically cast the dtype to numpy.float32. This might result in larger memory usage than in the original Raster (if of int type).

Thanks to the Masked-array NumPy interface, NumPy functions applied directly to a Raster will respect nodata values as well as if computing with the MaskedArray or an unmasked ndarray filled with nan.

Additionally, the Raster will automatically cast between different dtype, and possibly re-define missing nodatas.

Arithmetic#

A Raster can be applied any pythonic arithmetic operation (+, -, /, //, *, **, %) with another Raster, ndarray or number. It will output one or two Rasters. NumPy coercion rules apply for dtype.

# Add 1 and divide raster by 2
(rast + 1)/2
Raster(
  data=[[51.5 -- --]
        [-- 90.0 31.0]
        [117.5 102.0 --]]
  transform=| 0.33, 0.00, 0.00|
            | 0.00,-0.33, 1.00|
            | 0.00, 0.00, 1.00|
  crs=EPSG:4326
  nodata=255)

A Raster can also be applied any pythonic logical comparison operation (==, !=, >=, >, <=, <) with another Raster, ndarray or number. It will cast to a Mask.

# What raster pixels are less than 100?
rast < 100
Mask(
  data=[[False -- --]
        [-- False True]
        [False False --]]
  transform=| 0.33, 0.00, 0.00|
            | 0.00,-0.33, 1.00|
            | 0.00, 0.00, 1.00|
  crs=EPSG:4326)

See Support of pythonic operators for more details.

Array interface#

A Raster can be applied any NumPy universal functions and most mathematical, logical or masked-array functions with another Raster, ndarray or number.

# Compute the element-wise square-root
np.sqrt(rast)
Raster(
  data=[[10.1015625 -- --]
        [-- 13.3828125 7.80859375]
        [15.296875 14.25 --]]
  transform=| 0.33, 0.00, 0.00|
            | 0.00,-0.33, 1.00|
            | 0.00, 0.00, 1.00|
  crs=EPSG:4326
  nodata=255)

Logical comparison functions will cast to a Mask.

# Is the raster close to another within tolerance?

np.isclose(rast, rast+0.05, atol=0.1)
Mask(
  data=[[True -- --]
        [-- True True]
        [True True --]]
  transform=| 0.33, 0.00, 0.00|
            | 0.00,-0.33, 1.00|
            | 0.00, 0.00, 1.00|
  crs=EPSG:4326)

See Masked-array NumPy interface for more details.

Reproject#

Reprojecting a Raster is done through the reproject() function, which enforces new transform and/or crs.

Important

As with all geospatial handling methods, the reproject() function can be passed a Raster or Vector as a reference to match. In that case, no other argument is necessary.

A Raster reference will enforce to match its transform and crs. A Vector reference will enforce to match its bounds and crs.

See Match-reference functionality for more details.

The reproject() function can also be passed any individual arguments such as dst_bounds, to enforce specific georeferencing attributes. For more details, see the specific section and function descriptions in the API.

# Original bounds and resolution
print(rast.res)
print(rast.bounds)
(0.3333333333333333, 0.3333333333333333)
BoundingBox(left=0.0, bottom=0.0, right=1.0, top=1.0)
# Reproject to smaller bounds and higher resolution
rast_reproj = rast.reproject(
    res=0.1,
    bounds={"left": 0, "bottom": 0, "right": 0.75, "top": 0.75},
    resampling="cubic")
rast_reproj
Raster(
  data=[[102 102 116 -- -- -- -- --]
        [-- -- -- 159 174 161 126 90]
        [-- -- -- 179 179 161 126 91]
        [-- -- -- 196 189 172 141 101]
        [234 234 220 207 197 185 161 --]
        [234 234 225 215 205 198 190 --]
        [234 234 226 217 208 203 203 --]
        [-- -- -- -- -- -- -- --]]
  transform=| 0.10, 0.00, 0.00|
            | 0.00,-0.10, 0.75|
            | 0.00, 0.00, 1.00|
  crs=EPSG:4326
  nodata=255)
# New bounds and resolution
print(rast_reproj.res)
print(rast_reproj.bounds)
(0.1, 0.1)
BoundingBox(left=0.0, bottom=-0.050000000000000044, right=0.8, top=0.75)

Note

In GeoUtils, "bilinear" is the default resampling method. A simple str matching the naming of a rasterio.enums.Resampling method can be passed.

Resampling methods are listed in the dedicated section of Rasterio’s API.

Crop#

Cropping a Raster is done through the crop() function, which enforces new bounds.

Important

As with all geospatial handling methods, the crop() function can be passed only a Raster or Vector as a reference to match. In that case, no other argument is necessary.

See Match-reference functionality for more details.

The crop() function can also be passed a list or tuple of bounds (xmin, ymin, xmax, ymax). By default, crop() returns a new Raster. For more details, see the specific section and function descriptions in the API.

# Crop raster to smaller bounds
rast_crop = rast.crop(crop_geom=(0.3, 0.3, 1, 1))
print(rast_crop.bounds)
BoundingBox(left=0.0, bottom=0.33333333333333337, right=0.6666666666666666, top=1.0)

Polygonize#

Polygonizing a Raster is done through the polygonize() function, which converts target pixels into a multi-polygon Vector.

Note

For a Mask, polygonize() implicitly targets True values and thus does not require target pixels. See Polygonize (overloaded from Raster).

# Polygonize all values lower than 100
vect_lt_100 = (rast < 100).polygonize()
vect_lt_100
Vector(
  ds=   New_ID                                           geometry  raster_value
       0       0  POLYGON ((0.66667 0.66667, 0.66667 0.33333, 1....           1.0
  crs=GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]
  bounds=BoundingBox(left=0.6666666666666666, bottom=0.33333333333333337, right=1.0, top=0.6666666666666667))

Proximity#

Computing proximity from a Raster is done through by the proximity() function, which computes the closest distance to any target pixels in the Raster.

Note

For a Mask, proximity() implicitly targets True values and thus does not require target pixels. See Proximity (overloaded from Raster).

# Compute proximity from mask for all values lower than 100
prox_lt_100 = (rast < 100).proximity()
prox_lt_100
Raster(
  data=[[0.74535599 0.47140452 0.33333333]
        [0.66666667 0.33333333 0.        ]
        [0.74535599 0.47140452 0.33333333]]
  transform=| 0.33, 0.00, 0.00|
            | 0.00,-0.33, 1.00|
            | 0.00, 0.00, 1.00|
  crs=EPSG:4326
  nodata=None)

Optionally, instead of target pixel values, a Vector can be passed to compute the proximity from the geometry.

# Compute proximity from mask for all values lower than 100
prox_lt_100_from_vect = rast.proximity(vector=vect_lt_100)
prox_lt_100_from_vect
Raster(
  data=[[0.74535599 0.47140452 0.33333333]
        [0.66666667 0.33333333 0.        ]
        [0.66666667 0.33333333 0.        ]]
  transform=| 0.33, 0.00, 0.00|
            | 0.00,-0.33, 1.00|
            | 0.00, 0.00, 1.00|
  crs=EPSG:4326
  nodata=255)

Interpolate or extract to point#

Interpolating or extracting Raster values at specific points can be done through:

  • the value_at_coords() function, that extracts the single closest pixel or a surrounding window for each coordinate, on which can be applied reducing any function (numpy.ma.mean() by default), or

  • the interp_points() function, that interpolates the Raster’s regular grid to each coordinate using a resampling algorithm.

# Extract median value in a 3 x 3 pixel window
rast_reproj.value_at_coords(x=0.5, y=0.5, window=3, reducer_function=np.ma.median)
161.0
# Interpolate coordinate value with quintic algorithm
rast_reproj.interp_points([(0.5, 0.5)], method="quintic")
array([nan], dtype=float32)

Note

Both value_at_coords() and interp_points() can be passed a single coordinate as floats, or a list of coordinates.

Export#

A Raster can be exported to different formats, to facilitate inter-compatibility with different packages and code versions.

Those include exporting to:

# Export to rasterio dataset-reader through a memoryfile
rast_reproj.to_rio_dataset()
<open DatasetReader name='/vsimem/6dd82ca5-0275-435f-8c08-ef830f5e4ebe/6dd82ca5-0275-435f-8c08-ef830f5e4ebe.tif' mode='r'>
# Export to geopandas dataframe
rast_reproj.to_pointcloud()
Vector(
  ds=       b1                 geometry
       0   102.0  POINT (0.00000 0.75000)
       1   102.0  POINT (0.10000 0.75000)
       2   116.0  POINT (0.20000 0.75000)
       3   159.0  POINT (0.30000 0.65000)
       4   174.0  POINT (0.40000 0.65000)
       5   161.0  POINT (0.50000 0.65000)
       6   126.0  POINT (0.60000 0.65000)
       7    90.0  POINT (0.70000 0.65000)
       8   179.0  POINT (0.30000 0.55000)
       9   179.0  POINT (0.40000 0.55000)
       10  161.0  POINT (0.50000 0.55000)
       11  126.0  POINT (0.60000 0.55000)
       12   91.0  POINT (0.70000 0.55000)
       13  196.0  POINT (0.30000 0.45000)
       14  189.0  POINT (0.40000 0.45000)
       15  172.0  POINT (0.50000 0.45000)
       16  141.0  POINT (0.60000 0.45000)
       17  101.0  POINT (0.70000 0.45000)
       18  234.0  POINT (0.00000 0.35000)
       19  234.0  POINT (0.10000 0.35000)
       20  220.0  POINT (0.20000 0.35000)
       21  207.0  POINT (0.30000 0.35000)
       22  197.0  POINT (0.40000 0.35000)
       23  185.0  POINT (0.50000 0.35000)
       24  161.0  POINT (0.60000 0.35000)
       25  234.0  POINT (0.00000 0.25000)
       26  234.0  POINT (0.10000 0.25000)
       27  225.0  POINT (0.20000 0.25000)
       28  215.0  POINT (0.30000 0.25000)
       29  205.0  POINT (0.40000 0.25000)
       30  198.0  POINT (0.50000 0.25000)
       31  190.0  POINT (0.60000 0.25000)
       32  234.0  POINT (0.00000 0.15000)
       33  234.0  POINT (0.10000 0.15000)
       34  226.0  POINT (0.20000 0.15000)
       35  217.0  POINT (0.30000 0.15000)
       36  208.0  POINT (0.40000 0.15000)
       37  203.0  POINT (0.50000 0.15000)
       38  203.0  POINT (0.60000 0.15000)
  crs=GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]
  bounds=BoundingBox(left=0.0, bottom=0.1499999999999999, right=0.7000000000000001, top=0.75))
# Export to xarray data array
rast_reproj.to_xarray()
<xarray.DataArray (band: 1, y: 8, x: 8)> Size: 256B
[64 values with dtype=float32]
Coordinates:
  * band         (band) int64 8B 1
  * x            (x) float64 64B 0.05 0.15 0.25 0.35 0.45 0.55 0.65 0.75
  * y            (y) float64 64B 0.7 0.6 0.5 0.4 0.3 0.2 0.1 -1.11e-16
    spatial_ref  int64 8B 0
Attributes:
    AREA_OR_POINT:  Area
    scale_factor:   1.0
    add_offset:     0.0