Masked-array NumPy interface#
NumPy possesses an array interface that allows to properly map their functions on objects
that depend on ndarrays
.
GeoUtils utilizes this interface to work with all Rasters
and their subclasses.
Universal functions#
A first category of NumPy functions supported by Rasters
through the array interface is that of
universal functions, which operate on ndarrays
in an element-by-element
fashion. Examples of such functions are add()
, absolute()
, isnan()
or sin()
, and they number at more
than 90.
Universal functions can take one or two inputs, and return one or two outputs. Through GeoUtils, as long as one of the two inputs is a Rasters
,
the output will be a Raster
. If there is a second input, it can be a Raster
or ndarray
with
matching georeferencing or shape, respectively.
These functions inherently support the casting of different dtype
and values masked by nodata
in the
MaskedArray
.
Below, we re-use the same example created in Support of pythonic operators.
Show code cell source
import geoutils as gu
import rasterio as rio
import pyproj
import numpy as np
# Create a random 3 x 3 masked array
np.random.seed(42)
arr = np.random.randint(0, 255, size=(3, 3), dtype="uint8")
mask = np.random.randint(0, 2, size=(3, 3), dtype="bool")
ma = np.ma.masked_array(data=arr, mask=mask)
# Create an example raster
rast = gu.Raster.from_array(
data = ma,
transform = rio.transform.from_bounds(0, 0, 1, 1, 3, 3),
crs = pyproj.CRS.from_epsg(4326),
nodata = 255
)
rast
Show code cell output
Raster(
data=[[102 -- --]
[-- 179 61]
[234 203 --]]
transform=| 0.33, 0.00, 0.00|
| 0.00,-0.33, 1.00|
| 0.00, 0.00, 1.00|
crs=EPSG:4326
nodata=255)
# Universal function with a single input and output
np.sin(rast)
Raster(
data=[[0.99462890625 -- --]
[-- 0.07073974609375 -0.96630859375]
[0.9990234375 0.93310546875 --]]
transform=| 0.33, 0.00, 0.00|
| 0.00,-0.33, 1.00|
| 0.00, 0.00, 1.00|
crs=EPSG:4326
nodata=255)
# Universal function with a two inputs and single output
np.add(arr, rast)
Raster(
data=[[204 -- --]
[-- 102 122]
[212 150 --]]
transform=| 0.33, 0.00, 0.00|
| 0.00,-0.33, 1.00|
| 0.00, 0.00, 1.00|
crs=EPSG:4326
nodata=255)
# Universal function with a single input and two outputs
np.modf(rast)
(Raster(
data=[[0.0 -- --]
[-- 0.0 0.0]
[0.0 0.0 --]]
transform=| 0.33, 0.00, 0.00|
| 0.00,-0.33, 1.00|
| 0.00, 0.00, 1.00|
crs=EPSG:4326
nodata=255),
Raster(
data=[[102.0 -- --]
[-- 179.0 61.0]
[234.0 203.0 --]]
transform=| 0.33, 0.00, 0.00|
| 0.00,-0.33, 1.00|
| 0.00, 0.00, 1.00|
crs=EPSG:4326
nodata=255))
Similar to with Python operators, NumPy’s logical comparison functions cast
Rasters
to a Mask
.
# Universal function with a single input and two outputs
np.greater(rast, rast + np.random.normal(size=np.shape(arr)))
Mask(
data=[[False -- --]
[-- False False]
[True False --]]
transform=| 0.33, 0.00, 0.00|
| 0.00,-0.33, 1.00|
| 0.00, 0.00, 1.00|
crs=EPSG:4326)
Array functions#
The second and last category of NumPy array functions supported by Rasters
through the array interface is that of array functions,
which are all other non-universal functions that can be applied to an array. Those function always modify the dimensionality of the output, such as
mean()
, count_nonzero()
or nanmax()
. Consequently, the output is the same as it would be with ndarrays
.
# Traditional mathematical function
np.max(rast)
234
# Expliciting an axis for reduction
np.count_nonzero(rast, axis=1)
masked_array(data=[1, 2, 2],
mask=[False, False, False],
fill_value=999999)
Not all array functions are supported, however. GeoUtils supports nearly all mathematical functions,
masked-array functions and logical functions.
A full list of supported array function is available in geoutils.raster.handled_array_funcs
.
Respecting masked values#
There are two ways to compute statistics on Rasters
while respecting masked values:
Use any NumPy core function (
np.func
) directly on theRaster
(this includes NaN functionsnp.nanfunc
),Use any NumPy masked-array function (
np.ma.func
) onRaster.data
.
# Numpy core function applied to the raster
np.median(rast)
179.0
# Numpy NaN function applied to the raster
np.nanmedian(rast)
179.0
# Masked-array function on the data
np.ma.median(rast.data)
179.0
If a NumPy core function raises an error (e.g., numpy.percentile()
), nodata
values might not be respected. In this case, use the NaN
function on the Raster
.
Note
Unfortunately, masked-array functions np.ma.func
cannot be recognized yet if applied directly to a Raster
, but this should come
soon as related interfacing is in the works in NumPy!