The georeferenced raster (Raster
)#
Below, a summary of the Raster
object and its methods.
Object definition and attributes#
A Raster
contains four main attributes:
a
numpy.ma.MaskedArray
asdata
, of eitherinteger
orfloating
dtype
,an
affine.Affine
astransform
,a
pyproj.crs.CRS
ascrs
, and
For more details on Raster
class composition, see Composition from Rasterio and GeoPandas.
A Raster
also contains many derivative attributes, with naming generally consistent with that of a rasterio.io.DatasetReader
.
A first category includes georeferencing attributes directly derived from transform
, namely: shape
,
height
, width
, res
, bounds
.
A second category concerns the attributes derived from the raster array shape and type: count
, bands
and
dtype
. The two former refer to the number of bands loaded in a Raster
, and the band indexes.
Important
The bands
of rasterio.io.DatasetReader
start from 1 and not 0, be careful when instantiating or loading from a
multi-band raster!
Finally, the remaining attributes are only relevant when instantiating from a on-disk file: name
, driver
,
count_on_disk
, bands_on_disk
, is_loaded
and is_modified
.
Note
The count
and bands
attributes always exist, while count_on_disk
and
bands_on_disk
only refers to the number of bands on the on-disk dataset, if it exists.
For example, count
and count_on_disk
will differ when a single band is loaded from a
3-band on-disk file, by passing a single index to the bands
argument in Raster
or load()
.
The complete list of Raster
attributes with description is available in dedicated sections of the API.
Open and save#
A Raster
is opened by instantiating with either a str
, a pathlib.Path
, a rasterio.io.DatasetReader
or a
rasterio.io.MemoryFile
.
import geoutils as gu
# Instantiate a raster from a filename on disk
filename_rast = gu.examples.get_path("exploradores_aster_dem")
rast = gu.Raster(filename_rast)
rast
Raster(
data=not_loaded; shape on disk (1, 618, 539); will load (618, 539)
transform=| 30.00, 0.00, 627175.00|
| 0.00,-30.00, 4852085.00|
| 0.00, 0.00, 1.00|
crs=EPSG:32718
nodata=-9999.0)
Detailed information on the Raster
is printed using info()
, along with basic statistics using stats=True
:
# Print details of raster
print(rast.info(stats=True))
Driver: GTiff
Opened from file: /home/docs/checkouts/readthedocs.org/user_builds/geoutils/checkouts/stable/examples/data/Exploradores_ASTER/AST_L1A_00303182012144228_Z.tif
Filename: /home/docs/checkouts/readthedocs.org/user_builds/geoutils/checkouts/stable/examples/data/Exploradores_ASTER/AST_L1A_00303182012144228_Z.tif
Loaded? False
Modified since load? False
Grid size: 539, 618
Number of bands: 1
Data types: float32
Coordinate system: ['EPSG:32718']
Nodata value: -9999.0
Pixel interpretation: Area
Pixel size: 30.0, 30.0
Upper left corner: 627175.0, 4833545.0
Lower right corner: 643345.0, 4852085.0
[MAXIMUM]: 3959.90
[MINIMUM]: 317.71
[MEDIAN]: 1276.06
[MEAN]: 1497.26
[STD DEV]: 586.92
None
Note
Calling info()
with stats=True
automatically loads the array in-memory, like any other operation calling data
.
A Raster
is saved to file by calling save()
with a str
or a pathlib.Path
.
# Save raster to disk
rast.save("myraster.tif")
Create from ndarray
#
A Raster
is created from an array by calling the class method from_array()
and passing the
four main attributes.
import rasterio as rio
import pyproj
import numpy as np
# Create a random 3 x 3 masked array
np.random.seed(42)
arr = np.random.randint(0, 255, size=(3, 3), dtype="uint8")
mask = np.random.randint(0, 2, size=(3, 3), dtype="bool")
ma = np.ma.masked_array(data=arr, mask=mask)
# Create a raster from array
rast = gu.Raster.from_array(
data = ma,
transform = rio.transform.from_bounds(0, 0, 1, 1, 3, 3),
crs = pyproj.CRS.from_epsg(4326),
nodata = 255
)
rast
Raster(
data=[[102 -- --]
[-- 179 61]
[234 203 --]]
transform=| 0.33, 0.00, 0.00|
| 0.00,-0.33, 1.00|
| 0.00, 0.00, 1.00|
crs=EPSG:4326
nodata=255)
Important
The data
attribute can be passed as an unmasked ndarray
. That array will be converted to a MaskedArray
with a mask
on all nan
and inf
in the data
, and on all values
matching the nodata
value passed to from_array()
.
Get array#
The array of a Raster
is available in data
as a MaskedArray
.
# Get raster's masked-array
rast.data
masked_array(
data=[[102, --, --],
[--, 179, 61],
[234, 203, --]],
mask=[[False, True, True],
[ True, False, False],
[False, False, True]],
fill_value=255,
dtype=uint8)
For those less familiar with MaskedArrays
and the associated functions in NumPy, an unmasked ndarray
filled with
nan
on masked values can be extracted using get_nanarray()
.
# Get raster's nan-array
rast.get_nanarray()
array([[102., nan, nan],
[ nan, 179., 61.],
[234., 203., nan]], dtype=float32)
Important
Getting a ndarray
filled with nan
will automatically cast the dtype
to numpy.float32
. This
might result in larger memory usage than in the original Raster
(if of int
type).
Thanks to the Masked-array NumPy interface, NumPy functions applied directly to a Raster
will respect nodata
values as well as if computing with the MaskedArray
or an unmasked ndarray
filled with nan
.
Additionally, the Raster
will automatically cast between different dtype
, and possibly re-define missing
nodatas
.
Arithmetic#
A Raster
can be applied any pythonic arithmetic operation (+
, -
, /
, //
, *
,
**
, %
) with another Raster
, ndarray
or number. It will output one or two Rasters
. NumPy coercion rules apply for dtype
.
# Add 1 and divide raster by 2
(rast + 1)/2
Raster(
data=[[51.5 -- --]
[-- 90.0 31.0]
[117.5 102.0 --]]
transform=| 0.33, 0.00, 0.00|
| 0.00,-0.33, 1.00|
| 0.00, 0.00, 1.00|
crs=EPSG:4326
nodata=255)
A Raster
can also be applied any pythonic logical comparison operation (==
, !=
, >=
, >
, <=
,
<
) with another Raster
, ndarray
or number. It will cast to a Mask
.
# What raster pixels are less than 100?
rast < 100
Mask(
data=[[False -- --]
[-- False True]
[False False --]]
transform=| 0.33, 0.00, 0.00|
| 0.00,-0.33, 1.00|
| 0.00, 0.00, 1.00|
crs=EPSG:4326)
See Support of pythonic operators for more details.
Array interface#
A Raster
can be applied any NumPy universal functions and most mathematical, logical or masked-array functions with another
Raster
, ndarray
or number.
# Compute the element-wise square-root
np.sqrt(rast)
Raster(
data=[[10.1015625 -- --]
[-- 13.3828125 7.80859375]
[15.296875 14.25 --]]
transform=| 0.33, 0.00, 0.00|
| 0.00,-0.33, 1.00|
| 0.00, 0.00, 1.00|
crs=EPSG:4326
nodata=255)
Logical comparison functions will cast to a Mask
.
# Is the raster close to another within tolerance?
np.isclose(rast, rast+0.05, atol=0.1)
Mask(
data=[[True -- --]
[-- True True]
[True True --]]
transform=| 0.33, 0.00, 0.00|
| 0.00,-0.33, 1.00|
| 0.00, 0.00, 1.00|
crs=EPSG:4326)
See Masked-array NumPy interface for more details.
Reproject#
Reprojecting a Raster
is done through the reproject()
function, which enforces new transform
and/or
crs
.
Important
As with all geospatial handling methods, the reproject()
function can be passed a Raster
or
Vector
as a reference to match. In that case, no other argument is necessary.
A Raster
reference will enforce to match its transform
and crs
.
A Vector
reference will enforce to match its bounds
and crs
.
See Match-reference functionality for more details.
The reproject()
function can also be passed any individual arguments such as dst_bounds
, to enforce specific georeferencing
attributes. For more details, see the specific section and function descriptions in the API.
# Original bounds and resolution
print(rast.res)
print(rast.bounds)
(0.3333333333333333, 0.3333333333333333)
BoundingBox(left=0.0, bottom=0.0, right=1.0, top=1.0)
# Reproject to smaller bounds and higher resolution
rast_reproj = rast.reproject(
res=0.1,
bounds={"left": 0, "bottom": 0, "right": 0.75, "top": 0.75},
resampling="cubic")
rast_reproj
Raster(
data=[[102 102 116 -- -- -- -- --]
[-- -- -- 159 174 161 126 90]
[-- -- -- 179 179 161 126 91]
[-- -- -- 196 189 172 141 101]
[234 234 220 207 197 185 161 --]
[234 234 225 215 205 198 190 --]
[234 234 226 217 208 203 203 --]
[-- -- -- -- -- -- -- --]]
transform=| 0.10, 0.00, 0.00|
| 0.00,-0.10, 0.75|
| 0.00, 0.00, 1.00|
crs=EPSG:4326
nodata=255)
# New bounds and resolution
print(rast_reproj.res)
print(rast_reproj.bounds)
(0.1, 0.1)
BoundingBox(left=0.0, bottom=-0.050000000000000044, right=0.8, top=0.75)
Note
In GeoUtils, "bilinear"
is the default resampling method. A simple str
matching the naming of a rasterio.enums.Resampling
method can be
passed.
Resampling methods are listed in the dedicated section of Rasterio’s API.
Crop#
Cropping a Raster
is done through the crop()
function, which enforces new bounds
.
Important
As with all geospatial handling methods, the crop()
function can be passed only a Raster
or Vector
as a reference to match. In that case, no other argument is necessary.
See Match-reference functionality for more details.
The crop()
function can also be passed a list
or tuple
of bounds (xmin
, ymin
, xmax
, ymax
). By default,
crop()
returns a new Raster.
For more details, see the specific section and function descriptions in the API.
# Crop raster to smaller bounds
rast_crop = rast.crop(crop_geom=(0.3, 0.3, 1, 1))
print(rast_crop.bounds)
BoundingBox(left=0.0, bottom=0.33333333333333337, right=0.6666666666666666, top=1.0)
Polygonize#
Polygonizing a Raster
is done through the polygonize()
function, which converts target pixels into a multi-polygon
Vector
.
Note
For a Mask
, polygonize()
implicitly targets True
values and thus does not require target pixels. See
Polygonize (overloaded from Raster).
# Polygonize all values lower than 100
vect_lt_100 = (rast < 100).polygonize()
vect_lt_100
Vector(
ds= id geometry raster_value
0 0 POLYGON ((0.66667 0.66667, 0.66667 0.33333, 1 ... 1.0
crs=GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]
bounds=BoundingBox(left=0.6666666666666666, bottom=0.33333333333333337, right=1.0, top=0.6666666666666667))
Proximity#
Computing proximity from a Raster
is done through by the proximity()
function, which computes the closest distance
to any target pixels in the Raster
.
Note
For a Mask
, proximity()
implicitly targets True
values and thus does not require target pixels. See
Proximity (overloaded from Raster).
# Compute proximity from mask for all values lower than 100
prox_lt_100 = (rast < 100).proximity()
prox_lt_100
Raster(
data=[[0.74535599 0.47140452 0.33333333]
[0.66666667 0.33333333 0. ]
[0.74535599 0.47140452 0.33333333]]
transform=| 0.33, 0.00, 0.00|
| 0.00,-0.33, 1.00|
| 0.00, 0.00, 1.00|
crs=EPSG:4326
nodata=-99999)
Optionally, instead of target pixel values, a Vector
can be passed to compute the proximity from the geometry.
# Compute proximity from mask for all values lower than 100
prox_lt_100_from_vect = rast.proximity(vector=vect_lt_100)
prox_lt_100_from_vect
Raster(
data=[[0.74535599 0.47140452 0.33333333]
[0.66666667 0.33333333 0. ]
[0.66666667 0.33333333 0. ]]
transform=| 0.33, 0.00, 0.00|
| 0.00,-0.33, 1.00|
| 0.00, 0.00, 1.00|
crs=EPSG:4326
nodata=-99999)
Interpolate or reduce to point#
Interpolating or extracting Raster
values at specific points can be done through:
the
reduce_points()
function, that applies a reductor function (numpy.ma.mean()
by default) to a surrounding window for each coordinate, orthe
interp_points()
function, that interpolates theRaster
’s regular grid to each coordinate using a resampling algorithm.
# Extract median value in a 3 x 3 pixel window
rast_reproj.reduce_points((0.5, 0.5), window=3, reducer_function=np.ma.median)
161.0
# Interpolate coordinate value with quintic algorithm
rast_reproj.interp_points((0.5, 0.5), method="quintic")
array([nan], dtype=float32)
Note
Both reduce_points()
and interp_points()
can be passed a single coordinate as floats
, or a
list
of coordinates.
Export#
A Raster
can be exported to different formats, to facilitate inter-compatibility with different packages and code versions.
Those include exporting to:
a
xarray.Dataset
withto_xarray
,a
numpy.ndarray
orgeoutils.Vector
as a point cloud withto_pointcloud
.
# Export to rasterio dataset-reader through a memoryfile
rast_reproj.to_rio_dataset()
<open DatasetReader name='/vsimem/0234e2e4-32c2-4cca-af50-1455cd77a862/0234e2e4-32c2-4cca-af50-1455cd77a862.tif' mode='r'>
# Export to geopandas dataframe
rast_reproj.to_pointcloud()
Vector(
ds= b1 geometry
0 102 POINT (0 0.75)
1 102 POINT (0.1 0.75)
2 116 POINT (0.2 0.75)
3 159 POINT (0.3 0.65)
4 174 POINT (0.4 0.65)
5 161 POINT (0.5 0.65)
6 126 POINT (0.6 0.65)
7 90 POINT (0.7 0.65)
8 179 POINT (0.3 0.55)
9 179 POINT (0.4 0.55)
10 161 POINT (0.5 0.55)
11 126 POINT (0.6 0.55)
12 91 POINT (0.7 0.55)
13 196 POINT (0.3 0.45)
14 189 POINT (0.4 0.45)
15 172 POINT (0.5 0.45)
16 141 POINT (0.6 0.45)
17 101 POINT (0.7 0.45)
18 234 POINT (0 0.35)
19 234 POINT (0.1 0.35)
20 220 POINT (0.2 0.35)
21 207 POINT (0.3 0.35)
22 197 POINT (0.4 0.35)
23 185 POINT (0.5 0.35)
24 161 POINT (0.6 0.35)
25 234 POINT (0 0.25)
26 234 POINT (0.1 0.25)
27 225 POINT (0.2 0.25)
28 215 POINT (0.3 0.25)
29 205 POINT (0.4 0.25)
30 198 POINT (0.5 0.25)
31 190 POINT (0.6 0.25)
32 234 POINT (0 0.15)
33 234 POINT (0.1 0.15)
34 226 POINT (0.2 0.15)
35 217 POINT (0.3 0.15)
36 208 POINT (0.4 0.15)
37 203 POINT (0.5 0.15)
38 203 POINT (0.6 0.15)
crs=GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]
bounds=BoundingBox(left=0.0, bottom=0.1499999999999999, right=0.7000000000000001, top=0.75))
# Export to xarray data array
rast_reproj.to_xarray()
<xarray.DataArray (band: 1, y: 8, x: 8)> Size: 256B [64 values with dtype=float32] Coordinates: * band (band) int64 8B 1 * x (x) float64 64B 0.05 0.15 0.25 0.35 0.45 0.55 0.65 0.75 * y (y) float64 64B 0.7 0.6 0.5 0.4 0.3 0.2 0.1 -1.11e-16 spatial_ref int64 8B 0 Attributes: AREA_OR_POINT: Area scale_factor: 1.0 add_offset: 0.0