Raster Analysis

Raster analysis in WbW-R covers the end-to-end workflow of reading, transforming, and writing gridded data — from simple cell-value arithmetic through multi-layer overlays, proximity operations, interpolation, and statistical testing. All heavy computation runs in the Whitebox backend.


Core Concepts

Raster analysis requires understanding these fundamental terms:

  • Cell (pixel): The smallest unit of a raster. Each cell stores a single value (integer or floating-point) and has a uniform spatial extent (e.g. 10 m × 10 m).
  • Data type: Integer (Int8, Int16, Int32) for categorical data; Float32 or Float64 for continuous measurements. Data type affects precision, file size, and computation speed.
  • NoData value: Sentinel value representing missing or invalid data (e.g. -9999 or NaN). Critical for masking water, clouds, or invalid measurements in focal operations.
  • Spatial reference (CRS): Coordinate system and projection. Mismatched CRS between rasters causes silent misalignment; always verify before overlay operations.
  • Extent: The bounding box (xmin, ymin, xmax, ymax) of the raster in real-world coordinates.
  • Cell size (resolution): Cell width and height in map units. Coarser resolution is faster but loses detail; finer resolution requires more memory and computation.
  • Focal operation: Uses neighbourhood values (e.g. 3×3 kernel) to compute output. Examples: moving average, Sobel edge detection, local extrema.
  • Zonal operation: Aggregates grid values by zone (polygon or categorical layer). Examples: mean by land-cover class, sum by administrative boundary.
  • Reclassification: Reassigning cell values according to lookup rules. Common for categorizing continuous data (e.g. slope classes) or remapping land-cover codes.
  • Resampling: Changing cell size or alignment. Methods: nearest-neighbour (preserves categories), bilinear (smooth), cubic (smoother).

Session Setup

library(whiteboxworkflows)

s <- wbw_session()
setwd('/data/raster')

Reading and Inspecting Rasters

r <- wbw_read_raster('input.tif')
meta <- r$metadata()

cat('Rows:', meta$rows, '  Cols:', meta$columns, '\n')
cat('Cell size:', meta$resolution_x, 'x', meta$resolution_y, '\n')
cat('EPSG:', meta$epsg, '\n')
cat('NoData:', meta$nodata, '\n')
cat('Data type:', meta$data_type, '\n')

Raster Calculator

raster_calculator evaluates an algebraic expression that combines one or more raster inputs:

# Single-raster expression — multiply by constant
wbw_raster_calculator(statement = "'elevation.tif' * 3.28084",
  output    = 'elevation_ft.tif')

# Multi-raster NDVI expression
wbw_raster_calculator(statement = "('nir.tif' - 'red.tif') / ('nir.tif' + 'red.tif')",
  output    = 'ndvi.tif')

# Conditional expression using isnull() and nodata()
wbw_raster_calculator(statement = "if(isnull('input.tif'), nodata(), 'input.tif' + 100.0)",
  output    = 'result.tif')

Special tokens available in statements: nodata(), isnull(), if(), abs(), sqrt(), log(), log2(), exp(), min(), max(), pi, integer constants, and floating-point constants.


Reclassification

# Reclassify using from-to-becomes triplets
# Format: "from;to;new;from;to;new;..."
wbw_reclass(i         = 'slope.tif',
  output    = 'slope_class.tif',
  reclass_vals = '0;5;1;5;15;2;15;30;3;30;45;4;45;90;5',
  assign_mode = FALSE)

# Equal-interval reclassification
wbw_reclass_equal_interval(i         = 'ndvi.tif',
  output    = 'ndvi_class.tif',
  interval  = 0.1,
  start_val = -1.0,
  end_val   = 1.0)

# Reclassify from a CSV lookup table
wbw_reclass_from_file(i          = 'landcover.tif',
  reclass_file = 'reclass_table.txt',
  output     = 'landcover_reclassed.tif')

Focal Statistics (Moving Windows)

# Gaussian filter
wbw_gaussian_filter(i = 'dem.tif', output = 'dem_gauss.tif', sigma = 1.5)

# Median filter (feature-preserving)
wbw_median_filter(i = 'dem.tif', output = 'dem_median.tif', filterx = 5, filtery = 5,
  sig_digits = 2)

# Feature-preserving smoothing (Zhang et al.)
wbw_feature_preserving_smoothing(dem = 'dem.tif', output = 'dem_fps.tif', filter = 11,
  norm_diff = 8.0, num_iter = 3, max_diff = 0.5, zfactor = 1.0)

# Standard deviation in a window
wbw_standard_dev_filter(i = 'dem.tif', output = 'dem_sd.tif', filterx = 11, filtery = 11)

# Percentile filter
wbw_percentile_filter(i = 'dem.tif', output = 'dem_pct75.tif', filterx = 11, filtery = 11,
  sig_digits = 2)

Morphological Operations

# Morphological closing (fills gaps in foreground)
wbw_closing(i = 'binary.tif', output = 'binary_close.tif', filterx = 3, filtery = 3)

# Morphological opening (removes small foreground blobs)
wbw_opening(i = 'binary.tif', output = 'binary_open.tif', filterx = 3, filtery = 3)

# Top-hat transform (white)
wbw_tophat_transform(i = 'raster.tif', output = 'tophat.tif', filterx = 11, filtery = 11,
  variant = 'white')

Global and Zonal Statistics

# Global statistics (summary of entire raster)
wbw_raster_histogram(i = 'dem.tif', output = 'dem_histogram.html')

# Zonal statistics — mean elevation per watershed zone
wbw_zonal_statistics(i         = 'dem.tif',
  features  = 'watersheds.tif',
  output    = 'watershed_stats.html',
  stat      = 'mean',
  out_raster = 'watershed_mean_elev.tif')

Raster Overlay Operations

# Weighted sum (multi-criteria suitability)
wbw_weighted_sum(inputs  = 'soil.tif;slope.tif;distance.tif',
  weights = '0.3;0.5;0.2',
  output  = 'suitability.tif')

# Weighted overlay (MCE) with constraint
wbw_weighted_overlay(inputs      = 'factor1.tif;factor2.tif;factor3.tif',
  weights     = '0.4;0.4;0.2',
  output      = 'suitability_mce.tif',
  scale_max   = 5.0,
  scale_min   = 0.0,
  scale_factor  = 1.0)

Resampling and Aggregation

wbw_resample(inputs      = 'dem.tif',
  output      = 'dem_10m.tif',
  cell_size   = 10.0,
  method      = 'bilinear')

wbw_aggregate_raster(i = 'dem.tif', output = 'dem_agg.tif', agg_factor = 5,
  type = 'mean')

Proximity Analysis

# Euclidean distance
wbw_euclidean_distance(i = 'sources.tif', output = 'euclidean_dist.tif')

# Cost-distance accumulation
wbw_cost_distance(source = 'sources.tif',
  cost   = 'friction.tif',
  out_accum = 'cost_accum.tif',
  out_backlink = 'cost_backlink.tif')

# Least-cost path
wbw_cost_pathway(destination = 'destinations.tif',
  backlink    = 'cost_backlink.tif',
  output      = 'least_cost_path.tif',
  zero_background = FALSE)

# Raster buffer
wbw_buffer_raster(i = 'features.tif', output = 'buffered.tif',
  size = 250.0, gridcells = FALSE)

Raster Object Analysis

# Label connected patches (foreground = non-zero)
wbw_clump(i = 'binary.tif', output = 'patches.tif',
  diag = TRUE, zero_back = TRUE)

# Remove small patches below area threshold (10 000 m²)
wbw_remove_spurs(i = 'patches.tif', output = 'patches_clean.tif',
  iterations = 10)

# Raster area of each patch value
wbw_raster_area(i = 'patches.tif', output = 'patch_area.tif',
  out_text = FALSE, units = 'map units', zero_back = TRUE)

Interpolation from Points

pts <- wbw_read_vector('sample_points.shp')

# IDW
wbw_idw_interpolation(i         = pts$file_path(),
  field     = 'ELEV',
  output    = 'idw_surf.tif',
  use_z     = FALSE,
  weight    = 2.0,
  radius    = 2.5,
  min_points = 2,
  cell_size  = 5.0)

# Natural Neighbour
wbw_natural_neighbour_interpolation(i        = pts$file_path(),
  field    = 'ELEV',
  output   = 'nn_surf.tif',
  use_z    = FALSE,
  cell_size = 5.0)

# Radial Basis Function
wbw_radial_basis_function_interpolation(i         = pts$file_path(),
  field     = 'ELEV',
  output    = 'rbf_surf.tif',
  num_points = 8,
  cell_size  = 5.0,
  func_type  = 'ThinPlateSpline',
  poly_order = 1,
  weight     = 0.1)

Statistical Tests

# Kolmogorov-Smirnov normality test
wbw_ks_test_for_normality(i = 'dem.tif', output = 'ks_normality.html')

# Two-raster paired samples t-test
wbw_paired_sample_t_test(input1 = 'dem_2010.tif', input2 = 'dem_2020.tif',
  output = 'ttest.html', num_samples = 1000)

Contour Generation

wbw_contours_from_raster(i          = 'dem.tif',
  output     = 'contours.shp',
  interval   = 10.0,
  base       = 0.0,
  smooth     = 5,
  tolerance  = 10.0)

WbW-Pro Spotlight: Field Trafficability and Operation Planning

  • Problem: Plan equipment timing and field access under variable moisture and weather conditions.
  • Tool: field_trafficability_and_operation_planning
  • Typical inputs: DEM, normalized soil-moisture raster, optional rainfall-risk raster.
  • Typical outputs: Trafficability score raster, operation-class raster, and summary outputs.
result <- s$field_trafficability_and_operation_planning(
  dem               = 'field_dem.tif',
  soil_moisture     = 'soil_moisture_norm.tif',
  rainfall_forecast = 'rainfall_risk_norm.tif',
  output_prefix     = 'field_12_trafficability'
)

print(result)

Note: This workflow requires a session initialized with a valid Pro licence.


Complete Raster Analysis Workflow

library(whiteboxworkflows)

s <- wbw_session()
setwd('/data/raster_workflow')

dem <- wbw_read_raster('dem.tif')

# 1. Smooth DEM
wbw_feature_preserving_smoothing(dem = dem$file_path(), output = 'dem_smooth.tif', filter = 11,
  norm_diff = 8.0, num_iter = 3, max_diff = 0.5)

# 2. Slope
wbw_slope(dem = 'dem_smooth.tif', output = 'slope.tif', units = 'degrees')

# 3. Reclassify slope into erosion risk classes
wbw_reclass(i = 'slope.tif', output = 'slope_risk.tif',
  reclass_vals = '0;5;1;5;15;2;15;30;3;30;90;4')

# 4. Euclidean distance to water
wbw_euclidean_distance(i = 'water_bodies.tif', output = 'dist_water.tif')

# 5. Multi-criteria suitability overlay
wbw_weighted_sum(inputs  = paste('slope_risk.tif', 'dist_water.tif', 'soil_type.tif', sep=';'),
  weights = '0.5;0.3;0.2',
  output  = 'suitability.tif')

# 6. Reclassify to binary mask (suitability > threshold)
wbw_raster_calculator(statement = "if('suitability.tif' >= 3.0, 1.0, 0.0)",
  output    = 'suitable_areas.tif')

# 7. Generate contours
wbw_contours_from_raster(i = dem$file_path(), output = 'contours_10m.shp',
  interval = 10.0, base = 0.0, smooth = 3)

cat('Raster analysis complete.\n')

Tips

  • Choose your data type: Use integers for categorical data (land cover, classifications) to minimize file size and computation time. Use floating-point (Float32 or Float64) only for continuous measurements (elevation, temperature, probability).
  • Set NoData explicitly: Ensure your source rasters carry a valid NoData value. Missing NoData declarations can corrupt statistics and focal operations by including invalid pixels as zeros or false elevations.
  • Compress carefully: LZW and DEFLATE compression work well for most data; avoid if you need rapid random access to interior tiles. Use COMPRESS=JPEG for photographic data only (lossy, unsuitable for analysis).
  • Focal operations require buffering: Cells at raster edges cannot compute full neighbourhoods. Use expand() or accept edge effects; never assume borders are valid in derivative rasters.
  • Zonal statistics are only as good as your zones: Ensure zone boundaries are topologically clean (no overlaps, no gaps). Overlapping zones cause double-counting; gaps cause NoData regions in output.
  • Reclassification is fast but risky: Always validate output distributions (histogram) after reclassification. Off-by-one errors in class boundaries can silently produce wrong land-cover or suitability classes.
  • Memory is the constraint for large rasters: Tiles > 2 GB require out-of-core or streaming processing. Use read_by_block() for large files; avoid loading entire rasters into memory if they exceed available RAM.
  • Upsampling introduces artifacts: Never upsample (finer resolution) without a valid interpolation method. Nearest-neighbour upsampling creates blocky artefacts; bilinear is smoother but may violate data range (e.g. probability values > 1.0).