Overview
This manual is the long-form user guide for WbW-QGIS.
WbW-QGIS (Whitebox Workflows for QGIS) is the QGIS frontend for Whitebox Next Gen. It provides a QGIS-native way to discover, configure, and run Whitebox tools through the Processing framework and plugin UI.
Whitebox Next Gen uses a layered architecture:
- backend geospatial engines and tools in Rust,
- frontend runtimes for Python, R, and QGIS,
- shared tool taxonomy and capability metadata.
Whitebox Next Gen is intentionally full-stack: core geospatial capabilities that are often delegated to external C/C++ dependencies in other GIS platforms (for example raster I/O, projections, geometry/topology operations, and lidar handling) are implemented in the Whitebox codebase itself. This architecture is unusual in GIS and provides practical benefits for users: consistent behavior across platforms, tighter control over correctness and performance, fewer system-level dependency issues during installation, and faster iteration when fixing bugs or introducing new capabilities.
Within this model, WbW-QGIS is intentionally a thin integration layer. It handles QGIS presentation and orchestration while computation remains in the Whitebox backend runtime.
What This Manual Covers
This guide focuses on practical use of WbW-QGIS:
- setting up the plugin and runtime correctly,
- understanding discovery and provider refresh,
- running tools through QGIS Processing,
- handling output and troubleshooting common issues.
The manual is written for both analysts and developers who use QGIS as the primary working environment.
Goals
- Provide a stable onboarding path for local installation.
- Document the operational behavior of plugin discovery and execution.
- Clarify tier and licensing behavior in QGIS.
- Reduce setup friction and runtime ambiguity.
How to Use This Manual
For first-time setup, read chapters in order:
- Installation and Setup
- Build and Preview
- Quick Start
- Runtime and Discovery
After setup is stable, use the remaining chapters as reference material.
Installation and Setup
This chapter covers installing the Whitebox Workflows QGIS plugin and its Python backend.
System Requirements
- QGIS 4.0 or later (3.28+ may work but is not officially supported)
- Internet connection (required for backend installation via pip)
- macOS, Linux, or Windows
Install the QGIS Plugin
From QGIS Plugin Repository (Recommended)
- Open QGIS
- Go to Plugins → Manage and Install Plugins
- Search for "Whitebox Workflows"
- Click Install Plugin
- Restart QGIS
From File (Manual Installation)
If you have a plugin .zip file:
- Download or obtain the
whitebox_workflows_for_qgis-*.zipfile - Extract the zip to your QGIS plugins directory:
- macOS:
~/Library/Application Support/QGIS/QGIS4/profiles/default/python/plugins/ - Linux:
~/.local/share/QGIS/QGIS4/profiles/default/python/plugins/ - Windows:
%APPDATA%\QGIS\QGIS4\profiles\default\python\plugins\
- macOS:
- Ensure the directory structure is
plugins/whitebox_workflows_qgis/with__init__.pyandmetadata.txtdirectly inside - Restart QGIS
Install the Whitebox Workflows Backend
After installing the plugin, restart QGIS. The plugin will check for the
whitebox-workflows Python package.
Option A: Install via Plugin Dialog (Easiest)
- Open QGIS
- If the backend is not installed, a dialog appears: "⚠️ Action Required — Install Whitebox Workflows Backend"
- Read the installation instructions (or copy the command to clipboard)
exec("import runpy,sys\nsys.argv=['pip','install','--user','whitebox-workflows']\ntry:\n runpy.run_module('pip',run_name='__main__',alter_sys=True)\nexcept SystemExit:\n pass\nimport qgis.utils\nqgis.utils.unloadPlugin('whitebox_workflows_qgis')\nqgis.utils.loadPlugin('whitebox_workflows_qgis')\nqgis.utils.startPlugin('whitebox_workflows_qgis')\nprint('Whitebox Workflows backend installed and plugin reloaded.')")
- Click Install to automatically download and install the backend using QGIS's bundled Python and pip
- Wait for the installation to complete
- The plugin automatically reloads with full access to all tools
Option B: Install via Command Line
If you prefer manual installation, run:
pip install whitebox-workflows
Or, if you have the QGIS Python environment:
# macOS/Linux (find the QGIS Python executable)
$(which python3) -m pip install whitebox-workflows
# Or explicitly use QGIS's Python if installed locally:
/Applications/QGIS.app/Contents/MacOS/Python/bin/python3 -m pip install whitebox-workflows
After installation, restart QGIS.
Verify Installation
Once the backend is installed, you should see:
- The Processing Toolbox populates with 700+ Whitebox tools
- No error messages in the QGIS message log
- Whitebox tools appear in Processing → Toolbox → Whitebox Workflows
For Development and Local Testing
If you are a developer working with the source repository:
- Checkout the
whitebox_next_genrepository - Install the plugin locally by symlinking it into your QGIS plugins folder:
export QGIS_PLUGIN_DIR="<QGIS settings dir>/python/plugins" mkdir -p "$QGIS_PLUGIN_DIR" ln -snf "$PWD/crates/wbw_qgis/plugin/whitebox_workflows_qgis" \ "$QGIS_PLUGIN_DIR/whitebox_workflows_qgis" - Install the backend using the automated installer or
pip install whitebox-workflows - Changes to the plugin source are reflected immediately on QGIS restart
Troubleshooting
"Cannot find init.py or metadata.txt"
This error means the plugin zip was extracted incorrectly. Ensure your plugins directory contains:
plugins/
whitebox_workflows_qgis/
__init__.py
metadata.txt
bootstrap.py
plugin.py
... (other files)
Not:
plugins/
whitebox_workflows_qgis/
whitebox_workflows_qgis/ ← Extra nesting
__init__.py
Backend Installation Fails
- "pip: command not found" — Use
python3 -m pipinstead - "Permission denied" — Try
pip install --user whitebox-workflows - Network issues — Check your internet connection and try again
- For help — See Troubleshooting or contact support@whiteboxgeo.com
Build and Preview
The WbW-QGIS manual uses mdBook, matching the WbW-Python and WbW-R manuals.
Build the Manual
From the manual directory:
cd crates/wbw_qgis/manual
mdbook build
Generated output will be written to:
- crates/wbw_qgis/manual/book
Live Preview
For a local preview server:
cd crates/wbw_qgis/manual
mdbook serve --open
This starts a local server and opens the manual in your browser.
Writing Conventions
- Keep examples task-oriented and reproducible.
- Prefer short, complete QGIS workflows over abstract API descriptions.
- Document expected outputs and validation checks where possible.
Quick Start
This walkthrough verifies that WbW-QGIS is running and can execute tools.
1. Enable Plugin
- Start QGIS.
- Open Plugin Manager.
- Enable Whitebox Workflows.
2. Confirm Provider Availability
- Open the Processing Toolbox.
- Confirm Whitebox provider entries appear.
- If tools are missing, trigger a discovery refresh from the plugin panel.
3. Run a First Tool
A common smoke test is a simple raster analysis tool with a small input file.
Recommended pattern:
- Choose a small test raster.
- Run a lightweight tool from the Whitebox provider.
- Write output to a temporary file.
- Load and inspect the output layer in QGIS.
4. Validate Results
- Confirm the output file exists.
- Confirm layer metadata and CRS are as expected.
- Confirm visual result is plausible for the input.
If any step fails, continue to the Troubleshooting chapter.
Runtime and Discovery
WbW-QGIS discovers tool availability at runtime using the active whitebox_workflows environment.
Discovery Flow
At a high level:
- Import whitebox_workflows in the QGIS Python environment.
- Create a runtime session.
- Read runtime capability metadata.
- Read tool catalog metadata.
- Partition available vs locked tools.
- Refresh Processing provider algorithms.
When to Refresh
Refresh discovery when:
- plugin settings change,
- runtime tier/entitlement changes,
- whitebox_workflows is rebuilt or reinstalled,
- tool taxonomy updates are introduced.
Common Discovery Symptoms
- Provider missing entirely: plugin import/runtime bootstrap failure.
- Provider appears but tools are absent: catalog read failure or stale cache.
- Tools show as locked unexpectedly: runtime capability/tier mismatch.
In all cases, validate environment alignment first.
Tool Execution in QGIS
WbW-QGIS executes tools through the QGIS Processing framework.
Typical Execution Path
- Select a Whitebox algorithm in Processing Toolbox.
- Fill parameters in the algorithm dialog.
- Execute and monitor progress/messages.
- Load or inspect output artifacts.
Recommended Execution Practices
- Use explicit output paths for reproducibility.
- Start with small representative datasets before full runs.
- Validate intermediate outputs for CRS, schema, and metadata.
- Keep task logs for long workflows and batch operations.
Output Handling
Whitebox tools may produce:
- raster outputs,
- vector outputs,
- lidar outputs,
- text/report sidecar artifacts.
Confirm output type and format before chaining into downstream steps.
Progress and Messaging
Execution status and warnings should be treated as part of result validation. If a tool completes with warnings, inspect outputs before continuing.
Recipes
Recipes in WbW-QGIS are guided workflow entries that help users launch common multi-tool patterns faster.
A recipe is not a new backend algorithm. It is a curated sequence of existing tools with summary guidance, launch defaults, and tier-aware visibility.
What Recipes Provide
Recipes provide:
- A short purpose statement.
- A launch tool (the first tool dialog opened when you run the recipe).
- A step list (ordered tool IDs for the workflow).
- Optional input and output hints.
- Tier gating (Open, Pro, or Enterprise).
Where Recipes Appear in QGIS
Recipes are available in the Whitebox Workflows dock panel under Workflow Recipes.
The panel includes:
- Open Recipe
- Copy Recipe Steps
- Why Is This Locked?
- Open Recipe File
- Reload Recipe File
- Validate Recipe File
- Include locked recipes toggle
Built-in and User Recipes
WbW-QGIS merges two sources:
- Built-in recipes shipped with the plugin.
- User-defined recipes loaded from a local JSON file.
If a user recipe has the same id as a built-in recipe, the user recipe overrides the built-in entry.
User Recipe File Location
Default file path:
- ~/.whitebox_workflows_qgis/recipes.json
Override path with environment variable:
- WBW_QGIS_USER_RECIPES
When you press Open Recipe File, the plugin creates the file from a template if it does not exist.
User Recipe File Format
The file may be either:
- An object with a recipes array.
- A direct array of recipes.
Each recipe should include:
- id (required)
- tools array (required)
Optional fields:
- title
- summary
- tier (open, pro, enterprise)
- launch_tool
- input_hint
- output_hint
Example:
{
"recipes": [
{
"id": "my_custom_terrain_recipe",
"title": "My Custom Terrain Recipe",
"summary": "User-defined recipe example.",
"tier": "open",
"launch_tool": "slope",
"tools": ["slope", "aspect", "hillshade"],
"input_hint": "Set a DEM raster as the primary input.",
"output_hint": "Write outputs to a dedicated project output folder."
}
]
}
Validation and Error Reporting
Use Validate Recipe File in the panel to run validation and see a full report.
Validation checks include:
- Entry structure is a JSON object.
- id exists and is non-empty.
- tools exists and is a non-empty array.
- tier value is valid when supplied.
Invalid entries are skipped, while valid entries continue to load.
Warnings include recipe index and, when available, recipe id to speed up fixes.
Recipe Visibility and Tier Behavior
Recipes are filtered by:
- Runtime tier entitlement.
- Tool availability in the current runtime catalog.
- Include locked recipes panel setting.
When Include locked recipes is enabled, recipes that are not runnable in the current runtime remain visible with lock messaging for discovery.
Discovery and Sorting
Recipes are shown alphabetically in the panel for easier scanning.
Sorting applies to both built-in and user-defined recipes.
Recommended Team Practice
For teams, keep a shared recipe JSON under version control and point WBW_QGIS_USER_RECIPES to that path in your local environment setup.
This gives you repeatable, reviewable workflow definitions without modifying plugin source files.
Licensing and Tiers
Whitebox NG is available in two licensing tiers with different capabilities and licensing models.
License Tiers
-
Open Tier (free): Governed by MIT/Apache 2.0 dual licensing. All Open-tier tools are free and open-source with no entitlement or activation required. Use this tier for learning, research, and open development.
-
Pro Tier (commercial): Proprietary software governed by EULA. Pro-tier tools provide advanced capabilities and require activation with a valid license key. Once activated, the license persists locally so you do not need to re-authenticate on each QGIS session.
How QGIS Reflects Licensing
The Whitebox Workflows QGIS plugin is a frontend layer. Licensing authority and rules are enforced in the backend runtime.
Core Principle
The plugin reflects backend capabilities; it does not define licensing rules. Runtime mode (open vs. pro) determines which tools are available and functional.
Practical Behavior
- Open-tier tools are expected to run in all standard public QGIS environments.
- Pro-tier tools may be visible in the plugin but locked without an active Pro license.
- You can request a specific tier, but the effective tier depends on your entitlement state.
Why This Matters
- One plugin surface adapts to both open and pro capability tiers.
- Tool discovery remains consistent across Python, R, and QGIS frontends.
- Licensing decisions and enforcement remain centralized in backend logic.
Interactive License Management
The plugin provides convenient menu actions for license management without requiring external tools or command lines.
Activating a Pro License
- In QGIS, navigate to Plugins > Whitebox Workflows > Activate License.
- Enter your license information when prompted:
- License key (required)
- First name (required)
- Last name (required)
- Email (required)
- Provider URL (optional; defaults to production)
- Accept the EULA terms.
- Click OK. The plugin will activate and persist your license locally.
- The tool catalog automatically refreshes to show Pro-tier tools.
Important: License activation is tied to your machine. See Transferring a License to move to another machine.
Checking License Status
Navigate to Plugins > Whitebox Workflows > Plugin Settings (or look for diagnostics output) to see:
- License validity (active or expired)
- Effective tier (open or pro)
- License expiration time
Transferring a License
If you need to use your Pro license on a different machine, you must first deactivate it on the current machine and then activate on the destination.
- On the current machine, navigate to Plugins > Whitebox Workflows > Transfer License. This generates a portable activation payload and clears your local license state.
- Share the activation payload with the destination machine (or keep it for your own use on the other machine).
- On the destination machine, navigate to Plugins > Whitebox Workflows > Activate License and enter your license key using the same process as above. The destination will obtain its own local license state.
Deactivating a License
If you no longer plan to use Pro tools on this machine, navigate to Plugins > Whitebox Workflows > Deactivate License. This clears your local license state. Future sessions will fall back to Open-tier tools only.
Local License State
Once activated, your license information is stored locally at
~/.whitebox/wbw_ng_license_state.json (or override via the
WBW_LICENSE_STATE_PATH environment variable). On each QGIS startup:
- If valid local state exists, it is automatically loaded.
- If local state is expired or missing, the plugin falls back to Open-tier mode.
- You do not need to re-authenticate in QGIS on every session.
Expected Local-Dev Outcome
For most source-based setups, assume open-tier behavior unless your runtime environment is explicitly configured for Pro-enabled integration testing or you have an active Pro license activated on your machine.
Supported Data Formats
This chapter documents format support exposed through WbW-QGIS.
Authoritative backend support comes from Whitebox core crates:
- Raster I/O:
wbraster - Vector I/O:
wbvector - LiDAR I/O:
wblidar
The format tables below are aligned with those backend crates' README "Supported Formats" sections.
Raster Formats
Raster support in Whitebox is provided by wbraster.
| Format | Extension(s) | Read | Write | Notes |
|---|---|---|---|---|
| DTED | .dt0, .dt1, .dt2 | Yes | Yes | DTED 0/1/2 elevation; WGS-84 geographic only |
| ENVI HDR Labelled | .hdr + sidecar data | Yes | Yes | Multi-band (BSQ / BIL / BIP) |
| ER Mapper | .ers + data | Yes | Yes | Hierarchical header |
| ERDAS IMAGINE (HFA) | .img | Yes | No | Read-only MVP; RLC compression supported |
| Esri ASCII Grid | .asc, .grd | Yes | Yes | Handles xllcorner and xllcenter |
| Esri Binary Grid | workspace dir / .adf | Yes | Yes | Single-band float32, big-endian |
| Esri Float Grid | .flt, .hdr | Yes | Yes | Single-band float grid with header |
| JPEG + World File | .jpg, .jpeg + .jgw/.wld | Yes | Yes | Non-rotated georeferencing |
| PNG + World File | .png + .pgw/.wld | Yes | Yes | Non-rotated georeferencing |
| GeoTIFF / BigTIFF / COG | .tif, .tiff | Yes | Yes | Stripped/tiled GeoTIFF, BigTIFF, COG |
| GeoPackage Raster (Phase 4) | .gpkg | Yes | Yes | Multi-band tiled raster |
| GRASS ASCII Raster | .asc, .txt | Yes | Yes | north/south/east/west, rows/cols headers |
| Idrisi/TerrSet Raster | .rdc, .rst | Yes | Yes | byte, integer, real, RGB24 |
| JPEG2000 / GeoJP2 | .jp2 | Yes | Yes | Pure-Rust reader and writer |
| PCRaster | .map | Yes | Yes | Value-scale aware writer |
| SAGA GIS Binary | .sgrd, .sdat | Yes | Yes | SAGA data types supported |
| Surfer GRD | .grd | Yes | Yes | DSAA and DSRB |
| Zarr v2/v3 | .zarr | Yes | Yes | 2D and 3D (band,y,x) chunked arrays |
| XYZ ASCII Grid | .xyz | Yes | Yes | Whitespace or comma-delimited X Y Z points |
Notes:
- Whitebox avoids runtime dependence on GDAL.
- In QGIS workflows, GeoTIFF remains the safest default interchange raster.
Vector Formats
Vector support in Whitebox is provided by wbvector.
| Format | Read | Write | Notes |
|---|---|---|---|
FlatGeobuf (.fgb) | Yes | Yes | High-performance binary interchange |
GeoJSON (.geojson) | Yes | Yes | Web-friendly text format |
TopoJSON (.topojson) | Yes | Yes | Topology-preserving JSON format |
GeoPackage (.gpkg) | Yes | Yes | SQLite container; multi-layer workflows |
GML (.gml) | Yes | Yes | Standards-based XML exchange |
GPX (.gpx) | Yes | Yes | GPS tracks/routes/waypoints |
KML (.kml) | Yes | Yes | Google Earth-style visualization |
MapInfo Interchange (.mif + .mid) | Yes | Yes | Legacy MapInfo interoperability |
ESRI Shapefile (.shp + sidecars) | Yes | Yes | Broad legacy compatibility |
GeoParquet (.parquet) | Yes | Yes | Optional geoparquet feature |
KMZ (.kmz) | Yes | Yes | Optional kmz feature |
OSM PBF (.osm.pbf) | Yes | No | Read-only; optional osmpbf feature |
Feature-gated formats in wbvector:
geoparquetfor GeoParquet supportkmzfor KMZ supportosmpbffor OSM PBF read support
In QGIS workflows, GeoPackage and FlatGeobuf are good modern interchange choices; Shapefile remains a compatibility fallback.
LiDAR / Point Cloud Formats
LiDAR support in Whitebox is provided by wblidar.
| Format | Read | Write | Notes |
|---|---|---|---|
| LAS | Yes | Yes | LAS 1.1-1.5, PDRF 0-15 |
| LAZ | Yes | Yes | Standards-compliant LASzip v2/v3 Point10/Point14 codecs |
| COPC | Yes | Yes | COPC 1.0 hierarchy with Point14-family payloads |
| PLY | Yes | Yes | ASCII, binary little-endian, binary big-endian |
| E57 | Yes | Yes | ASTM E2807 with CRC-32 page validation |
Optional features in wblidar:
copc-httpfor HTTP range fetching of remote COPCcopc-parallelfor parallel COPC writing pathslaz-parallelfor optional parallel LAZ decode pathsparallelumbrella feature (enables both parallel paths)
In QGIS workflows, .copc.laz is a strong default for large point-cloud
delivery and archive.
QGIS Practical Defaults
For most WbW-QGIS production workflows:
- Raster default: GeoTIFF (
.tif) - Vector default: GeoPackage (
.gpkg) or FlatGeobuf (.fgb) - LiDAR default: COPC LAZ (
.copc.laz) or LAZ (.laz)
These defaults balance compatibility, file size, and performance.
Important Distinction
Backend format support means the Whitebox runtime can read/write those formats. Specific QGIS tool dialogs may still constrain certain outputs or defaults depending on parameter wiring and the tool category.
When in doubt:
- Use the tool's default output extension in QGIS.
- Re-open output and validate metadata.
- Use QGIS conversion tools only when you need a different interchange format.
Common Format Problems
| Problem | Likely cause | Fix |
|---|---|---|
| Output opens but schema is unexpected | Format-specific field/type constraints | Use GeoPackage or FlatGeobuf for richer schema |
| Shapefile field names truncated | 10-character DBF limit | Switch output to GeoPackage |
| Large cloud is slow to browse | Non-indexed point-cloud format | Use COPC LAZ for tiled access |
| Optional format not available | Feature not enabled in build | Use a non-optional format (for example GeoPackage/GeoJSON/Shapefile) |
| CRS appears missing in output | Sidecar or metadata issue | Confirm CRS in layer properties and re-export if needed |
Reprojection and CRS
A Coordinate Reference System (CRS) defines how coordinates in a dataset map to real-world locations. CRS mismatches are one of the most common sources of silent errors in GIS workflows: two layers may display correctly on screen (because QGIS reprojects them on-the-fly for display) while producing wrong results when passed to an analysis tool that expects matching CRS inputs.
This chapter explains how to identify, verify, and correct CRS issues in WbW-QGIS workflows.
Key Concepts
- Geographic CRS (GCS): Coordinates in angular units (degrees of latitude and longitude). Common examples: WGS84 (EPSG:4326), NAD83 (EPSG:4269). Not suitable as a working CRS for distance/area calculations.
- Projected CRS (PCS): Coordinates in linear units (metres or feet) on a flat map projection. Examples: UTM zones, Lambert Conformal Conic, Albers Equal Area.
- EPSG code: A numeric registry identifier for a CRS. EPSG:4326 = WGS84; EPSG:32617 = UTM Zone 17N (WGS84); EPSG:3978 = Canada Atlas Lambert.
- On-the-fly reprojection: QGIS displays all layers in the project CRS regardless of their native CRS. This is for display only — it does not change the file on disk.
- Reproject (warp): Permanently transform raster or vector data to a new CRS, writing a new file. Required before passing data to analysis tools.
- Z factor: A unit-conversion factor applied when DEM horizontal units (metres) differ from vertical units (feet), or vice versa.
Choosing a Working CRS
| Scenario | Recommended CRS type |
|---|---|
| Global or continental analysis | Geographic (WGS84 / EPSG:4326) for data exchange; Equal-Area projection for area measurements |
| Regional / national analysis | National projected CRS (e.g. Canada Atlas Lambert / EPSG:3978) |
| Local analysis (< 500 km extent) | UTM zone covering the study area |
| Terrain analysis, hydrology, LiDAR | Projected CRS in metres (UTM recommended) |
| Slope / distance calculations | Always use a projected CRS |
Finding your UTM zone: The UTM zone number equals ⌊(longitude + 180) / 6⌋ + 1. For Ottawa, Canada (longitude ≈ –75.7°): zone 18, northern hemisphere → EPSG:32618.
Checking the CRS of a Layer
- Right-click a layer in the Layers panel → Properties.
- Select the Information tab.
- Read the CRS field. Confirm:
- Authority and code (e.g.
EPSG:32618) - Unit (metres vs degrees)
- Datum (WGS84, NAD83, etc.)
- Authority and code (e.g.
Or in the Python Console:
from qgis.core import QgsProject
layer = QgsProject.instance().mapLayersByName('dem')[0]
crs = layer.crs()
print(crs.authid()) # e.g. "EPSG:32618"
print(crs.mapUnits()) # 0 = metres, 6 = degrees
print(crs.isGeographic()) # True if GCS
Setting the Project CRS
The project CRS controls the display and the default output CRS for tools that do not inherit CRS from their inputs.
Project → Properties → CRS tab → search by EPSG code or name → click OK.
Or use View → Panels → CRS Status at the bottom-right of the QGIS window to set the project CRS from any loaded layer.
Reprojecting a Raster
Use Reproject Raster to permanently transform a raster to a new CRS. This is required before any terrain analysis on a DEM stored in geographic (degree) coordinates.
Processing Toolbox → Whitebox Workflows → Raster → Reproject Raster
| Parameter | Recommended value |
|---|---|
| Input layer | dem_wgs84.tif |
| Target EPSG code | 32618 |
| Resampling method | bilinear (elevation surfaces) |
| Output | dem_utm18n.tif |
The Resampling method parameter accepts any of the methods supported by WbW's raster engine:
| Method | Best for |
|---|---|
nearest | Categorical / integer rasters (classification maps, stream grids) |
bilinear | Continuous surfaces (DEMs, slope, TWI, reflectance) |
cubic | High-quality continuous-surface resampling |
lanczos | High-quality sinc-window resampling |
average | 3×3 mean statistic |
min / max | 3×3 extremum statistics |
mode | 3×3 majority-class (smoothed categorical) |
median | 3×3 median statistic |
stddev | 3×3 standard deviation |
import processing
processing.run('whitebox_workflows:reproject_raster', {
'input': '/data/dem_wgs84.tif',
'epsg': 32618,
'resample': 'bilinear',
'output': '/data/dem_utm18n.tif',
})
After running, load the output and confirm in Layer Properties → Information:
CRS shows EPSG:32618 and the extent is in metres.
Reprojecting a Vector Layer
Use Reproject Vector to transform a vector dataset to a new CRS.
Processing Toolbox → Whitebox Workflows → Vector → Reproject Vector
| Parameter | Recommended value |
|---|---|
| Input layer | roads_wgs84.shp |
| Target EPSG code | 32618 |
| Output | roads_utm18n.shp |
import processing
processing.run('whitebox_workflows:reproject_vector', {
'input': '/data/roads_wgs84.shp',
'epsg': 32618,
'output': '/data/roads_utm18n.shp',
})
Reprojecting a LiDAR Dataset
Use Reproject LiDAR to transform a point cloud to a new CRS.
Processing Toolbox → Whitebox Workflows → LiDAR → Reproject LiDAR
| Parameter | Recommended value |
|---|---|
| Input LiDAR file | cloud_wgs84.laz |
| Target EPSG code | 32618 |
| Output | cloud_utm18n.laz |
import processing
processing.run('whitebox_workflows:reproject_lidar', {
'input': '/data/cloud_wgs84.laz',
'epsg': 32618,
'output': '/data/cloud_utm18n.laz',
})
Epoch-Aware Reprojection (Dynamic Datums)
For dynamic datums and realization-to-realization transformations, the reprojection tools support optional epoch-routing controls:
coordinate_epochsource_reference_epochtarget_reference_epochoperation_codeprefer_official_operationepoch_policy(strictorallow_static_fallback)
These parameters are optional. Use them when you need deterministic routing for time-dependent CRS transformations.
CSRS operational status (current)
For NAD83(CSRS) realization-routing in current WbW builds:
- Active preferred-operation corridors (zone-matched UTM, zones 7-24):
- all matched-zone CSRS realization pairs
v2..v8 -> v2..v8(excluding same-realization no-op pairs), operation10715
- all matched-zone CSRS realization pairs
For CRS pairs without a registered preferred operation mapping, standard reprojection remains available via the baseline transform path.
There is no reverse-corridor pending gate in this policy.
Expected active examples in current builds:
| Source | Target | Status | Operation | Zones |
|---|---|---|---|---|
| v3 | v8 | active | 10715 | 7-24 |
| v4 | v6 | active | 10715 | 7-24 |
| v5 | v8 | active | 10715 | 7-24 |
| v8 | v5 | active | 10715 | 7-24 |
Inspect CSRS support in QGIS diagnostics
The plugin diagnostics report now includes a csrs_preferred_operation_support
summary with:
- zone range (
zone_min,zone_max) - active realization pairs and operation codes
- pending pair count (expected to be zero for non-identical realization pairs)
Open diagnostics from the Whitebox Workflows plugin menu, then inspect the
capabilities block in the report text or JSON section for
projection_csrs_preferred_operation_support.
Epoch-aware raster example
import processing
processing.run('whitebox_workflows:reproject_raster', {
'input': '/data/dem_csrs_v3.tif',
'epsg': 22818,
'resample': 'bilinear',
'coordinate_epoch': 2020.0,
'source_reference_epoch': 2010.0,
'target_reference_epoch': 2020.0,
'prefer_official_operation': True,
'epoch_policy': 'strict',
'output': '/data/dem_csrs_v8_epoch2020.tif',
})
Epoch-aware vector example
import processing
processing.run('whitebox_workflows:reproject_vector', {
'input': '/data/stations_csrs_v3.gpkg',
'epsg': 22818,
'coordinate_epoch': 2020.0,
'prefer_official_operation': True,
'epoch_policy': 'strict',
'output': '/data/stations_csrs_v8_epoch2020.gpkg',
})
Epoch-aware LiDAR example
import processing
processing.run('whitebox_workflows:reproject_lidar', {
'input': '/data/survey_csrs_v3.laz',
'epsg': 22818,
'coordinate_epoch': 2020.0,
'prefer_official_operation': True,
'epoch_policy': 'strict',
'output': '/data/survey_csrs_v8_epoch2020.laz',
})
Assigning a Missing CRS
If a file has correct coordinates but missing or wrong CRS metadata (shown as "Unknown CRS" in Layer Properties), use one of the three Assign Projection tools to write the correct EPSG code into the file without moving any coordinates.
Assign vs. Reproject: Assigning a CRS only updates the metadata label. Use it when coordinates are already in the target system but the file has no CRS tag. If the coordinate values themselves need to change, use the Reproject tools above instead.
Assign Projection to a Raster
Processing Toolbox → Whitebox Workflows → Raster → Assign Projection Raster
| Parameter | Value |
|---|---|
| Input layer | dem_no_crs.tif |
| EPSG code to assign | 32618 |
import processing
processing.run('whitebox_workflows:assign_projection_raster', {
'input': '/data/dem_no_crs.tif',
'epsg': 32618,
})
Assign Projection to a Vector
Processing Toolbox → Whitebox Workflows → Vector → Assign Projection Vector
| Parameter | Value |
|---|---|
| Input layer | roads_no_crs.shp |
| EPSG code to assign | 32618 |
import processing
processing.run('whitebox_workflows:assign_projection_vector', {
'input': '/data/roads_no_crs.shp',
'epsg': 32618,
})
Assign Projection to a LiDAR File
Processing Toolbox -> Whitebox Workflows -> LiDAR -> Assign Projection LiDAR
| Parameter | Value |
|---|---|
| Input LiDAR file | cloud_no_crs.laz |
| EPSG code to assign | 32618 |
import processing
processing.run('whitebox_workflows:assign_projection_lidar', {
'input': '/data/cloud_no_crs.laz',
'epsg': 32618,
})
Georeferencing from Control Points
Use Georeference Raster From Control Points when the raster has no valid georeferencing and you have control points that relate pixel coordinates to map coordinates.
Processing Toolbox -> Whitebox Workflows -> Raster -> Georeference Raster From Control Points
| Parameter | Recommended value |
|---|---|
| Input raster | historical_scan.tif |
| Control points CSV | historical_scan_gcps.csv |
| Destination EPSG code | 32618 |
| Resampling method | bilinear (continuous) or nearest (categorical) |
| Georeferenced raster output | historical_scan_georef.tif |
| Diagnostics report output | historical_scan_georef_report.json (optional) |
Control-points CSV fields must include source image coordinates and target map coordinates:
source_colsource_rowtarget_xtarget_y
import processing
processing.run('whitebox_workflows:georeference_raster_from_control_points', {
'input': '/data/historical_scan.tif',
'control_points': '/data/historical_scan_gcps.csv',
'epsg': 32618,
'resample': 'bilinear',
'output': '/data/historical_scan_georef.tif',
'report': '/data/historical_scan_georef_report.json',
})
Assign Projection to a LiDAR Dataset
Processing Toolbox → Whitebox Workflows → LiDAR → Assign Projection LiDAR
| Parameter | Value |
|---|---|
| Input LiDAR file | cloud_no_crs.laz |
| EPSG code to assign | 32618 |
import processing
processing.run('whitebox_workflows:assign_projection_lidar', {
'input': '/data/cloud_no_crs.laz',
'epsg': 32618,
})
The Z Factor for Terrain Tools
When a DEM's horizontal units differ from its vertical units, slope and curvature calculations are incorrect unless a Z factor is applied.
| Horizontal unit | Vertical unit | Z factor |
|---|---|---|
| Metres | Metres | 1.0 (no conversion needed) |
| Metres | Feet | 0.3048 |
| Feet | Feet | 1.0 |
| Degrees (geographic) | Metres | Do not use — reproject first |
All WbW terrain tools that accept a Z factor parameter apply the conversion
as: slope = atan(rise × z_factor / run).
Best practice: Reproject the DEM to a projected CRS in metres before running any terrain analysis. Set Z factor to
1.0after reprojection if vertical units are also metres.
Batch Reprojection via Python Console
Reproject all GeoPackages in a folder to EPSG:32618 using the Whitebox vector reprojection tool:
import processing
from pathlib import Path
src_dir = Path('/data/raw_vectors')
out_dir = Path('/data/projected')
out_dir.mkdir(exist_ok=True)
for gpkg in src_dir.glob('*.gpkg'):
out = out_dir / gpkg.name
processing.run('whitebox_workflows:reproject_vector', {
'input': str(gpkg),
'epsg': 32618,
'output': str(out),
})
print(f"Reprojected: {gpkg.name}")
print("Batch reprojection complete.")
Reproject a folder of LiDAR files in the same pattern:
import processing
from pathlib import Path
src_dir = Path('/data/raw_lidar')
out_dir = Path('/data/projected_lidar')
out_dir.mkdir(exist_ok=True)
for las in src_dir.glob('*.laz'):
out = out_dir / las.name
processing.run('whitebox_workflows:reproject_lidar', {
'input': str(las),
'epsg': 32618,
'output': str(out),
})
print(f"Reprojected: {las.name}")
print("Batch LiDAR reprojection complete.")
Common CRS Problems
| Problem | Likely cause | Fix |
|---|---|---|
| Layers display in wrong location | Layer has incorrect assigned CRS | Assign correct CRS (do not reproject) |
| Slope values in hundreds of degrees | DEM in geographic CRS (degrees) — cell size << 1° | Reproject DEM to metres before running slope |
| Area calculations wildly wrong | Layer CRS is geographic (degrees) | Reproject to equal-area projected CRS |
| Watershed does not close properly | Raster and vector inputs in different CRS | Reproject all inputs to same CRS before processing |
| WbW tool silently returns NoData everywhere | CRS mismatch causes spatial extents not to overlap | Verify all inputs share the same CRS and extent |
| "Datum transform not found" warning in QGIS | Datum shift grid file not installed | Install proj-data package, or accept approximate transform |
Validation Checklist
- Project CRS is set to the intended working CRS before analysis.
- All raster inputs share the same CRS, extent, and cell size.
- All vector inputs share the same CRS as the raster grid.
- DEM CRS is projected (linear units — metres or feet), not geographic.
-
Z factor is set to
1.0when both horizontal and vertical units are metres. - Reprojected outputs have been inspected (extent in metres, CRS code confirmed).
- No layers show "Unknown CRS" in the Layers panel.
Terrain Analysis and Geomorphometry
Terrain analysis — or geomorphometry — is the quantitative characterisation of land-surface form from digital elevation models (DEMs). It is one of the original strengths of the Whitebox platform and covers first-order derivatives (slope, aspect), curvature families, terrain position, roughness, multiscale analysis, and visibility.
This chapter walks through a complete primary-derivative workflow in the QGIS Processing Toolbox, followed by a Python console version for batch scripting.
Key Concepts
- DEM: Raster where each cell stores surface elevation. All terrain derivatives begin here. Common sources: LiDAR bare-earth, drone photogrammetry, SRTM, Copernicus DEM.
- Slope: Maximum rate of elevation change per unit distance (degrees or percent). Core input for erosion, landslide, and routing models.
- Aspect: Compass direction a slope faces (0–360°, clockwise from north). Flat cells are assigned –1. Controls solar insolation and moisture.
- Curvature: Rate of change of slope. Profile curvature describes flow acceleration/deceleration; plan curvature describes flow convergence/divergence.
- TPI / geomorphons: Terrain position indices and landform classification assign cells to ridge, slope, valley, etc., without manual thresholds.
- TWI: Topographic Wetness Index — ln(upslope area / tan(slope)) — predicts persistent soil moisture and runoff zones.
End-to-End Workflow: Primary Terrain Derivatives
This workflow takes a raw DEM through sink filling, then derives the most commonly used terrain surfaces.
Inputs
| Layer | Format | Notes |
|---|---|---|
dem.tif | GeoTIFF raster | Projected CRS (e.g. UTM) strongly recommended |
Step 1 — Fill Depressions
Sinks (isolated low cells) in a DEM cause flow-routing artifacts in all downstream terrain derivatives. Fill them first.
Processing Toolbox → Whitebox Workflows → Spatial Hydrology →
Fill Depressions
| Parameter | Recommended value |
|---|---|
| Input DEM | dem.tif |
| Fix flats | ✓ enabled |
| Flat increment | 0.001 (one thousandth of the DEM z unit) |
| Output | dem_filled.tif |
Why fix flats? Perfectly flat areas produce ambiguous flow directions. Adding a tiny gradient across flats ensures a routable surface.
Step 2 — Slope
Processing Toolbox → Whitebox Workflows → Terrain Analysis →
Slope
| Parameter | Recommended value |
|---|---|
| Input DEM | dem_filled.tif |
| Output units | Degrees |
| Z conversion factor | 1.0 (set to 0.3048 if DEM z is in feet but CRS is metres) |
| Output | slope.tif |
Expected output range: 0° (flat) to ~85° (near-vertical cliff). Values above 70° often indicate interpolation artefacts — inspect those cells.
Step 3 — Aspect
Processing Toolbox → Whitebox Workflows → Terrain Analysis →
Aspect
| Parameter | Recommended value |
|---|---|
| Input DEM | dem_filled.tif |
| Output | aspect.tif |
Flat cells receive –1. Apply a pseudocolor ramp (circular HSV) to aspect for intuitive visualisation of slope direction.
Step 4 — Hillshade
Processing Toolbox → Whitebox Workflows → Terrain Analysis →
Hillshade
| Parameter | Recommended value |
|---|---|
| Input DEM | dem_filled.tif |
| Azimuth (°) | 315 (NW sun — standard cartographic convention) |
| Altitude (°) | 45 |
| Z factor | 1.0 |
| Output | hillshade.tif |
Set the hillshade layer to Multiply blend mode in QGIS and overlay it on a coloured DEM for a publication-quality relief map.
Step 5 — Profile and Plan Curvature
Processing Toolbox → Whitebox Workflows → Terrain Analysis →
Profile Curvature
| Parameter | Recommended value |
|---|---|
| Input DEM | dem_filled.tif |
| Output | profile_curv.tif |
Repeat for Plan Curvature → plan_curv.tif.
Style both outputs with a diverging colour ramp centred on 0. Negative profile curvature (blue) marks deceleration zones (valley bottoms); positive (red) marks acceleration zones (ridge crests). Plan curvature negatives mark convergent hollows; positives mark divergent noses.
Step 6 — Topographic Wetness Index
Processing Toolbox → Whitebox Workflows → Terrain Analysis →
Wetness Index
| Parameter | Recommended value |
|---|---|
| Slope raster | slope.tif |
| Specific contributing area raster | (run D8 Flow Accumulation first — see Spatial Hydrology chapter) |
| Output | twi.tif |
High TWI values (> 8–10) indicate persistent moisture zones. Use as a predictor variable in soil, flood, and habitat models.
Python Console Equivalent
Paste the following into the QGIS Python Console (Plugins → Python Console) or save as a Processing script to batch multiple DEMs.
import processing
dem = '/data/dem.tif'
# Step 1: fill depressions
processing.run('whitebox_workflows:fill_depressions', {
'dem': dem,
'fix_flats': True,
'flat_increment': 0.001,
'output': '/data/dem_filled.tif',
})
# Step 2: slope
processing.run('whitebox_workflows:slope', {
'dem': '/data/dem_filled.tif',
'units': 'Degrees',
'zfactor': 1.0,
'output': '/data/slope.tif',
})
# Step 3: aspect
processing.run('whitebox_workflows:aspect', {
'dem': '/data/dem_filled.tif',
'output': '/data/aspect.tif',
})
# Step 4: hillshade
processing.run('whitebox_workflows:hillshade', {
'dem': '/data/dem_filled.tif',
'azimuth': 315.0,
'altitude': 45.0,
'zfactor': 1.0,
'output': '/data/hillshade.tif',
})
# Step 5: curvature
processing.run('whitebox_workflows:profile_curvature', {
'dem': '/data/dem_filled.tif',
'output': '/data/profile_curv.tif',
})
processing.run('whitebox_workflows:plan_curvature', {
'dem': '/data/dem_filled.tif',
'output': '/data/plan_curv.tif',
})
print("Terrain derivatives complete.")
Advanced: Geomorphons Landform Classification
Geomorphons classify each cell into one of ten landform elements (peak, ridge, shoulder, spur, slope, hollow, footslope, valley, pit, flat) by analysing horizon profiles in eight compass directions.
Processing Toolbox → Whitebox Workflows → Terrain Analysis →
Geomorphons
| Parameter | Recommended value |
|---|---|
| Input DEM | dem_filled.tif |
| Search distance (cells) | 50 (adjust to DEM resolution and landscape scale) |
| Skip radius (cells) | 0 |
| Flatness threshold (°) | 1.0 |
| Output | geomorphons.tif |
The output is a categorical raster (1–10). Apply a predefined categorical colour map — the Geomorphons palette is available in many QGIS style repositories.
processing.run('whitebox_workflows:geomorphons', {
'dem': '/data/dem_filled.tif',
'search': 50,
'skip': 0,
'threshold': 1.0,
'output': '/data/geomorphons.tif',
})
Pro Siting Sweep Diagnostics
For Pro terrain siting workflows (wind_turbine_siting and
solar_site_suitability_analysis), providing a sweep specification executes a
multi-run grid and emits extra diagnostics:
run_matrix_summary(CSV)sensitivity_report(JSON)sensitivity_report_html(HTML)stability_map(GeoTIFF;3=high,2=medium,1=low)
Inside sensitivity_report, use these fields for quick robustness checks:
metrics.primary_metricmetrics.primary_relative_spanmetrics.stability_class(high,medium,low)
These outputs are intended for scenario comparison and shortlist stability review before field validation.
Common Pitfalls
| Problem | Likely cause | Fix |
|---|---|---|
| Slope values are unrealistically high (> 85°) | DEM has interpolation artefacts or NoData spikes | Run Remove Off-terrain Objects or inspect raw DEM |
| Flat areas produce zero slope everywhere | Depressions not filled before slope derivation | Run Fill Depressions first |
| Aspect shows –1 across large areas | Large flat regions in DEM | Expected for flat input; check DEM resolution |
| Curvature is noisy on fine-resolution DEMs | Sensor noise dominates at small spatial scales | Apply Gaussian Filter (σ ≈ 1–2 cells) before curvature |
| Units mismatch — Z factor warning | Horizontal CRS in metres but DEM z in feet | Set Z conversion factor to 0.3048 |
Validation Checklist
- DEM uses a projected CRS (not geographic degrees).
- No unexpected flat artefacts introduced by depression filling.
- Slope range plausible for local relief (inspect histogram).
- Aspect –1 cells are spatially limited to genuine flats.
- Curvature raster has a near-symmetric distribution centred on 0.
- Hillshade visually matches known ridgelines and valley geometry.
Terrain Derivatives
Accumulation Curvature
Function name: accumulation_curvature
Description
This tool calculates the accumulation curvature from a digital elevation model (DEM). Accumulation curvature is the product of profile (vertical) and tangential (horizontal) curvatures at a location (Shary, 1995). This variable has positive values, zero or greater. Florinsky (2017) states that accumulation curvature is a measure of the extent of local accumulation of flows at a given point in the topographic surface. Accumulation curvature is measured in units of m-2.
The user must specify the name of the input DEM (dem) and the output raster (output). The The Z conversion factor (zfactor) is only important when the vertical and horizontal units are not the same in the DEM. When this is the case, the algorithm will multiply each elevation in the DEM by the Z Conversion Factor. Curvature values are often very small and as such the user may opt to log-transform the output raster (log). Transforming the values applies the equation by Shary et al. (2002):
Θ' = sign(Θ) ln(1 + 10n|Θ|)
where Θ is the parameter value and n is dependent on the grid cell size.
For DEMs in projected coordinate systems, the tool uses the 3rd-order bivariate Taylor polynomial method described by Florinsky (2016). Based on a polynomial fit of the elevations within the 5x5 neighbourhood surrounding each cell, this method is considered more robust against outlier elevations (noise) than other methods. For DEMs in geographic coordinate systems (i.e. angular units), the tool uses the 3x3 polynomial fitting method for equal angle grids also described by Florinsky (2016).
References
Florinsky, I. (2016). Digital terrain analysis in soil science and geology. Academic Press.
Florinsky, I. V. (2017). An illustrated introduction to general geomorphometry. Progress in Physical Geography, 41(6), 723-752.
Shary PA (1995) Land surface in gravity points classification by a complete system of curvatures. Mathematical Geology 27: 373–390.
Shary P. A., Sharaya L. S. and Mitusov A. V. (2002) Fundamental quantitative methods of land surface analysis. Geoderma 107: 1–32.
See Also
tangential_curvature, profile_curvature, minimal_curvature, maximal_curvature, mean_curvature, gaussian_curvature
Python API
def accumulation_curvature(self, dem: Raster, log_transform: bool = False, z_factor: float = 1.0) -> Raster:
Aspect
Function name: aspect
This tool calculates slope aspect (i.e. slope orientation in degrees clockwise from north) for each grid cell in an input digital elevation model (DEM). The user must specify the name of the input DEM (dem) and the output raster (output). The Z conversion factor (zfactor) is only important when the vertical and horizontal units are not the same in the DEM, and the DEM is in a projected coordinate system. When this is the case, the algorithm will multiply each elevation in the DEM by the Z Conversion Factor to perform the unit conversion.
For DEMs in projected coordinate systems, the tool uses the 3rd-order bivariate Taylor polynomial method described by Florinsky (2016). Based on a polynomial fit of the elevations within the 5x5 neighbourhood surrounding each cell, this method is considered more robust against outlier elevations (noise) than other methods. For DEMs in geographic coordinate systems (i.e. angular units), the tool uses the 3x3 polynomial fitting method for equal angle grids also described by Florinsky (2016).
Reference
Florinsky, I. (2016). Digital terrain analysis in soil science and geology. Academic Press.
Gallant, J. C., and J. P. Wilson, 2000, Primary topographic attributes, in Terrain Analysis: Principles and Applications, edited by J. P. Wilson and J. C. Gallant pp. 51-86, John Wiley, Hoboken, N.J.
See Also
slope, plan_curvature, profile_curvature
Python API
def aspect(self, dem: Raster, z_factor: float = 1.0) -> Raster:
Casorati Curvature
Function name: casorati_curvature
Experimental
Calculates Casorati curvature from a DEM.
geomorphometry terrain curvature casorati_curvature legacy-port
Parameters
NameDescriptionRequiredDefault
inputInput DEM raster path or typed raster object.Requireddem.tif
z_factorOptional z conversion factor (default 1.0). Alias: zfactor.Optional1.0
log_transformOptional log-transform of output values (default false). Alias: log.OptionalFalse
outputOptional output path. If omitted, result is stored in memory.Optional—
Examples
Calculates casorati_curvature from a DEM.
wbe.casorati_curvature(input='dem.tif', log_transform=False, output='casorati_curvature.tif', z_factor=1.0)
Curvedness
Function name: curvedness
Description
This tool calculates the curvedness (Koenderink and van Doorn, 1992) from a digital elevation model (DEM). Curvedness is the root mean square of maximal and minimal curvatures, and measures the magnitude of surface bending, regardless of shape (Florinsky, 2017). Curvedness is characteristically low-values for flat areas and higher for areas of sharp bending (Florinsky, 2017). The index is also inversely proportional with the size of the object (Koenderink and van Doorn, 1992). Curvedness has values equal to or greater than zero and is measured in units of m-1.
The user must specify the name of the input DEM (dem) and the output raster (output). The The Z conversion factor (zfactor) is only important when the vertical and horizontal units are not the same in the DEM. When this is the case, the algorithm will multiply each elevation in the DEM by the Z Conversion Factor. Raw curvedness values are often challenging to visualize given their range and magnitude, and as such the user may opt to log-transform the output raster (log). Transforming the values applies the equation by Shary et al. (2002):
Θ' = sign(Θ) ln(1 + 10n|Θ|)
where Θ is the parameter value and n is dependent on the grid cell size.
For DEMs in projected coordinate systems, the tool uses the 3rd-order bivariate Taylor polynomial method described by Florinsky (2016). Based on a polynomial fit of the elevations within the 5x5 neighbourhood surrounding each cell, this method is considered more robust against outlier elevations (noise) than other methods. For DEMs in geographic coordinate systems (i.e. angular units), the tool uses the 3x3 polynomial fitting method for equal angle grids also described by Florinsky (2016).
References
Florinsky, I. (2016). Digital terrain analysis in soil science and geology. Academic Press.
Florinsky, I. V. (2017). An illustrated introduction to general geomorphometry. Progress in Physical Geography, 41(6), 723-752.
Koenderink, J. J., and Van Doorn, A. J. (1992). Surface shape and curvature scales. Image and vision computing, 10(8), 557-564.
Shary P. A., Sharaya L. S. and Mitusov A. V. (2002) Fundamental quantitative methods of land surface analysis. Geoderma 107: 1–32.
See Also
shape_index, minimal_curvature, maximal_curvature, tangential_curvature, profile_curvature, mean_curvature, gaussian_curvature
Python API
def curvedness(self, dem: Raster, log_transform: bool = False, z_factor: float = 1.0) -> Raster:
Difference Curvature
Function name: difference_curvature
Description
This tool calculates the difference curvature from a digital elevation model (DEM). Difference curvature is half of the difference between profile and tangential curvatures, sometimes called the vertical and horizontal curvatures (Shary, 1995). This variable has an unbounded range that can take either positive or negative values. Florinsky (2017) states that difference curvature measures the extent to which the relative deceleration of flows (measured by kv) is higher than flow convergence at a given point of the topographic surface. Difference curvature is measured in units of m-1.
The user must specify the name of the input DEM (dem) and the output raster (output). The The Z conversion factor (zfactor) is only important when the vertical and horizontal units are not the same in the DEM. When this is the case, the algorithm will multiply each elevation in the DEM by the Z Conversion Factor. Curvature values are often very small and as such the user may opt to log-transform the output raster (log). Transforming the values applies the equation by Shary et al. (2002):
Θ' = sign(Θ) ln(1 + 10n|Θ|)
where Θ is the parameter value and n is dependent on the grid cell size.
For DEMs in projected coordinate systems, the tool uses the 3rd-order bivariate Taylor polynomial method described by Florinsky (2016). Based on a polynomial fit of the elevations within the 5x5 neighbourhood surrounding each cell, this method is considered more robust against outlier elevations (noise) than other methods. For DEMs in geographic coordinate systems (i.e. angular units), the tool uses the 3x3 polynomial fitting method for equal angle grids also described by Florinsky (2016).
References
Florinsky, I. (2016). Digital terrain analysis in soil science and geology. Academic Press.
Florinsky, I. V. (2017). An illustrated introduction to general geomorphometry. Progress in Physical Geography, 41(6), 723-752.
Shary PA (1995) Land surface in gravity points classification by a complete system of curvatures. Mathematical Geology 27: 373–390.
Shary P. A., Sharaya L. S. and Mitusov A. V. (2002) Fundamental quantitative methods of land surface analysis. Geoderma 107: 1–32.
See Also
profile_curvature, tangential_curvature, rotor, minimal_curvature, maximal_curvature, mean_curvature, gaussian_curvature
Python API
def difference_curvature(self, dem: Raster, log_transform: bool = False, z_factor: float = 1.0) -> Raster:
Gaussian Curvature
Function name: gaussian_curvature
This tool calculates the Gaussian curvature from a digital elevation model (DEM). Gaussian curvature is the product of maximal and minimal curvatures, and retains values in each point of the topographic surface after its bending without breaking, stretching, and compressing (Florinsky, 2017). Gaussian curvature is measured in units of m-2.
The user must input a DEM (dem).The Z conversion factor (zfactor) is only important when the vertical and horizontal units are not the same in the DEM. When this is the case, the algorithm will multiply each elevation in the DEM by the Z Conversion Factor. Curvature values are often very small and as such the user may opt to log-transform the output raster (log). Transforming the values applies the equation by Shary et al. (2002):
Θ' = sign(Θ) ln(1 + 10n|Θ|)
where Θ is the parameter value and n is dependent on the grid cell size.
For DEMs in projected coordinate systems, the tool uses the 3rd-order bivariate Taylor polynomial method described by Florinsky (2016). Based on a polynomial fit of the elevations within the 5x5 neighbourhood surrounding each cell, this method is considered more robust against outlier elevations (noise) than other methods. For DEMs in geographic coordinate systems (i.e. angular units), the tool uses the 3x3 polynomial fitting method for equal angle grids also described by Florinsky (2016).
References
Florinsky, I. (2016). Digital terrain analysis in soil science and geology. Academic Press.
Florinsky, I. V. (2017). An illustrated introduction to general geomorphometry. Progress in Physical Geography, 41(6), 723-752.
Shary P. A., Sharaya L. S. and Mitusov A. V. (2002) Fundamental quantitative methods of land surface analysis. Geoderma 107: 1–32.
See Also
tangential_curvature, profile_curvature, plan_curvature, mean_curvature, minimal_curvature, maximal_curvature
Python API
def gaussian_curvature(self, dem: Raster, log_transform: bool = False, z_factor: float = 1.0) -> Raster:
Generating Function
Function name: generating_function
Description
This tool calculates the generating function (Shary and Stepanov, 1991) from a digital elevation model (DEM). Florinsky (2016) describes generating function as a measure for the deflection of tangential curvature from loci of extreme curvature of the topographic surface. Florinsky (2016) demonstrated the application of this variable for identifying landscape structural lines, i.e. ridges and thalwegs, for which the generating function takes values near zero. Ridges coincide with divergent areas where generating function is approximately zero, while thalwegs are associated with convergent areas with generating function values near zero. This variable has positive values, zero or greater and is measured in units of m-2.
The user must specify the name of the input DEM (dem) and the output raster (output). The The Z conversion factor (zfactor) is only important when the vertical and horizontal units are not the same in the DEM. When this is the case, the algorithm will multiply each elevation in the DEM by the Z Conversion Factor. Raw generating function values are often challenging to visualize given their range and magnitude, and as such the user may opt to log-transform the output raster (log). Transforming the values applies the equation by Shary et al. (2002):
Θ' = sign(Θ) ln(1 + 10n|Θ|)
where Θ is the parameter value and n is dependent on the grid cell size.
This tool uses the 3rd-order bivariate Taylor polynomial method described by Florinsky (2016). Based on a polynomial fit of the elevations within the 5x5 neighbourhood surrounding each cell, this method is considered more robust against outlier elevations (noise) than other methods. For DEMs in geographic coordinate systems, however, this tool cannot use the same 3x3 polynomial fitting method for equal angle grids, also described by Florinsky (2016), that is used by the other curvature tools in this software. That is because generating function uses 3rd order partial derivatives, which cannot be calculated using the 9 elevations in a 3x3; more elevation values are required (i.e. a 5x5 window). Thus, this tool uses the same 5x5 method used for DEMs in projected coordinate systems, and calculates the average linear distance between neighbouring cells in the vertical and horizontal directions using the Vincenty distance function. Note that this may cause a notable slow-down in algorithm performance and has a lower accuracy than would be achieved using an equal angle method, because it assumes a square pixel (in linear units).
References
Florinsky, I. (2016). Digital terrain analysis in soil science and geology. Academic Press.
Florinsky, I. V. (2017). An illustrated introduction to general geomorphometry. Progress in Physical Geography, 41(6), 723-752.
Koenderink, J. J., and Van Doorn, A. J. (1992). Surface shape and curvature scales. Image and vision computing, 10(8), 557-564.
Shary P. A., Sharaya L. S. and Mitusov A. V. (2002) Fundamental quantitative methods of land surface analysis. Geoderma 107: 1–32.
Shary P. A. and Stepanov I. N. (1991) Application of the method of second derivatives in geology. Transactions (Doklady) of the USSR Academy of Sciences, Earth Science Sections 320: 87–92.
See Also
shape_index, minimal_curvature, maximal_curvature, tangential_curvature, profile_curvature, mean_curvature, gaussian_curvature
Python API
def generating_function(self, dem: Raster, log_transform: bool = False, z_factor: float = 1.0) -> Raster:
Horizontal Excess Curvature
Function name: horizontal_excess_curvature
Description
This tool calculates the horizontal excess curvature from a digital elevation model (DEM). Horizontal excess curvature is the difference of tangential (horizontal) and minimal curvatures at a location (Shary, 1995). This variable has positive values, zero or greater. Florinsky (2017) states that horizontal excess curvature measures the extent to which the bending of a normal section tangential to a contour line is larger than the minimal bending at a given point of the surface. Horizontal excess curvature is measured in units of m-1.
The user must specify the name of the input DEM (dem) and the output raster (output). The The Z conversion factor (zfactor) is only important when the vertical and horizontal units are not the same in the DEM. When this is the case, the algorithm will multiply each elevation in the DEM by the Z Conversion Factor. Curvature values are often very small and as such the user may opt to log-transform the output raster (log). Transforming the values applies the equation by Shary et al. (2002):
Θ' = sign(Θ) ln(1 + 10n|Θ|)
where Θ is the parameter value and n is dependent on the grid cell size.
For DEMs in projected coordinate systems, the tool uses the 3rd-order bivariate Taylor polynomial method described by Florinsky (2016). Based on a polynomial fit of the elevations within the 5x5 neighbourhood surrounding each cell, this method is considered more robust against outlier elevations (noise) than other methods. For DEMs in geographic coordinate systems (i.e. angular units), the tool uses the 3x3 polynomial fitting method for equal angle grids also described by Florinsky (2016).
References
Florinsky, I. (2016). Digital terrain analysis in soil science and geology. Academic Press.
Florinsky, I. V. (2017). An illustrated introduction to general geomorphometry. Progress in Physical Geography, 41(6), 723-752.
Shary PA (1995) Land surface in gravity points classification by a complete system of curvatures. Mathematical Geology 27: 373–390.
Shary P. A., Sharaya L. S. and Mitusov A. V. (2002) Fundamental quantitative methods of land surface analysis. Geoderma 107: 1–32.
See Also
tangential_curvature, profile_curvature, minimal_curvature, maximal_curvature, mean_curvature, gaussian_curvature
Python API
def horizontal_excess_curvature(self, dem: Raster, log_transform: bool = False, z_factor: float = 1.0) -> Raster:
Maximal Curvature
Function name: maximal_curvature
This tool calculates the maximal curvature from a digital elevation model (DEM). Maximal curvature is the curvature of a principal section with the highest value of curvature at a given point of the topographic surface (Florinsky, 2017). The values of this curvature are unbounded, and positive values correspond to ridge positions while negative values are indicative of closed depressions (Florinsky, 2016). Maximal curvature is measured in units of m-1.
The user must input a DEM (dem). The Z conversion factor (zfactor) is only important when the vertical and horizontal units are not the same in the DEM. When this is the case, the algorithm will multiply each elevation in the DEM by the Z Conversion Factor. Curvature values are often very small and as such the user may opt to log-transform the output raster (log). Transforming the values applies the equation by Shary et al. (2002):
Θ' = sign(Θ) ln(1 + 10n|Θ|)
where Θ is the parameter value and n is dependent on the grid cell size.
For DEMs in projected coordinate systems, the tool uses the 3rd-order bivariate Taylor polynomial method described by Florinsky (2016). Based on a polynomial fit of the elevations within the 5x5 neighbourhood surrounding each cell, this method is considered more robust against outlier elevations (noise) than other methods. For DEMs in geographic coordinate systems (i.e. angular units), the tool uses the 3x3 polynomial fitting method for equal angle grids also described by Florinsky (2016).
References
Florinsky, I. (2016). Digital terrain analysis in soil science and geology. Academic Press.
Florinsky, I. V. (2017). An illustrated introduction to general geomorphometry. Progress in Physical Geography, 41(6), 723-752.
Shary P. A., Sharaya L. S. and Mitusov A. V. (2002) Fundamental quantitative methods of land surface analysis. Geoderma 107: 1–32.
minimal_curvature, tangential_curvature, profile_curvature, plan_curvature, mean_curvature, gaussian_curvature
Python API
def maximal_curvature(self, dem: Raster, log_transform: bool = False, z_factor: float = 1.0) -> Raster:
Mean Curvature
Function name: mean_curvature
This tool calculates the mean curvature, or the rate of change in slope along a flow line, from a digital elevation model (DEM). Curvature is the second derivative of the topographic surface defined by a DEM. Profile curvature characterizes the degree of downslope acceleration or deceleration within the landscape (Gallant and Wilson, 2000). The user must input a DEM (dem). WhiteboxTools reports curvature in radians multiplied by 100 for easier interpretation because curvature values are typically very small. The Z conversion factor (zfactor) is only important when the vertical and horizontal units are not the same in the DEM. When this is the case, the algorithm will multiply each elevation in the DEM by the Z Conversion Factor. If the DEM is in the geographic coordinate system (latitude and longitude), the following equation is used:
zfactor = 1.0 / (111320.0 x cos(mid_lat))
where mid_lat is the latitude of the centre of the raster, in radians.
The algorithm uses the same formula for the calculation of plan curvature as Gallant and Wilson (2000). Profile curvature is negative for slope increasing downhill (convex flow profile, typical of upper slopes) and positive for slope decreasing downhill (concave, typical of lower slopes).
Reference
Gallant, J. C., and J. P. Wilson, 2000, Primary topographic attributes, in Terrain Analysis: Principles and Applications, edited by J. P. Wilson and J. C. Gallant pp. 51-86, John Wiley, Hoboken, N.J.
See Also
profile_curvature, tangential_curvature, total_curvature, slope, aspect
Python API
def mean_curvature(self, dem: Raster, log_transform: bool = False, z_factor: float = 1.0) -> Raster:
Minimal Curvature
Function name: minimal_curvature
This tool calculates the minimal curvature from a digital elevation model (DEM). Minimal curvature is the curvature of a principal section with the lowest value of curvature at a given point of the topographic surface (Florinsky, 2017). The values of this curvature are unbounded, and positive values correspond to hills while negative values are indicative of valley positions (Florinsky, 2016). Minimal curvature is measured in units of m-1.
The user must input a DEM (dem). The Z conversion factor (zfactor) is only important when the vertical and horizontal units are not the same in the DEM. When this is the case, the algorithm will multiply each elevation in the DEM by the Z Conversion Factor. Curvature values are often very small and as such the user may opt to log-transform the output raster (log). Transforming the values applies the equation by Shary et al. (2002):
Θ' = sign(Θ) ln(1 + 10n|Θ|)
where Θ is the parameter value and n is dependent on the grid cell size.
For DEMs in projected coordinate systems, the tool uses the 3rd-order bivariate Taylor polynomial method described by Florinsky (2016). Based on a polynomial fit of the elevations within the 5x5 neighbourhood surrounding each cell, this method is considered more robust against outlier elevations (noise) than other methods. For DEMs in geographic coordinate systems (i.e. angular units), the tool uses the 3x3 polynomial fitting method for equal angle grids also described by Florinsky (2016).
References
Florinsky, I. (2016). Digital terrain analysis in soil science and geology. Academic Press.
Florinsky, I. V. (2017). An illustrated introduction to general geomorphometry. Progress in Physical Geography, 41(6), 723-752.
Shary P. A., Sharaya L. S. and Mitusov A. V. (2002) Fundamental quantitative methods of land surface analysis. Geoderma 107: 1–32.
maximal_curvature, tangential_curvature, profile_curvature, plan_curvature, mean_curvature, gaussian_curvature
Python API
def minimal_curvature(self, dem: Raster, log_transform: bool = False, z_factor: float = 1.0) -> Raster:
Plan Curvature
Function name: plan_curvature
This tool calculates the plan curvature (i.e. contour curvature), or the rate of change in aspect along a contour line, from a digital elevation model (DEM). Curvature is the second derivative of the topographic surface defined by a DEM. Plan curvature characterizes the degree of flow convergence or divergence within the landscape (Gallant and Wilson, 2000). The user must input a DEM (dem). WhiteboxTools reports curvature in degrees multiplied by 100 for easier interpretation. The Z conversion factor (zfactor) is only important when the vertical and horizontal units are not the same in the DEM. When this is the case, the algorithm will multiply each elevation in the DEM by the Z Conversion Factor. If the DEM is in the geographic coordinate system (latitude and longitude), the following equation is used:
zfactor = 1.0 / (111320.0 x cos(mid_lat))
where mid_lat is the latitude of the centre of the raster, in radians.
The algorithm uses the same formula for the calculation of plan curvature as Gallant and Wilson (2000). Plan curvature is negative for diverging flow along ridges and positive for convergent areas, e.g. along valley bottoms.
Reference
Gallant, J. C., and J. P. Wilson, 2000, Primary topographic attributes, in Terrain Analysis: Principles and Applications, edited by J. P. Wilson and J. C. Gallant pp. 51-86, John Wiley, Hoboken, N.J.
See Also
profile_curvature, tangential_curvature, total_curvature, slope, aspect
Python API
def plan_curvature(self, dem: Raster, log_transform: bool = False, z_factor: float = 1.0) -> Raster:
Principal Curvature Direction
Function name: principal_curvature_direction
Experimental
Calculates the principal curvature direction angle (degrees).
geomorphometry terrain curvature principal_curvature_direction legacy-port
Parameters
NameDescriptionRequiredDefault
inputInput DEM raster path or typed raster object.Requireddem.tif
z_factorOptional z conversion factor (default 1.0). Alias: zfactor.Optional1.0
log_transformOptional log-transform of output values (default false). Alias: log.OptionalFalse
outputOptional output path. If omitted, result is stored in memory.Optional—
Examples
Calculates principal_curvature_direction from a DEM.
wbe.principal_curvature_direction(input='dem.tif', log_transform=False, output='principal_curvature_direction.tif', z_factor=1.0)
Profile Curvature
Function name: profile_curvature
This tool calculates the profile curvature, or the rate of change in slope along a flow line, from a digital elevation model (DEM). Curvature is the second derivative of the topographic surface defined by a DEM. Profile curvature characterizes the degree of downslope acceleration or deceleration within the landscape (Gallant and Wilson, 2000). The user must input DEM a (dem). WhiteboxTools reports curvature in degrees multiplied by 100 for easier interpretation because curvature values are typically very small. The Z conversion factor (zfactor) is only important when the vertical and horizontal units are not the same in the DEM. When this is the case, the algorithm will multiply each elevation in the DEM by the Z Conversion Factor. If the DEM is in the geographic coordinate system (latitude and longitude), the following equation is used:
zfactor = 1.0 / (111320.0 x cos(mid_lat))
where mid_lat is the latitude of the centre of the raster, in radians.
The algorithm uses the same formula for the calculation of plan curvature as Gallant and Wilson (2000). Profile curvature is negative for slope increasing downhill (convex flow profile, typical of upper slopes) and positive for slope decreasing downhill (concave, typical of lower slopes).
Reference
Gallant, J. C., and J. P. Wilson, 2000, Primary topographic attributes, in Terrain Analysis: Principles and Applications, edited by J. P. Wilson and J. C. Gallant pp. 51-86, John Wiley, Hoboken, N.J.
See Also
profile_curvature, tangential_curvature, total_curvature, slope, aspect
Python API
def profile_curvature(self, dem: Raster, log_transform: bool = False, z_factor: float = 1.0) -> Raster:
Relative Aspect
Function name: relative_aspect
This tool creates a new raster in which each grid cell is assigned the terrain aspect relative to a user-specified direction (azimuth). Relative terrain aspect is the angular distance (measured in degrees) between the land-surface aspect and the assumed regional wind azimuth (Bohner and Antonic, 2007). It is bound between 0-degrees (windward direction) and 180-degrees (leeward direction). Relative terrain aspect is the simplest of the measures of topographic exposure to wind, taking into account terrain orientation only and neglecting the influences of topographic shadowing by distant landforms and the deflection of wind by topography.
The user must input a digital elevation model (DEM) (dem) and an azimuth (i.e. a wind direction). The Z Conversion Factor (zfactor) is only important when the vertical and horizontal units are not the same in the DEM, and the DEM is in a projected coordinate system. When this is the case, the algorithm will multiply each elevation in the DEM by the Z Conversion Factor.
Reference
Böhner, J., and Antonić, O. (2009). Land-surface parameters specific to topo-climatology. Developments in Soil Science, 33, 195-226.
See Also
aspect
Python API
def relative_aspect(self, dem: Raster, azimuth: float = 0.0, z_factor: float = 1.0) -> Raster:
Ring Curvature
Function name: ring_curvature
Description
This tool calculates the ring curvature, which is the product of horizontal excess and vertical excess curvatures (Shary, 1995), from a digital elevation model (DEM). Like rotor, ring curvature is used to measure flow line twisting. Ring curvature has values equal to or greater than zero and is measured in units of m-2.
The user must specify the name of the input DEM (dem) and the output raster (output). The The Z conversion factor (zfactor) is only important when the vertical and horizontal units are not the same in the DEM. When this is the case, the algorithm will multiply each elevation in the DEM by the Z Conversion Factor. Curvature values are often very small and as such the user may opt to log-transform the output raster (log). Transforming the values applies the equation by Shary et al. (2002):
Θ' = sign(Θ) ln(1 + 10n|Θ|)
where Θ is the parameter value and n is dependent on the grid cell size.
For DEMs in projected coordinate systems, the tool uses the 3rd-order bivariate Taylor polynomial method described by Florinsky (2016). Based on a polynomial fit of the elevations within the 5x5 neighbourhood surrounding each cell, this method is considered more robust against outlier elevations (noise) than other methods. For DEMs in geographic coordinate systems (i.e. angular units), the tool uses the 3x3 polynomial fitting method for equal angle grids also described by Florinsky (2016).
References
Florinsky, I. (2016). Digital terrain analysis in soil science and geology. Academic Press.
Florinsky, I. V. (2017). An illustrated introduction to general geomorphometry. Progress in Physical Geography, 41(6), 723-752.
Shary PA (1995) Land surface in gravity points classification by a complete system of curvatures. Mathematical Geology 27: 373–390.
Shary P. A., Sharaya L. S. and Mitusov A. V. (2002) Fundamental quantitative methods of land surface analysis. Geoderma 107: 1–32.
See Also
rotor, minimal_curvature, maximal_curvature, mean_curvature, gaussian_curvature, profile_curvature, tangential_curvature
Python API
def ring_curvature(self, dem: Raster, log_transform: bool = False, z_factor: float = 1.0) -> Raster:
Rotor
Function name: rotor
Description
This tool calculates the spatial pattern of rotor, which describes the degree to which a flow line twists (Shary, 1991), from a digital elevation model (DEM). Rotor has an unbounded range, with positive values indicating that a flow line turns clockwise and negative values indicating flow lines that turn counter clockwise (Florinsky, 2017). Rotor is measured in units of m-1.
The user must specify the name of the input DEM (dem) and the output raster (output). The The Z conversion factor (zfactor) is only important when the vertical and horizontal units are not the same in the DEM. When this is the case, the algorithm will multiply each elevation in the DEM by the Z Conversion Factor. Curvature values are often very small and as such the user may opt to log-transform the output raster (log). Transforming the values applies the equation by Shary et al. (2002):
Θ' = sign(Θ) ln(1 + 10n|Θ|)
where Θ is the parameter value and n is dependent on the grid cell size.
For DEMs in projected coordinate systems, the tool uses the 3rd-order bivariate Taylor polynomial method described by Florinsky (2016). Based on a polynomial fit of the elevations within the 5x5 neighbourhood surrounding each cell, this method is considered more robust against outlier elevations (noise) than other methods. For DEMs in geographic coordinate systems (i.e. angular units), the tool uses the 3x3 polynomial fitting method for equal angle grids also described by Florinsky (2016).
References
Florinsky, I. (2016). Digital terrain analysis in soil science and geology. Academic Press.
Florinsky, I. V. (2017). An illustrated introduction to general geomorphometry. Progress in Physical Geography, 41(6), 723-752.
Shary PA (1991) The second derivative topographic method. In: Stepanov IN (ed) The Geometry of the Earth Surface Structures. Pushchino, USSR: Pushchino Research Centre Press, 30–60 (in Russian).
Shary P. A., Sharaya L. S. and Mitusov A. V. (2002) Fundamental quantitative methods of land surface analysis. Geoderma 107: 1–32.
See Also
ring_curvature, profile_curvature, tangential_curvature, plan_curvature, mean_curvature, gaussian_curvature, minimal_curvature, maximal_curvature
Python API
def rotor(self, dem: Raster, log_transform: bool = False, z_factor: float = 1.0) -> Raster:
Shape Index
Function name: shape_index
Description
This tool calculates the shape index (Koenderink and van Doorn, 1992) from a digital elevation model (DEM). This variable ranges from -1 to 1, with positive values indicative of convex landforms, negative values corresponding to concave landforms (Florinsky, 2017). Absolute values from 0.5 to 1.0 are associated with elliptic surfaces (hills and closed depressions), while absolute values from 0.0 to 0.5 are typical of hyperbolic surface form (saddles). Shape index is a dimensionless variable and has utility in landform classification applications.
Koenderink and vsn Doorn (1992) make the following observations about the shape index:
Two shapes for which the shape index differs merely by sign represent complementary pairs that will fit together as ‘stamp’ and ‘mould’ when suitably scaled;
The shape for which the shape index vanishes - and consequently has indeterminate sign - represents the objects which are congruent to their own moulds;
Convexities and concavities find their places on opposite sides of the shape scale. These basic shapes are separated by those shapes which are neither convex nor concave, that are the saddle-like objects. The transitional shapes that divide the convexities/concavities from the saddle-shapes are the cylindrical ridge and the cylindrical rut.
The user must specify the name of the input DEM (dem) and the output raster (output). The The Z conversion factor (zfactor) is only important when the vertical and horizontal units are not the same in the DEM. When this is the case, the algorithm will multiply each elevation in the DEM by the Z Conversion Factor.
For DEMs in projected coordinate systems, the tool uses the 3rd-order bivariate Taylor polynomial method described by Florinsky (2016). Based on a polynomial fit of the elevations within the 5x5 neighbourhood surrounding each cell, this method is considered more robust against outlier elevations (noise) than other methods. For DEMs in geographic coordinate systems (i.e. angular units), the tool uses the 3x3 polynomial fitting method for equal angle grids also described by Florinsky (2016).
References
Florinsky, I. (2016). Digital terrain analysis in soil science and geology. Academic Press.
Florinsky, I. V. (2017). An illustrated introduction to general geomorphometry. Progress in Physical Geography, 41(6), 723-752.
Koenderink, J. J., and Van Doorn, A. J. (1992). Surface shape and curvature scales. Image and vision computing, 10(8), 557-564.
curvedness, minimal_curvature, maximal_curvature, tangential_curvature, profile_curvature, mean_curvature, gaussian_curvature
Python API
def shape_index(self, dem: Raster, z_factor: float = 1.0) -> Raster:
Slope
Function name: slope
This tool calculates slope gradient (i.e. slope steepness in degrees, radians, or percent) for each grid cell in an input digital elevation model (DEM). The user must specify the name of the input DEM (dem) and the output raster (output). The Z conversion factor (zfactor) is only important when the vertical and horizontal units are not the same in the DEM, and the DEM is in a projected coordinate system. When this is the case, the algorithm will multiply each elevation in the DEM by the Z Conversion Factor to perform the unit conversion.
For DEMs in projected coordinate systems, the tool uses the 3rd-order bivariate Taylor polynomial method described by Florinsky (2016). Based on a polynomial fit of the elevations within the 5x5 neighbourhood surrounding each cell, this method is considered more robust against outlier elevations (noise) than other methods. For DEMs in geographic coordinate systems (i.e. angular units), the tool uses the 3x3 polynomial fitting method for equal angle grids also described by Florinsky (2016).
Reference
Florinsky, I. (2016). Digital terrain analysis in soil science and geology. Academic Press.
Gallant, J. C., and J. P. Wilson, 2000, Primary topographic attributes, in Terrain Analysis: Principles and Applications, edited by J. P. Wilson and J. C. Gallant pp. 51-86, John Wiley, Hoboken, N.J.
See Also
aspect, plan_curvature, profile_curvature
Python API
def slope(self, dem: Raster, units: str = "degrees", z_factor: float = 1.0) -> Raster:
Tangential Curvature
Function name: tangential_curvature
This tool calculates the tangential curvature, which is the curvature of an inclined plan perpendicular to both the direction of flow and the surface (Gallant and Wilson, 2000). Curvature is a second derivative of the topographic surface defined by a digital elevation model (DEM). The user must input a DEM (dem). The output reports curvature in degrees multiplied by 100 for easier interpretation, as curvature values are often very small. The Z Conversion Factor (zfactor) is only important when the vertical and horizontal units are not the same in the DEM. When this is the case, the algorithm will multiply each elevation in the DEM by the Z Conversion Factor. If the DEM is in the geographic coordinate system (latitude and longitude), with XY units measured in degrees, an appropriate Z Conversion Factor is calculated internally based on site latitude.
Reference
Gallant, J. C., and J. P. Wilson, 2000, Primary topographic attributes, in Terrain Analysis: Principles and Applications, edited by J. P. Wilson and J. C. Gallant pp. 51-86, John Wiley, Hoboken, N.J.
plan_curvature, profile_curvature, total_curvature, slope, aspect
Python API
def tangential_curvature(self, dem: Raster, log_transform: bool = False, z_factor: float = 1.0) -> Raster:
Total Curvature
Function name: total_curvature
This tool calculates the total curvature, which measures the curvature of the topographic surface rather than the curvature of a line across the surface in some direction (Gallant and Wilson, 2000). Total curvature can be positive or negative, with zero curvature indicating that the surface is either flat or the convexity in one direction is balanced by the concavity in another direction, as would occur at a saddle point. Curvature is a second derivative of the topographic surface defined by a digital elevation model (DEM). The user must input a DEM (dem).The output reports curvature in degrees multiplied by 100 for easier interpretation, as curvature values are often very small. The Z Conversion Factor (zfactor) is only important when the vertical and horizontal units are not the same in the DEM. When this is the case, the algorithm will multiply each elevation in the DEM by the Z Conversion Factor. If the DEM is in the geographic coordinate system (latitude and longitude), with XY units measured in degrees, an appropriate Z Conversion Factor is calculated internally based on site latitude.
Reference
Gallant, J. C., and J. P. Wilson, 2000, Primary topographic attributes, in Terrain Analysis: Principles and Applications, edited by J. P. Wilson and J. C. Gallant pp. 51-86, John Wiley, Hoboken, N.J.
plan_curvature, profile_curvature, tangential_curvature, slope, aspect
Python API
def total_curvature(self, dem: Raster, log_transform: bool = False, z_factor: float = 1.0) -> Raster:
Unsphericity
Function name: unsphericity
Description
This tool calculates the spatial pattern of unsphericity curvature, which describes the degree to which the shape of the topographic surface is nonspherical at a given point (Shary, 1995), from a digital elevation model (DEM). It is calculated as half the difference between the maximal_curvature and the minimal_curvature. Unsphericity has values equal to or greater than zero and is measured in units of m-1. Larger values indicate locations that are less spherical in form.
The user must specify the name of the input DEM (dem) and the output raster (output). The The Z conversion factor (zfactor) is only important when the vertical and horizontal units are not the same in the DEM. When this is the case, the algorithm will multiply each elevation in the DEM by the Z Conversion Factor. Curvature values are often very small and as such the user may opt to log-transform the output raster (log). Transforming the values applies the equation by Shary et al. (2002):
Θ' = sign(Θ) ln(1 + 10n|Θ|)
where Θ is the parameter value and n is dependent on the grid cell size.
For DEMs in projected coordinate systems, the tool uses the 3rd-order bivariate Taylor polynomial method described by Florinsky (2016). Based on a polynomial fit of the elevations within the 5x5 neighbourhood surrounding each cell, this method is considered more robust against outlier elevations (noise) than other methods. For DEMs in geographic coordinate systems (i.e. angular units), the tool uses the 3x3 polynomial fitting method for equal angle grids also described by Florinsky (2016).
References
Florinsky, I. (2016). Digital terrain analysis in soil science and geology. Academic Press.
Florinsky, I. V. (2017). An illustrated introduction to general geomorphometry. Progress in Physical Geography, 41(6), 723-752.
Shary PA (1995) Land surface in gravity points classification by a complete system of curvatures. Mathematical Geology 27: 373–390.
Shary P. A., Sharaya L. S. and Mitusov A. V. (2002) Fundamental quantitative methods of land surface analysis. Geoderma 107: 1–32.
See Also
minimal_curvature, maximal_curvature, mean_curvature, gaussian_curvature, profile_curvature, tangential_curvature
Python API
def unsphericity(self, dem: Raster, log_transform: bool = False, z_factor: float = 1.0) -> Raster:
Vertical Excess Curvature
Function name: vertical_excess_curvature
Description
This tool calculates the vertical excess curvature from a digital elevation model (DEM). Vertical excess curvature is the difference of profile (vertical) and minimal curvatures at a location (Shary, 1995). This variable has positive values, zero or greater. Florinsky (2017) states that vertical excess curvature measures the extent to which the bending of a normal section having a common tangent line with a slope line is larger than the minimal bending at a given point of the surface. Vertical excess curvature is measured in units of m-1.
The user must specify the name of the input DEM (dem) and the output raster (output). The Z conversion factor (zfactor) is only important when the vertical and horizontal units are not the same in the DEM. When this is the case, the algorithm will multiply each elevation in the DEM by the Z Conversion Factor. Curvature values are often very small and as such the user may opt to log-transform the output raster (log). Transforming the values applies the equation by Shary et al. (2002):
Θ' = sign(Θ) ln(1 + 10n|Θ|)
where Θ is the parameter value and n is dependent on the grid cell size.
For DEMs in projected coordinate systems, the tool uses the 3rd-order bivariate Taylor polynomial method described by Florinsky (2016). Based on a polynomial fit of the elevations within the 5x5 neighbourhood surrounding each cell, this method is considered more robust against outlier elevations (noise) than other methods. For DEMs in geographic coordinate systems (i.e. angular units), the tool uses the 3x3 polynomial fitting method for equal angle grids also described by Florinsky (2016).
References
Florinsky, I. (2016). Digital terrain analysis in soil science and geology. Academic Press.
Florinsky, I. V. (2017). An illustrated introduction to general geomorphometry. Progress in Physical Geography, 41(6), 723-752.
Shary PA (1995) Land surface in gravity points classification by a complete system of curvatures. Mathematical Geology 27: 373–390.
Shary P. A., Sharaya L. S. and Mitusov A. V. (2002) Fundamental quantitative methods of land surface analysis. Geoderma 107: 1–32.
See Also
tangential_curvature, profile_curvature, minimal_curvature, maximal_curvature, mean_curvature, gaussian_curvature
Python API
def vertical_excess_curvature(self, dem: Raster, log_transform: bool = False, z_factor: float = 1.0) -> Raster:
Visibility Analysis
Average Horizon Distance
Function name: average_horizon_distance
PROExperimental
Calculates average distance to horizon across azimuth directions.
geomorphometry terrain visibility
Horizon Angle
Function name: horizon_angle
This tool calculates the horizon angle (Sx), i.e. the maximum slope along a specified azimuth (0-360 degrees) for each grid cell in an input digital elevation model (DEM). Horizon angle is sometime referred to as the maximum upwind slope in wind exposure/sheltering studies. Positive values can be considered sheltered with respect to the azimuth and negative values are exposed. Thus, Sx is a measure of exposure to a wind from a specific direction. The algorithm works by tracing a ray from each grid cell in the direction of interest and evaluating the slope for each location in which the DEM grid is intersected by the ray. Linear interpolation is used to estimate the elevation of the surface where a ray does not intersect the DEM grid precisely at one of its nodes.
The user is able to constrain the maximum search distance (max_dist) for the ray tracing by entering a valid maximum search distance value (in the same units as the X-Y coordinates of the input raster DEM). If the maximum search distance is left blank, each ray will be traced to the edge of the DEM, which will add to the computational time.
Maximum upwind slope should not be calculated for very extensive areas over which the Earth's curvature must be taken into account. Also, this index does not take into account the deflection of wind by topography. However, averaging the horizon angle over a window of directions can yield a more robust measure of exposure, compensating for the deflection of wind from its regional average by the topography. For example, if you are interested in measuring the exposure of a landscape to a northerly wind, you could perform the following calculation:
Sx(N) = [Sx(345)+Sx(350)+Sx(355)+Sx(0)+Sx(5)+Sx(10)+Sx(15)] / 7.0
Ray-tracing is a highly computationally intensive task and therefore this tool may take considerable time to operate for larger sized DEMs. Maximum upwind slope is best displayed using a Grey scale palette that is inverted.
Horizon angle is best visualized using a white-to-black palette and rescaled from approximately -10 to 70 (see below for an example of horizon angle calculated at a 150-degree azimuth).
See Also
time_in_daylight
Python API
def horizon_angle(self, dem: Raster, azimuth: float = 0.0, max_dist: float = float('inf')) -> Raster:
Horizon Area
Function name: horizon_area
PROExperimental
Calculates area of the horizon polygon (hectares).
geomorphometry terrain visibility
Openness
Function name: openness
Description
This tool calculates the Yokoyama et al. (2002) topographic openness index from an input DEM (input). Openness has two viewer perspectives, which correspond with positive and negative openness outputs (pos_output and neg_output). Positive values, expressing openness above the surface, are high for convex forms, whereas negative values describe this attribute below the surface and are high for concave forms. Openness is an angular value that is an average of the horizon angle in the eight cardinal directions to a maximum search distance (dist), measured in grid cells. Openness rasters are best visualized using a greyscale palette.
Positive Openness:
Negative Openness:
References
Yokoyama, R., Shirasawa, M., & Pike, R. J. (2002). Visualizing topography by openness: a new application of image processing to digital elevation models. Photogrammetric engineering and remote sensing, 68(3), 257-266.
See Also
viewshed, horizon_angle, time_in_daylight, hillshade
Python API
def openness(self, dem: Raster, dist: int = 20) -> Tuple[Raster, Raster]:
Shadow Animation
Function name: shadow_animation
PROExperimental
Creates an interactive HTML viewer and animated GIF showing terrain shadows throughout a day.
geomorphometry terrain visibility solar animation legacy-port
Examples
Generate a day-long shadow animation from a DEM or DSM.
wbe.shadow_animation(date='21/06/2021', dem='dsm.tif', location='43.5448/-80.2482/-4', output='shadow_animation.html')
Shadow Image
Function name: shadow_image
PROExperimental
Generates a terrain shadow intensity raster for a specified date, time, and location.
geomorphometry terrain visibility solar legacy-port
Sky View Factor
Function name: sky_view_factor
This tool calculates the sky-view factor (SVF) from an input digital elevation model (DEM) or digital surface model (DSM). The SVF is the proportion of the celestial hemisphere above a point on the earth's surface that is not obstructed by the surrounding land surface. It is often used to model the diffuse light that is received at the surface and has also been applied as a relief-shading technique (Böhner et al., 2009; Zakšek et al., 2011).
The user must specify an input DEM (dem), the azimuth fraction (az_fraction), the maximum search distance (max_dist), and the height offset of the observer (observer_hgt_offset). The input DEM should usually be a digital surface model (DSM) that contains significant off-terrain objects. Such a model, for example, could be created using the first-return points of a LiDAR data set, or using the lidar_digital_surface_model tool. The azimuth fraction should be an even divisor of 360-degrees and must be between 1-45 degrees.
The tool operates by calculating horizon angle (see horizon_angle) rasters from the DSM based on the user-specified azimuth fraction (az_fraction). For example, if an azimuth fraction of 15-degrees is specified, horizon angle rasters would be calculated for the solar azimuths 0, 15, 30, 45... A horizon angle raster evaluates the vertical angle between each grid cell in a DSM and a distant obstacle (e.g. a mountain ridge, building, tree, etc.) that obscures the view in a specified direction. In calculating horizon angle, the user must specify the maximum search distance (max_dist), in map units, beyond which the query for higher, more distant objects will cease. This parameter strongly impacts the performance of the function, with larger values resulting in significantly longer processing-times.
This tool uses the method described by Zakšek et al. (2011) to calculate SVF, which differs slightly from the method described by Böhner et al. (2009), as implemented in the Saga software. Most notably the Whitebox implementation does not involve local surface slope gradient and is closer in definition to the Saga 'Visible Sky' index.
There are other significant differences between the Whitebox and Saga implementations of SVF. For a given maximum search distance, the Whitebox SVF will be substantially faster to calculate. Furthermore, the Whitebox implementation has the ability to specify a height offset of the observer from the ground surface, using the observer_hgt_offset parameter. For example, the following image shows the spatial pattern derived from a LiDAR DSM using observer_hgt_offset = 0.0:
Notice that there are several places, plarticularly on the flatter rooftops, where the local noise in the LiDAR DEM, associated with the individual scan lines, has resulted in a somewhat noisy pattern in the output. By adding a small height offset of the scale of this noise variation (0.15 m), we see that most of this noisy pattern is removed in the output below:
This feature makes the function more robust against DEM noise. As another example of the usefulness of this additional parameter, in the image below, the observer_hgt_offset parameter has been used to measure the pattern of the index at a typical human height (1.7 m):
Notice how overall visiblility increases at this height.
References
Böhner, J. and Antonić, O., 2009. Land-surface parameters specific to topo-climatology. Developments in soil science, 33, pp.195-226.
Zakšek, K., Oštir, K. and Kokalj, Ž., 2011. Sky-view factor as a relief visualization technique. Remote sensing, 3(2), pp.398-415.
See Also
average_horizon_distance, horizon_area, openness, lidar_digital_surface_model, horizon_angle
Python API
def sky_view_factor(self, dem: Raster, az_fraction: float = 5.0, max_dist: float = float('inf'), observer_hgt_offset: float = 0.0) -> Raster:
Skyline Analysis
Function name: skyline_analysis
PROExperimental
Performs skyline analysis for one or more observation points and writes a vector horizon trace plus HTML report.
geomorphometry terrain visibility skyline
Time In Daylight
Function name: time_in_daylight
This tool calculates the proportion of time a location is within daylight. That is, it calculates the proportion of time, during a user-defined time frame, that a grid cell in an input digital elevation model (dem) is outside of an area of shadow cast by a local object. The input DEM should truly be a digital surface model (DSM) that contains significant off-terrain objects. Such a model, for example, could be created using the first-return points of a LiDAR data set, or using the lidar_digital_surface_model tool.
The tool operates by calculating a solar almanac, which estimates the sun's position for the location, in latitude and longitude coordinate (lat, long), of the input DSM. The algorithm then calculates horizon angle (see horizon_angle) rasters from the DSM based on the user-specified azimuth fraction (az_fraction). For example, if an azimuth fraction of 15-degrees is specified, horizon angle rasters could be calculated for the solar azimuths 0, 15, 30, 45... In reality, horizon angle rasters are only calculated for azimuths for which the sun is above the horizon for some time during the tested time period. A horizon angle raster evaluates the vertical angle between each grid cell in a DSM and a distant obstacle (e.g. a mountain ridge, building, tree, etc.) that blocks the view along a specified direction. In calculating horizon angle, the user must specify the maximum search distance (max_dist) beyond which the query for higher, more distant objects will cease. This parameter strongly impacts the performance of the tool, with larger values resulting in significantly longer run-times. Users are advised to set the max_dist based on the maximum shadow length expected in an area. For example, in a relatively flat urban landscape, the tallest building will likely determine the longest shadow lengths. All grid cells for which the calculated solar positions throughout the time frame are higher than the cell's horizon angle are deemed to be illuminated during the time the sun is in the corresponding azimuth fraction.
By default, the tool calculates time-in-daylight for a time-frame spanning an entire year. That is, the solar almanac is calculated for each hour, at 10-second intervals, and for each day of the year. Users may alternatively restrict the time of year over which time-in-daylight is calculated by specifying a starting day (1-365; start_day) and ending day (1-365; end_day). Similarly, by specifying start time (start_time) and end time (end_time) parameters, the user is able to measure time-in-daylight for specific ranges of the day (e.g. for the morning or afternoon hours). These time parameters must be specified in 24-hour time (HH:MM:SS), e.g. 15:30:00. sunrise and sunset are also acceptable inputs for the start time and end time respectively. The timing of sunrise and sunset on each day in the tested time-frame will be determined using the solar almanac.
See Also
lidar_digital_surface_model, horizon_angle
Python API
def time_in_daylight(self, dem: Raster, az_fraction: float = 5.0, max_dist: float = float('inf'), latitude: float = 0.0, longitude: float = 0.0, utc_offset_str: str = "UTC+00:00", start_day: int = 1, end_day: int = 365, start_time: str = "sunrise", end_time: str = "sunset") -> Raster:
Viewshed
Function name: viewshed
This tool can be used to calculate the viewshed (i.e. the visible area) from a location (i.e. viewing station) or group of locations based on the topography defined by an input digital elevation model (DEM). The user must input a DEM (dem), a viewing station input vector file (stations) and the viewing height (height). Viewing station locations are specified as points within an input shapefile. The output image indicates the number of stations visible from each grid cell. The viewing height is in the same units as the elevations of the DEM and represent a height above the ground elevation from which the viewshed is calculated.
viewshed should be used when there are a relatively small number of target sites for which visibility needs to be assessed. If you need to assess general landscape visibility as a land-surface parameter, the visibility_index tool should be used instead.
Viewshed analysis is a very computationally intensive task. Depending on the size of the input DEM grid and the number of viewing stations, this operation may take considerable time to complete. Also, this implementation of the viewshed algorithm does not account for the curvature of the Earth. This should be accounted for if viewsheds are being calculated over very extensive areas.
See Also
visibility_index
Python API
def viewshed(self, dem: Raster, station_points: Vector, station_height: float = 2.0) -> Raster:
Visibility Index
Function name: visibility_index
This tool can be used to calculate a measure of landscape visibility based on the topography of an input digital elevation model (DEM). The user must input DEM a (dem), the viewing height (height), and a resolution factor (res_factor). Viewsheds are calculated for a subset of grid cells in the DEM based on the resolution factor. The visibility index value (0.0-1.0) indicates the proportion of tested stations (determined by the resolution factor) that each cell is visible from. The viewing height is in the same units as the elevations of the DEM and represent a height above the ground elevation. Each tested grid cell's viewshed will be calculated in parallel. However, visibility index is one of the most computationally intensive geomorphometric indices to calculate. Depending on the size of the input DEM grid and the resolution factor, this operation may take considerable time to complete. If the task is too long-running, it is advisable to raise the resolution factor. A resolution factor of 2 will skip every second row and every second column (effectively evaluating the viewsheds of a quarter of the DEM's grid cells). Increasing this value decreases the number of calculated viewshed but will result in a lower accuracy estimate of overall visibility. In addition to the high computational costs of this index, the tool also requires substantial memory resources to operate. Each of these limitations should be considered before running this tool on a particular data set. This tool is best to apply on computer systems with high core-counts and plenty of memory.
See Also
viewshed
Python API
def visibility_index(self, dem: Raster, station_height: float = 2.0, resolution_factor: int = 8) -> Raster:
Roughness and Texture
Average Normal Vector Angular Deviation
Function name: average_normal_vector_angular_deviation
This tool characterizes the spatial distribution of the average normal vector angular deviation, a measure of surface roughness. Working in the field of 3D printing, Ko et al. (2016) defined a measure of surface roughness based on quantifying the angular deviations in the direction of the normal vector of a real surface from its ideal (i.e. smoothed) form. This measure of surface complexity is therefore in units of degrees. Specifically, roughness is defined in this study as the neighborhood-averaged difference in the normal vectors of the original DEM and a smoothed DEM surface. Smoothed surfaces are derived by applying a Gaussian blur of the same size as the neighborhood (filter).
The multiscale_roughness tool calculates the same measure of surface roughness, except that it is designed to work with multiple spatial scales.
Reference
Ko, M., Kang, H., ulrim Kim, J., Lee, Y., & Hwang, J. E. (2016, July). How to measure quality of affordable 3D printing: Cultivating quantitative index in the user community. In International Conference on Human-Computer Interaction (pp. 116-121). Springer, Cham.
Lindsay, J. B., & Newman, D. R. (2018). Hyper-scale analysis of surface roughness. PeerJ Preprints, 6, e27110v1.
See Also
multiscale_roughness, spherical_std_dev_of_normals, circular_variance_of_aspect
Python API
def average_normal_vector_angular_deviation(self, dem: Raster, filter_size: int = 11) -> Raster:
Circular Variance Of Aspect
Function name: circular_variance_of_aspect
This tool can be used to calculate the circular variance (i.e. one minus the mean resultant length) of aspect for a digital elevation model (DEM). This is a measure of how variable slope aspect is within a local neighbourhood of a specified size (filter). circular_variance_of_aspect is therefore a measure of surface shape complexity, or texture. It will take a value of 0.0 for smooth sites and near 1.0 in areas of high surface roughness or complex topography.
The local neighbourhood size (filter) must be any odd integer equal to or greater than three. Grohmann et al. (2010) found that vector dispersion, a related measure of angular variance, increases monotonically with scale. This is the result of the angular dispersion measure integrating (accumulating) all of the surface variance of smaller scales up to the test scale. A more interesting scale relation can therefore be estimated by isolating the amount of surface complexity associated with specific scale ranges. That is, at large spatial scales, the metric should reflect the texture of large-scale landforms rather than the accumulated complexity at all smaller scales, including microtopographic roughness. As such, this tool normalizes the surface complexity of scales that are smaller than the filter size by applying Gaussian blur (with a standard deviation of one-third the filter size) to the DEM prior to calculating circular_variance_of_aspect. In this way, the resulting distribution is able to isolate and highlight the surface shape complexity associated with landscape features of a similar scale to that of the filter size.
This tool makes extensive use of integral images (i.e. summed-area tables) and parallel processing to ensure computational efficiency. It may, however, require substantial memory resources when applied to larger DEMs.
References
Grohmann, C. H., Smith, M. J., & Riccomini, C. (2010). Multiscale analysis of topographic surface roughness in the Midland Valley, Scotland. IEEE Transactions on Geoscience and Remote Sensing, 49(4), 1200-1213.
See Also
aspect, spherical_std_dev_of_normals, multiscale_roughness, edge_density, surface_area_ratio, ruggedness_index
Python API
def circular_variance_of_aspect(self, dem: Raster, filter_size: int = 11) -> Raster:
Edge Density
Function name: edge_density
This tool calculates the density of edges, or breaks-in-slope within an input digital elevation model (DEM). A break-in-slope occurs between two neighbouring grid cells if the angular difference between their normal vectors is greater than a user-specified threshold value (norm_diff). edge_density calculates the proportion of edge cells within the neighbouring window, of square filter dimension filter, surrounding each grid cell. Therefore, EdgeDensityis a measure of how complex the topographic surface is within a local neighbourhood. It is therefore a measure of topographic texture. It will take a value near 0.0 for smooth sites and 1.0 in areas of high surface roughness or complex topography.
The distribution of edge_density is highly dependent upon the value of the norm_diff used in the calculation. This threshold may require experimentation to find an appropriate value and is likely dependent upon the topography and source data. Nonetheless, experience has shown that edge_density provides one of the best measures of surface texture of any of the available roughness tools.
See Also
circular_variance_of_aspect, multiscale_roughness, surface_area_ratio, ruggedness_index
Python API
def edge_density(self, dem: Raster, filter_size: int = 11, normal_diff_threshold: float = 5.0, z_factor: float = 1.0) -> Raster:
Ruggedness Index
Function name: ruggedness_index
The terrain ruggedness index (TRI) is a measure of local topographic relief. The TRI calculates the root-mean-square-deviation (RMSD) for each grid cell in a digital elevation model (DEM), calculating the residuals (i.e. elevation differences) between a grid cell and its eight neighbours. Notice that, unlike the output of this tool, the original Riley et al. (1999) TRI did not normalize for the number of cells in the local window (i.e. it is a root-square-deviation only). However, using the mean has the advantage of allowing for the varying number of neighbouring cells along the grid edges and in areas bordering NoData cells. This modification does however imply that the output of this tool cannot be directly compared with the index ranges of level to extremely rugged terrain provided in Riley et al. (1999)
Reference
Riley, S. J., DeGloria, S. D., and Elliot, R. (1999). Index that quantifies topographic heterogeneity. Intermountain Journal of Sciences, 5(1-4), 23-27.
See Also
relative_topographic_position, DevFromMeanElev
Python API
def ruggedness_index(self, input: Raster) -> Raster:
Spherical Std Dev Of Normals
Function name: spherical_std_dev_of_normals
This tool can be used to calculate the spherical standard deviation of the distribution of surface normals for an input digital elevation model (DEM; dem). This is a measure of the angular dispersion of the surface normal vectors within a local neighbourhood of a specified size (filter). spherical_std_dev_of_normals is therefore a measure of surface shape complexity, texture, and roughness. The spherical standard deviation (s) is defined as:
s = √[-2ln(R / N)] × 180 / π
where R is the resultant vector length and N is the number of unit normal vectors within the local neighbourhood. s is measured in degrees and is zero for simple planes and increases infinitely with increasing surface complexity or roughness. Note that this formulation of the spherical standard deviation assumes an underlying wrapped normal distribution.
The local neighbourhood size (filter) must be any odd integer equal to or greater than three. Grohmann et al. (2010) found that vector dispersion, a related measure of angular dispersion, increases monotonically with scale. This is the result of the angular dispersion measure integrating (accumulating) all of the surface variance of smaller scales up to the test scale. A more interesting scale relation can therefore be estimated by isolating the amount of surface complexity associated with specific scale ranges. That is, at large spatial scales, s should reflect the texture of large-scale landforms rather than the accumulated complexity at all smaller scales, including microtopographic roughness. As such, this tool normalizes the surface complexity of scales that are smaller than the filter size by applying Gaussian blur (with a standard deviation of one-third the filter size) to the DEM prior to calculating R. In this way, the resulting distribution is able to isolate and highlight the surface shape complexity associated with landscape features of a similar scale to that of the filter size.
This tool makes extensive use of integral images (i.e. summed-area tables) and parallel processing to ensure computational efficiency. It may, however, require substantial memory resources when applied to larger DEMs.
References
Grohmann, C. H., Smith, M. J., & Riccomini, C. (2010). Multiscale analysis of topographic surface roughness in the Midland Valley, Scotland. IEEE Transactions on Geoscience and Remote Sensing, 49(4), 1200-1213.
Hodgson, M. E., and Gaile, G. L. (1999). A cartographic modeling approach for surface orientation-related applications. Photogrammetric Engineering and Remote Sensing, 65(1), 85-95.
Lindsay J. B., Newman* D. R., Francioni, A. 2019. Scale-optimized surface roughness for topographic analysis. Geosciences, 9(7) 322. DOI: 10.3390/geosciences9070322.
See Also
circular_variance_of_aspect, multiscale_roughness, edge_density, surface_area_ratio, ruggedness_index
Python API
def spherical_std_dev_of_normals(self, dem: Raster, filter_size: int = 11) -> Raster:
Standard Deviation Of Slope
Function name: standard_deviation_of_slope
Calculates the standard deviation of slope from an input DEM, a metric of roughness described by Grohmann et al., (2011).
Python API
def standard_deviation_of_slope(self, dem: Raster, filter_size: int = 11, z_factor: float = 1.0) -> Raster:
Landform Indices
Elev Relative To Min Max
Function name: elev_relative_to_min_max
This tool can be used to express the elevation of a grid cell in a digital elevation model (DEM) as a percentage of the relief between the DEM minimum and maximum values. As such, it provides a basic measure of relative topographic position.
See Also
elev_relative_to_watershed_min_max, elevation_above_stream, ElevAbovePit
Python API
def elev_relative_to_min_max(self, dem: Raster) -> Raster:
Geomorphons
Function name: geomorphons
This tool can be used to perform a geomorphons landform classification based on an input digital elevation model (dem). The geomorphons concept is based on line-of-sight analysis for the eight topographic profiles in the cardinal directions surrounding each grid cell in the input DEM. The relative sizes of the zenith angle of a profile's maximum elevation angle (i.e. horizon angle) and the nadir angle of a profile's minimum elevation angle are then used to generate a ternary (base-3) digit: 0 when the nadir angle is less than the zenith angle, 1 when the two angles differ by less than a user-defined flatness threshold (threshold), and 2 when the nadir angle is greater than the zenith angle. A ternary number is then derived from the digits assigned to each of the eight profiles, with digits sequenced counter-clockwise from east. This ternary number forms the geomorphons code assigned to the grid cell. There are 38 = 6561 possible codes, although many of these codes are equivalent geomorphons through rotations and reflections. Some of the remaining geomorphons also rarely if ever occur in natural topography. Jasiewicz et al. (2013) identified 10 common landform types by reclassifying related geomorphons codes. The user may choose to output these common forms (forms) rather than the the raw ternary code. These landforms include: ValueLandform Type 1Flat 2Peak (summit) 3Ridge 4Shoulder 5Spur (convex) 6Slope 7Hollow (concave) 8Footslope 9Valley 10Pit (depression)
One of the main advantages of the geomrophons method is that, being based on minimum/maximum elevation angles, the scale used to estimate the landform type at a site adapts to the surrounding terrain. In principle, choosing a large value of search distance (search) should result in identification of a landform element regardless of its scale.
An experimental feature has been added to correct for global inclination. Global inclination biases the flatness threshold angle becasue it is measured relative to the z-axis, especially in locally flat areas. Including the residuals flag "flattens" the input by converting elevation to residuals of a 2-d linear model.
Reference
Jasiewicz, J., and Stepinski, T. F. (2013). Geomorphons — a pattern recognition approach to classification and mapping of landforms. Geomorphology, 182, 147-156.
See Also
PennockLandformClass
Python API
def geomorphons(self, dem: Raster, search_distance: int = 1, flatness_threshold: float = 1.0, flatness_distance: int = 0, skip_distance: int = 0, output_forms: bool = True, analyze_residuals: bool = False) -> Raster:
Hypsometric Analysis
Function name: hypsometric_analysis
This tool can be used to derive the hypsometric curve, or area-altitude curve, of one or more input digital elevation models (DEMs) ('inputs'). A hypsometric curve is a histogram or cumulative distribution function of elevations in a geographical area.
See Also
SlopeVsElevationPlot
Python API
def hypsometric_analysis(self, dem_rasters: List[Raster], output_html_file: str, watershed_rasters: List[Raster] = None) -> None:
Multiscale Topographic Position Class
Function name: multiscale_topographic_position_class
Description
This tool classifies each DEM grid cell into one of nine multiscale topographic position classes by combining local- and broad-scale maximum standardized elevation deviation (DEVmax) responses. The tool computes DEVmax internally for two user-defined scale ranges and then applies ternary thresholds to both responses. The broad-scale response separates lowland, intermediate, and upland settings, while the local-scale response separates hollow, mid-position, and knoll settings.
The combined output classes are: 0 Lowland hollow, 1 Lowland mid-position, 2 Lowland knoll, 3 Intermediate hollow, 4 Intermediate mid-position, 5 Intermediate knoll, 6 Upland hollow, 7 Upland mid-position, and 8 Upland knoll. The output raster is categorical and is intended to be displayed using a fixed nine-class palette.
The local and broad scale ranges are each defined by minimum scale, maximum scale, and step size parameters. Thresholds (local_threshold and broad_threshold) control the ternary classification of the corresponding DEVmax mosaics. The optional min_patch_size parameter can be used to suppress very small mapped patches after classification. The optional output_confidence raster stores a class confidence value in the range [0, 1] based on the margin from the classification thresholds.
Reference
Lindsay, J. B., Cockburn, J. M. H., and Russell, H. A. J. (2015). An integral image approach to performing multi-scale topographic position analysis. Geomorphology, 245, 51-61.
See Also
max_elevation_deviation, multiscale_topographic_position_image
Python API
def multiscale_topographic_position_class(self, input: Raster, local_min_scale: int = 5, local_max_scale: int = 80, local_step_size: int = 1, broad_min_scale: int = 500, broad_max_scale: int = 2000, broad_step_size: int = 20, local_threshold: float = 0.5, broad_threshold: float = 0.5, min_patch_size: int = 0, output_path: Optional[str] = None, output_confidence_path: Optional[str] = None, callback: Any = None) -> Raster:
Pennock Landform Classification
Function name: pennock_landform_classification
Tool can be used to perform a simple landform classification based on measures of slope gradient and curvature derived from a user-specified digital elevation model (DEM). The classification scheme is based on the method proposed by Pennock, Zebarth, and DeJong (1987). The scheme divides a landscape into seven element types, including: convergent footslopes (CFS), divergent footslopes (DFS), convergent shoulders (CSH), divergent shoulders (DSH), convergent backslopes (CBS), divergent backslopes (DBS), and level terrain (L). The output raster image will record each of these base element types as:
Element Type | Code ------------- | ------- CFS | 1 DFS | 2 CSH | 3 DSH | 4 CBS | 5 DBS | 6 L | 7
The definition of each of the elements, based on the original Pennock et al. (1987) paper, is as follows: PROFILEGRADIENTPLANElement Concave ( -0.10)High >3.0Concave 0.0CFS Concave ( -0.10)High >3.0Convex >0.0DFS Convex (>0.10)High >3.0Concave 0.0CSH Convex (>0.10)High >3.0Convex >0.0DSH Linear (-0.10...0.10)High >3.0Concave 0.0CBS Linear (-0.10...0.10)High >3.0Convex >0.0DBS
Where PROFILE is profile curvature, GRADIENT is the slope gradient, and PLAN is the plan curvature. Note that these values are likely landscape and data specific and can be adjusted by the user. Landscape classification schemes that are based on terrain attributes are highly sensitive to short-range topographic variability (i.e. roughness) and can benefit from pre-processing the DEM with a smoothing filter to reduce the effect of surface roughness and emphasize the longer-range topographic signal. The feature_preserving_smoothing tool offers excellent performance in smoothing DEMs without removing the sharpness of breaks-in-slope.
Reference
Pennock, D.J., Zebarth, B.J., and DeJong, E. (1987) Landform classification and soil distribution in hummocky terrain, Saskatchewan, Canada. Geoderma, 40: 297-315.
See Also
feature_preserving_smoothing
Python API
def pennock_landform_classification(self, dem: Raster, slope_threshold: float = 3.0, prof_curv_threshold: float = 0.1, plan_curv_threshold: float = 0.0, z_factor: float = 1.0) -> Tuple[Raster, str]:
Percent Elev Range
Function name: percent_elev_range
Percent elevation range (PER) is a measure of local topographic position (LTP). It expresses the vertical position for a digital elevation model (DEM) grid cell (z0) as the percentage of the elevation range within the neighbourhood filter window, such that:
PER = z0 / (zmax - zmin) x 100
where z0 is the elevation of the window's center grid cell, zmax is the maximum neighbouring elevation, and zmin is the minimum neighbouring elevation.
Neighbourhood size, or filter size, is specified in the x and y dimensions using the filterx and filteryflags. These dimensions should be odd, positive integer values (e.g. 3, 5, 7, 9, etc.).
Compared with ElevPercentile and DevFromMeanElev, PER is a less robust measure of LTP that is susceptible to outliers in neighbouring elevations (e.g. the presence of off-terrain objects in the DEM).
References
Newman, D. R., Lindsay, J. B., and Cockburn, J. M. H. (2018). Evaluating metrics of local topographic position for multiscale geomorphometric analysis. Geomorphology, 312, 40-50.
See Also
ElevPercentile, DevFromMeanElev, DiffFromMeanElev, relative_topographic_position
Python API
def percent_elev_range(self, dem: Raster, filter_size_x: int = 11, filter_size_y: int = 11) -> Raster:
Relative Topographic Position
Function name: relative_topographic_position
Relative topographic position (RTP) is an index of local topographic position (i.e. how elevated or low-lying a site is relative to its surroundings) and is a modification of percent elevation range (PER; percent_elev_range) and accounts for the elevation distribution. Rather than positioning the central cell's elevation solely between the filter extrema, RTP is a piece-wise function that positions the central elevation relative to the minimum (zmin), mean (μ), and maximum values (zmax), within a local neighbourhood of a user-specified size (filterx, filtery), such that:
RTP = (z0 − μ) / (μ − zmin), if z0 < μ
OR
RTP = (z0 − μ) / (zmax - μ), if z0 >= μ
The resulting index is bound by the interval [−1, 1], where the sign indicates if the cell is above or below than the filter mean. Although RTP uses the mean to define two linear functions, the reliance on the filter extrema is expected to result in sensitivity to outliers. Furthermore, the use of the mean implies assumptions of unimodal and symmetrical elevation distribution.
In many cases, Elevation Percentile (ElevPercentile) and deviation from mean elevation (DevFromMeanElev) provide more suitable and robust measures of relative topographic position.
Reference
Newman, D. R., Lindsay, J. B., and Cockburn, J. M. H. (2018). Evaluating metrics of local topographic position for multiscale geomorphometric analysis. Geomorphology, 312, 40-50.
See Also
DevFromMeanElev, DiffFromMeanElev, ElevPercentile, percent_elev_range
Python API
def relative_topographic_position(self, dem: Raster, filter_size_x: int = 11, filter_size_y: int = 11) -> Raster:
Multiscale Signatures
Max Anisotropy Dev
Function name: max_anisotropy_dev
Calculates the maximum anisotropy (directionality) in elevation deviation over a range of spatial scales.
Python API
def max_anisotropy_dev(self, dem: Raster, min_scale: int = 1, max_scale: int = 100, step_size: int = 1) -> Tuple[Raster, Raster]:
Max Anisotropy Dev Signature
Function name: max_anisotropy_dev_signature
/.//
Python API
def max_anisotropy_dev_signature(self, dem: Raster, points: Vector, output_html_file: str, min_scale: int = 1, max_scale: int = 100, step_size: int = 1) -> None:
Max Difference From Mean
Function name: max_difference_from_mean
Calculates the maximum difference from mean elevation over a range of spatial scales.
Python API
def max_difference_from_mean(self, dem: Raster, min_scale: int = 1, max_scale: int = 100, step_size: int = 1) -> Tuple[Raster, Raster]:
Max Elevation Deviation
Function name: max_elevation_deviation
This tool can be used to calculate the maximum deviation from mean elevation, DEVmax (Lindsay et al. 2015) for each grid cell in a digital elevation model (DEM) across a range specified spatial scales. DEV is an elevation residual index and is essentially equivalent to a local elevation z-score. This attribute measures the relative topographic position as a fraction of local relief, and so is normalized to the local surface roughness. The multi-scaled calculation of DEVmax utilizes an integral image approach (Crow, 1984) to ensure highly efficient filtering that is invariant with filter size, which is the algorithm characteristic that allows for this densely sampled multi-scale analysis. In this way, max_elevation_deviation allows users to estimate the locally optimal scale with which to estimate DEV on a pixel-by-pixel basis. This multi-scaled version of local topographic position can reveal significant terrain characteristics and can aid with soil, vegetation, landform, and other mapping applications that depend on geomorphometric characterization.
The user must input a digital elevation model (DEM) (dem). The range of scales that are evaluated in calculating DEVmax are determined by the user-specified min_scale, max_scale, and step parameters. All filter radii between the minimum and maximum scales, increasing by step, will be evaluated. The scale parameters are in units of grid cells and specify kernel size "radii" (r), such that:
d = 2r + 1
That is, a radii of 1, 2, 3... yields a square filters of dimension (d) 3 x 3, 5 x 5, 7 x 7...
DEV is estimated at each tested filter size and every grid cell is assigned the maximum DEV value across the evaluated scales.
Two output rasters will be generated, including the magnitude (DEVmax) and a second raster the assigns each pixel the scale at which DEVmax is encountered (DEVscale). The DEVscale raster can be very useful for revealing multi-scale landscape structure.
Reference
Lindsay J, Cockburn J, Russell H. 2015. An integral image approach to performing multi-scale topographic position analysis. Geomorphology, 245: 51-61.
See Also
DevFromMeanElev, max_difference_from_mean, multiscale_elevation_percentile
Python API
def max_elevation_deviation(self, dem: Raster, min_scale: int = 1, max_scale: int = 100, step_size: int = 1) -> Tuple[Raster, Raster]:
Max Elev Dev Signature
Function name: max_elev_dev_signature
Experimental
Calculates multiscale elevation-deviation signatures for input point sites and writes an HTML report.
geomorphometry terrain signature multiscale legacy-port
Parameters
NameDescriptionRequiredDefault
inputInput DEM raster path.Requireddem.tif
pointsInput vector point or multipoint file path.Requiredsites.geojson
min_scaleMinimum half-window radius in cells (default 1).Optional1
max_scaleMaximum half-window radius in cells (default 100).Optional100
step_sizeScale increment in cells (default 10). Alias: step.Optional10
outputOptional output path for the HTML signature report.Optional—
Examples
Generate DEV signatures for a set of sample locations.
wbe.max_elev_dev_signature(input='dem.tif', max_scale=150, min_scale=1, output='max_elev_dev_signature.html', points='sites.geojson', step_size=5)
Multiscale Curvatures
Function name: multiscale_curvatures
Description
This tool calculates several multiscale curvatures and curvature-based indices from an input DEM (--dem). There are 18 curvature types (--curv_type) available, including: accumulation curvature, curvedness, difference curvature, Gaussian curvature, generating function, horizontal excess curvature, maximal curvature, mean curvature, minimal curvature, plan curvature, profile curvature, ring curvature, rotor, shape index, tangential curvature, total curvature, unsphericity, and vertical excess curvature. Each of these curvatures can be measured in non-multiscale fashion using the corresponding tools available in either the WhiteboxTools open-core or the Whitebox extension.
Like many of the multi-scale land-surface parameter tools available in Whitebox, this tool can be run in two different modes: it can either be used to measure curvature at a single specific scale or to generate a curvature scale mosaic. To understand the difference between these two modes, we must first understand how curvatures are measured and how the non-multiscale curvature tools (e.g. ProfileCurvature) work. Curvatures are generally measured by fitting a mathematically defined surface to the elevation values within the local neighbourhood surrounding each grid cell in a DEM. The Whitebox curvature tools use the algorithms described Florinsky (2016), which use the 25 elevations within a 5 x 5 local neighbouhood for projected DEMs, and the nine elevations within a 3 x 3 neighbourhood for DEMs in geographic coordinate systems. This is what determines the scale at which these land-surface parameters are calculated. Because they are calculated using small local neighbourhoods (kernels), then these algorithms are heavily impacted by micro-topographic roughness and DEM noise. For example, in a fine-resolution DEM containing a great deal of micro-topographic roughness, the measured curvature value will be dominated by topographic variation at the scale of the roughness rather than the hillslopes on which that roughness is superimposed. This mis-matched scaling can be a problem in many applications, e.g. in landform classification and slope failure modelling applications.
Using the MultiscaleCurvatures tool, the user can specify a certain desired scale, larger than that defined by the grid resolution and kernel size, over which a curvature should be characterized. The tool will then use a fast Gaussian scale-space method to remove the topographic variation in the DEM at scales less than the desired scale, and will then characterize the curvature using the usual method based on this scaled DEM. To measure curvature at a single non-local scale, the user must specify a minimum search neighbourhood radius in grid cells (--min_scale) greater than 0.0. Note that a minimum search neighbourhood of 0.0 will replicate the non-multiscale equivalent curvature tool and any --min_scale value > 0.0 will apply the Gassian scale space method to eliminate topographic variation less than the scale of the minimum search neighbourhood. The base step size (--step), number of steps (--num_steps), and step nonlinearity (--step_nonlinearity) parameters should all be left to their default values of 1 in this case. The output curvature raster will be written to the output magnitude file (--out_mag). The following animation shows several multiscale curvature rasters (tangential curvature) measured from a DEM across a range of spatial scales.
Alternatively, one can use this tool to create a curvature scale mosaic. In this case, the user specifies a range of spatial scales (i.e., a scale space) over which to measure curvature. The curvature scale-space is densely sampled and each grid cell is assigned the maximum absolute curvature value (for the specified curvature type) across the scale space. In this scale-mosaic mode, the user must also specify the output scale file name (--out_scale), which is an output raster that, for each grid cell, specifies the scale at which the maximum absolute curvature was identified. The following is an example of a scale mosaic of unsphericity for an area in Pole Canyon, Utah (min_scale=1.0, step=1, num_steps=50, step_nonlinearity=1.0).
Scale mosaics are useful when modelling spatial distributions of land-surface parameters, like curvatures, in complex and heterogeneous landscapes that contain an abundance of topographic variation (micro-topography, landforms, etc.) at widely varying spatial scales, often associated with different geomorphic processes. Notice how in the image above, relatively strong curvature values are being characterized for both the landforms associated with the smaller-scale mass-movement processes as well as the broader-scale fluvial erosion (i.e. valley incision and hillslopes). It would be difficult, or impossible, to achieve this effect using a single, uniform scale. Each location in a land-surface parameter scale mosaic represents the parameter measured at a characteristic scale, given the unique topography of the site and surroundings.
The properties of the sampled scale space are determined using the --min_scale, --step, --num_steps (greater than 1), and --step_nonlinearity parameters. Experience with multiscale curvature scales spaces has shown that they are more highly variable at shorter spatial scales and change more gradually at broader scales. Therefore, a nonlinear scale sampling interval is used by this tool to ensure that the scale sampling density is higher for short scale ranges and coarser at longer tested scales, such that:
ri = rL + [step × (i - rL)]p
Where ri is the filter radius for step i and p is the nonlinear scaling factor (--step_nonlinearity) and a step size (--step) of step.
In scale-mosaic mode, the user must also decide whether or not to standardize the curvature values (--standardize). When this parameter is used, the algorithm will convert each curvature raster associated with each sampled region of scale-space to z-scores (i.e. differenced from the raster-wide mean and divided by the raster-wide standard deviation). It it usually the case that curvature values measured at broader spatial scales will on the whole become less strongly valued. Because the scale mosaic algorithm used in this tool assigns each grid cell the maximum absolute curvature observed within sampled scale-space, this implies that the curvature values associated with more local-scale ranges are more likely to be selected for the final scale-mosaic raster. By standardizing each scaled curvature raster, there is greater opportunity for the final scale-mosaic to represent broader scale topographic variation. Whether or not this is appropriate will depend on the application. However, it is important to stress that the sampled scale-space need not span the full range of possible scales, from the finest scale determined by the grid resolution up to the broadest scale possible, determined by the spatial extent of the input DEM. Often, a better approach is to use this tool to create multiple scale mosaics spanning this range, thereby capturing variation within broadly defined scale ranges. For example, one could create a local-scale, meso-scale, and broad-scale curvature scale mosaics, each of which would capture topographic variation and landforms that are present in the landscape and reflective of processing operating at vastly different spatial scales. When this approach is used, it may not be necessary to standardize each scaled curvature raster, since the gradual decline in curvature values as scales increase is less pronounced within each of these broad scale ranges than across the entirety of possible scale-space. Again, however, this will depend on the application and on the characteristics of the landscape at study.
Raw curvedness values are often challenging to visualize given their range and magnitude, and as such the user may opt to log-transform the output raster (--log). Transforming the values applies the equation by Shary et al. (2002):
Θ' = sign(Θ) ln(1 + 10n|Θ|)
where Θ is the parameter value and n is dependent on the grid cell size.
References
Florinsky, I. (2016). Digital terrain analysis in soil science and geology. Academic Press.
See Also
gaussian_scale_space, accumulation_curvature, curvedness, difference_curvature, gaussian_curvature, generating_function, horizontal_excess_curvature, maximal_curvature, mean_curvature, minimal_curvature, plan_curvature, profile_curvature, ring_curvature, rotor, shape_index, tangential_curvature, total_curvature, unsphericity, vertical_excess_curvature
Python API
def multiscale_curvatures(self, dem: Raster, curv_type: str = 'profile', min_scale: int = 4, step_size: int = 1, num_steps: int = 10, step_nonlinearity: float = 1.0, log_transform: bool = True, standardize: bool = False) -> Tuple[Raster, Raster]:
Multiscale Elevated Index
Function name: multiscale_elevated_index
Experimental
Calculates multiscale elevated-index (MsEI) and key-scale rasters using Gaussian scale-space residuals.
geomorphometry multiscale gss elevated-index
Multiscale Elevation Percentile
Function name: multiscale_elevation_percentile
This tool calculates the most elevation percentile (EP) across a range of spatial scales. EP is a measure of local topographic position (LTP) and expresses the vertical position for a digital elevation model (DEM) grid cell (z0) as the percentile of the elevation distribution within the filter window, such that:
EP = counti∈C(zi > z0) x (100 / nC)
where z0 is the elevation of the window's center grid cell, zi is the elevation of cell i contained within the neighboring set C, and nC is the number of grid cells contained within the window.
EP is unsigned and expressed as a percentage, bound between 0% and 100%. This tool outputs two rasters, the multiscale EP magnitude (out_mag) and the scale at which the most extreme EP value occurs (out_scale). The magnitude raster is the most extreme EP value (i.e. the furthest from 50%) for each grid cell encountered within the tested scales of EP.
Quantile-based estimates (e.g., the median and interquartile range) are often used in nonparametric statistics to provide data variability estimates without assuming the distribution is normal. Thus, EP is largely unaffected by irregularly shaped elevation frequency distributions or by outliers in the DEM, resulting in a highly robust metric of LTP. In fact, elevation distributions within small to medium sized neighborhoods often exhibit skewed, multimodal, and non-Gaussian distributions, where the occurrence of elevation errors can often result in distribution outliers. Thus, based on these statistical characteristics, EP is considered one of the most robust representation of LTP.
The algorithm implemented by this tool uses the relatively efficient running-histogram filtering algorithm of Huang et al. (1979). Because most DEMs contain floating point data, elevation values must be rounded to be binned. The sig_digits parameter is used to determine the level of precision preserved during this binning process. The algorithm is parallelized to further aid with computational efficiency.
Experience with multiscale EP has shown that it is highly variable at shorter scales and changes more gradually at broader scales. Therefore, a nonlinear scale sampling interval is used by this tool to ensure that the scale sampling density is higher for short scale ranges and coarser at longer tested scales, such that:
ri = rL + [step × (i - rL)]p
Where ri is the filter radius for step i and p is the nonlinear scaling factor (step_nonlinearity) and a step size (step) of step.
References
Newman, D. R., Lindsay, J. B., and Cockburn, J. M. H. (2018). Evaluating metrics of local topographic position for multiscale geomorphometric analysis. Geomorphology, 312, 40-50.
Huang, T., Yang, G.J.T.G.Y. and Tang, G., 1979. A fast two-dimensional median filtering algorithm. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27(1), pp.13-18.
See Also
elevation_percentile, max_elevation_deviation, max_difference_from_mean
Python API
def multiscale_elevation_percentile(self, dem: Raster, num_significant_digits: int = 3, min_scale: int = 4, step_size: int = 1, num_steps: int = 10, step_nonlinearity: float = 1.0) -> Tuple[Raster, Raster]:
Multiscale Low Lying Index
Function name: multiscale_low_lying_index
Experimental
Calculates multiscale low-lying-index (MsLLI) and key-scale rasters using Gaussian scale-space residuals.
geomorphometry multiscale gss low-lying-index
Multiscale Roughness
Function name: multiscale_roughness
/
Python API
def multiscale_roughness(self, dem: Raster, min_scale: int = 1, max_scale: int = 100, step_size: int = 1) -> Tuple[Raster, Raster]:
Multiscale Roughness Signature
Function name: multiscale_roughness_signature
/
Python API
def multiscale_roughness_signature(self, dem: Raster, points: Vector, output_html_file: str, min_scale: int = 1, max_scale: int = 100, step_size: int = 1) -> None:
Multiscale Std Dev Normals
Function name: multiscale_std_dev_normals
This tool can be used to map the spatial pattern of maximum spherical standard deviation (σs max; out_mag), as well as the scale at which maximum spherical standard deviation occurs (rmax; out_scale), for each grid cell in an input DEM (dem). This serves as a multi-scale measure of surface roughness, or topographic complexity. The spherical standard deviation (σs) is a measure of the angular spread among n unit vectors and is defined as:
σs = √[-2ln(R / N)] × 180 / π
Where R is the resultant vector length and is derived from the sum of the x, y, and z components of each of the n normals contained within a filter kernel, which designates a tested spatial scale. Each unit vector is a 3-dimensional measure of the surface orientation and slope at each grid cell center. The maximum spherical standard deviation is:
σs max=max{σs(r):r=rL...rU},
Experience with roughness scale signatures has shown that σs max is highly variable at shorter scales and changes more gradually at broader scales. Therefore, a nonlinear scale sampling interval is used by this tool to ensure that the scale sampling density is higher for short scale ranges and coarser at longer tested scales, such that:
ri = rL + [step × (i - rL)]p
Where ri is the filter radius for step i and p is the nonlinear scaling factor (step_nonlinearity) and a step size (step) of step.
Use the spherical_std_dev_of_normals tool if you need to calculate σs for a single scale.
Reference
JB Lindsay, DR Newman, and A Francioni. 2019 Scale-Optimized Surface Roughness for Topographic Analysis. Geosciences, 9(322) doi: 10.3390/geosciences9070322.
See Also
spherical_std_dev_of_normals, multiscale_std_dev_normals_signature, multiscale_roughness
Python API
def multiscale_std_dev_normals(self, dem: Raster, min_scale: int = 4, step_size: int = 1, num_steps: int = 10, step_nonlinearity: float = 1.0, html_signature_file: str = "") -> Tuple[Raster, Raster]:
Multiscale Std Dev Normals Signature
Function name: multiscale_std_dev_normals_signature
/
Python API
def multiscale_std_dev_normals_signature(self, dem: Raster, points: Vector, output_html_file: str, min_scale: int = 4, step_size: int = 1, num_steps: int = 10, step_nonlinearity: float = 1.0) -> None:
Multiscale Topographic Position Image
Function name: multiscale_topographic_position_image
This tool creates a multiscale topographic position (MTP) image (see here for an example) from three DEVmax rasters of differing spatial scale ranges. Specifically, multiscale_topographic_position_image takes three DEVmax magnitude rasters, created using the max_elevation_deviation tool, as inputs. The three inputs should correspond to the elevation deviations in the local (local), meso (meso), and broad (broad) scale ranges and will be forced into the blue, green, and red colour components of the colour composite output (output) raster. The image lightness value (lightness) controls the overall brightness of the output image, as depending on the topography and scale ranges, these images can appear relatively dark. Higher values result in brighter, more colourful output images.
The user may optionally specify an input hillshade raster. When specified, the hillshade will be used to provide a shaded-relief overlaid on top of the coloured multi-scale information, providing a very effective visualization. Any hillshade image may be used for this purpose, but we have found that multi-directional hillshade (multidirectional_hillshade), and specifically those derived using the 360-degree option, can be most effective for this application. However, experimentation is likely needed to find the optimal for each unique data set.
The output images can take some training to interpret correctly and a detailed explanation can be found in Lindsay et al. (2015). Sites within the landscape that occupy prominent topographic positions, either low-lying or elevated, will be apparent by their bright colouring in the MTP image. Those that are coloured more strongly in the blue are promient at the local scale range; locations that are more strongly green coloured are promient at the meso scale; and bright reds in the MTP image are associated with broad-scale landscape prominence. Of course, combination colours are also possible when topography is elevated or low-lying across multiple scale ranges. For example, a yellow area would indicated a site of prominent topographic position across the meso and broadest scale ranges.
Reference
Lindsay J, Cockburn J, Russell H. 2015. An integral image approach to performing multi-scale topographic position analysis. Geomorphology, 245: 51-61.
See Also
max_elevation_deviation
Python API
def multiscale_topographic_position_image(self, local: Raster, meso: Raster, broad: Raster, lightness: float = 1.2) -> Raster:
Topographic Position Animation
Function name: topographic_position_animation
PROExperimental
Creates an interactive HTML viewer and animated GIF of DEV or DEVmax across nonlinearly sampled scales.
geomorphometry terrain topographic-position animation integral-image legacy-port
Examples
Animate terrain topographic position through a sequence of DEV scales.
wbe.topographic_position_animation(input='dem.tif', num_steps=8, output='topographic_position_animation.html', use_dev_max=True)
General Tools
Assess Route
Function name: assess_route
PROExperimental
Segments route lines and evaluates per-segment terrain metrics from a DEM.
geomorphometry route vector legacy-port
Breakline Mapping
Function name: breakline_mapping
PROExperimental
Maps breaklines by thresholding log-transformed curvedness and vectorizing thinned linear features.
geomorphometry breaklines curvature vectorization legacy-port
Parameters
NameDescriptionRequiredDefault
inputInput DEM raster path or typed raster object.Requireddem.tif
thresholdMinimum log-curvedness threshold used for breakline extraction (default 0.8).Optional0.8
min_lengthMinimum output line length in grid cells (default 3).Optional3
outputOptional output vector path (default temporary .shp).Optionalbreaklines.shp
Examples
Extract breakline vectors from a DEM.
wbe.breakline_mapping(input='dem.tif', min_length=6, output='breaklines.shp', threshold=2.0)
Convergence Index
Function name: convergence_index
This tool calculates the convergence index (C), described by Koethe and Lehmeier (1996) and Kiss (2004), for each grid cell in an input digital elevation model (DEM). The convergence index measures the average amount by which the aspect value of each of the eight neighbours in a 3x3 kernel deviates from an aspect aligned with the direction towards the center cell. As such the index measures the degree to which the surrounding topography converges on the center cell.
C = 1 / 8 Σ|Φ - Az0| - 90
Where Φ is the aspect of a neighbour of the center cell and Az0 is the azimuth from the neighbour directed towards the center cell. Note, -90 < C < 90, where highly convergent areas have values near -90 and highly divergent areas have values near 90. Therefore, in actuality, C is more properly an index of divergence rather than a convergence index, despite its name.
The user must specify the name of the input DEM (dem) and the output raster (output). The Z conversion factor (zfactor) is only important when the vertical and horizontal units are not the same in the DEM, and the DEM is in a projected coordinate system. When this is the case, the algorithm will multiply each elevation in the DEM by the Z Conversion Factor to perform the unit conversion.
For DEMs in projected coordinate systems, the tool uses the 3rd-order bivariate Taylor polynomial method described by Florinsky (2016). Based on a polynomial fit of the elevations within the 5x5 neighbourhood surrounding each cell, this method is considered more robust against outlier elevations (noise) than other methods. For DEMs in geographic coordinate systems (i.e. angular units), the tool uses the 3x3 polynomial fitting method for equal angle grids also described by Florinsky (2016).
Reference
Florinsky, I. (2016). Digital terrain analysis in soil science and geology. Academic Press.
Kiss, R. (2004). Determination of drainage network in digital elevation models, utilities and limitations. Journal of Hungarian geomathematics, 2, 17-29.
Koethe, R. and Lehmeier, F. (1996): SARA - System zur Automatischen Relief-Analyse. User Manual, 2. Edition [Dept. of Geography, University of Goettingen, unpublished]
See Also
aspect, plan_curvature, profile_curvature
Python API
def convergence_index(self, dem: Raster, z_factor: float = 1.0) -> Raster:
DEM Void Filling
Function name: dem_void_filling
Description
This tool implements a modified version of the Delta Surface Fill method of Grohman et al. (2006). It can fill voids (i.e., data holes) contained within a digital elevation model (dem) by fusing the data with a second DEM (fill) that defines the topographic surface within the void areas. The two surfaces are fused seamlessly so that the transition from the source and fill surfaces is undetectable. The fill surface need not have the same resolution as the source DEM.
The algorithm works by computing a DEM-of-difference (DoD) for each valid grid cell in the source DEM that also has a valid elevation in the corresponding location within the fill DEM. This difference surface is then used to define offsets within the near void-edge locations. The fill surface elevations are then combined with interpolated offsets, with the interpolation based on near-edge offsets, and used to define a new surface within the void areas of the source DEM in such a way that the data transitions seamlessly from the source data source to the fill data. The image below provides an example of this method.
The user must specify the mean_plane_dist parameter, which defines the distance (measured in grid cells) within a void area from the void's edge. Grid cells within larger voids that are beyond this distance from their edges have their vertical offsets, needed during the fusion of the DEMs, set to the mean offset for all grid cells that have both valid source and fill elevations. Void cells that are nearer their void edges have vertical offsets that are interpolated based on nearby offset values (i.e., the DEM of difference). The interpolation uses inverse-distance weighted (IDW) scheme, with a user-specified weight parameter (weight_value).
The edge_treatment parameter describes how the data fusion operates at the edges of voids, i.e., the first line of grid cells for which there are both source and fill elevation values. This parameter has values of "use DEM", "use Fill", and "average". Grohman et al. (2006) state that sometimes, due to a weakened signal within these marginal locations between the area of valid data and voids, the estimated elevation values are inaccurate. When this is the case, it is best to use fill elevations in the transitional areas. If this isn't the case, the "use DEM" is the better option. A compromise between the two options is to average the two elevation sources.
References
Grohman, G., Kroenung, G. and Strebeck, J., 2006. Filling SRTM voids: The delta surface fill method. Photogrammetric Engineering and Remote Sensing, 72(3), pp.213-216.
Python API
def dem_void_filling(self, dem: Raster, fill: Raster, mean_plane_dist: int = 20, edge_treatment: str = "dem", weight_value: float = 2.0) -> Raster:
Deviation From Mean Elevation
Function name: deviation_from_mean_elevation
This tool can be used to calculate the difference between the elevation of each grid cell and the mean elevation of the centering local neighbourhood, normalized by standard deviation. Therefore, this index of topographic residual is essentially equivalent to a local z-score. This attribute measures the relative topographic position as a fraction of local relief, and so is normalized to the local surface roughness. DevFromMeanElev utilizes an integral image approach (Crow, 1984) to ensure highly efficient filtering that is invariant with filter size.
The user must input a digital elevation model (DEM) (dem) and the size of the neighbourhood in the x and y directions (filterx and filtery), measured in grid size.
While DeviationFromMeanElev calculates the deviation from mean elevation (DEV) at a single, user-defined scale, the max_elevation_deviation tool can be used to output the per-pixel maximum DEV value across a range of input scales.
See Also
DiffFromMeanElev, max_elevation_deviation
Python API
def deviation_from_mean_elevation(self, dem: Raster, filter_size_x: int = 11, filter_size_y: int = 11) -> Raster:
Difference From Mean Elevation
Function name: difference_from_mean_elevation
This tool can be used to calculate the difference between the elevation of each grid cell and the mean elevation of the centering local neighbourhood. This is similar to what a high-pass filter calculates for imagery data, but is intended to work with DEM data instead. This attribute measures the relative topographic position. DiffFromMeanElev utilizes an integral image approach (Crow, 1984) to ensure highly efficient filtering that is invariant with filter size.
The user must specify a digital elevation model (DEM) (dem) , and the size of the neighbourhood in the x and y directions (filterx and filtery), measured in grid size.
While DevFromMeanElev calculates the DIFF at a single, user-defined scale, the max_difference_from_mean tool can be used to output the per-pixel maximum DIFF value across a range of input scales.
See Also
DevFromMeanElev, max_difference_from_mean
Python API
def difference_from_mean_elevation(self, dem: Raster, filter_size_x: int = 11, filter_size_y: int = 11) -> Raster:
Directional Relief
Function name: directional_relief
This tool calculates the relief for each grid cell in a digital elevation model (DEM) in a specified direction. Directional relief is an index of the degree to which a DEM grid cell is higher or lower than its surroundings. It is calculated by subtracting the elevation of a DEM grid cell from the average elevation of those cells which lie between it and the edge of the DEM in a specified compass direction. Thus, positive values indicate that a grid cell is lower than the average elevation of the grid cells in a specific direction (i.e. relatively sheltered), whereas a negative directional relief indicates that the grid cell is higher (i.e. relatively exposed). The algorithm is based on a modification of the procedure described by Lapen and Martz (1993). The modifications include: (1) the ability to specify any direction between 0-degrees and 360-degrees (azimuth), and (2) the ability to use a distance-limited search (max_dist), such that the ray-tracing procedure terminates before the DEM edge is reached for longer search paths. The algorithm works by tracing a ray from each grid cell in the direction of interest and evaluating the average elevation along the ray. Linear interpolation is used to estimate the elevation of the surface where a ray does not intersect the DEM grid precisely at one of its nodes. The user must input a DEM raster file (dem) and a hypothetical wind direction. Furthermore, the user is able to constrain the maximum search distance for the ray tracing. If no maximum search distance is specified, each ray will be traced to the edge of the DEM. The units of the output image are the same as the input DEM.
Ray-tracing is a highly computationally intensive task and therefore this tool may take considerable time to operate for larger sized DEMs. This tool is parallelized to aid with computational efficiency. NoData valued grid cells in the input image will be assigned NoData values in the output image. The output raster is of the float data type and continuous data scale. Directional relief is best displayed using the blue-white-red bipolar palette to distinguish between the positive and negative values that are present in the output.
Reference
Lapen, D. R., & Martz, L. W. (1993). The measurement of two simple topographic indices of wind sheltering-exposure from raster digital elevation models. Computers & Geosciences, 19(6), 769-779.
See Also
fetch_analysis, horizon_angle, relative_aspect
Python API
def directional_relief(self, dem: Raster, azimuth: float = 0.0, max_dist: float = float('inf')) -> Raster:
Elev Above Pit
Function name: elev_above_pit
Experimental
Calculate elevation above the nearest depression (pit). Useful for drainage analysis and identifying topographic prominence.
geomorphometry terrain relative-elevation legacy-port
Elev Above Pit Dist
Function name: elev_above_pit_dist
Experimental
Compatibility alias for elev_above_pit.
geomorphometry terrain legacy-port
Elevation Percentile
Function name: elevation_percentile
Elevation percentile (EP) is a measure of local topographic position (LTP). It expresses the vertical position for a digital elevation model (DEM) grid cell (z0) as the percentile of the elevation distribution within the filter window, such that:
EP = counti∈C(zi > z0) x (100 / nC)
where z0 is the elevation of the window's center grid cell, zi is the elevation of cell i contained within the neighboring set C, and nC is the number of grid cells contained within the window.
EP is unsigned and expressed as a percentage, bound between 0% and 100%. Quantile-based estimates (e.g., the median and interquartile range) are often used in nonparametric statistics to provide data variability estimates without assuming the distribution is normal. Thus, EP is largely unaffected by irregularly shaped elevation frequency distributions or by outliers in the DEM, resulting in a highly robust metric of LTP. In fact, elevation distributions within small to medium sized neighborhoods often exhibit skewed, multimodal, and non-Gaussian distributions, where the occurrence of elevation errors can often result in distribution outliers. Thus, based on these statistical characteristics, EP is considered one of the most robust representation of LTP.
The algorithm implemented by this tool uses the relatively efficient running-histogram filtering algorithm of Huang et al. (1979). Because most DEMs contain floating point data, elevation values must be rounded to be binned. The sig_digits parameter is used to determine the level of precision preserved during this binning process. The algorithm is parallelized to further aid with computational efficiency.
Neighbourhood size, or filter size, is specified in the x and y dimensions using the filterx and filtery flags. These dimensions should be odd, positive integer values (e.g. 3, 5, 7, 9, etc.).
References
Newman, D. R., Lindsay, J. B., and Cockburn, J. M. H. (2018). Evaluating metrics of local topographic position for multiscale geomorphometric analysis. Geomorphology, 312, 40-50.
Huang, T., Yang, G.J.T.G.Y. and Tang, G., 1979. A fast two-dimensional median filtering algorithm. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27(1), pp.13-18.
See Also
DevFromMeanElev, DiffFromMeanElev
Python API
def elevation_percentile(self, dem: Raster, filter_size_x: int = 11, filter_size_y: int = 11, sig_digits: int = 2) -> Raster:
Embankment Mapping
Function name: embankment_mapping
This tool can be used to map and/or remove road embankments from an input fine-resolution digital elevation model (dem). Fine-resolution LiDAR DEMs can represent surface features such as road and railway embankments with high fidelity. However, transportation embankments are problematic for several environmental modelling applications, including soil an vegetation distribution mapping, where the pre-embankment topography is the contolling factor. The algorithm utilizes repositioned (search_dist) transportation network cells, derived from rasterizing a transportation vector (road_vec), as seed points in a region-growing operation. The embankment region grows based on derived morphometric parameters, including road surface width (min_road_width), embankment width (typical_width and max_width), embankment height (max_height), and absolute slope (spillout_slope). The tool can be run in two modes. By default the tool will simply map embankment cells, with a Boolean output raster. If, however, the remove_embankments flag is specified, the tool will instead output a DEM for which the mapped embankment grid cells have been excluded and new surfaces have been interpolated based on the surrounding elevation values (see below).
Hillshade from original DEM:
Hillshade from embankment-removed DEM:
References
Van Nieuwenhuizen, N, Lindsay, JB, DeVries, B. 2021. Automated mapping of transportation embankments in fine-resolution LiDAR DEMs. Remote Sensing. 13(7), 1308; https://doi.org/10.3390/rs13071308
See Also:
remove_off_terrain_objects, smooth_vegetation_residual
Python API
def embankment_mapping(self, dem: Raster, roads_vector: Vector, search_dist: float = 2.5, min_road_width: float = 6.0, typical_embankment_width: float = 30.0, typical_embankment_max_height: float = 2.0, embankment_max_width: float = 60.0, max_upwards_increment: float = 0.05, spillout_slope: float = 4.0, remove_embankments: bool = False) -> Tuple[Raster, Union[Raster, None]]:
Exposure Towards Wind Flux
Function name: exposure_towards_wind_flux
This tool creates a new raster in which each grid cell is assigned the exposure of the land-surface to a hypothetical wind flux. It can be conceptualized as the angle between a plane orthogonal to the wind and a plane that represents the local topography at a grid cell (Bohner and Antonic, 2007). The user must input a digital elevation model (dem), as well as the dominant wind azimuth (azimuth) and a maximum search distance (max_dist) used to calclate the horizon angle. Notice that the specified azimuth represents a regional average wind direction.
Exposure towards the sloped wind flux essentially combines the relative terrain aspect and the maximum upwind slope (i.e. horizon angle). This terrain attribute accounts for land-surface orientation, relative to the wind, and shadowing effects of distant topographic features but does not account for deflection of the wind by topography. This tool should not be used on very extensive areas over which Earth's curvature must be taken into account. DEMs in projected coordinate systems are preferred.
Algorithm Description:
Exposure is measured based on the equation presented in Antonic and Legovic (1999):
cos(E) = cos(S) sin(H) + sin(S) cos(H) cos(Az - A)
Where, E is angle between a plane defining the local terrain and a plane orthogonal to the wind flux, S is the terrain slope, A is the terrain aspect, Az is the azimuth of the wind flux, and H is the horizon angle of the wind flux, which is zero when only the horizontal component of the wind flux is accounted for.
Exposure images are best displayed using a greyscale or bipolar palette to distinguish between the positive and negative values that are present in the output.
References
Antonić, O., & Legović, T. 1999. Estimating the direction of an unknown air pollution source using a digital elevation model and a sample of deposition. Ecological modelling, 124(1), 85-95.
Böhner, J., & Antonić, O. 2009. Land-surface parameters specific to topo-climatology. Developments in Soil Science, 33, 195-226.
See Also
relative_aspect
Python API
def exposure_towards_wind_flux(self, dem: Raster, azimuth: float = 0.0, max_dist: float = float('inf'), z_factor: float = 1.0) -> Raster:
Feature Preserving Smoothing
Function name: feature_preserving_smoothing
Description
This tool implements a highly modified form of the DEM de-noising algorithm described by Sun et al. (2007). It is very effective at removing surface roughness from digital elevation models (DEMs), without significantly altering breaks-in-slope. As such, this tool should be used for smoothing DEMs rather than either smoothing with low-pass filters (e.g. mean, median, Gaussian filters) or grid size coarsening by resampling. The algorithm works by 1) calculating the surface normal 3D vector of each grid cell in the DEM, 2) smoothing the normal vector field using a filtering scheme that applies more weight to neighbours with lower angular difference in surface normal vectors, and 3) uses the smoothed normal vector field to update the elevations in the input DEM.
Sun et al.'s (2007) original method was intended to work on input point clouds and fitted triangular irregular networks (TINs). The algorithm has been modified to work with input raster DEMs instead. In so doing, this algorithm calculates surface normal vectors from the planes fitted to 3 x 3 neighbourhoods surrounding each grid cell, rather than the triangular facet. The normal vector field smoothing and elevation updating procedures are also based on raster filtering operations. These modifications make this tool more efficient than Sun's original method, but will also result in a slightly different output than what would be achieved with Sun's method.
The user must specify the values of three key parameters, including the filter size (filter), the normal difference threshold (norm_diff), and the number of iterations (num_iter). Lindsay et al. (2019) found that the degree of smoothing was less impacted by the filter size than it was either the normal difference threshold and the number of iterations. A filter size of 11, the default value, tends to work well in many cases. To increase the level of smoothing applied to the DEM, consider increasing the normal difference threshold, i.e. the angular difference in normal vectors between the center cell of a filter window and a neighbouring cell. This parameter determines which neighbouring values are included in a filtering operation and higher values will result in a greater number of neighbouring cells included, and therefore smoother surfaces. Similarly, increasing the number of iterations from the default value of 3 to upwards of 5-10 will result in significantly greater smoothing.
Before smoothing treatment:
After smoothing treatment with FPS:
For a video tutorial on how to use the feature_preserving_smoothing tool, please see this YouTube video.
Reference
Lindsay JB, Francioni A, Cockburn JMH. 2019. LiDAR DEM smoothing and the preservation of drainage features. Remote Sensing, 11(16), 1926; DOI: 10.3390/rs11161926.
Sun, X., Rosin, P., Martin, R., & Langbein, F. (2007). Fast and effective feature-preserving mesh denoising. IEEE Transactions on Visualization & Computer Graphics, (5), 925-938.
Parameters
dem (Raster): The input digital elevation model (DEM)
filter_size (int): The filter size used for smoothing. Default is 11.
normal_diff_threshold (float): The maximum allowable difference in the angle of the normals between two grid cells on the same facet. Default is 8.0.
iterations (int): The number of iterations used during smoothing. Default is 3.
max_elevation_diff (float): The maximum allowable vertical distance that a cell's elevation is allowed to be changed by
z_factor (float): Used to convert elevation units so that they match the horizontal units. Unless the two units differ, this should be set to 1.0. Default is 1.0.
Returns
Raster: return value
Python API
def feature_preserving_smoothing(self, dem: Raster, filter_size: int = 11, normal_diff_threshold: float = 8.0, iterations: int = 3, max_elevation_diff: float = float('inf'), z_factor: float = 1.0) -> Raster:
Feature Preserving Smoothing Multiscale
Function name: feature_preserving_smoothing_multiscale
No help documentation available for this tool.
Fetch Analysis
Function name: fetch_analysis
This tool creates a new raster in which each grid cell is assigned the distance, in meters, to the nearest topographic obstacle in a specified direction. It is a modification of the algorithm described by Lapen and Martz (1993). Unlike the original algorithm, Fetch Analysis is capable of analyzing fetch in any direction from 0-360 degrees. The user must input a digital elevation model (DEM) raster file, a hypothetical wind direction, and a value for the height increment parameter. The algorithm searches each grid cell in a path following the specified wind direction until the following condition is met:
Ztest >= Zcore + DI
where Zcore is the elevation of the grid cell at which fetch is being determined, Ztest is the elevation of the grid cell being tested as a topographic obstacle, D is the distance between the two grid cells in meters, and I is the height increment in m/m. Lapen and Martz (1993) suggest values for I in the range of 0.025 m/m to 0.1 m/m based on their study of snow re-distribution in low-relief agricultural landscapes of the Canadian Prairies. If the directional search does not identify an obstacle grid cell before the edge of the DEM is reached, the distance between the DEM edge and Zcore is entered. Edge distances are assigned negative values to differentiate between these artificially truncated fetch values and those for which a valid topographic obstacle was identified. Notice that linear interpolation is used to estimate the elevation of the surface where a ray (i.e. the search path) does not intersect the DEM grid precisely at one of its nodes.
Ray-tracing is a highly computationally intensive task and therefore this tool may take considerable time to operate for larger sized DEMs. This tool is parallelized to aid with computational efficiency. NoData valued grid cells in the input image will be assigned NoData values in the output image. Fetch Analysis images are best displayed using the blue-white-red bipolar palette to distinguish between the positive and negative values that are present in the output.
Reference
Lapen, D. R., & Martz, L. W. (1993). The measurement of two simple topographic indices of wind sheltering-exposure from raster digital elevation models. Computers & Geosciences, 19(6), 769-779.
See Also
directional_relief, horizon_angle, relative_aspect
Python API
def fetch_analysis(self, dem: Raster, azimuth: float = 0.0, height_increment: float = 0.05) -> Raster:
Fill Missing Data
Function name: fill_missing_data
This tool can be used to fill in small gaps in a raster or digital elevation model (DEM). The gaps, or holes, must have recognized NoData values. If gaps do not currently have this characteristic, use the set_nodata_value tool and ensure that the data are stored using a raster format that supports NoData values. All valid, non-NoData values in the input raster will be assigned the same value in the output image.
The algorithm uses an inverse-distance weighted (IDW) scheme based on the valid values on the edge of NoData gaps to estimate gap values. The user must specify the filter size (filter), which determines the size of gap that is filled, and the IDW weight (weight).
The filter size, specified in grid cells, is used to determine how far the algorithm will search for valid, non-NoData values. Therefore, setting a larger filter size allows for the filling of larger gaps in the input raster.
The no_edges flag can be used to exclude NoData values that are connected to the edges of the raster. It is usually the case that irregularly shaped DEMs have large regions of NoData values along the containing raster edges. This flag can be used to exclude these regions from the gap-filling operation, leaving only interior gaps for filling.
See Also
set_nodata_value
Python API
def fill_missing_data(self, dem: Raster, filter_size: int = 11, weight: float = 2.0, exclude_edge_nodata: bool = False) -> Raster:
Find Ridges
Function name: find_ridges
This tool can be used to identify ridge cells in a digital elevation model (DEM). Ridge cells are those that have lower neighbours either to the north and south or the east and west. Line thinning can optionally be used to create single-cell wide ridge networks by specifying the line_thin parameter.
Python API
def find_ridges(self, dem: Raster, line_thin: bool = True) -> Raster:
Hillshade
Function name: hillshade
This tool performs a hillshade operation (also called shaded relief) on an input digital elevation model (DEM). The user must input a DEM. Other parameters that must be specified include the illumination source azimuth (azimuth), or sun direction (0-360 degrees), the illumination source altitude (altitude; i.e. the elevation of the sun above the horizon, measured as an angle from 0 to 90 degrees) and the Z conversion factor (zfactor). The Z conversion factor is only important when the vertical and horizontal units are not the same in the DEM, and the DEM is in a projected coordinate system. When this is the case, the algorithm will multiply each elevation in the DEM by the Z conversion factor. If the DEM is in the geographic coordinate system (latitude and longitude), the following equation is used:
zfactor = 1.0 / (111320.0 x cos(mid_lat))
where mid_lat is the latitude of the centre of the raster, in radians.
The hillshade value (HS) of a DEM grid cell is calculate as:
HS = tan(s) / [1 - tan(s)2]0.5 x [sin(Alt) / tan(s) - cos(Alt) x sin(Az - a)]
where s and a are the local slope gradient and aspect (orientation) respectively and Alt and Az are the illumination source altitude and azimuth respectively. Slope and aspect are calculated using Horn's (1981) 3rd-order finate difference method.
Reference
Gallant, J. C., and J. P. Wilson, 2000, Primary topographic attributes, in Terrain Analysis: Principles and Applications, edited by J. P. Wilson and J. C. Gallant pp. 51-86, John Wiley, Hoboken, N.J.
See Also
hypsometrically_tinted_hillshade, multidirectional_hillshade, aspect, slope
Python API
def hillshade(self, dem: Raster, azimuth: float = 315.0, altitude: float = 30.0, z_factor: float = 1.0) -> Raster:
Hypsometrically Tinted Hillshade
Function name: hypsometrically_tinted_hillshade
This tool creates a hypsometrically tinted shaded relief (Swiss hillshading) image from an input digital elevation model (DEM). The tool combines a colourized version of the DEM with varying illumination provided by a hillshade image, to produce a composite relief model that can be used to visual topography for more effective interpretation of landscapes. The output of the tool is a 24-bit red-green-blue (RGB) colour image.
The user must input a DEM. Other parameters that must be specified include the illumination source azimuth (azimuth), or sun direction (0-360 degrees), the illumination source altitude (altitude; i.e. the elevation of the sun above the horizon, measured as an angle from 0 to 90 degrees), the hillshade weight (hs_weight; 0-1), image brightness (brightness; 0-1), and atmospheric effects (atmospheric; 0-1). The hillshade weight can be used to increase or subdue the relative prevalence of the hillshading effect in the output image. The image brightness parameter is used to create an overall brighter or darker version of the terrain rendering; note however, that very high values may over-saturate the well-illuminated portions of the terrain. The atmospheric effects parameter can be used to introduce a haze or atmosphere effect to the output image. It is intended to reproduce the effect of viewing mountain valley bottoms through a thicker and more dense atmosphere. Values greater than zero will introduce a slightly blue tint, particularly at lower altitudes, blur the hillshade edges slightly, and create a random haze-like speckle in lower areas. The user must also specify the Z conversion factor (zfactor). The Z conversion factor is only important when the vertical and horizontal units are not the same in the DEM. When this is the case, the algorithm will multiply each elevation in the DEM by the Z conversion factor. If the DEM is in the geographic coordinate system (latitude and longitude), the following equation is used:
zfactor = 1.0 / (111320.0 x cos(mid_lat))
where mid_lat is the latitude of the centre of the raster, in radians.
See Also
hillshade, multidirectional_hillshade, aspect, slope
Python API
def hypsometrically_tinted_hillshade(self, dem: Raster, solar_altitude: float = 45.0, hillshade_weight: float = 0.5, brightness: float = 0.5, atmospheric_effects: float = 0.0, palette: str = "atlas", reverse_palette: bool = False, full_360_mode: bool = False, z_factor: float = 1.0) -> Raster:
Local Hypsometric Analysis
Function name: local_hypsometric_analysis
PROExperimental
Computes the minimum local hypsometric integral across a nonlinearly sampled range of neighbourhood scales.
geomorphometry multiscale hypsometry legacy-port
Parameters
NameDescriptionRequiredDefault
inputInput DEM raster path or typed raster object.Requireddem.tif
min_scaleMinimum half-window radius in cells (default 4).Optional4
step_sizeBase step size in cells (default 1). Alias: step.Optional1
num_stepsNumber of scales to evaluate (default 10).Optional10
step_nonlinearityScale-step nonlinearity in [1,4] (default 1.0).Optional1.0
outputOptional output path for local HI minimum raster.Optional—
output_scaleOptional output path for scale-of-minimum-HI raster.Optional—
Examples
Compute minimum local hypsometric integral and associated scale.
wbe.local_hypsometric_analysis(input='dem.tif', min_scale=4, num_steps=10, output='local_hypsometric_analysis.tif', output_scale='local_hypsometric_analysis_scale.tif', step_nonlinearity=1.0, step_size=1)
Low Points On Headwater Divides
Function name: low_points_on_headwater_divides
PROExperimental
Locates low pass points along divides between neighboring headwater subbasins.
geomorphometry streams subbasins passes legacy-port
Parameters
NameDescriptionRequiredDefault
demInput depressionless DEM raster path or typed raster object.Requireddem.tif
streamsInput stream raster path (positive values indicate channel cells).Requiredstreams.tif
outputOptional output vector path (default temporary .shp).Optionallow_points_on_headwater_divides.shp
Examples
Find low pass points between neighboring headwater basins.
wbe.low_points_on_headwater_divides(dem='dem.tif', output='low_points_on_headwater_divides.shp', streams='streams.tif')
Max Downslope Elev Change
Function name: max_downslope_elev_change
This tool calculates the maximum elevation drop between each grid cell and its neighbouring cells within a digital elevation model (DEM). The user must input a DEM (dem).
See Also
max_upslope_elev_change, min_downslope_elev_change, num_downslope_neighbours
Python API
def max_downslope_elev_change(self, raster: Raster) -> Raster:
Max Upslope Elev Change
Function name: max_upslope_elev_change
a digital elevation model (DEM). The user must input DEM (dem).
See Also
max_downslope_elev_change
Python API
def max_upslope_elev_change(self, raster: Raster) -> Raster:
Min Downslope Elev Change
Function name: min_downslope_elev_change
This tool calculates the minimum elevation drop between each grid cell and its neighbouring cells within a digital elevation model (DEM). The user must input a DEM (dem).
See Also
max_downslope_elev_change, num_downslope_neighbours
Python API
def min_downslope_elev_change(self, raster: Raster) -> Raster:
Multidirectional Hillshade
Function name: multidirectional_hillshade
This tool performs a hillshade operation (also called shaded relief) on an input digital elevation model (DEM) with multiple sources of illumination. The user must input a DEM (dem). Other parameters that must be specified include the altitude of the illumination sources (altitude; i.e. the elevation of the sun above the horizon, measured as an angle from 0 to 90 degrees) and the Z conversion factor (zfactor). The Z conversion factor is only important when the vertical and horizontal units are not the same in the DEM, and the DEM is in a projected coordinate system. When this is the case, the algorithm will multiply each elevation in the DEM by the Z conversion factor.
The hillshade value (HS) of a DEM grid cell is calculate as:
HS = tan(s) / [1 - tan(s)2]0.5 x [sin(Alt) / tan(s) - cos(Alt) x sin(Az - a)]
where s and a are the local slope gradient and aspect (orientation) respectively and Alt and Az are the illumination source altitude and azimuth respectively. Slope and aspect are calculated using Horn's (1981) 3rd-order finate difference method.
Lastly, the user must specify whether or not to use full 360-degrees of illumination sources (full_mode). When this flag is not specified, the tool will perform a weighted summation of the hillshade images from four illumination azimuth positions at 225, 270, 315, and 360 (0) degrees, given weights of 0.1, 0.4, 0.4, and 0.1 respectively. When run in the full 360-degree mode, eight illumination source azimuths are used to calculate the output at 0, 45, 90, 135, 180, 225, 270, and 315 degrees, with weights of 0.15, 0.125, 0.1, 0.05, 0.1, 0.125, 0.15, and 0.2 respectively.
Classic hillshade (Azimuth=315, Altitude=45.0)
Multi-directional hillshade (Altitude=45.0, Four-direction mode)
Multi-directional hillshade (Altitude=45.0, 360-degree mode)
See Also
hillshade, hypsometrically_tinted_hillshade, aspect, slope
Python API
def multidirectional_hillshade(self, dem: Raster, altitude: float = 30.0, z_factor: float = 1.0, full_360_mode: bool = False) -> Raster:
Num Downslope Neighbours
Function name: num_downslope_neighbours
This tool calculates the number of downslope neighbours of each grid cell in a raster digital elevation model (DEM). The user must input a DEM (dem). The tool examines the eight neighbouring cells for each grid cell in a the DEM and counts the number of neighbours with an elevation less than the centre cell of the 3 x 3 window. The output image can therefore have values raning from 0 to 8. A raster grid cell with eight downslope neighbours is a peak and a cell with zero downslope neighbours is a pit. This tool can be used with the NumUpslopeNeighbours tool to assess the degree of local flow divergence/convergence.
See Also
NumUpslopeNeighbours
Python API
def num_downslope_neighbours(self, dem: Raster) -> Raster:
Num Upslope Neighbours
Function name: num_upslope_neighbours
Experimental
Counts the number of 8-neighbour cells higher than each DEM cell.
geomorphometry terrain flow legacy-port
Profile
Function name: profile
This tool can be used to plot the data profile, along a set of one or more vector lines (lines), in an input (surface) digital elevation model (DEM), or other surface model. The data profile plots surface height (y-axis) against distance along profile (x-axis). The tool outputs an interactive SVG line graph embedded in an HTML document (output). If the vector lines file contains multiple line features, the output plot will contain each of the input profiles.
If you want to extract the longitudinal profile of a river, use the long_profile tool instead.
See Also
long_profile, hypsometric_analysis
Python API
def profile(self, lines_vector: Vector, surface: Raster, output_html_file: str) -> None:
Slope Vs Aspect Plot
Function name: slope_vs_aspect_plot
PROExperimental
Creates an HTML radial slope-vs-aspect analysis plot for an input DEM.
geomorphometry terrain plot html legacy-port
Slope Vs Elev Plot
Function name: slope_vs_elev_plot
This tool can be used to create a slope versus average elevation plot for one or more digital elevation models (DEMs). Similar to a hypsometric analysis (hypsometric_analysis), the slope-elevation relation can reveal the basic topographic character of a site. The output of this analysis is an HTML document (output) that contains the slope-elevation chart. The tool can plot multiple slope-elevation analyses on the same chart by specifying multiple input DEM files (inputs). Each input DEM can have an optional watershed in which the slope-elevation analysis is confined by specifying the optional watershed flag. If multiple input DEMs are used, and a watershed is used to confine the analysis to a sub-area, there must be the same number of input raster watershed files as input DEM files. The order of the DEM and watershed files must the be same (i.e. the first DEM file must correspond to the first watershed file, the second DEM file to the second watershed file, etc.). Each watershed file may contain one or more watersheds, designated by unique identifiers.
See Also
hypsometric_analysis, slope_vs_aspect_plot
Python API
def slope_vs_elev_plot(self, dem_rasters: List[Raster], output_html_file: str, watershed_rasters: List[Raster]) -> None:
Surface Area Ratio
Function name: surface_area_ratio
This tool calculates the ratio between the surface area and planar area of grid cells within digital elevation models (DEMs). The tool uses the method of Jenness (2004) to estimate the surface area of a DEM grid cell based on the elevations contained within the 3 x 3 neighbourhood surrounding each cell. The surface area ratio has a lower bound of 1.0 for perfectly flat grid cells and is greater than 1.0 for other conditions. In particular, surface area ratio is a measure of neighbourhood surface shape complexity (texture) and elevation variability (local slope).
Reference
Jenness, J. S. (2004). Calculating landscape surface area from digital elevation models. Wildlife Society Bulletin, 32(3), 829-839.
See Also
ruggedness_index, multiscale_roughness, circular_variance_of_aspect, edge_density
Python API
def surface_area_ratio(self, dem: Raster) -> Raster:
Topographic Hachures
Function name: topographic_hachures
PROExperimental
Creates topographic hachure polylines from a DEM using contour-seeded downslope and upslope flowlines. Legacy authorship attribution is intentionally preserved for this tool.
geomorphometry hachures contours vector legacy-port
Map Off Terrain Objects
Function name: map_off_terrain_objects
This tool can be used to map off-terrain objects in a digital surface model (DSM) based on cell-to-cell differences in elevations and local slopes. The algorithm works by using a region-growing operation to connect neighbouring grid cells outwards from seed cells. Two neighbouring cells are considered connected if the slope between the two cells is less than the user-specified maximum slope value (max_slope). Mapped segments that are less than the minimum feature size (min_size), in grid cells, are assigned a common background value. Note that this method of mapping off-terrain objects, and thereby separating ground cells from non-ground objects in DSMs, works best with fine-resolution DSMs that have been interpolated using a non-smoothing method, such as triangulation (TINing) or nearest-neighbour interpolation.
See Also
remove_off_terrain_objects
Python API
def map_off_terrain_objects(self, dem: Raster, max_slope: float = float('inf'), min_feature_size: int = 0) -> Raster:
Remove Off Terrain Objects
Function name: remove_off_terrain_objects
This tool can be used to create a bare-earth DEM from a fine-resolution digital surface model. The tool is typically applied to LiDAR DEMs which frequently contain numerous off-terrain objects (OTOs) such as buildings, trees and other vegetation, cars, fences and other anthropogenic objects. The algorithm works by finding and removing steep-sided peaks within the DEM. All peaks within a sub-grid, with a dimension of the user-specified maximum OTO size (filter), in pixels, are identified and removed. Each of the edge cells of the peaks are then examined to see if they have a slope that is less than the user-specified minimum OTO edge slope (slope) and a back-filling procedure is used. This ensures that OTOs are distinguished from natural topographic features such as hills. The DEM is preprocessed using a white top-hat transform, such that elevations are normalized for the underlying ground surface.
Note that this tool is appropriate to apply to rasterized LiDAR DEMs. Use the lidar_ground_point_filter tool to remove or classify OTOs within a LiDAR point-cloud.
Reference
J.B. Lindsay (2018) A new method for the removal of off-terrain objects from LiDAR-derived raster surface models. Available online, DOI: 10.13140/RG.2.2.21226.62401
See Also
map_off_terrain_objects, tophat_transform, lidar_ground_point_filter
Python API
def remove_off_terrain_objects(self, dem: Raster, filter_size: int = 11, slope_threshold: float = 15.0) -> Raster:
Ridge And Valley Vectors
Function name: ridge_and_valley_vectors
This function can be used to extract ridge and channel vectors from an input digital elevation model (DEM). The function works by first calculating elevation percentile (EP) from an input DEM using a neighbourhood size set by the user-specified filter_size parameter. Increasing the value of filter_size can result in more continuous mapped ridge and valley bottom networks. A thresholding operation is then applied to identify cells that have an EP less than the user-specified ep_threshold (valley bottom regions) and a second thresholding operation maps regions where EP is greater than 100 - ep_threshold (ridges). Each of these ridge and valley region maps are also multiplied by a slope mask created by identify all cells with a slope greater than the user-specified slope_threshold value, which is set to zero by default. This second thresholding can be helpful if the input DEM contains extensive flat areas, which can be confused for valleys otherwise. The filter_size and ep_threshold parameters are somewhat dependent on one another. Increasing the filter_size parameter generally requires also increasing the value of the ep_threshold. The ep_threshold can take values between 5.0 and 50.0, where larger values will generally result in more extensive and continuous mapped ridge and valley bottom networks. For many DEMs, a value on the higher end of the scale tends to work best.
After applying the thresholding operations, the function then applies specialized shape generalization, line thinning, and vectorization alorithms to produce the final ridge and valley vectors. The user must also specify the value of the min_length parameter, which determines the minimum size, in grid cells, of a mapped line feature. The function outputs a tuple of two vector, the first being the ridge network and the second vector being the valley-bottom network.
Code Example
`from whitebox_workflows import WbEnvironment
Set up the WbW environment
license_id = 'my-license-id' # Note, this tool requires a license for WbW-Pro wbe = WbEnvironment(license_id) try: wbe.verbose = True wbe.working_directory = '/path/to/data' # Read the input DEM dem = wbe.read_raster('DEM.tif') # Run the operation ridges, valleys = wbe.ridge_and_valley_vectors(dem, filter_size=21, ep_threshold=45.0, slope_threshold=1.0, min_length=25) wbe.write_vector(ridges, 'ridges_lines.shp') wbe.write_vector(valley, 'valley_lines.shp') print('Done!') `
except Exception as e: print("Error: ", e) finally: wbe.check_in_license(license_id)
See Also:
extract_valleys
Python API
def ridge_and_valley_vectors(self, dem: Raster, filter_size: int = 11, ep_threshold: float = 30.0, slope_threshold: float = 0.0, min_length: int = 20) -> Tuple[Raster, Raster]:
Smooth Vegetation Residual
Function name: smooth_vegetation_residual
PROExperimental
Reduces canopy residual roughness by masking high local DEV responses at small scales and re-interpolating masked elevations.
geomorphometry lidar smoothing dem legacy-port
Parameters
NameDescriptionRequiredDefault
inputInput DEM raster path or typed raster object.Requireddem.tif
max_scaleMaximum DEV half-window radius in cells (default 30).Optional30
dev_thresholdMinimum DEV magnitude used to flag roughness cells (default 1.0).Optional1.0
scale_thresholdMaximum scale considered roughness (default 5).Optional5
outputOptional output path. If omitted, result stays in memory.Optional—
Examples
Suppress vegetation-residual roughness in a LiDAR DEM.
wbe.smooth_vegetation_residual(dev_threshold=1.0, input='dem.tif', max_scale=30, output='smooth_vegetation_residual.tif', scale_threshold=5)
Workflow Products
Topo Render
Function name: topo_render
PROExperimental
Creates a pseudo-3D topographic rendering using palette tinting, hillshade, shadows, and attenuation.
geomorphometry terrain rendering topographic
Parameters
NameDescriptionRequiredDefault
demInput DEM raster path.Requireddem.tif
palettePalette name (soft, atlas, high_relief, turbo, viridis, dem, grey, white).Optionalsoft
reverse_paletteReverse palette order.OptionalFalse
azimuthLight-source azimuth in degrees [0, 360].Optional315.0
altitudeLight-source altitude in degrees [0, 90].Optional30.0
clipping_polygonOptional polygon vector path; only DEM cells inside polygon(s) are rendered.Optional—
background_hgt_offsetVertical offset from minimum DEM elevation to background plane.Optional10.0
background_clrBackground RGBA colour as array [r,g,b,a].Optional[255, 255, 255, 255]
attenuation_parameterDistance attenuation exponent (>= 0).Optional0.3
ambient_lightAmbient light amount in [0, 1].Optional0.2
z_factorVertical exaggeration multiplier.Optional1.0
max_distMaximum shadow search distance in map units.Optional—
outputOutput raster path.Optionaltopo_render.tif
Examples
Generate a pseudo-3D topographic render from a DEM.
wbe.topo_render(altitude=30.0, azimuth=315.0, dem='dem.tif', output='topo_render.tif', palette='soft')
Wetland Hydrogeomorphic Classification
Function name: wetland_hydrogeomorphic_classification
PROProduction
Classify wetlands into hydrogeomorphic classes using DEM context and wetland masks.
workflow pro
Workflow Narrative
Wetland Hydrogeomorphic Classification
Problem It Solves
Which mapped wetland regions belong to key HGM classes, and where is class confidence high enough for permitting decisions?
Who It Is For
- Wetland scientists, permitting consultants, and mitigation planners.
Primary User
Environmental consulting firms, permitting agencies, and mitigation banking organizations.
What It Does
- Classifies wetland mask cells into HGM classes using terrain context.
- Builds confidence surfaces for wetland classification reliability.
- Polygonizes connected wetland regions with region-level attributes.
How It Works
- Computes local terrain signatures in wetland-mask cells from DEM gradients and relief context.
- Assigns HGM class codes using rule-based thresholds over those signatures.
- Aggregates connected components into polygons and summarizes dominant class and mean confidence.
- Indicative formula: class = rule(slope, relief, wetness_proxy), confidence from distance-to-threshold stability.
Why It Wins
- Produces connected-region polygons with explicit confidence and class attributes, not just cell-level labels.
Typical Buying Trigger
Permitting or mitigation workflows require polygon deliverables with defensible classification metadata.
Typical Presets
- default for full region polygon extraction.
- lower max_polygon_features for very large AOIs.
Inputs
ParameterOptionalDescription dem, wetland_masknoDEM and wetland mask defining candidate wetland extent and hydrogeomorphic context. max_polygon_featuresnoMaximum number of polygon features to emit for mapped wetland units.
Outputs
ParameterTypeDescription hgm_classGeoTIFFCategorical hydrogeomorphic wetland class raster. confidenceGeoTIFFConfidence layer quantifying reliability of modeled outputs. wetland_polygonsGeoPackageVectorized wetland class polygons for mapping and reporting. summaryJSONMachine-readable summary report containing run metadata, QA diagnostics, and key metrics. html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
hgm, conf, polys, summary = wbe.wetland_hydrogeomorphic_classification( dem="data/dem.tif", wetland_mask="data/wetlands.tif", max_polygon_features=10000, output_prefix="output/wetland_hgm", )
print(hgm) print(conf) print(polys) print(summary)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Urban Expansion Impact Assessment
Function name: urban_expansion_impact_assessment
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Urban Expansion Impact Assessment
Problem It Solves
What ecological and stream-network impacts should we expect under this growth scenario, and where are priorities for mitigation?
Who It Is For
- Urban planners, environmental assessors, and watershed impact analysts.
Primary User
Municipal planning departments, environmental consultancies, and stormwater authorities.
What It Does
- Quantifies expansion-driven impact severity from baseline/scenario urban surfaces.
- Derives habitat-loss raster products from change footprint logic.
- Scores stream features against impact raster exposure.
How It Works
- Builds urban change footprint by differencing scenario and baseline urban rasters.
- Translates new/expanded footprint intensity into impact and habitat-loss scores.
- Samples stream geometries against impact surfaces to compute attributed reach-level metrics.
- Indicative formula: change = max(0, urban_scenario - urban_baseline); impact ~= change * habitat_weight.
Why It Wins
- Links scenario change, habitat loss, and attributed stream impact in one report-ready package.
Typical Buying Trigger
Planning approvals require quantified environmental impact evidence under multiple development scenarios.
Typical Presets
- baseline/scenario-only impact analysis.
- add habitat_sensitivity for weighted ecological impact scoring.
Inputs
ParameterOptionalDescription baseline_urban, scenario_urban, streamsnoBaseline and scenario urban rasters plus stream network used for impact assessment. optional habitat_sensitivityyesOptional habitat sensitivity layer used to weight ecological impact severity.
Outputs
ParameterTypeDescription impact_severityGeoTIFFImpact severity raster for baseline-versus-scenario urban change. habitat_lossGeoTIFFRaster estimate of habitat loss intensity under scenario comparison. affected_streamsGeoPackageVector stream segments flagged as impacted under scenario conditions. summaryJSONMachine-readable summary report containing run metadata, QA diagnostics, and key metrics. html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
impact, habitat, streams, summary = wbe.urban_expansion_impact_assessment( baseline_urban="data/urban_2020.tif", scenario_urban="data/urban_2035.tif", streams="data/streams.gpkg", habitat_sensitivity="data/habitat_sensitivity.tif", output_prefix="output/urban_impact", )
print(impact) print(habitat) print(streams) print(summary)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Wind Turbine Siting
Function name: wind_turbine_siting
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Wind Turbine Siting Analysis
Problem It Solves
Which candidate areas are most promising for wind siting at early-stage screening, and how confident are those rankings?
Who It Is For
- Renewable energy siting teams and feasibility analysts.
Primary User
Wind developers, utility planning groups, and engineering consultancies.
What It Does
- Creates suitability and confidence surfaces for wind siting screening.
- Combines slope constraints, terrain exposure, and settlement-visibility signals.
- Supports profile-based scoring behavior for speed vs quality tradeoffs.
How It Works
- Derives slope and terrain-exposure factors from the DEM.
- Applies distance/visibility penalties around settlement inputs.
- Combines normalized factors with profile-dependent weights into siting score and confidence.
- Indicative formula: score ~= w_sterrain_suitability + w_eexposure - w_v*visibility_penalty.
Why It Wins
- Combines terrain and visibility constraints with confidence scoring to support transparent shortlist decisions.
Typical Buying Trigger
A development team needs to narrow a broad search region before expensive met mast or field campaigns.
Typical Presets
- fast for broad regional pre-screening.
- balanced for standard feasibility support.
- quality for higher-confidence candidate ranking.
Inputs
ParameterOptionalDescription dem, settlementsnoTerrain model and settlement features used for slope/visibility siting constraints. visibility_radius_metersnoMaximum visibility analysis radius used during visual-impact screening. min_slope_degrees, max_slope_degreesnoSlope suitability bounds used to constrain turbine placement candidates. profile: fast | balanced | qualitynoSiting profile controlling screening speed versus quality/strictness of constraints.
Outputs
ParameterTypeDescription siting_scoreGeoTIFFCore siting suitability score raster produced by the model. confidenceGeoTIFFConfidence layer quantifying reliability of modeled outputs. summaryJSONMachine-readable summary report containing run metadata, QA diagnostics, and key metrics. html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
When sweep_spec is supplied, the workflow also emits run_matrix_summary, sensitivity_report, sensitivity_report_html, and stability_map. The sensitivity report includes metrics.primary_metric, metrics.primary_relative_span, and metrics.stability_class (high, medium, low), while stability_map uses classes 3=high, 2=medium, 1=low.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
score, conf, summary = wbe.wind_turbine_siting( dem="data/dem.tif", settlements="data/settlements.gpkg", profile="balanced", output_prefix="output/wind_siting", )
print(score) print(conf) print(summary)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Solar Site Suitability Analysis
Function name: solar_site_suitability_analysis
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Solar Site Suitability Analysis
Problem It Solves
Where are top solar candidates based on terrain suitability, and which shortlisted sites are strong enough for deeper engineering review?
Who It Is For
- Solar development teams and regional screening analysts.
Primary User
Solar developers, renewable consultants, and utility planning units.
What It Does
- Computes solar siting suitability and visual-impact proxies from terrain.
- Selects and ranks candidate point sites.
- Emits attributed candidate vectors for downstream review.
How It Works
- Computes terrain suitability response from slope/aspect and neighborhood context.
- Estimates visual-impact proxy from terrain prominence and local exposure contrasts.
- Applies thresholding and ranking to emit top candidate points with attributes.
- Indicative formula: suitability ~= f(slope, aspect, relief); candidates = arg top-k(suitability) above threshold.
Why It Wins
- Emits ranked candidate vectors with visual-impact attributes, enabling immediate GIS review and filtering.
Typical Buying Trigger
Early project screening demands a short list of high-probability solar candidates across a large area.
Typical Presets
- higher candidate_threshold for only top candidates.
- lower threshold plus larger max_candidate_sites for exploratory workflows.
Inputs
ParameterOptionalDescription demnoDigital elevation model used as the terrain reference surface. candidate_thresholdnoMinimum suitability threshold required for candidate-site extraction. max_candidate_sitesnoUpper limit on the number of candidate sites emitted in vector output.
Outputs
ParameterTypeDescription suitability_scoreGeoTIFFCore suitability score raster produced by the model. visual_impactGeoTIFFVisual-impact raster used to screen candidate sites and stakeholder constraints. candidate_sitesGeoPackageVector candidate-site features passing selection thresholds. summaryJSONMachine-readable summary report containing run metadata, QA diagnostics, and key metrics. html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
When sweep_spec is supplied, the workflow also emits run_matrix_summary, sensitivity_report, sensitivity_report_html, and stability_map. The sensitivity report includes metrics.primary_metric, metrics.primary_relative_span, and metrics.stability_class (high, medium, low), while stability_map uses classes 3=high, 2=medium, 1=low.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
suit, vis, sites, summary = wbe.solar_site_suitability_analysis( dem="data/dem.tif", candidate_threshold=0.7, max_candidate_sites=200, output_prefix="output/solar_siting", )
print(suit) print(vis) print(sites) print(summary)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Corridor Mapping Intelligence
Function name: corridor_mapping_intelligence
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Corridor Mapping and Route Planning
Problem It Solves
What is the terrain-optimal route for this linear infrastructure, and what alternative corridor band exists within an acceptable cost margin?
Who It Is For
- Infrastructure planners, environmental consultants, and natural resource agencies assessing route options for roads, pipelines, or utility lines.
Primary User
Forestry companies, energy utilities, and environmental regulatory consultancies evaluating new linear infrastructure corridors.
What It Does
- Builds a terrain impedance cost surface from DEM-derived slope and roughness.
- Finds the least-cost path between two user-specified endpoints using Dijkstra accumulation.
- Delineates a corridor suitability band of near-optimal alternative routes.
- Supports polygon exclusion zones (steep terrain, protected areas, existing infrastructure).
How It Works
- Computes per-cell cost from slope (and optionally local roughness) normalised to [0, 1].
- Runs Dijkstra from both start and end to accumulate cost surfaces.
- Least-cost path traced back from end → start through predecessor pointers.
- Corridor band: cells where
acc_from_start + acc_from_end ≤ optimal_cost × (1 + tolerance). - Indicative formula: cost ~= w_slope × slope_norm + w_rough × roughness_norm (profile-controlled weights).
Why It Wins
- Compared with OSS least-cost building blocks (
cost_distance+cost_pathway+cost_allocation), this tool is an end-to-end siting workflow: it derives the terrain cost surface from the DEM, computes the optimal route, delineates near-optimal corridor alternatives, and packages decision-ready outputs in one run. - Why it wins vs OSS least-cost tools:
- Requires fewer analyst preparation steps (no separate friction-raster engineering pipeline required).
- Returns vector route geometry with engineering-friendly attributes, not just a path raster.
- Produces a corridor suitability band for option-space review, not only a single least-cost trace.
- Supports polygon exclusion constraints directly in the routing workflow.
- Emits a machine-readable summary contract for reproducible QA/reporting integration.
Typical Buying Trigger
A feasibility study needs a first-pass route alignment and suitability band for stakeholder review before field surveys. Cost profiles: - slope_only — slope-only impedance; fastest, suitable for quick screening. - slope_roughness — balanced slope + roughness blend (default); recommended for road and pipeline siting. - conservative — equal slope/roughness weighting; preferred for sensitive terrain or pipelines.
Inputs
ParameterOptionalDescription
demnoInput DEM raster path.
start_featuresnoStart feature vector path (point or polygon).
end_featuresnoEnd feature vector path (point or polygon).
constraintsyesOptional exclusion zone vector path (polygons to avoid).
cost_profileyesCost weighting profile: slope_only | slope_roughness | conservative. Default: slope_roughness.
terminal_anchor_strategyyesPolygon terminal anchor mode: mixed | centroid_only | boundary_only. Default: mixed.
corridor_toleranceyesFraction above optimal cost included in corridor suitability band. Default: 0.15.
output_prefixyesOutput prefix for generated artifacts.
Outputs
ParameterTypeDescription
cost_surfaceGeoTIFFNormalized terrain impedance surface [0-1].
accumulated_costGeoTIFFDijkstra accumulated cost from selected start anchor.
optimal_routeGeoPackageLeast-cost route LineString with route metrics.
corridor_suitabilityGeoTIFFBinary suitability band (1 = within tolerance).
summaryJSONSummary contract with metrics, selected anchors, and harmonization metadata.
html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
cost, acc_cost, route, suitability, summary = wbe.corridor_mapping_intelligence( dem="data/dem.tif", start_features="data/start_access_points.gpkg", end_features="data/forest_block_targets.gpkg", cost_profile="slope_roughness", terminal_anchor_strategy="mixed", corridor_tolerance=0.15, output_prefix="output/access_road", )
print(route) # path to optimal_route.gpkg print(suitability) # path to corridor suitability raster print(summary) # path to summary JSON`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Landslide Susceptibility Assessment
Function name: landslide_susceptibility_assessment
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Landslide Susceptibility Assessment
Problem It Solves
Which areas present elevated slope-failure risk today, and where is trigger pressure most concerning?
Who It Is For
- Hazard analysts, corridor planners, and geotechnical screening teams.
Primary User
Geological surveys, transportation agencies, and hazard consulting teams.
What It Does
- Estimates terrain-driven landslide susceptibility from slope/curvature context.
- Integrates optional rainfall pressure to strengthen trigger interpretation.
- Produces susceptibility, trigger-pressure, and confidence rasters with contract summary diagnostics.
How It Works
- Computes local slope and roughness/curvature proxies from DEM neighborhood derivatives.
- Blends terrain susceptibility with optional rainfall intensity to form trigger pressure.
- Applies profile-weighted scaling and thresholding to output susceptibility, confidence, and risk zones.
- Indicative formula: susceptibility ~= w1slope_term + w2roughness_term; trigger ~= susceptibility * (1 + rainfall_term).
Why It Wins
- Provides reproducible hazard screening outputs with structured summary metrics for downstream reporting.
Typical Buying Trigger
A public or infrastructure program requires defensible first-pass landslide risk zoning.
Typical Presets
- fast for rapid regional screening.
- balanced for standard hazard screening.
- conservative for stricter slope/curvature sensitivity.
Inputs
ParameterOptionalDescription demnoDigital elevation model used as the terrain reference surface. optional rainfall_intensityyesOptional rainfall forcing raster used to refine landslide trigger pressure estimation. profile: fast | balanced | conservativenoProcessing profile controlling sensitivity, quality strictness, and runtime tradeoffs. susceptibility_thresholdnoThreshold used to convert continuous susceptibility into risk-zone candidates.
Outputs
ParameterTypeDescription susceptibilityGeoTIFFLandslide susceptibility raster used to identify unstable terrain. trigger_pressureGeoTIFFTrigger-pressure raster indicating forcing required to initiate failures. confidenceGeoTIFFConfidence layer quantifying reliability of modeled outputs. risk_zonesGeoPackagePriority vector polygons highlighting intervention zones. summaryJSONMachine-readable summary report containing run metadata, QA diagnostics, and key metrics. html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
sus, trig, conf, zones, summary = wbe.landslide_susceptibility_assessment( dem="data/dem.tif", rainfall_intensity="data/rainfall_intensity.tif", profile="balanced", susceptibility_threshold=0.65, max_zone_features=5000, output_prefix="output/landslide", )
print(sus) print(trig) print(conf) print(zones) print(summary)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
River Corridor Health Assessment
Function name: river_corridor_health_assessment
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
River Corridor Health Assessment
Problem It Solves
Which river reaches are stable versus at-risk, and where should restoration action be prioritized?
Who It Is For
- Watershed restoration teams, river health analysts, and permit support consultants.
Primary User
Watershed councils, conservation authorities, and environmental consulting groups.
What It Does
- Derives erosion-pressure context from terrain.
- Scores stream reaches by sampled corridor pressure.
- Emits restoration-priority line outputs for intervention planning.
How It Works
- Calculates per-pixel erosion pressure from slope and local roughness terms.
- Samples stream vertices over the erosion raster and derives reach metrics (including high quantiles).
- Converts reach scores into health classes and restoration action categories.
- Indicative formula: erosion ~= aslope_norm + broughness_norm; health ~= 1 - mean(erosion_samples_along_reach).
Why It Wins
- Combines raster pressure context and attributed reach-level restoration outputs in one workflow.
Typical Buying Trigger
A watershed plan needs rapid reach triage with GIS-ready outputs for restoration budgeting.
Typical Presets
- fast for rapid watershed screening.
- balanced for default corridor scoring.
- conservative for roughness-sensitive erosion scoring.
Inputs
ParameterOptionalDescription demnoDigital elevation model used as the terrain reference surface. streamsnoVector stream network used for corridor health and restoration targeting. profile: fast | balanced | conservativenoProcessing profile controlling sensitivity, quality strictness, and runtime tradeoffs.
Outputs
ParameterTypeDescription erosion_pressureGeoTIFFErosion pressure raster for river corridor condition assessment. corridor_confidenceGeoTIFFConfidence surface for river corridor health interpretation. stream_health_scoreGeoPackageStream-reach health score output summarizing corridor condition. restoration_zonesGeoPackagePriority vector polygons highlighting restoration intervention zones. summaryJSONMachine-readable summary report containing run metadata, QA diagnostics, and key metrics. html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
erosion, confidence, health, restoration, summary = wbe.river_corridor_health_assessment( dem="data/dem.tif", streams="data/streams.gpkg", profile="balanced", output_prefix="output/river_health", )
print(erosion) print(confidence) print(health) print(restoration) print(summary)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Baseline Matching And Diagnostics Assessment
Function name: baseline_matching_and_diagnostics_assessment
No help documentation available for this tool.
Carbon Sequestration Verification Audit
Function name: carbon_sequestration_verification_audit
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Carbon Verification Audit
Problem It Solves
Where did vegetation-linked carbon proxy increase or decrease, and where is confidence high enough for audit triage?
Who It Is For
- Carbon program analysts, forest monitoring teams, and environmental verification workflows.
Primary User
Carbon project developers, ESG/MRV teams, and land-management compliance programs.
What It Does
- Quantifies baseline-to-current vegetation change using NDVI deltas.
- Derives a carbon-proxy change surface with confidence scoring.
- Optionally blends LiDAR biomass proxy inputs to strengthen stand-level interpretation.
- Produces verification zone polygons and an audit-ready contract output for MRV workflows.
How It Works
- Extracts baseline and current red/NIR bands from multiband bundles.
- Computes per-date NDVI and signed delta: NDVI_current - NDVI_baseline.
- Converts NDVI delta to a carbon-proxy index and computes confidence from vegetation signal and change magnitude.
- Optionally blends in biomass proxy values to improve relative stand-level carbon interpretation.
- Aggregates pixels into verification blocks and assigns zone-level change class labels (
gain,loss,unchanged). - Indicative formula: NDVI = (NIR - Red) / (NIR + Red), carbon_proxy ~= 10 * (NDVI_current - NDVI_baseline).
Why It Wins
- Combines change mapping, confidence scoring, zone-level vector outputs, and an audit contract in one reproducible run.
Typical Buying Trigger
Teams need standardized remote-sensing evidence packages ahead of formal carbon verification reporting.
Typical Presets
- conservative for stricter gain/loss interpretation thresholds.
- balanced for default operational monitoring.
- aggressive for early-signal monitoring workflows.
Inputs
ParameterOptionalDescription
baseline_bundle, current_bundlenoBaseline and current multiband rasters used for NDVI change analysis.
baseline_red_band_index, baseline_nir_band_indexyesBaseline red/NIR band indices. Defaults: 0, 1.
current_red_band_index, current_nir_band_indexyesCurrent red/NIR band indices. Defaults: 0, 1.
biomass_proxyyesOptional LiDAR-derived biomass proxy raster for blended carbon-proxy interpretation.
profile: conservative | balanced | aggressiveyesSensitivity profile controlling gain/loss thresholds and confidence blending behavior.
zone_block_cellsyesPixel block size used for verification zone polygon aggregation. Defaults to 16.
mrv_templateyesMRV profile: verra_vcs_vm0010, american_carbon_registry, gold_standard, or none.
methodology_referenceyesOptional methodology lineage note for audit metadata (for example Verra references).
output_prefixyesPrefix used to name output artifacts.
Outputs
ParameterTypeDescription
ndvi_baselineGeoTIFFBaseline NDVI surface (*_ndvi_baseline.tif).
ndvi_currentGeoTIFFCurrent NDVI surface (*_ndvi_current.tif).
ndvi_deltaGeoTIFFSigned NDVI change surface (*_ndvi_delta.tif).
carbon_proxyGeoTIFFCarbon-proxy change index (*_carbon_proxy.tif).
change_confidenceGeoTIFFConfidence raster for change interpretation (*_change_confidence.tif).
verification_zonesGeoPackageBlock-aggregated verification polygons with change class and summary attributes (*_verification_zones.gpkg).
audit_contractJSONMachine-readable audit summary contract (*_audit_contract.json).
compliance_evidence_packetJSONSubmission-oriented compliance evidence packet (*_compliance_evidence_packet.json).
regulator_ready_tableCSVFlat regulator-ready summary table (*_regulator_ready_table.csv).
html_reportHTMLHuman-readable report generated from the contract for stakeholder review.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
result = wbe.carbon_sequestration_verification_audit( baseline_bundle="data/baseline_multiband.tif", current_bundle="data/current_multiband.tif", baseline_red_band_index=0, baseline_nir_band_index=1, current_red_band_index=0, current_nir_band_index=1, biomass_proxy="data/biomass_proxy.tif", profile="balanced", zone_block_cells=16, mrv_template="verra_vcs_vm0010", methodology_reference="Verra VM0010 v1.3", output_prefix="output/carbon_audit", )
print(result)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Wildfire Fuel Loading And Risk Matrix
Function name: wildfire_fuel_loading_and_risk_matrix
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Wildfire Fuel Risk Analysis
Problem It Solves
Where are the highest-priority fuel and spread-risk areas, and what dominant fuel conditions drive those zones?
Who It Is For
- Wildfire planning teams, utility risk programs, and fuels management analysts.
Primary User
Utility wildfire mitigation teams, land-management agencies, and hazard/risk operations groups.
What It Does
- Classifies sparse, surface, ladder, and canopy fuel classes from optical and optional structure inputs.
- Computes moisture index and ladder-fuel continuity diagnostics.
- Builds a terrain-amplified wildfire risk matrix with zone-level risk tiers.
- Emits risk-zone vector outputs and summary contracts for operations planning.
How It Works
- Extracts required optical bands and computes NDMI (if SWIR available) or NDWI proxy fallback.
- Uses NDVI, moisture thresholds, and optional biomass proxy to assign fuel class per cell.
- Computes ladder continuity and combines fuel risk with optional slope/aspect spread amplifiers.
- Produces a clipped risk score in [0,1] and polygonized risk zones with dominant fuel and risk tier.
- Indicative formula: risk ~= (base_fuel_risk + dryness_boost) * slope_amp * aspect_amp.
Why It Wins
- Packages moisture, structure, terrain amplification, and zone-level risk outputs in a single reproducible workflow.
Typical Buying Trigger
Seasonal mitigation planning and risk-tier prioritization require consistent spatial evidence outputs.
Typical Presets
- conservative for lower sensitivity and tighter risk escalation.
- balanced for default planning workflows.
- aggressive for early-warning and preventative treatment prioritization.
Inputs
ParameterOptionalDescription
optical_bundlenoMultispectral raster containing red/NIR and optionally SWIR bands.
red_band_index, nir_band_indexyesRed and NIR band indices. Defaults: 0, 1.
swir_band_indexyesOptional SWIR index; enables NDMI moisture estimation when available.
biomass_proxyyesOptional LiDAR biomass/height proxy used to refine fuel class and ladder continuity signals.
slope, aspectyesOptional terrain slope/aspect rasters used for spread amplification terms.
profile: conservative | balanced | aggressiveyesSensitivity profile controlling class/risk thresholds.
zone_block_cellsyesPixel block size used for risk-zone polygon aggregation. Defaults to 20.
output_prefixyesPrefix used to name output artifacts.
Outputs
ParameterTypeDescription
moisture_indexGeoTIFFNDMI/NDWI moisture response raster (*_moisture_index.tif).
fuel_load_classGeoTIFFFuel class code raster (sparse/surface/ladder/canopy) (*_fuel_load_class.tif).
ladder_fuel_continuityGeoTIFFLadder continuity index surface (*_ladder_fuel_continuity.tif).
risk_matrixGeoTIFFTerrain-amplified wildfire risk score raster (*_risk_matrix.tif).
risk_zonesGeoPackageAggregated risk polygons with tier and dominant fuel attributes (*_risk_zones.gpkg).
summaryJSONMachine-readable summary contract (*_summary.json).
html_reportHTMLHuman-readable report generated from the summary contract for review and QA traceability.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
result = wbe.wildfire_fuel_loading_and_risk_matrix( optical_bundle="data/optical_multiband.tif", red_band_index=0, nir_band_index=1, swir_band_index=2, biomass_proxy="data/biomass_proxy.tif", slope="data/slope.tif", aspect="data/aspect.tif", profile="balanced", zone_block_cells=20, output_prefix="output/wildfire_risk", )
print(result)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Mine Site Reclamation Compliance Tracker
Function name: mine_site_reclamation_compliance_tracker
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Mine Reclamation Compliance Tracker
Problem It Solves
Are reclamation milestones being achieved spatially, and where are intervention priorities for compliance closure?
Who It Is For
- Mine closure teams, compliance analysts, and environmental reporting operations.
Primary User
Mine operators, reclamation contractors, and regulatory compliance monitoring groups.
What It Does
- Tracks vegetation recovery between baseline and current imagery.
- Computes per-cell reclamation progress against target NDVI milestone.
- Optionally evaluates slope stability milestone compliance.
- Produces compliance-zone outputs and a contract-ready compliance summary.
How It Works
- Extracts baseline/current red and NIR bands and computes per-date NDVI.
- Computes recovery delta and normalized progress toward target NDVI threshold.
- Optionally evaluates slope raster against maximum acceptable slope angle.
- Assigns milestone statuses and an overall compliance grade summary.
- Aggregates per-cell progress into compliance zones with pass/conditional/fail class attributes.
- Indicative formula: progress ~= (NDVI_current - NDVI_baseline) / (NDVI_target - NDVI_baseline), clipped to [0,1].
Why It Wins
- Integrates vegetation recovery, optional slope milestone evaluation, zone outputs, and compliance-grade contract reporting in one run.
Typical Buying Trigger
Regulatory milestone reporting cycles require reproducible evidence and zone-level compliance diagnostics.
Typical Presets
- spectral-only for vegetation recovery milestone tracking.
- full compliance mode with slope stability for stricter regulatory submissions.
Inputs
ParameterOptionalDescription
baseline_bundle, current_bundlenoBaseline and current multiband rasters used for NDVI recovery analysis.
baseline_red_band_index, baseline_nir_band_indexyesBaseline red/NIR band indices. Defaults: 0, 1.
current_red_band_index, current_nir_band_indexyesCurrent red/NIR band indices. Defaults: 0, 1.
slopeyesOptional slope raster (degrees) for stability milestone evaluation.
reclamation_target_ndviyesTarget NDVI threshold used for vegetation milestone scoring. Defaults to 0.35.
slope_stability_max_degyesMaximum acceptable slope for stability compliance. Defaults to 30.0.
jurisdictionyesCompliance template: us_federal_mtbs, us_california_mining, us_pennsylvania_coal, aus_western_australia, canada_bc_mines, south_africa_dmre, none.
site_nameyesOptional site name or permit identifier in contract outputs.
has_hydrology_evidence, has_soil_ph_evidence, has_perennial_vegetation_evidenceyesBoolean evidence flags used by submission-readiness diagnostics.
report_interval_monthsyesReported cadence in months (1-120) compared against template expectations.
zone_block_cellsyesPixel block size used for compliance-zone aggregation. Defaults to 20.
output_prefixyesPrefix used to name output artifacts.
Outputs
ParameterTypeDescription
ndvi_baselineGeoTIFFBaseline NDVI raster (*_ndvi_baseline.tif).
ndvi_currentGeoTIFFCurrent NDVI raster (*_ndvi_current.tif).
vegetation_recoveryGeoTIFFNDVI recovery delta raster (*_vegetation_recovery.tif).
reclamation_progressGeoTIFFNormalized progress-to-target raster (*_reclamation_progress.tif).
compliance_zonesGeoPackageZone-level compliance polygons with progress/recovery attributes (*_compliance_zones.gpkg).
compliance_contractJSONMachine-readable compliance summary contract (*_compliance_contract.json).
validation_diagnosticsJSONEvidence completeness and warning diagnostics (*_validation_diagnostics.json).
html_reportHTMLHuman-readable report generated from compliance contract outputs.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
result = wbe.mine_site_reclamation_compliance_tracker( baseline_bundle="data/mine_baseline.tif", current_bundle="data/mine_current.tif", baseline_red_band_index=0, baseline_nir_band_index=1, current_red_band_index=0, current_nir_band_index=1, slope="data/slope.tif", reclamation_target_ndvi=0.35, slope_stability_max_deg=30.0, jurisdiction="canada_bc_mines", site_name="Mine Site Alpha", has_hydrology_evidence=True, has_soil_ph_evidence=True, has_perennial_vegetation_evidence=True, report_interval_months=12, zone_block_cells=20, output_prefix="output/reclamation", )
print(result)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Terrain Constraint And Conflict Analysis
Function name: terrain_constraint_and_conflict_analysis
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Terrain Constraint and Conflict Analysis
Problem It Solves
Where are terrain constraints likely to create permitting, access, or design conflicts before field mobilization?
Who It Is For
- Infrastructure siting teams and engineering predesign analysts.
Primary User
Renewable developers, utilities, and civil engineering consultancies.
What It Does
- Builds a harmonized terrain conflict score from slope and optional risk overlays.
- Produces conflict classes for early-stage siting and routing screening.
How It Works
- Uses DEM-derived slope as the primary terrain constraint signal.
- Harmonizes optional wetness, flood-risk, and landcover-penalty rasters to the DEM grid.
- Blends constraints into a normalized conflict score and class map.
- QA acceptance guidance:
status=passindicates the conflict fraction remained below review threshold.diagnostics.acceptance_thresholds.high_conflict_fraction_reviewis the baseline threshold for escalation.- Elevated
summary.high_conflict_fractionshould trigger engineering review before downstream routing/siting commitments. - MVP hardening assets:
- Benchmark scaffold:
tests/fixtures/terrain_siting_benchmark/ - Promotion guide:
docs/internal/development/TERRAIN_PRECISION_AG_BENCHMARK_PROMOTION_GUIDE_2026_04_14.md
Inputs
ParameterOptionalDescription demnoReference DEM raster for terrain conflict analysis. wetnessyesOptional wetness raster normalized to [0,1]. flood_riskyesOptional flood-risk raster normalized to [0,1]. landcover_penaltyyesOptional landcover penalty raster normalized to [0,1]. slope_limit_degyesSlope threshold where terrain conflict accelerates.
Outputs
ParameterTypeDescription conflict_scoreGeoTIFFContinuous terrain conflict score [0,1]. conflict_classGeoTIFFDiscrete conflict class raster for planning triage. summaryJSONMachine-readable summary report containing run metadata, QA diagnostics, and key metrics. html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
conflict, classes, summary = wbe.terrain_constraint_and_conflict_analysis( dem="data/dem.tif", wetness="data/wetness.tif", flood_risk="data/flood_risk.tif", slope_limit_deg=15.0, output_prefix="output/terrain_conflict", )
print(conflict) print(classes) print(summary)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Terrain Constructability And Cost Analysis
Function name: terrain_constructability_and_cost_analysis
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Terrain Constructability and Cost Analysis
Problem It Solves
Which areas are practically constructible at lower relative cost before detailed design?
Who It Is For
- Engineering predesign teams and capital planning groups.
Primary User
Infrastructure developers, engineering firms, and utility planning teams.
What It Does
- Converts terrain and optional risk/cost context into constructability and cost-class surfaces.
- Supports quick predesign ranking of lower-cost, lower-risk development zones.
How It Works
- Computes slope from DEM and harmonizes optional conflict, wetness, and access-cost rasters.
- Blends penalties into a constructability score and relative cost class output.
- QA acceptance guidance:
status=passindicates high-cost fraction is within baseline tolerance.diagnostics.acceptance_thresholds.high_cost_fraction_reviewdefines review escalation threshold.- Use
summary.high_cost_fractionwith local access constraints before capital-stage budgeting decisions. - MVP hardening assets:
- Benchmark scaffold:
tests/fixtures/terrain_siting_benchmark/ - Promotion guide:
docs/internal/development/TERRAIN_PRECISION_AG_BENCHMARK_PROMOTION_GUIDE_2026_04_14.md
Inputs
ParameterOptionalDescription demnoReference DEM raster for constructability scoring. existing_conflictyesOptional prior terrain conflict raster normalized to [0,1]. wetnessyesOptional wetness raster normalized to [0,1]. access_costyesOptional access-friction raster normalized to [0,1].
Outputs
ParameterTypeDescription constructability_scoreGeoTIFFConstructability score raster [0,1] (higher is better). cost_classGeoTIFFRelative cost class raster (1 low cost to 5 high cost). summaryJSONMachine-readable summary report containing run metadata, QA diagnostics, and key metrics. html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
constructability, cost_class, summary = wbe.terrain_constructability_and_cost_analysis( dem="data/dem.tif", existing_conflict="output/terrain_conflict_conflict_score.tif", access_cost="data/access_friction.tif", output_prefix="output/terrain_constructability", )
print(constructability) print(cost_class) print(summary)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Utility Corridor Encroachment Intelligence
Function name: utility_corridor_encroachment_intelligence
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Utility Corridor Encroachment Detection
Problem It Solves
Which corridor segments present the highest near-term encroachment risk, and which assets should be prioritized first?
Who It Is For
- Utility vegetation-management teams and corridor maintenance planners.
Primary User
Transmission and distribution operators, utility contractors, and corridor asset-management groups.
What It Does
- Builds corridor-adjacent encroachment risk surfaces from LiDAR-derived canopy structure.
- Produces priority zones and per-asset risk summaries for maintenance planning.
How It Works
- Classifies vegetation and ground structure, then computes height-above-ground and local point-density context.
- Builds a corridor-relative risk surface that emphasizes canopy proximity, density, and confidence-weighted structure signals.
- Applies profile-based sensitivity settings and emits thresholded, ranked priority zones for maintenance triage.
- Indicative formula: risk ~= f(proximity_to_corridor, canopy_height, local_density, classification_confidence).
Why It Wins
- Produces both spatial priority zones and asset-level risk tables, enabling field-ready scheduling rather than map-only screening.
Typical Buying Trigger
Seasonal vegetation cycles or reliability programs require objective, repeatable encroachment prioritization across large networks.
Typical Presets
- fast for broad network screening.
- balanced for default operational planning.
- strict for conservative risk escalation and tighter action thresholds.
- Operational controls:
- profile: fast | balanced | strict.
- priority_zone_threshold and max_zone_features: bound priority-zone density for operational triage.
Inputs
ParameterOptionalDescription input (LAS/LAZ)noInput LiDAR point cloud used to derive QA, terrain, structure, or encroachment products. optional corridors and asset_featuresyesOptional corridor and asset vectors used to focus encroachment risk analysis. profile: fast | balanced | strictnoOperational profile controlling sensitivity and QA strictness for risk workflows. priority_zone_thresholdnoRisk threshold used to classify high-priority encroachment zones. max_zone_featuresnoUpper cap on number of output zone features to control product size.
Outputs
ParameterTypeDescription encroachment_riskGeoTIFFEncroachment risk surface used for corridor maintenance prioritization. corridor_priority_zonesGeoPackageVector priority zones for field action planning. asset_risk_tableGeoPackageVector table/layer with per-asset encroachment risk summaries. classification_confidenceGeoTIFFConfidence surface for LiDAR-derived classification and risk outputs. summaryJSONMachine-readable summary report containing run metadata, QA diagnostics, and key metrics. html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
risk, zones, assets, conf, summary = wbe.utility_corridor_encroachment_intelligence( input="data/corridor_points.laz", corridors="data/corridors.gpkg", asset_features="data/assets.gpkg", profile="balanced", priority_zone_threshold=0.75, max_zone_features=5000, output_prefix="output/utility_corridor", )
print(risk) print(zones) print(assets) print(conf) print(summary)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Forestry Structure And Biomass Intelligence
Function name: forestry_structure_and_biomass_intelligence
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Forestry Structure and Biomass Analysis
Problem It Solves
Where are high-structure and high-biomass stands concentrated, and how confident are those estimates for inventory decisions?
Who It Is For
- Forest inventory analysts, carbon accounting teams, and silviculture planning groups.
Primary User
Forestry agencies, carbon-market project developers, and natural resource consultancies.
What It Does
- Generates canopy structure, vertical class, and biomass-proxy products from LiDAR.
- Produces stand-level structure units and confidence diagnostics for inventory workflows.
- Requires Pro runtime visibility (
include_pro=True,tier='pro'or higher).
How It Works
- Grids LiDAR to derive canopy-height, density, and ground surfaces, then computes CHM-style structure metrics.
- Classifies canopy structure from adaptive height thresholds and density support into vertical strata classes.
- Scales a terrain-adaptive biomass proxy with configurable cap control.
- Aggregates stand-level structure units and confidence indicators for inventory and monitoring.
- Indicative formula: biomass_proxy ~= g(canopy_height, density_support, structure_class, terrain_relief), clamped by biomass_cap.
Why It Wins
- Combines structure classes, biomass proxy, and stand-unit outputs in one reproducible workflow suitable for operational reporting.
Typical Buying Trigger
Inventory refresh, carbon-baseline updates, or stand treatment planning needs consistent structure and biomass surfaces.
Typical Presets
- fast for regional reconnaissance.
- balanced for default inventory support.
- strict for conservative structure/banding and confidence interpretation.
- Operational controls:
- profile: fast | balanced | strict.
- terrain_adaptation: off | moderate | strong.
- biomass_cap: bounds biomass-proxy scaling for predictable downstream comparisons.
Inputs
ParameterOptionalDescription input (LAS/LAZ or Lidar)noInput LiDAR source used to derive forest structure and biomass products. profile: fast | balanced | strictnoOperational profile controlling sensitivity and QA strictness. resolutionyesOutput raster resolution, default 2.0. stand_block_cellsyesStand aggregation block size in cells, default 12. biomass_capyesUpper bound applied to biomass proxy estimates, default 25.0. terrain_adaptation: off | moderate | strongyesBiomass terrain-adaptation mode, default moderate.
Outputs
ParameterTypeDescription canopy_height_metricsGeoTIFFCanopy height metrics raster for forestry structure interpretation. vertical_structure_classGeoTIFFCategorical vertical-structure class raster. stand_structure_unitsGeoPackageVector stand structure units for reporting and management. biomass_proxyGeoTIFFBiomass proxy raster derived from structural LiDAR metrics. confidenceGeoTIFFConfidence layer quantifying reliability of modeled outputs. summaryJSONMachine-readable summary report containing run metadata, QA diagnostics, and key metrics. html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
lidar = wbw.Lidar("data/forest_points.laz")
height, vclass, stands, biomass, conf, summary = wbe.forestry_structure_and_biomass_intelligence( input=lidar, profile="balanced", resolution=2.0, stand_block_cells=12, biomass_cap=30.0, terrain_adaptation="moderate", output_prefix="output/forestry_structure", )
print(height) print(vclass) print(stands) print(biomass) print(conf) print(summary)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Spatial Hydrology
Spatial hydrology workflows extract hydrographic structure — flow directions, stream networks, watersheds, and hydrologic indices — from a terrain model. All hydrologic processing begins with a hydrologically conditioned DEM. This chapter demonstrates a complete watershed delineation workflow from raw DEM to labelled catchments.
Key Concepts
- Hydrologic conditioning: Removing or breaching depressions so that flow can route from every cell to a basin outlet without interruption.
- Flow direction: Per-cell pointer indicating which of the eight cardinal and diagonal neighbours receives runoff (D8 model) or a fractional multi-direction model (D-infinity, MD8).
- Flow accumulation: Upslope contributing area (in cells or area units). High values mark channels; low values mark ridges.
- Stream extraction threshold: Minimum contributing area that defines a first-order channel. Smaller thresholds produce denser networks.
- Watershed / catchment: All cells draining to a common outlet. Delineated by tracing flow direction upstream from outlet points.
- Strahler order: Hierarchical stream ordering from headwaters (order 1) to main channel (highest order). Used to characterise drainage network complexity.
End-to-End Workflow: Watershed Delineation
Inputs
| Layer | Format | Notes |
|---|---|---|
dem.tif | GeoTIFF raster | Projected CRS, metres |
outlets.shp | Point vector | One or more pour points |
Step 1 — Breach Depressions (Preferred Conditioning Method)
Breaching cuts a narrow channel through depression rims rather than filling them, preserving more of the original topography.
Processing Toolbox → Whitebox Workflows → Spatial Hydrology →
Breach Depressions (Least Cost)
| Parameter | Recommended value |
|---|---|
| Input DEM | dem.tif |
| Maximum search distance (cells) | 10 |
| Maximum breach depth | 2.0 (metres) |
| Flat increment | 0.001 |
| Fill remaining depressions | ✓ enabled |
| Output | dem_conditioned.tif |
If the DEM has large lake or wetland depressions that should not be breached, use
Fill Depressionsinstead.
Step 2 — D8 Flow Direction
Processing Toolbox → Whitebox Workflows → Spatial Hydrology →
D8 Pointer
| Parameter | Recommended value |
|---|---|
| Input DEM | dem_conditioned.tif |
| Output | d8_pointer.tif |
The output is an integer raster (powers of 2: 1, 2, 4, 8, 16, 32, 64, 128) encoding the direction to the steepest downslope neighbour.
Step 3 — D8 Flow Accumulation
Processing Toolbox → Whitebox Workflows → Spatial Hydrology →
D8 Flow Accumulation
| Parameter | Recommended value |
|---|---|
| Input D8 pointer | d8_pointer.tif |
| Output type | Cells |
| Log-transform output | ☐ (disable for threshold-based channel extraction) |
| Output | d8_accum.tif |
Visualise with a logarithmic stretch. The highest values form the main channel network.
Step 4 — Extract Stream Network
Processing Toolbox → Whitebox Workflows → Spatial Hydrology →
Extract Streams
| Parameter | Recommended value |
|---|---|
| Flow accumulation raster | d8_accum.tif |
| Threshold | 500 (cells — adjust for DEM resolution and drainage density) |
| Zero background | ✓ enabled |
| Output | streams.tif |
Rule of thumb: for a 10 m DEM, a threshold of 500 cells ≈ 0.05 km² contributing area, producing a moderately dense first-order network. Halve or double the threshold to adjust density.
Convert to vector for display: Processing → Raster Streams to Vector
→ streams.shp.
Step 5 — Snap Pour Points
Pour points must sit on the channel raster. Snap them to the nearest high-accumulation cell to avoid off-channel watershed boundaries.
Processing Toolbox → Whitebox Workflows → Spatial Hydrology →
Snap Pour Points
| Parameter | Recommended value |
|---|---|
| Pour points | outlets.shp |
| Flow accumulation | d8_accum.tif |
| Snap distance (map units) | 200 (metres — adjust to point accuracy) |
| Output | outlets_snapped.shp |
Step 6 — Watershed Delineation
Processing Toolbox → Whitebox Workflows → Spatial Hydrology →
Watershed
| Parameter | Recommended value |
|---|---|
| D8 pointer | d8_pointer.tif |
| Pour points | outlets_snapped.shp |
| Output | watersheds.tif |
Each outlet receives a unique integer ID; cells are assigned that ID. Use
Raster to Vector (Polygons) in QGIS to produce watershed boundary
polygons, then dissolve by ID if multiple raster cells share an outlet.
Step 7 — Strahler Stream Order (Optional)
Processing Toolbox → Whitebox Workflows → Spatial Hydrology →
Strahler Stream Order
| Parameter | Recommended value |
|---|---|
| D8 pointer | d8_pointer.tif |
| Streams raster | streams.tif |
| Output | strahler.tif |
Python Console Equivalent
import processing
dem = '/data/dem.tif'
outlets = '/data/outlets.shp'
# Step 1: condition DEM
processing.run('whitebox_workflows:breach_depressions_least_cost', {
'dem': dem,
'max_dist': 10,
'max_depth': 2.0,
'flat_increment': 0.001,
'fill': True,
'output': '/data/dem_conditioned.tif',
})
# Step 2: flow direction
processing.run('whitebox_workflows:d8_pointer', {
'dem': '/data/dem_conditioned.tif',
'output': '/data/d8_pointer.tif',
})
# Step 3: flow accumulation
processing.run('whitebox_workflows:d8_flow_accumulation', {
'input': '/data/d8_pointer.tif',
'output_type': 'Cells',
'log': False,
'output': '/data/d8_accum.tif',
})
# Step 4: extract streams
processing.run('whitebox_workflows:extract_streams', {
'flow_accum': '/data/d8_accum.tif',
'threshold': 500.0,
'zero_background': True,
'output': '/data/streams.tif',
})
# Step 5: snap pour points
processing.run('whitebox_workflows:snap_pour_points', {
'pour_pts': outlets,
'flow_accum': '/data/d8_accum.tif',
'snap_dist': 200.0,
'output': '/data/outlets_snapped.shp',
})
# Step 6: watershed
processing.run('whitebox_workflows:watershed', {
'd8_pntr': '/data/d8_pointer.tif',
'pour_pts': '/data/outlets_snapped.shp',
'output': '/data/watersheds.tif',
})
print("Watershed delineation complete.")
Advanced: Topographic Wetness Index
TWI requires the specific catchment area (flow accumulation in area units per unit contour width) rather than the raw cell count.
# SCA-based flow accumulation
processing.run('whitebox_workflows:d8_flow_accumulation', {
'input': '/data/d8_pointer.tif',
'output_type': 'Specific Contributing Area',
'log': False,
'output': '/data/sca.tif',
})
# Slope in radians (required for TWI)
processing.run('whitebox_workflows:slope', {
'dem': '/data/dem_conditioned.tif',
'units': 'Radians',
'output': '/data/slope_rad.tif',
})
# TWI
processing.run('whitebox_workflows:wetness_index', {
'sca': '/data/sca.tif',
'slope': '/data/slope_rad.tif',
'output': '/data/twi.tif',
})
Common Pitfalls
| Problem | Likely cause | Fix |
|---|---|---|
| Watershed does not extend to expected ridgeline | Pour point not on channel raster | Run Snap Pour Points before Watershed |
| Parallel flow stripes in accumulation raster | Flat areas in conditioned DEM | Enable fix-flats during conditioning |
| Stream network is too sparse / too dense | Threshold too high / too low | Halve or double threshold and re-inspect |
| Watershed covers entire DEM | Pour point is at or near the DEM outlet cell | Check that outlet coordinates fall inside the DEM extent |
| TWI has very high values in flat areas | Slope is near-zero, causing division by tan(0) | Mask flat areas or apply a minimum slope floor (e.g. 0.001 rad) |
Validation Checklist
- Conditioned DEM has no isolated flat areas (check flow direction raster for NoData).
- Flow accumulation values increase monotonically toward basin outlet.
- Extracted channels follow expected valley geometry in the DEM.
- Snapped pour points lie on the highest-accumulation cells within snap distance.
- Watershed boundary is a closed polygon that contains the pour point.
- Strahler orders are consistent with tributary junctions.
Flow Routing
Average Flowpath Slope
Function name: average_flowpath_slope
This tool calculates the average slope gradient (i.e. slope steepness in degrees) of the flowpaths that pass through each grid cell in an input digital elevation model (DEM). The user must specify the name of a DEM raster (dem). It is important that this DEM is pre-processed to remove all topographic depressions and flat areas using a tool such as breach_depressions_least_cost. Several intermediate rasters are created and stored in memory during the operation of this tool, which may limit the size of DEM that can be processed, depending on available system resources.
See Also
average_upslope_flowpath_length, breach_depressions_least_cost
Python API
def average_flowpath_slope(self, dem: Raster) -> Raster:
Average Upslope Flowpath Length
Function name: average_upslope_flowpath_length
This tool calculates the average slope gradient (i.e. slope steepness in degrees) of the flowpaths that pass through each grid cell in an input digital elevation model (DEM). The user must specify the name of a DEM raster (dem). It is important that this DEM is pre-processed to remove all topographic depressions and flat areas using a tool such as breach_depressions_least_cost. Several intermediate rasters are created and stored in memory during the operation of this tool, which may limit the size of DEM that can be processed, depending on available system resources.
See Also
average_upslope_flowpath_length, breach_depressions_least_cost
Python API
def average_upslope_flowpath_length(self, dem: Raster) -> Raster:
D8 Flow Accum
Function name: d8_flow_accum
This tool is used to generate a flow accumulation grid (i.e. catchment area) using the D8 (O'Callaghan and Mark, 1984) algorithm. This algorithm is an example of single-flow-direction (SFD) method because the flow entering each grid cell is routed to only one downslope neighbour, i.e. flow divergence is not permitted. The user must specify the name of the input digital elevation model (DEM) or flow pointer raster (input) derived using the D8 or Rho8 method (d8_pointer, rho8_pointer). If an input DEM is used, it must have been hydrologically corrected to remove all spurious depressions and flat areas. DEM pre-processing is usually achieved using the breach_depressions_least_cost or fill_depressions tools. If a D8 pointer raster is input, the user must also specify the optional pntr flag. If the D8 pointer follows the Esri pointer scheme, rather than the default WhiteboxTools scheme, the user must also specify the optional esri_pntr flag.
In addition to the input DEM/pointer, the user must specify the output type. The output flow-accumulation can be 1) cells (i.e. the number of inflowing grid cells), catchment area (i.e. the upslope area), or specific contributing area (i.e. the catchment area divided by the flow width. The default value is cells. The user must also specify whether the output flow-accumulation grid should be log-tranformed (log), i.e. the output, if this option is selected, will be the natural-logarithm of the accumulated flow value. This is a transformation that is often performed to better visualize the contributing area distribution. Because contributing areas tend to be very high along valley bottoms and relatively low on hillslopes, when a flow-accumulation image is displayed, the distribution of values on hillslopes tends to be 'washed out' because the palette is stretched out to represent the highest values. Log-transformation provides a means of compensating for this phenomenon. Importantly, however, log-transformed flow-accumulation grids must not be used to estimate other secondary terrain indices, such as the wetness index, or relative stream power index.
Grid cells possessing the NoData value in the input DEM/pointer raster are assigned the NoData value in the output flow-accumulation image.
Reference
O'Callaghan, J. F., & Mark, D. M. 1984. The extraction of drainage networks from digital elevation data. Computer Vision, Graphics, and Image Processing, 28(3), 323-344.
See Also:
FD8FlowAccumulation, quinn_flow_accumulation, qin_flow_accumulation, DInfFlowAccumulation, MDInfFlowAccumulation, rho8_pointer, d8_pointer, breach_depressions_least_cost, fill_depressions
Python API
def d8_flow_accum(self, raster: Raster, out_type: str = "sca", log_transform: bool = False, clip: bool = False, input_is_pointer: bool = False, esri_pntr: bool = False) -> Raster:
D8 Mass Flux
Function name: d8_mass_flux
This tool can be used to perform a mass flux calculation using DEM-based surface flow-routing techniques. For example, it could be used to model the distribution of sediment or phosphorous within a catchment. Flow-routing is based on a D8 flow pointer (i.e. flow direction) derived from an input depresionless DEM (dem). The user must also specify the names of loading (loading), efficiency (efficiency), and absorption (absorption) rasters, as well as the output raster. Mass Flux operates very much like a flow-accumulation operation except that rather than accumulating catchment areas the algorithm routes a quantity of mass, the spatial distribution of which is specified within the loading image. The efficiency and absorption rasters represent spatial distributions of losses to the accumulation process, the difference being that the efficiency raster is a proportional loss (e.g. only 50% of material within a particular grid cell will be directed downslope) and the absorption raster is an loss specified as a quantity in the same units as the loading image. The efficiency image can range from 0 to 1, or alternatively, can be expressed as a percentage. The equation for determining the mass sent from one grid cell to a neighbouring grid cell is:
Outflowing Mass = (Loading - Absorption + Inflowing Mass) × Efficiency
This tool assumes that each of the three input rasters have the same number of rows and columns and that any NoData cells present are the same among each of the inputs.
See Also
DInfMassFlux
Python API
def d8_mass_flux(self, dem: Raster, loading: Raster, efficiency: Raster, absorption: Raster) -> Raster:
D8 Pointer
Function name: d8_pointer
This tool is used to generate a flow pointer grid using the simple D8 (O'Callaghan and Mark, 1984) algorithm. The user must specify the name (dem) of a digital elevation model (DEM) that has been hydrologically corrected to remove all spurious depressions and flat areas. DEM pre-processing is usually achieved using either the breach_depressions_least_cost or fill_depressions tool. The local drainage direction raster output (output) by this tool serves as a necessary input for several other spatial hydrology and stream network analysis tools in the toolset. Some tools will calculate this flow pointer raster directly from the input DEM.
By default, D8 flow pointers use the following clockwise, base-2 numeric index convention: ... 641281 3202 1684
Notice that grid cells that have no lower neighbours are assigned a flow direction of zero. In a DEM that has been pre-processed to remove all depressions and flat areas, this condition will only occur along the edges of the grid. If the pointer file contains ESRI flow direction values instead, the esri_pntr parameter must be specified.
Grid cells possessing the NoData value in the input DEM are assigned the NoData value in the output image.
Memory Usage
The peak memory usage of this tool is approximately 10 bytes per grid cell.
Reference
O'Callaghan, J. F., & Mark, D. M. (1984). The extraction of drainage networks from digital elevation data. Computer vision, graphics, and image processing, 28(3), 323-344.
See Also
DInfPointer, fd8_pointer, breach_depressions_least_cost, fill_depressions
Python API
def d8_pointer(self, dem: Raster, esri_pointer: bool = False) -> Raster:
D-Infinity Flow Accum
Function name: dinf_flow_accum
This tool is used to generate a flow accumulation grid (i.e. contributing area) using the D-infinity algorithm (Tarboton, 1997). This algorithm is an examples of a multiple-flow-direction (MFD) method because the flow entering each grid cell is routed to one or two downslope neighbour, i.e. flow divergence is permitted. The user must specify the name of the input digital elevation model or D-infinity pointer raster (input). If an input DEM is specified, the DEM should have been hydrologically corrected to remove all spurious depressions and flat areas. DEM pre-processing is usually achieved using the breach_depressions_least_cost or fill_depressions tool.
In addition to the input DEM/pointer raster name, the user must specify the output type (out_type). The output flow-accumulation can be 1) specific catchment area (SCA), which is the upslope contributing area divided by the contour length (taken as the grid resolution), 2) total catchment area in square-metres, or 3) the number of upslope grid cells. The user must also specify whether the output flow-accumulation grid should be log-tranformed, i.e. the output, if this option is selected, will be the natural-logarithm of the accumulated area. This is a transformation that is often performed to better visualize the contributing area distribution. Because contributing areas tend to be very high along valley bottoms and relatively low on hillslopes, when a flow-accumulation image is displayed, the distribution of values on hillslopes tends to be 'washed out' because the palette is stretched out to represent the highest values. Log-transformation (log) provides a means of compensating for this phenomenon. Importantly, however, log-transformed flow-accumulation grids must not be used to estimate other secondary terrain indices, such as the wetness index, or relative stream power index.
Grid cells possessing the NoData value in the input DEM/pointer raster are assigned the NoData value in the output flow-accumulation image. The output raster is of the float data type and continuous data scale.
Reference
Tarboton, D. G. (1997). A new method for the determination of flow directions and upslope areas in grid digital elevation models. Water resources research, 33(2), 309-319.
See Also
DInfPointer, D8FlowAccumulation, <a href="https://www.whiteboxgeo.com/manual/wbw-user-manual/book/tool_help.html#quinn_flow_accumulation">quinn_flow_accumulation</a>, <a href="https://www.whiteboxgeo.com/manual/wbw-user-manual/book/tool_help.html#qin_flow_accumulation">qin_flow_accumulation</a>,FD8FlowAccumulation,MDInfFlowAccumulation, rho8_pointer, breach_depressions_least_cost, fill_depressions`
Python API
def dinf_flow_accum(self, dem: Raster, out_type: str = "sca", convergence_threshold: float = float('inf'), log_transform: bool = False, clip: bool = False, input_is_pointer: bool = False) -> Raster:
D-Infinity Mass Flux
Function name: dinf_mass_flux
This tool can be used to perform a mass flux calculation using DEM-based surface flow-routing techniques. For example, it could be used to model the distribution of sediment or phosphorous within a catchment. Flow-routing is based on a D-Infinity flow pointer derived from an input DEM (dem). The user must also specify the names of loading (loading), efficiency (efficiency), and absorption (absorption) rasters, as well as the output raster. Mass Flux operates very much like a flow-accumulation operation except that rather than accumulating catchment areas the algorithm routes a quantity of mass, the spatial distribution of which is specified within the loading image. The efficiency and absorption rasters represent spatial distributions of losses to the accumulation process, the difference being that the efficiency raster is a proportional loss (e.g. only 50% of material within a particular grid cell will be directed downslope) and the absorption raster is an loss specified as a quantity in the same units as the loading image. The efficiency image can range from 0 to 1, or alternatively, can be expressed as a percentage. The equation for determining the mass sent from one grid cell to a neighbouring grid cell is:
Outflowing Mass = (Loading - Absorption + Inflowing Mass) × Efficiency
This tool assumes that each of the three input rasters have the same number of rows and columns and that any NoData cells present are the same among each of the inputs.
See Also
d8_mass_flux
Python API
def dinf_mass_flux(self, dem: Raster, loading: Raster, efficiency: Raster, absorption: Raster) -> Raster:
D-Infinity Pointer
Function name: dinf_pointer
This tool is used to generate a flow pointer grid (i.e. flow direction) using the D-infinity (Tarboton, 1997) algorithm. Dinf is a multiple-flow-direction (MFD) method because the flow entering each grid cell is routed one or two downslope neighbours, i.e. flow divergence is permitted. The user must specify the name of a digital elevation model (DEM; dem) that has been hydrologically corrected to remove all spurious depressions and flat areas (breach_depressions_least_cost, fill_depressions). DEM pre-processing is usually achieved using the breach_depressions_least_cost or fill_depressions tool1. Flow directions are specified in the output flow-pointer grid (output) as azimuth degrees measured from north, i.e. any value between 0 and 360 degrees is possible. A pointer value of -1 is used to designate a grid cell with no flow-pointer. This occurs when a grid cell has no downslope neighbour, i.e. a pit cell or topographic depression. Like aspect grids, Dinf flow-pointer grids are best visualized using a circular greyscale palette.
Grid cells possessing the NoData value in the input DEM are assigned the NoData value in the output image. The output raster is of the float data type and continuous data scale.
Reference
Tarboton, D. G. (1997). A new method for the determination of flow directions and upslope areas in grid digital elevation models. Water resources research, 33(2), 309-319.
See Also
DInfFlowAccumulation, breach_depressions_least_cost, fill_depressions
Python API
def dinf_pointer(self, dem: Raster) -> Raster:
Downslope Flowpath Length
Function name: downslope_flowpath_length
This tool can be used to calculate the downslope flowpath length from each grid cell in a raster to an outlet cell either at the edge of the grid or at the outlet point of a watershed. The user must specify the name of a flow pointer grid (d8_pntr) derived using the D8 flow algorithm (d8_pointer). This grid should be derived from a digital elevation model (DEM) that has been pre-processed to remove artifact topographic depressions and flat areas (breach_depressions_least_cost, fill_depressions). The user may also optionally provide watershed (watersheds) and weights (weights) images. The optional watershed image can be used to define one or more irregular-shaped watershed boundaries. Flowpath lengths are measured within each watershed in the watershed image (each defined by a unique identifying number) as the flowpath length to the watershed's outlet cell.
The optional weight image is multiplied by the flow-length through each grid cell. This can be useful when there is a need to convert the units of the output image. For example, the default unit of flowpath lengths is the same as the input image(s). Thus, if the input image has X-Y coordinates measured in metres, the output image will likely contain very large values. A weight image containing a value of 0.001 for each grid cell will effectively convert the output flowpath lengths into kilometres. The weight image can also be used to convert the flowpath distances into travel times by multiplying the flow distance through a grid cell by the average velocity.
NoData valued grid cells in any of the input images will be assigned NoData values in the output image. The output raster is of the float data type and continuous data scale.
See Also
d8_pointer, elevation_above_stream, breach_depressions_least_cost, fill_depressions, watershed
Python API
def downslope_flowpath_length(self, d8_pointer: Raster, watersheds: Raster, weights: Raster, esri_pntr: bool = False) -> Raster:
FD8 Flow Accum
Function name: fd8_flow_accum
This tool is used to generate a flow accumulation grid (i.e. contributing area) using the FD8 algorithm (Freeman, 1991), sometimes referred to as FMFD. This algorithm is an examples of a multiple-flow-direction (MFD) method because the flow entering each grid cell is routed to each downslope neighbour, i.e. flow divergence is permitted. The user must specify the name (dem) of the input digital elevation model (DEM). The DEM must have been hydrologically corrected to remove all spurious depressions and flat areas. DEM pre-processing is usually achieved using either the breach_depressions_least_cost (also breach_depressions_least_cost) or fill_depressions tool. A value must also be specified for the exponent parameter (exponent), a number that controls the degree of dispersion in the resulting flow-accumulation grid. A lower value yields greater apparent flow dispersion across divergent hillslopes. Some experimentation suggests that a value of 1.1 is appropriate (Freeman, 1991), although this is almost certainly landscape-dependent.
In addition to the input DEM, the user must specify the output type (out_type). The output flow-accumulation can be 1) cells (i.e. the number of inflowing grid cells), catchment area (i.e. the upslope area), or specific contributing area (i.e. the catchment area divided by the flow width. The default value is cells. The user must also specify whether the output flow-accumulation grid should be log-tranformed (log), i.e. the output, if this option is selected, will be the natural-logarithm of the accumulated flow value. This is a transformation that is often performed to better visualize the contributing area distribution. Because contributing areas tend to be very high along valley bottoms and relatively low on hillslopes, when a flow-accumulation image is displayed, the distribution of values on hillslopes tends to be 'washed out' because the palette is stretched out to represent the highest values. Log-transformation provides a means of compensating for this phenomenon. Importantly, however, log-transformed flow-accumulation grids must not be used to estimate other secondary terrain indices, such as the wetness index, or relative stream power index.
The non-dispersive threshold (threshold) is a flow-accumulation value (measured in upslope grid cells, which is directly proportional to area) above which flow dispersion is no longer permitted. Grid cells with flow-accumulation values above this threshold will have their flow routed in a manner that is similar to the D8 single-flow-direction algorithm, directing all flow towards the steepest downslope neighbour. This is usually done under the assumption that flow dispersion, whilst appropriate on hillslope areas, is not realistic once flow becomes channelized.
Reference
Freeman, T. G. (1991). Calculating catchment area with divergent flow based on a regular grid. Computers and Geosciences, 17(3), 413-422.
See Also
D8FlowAccumulation, quinn_flow_accumulation, qin_flow_accumulation, DInfFlowAccumulation, MDInfFlowAccumulation, rho8_pointer
Python API
def fd8_flow_accum(self, dem: Raster, out_type: str = "sca", exponent: float = 1.1, convergence_threshold: float = float('inf'), log_transform: bool = False, clip: bool = False) -> Raster:
FD8 Pointer
Function name: fd8_pointer
This tool is used to generate a flow pointer grid (i.e. flow direction) using the FD8 (Freeman, 1991) algorithm. FD8 is a multiple-flow-direction (MFD) method because the flow entering each grid cell is routed one or more downslope neighbours, i.e. flow divergence is permitted. The user must specify the name of a digital elevation model (DEM; dem) that has been hydrologically corrected to remove all spurious depressions and flat areas. DEM pre-processing is usually achieved using the breach_depressions_least_cost or fill_depressions tools.
By default, D8 flow pointers use the following clockwise, base-2 numeric index convention: ... 641281 3202 1684
In the case of the FD8 algorithm, some portion of the flow entering a grid cell will be sent to each downslope neighbour. Thus, the FD8 flow-pointer value is the sum of each of the individual pointers for all downslope neighbours. For example, if a grid cell has downslope neighbours to the northeast, east, and south the corresponding FD8 flow-pointer value will be 1 + 2 + 8 = 11. Using the naming convention above, this is the only combination of flow-pointers that will result in the combined value of 11. Using the base-2 naming convention allows for the storage of complex combinations of flow-points using a single numeric value, which is the reason for using this somewhat odd convention.
Reference
Freeman, T. G. (1991). Calculating catchment area with divergent flow based on a regular grid. Computers and Geosciences, 17(3), 413-422.
See Also
FD8FlowAccumulation, d8_pointer, DInfPointer, breach_depressions_least_cost, fill_depressions
Python API
def fd8_pointer(self, dem: Raster) -> Raster:
Flow Accum Full Workflow
Function name: flow_accum_full_workflow
Resolves all of the depressions in a DEM, outputting a breached DEM, an aspect-aligned non-divergent flow pointer, and a flow accumulation raster.
Python API
def flow_accum_full_workflow(self, dem: Raster, out_type: str = "sca", log_transform: bool = False, clip: bool = False, esri_pntr: bool = False) -> Tuple[Raster, Raster, Raster]:
Flow Length Diff
Function name: flow_length_diff
FlowLengthDiff calculates the local maximum absolute difference in downslope flowpath length, which is useful in mapping drainage divides and ridges.
See Also
max_branch_length
Python API
def flow_length_diff(self, d8_pointer: Raster, esri_pointer: bool = False, log_transform: bool = False) -> Raster:
Max Upslope Flowpath Length
Function name: max_upslope_flowpath_length
This tool calculates the maximum length of the flowpaths that run through each grid cell (in map horizontal units) in an input digital elevation model (dem). The tool works by first calculating the D8 flow pointer (d8_pointer) from the input DEM. The DEM must be depressionless and should have been pre-processed using the breach_depressions_least_cost or fill_depressions tool. The user must also specify the name of output raster (output).
See Also
d8_pointer, breach_depressions_least_cost, fill_depressions, average_upslope_flowpath_length, downslope_flowpath_length, downslope_distance_to_stream
Python API
def max_upslope_flowpath_length(self, dem: Raster) -> Raster:
Max Upslope Value
Function name: max_upslope_value
This tool calculates the maximum length of the flowpaths that run through each grid cell (in map horizontal units) in an input digital elevation model (dem). The tool works by first calculating the D8 flow pointer (d8_pointer) from the input DEM. The DEM must be depressionless and should have been pre-processed using the breach_depressions_least_cost or fill_depressions tool. The user must also specify the name of output raster (output).
See Also
d8_pointer, breach_depressions_least_cost, fill_depressions, average_upslope_flowpath_length, downslope_flowpath_length, downslope_distance_to_stream
Python API
def max_upslope_value(self, dem: Raster, values_raster: Raster) -> Raster:
MD-Infinity Flow Accum
Function name: mdinf_flow_accum
This tool is used to generate a flow accumulation grid (i.e. contributing area) using the MD-infinity algorithm (Seibert and McGlynn, 2007). This algorithm is an examples of a multiple-flow-direction (MFD) method because the flow entering each grid cell is routed to one or two downslope neighbour, i.e. flow divergence is permitted. The user must specify the name of the input digital elevation model (dem). The DEM should have been hydrologically corrected to remove all spurious depressions and flat areas. DEM pre-processing is usually achieved using the breach_depressions_least_cost or fill_depressions tool.
In addition to the input flow-pointer grid name, the user must specify the output type (out_type). The output flow-accumulation can be 1) specific catchment area (SCA), which is the upslope contributing area divided by the contour length (taken as the grid resolution), 2) total catchment area in square-metres, or 3) the number of upslope grid cells. The user must also specify whether the output flow-accumulation grid should be log-tranformed, i.e. the output, if this option is selected, will be the natural-logarithm of the accumulated area. This is a transformation that is often performed to better visualize the contributing area distribution. Because contributing areas tend to be very high along valley bottoms and relatively low on hillslopes, when a flow-accumulation image is displayed, the distribution of values on hillslopes tends to be 'washed out' because the palette is stretched out to represent the highest values. Log-transformation (log) provides a means of compensating for this phenomenon. Importantly, however, log-transformed flow-accumulation grids must not be used to estimate other secondary terrain indices, such as the wetness index, or relative stream power index.
Grid cells possessing the NoData value in the input DEM raster are assigned the NoData value in the output flow-accumulation image. The output raster is of the float data type and continuous data scale.
Reference
Seibert, J. and McGlynn, B.L., 2007. A new triangular multiple flow direction algorithm for computing upslope areas from gridded digital elevation models. Water resources research, 43(4).
See Also
D8FlowAccumulation, FD8FlowAccumulation, quinn_flow_accumulation, qin_flow_accumulation, DInfFlowAccumulation, MDInfFlowAccumulation, rho8_pointer, breach_depressions_least_cost
Python API
def mdinf_flow_accum(self, dem: Raster, out_type: str = "sca", exponent: float = 1.1, convergence_threshold: float = float('inf'), log_transform: bool = False, clip: bool = False) -> Raster:
Minimal Dispersion Flow Algorithm
Function name: minimal_dispersion_flow_algorithm
Experimental
Generates MDFA flow-direction and flow-accumulation rasters from a DEM.
hydrology flow-direction flow-accumulation mdfa
Examples
Compute MDFA direction and specific contributing area from DEM
Num Inflowing Neighbours
Function name: num_inflowing_neighbours
This tool calculates the number of inflowing neighbours for each grid cell in a raster file. The user must specify the names of an input digital elevation model (DEM) file (dem) and the output raster file (output). The tool calculates the D8 pointer file internally in order to identify inflowing neighbouring cells.
Grid cells in the input DEM that contain the NoData value will be assigned the NoData value in the output image. The output image is of the integer data type and continuous data scale.
See Also
num_downslope_neighbours, NumUpslopeNeighbours
Python API
def num_inflowing_neighbours(self, dem: Raster) -> Raster:
Qin Flow Accumulation
Function name: qin_flow_accumulation
This tool is used to generate a flow accumulation grid (i.e. contributing area) using the Qin et al. (2007) flow algorithm, not to be confused with the similarly named quinn_flow_accumulation tool. This algorithm is an examples of a multiple-flow-direction (MFD) method because the flow entering each grid cell is routed to more than one downslope neighbour, i.e. flow divergence is permitted. It is based on a modification of the Freeman (1991; FD8FlowAccumulation) and Quinn et al. (1995; quinn_flow_accumulation) methods. The Qin method relates the degree of flow dispersion from a grid cell to the local maximum downslope gradient. Specifically, steeper terrain experiences more convergent flow while flatter slopes experience more flow divergence.
The following equations are used to calculate the portion flow (Fi) given to each neighbour, i:
Fi = Li(tanβ)f(e) / Σi=1n[Li(tanβ)f(e)]
f(e) = min(e, eU) / eU × (pU - 1.1) + 1.1
Where Li is the contour length, and is 0.5×cell size for cardinal directions and 0.354×cell size for diagonal directions, n = 8, and represents each of the eight neighbouring grid cells. The exponent f(e) controls the proportion of flow allocated to each downslope neighbour of a grid cell, based on the local maximum downslope gradient (e), and the user-specified upper boundary of e (eU; max_slope), and the upper boundary of the exponent (pU; exponent), f(e). Note that the original Qin (2007) implementation allowed for user-specified lower boundaries on the slope (eL) and exponent (pL) parameters as well. In this implementation, these parameters are assumed to be 0.0 and 1.1 respectively, and are not user adjustable. Also note, the exponent parameter should be less than 50.0, as higher values may cause numerical instability.
The user must specify the name (dem) of the input digital elevation model (DEM) and the output file (output). The DEM must have been hydrologically corrected to remove all spurious depressions and flat areas. DEM pre-processing is usually achieved using either the breach_depressions_least_cost (also breach_depressions_least_cost) or fill_depressions tool.
The user-specified non-dispersive, channel initiation threshold (threshold) is a flow-accumulation value (measured in upslope grid cells, which is directly proportional to area) above which flow dispersion is no longer permitted. Grid cells with flow-accumulation values above this area threshold will have their flow routed in a manner that is similar to the D8 single-flow-direction algorithm, directing all flow towards the steepest downslope neighbour. This is usually done under the assumption that flow dispersion, whilst appropriate on hillslope areas, is not realistic once flow becomes channelized. Importantly, the threshold parameter sets the spatial extent of the stream network, with lower values resulting in more extensive networks.
In addition to the input DEM, output file (output), and exponent, the user must also specify the output type (out_type). The output flow-accumulation can be: 1) cells (i.e. the number of inflowing grid cells), catchment area (i.e. the upslope area), or specific contributing area (i.e. the catchment area divided by the flow width). The default value is specific contributing area. The user must also specify whether the output flow-accumulation grid should be log-tranformed (log), i.e. the output, if this option is selected, will be the natural-logarithm of the accumulated flow value. This is a transformation that is often performed to better visualize the contributing area distribution. Because contributing areas tend to be very high along valley bottoms and relatively low on hillslopes, when a flow-accumulation image is displayed, the distribution of values on hillslopes tends to be 'washed out' because the palette is stretched out to represent the highest values. Log-transformation provides a means of compensating for this phenomenon. Importantly, however, log-transformed flow-accumulation grids must not be used to estimate other secondary terrain indices, such as the wetness index (wetness_index), or relative stream power index (StreamPowerIndex).
Reference
Freeman, T. G. (1991). Calculating catchment area with divergent flow based on a regular grid. Computers and Geosciences, 17(3), 413-422.
Qin, C., Zhu, A. X., Pei, T., Li, B., Zhou, C., & Yang, L. 2007. An adaptive approach to selecting a flow‐partition exponent for a multiple‐flow‐direction algorithm. International Journal of Geographical Information Science, 21(4), 443-458.
Quinn, P. F., K. J. Beven, Lamb, R. 1995. The in (a/tanβ) index: How to calculate it and how to use it within the topmodel framework. Hydrological Processes 9(2): 161-182.
See Also
D8FlowAccumulation, quinn_flow_accumulation, FD8FlowAccumulation, DInfFlowAccumulation, MDInfFlowAccumulation, rho8_pointer, wetness_index
Python API
def qin_flow_accumulation(self, dem: Raster, out_type: str = "sca", exponent: float = 10.0, max_slope: float = 45.0, convergence_threshold: float = float('inf'), log_transform: bool = False, clip: bool = False) -> Raster:
Quinn Flow Accumulation
Function name: quinn_flow_accumulation
This tool is used to generate a flow accumulation grid (i.e. contributing area) using the Quinn et al. (1995) flow algorithm, sometimes called QMFD or QMFD2, and not to be confused with the similarly named qin_flow_accumulation tool. This algorithm is an examples of a multiple-flow-direction (MFD) method because the flow entering each grid cell is routed to more than one downslope neighbour, i.e. flow divergence is permitted. The user must specify the name (dem) of the input digital elevation model (DEM). The DEM must have been hydrologically corrected to remove all spurious depressions and flat areas. DEM pre-processing is usually achieved using either the breach_depressions_least_cost (also breach_depressions_least_cost) or fill_depressions tool. A value must also be specified for the exponent parameter (exponent), a number that controls the degree of dispersion in the resulting flow-accumulation grid. A lower value yields greater apparent flow dispersion across divergent hillslopes. The exponent value (h) should probably be less than 50.0, as higher values may cause numerical instability, and values between 1 and 2 are most common. The following equations are used to calculate the portion flow (Fi) given to each neighbour, i:
Fi = Li(tanβ)p / Σi=1n[Li(tanβ)p]
p = (A / threshold + 1)h
Where Li is the contour length, and is 0.5×cell size for cardinal directions and 0.354×cell size for diagonal directions, n = 8, and represents each of the eight neighbouring grid cells, and, A is the flow accumulation value assigned to the current grid cell, that is being apportioned downslope. The non-dispersive, channel initiation threshold (threshold) is a flow-accumulation value (measured in upslope grid cells, which is directly proportional to area) above which flow dispersion is no longer permitted. Grid cells with flow-accumulation values above this threshold will have their flow routed in a manner that is similar to the D8 single-flow-direction algorithm, directing all flow towards the steepest downslope neighbour. This is usually done under the assumption that flow dispersion, whilst appropriate on hillslope areas, is not realistic once flow becomes channelized. Importantly, the threshold parameter sets the spatial extent of the stream network, with lower values resulting in more extensive networks.
In addition to the input DEM, output file (output), and exponent, the user must also specify the output type (out_type). The output flow-accumulation can be: 1) cells (i.e. the number of inflowing grid cells), catchment area (i.e. the upslope area), or specific contributing area (i.e. the catchment area divided by the flow width). The default value is specific contributing area. The user must also specify whether the output flow-accumulation grid should be log-transformed (log), i.e. the output, if this option is selected, will be the natural-logarithm of the accumulated flow value. This is a transformation that is often performed to better visualize the contributing area distribution. Because contributing areas tend to be very high along valley bottoms and relatively low on hillslopes, when a flow-accumulation image is displayed, the distribution of values on hillslopes tends to be 'washed out' because the palette is stretched out to represent the highest values. Log-transformation provides a means of compensating for this phenomenon. Importantly, however, log-transformed flow-accumulation grids must not be used to estimate other secondary terrain indices, such as the wetness index (wetness_index), or relative stream power index (StreamPowerIndex). The Quinn et al. (1995) algorithm is commonly used to calculate wetness index.
Reference
Quinn, P. F., K. J. Beven, Lamb, R. 1995. The in (a/tanβ) index: How to calculate it and how to use it within the topmodel framework. Hydrological Processes 9(2): 161-182.
See Also
D8FlowAccumulation, qin_flow_accumulation, FD8FlowAccumulation, DInfFlowAccumulation, MDInfFlowAccumulation, rho8_pointer, wetness_index
Python API
def quinn_flow_accumulation(self, dem: Raster, out_type: str = "sca", exponent: float = 1.1, convergence_threshold: float = float('inf'), log_transform: bool = False, clip: bool = False) -> Raster:
Rho8 Flow Accum
Function name: rho8_flow_accum
This tool is used to generate a flow accumulation grid (i.e. contributing area) using the Fairfield and Leymarie (1991) flow algorithm, often called Rho8. Like the D8 flow method, this algorithm is an examples of a single-flow-direction (SFD) method because the flow entering each grid cell is routed to only one downslope neighbour, i.e. flow divergence is not permitted. The user must specify the name of the input file (input), which may be either a digital elevation model (DEM) or a Rho8 pointer file (see rho8_pointer). If a DEM is input, it must have been hydrologically corrected to remove all spurious depressions and flat areas. DEM pre-processing is usually achieved using either the breach_depressions_least_cost (also breach_depressions_least_cost) or fill_depressions tool.
In addition to the input and output (output)files, the user must also specify the output type (out_type). The output flow-accumulation can be: 1) cells (i.e. the number of inflowing grid cells), catchment area (i.e. the upslope area), or specific contributing area (i.e. the catchment area divided by the flow width). The default value is specific contributing area. The user must also specify whether the output flow-accumulation grid should be log-tranformed (log), i.e. the output, if this option is selected, will be the natural-logarithm of the accumulated flow value. This is a transformation that is often performed to better visualize the contributing area distribution. Because contributing areas tend to be very high along valley bottoms and relatively low on hillslopes, when a flow-accumulation image is displayed, the distribution of values on hillslopes tends to be 'washed out' because the palette is stretched out to represent the highest values. Log-transformation provides a means of compensating for this phenomenon. Importantly, however, log-transformed flow-accumulation grids must not be used to estimate other secondary terrain indices, such as the wetness index (wetness_index), or relative stream power index (StreamPowerIndex).
If a Rho8 pointer is used as the input raster, the user must specify this (pntr). Similarly, if a pointer input is used and the pointer follows the Esri pointer convention, rather than the default WhiteboxTools convension for pointer files, then this must also be specified (esri_pntr).
Reference
Fairfield, J., and Leymarie, P. 1991. Drainage networks from grid digital elevation models. Water Resources Research, 27(5), 709-717.
See Also
rho8_pointer, D8FlowAccumulation, qin_flow_accumulation, FD8FlowAccumulation, DInfFlowAccumulation, MDInfFlowAccumulation, wetness_index
Python API
def rho8_flow_accum(self, raster: Raster, out_type: str = "sca", log_transform: bool = False, clip: bool = False, input_is_pointer: bool = False, esri_pntr: bool = False) -> Raster:
Rho8 Pointer
Function name: rho8_pointer
This tool is used to generate a flow pointer grid (i.e. flow direction) using the stochastic Rho8 (J. Fairfield and P. Leymarie, 1991) algorithm. Like the D8 flow algorithm (d8_pointer), Rho8 is a single-flow-direction (SFD) method because the flow entering each grid cell is routed to only one downslope neighbour, i.e. flow divergence is not permitted. The user must specify the name of a digital elevation model (DEM) file (dem) that has been hydrologically corrected to remove all spurious depressions and flat areas (breach_depressions_least_cost, fill_depressions). The output of this tool (output) is often used as the input to the Rho8FlowAccumulation tool.
By default, the Rho8 flow pointers use the following clockwise, base-2 numeric index convention: ... 641281 3202 1684
Notice that grid cells that have no lower neighbours are assigned a flow direction of zero. In a DEM that has been pre-processed to remove all depressions and flat areas, this condition will only occur along the edges of the grid. If the pointer file contains ESRI flow direction values instead, the esri_pntr parameter must be specified.
Grid cells possessing the NoData value in the input DEM are assigned the NoData value in the output image.
Memory Usage
The peak memory usage of this tool is approximately 10 bytes per grid cell.
References
Fairfield, J., and Leymarie, P. 1991. Drainage networks from grid digital elevation models. Water Resources Research, 27(5), 709-717.
See Also
Rho8FlowAccumulation, d8_pointer, fd8_pointer, DInfPointer, breach_depressions_least_cost, fill_depressions
Python API
def rho8_pointer(self, dem: Raster, esri_pntr: bool = False) -> Raster:
Trace Downslope Flowpaths
Function name: trace_downslope_flowpaths
This tool can be used to mark the flowpath initiated from user-specified locations downslope and terminating at either the grid's edge or a grid cell with undefined flow direction. The user must input the name of a D8 flow pointer grid (d8_pntr) and an input vector file indicating the location of one or more initiation points, i.e. 'seed points' (seed_pts). The seed point file must be a vector of the POINT VectorGeometryType. Note that the flow pointer should be generated from a DEM that has been processed to remove all topographic depression (see breach_depressions_least_cost and fill_depressions) and created using the D8 flow algorithm (d8_pointer).
See Also
d8_pointer, breach_depressions_least_cost, fill_depressions, downslope_flowpath_length, downslope_distance_to_stream
Python API
def trace_downslope_flowpaths(self, seed_points: Vector, d8_pointer: Raster, esri_pntr: bool = False, zero_background: bool = False) -> Raster:
Depressions and Storage
Breach Depressions Least Cost
Function name: breach_depressions_least_cost
This tool can be used to perform a type of optimal depression breaching to prepare a digital elevation model (DEM) for hydrological analysis. Depression breaching is a common alternative to depression filling (fill_depressions) and often offers a lower-impact solution to the removal of topographic depressions. This tool implements a method that is loosely based on the algorithm described by Lindsay and Dhun (2015), furthering the earlier algorithm with efficiency optimizations and other significant enhancements. The approach uses a least-cost path analysis to identify the breach channel that connects pit cells (i.e. grid cells for which there is no lower neighbour) to some distant lower cell. Prior to breaching and in order to minimize the depth of breach channels, all pit cells are rised to the elevation of the lowest neighbour minus a small heigh value. Here, the cost of a breach path is determined by the amount of elevation lowering needed to cut the breach channel through the surrounding topography.
The user must specify the name of the input DEM file (dem), the output breached DEM file (output), the maximum search window radius (dist), the optional maximum breach cost (max_cost), and an optional flat height increment value (flat_increment). Notice that if the flat_increment parameter is not specified, the small number used to ensure flow across flats will be calculated automatically, which should be preferred in most applications of the tool. The tool operates by performing a least-cost path analysis for each pit cell, radiating outward until the operation identifies a potential breach destination cell or reaches the maximum breach length parameter. If a value is specified for the optional max_cost parameter, then least-cost breach paths that would require digging a channel that is more costly than this value will be left unbreached. The flat increment value is used to ensure that there is a monotonically descending path along breach channels to satisfy the necessary condition of a downslope gradient for flowpath modelling. It is best for this value to be a small value. If left unspecified, the tool with determine an appropriate value based on the range of elevation values in the input DEM, which should be the case in most applications. Notice that the need to specify these very small elevation increment values is one of the reasons why the output DEM will always be of a 64-bit floating-point data type, which will often double the storage requirements of a DEM (DEMs are often store with 32-bit precision). Lastly, the user may optionally choose to apply depression filling (fill) on any depressions that remain unresolved by the earlier depression breaching operation. This filling step uses an efficient filling method based on flooding depressions from their pit cells until outlets are identified and then raising the elevations of flooded cells back and away from the outlets.
The tool can be run in two modes, based on whether the min_dist is specified. If the min_dist flag is specified, the accumulated cost (accum2) of breaching from cell1 to cell2 along a channel issuing from pit is calculated using the traditional cost-distance function:
cost1 = z1 - (zpit + l × s)
cost2 = z2 - [zpit + (l + 1)s]
accum2 = accum1 + g(cost1 + cost2) / 2.0
where cost1 and cost2 are the costs associated with moving through cell1 and cell2 respectively, z1 and z2 are the elevations of the two cells, zpit is the elevation of the pit cell, l is the length of the breach channel to cell1, g is the grid cell distance between cells (accounting for diagonal distances), and s is the small number used to ensure flow across flats. If the min_dist flag is not present, the accumulated cost is calculated as:
accum2 = accum1 + cost2
That is, without the min_dist flag, the tool works to minimize elevation changes to the DEM caused by breaching, without considering the distance of breach channels. Notice that the value max_cost, if specified, should account for this difference in the way cost/cost-distances are calculated. The first cell in the least-cost accumulation operation that is identified for which cost2 <= 0.0 is the target cell to which the breach channel will connect the pit along the least-cost path.
In comparison with the breach_depressions_least_cost tool, this breaching method often provides a more satisfactory, lower impact, breaching solution and is often more efficient. It is therefore advisable that users try the breach_depressions_least_cost tool to remove depressions from their DEMs first. This tool is particularly well suited to breaching through road embankments. There are instances when a breaching solution is inappropriate, e.g. when a very deep depression such as an open-pit mine occurs in the DEM and long, deep breach paths are created. Often restricting breaching with the max_cost parameter, combined with subsequent depression filling (fill) can provide an adequate solution in these cases. Nonetheless, there are applications for which full depression filling using the fill_depressions tool may be preferred.
Reference
Lindsay J, Dhun K. 2015. Modelling surface drainage patterns in altered landscapes using LiDAR. International Journal of Geographical Information Science, 29: 1-15. DOI: 10.1080/13658816.2014.975715
See Also
breach_depressions_least_cost, fill_depressions, cost_pathway
Python API
def breach_depressions_least_cost(self, dem: Raster, max_cost: float = float('inf'), max_dist: int = 100, flat_increment: float = float('nan'), fill_deps: bool = False, minimize_dist: bool = False) -> Raster:
Breach Single Cell Pits
Function name: breach_single_cell_pits
This tool calculates the average slope gradient (i.e. slope steepness in degrees) of the flowpaths that pass through each grid cell in an input digital elevation model (DEM). The user must specify the name of a DEM raster (dem). It is important that this DEM is pre-processed to remove all topographic depressions and flat areas using a tool such as breach_depressions_least_cost. Several intermediate rasters are created and stored in memory during the operation of this tool, which may limit the size of DEM that can be processed, depending on available system resources.
See Also
average_upslope_flowpath_length, breach_depressions_least_cost
Python API
def breach_single_cell_pits(self, dem: Raster) -> Raster:
Burn Streams
Function name: burn_streams
Stable
Burns a stream network into a DEM by decreasing stream-cell elevations.
stream_network dem_preprocessing
Burn Streams At Roads
Function name: burn_streams_at_roads
This tool decrements (lowers) the elevations of pixels within an input digital elevation model (DEM) (dem) along an input vector stream network (streams) at the sites of road (roads) intersections. In addition to the input data layers, the user must specify the output raster DEM (output), and the maximum road embankment width (width), in map units. The road width parameter is used to determine the length of channel along stream lines, at the junctions between streams and roads, that the burning (i.e. decrementing) operation occurs. The algorithm works by identifying stream-road intersection cells, then traversing along the rasterized stream path in the upstream and downstream directions by half the maximum road embankment width. The minimum elevation in each stream traversal is identified and then elevations that are higher than this value are lowered to the minimum elevation during a second stream traversal.
Reference
Lindsay JB. 2016. The practice of DEM stream burning revisited. Earth Surface Processes and Landforms, 41(5): 658–668. DOI: 10.1002/esp.3888
See Also
raster_streams_to_vector, rasterize_streams
Python API
def burn_streams_at_roads(self, dem: Raster, streams: Vector, roads: Vector, road_width: float) -> Raster:
Depth In Sink
Function name: depth_in_sink
This tool measures the depth that each grid cell in an input (dem) raster digital elevation model (DEM) lies within a sink feature, i.e. a closed topographic depression. A sink, or depression, is a bowl-like landscape feature, which is characterized by interior drainage and groundwater recharge. The depth_in_sink tool operates by differencing a filled DEM, using the same depression filling method as fill_depressions, and the original surface model.
In addition to the names of the input DEM (dem) and the output raster (output), the user must specify whether the background value (i.e. the value assigned to grid cells that are not contained within sinks) should be set to 0.0 (zero_background) Without this optional parameter specified, the tool will use the NoData value as the background value.
Reference
Antonić, O., Hatic, D., & Pernar, R. (2001). DEM-based depth in sink as an environmental estimator. Ecological Modelling, 138(1-3), 247-254.
See Also
fill_depressions
Python API
def depth_in_sink(self, dem: Raster, zero_background: bool = False) -> Raster:
Fill Burn
Function name: fill_burn
Burns streams into a DEM using the FillBurn (Saunders, 1999) method which produces a hydro-enforced DEM. This tool uses the algorithm described in:
Lindsay JB. 2016. The practice of DEM stream burning revisited. Earth Surface Processes and Landforms, 41(5): 658-668. DOI: 10.1002/esp.3888
And:
Saunders, W. 1999. Preparation of DEMs for use in environmental modeling analysis, in: ESRI User Conference. pp. 24-30.
Python API
def fill_burn(self, dem: Raster, streams: Vector) -> Raster:
Fill Depressions
Function name: fill_depressions
This tool can be used to fill all of the depressions in a digital elevation model (DEM) and to remove the flat areas. This is a common pre-processing step required by many flow-path analysis tools to ensure continuous flow from each grid cell to an outlet located along the grid edge. The fill_depressions algorithm operates by first identifying single-cell pits, that is, interior grid cells with no lower neighbouring cells. Each pit cell is then visited from highest to lowest and a priority region-growing operation is initiated. The area of monotonically increasing elevation, starting from the pit cell and growing based on flood order, is identified. Once a cell, that has not been previously visited and possessing a lower elevation than its discovering neighbour cell, is identified the discovering neighbour is labelled as an outlet (spill point) and the outlet elevation is noted. The algorithm then back-fills the labelled region, raising the elevation in the output DEM (output) to that of the outlet. Once this process is completed for each pit cell (noting that nested pit cells are often solved by prior pits) the flat regions of filled pits are optionally treated (fix_flats) with an applied small slope gradient away from outlets (note, more than one outlet cell may exist for each depression). The user may optionally specify the size of the elevation increment used to solve flats (flat_increment), although it is best to not specify this optional value and to let the algorithm determine the most suitable value itself. The flat-fixing method applies a small gradient away from outlets using another priority region-growing operation (i.e. based on a priority queue operation), where priorities are set by the elevations in the input DEM (input). This in effect ensures a gradient away from outlet cells but also following the natural pre-conditioned topography internal to depression areas. For example, if a large filled area occurs upstream of a damming road-embankment, the filled DEM will possess flow directions that are similar to the un-flooded valley, with flow following the valley bottom. In fact, the above case is better handled using the breach_depressions_least_cost tool, which would simply cut through the road embankment at the likely site of a culvert. However, the flat-fixing method of fill_depressions does mean that this common occurrence in LiDAR DEMs is less problematic.
The breach_depressions_least_cost, while slightly less efficient than either other hydrological preprocessing methods, often provides a lower impact solution to topographic depressions and should be preferred in most applications. In comparison with the breach_depressions_least_cost tool, the depression filling method often provides a less satisfactory, higher impact solution. It is advisable that users try the breach_depressions_least_cost tool to remove depressions from their DEMs before using fill_depressions. Nonetheless, there are applications for which full depression filling using the fill_depressions tool may be preferred.
Note that this tool will not fill in NoData regions within the DEM. It is advisable to remove such regions using the fill_missing_data tool prior to application.
See Also
breach_depressions_least_cost, breach_depressions_least_cost, sink, depth_in_sink, fill_missing_data
Python API
def fill_depressions(self, dem: Raster, fix_flats: bool = True, flat_increment: float = float('nan'), max_depth: float = float('inf')) -> Raster:
Fill Depressions Planchon And Darboux
Function name: fill_depressions_planchon_and_darboux
This tool can be used to fill all of the depressions in a digital elevation model (DEM) and to remove the flat areas using the Planchon and Darboux (2002) method. This is a common pre-processing step required by many flow-path analysis tools to ensure continuous flow from each grid cell to an outlet located along the grid edge. This tool is currently not the most efficient depression-removal algorithm available in WhiteboxTools; fill_depressions and breach_depressions_least_cost are both more efficient and often produce better, lower-impact results.
The user may optionally specify the size of the elevation increment used to solve flats (flat_increment), although it is best not to specify this optional value and to let the algorithm determine the most suitable value itself.
Reference
Planchon, O. and Darboux, F., 2002. A fast, simple and versatile algorithm to fill the depressions of digital elevation models. Catena, 46(2-3), pp.159-176.
See Also
fill_depressions, breach_depressions_least_cost
Python API
def fill_depressions_planchon_and_darboux(self, dem: Raster, fix_flats: bool = True, flat_increment: float = float('nan')) -> Raster:
Fill Depressions Wang And Liu
Function name: fill_depressions_wang_and_liu
This tool can be used to fill all of the depressions in a digital elevation model (DEM) and to remove the flat areas. This is a common pre-processing step required by many flow-path analysis tools to ensure continuous flow from each grid cell to an outlet located along the grid edge. The fill_depressions_wang_and_liu algorithm is based on the computationally efficient approach of examining each cell based on its spill elevation, starting from the edge cells, and visiting cells from lowest order using a priority queue. As such, it is based on the algorithm first proposed by Wang and Liu (2006). However, it is currently not the most efficient depression-removal algorithm available in WhiteboxTools; fill_depressions and breach_depressions_least_cost are both more efficient and often produce better, lower-impact results.
If the input DEM has gaps, or missing-data holes, that contain NoData values, it is better to use the fill_missing_data tool to repair these gaps. This tool will interpolate values across the gaps and produce a more natural-looking surface than the flat areas that are produced by depression filling. Importantly, the fill_depressions tool algorithm implementation assumes that there are no 'donut hole' NoData gaps within the area of valid data. Any NoData areas along the edge of the grid will simply be ignored and will remain NoData areas in the output image.
The user may optionally specify the size of the elevation increment used to solve flats (flat_increment), although it is best not to specify this optional value and to let the algorithm determine the most suitable value itself.
Reference
Wang, L. and Liu, H. 2006. An efficient method for identifying and filling surface depressions in digital elevation models for hydrologic analysis and modelling. International Journal of Geographical Information Science, 20(2): 193-213.
See Also
fill_depressions, breach_depressions_least_cost, breach_depressions_least_cost, fill_missing_data
Python API
def fill_depressions_wang_and_liu(self, dem: Raster, fix_flats: bool = True, flat_increment: float = float('nan')) -> Raster:
Fill Pits
Function name: fill_pits
This tool can be used to remove pits from a digital elevation model (DEM). Pits are single grid cells with no downslope neighbours. They are important because they impede overland flow-paths. This tool will remove any pits in the input DEM that can be resolved by raising the elevation of the pit such that flow will continue past the pit cell to one of the downslope neighbours. Notice that this tool can be a useful pre-processing technique before running one of the more robust depression breaching (breach_depressions_least_cost) or filling (fill_depressions) techniques, which are designed to remove larger depression features.
See Also
breach_depressions_least_cost, fill_depressions, breach_single_cell_pits
Python API
def fill_pits(self, dem: Raster) -> Raster:
Flatten Lakes
Function name: flatten_lakes
This tool can be used to set the elevations contained in a set of input vector lake polygons (lakes) to a consistent value within an input (dem) digital elevation model (DEM). Lake flattening is a common pre-processing step for DEMs intended for use in hydrological applications. This algorithm determines lake elevation automatically based on the minimum perimeter elevation for each lake polygon. The minimum perimeter elevation is assumed to be the lake outlet elevation and is assigned to the entire interior region of lake polygons, excluding island geometries. Note, this tool will not provide satisfactory results if the input vector polygons contain wide river features rather than true lakes. When this is the case, the tool will lower the entire river to the elevation of its mouth, leading to the creation of an artificial gorge.
See Also
fill_depressions
Python API
def flatten_lakes(self, dem: Raster, lakes: Vector) -> Raster:
Impoundment Size Index
Function name: impoundment_size_index
This tool can be used to calculate the impoundment size index (ISI) from a digital elevation model (DEM). The ISI is a land-surface parameter related to the size of the impoundment that would result from inserting a dam of a user-specified maximum length (damlength) into each DEM grid cell. The tool requires the user to specify the name of one or more of the possible outputs, which include the mean flooded depth (out_mean), the maximum flooded depth (out_max), the flooded volume (out_volume), the flooded area (out_area), and the dam height (out_dam_height).
Please note that this tool performs an extremely complex and computationally intensive flow-accumulation operation. As such, it may take a substantial amount of processing time and may encounter issues (including memory issues) when applied to very large DEMs. It is not necessary to pre-process the input DEM (dem) to remove topographic depressions and flat areas. The internal flow-accumulation operation will not be confounded by the presence of these features.
Reference
Lindsay, JB (2015) Modelling the spatial pattern of potential impoundment size from DEMs. Online resource: Whitebox Blog
See Also
insert_dams, stochastic_depression_analysis
Python API
def impoundment_size_index(self, dem: Raster, max_dam_length: float, output_mean: bool = False, output_max: bool = False, output_volume: bool = False, output_area: bool = False, output_height: bool = False) -> Tuple[Union[Raster, None], Union[Raster, None], Union[Raster, None], Union[Raster, None], Union[Raster, None]]:
Insert Dams
Function name: insert_dams
This tool can be used to insert dams at one or more user-specified points (dam_pts), and of a maximum length (damlength), within an input digital elevation model (DEM) (dem). This tool can be thought of as providing the impoundment feature that is calculated internally during a run of the the impoundment size index (ISI) tool for a set of points of interest. from a (DEM).
Reference
Lindsay, JB (2015) Modelling the spatial pattern of potential impoundment size from DEMs. Online resource: Whitebox Blog
See Also
impoundment_size_index, stochastic_depression_analysis
Python API
def insert_dams(self, dem: Raster, dam_points: Vector, dam_length: float) -> Raster:
Raise Walls
Function name: raise_walls
This tool is used to increment the elevations in a digital elevation model (DEM) along the boundaries of a vector lines or polygon layer. The user must specify the name of the raster DEM (dem), the vector file (input), the output file name (output), the increment height (height), and an optional breach lines vector layer (breach). The breach lines layer can be used to breach a whole in the raised walls at intersections with the wall layer.
Python API
def raise_walls(self, dem: Raster, walls: Vector, breach_lines: Vector, wall_height: float = 100.0) -> Raster:
Sink
Function name: sink
This tool measures the depth that each grid cell in an input (dem) raster digital elevation model (DEM) lies within a sink feature, i.e. a closed topographic depression. A sink, or depression, is a bowl-like landscape feature, which is characterized by interior drainage and groundwater recharge. The depth_in_sink tool operates by differencing a filled DEM, using the same depression filling method as fill_depressions, and the original surface model.
In addition to the names of the input DEM (dem) and the output raster (output), the user must specify whether the background value (i.e. the value assigned to grid cells that are not contained within sinks) should be set to 0.0 (zero_background) Without this optional parameter specified, the tool will use the NoData value as the background value.
Reference
Antonić, O., Hatic, D., & Pernar, R. (2001). DEM-based depth in sink as an environmental estimator. Ecological Modelling, 138(1-3), 247-254.
See Also
fill_depressions
Python API
def sink(self, dem: Raster, zero_background: bool = False) -> Raster:
Stochastic Depression Analysis
Function name: stochastic_depression_analysis
This tool performs a stochastic analysis of depressions within a DEM, calculating the probability of each cell belonging to a depression. This land-surface parameter (pdep) has been widely applied in wetland and bottom-land mapping applications.
This tool differs from the original Whitebox GAT tool in a few significant ways:
The Whitebox GAT tool took an error histogram as an input. In practice people found it difficult to create this input. Usually they just generated a normal distribution in a spreadsheet using information about the DEM root-mean-square-error (RMSE). As such, this tool takes a RMSE input and generates the histogram internally. This is more convienent for most applications but loses the flexibility of specifying the error distribution more completely.
The Whitebox GAT tool generated the error fields using the turning bands method. This tool generates a random Gaussian error field with no spatial autocorrelation and then applies local spatial averaging using a Gaussian filter (the size of which depends of the error autocorrelation length input) to increase the level of autocorrelation. We use the Fast Almost Gaussian Filter of Peter Kovesi (2010), which uses five repeat passes of a mean filter, based on an integral image. This filter method is highly efficient. This results in a significant performance increase compared with the original tool.
Parts of the tool's workflow utilize parallel processing. However, the depression filling operation, which is the most time-consuming part of the workflow, is not parallelized.
In addition to the input DEM (dem) and output pdep file name (output), the user must specify the nature of the error model, including the root-mean-square error (rmse) and the error field correlation length (range, in map units). These parameters determine the statistical frequency distribution and spatial characteristics of the modeled error fields added to the DEM in each iteration of the simulation. The user must also specify the number of iterations (iterations). A larger number of iterations will produce a smoother pdep raster.
This tool creates several temporary rasters in memory and, as a result, is very memory hungry. This will necessarily limit the size of DEMs that can be processed on more memory-constrained systems. As a rough guide for usage, the computer system will need 6-10 times more memory than the file size of the DEM. If your computer possesses insufficient memory, you may consider splitting the input DEM apart into smaller tiles.
For a video demonstrating the application of the stochastic_depression_analysis tool, see this YouTube video.
Reference
Lindsay, J. B., & Creed, I. F. (2005). Sensitivity of digital landscapes to artifact depressions in remotely-sensed DEMs. Photogrammetric Engineering & Remote Sensing, 71(9), 1029-1036.
See Also
impoundment_size_index, fast_almost_gaussian_filter
Python API
def stochastic_depression_analysis(self, dem: Raster, rmse: float, range: float, iterations: int = 100) -> Raster:
Topological Breach Burn
Function name: topological_breach_burn
PROExperimental
Burns streams into a DEM, conditions the surface, and returns stream, DEM, pointer, and accumulation rasters.
hydrology stream_burning d8
Examples
Generate topologically conditioned stream-burning outputs
Upslope Depression Storage
Function name: upslope_depression_storage
This tool estimates the average upslope depression storage depth using the FD8 flow algorithm. The input DEM (dem) need not be hydrologically corrected; the tool will internally map depression storage and resolve flowpaths using depression filling. This input elevation model should be of a fine resolution (< 2 m), and is ideally derived using LiDAR. The tool calculates the total upslope depth of depression storage, which is divided by the number of upslope cells in the final step of the process, yielding the average upslope depression depth. Roughened surfaces tend to have higher values compared with smoothed surfaces. Values, particularly on hillslopes, may be very small (< 0.01 m).
See Also
FD8FlowAccumulation, fill_depressions, depth_in_sink
Python API
def upslope_depression_storage(self, dem: Raster) -> Raster:
Watersheds and Basins
Basins
Function name: basins
This tool can be used to delineate all of the drainage basins contained within a local drainage direction, or flow pointer raster (d8_pntr), and draining to the edge of the data. The flow pointer raster must be derived using the d8_pointer tool and should have been extracted from a digital elevation model (DEM) that has been hydrologically pre-processed to remove topographic depressions and flat areas, e.g. using the breach_depressions_least_cost tool. By default, the flow pointer raster is assumed to use the clockwise indexing method used by WhiteboxTools: ... 641281 3202 1684
If the pointer file contains ESRI flow direction values instead, the esri_pntr parameter must be specified.
The basins and watershed tools are similar in function but while the watershed tool identifies the upslope areas that drain to one or more user-specified outlet points, the basins tool automatically sets outlets to all grid cells situated along the edge of the data that do not have a defined flow direction (i.e. they do not have a lower neighbour). Notice that these edge outlets need not be situated along the edges of the flow-pointer raster, but rather along the edges of the region of valid data. That is, the DEM from which the flow-pointer has been extracted may incompletely fill the containing raster, if it is irregular shaped, and NoData regions may occupy the peripherals. Thus, the entire region of valid data in the flow pointer raster will be divided into a set of mutually exclusive basins using this tool.
See Also
watershed, d8_pointer, breach_depressions_least_cost
Python API
def basins(self, d8_pntr: Raster, esri_pntr: bool = False) -> Raster:
Flood Order
Function name: flood_order
This tool takes an input digital elevation model (DEM) and creates an output raster where every grid cell contains the flood order of that cell within the DEM. The flood order is the sequence of grid cells that are encountered during a search, starting from the raster grid edges and the lowest grid cell, moving inward at increasing elevations. This is in fact similar to how the highly efficient Wang and Liu (2006) depression filling algorithm and the Breach Depressions (Fast) operates. The output flood order raster contains the sequential order, from lowest edge cell to the highest pixel in the DEM.
Like the fill_depressions tool, flood_order will read the entire DEM into memory. This may make the algorithm ill suited to processing massive DEMs except where the user's computer has substantial memory (RAM) resources.
Reference
Wang, L., and Liu, H. (2006). An efficient method for identifying and filling surface depressions in digital elevation models for hydrologic analysis and modelling. International Journal of Geographical Information Science, 20(2), 193-213.
See Also
fill_depressions
Python API
def flood_order(self, dem: Raster) -> Raster:
Hillslopes
Function name: hillslopes
This tool decrements (lowers) the elevations of pixels within an input digital elevation model (DEM) (dem) along an input vector stream network (streams) at the sites of road (roads) intersections. In addition to the input data layers, the user must specify the output raster DEM (output), and the maximum road embankment width (width), in map units. The road width parameter is used to determine the length of channel along stream lines, at the junctions between streams and roads, that the burning (i.e. decrementing) operation occurs. The algorithm works by identifying stream-road intersection cells, then traversing along the rasterized stream path in the upstream and downstream directions by half the maximum road embankment width. The minimum elevation in each stream traversal is identified and then elevations that are higher than this value are lowered to the minimum elevation during a second stream traversal.
Reference
Lindsay JB. 2016. The practice of DEM stream burning revisited. Earth Surface Processes and Landforms, 41(5): 658–668. DOI: 10.1002/esp.3888
See Also
raster_streams_to_vector, rasterize_streams
Python API
def hillslopes(self, d8_pntr: Raster, streams: Raster, esri_pntr: bool = False) -> Raster:
Isobasins
Function name: isobasins
This tool can be used to divide a landscape into a group of nearly equal-sized watersheds, known as isobasins. The user must specify the name (dem) of a digital elevation model (DEM), the output raster name (output), and the isobasin target area (size) specified in units of grid cells. The DEM must have been hydrologically corrected to remove all spurious depressions and flat areas. DEM pre-processing is usually achieved using either the breach_depressions_least_cost or fill_depressions tool. Several temporary rasters are created during the execution and stored in memory of this tool.
The tool can optionally (connections) output a CSV table that contains the upstream/downstream connections among isobasins. That is, this table will identify the downstream basin of each isobasin, or will list N/A in the event that there is no downstream basin, i.e. if it drains to an edge. Additionally, the CSV file will contain information about the number of grid cells in each isobasin and the isobasin outlet's row and column number and flow direction. The output CSV file will have the same name as the output raster, but with a *.csv file extension.
See Also
watershed, basins, breach_depressions_least_cost, fill_depressions
Python API
def isobasins(self, dem: Raster, target_size: float, connections: bool = False, csv_file: str = "" ) -> Raster:
Jenson Snap Pour Points
Function name: jenson_snap_pour_points
This tool measures the depth that each grid cell in an input (dem) raster digital elevation model (DEM) lies within a sink feature, i.e. a closed topographic depression. A sink, or depression, is a bowl-like landscape feature, which is characterized by interior drainage and groundwater recharge. The depth_in_sink tool operates by differencing a filled DEM, using the same depression filling method as fill_depressions, and the original surface model.
In addition to the names of the input DEM (dem) and the output raster (output), the user must specify whether the background value (i.e. the value assigned to grid cells that are not contained within sinks) should be set to 0.0 (zero_background) Without this optional parameter specified, the tool will use the NoData value as the background value.
Reference
Antonić, O., Hatic, D., & Pernar, R. (2001). DEM-based depth in sink as an environmental estimator. Ecological Modelling, 138(1-3), 247-254.
See Also
fill_depressions
Python API
def jenson_snap_pour_points(self, pour_pts: Vector, streams: Raster, snap_dist: float = 0.0) -> Vector:
Longest Flowpath
Function name: longest_flowpath
This tool delineates the longest flowpaths for a group of subbasins or watersheds. Flowpaths are initiated along drainage divides and continue along the D8-defined flow direction until either the subbasin outlet or DEM edge is encountered. Each input subbasin/watershed will have an associated vector flowpath in the output image. longest_flowpath is similar to the r.lfp plugin tool for GRASS GIS. The length of the longest flowpath draining to an outlet is related to the time of concentration, which is a parameter used in certain hydrological models.
The user must input the filename of a digital elevation model (DEM), a basins raster, and the output vector. The DEM must be depressionless and should have been pre-processed using the breach_depressions_least_cost or fill_depressions tool. The basins raster must contain features that are delineated by categorical (integer valued) unique identifier values. All non-NoData, non-zero valued grid cells in the basins raster are interpreted as belonging to features. In practice, this tool is usual run using either a single watershed, a group of contiguous non-overlapping watersheds, or a series of nested subbasins. These are often derived using the watershed tool, based on a series of input outlets, or the subbasins tool, based on an input stream network. If subbasins are input to longest_flowpath, each traced flowpath will include only the non-overlapping portions within nested areas. Therefore, this can be a convenient method of delineating the longest flowpath to each bifurcation in a stream network.
The output vector file will contain fields in the attribute table that identify the associated basin unique identifier (BASIN), the elevation of the flowpath source point on the divide (UP_ELEV), the elevation of the outlet point (DN_ELEV), the length of the flowpath (LENGTH), and finally, the average slope (AVG_SLOPE) along the flowpath, measured as a percent grade.
See Also
max_upslope_flowpath_length, breach_depressions_least_cost, fill_depressions, watershed, subbasins
Python API
def longest_flowpath(self, dem: Raster, basins: Raster) -> Vector:
Max Branch Length
Function name: max_branch_length
Maximum branch length (Bmax) is the longest branch length between a grid cell's flowpath and the flowpaths initiated at each of its neighbours. It can be conceptualized as the downslope distance that a volume of water that is split into two portions by a drainage divide would travel before reuniting.
If the two flowpaths of neighbouring grid cells do not intersect, Bmax is simply the flowpath length from the starting cell to its terminus at the edge of the grid or a cell with undefined flow direction (i.e. a pit cell either in a topographic depression or at the edge of a major body of water).
The pattern of Bmax derived from a DEM should be familiar to anyone who has interpreted upslope contributing area images. In fact, Bmax can be thought of as the complement of upslope contributing area. Whereas contributing area is greatest along valley bottoms and lowest at drainage divides, Bmax is greatest at divides and lowest along channels. The two topographic attributes are also distinguished by their units of measurements; Bmax is a length rather than an area. The presence of a major drainage divide between neighbouring grid cells is apparent in a Bmax image as a linear feature, often two grid cells wide, of relatively high values. This property makes Bmax a useful land surface parameter for mapping ridges and divides.
Bmax is useful in the study of landscape structure, particularly with respect to drainage patterns. The index gives the relative significance of a specific location along a divide, with respect to the dispersion of materials across the landscape, in much the same way that stream ordering can be used to assess stream size.
See Also
flow_length_diff
Reference
Lindsay JB, Seibert J. 2013. Measuring the significance of a divide to local drainage patterns. International Journal of Geographical Information Science, 27: 1453-1468. DOI: 10.1080/13658816.2012.705289
Python API
def max_branch_length(self, dem: Raster, log_transform: bool = False) -> Raster:
Snap Pour Points
Function name: snap_pour_points
This tool measures the depth that each grid cell in an input (dem) raster digital elevation model (DEM) lies within a sink feature, i.e. a closed topographic depression. A sink, or depression, is a bowl-like landscape feature, which is characterized by interior drainage and groundwater recharge. The depth_in_sink tool operates by differencing a filled DEM, using the same depression filling method as fill_depressions, and the original surface model.
In addition to the names of the input DEM (dem) and the output raster (output), the user must specify whether the background value (i.e. the value assigned to grid cells that are not contained within sinks) should be set to 0.0 (zero_background) Without this optional parameter specified, the tool will use the NoData value as the background value.
Reference
Antonić, O., Hatic, D., & Pernar, R. (2001). DEM-based depth in sink as an environmental estimator. Ecological Modelling, 138(1-3), 247-254.
See Also
fill_depressions
Python API
def snap_pour_points(self, pour_pts: Vector, flow_accum: Raster, snap_dist: float = 0.0) -> Vector:
Subbasins
Function name: subbasins
This tool will identify the catchment areas to each link in a user-specified stream network, i.e. the network's sub-basins. subbasins effectively performs a stream link ID operation (stream_link_identifier) followed by a watershed operation. The user must specify the name of a flow pointer (flow direction) raster (d8_pntr), a streams raster (streams), and the output raster (output). The flow pointer and streams rasters should be generated using the d8_pointer algorithm. This will require a depressionless DEM, processed using either the breach_depressions_least_cost or fill_depressions tool.
hillslopes are conceptually similar to sub-basins, except that sub-basins do not distinguish between the right-bank and left-bank catchment areas of stream links. The Sub-basins tool simply assigns a unique identifier to each stream link in a stream network.
By default, the pointer raster is assumed to use the clockwise indexing method used by WhiteboxTools. If the pointer file contains ESRI flow direction values instead, the esri_pntr parameter must be specified.
NoData values in the input flow pointer raster are assigned NoData values in the output image.
See Also
stream_link_identifier, watershed, hillslopes, d8_pointer, breach_depressions_least_cost, fill_depressions
Python API
def subbasins(self, d8_pntr: Raster, streams: Raster, esri_pntr: bool = False) -> Raster:
Unnest Basins
Function name: unnest_basins
In some applications it is necessary to relate a measured variable for a group of hydrometric stations (e.g. characteristics of flow timing and duration or water chemistry) to some characteristics of each outlet's catchment (e.g. mean slope, area of wetlands, etc.). When the group of outlets are nested, i.e. some stations are located downstream of others, then performing a watershed operation will result in inappropriate watershed delineation. In particular, the delineated watersheds of each nested outlet will not include the catchment areas of upstream outlets. This creates a serious problem for this type of application.
The Unnest Basin tool can be used to perform a watershedding operation based on a group of specified pour points, i.e. outlets or target cells, such that each complete watershed is delineated. The user must specify the name of a flow pointer (flow direction) raster, a pour point raster, and the name of the output rasters. Multiple numbered outputs will be created, one for each nesting level. Pour point, or target, cells are denoted in the input pour-point image as any non-zero, non-NoData value. The flow pointer raster should be generated using the D8 algorithm.
Python API
def unnest_basins(self, d8_pointer: Raster, pour_points: Vector, esri_pntr: bool = False) -> List[Raster]:
Watershed
Function name: watershed
This tool will perform a watershedding operation based on a group of input vector pour points (pour_pts), i.e. outlets or points-of-interest. Watershedding is a procedure that identifies all of the cells upslope of a cell of interest (pour point) that are connected to the pour point by a flow-path. The user must input a D8-derived flow pointer (flow direction) raster (d8_pntr) and a vector pour point file (pour_pts). The pour points must be of a Point ShapeType (i.e. Point, PointZ, PointM, MultiPoint, MultiPointZ, MultiPointM). Watersheds will be assigned the input pour point FID value. The flow pointer raster must be generated using the D8 algorithm, d8_pointer.
Pour point vectors can be attained by on-screen digitizing to designate these points-of-interest locations. Because pour points are usually, although not always, situated on a stream network, it is recommended that you use Jenson's method (jenson_snap_pour_points) to snap pour points on the stream network. This will ensure that the digitized outlets are coincident with the digital stream contained within the DEM flowpaths. If this is not done prior to inputting a pour-point set to the watershed tool, anomalously small watersheds may be output, as pour points that fall off of the main flow path (even by one cell) in the D8 pointer will yield very different catchment areas.
If a raster pour point is specified instead of vector points, the watershed labels will derive their IDs from the grid cell values of all non-zero, non-NoData valued grid cells in the pour points file. Notice that this file can contain any integer data. For example, if a lakes raster, with each lake possessing a unique ID, is used as the pour points raster, the tool will map the watersheds draining to each of the input lake features. Similarly, a pour points raster may actually be a streams file, such as what is generated by the stream_link_identifier tool.
By default, the pointer raster is assumed to use the clockwise indexing method used by Whitebox Workflows. If the pointer file contains ESRI flow direction values instead, the esri_pntr must be True.
There are several tools that perform similar watershedding operations in Whitebox Workflows. watershed is appropriate to use when you have a set of specific locations for which you need to derive the watershed areas. Use the basins tool instead when you simply want to find the watersheds draining to each outlet situated along the edge of a DEM. The isobasins tool can be used to divide a landscape into roughly equally sized watersheds. The subbasins and strahler_order_basins are useful when you need to find the areas draining to each link within a stream network. Finally, hillslopes can be used to identify the areas draining the each of the left and right banks of a stream network.
Reference
Jenson, S. K. (1991), Applications of hydrological information automatically extracted from digital elevation models, Hydrological Processes, 5, 31–44, doi:10.1002/hyp.3360050104.
Lindsay JB, Rothwell JJ, and Davies H. 2008. Mapping outlet points used for watershed delineation onto DEM-derived stream networks, Water Resources Research, 44, W08442, doi:10.1029/2007WR006507.
See Also
d8_pointer, basins, subbasins, isobasins, strahler_order_basins, hillslopes, jenson_snap_pour_points, breach_depressions_least_cost, fill_depressions
Python API
def watershed(self, d8_pointer: Raster, pour_points: Vector, esri_pntr: bool = False) -> Raster:
Watershed From Raster Pour Points
Function name: watershed_from_raster_pour_points
This tool will perform a watershedding operation based on a group of input raster containing point points (pour_points). Watershedding is a procedure that identifies all of the cells upslope of a cell of interest (pour point) that are connected to the pour point by a flow-path. The user must input a D8-derived flow pointer (flow direction) raster (d8_pointer) and a pour points raster (pour_points). The flow pointer raster must be generated using the D8 algorithm, d8_pointer.
Watershed labels will derive their IDs from the grid cell values of all non-zero, non-NoData valued grid cells in the pour points file. Notice that this file can contain any integer data. For example, if a lakes raster, with each lake possessing a unique ID, is used as the pour points raster, the tool will map the watersheds draining to each of the input lake features. Similarly, a pour points raster may actually be a streams file, such as what is generated by the stream_link_identifier tool.
By default, the pointer raster is assumed to use the clockwise indexing method used by Whitebox Workflows. If the pointer file contains ESRI flow direction values instead, the esri_pntr parameter must be specified.
There are several tools that perform similar watershedding operations in Whitebox Workflows. watershed is appropriate to use when you have a set of specific locations for which you need to derive the watershed areas. Use the basins tool instead when you simply want to find the watersheds draining to each outlet situated along the edge of a DEM. The isobasins tool can be used to divide a landscape into roughly equally sized watersheds. The subbasins and strahler_order_basins are useful when you need to find the areas draining to each link within a stream network. Finally, hillslopes can be used to identify the areas draining the each of the left and right banks of a stream network.
Reference
Jenson, S. K. (1991), Applications of hydrological information automatically extracted from digital elevation models, Hydrological Processes, 5, 31–44, doi:10.1002/hyp.3360050104.
Lindsay JB, Rothwell JJ, and Davies H. 2008. Mapping outlet points used for watershed delineation onto DEM-derived stream networks, Water Resources Research, 44, W08442, doi:10.1029/2007WR006507.
See Also
d8_pointer, basins, subbasins, isobasins, strahler_order_basins, hillslopes, jenson_snap_pour_points, breach_depressions_least_cost, fill_depressions
Python API
def watershed_from_raster_pour_points(self, d8_pointer: Raster, pour_points: Raster, esri_pntr: bool = False) -> Raster:
Hydrologic Indices
Depth To Water
Function name: depth_to_water
Description
This tool calculates the cartographic depth-to-water (DTW) index described by Murphy et al. (2009). The DTW index has been shown to be related to soil moisture, and is useful for identifying low-lying positions that are likely to experience surface saturated conditions. In this regard, it is similar to each of wetness_index, elevation_above_stream (HAND), and probability-of-depressions (i.e. stochastic_depression_analysis).
The index is the cumulative slope gradient along the least-slope path connecting each grid cell in an input DEM (dem) to a surface water cell. Tangent slope (i.e. rise / run) is calculated for each grid cell based on the neighbouring elevation values in the input DEM. The algorithm operates much like a cost-accumulation analysis (cost_distance), where the cost of moving through a cell is determined by the cell's tangent slope value and the distance travelled. Therefore, lower DTW values are associated with wetter soils and higher values indicate drier conditions, over longer time periods. Areas of surface water have DTW values of zero. The user must input surface water features, including vector stream lines (streams) and/or vector waterbody polygons (lakes, i.e. lakes, ponds, wetlands, etc.). At least one of these two optional water feature inputs must be specified. The tool internally rasterizes these vector features, setting the DTW value in the output raster to zero. DTW tends to increase with greater distances from surface water features, and increases more slowly in flatter topography and more rapidly in steeper settings. Murphy et al. (2009) state that DTW is a probablistic model that assumes uniform soil properties, climate, and vegetation.
Note that DTW values are highly dependent upon the accuracy and extent of the input streams/lakes layer(s).
References
Murphy, PNC, Gilvie, JO, and Arp, PA (2009) Topographic modelling of soil moisture conditiTons: a comparison and verification of two models. European Journal of Soil Science, 60, 94–109, DOI: 10.1111/j.1365-2389.2008.01094.x.
See Also
wetness_index, elevation_above_stream, stochastic_depression_analysis
Python API
def depth_to_water(self, dem: Raster, streams: Optional[Vector] = None, lakes: Optional[Vector] = None) -> Raster:
Distance To Outlet
Function name: distance_to_outlet
Description
This tool calculates the distance of stream grid cells to the channel network outlet cell for each grid cell belonging to a raster stream network. The user must input a raster containing streams data (streams_raster), where stream grid cells are denoted by all positive non-zero values, and a D8 flow pointer (i.e. flow direction) raster (d8_pointer). The pointer image is used to traverse the stream network and must only be created using the D8 algorithm. Stream cells are designated in the streams image as all grid cells with values greater than zero. Thus, all non-stream or background grid cells are commonly assigned either zeros or NoData values. Background cells will be assigned the NoData value in the output image, unless the zero_background parameter is True, in which case non-stream cells will be assigned zero values in the output.
By default, the pointer raster is assumed to use the clockwise indexing method used by Whitebox. If the pointer file contains ESRI flow direction values instead, the esri_pointer parameter must be True.
See Also
downslope_distance_to_stream, length_of_upstream_channels
Parameters
d8_pointer (Raster): The D8 pointer (flow direction) raster.
streams_raster (Raster): The raster object containing the streams data.
esri_pointer (bool): Determines whether the d8_pointer raster contains pointer data in the Esri format. Default is False.
zero_background (bool): Determines whether the background value in the output raster are assigned zero (True) or NoData values (False). Default is False.
Returns
Raster: returning value
Python API
def distance_to_outlet(self, d8_pointer: Raster, streams_raster: Raster, esri_pointer: bool = False, zero_background: bool = False) -> Raster:
Downslope Distance To Stream
Function name: downslope_distance_to_stream
This tool can be used to calculate the distance from each grid cell in a raster to the nearest stream cell, measured along the downslope flowpath. The user must specify the name of an input digital elevation model (dem) and streams raster (streams). The DEM must have been pre-processed to remove artifact topographic depressions and flat areas (see breach_depressions_least_cost). The streams raster should have been created using one of the DEM-based stream mapping methods, i.e. contributing area thresholding. Stream cells are designated in this raster as all non-zero values. The output of this tool, along with the elevation_above_stream tool, can be useful for preliminary flood plain mapping when combined with high-accuracy DEM data.
By default, this tool calculates flow-path using the D8 flow algorithm. However, the user may specify (dinf) that the tool should use the D-infinity algorithm instead.
See Also
elevation_above_stream, distance_to_outlet
Python API
def downslope_distance_to_stream(self, dem: Raster, streams: Raster, use_dinf: bool = False) -> Raster:
Downslope Index
Function name: downslope_index
This tool can be used to calculate the downslope index described by Hjerdt et al. (2004). The downslope index is a measure of the slope gradient between a grid cell and some downslope location (along the flowpath passing through the upslope grid cell) that represents a specified vertical drop (i.e. a potential head drop). The index has been shown to be useful for hydrological, geomorphological, and biogeochemical applications.
The user must input a digital elevaton model (DEM) raster. This DEM should be have been pre-processed to remove artifact topographic depressions and flat areas. The user must also specify the head potential drop (d), and the output type. The output type can be either 'tangent', 'degrees', 'radians', or 'distance'. If 'distance' is selected as the output type, the output grid actually represents the downslope flowpath length required to drop d meters from each grid cell. Linear interpolation is used when the specified drop value is encountered between two adjacent grid cells along a flowpath traverse.
Notice that this algorithm is affected by edge contamination. That is, for some grid cells, the edge of the grid will be encountered along a flowpath traverse before the specified vertical drop occurs. In these cases, the value of the downslope index is approximated by replacing d with the actual elevation drop observed along the flowpath. To avoid this problem, the entire watershed containing an area of interest should be contained in the DEM.
Grid cells containing NoData values in any of the input images are assigned the NoData value in the output raster. The output raster is of the float data type and continuous data scale.
Reference
Hjerdt, K.N., McDonnell, J.J., Seibert, J. Rodhe, A. (2004) A new topographic index to quantify downslope controls on local drainage, Water Resources Research, 40, W05602, doi:10.1029/2004WR003130.
Python API
def downslope_index(self, dem: Raster, vertical_drop: float, output_type: str = "tangent") -> Raster:
Edge Contamination
Function name: edge_contamination
This tool identifs grid cells in a DEM for which the upslope area extends beyond the raster data extent, so-called 'edge-contamined cells'. If a significant number of edge contaminated cells intersect with your area of interest, it is likely that any estimate of upslope area (i.e. flow accumulation) will be under-estimated.
The user must specify the name (dem) of the input digital elevation model (DEM) and the output file (output). The DEM must have been hydrologically corrected to remove all spurious depressions and flat areas. DEM pre-processing is usually achieved using either the breach_depressions_least_cost (also breach_depressions_least_cost) or fill_depressions tool.
Additionally, the user must specify the type of flow algorithm used for the analysis (-flow_type), which must be one of 'd8', 'mfd', or 'dinf', based on each of the D8FlowAccumulation, FD8FlowAccumulation, DInfFlowAccumulation methods respectively.
See Also
D8FlowAccumulation, FD8FlowAccumulation, DInfFlowAccumulation
Python API
def edge_contamination(self, dem: Raster, flow_type: str = "mfd", z_factor: float = -1.0) -> Raster:
Elev Relative To Watershed Min Max
Function name: elev_relative_to_watershed_min_max
This tool can be used to express the elevation of a grid cell in a digital elevation model (DEM) as a percentage of the relief between the watershed minimum and maximum values. As such, it provides a basic measure of relative topographic position. The user must input a DEM (dem) and watersheds (watersheds) raster files.
See Also
elev_relative_to_min_max, elevation_above_stream, ElevAbovePit
Python API
def elev_relative_to_watershed_min_max(self, dem: Raster, watersheds: Raster) -> Raster:
Elevation Above Stream
Function name: elevation_above_stream
This tool can be used to calculate the elevation of each grid cell in a raster above the nearest stream cell, measured along the downslope flowpath. This terrain index, a measure of relative topographic position, is essentially equivalent to the 'height above drainage' (HAND), as described by Renno et al. (2008). The user must specify the name of an input digital elevation model (dem) and streams raster (streams). The DEM must have been pre-processed to remove artifact topographic depressions and flat areas (see breach_depressions_least_cost). The streams raster should have been created using one of the DEM-based stream mapping methods, i.e. contributing area thresholding. Stream cells are designated in this raster as all non-zero values. The output of this tool, along with the downslope_distance_to_stream tool, can be useful for preliminary flood plain mapping when combined with high-accuracy DEM data.
The difference between elevation_above_stream and elevation_above_stream_euclidean is that the former calculates distances along drainage flow-paths while the latter calculates straight-line distances to streams channels.
Reference
Renno, C. D., Nobre, A. D., Cuartas, L. A., Soares, J. V., Hodnett, M. G., Tomasella, J., & Waterloo, M. J. (2008). HAND, a new terrain descriptor using SRTM-DEM: Mapping terra-firme rainforest environments in Amazonia. Remote Sensing of Environment, 112(9), 3469-3481.
See Also
elevation_above_stream_euclidean, downslope_distance_to_stream, ElevAbovePit, breach_depressions_least_cost
Python API
def elevation_above_stream(self, dem: Raster, streams: Raster) -> Raster:
Elevation Above Stream Euclidean
Function name: elevation_above_stream_euclidean
This tool can be used to calculate the elevation of each grid cell in a raster above the nearest stream cell, measured along the straight-line distance. This terrain index, a measure of relative topographic position, is related to the 'height above drainage' (HAND), as described by Renno et al. (2008). HAND is generally estimated with distances measured along drainage flow-paths, which can be calculated using the elevation_above_stream tool. The user must specify the name of an input digital elevation model (dem) and streams raster (streams). Stream cells are designated in this raster as all non-zero values. The output of this tool, along with the downslope_distance_to_stream tool, can be useful for preliminary flood plain mapping when combined with high-accuracy DEM data.
The difference between elevation_above_stream and elevation_above_stream_euclidean is that the former calculates distances along drainage flow-paths while the latter calculates straight-line distances to streams channels.
Reference
Renno, C. D., Nobre, A. D., Cuartas, L. A., Soares, J. V., Hodnett, M. G., Tomasella, J., & Waterloo, M. J. (2008). HAND, a new terrain descriptor using SRTM-DEM: Mapping terra-firme rainforest environments in Amazonia. Remote Sensing of Environment, 112(9), 3469-3481.
See Also
elevation_above_stream, downslope_distance_to_stream, ElevAbovePit
Python API
def elevation_above_stream_euclidean(self, dem: Raster, streams: Raster) -> Raster:
Find Noflow Cells
Function name: find_noflow_cells
This tool can be used to find cells with undefined flow, i.e. no valid flow direction, based on the D8 flow direction algorithm (d8_pointer). These cells are therefore either at the bottom of a topographic depression or in the interior of a flat area. In a digital elevation model (DEM) that has been pre-processed to remove all depressions and flat areas (breach_depressions_least_cost), this condition will only occur along the edges of the grid, otherwise no-flow grid cells can be situation in the interior. The user must specify the name (dem) of the DEM.
See Also
d8_pointer, breach_depressions_least_cost
Python API
def find_noflow_cells(self, dem: Raster) -> Raster:
Find Parallel Flow
Function name: find_parallel_flow
This tool can be used to find cells in a stream network grid that possess parallel flow directions based on an input D8 flow-pointer grid (d8_pointer). Because streams rarely flow in parallel for significant distances, these areas are likely errors resulting from the biased assignment of flow direction based on the D8 method.
See Also
d8_pointer
Python API
def find_parallel_flow(self, d8_pntr: Raster, streams: Raster) -> Raster:
Hydrologic Connectivity
Function name: hydrologic_connectivity
Theory
This tool calculates two indices related to hydrologic connectivity within catchments, the downslope unsaturated length (DUL) and the upslope disconnected saturated area (UDSA). Both of these hydrologic indices are based on the topographic wetness index (wetness_index), which measures the propensity for a site to be saturated to the surface, and therefore, to contribute to surface runoff. The wetness index (WI) is commonly used in hydrologic modelling, and famously in the TOPMODEL, to simulate variable source area (VSA) dynamics within catchments. The VSA is a dynamic region of surface-saturated soils within catchments that contributes fast overland flow to downslope streams during periods of precipitation. As a catchment's soil saturation deficit decreases ('wetting up'), areas with increasingly lower WI values become saturated to the surface. That is, areas of high WI are the first to become saturated and as the moisture deficit decreases, lower WI-valued cells become saturated, increasing the spatial extent of the source area. As a catchment dries out, the opposite effect occurs. The distribution of WI can therefore be used to map the spatial dyanamics of the VSA. However, the assumption in the TOPMODEL is that any rainfall over surface saturated areas will contribute to fast overland flow pathways and to stream discharge within the time step.
This method therefore implicitly assumes that all surface saturated grid cells are connected by continuously saturated areas along the downslope flow path connecting the cells to the stream. By comparison, Lane et al. (2004) proposed a modified WI, known as the network index (NI), which allowed for the modelling of disconnected, non-contributing saturated areas. The NI is essentially the downslope minimum WI. Grid cells for which WI > NI are likely to be disconnected during certain conditions from downslope streams, while similarly WI-valued cells are contributing. During these periods, any surface runoff from these cells is likely to contribute to downslope re-infilitration rather than directly to stream discharge via overland flow. This has implications for the timing and quality of stream discharge.
The DUL and UDSA indices extend the notion of the NI by mapping areas within catchments that are likely, at least during certain periods, to be sites of disconnected, non-contributing saturated areas and sites of re-infiltation respectively. These combined indices allow hydrologists to study the hydrologic connectivity and disconnectivity among areas within catchments.
The DUL (see image below) is defined for a grid cell as the number of downslope cells with a WI value lower than the current cell. Areas with non-zero DUL are likely to become fully saturated, and to contribute to overland flow, before they are directly connected to downslope areas and can contribute to stream flow. Under the appropriate catchment saturation deficit conditions, these are sites of disconnected, non-contributing saturated areas. When non-zero DUL cells are initially saturated, their precipitation excess will contribute to downslope re-infiltation, lessening the catchment's overall saturation deficit, rather than contributing to stormflow.
The UDSA (see image below) is defined for a grid cell as the number of upslope cells with a WI value higher than the current cell. Areas with non-zero UDSA are likely to have saturation deficits that are at least partly satisfied by local re-infiltation of overland flow from upslope areas. These non-zero UDSA cells are key sites causing the hydrologic disconnectivity of the catchment during certain conditions.
In the original Lane et al. (2004) NI paper, the authors state that the calculation of the index requires a unique, single downslope flow path for each grid cell. Therefore, the authors used the D8 single-direction flow algorithm to calculate NI. While the D8 method works well to model flow in convergent and channelized areas, it is generally recognized as a poor method for estimating WI on hillslopes, where divergent, non-chanellized flow dominates. Furthermore, the use of the D8 algorithm implied that the only way that WI can decrease downslope is for slope gradient to decrease, since specific contributing area only increases downslope with the D8 method. However, theoretically, WI may also decrease downslope due to flow dispersion, which allows for the upslope area (a surrogate for discharge) to be spread over a larger downslope dispersal area. The original NI formulation could not account for this effect.
Thus, in the implementation of the hydrologic_connectivity tool, WI is first calculated using the multiple flow-direction (MFD) algorithm described by Quinn et al. (1995), which is commonly used to estimate WI. While this implies that there are a multitude of potential flow pathways connecting each grid cell to a downstream location, in reality, if the flow path that follows the path of maximum WI issuing from a cell experiences a reduction in WI (to the point where it becomes less than the issuing cell's WI), then we can safely assume that re-infiltration occurs and the issuing cell is at times disconnected from downslope sites. Thus, after WI has been estimated using the quinn_flow_accumulation algorithm, flow directions, which are used to calculate upslope and downslope flow paths for calculating the two indices, are approximated by identifying the downslope neighbour of highest WI value for each grid cell.
Operation
The user must specify the name of the input digital elevation model (DEM; dem), and the output DUL and UDSA rasters (output1 and output2). The DEM must have been hydrologically corrected to remove all spurious depressions and flat areas. DEM pre-processing is usually achived using either the breach_depressions_least_cost (also breach_depressions_least_cost) or fill_depressions tool. The remaining two parameters are associated with the calculation of the Quinn et al. (1995) flow accumulation (quinn_flow_accumulation), used to estimate WI. A value must be specified for the exponent parameter (exponent), a number that controls the degree of dispersion in the flow-accumulation grid. A lower value yields greater apparent flow dispersion across divergent hillslopes. The exponent value (h) should probably be less than 10.0 and values between 1 and 2 are most common. The following equations are used to calculate the portion flow (Fi) given to each neighbour, i:
Fi = Li(tanβ)p / Σi=1n[Li(tanβ)p]
p = (A / threshold + 1)h
Where Li is the contour length, and is 0.5×grid size for cardinal directions and 0.354×grid size for diagonal directions, n = 8, and represents each of the eight neighbouring grid cells, and, A is the flow accumultation value assigned to the current grid cell, that is being apportioned downslope. The non-dispersive, channel initiation threshold (threshold) is a flow-accumulation value (measured in upslope grid cells, which is directly proportional to area) above which flow dispersion is no longer permited. Grid cells with flow-accumulation values above this threshold will have their flow routed in a manner that is similar to the D8 single-flow-direction algorithm, directing all flow towards the steepest downslope neighbour. This is usually done under the assumption that flow dispersion, whilst appropriate on hillslope areas, is not realistic once flow becomes channelized. Importantly, the threshold parameter sets the spatial extent of the stream network, with lower values resulting in more extensive networks.
References
Beven K.J., Kirkby M.J., 1979. A physically-based, variable contributing area model of basin hydrology. Hydrological Sciences Bulletin 24: 43–69.
Lane, S.N., Brookes, C.J., Kirkby, M.J. and Holden, J., 2004. A network‐index‐based version of TOPMODEL for use with high‐resolution digital topographic data. Hydrological processes, 18(1), pp.191-201.
Quinn, P. F., K. J. Beven, Lamb, R. 1995. The in (a/tanβ) index: How to calculate it and how to use it within the topmodel framework. Hydrological processes 9(2): 161-182.
See Also
wetness_index, quinn_flow_accumulation
Python API
def hydrologic_connectivity(self, dem: Raster, exponent: float = 1.1, convergence_threshold: float = 0.0, z_factor: float = 1.0 ) -> Tuple[Raster, Raster]:
Relative Stream Power Index
Function name: relative_stream_power_index
This tool can be used to calculate the relative stream power (RSP) index. This index is directly related to the stream power if the assumption can be made that discharge is directly proportional to upslope contributing area (As; sca). The index is calculated as:
RSP = As**p × tan(β)
where As is the specific catchment area (i.e. the upslope contributing area per unit contour length) estimated using one of the available flow accumulation algorithms; β is the local slope gradient in degrees (slope); and, p (exponent) is a user-defined exponent term that controls the location-specific relation between contributing area and discharge. Notice that As must not be log-transformed prior to being used; As is commonly log-transformed to enhance visualization of the data. The slope raster can be created from the base digital elevation model (DEM) using the slope tool. The input images must have the same grid dimensions.
Reference
Moore, I. D., Grayson, R. B., and Ladson, A. R. (1991). Digital terrain modelling: a review of hydrological, geomorphological, and biological applications. Hydrological processes, 5(1), 3-30.
See Also
sediment_transport_index, slope, D8FlowAccumulation DInfFlowAccumulation, FD8FlowAccumulation
Python API
def relative_stream_power_index(self, specific_catchment_area: Raster, slope: Raster, exponent: float = 1.0) -> Raster:
Sediment Transport Index
Function name: sediment_transport_index
This tool calculates the sediment transport index, or sometimes, length-slope (LS) factor, based on input specific contributing area (As, i.e. the upslope contributing area per unit contour length; sca) and slope gradient (β, measured in degrees; slope) rasters. Moore et al. (1991) state that the physical potential for sheet and rill erosion in upland catchments can be evaluated by the product R K LS, a component of the Universal Soil Loss Equation (USLE), where R is a rainfall and runoff erosivity factor, K is a soil erodibility factor, and LS is the length-slope factor that accounts for the effects of topography on erosion. To predict erosion at a point in the landscape the LS factor can be written as:
LS = (n + 1)(As / 22.13)n(sin(β) / 0.0896)m
where n = 0.4 (sca_exponent) and m = 1.3 (slope_exponent) in its original formulation.
This index is derived from unit stream-power theory and is sometimes used in place of the length-slope factor in the revised universal soil loss equation (RUSLE) for slope lengths less than 100 m and slope less than 14 degrees. Like many hydrological land-surface parameters sediment_transport_index assumes that contributing area is directly related to discharge. Notice that As must not be log-transformed prior to being used; As is commonly log-transformed to enhance visualization of the data. Also, As can be derived using any of the available flow accumulation tools, alghough better results usually result from application of multiple-flow direction algorithms such as DInfFlowAccumulation and FD8FlowAccumulation. The slope raster can be created from the base digital elevation model (DEM) using the slope tool. The input images must have the same grid dimensions.
Reference
Moore, I. D., Grayson, R. B., and Ladson, A. R. (1991). Digital terrain modelling: a review of hydrological, geomorphological, and biological applications. Hydrological processes, 5(1), 3-30.
See Also
StreamPowerIndex, DInfFlowAccumulation, FD8FlowAccumulation
Python API
def sediment_transport_index(self, specific_catchment_area: Raster, slope: Raster, sca_exponent: float = 0.4, slope_exponent: float = 1.3) -> Raster:
Wetness Index
Function name: wetness_index
This tool can be used to calculate the topographic wetness index, commonly used in the TOPMODEL rainfall-runoff framework. The index describes the propensity for a site to be saturated to the surface given its contributing area and local slope characteristics. It is calculated as:
WI = Ln(As / tan(Slope))
Where As is the specific catchment area (i.e. the upslope contributing area per unit contour length) estimated using one of the available flow accumulation algorithms in the Hydrological Analysis toolbox. Notice that As must not be log-transformed prior to being used; log-transformation of As is a common practice when visualizing the data. The slope image should be measured in degrees and can be created from the base digital elevation model (DEM) using the slope tool. Grid cells with a slope of zero will be assigned NoData in the output image to compensate for the fact that division by zero is infinity. These very flat sites likely coincide with the wettest parts of the landscape. The input images must have the same grid dimensions.
Grid cells possessing the NoData value in either of the input images are assigned NoData value in the output image. The output raster is of the float data type and continuous data scale.
See Also slope, D8FlowAccumulation, DInfFlowAccumulation, FD8FlowAccumulation, breach_depressions_least_cost
Python API
def wetness_index(self, specific_catchment_area: Raster, slope: Raster) -> Raster:
Stream Network Extraction
Extract Streams
Function name: extract_streams
Description
This tool can be used to extract, or map, the likely stream cells from an input flow-accumulation image (flow_accumulation). The algorithm applies a threshold to the input flow accumulation image such that streams are considered to be all grid cells with accumulation values greater than the specified threshold (threshold). As such, this threshold represents the minimum area (area is used here as a surrogate for discharge) required to initiate and maintain a channel. Smaller threshold values result in more extensive stream networks and vice versa. Unfortunately there is very little guidance regarding an appropriate method for determining the channel initiation area threshold in practice. As such, it is frequently determined either by examining map or imagery data, using field work, or by experimentation until a suitable or desirable channel network is identified. Notice that the threshold value will be unique for each landscape and dataset (including source and grid resolution), further complicating its a priori determination. There is also evidence that in some landscape the threshold is a combined upslope area-slope function. Generally, a lower threshold is appropriate in humid climates and a higher threshold is appropriate in areas underlain by more resistant bedrock. Climate and bedrock resistance are two factors related to drainage density, i.e. the extent to which a landscape is dissected by drainage channels.
The background value of the output raster will be the NoData value unless zero_background is set to True.
See Also
extract_valleys
Parameters
flow_accumulation (Raster): The input flow accumulation Raster object.
threshold (float): The minimum accumulation value required to be part of a stream channel. Default is 0.0, but should be set higher.
zero_background (bool): Whether the output raster uses 0.0 for non-channel cells (True) or NoData (False). Default is False.
Returns:
Raster
Python API
def extract_streams(self, flow_accumulation: Raster, threshold: float = 0.0, zero_background: bool = False) -> Raster:
Extract Valleys
Function name: extract_valleys
This tool can be used to extract channel networks from an input digital elevation models (dem) using one of three techniques that are based on local topography alone.
The Lindsay (2006) 'lower-quartile' method (variant='LQ') algorithm is a type of 'valley recognition' method. Other channel mapping methods, such as the Johnston and Rosenfeld (1975) algorithm, experience problems because channel profiles are not always 'v'-shaped, nor are they always apparent in small 3 x 3 windows. The lower-quartile method was developed as an alternative and more flexible valley recognition channel mapping technique. The lower-quartile method operates by running a filter over the DEM that calculates the percentile value of the centre cell with respect to the distribution of elevations within the filter window. The roving window is circular, the diameter of which should reflect the topographic variation of the area (e.g. the channel width or average hillslope length). If this variant is selected, the user must specify the filter_size parameter, in pixels, and this value should be an odd number (e.g. 3, 5, 7, etc.). The appropriateness of the selected window diameter will depend on the grid resolution relative to the scale of topographic features. Cells that are within the lower quartile of the distribution of elevations of their neighbourhood are flagged. Thus, the algorithm identifies grid cells that are in relatively low topographic positions at a local scale. This approach to channel mapping is only appropriate in fluvial landscapes. In regions containing numerous lakes and wetlands, the algorithm will pick out the edges of features.
The Johnston and Rosenfeld (1975) algorithm (variant='JandR') is a type of 'valley recognition' method and operates as follows: channel cells are flagged in a 3 x 3 window if the north and south neighbours are higher than the centre grid cell or if the east and west neighbours meet this same criterion. The group of cells that are flagged after one pass of the roving window constituted the drainage network. This method is best applied to DEMs that are relatively smooth and do not exhibit high levels of short-range roughness. As such, it may be desirable to use a smoothing filter before applying this tool. The feature_preserving_smoothing is a good option for removing DEM roughness while preserving the topographic information contain in breaks-in-slope (i.e. edges).
The Peucker and Douglas (1975) algorithm (variant='PandD') is one of the simplest and earliest algorithms for topography-based network extraction. Their 'valley recognition' method operates by passing a 2 x 2 roving window over a DEM and flagging the highest grid cell in each group of four. Once the window has passed over the entire DEM, channel grid cells are left unflagged. This method is also best applied to DEMs that are relatively smooth and do not exhibit high levels of short-range roughness. Pre-processing the DEM with the feature_preserving_smoothing tool may also be useful when applying this method.
Each of these methods of extracting valley networks result in line networks that can be wider than a single grid cell. As such, it is often desirable to thin the resulting network using a line-thinning algorithm. The option to perform line-thinning is provided by the tool as a post-processing step (line_thin=True).
References
Johnston, E. G., & Rosenfeld, A. (1975). Digital detection of pits, peaks, ridges, and ravines. IEEE Transactions on Systems, Man, and Cybernetics, (4), 472-480.
Lindsay, J. B. (2006). Sensitivity of channel mapping techniques to uncertainty in digital elevation data. International Journal of Geographical Information Science, 20(6), 669-692.
Peucker, T. K., & Douglas, D. H. (1975). Detection of surface-specific points by local parallel processing of discrete terrain elevation data. Computer Graphics and image processing, 4(4), 375-387.
See Also
feature_preserving_smoothing
Python API
def extract_valleys(self, dem: Raster, variant: str = "lq", line_thin: bool = False, filter_size: int = 5) -> Raster:
Prune Vector Streams
Function name: prune_vector_streams
Description
This tool can be used to prune the smallest branches of a vector stream network based on a threshold in link magnitude. The function automatically calculates the Shreve magnitude of each link in the input streams vector. This operation requires an input digital elevation model (DEM). The function is also capable of calculating the link magnitude from stream networks that have some minor topological errors (e.g., line overshoots or undershoots). This requires the input of a snap_distance parameter (default = 0.0).
See Also
vector_stream_network_analysis, repair_stream_vector_topology
Python API
def prune_vector_streams(self, streams: Vector, dem: Raster, threshold: float, snap_distance: float = 0.001) -> Vector:
Raster Streams To Vector
Function name: raster_streams_to_vector
This tool converts a raster stream file into a vector file. The user must specify an input raster streams file (streams), and an input D8 flow pointer file (d8_pointer). Streams in the input raster streams file are denoted by cells containing any positive, non-zero integer. A field in the output vector's database file, called STRM_VAL, will correspond to this positive integer value. The database file will also have a field for the length of each link in the stream network. The flow pointer file must be calculated from a DEM with all topographic depressions and flat areas removed and must be calculated using the D8 flow pointer algorithm (d8_pointer). The output vector will contain PolyLine features.
See Also
rasterize_streams, raster_to_vector_lines
Python API
def raster_streams_to_vector(self, streams: Raster, d8_pointer: Raster, esri_pointer: bool = False) -> Vector:
Rasterize Streams
Function name: rasterize_streams
This tool can be used rasterize an input vector stream network (streams) using on Lindsay (2016) method. The user inputs an existing raster (base_raster), from which the output raster's grid resolution is determined.
Reference
Lindsay JB. 2016. The practice of DEM stream burning revisited. Earth Surface Processes and Landforms, 41(5): 658–668. DOI: 10.1002/esp.3888
See Also
raster_streams_to_vector
Python API
def rasterize_streams(self, streams: Vector, base_raster: Raster = None, zero_background: bool = False, use_feature_id: bool = False) -> Raster:
Remove Short Streams
Function name: remove_short_streams
This tool can be used to remove stream links in a stream network that are shorter than a user-specified length (min_length). The user must input a streams raster image (streams_raster) and D8 pointer (flow direction) image (d8_pntr). Stream cells are designated in the streams raster as all positive, nonzero values. Thus all non-stream or background grid cells are commonly assigned either zeros or NoData values. The pointer raster is used to traverse the stream network and should only be created using the D8 algorithm (d8_pointer).
By default, the pointer raster is assumed to use the clockwise indexing method used by WhiteboxTools. If the pointer file contains ESRI flow direction values instead, the user must specify esri_pntr=True.
See Also
extract_streams, d8_pointer
Python API
def remove_short_streams(self, d8_pntr: Raster, streams_raster: Raster, min_length: float = 0.0, esri_pntr: bool = False) -> Raster:
Repair Stream Vector Topology
Function name: repair_stream_vector_topology
This tool can be used to resolve many of the topological errors and inconsistencies associated with manually digitized vector stream networks, i.e. hydrography data. A properly structured stream network should consist of a series of stream segments that connect a channel head to a downstream confluence, or an upstream confluence to a downstream confluence/outlet. This tool will join vector arcs that connect at arbitrary, non-confluence points along stream segments. It also splits an arc where a tributary stream connects at a mid-point, thereby creating a proper confluence where two upstream triburaries converge into a downstream segment. The tool also handles non-connecting tributaries caused by dangling arcs, i.e. overshoots and undershoots.
The user must specify the name of the input vector stream network (input) and the output file (output). Additionally, a distance threshold for snapping dangling arcs (snap) must be specified. This distance is in the input layer's x-y units. The tool works best on projected input data, however, if the input are in geographic coordinates (latitude and longitude), then specifying a small valued snap distance is advisable. Notice that the attributes of the input layer will not be carried over to the output file because there is not a one-for-one feature correspondence between the two files due to the joins and splits of stream segments. Instead the output attribute table will only contain a feature ID (FID) entry.
Note: this tool should be used to pre-process vector streams that are input to the vector_stream_network_analysis tool.
See Also
vector_stream_network_analysis, fix_dangling_arcs
River Centerlines
Function name: river_centerlines
Note this tool is part of a WhiteboxTools extension product. Please visit Whitebox Geospatial Inc. for information about purchasing a license activation key (https://www.whiteboxgeo.com/extension-pricing/).
This tool can map river centerlines, or medial-lines, from input river rasters (input). The input river (or water) raster is often derived from an image classification performed on multispectral satellite imagery. The river raster must be a Boolean (1 for water, 0/NoData for not-water) and can be derived either by reclassifying the classification output, or derived using a 1-class classification procedure. For example, using the parallelepiped_classification tool, it is possible to train the classifier using water training polygons, and all other land classes will simply be left unclassified. It may be necessary to perform some pre-processing on the water Boolean raster before input to the centerlines tool. For example, you may need to remove smaller water polygons associated with small lakes and ponds, and you may want to remove small islands from the remaining water features. This tool will often create a bifurcating vector path around islands within rivers, even if those islands are a single-cell in size. The RemoveRasterPolygonHoles tool can be used to remove islands in the water raster that are smaller than a user-specified size. The user must also specify the minimum line length (min_length), which determines the level of detail in the final rivers map. For example, in the first iamge below, a value of 30 grid cells was used for the min_length parameter, while a value of 5 was used in the second image, which possesses far more (arguably too much) detail.
Lastly, the user must specify the radius parameter value. At times, the tool will be able to connect distant water polygons that are part of the same feature and this parameter determines the size of the search radius used to identify separated line end-nodes that are candidates for connection. It is advisable that this value not be set too high, or else unexpected connections may be made between unrelated water features. However, a value of between 1 and 5 can produce satisfactory results. Experimentation may be needed to find an appropriate value for any one data set however. The image below provides an example of this characteristic of the tool, where the resulting vector stream centerline passes through disconnected raster water polygons in the underlying input image in four locations.
**Here** is a video that demonstrates how to apply this tool to map river center-lines taken from a water raster created by classifying a Sentinel-2 multi-spectral satellite imagery data set.
See Also
parallelepiped_classification, RemoveRasterPolygonHoles
Python API
def river_centerlines(self, raster: Raster, min_length: int = 3, search_radius: int = 9) -> Vector:
Longitudinal Profile Analysis
Farthest Channel Head
Function name: farthest_channel_head
Description
This tool calculates the upstream distance to the farthest stream head for each grid cell belonging to a raster stream network. The user must input a raster containing streams data (streams), where stream grid cells are denoted by all positive non-zero values, and a D8 flow pointer (i.e. flow direction) raster (d8_pointer). The pointer image is used to traverse the stream network and must only be created using the D8 algorithm. Stream cells are designated in the streams image as all values greater than zero. Thus, all non-stream or background grid cells are commonly assigned either zeros or NoData values. Background cells will be assigned the NoData value in the output image, unless zero_background=True, in which case non-stream cells will be assigned zero values in the output.
By default, the pointer raster is assumed to use the clockwise indexing method used by WhiteboxTools. If the pointer file contains ESRI flow direction values instead, the user should specify esri_pntr=True.
See Also
length_of_upstream_channels, find_main_stem
Parameters
d8_pointer (Raster): The D8 pointer (flow direction) raster.
streams_raster (Raster): The raster object containing the streams data.
esri_pointer (bool): Determines whether the d8_pointer raster contains pointer data in the Esri format. Default is False.
zero_background (bool): Determines whether the background value in the output raster are assigned zero (True) or NoData values (False). Default is False.
Returns
Raster: returning value
Python API
def farthest_channel_head(self, d8_pointer: Raster, streams_raster: Raster, esri_pointer: bool = False, zero_background: bool = False) -> Raster:
Find Main Stem
Function name: find_main_stem
This tool can be used to identify the main channel in a stream network. The user must input a D8 pointer (flow direction) raster (d8_pointer), and a streams raster (streams_raster). The pointer raster is used to traverse the stream network and should only be created using the d8_pointer tool. By default, the pointer raster is assumed to use the clockwise indexing method used by WhiteboxTools: ... 641281 3202 1684
If the pointer file contains ESRI flow direction values instead, you must set esri_pointer=True parameter must be specified.
The streams raster should have been created using one of the DEM-based stream mapping methods, i.e. contributing area thresholding. Stream grid cells are designated in the streams image as all positive, non-zero values. All non-stream cells will be assigned the NoData value in the output image, unless the user sets zero_background=True.
The algorithm operates by traversing each stream and identifying the longest stream-path draining to each outlet. When a confluence is encountered, the traverse follows the branch with the larger distance-to-head.
See Also
d8_pointer
Python API
def find_main_stem(self, d8_pointer: Raster, streams_raster: Raster, esri_pointer: bool = False, zero_background: bool = False) -> Raster:
Length Of Upstream Channels
Function name: length_of_upstream_channels
This tool calculates, for each stream grid cell in an input streams raster (streams_raster) the total length of channels upstream. The user must specify the name of a raster containing streams data (streams_raster), where stream grid cells are denoted by all positive non-zero values, and a D8 flow pointer (i.e. flow direction) raster (d8_pointer). The pointer image is used to traverse the stream network and must only be created using the D8 algorithm. Stream cells are designated in the streams image as all values greater than zero. Thus, all non-stream or background grid cells are commonly assigned either zeros or NoData values. Background cells will be assigned the NoData value in the output image, unless the user specifies zero_background=True, in which case non-stream cells will be assigned zero values in the output.
By default, the pointer raster is assumed to use the clockwise indexing method used by WhiteboxTools. If the pointer file contains ESRI flow direction values instead, set esri_pntr=True.
See Also
farthest_channel_head, find_main_stem
Python API
def length_of_upstream_channels(self, d8_pointer: Raster, streams_raster: Raster, esri_pointer: bool = False, zero_background: bool = False) -> Raster:
Long Profile
Function name: long_profile
This tool can be used to create a longitudinal profile plot. A longitudinal stream profile is a plot of elevation against downstream distance. Most long profiles use distance from channel head as the distance measure. This tool, however, uses the distance to the stream network outlet cell, or mouth, as the distance measure. The reason for this difference is that while for any one location within a stream network there is only ever one downstream outlet, there are usually many upstream channel heads. Thus plotted using the traditional downstream-distance method, the same point within a network will plot in many different long profile locations, whereas it will always plot on one unique location in the distance-to-mouth method. One consequence of this difference is that the long profile will be oriented from right-to-left rather than left-to-right, as would traditionally be the case.
The tool outputs an interactive SVG line graph embedded in an HTML document (output_html_file). The user must input a D8 pointer (flow direction) raster (d8_pointer), a streams raster image (streams_raster), and a digital elevation model (dem). Stream cells are designated in the streams image as all positive, nonzero values. Thus all non-stream or background grid cells are commonly assigned either zeros or NoData values. The pointer image is used to traverse the stream network and should only be created using the D8 algorithm (d8_pointer). The streams image should be derived using a flow accumulation based stream network extraction algorithm, also based on the D8 flow algorithm.
By default, the pointer raster is assumed to use the clockwise indexing method used by WhiteboxTools. If the pointer file contains ESRI flow direction values instead, set esri_pointer=True.
See Also
long_profile_from_points, profile, d8_pointer
Python API
def long_profile(self, d8_pointer: Raster, streams_raster: Raster, dem: Raster, output_html_file: str, esri_pointer: bool = False) -> None:
Long Profile From Points
Function name: long_profile_from_points
This tool can be used to create a longitudinal profile plot for a set of vector points (points). A longitudinal stream profile is a plot of elevation against downstream distance. Most long profiles use distance from channel head as the distance measure. This tool, however, uses the distance to the outlet cell, or mouth, as the distance measure.
The tool outputs an interactive SVG line graph embedded in an HTML document (output_html_file). The user input a D8 pointer (d8_pointer) image (flow direction), a vector points file (points), and a digital elevation model (dem). The pointer image is used to traverse the flow path issuing from each initiation point in the vector file; this pointer file should only be created using the D8 algorithm (d8_pointer).
By default, the pointer raster is assumed to use the clockwise indexing method used by WhiteboxTools. If the pointer file contains ESRI flow direction values instead, the esri_pointer parameter must be specified.
See Also
long_profile, profile, d8_pointer
Python API
def long_profile_from_points(self, d8_pointer: Raster, points: Vector, dem: Raster, output_html_file: str, esri_pointer: bool = False) -> None:
Stream Ordering and Metrics
Hack Stream Order
Function name: hack_stream_order
This tool can be used to assign the Hack stream order to each link in a stream network. According to this common stream numbering system, the main stream is assigned an order of one. All tributaries to the main stream (i.e. the trunk) are assigned an order of two; tributaries to second-order links are assigned an order of three, and so on. The trunk or main stream of the stream network can be defined either based on the furthest upstream distance, at each bifurcation (i.e. network junction).
Stream order is often used in hydro-geomorphic and ecological studies to quantify the relative size and importance of a stream segment to the overall river system. Unlike some other stream ordering systems, e.g. Horton-Strahler stream order (strahler_stream_order) and Shreve's stream magnitude (shreve_stream_magnitude), Hack's stream ordering method increases from the catchment outlet towards the channel heads. This has the main advantage that the catchment outlet is likely to be accurately located while the channel network extent may be less accurately mapped.
The user must input a streams raster image (streams_raster) and D8 pointer image (d8_pntr). Stream cells are designated in the streams image as all positive, nonzero values. Thus all non-stream or background grid cells are commonly assigned either zeros or NoData values. The pointer image is used to traverse the stream network and should only be created using the D8 algorithm (d8_pointer). Background cells will be assigned the NoData value in the output image, unless the zero_background=True, in which case non-stream cells will be assigned zero values in the output.
By default, the pointer raster is assumed to use the clockwise indexing method used by WhiteboxTools. If the pointer file contains ESRI flow direction values instead, the user should specify esri_pntr=True.
Reference
Hack, J. T. (1957). Studies of longitudinal stream profiles in Virginia and Maryland (Vol. 294). US Government Printing Office.
See Also
horton_stream_order, strahler_stream_order, shreve_stream_magnitude, topological_stream_order
Python API
def hack_stream_order(self, d8_pntr: Raster, streams_raster: Raster, esri_pntr: bool = False, zero_background: bool = False) -> Raster:
Horton Ratios
Function name: horton_ratios
This function can be used to calculate Horton's so-called laws of drainage network composition for a input stream network. The user must specify an input DEM (which has been suitably hydrologically pre-processed to remove any topographic depressions) and a raster stream network. The function will output a 4-element tuple containing the bifurcation ratio (Rb), the length ratio (Rl), the area ratio (Ra), and the slope ratio (Rs). These indices are related to drainage network geometry and are used in some geomorphological analysis. The calculation of the ratios is based on the method described by Knighton (1998) Fluvial Forms and Processes: A New Perspective.
Code Example
`from whitebox_workflows import WbEnvironment
Set up the WbW environment
wbe = WbEnvironment() wbe.verbose = True wbe.working_directory = '/path/to/data'
Read the inputs
dem = wbe.read_raster('DEM.tif') streams = wbe.read_raster('streams.tif')
Calculate the Horton ratios
(bifurcation_ratio, length_ratio, area_ratio, slope_ratio) = wbe.horton_ratios(dem, streams)
Outputs
print(f"Bifurcation ratio (Rb): {bifurcation_ratio:.3f}") print(f"Length ratio (Rl): {length_ratio:.3f}") print(f"Area ratio (Ra): {area_ratio:.3f}") print(f"Slope ratio (Rs): {slope_ratio:.3f}") `
See Also
horton_stream_order
Python API
def horton_ratios(self, dem: Raster, streams_raster: Raster) -> Tuple[float, float, float, float]:
Horton Stream Order
Function name: horton_stream_order
This tool can be used to assign the Horton stream order to each link in a stream network. Stream ordering is often used in hydro-geomorphic and ecological studies to quantify the relative size and importance of a stream segment to the overall river system. There are several competing stream ordering schemes. Based on to this common stream numbering system, headwater stream links are assigned an order of one. Stream order only increases downstream when two links of equal order join, otherwise the downstream link is assigned the larger of the two link orders.
Strahler order and Horton order are similar approaches to assigning stream network hierarchy. Horton stream order essentially starts with the Strahler order scheme, but subsequently replaces each of the assigned stream order value along the main trunk of the network with the order value of the outlet. The main channel is not treated differently compared with other tributaries in the Strahler ordering scheme.
The user must specify input a streams raster image (streams_raster) and D8 pointer image (d8_pntr). Stream cells are designated in the streams image as all positive, nonzero values. Thus all non-stream or background grid cells are commonly assigned either zeros or NoData values. The pointer image is used to traverse the stream network and should only be created using the D8 algorithm (d8_pointer). Background cells will be assigned the NoData value in the output image, unless the user specifies zero_background=True, in which case non-stream cells will be assigned zero values in the output.
By default, the pointer raster is assumed to use the clockwise indexing method used by WhiteboxTools. If the pointer file contains ESRI flow direction values instead, the user must set esri_pntr=True.
Reference Horton, R. E. (1945). Erosional development of streams and their
drainage basins; hydrophysical approach to quantitative morphology. Geological society of America bulletin, 56(3), 275-370.
See Also
hack_stream_order, shreve_stream_magnitude, strahler_stream_order, topological_stream_order
Python API
def horton_stream_order(self, d8_pntr: Raster, streams_raster: Raster, esri_pntr: bool = False, zero_background: bool = False) -> Raster:
Shreve Stream Magnitude
Function name: shreve_stream_magnitude
This tool can be used to assign the Shreve stream magnitude to each link in a stream network. Stream ordering is often used in hydro-geomorphic and ecological studies to quantify the relative size and importance of a stream segment to the overall river system. There are several competing stream ordering schemes. Shreve stream magnitude is equal to the number of headwater links upstream of each link. Headwater stream links are assigned a magnitude of one.
The user must input a streams raster image (streams_raster) and D8 pointer (flow direction) image (d8_pntr). Stream cells are designated in the streams raster as all positive, nonzero values. Thus all non-stream or background grid cells are commonly assigned either zeros or NoData values. The pointer image is used to traverse the stream network and should only be created using the D8 algorithm. Background cells will be assigned the NoData value in the output image, unless the user specifies zero_background=True, in which case non-stream cells will be assigned zero values in the output.
By default, the pointer raster is assumed to use the clockwise indexing method used by WhiteboxTools. If the pointer file contains ESRI flow direction values instead, the user should specify esri_pntr=True.
Reference Shreve, R. L. (1966). Statistical law of stream numbers. The Journal
of Geology, 74(1), 17-37.
See Also
horton_stream_order, hack_stream_order, strahler_stream_order, topological_stream_order
Python API
def shreve_stream_magnitude(self, d8_pntr: Raster, streams_raster: Raster, esri_pntr: bool = False, zero_background: bool = False) -> Raster:
Strahler Order Basins
Function name: strahler_order_basins
This tool will identify the catchment areas of each Horton-Strahler stream order link in a user-specified stream network (streams), i.e. the network's Strahler basins. The tool effectively performs a Horton-Strahler stream ordering operation (horton_stream_order) followed by by a watershed operation. The user must specify the name of a flow pointer (flow direction) raster (d8_pntr), a streams raster (streams), and the output raster (output). The flow pointer and streams rasters should be generated using the d8_pointer algorithm. This will require a depressionless DEM, processed using either the breach_depressions_least_cost or fill_depressions tool.
By default, the pointer raster is assumed to use the clockwise indexing method used by WhiteboxTools. If the pointer file contains ESRI flow direction values instead, the esri_pntr parameter must be specified.
NoData values in the input flow pointer raster are assigned NoData values in the output image.
See Also
horton_stream_order, watershed, d8_pointer, breach_depressions_least_cost, fill_depressions
Python API
def strahler_order_basins(self, d8_pointer: Raster, streams: Raster, esri_pntr: bool = False) -> Raster:
Strahler Stream Order
Function name: strahler_stream_order
This tool can be used to assign the Strahler stream order to each link in a stream network. Stream ordering is often used in hydro-geomorphic and ecological studies to quantify the relative size and importance of a stream segment to the overall river system. There are several competing stream ordering schemes. Based on to this common stream numbering system, headwater stream links are assigned an order of one. Stream order only increases downstream when two links of equal order join, otherwise the downstream link is assigned the larger of the two link orders.
Strahler order and Horton order are similar approaches to assigning stream network hierarchy. Horton stream order essentially starts with the Strahler order scheme, but subsequently replaces each of the assigned stream order value along the main trunk of the network with the order value of the outlet. The main channel is not treated differently compared with other tributaries in the Strahler ordering scheme.
The user must input a streams raster image (streams_raster) and D8 pointer (flow direction) image (d8_pntr). Stream cells are designated in the streams image as all positive, nonzero values. Thus all non-stream or background grid cells are commonly assigned either zeros or NoData values. The pointer image is used to traverse the stream network and should only be created using the D8 algorithm (d8_pointer). Background cells will be assigned the NoData value in the output image, unless the user specifies zero_background=True, in which case non-stream cells will be assigned zero values in the output.
By default, the pointer raster is assumed to use the clockwise indexing method used by WhiteboxTools. If the pointer file contains ESRI flow direction values instead, the user should specify esri_pntr=True.
Reference Strahler, A. N. (1957). Quantitative analysis of watershed
geomorphology. Eos, Transactions American Geophysical Union, 38(6), 913-920.
See Also
horton_stream_order, hack_stream_order, shreve_stream_magnitude, topological_stream_order
Python API
def strahler_stream_order(self, d8_pntr: Raster, streams_raster: Raster, esri_pntr: bool = False, zero_background: bool = False) -> Raster:
Stream Link Class
Function name: stream_link_class
This tool identifies all interior and exterior links, and source, link, and sink nodes in an input stream network (streams_raster). The input streams raster is used to designate which grid cells contain a stream and the pointer image is used to traverse the stream network. Stream cells are designated in the streams image as all values greater than zero. Thus, all non-stream or background grid cells are commonly assigned either zeros or NoData values. Background cells will be assigned the NoData value in the output image, unless zero_background=True, in which case non-stream cells will be assigned zero values in the output.
Each feature is assigned the following identifier in the output image:
Value | Stream Type ----- | ----------- 1 | Exterior Link 2 | Interior Link 3 | Source Node (head water) 4 | Link Node 5 | Sink Node
The user must input an input stream raster (streams_raster) and a pointer (flow direction) raster (d8_pntr). The flow pointer and streams rasters should be generated using the d8_pointer algorithm. This will require a depressionless DEM, processed using either the breach_depressions_least_cost or fill_depressions tools.
By default, the pointer raster is assumed to use the clockwise indexing method used by WhiteboxTools. If the pointer file contains ESRI flow direction values instead, set esri_pntr=True.
See Also
stream_link_identifier
Python API
def stream_link_class(self, d8_pntr: Raster, streams_raster: Raster, esri_pntr: bool = False, zero_background: bool = False) -> Raster:
Stream Link Identifier
Function name: stream_link_identifier
This tool can be used to assign each link in a stream network a unique numeric identifier. This grid is used by a number of other stream network analysis tools.
The input streams raster (streams_raster) is used to designate which grid cells contain a stream and the pointer image is used to traverse the stream network. Stream cells are designated in the streams image as all values greater than zero. Thus, all non-stream or background grid cells are commonly assigned either zeros or NoData values. Background cells will be assigned the NoData value in the output image, unless the user specifies zero_background=True, in which case non-stream cells will be assigned zero values in the output.
The user must specify the name of a flow pointer (flow direction) raster (d8_pntr) and a streams raster (streams_raster). The flow pointer and streams rasters should be generated using the d8_pointer algorithm. This will require a depressionless DEM, processed using either the breach_depressions_least_cost or fill_depressions tool.
By default, the pointer raster is assumed to use the clockwise indexing method used by WhiteboxTools. If the pointer file contains ESRI flow direction values instead, set esri_pntr=True.
See Also
d8_pointer, tributary_identifier, breach_depressions_least_cost, fill_depressions
Python API
def stream_link_identifier(self, d8_pntr: Raster, streams_raster: Raster, esri_pntr: bool = False, zero_background: bool = False) -> Raster:
Stream Link Length
Function name: stream_link_length
This tool can be used to measure the length of each link in a stream network. The user must input a stream link ID raster (streams_id_raster), created using the stream_link_identifier tool, and D8 pointer raster (d8_pointer). The flow pointer raster is used to traverse the stream network and should only be created using the d8_pointer algorithm. Stream cells are designated in the stream link ID raster as all non-zero, positive values. Background cells will be assigned the NoData value in the output image, unless zero_background=True, in which case non-stream cells will be assigned zero values in the output.
See Also
stream_link_identifier, d8_pointer, stream_link_slope
Python API
def stream_link_length(self, d8_pointer: Raster, streams_id_raster: Raster, esri_pointer: bool = False, zero_background: bool = False) -> Raster:
Stream Link Slope
Function name: stream_link_slope
This tool can be used to measure the average slope gradient, in degrees, of each link in a raster stream network. To estimate the slope of individual grid cells in a raster stream network, use the stream_slope_continuous tool instead. The user must input a stream link identifier raster image (streams_id_raster), a D8 pointer image (d8_pointer), and a digital elevation model (dem). The pointer image is used to traverse the stream network and must only be created using the D8 algorithm (d8_pointer). Stream cells are designated in the streams image as all values greater than zero. Thus, all non-stream or background grid cells are commonly assigned either zeros or NoData values. Background cells will be assigned the NoData value in the output image, unless zero_background=True, in which case non-stream cells will be assigned zero values in the output.
By default, the pointer raster is assumed to use the clockwise indexing method used by WhiteboxTools. If the pointer file contains ESRI flow direction values instead, set esri_pointer=True.
See Also
stream_slope_continuous, d8_pointer
Python API
def stream_link_slope(self, d8_pointer: Raster, streams_id_raster: Raster, dem: Raster, esri_pointer: bool = False, zero_background: bool = False) -> Raster:
Stream Slope Continuous
Function name: stream_slope_continuous
This tool can be used to measure the slope gradient, in degrees, each grid cell in a raster stream network. To estimate the average slope for each link in a stream network, use the stream_link_slope tool instead. The user must input a stream raster image (streams_raster), a D8 pointer image (d8_pointer), and a digital elevation model (dem). The pointer image is used to traverse the stream network and must only be created using the D8 algorithm (d8_pointer). Stream cells are designated in the streams image as all values greater than zero. Thus, all non-stream or background grid cells are commonly assigned either zeros or NoData values. Background cells will be assigned the NoData value in the output image, unless zero_background=True, in which case non-stream cells will be assigned zero values in the output.
By default, the pointer raster is assumed to use the clockwise indexing method used by WhiteboxTools. If the pointer file contains ESRI flow direction values instead, set esri_pointer=True.
See Also
stream_link_slope, d8_pointer
Python API
def stream_slope_continuous(self, d8_pointer: Raster, streams_raster: Raster, dem: Raster, esri_pointer: bool = False, zero_background: bool = False) -> Raster:
Topological Stream Order
Function name: topological_stream_order
This tool can be used to assign the topological stream order to each link in a stream network. According to this stream numbering system, the link directly draining to the outlet is assigned an order of one. Each of the two tributaries draining to the order-one link are assigned an order of two, and so on until the most distant link from the catchment outlet has been assigned an order. The topological order can therefore be thought of as a measure of the topological distance of each link in the network to the catchment outlet and is likely to be related to travel time.
The user must input a streams raster image (streams_raster) and D8 pointer image (d8_pntr). Stream cells are designated in the streams image as all positive, nonzero values. Thus all non-stream or background grid cells are commonly assigned either zeros or NoData values. The pointer image is used to traverse the stream network and should only be created using the D8 algorithm. Background cells will be assigned the NoData value in the output image, unless the zero_background=True, in which case non-stream cells will be assigned zero values in the output.
By default, the pointer raster is assumed to use the clockwise indexing method used by WhiteboxTools. If the pointer file contains ESRI flow direction values instead, set esri_pntr=True.
See Also
hack_stream_order, horton_stream_order, strahler_stream_order, shreve_stream_magnitude
Python API
def topological_stream_order(self, d8_pntr: Raster, streams_raster: Raster, esri_pntr: bool = False, zero_background: bool = False) -> Raster:
Tributary Identifier
Function name: tributary_identifier
This tool can be used to assigns a unique identifier to each tributary in a stream network. A tributary is a section of a stream network extending from a channel head downstream to a confluence with a larger stream. Relative stream size is estimated using stream length as a surrogate. Tributaries therefore extend from channel heads downstream until a confluence is encountered in which the intersecting stream is longer, or an outlet cell is detected.
The input streams raster (streams_raster) is used to designate which grid cells contain a stream and the pointer image is used to traverse the stream network. Stream cells are designated in the streams image as all values greater than zero. Thus, all non-stream or background grid cells are commonly assigned either zeros or NoData values. Background cells will be assigned the NoData value in the output image, unless zero_background=True, in which case non-stream cells will be assigned zero values in the output.
The user must specify the name of a flow pointer (flow direction) raster (d8_pntr) and a streams raster (streams_raster). The flow pointer and streams rasters should be generated using the d8_pointer algorithm. This will require a depressionless DEM, processed using either the breach_depressions_least_cost or fill_depressions tool. flow direction) raster, and the output raster.
By default, the pointer raster is assumed to use the clockwise indexing method used by WhiteboxTools. If the pointer file contains ESRI flow direction values instead, set esri_pntr=True.
See Also
d8_pointer, stream_link_identifier, breach_depressions_least_cost, fill_depressions
Python API
def tributary_identifier(self, d8_pntr: Raster, streams_raster: Raster, esri_pntr: bool = False, zero_background: bool = False) -> Raster:
Vector Stream Network Analysis
Function name: vector_stream_network_analysis
This tool performs common stream network analysis operations on an input vector stream file (streams). The network indices produced by this analysis are contained within the output vector's (output) attribute table. The following table shows each of the network indices that are calculated. Index NameDescription OUTLETUnique outlet identifying value, used as basin identifier TRIB_IDUnique tributary identifying value DIST2MOUTHDistance to outlet (i.e., mouth node) DS_NODESNumber of downstream nodes TUCLTotal upstream channel length; the channel equivalent to catchment area MAXUPSDISTMaximum upstream distance HORTONHorton stream order STRAHLERStrahler stream order SHREVEShreve stream magnitude HACKHack stream order MAINSTREAMBoolean value indicating whether link is the main stream trunk of its basin MIN_ELEVMinimum link elevation (from DEM) MAX_ELEVMaximum link elevation (from DEM) IS_OUTLETBoolean value indicating whether link is an outlet link
In addition to the input and output files, the user must also specify the name of an input DEM file (dem), the maximum ridge-cutting height, in DEM z units (cutting_height), and the snap distance used for identifying any topological errors in the stream file (snap). The main function of the input DEM is to distinguish between outlet and headwater links in the network, which can be differentiated by their elevations during the priority-flood operation used in the algorithm (see Lindsay et al. 2019). The maximum ridge-cutting height parameter is useful for preventing erroneous stream capture in the headwaters when channel heads are very near (within the sanp distance), which is usually very rare. The snap distance parameter is used to deal with certain common topological errors. However, it is advisable that the input streams file be pre-processed prior to analysis.
Note: The input streams file for this tool should be pre-processed using the repair_stream_vector_topology tool. This is an important step.
OUTLET:
HORTON:
SHREVE:
TRIB_ID:
Many of the network indices output by this tool for vector streams have raster equivalents in WhiteboxTools. For example, see the strahler_stream_order, shreve_stream_magnitude tools.
Tool outputs are: stream lines vector, confluences points vector, outlet points vector, and channel head points vector.
Reference
Lindsay, JB, Yang, W, Hornby, DD. 2019. Drainage network analysis and structuring of topologically noisy vector stream data. ISPRS International Journal of Geo-Information. 8(9), 422; DOI: 10.3390/ijgi8090422
See Also
repair_stream_vector_topology, strahler_stream_order, shreve_stream_magnitude
Python API
def vector_stream_network_analysis(self, streams: Vector, dem: Raster, max_ridge_cutting_height: float = 10.0, snap_distance: f64 = 0.001) -> Tuple[Vector, Vector, Vector, Vector]:
LiDAR Processing
LiDAR point clouds are the highest-resolution elevation data source available for most practitioners. WbW-QGIS exposes the full Whitebox LiDAR pipeline — from quality assurance through ground classification, surface modelling, and height normalisation — directly in the QGIS Processing Toolbox.
This chapter walks through a complete bare-earth and canopy-height workflow starting from a raw LAS/LAZ file.
Key Concepts
- Point cloud: A set of 3-D coordinates (X, Y, Z) plus attributes (intensity, return number, classification, scan angle, etc.) acquired by laser scanning from airborne or terrestrial platforms.
- LAS/LAZ: The industry-standard binary format for point clouds. LAZ is a losslessly compressed variant. COPC (Cloud-Optimised Point Cloud) is a tiled LAZ variant for efficient streaming.
- Classification codes: Numeric labels assigned to points indicating surface type (1 = unclassified, 2 = ground, 3–5 = vegetation, 6 = building, etc. per ASPRS convention).
- DTM: Digital Terrain Model — a raster surface interpolated from ground-classified points only.
- DSM: Digital Surface Model — a raster surface from first returns, representing the tops of all objects (vegetation, buildings).
- CHM: Canopy Height Model — DSM minus DTM, representing object height above ground.
- Height above ground (HAG): Per-point elevation relative to the interpolated ground surface. Enables classification of vegetation returns by height tier.
End-to-End Workflow: DTM, DSM, and Canopy Height Model
Inputs
| Layer | Format | Notes |
|---|---|---|
cloud.laz | LAZ point cloud | Any ASPRS LAS version 1.0–1.4 |
Step 1 — Point Cloud Quality Check
Processing Toolbox → Whitebox Workflows → LiDAR →
LiDAR Point Stats
| Parameter | Recommended value |
|---|---|
| Input LiDAR file | cloud.laz |
| Output | cloud_stats.html (HTML report) |
Review the report for:
- Point density (pts/m²)
- Classification distribution (% ground, vegetation, unclassified)
- Z range and intensity histogram
- Scan angle range (should be ±20° for most airborne missions)
Step 2 — Thin High-Density Files (Optional)
For files > 50 pts/m², thinning reduces processing time with minimal accuracy loss.
Processing Toolbox → Whitebox Workflows → LiDAR →
LiDAR Thin
| Parameter | Recommended value |
|---|---|
| Input LiDAR file | cloud.laz |
| Resolution | 0.5 (metres) |
| Retain ground points | ✓ enabled |
| Output | cloud_thin.laz |
Step 3 — Classify Ground Points
If the input file has all points as unclassified (class 1), classify ground returns before surface modelling.
Processing Toolbox → Whitebox Workflows → LiDAR →
LiDAR Ground Point Filter
| Parameter | Recommended value |
|---|---|
| Input LiDAR file | cloud.laz (or cloud_thin.laz) |
| Radius (m) | 2.0 |
| Minimum slope (°) | 5.0 |
| Maximum slope (°) | 85.0 |
| Terrain type | Normal |
| Output | cloud_classified.laz |
For complex terrain (steep slopes, dense vegetation), increase radius to
4.0and reduce minimum slope to2.0.
Step 4 — Build DTM from Ground Points
Processing Toolbox → Whitebox Workflows → LiDAR →
LiDAR IDW Interpolation
| Parameter | Recommended value |
|---|---|
| Input LiDAR file | cloud_classified.laz |
| IDW weight | 2.0 |
| Search radius (m) | 2.5 |
| Minimum number of points | 3 |
| Exclusion classes | (leave empty to use ground points only) |
| Returns | Last |
| Point classes included | 2 (ground) |
| Grid resolution | 0.5 |
| Output | dtm.tif |
Alternatively, use LiDAR TIN Gridding for faster interpolation on
uniformly distributed clouds.
Step 5 — Build DSM from First Returns
Processing Toolbox → Whitebox Workflows → LiDAR →
LiDAR IDW Interpolation (second pass)
| Parameter | Recommended value |
|---|---|
| Input LiDAR file | cloud_classified.laz |
| Returns | First |
| Point classes included | (all — leave blank) |
| Grid resolution | 0.5 |
| Output | dsm.tif |
Step 6 — Canopy Height Model
Subtract DTM from DSM using the QGIS Raster Calculator, or use the dedicated CHM tool.
Processing Toolbox → Whitebox Workflows → LiDAR →
Canopy Height Model
| Parameter | Recommended value |
|---|---|
| Input LiDAR file | cloud_classified.laz |
| DTM raster | dtm.tif |
| Grid resolution | 0.5 |
| Output | chm.tif |
Apply a stretch from 0 to the 98th percentile height value. Negative cells (< 0) indicate DTM–DSM interpolation artefacts; clamp to 0 in post-processing.
Step 7 — Height Above Ground Normalisation
Assign a HAG value to every point for per-return vertical stratification analysis.
Processing Toolbox → Whitebox Workflows → LiDAR →
Height Above Ground
| Parameter | Recommended value |
|---|---|
| Input LiDAR file | cloud_classified.laz |
| Output | cloud_hag.laz |
The tool sets the Z coordinate of each point to its height above the interpolated ground surface. Ground points are set to 0.
Python Console Equivalent
import processing
cloud = '/data/cloud.laz'
# Step 1: stats / QA
processing.run('whitebox_workflows:lidar_point_stats', {
'input': cloud,
'output': '/data/cloud_stats.html',
})
# Step 3: ground classification
processing.run('whitebox_workflows:lidar_ground_point_filter', {
'input': cloud,
'radius': 2.0,
'min_slope': 5.0,
'max_slope': 85.0,
'terrain_type': 'Normal',
'output': '/data/cloud_classified.laz',
})
# Step 4: DTM
processing.run('whitebox_workflows:lidar_idw_interpolation', {
'input': '/data/cloud_classified.laz',
'parameter': 'elevation',
'returns': 'Last',
'classes_included': '2',
'weight': 2.0,
'radius': 2.5,
'min_points': 3,
'resolution': 0.5,
'output': '/data/dtm.tif',
})
# Step 5: DSM
processing.run('whitebox_workflows:lidar_idw_interpolation', {
'input': '/data/cloud_classified.laz',
'parameter': 'elevation',
'returns': 'First',
'classes_included': '',
'weight': 2.0,
'radius': 2.5,
'min_points': 3,
'resolution': 0.5,
'output': '/data/dsm.tif',
})
# Step 6: CHM
processing.run('whitebox_workflows:canopy_height_model', {
'input': '/data/cloud_classified.laz',
'dtm': '/data/dtm.tif',
'resolution': 0.5,
'output': '/data/chm.tif',
})
# Step 7: HAG
processing.run('whitebox_workflows:height_above_ground', {
'input': '/data/cloud_classified.laz',
'output': '/data/cloud_hag.laz',
})
print("LiDAR pipeline complete.")
Common Pitfalls
| Problem | Likely cause | Fix |
|---|---|---|
| DTM has large NoData holes | Ground point density too low | Increase IDW search radius or use TIN gridding |
| CHM has negative values | DTM higher than DSM in flat/water areas | Clamp CHM ≥ 0 with Raster Calculator after generation |
| Classification codes all zero after processing | Input was LAS point format 6–10 and legacy writer bug (see known issues) | Use WbW Next Gen pipeline — classification is preserved correctly |
| Ground filter over-segments in steep terrain | Slope parameters too restrictive | Increase max slope to 85° and radius to 4 m |
| LiDAR stats report extreme Z values | Outlier high/low points present | Run LiDAR Remove Outliers before classification |
Validation Checklist
- Point stats report shows expected classification distribution (> 5 % ground points).
- DTM has no large NoData holes in vegetated areas.
- DTM is smooth with no pits deeper than 1–2 m.
- DSM ≥ DTM across the entire overlap extent.
- CHM values are 0 on roads and bare ground.
- HAG-normalised cloud has all ground points at Z ≈ 0 (±0.1 m).
I/O and Data Management
Ascii To LAS
Function name: ascii_to_las
This tool can be used to convert one or more ASCII files, containing LiDAR point data, into LAS files. The user must specify the name(s) of the input ASCII file(s) (inputs). Each input file will have a correspondingly named output file with a .las file extension. The output point data, each on a separate line, will take the format:
x,y,z,intensity,class,return,num_returns" ValueInterpretation xx-coordinate yy-coordinate zelevation iintensity value cclassification rnreturn number nrnumber of returns timeGPS time sascan angle rred bblue ggreen
The x, y, and z patterns must always be specified. If the rn pattern is used, the nr pattern must also be specified. Examples of valid pattern string include:
'x,y,z,i' 'x,y,z,i,rn,nr' 'x,y,z,i,c,rn,nr,sa' 'z,x,y,rn,nr' 'x,y,z,i,rn,nr,r,g,b' Use the las_to_ascii tool to convert a LAS file into a text file containing LiDAR point data.
See Also
las_to_ascii
Python API
def ascii_to_las(self, input_ascii_files: List[str], pattern: str, epsg_code: int) -> None:
LAS To Ascii
Function name: las_to_ascii
This tool can be used to convert one or more LAS file, containing LiDAR data, into ASCII files. The user must specify the name(s) of the input LAS file(s) (inputs). Each input file will have a correspondingly named output file with a .csv file extension. CSV files are comma separated value files and contain tabular data with each column corresponding to a field in the table and each row a point value. Fields are separated by commas in the ASCII formatted file. The output point data, each on a separate line, will take the format:
X,Y,Z,INTENSITY,CLASS,RETURN,NUM_RETURN,SCAN_ANGLE If the LAS file has a point format that contains RGB data, the final three columns will contain the RED, GREEN, and BLUE values respectively. Use the ascii_to_las tool to convert a text file containing LiDAR point data into a LAS file.
See Also
ascii_to_las
Python API
def las_to_ascii(self, input_lidar: Optional[Lidar]) -> None:
LAS To Shapefile
Function name: las_to_shapefile
This tool converts one or more LAS files into a POINT vector. When the input parameter is not specified, the tool grids all LAS files contained within the working directory. The attribute table of the output Shapefile will contain fields for the z-value, intensity, point class, return number, and number of return.
This tool can be used in place of the LasToMultipointShapefile tool when the number of points are relatively low and when the desire is to represent more than simply the x,y,z position of points. Notice however that because each point in the input LAS file will be represented as a separate record in the output Shapefile, the output file will be many time larger than the equivalent output of the LasToMultipointShapefile tool. There is also a practical limit on the total number of records that can be held in a single Shapefile and large LAS files approach this limit. In these cases, the LasToMultipointShapefile tool should be preferred instead.
See Also
LasToMultipointShapefile
Python API
def las_to_shapefile(self, input_lidar: Optional[Lidar], output_multipoint: bool = False) -> Vector:
LiDAR Colourize
Function name: lidar_colourize
This tool can be used to add red-green-blue (RGB) colour values to the points contained within an input LAS file (in_lidar), based on the pixel values of an overlapping input colour image (in_image). Ideally, the image has been acquired at the same time as the LiDAR point cloud. If this is not the case, one may expect that transient objects (e.g. cars) in both input data sets will be incorrectly coloured. The input image should overlap in extent with the LiDAR data set and the two data sets should share the same projection. You may use the lidar_tile_footprint tool to determine the spatial extent of the LAS file.
See Also
colourize_based_on_class, colourize_based_on_point_returns, lidar_tile_footprint
Python API
def lidar_colourize(self, in_lidar: Lidar, in_image: Raster) -> Lidar:
LiDAR Join
Function name: lidar_join
This tool can be used to merge multiple LiDAR LAS files into a single output LAS file. Due to their large size, LiDAR data sets are often tiled into smaller, non-overlapping tiles. Sometimes it is more convenient to combine multiple tiles together for data processing and lidar_join can be used for this purpose.
See Also
lidar_tile
Python API
def lidar_join(self, inputs: List[Lidar]) -> Lidar:
LiDAR Shift
Function name: lidar_shift
This tool can be used to shift the x,y,z coordinates of points within a LiDAR file. The user must specify the name of the input file (input) and the output file (output). Additionally, the user must specify the x,y,z shift values (x_shift, y_shift, z_shift). At least one non-zero shift value is needed to run the tool. Notice that shifting the x,y,z coordinates of LiDAR points is also possible using the modify_lidar tool, which can also be used for more sophisticated point property manipulation (e.g. rotations).
See Also
modify_lidar, lidar_elevation_slice, height_above_ground
Python API
def lidar_shift(self, input: Lidar, x_shift: float = 0.0, y_shift: float = 0.0, z_shift: float = 0.0) -> Lidar:
LiDAR Tile
Function name: lidar_tile
single LAS file. The user must specify the parameter of the tile grid, including its origin (origin_x and origin_y) and the tile width and height (width and height). Tiles containing fewer points than specified in the min_points parameter will not be output. This can be useful when tiling terrestrial LiDAR datasets because the low point density at the edges of the point cloud (i.e. most distant from the scan station) can result in poorly populated tiles containing relatively few points.
See Also
lidar_join, lidar_tile_footprint
Python API
def lidar_tile(self, input_lidar: Lidar, tile_width: float = 1000.0, tile_height: float = 1000.0, origin_x: float = 0.0, origin_y: float = 0.0, min_points_in_tile: int = 2, output_laz_format: bool = True) -> None:
LiDAR Tophat Transform
Function name: lidar_tophat_transform
This tool performs a white top-hat transform on a LiDAR point cloud (input). A top-hat transform is a common digital image processing operation used for various tasks, such as feature extraction, background equalization, and image enhancement. When applied to a LiDAR point cloud, the white top-hat transform provides an estimate of height above ground, which is useful for modelling the vegetation canopy.
As an example, notice that the input point cloud on the top of the image below has a substantial amount of topographic variability. After applying the top-hat transform (bottom point cloud), all of this topographic variability has been removed and point elevations values effectively become height above ground.
The white top-hat transform is defined as the difference between a point's original elevation and its opening. The opening operation can be thought of as the local neighbourhood maximum of a previous local minimum surface. The user must specify the size of the neighbourhood using the radius parameter. Setting this parameter can require some experimentation. Generally, it is appropriate to use a radius of a few meters in non-urban landscapes. However, in urban areas, the radius may need to be set much larger, reflective of the size of the largest building.
If the input point cloud already has ground points classified, it may be better to use the height_above_ground, which simply measures the difference in height between each point and its nearest ground classified point within the search radius.
See Also
height_above_ground, tophat_transform, closing, opening
Python API
def lidar_tophat_transform(self, input: Lidar, search_radius: float) -> Lidar:
Recover Flightline Info
Function name: recover_flightline_info
Description
Raw airborne LiDAR data are collected along flightlines and multiple flightlines are typically merged into square tiles to simplify data handling and processing. Commonly the Point Source ID attribute is used to store information about the origin flightline of each point. However, sometimes this information is lost (e.g. during data format conversion) or is omitted from some data sets. This tool can be used to identify groups of points within a LiDAR file (input) that belong to the same flightline.
The tool works by sorting points based on their timestamp and then identifying points for which the time difference from the previous point is greater than a user-specified maximum time difference (max_time_diff), which are deemed to be the start of a different flightline. The operational assumption is that the time between consecutive points within a flightline is usually quite small (usually a fraction of a second), while the time between points in different flightlines is often relatively large (consider the aircraft turning time needed to take multiple passes of the study area). By default the maximum time difference is set to 5.0 seconds, although it may be necessary to increase this value depending on the characteristics of a particular data set.
The tool works on individual LiDAR tiles and the flightline identifiers will range from 0 to the number of flightlines detected within the tile, minus one. Therefore, the flightline identifier created by this tool will not extend beyond the boundaries of the tile and into adjacent tiles. That is, a flightline that extends across multiple adjacent LiDAR tiles may have different flightline identifiers used in each tile. The identifiers are intended to discriminate between flighlines within a single file. The flightline identifier value can be optionally assigned to the Point Source ID point attribute (pt_src_id), the User Data point attribute (user_data), and the red-green-blue point colour data (rgb) within the output file (output). At least one of these output options must be selected and it is possible to select multiple output options. Notice that if the input file contains any information within the selected output fields, the original information will be over-written, and therefore lost--of course, it will remain unaltered within the input file, which this tool does not modify. If the input file does not contain RGB colour data and the rgb output option is selected, the output file point format will be altered from the input file to accommodate the addition of RGB colour data. Flightlines are assigned random colours. The LAS User Data point attribute is stored as a single byte and, therefore, if this output option is selected and the input file contains more than 256 flightlines, the tool will assign the same flightline identifier to more than one flightline. It is very rare for this condition to be the case in a typical 1 km-square tiles. The Point Source ID attribute is stored as a 16-bit integer and can therefore store 65,536 unique flightline identifiers.
Outputting flightline information within the colour data point attribute can be useful for visualizing areas of flightline overlap within a file. This can be an important quality assurance/quality control (QA/QC) step after acquiring a new LiDAR data set.
Please note that because this tool sorts points by their timestamps, the order of points in the output file may not match that of the input file.
See Also
flightline_overlap, find_flightline_edge_points, LidarSortByTime
Python API
def recover_flightline_info(self, input: Lidar, max_time_diff: float = 5.0, pt_src_id: bool = False, user_data: bool = False, rgb: bool = False) -> Lidar:
Select Tiles By Polygon
Function name: select_tiles_by_polygon
This tool copies LiDAR tiles overlapping with a polygon into an output directory. In actuality, the tool performs point-in-polygon operations, using the four corner points, the center point, and the four mid-edge points of each LiDAR tile bounding box and the polygons. This representation of overlapping geometry aids with performance. This approach generally works well when the polygon size is large relative to the LiDAR tiles. If, however, the input polygon is small relative to the tile size, this approach may miss some copying some tiles. It is advisable to buffer the polygon if this occurs.
See Also
lidar_tile_footprint
Python API
def select_tiles_by_polygon(self, input_directory: str, output_directory: str, polygons: Vector) -> None:
Sort LiDAR
Function name: sort_lidar
Description
This tool can be used to sort the points in an input LiDAR file (input) based on their properties with respect to one or more sorting criteria (criteria). The sorting criteria may include: the x, y or z coordinates (x, y, z), the intensity data (intensity), the point class value (class), the point user data field (user_data), the return number (ret_num), the point source ID (point_source_id), the point scan angle data (scan_angle), the scanner channel (scanner_channel; LAS 1.4 datasets only), and the acquisition time (time). The following is an example of a complex sorting criteria statement that includes multiple criteria:
x 100.0, y 100.0, z 10.0, scan_angle
Criteria should be separated by a comma, semicolon, or pipe (|). Each criteria may have an associated bin value. In the example above, point x values are sorted into bins of 100 m, which are then sorted by y values into bins of 100 m, and sorted by point z values into bins of 10 m, and finally sorted by their scan_angle.
Sorting point values can have a significant impact on the compression rate when using certain compressed LiDAR data formats (e.g. LAZ, zLidar). Sorting values can also improve the visualization speed in some rendering software.
Note that if the user does not specify the optional input LiDAR file, the tool will search for all valid LiDAR (*.las, *.laz, *.zlidar) files contained within the current working directory. This feature can be useful for processing a large number of LiDAR files in batch mode. When this batch mode is applied, the output file names will be the same as the input file names but with a '_sorted' suffix added to the end.
Python API
def sort_lidar(self, sort_criteria: str, input_lidar: Optional[Lidar]) -> Optional[Lidar]:
Split LiDAR
Function name: split_lidar
Description
This tool can be used to split an input LiDAR file (input) into a series of output files, placing points into each output based on their properties with respect to a grouping criterion (criterion). Points can be grouped based on a specified the number of points in the output file (num_pts; note the last file may contain fewer points), the x, y or z coordinates (x, y, z), the intensity data (intensity), the point class value (class), the point user data field (user_data), the point source ID (point_source_id), the point scan angle data (scan_angle), and the acquisition time (time). Points are binned into groupings based on a user-specified interval value (interval). For example, if an interval of 50.0 is used with the z criterion, a series of files will be output that are elevation bands of 50 m. The user may also optionally specify the minimum number of points needed before a particular grouping file is saved (min_pts). The interval value is not used for the class and point_source_id criteria.
With this tool, a single input file can generate many output files. The names of the output files will be reflective of the point attribute used for the grouping and the bin. For example, running the tool with the on an input file named my_file.las using intensity criterion and with an interval of 1000 may produce the following files:
- my_file_intensity0.las
- my_file_intensity1000.las
- my_file_intensity2000.las
- my_file_intensity3000.las
- my_file_intensity4000.las
Where the number after the attribute (intensity, in this case) reflects the lower boundary of the bin. Thus, the first file contains all of the input points with intensity values from 0 to just less than 1000.
Note that if the user does not specify the optional input LiDAR file, the tool will search for all valid LiDAR (*.las, *.laz, *.zlidar) files contained within the current working directory. This feature can be useful for processing a large number of LiDAR files in batch mode. When this batch mode is applied, the output file names will be the same as the input file names but with a suffix added to the end reflective of the split criterion and value (see above).
See Also
sort_lidar, filter_lidar, modify_lidar, lidar_elevation_slice
Python API
def split_lidar(self, split_criterion: str, input_lidar: Optional[Lidar], interval: float = 5.0, min_pts: int = 5) -> None:
Filtering and Classification
Classify Buildings In LiDAR
Function name: classify_buildings_in_lidar
This tool can be used to assign the building class (classification value 6) to all points within an input LiDAR point cloud (input) that are contained within the polygons of an input buildings footprint vector (buildings). The tool performs a simple point-in-polygon operation to determine membership. The two inputs (i.e. the LAS file and vector) must share the same map projection. Furthermore, any error in the definition of the building footprints will result in misclassified points in the output LAS file (output). In particular, if the footprints extend slightly beyond the actual building, ground points situated adjacent to the building will be incorrectly classified. Thus, care must be taken in digitizing building footprint polygons. Furthermore, where there are tall trees that overlap significantly with the building footprint, these vegetation points will also be incorrectly assigned the building class value.
See Also
filter_lidar_classes, lidar_ground_point_filter, clip_lidar_to_polygon
Python API
def classify_buildings_in_lidar(self, in_lidar: Lidar, building_footprints: Vector) -> Lidar:
Classify LiDAR
Function name: classify_lidar
Description
This tool provides a basic classification of a LiDAR point cloud into ground, building, and vegetation classes. The algorithm performs the classification based on point neighbourhood geometric properties, including planarity, linearity, and height above the ground. There is also a point segmentation involved in the classification process.
The user may specify the names of the input and output LiDAR files (input and output). Note that if the user does not specify the optional input/output LiDAR files, the tool will search for all valid LiDAR (*.las, *.laz, *.zlidar) files contained within the current working directory. This feature can be useful for processing a large number of LiDAR files in batch mode. When this batch mode is applied, the output file names will be the same as the input file names but with a '_classified' suffix added to the end.
The search distance (radius), defining the radius of the neighbourhood window surrounding each point, must also be specified. If this parameter is set to a value that is too large, areas of high surface curvature on the ground surface will be left unclassed and smaller buildings, e.g. sheds, will not be identified. If the parameter is set too small, areas of low point density may provide unsatisfactory classification values. The larger this search distance is, the longer the algorithm will take to processs a data set. For many airborne LiDAR data sets, a value between 1.0 - 3.0 meters is likely appropriate.
The ground threshold parameter (grd_threshold) determines how far above the tophat-transformed surface a point must be to be excluded from the ground surface. This parameter also determines the maximum distance a point can be from a plane or line model fit to a neighbourhood of points to be considered part of the model geometry. Similarly the off-terrain object threshold parameter (oto_threshold) is used to determine how high above the ground surface a point must be to be considered either a vegetation or building point. The ground threshold must be smaller than the off-terrain object threshold. If you find that breaks-in-slope in areas of more complex ground topography are left unclassed (class = 1), this can be addressed by raising the ground threshold parameter.
The planarity and linearity thresholds (planarity_threshold and linearity_threshold) describe the minimum proportion (0-1) of neighbouring points that must be part of a fitted model before the point is considered to be planar or linear. Both of these properties are used by the algorithm in a variety of ways to determine final class values. Planar and linear models are fit using a RANSAC-like algorithm, with the main user-specified parameter of the number of iterations (iterations). The larger the number of iterations the greater the processing time will be.
The facade threshold (facade_threshold) is the last user-specified parameter, and determines the maximum horizontal distance that a point beneath a rooftop edge point may be to be considered part of the building facade (i.e. walls). The default value is 0.5 m, although this value will depend on a number of factors, such as whether or not the building has balconies.
The algorithm generally does very well to identify deciduous (broad-leaf) trees but can at times struggle with incorrectly classifying dense coniferous (needle-leaf) trees as buildings. When this is the case, you may counter this tendency by lowering the planarity threshold parameter value. Similarly, the algorithm will generally leave overhead power lines as unclassified (class = 1), howevever, if you find that the algorithm misclassifies most such points as high vegetation (class = 5), this can be countered by lowering the linearity threshold value.
Note that if the input file already contains class data, these data will be overwritten in the output file.
See Also
colourize_based_on_class, filter_lidar, modify_lidar, sort_lidar, split_lidar
Python API
def classify_lidar(self, input_lidar: Optional[Lidar], search_radius: float = 2.5, grd_threshold: float = 0.1, oto_threshold: float = 1.0, linearity_threshold: float = 0.5, planarity_threshold: float = 0.85, num_iter: int = 30, facade_threshold: float = 0.5) -> Optional[Lidar]:
Classify Overlap Points
Function name: classify_overlap_points
This tool can be used to flag points within an input LiDAR file (input) that overlap with other nearby points from different flightlines, i.e. to identify overlap points. The flightline associated with a LiDAR point is assumed to be contained within the point's Point Source ID (PSID) property. If the PSID property is not set, or has been lost, users may with to apply the recover_flightline_info tool prior to running flightline_overlap.
Areas of multiple flightline overlap tend to have point densities that are far greater than areas of single flightlines. This can produce suboptimal results for applications that assume regular point distribution, e.g. in point classification operations.
The tool works by applying a square grid over the extent of the input LiDAR file. The grid cell size is determined by the user-defined resolution parameter. Grid cells containing multiple PSIDs, i.e. with more than one flightline, are then identified. Overlap points within these grid cells can then be flagged on the basis of a user-defined criterion. The flagging options include the following: CriterionOverlap Point Definition max scan angleAll points that share the PSID of the point with the maximum absolute scan angle not min point source IDAll points with a different PSID to that of the point with the lowest PSID not min timeAll points with a different PSID to that of the point with the minimum GPS time multiple point source IDsAll points in grid cells with multiple PSIDs, i.e. all overlap points.
Note that the max scan angle criterion may not be appropriate when more than two flightlines overlap, since it will result in only flagging points from one of the multiple flightlines.
It is important to set the resolution parameter appropriately, as setting this value too high will yield the filtering of points in non-overlap areas, and setting the resolution to low will result in fewer than expected points being flagged. An appropriate resolution size value may require experimentation, however a value that is 2-3 times the nominal point spacing has been previously recommended. The nominal point spacing can be determined using the lidar_info tool.
By default, all flagged overlap points are reclassified in the output LiDAR file (output) to class 12. Alternatively, if the user specifies the filter parameter, then each overlap point will be excluded from the output file. Classified overlap points may also be filtered from LiDAR point clouds using the filter_lidar tool.
Note that this tool is intended to be applied to LiDAR tile data containing points that have been merged from multiple overlapping flightlines. It is commonly the case that airborne LiDAR data from each of the flightlines from a survey are merged and then tiled into 1 km2 tiles, which are the target dataset for this tool.
See Also
flightline_overlap, recover_flightline_info, filter_lidar, lidar_info
Python API
def classify_overlap_points(self, in_lidar: Lidar, resolution: float = 1.0, overlap_criterion: str = "max scan angle", filter: bool = False) -> Lidar:
Clip LiDAR To Polygon
Function name: clip_lidar_to_polygon
This tool can be used to isolate, or clip, all of the LiDAR points in a LAS file (input) contained within one or more vector polygon features. The user must specify the name of the input clip file (--polygons), which must be a vector of a Polygon base shape type. The clip file may contain multiple polygon features and polygon hole parts will be respected during clipping, i.e. LiDAR points within polygon holes will be removed from the output LAS file.
Use the erase_polygon_from_lidar tool to perform the complementary operation of removing points from a LAS file that are contained within a set of polygons.
See Also
erase_polygon_from_lidar, filter_lidar, clip, clip_raster_to_polygon
Python API
def clip_lidar_to_polygon(self, input: Lidar, polygons: Vector) -> Lidar:
Erase Polygon From LiDAR
Function name: erase_polygon_from_lidar
This tool can be used to isolate, or clip, all of the LiDAR points in a LAS file (input) contained within one or more vector polygon features. The user must specify the name of the input clip file (--polygons), which must be a vector of a Polygon base shape type. The clip file may contain multiple polygon features and polygon hole parts will be respected during clipping, i.e. LiDAR points within polygon holes will be removed from the output LAS file.
Use the erase_polygon_from_lidar tool to perform the complementary operation of removing points from a LAS file that are contained within a set of polygons.
See Also
erase_polygon_from_lidar, filter_lidar, clip, clip_raster_to_polygon
Python API
def erase_polygon_from_lidar(self, input: Lidar, polygons: Vector) -> Lidar:
Filter LiDAR
Function name: filter_lidar
Description
The FilterLidar tool is a very powerful tool for filtering points within a LiDAR point cloud based on point properties. Complex filter statements (statement) can be used to include or exclude points in the output file (output).
Note that if the user does not specify the optional input LiDAR file (input), the tool will search for all valid LiDAR (*.las, *.laz, *.zlidar) files contained within the current working directory. This feature can be useful for processing a large number of LiDAR files in batch mode. When this batch mode is applied, the output file names will be the same as the input file names but with a '_filtered' suffix added to the end.
Points are either included or excluded from the output file by creating conditional filter statements. Statements must be valid Rust syntax and evaluate to a Boolean. Any of the following variables are acceptable within the filter statement: Variable NameDescription xThe point x coordinate yThe point y coordinate zThe point z coordinate intensityThe point intensity value retThe point return number nretThe point number of returns is_onlyTrue if the point is an only return (i.e. ret == nret == 1), otherwise false is_multipleTrue if the point is a multiple return (i.e. nret > 1), otherwise false is_earlyTrue if the point is an early return (i.e. ret == 1), otherwise false is_intermediateTrue if the point is an intermediate return (i.e. ret > 1 && ret is_lateTrue if the point is a late return (i.e. ret == nret), otherwise false is_firstTrue if the point is a first return (i.e. ret == 1 && nret > 1), otherwise false is_lastTrue if the point is a last return (i.e. ret == nret && nret > 1), otherwise false classThe class value in numeric form, e.g. 0 = Never classified, 1 = Unclassified, 2 = Ground, etc. is_noiseTrue if the point is classified noise (i.e. class == 7class == 18), otherwise false is_syntheticTrue if the point is synthetic, otherwise false is_keypointTrue if the point is a keypoint, otherwise false is_withheldTrue if the point is withheld, otherwise false is_overlapTrue if the point is an overlap point, otherwise false scan_angleThe point scan angle scan_directionTrue if the scanner is moving from the left towards the right, otherwise false is_flightline_edgeTrue if the point is situated along the filightline edge, otherwise false user_dataThe point user data point_source_idThe point source ID scanner_channelThe point scanner channel timeThe point GPS time, if it exists, otherwise 0 redThe point red value, if it exists, otherwise 0 greenThe point green value, if it exists, otherwise 0 blueThe point blue value, if it exists, otherwise 0 nirThe point near infrared value, if it exists, otherwise 0 pt_numThe point number within the input file n_ptsThe number of points within the file min_xThe file minimum x value mid_xThe file mid-point x value max_xThe file maximum x value min_yThe file minimum y value mid_yThe file mid-point y value max_yThe file maximum y value min_zThe file minimum z value mid_zThe file mid-point z value max_zThe file maximum z value dist_to_ptThe distance from the point to a specified xy or xyz point, e.g. dist_to_pt(562500, 4819500) or dist_to_pt(562500, 4819500, 320) dist_to_lineThe distance from the point to the line passing through two xy points, e.g. dist_to_line(562600, 4819500, 562750, 4819750) dist_to_line_segThe distance from the point to the line segment defined by two xy end-points, e.g. dist_to_line_seg(562600, 4819500, 562750, 4819750) within_rect1 if the point falls within the bounds of a 2D or 3D rectangle, otherwise 0. Bounds are defined as within_rect(ULX, ULY, LRX, LRY) or within_rect(ULX, ULY, ULZ, LRX, LRY, LRZ)
In addition to the point properties defined above, if the user applies the lidar_eigenvalue_features tool on the input LiDAR file, the filter_lidar tool will automatically read in the additional *.eigen file, which include the eigenvalue-based point neighbourhood measures, such as lambda1, lambda2, lambda3, linearity, planarity, sphericity, omnivariance, eigentropy, slope, and residual. See the lidar_eigenvalue_features documentation for details on each of these metrics describing the structure and distribution of points within the neighbourhood surrounding each point in the LiDAR file.
Statements can be as simple or complex as desired. For example, to filter out all points that are classified noise (i.e. class numbers 7 or 18):
!is_noise The following is a statement to retain only the late returns from the input file (i.e. both last and single returns):
ret == nret Notice that equality uses the == symbol an inequality uses the != symbol. As an equivalent to the above statement, we could have used the is_late point property:
is_late If we want to remove all points outside of a range of xy values:
x >= 562000 && x <= 562500 && y >= 4819000 && y <= 4819500 Notice how we can combine multiple constraints using the && (logical AND) and || (logical OR) operators. As an alternative to the above statement, we could have used the within_rect function:
within_rect(562000, 4819500, 562500, 4819000) If we want instead to exclude all of the points within this defined region, rather than to retain them, we simply use the ! (logial NOT):
!(x >= 562000 && x <= 562500 && y >= 4819000 && y <= 4819500) or, simply:
!within_rect(562000, 4819500, 562500, 4819000) If we need to find all of the ground points within 150 m of (562000, 4819500), we could use:
class == 2 && dist_to_pt(562000, 4819500) <= 150.0 The following statement outputs all non-vegetation classed points in the upper-right quadrant:
!(class == 3 && class != 4 && class != 5) && x < min_x + (max_x - min_x) / 2.0 && y > max_y - (max_y - min_y) / 2.0 As demonstrated above, the filter_lidar tool provides an extremely flexible, powerful, and easy means for retaining and removing points from point clouds based on any of the common LiDAR point attributes.
See Also
filter_lidar_classes, filter_lidar_scan_angles, modify_lidar, erase_polygon_from_lidar, clip_lidar_to_polygon, sort_lidar, lidar_eigenvalue_features
Python API
def filter_lidar(self, statement: str, input_lidar: Optional[Lidar]) -> Optional[Lidar]:
Filter LiDAR By Percentile
Function name: filter_lidar_by_percentile
Description
This tool can be used to extract a subset of points from an input LiDAR point cloud (input_lidar) that correspond to a user-specified percentile of the points within the local neighbourhood. The algorithm works by overlaying a grid of a specified size (block_size). The group of LiDAR points contained within each block in the superimposed grid are identified and are sorted by elevation. The point with the elevation that corresponds most closely to the specified percentile is then inserted into the output LiDAR point cloud. For example, if percentile = 0.0, the lowest point within each block will be output, if percentile = 100.0 the highest point will be output, and if percentile = 50.0 the point that is nearest the median elevation will be output. Notice that the lower the number of points contained within a block, the more approximate the calculation will be. For example, if a block only contains three points, no single point occupies the 25th percentile. The equation that is used to identify the closest corresponding point (zero-based) from a list of n sorted by elevation values is:
point_num = ⌊percentile / 100.0 * (n - 1)⌉
Increasing the block size (default is 1.0 xy-units) will increase the average number of points within blocks, allowing for a more accurate percentile calculation.
Like many of the LiDAR functions, the input LiDAR point cloud (input_lidar) is optional. If an input LiDAR file is not specified, the tool will search for all valid LiDAR (*.las, *.laz, *.zlidar) files contained within the current working directory. This feature can be very useful when you need to process a large number of LiDAR files contained within a directory. This batch processing mode enables the function to run in a more optimized parallel manner. When run in this batch mode, no output LiDAR object will be created. Instead the function will create an output file (saved to disc) with the same name as each input LiDAR file, but with the .tif extension. This can provide a very efficient means for processing extremely large LiDAR data sets.
See Also
filter_lidar, lidar_block_minimum, lidar_block_maximum
Python API
def filter_lidar_by_percentile(self, input_lidar: Optional[Lidar], percentile: float = 0.0, block_size: float = 1.0) -> Optional[Lidar]:
Filter LiDAR By Reference Surface
Function name: filter_lidar_by_reference_surface
Description
This tool can be used to extract a subset of points from an input LiDAR point cloud (input_lidar) that satisfy a query relation with a user-specified raster reference surface (ref_surface). For example, you may use this function to extract all of the points that are below (query="<" or query="<=") or above (query=">" or query=">=") a surface model. The default query mode is "within" (i.e. query="within"), which extracts all of the points that are within a specified absolute vertical distance (threshold) of the surface. Notice that the threshold parameter is ignored for query types other than "within".
By default, the function will return a point cloud containing only the subset of points in the input dataset that satisfy the condition of the query. Setting the classify parameter to True modifies this behaviour such that the output point cloud will contain all of the points within the input dataset, but will have the classification value of the query-satifying points will be set to the true_class_value parameter (0-255) and points that do not satisfy the query will be assigned the false_class_value (0-255). By setting the preserve_classes paramter to True, all points that do not satisfy the query will have unmodified class values from the input dataset.
Unlike many of the LiDAR functions, this function does not have a batch mode and operates on single tiles only.
See Also
filter_lidar
Python API
def filter_lidar_by_reference_surface(self, input_lidar: Lidar, ref_surface: Raster, query: str = "within", threshold: float = 0.0) -> Lidar:
Filter LiDAR Classes
Function name: filter_lidar_classes
This tool can be used to remove points within a LAS LiDAR file that possess certain specified class values. The user must input the names of the input (input) and output (output) LAS files and the class values to be excluded (exclude_cls). Class values are specified by their numerical values, such that: Classification ValueMeaning 0Created never classified 1Unclassified 2Ground 3Low Vegetation 4Medium Vegetation 5High Vegetation 6Building 7Low Point (noise) 8Reserved 9Water 10Rail 11Road Surface 12Reserved 13Wire – Guard (Shield) 14Wire – Conductor (Phase) 15Transmission Tower 16Wire-structure Connector (e.g. Insulator) 17Bridge Deck 18High noise
Thus, to filter out low and high noise points from a point cloud, specify exclude_cls='7,18'. Class ranges may also be specified, e.g. exclude_cls='3-5,7,18'. Notice that usage of this tool assumes that the LAS file has underwent a comprehensive point classification, which not all point clouds have had. Use the lidar_info tool determine the distribution of various class values in your file.
See Also
lidar_info
Python API
def filter_lidar_classes(self, input: Lidar, exclusion_classes: List[int]) -> Lidar:
Filter LiDAR Noise
Function name: filter_lidar_noise
This function can be used to remove both low (class = 7) and high (class = 18) noise classed points within a LiDAR file. The function is therefore equivalent to running the filter_lidar_noise function, specifying classes 7 and 18. Notice that usage of this tool assumes that the LAS file has underwent a comprehensive point classification, which not all point clouds have had. Use the lidar_info tool determine the distribution of various class values in your file.
See Also
lidar_info, filter_lidar_classes
Python API
def filter_lidar_noise(self, input: Lidar) -> Lidar:
Filter LiDAR Scan Angles
Function name: filter_lidar_scan_angles
Python API
def filter_lidar_scan_angles(self, in_lidar: Lidar, threshold: int) -> Lidar:
Height Above Ground
Function name: height_above_ground
This tool normalizes an input LiDAR point cloud (input) such that point z-values in the output LAS file (output) are converted from elevations to heights above the ground, specifically the height above the nearest ground-classified point. The input LAS file must have ground-classified points, otherwise the tool will return an error. The lidar_tophat_transform tool can be used to perform the normalization if a ground classification is lacking.
See Also
lidar_tophat_transform
Python API
def height_above_ground(self, input: Lidar) -> Lidar:
Improved Ground Point Filter
Function name: improved_ground_point_filter
This function provides a faster alternative to the lidar_ground_point_filter algorithm, provided in the free version of Whitebox Workflows, for the extraction of ground points from within a LiDAR point cloud. The algorithm works by placing a grid overtop of the point cloud of a specified resolution (block_size, in xy-units) and identifying the subset of lidar points associated with the lowest position in each block. A raster surface is then created by TINing these points. The surface is further processed by removing any off-terrain objects (OTOs), including buildings smaller than the max_building_size parameter (xy-units). Removing OTOs also requires the user to specify the value of a slope_threshold, in degrees. Finally, the algorithm then extracts all of the points in the input LiDAR point cloud (input) that are within a specified absolute vertical distance (elev_threshold) of this surface model.
Conceptually, this method of ground-point filtering is somewhat similar in concept to the cloth-simulation approach of Zhang et al. (2016). The difference is that the cloth is first fitted to the minimum surface with infinite flexibility and then the rigidity of the cloth is subsequently increased, via the identification and removal of OTOs from the minimal surface. The slope_threshold parameter effectively controls the eventual rigidity of the fitted surface.
By default, the tool will return a point cloud containing only the subset of points in the input dataset that coincide with the idenfitied ground points. Setting the classify parameter to True modifies this behaviour such that the output point cloud will contain all of the points within the input dataset, but will have the classification value of identified ground points set to '2' (i.e., the ground class value) and all other points will be set to '1' (i.e., the unclassified class value). By setting the preserve_classes paramter to True, all non-ground points in the output cloud will have the same classes as the corresponding point class values in the input dataset.
Compared with the lidar_ground_point_filter algorithm, the improved_ground_point_filter algorithm is generally far faster and is able to more effectively remove points associated with larger buildings. Removing large buildings from point clouds with the lidar_ground_point_filter algorithm requires use of very large search distances, which slows the operation considerably.
As a comparison of the two available methods, one test tile of LiDAR containing numerous large buildings and abundant vegetation required 600.5 seconds to process on the test system using the lidar_ground_point_filter algorithm (removing all but the largest buildings) and 9.8 seconds to process using the improved_ground_point_filter algorithm (with complete building removal), i.e., 61x faster.
The original test LiDAR tile, containing abundant vegetation and buildings:
The result of applying the lidar_ground_point_filter function, with a search radius of 25 m and max inter-point slope of 15 degrees:
The result of applying the improved_ground_point_filter method, with block_size = 1.0 m, max_building_size = 150.0 m, slope_threshold = 15.0 degrees, and elev_threshold = 0.15 m:
References:
Zhang, W., Qi, J., Wan, P., Wang, H., Xie, D., Wang, X., & Yan, G. (2016). An easy-to-use airborne LiDAR data filtering method based on cloth simulation. Remote sensing, 8(6), 501.
See Also:
lidar_ground_point_filter
Python API
def improved_ground_point_filter(self, input: Lidar, block_size = 1.0, max_building_size = 150.0, slope_threshold = 15.0, elev_threshold = 0.15, , classify = False, preserve_classes = False) -> Lidar:
Individual Tree Segmentation
Function name: individual_tree_segmentation
Stable
Segment individual tree crowns from LiDAR using adaptive bandwidth and vegetation filtering.
Parameters
NameDescriptionRequiredDefault
inputInput LiDAR path or typed LiDAR object.Required—
only_use_vegIf true, process only vegetation classes (default true).Optional—
veg_classesVegetation classes as comma-delimited text or integer array (default '3,4,5').Optional—
min_heightMinimum point height for segmentation (default 2.0).Optional—
max_heightOptional maximum point height.Optional—
bandwidth_minMinimum horizontal bandwidth (default 1.0).Optional—
bandwidth_maxMaximum horizontal bandwidth (default 6.0).Optional—
adaptive_bandwidthEstimate per-seed horizontal bandwidth from local crown geometry (default true).Optional—
adaptive_neighborsNeighbour count used for adaptive local density scale (default 24).Optional—
adaptive_sector_countNumber of angular sectors for local crown-radius estimation (default 8).Optional—
grid_accelerationUse MeanShift++-style grid approximation for faster mode updates (default false).Optional—
grid_cell_sizeGrid cell size for accelerated mode updates (default 0.5).Optional—
grid_refine_exactRun short exact-neighbour refinement after grid acceleration (default false).Optional—
grid_refine_iterationsExact refinement iteration cap after grid mode updates (default 2).Optional—
tile_sizeOptional tile size for seed scheduling; Optional—
tile_overlapTile overlap width for tiled seed scheduling (default 0.0).Optional—
vertical_bandwidthVertical kernel bandwidth (default 5.0).Optional—
max_iterationsMaximum mean-shift iterations per seed (default 30).Optional—
convergence_tolConvergence tolerance for shift magnitude (default 0.05).Optional—
min_cluster_pointsMinimum points per retained tree cluster (default 50).Optional—
mode_merge_distDistance threshold for merging converged modes (default 0.8).Optional—
threadsThread count override (0 uses default Rayon pool).Optional—
simdEnable SIMD-assisted arithmetic in weighting loops (default true).Optional—
output_id_modeOutput segment id encoding: rgb/user_data/point_source_id or combinations like rgb+user_data.Optional—
output_sidecar_csvIf true, write point_index,segment_id CSV beside lidar output.Optional—
seedDeterministic seed for colour mapping (default 1).Optional—
outputOptional output LiDAR path.Optional—
LiDAR Classify Subset
Function name: lidar_classify_subset
This tool classifies points within a user-specified LiDAR point cloud (base) that correspond with points in a subset cloud (subset). The subset point cloud may have been derived by filtering the original point cloud. The user must specify the names of the two input LAS files (i.e. the full and subset clouds) and the class value (subset_class) to assign the matching points. This class value will be assigned to points in the base cloud, overwriting their input class values in the output LAS file (output). Class values should be numerical (integer valued) and should follow the LAS specifications below: Classification ValueMeaning 0Created never classified 1Unclassified 2Ground 3Low Vegetation 4Medium Vegetation 5High Vegetation 6Building 7Low Point (noise) 8Reserved 9Water 10Rail 11Road Surface 12Reserved 13Wire – Guard (Shield) 14Wire – Conductor (Phase) 15Transmission Tower 16Wire-structure Connector (e.g. Insulator) 17Bridge Deck 18High noise
The user may optionally specify a class value to be assigned to non-subset (i.e. non-matching) points (nonsubset_class) in the base file. If this parameter is not specified, output non-sutset points will have the same class value as the base file.
Python API
def lidar_classify_subset(self, base_lidar: Lidar, subset_lidar: Lidar, subset_class_value: int, nonsubset_class_value: int) -> Lidar:
LiDAR Elevation Slice
Function name: lidar_elevation_slice
This tool can be used to either extract or classify the elevation values (z) of LiDAR points within a specified elevation range (slice). In addition to the names of the input and output LiDAR files (input and output), the user must specify the lower (minz) and upper (maxz) bounds of the elevation range. By default, the tool will only output points within the elevation slice, filtering out all points lying outside of this range. If the class parameter is used, the tool will operate by assigning a class value (inclassval) to the classification bit of points within the slice and another class value (outclassval) to those points falling outside the range.
See Also
lidar_remove_outliers, lidar_classify_subset
Python API
def lidar_elevation_slice(self, input: Lidar, minz: float = float('-inf'), maxz: float = float('inf'), classify: bool = False, in_class_value: int = 2, out_class_value: int = 1) -> Lidar:
LiDAR Remove Outliers
Function name: lidar_remove_outliers
This tool will filter out points from a LiDAR point cloud if the absolute elevation difference between a point and the averge elevation of its neighbourhood, calculated without the point, exceeds a threshold (elev_diff).
Python API
def lidar_remove_outliers(self, input: Lidar, search_radius: float = 2.0, elev_diff: float = 50.0, use_median: bool = False, classify: bool = False) -> Lidar:
LiDAR Segmentation
Function name: lidar_segmentation
This tool can be used to segment a LiDAR point cloud based on differences in the orientation of fitted planar surfaces and point proximity. The algorithm begins by attempting to fit planar surfaces to all of the points within a user-specified radius (radius) of each point in the LiDAR data set. The planar equation is stored for each point for which a suitable planar model can be fit. A region-growing algorithm is then used to assign nearby points with similar planar models. Similarity is based on a maximum allowable angular difference (in degrees) between the two neighbouring points' plane normal vectors (norm_diff). The norm_diff parameter can therefore be thought of as a way of specifying the magnitude of edges mapped by the region-growing algorithm. By setting this value appropriately, it is possible to segment each facet of a building's roof. Segment edges for planar points may also be determined by a maximum allowable height difference (maxzdiff) between neighbouring points on the same plane. Points for which no suitable planar model can be fit are assigned to 'volume' (non-planar) segments (e.g. vegetation points) using a region-growing method that connects neighbouring points based solely on proximity (i.e. all volume points within radius distance are considered to belong to the same segment).
The resulting point cloud will have both planar segments (largely ground surfaces and building roofs and walls) and volume segments (largely vegetation). Each segment is assigned a random red-green-blue (RGB) colour in the output LAS file. The largest segment in any airborne LiDAR dataset will usually belong to the ground surface. This largest segment will always be assigned a dark-green RGB of (25, 120, 0) by the tool.
This tool uses the random sample consensus (RANSAC) method to identify points within a LiDAR point cloud that belong to planar surfaces. RANSAC is a common method used in the field of computer vision to identify a subset of inlier points in a noisy data set containing abundant outlier points. Because LiDAR point clouds often contain vegetation points that do not form planar surfaces, this tool can be used to largely strip vegetation points from the point cloud, leaving behind the ground returns, buildings, and other points belonging to planar surfaces. If the classify flag is used, non-planar points will not be removed but rather will be assigned a different class (1) than the planar points (0).
The algorithm selects a random sample, of a specified size (num_samples) of the points from within the neighbourhood (radius) surrounding each LiDAR point. The sample is then used to parameterize a planar best-fit model. The distance between each neighbouring point and the plane is then evaluated; inliers are those neighbouring points within a user-specified distance threshold (threshold). Models with at least a minimum number of inlier points (model_size) are then accepted. This process of selecting models is iterated a number of user-specified times (num_iter).
One of the challenges with identifying planar surfaces in LiDAR point clouds is that these data are usually collected along scan lines. Therefore, each scan line can potentially yield a vertical planar surface, which is one reason that some vegetation points may be assigned to planes during the RANSAC plane-fitting method. To cope with this problem, the tool allows the user to specify a maximum planar slope (max_slope) parameter. Planes that have slopes greater than this threshold are rejected by the algorithm. This has the side-effect of removing building walls however.
References
Fischler MA and Bolles RC. 1981. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM, 24(6):381–395.
See Also
lidar_ransac_planes, lidar_ground_point_filter
Python API
def lidar_segmentation(self, in_lidar: Lidar, search_radius: float = 2.0, num_iterations: int = 50, num_samples: int = 10, inlier_threshold: float = 0.15, acceptable_model_size: int = 30, max_planar_slope: float = 75.0, norm_diff_threshold: float = 2.0, max_z_diff: float = 1.0, classes: bool = False, ground: bool = False) -> Lidar:
LiDAR Segmentation Based Filter
Function name: lidar_segmentation_based_filter
Python API
def lidar_segmentation_based_filter(self, in_lidar: Lidar, search_radius: float = 5.0, norm_diff_threshold: float = 2.0, max_z_diff: float = 1.0, classify_points: bool = False) -> Lidar:
Modify LiDAR
Function name: modify_lidar
Description
The ModifyLidar tool can be used to alter the properties of points within a LiDAR point cloud. The user provides a statement (statement) containing one or more expressions, separated by semicolons (;). The expressions are evaluated for each point within the input LiDAR file (input). Expressions assign altered values to the properties of points in the output file (output), based on any mathematically defined expression that may include the various properties of individual points (e.g. coordinates, intensity, return attributes, etc) or some file-level properties (e.g. min/max coordinates). As a basic example, the following statement:
x = x + 1000.0 could be used to translate the point cloud 1000 x-units (note, the increment operator could be used as a simpler equivalent, x += 1000.0).
Note that if the user does not specify the optional input LiDAR file, the tool will search for all valid LiDAR (*.las, *.laz, *.zlidar) files contained within the current working directory. This feature can be useful for processing a large number of LiDAR files in batch mode. When this batch mode is applied, the output file names will be the same as the input file names but with a '_modified' suffix added to the end.
Expressions may contain any of the following point-level or file-level variables: Variable NameDescriptionType Point-level properties xThe point x coordinatefloat yThe point y coordinatefloat zThe point z coordinatefloat xyAn x-y coordinate tuple, (x, y)(float, float) xyzAn x-y-z coordinate tuple, (x, y, z)(float, float, float) intensityThe point intensity valueint retThe point return numberint nretThe point number of returnsint is_onlyTrue if the point is an only return (i.e. ret == nret == 1), otherwise falseBoolean is_multipleTrue if the point is a multiple return (i.e. nret > 1), otherwise falseBoolean is_earlyTrue if the point is an early return (i.e. ret == 1), otherwise falseBoolean is_intermediateTrue if the point is an intermediate return (i.e. ret > 1 && ret Boolean is_lateTrue if the point is a late return (i.e. ret == nret), otherwise falseBoolean is_firstTrue if the point is a first return (i.e. ret == 1 && nret > 1), otherwise falseBoolean is_lastTrue if the point is a last return (i.e. ret == nret && nret > 1), otherwise falseBoolean classThe class value in numeric form, e.g. 0 = Never classified, 1 = Unclassified, 2 = Ground, etc.int is_noiseTrue if the point is classified noise (i.e. class == 7class == 18), otherwise falseBoolean is_syntheticTrue if the point is synthetic, otherwise falseBoolean is_keypointTrue if the point is a keypoint, otherwise falseBoolean is_withheldTrue if the point is withheld, otherwise falseBoolean is_overlapTrue if the point is an overlap point, otherwise falseBoolean scan_angleThe point scan angleint scan_directionTrue if the scanner is moving from the left towards the right, otherwise falseBoolean is_flightline_edgeTrue if the point is situated along the filightline edge, otherwise falseBoolean user_dataThe point user dataint point_source_idThe point source IDint scanner_channelThe point scanner channelint timeThe point GPS time, if it exists, otherwise 0float rgbA red-green-blue tuple (r, g, b) if it exists, otherwise (0,0,0)(int, int, int) nirThe point near infrared value, if it exists, otherwise 0int pt_numThe point number within the input fileint File-level properties (invariant) n_ptsThe number of points within the fileint min_xThe file minimum x valuefloat mid_xThe file mid-point x valuefloat max_xThe file maximum x valuefloat min_yThe file minimum y valuefloat mid_yThe file mid-point y valuefloat max_yThe file maximum y valuefloat min_zThe file minimum z valuefloat mid_zThe file mid-point z valuefloat max_zThe file maximum z valuefloat x_scale_factorThe file x scale factorfloat y_scale_factorThe file y scale factorfloat z_scale_factorThe file z scale factorfloat x_offsetThe file x offsetfloat y_offsetThe file y offsetfloat z_offsetThe file z offsetfloat
Most of the point-level properties above are modifiable, however some are not. The complete list of modifiable point attributes include, x, y, z, xy, xyz, intensity, ret, nret, class, user_data, point_source_id, scanner_channel, scan_angle, time, rgb, nir, is_synthetic, is_keypoint, is_withheld, and is_overlap. The immutable properties include is_only, is_multiple, is_early, is_intermediate, is_late, is_first, is_last, is_noise, and pt_num. Of the file-level properties, the modifiable properties include the x_scale_factor, y_scale_factor, z_scale_factor, x_offset, y_offset, and z_offset.
In addition to the point properties defined above, if the user applies the lidar_eigenvalue_features tool on the input LiDAR file, the modify_lidar tool will automatically read in the additional *.eigen file, which include the eigenvalue-based point neighbourhood measures, such as lambda1, lambda2, lambda3, linearity, planarity, sphericity, omnivariance, eigentropy, slope, and residual. See the lidar_eigenvalue_features documentation for details on each of these metrics describing the structure and distribution of points within the neighbourhood surrounding each point in the LiDAR file.
Expressions may use any of the standard mathematical operators, +, -, *, /, % (modulo), ^ (exponentiation), comparison operators, <, >, <=, >=, == (equality), != (inequality), and logical operators, && (Boolean AND), (Boolean OR). Expressions must evaluate to an assignment operation, where the variable that is assigned to must be a modifiable point-level property (see table above). That is, expressions should take the form
pt_variable = .... Other assignment operators are also possible (at least for numeric non-tuple properties), such as the increment (=+) operator (e.g. x += 1000.0) and the decrement (-=) operator (e.g. y -= 1000.0). Expressions may use a number of built-in mathematical functions, including: Function NameDescriptionExample ifPerforms an if(CONDITION, TRUE, FALSE) operation, return either the value of TRUE or FALSE depending on CONDITIONret = if(ret==0, 1, ret) absReturns the absolute value of the argumentvalue = abs(x - mid_x) minReturns the minimum of the argumentsvalue = min(x, y, z) maxReturns the maximum of the argumentsvalue = max(x, y, z) floorReturns the largest integer less than or equal to a numberx = floor(x) roundReturns the nearest integer to a number. Rounds half-way cases away from 0.0x = round(x) ceilReturns the smallest integer greater than or equal to a numberx = ceil(x) clampForces a value to fall within a specified range, defined by a minimum and maximumz = clamp(min_z+10.0, z, max_z-20.0) intReturns the integer equivalent of a numberintensity = int(z) floatReturns the float equivalent of a numberz = float(intensity) to_radiansConverts a number in degrees to radiansval = to_radians(scan_angle) to_degreesConverts a number in radians to degreesscan_angle = int(to_degrees(val)) distReturns the distance between two points defined by two n-length tuplesd = dist(xy, (mid_x, mid_y)) or d = dist(xyz, (mid_x, mid_y, mid_z)) rotate_ptRotates an x-y point by a certain angle, in degreesxy = rotate_pt(xy, 45.0) or orig_pt = (1000.0, 1000.0); xy = rotate_pt(xy, 45.0, orig_pt) math::lnReturns the natural logarithm of the numberz = math::ln(z) math::logReturns the logarithm of the number with respect to an arbitrary basez = math::log(z, 10) math::log2Returns the base 2 logarithm of the numberz = math::log2(z) math::log10Returns the base 10 logarithm of the numberz = math::log10(z) math::expReturns e^(number), (the exponential function)z = math::exp(z) math::powRaises a number to the power of the other numberz = math::pow(z, 2.0) math::sqrtReturns the square root of a number. Returns NaN for a negative numberz = math::sqrt(z, 2.0) math::cosComputes the cosine of a number (in radians)z = math::cos(to_radians(z)) math::sinComputes the sine of a number (in radians)z = math::sin(to_radians(z)) math::tanComputes the tangent of a number (in radians)z = math::tan(to_radians(z)) math::acosComputes the arccosine of a number. The return value is in radians in the range [0, pi] or NaN if the number is outside the range [-1, 1]z = math::acos(z) math::asinComputes the arcsine of a number. The return value is in radians in the range [0, pi] or NaN if the number is outside the range [-1, 1]z = math::asin(z) math::atanComputes the arctangent of a number. The return value is in radians in the range [0, pi] or NaN if the number is outside the range [-1, 1]z = math::atan(z) randReturns a random value between 0 and 1, with an optional seed valuergb = (int(255.0 * rand()), int(255.0 * rand()), int(255.0 * rand())) helmert_transformationPerforms a Helmert transformation on a point using a 7-parameter transformxyz = helmert_transformation(xyz, −446.448, 125.157, −542.06, 20.4894, −0.1502, −0.247, −0.8421 )
The hyperbolic trigonometric functions are also available for use in expression building, as is math::atan2 and the mathematical constants pi and e.
You may use if operations within statements to implement a conditional modification of point properties. For example, the following expression demonstrates how you could modify a point's RGB colour based on its classification, assign ground points (class 2) in the output file a green colour:
rgb = if(class==2, (0,255,0), rgb) To colour all points within 50 m of the tile mid-point red and all other points blue:
rgb = if(dist(xy, (mid_x, mid_y))<50.0, (255,0,0), (0,0,255)) if operations may also be nested to create more complex compound conditional point modification. For example, in the following statement, we assign first-return points red (255,0,0) and last-return points green (0,255,0) colours and white (255,255,255) to all other points (intermediate-returns and only-returns):
rgb = if(is_first, (255,0,0), if(is_last, (0,255,0), (255,255,255))) Here we use an if expression to re-classify points above an elevation of 1000.0 as high noise (class 18):
class = if(z>1000.0, 18, class) Expressions may be strung together within statements using semicolons (;), with each expression being evaluated individually. When this is the case, at least one of the expressions must assign a value to one of the variant point properties (see table above). The following statement demonstrates multi-expression statements, in this case to swap the x and y coordinates in a LiDAR file:
new_var = x; x = y; y = new_var The rand function, used with the seeding option, can be useful when assigning colours to points based on common point properties. For example, to assign a point a random RGB colour based on its point_source_id (Note, for many point clouds, this operation will assign each flightline a unique colour; if flightline information is not stored in the file's point_source_id attribute, one could use the recover_flightline_info tool to calculate this data.):
rgb=(int(255 * rand(point_source_id)), int(255 * rand(point_source_id+1)), int(255 * rand(point_source_id+2))) This expression-based approach to modifying point properties provides a great deal of flexibility and power to the processing of LiDAR point cloud data sets.
See Also
filter_lidar, sort_lidar, lidar_eigenvalue_features
Python API
def modify_lidar(self, statement: str, input_lidar: Optional[Lidar]) -> Optional[Lidar]:
Normalize LiDAR
Function name: normalize_lidar
This tool can be used to normalize a LiDAR point cloud. A normalized point cloud is one for which the point z-values represent height above the ground surface rather than raw elevation values. Thus, a point that falls on the ground surface will have a z-value of zero and vegetation points, and points associated with other off-terrain objects, have positive, non-zero z-values. Point cloud normalization is an essential pre-processing method for many forms of LiDAR data analysis, including the characterization of many forestry related metrics and individual tree mapping (IndividualTreeDetection).
This tool works by measuring the elevation difference of each point in an input LiDAR file (input) and the elevation of an input raster digital terrain model (dtm). A DTM is a bare-earth digital elevation model. Typically, the input DTM is creating using the same input LiDAR data by interpolating the ground surface using only ground-classified points. If the LiDAR point cloud does not contain ground-point classifications, you may wish to apply the LidarGroundPointFilter or ClassifyLidartools before interpolating the DTM. While ground-point classification works well to identify the ground surface beneath vegetation cover, building points are sometimes left It may also be necessary to remove other off-terrain objects like buildings. The RemoveOffTerrainObjects tool can be useful for this purpose, creating a final bare-earth DTM. This tool outputs a normalized LiDAR point cloud (output). If the no_negatives parameter is True, any points that fall beneath the surface elevation defined by the DTM, will have their z-value set to zero.
Note that the LidarTophatTransform tool similarly can be used to produce a type of normalized point cloud, although it does not require an input raster DTM. Rather, it attempts to model the ground surface within the point cloud by identifying the lowest points within local neighbourhoods surrounding each point in the cloud. While this approach can produce satisfactory results in some cases, the NormalizeLidar tool likely works better under more rugged topography and in areas with extensive building coverage, and provides greater control over the definition of the ground surface.
See Also
lidar_tophat_transform, individual_tree_detection, lidar_ground_point_filter, classify_lidar
Python API
def normalize_lidar(self, input_lidar: Lidar, dtm: Raster) -> Lidar:
Remove Duplicates
Function name: remove_duplicates
This tool removes duplicate points from a LiDAR data set. Duplicates are determined by their x, y, and optionally (include_z) z coordinates.
See Also
eliminate_coincident_points
Python API
def remove_duplicates(self, input: Lidar, include_z: bool = False) -> Lidar:
Interpolation and Gridding
Flightline Overlap
Function name: flightline_overlap
This tool can be used to map areas of overlapping flightlines in an input LiDAR (LAS) file (input). The output raster file (output) will contain the number of different flightlines that are contained within each grid cell. The user must specify the desired cell size (resolution). The flightline associated with a LiDAR point is assumed to be contained within the point's Point Source ID property. Thus, the tool essentially counts the number of different Point Source ID values among the points contained within each grid cell. If the Point Source ID property is not set, or has been lost, users may with to apply the recover_flightline_info tool prior to running flightline_overlap.
It is important to set the resolution parameter appropriately, as setting this value too high will yield the mis-characterization of non-overlap areas, and setting the resolution to low will result in fewer than expected overlap areas. An appropriate resolution size value may require experimentation, however a value that is 2-3 times the nominal point spacing has been previously recommended. The nominal point spacing can be determined using the lidar_info tool.
Note that this tool is intended to be applied to LiDAR tile data containing points that have been merged from multiple overlapping flightlines. It is commonly the case that airborne LiDAR data from each of the flightlines from a survey are merged and then tiled into 1 km2 tiles, which are the target dataset for this tool.
Like many of the LiDAR related tools, the input and output file parameters are optional. If left unspecified, the tool will locate all valid LiDAR files within the current Whitebox working directory and use these for calculation (specifying the output raster file name based on the associated input LiDAR file). This can be a helpful way to run the tool on a batch of user inputs within a specific directory.
See Also
classify_overlap_points, recover_flightline_info, lidar_info
Python API
def flightline_overlap(self, input_lidar: Lidar, resolution: float = 1.0) -> Raster:
LiDAR Block Maximum
Function name: lidar_block_maximum
This function superimposes a raster grid overtop of an input LiDAR point cloud (input_lidar) of a user-specified resolution (cell_size) and identifies the highest point in each block. The output raster therefore appoximates a digital surface model (DSM), representing the elevation of the ground surface in open areas and the elevations of off-terrain objects (OTOs), such as buildings and vegetation. While this function will be faster, it is recommended that if you use the lidar_digital_surface_model instead if you are trying to create a DSM. This method will generally produce better results.
Like many of the LiDAR functions, the input LiDAR point cloud (input_lidar) is optional. If an input LiDAR file is not specified, the tool will search for all valid LiDAR (*.las, *.laz, *.zlidar) files contained within the current working directory. This feature can be very useful when you need to process a large number of LiDAR files contained within a directory. This batch processing mode enables the function to run in a more optimized parallel manner. When run in this batch mode, no output LiDAR object will be created. Instead the function will create an output file (saved to disc) with the same name as each input LiDAR file, but with the .tif extension. This can provide a very efficient means for processing extremely large LiDAR data sets.
See Also
lidar_block_minimum, lidar_digital_surface_model, filterfilter_lidar_by_percentile_lidar
Python API
def lidar_block_maximum(self, input_lidar: Optional[Lidar], cell_size: float = 1.0) -> Raster:
LiDAR Block Minimum
Function name: lidar_block_minimum
This function superimposes a raster grid overtop of an input LiDAR point cloud (input_lidar) of a user-specified resolution (cell_size) and identifies the lowest point in each block. The output raster therefore appoximates a bare-earth digital elevation model (DEM), or a digital terrain model (DTM), although it is likely to contain several off-terrain objects (OTOs), such as buildings. Under heavier forest cover, the minimum-surface will also very likely contain some blocks that are not coinincident with the ground surface, but rather will represent the elevation of the lower position of tree trunks and low vegetation.
Like many of the LiDAR functions, the input LiDAR point cloud (input_lidar) is optional. If an input LiDAR file is not specified, the tool will search for all valid LiDAR (*.las, *.laz, *.zlidar) files contained within the current working directory. This feature can be very useful when you need to process a large number of LiDAR files contained within a directory. This batch processing mode enables the function to run in a more optimized parallel manner. When run in this batch mode, no output LiDAR object will be created. Instead the function will create an output file (saved to disc) with the same name as each input LiDAR file, but with the .tif extension. This can provide a very efficient means for processing extremely large LiDAR data sets.
See Also
lidar_block_maximum, filterfilter_lidar_by_percentile_lidar
Python API
def lidar_block_minimum(self, input_lidar: Optional[Lidar], cell_size: float = 1.0) -> Raster:
LiDAR Construct Vector TIN
Function name: lidar_construct_vector_tin
This tool creates a vector triangular irregular network (TIN) for a set of LiDAR points (input) using a 2D Delaunay triangulation algorithm. LiDAR points may be excluded from the triangulation operation based on a number of criteria, include the point return number (returns), point classification value (exclude_cls), or a minimum (minz) or maximum (maxz) elevation.
For vector points, use the construct_vector_tin tool instead.
See Also
construct_vector_tin
Python API
def lidar_construct_vector_tin(self, input_lidar: Optional[Lidar], returns_included: str = "all", excluded_classes: List[int] = None, min_elev: float = float('-inf'), max_elev: float = float('inf'), max_triangle_edge_length: float = float('inf')) -> Vector:
LiDAR Contour
Function name: lidar_contour
Description
This tool can be used to create a contour (i.e. isolines of elevation values) vector coverage from an input LiDAR points data set (input). The tool works by first creating a triangulation of the input LiDAR points. The user must specify the contour interval (interval), or vertical spacing between contour lines. The smooth parameter can be used to increase or decrease the degree to which contours are smoothed. This parameter should be an odd integer value (0, 1, 3, 5...), with 0 indicating no smoothing. The tool can interpolate contours based on the LiDAR point elevation values, intensity data, or the user data field (parameter), with 'elevation' as the default parameter. LiDAR points may be excluded from the contouring process based on a number of criteria, including their return value (returns, which may be 'all', 'last', 'first'), their class value (exclude_cls), and whether they fall outside of a user-specified elevation range (minz and maxz). The optional max_triangle_edge_length parameter can be used to exclude the output of contours within areas that are sparsely populated areas of the data set, where the triangles formed by the Delaunay triangulation are too large. This is often the case within bodies of water; long and narrow triangular facets can also occur within the concave portions of the hull, or polygon enclosing, the points, when the data have an irregular shaped extent. Setting this parameter can help alleviate the problem of contouring beyond the data footprint.
Like many of the LiDAR tools, both the input and output parameters are optional. If these parameters are not specified by the user, the tool will search for all LAS files contained within the current WhiteboxTools working directory. This feature can be useful when you need to contour a large number of LiDAR tiles. This batch processing mode enables the tool to enable parallel data processing, which can significantly improve the efficiency of data conversion for datasets with many LiDAR tiles. When run in this batch mode, the output file (output) also need not be specified; the tool will instead create an output file with the same name as each input LiDAR file, but with the .shp extension.
It is important to note that contouring is better suited to well-defined surfaces (e.g. the ground surface or building heights), rather than volume features, such as vegetation, which tend to produce extremely complex contour sets. It is advisable to use this tool with last-returns and/or ground-classified point returns. If the input data set does not contain ground classification, consider pre-processing with the lidar_ground_point_filter tool.
See Also
contours_from_points, contours_from_raster, lidar_ground_point_filter
Python API
def lidar_contour(self, input_lidar: Optional[Lidar], contour_interval: float = 10.0, base_contour: float = 0.0, smooth: int = 5, interpolation_parameter: str = "elevation", returns_included: str = "all", excluded_classes: Optional[List[int]] = None, min_elev: float = float('-inf'), max_elev: float = float('inf'), tile_overlap: float = 0.0, max_triangle_edge_length: float = float('inf')) -> Optional[Vector]:
LiDAR Digital Surface Model
Function name: lidar_digital_surface_model
This tool creates a digital surface model (DSM) from a LiDAR point cloud. A DSM reflects the elevation of the tops of all off-terrain objects (i.e. non-ground features) contained within the data set. For example, a DSM will model the canopy top as well as building roofs. This is in stark contrast to a bare-earth digital elevation model (DEM), which models the ground surface without off-terrain objects present. Bare-earth DEMs can be derived from LiDAR data by interpolating last-return points using one of the other LiDAR interpolators (e.g. lidar_tin_gridding). The algorithm used for interpolation in this tool is based on gridding a triangulation (TIN) fit to top-level points in the input LiDAR point cloud. All points in the input LiDAR data set that are below other neighbouring points, within a specified search radius (radius), and that have a large inter-point slope, are filtered out. Thus, this tool will remove the ground surface beneath as well as any intermediate points within a forest canopy, leaving only the canopy top surface to be interpolated. Similarly, building wall points and any ground points beneath roof overhangs will also be remove prior to interpolation. Note that because the ground points beneath overhead wires and utility lines are filtered out by this operation, these features tend to be appear as 'walls' in the output DSM. If these points are classified in the input LiDAR file, you may wish to filter them out before using this tool (filter_lidar_classes).
The following images show the differences between creating a DSM using the lidar_digital_surface_model and by interpolating first-return points only using the lidar_tin_gridding tool respectively. Note, the images show time_in_daylight, which is a more effective way of hillshading DSMs than the traditional hillshade method. Compare how the DSM created lidar_digital_surface_model tool (above) has far less variability in areas of tree-cover, more effectively capturing the canopy top. As well, notice how building rooftops are more extensive and straighter in the lidar_digital_surface_model DSM image. This is because this method eliminates ground returns beneath roof overhangs before the triangulation operation.
The user must specify the grid resolution of the output raster (resolution), and optionally, the name of the input LiDAR file (input) and output raster (output). Note that if an input LiDAR file (input) is not specified by the user, the tool will search for all valid LiDAR (*.las, *.laz, *.zlidar) files contained within the current working directory. This feature can be very useful when you need to interpolate a DSM for a large number of LiDAR files. Not only does this batch processing mode enable the tool to run in a more optimized parallel manner, but it will also allow the tool to include a small buffer of points extending into adjacent tiles when interpolating an individual file. This can significantly reduce edge-effects when the output tiles are later mosaicked together. When run in this batch mode, the output file (output) also need not be specified; the tool will instead create an output file with the same name as each input LiDAR file, but with the .tif extension. This can provide a very efficient means for processing extremely large LiDAR data sets.
Users may also exclude points from the interpolation if they fall below or above the minimum (minz) or maximum (maxz) thresholds respectively. This can be a useful means of excluding anomalously high or low points. Note that points that are classified as low points (LAS class 7) or high noise (LAS class 18) are automatically excluded from the interpolation operation.
Triangulation will generally completely fill the convex hull containing the input point data. This can sometimes result in very long and narrow triangles at the edges of the data or connecting vertices on either side of void areas. In LiDAR data, these void areas are often associated with larger waterbodies, and triangulation can result in very unnatural interpolated patterns within these areas. To avoid this problem, the user may specify a the maximum allowable triangle edge length (max_triangle_edge_length) and all grid cells within triangular facets with edges larger than this threshold are simply assigned the NoData values in the output DSM. These NoData areas can later be better dealt with using the fill_missing_data tool after interpolation.
See Also
lidar_tin_gridding, filter_lidar_classes, fill_missing_data, time_in_daylight
Python API
def lidar_digital_surface_model(self, input_lidar: Optional[Lidar], cell_size: float = 1.0, search_radius: float = 0.5, min_elev: float = float('-inf'), max_elev: float = float('inf'), max_triangle_edge_length: float = float('inf')) -> Raster:
LiDAR Hex Bin
Function name: lidar_hex_bin
The practice of binning point data to form a type of 2D histogram, density plot, or what is sometimes called a heatmap, is quite useful as an alternative for the cartographic display of of very dense points sets. This is particularly the case when the points experience significant overlap at the displayed scale. The lidar_point_density tool can be used to perform binning based on a regular grid (raster output). This tool, by comparison, bases the binning on a hexagonal grid.
The tool is similar to the CreateHexagonalVectorGrid tool, however instead will create an output hexagonal grid in which each hexagonal cell possesses a COUNT attribute which specifies the number of points from an input points file (LAS file) that are contained within the hexagonal cell. The tool will also calculate the minimum and maximum elevations and intensity values and outputs these data to the attribute table.
In addition to the names of the input points file and the output Shapefile, the user must also specify the desired hexagon width (w), which is the distance between opposing sides of each hexagon. The size (s) each side of the hexagon can then be calculated as, s = w / [2 x cos(PI / 6)]. The area of each hexagon (A) is, A = 3s(w / 2). The user must also specify the orientation of the grid with options of horizontal (pointy side up) and vertical (flat side up).
See Also
vector_hex_binning, lidar_point_density, CreateHexagonalVectorGrid
Python API
def lidar_hex_bin(self, input_lidar: Lidar, width: float, orientation: str = "h") -> Vector:
LiDAR Hillshade
Function name: lidar_hillshade
Python API
def lidar_hillshade(self, input: Lidar, search_radius: float = -1.0, azimuth: float = 315.0, altitude: float = 30.0) -> Lidar:
LiDAR IDW Interpolation
Function name: lidar_idw_interpolation
This tool interpolates LiDAR files using inverse-distance weighting (IDW) scheme. The user must specify the value of the IDW weight parameter (weight). The output grid can be based on any of the stored LiDAR point parameters (parameter), including elevation (in which case the output grid is a digital elevation model, DEM), intensity, class, return number, number of returns, scan angle, RGB (colour) values, and user data values. Similarly, the user may specify which point return values (returns) to include in the interpolation, including all points, last returns (including single return points), and first returns (including single return points).
The user must specify the grid resolution of the output raster (resolution), and optionally, the name of the input LiDAR file (input) and output raster (output). Note that if an input LiDAR file (input) is not specified by the user, the tool will search for all valid LiDAR (*.las, *.laz, *.zlidar) files contained within the current working directory. This feature can be very useful when you need to interpolate a DEM for a large number of LiDAR files. Not only does this batch processing mode enable the tool to run in a more optimized parallel manner, but it will also allow the tool to include a small buffer of points extending into adjacent tiles when interpolating an individual file. This can significantly reduce edge-effects when the output tiles are later mosaicked together. When run in this batch mode, the output file (output) also need not be specified; the tool will instead create an output file with the same name as each input LiDAR file, but with the .tif extension. This can provide a very efficient means for processing extremely large LiDAR data sets.
Users may excluded points from the interpolation based on point classification values, which follow the LAS classification scheme. Excluded classes are specified using the exclude_cls parameter. For example, to exclude all vegetation and building classified points from the interpolation, use --exclude_cls='3,4,5,6'. Users may also exclude points from the interpolation if they fall below or above the minimum (minz) or maximum (maxz) thresholds respectively. This can be a useful means of excluding anomalously high or low points. Note that points that are classified as low points (LAS class 7) or high noise (LAS class 18) are automatically excluded from the interpolation operation.
The tool will search for the nearest input LiDAR point to each grid cell centre, up to a maximum search distance (radius). If a grid cell does not have a LiDAR point within this search distance, it will be assigned the NoData value in the output raster. In LiDAR data, these void areas are often associated with larger waterbodies. These NoData areas can later be better dealt with using the fill_missing_data tool after interpolation.
See Also
lidar_tin_gridding, lidar_nearest_neighbour_gridding, lidar_sibson_interpolation
Python API
def lidar_idw_interpolation(self, input_lidar: Optional[Lidar], interpolation_parameter: str = "elevation", returns_included: str = "all", cell_size: float = 1.0, idw_weight: float = 1.0, search_radius: float = 2.5, excluded_classes: List[int] = None, min_elev: float = float('-inf'), max_elev: float = float('inf')) -> Raster:
LiDAR Nearest Neighbour Gridding
Function name: lidar_nearest_neighbour_gridding
This tool grids LiDAR files using nearest-neighbour (NN) scheme, that is, each grid cell in the output image will be assigned the parameter value of the point nearest the grid cell centre. This method should not be confused for the similarly named natural-neighbour interpolation (a.k.a Sibson's method). Nearest neighbour gridding is generally regarded as a poor way of interpolating surfaces from low-density point sets and results in the creation of a Voronoi diagram. However, this method has several advantages when applied to LiDAR data. NN gridding is one of the fastest methods for generating raster surfaces from large LiDAR data sets. NN gridding is one of the few interpolation methods, along with triangulation, that will preserve vertical breaks-in-slope, such as occur at the edges of building. This characteristic can be important when using some post-processing methods, such as the remove_off_terrain_objects tool. Furthermore, because most LiDAR data sets have remarkably high point densities compared with other types of geographic data, this approach does often produce a satisfactory result; this is particularly true when the point density is high enough that there are multiple points in the majority of grid cells.
The output grid can be based on any of the stored LiDAR point parameters (parameter), including elevation (in which case the output grid is a digital elevation model, DEM), intensity, class, return number, number of returns, scan angle, RGB (colour) values, time, and user data values. Similarly, the user may specify which point return values (returns) to include in the interpolation, including all points, last returns (including single return points), and first returns (including single return points).
The user must specify the grid resolution of the output raster (resolution), and optionally, the name of the input LiDAR file (input) and output raster (output). Note that if an input LiDAR file (input) is not specified by the user, the tool will search for all valid LiDAR (*.las, *.laz, *.zlidar) files contained within the current working directory. This feature can be very useful when you need to interpolate a DEM for a large number of LiDAR files. Not only does this batch processing mode enable the tool to run in a more optimized parallel manner, but it will also allow the tool to include a small buffer of points extending into adjacent tiles when interpolating an individual file. This can significantly reduce edge-effects when the output tiles are later mosaicked together. When run in this batch mode, the output file (output) also need not be specified; the tool will instead create an output file with the same name as each input LiDAR file, but with the .tif extension. This can provide a very efficient means for processing extremely large LiDAR data sets.
Users may excluded points from the interpolation based on point classification values, which follow the LAS classification scheme. Excluded classes are specified using the exclude_cls parameter. For example, to exclude all vegetation and building classified points from the interpolation, use --exclude_cls='3,4,5,6'. Users may also exclude points from the interpolation if they fall below or above the minimum (minz) or maximum (maxz) thresholds respectively. This can be a useful means of excluding anomalously high or low points. Note that points that are classified as low points (LAS class 7) or high noise (LAS class 18) are automatically excluded from the interpolation operation.
The tool will search for the nearest input LiDAR point to each grid cell centre, up to a maximum search distance (radius). If a grid cell does not have a LiDAR point within this search distance, it will be assigned the NoData value in the output raster. In LiDAR data, these void areas are often associated with larger waterbodies. These NoData areas can later be better dealt with using the fill_missing_data tool after interpolation.
See Also
lidar_tin_gridding, lidar_idw_interpolation, lidar_tin_gridding, remove_off_terrain_objects, fill_missing_data
Python API
def lidar_nearest_neighbour_gridding(self, input_lidar: Optional[Lidar], interpolation_parameter: str = "elevation", returns_included: str = "all", cell_size: float = 1.0, search_radius: float = 2.5, excluded_classes: List[int] = None, min_elev: float = float('-inf'), max_elev: float = float('inf')) -> Raster:
LiDAR Radial Basis Function Interpolation
Function name: lidar_radial_basis_function_interpolation
Python API
def lidar_radial_basis_function_interpolation(self, input_lidar: Optional[Lidar], interpolation_parameter: str = "elevation", returns_included: str = "all", cell_size: float = 1.0, num_points: int = 15, excluded_classes: List[int] = None, min_elev: float = float('-inf'), max_elev: float = float('inf'), func_type: str = "thinplatespline", poly_order: str = "none", weight: float = 0.1) -> Raster:
LiDAR Sibson Interpolation
Function name: lidar_sibson_interpolation
Description
This tool interpolates LiDAR files using Sibson's interpolation method, sometimes referred to as natural-neighbour interpolation (not to be confused with nearest-neighbour interpolation, lidar_nearest_neighbour_gridding). Sibon's method is based on assigning weight to points for which inserting a grid point would result in captured areas of the Voronoi tessellation of the input point set. The larger the captured area, the higher the weight assigned to the associated point. One of the main advantages of this natural neighbour approach to interpolation over similar techniques, such as inverse-distance weighting (IDW lidar_idw_interpolation), is that there is no need to specify a search distance or other interpolation weighting parameters. Sibson's approach frequently provides a very suitable interpolation for LiDAR data. The method requires the calculation of a Delaunay triangulation, from which the Voronoi tessellation is calculated.
The user must specify the value of the IDW weight parameter (weight). The output grid can be based on any of the stored LiDAR point parameters (parameter), including elevation (in which case the output grid is a digital elevation model, DEM), intensity, class, return number, number of returns, scan angle values, and user data values. Similarly, the user may specify which point return values (returns) to include in the interpolation, including all points, last returns (including single return points), and first returns (including single return points).
The user must specify the grid resolution of the output raster (resolution), and optionally, the name of the input LiDAR file (input) and output raster (output). Note that if an input LiDAR file (input) is not specified by the user, the tool will search for all valid LiDAR (*.las, *.laz, *.zlidar) files contained within the current working directory. This feature can be useful when you need to interpolate a DEM for a large number of LiDAR files. This batch processing mode enables the tool to include a small buffer of points extending into adjacent tiles when interpolating an individual file. This can significantly reduce edge-effects when the output tiles are later mosaicked together. When run in this batch mode, the output file (output) also need not be specified; the tool will instead create an output file with the same name as each input LiDAR file, but with the .tif extension. This can provide a very efficient means for processing extremely large LiDAR data sets.
Users may excluded points from the interpolation based on point classification values, which follow the LAS classification scheme. Excluded classes are specified using the exclude_cls parameter. For example, to exclude all vegetation and building classified points from the interpolation, use --exclude_cls='3,4,5,6'. Users may also exclude points from the interpolation if they fall below or above the minimum (minz) or maximum (maxz) thresholds respectively. This can be a useful means of excluding anomalously high or low points. Note that points that are classified as low points (LAS class 7) or high noise (LAS class 18) are automatically excluded from the interpolation operation.
See Also
lidar_tin_gridding, lidar_nearest_neighbour_gridding, lidar_idw_interpolation
Python API
def lidar_sibson_interpolation(self, input_lidar: Optional[Lidar], interpolation_parameter: str = "elevation", resolution: float = 1.0, returns_included: str = "all", excluded_classes: Optional[List[int]] = None, min_elev: float = float('-inf'), max_elev: float = float('inf')) -> Optional[Raster]:
LiDAR Thin
Function name: lidar_thin
Thins a LiDAR point cloud, reducing point density.
Python API
def lidar_thin(self, input: Lidar, resolution: float = 1.0, selection_method: str = "first", save_filtered: bool = False) -> Tuple[Lidar, Union[Lidar, None]]:
LiDAR Thin High Density
Function name: lidar_thin_high_density
Thins points from high density areas within a LiDAR point cloud.
Python API
def lidar_thin_high_density(self, input: Lidar, density: float, resolution: float = 1.0, save_filtered: bool = False) -> Tuple[Lidar, Union[Lidar, None]]:
LiDAR Tile Footprint
Function name: lidar_tile_footprint
This tool can be used to create a vector polygon of the bounding box or convex hull of a LiDAR point cloud (i.e. LAS file). If the user specified an input file (input) and output file (output), the tool will calculate the footprint, containing all of the data points, and output this feature to a vector polygon file. If the input and output parameters are left unspecified, the tool will calculate the footprint of every LAS file contained within the working directory and output these features to a single vector polygon file. If this is the desired mode of operation, it is important to specify the working directory (wd) containing the group of LAS files; do not specify the optional input and output parameters in this case. Each polygon in the output vector will contain a LAS_NM field, specifying the source LAS file name, a NUM_PNTS field, containing the number of points within the source file, and Z_MIN and Z_MAX fields, containing the minimum and maximum elevations. This output can therefore be useful to create an index map of a large tiled LiDAR dataset.
By default, this tool identifies the axis-aligned minimum rectangular hull, or bounding box, containing the points in each of the input tiles. If the user specifies the hull flag, the tool will identify the minimum convex hull instead of the bounding box. This option is considerably more computationally intensive and will be a far longer running operation if many tiles are specified as inputs.
A note on LAZ file inputs: While WhiteboxTools does not currently support the reading and writing of the compressed LiDAR format LAZ, it is able to read LAZ file headers. This tool, when run in in the bounding box mode (rather than the convex hull mode), is able to take LAZ input files.
lidar_tile, LayerFootprint, minimum_bounding_box, minimum_convex_hull
Python API
def lidar_tile_footprint(self, input_lidar: Optional[Lidar], output_hulls: bool = False) -> Vector:
LiDAR TIN Gridding
Function name: lidar_tin_gridding
This tool creates a raster grid based on a Delaunay triangular irregular network (TIN) fitted to LiDAR points. The output grid can be based on any of the stored LiDAR point parameters (parameter), including elevation (in which case the output grid is a digital elevation model, DEM), intensity, class, return number, number of returns, scan angle, RGB (colour) values, and user data values. Similarly, the user may specify which point return values (returns) to include in the interpolation, including all points, last returns (including single return points), and first returns (including single return points).
The user must specify the grid resolution of the output raster (resolution), and optionally, the name of the input LiDAR file (input) and output raster (output). Note that if an input LiDAR file (input) is not specified by the user, the tool will search for all valid LiDAR (*.las, *.laz, *.zlidar) files contained within the current working directory. This feature can be very useful when you need to interpolate a DEM for a large number of LiDAR files. Not only does this batch processing mode enable the tool to run in a more optimized parallel manner, but it will also allow the tool to include a small buffer of points extending into adjacent tiles when interpolating an individual file. This can significantly reduce edge-effects when the output tiles are later mosaicked together. When run in this batch mode, the output file (output) also need not be specified; the tool will instead create an output file with the same name as each input LiDAR file, but with the .tif extension. This can provide a very efficient means for processing extremely large LiDAR data sets.
Users may excluded points from the interpolation based on point classification values, which follow the LAS classification scheme. Excluded classes are specified using the exclude_cls parameter. For example, to exclude all vegetation and building classified points from the interpolation, use --exclude_cls='3,4,5,6'. Users may also exclude points from the interpolation if they fall below or above the minimum (minz) or maximum (maxz) thresholds respectively. This can be a useful means of excluding anomalously high or low points. Note that points that are classified as low points (LAS class 7) or high noise (LAS class 18) are automatically excluded from the interpolation operation.
Triangulation will generally completely fill the convex hull containing the input point data. This can sometimes result in very long and narrow triangles at the edges of the data or connecting vertices on either side of void areas. In LiDAR data, these void areas are often associated with larger waterbodies, and triangulation can result in very unnatural interpolated patterns within these areas. To avoid this problem, the user may specify a the maximum allowable triangle edge length (max_triangle_edge_length) and all grid cells within triangular facets with edges larger than this threshold are simply assigned the NoData values in the output DSM. These NoData areas can later be better dealt with using the fill_missing_data tool after interpolation.
See Also
lidar_idw_interpolation, lidar_nearest_neighbour_gridding, lidar_tin_gridding, filter_lidar_classes, fill_missing_data
Python API
def lidar_tin_gridding(self, input_lidar: Optional[Lidar], interpolation_parameter: str = "elevation", returns_included: str = "all", cell_size: float = 1.0, excluded_classes: List[int] = None, min_elev: float = float('-inf'), max_elev: float = float('inf'), max_triangle_edge_length: float = float('inf')) -> Raster:
Analysis and Metrics
Colourize Based On Class
Function name: colourize_based_on_class
Description
This tools sets the RGB colour values of an input LiDAR point cloud (input) based on the point classifications. Rendering a point cloud in this way can aid with the determination of point classification accuracy, by allowing you to determine if there are certain areas within a LiDAR tile, or certain classes, that are problematic during the point classification process.
By default, the tool renders buildings in red (see table below). However, the tool also provides the option to render each building in a unique colour (use_unique_clrs_for_buildings), providing a visually stunning LiDAR-based map of built-up areas. When this option is selected, the user must also specify the radius parameter, which determines the search distance used during the building segmentation operation. The radius parameter is optional, and if unspecified (when the use_unique_clrs_for_buildings flag is used), a value of 2.0 will be used.
The specific colours used to render each point class can optionally be set by the user with the clr_str parameter. The value of this parameter may list specific class values (0-18) and corresponding colour values in either a red-green-blue (RGB) colour triplet form (i.e. (r, g, b)), or or a hex-colour, of either form #e6d6aa or 0xe6d6aa (note the # and 0x prefixes used to indicate hexadecimal numbers; also either lowercase or capital letter values are acceptable). The following is an example of the a valid clr_str that sets the ground (class 2) and high vegetation (class 5) colours used for rendering:
2: (184, 167, 108); 5: #9ab86c
Notice that 1) each class is separated by a semicolon (';'), 2) class values and colour values are separated by colons (':'), and 3) either RGB and hex-colour forms are valid.
If a clr_str parameter is not provided, the tool will use the default colours used for each class (see table below).
Class values are assumed to follow the class designations listed in the LAS specification: Classification ValueMeaningDefault Colour 0Created never classified 1Unclassified 2Ground 3Low Vegetation 4Medium Vegetation 5High Vegetation 6Building 7Low Point (noise) 8Reserved 9Water 10Rail 11Road Surface 12Reserved 13Wire – Guard (Shield) 14Wire – Conductor (Phase) 15Transmission Tower 16Wire-structure Connector (e.g. Insulator) 17Bridge Deck 18High noise
The point RGB colour values can be blended with the intensity data to create a particularly effective visualization, further enhancing the visual interpretation of point return properties. The intensity_blending parameter value, which must range from 0% (no intensity blending) to 100% (all intensity), is used to set the degree of intensity/RGB blending.
Because the output file contains RGB colour data, it is possible that it will be larger than the input file. If the input file does contain valid RGB data, the output will be similarly sized, but the input colour data will be replaced in the output file with the point-return colours.
The output file can be visualized using any point cloud renderer capable of displaying point RGB information. We recommend the plas.io LiDAR renderer but many similar open-source options exist.
See Also
colourize_based_on_point_returns, lidar_colourize
Python API
def colourize_based_on_class(self, input_lidar: Optional[Lidar], intensity_blending_amount: float = 50.0, clr_str: str = "", use_unique_clrs_for_buildings: bool = False, search_radius: float = 2.0) -> Optional[Lidar]:
Colourize Based On Point Returns
Function name: colourize_based_on_point_returns
Description
This tool sets the RGB colour values of a LiDAR point cloud (input) based on the point returns. It specifically renders only-return, first-return, intermediate-return, and last-return points in different colours, storing these data in the RGB colour data of the output LiDAR file (output). Colourizing the points in a LiDAR point cloud based on return properties can aid with the visual inspection of point distributions, and therefore, the quality assurance/quality control (QA/QC) of LiDAR data tiles. For example, this visualization process can help to determine if there are areas of vegetation where there is insufficient coverage of ground points, perhaps due to acquisition of the data during leaf-on conditions. There is often an assumption in LiDAR data processing that the ground surface can be modelled using a subset of the only-return and last-return points (beige and blue in the image below). However, under heavy forest cover, and in particular if the data were collected during leaf-on conditions or if there is significant coverage of conifer trees, the only-return and last-return points may be poor approximations of the ground surface. This tool can help to determine the extent to which this is the case for a particular data set.
The specific colours used to render each return type can be set by the user with the only, first, intermediate, and last parameters. Each parameter takes either a red-green-blue (RGB) colour triplet, of the form (r,g,b), or a hex-colour, of either form #e6d6aa or 0xe6d6aa (note the # and 0x prefixes used to indicate hexadecimal numbers; also either lowercase or capital letter values are acceptable).
The point RGB colour values can be blended with the intensity data to create a particularly effective visualization, further enhancing the visual interpretation of point return properties. The intensity_blending parameter value, which must range from 0% (no intensity blending) to 100% (all intensity), is used to set the degree of intensity/RGB blending.
Because the output file contains RGB colour data, it is possible that it will be larger than the input file. If the input file does contain valid RGB data, the output will be similarly sized, but the input colour data will be replaced in the output file with the point-return colours.
The output file can be visualized using any point cloud renderer capable of displaying point RGB information. We recommend the plas.io LiDAR renderer but many similar open-source options exist.
This tool is a convenience function and can alternatively be achieved using the modify_lidar tool with the statement:
rgb=if(is_only, (230,214,170), if(is_last, (0,0,255), if(is_first, (0,255,0), (255,0,255))))
The colourize_based_on_point_returns tool is however significantly faster for this operation than the modify_lidar tool because the expression above must be executed dynamically for each point.
See Also
modify_lidar, lidar_colourize
Python API
def colourize_based_on_point_returns(self, input_lidar: Optional[Lidar], intensity_blending_amount: float = 50.0, only_ret_colour: str = "(230,214,170)", first_ret_colour:str = "(0,140,0)", intermediate_ret_colour: str = "(255,0,255)", last_ret_colour: str = "(0,0,255)") -> Optional[Lidar]:
Find Flightline Edge Points
Function name: find_flightline_edge_points
Python API
def find_flightline_edge_points(self, in_lidar: Lidar) -> Lidar:
Individual Tree Detection
Function name: individual_tree_detection
This tool can be used to identify points in a LiDAR point cloud that are associated with the tops of individual trees. The tool takes a LiDAR point cloud as an input (input_lidar) and it is best if the input file has been normalized using the lidar_tophat_transform function, such that points record height above the ground surface. Note that the input_lidar parameter is optional and if left unspecified the tool will search for all valid LiDAR (*.las, *.laz, *.zlidar) files contained within the current working directory. This 'batch mode' operation is common among many of the LiDAR processing tools. Output vectors are saved to disc automatically for each processed LiDAR file when operating in batch mode and the function returns None. When an individual input_lidar Lidar object is specified, the tool will return a Vector object, containing the tree top points.
The tool will evaluate the points within a local neighbourhood around each point in the input point cloud and determine if it is the highest point within the neighbourhood. If a point is the highest local point, it will be entered into the output vector file. The neighbourhood size can vary, with higher canopy positions generally associated with larger neighbourhoods. The user specifies the min_search_radius and min_height parameters, which default to 1 m and 0 m respectively. If the min_height parameter is greater than zero, all points that are less than this value above the ground (assuming the input point cloud measures this height parameter) are ignored, which can be a useful mechanism for removing shorter trees and other vegetation from the analysis. If the user specifies the max_search_radius and max_height parameters, the search radius will be determined by linearly interpolation based on point height and the min/max search radius and height parameter values. Points that are above the max_height parameter will be processed with search neighbourhoods sized max_search_radius. If the max radius and height parameters are unspecified, they are set to the same values as the minimum radius and height parameters, i.e., the neighbourhood size does not increase with canopy height.
If the point cloud contains point classifications, it may be useful to exclude all non-vegetation points. To do this simply set the only_use_veg parameter to True. This parameter should only be set to True when you know that the input file contains point classifications, otherwise the tool may generate an empty output vector file.
See Also
lidar_tophat_transform
Python API
def individual_tree_detection(self, input_lidar: Lidar, min_search_radius: float = 1.0, min_height: float = 0.0, max_search_radius: Optional[float] = None, max_height: Optional[float] = None, only_use_veg = False) -> Optional[Vector]:
LiDAR Eigenvalue Features
Function name: lidar_eigenvalue_features
Description
This tool can be used to measure eigenvalue-based features that describe the characteristics of the local neighbourhood surrounding each point in an input LiDAR file (input). These features can then be used in point classification applications, or as the basis for point filtering (filter_lidar) or modifying point properties (modify_lidar).
The algorithm begins by using the x, y, z coordinates of the points within a local spherical neighbourhood to calculate a covariance matrix. The three eigenvalues λ1, λ2, λ3 are then derived from the covariance matrix decomposition such that λ1 > λ2 > λ3. The eigenvalues are then used to describe the extent to which the neighbouring points can be characterized by a linear, planar, or volumetric distribution, by calculating the following three features:
linearity = (λ1 - λ2) / λ1
planarity = (λ2 - λ3) / λ1
sphericity = λ3 / λ1
In the case of a neighbourhood containing a 1-dimensional line, the first of the three components will possess most of data variance, with very little contained within λ2 and λ3, and linearity will be nearly equal to 1.0. If the local neighbourhood contains a 2-dimensional plane, the first two components will possess most of the variance, with little variance within λ3, and planarity will be nearly equal to 1.0. Lastly, in the case of a 3-dimensional, random volumetric point distribution, each of the three components will be nearly equal in magnitude and sphericity will be nearly equal to 1.0.
Researchers in the field of LiDAR point classification also frequently define two additional eigenvalue-based features, the omnivariance (low values correspond to planar and linear regions and higher values occur for areas with a volumetric point distribution, such as vegetation), and the eigentropy, which is related to the Shannon entropy and is a measure of the unpredictability of the distribution of points in the neighbourhood:
omnivariance = (λ1 ⋅ λ2 ⋅ λ3)1/3
eigentropy = -e1 ⋅ lne1 - e2 ⋅ lne2 - e3 ⋅ lne3
where e1, e2, and e3 are the normalized eigenvalues.
In addition to the eigenvalues, the eigendecomposition of the symmetric covariance matrix also yields the three eigenvectors, which describe the transformation coefficients of the principal components. The first two eigenvectors represent the basis of the plane resulting from the orthogonal regression analysis, while the third eigenvector represents the plane normal. From this normal, it is possible to calculate the slope of the plane, as well as the orthogonal distance between each point and the neighbourhood fitted plane, i.e. the point residual.
This tool outputs a binary file (*.eigen; output) that contains a total of 10 features for each point in the input file, including the point_num (for reference), lambda1, lambda2, lambda3, linearity, planarity, sphericity, omnivariance, eigentropy, slope, and residual. Users should bear in mind that most of these features describe the properties of the distribution of points within a spherical neighbourhood surrounding each point in the input file, rather than a characteristic of the point itself. The only one of the ten features that is a point property is the residual. Points for which the planarity value is high and the residual value is low may be assumed to be part of the plane that dominate the structure of their neighbourhoods. In addition to the binary data *.eigen file, the tool will also output a sidecar file, with a *.eigen.json extension, which describes the structure of the raw binary data file.
Local neighbourhoods are spherical in shape and the size of each neighbourhood is characterized by the num_neighbours and radius parameters. If the optional num_neighbours parameter is specified, the size of the neighbourhood will vary by point, increasing or decreasing to encompass the specified number of neighbours (notice that this value does not include the point itself). If the optional radius parameter is specified in addition to a number of neighbours, the specified radius value will serve as a upper-bound and neighbouring points that are beyond this radial distance to the centre point will be excluded. If a radius search distance is specified but the num_neighbours parameter is not, then a constant search distance will be used for each point in the input file, resulting in varying number of points within local neighbourhoods, depending on local point densities. If point density varies significantly in the input file, then use of the num_neighbours parameter may be advisable. Notice that at least one of the two parameters must be specified. In cases where the number of neighbouring points is fewer than eight, each of the output feature values will be set to 0.0.
Note that if the user does not specify the optional input LiDAR file, the tool will search for all valid LiDAR (*.las, *.laz, *.zlidar) files contained within the current working directory. This feature can be useful for processing a large number of LiDAR files in batch mode.
The binary data file (*.eigen) can be used directly by the filter_lidar and modify_lidar tools, and will be automatically read by the tools when the *.eigen and *.eigen.json files are present in the same folder as the accompanying source LiDAR file. This allows users to apply data filters, or to modify point properties, using these point neighbourhood features. For example, the statement, rgb=(int(linearity*255), int(planarity*255), int(sphericity*255)), used with the modify_lidar tool, can render the point RGB colour values based on some of the eigenvalue features, allowing users to visually identify linear features (red), planar features (green), and volumetric regions (blue).
Additionally, these features data can also be readily incorporated into a Python-based point analysis or classification. As an example, the following script reads in a *.eigen binary data file for direct manipulation and analysis:
`import numpy as np
dt = np.dtype([ ('point_num', '<u8'), ('lambda1', '<f4'), ('lambda2', '<f4'), ('lambda3', '<f4'), ('linearity', '<f4'), ('planarity', '<f4'), ('sphericity', '<f4'), ('omnivariance', '<f4'), ('eigentropy', '<f4'), ('slope', '<f4'), ('resid', '<f4') ])
with open('/Users/johnlindsay/Documents/data/aaa2.eigen', 'rb') as f: b = f.read()
pt_features = np.frombuffer(b, dt)
Print the first 100 point features to demonstrate
for i in range(100): print(f"{pt_features['point_num'][i]} {pt_features['linearity'][i]} {pt_features['planarity'][i]} {pt_features['sphericity'][i]}")
print("Done!") `
References
Chehata, N., Guo, L., & Mallet, C. (2009). Airborne lidar feature selection for urban classification using random forests. In Laser Scanning IAPRS, Vol. XXXVIII, Part 3/W8 – Paris, France, September 1-2, 2009.
Gross, H., Jutzi, B., & Thoennessen, U. (2007). Segmentation of tree regions using data of a full-waveform laser. International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, 36(part 3), W49A.
Niemeyer, J., Mallet, C., Rottensteiner, F., & Sörgel, U. (2012). Conditional Random Fields for the Classification of LIDAR Point Clouds. In XXII ISPRS Congress at Melbourne, ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences (Vol. 3).
West, K. F., Webb, B. N., Lersch, J. R., Pothier, S., Triscari, J. M., & Iverson, A. E. (2004). Context-driven automated target detection in 3D data. In Automatic Target Recognition XIV (Vol. 5426, pp. 133-143). SPIE.
See Also
filter_lidar, modify_lidar, sort_lidar, split_lidar
Python API
def lidar_eigenvalue_features(self, input_lidar: Optional[Lidar], num_neighbours: Optional[int], search_radius: Optional[float]) -> None:
LiDAR Histogram
Function name: lidar_histogram
This tool can be used to plot a histogram of data derived from a LiDAR file. The user must specify the name of the input LAS file (input), the name of the output HTML file (output), the parameter (parameter) to be plotted, and the amount (in percent) to clip the upper and lower tails of the f requency distribution (clip). The LiDAR parameters that can be plotted using lidar_histogram include the point elevations, intensity values, scan angles, and class values.
Use the lidar_point_stats tool instead to examine the spatial distribution of LiDAR points.
See Also
lidar_point_stats
Python API
def lidar_histogram(self, input_lidar: Lidar, output_html_file: str, parameter: str = "elevation", clip_percent: float = 1.0) -> None:
LiDAR Info
Function name: lidar_info
This tool can be used to print basic information about the data contained within a LAS file, used to store LiDAR data. The reported information will include including data on the header, point return frequency, and classification data and information about the variable length records (VLRs) and geokeys. If the output_html_file is specified, the function will write the output information as a HTML file that will be automatically displayed. If this parameter is unspecified, the function will return a string containing the information instead.
Python API
def lidar_info(self, input_lidar: Lidar, output_html_file: str = None, show_point_density: bool = True, show_vlrs: bool = True, show_geokeys: bool = True) -> str:
LiDAR Kappa
Function name: lidar_kappa
This tool performs a kappa index of agreement (KIA) analysis on the classification values of two LiDAR (LAS) files. The output report HTML file should be displayed automatically but can also be displayed afterwards in any web browser. As a measure of overall classification accuracy, the KIA is more robust than the percent agreement calculation because it takes into account the agreement occurring by random chance. In addition to the KIA, the tool will output the producer's and user's accuracy, the overall accuracy, and the error matrix. The KIA is often used as a means of assessing the accuracy of an image classification analysis; however the LidarKappaIndex tool performs the analysis on a point-to-point basis, comparing the class values of the points in one input LAS file with the corresponding nearest points in the second input LAS file.
The user must also specify the name and resolution of an output raster file, which is used to show the spatial distribution of class accuracy. Each grid cell contains the overall accuracy, i.e. the points correctly classified divided by the total number of points contained within the cell, expressed as a percentage.
Python API
def lidar_kappa(self, input_lidar1: Lidar, input_lidar2: Lidar, output_html_file: str, cell_size: float = 1.0, output_class_accuracy: bool = False) -> Raster:
LiDAR Point Density
Function name: lidar_point_density
Python API
def lidar_point_density(self, input_lidar: Optional[Lidar], returns_included: str = "all", cell_size: float = 1.0, search_radius: float = 2.5, excluded_classes: List[int] = None, min_elev: float = float('-inf'), max_elev: float = float('inf')) -> Raster:
LiDAR Point Return Analysis
Function name: lidar_point_return_analysis
Description
This performs a quality control check on the return values of points in a LiDAR file. In particular, the tool will search for missing point returns, duplicate point returns, and points for which the return number (r) is larger than the encoded number of returns (n), all of which may be indicative of processing or encoding errors in the input file.
The user must specify the name of the input LiDAR file (input), and may optionally specify an output LiDAR file (output). If no output file is specified, only the text report is generated by the tool. If an output is specified, the tool will create an output LiDAR file for which missing returns are assigned class 13, duplicate returns are assigned class 14, points that are both, part of a missing series and are duplicate returns, are classed 15, and all other non-problemmatic points are assigned class 1. Note, those points designated as missing in the output image are clearly not so much missing as they are part of a sequence of points that contain missing returns. Missing points are apparent when the first point in a series does not have r = 1, when the last point does not have r = n, or the series is non-sequential (e.g. 1/3, 3/3, but no 2/3). This condition may occur because returns are split between tiles. However, when sequences with missing points are not located near the edges of tiles, it is usually an indication that either point filtering has taken place during pre-processing or that there is been some kind of processing or encoding error.
Duplicate points are defined as points that share the same time, scanner channel, r, and n. Note that these points may have different x, y, z coordinates. Duplicate points are always an indication of a processing or encoding error. For example, it may indicate that the scanner channel information from a multi-channel LiDAR sensor has not been encoded when creating the file or has been lost.
No point should have r > n. This always indicates some kind of processing or encoding error when it occurs.
The following is a sample output report generated by this tool:
`*************************************** * Welcome to LidarPointReturnAnalysis *
The Global Encoding for this file indicates that the point returns are not synthetic.
Missing Returns: 2441636 (16.336 percent) points are missing rnMissing Pts 121127770 22817 13823240 23569 33718 14285695 24142890 34142 44213 1529772 2519848 359928 4518 5516
Duplicate Returns: 4311021 (28.844 percent) points are duplicates rnDuplicates 112707083 12332028 22663717 1370619 23211834 33282348 142856 248568 3414280 4417136 1523 2569 35115 45161 55184
Return Greater Than Num. Returns: 0 (0.000 percent) points have r > n
Writing output LAS file... Complete! Elapsed Time (including I/O): 1.959s `
Python API
def lidar_point_return_analysis(self, input: Lidar, create_output: bool = False) -> Optional[Lidar]:
LiDAR Point Stats
Function name: lidar_point_stats
This tool creates several rasters summarizing the distribution of LiDAR points in a LAS data file. The user must specify the name of an input LAS file (input) and the output raster grid resolution (resolution). Additionally, the user must specify one or more of the possible output rasters to create using the various available flags, which include: FlagMeaning num_pointsNumber of points (returns) in each grid cell num_pulsesNumber of pulses in each grid cell avg_points_per_pulseAverage number of points per pulse in each grid cells z_rangeElevation range within each grid cell intensity_rangeIntensity range within each grid cell predom_classPredominant class value within each grid cell
If no output raster flags are specified, all of the output rasters will be created. All output rasters will have the same base name as the input LAS file but will have a suffix that reflects the statistic type (e.g. _num_pnts, _num_pulses, _avg_points_per_pulse, etc.). Output files will be in the GeoTIFF (*.tif) file format.
When the input/output parameters are not specified, the tool works on all LAS files contained within the working directory.
Notes: 1. The num_pulses output is actually the number of pulses with at least one return; specifically it is the sum of the early returns (first and only) in a grid cell. In areas of low reflectance, such as over water surfaces, the system may have emitted a significantly higher pulse rate but far fewer returns are observed. 2. The memory requirement of this tool is high, particulalry if the grid resolution is fine and the spatial extent is large.
See Also
lidar_block_minimum, lidar_block_maximum
Python API
def lidar_point_stats(self, input_lidar: Optional[Lidar], cell_size: float = 1.0, num_points: bool = False, num_pulses: bool = False, avg_points_per_pulse: bool = False, z_range: bool = False, intensity_range: bool = False, predominant_class: bool = False) :
LiDAR Ransac Planes
Function name: lidar_ransac_planes
This tool uses the random sample consensus (RANSAC) method to identify points within a LiDAR point cloud that belong to planar surfaces. RANSAC is a common method used in the field of computer vision to identify a subset of inlier points in a noisy data set containing abundant outlier points. Because LiDAR point clouds often contain vegetation points that do not form planar surfaces, this tool can be used to largely strip vegetation points from the point cloud, leaving behind the ground returns, buildings, and other points belonging to planar surfaces. If the classify flag is used, non-planar points will not be removed but rather will be assigned a different class (1) than the planar points (0).
The algorithm selects a random sample, of a specified size (num_samples) of the points from within the neighbourhood (radius) surrounding each LiDAR point. The sample is then used to parameterize a planar best-fit model. The distance between each neighbouring point and the plane is then evaluated; inliers are those neighbouring points within a user-specified distance threshold (threshold). Models with at least a minimum number of inlier points (model_size) are then accepted. This process of selecting models is iterated a number of user-specified times (num_iter).
One of the challenges with identifying planar surfaces in LiDAR point clouds is that these data are usually collected along scan lines. Therefore, each scan line can potentially yield a vertical planar surface, which is one reason that some vegetation points remain after applying the RANSAC plane-fitting method. To cope with this problem, the tool allows the user to specify a maximum planar slope (max_slope) parameter. Planes that have slopes greater than this threshold are rejected by the algorithm. This has the side-effect of removing building walls however.
References
Fischler MA and Bolles RC. 1981. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM, 24(6):381–395.
See Also
lidar_segmentation, lidar_ground_point_filter
Python API
def lidar_ransac_planes(self, in_lidar: Lidar, search_radius: float = 2.0, num_iterations: int = 50, num_samples: int = 10, inlier_threshold: float = 0.15, acceptable_model_size: int = 30, max_planar_slope: float = 75.0, classify: bool = False, only_last_returns: bool = False) -> Lidar:
LiDAR Rooftop Analysis
Function name: lidar_rooftop_analysis
This tool can be used to identify roof segments in a LiDAR point cloud.
See Also
classify_buildings_in_lidar, clip_lidar_to_polygon
Python API
def lidar_rooftop_analysis(self, lidar_inputs: List[Lidar], building_footprints: Vector, search_radius: float = 2.0, num_iterations: int = 50, num_samples: int = 10, inlier_threshold: float = 0.15, acceptable_model_size: int = 30, max_planar_slope: float = 75.0, norm_diff_threshold: float = 2.0, azimuth: float = 180.0, altitude: float = 30.0) -> Vector:
Normal Vectors
Function name: normal_vectors
Calculates normal vectors for points within a LAS file and stores these data (XYZ vector components) in the RGB field.
Python API
def normal_vectors(self, input: Lidar, search_radius: float = -1.0) -> Lidar:
Workflow Products
LiDAR QA And Confidence
Function name: lidar_qa_and_confidence
PROProduction
Assess LiDAR point-cloud quality and compute confidence metrics for terrain extraction readiness.
workflow pro
Workflow Narrative
LiDAR QA and Confidence
Problem It Solves
Is this LiDAR deliverable trustworthy enough for production terrain modeling, and where are the risk zones?
Who It Is For
- LiDAR production QA teams and data acceptance reviewers.
Primary User
Survey/mapping firms, government mapping programs, and enterprise geospatial platforms.
What It Does
- Runs LiDAR QA workflow with ground-surface diagnostics.
- Produces confidence, uncertainty, and QA flags for acceptance screening.
- Supports qa_mode-driven strictness behavior (strict, balanced, permissive, auto).
- Supports fast_mode for exploratory runs that skip non-critical diagnostics.
How It Works
- Classifies and normalizes point-cloud structure to estimate ground-surface consistency.
- Builds rasterized QA metrics (confidence, uncertainty, flags) from neighborhood evidence.
- Applies mode-specific acceptance thresholds to summarize pass-risk patterns.
- Optionally runs checkpoint vertical validation and auto-mode recommendations in summary outputs.
- Indicative formula: confidence ~= 1 - normalized(local_residual + return_structure_penalty).
Why It Wins
- Converts QA from a binary pass/fail decision into spatial confidence and uncertainty diagnostics.
Typical Buying Trigger
Data acceptance teams need objective QA evidence before approving vendor LiDAR for production use.
Typical Presets
- strict for conservative acceptance checks.
- balanced for normal production QA.
- permissive for exploratory preprocessing.
- auto for recommendation-guided QA mode selection.
- fast_mode=true for rapid exploratory runs when full diagnostics are not required.
Inputs
ParameterOptionalDescription input (LAS/LAZ)noInput LiDAR point cloud used to derive QA, terrain, structure, or encroachment products. qa_mode and QA threshold controlsnoQA strictness mode and threshold controls for LiDAR acceptance diagnostics. fast_modeyesOptional acceleration mode that skips hotspot extraction, stratified metrics, and checkpoint validation for faster exploratory runs.
Outputs
ParameterTypeDescription classified_lidaroptional LAS/LAZOptional classified LiDAR point cloud output from QA/terrain workflows. dtmGeoTIFFDigital terrain model raster generated from workflow processing. confidenceGeoTIFFConfidence layer quantifying reliability of modeled outputs. uncertaintyGeoTIFFUncertainty diagnostics layer highlighting low-certainty areas. qa_flagsGeoTIFFQA flag raster identifying cells that failed quality checks. summaryJSONMachine-readable summary report containing run metadata, QA diagnostics, and key metrics. html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
result = wbe.lidar_qa_and_confidence( input="data/points.laz", qa_mode="balanced", fast_mode=False, output_prefix="output/lidar_qa", )
print(result)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
LiDAR Terrain Product Suite
Function name: lidar_terrain_product_suite
PROProduction
Generate terrain-focused raster products from LiDAR with consistent QA and surface controls.
workflow pro
Workflow Narrative
LiDAR Terrain Product Suite
Problem It Solves
How can we convert raw LiDAR into a consistent, production-ready terrain package quickly and reproducibly?
Who It Is For
- Terrain product operations teams and applied engineering GIS units.
Primary User
Mapping programs, engineering/environmental consultancies, and topographic data providers.
What It Does
- Produces a full terrain derivative package from raw LiDAR in one run.
- Includes core terrain products and propagated QA metrics.
- Emits metadata contract for reproducibility and downstream automation.
How It Works
- Performs LiDAR preprocessing/classification and interpolates terrain surfaces.
- Derives DTM/DSM/slope/hillshade products from the processed ground and surface model.
- Propagates QA diagnostics into confidence/uncertainty and records run metadata contract.
- Indicative formula: slope = atan(sqrt((dz/dx)^2 + (dz/dy)^2)); hillshade from slope/aspect and illumination azimuth/zenith.
Why It Wins
- Unifies classification, terrain derivatives, and QA metadata into one operational pipeline.
Typical Buying Trigger
Production teams must deliver standardized terrain products at scale with minimal manual orchestration.
Typical Presets
- balanced for default production.
- strict for high-stakes engineering QA contexts.
Inputs
ParameterOptionalDescription input (LAS/LAZ)noInput LiDAR point cloud used to derive QA, terrain, structure, or encroachment products. profile, block size, slope/elevation thresholdsnoTerrain-suite processing profile and block/threshold controls for derivative generation. hillshade and QA controlsnoHillshade illumination and QA export controls for terrain package outputs.
Outputs
ParameterTypeDescription dtmGeoTIFFDigital terrain model raster generated from workflow processing. dsmGeoTIFFDigital surface model raster generated from workflow processing. slopeGeoTIFFSlope derivative raster generated from terrain processing. hillshadeGeoTIFFHillshade visualization raster generated from terrain derivatives. confidenceGeoTIFFConfidence layer quantifying reliability of modeled outputs. uncertaintyGeoTIFFUncertainty diagnostics layer highlighting low-certainty areas. metadataJSONMetadata contract describing generated products and provenance. html_reportHTMLHuman-readable customer-facing report generated from the metadata/summary contract for stakeholder review and QA traceability. classified_lidaroptional LAS/LAZOptional classified LiDAR point cloud output from QA/terrain workflows.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
result = wbe.lidar_terrain_product_suite( input="data/points.laz", profile="balanced", output_prefix="output/terrain_suite", )
print(result)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
LiDAR Change And Disturbance Analysis
Function name: lidar_change_and_disturbance_analysis
PROProduction
Compare baseline and monitoring LiDAR epochs to quantify elevation change and disturbance intensity.
workflow pro
Workflow Narrative
LiDAR Change and Disturbance Analysis
Problem It Solves
Where has significant terrain or canopy disturbance occurred between LiDAR acquisition epochs?
Who It Is For
- LiDAR monitoring programs, forestry operations, and infrastructure change-monitoring teams.
Primary User
Asset monitoring groups, agencies running repeat LiDAR acquisitions, and environmental compliance teams.
What It Does
- Performs epoch-to-epoch LiDAR change analysis using tile-native processing.
- Produces per-tile delta rasters plus a disturbance manifest and run summary.
How It Works
- Accepts baseline and monitoring tile sets (arrays or directories), sorted and paired per tile.
- Grids each tile to surface rasters and harmonizes monitor tile CRS/grid to baseline per pair.
- Computes elevation delta and thresholded disturbed area metrics per tile.
- QA acceptance guidance:
status=passindicates no seam-risk or low-support warnings were triggered.diagnostics.acceptance_thresholds.min_valid_cells_per_tiledefaults to 500 and flags sparse tile support.diagnostics.acceptance_thresholds.seam_risk_warn_cvdefaults to 0.75 and flags elevated inter-tile inconsistency.- Review
diagnostics.tile_diagnosticsbefore publishing disturbance totals for operational reporting. - MVP hardening assets:
- Municipal ingestion guide:
docs/internal/development/LIDAR_CHANGE_SIDEWALK_MUNICIPAL_SCHEMA_INGESTION_GUIDE_2026_04_14.md - Benchmark fixture scaffold:
tests/fixtures/lidar_change_city_benchmark/
Inputs
ParameterOptionalDescription baseline_tilesnoBaseline LiDAR tiles (array or directory path). monitor_tilesnoMonitoring-epoch LiDAR tiles (array or directory path). resolutionyesOutput tile-surface resolution in map units. min_change_myesAbsolute change threshold for disturbance accounting.
Outputs
ParameterTypeDescription tile_directorydirectoryDirectory containing per-tile delta rasters. tile_manifestJSONPer-tile output and disturbance metrics manifest. summaryJSONMachine-readable summary report containing run metadata, QA diagnostics, and key metrics. html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
summary, manifest, tile_dir = wbe.lidar_change_and_disturbance_analysis( baseline_tiles="data/lidar_2023_tiles/", monitor_tiles="data/lidar_2025_tiles/", resolution=2.0, min_change_m=1.0, output_prefix="output/lidar_change", )
print(summary) print(manifest) print(tile_dir)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Sidewalk Vegetation Accessibility Monitoring
Function name: sidewalk_vegetation_accessibility_monitoring
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Sidewalk Vegetation Accessibility Monitoring
Problem It Solves
Which sidewalk segments are most obstructed by vegetation, and where is LiDAR coverage missing?
Who It Is For
- Municipal accessibility teams, urban forestry programs, and public-works corridor management.
Primary User
Municipal accessibility offices and operations teams responsible for pedestrian corridor clearance.
What It Does
- Scores vegetation encroachment along sidewalks using tile-native LiDAR processing.
- Aggregates all tile evidence into one city-level output layer.
- Supports centerline segmentation for line inputs and feature-level fallback for polygon inputs.
How It Works
- Reads LiDAR tiles from an array or directory and processes them one tile at a time.
- Builds per-tile DSM/DTM and computes height-above-ground obstruction surfaces.
- Reprojects sidewalks to tile CRS when needed, samples obstruction neighborhoods, and aggregates results.
- For line/multiline sidewalks, optionally segments centerlines into fixed-length analysis units.
- QA acceptance guidance:
status=passindicates coverage and obstruction diagnostics met baseline QA thresholds.diagnostics.acceptance_thresholds.minimum_coverage_fractiondefaults to 0.75; values below this trigger review.summary.no_lidar_coverage_featuresare unresolved units and should be targeted for additional capture or fallback policy.- Prioritize operational responses using
STATUS,MAX_OBSTR, and tile-level diagnostics together. - MVP hardening assets:
- Municipal ingestion guide:
docs/internal/development/LIDAR_CHANGE_SIDEWALK_MUNICIPAL_SCHEMA_INGESTION_GUIDE_2026_04_14.md - Benchmark fixture scaffold:
tests/fixtures/sidewalk_accessibility_city_benchmark/
Inputs
ParameterOptionalDescription lidar_tilesnoLiDAR tiles as array or directory path (LAS/LAZ/ZLidar). sidewalksnoSidewalk layer (line, multiline, polygon, or multipolygon). sidewalks_epsgyesEPSG override when sidewalk CRS metadata is missing. resolutionyesIntermediate tile raster resolution in map units. segment_length_myesSegment length for line inputs; ignored for polygon inputs. clearance_height_myesHeight threshold used to label obstruction status. buffer_distance_myesNeighborhood sampling radius around sidewalk geometry.
Outputs
ParameterTypeDescription sidewalk_accessibilityGeoPackageCity-level aggregated sidewalk/segment accessibility layer. summaryJSONMachine-readable summary report containing coverage, obstruction counts, and QA diagnostics. html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
access, summary = wbe.sidewalk_vegetation_accessibility_monitoring( lidar_tiles="data/lidar_1km_tiles/", sidewalks="data/sidewalk_centerlines.gpkg", segment_length_m=10.0, clearance_height_m=2.5, output_prefix="output/sidewalk_access", )
print(access) print(summary)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Remote Sensing Analysis
Remote sensing workflows in WbW-QGIS cover multispectral and hyperspectral image analysis: spectral index computation, image enhancement, principal component analysis (PCA), segmentation, and change detection.
This chapter is aligned with the Python and R manuals while staying focused on QGIS Processing Toolbox execution patterns.
Core Concepts You Should Know First
- Spectral bands: Wavelength-specific image channels (for example blue, red, NIR, SWIR) used to separate land-cover materials.
- Spectral indices: Band combinations that highlight specific targets, such as NDVI (vegetation), NDWI (water), and NBR (burn severity).
- Spatial resolution: Pixel size affects detail and detectability of features.
- Temporal resolution: Revisit interval controls change-detection sensitivity.
- Atmospheric and cloud effects: Compare like-with-like by masking clouds and shadows and using corrected imagery where possible.
- Change detection: Compare index or class outputs across acquisition dates.
- Dimensionality reduction: PCA reduces band redundancy before segmentation or classification.
Typical Inputs
| Layer | Format | Notes |
|---|---|---|
| image_t1.tif | Multiband GeoTIFF | Time-1 scene, reflectance preferred |
| image_t2.tif | Multiband GeoTIFF | Time-2 scene, same sensor/preprocessing |
| cloud_mask_t1.tif | Raster | Optional cloud or QA-derived mask |
| cloud_mask_t2.tif | Raster | Optional cloud or QA-derived mask |
End-to-End Workflow
Step 1 - Quality Check and Harmonize Inputs
Before analysis:
- Confirm both scenes use the same CRS, grid, and pixel size.
- Confirm band order and data scale (for example 0-1 reflectance or scaled integer reflectance).
- Mask clouds/shadows using QA products or Raster Calculator conditions.
Use QGIS Raster menu tools as needed:
- Align Raster
- Warp (Reproject)
- Raster Calculator
Step 2 - Build Key Spectral Indices
Process both dates with the same settings.
Processing Toolbox -> Whitebox Workflows -> Remote Sensing:
- NDVI
- Normalized Difference Index (for NDWI, NBR, NDSI, NDBI patterns)
Recommended outputs:
- ndvi_t1.tif, ndvi_t2.tif
- ndwi_t1.tif, ndwi_t2.tif
- nbr_t1.tif, nbr_t2.tif
Example NDVI run:
| Parameter | Value |
|---|---|
| Input image | image_t1.tif |
| NIR band | sensor-specific (for example 4, 5, or 8 depending on product) |
| Red band | sensor-specific |
| Output | ndvi_t1.tif |
Repeat for time 2.
Step 3 - Create Change Surfaces
Use QGIS Raster Calculator for differencing:
- NDVI change: ndvi_t2 - ndvi_t1
- NBR change: nbr_t2 - nbr_t1
Then classify into practical bins (loss, stable, gain) using:
- Whitebox Workflows -> Raster Analysis -> Reclass
- or Raster Calculator threshold expressions
Suggested interpretation for NDVI change:
| Class | Threshold |
|---|---|
| Strong loss | < -0.20 |
| Moderate loss | -0.20 to -0.10 |
| Stable | -0.10 to 0.10 |
| Moderate gain | 0.10 to 0.20 |
| Strong gain | > 0.20 |
Step 4 - Dimensionality Reduction (PCA)
Processing Toolbox -> Whitebox Workflows -> Remote Sensing -> Principal Component Analysis
Use PCA when:
- Bands are highly correlated.
- You need compact inputs for segmentation or clustering.
- You are preparing a classification feature stack.
Inspect output variance/eigenvalue diagnostics and retain only the components needed for most variance.
Step 5 - Segmentation and Classification Prep
Processing Toolbox -> Whitebox Workflows -> Remote Sensing -> Image Segmentation
Typical tuning:
- Lower threshold -> more, smaller segments
- Higher threshold -> fewer, larger segments
- Minimum segment size removes speckle
After segmentation:
- Optionally polygonize segment rasters in QGIS.
- Join zonal metrics from index layers.
- Use resulting segment features for training/validation workflows.
QGIS Python Console Equivalent
Use this pattern for reproducible batch processing in QGIS:
import processing
img_t1 = '/data/image_t1.tif'
img_t2 = '/data/image_t2.tif'
for label, img in [('t1', img_t1), ('t2', img_t2)]:
processing.run('whitebox_workflows:ndvi', {
'input': img,
'nir_band': 4,
'red_band': 3,
'output': f'/data/ndvi_{label}.tif',
})
processing.run('qgis:rastercalculator', {
'EXPRESSION': '"ndvi_t2@1" - "ndvi_t1@1"',
'LAYERS': ['/data/ndvi_t1.tif', '/data/ndvi_t2.tif'],
'OUTPUT': '/data/ndvi_change.tif',
})
processing.run('whitebox_workflows:principal_component_analysis', {
'input': '/data/image_t1.tif',
'num_comp': 4,
'output': '/data/pca_t1.tif',
})
Common Pitfalls
| Problem | Likely cause | Fix |
|---|---|---|
| Index values are outside expected ranges | Wrong bands or scale mismatch | Verify band mapping and value scale |
| Apparent change is mostly cloud edges | Missing cloud/shadow masking | Mask QA classes before differencing |
| PCA output looks unstable | NoData included in stats | Mask NoData consistently |
| Segmentation over-merges features | Threshold too high | Lower threshold and retest |
| Change map is noisy | Different spatial grids or radiometry | Align rasters and normalize radiometry |
Validation Checklist
- Inputs are co-registered and in a common CRS.
- Band assignments match sensor metadata.
- Cloud/shadow/no-data masking applied consistently across dates.
- Index histograms are plausible for local land cover.
- Change classes were reviewed visually against source imagery.
- PCA/segmentation parameters were documented for reproducibility.
Image Enhancement and Contrast
Balance Contrast Enhancement
Function name: balance_contrast_enhancement
This tool can be used to reduce colour bias in a colour composite image based on the technique described by Liu (1991). Colour bias is a common phenomena with colour images derived from multispectral imagery, whereby a higher average brightness value in one band results in over-representation of that band in the colour composite. The tool essentially applies a parabolic stretch to each of the three bands in a user specified RGB colour composite, forcing the histograms of each band to have the same minimum, maximum, and average values while maintaining their overall histogram shape. For greater detail on the operation of the tool, please see Liu (1991). Aside from the names of the input and output colour composite images, the user must also set the value of E, the desired output band mean, where 20 < E < 235.
Reference
Liu, J.G. (1991) Balance contrast enhancement technique and its application in image colour composition. International Journal of Remote Sensing, 12:10.
See Also
direct_decorrelation_stretch, histogram_matching, histogram_matching_two_images, histogram_equalization, gaussian_contrast_stretch
Python API
def balance_contrast_enhancement(self, image: Raster, band_mean: float = 100.0) -> Raster:
Create Colour Composite
Function name: create_colour_composite
This tool can be used to create a colour-composite image from three bands of multi-spectral imagery. The user must input images to enter into the red, green, and blue channels of the resulting composite image. The output image uses the 32-bit aRGB colour model, and therefore, in addition to red, green and blue bands, the user may optionally specify a fourth image that will be used to determine pixel opacity (the 'a' channel). If no opacity image is specified, each pixel will be opaque. This can be useful for cropping an image to an irregular-shaped boundary. The opacity channel can also be used to create transparent gradients in the composite image.
A balance contrast enhancement (BCE) can optionally be performed on the bands prior to creation of the colour composite. While this operation will add to the runtime of create_colour_composite, if the individual input bands have not already had contrast enhancements, then it is advisable that the BCE option be used to improve the quality of the resulting colour composite image.
NoData values in any of the input images are assigned NoData values in the output image and are not taken into account when performing the BCE operation. Please note, not all images have NoData values identified. When this is the case, and when the background value is 0 (often the case with multispectral imagery), then the create_colour_composite tool can be told to ignore zero values using the zeros flag.
See Also
balance_contrast_enhancement, split_colour_composite
Python API
def create_colour_composite(self, red: Raster, green: Raster, blue: Raster, opacity: Raster = None, enhance: bool = True, treat_zeros_as_nodata: bool = False) -> Raster:
Direct Decorrelation Stretch
Function name: direct_decorrelation_stretch
The Direct Decorrelation Stretch (DDS) is a simple type of saturation stretch. The stretch is applied to a colour composite image and is used to improve the saturation, or colourfulness, of the image. The DDS operates by reducing the achromatic (grey) component of a pixel's colour by a scale factor (k), such that the red (r), green (g), and blue (b) components of the output colour are defined as:
rk = r - k min(r, g, b)
gk = g - k min(r, g, b)
bk = b - k min(r, g, b)
The achromatic factor (k) can range between 0 (no effect) and 1 (full saturation stretch), although typical values range from 0.3 to 0.7. A linear stretch is used afterwards to adjust overall image brightness. Liu and Moore (1996) recommend applying a colour balance stretch, such as balance_contrast_enhancement before using the DDS.
Reference
Liu, J.G., and Moore, J. (1996) Direct decorrelation stretch technique for RGB colour composition. International Journal of Remote Sensing, 17:5, 1005-1018.
See Also
create_colour_composite, balance_contrast_enhancement
Python API
def direct_decorrelation_stretch(self, image: Raster, achromatic_factor: float = 0.5, clip_percent: float = 1.0) -> Raster:
False Colour Composite
Function name: false_colour_composite
No help documentation available for this tool.
Gamma Correction
Function name: gamma_correction
This tool performs a gamma colour correction transform on an input image (input), such that each input pixel value (zin) is mapped to the corresponding output value (zout) as:
zout = zingamma
The user must specify the value of the gamma parameter. The input image may be of either a greyscale or RGB colour composite data type.
Python API
def gamma_correction(self, raster: Raster, gamma_value: float = 0.5) -> Raster:
Gaussian Contrast Stretch
Function name: gaussian_contrast_stretch
This tool performs a Gaussian stretch on a raster image. The observed histogram of the input image is fitted to a Gaussian histogram, i.e. normal distribution. A histogram matching technique is used to map the values from the input image onto the output Gaussian distribution. The user must input the number of tones (num_tones) used.
This tool is related to the more general histogram_matching tool, which can be used to fit any frequency distribution to an input image, and other contrast enhancement tools such as histogram_equalization, min_max_contrast_stretch, percentage_contrast_stretch, sigmoidal_contrast_stretch, and standard_deviation_contrast_stretch.
See Also
piecewise_contrast_stretch, histogram_equalization, min_max_contrast_stretch, percentage_contrast_stretch, sigmoidal_contrast_stretch, standard_deviation_contrast_stretch, histogram_matching
Python API
def gaussian_contrast_stretch(self, raster: Raster, num_tones: int = 256) -> Raster:
Histogram Equalization
Function name: histogram_equalization
This tool alters the cumulative distribution function (CDF) of a raster image to match, as closely as possible, the CDF of a uniform distribution. Histogram equalization works by first calculating the histogram of the input image. This input histogram is then converted into a CDF. Each grid cell value in the input image is then mapped to the corresponding value in the uniform distribution's CDF that has an equivalent (or as close as possible) cumulative probability value. Histogram equalization provides a very effective means of performing image contrast adjustment in an efficient manner with little need for human input.
The user must specify the name of the input image to perform histogram equalization on. The user must also specify the number of tones, corresponding to the number of histogram bins used in the analysis.
histogram_equalization is related to the histogram_matching_two_images tool (used when an image's CDF is to be matched to a reference CDF derived from a reference image). Similarly, histogram_matching, and gaussian_contrast_stretch are similarly related tools frequently used for image contrast adjustment, where the reference CDFs are uniform and Gaussian (normal) respectively.
Notes:
- The algorithm can introduces gaps in the histograms (steps in the CDF). This is to be expected because the histogram is being distorted. This is more prevalent for integer-level images.
- Histogram equalization is not appropriate for images containing categorical (class) data.
See Also
piecewise_contrast_stretch, histogram_matching, histogram_matching_two_images, gaussian_contrast_stretch
Python API
def histogram_equalization(self, raster: Raster, num_tones: int = 256) -> Raster:
Histogram Matching
Function name: histogram_matching
This tool alters the cumulative distribution function (CDF) of a raster image to match, as closely as possible, the CDF of a reference histogram. Histogram matching works by first calculating the histogram of the input image. This input histogram and reference histograms are each then converted into CDFs. Each grid cell value in the input image is then mapped to the corresponding value in the reference CDF that has an equivalent (or as close as possible) cumulative probability value. Histogram matching provides the most flexible means of performing image contrast adjustment.
The reference histogram must be specified to the tool in the form of a text file (.txt), provided using the histo_file flag. This file must contain two columns (delimited by a tab, space, comma, colon, or semicolon) where the first column contains the x value (i.e. the values that will be assigned to the grid cells in the output image) and the second column contains the frequency or probability. Note that 1) the file must not contain a header row, 2) each x value/frequency pair must be on a separate row. It is possible to create this type of histogram using the wide range of distribution tools available in most spreadsheet programs (e.g. Excel or LibreOffice's Calc program). You must save the file as a text-only (ASCII) file.
histogram_matching is related to the histogram_matching_two_images tool, which can be used when a reference CDF can be derived from a reference image. histogram_equalization and gaussian_contrast_stretch are similarly related tools frequently used for image contrast adjustment, where the reference CDFs are uniform and Gaussian (normal) respectively.
Notes: - The algorithm can introduces gaps in the histograms (steps in the CDF). This is to be expected because the histogram is being distorted. This is more prevalent for integer-level images. - Histogram matching is not appropriate for images containing categorical (class) data. - This tool is not intended for images containing RGB data. If this is the case, the colour channels should be split using the split_colour_composite tool.
See Also
histogram_matching_two_images, histogram_equalization, gaussian_contrast_stretch, split_colour_composite
Python API
def histogram_matching(self, image: Raster, histogram: List[List[float]], histo_is_cumulative: bool = False) -> Raster:
Histogram Matching Two Images
Function name: histogram_matching_two_images
This tool alters the cumulative distribution function (CDF) of a raster image to match, as closely as possible, the CDF of a reference image. Histogram matching works by first calculating the histograms of the input image (i.e. the image to be adjusted) and the reference image. These histograms are then converted into CDFs. Each grid cell value in the input image is then mapped to the corresponding value in the reference CDF that has the an equivalent (or as close as possible) cumulative probability value. A common application of this is to match the images from two sensors with slightly different responses, or images from the same sensor, but the sensor's response is known to change over time.The size of the two images (rows and columns) do not need to be the same, nor do they need to be geographically overlapping.
histogram_matching_two_images is related to the histogram_matching tool, which can be used when a reference CDF is used directly rather than deriving it from a reference image. histogram_equalization and gaussian_contrast_stretch are similarly related tools, where the reference CDFs are uniform and Gaussian (normal) respectively.
The algorithm may introduces gaps in the histograms (steps in the CDF). This is to be expected because the histograms are being distorted. This is more prevalent for integer-level images. Histogram matching is not appropriate for images containing categorical (class) data. It is also not intended for images containing RGB data, in which case, the colour channels should be split using the split_colour_composite tool.
See Also
histogram_matching, histogram_equalization, gaussian_contrast_stretch, split_colour_composite
Python API
def histogram_matching_two_images(self, image1: Raster, image2: Raster) -> Raster:
IHS To RGB
Function name: ihs_to_rgb
This tool transforms three intensity, hue, and saturation (IHS; sometimes HSI or HIS) raster images into three equivalent multispectral images corresponding with the red, green, and blue channels of an RGB composite. Intensity refers to the brightness of a color, hue is related to the dominant wavelength of light and is perceived as color, and saturation is the purity of the color (Koutsias et al., 2000). There are numerous algorithms for performing a red-green-blue (RGB) to IHS transformation. This tool uses the transformation described by Haydn (1982). Note that, based on this transformation, the input IHS values must follow the ranges:
0 < I < 1
0 < H < 2PI
0 < S < 1
The output red, green, and blue images will have values ranging from 0 to 255. The user must specify the names of the intensity, hue, and saturation images (intensity, hue, saturation). These images will generally be created using the rgb_to_ihs tool. The user must also specify the names of the output red, green, and blue images (red, green, blue). Image enhancements, such as contrast stretching, are often performed on the individual IHS components, which are then inverse transformed back in RGB components using this tool. The output RGB components can then be used to create an improved color composite image.
References
Haydn, R., Dalke, G.W. and Henkel, J. (1982) Application of the IHS color transform to the processing of multisensor data and image enhancement. Proc. of the Inter- national Symposium on Remote Sensing of Arid and Semiarid Lands, Cairo, 599-616.
Koutsias, N., Karteris, M., and Chuvico, E. (2000). The use of intensity-hue-saturation transformation of Landsat-5 Thematic Mapper data for burned land mapping. Photogrammetric Engineering and Remote Sensing, 66(7), 829-840.
See Also
rgb_to_ihs, balance_contrast_enhancement, direct_decorrelation_stretch
Python API
def ihs_to_rgb(self, intensity: Raster, hue: Raster, saturation: Raster) -> Tuple[Raster, Raster, Raster]:
Min Max Contrast Stretch
Function name: min_max_contrast_stretch
This tool performs a Gaussian stretch on a raster image. The observed histogram of the input image is fitted to a Gaussian histogram, i.e. normal distribution. A histogram matching technique is used to map the values from the input image onto the output Gaussian distribution. The user must the number of tones (num_tones) used.
This tool is related to the more general histogram_matching tool, which can be used to fit any frequency distribution to an input image, and other contrast enhancement tools such as histogram_equalization, min_max_contrast_stretch, percentage_contrast_stretch, sigmoidal_contrast_stretch, and standard_deviation_contrast_stretch.
See Also
piecewise_contrast_stretch, histogram_equalization, min_max_contrast_stretch, percentage_contrast_stretch, sigmoidal_contrast_stretch, standard_deviation_contrast_stretch, histogram_matching
Python API
def min_max_contrast_stretch(self, raster: Raster, min_val: float, max_val: float, num_tones: int = 256) -> Raster:
Mosaic
Function name: mosaic
This tool will create an image mosaic from one or more input image files using one of three resampling methods including, nearest neighbour, bilinear interpolation, and cubic convolution. The order of the input source image files is important. Grid cells in the output image will be assigned the corresponding value determined from the last image found in the list to possess an overlapping coordinate.
Note that when the inputs parameter is left unspecified, the tool will use all of the .tif, .tiff, .rdc, .flt, .sdat, and .dep files located in the working directory. This can be a useful way of mosaicing large number of tiles, particularly when the text string that would be required to specify all of the input tiles is longer than the allowable limit.
This is the preferred mosaicing tool to use when appending multiple images with little to no overlapping areas, e.g. tiled data. When images have significant overlap areas, users are advised to use the mosaic_with_feathering tool instead.
Resample is very similar in operation to the Mosaic tool. The Resample tool should be used when there is an existing image into which you would like to dump information from one or more source images. If the source images are more extensive than the destination image, i.e. there are areas that extend beyond the destination image boundaries, these areas will not be represented in the updated image. Grid cells in the destination image that are not overlapping with any of the input source images will not be updated, i.e. they will possess the same value as before the resampling operation. The Mosaic tool is used when there is no existing destination image. In this case, a new image is created that represents the bounding rectangle of each of the two or more input images. Grid cells in the output image that do not overlap with any of the input images will be assigned the NoData value.
See Also
mosaic_with_feathering
Python API
def mosaic(self, images: List[Raster], resampling_method: str = "cc") -> Raster:
Mosaic With Feathering
Function name: mosaic_with_feathering
This tool will create a mosaic from two input images. It is similar in operation to the mosaic tool, however, this tool is the preferred method of mosaicing images when there is significant overlap between the images. For areas of overlap, the feathering method will calculate the output value as a weighted combination of the two input values, where the weights are derived from the squared distance of the pixel to the edge of the data in each of the input raster files. Therefore, less weight is assigned to an image's pixel value where the pixel is very near the edge of the image. Note that the distance is actually calculated to the edge of the grid and not necessarily the edge of the data, which can differ if the image has been rotated during registration. The result of this feathering method is that the output mosaic image should have very little evidence of the original image edges within the overlapping area.
Unlike the Mosaic tool, which can take multiple input images, this tool only accepts two input images. Mosaic is therefore useful when there are many, adjacent or only slightly overlapping images, e.g. for tiled data sets.
Users may want to use the histogram_matching tool prior to mosaicing if the two input images differ significantly in their radiometric properties. i.e. if image contrast differences exist.
See Also
mosaic, histogram_matching
Python API
def mosaic_with_feathering(self, image1: Raster, image2: Raster, resampling_method: str = "cc", distance_weight: float = 4.0) -> Raster:
Normalized Difference Index
Function name: normalized_difference_index
This tool can be used to calculate a normalized difference index (NDI) from two bands of multispectral image data. A NDI of two band images (image1 and image2) takes the general form:
NDI = (image1 - image2) / (image1 + image2 + c)
Where c is a correction factor sometimes used to avoid division by zero. It is, however, often set to 0.0. In fact, the normalized_difference_index tool will set all pixels where image1 + image2 = 0 to 0.0 in the output image. While this is not strictly mathematically correct (0 / 0 = infinity), it is often the intended output in these cases.
NDIs generally takes the value range -1.0 to 1.0, although in practice the range of values for a particular image scene may be more restricted than this.
NDIs have two important properties that make them particularly useful for remote sensing applications. First, they emphasize certain aspects of the shape of the spectral signatures of different land covers. Secondly, they can be used to de-emphasize the effects of variable illumination within a scene. NDIs are therefore frequently used in the field of remote sensing to create vegetation indices and other indices for emphasizing various land-covers and as inputs to analytical operations like image classification. For example, the normalized difference vegetation index (NDVI), one of the most common image-derived products in remote sensing, is calculated as:
NDVI = (NIR - RED) / (NIR + RED)
The optimal soil adjusted vegetation index (OSAVI) is:
OSAVI = (NIR - RED) / (NIR + RED + 0.16)
The normalized difference water index (NDWI), or normalized difference moisture index (NDMI), is:
NDWI = (NIR - SWIR) / (NIR + SWIR)
The normalized burn ratio 1 (NBR1) and normalized burn ration 2 (NBR2) are:
NBR1 = (NIR - SWIR2) / (NIR + SWIR2)
NBR2 = (SWIR1 - SWIR2) / (SWIR1 + SWIR2)
In addition to NDIs, Simple Ratios of image bands, are also commonly used as inputs to other remote sensing applications like image classification. Simple ratios can be calculated using the Divide tool. Division by zero, in this case, will result in an output NoData value.
See Also
Divide
Python API
def normalized_difference_index(self, nir_image: Raster, red_image: Raster, clip_percent: float = 0.0, correction_value: float = 0.0) -> Raster:
Panchromatic Sharpening
Function name: panchromatic_sharpening
Panchromatic sharpening, or simply pan-sharpening, refers to a range of techniques that can be used to merge finer spatial resolution panchromatic images with coarser spatial resolution multi-spectral images. The multi-spectral data provides colour information while the panchromatic image provides improved spatial information. This procedure is sometimes called image fusion. Jensen (2015) describes panchromatic sharpening in detail.
Whitebox provides two common methods for panchromatic sharpening including the Brovey transformation and the Intensity-Hue-Saturation (IHS) methods. Both of these techniques provide the best results when the range of wavelengths detected by the panchromatic image overlap significantly with the wavelength range covered by the three multi-spectral bands that are used. When this is not the case, the resulting colour composite will likely have colour properties that are dissimilar to the colour composite generated by the original multispectral images. For Landsat ETM+ data, the panchromatic band is sensitive to EMR in the range of 0.52-0.90 micrometres. This corresponds closely to the green (band 2), red (band 3), and near-infrared (band 4).
Reference
Jensen, J. R. (2015). Introductory Digital Image Processing: A Remote Sensing Perspective.
See Also
create_colour_composite
Python API
def panchromatic_sharpening(self, pan: Raster, colour_composite: Raster, red: Raster, green: Raster, blue: Raster, fusion_method: str = "brovey") -> Raster:
Percentage Contrast Stretch
Function name: percentage_contrast_stretch
This tool performs a percentage contrast stretch on a raster image. This operation maps each grid cell value in the input raster image (zin) onto a new scale that ranges from a lower-tail clip value (min_val) to the upper-tail clip value (max_val), with the user-specified number of tonal values (num_tones), such that:
zout = ((zin – min_val)/(max_val – min_val)) x num_tones
where zout is the output value. The values of min_val and max_val are determined from the frequency distribution and the user-specified tail clip value (clip). For example, if a value of 1% is specified, the tool will determine the values in the input image for which 1% of the grid cells have a lower value min_val and 1% of the grid cells have a higher value max_val. The user must also specify which tails (upper, lower, or both) to clip (tail).
This is a type of linear contrast stretch with saturation at the tails of the frequency distribution. This is the same kind of stretch that is used to display raster type data on the fly in many GIS software packages, such that the lower and upper tail values are set using the minimum and maximum display values and the number of tonal values is determined by the number of palette entries.
See Also
piecewise_contrast_stretch, gaussian_contrast_stretch, histogram_equalization, min_max_contrast_stretch, sigmoidal_contrast_stretch, standard_deviation_contrast_stretch
Python API
def percentage_contrast_stretch(self, raster: Raster, clip: float = 1.0, tail: str = "both", num_tones: int = 256) -> Raster:
Piecewise Contrast Stretch
Function name: piecewise_contrast_stretch
Description
This tool can be used to perform a piecewise contrast stretch on an input image (input). The input image can be either a single-band image or a colour composite, in which case the contrast stretch will be performed on the intensity values of the hue-saturation-intensity (HSI) transform of the colour image. The user must also specify the name of the output image (output) and the break-points that define the piecewise function used to transfer brightness values from the input to the output image. The break-point values are specified as a string parameter (function), with each break-point taking the form of (input value, output proportion); (input value, output proportion); (input value, output proportion), etc. Piecewise functions can have as many break-points as desired, and each break-point should be separated by a semicolon (;). The input values are specifies as brightness values in the same units as the input image (unless it is an input colour composite, in which case the intensity values range from 0 to 1). The output function must be specified as a proportion (from 0 to 1) of the output value range, which is specified by the number of output greytones (greytones). The greytones parameter is ignored if the input image is a colour composite. Note that there is no need to specify the initial break-point to the piecewise function, as (input min value; 0.0) will be inserted automatically. Similarly, an upper bound of the piecewise function of (input max value; 1.0) will also be inserted automatically.
Generally you want to set breakpoints by examining the image histogram. Typically it is desirable to map large unpopulated ranges of input brightness values in the input image onto relatively narrow ranges of the output brightness values (i.e. a shallow sloped segment of the piecewise function), and areas of the histogram that are well populated with pixels in the input image with a larger range of brightness values in the output image (i.e. a steeper slope segment). This will have the effect of reducing the number of tones used to display the under-populated tails of the distribution and spreading out the well-populated regions of the histogram, thereby improving the overall contrast and the visual interpretability of the output image. The flexibility of the piecewise contrast stretch can often provide a very suitable means of significantly improving image quality.
See Also
raster_histogram, gaussian_contrast_stretch, min_max_contrast_stretch, standard_deviation_contrast_stretch
Python API
def piecewise_contrast_stretch(self, raster: Raster, transformation_statement: str, num_greytones: float = 1024.0) -> Raster:
Resample
Function name: resample
This tool can be used to modify the grid resolution of one or more rasters. The user specifies the names of one or more input rasters (inputs). The resolution of the output raster is determined either using a specified cell_size parameter, in which case the output extent is determined by the combined extent of the inputs, or by an optional base raster (base), in which case the output raster spatial extent matches that of the base file. This operation is similar to the mosaic tool, except that resample modifies the output resolution. The resample tool may also be used with a single input raster (when the user wants to modify its spatial resolution, whereas, mosaic always includes multiple inputs.
If the input source images are more extensive than the base image (if optionally specified), these areas will not be represented in the output image. Grid cells in the output image that are not overlapping with any of the input source images will not be assigned the NoData value, which will be the same as the first input image. Grid cells in the output image that overlap with multiple input raster cells will be assigned the last input value in the stack. Thus, the order of input images is important.
See Also
mosaic
Python API
def resample(self, input_rasters: List[Raster], cell_size: float = 0.0, base_raster: Raster = None, method: str = "cc") -> Raster:
RGB To IHS
Function name: rgb_to_ihs
This tool transforms three raster images of multispectral data (red, green, and blue channels) into their equivalent intensity, hue, and saturation (IHS; sometimes HSI or HIS) images. Intensity refers to the brightness of a color, hue is related to the dominant wavelength of light and is perceived as color, and saturation is the purity of the color (Koutsias et al., 2000). There are numerous algorithms for performing a red-green-blue (RGB) to IHS transformation. This tool uses the transformation described by Haydn (1982). Note that, based on this transformation, the output IHS values follow the ranges:
0 < I < 1
0 < H < 2PI
0 < S < 1
The user must specify the names of the red, green, and blue images (red, green, blue). Importantly, these images need not necessarily correspond with the specific regions of the electromagnetic spectrum that are red, green, and blue. Rather, the input images are three multispectral images that could be used to create a RGB color composite. The user must also specify the names of the output intensity, hue, and saturation images (intensity, hue, saturation). Image enhancements, such as contrast stretching, are often performed on the IHS components, which are then inverse transformed back in RGB components to then create an improved color composite image.
References
Haydn, R., Dalke, G.W. and Henkel, J. (1982) Application of the IHS color transform to the processing of multisensor data and image enhancement. Proc. of the Inter- national Symposium on Remote Sensing of Arid and Semiarid Lands, Cairo, 599-616.
Koutsias, N., Karteris, M., and Chuvico, E. (2000). The use of intensity-hue-saturation transformation of Landsat-5 Thematic Mapper data for burned land mapping. Photogrammetric Engineering and Remote Sensing, 66(7), 829-840.
See Also
ihs_to_rgb, balance_contrast_enhancement, direct_decorrelation_stretch
Python API
def rgb_to_ihs(self, red: Optional[Raster] = None, green: Optional[Raster] = None, blue: Optional[Raster] = None, composite: Optional[Raster] = None) -> Tuple[Raster, Raster, Raster]:
Sigmoidal Contrast Stretch
Function name: sigmoidal_contrast_stretch
This tool performs a sigmoidal stretch on a raster image. This is a transformation where the input image value for a grid cell (zin) is transformed to an output value zout such that:
zout = (1.0 / (1.0 + exp(gain(cutoff - z))) - a ) / b x num_tones
where,
z = (zin - MIN) / RANGE,
a = 1.0 / (1.0 + exp(gain x cutoff)),
b = 1.0 / (1.0 + exp(gain x (cutoff - 1.0))) - 1.0 / (1.0 + exp(gain x cutoff)),
MIN and RANGE are the minimum value and data range in the input image respectively and gain and cutoff are user specified parameters (gain, cutoff).
Like all of WhiteboxTools's contrast enhancement tools, this operation will work on either greyscale or RGB input images.
See Also
piecewise_contrast_stretch, gaussian_contrast_stretch, histogram_equalization, min_max_contrast_stretch, percentage_contrast_stretch, standard_deviation_contrast_stretch
Python API
def sigmoidal_contrast_stretch(self, raster: Raster, cutoff: float = 0.0, gain: float = 1.0, num_tones: int = 256) -> Raster:
Split Colour Composite
Function name: split_colour_composite
This tool can be used to split a red-green-blue (RGB) colour-composite image into three separate bands of multi-spectral imagery. The user must specify the input image (input) and output red, green, blue images.
See Also
create_colour_composite
Python API
def split_colour_composite(self, composite_image: Raster) -> Tuple[Raster, Raster, Raster]:
Standard Deviation Contrast Stretch
Function name: standard_deviation_contrast_stretch
This tool performs a standard deviation contrast stretch on a raster image. This operation maps each grid cell value in the input raster image (zin) onto a new scale that ranges from a lower-tail clip value (min_val) to the upper-tail clip value (max_val), with the user-specified number of tonal values (num_tones), such that:
zout = ((zin – min_val)/(max_val – min_val)) x num_tones
where zout is the output value. The values of min_val and max_val are determined based on the image mean and standard deviation. Specifically, the user must specify the number of standard deviations (clip or stdev) to be used in determining the min and max clip values. The tool will then calculate the input image mean and standard deviation and estimate the clip values from these statistics.
This is the same kind of stretch that is used to display raster type data on the fly in many GIS software packages.
See Also
piecewise_contrast_stretch, gaussian_contrast_stretch, histogram_equalization, min_max_contrast_stretch, percentage_contrast_stretch, sigmoidal_contrast_stretch
Python API
def standard_deviation_contrast_stretch(self, raster: Raster, clip: float = 2.0, num_tones: int = 256) -> Raster:
True Colour Composite
Function name: true_colour_composite
No help documentation available for this tool.
Image Filters
Adaptive Filter
Function name: adaptive_filter
This tool performs a type of adaptive filter on a raster image. An adaptive filter can be used to reduce the level of random noise (shot noise) in an image. The algorithm operates by calculating the average value in a moving window centred on each grid cell. If the absolute difference between the window mean value and the centre grid cell value is beyond a user-defined threshold (threshold), the grid cell in the output image is assigned the mean value, otherwise it is equivalent to the original value. Therefore, the algorithm only modifies the image where grid cell values are substantially different than their neighbouring values.
Neighbourhood size, or filter size, is specified in the x and y dimensions using filterx and filtery. These dimensions should be odd, positive integer values (e.g. 3, 5, 7, 9, etc.).
See Also
mean_filter
Python API
def adaptive_filter(self, raster: Raster, filter_size_x: int = 11, filter_size_y: int = 11, threshold: float = 2.0) -> Raster:
Anisotropic Diffusion Filter
Function name: anisotropic_diffusion_filter
Experimental
Performs Perona-Malik edge-preserving anisotropic diffusion smoothing.
remote_sensing raster filter anisotropic_diffusion_filter legacy-port
Parameters
NameDescriptionRequiredDefault
inputInput raster path or typed raster object.Requiredinput.tif
iterationsNumber of diffusion iterations (default 10).Optional10
kappaEdge sensitivity parameter (default 20.0).Optional20.0
lambdaTime-step in (0, 0.25], default 0.2.Optional0.2
outputOptional output path. If omitted, output remains in memory.Optional—
Examples
Applies anisotropic_diffusion_filter to an input raster.
wbe.anisotropic_diffusion_filter(input='image.tif', output='anisotropic_diffusion_filter.tif')
Bilateral Filter
Function name: bilateral_filter
This tool can be used to perform an edge-preserving smoothing filter, or bilateral filter, on an image. A bilateral filter can be used to emphasize the longer-range variability in an image, effectively acting to smooth the image, while reducing the edge blurring effect common with other types of smoothing filters. As such, this filter is very useful for reducing the noise in an image. Bilateral filtering is a non-linear filtering technique introduced by Tomasi and Manduchi (1998). The algorithm operates by convolving a kernel of weights with each grid cell and its neighbours in an image. The bilateral filter is related to Gaussian smoothing, in that the weights of the convolution kernel are partly determined by the 2-dimensional Gaussian (i.e. normal) curve, which gives stronger weighting to cells nearer the kernel centre. Unlike the gaussian_filter, however, the bilateral kernel weightings are also affected by their similarity to the intensity value of the central pixel. Pixels that are very different in intensity from the central pixel are weighted less, also based on a Gaussian weight distribution. Therefore, this non-linear convolution filter is determined by the spatial and intensity domains of a localized pixel neighborhood.
The heavier weighting given to nearer and similar-valued pixels makes the bilateral filter an attractive alternative for image smoothing and noise reduction compared to the much-used Mean filter. The size of the filter is determined by setting the standard deviation distance parameter (sigma_dist); the larger the standard deviation the larger the resulting filter kernel. The standard deviation can be any number in the range 0.5-20 and is specified in the unit of pixels. The standard deviation intensity parameter (sigma_int), specified in the same units as the z-values, determines the intensity domain contribution to kernel weightings.
References
Tomasi, C., & Manduchi, R. (1998, January). Bilateral filtering for gray and color images. In null (p. 839). IEEE.
See Also
edge_preserving_mean_filter
Python API
def bilateral_filter(self, raster: Raster, sigma_dist: float = 0.75, sigma_int: float = 1.0) -> Raster:
Closing
Function name: closing
This tool performs a closing operation on an input greyscale image (input). A closing is a mathematical morphology operation involving an erosion (minimum filter) of a dilation (maximum filter) set. closing operations, together with the opening operation, is frequently used in the fields of computer vision and digital image processing for image noise removal. The user must specify the size of the moving window in both the x and y directions (filterx and filtery).
See Also
opening, tophat_transform
Python API
def closing(self, raster: Raster, filter_size_x: int = 11, filter_size_y: int = 11) -> Raster:
Conservative Smoothing Filter
Function name: conservative_smoothing_filter
This tool performs a conservative smoothing filter on a raster image. A conservative smoothing filter can be used to remove short-range variability in an image, effectively acting to smooth the image. It is particularly useful for eliminating local spikes and reducing the noise in an image. The algorithm operates by calculating the minimum and maximum neighbouring values surrounding a grid cell. If the cell at the centre of the kernel is greater than the calculated maximum value, it is replaced with the maximum value in the output image. Similarly, if the cell value at the kernel centre is less than the neighbouring minimum value, the corresponding grid cell in the output image is replaced with the minimum value. This filter tends to alter an image very little compared with other smoothing filters such as the mean_filter, edge_preserving_mean_filter, bilateral_filter, median_filter, gaussian_filter, or olympic_filter.
Neighbourhood size, or filter size, is specified in the x and y dimensions using the filterx and filtery flags. These dimensions should be odd, positive integer values (e.g. 3, 5, 7, 9, etc.).
See Also
mean_filter, edge_preserving_mean_filter, bilateral_filter, median_filter, gaussian_filter, olympic_filter
Python API
def conservative_smoothing_filter(self, raster: Raster, filter_size_x: int = 3, filter_size_y: int = 3) -> Raster:
Diff Of Gaussians Filter
Function name: diff_of_gaussians_filter
This tool can be used to perform a difference-of-Gaussians (DoG) filter on a raster image. In digital image processing, DoG is a feature enhancement algorithm that involves the subtraction of one blurred version of an image from another, less blurred version of the original. The blurred images are obtained by applying filters with Gaussian-weighted kernels of differing standard deviations to the input image (input). Blurring an image using a Gaussian-weighted kernel suppresses high-frequency spatial information and emphasizes lower-frequency variation. Subtracting one blurred image from the other preserves spatial information that lies between the range of frequencies that are preserved in the two blurred images. Thus, the difference-of-Gaussians is a band-pass filter that discards all but a specified range of spatial frequencies that are present in the original image.
The algorithm operates by differencing the results of convolving two kernels of weights with each grid cell and its neighbours in an image. The weights of the convolution kernels are determined by the 2-dimensional Gaussian (i.e. normal) curve, which gives stronger weighting to cells nearer the kernel centre. The size of the two convolution kernels are determined by setting the two standard deviation parameters (sigma1 and sigma2); the larger the standard deviation the larger the resulting filter kernel. The second standard deviation should be a larger value than the first, however if this is not the case, the tool will automatically swap the two parameters. Both standard deviations can range from 0.5-20.
The difference-of-Gaussians filter can be used to emphasize edges present in an image. Other edge-sharpening filters also operate by enhancing high-frequency detail, but because random noise also has a high spatial frequency, many of these sharpening filters tend to enhance noise, which can be an undesirable artifact. The difference-of-Gaussians filter can remove high-frequency noise while emphasizing edges. This filter can, however, reduce overall image contrast.
See Also
gaussian_filter, fast_almost_gaussian_filter, laplacian_filter, LaplacianOfGaussianFilter`
Python API
def diff_of_gaussians_filter(self, raster: Raster, sigma1: float = 2.0, sigma2: float = 4.0) -> Raster:
Diversity Filter
Function name: diversity_filter
This tool assigns each cell in the output grid the number of different values in a moving window centred on each grid cell in the input raster. The input image should contain integer values but floating point data are allowable and will be handled by multiplying pixel values by 1000 and rounding. Neighbourhood size, or filter size, is specified in the x and y dimensions using the filterx and filtery flags. These dimensions should be odd, positive integer values, e.g. 3, 5, 7, 9... If the kernel filter size is the same in the x and y dimensions, the silent filter flag may be used instead (command-line interface only).
See Also
majority_filter
Python API
def diversity_filter(self, raster: Raster, filter_size_x: int = 11, filter_size_y: int = 11) -> Raster:
Edge Preserving Mean Filter
Function name: edge_preserving_mean_filter
This tool performs a type of edge-preserving mean filter operation on an input image (input). The filter, a type of low-pass filter, can be used to emphasize the longer-range variability in an image, effectively acting to smooth the image and to reduce noise in the image. The algorithm calculates the average value in a moving window centred on each grid cell, including in the averaging only the set of neighbouring values for which the absolute value difference with the centre value is less than a specified threshold value (threshold). It is, therefore, similar to the bilateral_filter, except all neighbours within the threshold difference are equally weighted and neighbour distance is not accounted for. Filter kernels are always square, and filter size, is specified using the filter parameter. This dimensions should be odd, positive integer values, e.g. 3, 5, 7, 9...
This tool works with both greyscale and red-green-blue (RGB) input images. RGB images are decomposed into intensity-hue-saturation (IHS) and the filter is applied to the intensity channel. If an RGB image is input, the threshold value must be in the range 0.0-1.0 (more likely less than 0.15), where a value of 1.0 would result in an ordinary mean filter (mean_filter). NoData values in the input image are ignored during filtering.
See Also
mean_filter, bilateral_filter, edge_preserving_mean_filter, gaussian_filter, median_filter, rgb_to_ihs
Python API
def edge_preserving_mean_filter(self, raster: Raster, filter_size: int = 11, threshold: float = 15.0) -> Raster:
Emboss Filter
Function name: emboss_filter
This tool can be used to perform one of eight 3x3 emboss filters on a raster image. Like the sobel_filter and prewitt_filter, the emboss_filter is often applied in edge-detection applications. While these other two common edge-detection filters approximate the slope magnitude of the local neighbourhood surrounding each grid cell, the emboss_filter can be used to estimate the directional slope. The kernel weights for each of the eight available filters are as follows:
North (n) ... 0-10 000 010
Northeast (ne) ... 00-1 000 -100
East (e) ... 000 10-1 000
Southeast (se) ... 100 000 00-1
South (s) ... 010 000 0-10
Southwest (sw) ... 001 000 -100
West (w) ... 000 -101 000
Northwest (nw) ... -100 000 001
The user must specify the direction, options include 'n', 's', 'e', 'w', 'ne', 'se', 'nw', 'sw'. The user may also optionally clip the output image distribution tails by a specified amount (e.g. 1%).
See Also
sobel_filter, prewitt_filter
Python API
def emboss_filter(self, raster: Raster, direction: str = "n", clip_amount: float = 0.0) -> Raster:
Fast Almost Gaussian Filter
Function name: fast_almost_gaussian_filter
The tool is somewhat modified from Dr. Kovesi's original Matlab code in that it works with both greyscale and RGB images (decomposes to HSI and uses the intensity data) and it handles the case of rasters that contain NoData values. This adds complexity to the original 20 additions and 5 multiplications assertion of the original paper.
Also note, for small values of sigma (< 1.8), you should probably just use the regular GaussianFilter tool.
Reference
P. Kovesi 2010 Fast Almost-Gaussian Filtering, Digital Image Computing: Techniques and Applications (DICTA), 2010 International Conference on.
Python API
def fast_almost_gaussian_filter(self, raster: Raster, sigma: float = 1.8) -> Raster:
Flip Image
Function name: flip_image
This tool can be used to flip, or reflect, an image (input) either vertically, horizontally, or both. The axis of reflection is specified using the direction parameter. The input image is not reflected in place; rather, the reflected image is stored in a separate output file.
Python API
def flip_image(self, raster: Raster, direction: str = "v") -> Raster:
Frangi Filter
Function name: frangi_filter
Experimental
Performs multiscale Frangi vesselness enhancement.
remote_sensing raster filter frangi_filter legacy-port
Parameters
NameDescriptionRequiredDefault
inputInput raster path or typed raster object.Requiredinput.tif
scalesList of Gaussian-like scales in pixels (default [1.0, 2.0, 3.0]).Optional[1.0, 2.0, 3.0]
betaFrangi beta parameter for blob suppression (default 0.5).Optional0.5
cFrangi c parameter for structure sensitivity (default 15.0).Optional15.0
outputOptional output path. If omitted, output remains in memory.Optional—
Examples
Applies frangi_filter to an input raster.
wbe.frangi_filter(input='image.tif', output='frangi_filter.tif')
Gabor Filter Bank
Function name: gabor_filter_bank
Experimental
Performs multi-orientation Gabor response filtering.
remote_sensing raster filter gabor_filter_bank legacy-port
Parameters
NameDescriptionRequiredDefault
inputInput raster path or typed raster object.Requiredinput.tif
sigmaGaussian envelope sigma in pixels (default 2.0).Optional2.0
frequencySinusoid spatial frequency in cycles/pixel (default 0.2).Optional0.2
orientationsNumber of orientations in the filter bank (default 6).Optional6
outputOptional output path. If omitted, output remains in memory.Optional—
Examples
Applies gabor_filter_bank to an input raster.
wbe.gabor_filter_bank(input='image.tif', output='gabor_filter_bank.tif')
Gaussian Filter
Function name: gaussian_filter
This tool can be used to perform a Gaussian filter on a raster image. A Gaussian filter can be used to emphasize the longer-range variability in an image, effectively acting to smooth the image. This can be useful for reducing the noise in an image. The algorithm operates by convolving a kernel of weights with each grid cell and its neighbours in an image. The weights of the convolution kernel are determined by the 2-dimensional Gaussian (i.e. normal) curve, which gives stronger weighting to cells nearer the kernel centre. It is this characteristic that makes the Gaussian filter an attractive alternative for image smoothing and noise reduction than the mean_filter. The size of the filter is determined by setting the standard deviation parameter (sigma), which is in units of grid cells; the larger the standard deviation the larger the resulting filter kernel. The standard deviation can be any number in the range 0.5-20.
gaussian_filter works with both greyscale and red-green-blue (RGB) colour images. RGB images are decomposed into intensity-hue-saturation (IHS) and the filter is applied to the intensity channel. NoData values in the input image are ignored during processing.
Like many low-pass filters, Gaussian filtering can significantly blur well-defined edges in the input image. The edge_preserving_mean_filter and bilateral_filter offer more robust feature preservation during image smoothing. gaussian_filter is relatively slow compared to the fast_almost_gaussian_filter tool, which offers a fast-running approximatation to a Gaussian filter for larger kernel sizes.
See Also
fast_almost_gaussian_filter, mean_filter, median_filter, rgb_to_ihs
Python API
def gaussian_filter(self, raster: Raster, sigma: float = 0.75) -> Raster:
GLCM Texture
Function name: glcm_texture
Description
Computes general-purpose local texture metrics from a single-band raster using a gray-level co-occurrence matrix (GLCM) within a moving window. Output is written as a multiband raster so that large metric sets remain manageable in Python/R APIs and QGIS.
Use features to choose which metrics are emitted. Supported feature names are contrast, dissimilarity, homogeneity, asm, energy, entropy, mean, variance, and correlation. Use direction_aggregation to combine directions (mean, min, max, range) or keep each direction as separate output bands (separate).
Angles are specified in degrees using a comma-separated list from 0,45,90,135. Increasing window_size and levels generally improves stability at higher computational cost.
Python API
def glcm_texture(self, input: Raster, window_size: int = 7, distance: int = 1, angles: str = "0,45,90,135", features: str = "contrast,homogeneity,energy,entropy", direction_aggregation: str = "mean", levels: int = 32, symmetric: bool = True) -> Raster:
Example
glcm = wbe.glcm_texture( input=raster, window_size=9, distance=1, angles="0,45,90,135", features="contrast,homogeneity,entropy", direction_aggregation="mean", levels=32, output="glcm_texture.tif", )
See Also
image_segmentation, object_features_texture_glcm_basic
Guided Filter
Function name: guided_filter
Experimental
Performs edge-preserving guided filtering using local linear models.
remote_sensing raster filter guided_filter legacy-port
Parameters
NameDescriptionRequiredDefault
inputInput raster path or typed raster object.Requiredinput.tif
radiusGuided filter window radius in pixels (default 4).Optional4
epsilonRegularization parameter for local variance (default 0.01).Optional0.01
outputOptional output path. If omitted, output remains in memory.Optional—
Examples
Applies guided_filter to an input raster.
wbe.guided_filter(input='image.tif', output='guided_filter.tif')
High Pass Bilateral Filter
Function name: high_pass_bilateral_filter
Experimental
Computes a high-pass residual by subtracting bilateral smoothing from the input raster.
raster image filter high-pass legacy-port
Parameters
NameDescriptionRequiredDefault
inputInput raster path or typed raster object.Requiredinput.tif
sigma_distStandard deviation of the spatial (distance) Gaussian kernel, in pixels (0.5–20.0, default 0.75).Optional0.75
sigma_intStandard deviation of the intensity Gaussian kernel, in raster-value units (default 1.0).Optional1.0
treat_as_rgbSet true to force HSI-intensity filtering for packed RGB rasters before high-pass differencing.OptionalFalse
assume_three_band_rgbWhen true (default), and no explicit color metadata is present, allow 3-band uint8/uint16 RGB interpretation.OptionalTrue
outputOptional output file path. If omitted, output remains in memory.Optional—
Examples
Applies high-pass bilateral filtering to emphasize local texture.
wbe.high_pass_bilateral_filter(assume_three_band_rgb=True, input='image.tif', output='image_highpass_bilateral.tif', sigma_dist=1.5, sigma_int=25.0, treat_as_rgb=False)
High Pass Filter
Function name: high_pass_filter
This tool performs a high-pass filter on a raster image. High-pass filters can be used to emphasize the short-range variability in an image. The algorithm operates essentially by subtracting the value at the grid cell at the centre of the window from the average value in the surrounding neighbourhood (i.e. window.)
Neighbourhood size, or filter size, is specified in the x and y dimensions using the filterx and filtery flags. These dimensions should be odd, positive integer values (e.g. 3, 5, 7, 9, etc.).
See Also
high_pass_median_filter, mean_filter
Python API
def high_pass_filter(self, raster: Raster, filter_size_x: int = 11, filter_size_y: int = 11) -> Raster:
High Pass Median Filter
Function name: high_pass_median_filter
This tool performs a high-pass median filter on a raster image. High-pass filters can be used to emphasize the short-range variability in an image. The algorithm operates essentially by subtracting the value at the grid cell at the centre of the window from the median value in the surrounding neighbourhood (i.e. window.)
Neighbourhood size, or filter size, is specified in the x and y dimensions using the filterx and filtery flags. These dimensions should be odd, positive integer values (e.g. 3, 5, 7, 9, etc.).
See Also
high_pass_filter, median_filter
Python API
def high_pass_median_filter(self, raster: Raster, filter_size_x: int = 11, filter_size_y: int = 11, sig_digits: int = 2) -> Raster:
Integral Image Transform
Function name: integral_image_transform
This tool transforms an input raster image into an integral image, or summed area table. Integral images are the two-dimensional equivalent to a cumulative distribution function. Each pixel contains the sum of all pixels contained within the enclosing rectangle above and to the left of a pixel. Images with a very large number of grid cells will likely experience numerical overflow errors when converted to an integral image. Integral images are used in a wide variety of computer vision and digital image processing applications, including texture mapping. They allow for the efficient calculation of very large filters and are the basis of several of WhiteboxTools's image filters.
Reference
Crow, F. C. (1984, January). Summed-area tables for texture mapping. In ACM SIGGRAPH computer graphics (Vol. 18, No. 3, pp. 207-212). ACM.
Python API
def integral_image_transform(self, raster: Raster) -> Raster:
K Nearest Mean Filter
Function name: k_nearest_mean_filter
This tool performs a k-nearest mean filter on a raster image. A mean filter can be used to emphasize the longer-range variability in an image, effectively acting to smooth or blur the image. This can be useful for reducing the noise in an image. The algorithm operates by calculating the average of a specified number (k) values in a moving window centred on each grid cell. The k values used in the average are those cells in the window with the nearest intensity values to that of the centre cell. As such, this is a type of edge-preserving smoothing filter. The bilateral_filter and edge_preserving_mean_filter are examples of more sophisticated edge-preserving smoothing filters.
Neighbourhood size, or filter size, is specified in the x and y dimensions using the filterx and filtery flags. These dimensions should be odd, positive integer values (e.g. 3, 5, 7, 9, etc.).
NoData values in the input image are ignored during filtering.
See Also
mean_filter, bilateral_filter, edge_preserving_mean_filter
Python API
def k_nearest_mean_filter(self, raster: Raster, filter_size_x: int = 3, filter_size_y: int = 3, k: int = 5) -> Raster:
Kuwahara Filter
Function name: kuwahara_filter
Experimental
Performs edge-preserving Kuwahara filtering using minimum-variance subwindows.
remote_sensing raster filter kuwahara_filter legacy-port
Parameters
NameDescriptionRequiredDefault
inputInput raster path or typed raster object.Requiredinput.tif
radiusKuwahara quadrant radius in pixels (default 2).Optional2
outputOptional output path. If omitted, output remains in memory.Optional—
Examples
Applies kuwahara_filter to an input raster.
wbe.kuwahara_filter(input='image.tif', output='kuwahara_filter.tif')
Lee Filter
Function name: lee_filter
The Lee Sigma filter is a low-pass filter used to smooth the input image (input). The user must specify the dimensions of the filter (filterx and filtery) as well as the sigma (sigma) and M (m) parameter.
Reference
Lee, J. S. (1983). Digital image smoothing and the sigma filter. Computer vision, graphics, and image processing, 24(2), 255-269.
See Also
mean_filter, gaussian_filter
Python API
def lee_filter(self, raster: Raster, filter_size_x: int = 11, filter_size_y: int = 11, sigma: float = 10.0, m_value: float = 5.0) -> Raster:
Line Detection Filter
Function name: line_detection_filter
This tool can be used to perform one of four 3x3 line-detection filters on a raster image. These filters can be used to find one-cell-thick vertical, horizontal, or angled (135-degrees or 45-degrees) lines in an image. Notice that line-finding is a similar application to edge-detection. Common edge-detection filters include the Sobel and Prewitt filters. The kernel weights for each of the four line-detection filters are as follows:
'v' (Vertical) ... -12-1 -12-1 -12-1
'h' (Horizontal) ... -1-1-1 222 -1-1-1
'45' (Northeast-Southwest) ... -1-12 -12-1 2-1-1
'135' (Northwest-Southeast) ... 2-1-1 -12-1 -1-12
The user must specify the variant, including 'v', 'h', '45', and '135', for vertical, horizontal, northeast-southwest, and northwest-southeast directions respectively. The user may also optionally clip the output image distribution tails by a specified amount (e.g. 1%).
See Also
prewitt_filter, sobel_filter
Python API
def line_detection_filter(self, raster: Raster, variant: str = "v", abs_values: bool = False, clip_tails: float = 0.0) -> Raster:
Line Thinning
Function name: line_thinning
This image processing tool reduces all polygons in a Boolean raster image to their single-cell wide skeletons. This operation is sometimes called line thinning or skeletonization. In fact, the input image need not be truly Boolean (i.e. contain only 1's and 0's). All non-zero, positive values are considered to be foreground pixels while all zero valued cells are considered background pixels. The remove_spurs tool is useful for cleaning up an image before performing a line thinning operation.
Note: Unlike other filter-based operations in WhiteboxTools, this algorithm can't easily be parallelized because the output raster must be read and written to during the same loop.
See Also
remove_spurs, thicken_raster_line
Python API
def line_thinning(self, raster: Raster) -> Raster:
LiDAR Ground Point Filter
Function name: lidar_ground_point_filter
This tool can be used to perform a slope-based classification, or filtering (i.e. removal), of non-ground points within a LiDAR point-cloud. The user must specify the name of the input and output LiDAR files (input and output). Inter-point slopes are compared between pair of points contained within local neighbourhoods of size radius. Neighbourhoods with fewer than the user-specified minimum number of points (min_neighbours) are extended until the minimum point number is equaled or exceeded. Points that are above neighbouring points by the minimum (height_threshold) and have an inter-point slope greater than the user-specifed threshold (slope_threshold) are considered non-ground points and are either optionally (classify) excluded from the output point-cloud or assigned the unclassified (value 1) class value.
Slope-based ground-point classification methods suffer from the challenge of uses a constant slope threshold under varying terrain slopes. Some researchers have developed schemes for varying the slope threshold based on underlying terrain slopes. lidar_ground_point_filter instead allow the user to optionally (slope_norm) normalize the underlying terrain (i.e. flatten the terrain) using a white top-hat transform. A constant slope threshold may then be used without contributing to poorer performance under steep topography. Note, that this option, while useful in rugged terrain, is computationally intensive. If the point-cloud is of a relatively flat terrain, this option may be excluded.
While this tool is appropriately applied to LiDAR point-clouds, the remove_off_terrain_objects tool can be used to remove off-terrain objects from rasterized LiDAR digital elevation models (DEMs).
Reference
Vosselman, G. (2000). Slope based filtering of laser altimetry data. International Archives of Photogrammetry and Remote Sensing, 33(B3/2; PART 3), 935-942.
See Also
improved_ground_point_filter, remove_off_terrain_objects
Python API
def lidar_ground_point_filter(self, input_lidar: Optional[Lidar], search_radius: float = 2.0, min_neighbours: int = 0, slope_threshold: float = 45.0, height_threshold: float = 1.0, classify: bool = False, slope_norm: bool = True, height_above_ground: bool = False) -> Lidar:
Majority Filter
Function name: majority_filter
This tool performs a range filter on an input image (input). A range filter assigns to each cell in the output grid. The range (maximum - minimum) of the values contained within a moving window centred on each grid cell.
Neighbourhood size, or filter size, is specified in the x and y dimensions using the filterx and filtery flags. These dimensions should be odd, positive integer values (e.g. 3, 5, 7, 9, etc.).
See Also
total_filter
Python API
def majority_filter(self, raster: Raster, filter_size_x: int = 11, filter_size_y: int = 11) -> Raster:
Maximum Filter
Function name: maximum_filter
This tool assigns each cell in the output grid. The maximum value in a moving window centred on each grid cell in the input raster (input). A maximum filter is the equivalent of the mathematical morphological dilation operator.
Neighbourhood size, or filter size, is specified in the x and y dimensions using the filterx and filtery flags. These dimensions should be odd, positive integer values, e.g. 3, 5, 7, 9... If the kernel filter size is the same in the x and y dimensions, the silent filter flag may be used instead (command-line interface only).
This tool takes advantage of the redundancy between overlapping, neighbouring filters to enhance computationally efficiency. Like most of WhiteboxTools' filters, it is also parallelized for further efficiency.
See Also
minimum_filter
Python API
def maximum_filter(self, raster: Raster, filter_size_x: int = 11, filter_size_y: int = 11) -> Raster:
Mean Filter
Function name: mean_filter
This tool performs a mean filter operation on a raster image. A mean filter, a type of low-pass filter, can be used to emphasize the longer-range variability in an image, effectively acting to smooth the image. This can be useful for reducing the noise in an image. This tool utilizes an integral image approach (Crow, 1984) to ensure highly efficient filtering that is invariant to filter size. The algorithm operates by calculating the average value in a moving window centred on each grid cell. Neighbourhood size, or filter size, is specified in the x and y dimensions using the filterx and filtery flags. These dimensions should be odd, positive integer values, e.g. 3, 5, 7, 9... If the kernel filter size is the same in the x and y dimensions, the silent filter flag may be used instead (command-line interface only).
Although commonly applied in digital image processing, mean filters are generally considered to be quite harsh, with respect to their impact on the image, compared to other smoothing filters such as the edge-preserving smoothing filters including the bilateral_filter, median_filter, olympic_filter, edge_preserving_mean_filter and even gaussian_filter.
This tool works with both greyscale and red-green-blue (RGB) images. RGB images are decomposed into intensity-hue-saturation (IHS) and the filter is applied to the intensity channel. NoData values in the input image are ignored during filtering. NoData values are assigned to all sites beyond the raster.
Reference
Crow, F. C. (1984, January). Summed-area tables for texture mapping. In ACM SIGGRAPH computer graphics (Vol. 18, No. 3, pp. 207-212). ACM.
See Also
bilateral_filter, edge_preserving_mean_filter, gaussian_filter, median_filter, rgb_to_ihs
Python API
def mean_filter(self, raster: Raster, filter_size_x: int = 11, filter_size_y: int = 11) -> Raster:
Median Filter
Function name: median_filter
This tool performs a median filter on a raster image. Median filters, a type of low-pass filter, can be used to emphasize the longer-range variability in an image, effectively acting to smooth the image. This can be useful for reducing the noise in an image. The algorithm operates by calculating the median value (middle value in a sorted list) in a moving window centred on each grid cell. Specifically, this tool uses the efficient running-median filtering algorithm of Huang et al. (1979). The median value is not influenced by anomolously high or low values in the distribution to the extent that the average is. As such, the median filter is far less sensitive to shot noise in an image than the mean filter.
Neighbourhood size, or filter size, is specified in the x and y dimensions using the filterx and filteryflags. These dimensions should be odd, positive integer values (e.g. 3, 5, 7, 9, etc.).
Reference
Huang, T., Yang, G.J.T.G.Y. and Tang, G., 1979. A fast two-dimensional median filtering algorithm. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27(1), pp.13-18.
See Also
bilateral_filter, edge_preserving_mean_filter, gaussian_filter, mean_filter
Python API
def median_filter(self, raster: Raster, filter_size_x: int = 11, filter_size_y: int = 11, sig_digits: int = 2) -> Raster:
Minimum Filter
Function name: minimum_filter
This tool assigns each cell in the output grid the minimum value in a moving window centred on each grid cell in the input raster (input). A maximum filter is the equivalent of the mathematical morphological erosion operator.
Neighbourhood size, or filter size, is specified in the x and y dimensions using the filterx and filtery flags. These dimensions should be odd, positive integer values, e.g. 3, 5, 7, 9... If the kernel filter size is the same in the x and y dimensions, the silent filter flag may be used instead (command-line interface only).
This tool takes advantage of the redundancy between overlapping, neighbouring filters to enhance computationally efficiency. Like most of WhiteboxTools' filters, it is also parallelized for further efficiency.
See Also
maximum_filter
Python API
def minimum_filter(self, raster: Raster, filter_size_x: int = 11, filter_size_y: int = 11) -> Raster:
Non Local Means Filter
Function name: non_local_means_filter
Experimental
Performs non-local means denoising using patch similarity weighting.
remote_sensing raster filter non_local_means_filter legacy-port
Parameters
NameDescriptionRequiredDefault
inputInput raster path or typed raster object.Requiredinput.tif
search_radiusSearch window radius in pixels (default 5).Optional5
patch_radiusPatch radius in pixels (default 1).Optional1
hFiltering strength parameter (default 10.0).Optional10.0
outputOptional output path. If omitted, output remains in memory.Optional—
Examples
Applies non_local_means_filter to an input raster.
wbe.non_local_means_filter(input='image.tif', output='non_local_means_filter.tif')
Opening
Function name: opening
This tool performs an opening operation on an input greyscale image (input). An opening is a mathematical morphology operation involving a dilation (maximum filter) on an erosion (minimum filter) set. opening operations, together with the closing operation, is frequently used in the fields of computer vision and digital image processing for image noise removal. The user must specify the size of the moving window in both the x and y directions (filterx and filtery).
See Also
closing, tophat_transform
Python API
def opening(self, raster: Raster, filter_size_x: int = 11, filter_size_y: int = 11) -> Raster:
Olympic Filter
Function name: olympic_filter
This filter is a modification of the mean_filter, whereby the highest and lowest values in the kernel are dropped, and the remaining values are averaged to replace the central pixel. The result is a low-pass smoothing filter that is more robust than the mean_filter, which is more strongly impacted by the presence of outlier values. It is named after a system of scoring Olympic events.
Neighbourhood size, or filter size, is specified in the x and y dimensions using the filterx and filtery flags. These dimensions should be odd, positive integer values (e.g. 3, 5, 7, 9, etc.).
See Also
mean_filter
Python API
def olympic_filter(self, raster: Raster, filter_size_x: int = 11, filter_size_y: int = 11) -> Raster:
Percentile Filter
Function name: percentile_filter
This tool calculates the percentile of the center cell in a moving filter window applied to an input image (`input). This indicates the value below which a given percentage of the neighbouring values in within the filter fall. For example, the 35th percentile is the value below which 35% of the neighbouring values in the filter window may be found. As such, the percentile of a pixel value is indicative of the relative location of the site within the statistical distribution of values contained within a filter window. When applied to input digital elevation models, percentile is a measure of local topographic position, or elevation residual.
Neighbourhood size, or filter size, is specified in the x and y dimensions using the filterx and filtery flags. These dimensions should be odd, positive integer values, e.g. 3, 5, 7, 9... If the kernel filter size is the same in the x and y dimensions, the silent filter flag may be used instead (command-line interface only).
This tool takes advantage of the redundancy between overlapping, neighbouring filters to enhance computationally efficiency, using a method similar to Huang et al. (1979). This efficient method of calculating percentiles requires rounding of floating-point inputs, and therefore the user must specify the number of significant digits (sig_digits) to be used during the processing. Like most of WhiteboxTools' filters, this tool is also parallelized for further efficiency.
Reference
Huang, T., Yang, G.J.T.G.Y. and Tang, G., 1979. A fast two-dimensional median filtering algorithm. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27(1), pp.13-18.
See Also
median_filter
Python API
def percentile_filter(self, raster: Raster, filter_size_x: int = 11, filter_size_y: int = 11, sig_digits: int = 2) -> Raster:
Range Filter
Function name: range_filter
This tool performs a range filter on an input image (input). A range filter assigns to each cell in the output grid the range (maximum - minimum) of the values contained within a moving window centred on each grid cell.
Neighbourhood size, or filter size, is specified in the x and y dimensions using the filterx and filtery flags. These dimensions should be odd, positive integer values (e.g. 3, 5, 7, 9, etc.).
See Also
total_filter
Python API
def range_filter(self, raster: Raster, filter_size_x: int = 11, filter_size_y: int = 11) -> Raster:
Remove Spurs
Function name: remove_spurs
This image processing tool removes small irregularities (i.e. spurs) on the boundaries of objects in a Boolean input raster image (input). This operation is sometimes called pruning. Remove Spurs is a useful tool for cleaning an image before performing a line thinning operation. In fact, the input image need not be truly Boolean (i.e. contain only 1's and 0's). All non-zero, positive values are considered to be foreground pixels while all zero valued cells are considered background pixels.
Note: Unlike other filter-based operations in WhiteboxTools, this algorithm can't easily be parallelized because the output raster must be read and written to during the same loop.
See Also
line_thinning
Python API
def remove_spurs(self, raster: Raster, max_iterations: int = 10) -> Raster:
Savitzky Golay 2D Filter
Function name: savitzky_golay_2d_filter
Experimental
Performs 2D Savitzky-Golay smoothing.
remote_sensing raster filter savitzky_golay_2d_filter legacy-port
Parameters
NameDescriptionRequiredDefault
inputInput raster path or typed raster object.Requiredinput.tif
window_sizeOdd window size (default 5). Currently supports 5 for polynomial order 2.Optional5
outputOptional output path. If omitted, output remains in memory.Optional—
Examples
Applies savitzky_golay_2d_filter to an input raster.
wbe.savitzky_golay_2d_filter(input='image.tif', output='savitzky_golay_2d_filter.tif')
Scharr Filter
Function name: scharr_filter
This tool performs a Scharr edge-detection filter on a raster image. The Scharr filter is similar to the sobel_filter and prewitt_filter, in that it identifies areas of high slope in the input image through the calculation of slopes in the x and y directions. A 3 × 3 Scharr filter uses the following schemes to calculate x and y slopes:
X-direction slope ... 30-3 100-10 30-3
Y-direction slope ... 3103 000 -3-10-3
Each grid cell in the output image is assigned the square-root of the squared sum of the x and y slopes.
The output image may be overwhelmed by a relatively small number of high-valued pixels, stretching the palette. The user may therefore optionally clip the output image distribution tails by a specified amount (clip) for improved visualization.
See Also
sobel_filter, prewitt_filter
Python API
def scharr_filter(self, raster: Raster, clip_tails: float = 0.0) -> Raster:
Standard Deviation Filter
Function name: standard_deviation_filter
This tool performs a standard deviation filter on an input image (input). A standard deviation filter assigns to each cell in the output grid the standard deviation, a measure of dispersion, of the values contained within a moving window centred on each grid cell.
Neighbourhood size, or filter size, is specified in the x and y dimensions using the filterx and filtery flags. These dimensions should be odd, positive integer values (e.g. 3, 5, 7, 9, etc.).
See Also
range_filter, total_filter
Python API
def standard_deviation_filter(self, raster: Raster, filter_size_x: int = 11, filter_size_y: int = 11) -> Raster:
Thicken Raster Line
Function name: thicken_raster_line
This image processing tool can be used to thicken single-cell wide lines within a raster file along diagonal sections of the lines. Because of the limitation of the raster data format, single-cell wide raster lines can be traversed along diagonal sections without passing through a line grid cell. This causes problems for various raster analysis functions for which lines are intended to be barriers. This tool will thicken raster lines, such that it is impossible to cross a line without passing through a line grid cell. While this can also be achieved using a maximum filter, unlike the filter approach, this tool will result in the smallest possible thickening to achieve the desired result.
All non-zero, positive values are considered to be foreground pixels while all zero valued cells or NoData cells are considered background pixels.
Note: Unlike other filter-based operations in WhiteboxTools, this algorithm can't easily be parallelized because the output raster must be read and written to during the same loop.
See Also
line_thinning
Python API
def thicken_raster_line(self, raster: Raster) -> Raster:
Tophat Transform
Function name: tophat_transform
This tool performs either a white or black top-hat transform on an input image. A top-hat transform is a common digital image processing operation used for various tasks, such as feature extraction, background equalization, and image enhancement. The size of the rectangular structuring element used in the filtering can be specified using the filterx and filtery flags.
There are two distinct types of top-hat transform including white and black top-hat transforms. The white top-hat transform is defined as the difference between the input image and its opening by some structuring element. An opening operation is the dilation (maximum filter) of an erosion (minimum filter) image. The black top-hat transform, by comparison, is defined as the difference between the closing and the input image. The user specifies which of the two flavours of top-hat transform the tool should perform by specifying either 'white' or 'black' with the variant flag.
See Also:
closing, opening, maximum_filter, minimum_filter
Python API
def tophat_transform(self, raster: Raster, filter_size_x: int = 11, filter_size_y: int = 11, variant: str = "white") -> Raster:
Total Filter
Function name: total_filter
This tool performs a total filter on an input image. A total filter assigns to each cell in the output grid the total (sum) of all values in a moving window centred on each grid cell.
Neighbourhood size, or filter size, is specified in the x and y dimensions using the filterx and filtery flags. These dimensions should be odd, positive integer values (e.g. 3, 5, 7, 9, etc.).
See Also
range_filter
Python API
def total_filter(self, raster: Raster, filter_size_x: int = 11, filter_size_y: int = 11) -> Raster:
Unsharp Masking
Function name: unsharp_masking
Unsharp masking is an image edge-sharpening technique commonly applied in digital image processing. Admittedly, the name 'unsharp' seems somewhat counter-intuitive given the purpose of the filter, which is to enchance the definition of edge features within the input image (input). This name comes from the use of a blurred, or unsharpened, intermediate image (mask) in the process. The blurred image is combined with the positive (original) image, creating an image that exhibits enhanced feature definition. A caution is needed in that the output image, although clearer, may be a less accurate representation of the image's subject. The output may also contain more speckle than the input image.
In addition to the input (input) and output image files, the user must specify the values of three parameters: the standard deviation distance (sigma), which is a measure of the filter size in pixels, the amount (amount), a percentage value that controls the magnitude of each overshoot at edges, and lastly, the threshold (threshold), which controls the minimal brightness change that will be sharpened. Pixels with values differ after the calculation of the filter by less than the threshold are unmodified in the output image.
unsharp_masking works with both greyscale and red-green-blue (RGB) colour images. RGB images are decomposed into intensity-hue-saturation (IHS) and the filter is applied to the intensity channel. Importantly, the intensity values range from 0-1, which is important when setting the threshold value for colour images. NoData values in the input image are ignored during processing.
See Also
gaussian_filter, high_pass_filter
Python API
def unsharp_masking(self, raster: Raster, sigma: float = 0.75, amount: float = 100.0, threshold: float = 0.0) -> Raster:
User Defined Weights Filter
Function name: user_defined_weights_filter
NoData values in the input image are ignored during the convolution operation. This can lead to unexpected behavior at the edges of images (since the default behavior is to return NoData when addressing cells beyond the grid edge) and where the grid contains interior areas of NoData values. Normalization of kernel weights can be useful for handling the edge effects associated with interior areas of NoData values. When the normalization option is selected, the sum of the cell value-weight product is divided by the sum of the weights on a cell-by-cell basis. Therefore, if the kernel at a particular grid cell contains neighboring cells of NoData values, normalization effectively re-adjusts the weighting to account for the missing data values. Normalization also ensures that the output image will possess values within the range of the input image and allows the user to specify integer value weights in the kernel. However, note that this implies that the sum of weights should equal one. In some cases, alternative sums (e.g. zero) are more appropriate, and as such normalization should not be applied in these cases.
Python API
def user_defined_weights_filter(self, raster: Raster, weights: List[List[float]], kernel_center: str = "center", normalize_weights: bool = False) -> Raster:
Wiener Filter
Function name: wiener_filter
Experimental
Performs adaptive Wiener denoising using local mean and variance.
remote_sensing raster filter wiener_filter legacy-port
Parameters
NameDescriptionRequiredDefault
inputInput raster path or typed raster object.Requiredinput.tif
radiusWiener local window radius in pixels (default 2).Optional2
noise_varianceOptional additive noise variance. If omitted, estimated from local variance map.Optional—
outputOptional output path. If omitted, output remains in memory.Optional—
Examples
Applies wiener_filter to an input raster.
wbe.wiener_filter(input='image.tif', output='wiener_filter.tif')
Edge and Feature Detection
Canny Edge Detection
Function name: canny_edge_detection
Description
This tool performs a Canny edge-detection filtering operation on an input image (input). The Canny edge-detection filter is a multi-stage filter that combines a Gassian filtering (gaussian_filter) operation with various thresholding operations to generate a single-cell wide edges output raster (output). The sigma parameter, measured in grid cells determines the size of the Gaussian filter kernel. The low and high parameters determine the characteristics of the thresholding steps; both parameters range from 0.0 to 1.0.
By default, the output raster will be Boolean, with 1's designating edge-cells. It is possible, using the add_back parameter to add the edge cells back into the original image, providing an edge-enchanced output, similar in concept to the unsharp_masking operation.
References
This implementation was inspired by the algorithm described here: https://towardsdatascience.com/canny-edge-detection-step-by-step-in-python-computer-vision-b49c3a2d8123
See Also
gaussian_filter, sobel_filter, unsharp_masking, scharr_filter
Python API
def canny_edge_detection(self, input: Raster, sigma: float = 0.5, low_threshold: float = 0.05, high_threshold: float = 0.15, add_back_to_image: bool = False) -> Raster:
Corner Detection
Function name: corner_detection
This tool identifies corner patterns in boolean images using hit-and-miss pattern matching. Foreground pixels in the input image (input) are designated by any positive, non-zero values. Zero-valued and NoData-valued grid cells are interpreted by the algorithm as background values.
Reference
Fisher, R, Brown, N, Cammas, N, Fitzgibbon, A, Horne, S, Koryllos, K, Murdoch, A, Robertson, J, Sharman, T, Strachan, C, 2004. Hypertext Image Processing Resource. online: http://homepages.inf.ed.ac.uk/rbf/HIPR2/hitmiss.htm
Python API
def corner_detection(self, raster: Raster) -> Raster:
Laplacian Filter
Function name: laplacian_filter
This tool can be used to perform a Laplacian filter on a raster image. A Laplacian filter can be used to emphasize the edges in an image. As such, this filter type is commonly used in edge-detection applications. The algorithm operates by convolving a kernel of weights with each grid cell and its neighbours in an image. Four 3x3 sized filters and one 5x5 filter are available for selection. The weights of the kernels are as follows:
3x3(1) ... 0-10 -14-1 0-10
3x3(2) ... 0-10 -15-1 0-10
3x3(3) ... -1-1-1 -18-1 -1-1-1
3x3(4) ... 1-21 -24-2 1-21
5x5(1) ..... 00-100 0-1-2-10 -1-217-2-1 0-1-2-10 00-100
5x5(2) ..... 00-100 0-1-2-10 -1-216-2-1 0-1-2-10 00-100
The user must specify the variant, including '3x3(1)', '3x3(2)', '3x3(3)', '3x3(4)', '5x5(1)', and '5x5(2)'. The user may also optionally clip the output image distribution tails by a specified amount (e.g. 1%).
See Also
prewitt_filter, sobel_filter
Python API
def laplacian_filter(self, raster: Raster, variant: str = "3x3(1)", clip_amount: float = 0.0) -> Raster:
Laplacian Of Gaussians Filter
Function name: laplacian_of_gaussians_filter
The Laplacian-of-Gaussian (LoG) is a spatial filter used for edge enhancement and is closely related to the difference-of-Gaussians filter (DiffOfGaussianFilter). The formulation of the LoG filter algorithm is based on the equation provided in the Hypermedia Image Processing Reference (HIPR) 2. The LoG operator calculates the second spatial derivative of an image. In areas where image intensity is constant, the LoG response will be zero. Near areas of change in intensity the LoG will be positive on the darker side, and negative on the lighter side. This means that at a sharp edge, or boundary, between two regions of uniform but different intensities, the LoG response will be:
- zero at a long distance from the edge,
- positive just to one side of the edge,
- negative just to the other side of the edge,
- zero at some point in between, on the edge itself.
The user may optionally choose to reflecting the data along image edges. NoData values in the input image are similarly valued in the output. The output raster is of the float data type and continuous data scale.
Reference
Fisher, R. 2004. Hypertext Image Processing Resources 2 (HIPR2). Available online: http://homepages.inf.ed.ac.uk/rbf/HIPR2/roberts.htm
See Also
DiffOfGaussianFilter
Python API
def laplacian_of_gaussians_filter(self, raster: Raster, sigma: float = 0.75) -> Raster:
Prewitt Filter
Function name: prewitt_filter
This tool performs a 3 × 3 Prewitt edge-detection filter on a raster image. The Prewitt filter is similar to the sobel_filter, in that it identifies areas of high slope in the input image through the calculation of slopes in the x and y directions. The Prewitt edge-detection filter, however, gives less weight to nearer cell values within the moving window, or kernel. For example, a Prewitt filter uses the following schemes to calculate x and y slopes:
X-direction slope ... -101 -101 -101
Y-direction slope ... 111 000 -1-1-1
Each grid cell in the output image is assigned the square-root of the squared sum of the x and y slopes.
The user may optionally clip the output image distribution tails by a specified amount (e.g. 1%).
See Also
sobel_filter
Python API
def prewitt_filter(self, raster: Raster, clip_tails: float = 0.0) -> Raster:
Roberts Cross Filter
Function name: roberts_cross_filter
This tool performs Robert's Cross edge-detection filter on a raster image. The roberts_cross_filter is similar to the sobel_filter and prewitt_filter, in that it identifies areas of high slope in the input image through the calculation of slopes in the x and y directions. A Robert's Cross filter uses the following 2 × 2 schemes to calculate slope magnitude, |G|: .. P1P2 P3P4 G=P1 - P4+P2- P3
Note, the filter is centered on pixel P1 and P2, P3, and P4 are the neighbouring pixels towards the east, south, and south-east respectively.
The output image may be overwhelmed by a relatively small number of high-valued pixels, stretching the palette. The user may therefore optionally clip the output image distribution tails by a specified amount (clip) for improved visualization.
Reference
Fisher, R. 2004. Hypertext Image Processing Resources 2 (HIPR2). Available online: http://homepages.inf.ed.ac.uk/rbf/HIPR2/roberts.htm
See Also
sobel_filter, prewitt_filter
Python API
def roberts_cross_filter(self, raster: Raster, clip_amount: float = 0.0) -> Raster:
Sobel Filter
Function name: sobel_filter
This tool performs a 3 × 3 or 5 × 5 Sobel edge-detection filter on a raster image. The Sobel filter is similar to the prewitt_filter, in that it identifies areas of high slope in the input image through the calculation of slopes in the x and y directions. The Sobel edge-detection filter, however, gives more weight to nearer cell values within the moving window, or kernel. For example, a 3 × 3 Sobel filter uses the following schemes to calculate x and y slopes:
X-direction slope ... -101 -202 -101
Y-direction slope ... 121 000 -1-2-1
Each grid cell in the output image is assigned the square-root of the squared sum of the x and y slopes.
The user must specify the variant, including '3x3' and '5x5' variants. The user may also optionally clip the output image distribution tails by a specified amount (e.g. 1%).
See Also
prewitt_filter
Python API
def sobel_filter(self, raster: Raster, variant: str = "3x3", clip_tails: float = 0.0) -> Raster:
Image Classification
Evaluate Training Sites
Function name: evaluate_training_sites
Description
This tool performs an evaluation of the reflectance properties of multi-spectral image dataset for a group of digitized class polygons. This is often viewed as the first step in a supervised classification procedure, such as those performed using the min_dist_classification or parallelepiped_classification tools. The analysis is based on a series of one or more input images (inputs) and an input polygon vector file (polys). The user must also specify the attribute name (field), within the attribute table, containing the class ID associated with each feature in input the polygon vector. A single class may be designated by multiple polygon features in the test site polygon vector. Note that the input polygon file is generally created by digitizing training areas of exemplar reflectance properties for each class type. The input polygon vector should be in the same coordinate system as the input multi-spectral images. The input images must represent a multi-spectral data set made up of individual bands. Do not input colour composite images. Lastly, the user must specify the name of the output HTML file. This file will contain a series of box-and-whisker plots, one for each band in the multi-spectral data set, that visualize the distribution of each class in the associated bands. This can be helpful in determining the overlap between spectral properties for the classes, which may be useful if further class or test site refinement is necessary. For a subsequent supervised classification to be successful, each class should not overlap significantly with the other classes in at least one of the input bands. If this is not the case, the user may need to refine the class system.
See Also
min_dist_classification, parallelepiped_classification
Python API
def evaluate_training_sites(self, input_rasters: List[Raster], training_polygons: Vector, class_field_name: str, output_html_file: str) -> None:
Fuzzy kNN Classification
Function name: fuzzy_knn_classification
Experimental
Fuzzy k-NN classification that yields soft class membership and optional probability surfaces.
remote_sensing classification knn fuzzy legacy-port
Parameters
NameDescriptionRequiredDefault
inputsArray of single-band input rasters.Required['band1.tif', 'band2.tif', 'band3.tif']
training_dataPoint/polygon vector training data path.Requiredtraining.shp
class_fieldClass field in training_data attributes.Requiredclass
scalingFeature scaling mode: none (default), normalize, standardize.Optionalnone
kNumber of neighbors (default 5).Optional5
mFuzzy exponent parameter (> 1; default 2.0).Optional2.0
outputOptional output classified raster path.Optional—
probability_outputOptional membership-probability raster path.Optional—
Examples
Run fuzzy kNN and output both class and confidence rasters.
wbe.fuzzy_knn_classification(class_field='class', inputs=['band1.tif', 'band2.tif', 'band3.tif'], k=7, m=2.0, output='fuzzy_knn_classified.tif', probability_output='fuzzy_knn_probability.tif', training_data='training.shp')
Generalize Classified Raster
Function name: generalize_classified_raster
Description
This tool can be used to generalize a raster containing class or object features. Such rasters are usually derived from some classification procedure (e.g. image classification and landform classification), or as the output of a segmentation procedure (image_segmentation). Rasters that are created in this way often contain many very small features that make their interpretation, or vectorization, challenging. Therefore, it is common for practitioners to remove the smaller features. Many different approaches have been used for this task in the past. For example, it is common to remove small features using a filtering based approach (majority_filter). While this can be an effective strategy, it does have the disadvantage of modifying all of the boundaries in the class raster, including those that define larger features. In many applications, this can be a serious issue of concern.
The generalize_classified_raster tool offers an alternative method for simplifying class rasters. The process begins by identifying each contiguous group of cells in the input (i.e. a clumping operation) and then defines the subset of features that are smaller than the user-specified minimum feature size (min_size), in grid cells. This set of small features is then dealt with using one of three methods (method). In the first method (longest), a small feature may be reassigned the class value of the neighbouring feature with the longest shared border. The sum of the neighbouring feature size and the small feature size must be larger than the specified size threshold, and the tool will iterate through this process of reassigning feature values to neighbouring values until each small feature has been resolved.
The second method, largest, operates in much the same way as the first, except that objects are reassigned the value of the largest neighbour. Again, this process of reassigning small feature values iterates until every small feature has been reassigned to a large neighbouring feature.
The third and last method (nearest) takes a different approach to resolving the reassignment of small features. Using the nearest generalization approach, each grid cell contained within a small feature is reassigned the value of the nearest large neighbouring feature. When there are two or more neighbouring features that are equally distanced to a small feature cell, the cell will be reassigned to the largest neighbour. Perhaps the most significant disadvantage of this approach is that it creates a new artificial boundary in the output image that is not contained within the input class raster. That is, with the previous two methods, boundaries associated with smaller features in the input images are 'erased' in the output map, but every boundary in the output raster exactly matches boundaries within the input raster (i.e. the output boundaries are a subset of the input feature boundaries). However, with the nearest method, artificial boundaries, determined by the divide between nearest neighbours, are introduced to the output raster and these new feature boundaries do not have any basis in the original classification/segmentation process. Thus caution should be exercised when using this approach, especially when larger minimum size thresholds are used. The longest method is the recommended approach to class feature generalization.
For a video tutorial on how to use the generalize_classified_raster tool, see this YouTube video.
See Also
generalize_with_similarity, majority_filter, image_segmentation
Python API
def generalize_classified_raster(self, raster: Raster, area_threshold: int = 5, method: str = "longest") -> Raster:
Generalize With Similarity
Function name: generalize_with_similarity
Description
This tool can be used to generalize a raster containing class features (input) by reassigning the identifier values of small features (min_size) to those of neighbouring features. Therefore, this tool performs a very similar operation to the generalize_classified_raster tool. However, while the generalize_classified_raster tool re-labels small features based on the geometric properties of neighbouring features (e.g. neighbour with the longest shared border, largest neighbour, or nearest neighbour), the generalize_with_similarity tool reassigns feature labels based on similarity with neighbouring features. Similarity is determined using a series of input similarity criteria rasters (similarity), which may be factors used in the creation of the input class raster. For example, the similarlity rasters may be bands of multi-spectral imagery, if the input raster is a classified land-cover map, or DEM-derived land surface parameters, if the input raster is a landform class map.
The tool works by identifying each contiguous group of pixels (features) in the input class raster (input), i.e. a clumping operation. The mean value is then calculated for each feature and each similarity input, which defines a multi-dimensional 'similarity centre point' associated with each feature. It should be noted that the similarity raster data are standardized prior to calculating these centre point values. Lastly, the tool then reassigns the input label values of all features smaller than the user-specified minimum feature size (min_size) to that of the neighbouring feature with the shortest distance between similarity centre points.
For small features that are entirely enclosed by a single larger feature, this process will result in the same generalization solution presented by any of the geometric-based methods of the generalize_classified_raster tool. However, for small features that have more than one neighbour, this tool may provide a superior generalization solution than those based solely on geometric information.
For a video tutorial on how to use the generalize_with_similarity tool, see this YouTube video.
See Also
generalize_classified_raster, majority_filter, image_segmentation
Python API
def generalize_with_similarity(self, raster: Raster, similarity_rasters: List[Raster], area_threshold: int = 5) -> Raster:
K Means Clustering
Function name: k_means_clustering
This tool can be used to perform a k-means clustering operation on two or more input images (inputs), typically several bands of multi-spectral satellite imagery. The tool creates two outputs, including the classified image (output and a classification HTML report (out_html). The user must specify the number of class (classes), which should be known a priori, and the strategy for initializing class clusters (initialize). The initialization strategies include "diagonal" (clusters are initially located randomly along the multi-dimensional diagonal of spectral space) and "random" (clusters are initially located randomly throughout spectral space). The algorithm will continue updating cluster center locations with each iteration of the process until either the user-specified maximum number of iterations (max_iterations) is reached, or until a stability criteria (class_change) is achieved. The stability criteria is the percent of the total number of pixels in the image that are changed among the class values between consecutive iterations. Lastly, the user must specify the minimum allowable number of pixels in a cluster (min_class_size).
Note, each of the input images must have the same number of rows and columns and the same spatial extent because the analysis is performed on a pixel-by-pixel basis. NoData values in any of the input images will result in the removal of the corresponding pixel from the analysis.
See Also
modified_k_means_clustering
Python API
def k_means_clustering(self, input_rasters: List[Raster], output_html_file: str = "", num_clusters: int = 5, max_iterations: int = 10, percent_changed_threshold: float = 2.0, initialization_mode: str = "dia", min_class_size: int = 10) -> Raster:
kNN Classification
Function name: knn_classification
Description
This tool performs a supervised *k*-nearest neighbour (*k*-NN) classification using multiple predictor rasters (inputs), or features, and training data (training). It can be used to model the spatial distribution of class data, such as land-cover type, soil class, or vegetation type. The training data take the form of an input vector Shapefile containing a set of points or polygons, for which the known class information is contained within a field (field) of the attribute table. Each grid cell defines a stack of feature values (one value for each input raster), which serves as a point within the multi-dimensional feature space. The algorithm works by identifying a user-defined number (k, -k) of feature-space neighbours from the training set for each grid cell. The class that is then assigned to the grid cell in the output raster (output) is then determined as the most common class among the set of neighbours. Note that the knn_regression tool can be used to apply the k-NN method to the modelling of continuous data.
The user has the option to clip the training set data (clip). When this option is selected, each training pixel for which the estimated class value, based on the k-NN procedure, is not equal to the known class value, is removed from the training set before proceeding with labelling all grid cells. This has the effect of removing outlier points within the training set and often improves the overall classification accuracy.
The tool splits the training data into two sets, one for training the classifier and one for testing the classification. These test data are used to calculate the overall accuracy and Cohen's kappa index of agreement, as well as to estimate the variable importance. The test_proportion parameter is used to set the proportion of the input training data used in model testing. For example, if test_proportion = 0.2, 20% of the training data will be set aside for testing, and this subset will be selected randomly. As a result of this random selection of test data, the tool behaves stochastically, and will result in a different model each time it is run.
Note that the output image parameter (output) is optional. When unspecified, the tool will simply report the model accuracy statistics and variable importance, allowing the user to experiment with different parameter settings and input predictor raster combinations to optimize the model before applying it to classify the whole image data set.
Like all supervised classification methods, this technique relies heavily on proper selection of training data. Training sites are exemplar areas/points of known and representative class value (e.g. land cover type). The algorithm determines the feature signatures of the pixels within each training area. In selecting training sites, care should be taken to ensure that they cover the full range of variability within each class. Otherwise the classification accuracy will be impacted. If possible, multiple training sites should be selected for each class. It is also advisable to avoid areas near the edges of class objects (e.g. land-cover patches), where mixed pixels may impact the purity of training site values.
After selecting training sites, the feature value distributions of each class type can be assessed using the evaluate_training_sites tool. In particular, the distribution of class values should ideally be non-overlapping in at least one feature dimension.
The k-NN algorithm is based on the calculation of distances in multi-dimensional space. Feature scaling is essential to the application of k-NN modelling, especially when the ranges of the features are different, for example, if they are measured in different units. Without scaling, features with larger ranges will have greater influence in computing the distances between points. The tool offers three options for feature-scaling (scaling), including 'None', 'Normalize', and 'Standardize'. Normalization simply rescales each of the features onto a 0-1 range. This is a good option for most applications, but it is highly sensitive to outliers because it is determined by the range of the minimum and maximum values. Standardization rescales predictors using their means and standard deviations, transforming the data into z-scores. This is a better option than normalization when you know that the data contain outlier values; however, it does does assume that the feature data are somewhat normally distributed, or are at least symmetrical in distribution.
Because the k-NN algorithm calculates distances in feature-space, like many other related algorithms, it suffers from the curse of dimensionality. Distances become less meaningful in high-dimensional space because the vastness of these spaces means that distances between points are less significant (more similar). As such, if the predictor list includes insignificant or highly correlated variables, it is advisable to exclude these features during the model-building phase, or to use a dimension reduction technique such as principal_component_analysis to transform the features into a smaller set of uncorrelated predictors.
For a video tutorial on how to use the knn_classification tool, see this YouTube video.
Memory Usage
The peak memory usage of this tool is approximately 8 bytes per grid cell × # predictors.
See Also
knn_regression, random_forest_classification, svm_classification, parallelepiped_classification, evaluate_training_sites
Python API
def knn_classification(self, input_rasters: List[Raster], training_data: Vector, class_field_name: str, scaling_method: str = "none", k: int = 5, test_proportion: float = 0.2, use_clipping: bool = False, create_output: bool = False) -> Optional[Raster]:
kNN Regression
Function name: knn_regression
Description
This tool performs a supervised *k*-nearest neighbour (*k*-NN) regression analysis using multiple predictor rasters (inputs), or features, and training data (training). It can be used to model the spatial distribution of continuous data, such as soil properties (e.g. percent sand/silt/clay). The training data take the form of an input vector Shapefile containing a set of points, for which the known outcome information is contained within a field (field) of the attribute table. Each grid cell defines a stack of feature values (one value for each input raster), which serves as a point within the multi-dimensional feature space. The algorithm works by identifying a user-defined number (k, -k) of feature-space neighbours from the training set for each grid cell. The value that is then assigned to the grid cell in the output raster (output) is then determined as the mean of the outcome variable among the set of neighbours. The user may optionally choose to weight neighbour outcome values in the averaging calculation, with weights determined by the inverse distance function (weight). Note that the knn_classification tool can be used to apply the k-NN method to the modelling of categorical data.
The tool splits the training data into two sets, one for training the model and one for testing the prediction. These test data are used to calculate the regression accuracy statistics, as well as to estimate the variable importance. The test_proportion parameter is used to set the proportion of the input training data used in model testing. For example, if test_proportion = 0.2, 20% of the training data will be set aside for testing, and this subset will be selected randomly. As a result of this random selection of test data, the tool behaves stochastically, and will result in a different model each time it is run.
Note that the output image parameter (output) is optional. When unspecified, the tool will simply report the model accuracy statistics and variable importance, allowing the user to experiment with different parameter settings and input predictor raster combinations to optimize the model before applying it to model the outcome variable across the whole region defined by image data set.
The k-NN algorithm is based on the calculation of distances in multi-dimensional space. Feature scaling is essential to the application of k-NN modelling, especially when the ranges of the features are different, for example, if they are measured in different units. Without scaling, features with larger ranges will have greater influence in computing the distances between points. The tool offers three options for feature-scaling (scaling), including 'None', 'Normalize', and 'Standardize'. Normalization simply rescales each of the features onto a 0-1 range. This is a good option for most applications, but it is highly sensitive to outliers because it is determined by the range of the minimum and maximum values. Standardization rescales predictors using their means and standard deviations, transforming the data into z-scores. This is a better option than normalization when you know that the data contain outlier values; however, it does does assume that the feature data are somewhat normally distributed, or are at least symmetrical in distribution.
Because the k-NN algorithm calculates distances in feature-space, like many other related algorithms, it suffers from the curse of dimensionality. Distances become less meaningful in high-dimensional space because the vastness of these spaces means that distances between points are less significant (more similar). As such, if the predictor list includes insignificant or highly correlated variables, it is advisable to exclude these features during the model-building phase, or to use a dimension reduction technique such as principal_component_analysis to transform the features into a smaller set of uncorrelated predictors.
Memory Usage
The peak memory usage of this tool is approximately 8 bytes per grid cell × # predictors.
See Also
knn_classification, random_forest_regression, svm_regression, principal_component_analysis
Python API
def knn_regression(self, input_rasters: List[Raster], training_data: Vector, field_name: str, scaling_method: str = "none", k: int = 5, distance_weighting: bool = False, test_proportion: float = 0.2, create_output: bool = False) -> Optional[Raster]:
Logistic Regression
Function name: logistic_regression
Description
This tool performs a logistic regression analysis using multiple predictor rasters (inputs), or features, and training data (training). Logistic regression is a type of linear statistical classifier that in its basic form uses a logistic function to model a binary outcome variable, although the implementation used by this tool can handle multi-class dependent variables. This tool can be used to model the spatial distribution of class data, such as land-cover type, soil class, or vegetation type.
The training data take the form of an input vector Shapefile containing a set of points or polygons, for which the known class information is contained within a field (field) of the attribute table. Each grid cell defines a stack of feature values (one value for each input raster), which serves as a point within the multi-dimensional feature space.
The tool splits the training data into two sets, one for training the model and one for testing the prediction. These test data are used to calculate the classification accuracy stats, as well as to estimate the variable importance. The test_proportion parameter is used to set the proportion of the input training data used in model testing. For example, if test_proportion = 0.2, 20% of the training data will be set aside for testing, and this subset will be selected randomly. As a result of this random selection of test data, the tool behaves stochastically, and will result in a different model each time it is run.
Note that the output image parameter (output) is optional. When unspecified, the tool will simply report the model accuracy statistics and variable importance, allowing the user to experiment with different parameter settings and input predictor raster combinations to optimize the model before applying it to model the outcome variable across the whole region defined by image data set.
The user may opt for feature scaling, which can be important when the ranges of the features are different, for example, if they are measured in different units. Without scaling, features with larger ranges will have greater influence in computing the distances between points. The tool offers three options for feature-scaling (scaling), including 'None', 'Normalize', and 'Standardize'. Normalization simply rescales each of the features onto a 0-1 range. This is a good option for most applications, but it is highly sensitive to outliers because it is determined by the range of the minimum and maximum values. Standardization rescales predictors using their means and standard deviations, transforming the data into z-scores. This is a better option than normalization when you know that the data contain outlier values; however, it does does assume that the feature data are somewhat normally distributed, or are at least symmetrical in distribution.
Because the logistic regression calculates distances in feature-space, like many other related algorithms, it suffers from the curse of dimensionality. Distances become less meaningful in high-dimensional space because the vastness of these spaces means that distances between points are less significant (more similar). As such, if the predictor list includes insignificant or highly correlated variables, it is advisable to exclude these features during the model-building phase, or to use a dimension reduction technique such as principal_component_analysis to transform the features into a smaller set of uncorrelated predictors.
See Also
svm_classification, random_forest_classification, knn_classification, principal_component_analysis
Python API
def logistic_regression(self, input_rasters: List[Raster], training_data: Vector, class_field_name: str, scaling_method: str = "none", test_proportion: float = 0.2, create_output: bool = False) -> Optional[Raster]:
Min Dist Classification
Function name: min_dist_classification
Description
This tool performs a supervised minimum-distance classification using training site polygons (polys) and multi-spectral images (inputs). This classification method uses the mean vectors for each class and calculates the Euclidean distance from each unknown pixel to the class mean vector. Unclassed pixels are then assigned to the nearest class mean. A threshold distance (threshold), expressed in number of z-scores, may optionally be used and pixels whose multi-spectral distance is greater than this threshold will not be assigned a class in the output image (output). When a threshold distance is unspecified, all pixels will be assigned to a class.
Like all supervised classification methods, this technique relies heavily on proper selection of training data. Training sites are exemplar areas of known and representative land cover type. The algorithm determines the spectral signature of the pixels within each training area, and uses this information to define the mean vector of each class. It is preferable that training sites are based on either field-collected data or fine-resolution reference imagery. In selecting training sites, care should be taken to ensure that they cover the full range of variability within each class. Otherwise the classification accuracy will be impacted. If possible, multiple training sites should be selected for each class. It is also advisable to avoid areas near the edges of land-cover patches, where mixed pixels may impact the purity of training site reflectance values.
After selecting training sites, the reflectance values of each land-cover type can be assessed using the evaluate_training_sites tool. In particular, the distribution of reflectance values should ideally be non-overlapping in at least one band of the multi-spectral data set.
See Also
evaluate_training_sites, parallelepiped_classification
Python API
def min_dist_classification(self, input_rasters: List[Raster], training_data: Vector, class_field_name: str, dist_threshold: float = float('inf')) -> Raster:
Python API
def min_dist_classification(self, input_rasters: List[Raster], training_data: Vector, class_field_name: str, dist_threshold: float = float('inf')) -> Raster:
Modified K Means Clustering
Function name: modified_k_means_clustering
This modified k-means algorithm is similar to that described by Mather and Koch (2011). The main difference between the traditional k-means and this technique is that the user does not need to specify the desired number of classes/clusters prior to running the tool. Instead, the algorithm initializes with a very liberal overestimate of the number of classes and then merges classes that have cluster centres that are separated by less than a user-defined threshold. The main difference between this algorithm and the ISODATA technique is that clusters can not be broken apart into two smaller clusters.
Reference
Mather, P. M., & Koch, M. (2011). Computer processing of remotely-sensed images: an introduction. John Wiley & Sons.
See Also
k_means_clustering
Python API
def modified_k_means_clustering(self, input_rasters: List[Raster], output_html_file: str = "", num_start_clusters: int = 1000, merge_distance: float = 1.0, max_iterations: int = 10, percent_changed_threshold: float = 2.0) -> Raster:
Nnd Classification
Function name: nnd_classification
Description
This tool performs a supervised *k*-nearest neighbour (*k*-NN) classification using multiple predictor rasters (inputs), or features, and training data (training). It can be used to model the spatial distribution of class data, such as land-cover type, soil class, or vegetation type. The training data take the form of an input vector Shapefile containing a set of points or polygons, for which the known class information is contained within a field (field) of the attribute table. Each grid cell defines a stack of feature values (one value for each input raster), which serves as a point within the multi-dimensional feature space. The algorithm works by identifying a user-defined number (k, -k) of feature-space neighbours from the training set for each grid cell. The class that is then assigned to the grid cell in the output raster (output) is then determined as the most common class among the set of neighbours. Note that the knn_regression tool can be used to apply the k-NN method to the modelling of continuous data.
The user has the option to clip the training set data (clip). When this option is selected, each training pixel for which the estimated class value, based on the k-NN procedure, is not equal to the known class value, is removed from the training set before proceeding with labelling all grid cells. This has the effect of removing outlier points within the training set and often improves the overall classification accuracy.
The tool splits the training data into two sets, one for training the classifier and one for testing the classification. These test data are used to calculate the overall accuracy and Cohen's kappa index of agreement, as well as to estimate the variable importance. The test_proportion parameter is used to set the proportion of the input training data used in model testing. For example, if test_proportion = 0.2, 20% of the training data will be set aside for testing, and this subset will be selected randomly. As a result of this random selection of test data, the tool behaves stochastically, and will result in a different model each time it is run.
Note that the output image parameter (output) is optional. When unspecified, the tool will simply report the model accuracy statistics and variable importance, allowing the user to experiment with different parameter settings and input predictor raster combinations to optimize the model before applying it to classify the whole image data set.
Like all supervised classification methods, this technique relies heavily on proper selection of training data. Training sites are exemplar areas/points of known and representative class value (e.g. land cover type). The algorithm determines the feature signatures of the pixels within each training area. In selecting training sites, care should be taken to ensure that they cover the full range of variability within each class. Otherwise the classification accuracy will be impacted. If possible, multiple training sites should be selected for each class. It is also advisable to avoid areas near the edges of class objects (e.g. land-cover patches), where mixed pixels may impact the purity of training site values.
After selecting training sites, the feature value distributions of each class type can be assessed using the evaluate_training_sites tool. In particular, the distribution of class values should ideally be non-overlapping in at least one feature dimension.
The k-NN algorithm is based on the calculation of distances in multi-dimensional space. Feature scaling is essential to the application of k-NN modelling, especially when the ranges of the features are different, for example, if they are measured in different units. Without scaling, features with larger ranges will have greater influence in computing the distances between points. The tool offers three options for feature-scaling (scaling), including 'None', 'Normalize', and 'Standardize'. Normalization simply rescales each of the features onto a 0-1 range. This is a good option for most applications, but it is highly sensitive to outliers because it is determined by the range of the minimum and maximum values. Standardization rescales predictors using their means and standard deviations, transforming the data into z-scores. This is a better option than normalization when you know that the data contain outlier values; however, it does does assume that the feature data are somewhat normally distributed, or are at least symmetrical in distribution.
Because the k-NN algorithm calculates distances in feature-space, like many other related algorithms, it suffers from the curse of dimensionality. Distances become less meaningful in high-dimensional space because the vastness of these spaces means that distances between points are less significant (more similar). As such, if the predictor list includes insignificant or highly correlated variables, it is advisable to exclude these features during the model-building phase, or to use a dimension reduction technique such as principal_component_analysis to transform the features into a smaller set of uncorrelated predictors.
For a video tutorial on how to use the knn_classification tool, see this YouTube video.
Memory Usage
The peak memory usage of this tool is approximately 8 bytes per grid cell × # predictors.
See Also
knn_regression, random_forest_classification, svm_classification, parallelepiped_classification, evaluate_training_sites
Python API
def knn_classification(self, input_rasters: List[Raster], training_data: Vector, class_field_name: str, scaling_method: str = "none", k: int = 5, test_proportion: float = 0.2, use_clipping: bool = False, create_output: bool = False) -> Optional[Raster]:
Otsu Thresholding
Function name: otsu_thresholding
This tool uses Ostu's method for optimal automatic binary thresholding, transforming an input image (--input) into background and foreground pixels (--output). Otsu’s method uses the grayscale image histogram to detect an optimal threshold value that separates two regions with maximum inter-class variance. The process begins by calculating the image histogram of the input.
References
Otsu, N., 1979. A threshold selection method from gray-level histograms. IEEE transactions on systems, man, and cybernetics, 9(1), pp.62-66.
See Also
image_segmentation, image_segmentation
Python API
def otsu_thresholding(self, raster: Raster) -> Raster:
Parallelepiped Classification
Function name: parallelepiped_classification
Description
This tool performs a supervised parallelepiped classification using training site polygons (polys) and multi-spectral images (inputs). This classification method uses the minimum and maximum reflectance values for each class within the training data to characterize a set of parallelepipeds, i.e. multi-dimensional geometric shapes. The algorithm then assigns each unknown pixel in the image data set to the first class for which the pixel's spectral vector is contained within the corresponding class parallelepiped. Pixels with spectral vectors that are not contained within any class parallelepiped will not be assigned a class in the output image.
Like all supervised classification methods, this technique relies heavily on proper selection of training data. Training sites are exemplar areas of known and representative land cover type. The algorithm determines the spectral signature of the pixels within each training area, and uses this information to define the mean vector of each class. It is preferable that training sites are based on either field-collected data or fine-resolution reference imagery. In selecting training sites, care should be taken to ensure that they cover the full range of variability within each class. Otherwise the classification accuracy will be impacted. If possible, multiple training sites should be selected for each class. It is also advisable to avoid areas near the edges of land-cover patches, where mixed pixels may impact the purity of training site reflectance values.
After selecting training sites, the reflectance values of each land-cover type can be assessed using the evaluate_training_sites tool. In particular, the distribution of reflectance values should ideally be non-overlapping in at least one band of the multi-spectral data set.
See Also
evaluate_training_sites, min_dist_classification
Python API
def parallelepiped_classification(self, input_rasters: List[Raster], training_data: Vector, class_field_name: str) -> Raster:
Random Forest Classification
Function name: random_forest_classification
Experimental
Supervised Random Forest classification for multisource raster features using point/polygon training data.
remote_sensing classification random_forest legacy-port
Parameters
NameDescriptionRequiredDefault
inputsArray of single-band input rasters.Required['band1.tif', 'band2.tif', 'band3.tif']
training_dataPoint/polygon vector training data path.Requiredtraining.shp
class_fieldClass field in training_data attributes.Requiredclass
scalingFeature scaling mode: none (default), normalize, standardize.Optionalnone
n_treesNumber of trees in the forest (default 200).Optional200
min_samples_leafMinimum number of samples required at a leaf node (default 1).Optional1
min_samples_splitMinimum number of samples required to split an internal node (default 2).Optional2
outputOptional output raster path.Optional—
Examples
Run random forest classification on multiband predictors.
wbe.random_forest_classification(class_field='class', inputs=['band1.tif', 'band2.tif', 'band3.tif'], n_trees=300, output='rf_classification.tif', scaling='standardize', training_data='training.shp')
Random Forest Regression
Function name: random_forest_regression
Experimental
Random Forest regression for continuous targets (e.g., biomass, moisture, temperature) from raster predictors.
remote_sensing regression random_forest legacy-port
Parameters
NameDescriptionRequiredDefault
inputsArray of single-band input rasters.Required['band1.tif', 'band2.tif', 'band3.tif']
training_dataPoint vector training data path.Requiredtraining_points.shp
fieldNumeric target field in training_data attributes.Requiredvalue
scalingFeature scaling mode: none (default), normalize, standardize.Optionalnone
n_treesNumber of trees in the forest (default 200).Optional200
min_samples_leafMinimum number of samples required at a leaf node (default 1).Optional1
min_samples_splitMinimum number of samples required to split an internal node (default 2).Optional2
outputOptional output raster path.Optional—
Examples
Run random forest regression on multiband predictors.
wbe.random_forest_regression(field='target', inputs=['band1.tif', 'band2.tif', 'band3.tif'], n_trees=300, output='rf_regression.tif', scaling='standardize', training_data='training_points.shp')
SVM Classification
Function name: svm_classification
Description
This tool performs a support vector machine (SVM) binary classification using multiple predictor rasters (inputs), or features, and training data (training). SVMs are a common class of supervised learning algorithms widely applied in many problem domains. This tool can be used to model the spatial distribution of class data, such as land-cover type, soil class, or vegetation type. The training data take the form of an input vector Shapefile containing a set of points or polygons, for which the known class information is contained within a field (field) of the attribute table. Each grid cell defines a stack of feature values (one value for each input raster), which serves as a point within the multi-dimensional feature space. Note that the svm_regression tool can be used to apply the SVM method to the modelling of continuous data.
The user must specify the values of three parameters used in the development of the model, the c parameters (-c), gamma (gamma), and the tolerance (tolerance). The c-value is the regularization parameter used in model optimization. The gamma parameter defines the radial basis function (Gaussian) kernel parameter. The tolerance parameter controls the stopping condition used during model optimization.
The tool splits the training data into two sets, one for training the classifier and one for testing the classification. These test data are used to calculate the overall accuracy and Matthew correlation coefficient (MCC). The test_proportion parameter is used to set the proportion of the input training data used in model testing. For example, if test_proportion = 0.2, 20% of the training data will be set aside for testing, and this subset will be selected randomly. As a result of this random selection of test data, the tool behaves stochastically, and will result in a different model each time it is run.
Note that the output image parameter (output) is optional. When unspecified, the tool will simply report the model accuracy statistics, allowing the user to experiment with different parameter settings and input predictor raster combinations to optimize the model before applying it to classify the whole image data set.
Like all supervised classification methods, this technique relies heavily on proper selection of training data. Training sites are exemplar areas/points of known and representative class value (e.g. land cover type). The algorithm determines the feature signatures of the pixels within each training area. In selecting training sites, care should be taken to ensure that they cover the full range of variability within each class. Otherwise the classification accuracy will be impacted. If possible, multiple training sites should be selected for each class. It is also advisable to avoid areas near the edges of class objects (e.g. land-cover patches), where mixed pixels may impact the purity of training site values.
After selecting training sites, the feature value distributions of each class type can be assessed using the evaluate_training_sites tool. In particular, the distribution of class values should ideally be non-overlapping in at least one feature dimension.
The SVM algorithm is based on the calculation of distances in multi-dimensional space. Feature scaling is essential to the application of SVM-based modelling, especially when the ranges of the features are different, for example, if they are measured in different units. Without scaling, features with larger ranges will have greater influence in computing the distances between points. The tool offers three options for feature-scaling (scaling), including 'None', 'Normalize', and 'Standardize'. Normalization simply rescales each of the features onto a 0-1 range. This is a good option for most applications, but it is highly sensitive to outliers because it is determined by the range of the minimum and maximum values. Standardization rescales predictors using their means and standard deviations, transforming the data into z-scores. This is a better option than normalization when you know that the data contain outlier values; however, it does does assume that the feature data are somewhat normally distributed, or are at least symmetrical in distribution.
Because the SVM algorithm calculates distances in feature-space, like many other related algorithms, it suffers from the curse of dimensionality. Distances become less meaningful in high-dimensional space because the vastness of these spaces means that distances between points are less significant (more similar). As such, if the predictor list includes insignificant or highly correlated variables, it is advisable to exclude these features during the model-building phase, or to use a dimension reduction technique such as principal_component_analysis to transform the features into a smaller set of uncorrelated predictors.
See Also
random_forest_classification, knn_classification, parallelepiped_classification, evaluate_training_sites, principal_component_analysis
Python API
def svm_classification(self, input_rasters: List[Raster], training_data: Vector, class_field_name: str, scaling_method: str = "none", c_value: float = 50.0, kernel_gamma: float = 0.5, tolerance: float = 0.1, test_proportion: float = 0.2, create_output: bool = False) -> Optional[Raster]:
SVM Regression
Function name: svm_regression
Description
This tool performs a supervised support vector machine (SVM) regression analysis using multiple predictor rasters (inputs), or features, and training data (training). SVMs are a common class of supervised learning algorithms widely applied in many problem domains. This tool can be used to model the spatial distribution of continuous data, such as soil properties (e.g. percent sand/silt/clay). The training data take the form of an input vector Shapefile containing a set of points for which the known outcome data is contained within a field (field) of the attribute table. Each grid cell defines a stack of feature values (one value for each input raster), which serves as a point within the multi-dimensional feature space. Note that the svm_classification tool can be used to apply the SVM method to the modelling of categorical data.
The user must specify the c-value (-c), the regularization parameter used in model optimization, the epsilon-value (eps), used in the development of the epsilon-SVM regression model, and the gamma-value (gamma), which is used in defining the radial basis function (Gaussian) kernel parameter.
The tool splits the training data into two sets, one for training the model and one for testing the prediction. These test data are used to calculate the regression accuracy statistics, as well as to estimate the variable importance. The test_proportion parameter is used to set the proportion of the input training data used in model testing. For example, if test_proportion = 0.2, 20% of the training data will be set aside for testing, and this subset will be selected randomly. As a result of this random selection of test data, the tool behaves stochastically, and will result in a different model each time it is run.
Note that the output image parameter (output) is optional. When unspecified, the tool will simply report the model accuracy statistics and variable importance, allowing the user to experiment with different parameter settings and input predictor raster combinations to optimize the model before applying it to model the outcome variable across the whole region defined by image data set.
The SVM algorithm is based on the calculation of distances in multi-dimensional space. Feature scaling is essential to the application of SVM modelling, especially when the ranges of the features are different, for example, if they are measured in different units. Without scaling, features with larger ranges will have greater influence in computing the distances between points. The tool offers three options for feature-scaling (scaling), including 'None', 'Normalize', and 'Standardize'. Normalization simply rescales each of the features onto a 0-1 range. This is a good option for most applications, but it is highly sensitive to outliers because it is determined by the range of the minimum and maximum values. Standardization rescales predictors using their means and standard deviations, transforming the data into z-scores. This is a better option than normalization when you know that the data contain outlier values; however, it does does assume that the feature data are somewhat normally distributed, or are at least symmetrical in distribution.
Because the SVM algorithm calculates distances in feature-space, like many other related algorithms, it suffers from the curse of dimensionality. Distances become less meaningful in high-dimensional space because the vastness of these spaces means that distances between points are less significant (more similar). As such, if the predictor list includes insignificant or highly correlated variables, it is advisable to exclude these features during the model-building phase, or to use a dimension reduction technique such as principal_component_analysis to transform the features into a smaller set of uncorrelated predictors.
See Also
svm_classification, random_forest_regression, knn_regression, principal_component_analysis
Python API
def svm_regression(self, input_rasters: List[Raster], training_data: Vector, class_field_name: str, scaling_method: str = "none", c_value: float = 50.0, epsilon_value: float = 10.0, kernel_gamma: float = 0.5, test_proportion: float = 0.2, create_output: bool = False) -> Optional[Raster]:
Object-Based Image Analysis (OBIA)
Build Object Hierarchy Multiscale
Function name: build_object_hierarchy_multiscale
No help documentation available for this tool.
Classify Objects Ensemble Pro
Function name: classify_objects_ensemble_pro
No help documentation available for this tool.
Classify Objects Random Forest
Function name: classify_objects_random_forest
No help documentation available for this tool.
Classify Objects Rules Basic
Function name: classify_objects_rules_basic
No help documentation available for this tool.
Classify Objects Rules Hierarchical
Function name: classify_objects_rules_hierarchical
No help documentation available for this tool.
Classify Objects SVM
Function name: classify_objects_svm
No help documentation available for this tool.
Evaluate Object Classification Accuracy
Function name: evaluate_object_classification_accuracy
No help documentation available for this tool.
Evaluate Segmentation Quality Pro
Function name: evaluate_segmentation_quality_pro
No help documentation available for this tool.
Image Segmentation
Function name: image_segmentation
Description
This tool is used to segment a mult-spectral image data set, or multi-dimensional data stack. The algorithm is based on region-growing operations. Each of the input images are transformed into standard scores prior to analysis. The total multi-dimensional distance between each pixel and its eight neighbours is measured, which then serves as a priority value for selecting potential seed pixels for the region-growing operations, with pixels exhibited the least difference with their neighbours more likely to serve as seeds. The region-growing operations initiate at seed pixels and grows outwards, connecting neighbouring pixels that have a multi-dimensional distance from the seed cell that is less than a threshold value. Thus, the region-growing operations attempt to identify contiguous, relatively homogeneous objects. The algorithm stratifies potential seed pixels into bands, based on their total difference with their eight neighbours. The user may control the size and number of these bands using the threshold and steps parameters respectively. Increasing the magnitude of the threshold parameter will result in fewer mapped objects and vice versa. All pixels that are not assigned to an object after the seeding-based region-growing operations are then clumped simply based on contiguity.
It is commonly the case that there will be a large number of very small-sized objects identified using this approach. The user may optionally specify that objects that are less than a minimum area (expressed in pixels) be eliminated from the final output raster. The min_area parameter must be an integer between 1 and 8. In cleaning small objects from the output, the pixels belonging to these smaller features are assigned to the most homogeneous neighbouring object.
The input rasters (inputs) may be bands of satellite imagery, or any other attribute, such as measures of texture, elevation, or other topographic derivatives, such as slope. If satellite imagery is used as inputs, it can be beneficial to pre-process the data with an edge-preserving low-pass filter, such as the bilateral_filter and edge_preserving_mean_filter tools.
See Also
bilateral_filter, edge_preserving_mean_filter
Python API
def image_segmentation(self, input_rasters: List[Raster], dist_threshold: float = 0.5, num_steps: int = 10, area_threshold: int = 4) -> Raster:
Image Slider
Function name: image_slider
Description
This tool creates an interactive image slider from two input images (input1 and input2). An image slider is an interactive visualization of two overlapping images, in which the user moves the position of a slider bar to hide or reveal one of the overlapping images. The output (output) is an HTML file. Each of the two input images may be rendered in one of several available palettes. If the input image is a colour composite image, no palette is required. Labels may also be optionally associated with each of the images, displayed in the upper left and right corners. The user must also specify the image height (height) in the output file. Note that the output is simply HTML, CSS, and javascript code, which can be readily embedded in other documents.
The following is an example of what the output of this tool looks like. Click the image for an interactive example.
Python API
def image_slider(self, left_raster: Raster, right_raster: Raster, output_html_file: str, left_palette: WbPalette = WbPalette.Grey, left_reverse_palette: bool = False, left_label: str = "", right_palette: WbPalette = WbPalette.Grey, right_reverse_palette: bool = False, right_label: str = "", image_height: int = 600) -> None:
Image Stack Profile
Function name: image_stack_profile
This tool can be used to plot an image stack profile (i.e. a signature) for a set of points (points) and a multispectral image stack (inputs). The tool outputs an interactive SVG line graph embedded in an HTML document. If the input points vector contains multiple points, each input point will be associated with a single line in the output plot. The order of vertices in each signature line is determined by the order of images specified in the inputs parameter. At least two input images are required to run this operation. Note that this tool does not require multispectral images as inputs; other types of data may also be used as the image stack. Also note that the input images should be single-band, continuous greytone rasters. RGB colour images are not good candidates for this tool.
If you require the raster values to be saved in the vector points file's attribute table, or if you need the raster values to be output as text, you may use the extract_raster_values_at_points tool instead.
See Also
extract_raster_values_at_points
Python API
def image_stack_profile(self, images: List[Raster], points: Vector, output_html_file: str) -> None:
OBIA Audit Report Pro
Function name: obia_audit_report_pro
No help documentation available for this tool.
OBIA Batch Orchestrator Pro
Function name: obia_batch_orchestrator_pro
No help documentation available for this tool.
OBIA Pipeline Basic
Function name: obia_pipeline_basic
No help documentation available for this tool.
Object Class Probability Maps
Function name: object_class_probability_maps
No help documentation available for this tool.
Object Features Context Neighbors
Function name: object_features_context_neighbors
No help documentation available for this tool.
Object Features Shape Basic
Function name: object_features_shape_basic
No help documentation available for this tool.
Object Features Spectral Basic
Function name: object_features_spectral_basic
No help documentation available for this tool.
Object Features Texture GLCM Basic
Function name: object_features_texture_glcm_basic
No help documentation available for this tool.
Object Features Topology Relations
Function name: object_features_topology_relations
No help documentation available for this tool.
Object Uncertainty Diagnostics Pro
Function name: object_uncertainty_diagnostics_pro
No help documentation available for this tool.
Objects Boundary Refinement Pro
Function name: objects_boundary_refinement_pro
No help documentation available for this tool.
Objects Enforce Min Mapping Unit
Function name: objects_enforce_min_mapping_unit
No help documentation available for this tool.
Polygons To Segments
Function name: polygons_to_segments
No help documentation available for this tool.
Propagate Labels Across Hierarchy
Function name: propagate_labels_across_hierarchy
No help documentation available for this tool.
Segment Graph Felzenszwalb
Function name: segment_graph_felzenszwalb
No help documentation available for this tool.
Segment Multiresolution Hierarchical
Function name: segment_multiresolution_hierarchical
No help documentation available for this tool.
Segment Scale Parameter Optimizer
Function name: segment_scale_parameter_optimizer
No help documentation available for this tool.
Segment SLIC Superpixels
Function name: segment_slic_superpixels
No help documentation available for this tool.
Segment Watershed Markers
Function name: segment_watershed_markers
No help documentation available for this tool.
Segments Merge Small Regions
Function name: segments_merge_small_regions
No help documentation available for this tool.
Segments Split Low Cohesion
Function name: segments_split_low_cohesion
No help documentation available for this tool.
Segments To Polygons
Function name: segments_to_polygons
No help documentation available for this tool.
Change Detection
Change Vector Analysis
Function name: change_vector_analysis
Change Vector Analysis (CVA) is a change detection method that characterizes the magnitude and change direction in spectral space between two times. A change vector is the difference vector between two vectors in n-dimensional feature space defined for two observations of the same geographical location (i.e. corresponding pixels) during two dates. The CVA inputs include the set of raster images corresponding to the multispectral data for each date. Note that there must be the same number of image files (bands) for the two dates and they must be entered in the same order, i.e. if three bands, red, green, and blue are entered for date one, these same bands must be entered in the same order for date two.
CVA outputs two image files. The first image contains the change vector length, i.e. magnitude, for each pixel in the multi-spectral dataset. The second image contains information about the direction of the change event in spectral feature space, which is related to the type of change event, e.g. deforestation will likely have a different change direction than say crop growth. The vector magnitude is a continuous numerical variable. The change vector direction is presented in the form of a code, referring to the multi-dimensional sector in which the change vector occurs. A text output will be produced to provide a key describing sector codes, relating the change vector to positive or negative shifts in n-dimensional feature space.
It is common to apply a simple thresholding operation on the magnitude data to determine 'actual' change (i.e. change above some assumed level of error). The type of change (qualitatively) is then defined according to the corresponding sector code. Jensen (2015) provides a useful description of this approach to change detection.
Reference
Jensen, J. R. (2015). Introductory Digital Image Processing: A Remote Sensing Perspective.
See Also
write_function_memory_insertion
Python API
def change_vector_analysis(self, date1_rasters: List[Raster], date2_rasters: List[Raster]) -> Tuple[Raster, Raster, str]:
Image Difference Change Detection
Function name: image_difference_change_detection
No help documentation available for this tool.
PCA Based Change Detection
Function name: pca_based_change_detection
No help documentation available for this tool.
Post Classification Change
Function name: post_classification_change
No help documentation available for this tool.
Remote Sensing Change Detection
Function name: remote_sensing_change_detection
PROProduction
Detect spectral change between baseline and change-date multispectral bundles with profile-based sensitivity.
workflow pro
Workflow Narrative
Remote Sensing Change Detection
Problem It Solves
Where is meaningful vegetation change occurring, and which detections are reliable enough for reporting?
Who It Is For
- Environmental monitoring analysts and EO operations teams.
Primary User
Environmental consultancies, forestry/conservation agencies, and compliance programs.
What It Does
- Detects vegetation loss/gain between two dates using NDVI-style change logic.
- Uses bundle-native multiband inputs with explicit red/NIR band indices.
- Produces a signed change raster and a confidence raster for analyst triage.
How It Works
- Computes NDVI-like indices per date from selected red and NIR bands.
- Calculates signed per-pixel delta between change-date and baseline NDVI-like response.
- Applies profile-dependent thresholds and local consistency checks to derive confidence.
- Indicative formula: NDVI = (NIR - Red) / (NIR + Red), then change = NDVI_change - NDVI_baseline.
Why It Wins
- Bundle-native inputs reduce parameter mismatch errors and make cross-date processing more reproducible.
Typical Buying Trigger
A client or regulator requires repeatable, confidence-scored vegetation change evidence for periodic reporting.
Typical Presets
- aggressive: lower change threshold, more sensitive detection.
- balanced: default tradeoff for general monitoring.
- conservative: stricter thresholds, fewer false positives.
Inputs
ParameterOptionalDescription
baseline_bundle, baseline_red_band_index, baseline_nir_band_indexnoBaseline multispectral bundle and red/NIR band selectors used to compute baseline vegetation response.
change_bundle, change_red_band_index, change_nir_band_indexnoChange-date multispectral bundle and red/NIR band selectors used for signed change estimation.
optional intermediate_ndviyesIntermediate-date NDVI raster used to strengthen temporal plausibility scoring.
profile: aggressive | balanced | conservativeyesProcessing profile controlling sensitivity, quality strictness, and runtime tradeoffs.
high_confidence_thresholdyesConfidence threshold used for summary metrics (default 0.85).
Outputs
ParameterTypeDescription change_mapGeoTIFFPrimary change-intelligence raster showing direction and magnitude of detected change. confidenceGeoTIFFConfidence layer quantifying reliability of modeled outputs. summaryJSONMachine-readable summary report containing run metadata, QA diagnostics, and key metrics. html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
change_map, confidence, summary = wbe.remote_sensing_change_detection( baseline_bundle="data/baseline_bundle.tif", baseline_red_band_index=0, baseline_nir_band_index=1, change_bundle="data/change_bundle.tif", change_red_band_index=0, change_nir_band_index=1, profile="balanced", output="output/rs_change", )
print(change_map) print(confidence) print(summary)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Time Series Change Intelligence
Function name: time_series_change_intelligence
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Time-Series Change Analysis
Problem It Solves
Where and when are structural shifts emerging in the time series, and how confident are those signals?
Who It Is For
- Time-series monitoring teams and geospatial data science groups.
Primary User
Regional planning programs, policy/compliance analytics groups, and EO product teams.
What It Does
- Performs multi-date trend and breakpoint analysis from temporal stacks.
- Supports mode-dependent decomposition/segmentation behavior.
- Emits confidence-scored temporal change surfaces.
How It Works
- Traverses each pixel time series and estimates trend response over the observation window.
- Uses algorithm_mode-specific breakpoint logic to detect structural shifts.
- Converts fit strength, breakpoint support, and observation sufficiency into confidence output.
- Indicative formula: confidence ~= f(|trend|, breakpoint_support, n_observations), bounded to [0, 1].
Why It Wins
- Provides mode flexibility (fast/iterative/bfast) so users can balance throughput and analytical depth.
Typical Buying Trigger
A monitoring program moves from two-date snapshots to sustained time-series surveillance.
Typical Presets
- fast: throughput-oriented screening.
- iterative: stronger breakpoint refinement.
- bfast: decomposition-oriented analysis.
Inputs
ParameterOptionalDescription input_stack (required)noPrimary temporal raster stack used for trend and breakpoint analysis. optional qa_stackyesOptional temporal QA stack used to weight or suppress low-quality observations. algorithm_mode and thresholding controlsnoTime-series change algorithm mode and threshold controls.
Outputs
ParameterTypeDescription trend_changeGeoTIFFPrimary change-intelligence raster showing direction and magnitude of detected change. breakpoint_countGeoTIFFPer-pixel count of detected temporal breakpoints. breakpoint_dateGeoTIFFEstimated timing raster for dominant detected breakpoint events. change_confidenceGeoTIFFConfidence surface for the time-series change detection result. summaryJSONMachine-readable summary report containing run metadata, QA diagnostics, and key metrics. html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
result = wbe.time_series_change_intelligence( input_stack="data/time_stack.tif", qa_stack="data/time_stack_qa.tif", algorithm_mode="bfast", output_prefix="output/ts_change", )
print(result)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Write Function Memory Insertion
Function name: write_function_memory_insertion
Jensen (2015) describes write function memory (WFM) insertion as a simple yet effective method of visualizing land-cover change between two or three dates. WFM insertion may be used to qualitatively inspect change in any type of registered, multi-date imagery. The technique operates by creating a red-green-blue (RGB) colour composite image based on co-registered imagery from two or three dates. If two dates are input, the first date image will be put into the red channel, while the second date image will be put into both the green and blue channels. The result is an image where the areas of change are displayed as red (date 1 is brighter than date 2) and cyan (date 1 is darker than date 2), and areas of little change are represented in grey-tones. The larger the change in pixel brightness between dates, the more intense the resulting colour will be.
If images from three dates are input, the resulting composite can contain many distinct colours. Again, more intense the colours are indicative of areas of greater land-cover change among the dates, while areas of little change are represented in grey-tones. Interpreting the direction of change is more difficult when three dates are used. Note that for multi-spectral imagery, only one band from each date can be used for creating a WFM insertion image.
Reference
Jensen, J. R. (2015). Introductory Digital Image Processing: A Remote Sensing Perspective.
See Also
create_colour_composite, change_vector_analysis
Python API
def write_function_memory_insertion(self, image1: Raster, image2: Raster, image3: Raster) -> Raster:
Radiometric Correction
BRDF Surface Reflectance Consistency
Function name: brdf_surface_reflectance_consistency
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Surface Reflectance Consistency Analysis
Problem It Solves
Are directional and terrain illumination effects sufficiently normalized for reliable scene-to-scene reflectance comparison?
Who It Is For
- Optical remote sensing teams preparing cross-scene reflectance products for trend/change workflows.
Primary User
EO product teams, environmental mapping agencies, and analytics groups with multi-date reflectance pipelines.
What It Does
- Harmonizes reflectance across dates and sensors after terrain correction.
- Quantifies normalization magnitude and confidence for QA-aware downstream analysis.
- Packages reflectance consistency as a reproducible workflow-stage product.
How It Works
- Runs an OSS terrain-correction prep stage when the input scene is still topographically distorted.
- Derives normalization delta from pre/post correction residual magnitude.
- Computes consistency confidence using exponential damping of large correction residuals.
- Indicative formula: $C = \exp(-\Delta / s) \cdot Q$, where $\Delta$ is normalization delta and $Q$ is quality confidence.
Why It Wins
- Couples normalization output with explicit delta and confidence diagnostics, enabling QA-aware downstream acceptance rules.
Typical Buying Trigger
Teams observe inconsistent reflectance behavior across acquisition geometries and need a reproducible consistency gate.
Typical Presets
- fast: quicker normalization for large-area throughput.
- balanced: default quality/speed tradeoff.
- conservative: stronger correction confidence thresholding.
Inputs
ParameterOptionalDescription input_red, input_nir, input_demnoCore optical + terrain inputs used for topographic/reflectance normalization workflows. solar_zenith_deg, solar_azimuth_degnoSolar geometry parameters used to model illumination and terrain incidence effects. optional input_greenyesOptional green band used by workflows that include green-channel diagnostics. profile: fast | balanced | conservativenoProcessing profile controlling sensitivity, quality strictness, and runtime tradeoffs.
Outputs
ParameterTypeDescription brdf_normalized_reflectanceGeoTIFFBRDF-normalized reflectance raster with improved angular consistency. normalization_deltaGeoTIFFDifference layer showing magnitude of BRDF normalization adjustments. consistency_confidenceGeoTIFFConfidence surface indicating reflectance consistency after normalization. summaryJSONMachine-readable summary report containing run metadata, QA diagnostics, and key metrics. html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
normalized, delta, confidence, summary = wbe.brdf_surface_reflectance_consistency( input_red="data/red.tif", input_nir="data/nir.tif", input_dem="data/dem.tif", input_green="data/green.tif", solar_zenith_deg=40.0, solar_azimuth_deg=165.0, profile="balanced", output_prefix="output/brdf_consistency", )
print(normalized) print(delta) print(confidence) print(summary)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
BRDF Normalization
Function name: brdf_normalization
No help documentation available for this tool.
Correct Vignetting
Function name: correct_vignetting
This tool can be used to reduce vignetting within an image. Vignetting refers to the reduction of image brightness away from the image centre (i.e. the principal point). Vignetting is a radiometric distortion resulting from lens characteristics. The algorithm calculates the brightness value in the output image (BVout) as:
BVout = BVin / [cos^n(arctan(d / f))]
Where d is the photo-distance from the principal point in millimetres, f is the focal length of the camera, in millimeters, and n is a user-specified parameter. Pixel distances are converted to photo-distances (in millimetres) using the specified image width, i.e. distance between left and right edges (mm). For many cameras, 4.0 is an appropriate value of the n parameter. A second pass of the image is used to rescale the output image so that it possesses the same minimum and maximum values as the input image.
If an RGB image is input, the analysis will be performed on the intensity component of the HSI transform.
Python API
def correct_vignetting(self, image: Raster, principal_point: Vector, focal_length: float = 304.8, image_width: float = 228.6, n_param: float = 4.0) -> Raster:
Dark Object Subtraction
Function name: dark_object_subtraction
No help documentation available for this tool.
Dn To Toa Reflectance
Function name: dn_to_toa_reflectance
No help documentation available for this tool.
Terrain Corrected Optical Analytics
Function name: terrain_corrected_optical_analytics
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Terrain-Corrected Optical Prep
Problem It Solves
Can we reduce terrain illumination bias so downstream indices and change products are defensible?
Who It Is For
- Remote sensing practitioners working in high-relief terrain.
Primary User
EO analytics teams, natural resource agencies, and monitoring service providers.
What It Does
- Applies terrain-aware optical correction (C-correction style) using DEM-derived geometry.
- Builds cloud/shadow masks and quality confidence diagnostics alongside corrected bands.
- Produces analysis-ready corrected optical outputs for downstream monitoring/classification.
How It Works
- Derives slope/aspect illumination terms from the DEM and solar geometry parameters.
- Applies per-band topographic normalization (C-correction style) to reduce relief-driven bias.
- Generates cloud-shadow and quality layers from correction residual behavior and masking rules.
- Indicative formula: L_corr ~= L_obs * (cos(theta_s) + C) / (cos(i) + C), where i is incidence angle from DEM slope/aspect.
Why It Wins
- Integrates correction and QA/mask outputs in one workflow rather than forcing separate ad hoc preprocessing scripts.
Typical Buying Trigger
Teams see unstable index/change outputs across steep terrain and need a standardized correction stage.
Typical Presets
- conservative: stricter correction and masking behavior.
- balanced: default profile.
- fast: quicker execution on large scenes.
Inputs
ParameterOptionalDescription input_red, input_nir, input_demnoCore optical + terrain inputs used for topographic/reflectance normalization workflows. optional input_greenyesOptional green band used by workflows that include green-channel diagnostics. solar_zenith_deg, solar_azimuth_degnoSolar geometry parameters used to model illumination and terrain incidence effects. profile: conservative | balanced | fastnoProcessing profile controlling quality-vs-throughput behavior for correction workflow execution.
Outputs
ParameterTypeDescription corrected optical bandsGeoTIFFTerrain-corrected optical bands for downstream index and classification workflows. cloud_shadow_maskGeoTIFFCloud and shadow mask used to suppress unreliable optical pixels. topographic_correction_factorGeoTIFFPer-pixel topographic correction factor used in radiometric normalization. quality_confidenceGeoTIFFConfidence surface indicating reliability of corrected optical values. summaryJSONMachine-readable summary report containing run metadata, QA diagnostics, and key metrics. html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
result = wbe.terrain_corrected_optical_analytics( input_red="data/red.tif", input_nir="data/nir.tif", input_dem="data/dem.tif", solar_zenith_deg=40.0, solar_azimuth_deg=165.0, profile="balanced", output_prefix="output/tco", )
print(result)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Thermal and Emissivity
Land Surface Temperature Single Channel
Function name: land_surface_temperature_single_channel
No help documentation available for this tool.
Land Surface Temperature Split Window
Function name: land_surface_temperature_split_window
No help documentation available for this tool.
NDVI Based Emissivity
Function name: ndvi_based_emissivity
No help documentation available for this tool.
Spectral Analytics
Continuum Removal
Function name: continuum_removal
No help documentation available for this tool.
Linear Spectral Unmixing
Function name: linear_spectral_unmixing
No help documentation available for this tool.
Minimum Noise Fraction
Function name: minimum_noise_fraction
No help documentation available for this tool.
Spectral Angle Mapper
Function name: spectral_angle_mapper
No help documentation available for this tool.
Spectral Library Matching
Function name: spectral_library_matching
No help documentation available for this tool.
SAR Processing
Cloude Pottier Decomposition
Function name: cloude_pottier_decomposition
No help documentation available for this tool.
Enhanced Lee Filter
Function name: enhanced_lee_filter
No help documentation available for this tool.
Freeman Durden Decomposition
Function name: freeman_durden_decomposition
No help documentation available for this tool.
Frost Filter
Function name: frost_filter
Experimental
Speckle reduction for SAR intensity imagery using Frost adaptive filtering.
remote_sensing raster filter frost_filter legacy-port
Parameters
NameDescriptionRequiredDefault
inputInput raster path or typed raster object.Requiredinput.tif
radiusLocal window radius in pixels (default 2).Optional2
damping_factorFrost damping factor controlling exponential decay (default 2.0).Optional2.0
outputOptional output path. If omitted, output remains in memory.Optional—
Examples
Applies frost_filter to an input raster.
wbe.frost_filter(input='image.tif', output='frost_filter.tif')
Gamma Map Filter
Function name: gamma_map_filter
Experimental
Gamma-MAP speckle filter for SAR imagery with ENL-aware noise modeling.
remote_sensing raster filter gamma_map_filter legacy-port
Parameters
NameDescriptionRequiredDefault
inputInput raster path or typed raster object.Requiredinput.tif
radiusLocal window radius in pixels (default 2).Optional2
enlEquivalent number of looks (default 1.0).Optional1.0
outputOptional output path. If omitted, output remains in memory.Optional—
Examples
Applies gamma_map_filter to an input raster.
wbe.gamma_map_filter(input='image.tif', output='gamma_map_filter.tif')
H Alpha Wisart Classification
Function name: h_alpha_wisart_classification
No help documentation available for this tool.
Kuan Filter
Function name: kuan_filter
Experimental
Kuan adaptive speckle filter for SAR intensity data.
remote_sensing raster filter kuan_filter legacy-port
Parameters
NameDescriptionRequiredDefault
inputInput raster path or typed raster object.Requiredinput.tif
radiusLocal window radius in pixels (default 2).Optional2
enlEquivalent number of looks (default 1.0).Optional1.0
outputOptional output path. If omitted, output remains in memory.Optional—
Examples
Applies kuan_filter to an input raster.
wbe.kuan_filter(input='image.tif', output='kuan_filter.tif')
Refined Lee Filter
Function name: refined_lee_filter
No help documentation available for this tool.
SAR Analysis Readiness
Function name: sar_analysis_readiness
PROProduction
Evaluate SAR scene readiness for downstream analysis, including optional terrain and pairwise coherence-proxy checks.
workflow pro
Workflow Narrative
SAR Readiness QA
Problem It Solves
Are SAR scenes normalized enough for robust multi-scene comparison and downstream analytics?
Who It Is For
- SAR specialists and all-weather monitoring teams.
Primary User
Disaster/hazard monitoring units, remote sensing service firms, and infrastructure monitoring groups.
What It Does
- Produces analysis-ready SAR derivatives from SAR + DEM.
- Applies calibration, speckle filtering, and RTC-support factor generation.
- Accepts either direct SAR rasters or supported SAR bundles and records bundle metadata provenance.
- Optionally emits a pair-based coherence proxy and can auto-coregister the pair first when explicit opt-in alignment is requested.
How It Works
- Converts input SAR intensity to calibrated backscatter-compatible values.
- Applies configurable speckle-window filtering for noise suppression.
- Computes terrain correction factors from DEM geometry, resolves bundle metadata when present, and optionally evaluates a pair coherence proxy.
- Indicative formula: gamma0 ~= sigma0 / cos(local_incidence), then local-window filtering and optional pair coherence estimation.
Why It Wins
- Couples SAR preprocessing with terrain support, bundle-native provenance, and explicit geometry guardrails, while still allowing an audited auto-coregistration handoff when pair alignment is not pre-established.
- Boundary note:
sar_analysis_readinessis not the dedicated interferogram+coherence production workflow.- Dedicated companion tool (single combined workflow):
sar_interferogram_coherence. - Design spec and scope details:
docs/sar_interferogram_coherence_spec.md.
Typical Buying Trigger
An organization needs dependable all-weather monitoring where optical imagery is frequently unavailable.
Typical Presets
- default with single scene for preprocessing.
- pair mode with either direct raster pairs or bundle-resolved pair inputs for coherence-proxy output.
Inputs
ParameterOptionalDescription input_sar or input_sar_bundle, input_demnoPrimary SAR source plus terrain model used for radiometric terrain correction and readiness metrics. input_measurement_keyyesOptional bundle measurement selector when the input SAR bundle contains multiple assets. optional pair_sar or pair_sar_bundle and look-angle controlsyesOptional secondary SAR source and geometry controls used for pair-based coherence-proxy diagnostics. pair_measurement_keyyesOptional bundle measurement selector when the pair SAR bundle contains multiple assets. auto_coregister_pair, coreg_max_offset_px, coreg_decimation, coreg_min_overlap_fractionyesOptional opt-in handoff to translation-mode pair alignment before coherence-proxy estimation when CRS/grid do not already match. speckle_window, z_factornoNoise-filter window and vertical scaling controls used in SAR preprocessing.
Outputs
ParameterTypeDescription sar_backscatter_calibratedGeoTIFFCalibrated SAR backscatter raster suitable for quantitative comparison. speckle_filteredGeoTIFFSpeckle-reduced SAR raster for improved interpretability and downstream analysis. rtc_factorGeoTIFFRadiometric terrain correction factor raster used to normalize SAR signal. coherence_proxyoptional GeoTIFFOptional pair-based amplitude-domain coherence proxy raster when a compatible SAR pair is provided. summaryJSONMachine-readable summary report containing run metadata, QA diagnostics, and key metrics. html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
result = wbe.sar_analysis_readiness( input_sar_bundle="data/S1_reference.SAFE", input_measurement_key="vv", input_dem="data/dem.tif", pair_sar_bundle="data/S1_pair.SAFE", pair_measurement_key="vv", auto_coregister_pair=True, coreg_max_offset_px=24, speckle_window=5, output_prefix="output/sar_ready", )
print(result)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
SAR Coregistration
Function name: sar_coregistration
PROProduction
Coregister moving SAR imagery to a reference grid (translation/affine/local-offset-grid modes).
workflow pro
Workflow Narrative
SAR Coregistration
Problem It Solves
Can we put a moving SAR scene on the reference scene grid with enough geometric confidence for pairwise analytics?
Who It Is For
- SAR analysts, infrastructure monitoring teams, and registration-first EO pipelines.
Primary User
InSAR/infrastructure monitoring teams, geospatial intelligence groups, and SAR QA operations.
What It Does
- Aligns a moving SAR raster or resolved SAR-bundle measurement onto a reference SAR grid.
- Supports
translation,affine, andlocal_offset_gridmodes (affineandlocal_offset_gridremain experimental). - Emits aligned output plus transform/summary diagnostics, including compatibility and acceptance gates for machine-checkable downstream use.
How It Works
- Harmonizes the moving SAR raster to the reference CRS/grid when needed and evaluates strict scene compatibility where metadata allows.
- Applies amplitude-domain matching with bounded search and subpixel refinement, then executes the selected transform model.
- Emits per-run QA including burst diagnostics (Sentinel-1), continuity checks, and Phase-A acceptance gates.
- Indicative formula: $\rho(\Delta x, \Delta y) = \mathrm{corr}(\log(1 + I_{ref}), \log(1 + I_{mov}(x-\Delta x, y-\Delta y)))$.
Why It Wins
- Accepts both direct rasters and supported SAR bundles while turning pair alignment into a reproducible audited workflow artifact instead of an opaque manual preprocessing step.
Typical Buying Trigger
A team needs a defensible alignment stage before coherence, pair differencing, or cross-sensor fusion.
Typical Presets
- translation: default global shift path.
- affine: global affine refinement (experimental).
- local_offset_grid: local residual warp model (experimental).
Inputs
ParameterOptionalDescription
reference_sar or reference_sar_bundlenoReference SAR source provided either as a raster path or a supported SAR bundle.
moving_sar or moving_sar_bundlenoMoving SAR source provided either as a raster path or a supported SAR bundle.
reference_measurement_key, moving_measurement_keyyesOptional bundle measurement selectors when either SAR bundle contains multiple measurement assets.
coreg_modeyesCoregistration mode: translation, affine, or local_offset_grid (experimental for affine/local).
max_offset_px, decimation, min_overlap_fractionyesSearch-radius and sampled-overlap controls for correlation-based shift estimation.
input_dem, dem_z_factoryesOptional DEM-assisted initialization controls used for geometry-informed matching support.
phase_a_* thresholdsyesOptional acceptance/continuity threshold overrides for deterministic quality gating.
resample_method, output_prefixyesOutput resampling mode and artifact prefix.
Outputs
ParameterTypeDescription moving_alignedGeoTIFFMoving SAR raster resampled onto the reference SAR grid. offset_xGeoTIFFConstant x-offset surface in map units for the estimated translation. offset_yGeoTIFFConstant y-offset surface in map units for the estimated translation. transformJSONMachine-readable transform and QA summary for the estimated alignment. summaryJSONMachine-readable workflow report containing parameters, QA diagnostics, and artifact paths. html_reportHTMLHuman-readable report generated from the summary contract for review and traceability.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
result = wbe.sar_coregistration( reference_sar_bundle="data/S1_reference.SAFE", reference_measurement_key="vv", moving_sar_bundle="data/S1_pair.SAFE", moving_measurement_key="vv", max_offset_px=24, decimation=4, resample_method="bilinear", output_prefix="output/sar_coreg", )
print(result)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
SAR Interferogram Coherence
Function name: sar_interferogram_coherence
PROProduction
Estimate interferometric coherence between SAR acquisitions (direct, complex, or bundle modes).
workflow pro
Workflow Narrative
SAR Interferogram and Coherence
Problem It Solves
Can this SAR pair be converted into defensible interferogram/coherence products with auditable provenance and QA diagnostics?
Who It Is For
- InSAR practitioners and SAR operations teams requiring repeatable pair-level coherence/interferogram products.
Primary User
Infrastructure deformation monitoring teams, hazard intelligence groups, and SAR analytics operations.
What It Does
- Produces interferogram, coherence, and valid-mask outputs from compatible SAR pairs in one dedicated workflow.
- Accepts either direct SAR rasters, supported SAR bundles, or complex split real/imag SAR inputs.
- Emits standardized machine-readable summary and optional HTML report artifacts for auditable downstream use.
How It Works
- Resolves reference and moving SAR inputs from either direct raster mode, bundle mode, or complex component mode.
- Optionally performs internal
sar_coregistrationhandoff when grids are mismatched, unless the caller explicitly asserts a prealigned pair. - Computes either complex-domain interferometric phase and coherence magnitude (complex mode) or amplitude-domain proxy interferogram/coherence (scalar mode).
- Uses summed-area coherence kernels and parallel row-chunk evaluation to reduce runtime on larger scenes; optional fast mode can further decimate coherence sampling.
- Indicative formulas: $\phi = \mathrm{atan2}(\Im(z), \Re(z))$ for interferometric phase where $z = s_{ref},\overline{s_{mov}}$, and local-window coherence proxy from normalized cross-correlation magnitude.
Why It Wins
- Unifies SAR pair ingest, optional alignment handoff, product generation, acceptance gating, and QA/report artifacts in one reproducible contract workflow.
- Operational note:
- When
auto_coregister_pair=true, this tool invokes the existingsar_coregistrationengine internally if the pair grids do not already match. - If the pair is known to already be aligned, prefer
assume_prealigned_pair=trueand leaveauto_coregister_pair=falseto avoid unnecessary alignment work. - Failed acceptance on the coreg residual gate indicates the pair/alignment should not be treated as registration-quality enough for trusted downstream interpretation.
Typical Buying Trigger
Teams need a dedicated production stage for interferogram/coherence outputs instead of ad hoc post-processing scripts.
Typical Presets
- scalar raster pair: direct amplitude-domain processing with optional auto-coregistration.
- bundle pair: bundle-native measurement resolution with identical output contract.
- complex split mode: complex-domain phase/coherence computation from explicit real/imag rasters.
- fast large-scene mode: reduced coherence workload via
performance_profile="fast", optionalcoherence_decimation, and selective artifact writing.
Inputs
ParameterOptionalDescription
reference_sar or reference_sar_bundleno (unless complex mode)Reference SAR source provided either as a direct raster path or supported SAR bundle root.
moving_sar or moving_sar_bundleno (unless complex mode)Moving SAR source provided either as a direct raster path or supported SAR bundle root.
reference_sar_real, reference_sar_imag, moving_sar_real, moving_sar_imagno (complex mode)Complex input mode components for explicit complex-domain interferogram/coherence processing.
reference_measurement_key, moving_measurement_keyyesOptional bundle measurement selectors when SAR bundles include multiple measurement assets.
auto_coregister_pair, assume_prealigned_pair, coreg_modeyesPair-alignment controls: either invoke internal sar_coregistration when needed or explicitly assert the pair is already aligned.
coreg_max_offset_px, coreg_decimation, coreg_min_overlap_fractionyesOptional scalar-mode coreg handoff tuning controls for search radius, sampling stride, and minimum overlap.
coherence_window, performance_profile, coherence_decimationyesCoherence kernel controls. Fast mode can reduce effective window size and decimate coherence sampling on very large scenes.
input_demyesOptional DEM used for terrain-context masking and geometry-support pathways.
write_interferogram, write_coherence, write_valid_mask, write_html_reportyesOptional artifact suppression controls used to reduce heavy-run output cost.
output_layout, output_compression, output_tile_sizeyesGeoTIFF write-profile controls for standard vs COG-style output and compression behavior.
output_prefixyesOutput artifact prefix.
Outputs
ParameterTypeDescription interferogramoptional GeoTIFFInterferogram raster (complex phase in complex mode, amplitude-domain proxy in scalar mode) when writing is enabled. coherenceoptional GeoTIFFCoherence magnitude raster from local-window pair statistics when writing is enabled. valid_maskoptional GeoTIFFBinary/flag mask indicating valid pair-support pixels used during computation when writing is enabled. summaryJSONMachine-readable summary report containing run metadata, QA diagnostics, timings, and artifact paths. html_reportoptional HTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability when writing is enabled.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
result = wbe.sar_interferogram_coherence( reference_sar="output/sar_coreg_reference.tif", moving_sar="output/sar_coreg_moving_aligned.tif", assume_prealigned_pair=True, coherence_window=7, performance_profile="balanced", output_prefix="output/sar_ifg_coh", )
print(result)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Wisart Iterative Clustering
Function name: wisart_iterative_clustering
No help documentation available for this tool.
Yamaguchi 4component Decomposition
Function name: yamaguchi_4component_decomposition
No help documentation available for this tool.
Workflow Products
Multi Sensor Fusion Monitoring
Function name: multi_sensor_fusion_monitoring
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Multi-Sensor Fusion Monitoring
Problem It Solves
Where do optical and SAR signals jointly support actionable change alerts, and where is confidence high enough to triage immediately?
Who It Is For
- Multi-source monitoring teams combining optical and SAR evidence.
Primary User
National/regional EO monitoring programs, environmental observatories, and risk intelligence teams.
What It Does
- Fuses optical change, SAR stability cues, and terrain context into one disturbance-monitoring product.
- Produces sensor-agreement and fused-probability outputs for confidence-first screening.
- Emits high-confidence change zones suitable for direct review and reporting.
How It Works
- Runs
remote_sensing_change_detectionandsar_analysis_readinessas upstream stages. - Computes a per-pixel agreement score from optical confidence plus SAR consistency/coherence cues.
- Combines normalized change strength and agreement into fused change probability.
- Indicative formula: $P_{fused} = w_c \cdot |\Delta_{optical}|{norm} + w_a \cdot A{sensor}$.
Why It Wins
- Prevents one-sensor overconfidence by explicitly encoding cross-sensor agreement into final change probability.
Typical Buying Trigger
Operations teams need lower-false-alarm change monitoring in cloudy/seasonally variable regions where single-sensor methods are unstable.
Typical Presets
- fast: lower-overhead configuration for broader candidate capture.
- balanced: default setting for operational triage.
- conservative: stricter thresholding for high-specificity alerting.
Inputs
ParameterOptionalDescription baseline_bundle, baseline_red_band_index, baseline_nir_band_indexnoBaseline multispectral bundle and red/NIR band selectors used to compute baseline vegetation response. change_bundle, change_red_band_index, change_nir_band_indexnoChange-date multispectral bundle and red/NIR band selectors used for signed change estimation. input_sar, input_demnoSAR scene and terrain model used for radiometric terrain correction and readiness metrics. optional pair_saryesOptional second SAR scene used when pair/coherence diagnostics are enabled. optional thermal_bundle, thermal_band_indexyesOptional thermal raster and 0-based band index used for three-modality diagnostics. profile: fast | balanced | conservativenoProcessing profile controlling sensitivity, quality strictness, and runtime tradeoffs. harmonization_modeyesCross-sensor bias harmonization mode: off, robust, or conservative. high_confidence_threshold, max_zone_featuresnoThreshold and feature-cap controls for extracting high-confidence change zones. vector_output_formatyesOutput vector format for zones: gpkg, geojson, or shp.
Outputs
ParameterTypeDescription fused_change_probabilityGeoTIFFCross-sensor fused probability of meaningful environmental change. sensor_agreementGeoTIFFAgreement surface indicating where sensors support the same change interpretation. terrain_contextGeoTIFFDerived terrain context layer used by fused change interpretation. uncertainty_inflationGeoTIFFPer-pixel uncertainty inflation diagnostic from cross-modality fusion. high_confidence_change_zonesGeoPackageVector zones representing high-confidence change hotspots. thermal_input_contractJSONThermal coverage and weighting contract generated when the workflow runs. modality_contribution_diagnosticsJSONRelative modality contribution diagnostics for optical/SAR/thermal sources. summaryJSONMachine-readable summary report containing run metadata, QA diagnostics, and key metrics. html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
fused, agreement, terrain, uncertainty, zones, thermal_contract, modality_diagnostics, summary = wbe.multi_sensor_fusion_monitoring( baseline_bundle="data/baseline_bundle.tif", baseline_red_band_index=0, baseline_nir_band_index=1, change_bundle="data/change_bundle.tif", change_red_band_index=0, change_nir_band_index=1, input_sar="data/sar_a.tif", input_dem="data/dem.tif", pair_sar="data/sar_b.tif", thermal_bundle="data/thermal.tif", thermal_band_index=0, profile="balanced", harmonization_mode="robust", vector_output_format="gpkg", high_confidence_threshold=0.8, max_zone_features=25000, output_prefix="output/ms_fusion", )
print(fused) print(agreement) print(terrain) print(uncertainty) print(zones) print(thermal_contract) print(modality_diagnostics) print(summary)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Guided Uav Image Intake Workflow
Function name: guided_uav_image_intake_workflow
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
UAV Image Intake QA
Problem It Solves
Is this image set good enough to proceed into expensive downstream processing, and what should be fixed first if not?
Who It Is For
- UAV mission operators, photogrammetry technicians, and geospatial production teams.
Primary User
Drone services teams, geomatics operations managers, and data-engineering teams responsible for production intake quality.
What It Does
- Scans UAV imagery and builds a metadata/coverage inventory.
- Computes intake QA metrics (GPS/EXIF completeness, overlap estimate, image-count sufficiency).
- Extracts per-image blur scores and flags blurry images with warnings.
- Parses RTK fix status and gimbal/flight orientation priors from DJI XMP metadata where available.
- Returns workflow status (
pass,review,fail) with warnings suitable for operator triage.
How It Works
- Recursively discovers supported image files (
jpg,jpeg,tif,tiff,png). - Parses EXIF fields for timestamp and GPS coordinates; parses DJI XMP sidecar fields for RTK fix status and gimbal yaw/pitch/roll and flight yaw.
- Estimates overlap from GPS nearest-neighbor spacing against a nominal footprint heuristic.
- Computes a per-image blur score from a Laplacian variance kernel (configurable as
off,fast, orfullmode) with wide-SIMD acceleration. - Metadata extraction and blur scoring run in parallel across all images using Rayon thread pools.
- Applies profile thresholds to classify readiness and emit actionable warnings.
- Indicative heuristic: overlap_proxy ~= 1 - (nn_spacing_m / nominal_footprint_m), clipped to [0, 1].
Why It Wins
- Converts ad hoc manual preflight checks into a single reproducible QA workflow with machine-readable outputs, including blur quality, RTK status, and orientation priors alongside the standard GPS and overlap diagnostics.
Typical Buying Trigger
Frequent failed/rework-heavy photogrammetry or registration runs caused by avoidable intake quality issues.
Typical Presets
- fast: permissive intake thresholds with fast-mode blur scoring for rapid field feedback.
- balanced: default operations profile with fast-mode blur scoring.
- strict: tighter quality gate with full-mode blur scoring before production processing.
Inputs
ParameterOptionalDescription
images_dirnoInput directory containing UAV images to screen.
profile: fast | balanced | strictyes[pro] QA profile controlling overlap, metadata readiness, and blur thresholds; drives pass/review/fail classification. Defaults to balanced.
recursiveyesIf true, scans subdirectories under images_dir. Defaults to true.
output_prefixyesPrefix used to name all output artifacts.
blur_mode: off | fast | fullyesBlur scoring mode. off skips blur scoring; fast downsamples before scoring; full scores at original resolution. Defaults to fast.
Outputs
ParameterTypeDescription
image_inventoryCSVPer-image inventory with metadata flags, GPS coordinates, capture fields, blur score, and gimbal/RTK orientation columns (*_image_inventory.csv).
qa_reportJSONStructured QA checks and warning details for intake triage, including blur and orientation hint sections (*_intake_qa_report.json).
summaryJSONWorkflow summary contract with status and aggregate metrics including blur coverage and RTK coverage fractions (*_intake_summary.json).
html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
image_centersGeoJSONImage center points from EXIF GPS with per-feature blur score and orientation properties (*_image_centers.geojson).
flight_path_linesGeoJSONFlight path line geometry built from ordered image centers with GPS (*_flight_path_lines.geojson).
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
result = wbe.guided_uav_image_intake_workflow( images_dir="data/uav_mission/images", profile="balanced", recursive=True, output_prefix="output/uav_intake", blur_mode="fast", )
print(result)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Registration Oriented Feature Workflow
Function name: registration_oriented_feature_workflow
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Image Registration Diagnostics
Problem It Solves
Which pairs are registration-ready, and do we have enough correspondence quality to proceed confidently?
Who It Is For
- Geospatial registration specialists, EO preprocessing teams, and image-fusion analysts.
Primary User
Teams building repeatable image registration pipelines (RGB-RGB, thermal-RGB, and cross-date alignment).
What It Does
- Runs image-pair or image-set registration diagnostics using a RootSIFT feature engine.
- Produces tie-point outputs and pair-level quality metrics with full fallback attempt traces.
- Handles cross-modal pairs (e.g., thermal LWIR to RGB visible) via automatic histogram equalization preprocessing.
- Optionally emits annotated side-by-side pair visualizations with tie-point overlays.
- Emits workflow-level readiness status for registration-first pipelines.
How It Works
- Builds pair candidates (single pair mode or ranked set mode).
- Extracts RootSIFT descriptors from a Gaussian scale-space pyramid; descriptor distances are computed with wide-SIMD acceleration.
- Applies cross-verified nearest-neighbor matching with ratio test filtering.
- On match shortfall, escalates through a six-strategy fallback chain:
baseline→high_feature_baseline→relaxed_ratio→high_feature_relaxed→preprocess_eq_right→preprocess_eq_both; the final two strategies apply histogram equalization to recover keypoints from low-contrast inputs such as LWIR thermal imagery. - Records the full strategy attempt trace per pair in diagnostics for auditable QA.
- Reports per-pair keypoint counts, match counts, confidence summary, inlier-style proxy metrics, strategy used, and fallback policy.
- Indicative rule: accept match if d1 / d2 ParameterOptionalDescription
mode: set | pairnoExecution mode for set-wide pair planning or explicit pair diagnostics.
images_dirno (set mode)Input image directory used when mode is set.
left_image, right_imageno (pair mode)Explicit pair inputs used when mode is pair.
max_pairsyesMaximum number of candidate pairs evaluated in set mode.
max_features_per_imageyesUpper bound on extracted keypoints per image.
ratio_testyesDescriptor ratio-test threshold controlling match strictness.
min_matchesyes[pro] Minimum accepted match count per pair; drives the QA gate, fallback routing, and pass/review/fail classification.
output_prefixyesPrefix used to name workflow outputs.
emit_pair_match_vizyesIf true, writes annotated side-by-side pair images with tie-point lines. Defaults to
false. max_pair_visualizationsyesMaximum number of pair visualizations to write when emit_pair_match_viz is true. Defaults to8. max_lines_per_pairyesMaximum number of tie-point lines drawn per visualization. Defaults to150. viz_scaleyesDownscale factor applied to visualization images, in [0.05, 1.0]. Defaults to0.5.
Outputs
ParameterTypeDescription
pair_diagnosticsJSONPair-level diagnostics including candidate score, keypoint counts, match quality, confidence proxies, strategy used, fallback flag, and full strategy attempt trace (*_pair_diagnostics.json).
match_summaryJSONWorkflow summary contract with aggregate match metrics, status, and fallback policy record (*_match_summary.json).
html_reportHTMLHuman-readable customer-facing report generated from the workflow summary contract for stakeholder review and QA traceability.
tie_pointsCSVTie-point table (pair_id,left_x,left_y,right_x,right_y,confidence) for downstream registration workflows (*_tie_points.csv).
pair_match_vizdirectoryAnnotated side-by-side JPEG images per pair with tie-point lines overlaid, written when emit_pair_match_viz is true.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
result = wbe.registration_oriented_feature_workflow( mode="set", images_dir="data/uav_mission/images", max_pairs=24, max_features_per_image=500, ratio_test=0.80, min_matches=24, output_prefix="output/registration_workflow", emit_pair_match_viz=True, )
print(result)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Raster Analysis
General raster analysis covers a broad set of cell-based operations: local algebra on single rasters, focal statistics across neighbourhoods, zonal summaries within polygon regions, reclassification, and suitability scoring. These tools form the computational backbone of many GIS modelling workflows.
This chapter demonstrates an environmental suitability analysis that combines multiple raster datasets through reclassification and weighted overlay.
Key Concepts
- Local operations: Applied to each cell independently — arithmetic, logic, trigonometry, conditional assignment. Result depends only on the cell's own value (and corresponding cells in other input rasters).
- Focal operations: Applied to each cell using values within a spatial neighbourhood (kernel). Examples: focal mean, focal max, focal standard deviation.
- Zonal statistics: Aggregate raster values by zone boundaries defined by a second raster or vector polygon layer.
- Reclassification: Map old cell values to new values via a lookup table or range intervals. Core step in suitability and habitat modelling.
- Weighted overlay: Combine multiple reclassified factor rasters using factor-specific weights. The weighted sum produces a composite suitability score.
- NoData / null handling: Cells with NoData propagate through most local operations. Ensure all input rasters share the same extent, resolution, and NoData mask before combining them.
End-to-End Workflow: Multi-Criteria Habitat Suitability
This workflow scores terrain for habitat suitability using slope, TWI, and distance from water as factors.
Inputs
| Layer | Format | Notes |
|---|---|---|
slope.tif | GeoTIFF raster | Degrees, from terrain analysis |
twi.tif | GeoTIFF raster | Topographic Wetness Index |
streams.tif | GeoTIFF raster | Binary stream network |
All rasters must share the same projected CRS, extent, and cell size before
being combined. Use Snap Raster Extents or QGIS Warp (Reproject)
to align if needed.
Step 1 — Compute Distance from Water
Processing Toolbox → Whitebox Workflows → GIS Analysis →
Euclidean Distance
| Parameter | Recommended value |
|---|---|
| Input feature | streams.tif |
| Output | dist_water.tif |
Cells closer to water receive smaller values. This will be reclassified so that proximity = higher suitability.
Step 2 — Reclassify Slope Factor
Assign suitability scores 1–5 to slope ranges.
Processing Toolbox → Whitebox Workflows → Raster Analysis →
Reclass
| Class | Slope range (°) | Suitability score |
|---|---|---|
| 1 | > 30 | 1 (unsuitable) |
| 2 | 20–30 | 2 |
| 3 | 10–20 | 3 |
| 4 | 5–10 | 4 |
| 5 | 0–5 | 5 (most suitable) |
Use Reclass From File with a two-column table file (old value; new value)
or set up intervals in the tool dialogue.
| Parameter | Recommended value |
|---|---|
| Input raster | slope.tif |
| Reclass intervals file | slope_reclass.txt |
| Output | slope_reclass.tif |
Step 3 — Reclassify TWI Factor
| Class | TWI range | Suitability score |
|---|---|---|
| 1 | < 4 | 1 |
| 2 | 4–6 | 2 |
| 3 | 6–8 | 3 |
| 4 | 8–10 | 4 |
| 5 | > 10 | 5 |
Processing Toolbox → Whitebox Workflows → Raster Analysis → Reclass
Input: twi.tif → Output: twi_reclass.tif
Step 4 — Reclassify Distance from Water
Closer = more suitable:
| Class | Distance range (m) | Suitability score |
|---|---|---|
| 1 | > 500 | 1 |
| 2 | 300–500 | 2 |
| 3 | 100–300 | 3 |
| 4 | 50–100 | 4 |
| 5 | 0–50 | 5 |
Processing Toolbox → Whitebox Workflows → Raster Analysis → Reclass
Input: dist_water.tif → Output: dist_water_reclass.tif
Step 5 — Weighted Overlay (Raster Calculator)
Combine the three factors using assigned weights (must sum to 1.0).
| Factor | Weight |
|---|---|
| Slope suitability | 0.4 |
| TWI suitability | 0.35 |
| Distance suitability | 0.25 |
QGIS Raster Calculator:
("slope_reclass@1" * 0.4) + ("twi_reclass@1" * 0.35) + ("dist_water_reclass@1" * 0.25)
Output: suitability.tif (range 1–5, continuous).
Step 6 — Zonal Statistics (Optional)
Summarise suitability scores by catchment polygon.
Processing Toolbox → QGIS → Vector Analysis →
Zonal Statistics (QGIS native)
| Parameter | Recommended value |
|---|---|
| Input raster | suitability.tif |
| Vector layer | watersheds.shp |
| Statistics | Mean, Max, Std Dev |
| Output column prefix | suit_ |
Python Console Equivalent
import processing
# Step 1: distance from water
processing.run('whitebox_workflows:euclidean_distance', {
'input': '/data/streams.tif',
'output': '/data/dist_water.tif',
})
# Step 2–4: reclassify each factor
for src, dst in [
('slope', 'slope_reclass'),
('twi', 'twi_reclass'),
('dist_water', 'dist_water_reclass'),
]:
processing.run('whitebox_workflows:reclass_from_file', {
'input': f'/data/{src}.tif',
'reclass_vals': f'/data/{src}_reclass.txt',
'output': f'/data/{dst}.tif',
})
# Step 5: weighted overlay via Raster Calculator
processing.run('qgis:rastercalculator', {
'EXPRESSION': '("slope_reclass@1" * 0.4) + ("twi_reclass@1" * 0.35) + ("dist_water_reclass@1" * 0.25)',
'LAYERS': [
'/data/slope_reclass.tif',
'/data/twi_reclass.tif',
'/data/dist_water_reclass.tif',
],
'OUTPUT': '/data/suitability.tif',
})
print("Suitability analysis complete.")
Advanced: Focal Statistics
Focal statistics smooth or enhance spatial patterns at a neighbourhood scale.
Processing Toolbox → Whitebox Workflows → Raster Analysis →
Mean Filter
| Parameter | Recommended value |
|---|---|
| Input raster | suitability.tif |
| Filter size X | 5 (cells) |
| Filter size Y | 5 |
| Output | suitability_smooth.tif |
Use Standard Deviation Filter to highlight areas of high local variability,
or Percentile Filter for rank-based neighbourhood smoothing.
Common Pitfalls
| Problem | Likely cause | Fix |
|---|---|---|
| Raster Calculator outputs all NoData | Rasters have different extents or CRS | Clip/warp all inputs to common grid before combining |
| Reclass produces unexpected values | Range gaps or overlaps in reclass table | Verify that intervals are contiguous with no gap or overlap |
| Zonal statistics returns wrong polygon counts | Raster–vector CRS mismatch | Reproject vector to match raster CRS before running |
| Weighted overlay result > 5 | Weights do not sum to 1.0 | Recalculate weights so they sum to exactly 1.0 |
| Focal filter introduces edge NoData | Kernel extends beyond raster boundary | Pad raster with Expand Raster before filtering, or ignore edge cells |
Validation Checklist
- All input rasters aligned (same CRS, extent, cell size, NoData value).
- Reclass table covers the full observed value range with no gaps.
- Weighted overlay weights sum to 1.0.
- Output suitability range matches expected 1–5 interval.
- Zonal statistics polygon CRS matches raster CRS.
- Focal filter kernel size is appropriate for target feature scale.
Overlay and Math
Add
Function name: add
Experimental
Adds two rasters on a cell-by-cell basis.
raster math add legacy-port
Parameters
NameDescriptionRequiredDefault
input1First input raster (path string or typed raster object).Requiredinput1.tif
input2Second input raster (path string or typed raster object).Requiredinput2.tif
outputOptional output raster file path. If omitted, output remains in memory and is returned as a memory:// raster handle.Optional—
Examples
Runs add on two DEM rasters and writes the result to dem_sum.tif.
wbe.add(input1='dem_a.tif', input2='dem_b.tif', output='dem_sum.tif')
Average Overlay
Function name: average_overlay
This tool can be used to find the average value in each cell of a grid from a set of input images (inputs). It is therefore similar to the weighted_sum tool except that each input image is given equal weighting. This tool operates on a cell-by-cell basis. Therefore, each of the input rasters must share the same number of rows and columns and spatial extent. An error will be issued if this is not the case. At least two input rasters are required to run this tool. Like each of the WhiteboxTools overlay tools, this tool has been optimized for parallel processing.
See Also
weighted_sum
Python API
def average_overlay(self, input_rasters: List[Raster]) -> Raster:
Bool And
Function name: bool_and
This tool is a Boolean AND operator, i.e. it works on True or False (1 and 0) values. Grid cells for which the first and second input rasters (input1; input2) have True values are assigned 1 in the output raster, otherwise grid cells are assigned a value of 0. All non-zero values in the input rasters are considered to be True, while all zero-valued grid cells are considered to be False. Grid cells containing NoData values in either of the input rasters will be assigned a NoData value in the output raster.
See Also
bool_not, bool_or, bool_xor
Python API
def bool_and(self, input1: Raster, input2: Raster) -> Raster:
Bool Not
Function name: bool_not
This tool is a Boolean NOT operator, i.e. it works on True or False (1 and 0) values. Grid cells for which the first input raster (input1) has a True value and the second raster (input2) has a False value are assigned 0 in the output raster, otherwise grid cells are assigned a value of 0. All non-zero values in the input rasters are considered to be True, while all zero-valued grid cells are considered to be False. Grid cells containing NoData values in either of the input rasters will be assigned a NoData value in the output raster. Notice that the Not operator is asymmetrical, and the order of inputs matters.
See Also
bool_and, bool_or, bool_xor
Python API
def bool_not(self, input1: Raster, input2: Raster) -> Raster:
Bool Or
Function name: bool_or
This tool is a Boolean OR operator, i.e. it works on True or False (1 and 0) values. Grid cells for which the either the first or second input rasters (input1; input2) have a True value are assigned 1 in the output raster, otherwise grid cells are assigned a value of 0. All non-zero values in the input rasters are considered to be True, while all zero-valued grid cells are considered to be False. Grid cells containing NoData values in either of the input rasters will be assigned a NoData value in the output raster.
See Also
bool_and, bool_not, bool_xor
Python API
def bool_or(self, input1: Raster, input2: Raster) -> Raster:
Bool Xor
Function name: bool_xor
This tool is a Boolean XOR operator, i.e. it works on True or False (1 and 0) values. Grid cells for which either the first or second input rasters (input1; input2) have a True value but not both are assigned 1 in the output raster, otherwise grid cells are assigned a value of 0. All non-zero values in the input rasters are considered to be True, while all zero-valued grid cells are considered to be False. Grid cells containing NoData values in either of the input rasters will be assigned a NoData value in the output raster. Notice that the Not operator is asymmetrical, and the order of inputs matters.
See Also
bool_and, bool_not, bool_or
Python API
def bool_xor(self, input1: Raster, input2: Raster) -> Raster:
Count If
Function name: count_if
This tool counts the number of occurrences of a specified value (value) in a stack of input rasters (inputs). Each grid cell in the output raster (output) will contain the number of occurrences of the specified value in the stack of corresponding cells in the input image. At least two input rasters are required to run this tool. Each of the input rasters must share the same number of rows and columns and spatial extent. An error will be issued if this is not the case.
See Also
pick_from_list
Python API
def count_if(self, input_rasters: List[Raster], comparison_value: float) -> Raster:
Divide
Function name: divide
Experimental
Divides the first raster by the second on a cell-by-cell basis.
raster math divide legacy-port
Parameters
NameDescriptionRequiredDefault
input1First input raster (path string or typed raster object).Requiredinput1.tif
input2Second input raster (path string or typed raster object).Requiredinput2.tif
outputOptional output raster file path. If omitted, output remains in memory and is returned as a memory:// raster handle.Optional—
Examples
Runs divide on two DEM rasters and writes the result to dem_ratio.tif.
wbe.divide(input1='dem_a.tif', input2='dem_b.tif', output='dem_ratio.tif')
Highest Position
Function name: highest_position
This tool identifies the stack position (index) of the maximum value within a raster stack on a cell-by-cell basis. For example, if five raster images (inputs) are input to the tool, the output raster (output) would show which of the five input rasters contained the highest value for each grid cell. The index value in the output raster is the zero-order number of the raster stack, i.e. if the highest value in the stack is contained in the first image, the output value would be 0; if the highest stack value were the second image, the output value would be 1, and so on. If any of the cell values within the stack is NoData, the output raster will contain the NoData value for the corresponding grid cell. The index value is related to the order of the input images.
Warning
Each of the input rasters must have the same spatial extent and number of rows and columns.
See Also
lowest_position, pick_from_list
Python API
def highest_position(self, input_rasters: List[Raster]) -> Raster:
Lowest Position
Function name: lowest_position
This tool identifies the stack position (index) of the minimum value within a raster stack on a cell-by-cell basis. For example, if five raster images (inputs) are input to the tool, the output raster (output) would show which of the five input rasters contained the lowest value for each grid cell. The index value in the output raster is the zero-order number of the raster stack, i.e. if the lowest value in the stack is contained in the first image, the output value would be 0; if the lowest stack value were the second image, the output value would be 1, and so on. If any of the cell values within the stack is NoData, the output raster will contain the NoData value for the corresponding grid cell. The index value is related to the order of the input images.
Warning
Each of the input rasters must have the same spatial extent and number of rows and columns.
See Also
highest_position, pick_from_list
Python API
def lowest_position(self, input_rasters: List[Raster]) -> Raster:
Max Absolute Overlay
Function name: max_absolute_overlay
This tool can be used to find the maximum absolute (non-negative) value in each cell of a grid from a set of input images (inputs). NoData values in any of the input images will result in a NoData pixel in the output image.
Warning
Each of the input rasters must have the same spatial extent and number of rows and columns.
See Also
max_overlay, min_absolute_overlay, min_overlay
Python API
def max_absolute_overlay(self, input_rasters: List[Raster]) -> Raster:
Max Overlay
Function name: max_overlay
This tool can be used to find the maximum value in each cell of a grid from a set of input images (inputs). NoData values in any of the input images will result in a NoData pixel in the output image (output). It is similar to the Max mathematical tool, except that it will accept more than two input images.
Warning
Each of the input rasters must have the same spatial extent and number of rows and columns.
See Also
min_overlay, max_absolute_overlay
Python API
def max_overlay(self, input_rasters: List[Raster]) -> Raster:
Min Absolute Overlay
Function name: min_absolute_overlay
This tool can be used to find the minimum absolute (non-negative) value in each cell of a grid from a set of input images (inputs). NoData values in any of the input images will result in a NoData pixel in the output image.
Warning
Each of the input rasters must have the same spatial extent and number of rows and columns.
See Also
min_overlay, max_absolute_overlay, max_overlay
Python API
def min_absolute_overlay(self, input_rasters: List[Raster]) -> Raster:
Min Overlay
Function name: min_overlay
This tool can be used to find the minimum value in each cell of a grid from a set of input images (inputs). NoData values in any of the input images will result in a NoData pixel in the output image (output). It is similar to the Min mathematical tool, except that it will accept more than two input images.
Warning
Each of the input rasters must have the same spatial extent and number of rows and columns.
See Also
max_overlay, max_absolute_overlay, min_absolute_overlay, Min
Python API
def min_overlay(self, input_rasters: List[Raster]) -> Raster:
Modulo
Function name: modulo
Experimental
Computes the remainder of dividing the first raster by the second on a cell-by-cell basis.
raster math modulo legacy-port
Parameters
NameDescriptionRequiredDefault
input1First input raster (path string or typed raster object).Requiredinput1.tif
input2Second input raster (path string or typed raster object).Requiredinput2.tif
outputOptional output raster file path. If omitted, output remains in memory and is returned as a memory:// raster handle.Optional—
Examples
Runs modulo on two DEM rasters and writes the result to dem_modulo.tif.
wbe.modulo(input1='dem_a.tif', input2='dem_b.tif', output='dem_modulo.tif')
Multiply
Function name: multiply
Experimental
Multiplies two rasters on a cell-by-cell basis.
raster math multiply legacy-port
Parameters
NameDescriptionRequiredDefault
input1First input raster (path string or typed raster object).Requiredinput1.tif
input2Second input raster (path string or typed raster object).Requiredinput2.tif
outputOptional output raster file path. If omitted, output remains in memory and is returned as a memory:// raster handle.Optional—
Examples
Runs multiply on two DEM rasters and writes the result to dem_product.tif.
wbe.multiply(input1='dem_a.tif', input2='dem_b.tif', output='dem_product.tif')
Multiply Overlay
Function name: multiply_overlay
This tool multiplies a stack of raster images (inputs) on a pixel-by-pixel basis. This tool is particularly well suited when you need to create a masking layer from the combination of several Boolean rasters, i.e. for constraint mapping applications. NoData values in any of the input images will result in a NoData pixel in the output image (output).
Warning
Each of the input rasters must have the same spatial extent and number of rows and columns.
See Also
sum_overlay, weighted_sum
Python API
def multiply_overlay(self, input_rasters: List[Raster]) -> Raster:
Percent Equal To
Function name: percent_equal_to
This tool calculates the percentage of a raster stack (inputs) that have cell values equal to an input comparison raster. The user must specify the name of the value raster (comparison), the names of the raster files contained in the stack, and an output raster file name (output). The tool, working on a cell-by-cell basis, will count the number of rasters within the stack that have the same grid cell value as the corresponding grid cell in the comparison raster. This count is then expressed as a percentage of the number of rasters contained within the stack and output. If any of the rasters within the stack contain the NoData value, the corresponding grid cell in the output raster will be assigned NoData.
Warning
Each of the input rasters must have the same spatial extent and number of rows and columns.
See Also
percent_greater_than, percent_less_than
Python API
def percent_equal_to(self, input_rasters: List[Raster], comparison: Raster) -> Raster:
Percent Greater Than
Function name: percent_greater_than
This tool calculates the percentage of a raster stack (inputs) that have cell values greater than an input comparison raster. The user must specify the name of the value raster (comparison), the names of the raster files contained in the stack, and an output raster file name (output). The tool, working on a cell-by-cell basis, will count the number of rasters within the stack with larger grid cell values greater than the corresponding grid cell in the comparison raster. This count is then expressed as a percentage of the number of rasters contained within the stack and output. If any of the rasters within the stack contain the NoData value, the corresponding grid cell in the output raster will be assigned NoData.
Warning
Each of the input rasters must have the same spatial extent and number of rows and columns.
See Also
percent_less_than, percent_equal_to
Python API
def percent_greater_than(self, input_rasters: List[Raster], comparison: Raster) -> Raster:
Percent Less Than
Function name: percent_less_than
This tool calculates the percentage of a raster stack (inputs) that have cell values less than an input comparison raster. The user must specify the name of the value raster (comparison), the names of the raster files contained in the stack, and an output raster file name (output). The tool, working on a cell-by-cell basis, will count the number of rasters within the stack with larger grid cell values less than the corresponding grid cell in the comparison raster. This count is then expressed as a percentage of the number of rasters contained within the stack and output. If any of the rasters within the stack contain the NoData value, the corresponding grid cell in the output raster will be assigned NoData.
Warning
Each of the input rasters must have the same spatial extent and number of rows and columns.
See Also
percent_greater_than, percent_equal_to
Python API
def percent_less_than(self, input_rasters: List[Raster], comparison: Raster) -> Raster:
Pick From List
Function name: pick_from_list
This tool outputs the cell value from a raster stack specified (inputs) by a position raster (pos_input). The user must specify the name of the position raster, the names of the raster files contained in the stack (i.e. group of rasters), and an output raster file name (output). The tool, working on a cell-by-cell basis, will assign the value to the output grid cell contained in the corresponding cell in the stack image in the position specified by the cell value in the position raster. Importantly, the positions raster should be in zero-based order. That is, the first image in the stack should be assigned the value zero, the second raster is assigned 1, and so on.
At least two input rasters are required to run this tool. Each of the input rasters must share the same number of rows and columns and spatial extent. An error will be issued if this is not the case.
See Also
count_if
Python API
def pick_from_list(self, input_rasters: List[Raster], pos_input: Raster) -> Raster:
Power
Function name: power
Experimental
Raises the first raster to the power of the second on a cell-by-cell basis.
raster math power legacy-port
Parameters
NameDescriptionRequiredDefault
input1First input raster (path string or typed raster object).Requiredinput1.tif
input2Second input raster (path string or typed raster object).Requiredinput2.tif
outputOptional output raster file path. If omitted, output remains in memory and is returned as a memory:// raster handle.Optional—
Examples
Runs power on two DEM rasters and writes the result to dem_power.tif.
wbe.power(input1='dem_a.tif', input2='dem_b.tif', output='dem_power.tif')
Standard Deviation Overlay
Function name: standard_deviation_overlay
This tool can be used to find the standard deviation of the values in each raster cell from a set of input rasters (inputs). NoData values in any of the input images will result in a NoData pixel in the output image (output).
Warning
Each of the input rasters must have the same spatial extent and number of rows and columns.
See Also
min_overlay, max_overlay
Python API
def standard_deviation_overlay(self, input_rasters: List[Raster]) -> Raster:
Subtract
Function name: subtract
Experimental
Subtracts the second raster from the first on a cell-by-cell basis.
raster math subtract legacy-port
Parameters
NameDescriptionRequiredDefault
input1First input raster (path string or typed raster object).Requiredinput1.tif
input2Second input raster (path string or typed raster object).Requiredinput2.tif
outputOptional output raster file path. If omitted, output remains in memory and is returned as a memory:// raster handle.Optional—
Examples
Runs subtract on two DEM rasters and writes the result to dem_difference.tif.
wbe.subtract(input1='dem_a.tif', input2='dem_b.tif', output='dem_difference.tif')
Sum Overlay
Function name: sum_overlay
This tool calculates the sum for each grid cell from a group of raster images (inputs). NoData values in any of the input images will result in a NoData pixel in the output image (output).
Warning
Each of the input rasters must have the same spatial extent and number of rows and columns.
See Also
weighted_sum, multiply_overlay
Python API
def sum_overlay(self, input_rasters: List[Raster]) -> Raster:
Update Nodata Cells
Function name: update_nodata_cells
This tool will assign the NoData valued cells in an input raster (input1) the values contained in the corresponding grid cells in a second input raster (input2). This operation is sometimes necessary because most other overlay operations exclude areas of NoData values from the analysis. This tool can be used when there is need to update the values of a raster within these missing data areas.
See Also
IsNodata
Python API
def update_nodata_cells(self, input1: Raster, input2: Raster) -> Raster:
Weighted Overlay
Function name: weighted_overlay
This tool performs a weighted overlay on multiple input images. It can be used to combine multiple factors with varying levels of weight or relative importance. The WeightedOverlay tool is similar to the WeightedSum tool but is more powerful because it automatically converts the input factors to a common user-defined scale and allows the user to specify benefit factors and cost factors. A benefit factor is a factor for which higher values are more suitable. A cost factor is a factor for which higher values are less suitable. By default, WeightedOverlay assumes that input images are benefit factors, unless a cost value of 'true' is entered in the cost array. Constraints are absolute restriction with values of 0 (unsuitable) and 1 (suitable). This tool is particularly useful for performing multi-criteria evaluations (MCE).
Notice that the algorithm will convert the user-defined factor weights internally such that the sum of the weights is always equal to one. As such, the user can specify the relative weights as decimals, percentages, or relative weightings (e.g. slope is 2 times more important than elevation, in which case the weights may not sum to 1 or 100).
NoData valued grid cells in any of the input images will be assigned NoData values in the output image. The output raster is of the float data type and continuous data scale.
Warning
Each of the input rasters must have the same spatial extent and number of rows and columns.
Python API
def weighted_overlay(self, factors: List[Raster], weights: List[float], cost: List[Raster] = None, constraints: List[Raster] = None, scale_max: float = 1.0) -> Raster:
Weighted Sum
Function name: weighted_sum
This tool performs a weighted-sum overlay on multiple input raster images. If you have a stack of rasters that you would like to sum, each with an equal weighting (1.0), then use the sum_overlay tool instead.
Warning
Each of the input rasters must have the same spatial extent and number of rows and columns.
See Also
sum_overlay
Python API
def weighted_sum(self, input_rasters: List[Raster], weights: List[float]) -> Raster:
Distance and Cost
Buffer Raster
Function name: buffer_raster
This tool can be used to identify an area of interest within a specified distance of features of interest in a raster data set.
The Euclidean distance (i.e. straight-line distance) is calculated between each grid cell and the nearest 'target cell' in the input image. Distance is calculated using the efficient method of Shih and Wu (2004). Target cells are all non-zero, non-NoData grid cells. Because NoData values in the input image are assigned the NoData value in the output image, the only valid background value in the input image is zero.
The user must specify the input and output image names, the desired buffer size (size), and, optionally, whether the distance units are measured in grid cells (i.e. gridcells flag). If the gridcells flag is not specified, the linear units of the raster's coordinate reference system will be used.
Reference
Shih FY and Wu Y-T (2004), Fast Euclidean distance transformation in two scans using a 3 x 3 neighborhood, Computer Vision and Image Understanding, 93: 195-205.
See Also
euclidean_distance
Python API
def buffer_raster(self, input: Raster, buffer_size: float, grid_cells_units: bool = False) -> Raster:
Cost Allocation
Function name: cost_allocation
This tool can be used to identify the 'catchment area' of each source grid cell in a cost-distance analysis. The user must specify the names of the input source and back-link raster files. Source cells (i.e. starting points for the cost-distance or least-cost path analysis) are designated as all positive, non-zero valued grid cells in the source raster. A back-link raster file can be created using the cost_distance tool and is conceptually similar to the D8 flow-direction pointer raster grid in that it describes the connectivity between neighbouring cells on the accumulated cost surface.
NoData values in the input back-link image are assigned NoData values in the output image.
See Also
cost_distance, cost_pathway, euclidean_allocation
Python API
def cost_allocation(self, source: Raster, backlink: Raster) -> Raster:
Cost Distance
Function name: cost_distance
This tool can be used to perform cost-distance or least-cost pathway analyses. Specifically, this tool can be used to calculate the accumulated cost of traveling from the 'source grid cell' to each other grid cell in a raster dataset. It is based on the costs associated with traveling through each cell along a pathway represented in a cost (or friction) surface. If there are multiple source grid cells, each cell in the resulting cost-accumulation surface will reflect the accumulated cost to the source cell that is connected by the minimum accumulated cost-path. The user must specify the names of the raster file containing the source cells (source), the raster file containing the cost surface information (cost), the output cost-accumulation surface raster (out_accum), and the output back-link raster (out_backlink). Source cells are designated as all positive, non-zero valued grid cells in the source raster. The cost (friction) raster can be created by combining the various cost factors associated with the specific problem (e.g. slope gradient, visibility, etc.) using a raster calculator or the weighted_overlay tool.
While the cost-accumulation surface raster can be helpful for visualizing the three-dimensional characteristics of the 'cost landscape', it is actually the back-link raster that is used as inputs to the other two cost-distance tools, cost_allocation and cost_pathway, to determine the least-cost linkages among neighbouring grid cells on the cost surface. If the accumulated cost surface is analogous to a digital elevation model (DEM) then the back-link raster is equivalent to the D8 flow-direction pointer. In fact, it is created in a similar way and uses the same convention for designating 'flow directions' between neighbouring grid cells. The algorithm for the cost distance accumulation operation uses a type of priority-flood method similar to what is used for depression filling and flow accumulation operations.
NoData values in the input cost surface image are ignored during processing and assigned NoData values in the outputs. The output cost accumulation raster is of the float data type and continuous data scale.
See Also
cost_allocation, cost_pathway, weighted_overlay
Python API
def cost_distance(self, source: Raster, cost: Raster) -> Tuple[Raster, Raster]:
Cost Pathway
Function name: cost_pathway
This tool can be used to map the least-cost pathway connecting each destination grid cell in a cost-distance analysis to a source cell. The user must specify the names of the input destination and back-link raster files. Destination cells (i.e. end points for the least-cost path analysis) are designated as all positive, non-zero valued grid cells in the destination raster. A back-link raster file can be created using the cost_distance tool and is conceptually similar to the D8 flow-direction pointer raster grid in that it describes the connectivity between neighbouring cells on the accumulated cost surface. All background grid cells in the output image are assigned the NoData value.
NoData values in the input back-link image are assigned NoData values in the output image.
See Also
cost_distance, cost_allocation
Python API
def cost_pathway(self, destination: Raster, backlink: Raster, zero_background: bool = False) -> Raster:
Euclidean Allocation
Function name: euclidean_allocation
This tool assigns grid cells in the output image the value of the nearest target cell in the input image, measured by the Euclidean distance (i.e. straight-line distance). Thus, euclidean_allocation essentially creates the Voronoi diagram for a set of target cells. Target cells are all non-zero, non-NoData grid cells in the input image. Distances are calculated using the same efficient algorithm (Shih and Wu, 2003) as the euclidean_distance tool.
Reference
Shih FY and Wu Y-T (2004), Fast Euclidean distance transformation in two scans using a 3 x 3 neighborhood, Computer Vision and Image Understanding, 93: 195-205.
See Also
euclidean_distance, voronoi_diagram, cost_allocation
Python API
def euclidean_allocation(self, input: Raster) -> Raster:
Euclidean Distance
Function name: euclidean_distance
This tool will estimate the Euclidean distance (i.e. straight-line distance) between each grid cell and the nearest 'target cell' in the input image. Target cells are all non-zero, non-NoData grid cells. Distance in the output image is measured in the same units as the horizontal units of the input image.
Algorithm Description
The algorithm is based on the highly efficient distance transform of Shih and Wu (2003). It makes four passes of the image; the first pass initializes the output image; the second and third passes calculate the minimum squared Euclidean distance by examining the 3 x 3 neighbourhood surrounding each cell; the last pass takes the square root of cell values, transforming them into true Euclidean distances, and deals with NoData values that may be present. All NoData value grid cells in the input image will contain NoData values in the output image. As such, NoData is not a suitable background value for non-target cells. Background areas should be designated with zero values.
Reference
Shih FY and Wu Y-T (2004), Fast Euclidean distance transformation in two scans using a 3 x 3 neighborhood, Computer Vision and Image Understanding, 93: 195-205.
See Also
euclidean_allocation, cost_distance
Python API
def euclidean_distance(self, input: Raster) -> Raster:
Spatial Statistics
Local Morans I Lisa Raster
Function name: local_morans_i_lisa_raster
No help documentation available for this tool.
Getis Ord Gi Star Raster
Function name: getis_ord_gi_star_raster
No help documentation available for this tool.
Ordinary Kriging
Function name: ordinary_kriging
No help documentation available for this tool.
Local Kriging
Function name: local_kriging
No help documentation available for this tool.
Simple Kriging
Function name: simple_kriging
No help documentation available for this tool.
Universal Kriging
Function name: universal_kriging
No help documentation available for this tool.
Spacetime Kriging
Function name: spacetime_kriging
No help documentation available for this tool.
Ordinary Cokriging
Function name: ordinary_cokriging
No help documentation available for this tool.
Spatial Lag Regression Raster
Function name: spatial_lag_regression_raster
No help documentation available for this tool.
Spatial Error Regression Raster
Function name: spatial_error_regression_raster
No help documentation available for this tool.
Geographically Weighted Regression Raster
Function name: geographically_weighted_regression_raster
No help documentation available for this tool.
Inhomogeneous Intensity Raster
Function name: inhomogeneous_intensity_raster
No help documentation available for this tool.
Reclass and Mask
Conditional Evaluation
Function name: conditional_evaluation
Experimental
Performs if-then-else conditional evaluation on raster cells.
raster math conditional legacy-port
Parameters
NameDescriptionRequiredDefault
inputInput raster path.Requiredinput.tif
statementConditional expression evaluated per cell.Requiredvalue > 35.0
trueValue or raster/expression used when condition is true.Optional1.0
falseValue or raster/expression used when condition is false.Optional0.0
outputOptional output raster path.Optional—
Examples
Assign values based on a per-cell condition.
wbe.conditional_evaluation(false='dem.tif', input='dem.tif', output='conditional.tif', statement='value > 2500.0', true=2500.0)
Reclass
Function name: reclass
This tool creates a new raster in which the value of each grid cell is determined by an input raster (input) and a collection of user-defined classes. The user must specify the New value, the From value, and the To Just Less Than value of each class triplet of the reclass_value parameter. Classes must be mutually exclusive. Reclass values must be presented as lists-of-lists, where each row of the list contains either three (assign_mode=False) or two (assign_mode=True) values. If assign-mode is True, then the pair of values represent New value and Old value keys. As an example:
reclassed = wbe.reclass(raster, [[1.0, 0.0, 100.0], [2.0, 100.0, 200.0]], assign_mode=False)
Python API
def reclass(self, raster: Raster, reclass_values: List[List[float]], assign_mode: bool = False) -> Raster:
Reclass Equal Interval
Function name: reclass_equal_interval
This tool reclassifies the values in an input raster (input) file based on an equal-interval scheme, where the user must specify the reclass interval value (interval), the starting value (start_val), and optionally, the ending value (end_val). Grid cells containing values that fall outside of the range defined by the starting and ending values, will be assigned their original values in the output grid. If the user does not specify an ending value, the tool will assign a very large positive value.
See Also
reclass
Python API
def reclass_equal_interval(self, raster: Raster, interval_size: float, start_value: float = float('-inf'), end_value: float = float('inf')) -> Raster:
Local and Neighborhood
Image Correlation Neighbourhood Analysis
Function name: image_correlation_neighbourhood_analysis
This tool can be used to perform nieghbourhood-based (i.e. using roving search windows applied to each grid cell) correlation analysis on two input rasters (input1 and input2). The tool outputs a correlation value raster (output1) and a significance (p-value) raster (output2). Additionally, the user must specify the size of the search window (filter) and the correlation statistic (stat). Options for the correlation statistic include pearson, kendall, and spearman. Notice that Pearson's r is the most computationally efficient of the three correlation metrics but is unsuitable when the input distributions are non-linearly associated, in which case, either Spearman's Rho or Kendall's tau-b correlations are more suited. Both Spearman and Kendall correlations evaluate monotonic associations without assuming linearity in the relation. Kendall's tau-b is by far the most computationally expensive of the three statistics and may not be suitable to larger sized search windows.
See Also
image_correlation, image_regression
Python API
def image_correlation_neighbourhood_analysis(self, raster1: Raster, raster2: Raster, filter_size: int = 11, correlation_stat: str = "pearson") -> Tuple[Raster, Raster]:
Natural Neighbour Interpolation
Function name: natural_neighbour_interpolation
This tool can be used to interpolate a set of input vector points (input) onto a raster grid using Sibson's (1981) natural neighbour method. Similar to inverse-distance-weight interpolation (idw_interpolation), the natural neighbour method performs a weighted averaging of nearby point values to estimate the attribute (field) value at grid cell intersections in the output raster (output). However, the two methods differ quite significantly in the way that neighbours are identified and in the weighting scheme. First, natural neigbhour identifies neighbours to be used in the interpolation of a point by finding the points connected to the estimated value location in a Delaunay triangulation, that is, the so-called natural neighbours. This approach has the main advantage of not having to specify an arbitrary search distance or minimum number of nearest neighbours like many other interpolators do. Weights in the natural neighbour scheme are determined using an area-stealing approach, whereby the weight assigned to a neighbour's value is determined by the proportion of its Voronoi polygon that would be lost by inserting the interpolation point into the Voronoi diagram. That is, inserting the interpolation point into the Voronoi diagram results in the creation of a new polygon and shrinking the sizes of the Voronoi polygons associated with each of the natural neighbours. The larger the area by which a neighbours polygon is reduced through the insertion, relative to the polygon of the interpolation point, the greater the weight given to the neighbour point's value in the interpolation. Interpolation weights sum to one because the sum of the reduced polygon areas must account for the entire area of the interpolation points polygon.
The user must specify the attribute field containing point values (field). Alternatively, if the input Shapefile contains z-values, the interpolation may be based on these values (use_z). Either an output grid resolution (cell_size) must be specified or alternatively an existing base file (base) can be used to determine the output raster's (output) resolution and spatial extent. Natural neighbour interpolation generally produces a satisfactorily smooth surface within the region of data points but can produce spurious breaks in the surface outside of this region. Thus, it is recommended that the output surface be clipped to the convex hull of the input points (clip).
Reference
Sibson, R. (1981). "A brief description of natural neighbor interpolation (Chapter 2)". In V. Barnett (ed.). Interpolating Multivariate Data. Chichester: John Wiley. pp. 21–36.
See Also
idw_interpolation, NearestNeighbourGridding
Python API
def natural_neighbour_interpolation(self, points: Vector, field_name: str = "FID", use_z: bool = False, cell_size: float = 0.0, base_raster: Raster = None, clip_to_hull: bool = True) -> Raster:
Nearest Neighbour Interpolation
Function name: nearest_neighbour_interpolation
Creates a raster grid based on a set of vector points and assigns grid values using the nearest neighbour.
Python API
def nearest_neighbour_interpolation(self, points: Vector, field_name: str = "FID", use_z: bool = False, cell_size: float = 0.0, base_raster: Raster = None, max_dist: float = float('inf')) -> Raster:
General Tools
Abs
Function name: abs
Experimental
Calculates the absolute value of each raster cell.
raster math abs
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply abs transform to each non-nodata cell.
wbe.abs(input='dem.tif', output='abs_dem.tif')
Aggregate Raster
Function name: aggregate_raster
This tool can be used to reduce the grid resolution of a raster by a user specified amount. For example, using an aggregation factor (agg_factor) of 2 would result in a raster with half the number of rows and columns. The grid cell values (type) in the output image will consist of the mean, sum, maximum, minimum, or range of the overlapping grid cells in the input raster (four cells in the case of an aggregation factor of 2).
See Also
resample
Python API
def aggregate_raster(self, raster: Raster, aggregation_factor: int = 2, aggregation_type: str = "mean") -> Raster:
Anova
Function name: anova
This tool performs an Analysis of variance (ANOVA) test on the distribution of values in a raster (input) among a group of features (features). The ANOVA report is written to an output HTML report (output).
Python API
def anova(self, input_raster: Raster, features_raster: Raster, output_html_file: str) -> None:
Arccos
Function name: arccos
Experimental
Computes the inverse cosine (arccos) of each raster cell.
raster math arccos
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply arccos transform to each non-nodata cell.
wbe.arccos(input='dem.tif', output='arccos_dem.tif')
Arcosh
Function name: arcosh
Experimental
Computes the inverse hyperbolic cosine of each raster cell.
raster math arcosh
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply arcosh transform to each non-nodata cell.
wbe.arcosh(input='dem.tif', output='arcosh_dem.tif')
Arcsin
Function name: arcsin
Experimental
Computes the inverse sine (arcsin) of each raster cell.
raster math arcsin
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply arcsin transform to each non-nodata cell.
wbe.arcsin(input='dem.tif', output='arcsin_dem.tif')
Arctan
Function name: arctan
Experimental
Computes the inverse tangent (arctan) of each raster cell.
raster math arctan
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply arctan transform to each non-nodata cell.
wbe.arctan(input='dem.tif', output='arctan_dem.tif')
Arsinh
Function name: arsinh
Experimental
Computes the inverse hyperbolic sine of each raster cell.
raster math arsinh
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply arsinh transform to each non-nodata cell.
wbe.arsinh(input='dem.tif', output='arsinh_dem.tif')
Artanh
Function name: artanh
Experimental
Computes the inverse hyperbolic tangent of each raster cell.
raster math artanh
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply artanh transform to each non-nodata cell.
wbe.artanh(input='dem.tif', output='artanh_dem.tif')
Atan2
Function name: atan2
Experimental
Computes the four-quadrant inverse tangent using two rasters on a cell-by-cell basis.
raster math atan2 legacy-port
Parameters
NameDescriptionRequiredDefault
input1First input raster (path string or typed raster object).Requiredinput1.tif
input2Second input raster (path string or typed raster object).Requiredinput2.tif
outputOptional output raster file path. If omitted, output remains in memory and is returned as a memory:// raster handle.Optional—
Examples
Runs atan2 on two DEM rasters and writes the result to dem_atan2.tif.
wbe.atan2(input1='dem_a.tif', input2='dem_b.tif', output='dem_atan2.tif')
Block Maximum
Function name: block_maximum
Creates a raster grid based on a set of vector points and assigns grid values using a block maximum scheme.
Python API
def block_maximum(self, points: Vector, field_name: str = "FID", use_z: bool = False, cell_size: float = 0.0, base_raster: Raster = None) -> Raster:
Block Minimum
Function name: block_minimum
Creates a raster grid based on a set of vector points and assigns grid values using a block minimum scheme.
Python API
def block_minimum(self, points: Vector, field_name: str = "FID", use_z: bool = False, cell_size: float = 0.0, base_raster: Raster = None) -> Raster:
Boundary Shape Complexity
Function name: boundary_shape_complexity
This tools calculates a type of shape complexity index for raster objects, focused on the complexity of the boundary of polygons. The index uses the line_thinning tool to estimate a skeletonized network for each input raster polygon. The Boundary Shape Complexity (BSC) index is then calculated as the percentage of the skeletonized network belonging to exterior links. Polygons with more complex boundaries will possess more branching skeletonized networks, with each spur in the boundary possessing a short exterior branch. The two longest exterior links in the network are considered to be part of the main network. Therefore, polygons of complex shaped boundaries will have a higher percentage of their skeleton networks consisting of exterior links. It is expected that simple convex hulls should have relatively low BSC index values.
Objects in the input raster (input) are designated by their unique identifiers. Identifier values should be positive, non-zero whole numbers.
See Also
shape_complexity_index_raster, line_thinning
Python API
def boundary_shape_complexity(self, raster: Raster) -> Raster:
Ceil
Function name: ceil
Experimental
Rounds each raster cell upward to the nearest integer.
raster math ceil
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply ceil transform to each non-nodata cell.
wbe.ceil(input='dem.tif', output='ceil_dem.tif')
Centroid Raster
Function name: centroid_raster
This tool calculates the centroid, or average location, of raster polygon objects. For vector features, use the centroid_vector tool instead.
See Also
centroid_vector
Python API
def centroid_raster(self, input: Raster) -> Tuple[Raster, str]:
Clip Raster To Polygon
Function name: clip_raster_to_polygon
This tool can be used to clip an input raster (input) to the extent of a vector polygon (shapefile). The user must specify the name of the input clip file (polygons), which must be a vector of a Polygon base shape type. The clip file may contain multiple polygon features. Polygon hole parts will be respected during clipping, i.e. polygon holes will be removed from the output raster by setting them to a NoData background value. Raster grid cells that fall outside of a polygons in the clip file will be assigned the NoData background value in the output file. By default, the output raster will be cropped to the spatial extent of the clip file, unless the maintain_dimensions parameter is used, in which case the output grid extent will match that of the input raster. The grid resolution of output raster is the same as the input raster.
It is very important that the input raster and the input vector polygon file share the same projection. The result is unlikely to be satisfactory otherwise.
See Also
erase_polygon_from_raster
Python API
def clip_raster_to_polygon(self, raster: Raster, polygons: Vector, maintain_dimensions: bool = False) -> Raster:
Clump
Function name: clump
This tool re-categorizes data in a raster image by grouping cells that form discrete, contiguous areas into unique categories. Essentially this will produce a patch map from an input categorical raster, assigning each feature unique identifiers. The input raster should either be Boolean (1's and 0's) or categorical. The input raster could be created using the reclass tool or one of the comparison operators (GreaterThan, LessThan, EqualTo, NotEqualTo). Use the treat zeros as background cells options (zero_back) if you would like to only assigned contiguous groups of non-zero values in the raster unique identifiers. Additionally, inter-cell connectivity can optionally include diagonally neighbouring cells if the diag flag is specified.
See Also
reclass, GreaterThan, LessThan, EqualTo, NotEqualTo
Python API
def clump(self, raster: Raster, diag: bool = False, zero_background: bool = False) -> Raster:
Cos
Function name: cos
Experimental
Computes the cosine of each raster cell value.
raster math cos
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply cos transform to each non-nodata cell.
wbe.cos(input='dem.tif', output='cos_dem.tif')
Cosh
Function name: cosh
Experimental
Computes the hyperbolic cosine of each raster cell.
raster math cosh
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply cosh transform to each non-nodata cell.
wbe.cosh(input='dem.tif', output='cosh_dem.tif')
Create Plane
Function name: create_plane
This tool can be used to create a new raster with values that are determined by the equation of a simple plane. The user must specify the name of a base raster (base) from which the output raster coordinate and dimensional information will be taken. In addition the user must specify the values of the planar slope gradient (S; gradient; aspect) in degrees, the planar slope direction or aspect (A; 0 to 360 degrees), and an constant value (k; constant). The equation of the plane is as follows:
Z = tan(S) × sin(A - 180) × X + tan(S) × cos(A - 180) × Y + k
where X and Y are the X and Y coordinates of each grid cell in the grid. Notice that A is the direction, or azimuth, that the plane is facing
Python API
def create_plane(self, base_file: Raster, gradient: float, aspect: float, constant: float) -> Raster:
Crispness Index
Function name: crispness_index
The Crispness Index (C) provides a means of quantifying the crispness, or fuzziness, of a membership probability (MP) image. MP images describe the probability of each grid cell belonging to some feature or class. MP images contain values ranging from 0 to 1.
The index, as described by Lindsay (2006), is the ratio between the sum of the squared differences (from the image mean) in the MP image divided by the sum of the squared differences for the Boolean case in which the total probability, summed for the image, is arranged crisply.
C is closely related to a family of relative variation coefficients that measure variation in an MP image relative to the maximum possible variation (i.e. when the total probability is arranged such that grid cells contain only 1s or 0s). Notice that 0 < C < 1 and a low C-value indicates a nearly uniform spatial distribution of any probability value, and C = 1 indicates a crisp spatial probability distribution, containing only 1's and 0's.
C is calculated as follows:
C = SS_mp ∕ SS_B = [∑(pij − p-bar)^2] ∕ [ ∑pij(1 − p-bar)^2 + p2(RC − ∑pij)]
Note that there is an error in the original published equation. Specifically, the denominator read:
∑pij(1 - p_bar)^2 + p_bar^2 (RC - ∑pij)
instead of the original:
∑pij(1 - p_bar^2) - p_bar^2 (RC - ∑pij)
References
Lindsay, J. B. (2006). Sensitivity of channel mapping techniques to uncertainty in digital elevation data. International Journal of Geographical Information Science, 20(6), 669-692.
Python API
def crispness_index(self, raster: Raster, output_html_file: str) -> None:
Cross Tabulation
Function name: cross_tabulation
This tool can be used to perform a cross-tabulation on two input raster images (i1 and i2) containing categorical data, i.e. classes. It will output a contingency table in HTML format (output). A contingency table, also known as a cross tabulation or crosstab, is a type of table that displays the multivariate frequency distribution of the variables. These tables provide a basic picture of the interrelation between two categorical variables and can help find interactions between them. cross_tabulation can provide useful information about the nature of land-use/land-cover (LULC) changes between two dates of classified multi-spectral satellite imagery. For example, the extent of urban expansion could be described using the information about the extent of pixels in an 'urban' class in Date 2 that were previously assigned to other classes (e.g. agricultural LULC categories) in the Date 1 imagery.
Both input images must share the same grid, as the analysis requires a comparison of a pair of images on a cell-by-cell basis. If a grid cell contains a NoData value in either of the input images, the cell will be excluded from the analysis.
Python API
def cross_tabulation(self, raster1: Raster, raster2: Raster, output_html_file: str) -> None:
Cumulative Distribution
Function name: cumulative_distribution
This tool converts the values in an input image (input) into a cumulative distribution function. Therefore, the output raster (output) will contain the cumulative probability value (0-1) of of values equal to or less than the value in the corresponding grid cell in the input image. NoData values in the input image are not considered during the transformation and remain NoData values in the output image.
See Also
z_scores
Python API
def cumulative_distribution(self, raster: Raster) -> Raster:
Dbscan
Function name: dbscan
Description
This tool performs an unsupervised DBSCAN clustering operation, based on a series of input rasters (inputs). Each grid cell defines a stack of feature values (one value for each input raster), which serves as a point within the multi-dimensional feature space. The DBSCAN algorithm identifies clusters in feature space by identifying regions of high density (core points) and the set of points connected to these high-density areas. Points in feature space that are not connected to high-density regions are labeled by the DBSCAN algorithm as 'noise' and the associated grid cell in the output raster (output) is assigned the nodata value. Areas of high density (i.e. core points) are defined as those points for which the number of neighbouring points within a search distance (search_dist) is greater than some user-defined minimum threshold (min_points).
The main advantages of the DBSCAN algorithm over other clustering methods, such as k-means (k_means_clustering), is that 1) you do not need to specify the number of clusters a priori, and 2) that the method does not make assumptions about the shape of the cluster (spherical in the k-means method). However, DBSCAN does assume that the density of every cluster in the data is approximately equal, which may not be a valid assumption. DBSCAN may also produce unsatisfactory results if there is significant overlap among clusters, as it will aggregate the clusters. Finding search distance and minimum core-point density thresholds that apply globally to the entire data set may be very challenging or impossible for certain applications.
The DBSCAN algorithm is based on the calculation of distances in multi-dimensional space. Feature scaling is essential to the application of DBSCAN clustering, especially when the ranges of the features are different, for example, if they are measured in different units. Without scaling, features with larger ranges will have greater influence in computing the distances between points. The tool offers three options for feature-scaling (scaling), including 'None', 'Normalize', and 'Standardize'. Normalization simply rescales each of the features onto a 0-1 range. This is a good option for most applications, but it is highly sensitive to outliers because it is determined by the range of the minimum and maximum values. Standardization rescales predictors using their means and standard deviations, transforming the data into z-scores. This is a better option than normalization when you know that the data contain outlier values; however, it does does assume that the feature data are somewhat normally distributed, or are at least symmetrical in distribution.
One should keep the impact of feature scaling in mind when setting the search_dist parameter. For example, if applying normalization, the entire range of values for each dimension of feature space will be bound within the 0-1 range, meaning that the search distance should be smaller than 1.0, and likely significantly smaller. If standardization is used instead, features space is technically infinite, although the vast majority of the data are likely to be contained within the range -2.5 to 2.5.
Because the DBSCAN algorithm calculates distances in feature-space, like many other related algorithms, it suffers from the curse of dimensionality. Distances become less meaningful in high-dimensional space because the vastness of these spaces means that distances between points are less significant (more similar). As such, if the predictor list includes insignificant or highly correlated variables, it is advisable to exclude these features during the model-building phase, or to use a dimension reduction technique such as principal_component_analysis to transform the features into a smaller set of uncorrelated predictors.
Memory Usage
The peak memory usage of this tool is approximately 8 bytes per grid cell × # predictors.
See Also
k_means_clustering, modified_k_means_clustering, principal_component_analysis
Python API
def dbscan(self, input_rasters: List[Raster], scaling_method: str = "none", search_distance: float = 1.0, min_points: int = 5) -> Raster:
Decrement
Function name: decrement
Experimental
Subtracts 1 from each non-nodata raster cell.
raster math decrement
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply decrement transform to each non-nodata cell.
wbe.decrement(input='dem.tif', output='decrement_dem.tif')
Edge Proportion
Function name: edge_proportion
This tool will measure the edge proportion, i.e. the proportion of grid cells in a patch that are located along the patch's boundary, for an input raster image (input). Edge proportion is an indicator of polygon shape complexity and elongation. The user must specify the name of the output raster file (output), which will be raster layer containing the input features assigned the edge proportion. The user may also optionally choose to output text data for easy input to a spreadsheet or database.
Objects in the input raster are designated by their unique identifiers. Identifier values must be positive, non-zero whole numbers.
See Also
shape_complexity_index_raster, linearity_index, elongation_ratio
Python API
def edge_proportion(self, raster: Raster) -> Tuple[Raster, str]:
Equal To
Function name: equal_to
Experimental
Tests whether two rasters are equal on a cell-by-cell basis.
raster math equal_to legacy-port
Parameters
NameDescriptionRequiredDefault
input1First input raster (path string or typed raster object).Requiredinput1.tif
input2Second input raster (path string or typed raster object).Requiredinput2.tif
outputOptional output raster file path. If omitted, output remains in memory and is returned as a memory:// raster handle.Optional—
Examples
Runs equal_to on two DEM rasters and writes the result to dem_equal_to.tif.
wbe.equal_to(input1='dem_a.tif', input2='dem_b.tif', output='dem_equal_to.tif')
Erase Polygon From Raster
Function name: erase_polygon_from_raster
This tool can be used to set values an input raster (input) to a NoData background value with a vector erasing polygon (polygons). The input erase polygon file must be a vector of a Polygon base shape type. The erase file may contain multiple polygon features. Polygon hole parts will be respected during clipping, i.e. polygon holes will not be removed from the output raster. Raster grid cells that fall inside of a polygons in the erase file will be assigned the NoData background value in the output file.
See Also
clip_raster_to_polygon
Python API
def erase_polygon_from_raster(self, raster: Raster, polygons: Vector) -> Raster:
Exp
Function name: exp
Experimental
Computes e raised to the power of each raster cell.
raster math exp
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply exp transform to each non-nodata cell.
wbe.exp(input='dem.tif', output='exp_dem.tif')
Exp2
Function name: exp2
Experimental
Computes 2 raised to the power of each raster cell.
raster math exp2
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply exp2 transform to each non-nodata cell.
wbe.exp2(input='dem.tif', output='exp2_dem.tif')
FFT Random Field
Function name: fft_random_field
No help documentation available for this tool.
Filter Raster Features By Area
Function name: filter_raster_features_by_area
This tool takes an input raster (input) containing integer-labelled features, such as the output of the clump tool, and removes all features that are smaller than a user-specified size (threshold), measured in grid cells. The user must specify the replacement value for removed features using the background parameter, which can be either zero or nodata.
See Also
clump
Python API
def filter_raster_features_by_area(self, input: Raster, threshold: int, zero_background: bool = False) -> Raster:
Find Patch Edge Cells
Function name: find_patch_edge_cells
This tool will identify all grid cells situated along the edges of patches or class features within an input raster (input). Edge cells in the output raster (output) will have the patch identifier value assigned in the corresponding grid cell. All non-edge cells will be assigned zero in the output raster. Patches (or classes) are designated by positive, non-zero values in the input image. Zero-valued and NoData-valued grid cells are interpreted as background cells by the tool.
See Also
edge_proportion
Python API
def find_patch_edge_cells(self, raster: Raster) -> Raster:
Floor
Function name: floor
Experimental
Rounds each raster cell downward to the nearest integer.
raster math floor
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply floor transform to each non-nodata cell.
wbe.floor(input='dem.tif', output='floor_dem.tif')
Greater Than
Function name: greater_than
Experimental
Tests whether the first raster is greater than the second on a cell-by-cell basis.
raster math greater_than legacy-port
Parameters
NameDescriptionRequiredDefault
input1First input raster (path string or typed raster object).Requiredinput1.tif
input2Second input raster (path string or typed raster object).Requiredinput2.tif
outputOptional output raster file path. If omitted, output remains in memory and is returned as a memory:// raster handle.Optional—
Examples
Runs greater_than on two DEM rasters and writes the result to dem_greater_than.tif.
wbe.greater_than(input1='dem_a.tif', input2='dem_b.tif', output='dem_greater_than.tif')
Heat Map
Function name: heat_map
This tool is used to generate a raster heat map, or kernel density estimation surface raster from a set of vector points (input). Heat mapping is a visualization and modelling technique used to create the continuous density surface associated with the occurrences of a point phenomenon. Heat maps can therefore be used to identify point clusters by mapping the concentration of event occurrence. For example, heat maps have been used extensively to map the spatial distributions of crime events (i.e. crime mapping) or disease cases.
By default, the tool maps the density of raw occurrence events, however, the user may optionally specify an associated weights field (weights) from the point file's attribute table. When a weights field is specified, these values are simply multiplied by each of the individual components of the density estimate. Weights must be numeric.
The bandwidth parameter (--bandwidth) determines the radius of the kernel used in calculation of the density surface. There are guidelines that statisticians use in determining an appropriate bandwidth for a particular population and data set, but often this parameter is determined through experimentation. The bandwidth of the kernel is a free parameter which exhibits a strong influence on the resulting estimate.
The user must specify the kernel function type (kernel). Options include 'uniform', 'triangular', 'epanechnikov', 'quartic', 'triweight', 'tricube', 'gaussian', 'cosine', 'logistic', 'sigmoid', and 'silverman'; 'quartic' is the default kernel type. Descriptions of each function can be found at the link above.
The characteristics of the output raster (resolution and extent) are determined by one of two optional parameters, cell_size and base. If the user optionally specifies the output grid cell size parameter (cell_size) then the coordinates of the output raster extent are determined by the input vector (i.e. the bounding box) and the specified cell size determines the number of rows and columns. If the user instead specifies the optional base raster file parameter (base), the output raster's coordinates (i.e. north, south, east, west) and row and column count, and therefore, resolution, will be the same as the base file.
Reference
Geomatics (2017) QGIS Heatmap Using Kernel Density Estimation Explained, online resource: https://www.geodose.com/2017/11/qgis-heatmap-using-kernel-density.html visited 02/06/2022.
Python API
def heat_map(self, points: Vector, field_name: str, bandwidth: float = 0.0, cell_size: float = 0.0, base_raster: Raster = None, kernel_function: str = "quartic") -> Raster:
IDW Interpolation
Function name: idw_interpolation
points or a fixed neighbourhood size. This tool is currently configured to perform the later only, using a FixedRadiusSearch structure. Using a fixed number of neighbours will require use of a KD-tree structure. I've been testing one Rust KD-tree library but its performance does not appear to be satisfactory compared to the FixedRadiusSearch. I will need to explore other options here.
Another change that will need to be implemented is the use of a nodal function. The original Whitebox GAT tool allows for use of a constant or a quadratic. This tool only allows the former.
Python API
def idw_interpolation(self, points: Vector, field_name: str = "FID", use_z: bool = False, weight: float = 2.0, radius: float = 0.0, min_points: int = 0, cell_size: float = 0.0, base_raster: Raster = None) -> Raster:
Image Autocorrelation
Function name: image_autocorrelation
Spatial autocorrelation describes the extent to which a variable is either dispersed or clustered through space. In the case of a raster image, spatial autocorrelation refers to the similarity in the values of nearby grid cells. This tool measures the spatial autocorrelation of a raster image using the global Moran's I statistic. Moran's I varies from -1 to 1, where I = -1 indicates a dispersed, checkerboard type pattern and I = 1 indicates a clustered (smooth) surface. I = 0 occurs for a random distribution of values. image_autocorrelation computes Moran's I for the first lag only, meaning that it only takes into account the variability among the immediate neighbors of each grid cell.
The user must specify the names of one or more input raster images. In addition, the user must specify the contiguity type (contiguity; Rook's, King's, or Bishop's), which describes which neighboring grid cells are examined for the analysis. The following figure describes the available cases:
Rook's contiguity ... 010 1X1 010
Kings's contiguity ... 111 1X1 111
Bishops's contiguity ... 101 0X0 101
The tool outputs an HTML report (output) which, for each input image (input), reports the Moran's I value and the variance, z-score, and p-value (significance) under normal and randomization sampling assumptions.
Use the image_correlation tool instead when there is need to determine the correlation among multiple raster inputs.
**NoData **values in the input image are ignored during the analysis.
See Also
image_correlation, image_correlation_neighbourhood_analysis
Python API
def image_autocorrelation(self, rasters: List[Raster], output_html_file: str, contiguity_type: str = "bishop") -> None:
Image Correlation
Function name: image_correlation
This tool can be used to estimate the Pearson product-moment correlation coefficient (r) between two or more input images (inputs). The r-value is a measure of the linear association in the variation of the input variables (images, in this case). The coefficient ranges from -1.0, indicated a perfect negative linear association, to 1.0, indicated a perfect positive linear association. An r-value of 0.0 indicates no correlation between the test variables.
Note that this index is a measure of the linear association; two variables may be strongly related by a non-linear association (e.g. a power function curve) which will lead to an apparent weak association based on the Pearson coefficient. In fact, non-linear associations are very common among spatial variables, e.g. terrain indices such as slope and contributing area. In such cases, it is advisable that the input images are transformed prior to the estimation of the Pearson coefficient, or that an alternative, non-parametric statistic be used, e.g. the Spearman rank correlation coefficient.
The user must specify the names of two or more input images (inputs). All input images must share the same grid, as the coefficient requires a comparison of a pair of images on a grid-cell-by-grid-cell basis. If more than two image names are selected, the correlation coefficient will be calculated for each pair of images and reported in the HTML output report (output) as a correlation matrix. Caution must be exercised when attempted to estimate the significance of a correlation coefficient derived from image data. The very high N-value (essentially the number of pixels in the image pair) means that even small correlation coefficients can be found to be statistically significant, despite being practically insignificant.
NoData values in either of the two input images are ignored during the calculation of the correlation between images.
See Also
image_correlation_neighbourhood_analysis, image_regression, image_autocorrelation
Python API
def image_correlation(self, rasters: List[Raster], output_html_file: str) -> None:
Image Regression
Function name: image_regression
This tool performs a bivariate linear regression analysis on two input raster images. The first image (i1) is considered to be the independent variable while the second image (i2) is considered to be the dependent variable in the analysis. Both input images must share the same grid, as the coefficient requires a comparison of a pair of images on a grid-cell-by-grid-cell basis. The tool will output an HTML report (output) summarizing the regression model, an Analysis of Variance (ANOVA), and the significance of the regression coefficients. The regression residuals can optionally be output as a new raster image (out_residuals) and the user can also optionally specify to standardize the residuals (standardize).
Note that the analysis performs a linear regression; two variables may be strongly related by a non-linear association (e.g. a power function curve) which will lead to an apparently weak fitting regression model. In fact, non-linear relations are very common among spatial variables, e.g. terrain indices such as slope and contributing area. In such cases, it is advisable that the input images are transformed prior to the analysis.
NoData values in either of the two input images are ignored during the calculation of the correlation between images.
Example usage
import whitebox_workflow
See Also
image_correlation, image_correlation_neighbourhood_analysis
Python API
def image_regression(self, independent_variable: Raster, dependent_variable: Raster, output_html_file: str, standardize_residuals: bool = False, output_scattergram: bool = False, num_samples: int = 1000) -> Raster:
Increment
Function name: increment
Experimental
Adds 1 to each non-nodata raster cell.
raster math increment
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply increment transform to each non-nodata cell.
wbe.increment(input='dem.tif', output='increment_dem.tif')
Inplace Add
Function name: inplace_add
Experimental
Performs an in-place addition operation (input1 += input2).
raster math legacy-port
Parameters
NameDescriptionRequiredDefault
input1Input raster to modify.Requiredin1.tif
input2Input raster path or numeric constant.Requiredin2.tif
Examples
Modify input1 by adding input2.
wbe.inplace_add(input1='in1.tif', input2=10.5)
Inplace Divide
Function name: inplace_divide
Experimental
Performs an in-place division operation (input1 /= input2).
raster math legacy-port
Parameters
NameDescriptionRequiredDefault
input1Input raster to modify.Requiredin1.tif
input2Input raster path or non-zero numeric constant.Requiredin2.tif
Examples
Modify input1 by dividing by input2.
wbe.inplace_divide(input1='in1.tif', input2=10.5)
Inplace Multiply
Function name: inplace_multiply
Experimental
Performs an in-place multiplication operation (input1 *= input2).
raster math legacy-port
Parameters
NameDescriptionRequiredDefault
input1Input raster to modify.Requiredin1.tif
input2Input raster path or numeric constant.Requiredin2.tif
Examples
Modify input1 by multiplying with input2.
wbe.inplace_multiply(input1='in1.tif', input2=10.5)
Inplace Subtract
Function name: inplace_subtract
Experimental
Performs an in-place subtraction operation (input1 -= input2).
raster math legacy-port
Parameters
NameDescriptionRequiredDefault
input1Input raster to modify.Requiredin1.tif
input2Input raster path or numeric constant.Requiredin2.tif
Examples
Modify input1 by subtracting input2.
wbe.inplace_subtract(input1='in1.tif', input2=10.5)
Integer Division
Function name: integer_division
Experimental
Divides two rasters and truncates each result toward zero.
raster math integer_division legacy-port
Parameters
NameDescriptionRequiredDefault
input1First input raster (path string or typed raster object).Requiredinput1.tif
input2Second input raster (path string or typed raster object).Requiredinput2.tif
outputOptional output raster file path. If omitted, output remains in memory and is returned as a memory:// raster handle.Optional—
Examples
Runs integer_division on two DEM rasters and writes the result to dem_integer_division.tif.
wbe.integer_division(input1='dem_a.tif', input2='dem_b.tif', output='dem_integer_division.tif')
Inverse PCA
Function name: inverse_pca
Description
This tool takes a two or more component images (inputs), and the principal component analysis (PCA) report derived using the principal_component_analysis tool, and performs the inverse PCA transform to derive the original series of input images. This inverse transform is frequently performed to reduce noise within a multi-spectral image data set. With a typical PCA transform, high-frequency noise will commonly map onto the higher component images. By excluding one or more higher-valued component images from the input component list, the inverse transform can produce a set of images in the original coordinate system that exclude the information contained within component images excluded from the input list. Note that the number of output images will also equal the number of original images input to the principal_component_analysis tool. The output images will be named automatically with a "inv_PCA_image" suffix.
See Also
principal_component_analysis
Python API
def inverse_pca(self, rasters: List[Raster], pca_report_file: str) -> List[Raster]:
Is Nodata
Function name: is_nodata
Experimental
Outputs 1 for nodata cells and 0 for all valid cells.
raster math is_nodata
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Identify nodata cells.
wbe.is_nodata(input='dem.tif', output='dem_is_nodata.tif')
Kappa Index
Function name: kappa_index
This tool calculates the Kappa index of agreement (KIA), or Cohen's Kappa, for two categorical input raster images (input1 and input2). The KIA is a measure of inter-rater reliability (i.e. classification accuracy) and is widely applied in many fields, notably remote sensing. For example, The KIA is often used as a means of assessing the accuracy of an image classification analysis. The KIA can be interpreted as the percentage improvement that the underlying classification has over and above a random classifier (i.e. random assignment to categories). The user must specify the output HTML file (output). The input images must be of a categorical data type, i.e. contain classes. As a measure of classification accuracy, the KIA is more robust than the overall percent agreement because it takes into account the agreement occurring by chance. A KIA of 0 would indicate that the classifier is no better than random class assignment. In addition to the KIA, this tool will also output the producer's and user's accuracy, the overall accuracy, and the error matrix.
See Also
cross_tabulation
Python API
def kappa_index(self, class_raster: Raster, reference_raster: Raster, output_html_file: str = "") -> None:
KS Normality Test
Function name: ks_normality_test
This tool will perform a Kolmogorov-Smirnov (K-S) test for normality to evaluate whether the frequency distribution of values within a raster image are drawn from a Gaussian (normal) distribution. The user must specify the name of the raster image. The test can be performed optionally on the entire image or on a random sub-sample of pixel values of a user-specified size. In evaluating the significance of the test, it is important to keep in mind that given a sufficiently large sample, extremely small and non-notable differences can be found to be statistically significant. Furthermore statistical significance says nothing about the practical significance of a difference.
See Also
two_sample_ks_test
Python API
def ks_normality_test(self, raster: Raster, output_html_file: str, num_samples: int) -> None:
Less Than
Function name: less_than
Experimental
Tests whether the first raster is less than the second on a cell-by-cell basis.
raster math less_than legacy-port
Parameters
NameDescriptionRequiredDefault
input1First input raster (path string or typed raster object).Requiredinput1.tif
input2Second input raster (path string or typed raster object).Requiredinput2.tif
outputOptional output raster file path. If omitted, output remains in memory and is returned as a memory:// raster handle.Optional—
Examples
Runs less_than on two DEM rasters and writes the result to dem_less_than.tif.
wbe.less_than(input1='dem_a.tif', input2='dem_b.tif', output='dem_less_than.tif')
List Unique Values Raster
Function name: list_unique_values_raster
This function can be used to list each of the unique values contained within a categorical raster (raster). The tool outputs string containing a comma-seperated variable (CSV) table of the unique values and their frequency of occurrence within the data. The input raster should not contain continuous floating-point numerical data, because the number of categories will likely equal the number of pixels, which may be quite large.
See Also
list_unique_values
Python API
def list_unique_values_raster(self, raster: Raster) -> str:
Ln
Function name: ln
Experimental
Computes the natural logarithm of each raster cell.
raster math ln
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply ln transform to each non-nodata cell.
wbe.ln(input='dem.tif', output='ln_dem.tif')
Log10
Function name: log10
Experimental
Computes the base-10 logarithm of each raster cell.
raster math log10
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply log10 transform to each non-nodata cell.
wbe.log10(input='dem.tif', output='log10_dem.tif')
Log2
Function name: log2
Experimental
Computes the base-2 logarithm of each raster cell.
raster math log2
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply log2 transform to each non-nodata cell.
wbe.log2(input='dem.tif', output='log2_dem.tif')
Map Features
Function name: map_features
Experimental
Maps discrete elevated terrain features from a raster using descending-priority region growth.
raster gis features legacy-port
Parameters
NameDescriptionRequiredDefault
inputInput raster.Requiredinput.tif
min_feature_heightMinimum vertical separation for independent features.Required1.0
min_feature_sizeMinimum feature size in cells.Required1
outputOptional output raster path.Optional—
Examples
Labels terrain features using descending-elevation region growth.
wbe.map_features(input='input.tif', min_feature_height=1.0, min_feature_size=1, output='map_features.tif')
Max
Function name: max
Experimental
Performs a MAX operation on two rasters or a raster and a constant value.
raster math max legacy-port
Parameters
NameDescriptionRequiredDefault
input1First raster path or numeric constant.Requiredin1.tif
input2Second raster path or numeric constant.Requiredin2.tif
outputOptional output raster path.Optional—
Examples
Compute cellwise maximum between a raster and a constant.
wbe.max(input1='in1.tif', input2='15.0', output='max_output.tif')
Min
Function name: min
Experimental
Performs a MIN operation on two rasters or a raster and a constant value.
raster math min legacy-port
Parameters
NameDescriptionRequiredDefault
input1First raster path or numeric constant.Requiredin1.tif
input2Second raster path or numeric constant.Requiredin2.tif
outputOptional output raster path.Optional—
Examples
Compute cellwise minimum between a raster and a constant.
wbe.min(input1='in1.tif', input2='15.0', output='min_output.tif')
Modified Shepard Interpolation
Function name: modified_shepard_interpolation
This tool interpolates vector points into a raster surface using a radial basis function (RBF) scheme.
Python API
def radial_basis_function_interpolation(self, points: Vector, field_name: str = "FID", use_z: bool = False, radius: float = 0.0, min_points: int = 0, cell_size: float = 0.0, base_raster: Raster = None, func_type: str = "thinplatespline", poly_order: str = "none", weight: float = 0.1) -> Raster:
Narrowness Index
Function name: narrowness_index
This tools calculates a type of shape narrowness index (NI) for raster objects. The index is equal to:
NI = A / (πMD2)
where A is the patch area and MD is the maximum distance-to-edge of the patch. Circular-shaped patches will have a narrowness index near 1.0, while more narrow patch shapes will have higher index values. The index may be conceptualized as the ratio of the patch area to the area of the largest contained circle, although in practice the circle defined by the radius of the maximum distance-to-edge will often fall outside the patch boundaries.
Objects in the input raster (input) are designated by their unique identifiers. Identifier values must be positive, non-zero whole numbers. It is quite common for identifiers to be set using the clump tool applied to some kind of thresholded raster.
See Also
linearity_index, elongation_ratio, clump
Python API
def narrowness_index(self, raster: Raster) -> Raster:
Negate
Function name: negate
Experimental
Negates each non-nodata raster cell value.
raster math negate
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply negate transform to each non-nodata cell.
wbe.negate(input='dem.tif', output='negate_dem.tif')
Nibble
Function name: nibble
The nibble function assigns areas within an input class map raster that are coincident with a mask the value of their nearest neighbour. Nibble is typically used to replace erroneous sections in a class map. Cells in the mask raster that are either NoData or zero values will be replaced in the input image with their nearest non-masked value. All input raster values in non-mask areas will be unmodified.
There are two input parameters that are related to how NoData cells in the input raster are handled during the nibble operation. The use_nodata Boolean determines whether or not input NoData cells, not contained within masked areas, are treated as ordinary values during the nibble. It is False by default, meaning that NoData cells in the input raster do not extend into nibbled areas. When the nibble_nodata parameter is True, any NoData cells in the input raster that are within the masked area are also NoData in the output raster; when nibble_nodata is False these cells will be nibbled.
See Also:
sieve
Python API
def nibble(self, input_raster: Raster, mask: Raster, use_nodata: bool = False, nibble_nodata: bool = True) -> Raster:
Not Equal To
Function name: not_equal_to
Experimental
Tests whether two rasters are not equal on a cell-by-cell basis.
raster math not_equal_to legacy-port
Parameters
NameDescriptionRequiredDefault
input1First input raster (path string or typed raster object).Requiredinput1.tif
input2Second input raster (path string or typed raster object).Requiredinput2.tif
outputOptional output raster file path. If omitted, output remains in memory and is returned as a memory:// raster handle.Optional—
Examples
Runs not_equal_to on two DEM rasters and writes the result to dem_not_equal_to.tif.
wbe.not_equal_to(input1='dem_a.tif', input2='dem_b.tif', output='dem_not_equal_to.tif')
Paired Sample T Test
Function name: paired_sample_t_test
This tool will perform a paired-sample t-test to evaluate whether a significant statistical difference exists between the two rasters. The null hypothesis is that the difference between the paired population means is equal to zero. The paired-samples t-test makes an assumption that the differences between related samples follows a Gaussian distribution. The tool will output a cumulative probability distribution, with a fitted Gaussian, to help users evaluate whether this assumption is violated by the data. If this is the case, the wilcoxon_signed_rank_test should be used instead.
The user must specify the name of the two input raster images (input1 and input2) and the output report HTML file (output). The test can be performed optionally on the entire image or on a random sub-sample of pixel values of a user-specified size (num_samples). In evaluating the significance of the test, it is important to keep in mind that given a sufficiently large sample, extremely small and non-notable differences can be found to be statistically significant. Furthermore statistical significance says nothing about the practical significance of a difference.
See Also
two_sample_ks_test, wilcoxon_signed_rank_test
Python API
def paired_sample_t_test(self, raster1: Raster, raster2: Raster, output_html_file: str, num_samples: int) -> None:
Phi Coefficient
Function name: phi_coefficient
Description
This tool performs a binary classification accuracy assessment, using the Phi coefficient. The Phi coefficient is a measure of association for two binary variables. Introduced by Karl Pearson, this measure is similar to the Pearson correlation coefficient in its interpretation and is related to the chi-squared statistic for a 2×2 contingency table. The user must specify the names of two input images (input1 and input2), containing categorical data.
Python API
def phi_coefficient(self, raster1: Raster, raster2: Raster, output_html_file: str) -> None:
Principal Component Analysis
Function name: principal_component_analysis
Principal component analysis (PCA) is a common data reduction technique that is used to reduce the dimensionality of multi-dimensional space. In the field of remote sensing, PCA is often used to reduce the number of bands of multi-spectral, or hyper-spectral, imagery. Image correlation analysis often reveals a substantial level of correlation among bands of multi-spectral imagery. This correlation represents data redundancy, i.e. fewer images than the number of bands are required to represent the same information, where the information is related to variation within the imagery. PCA transforms the original data set of n bands into n 'component' images, where each component image is uncorrelated with all other components. The technique works by transforming the axes of the multi-spectral space such that it coincides with the directions of greatest correlation. Each of these new axes are orthogonal to one another, i.e. they are at right angles. PCA is therefore a type of coordinate system transformation. The PCA component images are arranged such that the greatest amount of variance (or information) within the original data set, is contained within the first component and the amount of variance decreases with each component. It is often the case that the majority of the information contained in a multi-spectral data set can be represented by the first three or four PCA components. The higher-order components are often associated with noise in the original data set.
The user must specify the names of the multiple input images (inputs). Additionally, the user must specify whether to perform a standardized PCA (standardized) and the number of output components (num_comp) to generate (all components will be output unless otherwise specified). A standardized PCA is performed using the correlation matrix rather than the variance-covariance matrix. This is appropriate when the variances in the input images differ substantially, such as would be the case if they contained values that were recorded in different units (e.g. feet and meters) or on different scales (e.g. 8-bit vs. 16 bit).
Several outputs will be generated when the tool has completed. The PCA report will be embedded within an output (output) HTML file, which should be automatically displayed after the tool has completed. This report contains useful data summarizing the results of the PCA, including the explained variances of each factor, the Eigenvalues and Eigenvectors associated with factors, the factor loadings, and a scree plot. The first table that is in the PCA report lists the amount of explained variance (in non-cumulative and cumulative form), the Eigenvalue, and the Eigenvector for each component. Each of the PCA components refer to the newly created, transformed images that are created by running the tool. The amount of explained variance associated with each component can be thought of as a measure of how much information content within the original multi-spectral data set that a component has. The higher this value is, the more important the component is. This same information is presented in graphical form in the scree plot, found at the bottom of the PCA report. The Eigenvalue is another measure of the information content of a component and the eigenvector describes the mathematical transformation (rotation coordinates) that correspond to a particular component image.
Factor loadings are also output in a table within the PCA text report (second table). These loading values describe the correlation (i.e. r values) between each of the PCA components (columns) and the original images (rows). These values show you how the information contained in an image is spread among the components. An analysis of factor loadings can be reveal useful information about the data set. For example, it can help to identify groups of similar images.
PCA is used to reduce the number of band images necessary for classification (i.e. as a data reduction technique), for noise reduction, and for change detection applications. When used as a change detection technique, the major PCA components tend to be associated with stable elements of the data set while variance due to land-cover change tend to manifest in the high-order, 'change components'. When used as a noise reduction technique, an inverse PCA is generally performed, leaving out one or more of the high-order PCA components, which account for noise variance.
Note: the current implementation reads every raster into memory at one time. This is because of the calculation of the co-variances. As such, if the entire image stack cannot fit in memory, the tool will likely experience an out-of-memory error. This tool should be run using the wd flag to specify the working directory into which the component images will be written.
Python API
def principal_component_analysis(self, rasters: List[Raster], output_html_file: str, num_components: int = 2, standardized: bool = False) -> List[Raster]:
Print Geotiff Tags
Function name: print_geotiff_tags
This tool can be used to view the tags contained within a GeoTiff file. Viewing the tags of a GeoTiff file can be useful when trying to import the GeoTiff to different software environments. The user must specify the name of a GeoTiff file and the tag information will be output to the StdOut output stream (e.g. console). Note that tags that contain greater than 100 values will be truncated in the output. GeoKeys will also be interpreted as per the GeoTIFF specification.
Python API
def print_geotiff_tags(self, file_name: str) :
Quantiles
Function name: quantiles
This tool transforms values in an input raster (input) into quantiles. In statistics, quantiles are cut points dividing the range of a probability distribution into continuous intervals with equal probabilities, or dividing the observations in a sample in a same way. There is one fewer quantile than the number of groups created. Thus quartiles are the three cut points that will divide a dataset into four equal-sized groups. Common quantiles have special names: for instance quartile (4-quantile), quintiles (5-quantiles), decile (10-quantile), percentile (100-quantile).
The user must specify the desired number of quantiles, q (num_quantiles), in the output raster (output). The output raster will contain q equal-sized groups with values 1 to q, indicating which quantile group each grid cell belongs to.
See Also
histogram_equalization
Python API
def quantiles(self, raster: Raster, num_quantiles: int = 5) -> Raster:
Radial Basis Function Interpolation
Function name: radial_basis_function_interpolation
This tool interpolates vector points into a raster surface using a radial basis function (RBF) scheme.
Python API
def radial_basis_function_interpolation(self, points: Vector, field_name: str = "FID", use_z: bool = False, radius: float = 0.0, min_points: int = 0, cell_size: float = 0.0, base_raster: Raster = None, func_type: str = "thinplatespline", poly_order: str = "none", weight: float = 0.1) -> Raster:
Radius Of Gyration
Function name: radius_of_gyration
This can be used to calculate the radius of gyration (RoG) for the polygon features within a raster image. RoG measures how far across the landscape a polygon extends its reach on average, given by the mean distance between cells in a patch (Mcgarigal et al. 2002). The radius of gyration can be considered a measure of the average distance an organism can move within a patch before encountering the patch boundary from a random starting point (Mcgarigal et al. 2002). The input raster grid should contain polygons with unique identifiers greater than zero. The user must also specify the name of the output raster file (where the radius of gyration will be assigned to each feature in the input file) and the specified option of outputting text data.
Python API
def radius_of_gyration(self, raster: Raster) -> Tuple[Raster, str]:
Random Field
Function name: random_field
This tool can be used to a raster image filled with random values drawn from a standard normal distribution. The values range from approximately -4.0 to 4.0, with a mean of 0 and a standard deviation of 1.0. The dimensions and georeferencing of the output random field (output) are based on an existing, user-specified raster grid (base). Note that the output field will not possess any spatial autocorrelation. If spatially autocorrelated random fields are desired, the turning_bands_simulation tool is more appropriate, or alternatively, the fast_almost_gaussian_filter tool may be used to force spatial autocorrelation onto the distribution of the random_field tool.
See Also
turning_bands_simulation, fast_almost_gaussian_filter
Python API
def random_field(self, base_raster: Raster = None) -> Raster:
Random Forest Classification Fit
Function name: random_forest_classification_fit
Description
This tool performs a supervised random forest (RF) classification using multiple predictor rasters (inputs), or features, and training data (training). It can be used to model the spatial distribution of class data, such as land-cover type, soil class, or vegetation type. The training data take the form of an input vector Shapefile containing a set of points or polygons, for which the known class information is contained within a field (class_field_name) of the attribute table. Each grid cell defines a stack of feature values (one value for each input raster), which serves as a point within the multi-dimensional feature space. Random forest is an ensemble learning method that works by creating a large number (n_trees) of decision trees and using a majority vote to determine estimated class values. Individual trees are created using a random sub-set of predictors. This ensemble approach overcomes the tendency of individual decision trees to overfit the training data. As such, the RF method is a widely and successfully applied machine-learning method in many domains.
Note that this function is part of a set of two tools, including random_forest_classification_fit and random_forest_classification_prdict. The random_forest_classificaiton_fit tool should be used first to create the RF model and the random_forest_classification_predict can then be used to apply that model for prediction. The output of the fit tool is a byte array that is a binary representation of the RF model. This model can then be used as the input to the predict tool, along with a list of input raster predictors, which must be in the same order as those used in the fit tool. The output of the predict tool is a classified raster. The reason that the RF workflow is split in this way is that often it is the case that you need to experiment with various input predictor sets and parameter values to create an adequate model. There is no need to generate an output classified raster during this experimentation stage, and because prediction can often be the slowest part of the RF modelling process, it is generally only performed after the final model has been identified. The binary representation of the RF-based model can be serialized (i.e., saved to a file) and then later read back into memory to serve as the input for the prediction step of the workflow (see code example below).
Also note that this tool is for RF-based classification. There is a similar set of fit and *predict tools available for performing RF-based regression, including random_forest_regression_fit and random_forest_regression_predict. These tools are more appropriately applied to the modelling of continuous data, rather than categorical data.
The user must specify the splitting criteria (split_criterion) used in training the decision trees. Options for this parameter include 'Gini', 'Entropy', and 'ClassificationError'. The model can also be adjusted based on each of the number of trees (n_trees), the minimum number of samples required to be at a leaf node (min_samples_leaf), and the minimum number of samples required to split an internal node (min_samples_split) parameters.
The tool splits the training data into two sets, one for training the classifier and one for testing the model. These test data are used to calculate the overall accuracy and Cohen's kappa index of agreement, as well as to estimate the variable importance. The test_proportion parameter is used to set the proportion of the input training data used in model testing. For example, if test_proportion = 0.2, 20% of the training data will be set aside for testing, and this subset will be selected randomly. As a result of this random selection of test data, and the random selection of features used in decision tree creation, the tool is inherently stochastic, and will result in a different model each time it is run.
Like all supervised classification methods, this technique relies heavily on proper selection of training data. Training sites are exemplar areas/points of known and representative class value (e.g. land cover type). The algorithm determines the feature signatures of the pixels within each training area. In selecting training sites, care should be taken to ensure that they cover the full range of variability within each class. Otherwise the classification accuracy will be impacted. If possible, multiple training sites should be selected for each class. It is also advisable to avoid areas near the edges of class objects (e.g. land-cover patches), where mixed pixels may impact the purity of training site values.
After selecting training sites, the feature value distributions of each class type can be assessed using the evaluate_training_sites tool. In particular, the distribution of class values should ideally be non-overlapping in at least one feature dimension.
RF, like decision trees, does not require feature scaling. That is, unlike the k-NN algorithm and other methods that are based on the calculation of distances in multi-dimensional space, there is no need to rescale the predictors onto a common scale prior to RF analysis. Because individual trees do not use the full set of predictors, RF is also more robust against the curse of dimensionality than many other machine learning methods. Nonetheless, there is still debate about whether or not it is advisable to use a large number of predictors with RF analysis and it may be better to exclude predictors that are highly correlated with others, or that do not contribute significantly to the model during the model-building phase. A dimension reduction technique such as principal_component_analysis can be used to transform the features into a smaller set of uncorrelated predictors.
Example Code
`import os from whitebox_workflows import WbEnvironment
license_id = 'floating-license-id' wbe = WbEnvironment(license_id)
try: wbe.verbose = True wbe.working_directory = "/path/to/data" # Read the input raster files into memory images = wbe.read_rasters( 'LC09_L1TP_018030_20220614_20220615_02_T1_B2.TIF', 'LC09_L1TP_018030_20220614_20220615_02_T1_B3.TIF', 'LC09_L1TP_018030_20220614_20220615_02_T1_B4.TIF', 'LC09_L1TP_018030_20220614_20220615_02_T1_B5.TIF' ) # Read the input training polygons into memory training_data = wbe.read_vector('training_data.shp') # Train the model model = wbe.random_forest_classification_fit( images, training_data, class_field_name = 'CLASS', split_criterion = "Gini", n_trees = 50, min_samples_leaf = 1, min_samples_split = 2, test_proportion = 0.2 ) # Example of how to serialize the model, i.e., save the model, which is just binary data print('Saving the model to file...') file_path = os.path.join(wbe.working_directory, "rf_model.bin") with open(file_path, "wb") as file: file.write(bytearray(model)) # Example of how to deserialize the model, i.e. read the model model = [] with open(file_path, mode='rb') as file: model = list(file.read()) # Use the model to predict rf_class_image = wbe.random_forest_classification_predict(images, model) wbe.write_raster(rf_class_image, 'rf_classification.tif', compress=True) print('All done!') `
except Exception as e: print("The error raised is: ", e) finally: wbe.check_in_license(license_id)
See Also
random_forest_classification_predict, random_forest_regression_fit, random_forest_regression_predict, knn_classification, svm_classification, parallelepiped_classification, evaluate_training_sites
Python API
def random_forest_classification_fit(self, input_rasters: List[Raster], training_data: Vector, class_field_name: str, split_criterion: str = "gini", n_trees: int = 500, min_samples_leaf: int = 1, min_samples_split: int = 2, test_proportion: float = 0.2) -> List[int]:
Random Forest Classification Predict
Function name: random_forest_classification_predict
Note this tool is part of a WhiteboxTools extension product. Please visit Whitebox Geospatial Inc. for information about purchasing a license activation key (https://www.whiteboxgeo.com/extension-pricing/).
This tool applies a pre-built random forest (RF) classification model trained using multiple predictor rasters (input_rasters), or features, and training data to predict a spatial distribution. This function is part of a set of two tools, including random_forest_classification_fit and random_forest_classification_prdict. The random_forest_classification_fit tool should be used first to create the RF model and the random_forest_classification_predict can then be used to apply that model for prediction. The output of the fit tool is a byte array that is a binary representation of the RF model. This model can then be used as the input to the predict tool, along with a list of input raster predictors, which must be in the same order as those used in the fit tool (see below). The output of the predict tool is a classified raster. The reason that the RF workflow is split in this way is that often it is the case that you need to experiment with various input predictor sets and parameter values to create an adequate model. There is no need to generate an output classified raster during this experimentation stage, and because prediction can often be the slowest part of the RF modelling process, it is generally only performed after the final model has been identified. The binary representation of the RF-based model can be serialized (i.e., saved to a file) and then later read back into memory to serve as the input for the prediction step of the workflow (see code example below).
Note: it is very important that the order of feature rasters is the same for both fitting the model and using the model for prediction. It is possible to use a model fitted to one data set to make preditions for another data set, however, the set of feature reasters specified to the prediction tool must be input in the same sequence used for building the model. For example, one may train a RF classifer on one set of multi-spectral satellite imagery and then apply that model to classify a different imagery scene, but the image band sequence must be the same for the Fit/Predict tools otherwise inaccurate predictions will result.
Example Code
`import os from whitebox_workflows import WbEnvironment
license_id = 'floating-license-id' wbe = WbEnvironment(license_id)
try: wbe.verbose = True wbe.working_directory = "/path/to/data" # Read the input raster files into memory images = wbe.read_rasters( 'LC09_L1TP_018030_20220614_20220615_02_T1_B2.TIF', 'LC09_L1TP_018030_20220614_20220615_02_T1_B3.TIF', 'LC09_L1TP_018030_20220614_20220615_02_T1_B4.TIF', 'LC09_L1TP_018030_20220614_20220615_02_T1_B5.TIF' ) # Read the input training polygons into memory training_data = wbe.read_vector('training_data.shp') # Train the model model = wbe.random_forest_classification_fit( images, training_data, class_field_name = 'CLASS', split_criterion = "Gini", n_trees = 50, min_samples_leaf = 1, min_samples_split = 2, test_proportion = 0.2 ) # Example of how to serialize the model, i.e., save the model, which is just binary data print('Saving the model to file...') file_path = os.path.join(wbe.working_directory, "rf_model.bin") with open(file_path, "wb") as file: file.write(bytearray(model)) # Example of how to deserialize the model, i.e. read the model model = [] with open(file_path, mode='rb') as file: model = list(file.read()) # Use the model to predict rf_class_image = wbe.random_forest_classification_predict(images, model) wbe.write_raster(rf_class_image, 'rf_classification.tif', compress=True) print('All done!') `
except Exception as e: print("The error raised is: ", e) finally: wbe.check_in_license(license_id)
See Also
random_forest_classification_fit, random_forest_regression_fit, random_forest_regression_predict, knn_classification, svm_classification, parallelepiped_classification, evaluate_training_sites
Python API
def random_forest_classification_predict(self, input_rasters: List[Raster], model_bytes: List[int]) -> Raster:
Random Forest Regression Fit
Function name: random_forest_regression_fit
Description
This function performs a supervised random forest (RF) regression analysis using multiple predictor rasters (input_rasters), or features, and training data (training_data). The training data take the form of an input vector Shapefile containing a set of points, for which the known outcome information is contained within a field (field_name) of the attribute table. Each grid cell defines a stack of feature values (one value for each input raster), which serves as a point within the multi-dimensional feature space.
Note that this function is part of a set of two tools, including random_forest_regression_fit and random_forest_regression_prdict. The random_forest_classificaiton_fit tool should be used first to create the RF model and the random_forest_regression_predict can then be used to apply that model for prediction. The output of the fit tool is a byte array that is a binary representation of the RF model. This model can then be used as the input to the predict tool, along with a list of input raster predictors, which must be in the same order as those used in the fit tool. The output of the predict tool is a continous raster. The reason that the RF workflow is split in this way is that often it is the case that you need to experiment with various input predictor sets and parameter values to create an adequate model. There is no need to generate an output raster during this experimentation stage. Because prediction can often be the slowest part of the RF modelling process, it is generally only performed after the final model has been identified. The binary representation of the RF-based model can be serialized (i.e., saved to a file) and then later read back into memory to serve as the input for the prediction step of the workflow (see code example below).
Also note that this tool is for RF-based regression analysis. There is a similar set of fit and *predict tools available for performing RF-based classification, including random_forest_classification_fit and random_forest_classification_predict. These tools are more appropriately applied to the modelling of categorical data, rather than continuous data.
Note: it is very important that the order of feature rasters is the same for both fitting the model and using the model for prediction. It is possible to use a model fitted to one data set to make preditions for another data set, however, the set of feature reasters specified to the prediction tool must be input in the same sequence used for building the model. For example, one may train a RF regressor on one set of land-surface parameters and then apply that model to predict the spatial distribution of a soil property on a land-surface parameter stack derived for a different landscape, but the image band sequence must be the same for the Fit/Predict tools otherwise inaccurate predictions will result.
Random forest is an ensemble learning method that works by creating a large number (n_trees) of decision trees and using an averaging of each tree to determine estimated outcome values. Individual trees are created using a random sub-set of predictors. This ensemble approach overcomes the tendency of individual decision trees to overfit the training data. As such, the RF method is a widely and successfully applied machine-learning method in many domains.
Users must specify the number of trees (n_trees), the minimum number of samples required to be at a leaf node (min_samples_leaf), and the minimum number of samples required to split an internal node (min_samples_split) parameters, which determine the characteristics of the resulting model.
The function splits the training data into two sets, one for training the model and one for testing the prediction. These test data are used to calculate the regression accuracy statistics, as well as to estimate the variable importance. The test_proportion parameter is used to set the proportion of the input training data used in model testing. For example, if test_proportion = 0.2, 20% of the training data will be set aside for testing, and this subset will be selected randomly. As a result of this random selection of test data, as well as the randomness involved in establishing the individual decision trees, the tool in inherently stochastic, and will result in a different model each time it is run.
RF, like decision trees, does not require feature scaling. That is, unlike the k-NN algorithm and other methods that are based on the calculation of distances in multi-dimensional space, there is no need to rescale the predictors onto a common scale prior to RF analysis. Because individual trees do not use the full set of predictors, RF is also more robust against the curse of dimensionality than many other machine learning methods. Nonetheless, there is still debate about whether or not it is advisable to use a large number of predictors with RF analysis and it may be better to exclude predictors that are highly correlated with others, or that do not contribute significantly to the model during the model-building phase. A dimension reduction technique such as principal_component_analysis can be used to transform the features into a smaller set of uncorrelated predictors.
For a video tutorial on how to use the RandomForestRegression tool, see this YouTube video.
Code Example
`import os from whitebox_workflows import WbEnvironment
license_id = 'floating-license-id' wbe = WbEnvironment(license_id)
try: wbe.verbose = True wbe.working_directory = "/path/to/data" # Read the input raster files into memory images = wbe.read_rasters( 'DEV.tif', 'profile_curv.tif', 'tan_curv.tif', 'slope.tif' ) # Read the input training polygons into memory training_data = wbe.read_vector('Ottawa_soils_data.shp') # Train the model model = wbe.random_forest_regression_fit( images, training_data, field_name = 'Sand', n_trees = 50, min_samples_leaf = 1, min_samples_split = 2, test_proportion = 0.2 ) # Example of how to serialize the model, i.e., save the model, which is just binary data print('Saving the model to file...') file_path = os.path.join(wbe.working_directory, "rf_model.bin") with open(file_path, "wb") as file: file.write(bytearray(model)) # Example of how to deserialize the model, i.e. read the model model = [] with open(file_path, mode='rb') as file: model = list(file.read()) # Use the model to predict rf_image = wbe.random_forest_regression_predict(images, model) wbe.write_raster(rf_image, 'rf_regression.tif', compress=True) print('All done!') `
except Exception as e: print("The error raised is: ", e) finally: wbe.check_in_license(license_id)
See Also
random_forest_regression_predict, random_forest_classification_fit, random_forest_classification_predict, knn_classification, svm_classification, parallelepiped_classification, evaluate_training_sites
Python API
def random_forest_regression_fit(self, input_rasters: List[Raster], training_data: Vector, field_name: str, n_trees: int = 500, min_samples_leaf: int = 1, min_samples_split: int = 2, test_proportion: float = 0.2) -> List[int]:
Random Forest Regression Predict
Function name: random_forest_regression_predict
Note this tool is part of a WhiteboxTools extension product. Please visit Whitebox Geospatial Inc. for information about purchasing a license activation key (https://www.whiteboxgeo.com/extension-pricing/).
This tool applies a pre-built random forest (RF) regression model trained using multiple predictor rasters, or features (input_rasters), and training data to predict a spatial distribution. This function is part of a set of two tools, including random_forest_regression_fit and random_forest_regression_prdict. The random_forest_regression_fit function should be used first to create the RF model and the random_forest_regression_predict can then be used to apply that model for prediction. The output of the fit tool is a byte array that is a binary representation of the RF model. This model can then be used as the input to the predict tool, along with a list of input raster predictors, which must be in the same order as those used in the fit tool (see below). The output of the predict tool is a raster. The reason that the RF workflow is split in this way is that often it is the case that you need to experiment with various input predictor sets and parameter values to create an adequate model. There is no need to generate an output classified raster during this experimentation stage, and because prediction can often be the slowest part of the RF modelling process, it is generally only performed after the final model has been identified. The binary representation of the RF-based model can be serialized (i.e., saved to a file) and then later read back into memory to serve as the input for the prediction step of the workflow (see code example below).
Note: it is very important that the order of feature rasters is the same for both fitting the model and using the model for prediction. It is possible to use a model fitted to one data set to make preditions for another data set, however, the set of feature reasters specified to the prediction tool must be input in the same sequence used for building the model. For example, one may train a RF classifer on one set of multi-spectral satellite imagery and then apply that model to classify a different imagery scene, but the image band sequence must be the same for the Fit/Predict tools otherwise inaccurate predictions will result.
Code Example
`import os from whitebox_workflows import WbEnvironment
license_id = 'floating-license-id' wbe = WbEnvironment(license_id)
try: wbe.verbose = True wbe.working_directory = "/path/to/data" # Read the input raster files into memory images = wbe.read_rasters( 'DEV.tif', 'profile_curv.tif', 'tan_curv.tif', 'slope.tif' ) # Read the input training polygons into memory training_data = wbe.read_vector('Ottawa_soils_data.shp') # Train the model model = wbe.random_forest_regression_fit( images, training_data, field_name = 'Sand', n_trees = 50, min_samples_leaf = 1, min_samples_split = 2, test_proportion = 0.2 ) # Example of how to serialize the model, i.e., save the model, which is just binary data print('Saving the model to file...') file_path = os.path.join(wbe.working_directory, "rf_model.bin") with open(file_path, "wb") as file: file.write(bytearray(model)) # Example of how to deserialize the model, i.e. read the model model = [] with open(file_path, mode='rb') as file: model = list(file.read()) # Use the model to predict rf_image = wbe.random_forest_regression_predict(images, model) wbe.write_raster(rf_image, 'rf_regression.tif', compress=True) print('All done!') `
except Exception as e: print("The error raised is: ", e) finally: wbe.check_in_license(license_id)
See Also
random_forest_regression_fit, random_forest_classification_fit, random_forest_classification_predict, knn_classification, svm_classification, parallelepiped_classification, evaluate_training_sites
Python API
def random_forest_regression_predict(self, input_rasters: List[Raster], model_bytes: List[int]) -> Raster:
Random Sample
Function name: random_sample
This tool can be used to create a random sample of grid cells. The user specifies the base raster file, which is used to determine the grid dimensions and georeference information for the output raster, and the number of sample random samples (n). The output grid will contain n non-zero grid cells, randomly distributed throughout the raster grid, and a background value of zero. This tool is useful when performing statistical analyses on raster images when you wish to obtain a random sample of data.
Only valid, non-nodata, cells in the base raster will be sampled.
Python API
def random_sample(self, base_raster: Raster = None, num_samples: int = 1000) -> Raster:
Raster Area
Function name: raster_area
This tools estimates the area of each category, polygon, or patch in an input raster. The input raster must be categorical in data scale. Rasters with floating-point cell values are not good candidates for an area analysis. The user must specify whether the output is given in grid cells or map units (units). Map Units are physical units, e.g. if the rasters's scale is in metres, areas will report in square-metres. Notice that square-metres can be converted into hectares by dividing by 10,000 and into square-kilometres by dividing by 1,000,000. If the input raster is in geographic coordinates (i.e. latitude and longitude) a warning will be issued and areas will be estimated based on per-row calculated degree lengths.
The tool can be run with a raster output (output), a text output (out_text), or both. If niether outputs are specified, the tool will automatically output a raster named area.tif.
Zero values in the input raster may be excluded from the area analysis if the zero_back flag is used.
To calculate the area of vector polygons, use the polygon_area tool instead.
See Also
polygon_area, raster_histogram
Python API
def raster_area(self, raster: Raster, units: str = "map units", zero_background: bool = False) -> Tuple[Raster, str]:
Raster Calculator
Function name: raster_calculator
The raster_calculator tool can be used to perform a complex mathematical operations on one or more input raster images on a cell-to-cell basis. The user inputs an expression and a list of input rasters (input_rasters), specified in the same order as the rasters contained within the statement. Rasters are treated like variables (that change value with each grid cell) and are specified within the statement as arbitrarily named variables contained within either double or single quotation marks (e.g. "DEM" > 500.0). The order of raster variables must match the order of rasters within the input_rasters list.**Note, all input rasters must share the same number of rows and columns and spatial extent. Use the resample tool if this is not the case to convert the one raster's grid resolution to the others.
Example
(band3, band4) = wbe.read_rasters('band3.tif', 'band4.tif') result = wbe.raster_calculator("('nir' - 'red') / ('nir' + 'red')", [band4, band3]) wbe.write_raster(result, 'result.tif', True) The mathematical expression supports all of the standard algebraic unary and binary operators (+ - * / ^ %), as well as comparisons (< <= == != >= >) and logical operators (&& ||) with short-circuit support. The order of operations, from highest to lowest is as follows.
Listed in order of precedence: OrderSymbolDescription (Highest Precedence)^Exponentiation %Modulo /Division *Multiplication -Subtraction +Addition == != = >Comparisons (all have equal precedence) && andLogical AND with short-circuit (Lowest Precedence)|| orLogical OR with short-circuit
Several common mathematical functions are also available for use in the input statement. For example:
` * log(base=10, val) -- Logarithm with optional 'base' as first argument. If not provided, 'base' defaults to '10'. Example: log(100) + log(e(), 100)
- e() -- Euler's number (2.718281828459045)
pi() -- π (3.141592653589793)
int(val)
- ceil(val)
- floor(val)
round(modulus=1, val) -- Round with optional 'modulus' as first argument. Example: round(1.23456) == 1 && round(0.001, 1.23456) == 1.235
abs(val)
sign(val)
min(val, ...) -- Example: min(1, -2, 3, -4) == -4
max(val, ...) -- Example: max(1, -2, 3, -4) == 3
sin(radians) * asin(val)
- cos(radians) * acos(val)
- tan(radians) * atan(val)
- sinh(val) * asinh(val)
- cosh(val) * acosh(val)
- tanh(val) * atanh(val)
Notice that the constants pi and e must be specified as functions,pi()ande()`. A number of global variables are also available to build conditional statements. These include the following:
Special Variable Names For Use In Conditional Statements: NameDescription nodataAn input raster's NoData value. nullSame as nodata. minvalueAn input raster's minimum value. maxvalueAn input raster's maximum value. rowsThe input raster's number of rows. columnsThe input raster's number of columns. rowThe grid cell's row number. columnThe grid cell's column number. rowyThe row's y-coordinate. columnxThe column's x-coordinate. northThe input raster's northern coordinate. southThe input raster's southern coordinate. eastThe input raster's eastern coordinate. westThe input raster's western coordinate. cellsizexThe input raster's grid resolution in the x-direction. cellsizeyThe input raster's grid resolution in the y-direction. cellsizeThe input raster's average grid resolution.
The special variable names are case-sensitive. If there are more than one raster inputs used in the statement, the functional forms of the nodata, null, minvalue, and maxvalue variables should be used, e.g. nodata("InputRaster"), otherwise the value is assumed to specify the attribute of the first raster in the statement. The following are examples of valid statements:
` "raster" != 300.0
"raster" >= (minvalue + 35.0)
("raster1" >= 25.0) && ("raster2" <= 75.0) -- Evaluates to 1 where both conditions are true.
tan("raster" * pi() / 180.0) > 1.0
"raster" == nodata Any grid cell in the input rasters containing the NoData value will be assigned NoData in the output raster, unless a NoData grid cell value allows the statement to evaluate to True (i.e. the mathematical expression includes thenodata` value).
See Also
ConditionalEvaluation
Python API
def raster_calculator(self, expression: str, input_rasters: List[Raster]) -> Raster:
Raster Cell Assignment
Function name: raster_cell_assignment
This tool can be used to create a new raster with the same coordinates and dimensions (i.e. rows and columns) as an existing base image. Grid cells in the new raster will be assigned either the row or column number or the x- or y-coordinate, depending on the selected option (assign flag). The user must also specify the name of the base image (input).
See Also
NewRasterFromBase
Python API
def raster_cell_assignment(self, raster: Raster, what_to_assign: str = "column") -> Raster:
Raster Histogram
Function name: raster_histogram
This tool produces a histogram (i.e. a frequency distribution graph) for the values contained within an input raster file (input). The histogram will be embedded within an output (output_html_file) HTML file, which should be automatically displayed after the tool has completed. The user may optionally specify the number of bins (num_bins) used in the histogram. If unspecified, this is calculated as:
num_bins = ((rows * columns)).log2().ceil() + 1
See Also
attribute_histogram
Python API
def raster_histogram(self, raster: Raster, output_html_file: str) -> None:
Raster Perimeter
Function name: raster_perimeter
This tool can be used to measure the length of the perimeter of polygon features in a raster layer. The user must specify the name of the input raster file (input) and optionally an output raster (output), which is the raster layer containing the input features assigned the perimeter length. The user may also optionally choose to output text data (out_text). Raster-based perimeter estimation uses the accurate, anti-aliasing algorithm of Prashker (2009).
The input file must be of a categorical data type, containing discrete polygon features that have been assigned unique identifiers. Such rasters are often created by region-grouping (clump) a classified raster.
Reference
Prashker, S. (2009) An anti-aliasing algorithm for calculating the perimeter of raster polygons. Geotec, Ottawa and Geomtics Atlantic, Wolfville, NS.
See Also
raster_area, clump
Python API
def raster_perimeter(self, raster: Raster, units: str = "map units", zero_background: bool = False) -> Tuple[Raster, str]:
Raster Summary Stats
Function name: raster_summary_stats
This tool outputs distribution summary statistics for input raster images (input). The distribution statistics include the raster minimum, maximum, range, total, mean, variance, and standard deviation. These summary statistics are output to the system stdout.
The following is an example of the summary report:
********************************* * Welcome to RasterSummaryStats * ********************************* Reading data...
Number of non-nodata grid cells: 32083559 Number of nodata grid cells: 3916441 Image minimum: 390.266357421875 Image maximum: 426.0322570800781 Image range: 35.765899658203125 Image total: 13030334843.332886 Image average: 406.13745012929786 Image variance: 31.370027239143383 Image standard deviation: 5.600895217654351
See Also
raster_histogram, zonal_statistics
Python API
def raster_summary_stats(self, input: Raster) -> str:
Reciprocal
Function name: reciprocal
This tool creates a new raster (output) in which each grid cell is equal to one divided by the grid cell values in the input raster image (input). NoData values in the input image will be assigned NoData values in the output image.
Python API
def reciprocal(self, raster: Raster) -> Raster:
Rescale Value Range
Function name: rescale_value_range
Python API
def rescale_value_range(self, raster: Raster, out_min_val: float, out_max_val: float, clip_min: float = float('inf'), clip_max: float = float('-inf')) -> Raster:
Root Mean Square Error
Function name: root_mean_square_error
This tool calculates the root-mean-square-error (RMSE) or root-mean-square-difference (RMSD) from two input rasters. If the two input rasters possess the same number of rows and columns, the RMSE is calucated on a cell-by-cell basis, otherwise bilinear resampling is used. In addition to RMSE, the tool also reports other common accuracy statistics including the mean verical error, the 95% confidence limit (RMSE x 1.96), and the 90% linear error (LE90), which is the 90% percentile of the residuals between two raster surfaces. The LE90 is the most robust of the reported accuracy statistics when the residuals are non-Gaussian. The LE90 requires sorting the residual values, which can be a relatively slow operation for larger rasters.
See Also
paired_sample_t_test, wilcoxon_signed_rank_test
Python API
def root_mean_square_error(self, input: Raster, reference: Raster) -> str:
Round
Function name: round
Experimental
Rounds each raster cell to the nearest integer.
raster math round
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply round transform to each non-nodata cell.
wbe.round(input='dem.tif', output='round_dem.tif')
Shape Complexity Index Raster
Function name: shape_complexity_index_raster
This tools calculates a type of shape complexity index for raster objects. The index is equal to the average number of intersections of the group of vertical and horizontal transects passing through an object. Simple objects will have a shape complexity index of 1.0 and more complex shapes, including those containing numerous holes or are winding in shape, will have higher index values. Objects in the input raster (input) are designated by their unique identifiers. Identifier values should be positive, non-zero whole numbers.
See Also
ShapeComplexityIndex, boundary_shape_complexity
Python API
def shape_complexity_index_raster(self, raster: Raster) -> Raster:
Sieve
Function name: sieve
The sieve function removes individual objects in a class map that are less than a threshold area, in grid cells. Pixels contained within the removed small polygons will be replaced with the nearest remaining class value. This operation is common when generalizing class maps, e.g. those derived from an image classification. Thus, this tool provides a similar function to the generalize_classified_raster and generalize_with_similarity functions.
See Also:
generalize_classified_raster, generalize_with_similarity
Python API
def sieve(self, input_raster: Raster, threshold: int = 1, zero_background: bool = False) -> Raster:
Sin
Function name: sin
Experimental
Computes the sine of each raster cell value.
raster math sin
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply sin transform to each non-nodata cell.
wbe.sin(input='dem.tif', output='sin_dem.tif')
Sinh
Function name: sinh
Experimental
Computes the hyperbolic sine of each raster cell.
raster math sinh
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply sinh transform to each non-nodata cell.
wbe.sinh(input='dem.tif', output='sinh_dem.tif')
Sqrt
Function name: sqrt
Experimental
Computes the square-root of each raster cell.
raster math sqrt
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply sqrt transform to each non-nodata cell.
wbe.sqrt(input='dem.tif', output='sqrt_dem.tif')
Square
Function name: square
Experimental
Squares each raster cell value.
raster math square
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply square transform to each non-nodata cell.
wbe.square(input='dem.tif', output='square_dem.tif')
Tan
Function name: tan
Experimental
Computes the tangent of each raster cell value.
raster math tan
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply tan transform to each non-nodata cell.
wbe.tan(input='dem.tif', output='tan_dem.tif')
Tanh
Function name: tanh
Experimental
Computes the hyperbolic tangent of each raster cell.
raster math tanh
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply tanh transform to each non-nodata cell.
wbe.tanh(input='dem.tif', output='tanh_dem.tif')
TIN Interpolation
Function name: tin_interpolation
Creates a raster grid based on a triangular irregular network (TIN) fitted to vector points and linear interpolation within each triangular-shaped plane. The TIN creation algorithm is based on Delaunay triangulation.
The user must specify the attribute field containing point values (field). Alternatively, if the input Shapefile contains z-values, the interpolation may be based on these values (use_z). Either an output grid resolution (cell_size) must be specified or alternatively an existing base file (base) can be used to determine the output raster's (output) resolution and spatial extent. Natural neighbour interpolation generally produces a satisfactorily smooth surface within the region of data points but can produce spurious breaks in the surface outside of this region. Thus, it is recommended that the output surface be clipped to the convex hull of the input points (clip).
See Also
lidar_tin_gridding, construct_vector_tin, natural_neighbour_interpolation
Python API
def tin_interpolation(self, points: Vector, field_name: str = "FID", use_z: bool = False, cell_size: float = 0.0, base_raster: Raster = None, max_triangle_edge_length: float = float('inf')) -> Raster:
To Degrees
Function name: to_degrees
Experimental
Converts each raster cell from radians to degrees.
raster math to_degrees
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply to_degrees transform to each non-nodata cell.
wbe.to_degrees(input='dem.tif', output='to_degrees_dem.tif')
To Radians
Function name: to_radians
Experimental
Converts each raster cell from degrees to radians.
raster math to_radians
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply to_radians transform to each non-nodata cell.
wbe.to_radians(input='dem.tif', output='to_radians_dem.tif')
Trend Surface
Function name: trend_surface
This tool can be used to interpolate a trend surface from a raster image. The technique uses a polynomial, least-squares regression analysis. The user must specify the name of the input raster file. In addition, the user must specify the polynomial order (1 to 10) for the analysis. A first-order polynomial is a planar surface with no curvature. As the polynomial order is increased, greater flexibility is allowed in the fitted surface. Although polynomial orders as high as 10 are accepted, numerical instability in the analysis often creates artifacts in trend surfaces of orders greater than 5. The operation will display a text report on completion, in addition to the output raster image. The report will list each of the coefficient values and the r-square value. Note that the entire raster image must be able to fit into computer memory, limiting the use of this tool to relatively small rasters. The Trend Surface (Vector Points) tool can be used instead if the input data is vector points contained in a shapefile.
Numerical stability is enhanced by transforming the x, y, z data by their minimum values before performing the regression analysis. These transform parameters are also reported in the output report.
Python API
def trend_surface(self, raster: Raster, output_html_file: str, polynomial_order: int = 1) -> Raster:
Trend Surface Vector Points
Function name: trend_surface_vector_points
This tool can be used to interpolate a trend surface from a vector points file. The technique uses a polynomial, least-squares regression analysis. The user must specify the name of the input shapefile, which must be of a 'Points' base VectorGeometryType and select the attribute in the shapefile's associated attribute table for which to base the trend surface analysis. The attribute must be numerical. In addition, the user must specify the polynomial order (1 to 10) for the analysis. A first-order polynomial is a planar surface with no curvature. As the polynomial order is increased, greater flexibility is allowed in the fitted surface. Although polynomial orders as high as 10 are accepted, numerical instability in the analysis often creates artifacts in trend surfaces of orders greater than 5. The operation will display a text report on completion, in addition to the output raster image. The report will list each of the coefficient values and the r-square value. The Trend Surface tool can be used instead if the input data is a raster image.
Numerical stability is enhanced by transforming the x, y, z data by their minimum values before performing the regression analysis. These transform parameters are also reported in the output report.
Python API
def trend_surface_vector_points(self, input: Vector, cell_size: float, output_html_file: str, field_name: str = "FID", polynomial_order: int = 1) -> Raster:
Truncate
Function name: truncate
Experimental
Truncates each raster cell value to its integer part.
raster math truncate
Parameters
NameDescriptionRequiredDefault
inputInput raster file path.Requiredinput.tif
outputOptional output raster file path. If omitted, the result is stored in memory.Optionaloutput.tif
Examples
Apply truncate transform to each non-nodata cell.
wbe.truncate(input='dem.tif', output='truncate_dem.tif')
Turning Bands Simulation
Function name: turning_bands_simulation
This tool can be used to create a random field using the turning bands algorithm. The user must specify the name of a base raster image (base) from which the output raster will derive its geographical information, dimensions (rows and columns), and other information. In addition, the range (range), in x-y units, must be specified. The range determines the correlation length of the resulting field. For a good description of how the algorithm works, see Carr (2002). The turning bands method creates a number of 1-D simulations (called bands) and fuses these together to create a 2-D error field. There is no natural stopping condition in this process, so the user must specify the number of bands to create (iterations). The default value of 1000 iterations is reasonable. The fewer iterations used, the more prevalent the 1-D simulations will be in the output error image, effectively creating artifacts. Run time increases with the number of iterations.
Turning bands simulation is a commonly applied technique in Monte Carlo style simulations of uncertainty. As such, it is frequently run many times during a simulation (often 1000s of times). When this is the case, algorithm performance and efficiency are key considerations. One alternative method to efficiently generate spatially autocorrelated random fields is to apply the fast_almost_gaussian_filter tool to the output of the random_field tool. This can be used to generate a random field with the desired spatial characteristics and frequency distribution. This is the alternative approach used by the stochastic_depression_analysis tool.
Reference
Carr, J. R. (2002). Data visualization in the geosciences. Upper Saddle River, NJ: Prentice Hall. pp. 267.
See Also
random_field, fast_almost_gaussian_filter, stochastic_depression_analysis
Python API
def turning_bands_simulation(self, base_raster: Raster = None, range: float = 1.0, iterations: int = 1000) -> Raster:
Two Sample KS Test
Function name: two_sample_ks_test
This tool will perform a two-sample Kolmogorov-Smirnov (K-S) test to evaluate whether a significant statistical difference exists between the frequency distributions of two rasters. The null hypothesis is that both samples come from a population with the same distribution. Note that this test evaluates the two input rasters for differences in their overall distribution shape, with no assumption of normality. If there is need to compare the per-pixel differences between two input rasters, a paired-samples test such as the paired_sample_t_test or the non-parametric wilcoxon_signed_rank_test should be used instead.
The user must specify the name of the two input raster images (input1 and input2) and the output report HTML file (output). The test can be performed optionally on the entire image or on a random sub-sample of pixel values of a user-specified size (num_samples). In evaluating the significance of the test, it is important to keep in mind that given a sufficiently large sample, extremely small and non-notable differences can be found to be statistically significant. Furthermore statistical significance says nothing about the practical significance of a difference.
See Also
KSTestForNormality, paired_sample_t_test, wilcoxon_signed_rank_test
Python API
def two_sample_ks_test(self, raster1: Raster, raster2: Raster, output_html_file: str, num_samples: int) -> None:
Wilcoxon Signed Rank Test
Function name: wilcoxon_signed_rank_test
This tool will perform a Wilcoxon signed-rank test to evaluate whether a significant statistical difference exists between the two rasters. The Wilcoxon signed-rank test is often used as a non-parametric equivalent to the paired-samples Student's t-test, and is used when the distribution of sample difference values between the paired inputs is non-Gaussian. The null hypothesis of this test is that difference between the sample pairs follow a symmetric distribution around zero. i.e. that the median difference between pairs of observations is zero.
The user must specify the name of the two input raster images (input1 and input2) and the output report HTML file (output). The test can be performed optionally on the entire image or on a random sub-sample of pixel values of a user-specified size (num_samples). In evaluating the significance of the test, it is important to keep in mind that given a sufficiently large sample, extremely small and non-notable differences can be found to be statistically significant. Furthermore statistical significance says nothing about the practical significance of a difference. Note that cells with a difference of zero are excluded from the ranking and tied difference values are assigned their average rank values.
See Also
paired_sample_test, two_sample_ks_test
Python API
def wilcoxon_signed_rank_test(self, raster1: Raster, raster2: Raster, output_html_file: str, num_samples: int) -> None:
Z Scores
Function name: z_scores
This tool will transform the values in an input raster image (input) into z-scores. Z-scores are also called standard scores, normal scores, or z-values. A z-score is a dimensionless quantity that is calculated by subtracting the mean from an individual raw value and then dividing the difference by the standard deviation. This conversion process is called standardizing or normalizing and the result is sometimes referred to as a standardized variable. The mean and standard deviation are estimated using all values in the input image except for NoData values. The input image should not have a Boolean or categorical data scale, i.e. it should be on a continuous scale.
See Also
cumulative_distribution
Python API
def z_scores(self, raster: Raster) -> Raster:
Zonal Statistics
Function name: zonal_statistics
This tool can be used to extract common descriptive statistics associated with the distribution of some underlying data raster based on feature units defined by a feature definition raster. For example, this tool can be used to measure the maximum or average slope gradient (data image) for each of a group of watersheds (feature definitions). Although the data raster can contain any type of data, the feature definition raster must be categorical, i.e. it must define area entities using integer values.
The stat parameter can take the values, 'mean', 'median', 'minimum', 'maximum', 'range', 'standard deviation', or 'total'.
If an output image name is specified, the tool will assign the descriptive statistic value to each of the spatial entities defined in the feature definition raster. If text output is selected, an HTML table will be output, which can then be readily copied into a spreadsheet program for further analysis. This is a very powerful and useful tool for creating numerical summary data from spatial data which can then be interrogated using statistical analyses. At least one output type (image or text) must be specified for the tool to operate.
NoData values in either of the two input images are ignored during the calculation of the descriptive statistic.
See Also
raster_summary_stats
Python API
def zonal_statistics(self, data_raster: Raster, feature_definitions_raster: Raster, stat_type: str = "mean") -> Tuple[Raster, str]:
Vector Analysis
Vector analysis in WbW-QGIS covers geometry validation, overlay operations, attribute enrichment, spatial selection, proximity analysis, and spatial joining. Whitebox supplements the native QGIS vector toolbox with high-performance tools built on the wbtopology spatial index.
This chapter walks through a complete parcel-attribute enrichment workflow — a common task in land management and environmental assessment.
Key Concepts
- Geometry validity: Self-intersecting rings, duplicate vertices, and unclosed polygons cause silent failures in overlay tools. Always validate and repair geometry before any overlay operation.
- Spatial join: Assigns attributes from one layer to features in another based on spatial relationship (intersects, contains, nearest). Supports aggregation modes (first, last, sum, mean, count, min, max).
- Near analysis: Finds the nearest feature (or features within a distance) from a source layer to a target layer. Returns distance and optional target attributes.
- Clip / Intersection / Difference: Standard polygon overlay operations. Clip retains the geometry of input A bounded by B. Intersection produces the geometric overlap. Difference removes the overlap.
- Add geometry attributes: Computes and appends area, perimeter, length, centroid coordinates, or bounding box dimensions as new attribute fields.
- Select by location: Spatial predicate query (intersects, within, contains, etc.) that produces a feature selection or a new filtered layer.
End-to-End Workflow: Parcel Attribute Enrichment
This workflow assigns catchment statistics and proximity-to-road measurements to a parcel layer.
Inputs
| Layer | Format | Notes |
|---|---|---|
parcels.shp | Polygon vector | Land parcel boundaries |
catchments.shp | Polygon vector | Watershed polygons with area and slope stats |
roads.shp | Polyline vector | Road network |
Step 1 — Validate and Repair Geometry
Processing Toolbox → Vector Geometry → Fix Geometries (QGIS native)
| Parameter | Recommended value |
|---|---|
| Input layer | parcels.shp |
| Output | parcels_valid.shp |
Run Check Validity on both catchments.shp and roads.shp and fix any
errors before proceeding.
Step 2 — Add Geometry Attributes to Parcels
Processing Toolbox → Whitebox Workflows → Vector Analysis →
Add Geometry Attributes
| Parameter | Recommended value |
|---|---|
| Input vector | parcels_valid.shp |
| Units | Metres |
| Output | parcels_geom.shp |
This appends AREA, PERIMETER, and centroid X/Y fields to each parcel.
Step 3 — Spatial Join: Assign Catchment Attributes to Parcels
Processing Toolbox → Whitebox Workflows → Vector Analysis →
Spatial Join
| Parameter | Recommended value |
|---|---|
| Target layer | parcels_geom.shp |
| Join layer | catchments.shp |
| Spatial relationship | Intersects |
| Join strategy | First (largest overlap catchment) |
| Fields to join | catch_id, mean_slope, area_km2 |
| Output | parcels_joined.shp |
Each parcel now carries the attributes of the catchment it intersects most.
Step 4 — Near: Distance from Each Parcel to Nearest Road
Processing Toolbox → Whitebox Workflows → Vector Analysis →
Near
| Parameter | Recommended value |
|---|---|
| Input vector (source) | parcels_joined.shp |
| Near vector (target) | roads.shp |
| Max search distance (m) | 5000 (0 = search all) |
| Output | parcels_near.shp |
Appended fields: NEAR_DIST (metres to nearest road segment),
NEAR_FID (FID of nearest road feature).
Step 5 — Select High-Priority Parcels
Processing Toolbox → Whitebox Workflows → Vector Analysis →
Select By Attribute or use QGIS Select by Expression:
"AREA" > 10000 AND "NEAR_DIST" < 500 AND "mean_slope" < 10
Export the selection as priority_parcels.shp using
Layer → Export → Save Selected Features As.
Step 5b — Field Calculator Assistant (Expression + Preview Workflow)
Use this when you need guided expression authoring for derived attributes (for example TYPE-to-SPEED conversion before network impedance analysis).
Open from the Whitebox panel (recommended path):
Whitebox panel → tool search → field_calculator
The assistant provides:
- expression editor with SQL-style presets/snippets
- geometry token insertion (
$area,$length,$perimeter, centroid tokens) - category and keyword snippet filtering
- preview table driven by backend
preview_rowspayload - one-click handoff to the standard processing dialog with prefilled parameters
Supported expression features include:
CASE WHEN ... THEN ... ELSE ... END- simple
CASE field WHEN value THEN ... END - optional
UPDATE ... SET ... WHERE ...wrapper syntax - SQL operators (
=,<>,AND,OR,NOT) and null predicates CAST(... AS integer|float|text|boolean)
Example expression:
UPDATE roads SET SPEED = CASE
WHEN TYPE == 'motorway' THEN 100
WHEN TYPE == 'primary' THEN 80
WHEN TYPE == 'collector' THEN 60
ELSE 40
END
Notes:
- Launching
field_calculatorfrom the Whitebox panel opens the assistant. - Launching from the generic Processing Toolbox can open the standard dialog directly, depending on host/API path.
Step 6 — Clip Parcels to Study Area (Optional)
Processing Toolbox → Whitebox Workflows → Vector Analysis →
Line Polygon Clip (for lines) or QGIS Clip for polygon-on-polygon.
| Parameter | Recommended value |
|---|---|
| Input vector | priority_parcels.shp |
| Clip polygon | study_area.shp |
| Output | priority_parcels_clipped.shp |
TopoJSON Conversion Chain (QGIS Interop)
Use this workflow when you need to exchange shared-boundary vector data with web clients while keeping a GeoPackage working copy for analysis.
Inputs
| Layer | Format | Notes |
|---|---|---|
zones.gpkg | Polygon vector | Authoritative analysis dataset |
Step 1 — Run a Whitebox vector operation and emit TopoJSON
Processing Toolbox → Whitebox Workflows → Vector Analysis →
Add Geometry Attributes
| Parameter | Recommended value |
|---|---|
| Input vector | zones.gpkg |
| Units | Metres |
| Output | zones_metrics.topojson |
This confirms the plugin accepts .topojson output targets in a normal vector
processing chain.
Step 2 — Re-open TopoJSON and convert back to GeoPackage
- Add
zones_metrics.topojsonto the QGIS project. - Right-click the layer, then choose Export → Save Features As....
- Set format to GeoPackage and save as
zones_metrics_roundtrip.gpkg.
Step 3 — Validate roundtrip integrity
Check these before publishing or reusing the roundtrip layer:
- Feature count matches source layer.
- Core attributes (e.g., ID fields and geometry metrics) are preserved.
- CRS is correctly populated on the roundtrip GeoPackage.
Recommended use pattern
- Keep
.gpkgas the editable analysis master. - Generate
.topojsonas interchange or web-delivery artifacts. - Re-import to
.gpkgfor heavier downstream spatial analysis.
TopoJSON Boundary-Preserving Generalization Chain
Use this chain when you need smaller delivery payloads while preserving shared boundary consistency during simplification.
Inputs
| Layer | Format | Notes |
|---|---|---|
admin_units.gpkg | Polygon vector | Shared boundaries between adjacent polygons |
Step 1 — Simplify and emit TopoJSON
Processing Toolbox → Whitebox Workflows → Vector Analysis →
Simplify Features
| Parameter | Recommended value |
|---|---|
| Input vector | admin_units.gpkg |
| Algorithm | Douglas-Peucker |
| Tolerance | 25.0 (adjust to target scale) |
| Output | admin_units_simplified.topojson |
Step 2 — Inspect topology consistency in QGIS
- Add
admin_units_simplified.topojsonto the map. - Inspect shared boundaries at high zoom for slivers/gaps.
- Validate feature count versus source before publication.
Step 3 — Export analysis copy
Export to admin_units_simplified.gpkg for downstream joins/overlay work.
TopoJSON Transport + Enrichment Return Chain
Use this chain when TopoJSON is used only for transport and you need to return to an analysis-grade format for attribute enrichment.
Inputs
| Layer | Format | Notes |
|---|---|---|
transport_in.topojson | Topology-preserving vector | Interchange input received from external system |
Step 1 — Convert transport input to GeoPackage
- Add
transport_in.topojsonto the project. - Export as
transport_stage.gpkg.
Step 2 — Apply enrichment tools
Run Whitebox vector tools against transport_stage.gpkg:
Add Geometry Attributesfor geometry metrics.Spatial Joinfor contextual attribute enrichment.Nearfor proximity attributes.
Step 3 — Emit deliverables
Write two outputs:
transport_enriched.gpkgfor analytic persistence.transport_enriched.topojsonfor interchange/web handoff.
Python Console Equivalent
import processing
# Step 1: fix geometry
processing.run('native:fixgeometries', {
'INPUT': '/data/parcels.shp',
'OUTPUT': '/data/parcels_valid.shp',
})
# Step 2: add geometry attributes
processing.run('whitebox_workflows:add_geometry_attributes', {
'input': '/data/parcels_valid.shp',
'units': 'Metres',
'output': '/data/parcels_geom.shp',
})
# Step 3: spatial join
processing.run('whitebox_workflows:spatial_join', {
'input': '/data/parcels_geom.shp',
'join': '/data/catchments.shp',
'spatial_relation': 'Intersects',
'join_method': 'First',
'output': '/data/parcels_joined.shp',
})
# Step 4: near
processing.run('whitebox_workflows:near', {
'input': '/data/parcels_joined.shp',
'near': '/data/roads.shp',
'max_dist': 5000.0,
'output': '/data/parcels_near.shp',
})
print("Parcel enrichment complete.")
Advanced: Simplify Features for Cartographic Output
Large polygon datasets with many vertices slow down rendering and tile export. Simplify geometries while preserving topology.
Processing Toolbox → Whitebox Workflows → Vector Analysis →
Simplify Features
| Parameter | Recommended value |
|---|---|
| Input vector | priority_parcels_clipped.shp |
| Algorithm | Douglas-Peucker |
| Tolerance (m) | 5.0 (adjust to display scale) |
| Output | parcels_simplified.shp |
processing.run('whitebox_workflows:simplify_features', {
'input': '/data/priority_parcels_clipped.shp',
'algorithm': 'DouglasPeucker',
'tolerance': 5.0,
'output': '/data/parcels_simplified.shp',
})
Common Pitfalls
| Problem | Likely cause | Fix |
|---|---|---|
| Spatial join returns no matches | CRS mismatch between target and join layers | Reproject both to the same CRS before joining |
| Near returns –1 for all distances | Search distance too small for data extent | Increase max_dist or set to 0 for unlimited search |
| Add geometry attributes returns wrong area | Layer CRS is geographic (degrees) | Reproject to a projected CRS (metres) first |
| Simplify removes valid narrow features | Tolerance too large | Use a smaller tolerance (< 1 m for cadastral data) |
| Select by location selects too many features | Predicate too inclusive (intersects vs. within) | Switch to Within or Contains for strict containment |
Validation Checklist
- All input layers pass geometry validity check.
- All vector layers share the same projected CRS.
- Spatial join result preserves original feature count (check attribute table row count).
- NEAR_DIST values are plausible (inspect histogram).
- Simplified geometry does not self-intersect at the chosen tolerance.
- Attribute field names in output do not exceed shapefile 10-character limit.
Overlay Analysis
Clip
Function name: clip
This tool will extract all the features, or parts of features, that overlap with the features of the clip vector file. The clipping operation is one of the most common vector overlay operations in GIS and effectively imposes the boundary of the clip layer on a set of input vector features, or target features. The operation is sometimes likened to a 'cookie-cutter'. The input vector file can be of any feature type (i.e. points, lines, polygons), however, the clip vector must consist of polygons.
See Also
erase
Python API
def clip(self, input: Vector, clip_layer: Vector) -> Vector:
Difference
Function name: difference
This tool will remove all the overlapping features, or parts of overlapping features, between input and overlay vector files, outputting only the features that occur in one of the two inputs but not both. The Symmetrical Difference is related to the Boolean exclusive-or (XOR) operation in set theory and is one of the common vector overlay operations in GIS. The user must specify the names of the input and overlay vector files as well as the output vector file name. The tool operates on vector points, lines, or polygon, but both the input and overlay files must contain the same VectorGeometryType.
The Symmetrical Difference can also be derived using a combination of other vector overlay operations, as either (A union B) difference (A intersect B), or (A difference B) union (B difference A).
The attributes of the two input vectors will be merged in the output attribute table. Fields that are duplicated between the inputs will share a single attribute in the output. Fields that only exist in one of the two inputs will be populated by null in the output table. Multipoint VectorGeometryTypes however will simply contain a single output feature identifier (FID) attribute. Also, note that depending on the VectorGeometryType (polylines and polygons), Measure and Z ShapeDimension data will not be transferred to the output geometries. If the input attribute table contains fields that measure the geometric properties of their associated features (e.g. length or area), these fields will not be updated to reflect changes in geometry shape and size resulting from the overlay operation.
See Also
intersect, difference, union, clip, erase
Python API
def difference(self, input: Vector, overlay: Vector) -> Vector:
Dissolve
Function name: dissolve
This tool can be used to remove the interior, or shared, boundaries within a vector polygon coverage. You can either dissolve all interior boundaries or dissolve those boundaries along polygons with the same value of a user-specified attribute within the vector's attribute table. It may be desirable to use the VectorCleaning tool to correct any topological errors resulting from the slight misalignment of nodes along shared boundaries in the vector coverage before performing the dissolve operation.
See Also
clip, erase, polygonize
Python API
def dissolve(self, input: Vector, dissolve_field: str = "", snap_tolerance: float = 2.220446049250313e-16) -> Vector:
Erase
Function name: erase
This tool will remove all the features, or parts of features, that overlap with the features of the erase vector file. The erasing operation is one of the most common vector overlay operations in GIS and effectively imposes the boundary of the erase layer on a set of input vector features, or target features.
See Also
clip
Python API
def erase(self, input: Vector, erase_layer: Vector) -> Vector:
Identity
Function name: identity
No help documentation available for this tool.
Intersect
Function name: intersect
The result of the intersect vector overlay operation includes all the feature parts that occur in both input layers, excluding all other parts. It is analogous to the OR logical operator and multiplication in arithmetic. This tool is one of the common vector overlay operations in GIS. The user must specify the names of the input and overlay vector files as well as the output vector file name. The tool operates on vector points, lines, or polygon, but both the input and overlay files must contain the same VectorGeometryType.
The intersect tool is similar to the clip tool. The difference is that the overlay vector layer in a clip operation must always be polygons, regardless of whether the input layer consists of points or polylines.
The attributes of the two input vectors will be merged in the output attribute table. Note, duplicate fields should not exist between the inputs layers, as they will share a single attribute in the output (assigned from the first layer). Multipoint VectorGeometryTypes will simply contain a single output feature identifier (FID) attribute. Also, note that depending on the VectorGeometryType (polylines and polygons), Measure and Z ShapeDimension data will not be transferred to the output geometries. If the input attribute table contains fields that measure the geometric properties of their associated features (e.g. length or area), these fields will not be updated to reflect changes in geometry shape and size resulting from the overlay operation.
See Also
difference, union, symmetrical_difference, clip, erase
Python API
def intersect(self, input: Vector, overlay: Vector, snap_tolerance: float = 2.220446049250313e-16) -> Vector:
Line Intersections
Function name: line_intersections
This tool identifies points where the features of two vector line/polygon layers intersect. The user must specify the names of two input vector line files and the output file. The output file will be a vector of POINT VectorGeometryType. If the input vectors intersect at a line segment, the beginning and end vertices of the segment will be present in the output file. A warning is issued if intersection line segments are identified during analysis. If no intersections are found between the input line files, the output file will not be saved and a warning will be issued.
Each intersection point will contain PARENT1 and PARENT2 attribute fields, identifying the instersecting features in the first and second input line files respectively. Additionally, the output attribute table will contain all of the attributes (excluding FIDs) of the two parent line features.
Python API
def line_intersections(self, input1: Vector, input2: Vector) -> Vector:
Line Polygon Clip
Function name: line_polygon_clip
Experimental
Clips line features to polygon interiors and outputs clipped line segments.
vector clip line
Parameters
NameDescriptionRequiredDefault
inputInput line layer.Requiredlines.shp
clipClip polygon layer.Requiredclip_polygons.shp
outputOutput vector path.Required—
Examples
Returns clipped line segments inside clip polygons.
wbe.line_polygon_clip(clip='clip_polygons.shp', input='lines.shp', output='line_polygon_clip.shp')
Near
Function name: near
Experimental
Find nearest neighbor features and optionally compute distance. Efficient for proximity analysis and distance calculations.
vector nearest distance
Parameters
NameDescriptionRequiredDefault
inputInput vector layer.Requiredinput.shp
nearNear-feature vector layer.Requirednear.shp
max_distanceOptional maximum search distance.Optional—
outputOutput vector path.Required—
Examples
Computes nearest feature IDs and distances.
wbe.near(input='input.shp', near='near.shp', output='near_output.shp')
Select By Location
Function name: select_by_location
Experimental
Extracts target features that satisfy a spatial relationship to query features.
vector query spatial
Parameters
NameDescriptionRequiredDefault
targetTarget feature layer to filter.Requiredtarget.shp
queryQuery feature layer.Requiredquery.shp
predicateSpatial predicate: intersects, within, contains, touches, crosses, overlaps, disjoint, within_distance.Requiredintersects
distanceDistance threshold for within_distance predicate.Optional—
outputOutput vector path.Required—
Examples
Selects target features that intersect query features.
wbe.select_by_location(output='selected.shp', predicate='intersects', query='query.shp', target='target.shp')
Spatial Join
Function name: spatial_join
Experimental
Join attributes from one vector layer to another based on spatial relationship. Uses spatial indexing for efficient processing.
vector join spatial
Parameters
NameDescriptionRequiredDefault
targetTarget layer receiving joined attributes.Requiredtarget.shp
joinJoin layer providing attributes.Requiredjoin.shp
predicateSpatial predicate: intersects, within, contains, touches, crosses, overlaps, within_distance.Requiredintersects
distanceDistance threshold for within_distance predicate.Optional—
strategyJoin strategy: first, last, count, sum, mean, min, max.Optionalfirst
prefixPrefix for joined field names (default JOIN_).OptionalJOIN_
outputOutput vector path.Required—
Examples
Transfers join-layer attributes where geometries intersect.
wbe.spatial_join(join='join.shp', output='spatial_join.shp', predicate='intersects', prefix='JOIN_', strategy='first', target='target.shp')
Symmetrical Difference
Function name: symmetrical_difference
This tool will remove all the overlapping features, or parts of overlapping features, between input and overlay vector files, outputting only the features that occur in one of the two inputs but not both. The Symmetrical Difference is related to the Boolean exclusive-or (XOR) operation in set theory and is one of the common vector overlay operations in GIS. The user must specify the names of the input and overlay vector files as well as the output vector file name. The tool operates on vector points, lines, or polygon, but both the input and overlay files must contain the same VectorGeometryType.
The Symmetrical Difference can also be derived using a combination of other vector overlay operations, as either (A union B) difference (A intersect B), or (A difference B) union (B difference A).
The attributes of the two input vectors will be merged in the output attribute table. Fields that are duplicated between the inputs will share a single attribute in the output. Fields that only exist in one of the two inputs will be populated by null in the output table. Multipoint VectorGeometryTypes however will simply contain a single output feature identifier (FID) attribute. Also, note that depending on the VectorGeometryType (polylines and polygons), Measure and Z ShapeDimension data will not be transferred to the output geometries. If the input attribute table contains fields that measure the geometric properties of their associated features (e.g. length or area), these fields will not be updated to reflect changes in geometry shape and size resulting from the overlay operation.
See Also
intersect, difference, union, clip, erase
Python API
def symmetrical_difference(self, input: Vector, overlay: Vector, snap_tolerance: float = 2.220446049250313e-16) -> Vector:
Union
Function name: union
This tool splits vector layers at their overlaps, creating a layer containing all the portions from both input and overlay layers. The Union is related to the Boolean OR operation in set theory and is one of the common vector overlay operations in GIS. The user must specify the names of the input and overlay vector files as well as the output vector file name. The tool operates on vector points, lines, or polygon, but both the input and overlay files must contain the same VectorGeometryType.
The attributes of the two input vectors will be merged in the output attribute table. Fields that are duplicated between the inputs will share a single attribute in the output. Fields that only exist in one of the two inputs will be populated by null in the output table. Multipoint VectorGeometryTypes however will simply contain a single output feature identifier (FID) attribute. Also, note that depending on the VectorGeometryType (polylines and polygons), Measure and Z ShapeDimension data will not be transferred to the output geometries. If the input attribute table contains fields that measure the geometric properties of their associated features (e.g. length or area), these fields will not be updated to reflect changes in geometry shape and size resulting from the overlay operation.
See Also
intersect, difference, symmetrical_difference, clip, erase
Python API
def union(self, input: Vector, overlay: Vector, snap_tolerance: float = 2.220446049250313e-16) -> Vector:
Update
Function name: update
No help documentation available for this tool.
Geometry Processing
Centroid Vector
Function name: centroid_vector
This can be used to identify the centroid point of a vector polyline or polygon feature or a group of vector points. The output is a vector shapefile of points. For multi-part polyline or polygon features, the user can optionally specify whether to identify the centroid of each part. The default is to treat multi-part features a single entity.
For raster features, use the Centroid tool instead.
See Also
Centroid, medoid
Python API
def centroid_vector(self, input: Vector) -> Vector:
Concave Hull
Function name: concave_hull
Experimental
Creates concave hull polygons around all input feature coordinates.
vector hull boundary
Parameters
NameDescriptionRequiredDefault
inputInput vector layer.Requiredinput.shp
max_edge_lengthMaximum edge length controlling hull detail.Required50.0
epsilonRobustness epsilon (default 1e-9).Optional1e-09
outputOutput vector path.Required—
Examples
Builds a concave hull from all input coordinates.
wbe.concave_hull(epsilon=1e-09, input='input.shp', max_edge_length=50.0, output='concave_hull.shp')
Densify Features
Function name: densify_features
Experimental
Add intermediate vertices to geometries at regular intervals. Improves accuracy for curved features or when reprojecting.
vector densify vertices
Parameters
NameDescriptionRequiredDefault
inputInput vector layer.Requiredinput.shp
spacingMaximum spacing between adjacent vertices.Required25.0
outputOutput vector path.Required—
Examples
Adds regularly spaced vertices along geometry segments.
wbe.densify_features(input='input.shp', output='densified.shp', spacing=25.0)
Eliminate Coincident Points
Function name: eliminate_coincident_points
This tool can be used to remove any coincident, or nearly coincident, points from a vector points file. The user must specify the name of the input file, which must be of a POINTS VectorGeometryType, the output file name, and the tolerance distance. All points that are within the specified tolerance distance will be eliminated from the output file. A tolerance distance of 0.0 indicates that points must be exactly coincident to be removed.
See Also
LidarRemoveDuplicates
Python API
def eliminate_coincident_points(self, input: Vector, tolerance_dist: float) -> Vector:
Extend Vector Lines
Function name: extend_vector_lines
This tool can be used to extend vector lines by a specified distance. The user must input the names of the input and output shapefiles, the distance to extend features by, and whether to extend both ends, line starts, or line ends. The input shapefile must be of a POLYLINE base shape type and should be in a projected coordinate system.
Python API
def extend_vector_lines(self, input: Vector, distance: float, extend_direction: str = "both") -> Vector:
Merge Line Segments
Function name: merge_line_segments
Vector lines can sometimes contain two features that are connected by a shared end vertex. This tool identifies connected line features in an input vector file (input) and merges them in the output file (output). Two line features are merged if their ends are coincident, and are not coincident with any other feature (i.e. a bifurcation junction). End vertices are considered to be coincident if they are within the specified snap distance (snap).
See Also
split_with_lines
Python API
def merge_line_segments(self, input: Vector, snap_tolerance: float = 2.220446049250313e-16) -> Vector:
Minimum Bounding Box
Function name: minimum_bounding_box
This tool delineates the minimum bounding box (MBB) for a group of vectors. The MBB is the smallest box to completely enclose a feature. The algorithm works by rotating the feature, calculating the axis-aligned bounding box for each rotation, and finding the box with the smallest area, length, width, or perimeter. The MBB is needed to compute several shape indices, such as the Elongation Ratio. The MinimumBoundingEnvelop tool can be used to calculate the axis-aligned bounding rectangle around each feature in a vector file.
See Also
minimum_bounding_circle, minimum_bounding_envelope, minimum_convex_hull
Python API
def minimum_bounding_box(self, input: Vector, min_criteria: str = "area", individual_feature_hulls: bool = True) -> Vector:
Minimum Bounding Circle
Function name: minimum_bounding_circle
This tool delineates the minimum bounding circle (MBC) for a group of vectors. The MBC is the smallest enclosing circle to completely enclose a feature.
See Also
minimum_bounding_box, minimum_bounding_envelope, minimum_convex_hull
Python API
def minimum_bounding_circle(self, input: Vector, individual_feature_hulls: bool = True) -> Vector:
Minimum Bounding Envelope
Function name: minimum_bounding_envelope
This tool delineates the minimum bounding axis-aligned box for a group of vector features. The is the smallest rectangle to completely enclose a feature, in which the sides of the envelope are aligned with the x and y axis of the coordinate system. The minimum_bounding_box can be used instead to find the smallest possible non-axis aligned rectangular envelope.
See Also
minimum_bounding_box, minimum_bounding_circle, minimum_convex_hull
Python API
def minimum_bounding_envelope(self, input: Vector, individual_feature_hulls: bool = True) -> Vector:
Minimum Convex Hull
Function name: minimum_convex_hull
This tool creates a vector convex polygon around vector features. The convex hull is a convex closure of a set of points or polygon vertices and can be may be conceptualized as the shape enclosed by a rubber band stretched around the point set. The convex hull has many applications and is most notably used in various shape indices. The Delaunay triangulation of a point set and its dual, the Voronoi diagram, are mathematically related to convex hulls.
See Also
minimum_bounding_box, minimum_bounding_circle, minimum_bounding_envelope
Python API
def minimum_convex_hull(self, input: Vector, individual_feature_hulls: bool = True) -> Vector:
Polygonize
Function name: polygonize
This tool outputs a vector polygon layer from two or more intersecting line features contained in one or more input vector line files. Each space enclosed by the intersecting line set is converted to polygon added to the output layer. This tool should not be confused with the lines_to_polygons tool, which can be used to convert a vector file of polylines into a set of polygons, simply by closing each line feature. The lines_to_polygons tool does not deal with line intersection in the same way that the polygonize tool does.
See Also
lines_to_polygons
Python API
def polygonize(self, input_layers: List[Vector]) -> Vector:
Representative Point Vector
Function name: representative_point_vector
No help documentation available for this tool.
Simplify Features
Function name: simplify_features
Experimental
Simplify geometries by removing detail while preserving shape. Reduces complexity for visualization or processing speed.
vector simplify generalization
Parameters
NameDescriptionRequiredDefault
inputInput vector layer.Requiredinput.shp
toleranceSimplification tolerance in map units.Required5.0
outputOutput vector path.Required—
Examples
Simplifies geometry complexity while retaining shape.
wbe.simplify_features(input='input.shp', output='simplified.shp', tolerance=5.0)
Smooth Vectors
Function name: smooth_vectors
This tool smooths a vector coverage of either a POLYLINE or POLYGON base VectorGeometryType. The algorithm uses a simple moving average method for smoothing, where the size of the averaging window is specified by the user. The default filter size is 3 and can be any odd integer larger than or equal to 3. The larger the averaging window, the greater the degree of line smoothing.
Python API
def smooth_vectors(self, input: Vector, filter_size: int = 3) -> Vector:
Snap Endnodes
Function name: snap_endnodes
Experimental
Snaps nearby polyline endpoints to a shared location within a tolerance.
vector gis linework legacy-port
Parameters
NameDescriptionRequiredDefault
inputInput polyline vector layer.Requiredinput_lines.shp
snap_toleranceEndpoint snapping tolerance in map units.Optional2.220446049250313e-16
outputOutput snapped polyline vector path.Required—
Examples
Snaps adjacent line endpoints by a tolerance threshold.
wbe.snap_endnodes(input='input_lines.shp', output='snapped_endnodes.shp', snap_tolerance=2.220446049250313e-16)
Split Vector Lines
Function name: split_vector_lines
This tool can be used to divide longer vector lines (input) into segments of a maximum specified length (length).
See Also
assess_route
Python API
def split_vector_lines(self, input: Vector, segment_length: float) -> Vector:
Split With Lines
Function name: split_with_lines
This tool splits the lines or polygons in one layer using the lines in another layer to define the breaking points. Intersection points between geometries in both layers are considered as split points. The input layer (input) can be of either POLYLINE or POLYGON VectorGeometryType and the output file will share this geometry type. The user must also specify an split layer (split), of POLYLINE VectorGeometryType, used to bisect the input geometries.
Each split geometry's attribute record will contain FID and PARENT_FID values and all of the attributes (excluding FID's) of the input layer.
See Also
'MergeLineSegments'
Python API
def split_with_lines(self, input: Vector, split_vector: Vector) -> Vector:
Shape Metrics
Compactness Ratio
Function name: compactness_ratio
The compactness ratio is an indicator of polygon shape complexity. The compactness ratio is defined as the polygon area divided by its perimeter. Unlike some other shape parameters (e.g. ShapeComplexityIndex), compactness ratio does not standardize to a simple Euclidean shape. Although widely used for landscape analysis, compactness ratio, like its inverse, the perimeter_area_ratio, exhibits the undesirable property of polygon size dependence (Mcgarigal et al. 2002). That is, holding shape constant, an increase in polygon size will cause a change in the compactness ratio.
The output data will be contained in the input vector's attribute table as a new field (COMPACT).
See Also
perimeter_area_ratio, ShapeComplexityIndex, related_circumscribing_circle
Python API
def compactness_ratio(self, input: Vector) -> Vector:
Deviation From Regional Direction
Function name: deviation_from_regional_direction
This tool calculates the degree to which each polygon in an input shapefile (input) deviates from the average, or regional, direction. The input file will have a new attribute inserted in the attribute table, DEV_DIR, which will contain the calculated values. The deviation values are in degrees. The orientation of each polygon is determined based on the long-axis of the minimum bounding box fitted to the polygon. The regional direction is based on the mean direciton of the polygons, weighted by long-axis length (longer polygons contribute more weight) and elongation, i.e., a function of the long and short axis lengths (greater elongation contributes more weight). Polygons with elongation values lower than the elongation threshold value (elongation_threshold), which has values between 0 and 1, will be excluded from the calculation of the regional direction.
See Also
patch_orientation, elongation_ratio
Python API
def deviation_from_regional_direction(self, input: Vector, elongation_threshold: float = 0.75) -> Vector:
Elongation Ratio
Function name: elongation_ratio
This tool can be used to calculate the elongation ratio for vector polygons. The elongation ratio values calculated for each vector polygon feature will be placed in the accompanying database file (.dbf) as an elongation field (ELONGATION).
The elongation ratio (E) is:
E = 1 - S / L
Where S is the short-axis length, and L is the long-axis length. Axes lengths are determined by estimating the minimum bounding box.
The elongation ratio provides similar information as the Linearity Index. The ratio is not an adequate measure of overall polygon narrowness, because a highly sinuous but narrow polygon will have a low linearity (elongation) owing to the compact nature of these polygon.
Python API
def elongation_ratio(self, input: Vector) -> Vector:
Hole Proportion
Function name: hole_proportion
This calculates the proportion of the total area of a polygon's holes (i.e. islands) relative to the area of the polygon's hull. It can be a useful measure of shape complexity, or how discontinuous a patch is. The user must specify the name of the input vector file and the output data will be contained within the input vector's database file as a new field (HOLE_PROP).
See Also
ShapeComplexityIndex, elongation_ratio, perimeter_area_ratio
Python API
def hole_proportion(self, input: Vector) -> Vector:
Linearity Index
Function name: linearity_index
This tool calculates the linearity index of polygon features based on a regression analysis. The index is simply the coefficient of determination (r-squared) calculated from a regression analysis of the x and y coordinates of the exterior hull nodes of a vector polygon. Linearity index is a measure of how well a polygon can be described by a straight line. It is a related index to the elongation_ratio, but is more efficient to calculate as it does not require finding the minimum bounding box. The Pearson correlation coefficient between linearity index and the elongation ratio for a large data set of lake polygons in northern Canada was found to be 0.656, suggesting a moderate level of association between the two measures of polygon linearity. Note that this index is not useful for identifying narrow yet sinuous polygons, such as meandering rivers.
The only required input is the name of the file. The linearity values calculated for each vector polygon feature will be placed in the accompanying attribute table as a new field (LINEARITY).
See Also
elongation_ratio, patch_orientation
Python API
def linearity_index(self, input: Vector) -> Vector:
Narrowness Index Vector
Function name: narrowness_index_vector
No help documentation available for this tool.
Patch Orientation
Function name: patch_orientation
This tool calculates the orientation of polygon features based on the slope of a reduced major axis (RMA) regression line. The regression analysis use the vertices of the exterior hull nodes of a vector polygon. The only required input is the name of the vector polygon file. The orientation values, measured in degrees from north, will be placed in the accompanying attribute table as a new field (ORIENT). The value of the orientation measure for any polygon will depend on how elongated the feature is.
Note that the output values are polygon orientations and not true directions. While directions may take values ranging from 0-360, orientation is expressed as an angle between 0 and 180 degrees clockwise from north. Lastly, the orientation measure may become unstable when polygons are oriented nearly vertical or horizontal.
See Also
linearity_index, elongation_ratio
Python API
def patch_orientation(self, input: Vector) -> Vector:
Perimeter Area Ratio
Function name: perimeter_area_ratio
The perimeter-area ratio is an indicator of polygon shape complexity. Unlike some other shape parameters (e.g. shape complexity index), perimeter-area ratio does not standardize to a simple Euclidean shape. Although widely used for landscape analysis, perimeter-area ratio exhibits the undesirable property of polygon size dependence (Mcgarigal et al. 2002). That is, holding shape constant, an increase in polygon size will cause a decrease in the perimeter-area ratio. The perimeter-area ratio is the inverse of the compactness ratio.
The output data will be displayed as a new field (P_A_RATIO) in the input vector's database file.
Python API
def perimeter_area_ratio(self, input: Vector) -> Vector:
Polygon Area
Function name: polygon_area
This tool calculates the area of vector polygons, adding the result to the vector's attribute table (AREA field). The area calculation will account for any holes contained within polygons. The vector should be in a projected coordinate system.
To calculate the area of raster polygons, use the raster_area tool instead.
See Also
raster_area
Python API
def polygon_area(self, input: Vector) -> Vector:
Polygon Long Axis
Function name: polygon_long_axis
This tool can be used to map the long axis of polygon features. The long axis is the longer of the two primary axes of the minimum bounding box (MBB), i.e. the smallest box to completely enclose a feature. The long axis is drawn for each polygon in the input vector file such that it passes through the centre point of the MBB. The output file is therefore a vector of simple two-point polylines forming a vector field.
Python API
def polygon_long_axis(self, input: Vector) -> Vector:
Polygon Perimeter
Function name: polygon_perimeter
This tool calculates the perimeter of vector polygons, adding the result to the vector's attribute table (PERIMETER field). The area calculation will account for any holes contained within polygons. The vector should be in a a projected coordinate system.
Python API
def polygon_perimeter(self, input: Vector) -> Vector:
Polygon Short Axis
Function name: polygon_short_axis
This tool can be used to map the short axis of polygon features. The short axis is the shorter of the two primary axes of the minimum bounding box (MBB), i.e. the smallest box to completely enclose a feature. The short axis is drawn for each polygon in the input vector file such that it passes through the centre point of the MBB. The output file is therefore a vector of simple two-point polylines forming a vector field.
Python API
def polygon_short_axis(self, input: Vector) -> Vector:
Related Circumscribing Circle
Function name: related_circumscribing_circle
This tool can be used to calculate the related circumscribing circle (Mcgarigal et al. 2002) for vector polygon features. The related circumscribing circle values calculated for each vector polygon feature will be placed in the accompanying attribute table as a new field (RC_CIRCLE).
Related circumscribing circle (RCC) is defined as:
RCC = 1 - A / Ac
Where A is the polygon's area and Ac the area of the smallest circumscribing circle.
Theoretically, related_circumscribing_circle ranges from 0 to 1, where a value of 0 indicates a circular polygon and a value of 1 indicates a highly elongated shape. The circumscribing circle provides a measure of polygon elongation. Unlike the elongation_ratio, however, it does not provide a measure of polygon direction in addition to overall elongation. Like the elongation_ratio and linearity_index, related_circumscribing_circle is not an adequate measure of overall polygon narrowness, because a highly sinuous but narrow patch will have a low related circumscribing circle index owing to the compact nature of these polygon.
Note: Holes are excluded from the area calculation of polygons.
Python API
def related_circumscribing_circle(self, input: Vector) -> Vector:
Shape Complexity Index Vector
Function name: shape_complexity_index_vector
This tool provides a measure of overall polygon shape complexity, or irregularity, for vector polygons. Several shape indices have been created to compare a polygon's shape to simple Euclidean shapes (e.g. circles, squares, etc.). One of the problems with this approach is that it inherently convolves the characteristics of polygon complexity and elongation. The Shape Complexity Index (SCI) was developed as a parameter for assessing the complexity of a polygon that is independent of its elongation.
SCI relates a polygon's shape to that of an encompassing convex hull. It is defined as:
SCI = 1 - A / Ah
Where A is the polygon's area and Ah is the area of the convex hull containing the polygon. Convex polygons, i.e. those that do not contain concavities or holes, have a value of 0. As the shape of the polygon becomes more complex, the SCI approaches 1. Note that polygon shape complexity also increases with the greater number of holes (i.e. islands), since holes have the effect of reducing the lake area.
The SCI values calculated for each vector polygon feature will be placed in the accompanying database file (.dbf) as a complexity field (COMPLEXITY).
See Also
shape_complexity_index_raster
Python API
def shape_complexity_index_vector(self, input: Vector) -> Vector:
Sampling and Gridding
Construct Vector TIN
Function name: construct_vector_tin
This tool creates a vector triangular irregular network (TIN) for a set of vector points (input) using a 2D Delaunay triangulation algorithm. TIN vertex heights can be assigned based on either a field in the vector's attribute table (field), or alternatively, if the vector is of a z-dimension VectorGeometryTypeDimension, the point z-values may be used for vertex heights (use_z). For LiDAR points, use the lidar_construct_vector_tin tool instead.
Triangulation often creates very long, narrow triangles near the edges of the data coverage, particularly in convex regions along the data boundary. To avoid these spurious triangles, the user may optionally specify the maximum allowable edge length of a triangular facet (max_triangle_edge_length).
See Also
lidar_construct_vector_tin
Python API
def construct_vector_tin(self, input_points: Vector, field_name: str = "FID", use_z: bool = False, max_triangle_edge_length: float = float('inf')) -> Vector:
Contours From Points
Function name: contours_from_points
This tool creates a contour coverage from a set of input points (input). The user must specify the contour interval (interval) and optionally, the base contour value (base). The degree to which contours are smoothed is controlled by the Smoothing Filter Size parameter (smooth). This value, which determines the size of a mean filter applied to the x-y position of vertices in each contour, should be an odd integer value, e.g. 3, 5, 7, 9, 11, etc. Larger values will result in smoother contour lines.
See Also
contours_from_raster
Python API
def contours_from_points(self, input: Vector, field_name: str = "", use_z_values: bool = False, max_triangle_edge_length: float = float('inf'), contour_interval: float = 10.0, base_contour: float = 0.0, smoothing_filter_size: int = 9) -> Vector:
Contours From Raster
Function name: contours_from_raster
This tool can be used to create a vector contour coverage from an input raster surface model (input), such as a digital elevation model (DEM). The user must specify the contour interval (interval) and optionally, the base contour value (base). The degree to which contours are smoothed is controlled by the Smoothing Filter Size parameter (smooth). This value, which determines the size of a mean filter applied to the x-y position of vertices in each contour, should be an odd integer value, e.g. 3, 5, 7, 9, 11, etc. Larger values will result in smoother contour lines. The tolerance parameter (tolerance) controls the amount of line generalization. That is, vertices in a contour line will be selectively removed from the line if they do not result in an angular deflection in the line's path of at least this threshold value. Increasing this value can significantly decrease the size of the output contour vector file, at the cost of generating straighter contour line segments.
See Also
raster_to_vector_polygons
Python API
def contours_from_raster(self, raster_surface: Raster, contour_interval: float = 10.0, base_contour: float = 0.0, smoothing_filter_size: int = 9, deflection_tolerance: float = 10.0) -> Vector:
Extract Nodes
Function name: extract_nodes
This tool converts vector lines or polygons into vertex points. The user must specify the name of the input vector, which must be of a polyline or polygon base shape type, and the name of the output point-type vector.
Python API
def extract_nodes(self, input: Vector) -> Vector:
Extract Raster Values At Points
Function name: extract_raster_values_at_points
This tool can be used to extract the values of one or more rasters (inputs) at the sites of a set of vector points. By default, the data is output to the attribute table of the input points (points) vector; however, if the out_text parameter is specified, the tool will additionally output point values as text data to standard output (stdout). Attribute fields will be added to the table of the points file, with field names, VALUE1, VALUE2, VALUE3, etc. each corresponding to the order of input rasters.
If you need to plot a chart of values from a raster stack at a set of points, the image_stack_profile may be more suitable for this application.
See Also
image_stack_profile, find_lowest_or_highest_points
Python API
def extract_raster_values_at_points(self, rasters: List[Raster], points: Vector) -> Tuple[Vector, str]:
Find Lowest Or Highest Points
Function name: find_lowest_or_highest_points
This tool locates the lowest and/or highest cells in a raster and outputs these locations to a vector points file. The user must specify the name of the input raster (input) and the name of the output vector file (output). The user also has the option (out_type) to locate either the lowest value, highest value, or both values. The output vector's attribute table will contain fields for the points XY coordinates and their values.
See Also
extract_raster_values_at_points
Python API
def find_lowest_or_highest_points(self, raster: Raster, output_type: str = "lowest") -> Vector:
Hexagonal Grid From Raster Base
Function name: hexagonal_grid_from_raster_base
This tool can be used to create a hexagonal vector grid. The extent of the hexagonal grid is based on the extent of an input raster base file (base). The user must also specify the hexagonal cell width (width) and whether the hexagonal orientation (orientation) is horizontal or vertical. To use a vector base image instead of a raster, use the hexagonal_grid_from_vector_base tool.
See Also
hexagonal_grid_from_vector_base
Python API
def hexagonal_grid_from_raster_base(self, base: Raster, width: float, orientation: str = "h") -> Vector:
Hexagonal Grid From Vector Base
Function name: hexagonal_grid_from_vector_base
This tool can be used to create a hexagonal vector grid. The extent of the hexagonal grid is based on the extent of an input vector base file (base). The user must also specify the hexagonal cell width (width) and whether the hexagonal orientation (orientation) is horizontal or vertical. To use a raster base image instead of a vector, use the hexagonal_grid_from_raster_base tool.
See Also
hexagonal_grid_from_raster_base
Python API
def hexagonal_grid_from_vector_base(self, base: Vector, width: float, orientation: str = "h") -> Vector:
Layer Footprint Raster
Function name: layer_footprint_raster
This tool creates a vector polygon footprint of the area covered by an input raster grid (input). It will create a vector rectangle corresponding to the bounding box of the input raster.
If input data are irregular shape (i.e. there a boundary of NoData cells) the resulting vector will still correspond to the full grid extent, ignoring the irregular boundary. If this is not the desired effect, you may consider the minimum_bounding_envelope tool instead.
See Also
layer_footprint_vector, minimum_bounding_envelope
Python API
def layer_footprint_raster(self, input: Raster) -> Vector:
Layer Footprint Vector
Function name: layer_footprint_vector
This tool creates a vector polygon footprint of the area covered by a vector layer. It will create a vector rectangle corresponding to the bounding box. The user must specify the name of the input file (input).
If input data are irregular shape the resulting vector will still correspond to the full grid extent, ignoring the irregular boundary. If this is not the desired effect, you should use the minimum_bounding_envelope tool instead.
See Also
layer_footprint_raster, minimum_bounding_envelope
Python API
def layer_footprint_vector(self, input: Vector) -> Vector:
Medoid
Function name: medoid
This tool calculates the medoid for a series of vector features contained in a shapefile. The medoid of a two-dimensional feature is conceptually similar its centroid, or mean position, but the medoid is always a members of the input feature data set. Thus, the medoid is a measure of central tendency that is robust in the presence of outliers. If the input vector is of a POLYLINE or POLYGON VectorGeometryType, the nodes of each feature will be used to estimate the feature medoid. If the input vector is of a POINT base VectorGeometryType, the medoid will be calculated for the collection of points. While there are more than one competing method of calculating the medoid, this tool uses an algorithm that works as follows:
- The x-coordinate and y-coordinate of each point/node are placed into two arrays.
- The x- and y-coordinate arrays are then sorted and the median x-coordinate (Med X) and median y-coordinate (Med Y) are calculated.
- The point/node in the dataset that is nearest the point (Med X, Med Y) is identified as the medoid.
See Also
centroid_vector
Python API
def medoid(self, input: Vector) -> Vector:
Random Points In Polygon
Function name: random_points_in_polygon
Experimental
Generates random points uniformly within input polygon geometries.
vector sampling random
Parameters
NameDescriptionRequiredDefault
inputInput polygon layer.Requiredpolygons.shp
num_pointsNumber of random points to create.Required100
seedOptional RNG seed for reproducibility.Optional—
outputOutput vector path.Required—
Examples
Generates random sample points inside polygon boundaries.
wbe.random_points_in_polygon(input='polygons.shp', num_points=100, output='random_points.shp')
Rectangular Grid From Raster Base
Function name: rectangular_grid_from_raster_base
This tool can be used to create a rectangular vector grid. The extent of the rectangular grid is based on the extent of an input base raster (base). The user may also specify the origin of the grid (xorig and yorig, defaults are 0.0) and the grid cell width and height (width and height).
See Also
rectangular_grid_from_vector_base, hexagonal_grid_from_raster
Python API
def rectangular_grid_from_raster_base(self, base: Raster, width: float, height: float, x_origin: float = 0.0, y_origin: float = 0.0) -> Vector:
Rectangular Grid From Vector Base
Function name: rectangular_grid_from_vector_base
This tool can be used to create a rectangular vector grid. The extent of the rectangular grid is based on the extent of an input base vector (base). The user may also specify the origin of the grid (xorig and yorig, defaults are 0.0) and the grid cell width and height (width and height).
See Also
rectangular_grid_from_raster_base, hexagonal_grid_from_vector
Python API
def rectangular_grid_from_vector_base(self, base: Vector, width: float, height: float, x_origin: float = 0.0, y_origin: float = 0.0) -> Vector:
Vector Hex Binning
Function name: vector_hex_binning
The practice of binning point data to form a type of 2D histogram, density plot, or what is sometimes called a heatmap, is quite useful as an alternative for the cartographic display of of very dense points sets. This is particularly the case when the points experience significant overlap at the displayed scale. The PointDensity tool can be used to perform binning based on a regular grid (raster output). This tool, by comparison, bases the binning on a hexagonal grid.
The tool is similar to the CreateHexagonalVectorGrid tool, however instead will create an output hexagonal grid in which each hexagonal cell possesses a COUNT attribute which specifies the number of points from an input points file (Shapefile vector) that are contained within the hexagonal cell.
In addition to the names of the input points file and the output Shapefile, the user must also specify the desired hexagon width (w), which is the distance between opposing sides of each hexagon. The size (s) each side of the hexagon can then be calculated as, s = w / [2 x cos(PI / 6)]. The area of each hexagon (A) is, A = 3s(w / 2). The user must also specify the orientation of the grid with options of horizontal (pointy side up) and vertical (flat side up).
See Also
LidarHexBinning, PointDensity, CreateHexagonalVectorGrid
Python API
def vector_hex_binning(self, vector_points: Vector, width: float, orientation: str = "h") -> Vector:
Voronoi Diagram
Function name: voronoi_diagram
This tool creates a vector Voronoi diagram for a set of vector points. The Voronoi diagram is the dual graph of the Delaunay triangulation. The tool operates by first constructing the Delaunay triangulation and then connecting the circumcenters of each triangle. Each Voronoi cell contains one point of the input vector points. All locations within the cell are nearer to the contained point than any other input point.
A dense frame of 'ghost' (hidden) points is inserted around the input point set to limit the spatial extent of the diagram. The frame is set back from the bounding box of the input points by 2 x the average point spacing. The polygons of these ghost points are not output, however, points that are situated along the edges of the data will have somewhat rounded (paraboloic) exterior boundaries as a result of this edge condition. If this property is unacceptable for application, clipping the Voronoi diagram to the convex hull may be a better alternative.
This tool works on vector input data only. If a Voronoi diagram is needed to tessellate regions associated with a set of raster points, use the euclidean_allocation tool instead. To use Voronoi diagrams for gridding data (i.e. raster interpolation), use the NearestNeighbourGridding tool.
See Also
construct_vector_tin, euclidean_allocation, NearestNeighbourGridding
Python API
def voronoi_diagram(self, input_points: Vector) -> Vector:
Attribute Analysis
Add Field
Function name: add_field
Experimental
Adds a new attribute field with an optional default value.
vector schema attributes
Parameters
NameDescriptionRequiredDefault
inputInput vector layer.Requiredinput.shp
fieldNew field name.RequiredNEW_FIELD
field_typeField type: integer, float, text, boolean.Requiredfloat
defaultOptional default value.Optional—
outputOutput vector path.Required—
Examples
Adds a typed field to the layer schema.
wbe.add_field(default=0.0, field='NEW_FIELD', field_type='float', input='input.shp', output='add_field.shp')
Add Geometry Attributes
Function name: add_geometry_attributes
Experimental
Adds area, length, perimeter, and centroid attributes to vector features.
vector attributes measurements
Parameters
NameDescriptionRequiredDefault
inputInput vector layer.Requiredinput.shp
areaInclude AREA field (default true).OptionalTrue
lengthInclude LENGTH field (default true).OptionalTrue
perimeterInclude PERIMETER field (default true).OptionalTrue
centroidInclude centroid X/Y fields (default true).OptionalTrue
outputOutput vector path.Required—
Examples
Adds geometry-derived attributes to each feature.
wbe.add_geometry_attributes(area=True, centroid=True, input='input.shp', length=True, output='geometry_attributes.shp', perimeter=True)
Attribute Correlation
Function name: attribute_correlation
This tool can be used to estimate the Pearson product-moment correlation coefficient (r) for each pair among a group of attributes associated with the database file of a shapefile. The r-value is a measure of the linear association in the variation of the attributes. The coefficient ranges from -1, indicated a perfect negative linear association, to 1, indicated a perfect positive linear association. An r-value of 0 indicates no correlation between the test variables.
Notice that this index is a measure of the linear association; two variables may be strongly related by a non-linear association (e.g. a power function curve) which will lead to an apparent weak association based on the Pearson coefficient. In fact, non-linear associations are very common among spatial variables, e.g. terrain indices such as slope and contributing area. In such cases, it is advisable that the input images are transformed prior to the estimation of the Pearson coefficient, or that an alternative, non-parametric statistic be used, e.g. the Spearman rank correlation coefficient.
The user must specify the name of the input vector Shapefile (input). Correlations will be calculated for each pair of numerical attributes contained within the input file's attribute table and presented in a correlation matrix HMTL output (output).
See Also
image_correlation, attribute_scattergram, attribute_histogram
Python API
def attribute_correlation(self, input: Vector, output_html_file: str) -> None:
Attribute Histogram
Function name: attribute_histogram
This tool can be used to create a histogram, which is a graph displaying the frequency distribution of data, for the values contained in a field of an input vector's attribute table. The user must specify the name of an input vector (input) and the name of one of the fields (field) contained in the associated attribute table. The tool output (output) is an HTML formatted histogram analysis report. If the specified field is non-numerical, the tool will produce a bar-chart of class frequency, similar to the tabular output of the list_unique_values tool.
See Also
list_unique_values, raster_histogram
Python API
def attribute_histogram(self, input: Vector, field_name: str, output_html_file: str) -> None:
Attribute Scattergram
Function name: attribute_scattergram
This tool can be used to create a scattergram for two numerical fields (fieldx and fieldy) contained within an input vector's attribute table (input). The user must specify the name of an input shapefile and the name of two of the fields contained it the associated attribute table. The tool output (output) is an HTML formatted report containing a graphical scattergram plot.
See Also
attribute_histogram, attribute_correlation
Python API
def attribute_scattergram(self, input: Vector, field_name_x: str, field_name_y: str, output_html_file: str, add_trendline: bool = False) -> None:
Delete Field
Function name: delete_field
Experimental
Deletes one or more attribute fields from a vector layer.
vector schema attributes
Parameters
NameDescriptionRequiredDefault
inputInput vector layer.Requiredinput.shp
fieldsComma-delimited field names to delete.RequiredFIELD_A,FIELD_B
outputOutput vector path.Required—
Examples
Removes selected fields from a layer schema.
wbe.delete_field(fields='FIELD_A,FIELD_B', input='input.shp', output='fields_deleted.shp')
Extract By Attribute
Function name: extract_by_attribute
This tool extracts features from an input vector into an output file based on attribute properties. The user must specify the name of the input (--input) and output (--output) files, along with the filter statement (--statement). The conditional statement is a single-line logical condition containing one or more attribute variables contained in the file's attribute table that evaluates to TRUE/FALSE. In addition to the common comparison and logical operators, i.e. < > <= >= == (EQUAL TO) != (NOT EQUAL TO) || (OR) && (AND), conditional statements may contain a any valid mathematical operation and the null value. IdentifierArgument AmountArgument TypesDescription min>= 1NumericReturns the minimum of the arguments max>= 1NumericReturns the maximum of the arguments len1String/TupleReturns the character length of a string, or the amount of elements in a tuple (not recursively) floor1NumericReturns the largest integer less than or equal to a number round1NumericReturns the nearest integer to a number. Rounds half-way cases away from 0.0 ceil1NumericReturns the smallest integer greater than or equal to a number if3Boolean, Any, AnyIf the first argument is true, returns the second argument, otherwise, returns the third contains2Tuple, any non-tupleReturns true if second argument exists in first tuple argument. contains_any2Tuple, Tuple of any non-tupleReturns true if one of the values in the second tuple argument exists in first tuple argument. typeof1Anyreturns "string", "float", "int", "boolean", "tuple", or "empty" depending on the type of the argument math::is_nan1NumericReturns true if the argument is the floating-point value NaN, false if it is another floating-point value, and throws an error if it is not a number math::is_finite1NumericReturns true if the argument is a finite floating-point number, false otherwise math::is_infinite1NumericReturns true if the argument is an infinite floating-point number, false otherwise math::is_normal1NumericReturns true if the argument is a floating-point number that is neither zero, infinite, subnormal, or NaN, false otherwise math::ln1NumericReturns the natural logarithm of the number math::log2Numeric, NumericReturns the logarithm of the number with respect to an arbitrary base math::log21NumericReturns the base 2 logarithm of the number math::log101NumericReturns the base 10 logarithm of the number math::exp1NumericReturns e^(number), (the exponential function) math::exp21NumericReturns 2^(number) math::pow2Numeric, NumericRaises a number to the power of the other number math::cos1NumericComputes the cosine of a number (in radians) math::acos1NumericComputes the arccosine of a number. The return value is in radians in the range [0, pi] or NaN if the number is outside the range [-1, 1] math::cosh1NumericHyperbolic cosine function math::acosh1NumericInverse hyperbolic cosine function math::sin1NumericComputes the sine of a number (in radians) math::asin1NumericComputes the arcsine of a number. The return value is in radians in the range [-pi/2, pi/2] or NaN if the number is outside the range [-1, 1] math::sinh1NumericHyperbolic sine function math::asinh1NumericInverse hyperbolic sine function math::tan1NumericComputes the tangent of a number (in radians) math::atan1NumericComputes the arctangent of a number. The return value is in radians in the range [-pi/2, pi/2] math::atan22Numeric, NumericComputes the four quadrant arctangent in radians math::tanh1NumericHyperbolic tangent function math::atanh1NumericInverse hyperbolic tangent function. math::sqrt1NumericReturns the square root of a number. Returns NaN for a negative number math::cbrt1NumericReturns the cube root of a number math::hypot2NumericCalculates the length of the hypotenuse of a right-angle triangle given legs of length given by the two arguments math::abs1NumericReturns the absolute value of a number, returning an integer if the argument was an integer, and a float otherwise str::regex_matches2String, StringReturns true if the first argument matches the regex in the second argument (Requires regex_support feature flag) str::regex_replace3String, String, StringReturns the first argument with all matches of the regex in the second argument replaced by the third argument (Requires regex_support feature flag) str::to_lowercase1StringReturns the lower-case version of the string str::to_uppercase1StringReturns the upper-case version of the string str::trim1StringStrips whitespace from the start and the end of the string str::from>= 0AnyReturns passed value as string bitand2IntComputes the bitwise and of the given integers bitor2IntComputes the bitwise or of the given integers bitxor2IntComputes the bitwise xor of the given integers bitnot1IntComputes the bitwise not of the given integer shl2IntComputes the given integer bitwise shifted left by the other given integer shr2IntComputes the given integer bitwise shifted right by the other given integer random0EmptyReturn a random float between 0 and 1. Requires the rand feature flag. pi0EmptyReturn the value of the PI constant./
The following are examples of valid conditional statements:
`HEIGHT >= 300.0
CROP == "corn"
(ELEV >= 525.0) && (HGT_AB_GR <= 5.0)
math::ln(CARBON) > 1.0
VALUE == null `
Python API
def extract_by_attribute(self, input: Vector, statement: str) -> Vector:
Field Calculator
Function name: field_calculator
Experimental
Calculates or updates a field value from SQL-style or expression-style formulas using feature attributes and geometry variables.
vector attributes expression
Parameters
NameDescriptionRequiredDefault
inputInput vector layer.Requiredinput.shp
fieldOutput field name.Requiredscore
field_typeOutput field type: float, integer, text.Optionalfloat
expressionExpression or SQL-style UPDATE assignment evaluated per feature.RequiredVALUE * 2.0 + $area
overwriteOverwrite existing field if present (default true).OptionalTrue
preview_rowsOptional number of preview rows to return in payload.Optional0
outputOutput vector path. Optional when preview_rows > 0.Optional—
Examples
Computes a derived numeric field using attributes and geometry.
wbe.field_calculator(expression='VALUE * 2.0 + $area', field='score', field_type='float', input='input.shp', output='field_calc.shp', overwrite=True)
SQL-style conditional update using CASE and UPDATE wrapper.
wbe.field_calculator(input='roads.gpkg', field='SPEED', field_type='integer', expression="UPDATE roads SET SPEED = CASE WHEN TYPE == 'motorway' THEN 100 WHEN TYPE == 'primary' THEN 80 ELSE 60 END", overwrite=True, output='roads_speed.gpkg')
Preview-only evaluation for first 10 rows (no output write).
wbe.field_calculator(input='roads.gpkg', field='SPEED', field_type='integer', expression="CASE TYPE WHEN 'motorway' THEN 100 ELSE 60 END", overwrite=True, preview_rows=10)
Filter Vector Features By Area
Function name: filter_vector_features_by_area
Experimental
Filters polygon features below a minimum area threshold.
vector gis filter polygon legacy-port
Parameters
NameDescriptionRequiredDefault
inputInput polygon vector layer.Requiredpolygons.shp
thresholdMinimum polygon area to retain, in layer coordinate units squared.Required1000.0
outputOutput vector path.Required—
Examples
Removes polygons smaller than the specified area threshold.
wbe.filter_vector_features_by_area(input='polygons.shp', output='filtered_polygons.shp', threshold=1000.0)
List Unique Values
Function name: list_unique_values
This tool can be used to list each of the unique values contained within a categorical field of an input vector file's attribute table. The tool outputs an HTML formatted report (output) containing a table of the unique values and their frequency of occurrence within the data. The user must specify the name of an input shapefile (input) and the name of one of the fields (field) contained in the associated attribute table. The specified field should not contained floating-point numerical data, since the number of categories will likely equal the number of records, which may be quite large. The tool effectively provides tabular output that is similar to the graphical output provided by the attribute_histogram tool, which, however, can be applied to continuous data.
See Also
attribute_histogram
Python API
def list_unique_values(self, input: Vector, field_name: str) -> Tuple[str, int]:
Rename Field
Function name: rename_field
Experimental
Renames an attribute field in a vector layer.
vector schema attributes
Parameters
NameDescriptionRequiredDefault
inputInput vector layer.Requiredinput.shp
fieldExisting field name.RequiredOLD_NAME
new_fieldReplacement field name.RequiredNEW_NAME
outputOutput vector path.Required—
Examples
Renames one attribute field.
wbe.rename_field(field='OLD_NAME', input='input.shp', new_field='NEW_NAME', output='renamed.shp')
Spatial Statistics
Global Morans I
Function name: global_morans_i
No help documentation available for this tool.
Getis Ord Gi Star
Function name: getis_ord_gi_star
No help documentation available for this tool.
Local Morans I Lisa
Function name: local_morans_i_lisa
No help documentation available for this tool.
Nearest Neighbour Index
Function name: nearest_neighbour_index
No help documentation available for this tool.
Quadrat Count Test
Function name: quadrat_count_test
No help documentation available for this tool.
Estimate Variogram
Function name: estimate_variogram
No help documentation available for this tool.
Fit Variogram
Function name: fit_variogram
No help documentation available for this tool.
Directional Variogram
Function name: directional_variogram
No help documentation available for this tool.
Kriging Cross Validation
Function name: kriging_cross_validation
No help documentation available for this tool.
Spatial Lag Regression
Function name: spatial_lag_regression
No help documentation available for this tool.
Spatial Error Regression
Function name: spatial_error_regression
No help documentation available for this tool.
Geographically Weighted Regression
Function name: geographically_weighted_regression
No help documentation available for this tool.
Ripleys K Test
Function name: ripleys_k_test
No help documentation available for this tool.
Envelope Test
Function name: envelope_test
No help documentation available for this tool.
Point Process Residuals
Function name: point_process_residuals
No help documentation available for this tool.
Ripleys K Function
Function name: ripleys_k_function
No help documentation available for this tool.
Point Pattern Envelope
Function name: point_pattern_envelope
No help documentation available for this tool.
Inhomogeneous Baseline
Function name: inhomogeneous_baseline
No help documentation available for this tool.
Hotspot Vs Process
Function name: hotspot_vs_process
No help documentation available for this tool.
Point Process Residuals Comparison
Function name: point_process_residuals_comparison
No help documentation available for this tool.
Online Data
Download OSM Vector
Function name: download_osm_vector
No help documentation available for this tool.
Workflow Products
Utility Corridor Encroachment And Access Planning
Function name: utility_corridor_encroachment_and_access_planning
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Utility Corridor Access Planning
Who It Is For
- Utility corridor maintenance planners and linear-infrastructure field operations teams.
- Vegetation/encroachment risk analysts coordinating access logistics.
Primary User
Transmission/distribution utilities and corridor maintenance operations.
What It Does
- Identifies encroachment hotspots that fall within a configurable corridor influence distance.
- Scores hotspot risk based on proximity to corridor centerlines.
- Assigns nearest access points for field-response feasibility.
- Produces ranked hotspot CSV plus planning summary JSON for operations teams.
- Adds dispatch-ready priority bands, SLA guidance, and optional response queue export.
How It Works
- Loads corridor line geometry, encroachment observations, and access points.
- Converts corridor geometry to line segments and computes minimum point-to-segment distance for each encroachment.
- Keeps in-range encroachments as hotspots using
corridor_influence_distancethreshold. - Computes risk score as inverse-distance to corridor (closer = higher risk).
- Computes access score from nearest access-point distance and combines with risk into priority score.
- Classifies each hotspot into
critical,high,medium, orlowresponse bands and assigns response SLA hours. - Emits hotspot vector with attributes: ENC_FID, DIST_CORR, RISK_SCORE, ACCESS_FID, ACCESS_DIST, PRIORITY, PRIOR_BAND, SLA_HOURS, ACCESS_DIFF.
Inputs
ParameterTypeRequiredDescription
corridorsLineVector pathRequiredCorridor centerline layer
encroachmentsVector pathRequiredEncroachment observations (point/line/polygon; representative point sampled)
access_pointsPointVector pathRequiredField access points
corridor_influence_distancefloatOptionalMax distance from corridor to retain as hotspot (default 30.0)
high_risk_distancefloatOptionalDistance considered highest-risk zone (default 10.0)
hotspotsvector pathRequiredOutput hotspot vector
priority_csvpathRequiredOutput ranked hotspot CSV
planning_reportpathRequiredOutput planning summary JSON
response_queue_csvpathOptionalOptional dispatch-ready response queue with SLA guidance
Outputs
OutputTypeContents
hotspotsVectorHotspot points with risk/access/priority attributes plus corridor lineage fields (CORR_FID, SEG_IDX, LINEAGE_ID)
priority_csvCSVRanked hotspots: rank, enc_fid, corridor_fid, segment_idx, dist_to_corridor, risk_score, nearest_access_fid, access_dist, priority_score, priority_band, response_sla_hours, access_difficulty, lineage_id
planning_reportJSONCounts, averages, thresholds, counts by priority band, and top hotspot summary
response_queue_csvCSVResponse queue with priority band, SLA target, access difficulty, recommended action, and lineage_id
Python Example
`env = WbEnvironment(license_tier="pro")
result = env.run_tool("utility_corridor_encroachment_and_access_planning", corridors="corridor_centerlines.gpkg", encroachments="encroachment_points.gpkg", access_points="field_access_points.gpkg", corridor_influence_distance=30.0, high_risk_distance=10.0, hotspots="output/corridor_hotspots.gpkg", priority_csv="output/corridor_priority.csv", planning_report="output/corridor_planning_report.json", response_queue_csv="output/corridor_response_queue.csv", )
print(result)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Parcel And Land Fabric Topology Compliance Workflow
Function name: parcel_and_land_fabric_topology_compliance_workflow
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Parcel Fabric Topology Compliance
Who It Is For
- Local government cadastral teams and parcel QA workflows.
- Land administration vendors with regulatory topology compliance requirements.
Primary User
Municipal cadastral programs and land administration platforms.
What It Does
- Audits parcel fabrics for topology compliance using rule-based checks.
- Flags parcel slivers below a configurable minimum area threshold.
- Produces violations vector, issues CSV, and a compliance summary JSON.
- Optionally runs topology auto-fix and outputs corrected parcel geometry.
- Supports jurisdiction templates (
generic,ontario_mpac) for calibrated defaults. - Optionally emits a remediation queue CSV with prioritized corrective actions.
- Emits sliver-threshold calibration diagnostics so parcel fabrics can be profiled before tightening production thresholds.
How It Works
- Runs
topology_validation_reportto produce per-feature topology issue CSV. - Runs
topology_rule_validatewith parcel-focused rules (polygon_must_not_overlap,polygon_must_not_have_gaps). - Performs additional sliver detection on polygon area (
area ParameterTypeRequiredDescriptionparcelsPolygonVector pathRequiredInput parcel polygon layermin_sliver_areafloatOptionalArea threshold for sliver detection (default 1.0)auto_fixboolOptionalRun topology auto-fix and emit corrected output (default false)jurisdiction_templatestringOptionalRule-template preset (generic|ontario_mpac) used for calibrated defaultstopology_violationsvector pathRequiredOutput topology violations layerissues_csvpathRequiredOutput topology validation CSVcompliance_reportpathRequiredOutput compliance summary JSONcorrected_parcelsvector pathOptionalOptional corrected parcel output path when auto_fix=trueremediation_queue_csvpathOptionalOptional prioritized remediation action queue CSVhtml_report`pathOptionalOptional output HTML path for the compliance dashboard report
Outputs
OutputTypeContents
topology_violationsVectorRule violations generated by topology rule validation
issues_csvCSVPer-feature topology issue report from topology validation
compliance_reportJSONSummary counts by rule, sliver diagnostics, sliver calibration profile, autofix summary, pass/fail
corrected_parcelsVectorAuto-fix output when enabled and path provided
remediation_queue_csvCSVPriority-ranked remediation actions by issue type/rule
html_reportHTMLOptional compliance dashboard report with visual summary
Python Example
`env = WbEnvironment(license_tier="pro")
result = env.run_tool("parcel_and_land_fabric_topology_compliance_workflow", parcels="parcel_fabric.gpkg", min_sliver_area=1.0, jurisdiction_template="ontario_mpac", auto_fix=True, topology_violations="output/parcel_violations.gpkg", issues_csv="output/parcel_issues.csv", compliance_report="output/parcel_compliance.json", corrected_parcels="output/parcel_corrected.gpkg", remediation_queue_csv="output/parcel_remediation_queue.csv", html_report="output/parcel_compliance_report.html", )
print(result)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Network Analysis
Network analysis in WbW-QGIS spans both transportation and hydrologic networks. This chapter is aligned with the Python and R manuals and now covers three common tracks:
- Transportation routing and service areas
- OD and nearest-facility analysis
- Stream-network hierarchy and connectivity
Capability Note (Open Tier)
Whitebox open tier provides advanced network tools directly in the QGIS plugin, including shortest path, k-shortest alternatives, service areas, OD matrices, closest facility, location-allocation, and multimodal OD/routes. Advanced impedance controls include one-way directionality, turn/u-turn penalties, optional node-entry costs, and optional temporal cost profiles.
Core Concepts You Should Know First
- Network: A graph of edges (line segments) and nodes (junctions/endpoints).
- Cost or impedance: Value minimized by routing (distance, minutes, or other weighted friction).
- OD pair: Origin and destination used in path queries.
- Service area: All network locations reachable under a cost budget.
- Closest facility: Nearest destination by network cost, not straight-line distance.
- Connectivity: Whether all required features are in connected components.
- Directed network: Edge direction matters (one-way roads, downstream streams).
Typical Inputs
| Layer | Format | Notes |
|---|---|---|
| roads.shp | Polyline vector | Cleaned road centerlines |
| facilities.shp | Point vector | Hospitals, depots, schools, etc. |
| demand_points.shp | Point vector | Incidents, customers, or population centroids |
| streams.tif | Raster | Binary stream raster for hydrologic hierarchy |
| d8_pointer.tif | Raster | D8 flow-direction raster |
Workflow A: Transportation Network Preparation
Step 1 - Topology QA and Geometry Cleanup
Use standard QGIS cleanup first:
- Check validity
- Snap Geometries to Layer
- Fix Geometries
Then enrich network attributes with Whitebox tools:
Processing Toolbox -> Whitebox Workflows -> Vector Analysis -> Add Geometry Attributes
This provides segment length fields needed for distance-based routing.
If travel-time routing is required, compute a time field such as:
- TIME_MIN = LENGTH_M / SPEED_M_PER_MIN
using Field Calculator.
Step 2 - Build Cost-Aware Road Layer
Recommended fields:
- LENGTH_M (meters)
- SPEED_KMH (if available)
- TIME_MIN (derived)
- ONEWAY (optional directional control)
Use this prepared layer as the routing network for Whitebox network-analysis tools in the Processing Toolbox.
Step 2.5 - Build Network Topology and Snap Points (Optional)
If your network lacks proper node structure or you need to snap facility/demand points to the network:
Processing Toolbox → Whitebox Workflows → Vector Analysis →
Build Network Topology
| Parameter | Value |
|---|---|
| Input vector | roads_prepared.shp |
| Snap tolerance | 0.5 |
| Output | roads_noded.shp |
| Output nodes | network_nodes.shp |
Then snap your facilities and demand points:
Processing Toolbox → Whitebox Workflows → Vector Analysis →
Snap Points to Network
| Parameter | Value |
|---|---|
| Network layer | roads_noded.shp |
| Points layer | fire_stations.shp |
| Snap distance | 50.0 (meters) |
| Output | fire_stations_snapped.shp |
Output includes SNAP_DIST (offset to network) for diagnostics.
Workflow B: Routing, Service Areas, and Closest Facility
Intersection Delay / Node Cost Modeling
For Whitebox network tools that support advanced impedance (for example service-area and closest-facility workflows), you can include node-entry costs to model intersection delay:
node_cost_points: point layer of intersection/gate delay observations.node_cost_field: numeric field innode_cost_pointswith non-negative delay/cost values.node_cost_snap_distance: optional max assignment distance from each node-cost point to a network node.
Practical pattern:
- Build/snap a clean network first.
- Prepare an intersection delay points layer (signals, crossings, gates).
- Run network tools with both edge impedance and node-cost parameters.
When node-cost parameters are omitted, routing uses edge impedance only.
Step 3 - Shortest Path and K-Shortest Alternatives
Processing Toolbox -> Whitebox Workflows -> Network Analysis:
Shortest Path NetworkK Shortest Paths Network
Recommended parameters:
| Parameter | Example |
|---|---|
| Input network | roads_prepared.shp |
| Start / End | route endpoints or point features |
| Edge cost field | TIME_MIN |
| One-way field | ONEWAY |
| Turn penalty | 0.3 to 0.8 (minutes) |
| U-turn penalty | 2.0 to 4.0 (minutes) |
Use k-shortest outputs when teaching resilience, alternate routing, or choice-model concepts.
Step 4 - Service Area (Isochrone)
Processing Toolbox -> Whitebox Workflows -> Network Analysis ->
Network Service Area
Recommended parameters:
| Parameter | Example |
|---|---|
| Network layer | roads_prepared.shp |
| Origins | facilities.shp |
| Max cost | 5.0 (minutes) or 3000 (meters) |
| Output mode | polygon or edges |
| Polygon merge origins | true/false |
| Edge cost field | TIME_MIN |
| One-way field | ONEWAY |
Advanced options (recommended for realistic urban travel times):
node_cost_points,node_cost_field,node_cost_snap_distanceturn_penalty,u_turn_penalty,forbid_u_turnstemporal_cost_profile,departure_time,temporal_mode
Step 5 - Closest Facility and OD Matrices
Processing Toolbox -> Whitebox Workflows -> Network Analysis:
Closest Facility NetworkNetwork OD Cost MatrixNetwork Routes From OD
Use these for assignment, accessibility summaries, and route materialization. OD matrix output is ideal for downstream tabular analysis (Python/R/pandas).
Workflow C: Location-Allocation and Accessibility
Processing Toolbox -> Whitebox Workflows -> Network Analysis:
Location Allocation NetworkCompute Network Accessibility
Recommended location-allocation pattern:
- Prepare candidate sites and weighted demand points.
- Select solver mode (
minimize_impedance,maximize_coverage, ormaximize_attendance). - Compare static-cost and peak-period temporal-profile runs.
Recommended accessibility pattern:
- Provide origins and destination opportunities.
- Set impedance cutoff and decay function.
- Map/compare resulting accessibility scores across scenarios.
Workflow D: Multimodal Network Analysis
Processing Toolbox -> Whitebox Workflows -> Network Analysis:
Multimodal Shortest PathMultimodal OD Cost MatrixMultimodal Routes From OD
Requirements:
- mode field on network edges (e.g., walk/bus/rail)
- allowed-modes and transfer-penalty configuration
- optional temporal profile for schedule/peak scenarios
Workflow E: Hydrologic Stream Networks
Hydrologic network tools remain an important part of network analysis and are included here as a dedicated sub-workflow rather than the entire chapter.
Step 6 - Stream Hierarchy
Processing Toolbox -> Whitebox Workflows -> Spatial Hydrology:
- Strahler Stream Order
- Shreve Stream Magnitude
- Hack Stream Order
These tools characterize stream position and downstream accumulation.
Step 7 - Stream Vectorization
Processing Toolbox -> Whitebox Workflows -> Spatial Hydrology -> Raster Streams to Vector
Convert ordered stream rasters to vector lines for cartography and further network operations.
QGIS Python Console Equivalent
import processing
# Add geometry attributes for road cost preparation
processing.run('whitebox_workflows:add_geometry_attributes', {
'input': '/data/roads.shp',
'output': '/data/roads_prepared.shp',
})
# Whitebox network service area
processing.run('whitebox_workflows:network_service_area', {
'input': '/data/roads_prepared.shp',
'origins': '/data/facilities.shp',
'max_cost': 5.0,
'output_mode': 'polygon',
'edge_cost_field': 'TIME_MIN',
'one_way_field': 'ONEWAY',
'turn_penalty': 0.4,
'u_turn_penalty': 2.5,
'output': '/data/service_area_5min.shp',
})
# Whitebox OD matrix
processing.run('whitebox_workflows:network_od_cost_matrix', {
'input': '/data/roads_prepared.shp',
'origins': '/data/origins.shp',
'destinations': '/data/destinations.shp',
'edge_cost_field': 'TIME_MIN',
'one_way_field': 'ONEWAY',
'output': '/data/od_costs.csv',
})
# Stream order
processing.run('whitebox_workflows:strahler_stream_order', {
'd8_pntr': '/data/d8_pointer.tif',
'streams': '/data/streams.tif',
'output': '/data/strahler.tif',
})
Common Pitfalls
| Problem | Likely cause | Fix |
|---|---|---|
| No route found between known-connected points | Topology gaps or unsnapped endpoints | Run snapping and revalidate connectivity |
| Service area too small or too large | Cost units inconsistent | Keep all costs in either meters or minutes |
| One-way streets ignored | Direction field not configured | Verify direction settings in network algorithm |
| Batch routing is slow | Unnecessary repeated reprojection or heavy geometry | Preprocess to common CRS and simplify where appropriate |
| Stream order appears uniform | Bad stream threshold or mismatched d8/stream rasters | Rebuild streams and ensure matching extent/grid |
Validation Checklist
- Routing network passes geometry validity and snapping checks.
- Cost field units are consistent across all analyses.
- Directionality assumptions are documented (directed vs undirected).
- Service-area outputs were spot-checked against known travel behavior.
- Stream-order outputs were checked at confluences.
- Workflow parameters were saved in model or processing history.
Network Analysis — Tool Reference
Build Network Topology
Function name: build_network_topology
No help documentation available for this tool.
Closest Facility Network
Function name: closest_facility_network
Experimental
Finds the minimum-cost network route from each incident point to its nearest reachable facility point.
vector network closest-facility routing
Parameters
NameDescriptionRequiredDefault
inputInput line network layer.Requirednetwork.shp
incidentsIncident/demand point layer.Requiredincidents.shp
facilitiesFacility/supply point layer.Requiredfacilities.shp
snap_toleranceOptional node snapping tolerance for graph construction.Optional—
max_snap_distanceOptional max distance from incident/facility points to nearest network node.Optional—
edge_cost_fieldOptional numeric line field used as an impedance multiplier for segment length.Optional—
one_way_fieldOptional line field marking one-way digitized edges (true/1/yes means from first to second vertex only).Optional—
blocked_fieldOptional line field marking blocked/closed edges to exclude from routing (true/1/yes blocks).Optional—
barriersOptional barrier point layer; nearest network nodes are blocked from traversal.Optional—
barrier_snap_distanceOptional max distance from each barrier point to a network node for blocking.Optional—
turn_penaltyOptional additive cost applied to non-straight turns at network nodes.Optional—
u_turn_penaltyOptional additive cost applied to U-turn transitions.Optional—
forbid_u_turnsIf true, disallow U-turn transitions.Optional—
forbid_left_turnsIf true, disallow left-turn transitions.Optional—
forbid_right_turnsIf true, disallow right-turn transitions.Optional—
turn_restrictions_csvOptional CSV of turn transitions using columns prev_x,prev_y,node_x,node_y,next_x,next_y. Optional columns: forbidden (default true when no turn_cost column is provided) and turn_cost (or penalty/cost/extra_cost) for per-turn additive cost.Optional—
temporal_cost_profileOptional CSV defining time-dependent edge costs (columns: edge_id,dow,start_minute,end_minute,value).Optional—
temporal_edge_id_fieldOptional network field used to match temporal_cost_profile edge_id values (default EDGE_ID).Optional—
departure_timeOptional RFC3339 departure time used for temporal profile lookup.Optional—
temporal_modeOptional temporal interpretation mode: multiplier or absolute.Optional—
temporal_fallbackOptional fallback when temporal row is missing: static_cost or error.Optional—
temporal_profile_reportOptional JSON output path for temporal profile diagnostics (coverage, unmatched edges, fallback usage).Optional—
outputOutput closest-facility route line vector path.Required—
Examples
Routes each incident point to the nearest reachable facility by network cost.
wbe.closest_facility_network(facilities='facilities.shp', incidents='incidents.shp', input='network.shp', output='closest_facility_routes.shp')
Emergency Scenario Routing And Accessibility Simulator
Function name: emergency_scenario_routing_and_accessibility_simulator
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Emergency Accessibility Scenario Planning
Who It Is For
- Emergency management teams planning resilience under flood/fire/closure scenarios.
- Public safety routing analysts evaluating critical-facility reachability under disruptions.
Primary User
Emergency management, public safety operations, and municipal resilience planning teams.
What It Does
- Simulates emergency network accessibility under multiple disruption scenarios.
- Compares scenario accessibility against baseline service-area coverage.
- Supports scenario-specific blocked-edge simulation using network attribute mapping.
- Outputs baseline and worst-case service areas, scenario KPI table, and simulation report JSON.
How It Works
- Runs baseline merged multi-ring service areas from critical facilities using network_service_area.
- Reads scenario CSV (
scenario_id,max_cost_multiplier[,blocked_value]). - For each scenario:
- Scales max travel cost by
max_cost_multiplier. - Optionally maps
blocked_valueto scenario-blocked edges usingscenario_block_source_field. - Scope boundary: Emergency Accessibility Scenario Planning simulates disrupted-network conditions for critical facility coverage — it is the service-area tool for emergency resilience and response planning. Service Area Planning and Coverage Optimization (6.2) addresses public infrastructure coverage under normal operating conditions. Market Access and Site Planning (6.7) evaluates commercial expansion by drive-time access and competitive positioning. Choose this tool for resilience analysis, 6.2 for infrastructure coverage gaps, and 6.7 for commercial site decisions.
- Computes scenario service area polygons and demand-point coverage percent.
- Computes
delta_from_baseline_pctper scenario and identifies best/worst scenario outcomes.
Inputs
ParameterTypeRequiredDescription
networkLineVector pathRequiredNetwork layer for routing
critical_facilitiesPointVector pathRequiredOrigin facilities (hospitals/fire/EMS/etc.)
demand_pointsPointVector pathOptionalDemand points for scenario coverage KPIs
ring_costsarray[float]RequiredService area ring costs (e.g., [5,10,15])
scenario_csvpathRequiredCSV: scenario_id,max_cost_multiplier[,blocked_value]
scenario_templatestringOptionalScenario authoring template: custom | flood | wildfire | earthquake; applies template guardrails
scenario_block_source_fieldstringOptionalNetwork attribute used to match scenario blocked_value
baseline_service_areasvector pathRequiredOutput baseline service areas
worst_case_service_areasvector pathRequiredOutput worst-scenario service areas
scenario_summary_csvpathRequiredOutput scenario KPI summary CSV
simulation_reportpathRequiredOutput simulation summary JSON
Outputs
OutputTypeContents
baseline_service_areasVectorBaseline merged service area polygons
worst_case_service_areasVectorWorst-performing scenario service area polygons
scenario_summary_csvCSVscenario_id, blocked_value, covered_pct, delta_from_baseline_pct and related KPIs
simulation_reportJSONbaseline stats, scenario comparisons, best/worst scenario summary
Python Example
`env = WbEnvironment(license_tier="pro")
result = env.run_tool("emergency_scenario_routing_and_accessibility_simulator", network="city_network.gpkg", critical_facilities="critical_facilities.gpkg", demand_points="demand_points.gpkg", ring_costs=[5, 10, 15], scenario_csv="scenarios.csv", scenario_block_source_field="STATUS", baseline_service_areas="output/baseline_service_areas.gpkg", worst_case_service_areas="output/worst_service_areas.gpkg", scenario_summary_csv="output/scenario_summary.csv", simulation_report="output/simulation_report.json", )
print(result)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Fleet Routing And Dispatch Optimizer
Function name: fleet_routing_and_dispatch_optimizer
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Fleet Routing and Dispatch Optimization
Who It Is For
- Logistics dispatch teams planning daily vehicle assignments.
- Municipal waste and field-maintenance operations with constrained fleet capacity/time.
- Courier and distribution operations requiring transparent route exceptions.
Primary User
Logistics operations leaders, municipal service planners, and fleet-dispatch platform teams.
What It Does
- Optimizes vehicle dispatch plans across depots and stops for logistics and field operations.
- Builds feasible routes under capacity and shift-time constraints.
- Produces assignment, KPI, and exception outputs for operations review.
- Supports objective mode selection (
minimize_distance,minimize_time,minimize_cost,balanced).
How It Works
- Parses depot and stop layers plus fleet specifications from
vehicles_csv. - Harmonizes depot/stop CRS to the network CRS when EPSG metadata is available and validates projected CRS for distance/time calculations.
- Filters infeasible stops where demand exceeds max vehicle capacity.
- Applies objective-aware greedy nearest-feasible stop construction for initial route assignment (
distance,time,cost,balanced). - Applies optional edge restrictions from CSV (
from_x,from_y,to_x,to_y,closed,penalty_factor) as closures or impedance penalties. - Runs local 2-opt sequence refinement for incremental route improvement.
- Computes per-route and fleet KPIs and emits exceptions with reason codes.
Inputs
ParameterTypeRequiredDescription
networkLineVector pathRequiredStreet/network layer used by routing workflow
depotsPointVector pathRequiredDepot or start/end locations for vehicles
stopsPointVector pathRequiredStops/tasks to assign to routes
vehicles_csvpathRequiredFleet specs (vehicle_id, capacity, available_time_minutes, cost_per_minute, cost_per_km, depot_id)
objectivestringOptionalObjective mode: minimize_distance, minimize_time, minimize_cost, balanced
restrictionspathOptionalOptional restrictions CSV (from_x,from_y,to_x,to_y[,closed][,penalty_factor]) used for edge closures or impedance penalties
routes_outputvector pathRequiredOutput route vector path
assignment_csv_outputpathRequiredOutput stop-to-route assignment CSV
route_kpis_csv_outputpathRequiredOutput per-route/fleet KPI CSV
exceptions_csv_outputpathRequiredOutput infeasible stop diagnostics CSV
Outputs
OutputTypeContents
routes_outputVectorRoute geometries and route-level summaries by vehicle
assignment_csv_outputCSVstop_id, route_id, vehicle_id, sequence_order, arrival/departure times
route_kpis_csv_outputCSVPer-route metrics and fleet roll-up summary
exceptions_csv_outputCSVstop_id with reason code (demand_exceeds_max_vehicle_capacity, no_feasible_route)
Python Example
`env = WbEnvironment(license_tier="pro")
result = env.run_tool("fleet_routing_and_dispatch_optimizer", network="city_network.gpkg", depots="depots.gpkg", stops="daily_stops.gpkg", vehicles_csv="fleet_specs.csv", objective="minimize_cost", routes_output="output/routes.gpkg", assignment_csv_output="output/assignments.csv", route_kpis_csv_output="output/route_kpis.csv", exceptions_csv_output="output/exceptions.csv", )
print(result)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Generate Network Nodes
Function name: generate_network_nodes
No help documentation available for this tool.
K Shortest Paths Network
Function name: k_shortest_paths_network
Experimental
Finds the k shortest simple paths between start and end coordinates over a line network.
vector network k-shortest-paths
Parameters
NameDescriptionRequiredDefault
inputInput line network layer.Requirednetwork.shp
start_xStart x coordinate.Required0.0
start_yStart y coordinate.Required0.0
end_xEnd x coordinate.Required100.0
end_yEnd y coordinate.Required100.0
kNumber of shortest paths to return.Required3
snap_toleranceOptional node snapping tolerance for graph construction.Optional—
max_snap_distanceOptional max distance from start/end coordinates to nearest network node.Optional—
edge_cost_fieldOptional numeric line field used as an impedance multiplier for segment length.Optional—
one_way_fieldOptional line field marking one-way digitized edges (true/1/yes means from first to second vertex only).Optional—
blocked_fieldOptional line field marking blocked/closed edges to exclude from routing (true/1/yes blocks).Optional—
barriersOptional barrier point layer; nearest network nodes are blocked from traversal.Optional—
barrier_snap_distanceOptional max distance from each barrier point to a network node for blocking.Optional—
turn_penaltyOptional additive cost applied to non-straight turns at network nodes.Optional—
u_turn_penaltyOptional additive cost applied to U-turn transitions.Optional—
forbid_u_turnsIf true, disallow U-turn transitions.Optional—
forbid_left_turnsIf true, disallow left-turn transitions.Optional—
forbid_right_turnsIf true, disallow right-turn transitions.Optional—
turn_restrictions_csvOptional CSV of turn transitions using columns prev_x,prev_y,node_x,node_y,next_x,next_y. Optional columns: forbidden (default true when no turn_cost column is provided) and turn_cost (or penalty/cost/extra_cost) for per-turn additive cost.Optional—
temporal_cost_profileOptional CSV defining time-dependent edge costs (columns: edge_id,dow,start_minute,end_minute,value).Optional—
temporal_edge_id_fieldOptional network field used to match temporal_cost_profile edge_id values (default EDGE_ID).Optional—
departure_timeOptional RFC3339 departure time used for temporal profile lookup.Optional—
temporal_modeOptional temporal interpretation mode: multiplier or absolute.Optional—
temporal_fallbackOptional fallback when temporal row is missing: static_cost or error.Optional—
temporal_profile_reportOptional JSON output path for temporal profile diagnostics (coverage, unmatched edges, fallback usage).Optional—
outputOutput line vector path.Required—
Examples
Computes multiple alternative simple paths between two points on a line network.
wbe.k_shortest_paths_network(end_x=100.0, end_y=100.0, input='network.shp', k=3, output='k_shortest_paths.shp', start_x=0.0, start_y=0.0)
Location Allocation Network
Function name: location_allocation_network
Experimental
Selects k facilities and allocates demand points by network cost with greedy or exact solving, optional capacities, and required/forbidden candidate constraints.
vector network location-allocation allocation
Parameters
NameDescriptionRequiredDefault
inputInput line network layer.Requirednetwork.shp
demand_pointsDemand point layer to allocate.Requireddemand.shp
facilitiesCandidate facility point layer.Requiredfacilities.shp
facility_countNumber of facilities to select (k).Required2
solver_modeSolver mode: auto, greedy, or exact (exact is intended for smaller problems).Optional—
demand_weight_fieldOptional numeric demand weight field in demand_points (default weight=1).Optional—
facility_capacity_fieldOptional numeric capacity field in facilities; capacity is consumed by demand_weight_field values.Optional—
required_facility_fieldOptional boolean facility field marking candidates that must be selected.Optional—
forbidden_facility_fieldOptional boolean facility field marking candidates that must not be selected.Optional—
snap_toleranceOptional node snapping tolerance for graph construction.Optional—
max_snap_distanceOptional max distance from demand/facility points to nearest network node.Optional—
edge_cost_fieldOptional numeric line field used as an impedance multiplier for segment length.Optional—
one_way_fieldOptional line field marking one-way digitized edges (true/1/yes means from first to second vertex only).Optional—
blocked_fieldOptional line field marking blocked/closed edges to exclude from routing (true/1/yes blocks).Optional—
barriersOptional barrier point layer; nearest network nodes are blocked from traversal.Optional—
barrier_snap_distanceOptional max distance from each barrier point to a network node for blocking.Optional—
turn_penaltyOptional additive cost applied to non-straight turns at network nodes.Optional—
u_turn_penaltyOptional additive cost applied to U-turn transitions.Optional—
forbid_u_turnsIf true, disallow U-turn transitions.Optional—
forbid_left_turnsIf true, disallow left-turn transitions.Optional—
forbid_right_turnsIf true, disallow right-turn transitions.Optional—
turn_restrictions_csvOptional CSV of turn transitions using columns prev_x,prev_y,node_x,node_y,next_x,next_y. Optional columns: forbidden (default true when no turn_cost column is provided) and turn_cost (or penalty/cost/extra_cost) for per-turn additive cost.Optional—
temporal_cost_profileOptional CSV defining time-dependent edge costs (columns: edge_id,dow,start_minute,end_minute,value).Optional—
temporal_edge_id_fieldOptional network field used to match temporal_cost_profile edge_id values (default EDGE_ID).Optional—
departure_timeOptional RFC3339 departure time used for temporal profile lookup.Optional—
temporal_modeOptional temporal interpretation mode: multiplier or absolute.Optional—
temporal_fallbackOptional fallback when temporal row is missing: static_cost or error.Optional—
temporal_profile_reportOptional JSON output path for temporal profile diagnostics (coverage, unmatched edges, fallback usage).Optional—
outputOutput allocated route line vector path.Required—
Examples
Selects facilities and allocates demand points using network travel cost.
wbe.location_allocation_network(demand_points='demand.shp', facilities='facilities.shp', facility_count=2, input='network.shp', output='location_allocation_routes.shp')
Map Matching v1
Function name: map_matching_v1
Experimental
Snaps trajectory points onto a line network and reconstructs an inferred route with diagnostics.
vector network map-matching
Parameters
NameDescriptionRequiredDefault
inputInput line network layer.Requirednetwork.shp
trajectory_pointsInput trajectory point layer.Requiredtrajectory_points.shp
timestamp_fieldTrajectory field used for time ordering.Requiredtimestamp
search_radiusOptional candidate search radius around each trajectory point.Optional25.0
candidate_kOptional number of nearest candidates retained per point.Optional5
snap_toleranceOptional node snapping tolerance for graph construction.Optional—
max_snap_distanceOptional max distance for snapping trajectory points to network nodes.Optional—
edge_cost_fieldOptional numeric line field used as an impedance multiplier for segment length.Optional—
one_way_fieldOptional line field marking one-way digitized edges (true/1/yes means from first to second vertex only).Optional—
blocked_fieldOptional line field marking blocked/closed edges to exclude from routing (true/1/yes blocks).Optional—
barriersOptional barrier point layer; nearest network nodes are blocked from traversal.Optional—
barrier_snap_distanceOptional max distance from each barrier point to a network node for blocking.Optional—
turn_penaltyOptional additive cost applied to non-straight turns at network nodes.Optional—
u_turn_penaltyOptional additive cost applied to U-turn transitions.Optional—
forbid_u_turnsIf true, disallow U-turn transitions.Optional—
forbid_left_turnsIf true, disallow left-turn transitions.Optional—
forbid_right_turnsIf true, disallow right-turn transitions.Optional—
turn_restrictions_csvOptional CSV of turn transitions using columns prev_x,prev_y,node_x,node_y,next_x,next_y. Optional columns: forbidden (default true when no turn_cost column is provided) and turn_cost (or penalty/cost/extra_cost) for per-turn additive cost.Optional—
matched_points_outputOptional output vector path for per-point diagnostics.Optional—
match_reportOptional JSON output path for summary diagnostics.Optional—
outputOutput line vector path for inferred route.Required—
Examples
Matches time-ordered trajectory points to a network and emits route and diagnostics outputs.
wbe.map_matching_v1(candidate_k=5, input='network.shp', matched_points_output='matched_points.shp', output='matched_route.shp', search_radius=25.0, timestamp_field='timestamp', trajectory_points='trajectory_points.shp')
Market Access And Site Intelligence Workflow
Function name: market_access_and_site_intelligence_workflow
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Market Access and Site Planning
Who It Is For
- Retail chains evaluating expansion locations based on demand accessibility and competitive saturation.
- Healthcare networks rating candidate clinic/hospital sites for market coverage and competitive positioning.
- Franchise developers assessing new territory opportunities via catchment analysis.
Primary User
Commercial real estate, retail operations, healthcare network planning, franchise development.
What It Does
- Evaluates candidate site locations for commercial expansion (retail, healthcare, franchise).
- Computes drive-time catchment areas for each candidate.
- Measures demand coverage and competitive overlap positioning.
- Ranks candidates by composite score: 50% demand coverage + 25% accessibility + 25% low competitive overlap.
- Outputs ranked candidates, competitive analysis, and executive summary with decision metrics.
- Adds opportunity bands and optional market action queue output for expansion triage.
How It Works
- For each candidate site:
- Compute demand coverage (% of demand points within max ring cost).
- Compute average distance to demand points (accessibility proxy).
- Compute competitive overlap vs existing/competitor sites (% within catchment radius).
- Existing-site baseline coverage is computed from the existing site layer, not from candidate seed sites.
- Accessibility score is normalized across the evaluated candidate set.
- Composite rank score = 0.50 × coverage + 0.25 × accessibility + 0.25 × (100 - overlap).
- Scope boundary: Market Access and Site Planning evaluates candidate commercial sites for expansion using drive-time access, demand coverage, and competitive positioning — it is the service-area tool for commercial decisions. Service Area Planning and Coverage Optimization (6.2) addresses public infrastructure coverage diagnostics. Emergency Accessibility Scenario Planning (6.6) simulates disrupted-network scenarios for resilience analysis. Choose this tool for commercial expansion, 6.2 for infrastructure planning, and 6.6 for emergency response planning.
- Rank candidates by composite score; classify each site as
expand_now,pilot,monitor, orsaturated; emit top candidates + decision gate (coverage > 70% AND overlap ParameterTypeRequiredDescriptionnetworkLineVector pathRequiredStreet/transit network for routingsites_existingPointVector pathRequiredExisting own or benchmark competitive sitessites_candidatesPointVector pathRequiredCandidate expansion site locationsdemand_surfacePointVector pathRequiredDemand locations (customers, population centroids)competition_sitesPointVector pathOptionalCompetitor locations (separate from own/benchmark)ring_costsarray[float]RequiredDrive-time costs for catchments (e.g., [5, 10, 15])catchments_outputvector pathRequiredOutput drive-time catchment polygonsoverlap_analysis_outputvector pathRequiredOutput competitive overlap analysis layercandidate_rank_csvpathRequiredOutput CSV: ranked candidates with KPIsexecutive_summary_jsonpathRequiredOutput JSON: market metrics and decision gatemarket_action_queue_csvpathOptionalOptional prioritized expansion action queue CSV
Important input roles: sites_existing is the incumbent baseline used for existing coverage and coverage-gain calculations. competition_sites is an optional separate competitor layer used for overlap pressure. If competition_sites is omitted, overlap falls back to sites_existing.
Outputs
OutputTypeContents
catchments_outputVectorDrive-time catchment polygons per candidate
overlap_analysis_outputVector/GeoJSONCandidate-level overlap visualization with coverage gain and opportunity band
candidate_rank_csvCSVrank, site_id, x, y, demand_coverage_pct, coverage_gain_pct, avg_distance_to_demand, competitive_overlap_pct, accessibility_score, composite_rank_score, opportunity_band
executive_summary_jsonJSONtotal_candidates, market_metrics, top_candidates, recommendation, decision_gate, decision_rationale
market_action_queue_csvCSVPrioritized expansion actions by candidate/opportunity band
Map interpretation: catchments_output contains one candidate-level polygon per candidate (not separate ring-band polygons). Catchments are convex-hull trade areas of covered demand plus candidate location; if a hull cannot be formed, a small fallback square polygon may be written. overlap_analysis_output is a point layer, and square symbols there are usually marker style.
Python Example
`env = WbEnvironment(license_tier="pro")
result = env.run_tool("market_access_and_site_intelligence_workflow", network="city_network.gpkg", sites_existing="existing_retail.gpkg", sites_candidates="candidate_expansion_sites.gpkg", demand_surface="customer_demand_points.gpkg", competition_sites="competitor_locations.gpkg", ring_costs=[5, 10, 15], catchments_output="output/candidate_catchments.gpkg", overlap_analysis_output="output/competitive_overlap.gpkg", candidate_rank_csv="output/candidate_ranking.csv", executive_summary_json="output/market_summary.json", market_action_queue_csv="output/market_action_queue.csv", )
print(result)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Multimodal OD Cost Matrix
Function name: multimodal_od_cost_matrix
Experimental
Computes batched multimodal OD costs and mode summaries between origin and destination point sets.
vector network multimodal od-matrix
Parameters
NameDescriptionRequiredDefault
inputInput line network layer.Requirednetwork.shp
originsOrigin point layer.Requiredorigins.shp
destinationsDestination point layer.Requireddestinations.shp
mode_fieldLine attribute field that identifies travel mode per segment.RequiredMODE
snap_toleranceOptional node snapping tolerance for graph construction.Optional—
max_snap_distanceOptional max distance from origin/destination points to nearest network node.Optional—
default_mode_speedDefault mode speed in coordinate-units per time unit (default: 1).Optional1.0
mode_speed_overridesOptional comma-separated mode:speed overrides (for example: walk:1.4,drive:12,transit:8).Optional—
allowed_modesOptional comma-separated allow-list of modes to include in routing.Optional—
transfer_penaltyOptional additive penalty applied each time the route changes mode.Optional0.0
temporal_cost_profileOptional CSV defining time-dependent edge costs (columns: edge_id,dow,start_minute,end_minute,value).Optional—
temporal_edge_id_fieldOptional network field used to match temporal_cost_profile edge_id values (default EDGE_ID).Optional—
departure_timeOptional RFC3339 departure time used for temporal profile lookup.Optional—
temporal_modeOptional temporal interpretation mode: multiplier or absolute.Optional—
temporal_fallbackOptional fallback when temporal row is missing: static_cost or error.Optional—
temporal_profile_reportOptional JSON output path for temporal profile diagnostics when using direct temporal input.Optional—
scenario_bundle_csvOptional CSV listing named temporal scenarios for comparative multi-scenario OD output.Optional—
outputOutput CSV path.Required—
Examples
Creates a multimodal OD matrix with route cost and mode-sequence summaries.
wbe.multimodal_od_cost_matrix(default_mode_speed=1.0, destinations='destinations.shp', input='network.shp', mode_field='MODE', mode_speed_overrides='walk:1.4,transit:8', origins='origins.shp', output='multimodal_od_matrix.csv', transfer_penalty=0.0)
Multimodal Routes From OD
Function name: multimodal_routes_from_od
Experimental
Builds route geometries for multimodal origin-destination point pairs with per-route mode summaries.
vector network multimodal routes
Parameters
NameDescriptionRequiredDefault
inputInput line network layer.Requirednetwork.shp
originsOrigin point layer.Requiredorigins.shp
destinationsDestination point layer.Requireddestinations.shp
mode_fieldLine attribute field that identifies travel mode per segment.RequiredMODE
snap_toleranceOptional node snapping tolerance for graph construction.Optional—
max_snap_distanceOptional max distance from origin/destination points to nearest network node.Optional—
default_mode_speedDefault mode speed in coordinate-units per time unit (default: 1).Optional1.0
mode_speed_overridesOptional comma-separated mode:speed overrides (for example: walk:1.4,drive:12,transit:8).Optional—
allowed_modesOptional comma-separated allow-list of modes to include in routing.Optional—
transfer_penaltyOptional additive penalty applied each time the route changes mode.Optional0.0
temporal_cost_profileOptional CSV defining time-dependent edge costs (columns: edge_id,dow,start_minute,end_minute,value).Optional—
temporal_edge_id_fieldOptional network field used to match temporal_cost_profile edge_id values (default EDGE_ID).Optional—
departure_timeOptional RFC3339 departure time used for temporal profile lookup.Optional—
temporal_modeOptional temporal interpretation mode: multiplier or absolute.Optional—
temporal_fallbackOptional fallback when temporal row is missing: static_cost or error.Optional—
temporal_profile_reportOptional JSON output path for temporal profile diagnostics when using direct temporal input.Optional—
scenario_bundle_csvOptional CSV listing named temporal scenarios for comparative multi-scenario route output.Optional—
outputOutput route line vector path.Required—
Examples
Creates route lines for each reachable multimodal origin-destination pair.
wbe.multimodal_routes_from_od(default_mode_speed=1.0, destinations='destinations.shp', input='network.shp', mode_field='MODE', mode_speed_overrides='walk:1.4,transit:8', origins='origins.shp', output='multimodal_routes_from_od.gpkg', transfer_penalty=0.0)
Multimodal Shortest Path
Function name: multimodal_shortest_path
Experimental
Finds a mode-aware shortest path over a line network with configurable transfer penalties.
vector network multimodal shortest-path
Parameters
NameDescriptionRequiredDefault
inputInput line network layer.Requirednetwork.shp
start_xStart x coordinate.Required0.0
start_yStart y coordinate.Required0.0
end_xEnd x coordinate.Required100.0
end_yEnd y coordinate.Required100.0
mode_fieldLine attribute field that identifies travel mode per segment.RequiredMODE
snap_toleranceOptional node snapping tolerance for graph construction.Optional—
max_snap_distanceOptional max distance from start/end coordinates to nearest network node.Optional—
default_mode_speedDefault mode speed in coordinate-units per time unit (default: 1).Optional1.0
mode_speed_overridesOptional comma-separated mode:speed overrides (for example: walk:1.4,drive:12,transit:8).Optional—
allowed_modesOptional comma-separated allow-list of modes to include in routing.Optional—
transfer_penaltyOptional additive penalty applied each time the route changes mode.Optional0.0
outputOutput line vector path.Required—
Examples
Routes between two coordinates using mode-aware costs and transfer penalties.
wbe.multimodal_shortest_path(default_mode_speed=1.0, end_x=100.0, end_y=100.0, input='network.shp', mode_field='MODE', mode_speed_overrides='walk:1.4,transit:8', output='multimodal_shortest_path.shp', start_x=0.0, start_y=0.0, transfer_penalty=0.0)
Demonstrates walk-drive routing with mode filtering and transfer penalty.
wbe.multimodal_shortest_path(allowed_modes='walk,drive', default_mode_speed=1.0, end_x=100.0, end_y=100.0, input='network.shp', mode_field='MODE', mode_speed_overrides='walk:1.4,drive:12', output='multimodal_walk_drive_path.shp', start_x=0.0, start_y=0.0, transfer_penalty=2.0)
Demonstrates walk-transit routing with mode filtering and transfer penalty.
wbe.multimodal_shortest_path(allowed_modes='walk,transit', default_mode_speed=1.0, end_x=100.0, end_y=100.0, input='network.shp', mode_field='MODE', mode_speed_overrides='walk:1.4,transit:8', output='multimodal_walk_transit_path.shp', start_x=0.0, start_y=0.0, transfer_penalty=1.0)
Network Accessibility Metrics
Function name: network_accessibility_metrics
Experimental
Computes accessibility indices for origin points based on reachability to destinations with optional impedance cutoffs and decay functions.
vector network accessibility
Parameters
NameDescriptionRequiredDefault
inputInput line network layer.Requirednetwork.shp
originsOrigin point layer.Requiredorigins.shp
destinationsDestination point layer.Requireddestinations.shp
snap_toleranceOptional node snapping tolerance for graph construction.Optional0.0
max_snap_distanceOptional max distance from origin/destination points to nearest network node.Optional—
impedance_cutoffOptional maximum distance threshold for counting reachable destinations (default: infinite).Optional—
decay_functionOptional decay function: 'none' (default), 'linear', or 'exponential' for distance-weighted accessibility.Optionalnone
decay_parameterOptional decay parameter (lambda for exponential, rate for linear).Optional—
edge_cost_fieldOptional numeric line field used as an impedance multiplier for segment length.Optional—
one_way_fieldOptional line field marking one-way digitized edges (true/1/yes means from first to second vertex only).Optional—
blocked_fieldOptional line field marking blocked/closed edges to exclude from routing (true/1/yes blocks).Optional—
parallel_executionIf true (default), evaluate origins in parallel for faster accessibility computation.Optional—
outputOutput point vector path (origins with accessibility metrics).Required—
Examples
Computes accessibility index for origins to destinations within cutoff distance.
wbe.network_accessibility_metrics(decay_function='none', destinations='destinations.shp', input='network.shp', origins='origins.shp', output='origins_accessibility.shp', snap_tolerance=0.0)
Network Centrality Metrics
Function name: network_centrality_metrics
Experimental
Computes baseline degree, closeness, and betweenness centrality metrics for network nodes.
vector network centrality
Parameters
NameDescriptionRequiredDefault
inputInput line network layer.Requirednetwork.shp
snap_toleranceOptional node snapping tolerance for graph construction.Optional0.0
edge_cost_fieldOptional numeric line field used as an impedance multiplier for segment length.Optional—
one_way_fieldOptional line field marking one-way digitized edges (true/1/yes means from first to second vertex only).Optional—
blocked_fieldOptional line field marking blocked/closed edges to exclude from graph construction (true/1/yes blocks).Optional—
outputOutput point vector path.Required—
Examples
Computes node-level centrality metrics for a line network.
wbe.network_centrality_metrics(input='network.shp', output='network_centrality.gpkg', snap_tolerance=0.0)
Network Connected Components
Function name: network_connected_components
Experimental
Assigns a connected-component ID to each line feature in a network.
vector network components
Parameters
NameDescriptionRequiredDefault
inputInput line network layer.Requirednetwork.shp
snap_toleranceOptional node snapping tolerance for graph construction.Optional0.0
outputOutput line vector path.Required—
Examples
Labels disconnected subnetworks with unique component IDs.
wbe.network_connected_components(input='network.shp', output='network_components.shp', snap_tolerance=0.0)
Network Node Degree
Function name: network_node_degree
Experimental
Extracts network nodes from line features and computes node degree and node type.
vector network topology
Parameters
NameDescriptionRequiredDefault
inputInput line network layer.Requirednetwork.shp
snap_toleranceOptional node snapping tolerance for graph construction.Optional0.0
outputOutput point vector path.Required—
Examples
Creates a node point layer with network degree attributes.
wbe.network_node_degree(input='network.shp', output='network_nodes.shp', snap_tolerance=0.0)
Network OD Cost Matrix
Function name: network_od_cost_matrix
Experimental
Compute origin-destination cost matrix between point pairs. Calculates travel distance or cost along network paths.
vector network od-matrix
Parameters
NameDescriptionRequiredDefault
inputInput line network layer.Requirednetwork.shp
originsOrigin point layer.Requiredorigins.shp
destinationsDestination point layer.Requireddestinations.shp
snap_toleranceOptional node snapping tolerance for graph construction.Optional—
max_snap_distanceOptional max distance from origin/destination points to nearest network node.Optional—
edge_cost_fieldOptional numeric line field used as an impedance multiplier for segment length.Optional—
one_way_fieldOptional line field marking one-way digitized edges (true/1/yes means from first to second vertex only).Optional—
blocked_fieldOptional line field marking blocked/closed edges to exclude from routing (true/1/yes blocks).Optional—
barriersOptional barrier point layer; nearest network nodes are blocked from traversal.Optional—
barrier_snap_distanceOptional max distance from each barrier point to a network node for blocking.Optional—
turn_penaltyOptional additive cost applied to non-straight turns at network nodes.Optional—
u_turn_penaltyOptional additive cost applied to U-turn transitions.Optional—
forbid_u_turnsIf true, disallow U-turn transitions.Optional—
forbid_left_turnsIf true, disallow left-turn transitions.Optional—
forbid_right_turnsIf true, disallow right-turn transitions.Optional—
turn_restrictions_csvOptional CSV of turn transitions using columns prev_x,prev_y,node_x,node_y,next_x,next_y. Optional columns: forbidden (default true when no turn_cost column is provided) and turn_cost (or penalty/cost/extra_cost) for per-turn additive cost.Optional—
temporal_cost_profileOptional CSV defining time-dependent edge costs (columns: edge_id,dow,start_minute,end_minute,value).Optional—
temporal_edge_id_fieldOptional network field used to match temporal_cost_profile edge_id values (default EDGE_ID).Optional—
departure_timeOptional RFC3339 departure time used for temporal profile lookup.Optional—
temporal_modeOptional temporal interpretation mode: multiplier or absolute.Optional—
temporal_fallbackOptional fallback when temporal row is missing: static_cost or error.Optional—
temporal_profile_reportOptional JSON output path for temporal profile diagnostics (coverage, unmatched edges, fallback usage).Optional—
outputOutput CSV path.Required—
Examples
Creates an OD cost matrix from origins and destinations on a line network.
wbe.network_od_cost_matrix(destinations='destinations.shp', input='network.shp', origins='origins.shp', output='od_matrix.csv')
Network Readiness And Diagnostics Intelligence
Function name: network_readiness_and_diagnostics_intelligence
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Network Readiness and Diagnostics
Problem It Solves
Is this network structurally and cost-wise ready for reliable routing and service optimization workflows?
Who It Is For
- Routing analysts, municipal transportation planning teams, and utility network operations.
Primary User
Municipal/public works GIS teams, utilities, and logistics operations requiring reproducible network QA gates.
What It Does
- Audits line-based transportation/utility networks for operational routing readiness.
- Detects dead-end concentration and cost-consistency anomalies before routing runs.
- Emits a machine-readable readiness score with pass/fail quality gate and diagnostics outputs.
How It Works
- Loads a line network and validates geometry is line-based (
LineString/MultiLineString). - Builds node degree counts from line endpoints to quantify dead-end prevalence.
- Computes per-segment length costs and assesses variance using z-score outlier detection.
- Computes weighted readiness score from connectivity and cost-consistency components.
- Indicative formula:
overall = 0.6 * connectivity_score + 0.4 * cost_consistency_score.
Why It Wins
- Replaces manual topology spot-checks with a reproducible score + diagnostics package that is machine-checkable and reportable.
Typical Buying Trigger
Teams encounter unstable routing outcomes and need an auditable pre-routing network quality gate.
Typical Presets
- default: balanced scoring for day-to-day network readiness checks.
- pre-routing gate: run before route optimization or service-area generation.
- data onboarding QA: run when ingesting new road/utility centerline deliveries.
Inputs
ParameterOptionalDescription networknoInput line network layer (street, transit, or utility network). qa_reportnoOutput CSV path for detailed QA findings and issue counts. diagnostics_layernoOutput GeoJSON/GeoPackage path containing diagnostic geometries. readiness_scorenoOutput JSON path with readiness score, component scores, and pass/fail gate.
Outputs
ParameterTypeDescription qa_reportCSVTabular QA findings including severity, check type, count, and descriptive diagnostics. diagnostics_layerGeoJSON/GeoPackageSpatial diagnostics layer for issue visualization and spatial triage. readiness_scoreJSONMachine-readable readiness contract with overall score, component scores, penalties, and pass/fail gate. html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
result = wbe.network_readiness_and_diagnostics_intelligence( network="data/street_network.shp", qa_report="output/network_readiness_qa.csv", diagnostics_layer="output/network_readiness_diagnostics.geojson", readiness_score="output/network_readiness_score.json", )
print(result)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Network Routes From OD
Function name: network_routes_from_od
Experimental
Builds route geometries for origin-destination point pairs over a line network.
vector network routes
Parameters
NameDescriptionRequiredDefault
inputInput line network layer.Requirednetwork.shp
originsOrigin point layer.Requiredorigins.shp
destinationsDestination point layer.Requireddestinations.shp
snap_toleranceOptional node snapping tolerance for graph construction.Optional—
max_snap_distanceOptional max distance from origin/destination points to nearest network node.Optional—
edge_cost_fieldOptional numeric line field used as an impedance multiplier for segment length.Optional—
one_way_fieldOptional line field marking one-way digitized edges (true/1/yes means from first to second vertex only).Optional—
blocked_fieldOptional line field marking blocked/closed edges to exclude from routing (true/1/yes blocks).Optional—
barriersOptional barrier point layer; nearest network nodes are blocked from traversal.Optional—
barrier_snap_distanceOptional max distance from each barrier point to a network node for blocking.Optional—
turn_penaltyOptional additive cost applied to non-straight turns at network nodes.Optional—
u_turn_penaltyOptional additive cost applied to U-turn transitions.Optional—
forbid_u_turnsIf true, disallow U-turn transitions.Optional—
forbid_left_turnsIf true, disallow left-turn transitions.Optional—
forbid_right_turnsIf true, disallow right-turn transitions.Optional—
turn_restrictions_csvOptional CSV of turn transitions using columns prev_x,prev_y,node_x,node_y,next_x,next_y. Optional columns: forbidden (default true when no turn_cost column is provided) and turn_cost (or penalty/cost/extra_cost) for per-turn additive cost.Optional—
temporal_cost_profileOptional CSV defining time-dependent edge costs (columns: edge_id,dow,start_minute,end_minute,value).Optional—
temporal_edge_id_fieldOptional network field used to match temporal_cost_profile edge_id values (default EDGE_ID).Optional—
departure_timeOptional RFC3339 departure time used for temporal profile lookup.Optional—
temporal_modeOptional temporal interpretation mode: multiplier or absolute.Optional—
temporal_fallbackOptional fallback when temporal row is missing: static_cost or error.Optional—
temporal_profile_reportOptional JSON output path for temporal profile diagnostics (coverage, unmatched edges, fallback usage).Optional—
outputOutput route line vector path.Required—
Examples
Creates route line features for OD point pairs on a network.
wbe.network_routes_from_od(destinations='destinations.shp', input='network.shp', origins='origins.shp', output='network_routes.shp')
Network Service Area
Function name: network_service_area
Experimental
Computes reachable network nodes from origin points within a maximum network cost.
vector network service-area
Parameters
NameDescriptionRequiredDefault
inputInput line network layer.Requirednetwork.shp
originsOrigin point layer.Requiredorigins.shp
max_costMaximum reachable path cost.Required1000.0
ring_costsOptional comma-separated ring thresholds for multi-ring outputs (for example: 5,10,15).Optional—
snap_toleranceOptional node snapping tolerance for graph construction.Optional—
max_snap_distanceOptional max distance from origin points to nearest network node.Optional—
output_modeOutput mode: 'nodes' (default), 'edges' for cost-trimmed reachable edge segments, or 'polygons' for per-origin isochrone-like polygons from reachable edge envelopes.Optional—
polygon_merge_originsIf true and output_mode='polygons', dissolve overlapping origin polygons into merged coverage per ring instead of emitting one polygon per origin.Optional—
mode_fieldOptional line attribute field identifying travel mode per segment; enables mode-aware service-area costs.Optional—
default_mode_speedDefault mode speed in coordinate-units per time unit when mode_field is provided (default: 1).Optional—
mode_speed_overridesOptional comma-separated mode:speed overrides (for example: walk:1.4,drive:12).Optional—
allowed_modesOptional comma-separated allow-list of modes to include when mode_field is provided.Optional—
edge_cost_fieldOptional numeric line field used as an impedance multiplier for segment length.Optional—
one_way_fieldOptional line field marking one-way digitized edges (true/1/yes means from first to second vertex only).Optional—
blocked_fieldOptional line field marking blocked/closed edges to exclude from routing (true/1/yes blocks).Optional—
barriersOptional barrier point layer; nearest network nodes are blocked from traversal.Optional—
barrier_snap_distanceOptional max distance from each barrier point to a network node for blocking.Optional—
turn_penaltyOptional additive cost applied to non-straight turns at network nodes.Optional—
u_turn_penaltyOptional additive cost applied to U-turn transitions.Optional—
forbid_u_turnsIf true, disallow U-turn transitions.Optional—
forbid_left_turnsIf true, disallow left-turn transitions.Optional—
forbid_right_turnsIf true, disallow right-turn transitions.Optional—
turn_restrictions_csvOptional CSV of turn transitions using columns prev_x,prev_y,node_x,node_y,next_x,next_y. Optional columns: forbidden (default true when no turn_cost column is provided) and turn_cost (or penalty/cost/extra_cost) for per-turn additive cost.Optional—
temporal_cost_profileOptional CSV defining time-dependent edge costs (columns: edge_id,dow,start_minute,end_minute,value).Optional—
temporal_edge_id_fieldOptional network field used to match temporal_cost_profile edge_id values (default EDGE_ID).Optional—
departure_timeOptional RFC3339 departure time used for temporal profile lookup.Optional—
temporal_modeOptional temporal interpretation mode: multiplier or absolute.Optional—
temporal_fallbackOptional fallback when temporal row is missing: static_cost or error.Optional—
temporal_profile_reportOptional JSON output path for temporal profile diagnostics (coverage, unmatched edges, fallback usage).Optional—
outputOutput service-area vector path.Required—
Examples
Finds all nodes reachable from origins within max_cost.
wbe.network_service_area(input='network.shp', max_cost=1000.0, origins='origins.shp', output='service_area_nodes.shp')
Network Topology Audit
Function name: network_topology_audit
Experimental
Audits a line network for topology anomalies—disconnected components, dead ends, and degree anomalies—that cause routing failures.
vector network diagnostics topology
Parameters
NameDescriptionRequiredDefault
inputInput line network layer.Requirednetwork.shp
snap_toleranceOptional node snapping tolerance for graph construction.Optional—
one_way_fieldOptional line field marking one-way edges for directional analysis.Optional—
blocked_fieldOptional line field marking blocked edges to exclude from analysis.Optional—
reportOptional JSON output path for the audit summary report.Optional—
outputOutput point vector path for per-node diagnostics.Required—
Examples
Writes per-node degree and component diagnostics and a summary JSON report for the input network.
wbe.network_topology_audit(input='network.shp', output='network_node_audit.shp', report='audit_report.json')
OD Sensitivity Analysis
Function name: od_sensitivity_analysis
Experimental
Computes OD shortest-path costs with impedance perturbations and outputs sensitivity statistics via Monte Carlo sampling.
vector network sensitivity
Parameters
NameDescriptionRequiredDefault
inputInput line network layer.Requirednetwork.shp
originsOrigin point layer.Requiredorigins.shp
destinationsDestination point layer.Requireddestinations.shp
edge_cost_fieldRequired numeric line field used as an impedance multiplier for perturbation analysis.Requiredcost
impedance_disturbance_rangeRange for cost perturbation as 'min_factor,max_factor' (e.g., '0.8,1.2' for ±20% variation).Optional0.8,1.2
monte_carlo_samplesNumber of Monte Carlo samples for perturbation analysis (default 1, max 100).Optional10
snap_toleranceOptional node snapping tolerance for graph construction.Optional—
max_snap_distanceOptional max distance from origin/destination points to nearest network node.Optional—
one_way_fieldOptional line field marking one-way digitized edges.Optional—
blocked_fieldOptional line field marking blocked/closed edges.Optional—
parallel_executionIf true (default), evaluates origin searches in parallel for baseline and perturbed OD runs.Optional—
outputOutput CSV path with OD pairs and sensitivity statistics.Required—
Examples
Computes OD costs with Monte Carlo impedance perturbation sensitivity.
wbe.od_sensitivity_analysis(destinations='destinations.shp', edge_cost_field='cost', impedance_disturbance_range='0.8,1.2', input='network.shp', monte_carlo_samples=10, origins='origins.shp', output='od_sensitivity.csv')
Shortest Path Network
Function name: shortest_path_network
Experimental
Finds the shortest path between start and end coordinates over a line network.
vector network shortest-path
Parameters
NameDescriptionRequiredDefault
inputInput line network layer.Requirednetwork.shp
start_xStart x coordinate.Required0.0
start_yStart y coordinate.Required0.0
end_xEnd x coordinate.Required100.0
end_yEnd y coordinate.Required100.0
snap_toleranceOptional node snapping tolerance for graph construction.Optional—
max_snap_distanceOptional max distance from start/end coordinates to nearest network node.Optional—
edge_cost_fieldOptional numeric line field used as an impedance multiplier for segment length.Optional—
one_way_fieldOptional line field marking one-way digitized edges (true/1/yes means from first to second vertex only).Optional—
blocked_fieldOptional line field marking blocked/closed edges to exclude from routing (true/1/yes blocks).Optional—
barriersOptional barrier point layer; nearest network nodes are blocked from traversal.Optional—
barrier_snap_distanceOptional max distance from each barrier point to a network node for blocking.Optional—
turn_penaltyOptional additive cost applied to non-straight turns at network nodes.Optional—
u_turn_penaltyOptional additive cost applied to U-turn transitions.Optional—
forbid_u_turnsIf true, disallow U-turn transitions.Optional—
forbid_left_turnsIf true, disallow left-turn transitions.Optional—
forbid_right_turnsIf true, disallow right-turn transitions.Optional—
turn_restrictions_csvOptional CSV of turn transitions using columns prev_x,prev_y,node_x,node_y,next_x,next_y. Optional columns: forbidden (default true when no turn_cost column is provided) and turn_cost (or penalty/cost/extra_cost) for per-turn additive cost.Optional—
temporal_cost_profileOptional CSV defining time-dependent edge costs (columns: edge_id,dow,start_minute,end_minute,value).Optional—
temporal_edge_id_fieldOptional network field used to match temporal_cost_profile edge_id values (default EDGE_ID).Optional—
departure_timeOptional RFC3339 departure time used for temporal profile lookup.Optional—
temporal_modeOptional temporal interpretation mode: multiplier or absolute.Optional—
temporal_fallbackOptional fallback when temporal row is missing: static_cost or error.Optional—
temporal_profile_reportOptional JSON output path for temporal profile diagnostics (coverage, unmatched edges, fallback usage).Optional—
outputOutput line vector path.Required—
Examples
Computes shortest path between two points on a line network.
wbe.shortest_path_network(end_x=100.0, end_y=100.0, input='network.shp', output='shortest_path.shp', start_x=0.0, start_y=0.0)
Split Lines At Intersections
Function name: split_lines_at_intersections
No help documentation available for this tool.
Snap Points To Network
Function name: snap_points_to_network
No help documentation available for this tool.
Service Area Planning And Coverage Optimization
Function name: service_area_planning_and_coverage_optimization
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Service Area Planning and Coverage Optimization
Problem It Solves
Which facilities and scenarios provide the strongest service coverage, and where do unmet demand gaps remain?
Who It Is For
- Accessibility planners, emergency coverage teams, and utility service design analysts.
Primary User
Municipal/public safety GIS teams, utilities, and logistics planners managing service-area targets.
What It Does
- Builds network-based multi-ring service-area polygons from facility points over a line network.
- Flags uncovered demand points outside baseline service coverage.
- Produces scenario summary and candidate ranking CSV outputs for open/close planning workflows.
How It Works
- Validates network/facility geometry types and parses ring costs.
- Runs network service-area generation using the OSS network engine with polygon outputs.
- Computes demand coverage and uncovered demand diagnostics against generated polygons.
- Optionally evaluates scenario CSV variants (
scenario_id,facility_id,is_open[,capacity]) and exports comparative KPI rows. - Indicative KPI formula:
coverage_pct = 100 * covered_demand / total_demand.
Why It Wins
- Replaces disconnected manual GIS steps with a reproducible network-derived coverage workflow and explicit planning artifacts.
Typical Buying Trigger
Teams need auditable facility coverage plans with scenario comparison outputs for governance or budget review.
Typical Presets
- baseline-only: generate default service areas + uncovered demand.
- scenario-planning: include open/close scenario CSV for option analysis.
- candidate-screening: rank facilities by demand coverage proxy for expansion planning.
Inputs
ParameterOptionalDescription
networknoInput line network layer (roads, trails, utility lines).
facilitiesnoInput facility point layer used as service origins.
demand_pointsyesOptional demand point layer used for covered/uncovered diagnostics and KPI generation.
ring_costsnoNumeric array of travel-cost ring thresholds (e.g., [5, 10, 15]).
scenariosyesOptional CSV with scenario_id,facility_id,is_open[,capacity] for open/close scenario runs.
service_areasnoOutput vector path for service-area polygons.
uncovered_demandnoOutput vector path for uncovered demand points.
scenario_summary_csvnoOutput CSV path for scenario KPIs.
ranked_candidates_csvnoOutput CSV path for candidate ranking metrics.
Outputs
ParameterTypeDescription
service_areasGeoJSON/GeoPackage/ShapefileBaseline network-derived multi-ring service-area polygons.
uncovered_demandGeoJSON/GeoPackage/ShapefileDemand points outside baseline service-area coverage.
scenario_summary_csvCSVScenario-level KPI table (scenario_id,total_demand_covered_pct,avg_accessibility,outlier_count).
ranked_candidates_csvCSVCandidate ranking table (candidate_id,coverage_gain_pct,avg_distance_improvement,rank).
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
result = wbe.service_area_planning_and_coverage_optimization( network="data/street_network.shp", facilities="data/facilities.shp", demand_points="data/demand_points.shp", ring_costs=[5.0, 10.0, 15.0], scenarios="data/service_scenarios.csv", service_areas="output/service_areas.geojson", uncovered_demand="output/uncovered_demand.geojson", scenario_summary_csv="output/scenario_summary.csv", ranked_candidates_csv="output/ranked_candidates.csv", )
print(result)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Transfer Attributes
Function name: transfer_attributes
No help documentation available for this tool.
Travelling Salesman Problem
Function name: travelling_salesman_problem
This tool finds approximate solutions to travelling salesman problems, the goal of which is to identify the shortest route connecting a set of locations. The tool uses an algorithm that applies a 2-opt heuristic and a 3-opt heuristic as a fall-back if the initial approach takes too long. The user must specify the names of the input points vector (input) and output lines vector file (output), as well as the duration, in seconds, over which the algorithm is allowed to search for improved solutions (duration). The tool works in parallel to find more optimal solutions.
Python API
def travelling_salesman_problem(self, input: Vector, duration: int = 60) -> Vector:
Vehicle Routing CVRP
Function name: vehicle_routing_cvrp
Experimental
Builds capacity-constrained multi-depot delivery routes with heterogeneous fleet controls, objective modes, and optional local optimization.
vector network routing optimization
Parameters
NameDescriptionRequiredDefault
networkInput line network layer (validated for contract parity).Requirednetwork.gpkg
depot_pointsDepot point layer; each point can contribute one or more vehicles.Requireddepots.gpkg
stop_pointsDelivery stop point layer.Requiredstops.gpkg
demand_fieldNumeric demand field in stop_points (default: demand).Optionaldemand
priority_fieldOptional stop priority field using values like required/high/normal/low or numeric ranks.Optionalpriority
allowed_vehicle_profiles_fieldOptional stop field listing compatible vehicle profiles (comma/semicolon/pipe-delimited).Optional—
allowed_route_classes_fieldOptional alias of allowed_vehicle_profiles_field for route-class compatibility rules.Optional—
depot_id_fieldOptional depot ID field used in route/assignment outputs.Optional—
vehicle_count_fieldOptional depot field for number of vehicles spawned at each depot.Optional—
vehicle_capacity_fieldOptional depot field overriding vehicle_capacity per depot/vehicle template.Optional—
vehicle_fixed_cost_fieldOptional depot field overriding vehicle_fixed_cost per depot/vehicle template.Optional—
travel_speed_fieldOptional depot field overriding travel_speed per depot/vehicle template.Optional—
max_route_distance_fieldOptional depot field overriding max_route_distance per depot/vehicle template.Optional—
max_route_time_fieldOptional depot field overriding max_route_time per depot/vehicle template.Optional—
vehicle_profile_fieldOptional depot field defining vehicle profile/category token used for stop compatibility.Optional—
vehicle_route_class_fieldOptional alias of vehicle_profile_field for route-class compatibility rules.Optional—
vehicle_capacityPer-vehicle capacity (> 0).Required100.0
vehicle_fixed_costOptional fixed cost charged per dispatched vehicle/route (default: 0).Optional0.0
max_vehiclesOptional maximum number of vehicles/routes to construct.Optional—
max_route_distanceOptional maximum travel distance per route, including return to depot.Optional—
travel_speedTravel speed in coordinate-units per time unit (default: 1).Optional—
max_route_timeOptional maximum route duration in model time units, including return to depot.Optional—
max_stops_per_vehicleOptional maximum number of stops assigned to each vehicle route.Optional—
objective_modeRoute-construction objective: minimize_distance, minimize_vehicles, or minimize_cost.Optionalminimize_distance
apply_local_optimizationWhen true, applies a deterministic 2-opt local improvement pass to each constructed route (default: true).OptionalTrue
apply_simulated_annealingWhen true, applies a seeded simulated annealing refinement pass per route after greedy/local optimization (default: false).OptionalFalse
sa_iterationsMaximum simulated annealing iterations per route when apply_simulated_annealing=true (default: 1500).Optional1500
sa_initial_temperatureInitial simulated annealing temperature (> 0, default: 1.0).Optional1.0
sa_cooling_rateSimulated annealing cooling multiplier in (0, 1); default 0.995.Optional0.995
sa_seedOptional deterministic random seed for simulated annealing (default: 42).Optional42
outputOutput route line vector path.Required—
assignment_outputOptional stop assignment point output with visit order/load diagnostics.Optional—
Examples
Builds CVRP routes and writes route lines with deterministic local optimization and optional simulated annealing controls.
wbe.vehicle_routing_cvrp(apply_local_optimization=True, apply_simulated_annealing=False, demand_field='demand', depot_points='depots.gpkg', network='network.gpkg', objective_mode='minimize_distance', output='cvrp_routes.gpkg', priority_field='priority', sa_cooling_rate=0.995, sa_initial_temperature=1.0, sa_iterations=1500, sa_seed=42, stop_points='stops.gpkg', vehicle_capacity=100.0, vehicle_fixed_cost=0.0)
Vehicle Routing Pickup Delivery
Function name: vehicle_routing_pickup_delivery
Experimental
Builds paired pickup-delivery routes with precedence and capacity constraints using a deterministic nearest-neighbour baseline.
vector network routing optimization pickup-delivery
Parameters
NameDescriptionRequiredDefault
networkInput line network layer (validated for contract parity).Requirednetwork.gpkg
depot_pointsDepot point layer; first point is used as the active depot in this baseline implementation.Requireddepots.gpkg
stop_pointsStop point layer containing paired pickup and delivery records.Requiredstops.gpkg
request_id_fieldRequest identifier field in stop_points used to pair pickup and delivery records (default: request_id).Optionalrequest_id
stop_type_fieldStop type field in stop_points containing pickup/delivery labels (default: stop_type).Optionalstop_type
demand_fieldNumeric demand field in stop_points; pickup demand is loaded and delivered demand is ignored (default: demand).Optionaldemand
vehicle_capacityPer-vehicle capacity (> 0).Required100.0
max_vehiclesOptional maximum number of vehicles/routes to construct.Optional—
outputOutput route line vector path.Required—
assignment_outputOptional stop assignment point output with request and precedence diagnostics.Optional—
Examples
Builds baseline pickup-delivery routes and writes route lines.
wbe.vehicle_routing_pickup_delivery(demand_field='demand', depot_points='depots.gpkg', network='network.gpkg', output='pickup_delivery_routes.gpkg', request_id_field='request_id', stop_points='stops.gpkg', stop_type_field='stop_type', vehicle_capacity=100.0)
Vehicle Routing VRPTW
Function name: vehicle_routing_vrptw
Experimental
Builds capacity-constrained multi-depot VRPTW routes with heterogeneous fleet settings, break windows, and objective-mode controls.
vector network routing optimization time-window
Parameters
NameDescriptionRequiredDefault
networkInput line network layer (validated for contract parity).Requirednetwork.gpkg
depot_pointsDepot point layer; each point can contribute one or more vehicles.Requireddepots.gpkg
stop_pointsDelivery stop point layer with demand and time-window fields.Requiredstops.gpkg
demand_fieldNumeric demand field in stop_points (default: demand).Optionaldemand
priority_fieldOptional stop priority field using values like required/high/normal/low or numeric ranks.Optionalpriority
allowed_vehicle_profiles_fieldOptional stop field listing compatible vehicle profiles (comma/semicolon/pipe-delimited).Optional—
allowed_route_classes_fieldOptional alias of allowed_vehicle_profiles_field for route-class compatibility rules.Optional—
tw_start_fieldNumeric time-window start field in stop_points (default: tw_start).Optionaltw_start
tw_end_fieldNumeric time-window end field in stop_points (default: tw_end).Optionaltw_end
service_time_fieldNumeric per-stop service time field in stop_points (default: service_time).Optionalservice_time
depot_id_fieldOptional depot ID field used in route/assignment outputs.Optional—
vehicle_count_fieldOptional depot field for number of vehicles spawned at each depot.Optional—
vehicle_capacity_fieldOptional depot field overriding vehicle_capacity per depot/vehicle template.Optional—
vehicle_fixed_cost_fieldOptional depot field overriding vehicle_fixed_cost per depot/vehicle template.Optional—
travel_speed_fieldOptional depot field overriding travel_speed per depot/vehicle template.Optional—
max_route_distance_fieldOptional depot field overriding max_route_distance per depot/vehicle template.Optional—
max_route_time_fieldOptional depot field overriding max_route_time per depot/vehicle template.Optional—
vehicle_profile_fieldOptional depot field defining vehicle profile/category token used for stop compatibility.Optional—
vehicle_route_class_fieldOptional alias of vehicle_profile_field for route-class compatibility rules.Optional—
depot_close_time_fieldOptional depot field overriding depot_close_time per depot/vehicle template.Optional—
break_start_fieldOptional depot field overriding break_start_time per depot/vehicle template.Optional—
break_end_fieldOptional depot field overriding break_end_time per depot/vehicle template.Optional—
break_duration_fieldOptional depot field overriding break_duration per depot/vehicle template.Optional—
vehicle_capacityPer-vehicle capacity (> 0).Required100.0
vehicle_fixed_costOptional fixed cost charged per dispatched vehicle/route (default: 0).Optional0.0
start_timeRoute start time in model time units (default: 0).Optional0.0
travel_speedTravel speed in coordinate-units per time unit (default: 1).Optional1.0
enforce_time_windowsWhen true, only stops with lateness OptionalFalse
allowed_latenessMaximum lateness tolerated when enforce_time_windows=true (default: 0).Optional0.0
depot_close_timeOptional hard close time by which each route must return to depot.Optional—
break_start_timeOptional global break-window start time for all vehicles.Optional—
break_end_timeOptional global break-window end time for all vehicles.Optional—
break_durationOptional global break duration applied once per route when break window is intersected.Optional—
use_priority_scoringWhen true, ranks feasible candidates by projected lateness/slack before travel distance; when false, uses nearest-neighbour baseline (default: true).OptionalTrue
max_vehiclesOptional maximum number of vehicles/routes to construct.Optional—
max_route_distanceOptional maximum route travel distance, including return to depot.Optional—
max_route_timeOptional maximum route duration in model time units, including return to depot.Optional—
max_stops_per_vehicleOptional maximum number of stops assigned to each vehicle route.Optional—
objective_modeRoute-construction objective: minimize_lateness, minimize_distance, minimize_vehicles, or minimize_cost.Optionalminimize_lateness
outputOutput route line vector path.Required—
assignment_outputOptional stop assignment point output with time-window diagnostics.Optional—
Examples
Builds baseline VRPTW routes and reports time-window diagnostics.
wbe.vehicle_routing_vrptw(allowed_lateness=0.0, demand_field='demand', depot_points='depots.gpkg', enforce_time_windows=False, network='network.gpkg', objective_mode='minimize_lateness', output='vrptw_routes.gpkg', priority_field='priority', service_time_field='service_time', start_time=0.0, stop_points='stops.gpkg', travel_speed=1.0, tw_end_field='tw_end', tw_start_field='tw_start', use_priority_scoring=True, vehicle_capacity=100.0, vehicle_fixed_cost=0.0)
Linear Referencing
Linear referencing (LRS) is a data model where features are located along routes by a measured distance from a known origin — rather than by absolute X/Y coordinates. It is the foundation of road/rail inventory, pipeline inspection data, accident records, and pavement condition databases.
WbW-QGIS provides tools for building measure fields on route networks, locating point and line events along routes, and exporting event geometries for spatial analysis.
Key Concepts
- Route: A polyline feature with a unique, stable route identifier
(
ROUTE_ID) and a monotonically increasing measure value (M-value) along its length. - M-value: The accumulated distance (or time, or post number) from the route origin to each vertex. Stored as the M coordinate in an MZ geometry.
- Event: A point or interval on a route located by one measure (point event) or two measures (line event: from-measure, to-measure).
- Event table: A tabular record set with
ROUTE_ID,MEASURE(point) orFROM_M/TO_M(line), and any associated attributes. - Dynamic segmentation: The process of converting event tables to geometry by interpolating measure positions along routes.
- Calibration: Adjusting M-values to match real-world control points — for example, aligning stationing to kilometre posts.
End-to-End Workflow: Locating Inspection Events Along a Road Network
This workflow builds measure fields on a road network, then locates a set of field inspection points as events on their respective routes.
Inputs
| Layer | Format | Notes |
|---|---|---|
roads.shp | Polyline vector | Road centrelines, unique ROUTE_ID field |
inspections.csv | CSV table | Columns: ROUTE_ID, CHAINAGE_M, CONDITION |
Step 1 — Add Cumulative Distance (Measure) Field
WbW computes the cumulative distance from each route's start vertex to every subsequent vertex and writes it as the M coordinate.
Processing Toolbox → Whitebox Workflows → Vector Analysis →
Add Geometry Attributes
| Parameter | Recommended value |
|---|---|
| Input vector | roads.shp |
| Units | Metres |
| Output | roads_geom.shp |
This step appends LENGTH to each segment. For building full route M-values
use the QGIS Set M Value tool (from Geometry group) after merging
segments by ROUTE_ID.
Alternative — set M values from a field:
Processing Toolbox → Vector Geometry → Set M Value (QGIS native)
| Parameter | Recommended value |
|---|---|
| Input layer | roads.shp (merged per route) |
| M value | Expression: $length (QGIS expression — cumulative along merged route) |
| Output | routes_m.shp |
Route Calibration and Recalibration
Measures are only useful when anchored to real-world control points such as kilometre posts or survey stations. If your routes lack calibration or have been edited, use these tools to establish stable, field-verified measures.
Calibrate Routes from Control Points
Processing Toolbox → Whitebox Workflows → Linear Referencing →
Route Calibrate
| Parameter | Value |
|---|---|
| Input routes | roads.shp (with ROUTE_ID field) |
| Control points | km_posts.shp (with ROUTE_ID and KNOWN_MEASURE fields) |
| Control measure field | KNOWN_MEASURE |
| Snap tolerance | 10.0 (meters) |
| Output | routes_calibrated.shp |
Output adds FROM_MEASURE and TO_MEASURE fields containing the calibrated values.
Recalibrate After Route Edits
If you split, merge, or redraw routes, use recalibration to scale measures proportionally:
Processing Toolbox → Whitebox Workflows → Linear Referencing →
Route Recalibrate
| Parameter | Value |
|---|---|
| Original routes | routes_calibrated.shp (reference with valid measures) |
| Edited routes | routes_edited.shp (after geometric changes) |
| Output | routes_recalibrated.shp |
Step 2 — Validate Route Identifiers
Route IDs must be unique per route and stable across updates. Check for duplicates.
QGIS → Open Attribute Table → Field Calculator or via the Python Console:
from qgis.core import QgsVectorLayer
layer = QgsVectorLayer('/data/routes_m.shp', 'routes', 'ogr')
ids = [f['ROUTE_ID'] for f in layer.getFeatures()]
duplicates = [x for x in ids if ids.count(x) > 1]
if duplicates:
print(f"Duplicate route IDs found: {set(duplicates)}")
else:
print("All route IDs are unique.")
Step 3 — Locate Point Events (Dynamic Segmentation)
Processing Toolbox → Whitebox Workflows → Linear Referencing →
Locate Point Events
| Parameter | Recommended value |
|---|---|
| Input routes | routes_m.shp |
| Event table | inspections.csv |
| Route ID field (routes) | ROUTE_ID |
| Route ID field (events) | ROUTE_ID |
| Measure field | CHAINAGE_M |
| Output | inspection_points.shp |
Each CSV row becomes a point geometry placed at the corresponding measure position on its route. Rows with unmatched route IDs or out-of-range measures are written to an error table.
Step 4 — Locate Line Events (Optional)
If the inspection table records intervals (e.g. pavement condition rated over 100 m segments):
Processing Toolbox → Whitebox Workflows → Linear Referencing →
Locate Line Events
| Parameter | Recommended value |
|---|---|
| Input routes | routes_m.shp |
| Event table | pavement.csv |
| Route ID field (routes) | ROUTE_ID |
| Route ID field (events) | ROUTE_ID |
| From-measure field | FROM_M |
| To-measure field | TO_M |
| Output | pavement_segments.shp |
Step 5 — Inspect and Validate Event Geometry
Load inspection_points.shp in QGIS. Pan to several known inspection records
and confirm point positions against the road centreline.
Use QGIS Identify tool to click a point and verify that CHAINAGE_M
matches the M-value of the nearest route vertex within acceptable tolerance
(typically ± half the route vertex spacing).
Python Console Equivalent
import processing
# Step 3: locate point events
processing.run('whitebox_workflows:locate_point_events', {
'routes': '/data/routes_m.shp',
'events': '/data/inspections.csv',
'route_id_field': 'ROUTE_ID',
'event_route_id_field': 'ROUTE_ID',
'measure_field': 'CHAINAGE_M',
'output': '/data/inspection_points.shp',
})
# Step 4: locate line events
processing.run('whitebox_workflows:locate_line_events', {
'routes': '/data/routes_m.shp',
'events': '/data/pavement.csv',
'route_id_field': 'ROUTE_ID',
'event_route_id_field': 'ROUTE_ID',
'from_measure_field': 'FROM_M',
'to_measure_field': 'TO_M',
'output': '/data/pavement_segments.shp',
})
print("Linear referencing complete.")
Advanced: Calibrate Routes Against Control Points
If field-collected kilometre posts differ from computed cumulative distance, calibrate M-values by interpolating between control points.
Processing Toolbox → Whitebox Workflows → Linear Referencing →
Calibrate Route
| Parameter | Recommended value |
|---|---|
| Input routes | routes_m.shp |
| Calibration points | km_posts.shp (with ROUTE_ID and KNOWN_M fields) |
| Route ID field | ROUTE_ID |
| Measure field | KNOWN_M |
| Search tolerance (m) | 50 |
| Output | routes_calibrated.shp |
After calibration, re-run Locate Point Events on routes_calibrated.shp to
position events against field-verified measures.
Common Pitfalls
| Problem | Likely cause | Fix |
|---|---|---|
| Events do not locate (0 features output) | Route ID field names do not match | Check both ROUTE_ID parameter values match the actual field names |
| Events placed far from expected position | M-values in event table use different units | Confirm both routes and events use the same unit (metres vs km) |
| Out-of-range events produce no error output | Measure > route end measure | Check that FROM_M/TO_M do not exceed the route total length |
| Calibration shifts all events uniformly | Only one control point per route | Add at least two control points per route for interpolation |
| Duplicate route IDs cause incorrect event assignment | Merged route has repeated IDs | Dissolve route features by ROUTE_ID before building M-values |
Validation Checklist
- Route IDs are unique per route with no duplicates.
- M-values are monotonically increasing along each route (no reversals).
- Event table measures fall within the range [0, route total length].
- Located point events visually snap to the correct road centreline.
-
Line event segment lengths match
TO_M - FROM_Mwithin 0.1 m. -
Error table from
Locate Eventscontains zero unmatched records.
Linear Referencing — Tool Reference
Snap Events To Routes
Function name: snap_events_to_routes
No help documentation available for this tool.
Route Event Governance For Linear Assets
Function name: route_event_governance_for_linear_assets
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Route Event Governance
Who It Is For
- Departments of Transportation managing pavement/paving event datasets.
- Pipeline operators maintaining linear segment datasets (coating, pressure class, material type).
- Rail and powerline operators tracking inspection or maintenance event inventories.
- GIS production teams responsible for LRS data quality before system integration.
Primary User
DOTs, pipeline operators, rail/powerline operators, and telecom managers.
What It Does
- Validates route event datasets (line segments with from/to measures) against production governance rules.
- Detects overlapping events (events sharing measure ranges on the same route).
- Detects measure gaps (discontinuities between sequential events).
- Detects descending intervals (to_measure ParameterTypeRequiredDescription
eventsVector pathRequiredEvent layer (GeoPackage, GeoJSON, Shapefile) with route ID and from/to measure fieldsroute_id_fieldstringRequiredField name containing route identifiersfrom_measure_fieldstringRequiredField name for interval start measureto_measure_fieldstringRequiredField name for interval end measuregap_tolerancefloatOptionalGaps smaller than this value are not flagged (default 0.0)overlap_tolerancefloatOptionalOverlaps smaller than this value are not flagged (default 0.0)auto_fixboolOptionalEnable auto-correction of descending intervals and trimming of overlaps (default false)domain_rules_jsonpathOptionalJSON file defining per-field validation rules withallowed_values,regex,min, andmaxchecksgoverned_eventsvector pathRequiredOutput path for QA-passed events with GOVERNANCE_STATUS and CORRECTIONS attributesissues_csvpathRequiredOutput CSV path for per-event issue logcorrected_eventsvector pathOptionalOutput path for auto-corrected events (only written when auto_fix=true)governance_reportpathRequiredOutput JSON path for governance summary reportremediation_queue_csvpathOptionalOptional prioritized remediation queue with recommended corrective action
Outputs
OutputTypeContents
governed_eventsVectorQA-passed events; attributes include GOVERNANCE_STATUS ("PASSED"/"CORRECTED") and CORRECTIONS (correction type or "none")
issues_csvCSVPer-violation log: event_id, route_id, rule_violated, severity, description, measure_start, measure_end
corrected_eventsVectorOnly written when auto_fix=true; lists corrected events with CORR_TYPE, ORIG_FROM, ORIG_TO, CORR_FROM, CORR_TO columns
governance_reportJSONSummary: total_events, passed_events, failed_events, pass_rate_percent, rules_violated, severity_distribution, correctable_count, domain_rules_applied
remediation_queue_csvCSVPrioritized queue with rule category, severity, and recommended corrective action
Python Example
`env = WbEnvironment(license_tier="pro")
result = env.run_tool("route_event_governance_for_linear_assets", events="pavement_events.gpkg", route_id_field="ROUTE_ID", from_measure_field="FROM_MEAS", to_measure_field="TO_MEAS", gap_tolerance=0.5, overlap_tolerance=0.1, auto_fix=True, domain_rules_json="output/route_event_rules.json", governed_events="output/governed_events.gpkg", issues_csv="output/issues.csv", corrected_events="output/corrected_events.gpkg", governance_report="output/governance_report.json", remediation_queue_csv="output/remediation_queue.csv", )
import json report = json.loads(open(result["governance_report"]).read()) print(f"Pass rate: {report['pass_rate_percent']:.1f}% " f"({report['passed_events']}/{report['total_events']} events)")`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Locate Points Along Routes
Function name: locate_points_along_routes
Experimental
Locates point features along route lines and writes route-measure attributes.
vector linear-referencing routes points
Parameters
NameDescriptionRequiredDefault
routesInput route line layer.Requiredroutes.shp
pointsInput point layer to locate along routes.Requiredevents.shp
max_offset_distanceOptional maximum point-to-route offset distance.Optional—
outputOutput point vector path.Required—
Examples
Adds route-measure attributes to points by locating them on the nearest route.
wbe.locate_points_along_routes(max_offset_distance=25.0, output='located_points.shp', points='events.shp', routes='routes.shp')
Points Along Lines
Function name: points_along_lines
Experimental
Creates regularly spaced point features along input line geometries.
vector points lines
Parameters
NameDescriptionRequiredDefault
inputInput line layer.Requiredlines.shp
spacingSpacing distance between points.Required50.0
include_endInclude line endpoints (default true).OptionalTrue
outputOutput point vector path.Required—
Examples
Creates points at fixed spacing along each line.
wbe.points_along_lines(include_end=True, input='lines.shp', output='points_along_lines.shp', spacing=50.0)
Route Calibrate
Function name: route_calibrate
Experimental
Calibrates route start/end measures from control points with known measures.
vector linear-referencing calibration
Parameters
NameDescriptionRequiredDefault
routesInput route line layer.Requiredroutes.gpkg
control_pointsInput control-point layer.Requiredcontrol_points.gpkg
control_measure_fieldControl-point field containing known measure values.Requiredmeasure
route_id_fieldOptional route identifier field in routes. Defaults to feature FID.Optional—
control_route_id_fieldOptional route identifier field in control points. Defaults to feature FID.Optional—
from_measure_fieldOutput field for route start measure (default 'from_measure').Optional—
to_measure_fieldOutput field for route end measure (default 'to_measure').Optional—
snap_toleranceMaximum control-point offset distance from route geometry (default 1.0).Optional1.0
outputOutput calibrated route layer.Required—
Examples
Calibrates route start/end measures using route control points.
wbe.route_calibrate(control_measure_field='measure', control_points='control_points.gpkg', control_route_id_field='route_id', output='routes_calibrated.gpkg', route_id_field='route_id', routes='routes.gpkg', snap_tolerance=1.0)
Route Event Lines From Layer
Function name: route_event_lines_from_layer
Experimental
Creates routed line events from an event vector layer using from/to measures.
vector linear-referencing events
Parameters
NameDescriptionRequiredDefault
routesInput route line layer.Requiredroutes.gpkg
eventsInput event vector layer.Requiredline_events.gpkg
event_route_fieldEvent-layer field containing route identifiers.Requiredroute_id
from_measure_fieldEvent-layer field containing start measures.Requiredfrom_m
to_measure_fieldEvent-layer field containing end measures.Requiredto_m
route_id_fieldOptional route-layer field containing route identifiers. Defaults to feature FID.Optional—
write_event_fidWrite EVENT_FID to preserve source event feature IDs (default true).OptionalTrue
write_event_xyWrite source event geometry X/Y attributes (default false).OptionalFalse
outputOutput line vector path.Required—
Examples
Creates line events on routes from from/to measures in an event vector layer.
wbe.route_event_lines_from_layer(event_route_field='route_id', events='line_events.gpkg', from_measure_field='from_m', output='route_event_lines_layer.gpkg', route_id_field='RID', routes='routes.gpkg', to_measure_field='to_m', write_event_fid=True, write_event_xy=False)
Route Event Lines From Table
Function name: route_event_lines_from_table
Experimental
Creates routed line events from a CSV event table and a route layer using from/to measures.
vector linear-referencing events csv
Parameters
NameDescriptionRequiredDefault
routesInput route line layer.Requiredroutes.gpkg
eventsInput CSV event table path.Requiredline_events.csv
event_route_fieldCSV field containing route identifiers.Requiredroute_id
from_measure_fieldCSV field containing start measures.Requiredfrom_m
to_measure_fieldCSV field containing end measures.Requiredto_m
route_id_fieldOptional route-layer field containing route identifiers. Defaults to feature FID.Optional—
outputOutput line vector path.Required—
Examples
Creates line events on routes from from/to measures stored in a CSV table.
wbe.route_event_lines_from_table(event_route_field='route_id', events='line_events.csv', from_measure_field='from_m', output='route_event_lines.gpkg', route_id_field='RID', routes='routes.gpkg', to_measure_field='to_m')
Route Event Merge
Function name: route_event_merge
Experimental
Merges adjacent compatible route events.
vector linear-referencing events merge
Parameters
NameDescriptionRequiredDefault
eventsInput event layer containing route intervals.Requiredevents.gpkg
event_route_fieldEvent-layer route identifier field.Requiredroute_id
from_measure_fieldEvent-layer interval start field.Requiredfrom_m
to_measure_fieldEvent-layer interval end field.Requiredto_m
group_fieldsOptional comma-delimited fields used for merge compatibility. Defaults to all non-measure fields.Optional—
gap_toleranceMaximum gap allowed for adjacency merge (default 0.0).Optional0.0
conflict_modeOverlap handling mode: error|skip (default error).Optionalerror
outputOutput merged-event layer.Required—
Examples
Merges compatible adjacent events on each route.
wbe.route_event_merge(conflict_mode='error', event_route_field='route_id', events='events.gpkg', from_measure_field='from_m', gap_tolerance=0.0, group_fields='route_id,road_class,speed', output='events_merged.gpkg', to_measure_field='to_m')
Route Event Overlay
Function name: route_event_overlay
Experimental
Overlays two route event layers by interval overlap.
vector linear-referencing events overlay
Parameters
NameDescriptionRequiredDefault
primary_eventsPrimary event layer.Requiredprimary_events.gpkg
overlay_eventsOverlay event layer.Requiredoverlay_events.gpkg
primary_route_fieldPrimary layer route identifier field.Requiredroute_id
primary_from_measure_fieldPrimary layer interval start field.Requiredfrom_m
primary_to_measure_fieldPrimary layer interval end field.Requiredto_m
overlay_route_fieldOverlay layer route identifier field.Requiredroute_id
overlay_from_measure_fieldOverlay layer interval start field.Requiredfrom_m
overlay_to_measure_fieldOverlay layer interval end field.Requiredto_m
min_overlap_lengthMinimum overlap length to keep (default 0.0).Optional0.0
outputOutput overlay layer.Required—
Examples
Computes overlapping route-event intervals between two event layers.
wbe.route_event_overlay(min_overlap_length=0.0, output='events_overlay.gpkg', overlay_events='overlay_events.gpkg', overlay_from_measure_field='from_m', overlay_route_field='route_id', overlay_to_measure_field='to_m', primary_events='primary_events.gpkg', primary_from_measure_field='from_m', primary_route_field='route_id', primary_to_measure_field='to_m')
Route Event Points From Layer
Function name: route_event_points_from_layer
Experimental
Creates routed point events from an event vector layer and a route layer.
vector linear-referencing events
Parameters
NameDescriptionRequiredDefault
routesInput route line layer.Requiredroutes.gpkg
eventsInput event vector layer.Requiredpoint_events.gpkg
event_route_fieldEvent-layer field containing route identifiers.Requiredroute_id
measure_fieldEvent-layer field containing point-event measures.Requiredmeasure
route_id_fieldOptional route-layer field containing route identifiers. Defaults to feature FID.Optional—
write_event_fidWrite EVENT_FID to preserve source event feature IDs (default true).OptionalTrue
write_event_xyWrite source event geometry X/Y attributes (default false).OptionalFalse
outputOutput point vector path.Required—
Examples
Creates point events on routes from measure values in an event vector layer.
wbe.route_event_points_from_layer(event_route_field='route_id', events='point_events.gpkg', measure_field='measure', output='route_event_points_layer.gpkg', route_id_field='RID', routes='routes.gpkg', write_event_fid=True, write_event_xy=False)
Route Event Points From Table
Function name: route_event_points_from_table
Experimental
Creates routed point events from a CSV event table and a route layer.
vector linear-referencing events csv
Parameters
NameDescriptionRequiredDefault
routesInput route line layer.Requiredroutes.gpkg
eventsInput CSV event table path.Requiredpoint_events.csv
event_route_fieldCSV field containing route identifiers.Requiredroute_id
measure_fieldCSV field containing point-event measures.Requiredmeasure
route_id_fieldOptional route-layer field containing route identifiers. Defaults to feature FID.Optional—
outputOutput point vector path.Required—
Examples
Creates point events on routes from measure values stored in a CSV table.
wbe.route_event_points_from_table(event_route_field='route_id', events='point_events.csv', measure_field='measure', output='route_event_points.gpkg', route_id_field='RID', routes='routes.gpkg')
Route Event Split
Function name: route_event_split
Experimental
Splits route events by per-route boundary measures.
vector linear-referencing events split
Parameters
NameDescriptionRequiredDefault
eventsInput event layer containing route intervals.Requiredevents.gpkg
boundariesInput boundary layer containing route measure breakpoints.Requiredevent_boundaries.gpkg
event_route_fieldEvent-layer route identifier field.Requiredroute_id
from_measure_fieldEvent-layer interval start field.Requiredfrom_m
to_measure_fieldEvent-layer interval end field.Requiredto_m
boundary_route_fieldBoundary-layer route identifier field.Requiredroute_id
boundary_measure_fieldBoundary-layer measure field.Requiredmeasure
min_segment_lengthOptional minimum split segment length to keep (default 0.0).Optional0.0
outputOutput split-event layer.Required—
Examples
Splits route event intervals at supplied route measure boundaries.
wbe.route_event_split(boundaries='event_boundaries.gpkg', boundary_measure_field='measure', boundary_route_field='route_id', event_route_field='route_id', events='events.gpkg', from_measure_field='from_m', min_segment_length=0.0, output='events_split.gpkg', to_measure_field='to_m')
Route Measure QA
Function name: route_measure_qa
Experimental
Diagnoses route-event measure gaps, overlaps, non-monotonic sequences, and duplicate measures.
vector linear-referencing qa diagnostics
Parameters
NameDescriptionRequiredDefault
eventsInput event layer containing route intervals.Requiredevents.gpkg
route_fieldRoute identifier field.Requiredroute_id
from_measure_fieldInterval start field.Requiredfrom_m
to_measure_fieldInterval end field.Requiredto_m
gap_toleranceGap tolerance (default 0.0).Optional0.0
overlap_toleranceOverlap tolerance (default 0.0).Optional0.0
outputOutput QA diagnostics layer.Required—
Examples
Generates route-measure diagnostics for interval event data.
wbe.route_measure_qa(events='events.gpkg', from_measure_field='from_m', gap_tolerance=0.0, output='route_measure_qa.gpkg', overlap_tolerance=0.0, route_field='route_id', to_measure_field='to_m')
Route Recalibrate
Function name: route_recalibrate
Experimental
Recalibrates edited route measures from a reference route layer while preserving route measure continuity.
vector linear-referencing recalibration
Parameters
NameDescriptionRequiredDefault
original_routesReference routes containing prior calibrated measures.Requiredroutes_original.gpkg
edited_routesEdited routes to recalibrate.Requiredroutes_edited.gpkg
route_id_fieldOptional shared route identifier field. Defaults to feature FID.Optional—
from_measure_fieldMeasure-start field name (default 'from_measure').Optional—
to_measure_fieldMeasure-end field name (default 'to_measure').Optional—
outputOutput recalibrated route layer.Required—
Examples
Scales edited route measures from a previously calibrated route layer.
wbe.route_recalibrate(edited_routes='routes_edited.gpkg', original_routes='routes_original.gpkg', output='routes_recalibrated.gpkg', route_id_field='route_id')
Projection and Georeferencing
Accurate coordinate reference systems (CRS) and georeferencing are foundational to all GIS work. This chapter covers tools for assigning, reprojecting, and transforming spatial data between coordinate systems, as well as tools for georeferencing rasters from ground control points.
Key Concepts
- CRS / Projection: A mathematical model that defines how geographic coordinates map to a flat surface. All spatial data in a GIS project must share a common CRS for overlay and analysis to be meaningful.
- Reprojection: Transforming data from one CRS to another. Whitebox Workflows supports epoch-aware datum transformations for sub-metre accuracy.
- Georeferencing: Assigning spatial coordinates to a raster image using ground control points (GCPs), typically for aerial photography, scanned maps, or satellite imagery without embedded metadata.
- Orthorectification: Correcting geometric distortions in aerial/satellite imagery caused by terrain relief and sensor tilt.
Tool Reference
The tools in this chapter are accessible from the QGIS Processing Toolbox under Whitebox Workflows → Projection and Georeferencing.
General Tools
Assign Projection LiDAR
Function name: assign_projection_lidar
No help documentation available for this tool.
Assign Projection Raster
Function name: assign_projection_raster
No help documentation available for this tool.
Assign Projection Vector
Function name: assign_projection_vector
No help documentation available for this tool.
Georeference Raster From Control Points
Function name: georeference_raster_from_control_points
No help documentation available for this tool.
Reproject LiDAR
Function name: reproject_lidar
No help documentation available for this tool.
Reproject Raster
Function name: reproject_raster
No help documentation available for this tool.
Reproject Vector
Function name: reproject_vector
Experimental
Reprojects an input vector layer to a destination EPSG code.
vector projection crs
Parameters
NameDescriptionRequiredDefault
inputInput vector layer.Requiredinput.shp
epsgDestination EPSG code.Required4326
outputOutput vector path.Required—
Examples
Reprojects a vector layer to EPSG:4326.
wbe.reproject_vector(epsg=4326, input='input.shp', output='reprojected.shp')
Orthorectification
Function name: orthorectification
No help documentation available for this tool.
Precision Agriculture
Precision agriculture tools integrate high-resolution geospatial data — LiDAR, multispectral imagery, soil surveys, and yield maps — to support evidence-based farm management decisions. These are specialised Pro tier workflow tools that automate complex multi-source analyses into actionable management zones and field intelligence reports.
Key Concepts
- Management Zones: Spatially delineated areas of a field that share similar soil, crop, or topographic characteristics, enabling variable-rate application of inputs such as fertiliser, seed, or irrigation water.
- Yield Data Conditioning: Raw yield monitor data contains significant noise and artifacts; conditioning standardises and quality-controls it before analysis.
- Crop Stress Detection: Using multispectral imagery (NDVI, NDRE, CWSI) to identify in-season stress before visible symptoms appear, enabling targeted intervention.
- Field Trafficability: Assessing soil compaction risk and field access conditions based on soil moisture models and terrain analysis.
Tool Reference
The tools in this chapter are accessible from the QGIS Processing Toolbox under Whitebox Workflows → Precision Agriculture.
Note: All Precision Agriculture tools require a Pro license.
General Tools
Field Trafficability And Operation Planning
Function name: field_trafficability_and_operation_planning
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Field Trafficability and Operation Planning
Problem It Solves
Where can equipment operate safely now, and which areas should be delayed or rerouted?
Who It Is For
- Machinery planning teams, farm managers, and field operations coordinators.
Primary User
Farm operations teams and precision agriculture service providers.
What It Does
- Scores field trafficability and operation timing risk from terrain and saturation context.
- Produces operation classes for go/hold/reroute-style field execution decisions.
How It Works
- Derives slope from DEM and harmonizes moisture/rainfall context to the same grid.
- Blends terrain, soil saturation, and rainfall-risk penalties into trafficability scores.
- Converts continuous scores into operation classes for practical planning use.
- QA acceptance guidance:
status=passindicates low-trafficability burden stayed within baseline tolerance.diagnostics.acceptance_thresholds.low_trafficability_fraction_reviewdefines review escalation threshold.- High
summary.low_trafficability_fractionshould trigger cautious equipment scheduling and field confirmation. - MVP hardening assets:
- Benchmark scaffold:
tests/fixtures/precision_ag_ops_benchmark/ - Promotion guide:
docs/internal/development/TERRAIN_PRECISION_AG_BENCHMARK_PROMOTION_GUIDE_2026_04_14.md
Inputs
ParameterOptionalDescription demnoDEM raster for terrain slope context. soil_moisturenoSoil saturation/moisture raster normalized to [0,1]. rainfall_forecastyesOptional rainfall-risk raster normalized to [0,1].
Outputs
ParameterTypeDescription trafficability_scoreGeoTIFFContinuous trafficability score [0,1] (higher is better). operation_classGeoTIFFDiscrete operation class raster (1 favorable to 4 poor). summaryJSONMachine-readable summary report containing run metadata, QA diagnostics, and key metrics. html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
traffic, op_class, summary = wbe.field_trafficability_and_operation_planning( dem="data/dem.tif", soil_moisture="data/soil_saturation.tif", rainfall_forecast="data/rain_risk.tif", output_prefix="output/field_trafficability", )
print(traffic) print(op_class) print(summary)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
In Season Crop Stress Intervention Planning
Function name: in_season_crop_stress_intervention_planning
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
In-Season Crop Stress Intervention Planning
Problem It Solves
Where should interventions be prioritized this week to limit stress-driven yield loss?
Who It Is For
- Agronomy advisors and farm operations teams managing in-season response.
Primary User
Precision agronomy service providers and large production farms.
What It Does
- Prioritizes intervention zones from in-season crop stress indicators.
- Combines vigor decline with optional thermal and moisture stress context.
How It Works
- Uses NDVI/vigor deficit as the primary stress signal.
- Harmonizes optional canopy-temperature and soil-moisture rasters to the NDVI grid.
- Produces intervention-priority and intervention-class surfaces.
- QA acceptance guidance:
status=passindicates intervention burden is below review threshold.diagnostics.acceptance_thresholds.high_priority_fraction_reviewdefines escalation threshold for broad intervention risk.- High
summary.high_priority_fractionshould trigger stress-source verification before large-scale treatment rollout. - MVP hardening assets:
- Benchmark scaffold:
tests/fixtures/precision_ag_ops_benchmark/ - Promotion guide:
docs/internal/development/TERRAIN_PRECISION_AG_BENCHMARK_PROMOTION_GUIDE_2026_04_14.md
Inputs
ParameterOptionalDescription ndvinoNDVI/vigor raster normalized to [0,1]. canopy_temperatureyesOptional thermal-stress raster normalized to [0,1]. soil_moistureyesOptional moisture-deficit raster normalized to [0,1].
Outputs
ParameterTypeDescription intervention_priorityGeoTIFFContinuous intervention-priority score [0,1]. intervention_classGeoTIFFDiscrete intervention class raster (1 low to 4 urgent). summaryJSONMachine-readable summary report containing run metadata, QA diagnostics, and key metrics. html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
priority, classes, summary = wbe.in_season_crop_stress_intervention_planning( ndvi="data/ndvi_current.tif", canopy_temperature="data/canopy_temp_stress.tif", soil_moisture="data/moisture_deficit.tif", output_prefix="output/in_season_stress", )
print(priority) print(classes) print(summary)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Precision Irrigation Optimization
Function name: precision_irrigation_optimization
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Precision Irrigation Optimization
Problem It Solves
Where should irrigation depth be increased or reduced to achieve target moisture with less water waste?
Who It Is For
- Precision agriculture teams, irrigation planners, and agronomy analytics groups.
Primary User
Large growers, precision ag service providers, and irrigation technology partners.
What It Does
- Generates variable-rate irrigation prescription depth from terrain and moisture context.
- Estimates moisture stress risk to prioritize intervention zones.
- Emits summary diagnostics suitable for irrigation planning dashboards.
How It Works
- Estimates local moisture deficit from target_moisture and measured or terrain-inferred moisture.
- Adjusts prescription depth by slope-derived terrain factor and max_irrigation_mm limits.
- Converts prescribed depth to VRI zones and computes moisture-stress risk index.
- Indicative formula: prescribed_mm = max(0, target - moisture) * max_irrigation_mm * terrain_factor.
Why It Wins
- Produces direct raster-ready prescription depths and risk diagnostics from reproducible inputs.
Typical Buying Trigger
A grower or consultant needs to shift from uniform irrigation to variable-rate execution.
Typical Presets
- fast for quick field-scale recommendations.
- balanced for default VRI planning.
- conservative for stronger terrain-risk penalties.
Inputs
ParameterOptionalDescription demnoDigital elevation model used as the terrain reference surface. optional soil_moistureyesOptional soil moisture raster used to refine irrigation need and stress scoring. profile: fast | balanced | conservativenoProcessing profile controlling sensitivity, quality strictness, and runtime tradeoffs. target_moisturenoTarget moisture level used to compute irrigation prescription amounts. max_irrigation_mmnoMaximum irrigation depth allowed in generated irrigation recommendations.
Outputs
ParameterTypeDescription irrigation_prescriptionGeoTIFFIrrigation prescription raster indicating recommended application depth. moisture_stress_riskGeoTIFFMoisture stress risk raster supporting irrigation prioritization. vri_zonesGeoTIFFVariable-rate irrigation zone raster. summaryJSONMachine-readable summary report containing run metadata, QA diagnostics, and key metrics. html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
When sweep_spec is supplied, the workflow also emits run_matrix_summary, sensitivity_report, sensitivity_report_html, and stability_map. The sensitivity report includes metrics.primary_metric, metrics.primary_relative_span, and metrics.stability_class (high, medium, low), while stability_map uses classes 3=high, 2=medium, 1=low.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
prescription, stress, zones, summary = wbe.precision_irrigation_optimization( dem="data/dem.tif", soil_moisture="data/soil_moisture.tif", profile="balanced", target_moisture=0.6, max_irrigation_mm=18.0, output_prefix="output/irrigation", )
print(prescription) print(stress) print(zones) print(summary)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Precision Ag Yield Zone Intelligence
Function name: precision_ag_yield_zone_intelligence
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Precision Ag Yield Zone Analysis
Problem It Solves
Which management zones should receive differentiated input strategy based on stable productivity patterns?
Who It Is For
- Precision agronomy teams and farm analytics providers.
Primary User
Enterprise farms, agronomy consultancies, and digital ag platforms.
What It Does
- Builds yield-stability surfaces from yield productivity context.
- Integrates optional terrain context to improve zone differentiation.
- Produces management-zone rasters for variable-rate strategy design.
How It Works
- Normalizes yield response and computes local stability/variability signatures.
- Optionally blends terrain_context influence with profile-dependent weighting.
- Partitions pixels into zone_count management classes and emits zone confidence and polygons.
- Indicative formula: zone_score ~= w_yyield_stability + w_tterrain_term; zone = discretize(zone_score, zone_count).
Why It Wins
- Converts yield and terrain context into explicit, contract-backed management zone outputs.
Typical Buying Trigger
A farm program is formalizing zone-based management for seed, fertility, and water decisions.
Typical Presets
- fast for high-throughput zone generation.
- balanced for default zone planning.
- conservative for stronger terrain-context influence.
Inputs
ParameterOptionalDescription yield_surfacenoYield raster used to estimate productivity stability and zone segmentation. optional terrain_contextyesOptional terrain context raster used to strengthen management-zone differentiation. profile: fast | balanced | conservativenoProcessing profile controlling sensitivity, quality strictness, and runtime tradeoffs. zone_count (2..8)noRequested number of management zones to generate.
Outputs
ParameterTypeDescription yield_stabilityGeoTIFFYield stability raster used to delineate management zones. management_zonesGeoTIFFManagement-zone classification raster. management_zones_vectorGeoPackageVectorized management zones for operations and reporting. zone_confidenceGeoTIFFConfidence surface associated with generated management zones. summaryJSONMachine-readable summary report containing run metadata, QA diagnostics, and key metrics. html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
When sweep_spec is supplied, the workflow also emits run_matrix_summary, sensitivity_report, sensitivity_report_html, and stability_map. The sensitivity report includes metrics.primary_metric, metrics.primary_relative_span, and metrics.stability_class (high, medium, low), while stability_map uses classes 3=high, 2=medium, 1=low.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
stability, zones, zone_polys, confidence, summary = wbe.precision_ag_yield_zone_intelligence( yield_surface="data/yield_surface.tif", terrain_context="data/terrain_context.tif", profile="balanced", zone_count=4, max_zone_features=5000, output_prefix="output/yield_zone", )
print(stability) print(zones) print(zone_polys) print(confidence) print(summary)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Soil Landscape Classification
Function name: soil_landscape_classification
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Soil Landscape Classification
Problem It Solves
Which terrain-defined landform units dominate the area, and how can they guide management zoning or sampling strategy?
Who It Is For
- Geomorphometry analysts, soil-landscape researchers, and precision-ag planning teams.
Primary User
Agronomy/land capability groups, environmental consulting teams, and land resource agencies.
What It Does
- Classifies landform units using multiscale curvature and slope signatures.
- Produces raster class maps and optional polygon outputs.
- Emits summary distributions for each landform class.
How It Works
- Computes slope, profile curvature, and plan curvature at fine and coarse scales.
- Applies rule-based landform assignment thresholds to each pixel.
- Optionally polygonizes class regions and aggregates class-area summary statistics.
- Indicative formula: landform_class = rule(slope, k_profile, k_plan, multiscale_signature).
Why It Wins
- Couples interpretable geomorphometric classes with optional polygon outputs for direct planning integration.
Typical Buying Trigger
A land management or agronomy program needs terrain-driven zones for sampling design or variable-rate planning.
Typical Presets
- default thresholds for general terrain partitioning.
- tune fine/coarse scales for local vs regional landform emphasis.
Inputs
ParameterOptionalDescription input DEMnoInput DEM used for terrain-derivative and landform classification workflows. flat/profile/plan thresholdsnoCurvature and surface-form thresholds used to separate landform classes. fine_scale, coarse_scalenoMultiscale analysis windows used to capture local and broad terrain structure. optional landform_polygons_outputyesOptional switch/path to emit vectorized landform polygons.
Outputs
ParameterTypeDescription landform_unitsGeoTIFFCategorical landform unit raster from terrain analysis. multiscale_signatureGeoTIFFMultiscale terrain-signature raster for landscape interpretation. landform_polygonsoptional vectorOptional vectorized landform polygons for cartographic/reporting use. summaryJSONMachine-readable summary report containing run metadata, QA diagnostics, and key metrics. html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
landform, signature, polygons, summary = wbe.soil_landscape_classification( input="data/dem.tif", fine_scale=2.0, coarse_scale=8.0, output_prefix="output/soil_landscape", landform_polygons_output="output/landforms.gpkg", )
print(landform) print(signature) print(polygons) print(summary)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Yield Data Conditioning And QA
Function name: yield_data_conditioning_and_qa
PROProduction
Workflow-grade Pro analysis with audit-ready outputs.
workflow pro
Workflow Narrative
Yield Data Conditioning and QA
Problem It Solves
How do we turn raw, noisy harvest monitor points into defensible, analysis-ready yield products for management decisions?
Who It Is For
- Precision-ag service providers, agronomy analytics teams, and farm data engineering groups.
Primary User
Enterprise farms, precision agriculture consultancies, and digital agronomy platforms.
What It Does
- Runs an end-to-end conditioning pipeline for noisy combine-harvester yield points.
- Orchestrates edge QA, pass reconstruction, optional multi-header reconciliation, filtering, normalization, and swath map generation.
- Produces contract-ready summary outputs for auditability and downstream zoning workflows.
How It Works
- Flags/removes likely edge points using local neighborhood support.
- Optionally filters telemetry anomalies using speed and heading consistency thresholds.
- Auto-resolves common monitor-export field aliases when requested names are missing.
- Reconstructs pass structure from point geometry and heading continuity, then smooths yield using pass-aware neighborhood statistics.
- Optionally applies distance-based lag correction along reconstructed passes to compensate harvest monitor delay.
- Optionally converts raw yield to a target moisture basis using moisture_field_name and target_moisture_pct.
- Optionally applies robust MAD-based clipping on filtered yield values before normalization.
- Propagates a QA confidence score to final point outputs, then builds swath polygons for map-ready products.
- Indicative formula: filtered_yield ~= weighted_neighborhood(yield) with outlier replacement when z = |y - mean_adjacent_pass| / sd_adjacent_pass exceeds threshold.
Why It Wins
- Provides a reproducible and transparent preprocessing chain with explicit QA outputs rather than opaque one-click cleaning.
Typical Buying Trigger
A team needs to standardize inconsistent yield data cleaning before zone generation, variable-rate planning, or year-over-year analytics.
Typical Presets
- fast: lighter cleanup with larger tolerances for rapid turnaround.
- balanced: default production profile.
- strict: stronger edge/outlier controls for high-confidence analytics.
Inputs
ParameterOptionalDescription input (yield point vector)noRaw yield telemetry points used as input to the conditioning and QA pipeline. yield_field_name, moisture_field_name, target_moisture_pct, header_field_namenoField-name mappings for yield/moisture/header attributes used by cleaning stages. use_field_aliasesnoWhether known attribute aliases should be resolved automatically. speed_field_name, heading_field_namenoTelemetry field mappings used for speed and heading quality checks. min_speed_kmh, max_speed_kmh, max_heading_change_degnoOperating-speed and heading-change limits used for telemetry QC filtering. profile: fast | balanced | strictnoOperational profile controlling sensitivity and QA strictness for risk workflows. swath_width, edge_radius, reconcile_radius, normalization_radiusnoGeometric parameters controlling swath-edge handling and neighborhood reconciliation. lag_correction_mode: none | distancenoLag correction mode controlling whether and how harvest lag compensation is applied. lag_distance_mnoDistance offset used when distance-based lag correction is enabled. filtering_mode: standard | robustnoOutlier-filtering method selection for yield cleaning. robust_mad_threshold, z_score_threshold, min_yield, max_yieldnoStatistical thresholds and hard limits used for yield outlier rejection. optional mean_tonnageyesOptional mean tonnage override used during normalization/reconciliation steps.
Outputs
ParameterTypeDescription qa_flagsGeoPackageQA flag layer identifying records that failed yield telemetry checks. telemetry_qc_pointsoptional GeoPackageOptional point layer of telemetry QC diagnostics by record. clean_pointsGeoPackageCleaned yield points after telemetry and statistical filtering. clean_mapGeoPackageCartography-ready cleaned yield layer for rapid map production. confidence_pointsGeoPackagePoint-level confidence diagnostics for cleaned yield records. pass_linesGeoPackageHarvest pass line features inferred from telemetry trajectories. pass_pointsGeoPackageHarvest pass points used in overlap and reconciliation analyses. lag_corrected_pointsoptional GeoPackageOptional points after lag-correction adjustment. moisture_adjusted_pointsoptional GeoPackageOptional points after moisture normalization adjustments. filtered_pointsGeoPackagePoints retained after standard filtering criteria are applied. robust_filtered_pointsoptional GeoPackageOptional points retained by robust filtering mode diagnostics. normalized_pointsGeoPackageYield points after normalization to common analytical basis. reconciled_pointsoptional GeoPackageOptional reconciled points after overlap and pass harmonization. summaryJSONMachine-readable summary report containing run metadata, QA diagnostics, and key metrics. html_reportHTMLHuman-readable customer-facing report generated from the summary contract for stakeholder review and QA traceability.
Python Example
`import whitebox_workflows as wbw
wbe = wbw.WbEnvironment(include_pro=True, tier="pro")
result = wbe.yield_data_conditioning_and_qa( input="data/yield_points.gpkg", yield_field_name="YIELD", header_field_name="HEADER", profile="balanced", swath_width=6.096, output_prefix="output/yield_pipeline", )
print(result)`
License Notice
Use of this function requires a license for Whitebox Workflows Professional (WbW-Pro). Please visit www.whiteboxgeo.com to purchase a license.
Data Conversion and Format Tools
Data conversion tools handle format transformations, topology repairs, and attribute table operations that are essential plumbing in any GIS workflow. These tools prepare data for analysis, export results to standard formats, and ensure geometric and topological consistency.
Key Concepts
- Vector-Raster Conversion: Many analysis pipelines require moving between vector and raster representations. Whitebox provides precise control over cell size, nodata handling, and attribute transfer during conversion.
- Topology Repair: Real-world vector data often contains geometric errors — dangling arcs, unclosed polygons, multipart features — that cause downstream analysis failures. The topology tools detect and fix these automatically.
- Attribute Table I/O: Joining external CSVs, merging attribute tables across layers, and exporting to standard tabular formats is routine data preparation work.
Tool Reference
The tools in this chapter are accessible from the QGIS Processing Toolbox under Whitebox Workflows → Data Conversion.
Vector and Table I/O
Add Point Coordinates To Table
Function name: add_point_coordinates_to_table
Description
This tool modifies the attribute table of a vector of POINT VectorGeometryType by adding two fields, XCOORD and YCOORD, containing each point's X and Y coordinates respectively.
Parameters
input (Vector): The input Vector object
Returns
Vector: the returning value
Python API
def add_point_coordinates_to_table(self, input: Vector) -> Vector:
Clean Vector
Function name: clean_vector
Description
This tool can be used to remove all features in Shapefiles that are of the null VectorGeometryType. It also removes line features with fewer than two vertices and polygon features with fewer than three vertices.
Parameters
input (Vector): The input Vector object
Returns
Vector: the returning value
Python API
def clean_vector(self, input: Vector) -> Vector:
CSV Points To Vector
Function name: csv_points_to_vector
This tool can be used to import a series of points contained within a comma-separated values (*.csv) file (input_file) into a vector shapefile of a POINT VectorGeometryType. The input file must be an ASCII text file with a .csv extensions. The tool will automatically detect the field data type; for numeric fields, it will also determine the appropriate length and precision. The user must specify the x-coordinate (x_field_num) and y-coordiante (y_field_num) fields. All fields are imported as attributes in the output (output) vector file. The tool assumes that the first line of the file is a header line from which field names are retrieved.
See Also
merge_table_with_csv, export_table_to_csv
Python API
def csv_points_to_vector(self, input_file: str, x_field_num: int = 0, y_field_num: int = 1, epsg: int = 0) -> Vector:
Export Table To CSV
Function name: export_table_to_csv
This tool can be used to export a vector's attribute table to a comma separated values (CSV) file. CSV files stores tabular data (numbers and text) in plain-text form such that each row corresponds to a record and each column to a field. Fields are typically separated by commas within records. The user must specify the name of the vector (and associated attribute file), the name of the output CSV file, and whether or not to include the field names as a header column in the output CSV file.
See Also
merge_table_with_csv
Python API
def export_table_to_csv(self, input: Vector, output_csv_file: str, headers: bool = True) -> None:
Join Tables
Function name: join_tables
This tool can be used to join (i.e. merge) a vector's attribute table with a second table. The user must specify the name of the vector file (and associated attribute file) as well as the primary key within the table. The primary key (pkey flag) is the field within the table that is being appended to that serves as the identifier. Additionally, the user must specify the name of a second vector from which the data appended into the first table will be derived. The foreign key (fkey flag), the identifying field within the second table that corresponds with the data contained within the primary key in the table, must be specified. Both the primary and foreign keys should either be strings (text) or integer values. Fields containing decimal values are not good candidates for keys. Lastly, the names of the field within the second file to include in the merge operation can also be input (import_field). If the import_field field is not input, all fields in the attribute table of the second file, that are not the foreign key nor FID, will be imported to the first table.
Merging works for one-to-one and many-to-one database relations. A one-to-one relations exists when each record in the attribute table corresponds to one record in the second table and each primary key is unique. Since each record in the attribute table is associated with a geospatial feature in the vector, an example of a one-to-one relation may be where the second file contains AREA and PERIMETER fields for each polygon feature in the vector. This is the most basic type of relation. A many-to-one relation would exist when each record in the first attribute table corresponds to one record in the second file and the primary key is NOT unique. Consider as an example a vector and attribute table associated with a world map of countries. Each country has one or more more polygon features in the shapefile, e.g. Canada has its mainland and many hundred large islands. You may want to append a table containing data about the population and area of each country. In this case, the COUNTRY columns in the attribute table and the second file serve as the primary and foreign keys respectively. While there may be many duplicate primary keys (all of those Canadian polygons) each will correspond to only one foreign key containing the population and area data. This is a many-to-one relation. The join_tables tool does not support one-to-many nor many-to-many relations.
See Also
merge_table_with_csv, reinitialize_attribute_table, export_table_to_csv
Python API
def join_tables(self, primary_vector: Vector, primary_key_field: str, foreign_vector: Vector, foreign_key_field: str, import_field: str = "") -> None:
Merge Table With CSV
Function name: merge_table_with_csv
This tool can be used to merge a vector's attribute table with data contained within a comma separated values (CSV) text file. CSV files stores tabular data (numbers and text) in plain-text form such that each row is a record and each column a field. Fields are typically separated by commas although the tool will also support seimi-colon, tab, and space delimited files. The user must specify the name of the vector (and associated attribute file) as well as the primary key within the table. The primary key (pkey flag) is the field within the table that is being appended to that serves as the unique identifier. Additionally, the user must specify the name of a CSV text file with either a *.csv or *.txt extension. The file must possess a header row, i.e. the first row must contain information about the names of the various fields. The foreign key (fkey flag), that is the identifying field within the CSV file that corresponds with the data contained within the primary key in the table, must also be specified. Both the primary and foreign keys should either be strings (text) or integer values. Fields containing decimal values are not good candidates for keys. Lastly, the user may optionally specify the name of a field within the CSV file to import in the merge operation (import_field flag). If this flag is not specified, all of the fields within the CSV, with the exception of the foreign key, will be appended to the attribute table.
Merging works for one-to-one and many-to-one database relations. A one-to-one relations exists when each record in the attribute table corresponds to one record in the second table and each primary key is unique. Since each record in the attribute table is associated with a geospatial feature in the vector, an example of a one-to-one relation may be where the second file contains AREA and PERIMETER fields for each polygon feature in the vector. This is the most basic type of relation. A many-to-one relation would exist when each record in the first attribute table corresponds to one record in the second file and the primary key is NOT unique. Consider as an example a vector and attribute table associated with a world map of countries. Each country has one or more more polygon features in the shapefile, e.g. Canada has its mainland and many hundred large islands. You may want to append a table containing data about the population and area of each country. In this case, the COUNTRY columns in the attribute table and the second file serve as the primary and foreign keys respectively. While there may be many duplicate primary keys (all of those Canadian polygons) each will correspond to only one foreign key containing the population and area data. This is a many-to-one relation. The join_tables tool does not support one-to-many nor many-to-many relations.
See Also
join_tables, reinitialize_attribute_table, export_table_to_csv
Python API
def merge_table_with_csv(self, primary_vector: Vector, primary_key_field: str, foreign_csv_filename: str, foreign_key_field: str, import_field: str = "") -> None:
Merge Vectors
Function name: merge_vectors
Combines two or more input vectors of the same ShapeType creating a single, new output vector. Importantly, the attribute table of the output vector will contain the ubiquitous file-specific FID, the parent file name, the parent FID, and the list of attribute fields that are shared among each of the input files. For a field to be considered common between tables, it must have the same name and field_type (i.e. data type and precision).
Overlapping features will not be identified nor handled in the merging. If you have significant areas of overlap, it is advisable to use one of the vector overlay tools instead.
The difference between merge_vectors and the Append tool is that merging takes two or more files and creates one new file containing the features of all inputs, and Append places the features of a single vector into another existing (appended) vector.
This tool only operates on vector files. Use the mosaic tool to combine raster data.
See Also
Append, mosaic
Python API
def merge_vectors(self, input_vectors: List[Vector]) -> Vector:
Reinitialize Attribute Table
Function name: reinitialize_attribute_table
Reinitializes a vector's attribute table deleting all fields but the feature ID (FID). Caution: this tool overwrites the input file's attribute table.
Python API
def reinitialize_attribute_table(self, input: Vector) -> None:
Vector Summary Statistics
Function name: vector_summary_statistics
Experimental
Computes grouped summary statistics for a numeric field and writes the result to CSV.
vector statistics table
Parameters
NameDescriptionRequiredDefault
inputInput vector layer.Requiredinput.shp
group_fieldGrouping field name.RequiredCLASS
value_fieldNumeric value field name.RequiredVALUE
outputOutput CSV path.Required—
Examples
Summarizes a value field by category.
wbe.vector_summary_statistics(group_field='CLASS', input='input.shp', output='summary.csv', value_field='VALUE')
Geometry and Topology
Fix Dangling Arcs
Function name: fix_dangling_arcs
Description
This tool can be used to fix undershot and overshot arcs, two common topological errors, in an input vector lines file (input). In addition to the input lines vector, the user must also specify the output vector (output) and the snap distance (snap). All dangling arcs that are within this threshold snap distance of another line feature will be connected to the neighbouring feature. If the input lines network is a vector stream network, users are advised to apply the repair_stream_vector_topology tool instead.
See Also
repair_stream_vector_topology, clean_vector
Python API
def fix_dangling_arcs(self, input: Vector, snap_dist: float) -> Vector:
Lines To Polygons
Function name: lines_to_polygons
This tool converts vector polylines into polygons. Note that this tool will close polygons that are open and will ensure that the first part of an input line is interpreted as the polygon hull and subsequent parts are considered holes. The tool does not examine input lines for line crossings (self intersections), which are topological errors.
See Also
polygons_to_lines
Python API
def lines_to_polygons(self, input: Vector) -> Vector:
Multipart To Singlepart
Function name: multipart_to_singlepart
This tool can be used to convert a vector file containing multi-part features into a vector containing only single-part features. Any multi-part polygons or lines within the input vector file will be split into separate features in the output file, each possessing their own entry in the associated attribute file. For polygon-type vectors, the user may optionally choose to exclude hole-parts from being separated from their containing polygons. That is, with the exclude_holes parameter, hole parts in the input vector will continue to belong to their enclosing polygon in the output vector. The tool will also convert MultiPoint Shapefiles into single Point vectors.
See Also
single_part_to_multipart
Python API
def multipart_to_singlepart(self, input: Vector, exclude_holes: bool = False) -> Vector:
Polygons To Lines
Function name: polygons_to_lines
This tool converts vector polygons into polylines, simply by modifying the Shapefile geometry type.
See Also
lines_to_polygons
Python API
def polygons_to_lines(self, input: Vector) -> Vector:
Remove Polygon Holes
Function name: remove_polygon_holes
This tool can be used to remove holes from the features within a vector polygon file. The user must specify the name of the input vector file, which must be of a polygon VectorGeometryType, and the name of the output file.
Python API
def remove_polygon_holes(self, input: Vector) -> Vector:
Singlepart To Multipart
Function name: singlepart_to_multipart
This tool can be used to convert a vector file containing single-part features into a vector containing multi-part features. The user has the option to either group features based on an ID Field (field flag), which is a categorical field within the vector's attribute table. The ID Field should either be of String (text) or Integer type. Fields containing decimal values are not good candidates for the ID Field. If no field flag is specified, all features will be grouped together into one large multi-part vector.
This tool works for vectors containing either point, line, or polygon features. Since vectors of a POINT VectorGeometryType cannot represent multi-part features, the VectorGeometryType of the output file will be modified to a MULTIPOINT VectorGeometryType if the input file is of a POINT VectorGeometryType. If the input vector is of a POLYGON VectorGeometryType, the user can optionally set the algorithm to search for polygons that should be represented as hole parts. In the case of grouping based on an ID Field, hole parts are polygon features contained within larger polygons of the same ID Field value. Please note that searching for polygon holes may significantly increase processing time for larger polygon coverages.
See Also
MultiPartToSinglePart
Python API
def singlepart_to_multipart(self, input: Vector, field_name: str) -> Vector:
Topology Rule Autofix
Function name: topology_rule_autofix
Experimental
Automatically applies safe, auditable fixes to topology violations detected by topology_rule_validate.
data-tools vector topology fix quality
Parameters
NameDescriptionRequiredDefault
inputInput vector path.Requiredinput.gpkg
rule_setRule configuration as JSON array/object, CSV string, or file path. Applies fixes for supported rules only: line_endpoints_must_snap_within_tolerance, point_must_be_covered_by_line, polygon_must_not_have_gaps, line_must_not_have_dangles.Optional['line_endpoints_must_snap_within_tolerance', 'point_must_be_covered_by_line', 'polygon_must_not_have_gaps', 'line_must_not_have_dangles']
snap_toleranceTolerance for snapping operations in coordinate units. Defaults to 0.01.Optional0.01
dry_runIf true (default), emits change report without modifying input. If false, applies changes.OptionalTrue
outputOutput vector path for fixed features. If omitted, derived beside input.Optionaltopology_fixed.gpkg
change_reportOptional JSON change audit-trail report path.Optional—
Examples
Preview endpoint snapping fixes in dry-run mode with change audit trail.
wbe.topology_rule_autofix(change_report='network_changes.json', dry_run=False, input='network_violations.gpkg', output='network_fixed.gpkg', rule_set=['line_endpoints_must_snap_within_tolerance'], snap_tolerance=0.01)
Topology Rule Validate
Function name: topology_rule_validate
Experimental
Validates vector topology against rule-set checks (self-intersection, overlap, gaps, dangles, point coverage, endpoint snapping) and emits feature-level violations.
data-tools vector topology rules qa
Parameters
NameDescriptionRequiredDefault
inputInput vector path.Requiredinput.gpkg
rule_setRule configuration as JSON array/object, CSV string, or file path. Defaults to all 6 MVP rules. Supported: line_must_not_self_intersect, polygon_must_not_overlap, polygon_must_not_have_gaps, line_must_not_have_dangles, point_must_be_covered_by_line, line_endpoints_must_snap_within_tolerance.Optional['line_must_not_self_intersect', 'polygon_must_not_overlap', 'polygon_must_not_have_gaps', 'line_must_not_have_dangles', 'point_must_be_covered_by_line', 'line_endpoints_must_snap_within_tolerance']
snap_toleranceTolerance for line_endpoints_must_snap_within_tolerance rule in coordinate units. Defaults to 1.0.Optional1.0
outputOutput vector path for violations. If omitted, derived beside input.Optionaltopology_rule_violations.gpkg
reportOptional JSON summary report path.Optional—
Examples
Validate network topology including self-intersections, dangles, and endpoint snapping.
wbe.topology_rule_validate(input='network.gpkg', output='network_topology_violations.gpkg', report='network_topology_violations.json', rule_set=['line_must_not_self_intersect', 'line_endpoints_must_snap_within_tolerance'], snap_tolerance=0.5)
Topology Validation Report
Function name: topology_validation_report
Experimental
Audits a vector layer for topology issues and writes a per-feature CSV report.
data-tools vector topology qa
Parameters
NameDescriptionRequiredDefault
inputInput vector path.Requiredinput.gpkg
outputOutput CSV path. If omitted, a CSV path is derived beside the input.Optionaltopology_report.csv
Examples
Generate a CSV report of topology issues for a vector layer.
wbe.topology_validation_report(input='parcels.gpkg', output='parcels_topology_report.csv')
Raster-Vector Conversion
Convert Nodata To Zero
Function name: convert_nodata_to_zero
Description
This tool can be used to change the value within the grid cells of a raster (input) that contain NoData to zero. The most common reason for using this tool is to change the background region of a raster image such that it can be included in analysis since NoData values are usually ignored by by most tools. This change, however, will result in the background no longer displaying transparently in most GIS. This change can be reversed using the set_nodata_value tool.
See Also
set_nodata_value, Raster.is_nodata
Parameters
raster (Raster): The input Raster object
Returns
Raster: the returning value
Python API
def convert_nodata_to_zero(self, raster: Raster) -> Raster:
Modify Nodata Value
Function name: modify_nodata_value
This tool can be used to modify the value of pixels containing the NoData value for an input raster image. This operation differs from the set_nodata_value tool, which sets the NoData value for an image in the image header without actually modifying pixel values. Also, set_nodata_value does not overwrite the input file, while the modify_nodata_value tool does. This tool cannot modify the input image data type, which is important to note since it may cause an unexpected behaviour if the new NoData value is negative and the input image data type is an unsigned integer type.
See Also
set_nodata_value, convert_nodata_to_zero
Python API
def modify_nodata_value(self, input: Raster, new_value: float = -32768.0) :
New Raster From Base Raster
Function name: new_raster_from_base_raster
This tool can be used to create a new raster with the same coordinates and dimensions (i.e. rows and columns) as an existing base image. The user must input a base file (base), the value that the new grid will be filled with (out_val; NoData if unspecified), and the data type (data_type flag; options include 'double', 'float', and 'integer').
See Also
new_raster_from_base_vector, raster_cell_assignment
Python API
def new_raster_from_base_raster(self, base: Raster, out_val: float = float('nan'), data_type: str = "float") -> Raster:
New Raster From Base Vector
Function name: new_raster_from_base_vector
This tool can be used to create a new raster with the same spatial extent as an input vector file (base). The user must specify the name of the base file, the value that the new grid will be filled with (out_val; NoData if unspecified), and the data type (data_type flag; options include 'double', 'float', and 'integer'). It is also necessary to specify a value for the optional grid cell size (cell_size) input parameter.
See Also
new_raster_from_base_raster, raster_cell_assignment
Python API
def new_raster_from_base_vector(self, base: Vector, cell_size: float, out_val: float = float('nan'), data_type: str = "float") -> Raster:
Raster To Vector Lines
Function name: raster_to_vector_lines
This tool converts raster lines features into a vector of the POLYLINE VectorGeometryType. Grid cells associated with line features will contain non-zero, non-NoData cell values. The algorithm requires three passes of the raster. The first pass counts the number of line neighbours of each line cell; the second pass traces line segments starting from line ends (i.e. line cells with only one neighbouring line cell); lastly, the final pass traces any remaining line segments, which are likely forming closed loops (and therefore do not have line ends).
If the line raster contains streams, it is preferable to use the raster_streams_to_vector instead. This tool will use knowledge of flow directions to ensure connections between stream segments at confluence sites, whereas raster_to_vector_lines will not.
See Also
raster_to_vector_polygons, raster_to_vector_points, raster_streams_to_vector
Python API
def raster_to_vector_lines(self, raster: Raster) -> Vector:
Raster To Vector Points
Function name: raster_to_vector_points
Converts a raster data set to a vector of the POINT VectorGeometryType. The user must specify the name of a raster file (input) and the name of the output vector (output). Points will correspond with grid cell centre points. All grid cells containing non-zero, non-NoData values will be considered a point. The vector's attribute table will contain a field called 'VALUE' that will contain the cell value for each point feature.
See Also
raster_to_vector_polygons, raster_to_vector_lines
Python API
def raster_to_vector_points(self, raster: Raster) -> Vector:
Raster To Vector Polygons
Function name: raster_to_vector_polygons
Converts a raster data set to a vector of the POLYGON geometry type. The user must specify the name of a raster file (input) and the name of the output (output) vector. All grid cells containing non-zero, non-NoData values will be considered part of a polygon feature. The vector's attribute table will contain a field called 'VALUE' that will contain the cell value for each polygon feature, in addition to the standard feature ID (FID) attribute.
See Also
raster_to_vector_points, raster_to_vector_lines
Python API
def raster_to_vector_polygons(self, raster: Raster) -> Vector:
Remove Raster Polygon Holes
Function name: remove_raster_polygon_holes
Description
This tool can be used to remove holes in raster polygons. Holes are areas of background values (either zero or NoData), completely surrounded by foreground values (any value other than zero or NoData). Therefore, this tool can somewhat be considered to be the raster equivalent to the vector-based RemovePolygonHoles tool. Users may optionally remove holes less than a specified threshold size (--threshold), measured in grid cells. Hole size is determined using a clumping operation, similar to what is used by the Clump tool. Users may also optionally specify whether or not to included 8-cell diagonal connectedness during the clumping operation (--use_diagonals).
Some GIS professionals have previously used a closing operation to lessen the extent of polygon holes in raster data. A closing is a mathematical morphology operation that involves expanding the raster polygons using a dialation filter (MaximumFilter), followed by a dialation filter (MinimumFilter) on the resulting image. While this common image processing technique can be helpful for reducing the prevalance of polygon holes, it can also have considerable impact on non-hole features within the image. The RemoveRasterPolygonHoles tool, by comparison, will only affect hole features and does not impact the boundaries of other polygons at all. The following image compares the removal of polygon holes (islands in a lake polygon) using a closing operation (middle) calculated using an 11x11 convolution filter and the output of the RemoveRasterPolygonHoles tool. Notice how the convolution operation impacts the edges of the polygon, particularly in convex regions, compared with the RemoveRasterPolygonHoles.
**Here** is a video that demonstrates how to apply this tool to a classified Sentinel-2 multi-spectral satellite imagery data set.
See Also
closing, remove_polygon_holes, clump, generalize_classified_raster
Python API
def remove_raster_polygon_holes(self, input: Raster, threshold_size: int = sys.maxsize, use_diagonals: bool = False) -> Raster:
Set Nodata Value
Function name: set_nodata_value
This tool will re-assign a user-defined background value in an input raster image the NoData value. More precisely, the NoData value will be changed to the specified background value and any existing grid cells containing the previous NoData value, if it had been defined, will be changed to this new value. Most WhiteboxTools tools recognize NoData grid cells and treat them specially. NoData grid cells are also often displayed transparently by GIS software. The user must specify the names of the input and output rasters and the background value. The default background value is zero, although any numeric value is possible.
This tool differs from the ModifyNoDataValue tool in that it simply updates the NoData value in the raster header, without modifying pixel values. The ModifyNoDataValue tool will update the value in the header, and then modify each existing NoData pixel to contain this new value. Also, set_nodata_value does not overwrite the input file, while the ModifyNoDataValue tool does.
This tool may result in a change in the data type of the output image compared with the input image, if the background value is set to a negative value and the input image data type is an unsigned integer. In some cases, this may result in a doubling of the storage size of the output image.
See Also
ModifyNoDataValue, convert_nodata_to_zero, IsNoData
Python API
def set_nodata_value(self, raster: Raster, back_value: float = 0.0) -> Raster:
Vector Lines To Raster
Function name: vector_lines_to_raster
This tool can be used to convert a vector lines or polygon file into a raster grid of lines. If a vector of one of the polygon VectorGeometryTypes is selected, the resulting raster will outline the polygons without filling these features. Use the VectorPolygonToRaster tool if you need to fill the polygon features.
The user must specify the name of the input vector (input) and the output raster file (output). The Field Name (field) is the field from the attributes table, from which the tool will retrieve the information to assign to grid cells in the output raster. Note that if this field contains numerical data with no decimals, the output raster data type will be INTEGER; if it contains decimals it will be of a FLOAT data type. The field must contain numerical data. If the user does not supply a Field Name parameter, each feature in the raster will be assigned the record number of the feature. The assignment operation determines how the situation of multiple points contained within the same grid cell is handled. The background value is the value that is assigned to grid cells in the output raster that do not correspond to the location of any points in the input vector. This value can be any numerical value (e.g. 0) or the string 'NoData', which is the default.
If the user optionally specifies the cell_size parameter then the coordinates will be determined by the input vector (i.e. the bounding box) and the specified Cell Size. This will also determine the number of rows and columns in the output raster. If the user instead specifies the optional base raster file parameter (base), the output raster's coordinates (i.e. north, south, east, west) and row and column count will be the same as the base file. If the user does not specify either of these two optional parameters, the tool will determine the cell size automatically as the maximum of the north-south extent (determined from the shapefile's bounding box) or the east-west extent divided by 500.
See Also
vector_points_to_raster, vector_polygons_to_raster
Python API
def vector_lines_to_raster(self, input: Vector, field_name: str = "FID", zero_background: bool = False, cell_size: float = 0.0, base_raster: Raster = None) -> Raster:
Vector Points To Raster
Function name: vector_points_to_raster
This tool can be used to convert a vector points file into a raster grid. The user must specify the name of the input vector and the output raster file. The field name (field) is the field from the attributes table from which the tool will retrieve the information to assign to grid cells in the output raster. The field must contain numerical data. If the user does not supply a field name parameter, each feature in the raster will be assigned the record number of the feature. The assignment operation determines how the situation of multiple points contained within the same grid cell is handled. The background value is zero by default but can be set to NoData optionally using the nodata value.
If the user optionally specifies the grid cell size parameter (cell_size) then the coordinates will be determined by the input vector (i.e. the bounding box) and the specified cell size. This will also determine the number of rows and columns in the output raster. If the user instead specifies the optional base raster file parameter (base), the output raster's coordinates (i.e. north, south, east, west) and row and column count will be the same as the base file.
In the case that multiple points are contained within a single grid cell, the output can be assigned (assign) the first, last (default), min, max, sum, mean, or number of the contained points.
See Also
vector_polygons_to_raster, vector_lines_to_raster
Python API
def vector_points_to_raster(self, input: Vector, field_name: str = "FID", assign_op: str = "last", zero_background: bool = False, cell_size: float = 0.0, base_raster: Raster = None) -> Raster:
Vector Polygons To Raster
Function name: vector_polygons_to_raster
public constructor
Python API
def vector_polygons_to_raster(self, input: Vector, field_name: str = "FID", zero_background: bool = False, cell_size: float = 0.0, base_raster: Raster = None) -> Raster:
Workflow Index
This index provides task-first entry points for WbW-QGIS workflows. Each entry links to the chapter section where the complete step-by-step example can be found, and lists the key Processing Toolbox tool IDs needed for the task.
Terrain Analysis
| Task | Chapter section | Key tools |
|---|---|---|
| Compute slope from DEM | Terrain Analysis — Step 2 | whitebox_workflows:slope |
| Compute aspect from DEM | Terrain Analysis — Step 3 | whitebox_workflows:aspect |
| Generate hillshade for visualisation | Terrain Analysis — Step 4 | whitebox_workflows:hillshade |
| Compute profile and plan curvature | Terrain Analysis — Step 5 | whitebox_workflows:profile_curvature, whitebox_workflows:plan_curvature |
| Classify terrain into landform elements | Terrain Analysis — Geomorphons | whitebox_workflows:geomorphons |
| Compute topographic wetness index | Terrain Analysis — Step 6 | whitebox_workflows:wetness_index |
| Fill depressions before terrain derivatives | Terrain Analysis — Step 1 | whitebox_workflows:fill_depressions |
Spatial Hydrology
| Task | Chapter section | Key tools |
|---|---|---|
| Condition DEM for hydrologic routing | Spatial Hydrology — Step 1 | whitebox_workflows:breach_depressions_least_cost |
| Derive D8 flow direction | Spatial Hydrology — Step 2 | whitebox_workflows:d8_pointer |
| Compute flow accumulation | Spatial Hydrology — Step 3 | whitebox_workflows:d8_flow_accumulation |
| Extract stream network from accumulation | Spatial Hydrology — Step 4 | whitebox_workflows:extract_streams |
| Snap pour points to channel raster | Spatial Hydrology — Step 5 | whitebox_workflows:snap_pour_points |
| Delineate watershed / catchment | Spatial Hydrology — Step 6 | whitebox_workflows:watershed |
| Compute Topographic Wetness Index | Spatial Hydrology — TWI | whitebox_workflows:wetness_index |
LiDAR Processing
| Task | Chapter section | Key tools |
|---|---|---|
| QA — inspect point cloud statistics | LiDAR Processing — Step 1 | whitebox_workflows:lidar_point_stats |
| Thin high-density point cloud | LiDAR Processing — Step 2 | whitebox_workflows:lidar_thin |
| Classify ground returns | LiDAR Processing — Step 3 | whitebox_workflows:lidar_ground_point_filter |
| Build DTM from ground-classified cloud | LiDAR Processing — Step 4 | whitebox_workflows:lidar_idw_interpolation |
| Build DSM from first returns | LiDAR Processing — Step 5 | whitebox_workflows:lidar_idw_interpolation |
| Derive canopy height model (CHM) | LiDAR Processing — Step 6 | whitebox_workflows:canopy_height_model |
| Normalise heights above ground | LiDAR Processing — Step 7 | whitebox_workflows:height_above_ground |
Remote Sensing
| Task | Chapter section | Key tools |
|---|---|---|
| Compute NDVI from multispectral image | Remote Sensing — Step 2 | whitebox_workflows:ndvi |
| Threshold vegetation and classify change bins | Remote Sensing — Step 3 | QGIS Raster Calculator, whitebox_workflows:reclass |
| NDVI/NBR-based change detection | Remote Sensing — Step 3 | QGIS Raster Calculator |
| Reduce bands with PCA | Remote Sensing — Step 4 | whitebox_workflows:principal_component_analysis |
| Segment image into homogeneous objects | Remote Sensing — Step 5 | whitebox_workflows:image_segmentation |
Raster Analysis
| Task | Chapter section | Key tools |
|---|---|---|
| Compute distance from a binary feature raster | Raster Analysis — Step 1 | whitebox_workflows:euclidean_distance |
| Reclassify raster into suitability scores | Raster Analysis — Steps 2–4 | whitebox_workflows:reclass_from_file |
| Combine reclassified factors by weight | Raster Analysis — Step 5 | QGIS Raster Calculator |
| Summarise raster values within polygons | Raster Analysis — Step 6 | QGIS Zonal Statistics |
| Smooth raster with focal mean | Raster Analysis — Focal Statistics | whitebox_workflows:mean_filter |
Vector Analysis
| Task | Chapter section | Key tools |
|---|---|---|
| Validate and repair polygon geometry | Vector Analysis — Step 1 | QGIS native:fixgeometries |
| Add area, perimeter, centroid attributes | Vector Analysis — Step 2 | whitebox_workflows:add_geometry_attributes |
| Join attributes from overlapping polygons | Vector Analysis — Step 3 | whitebox_workflows:spatial_join |
| Compute distance to nearest feature | Vector Analysis — Step 4 | whitebox_workflows:near |
| Select features by spatial predicate | Vector Analysis — Step 5 | QGIS Select by Expression |
| Simplify polygon boundaries | Vector Analysis — Simplify | whitebox_workflows:simplify_features |
| Convert GeoPackage to TopoJSON and back | Vector Analysis — TopoJSON Conversion Chain | whitebox_workflows:add_geometry_attributes, QGIS Export |
| Simplify shared boundaries and emit TopoJSON | Vector Analysis — TopoJSON Boundary-Preserving Generalization Chain | whitebox_workflows:simplify_features, QGIS Export |
| Convert TopoJSON transport input, enrich, and re-emit | Vector Analysis — TopoJSON Transport + Enrichment Return Chain | QGIS Export, whitebox_workflows:add_geometry_attributes, whitebox_workflows:spatial_join, whitebox_workflows:near |
Network Analysis
| Task | Chapter section | Key tools |
|---|---|---|
| Prepare road network geometry and costs | Network Analysis — Workflow A | whitebox_workflows:add_geometry_attributes, QGIS geometry tools |
| Compute shortest path routes | Network Analysis — Workflow B | QGIS Network Analysis shortest path tools |
| Delineate road service areas | Network Analysis — Workflow B | QGIS native:serviceareafromlayer |
| Build OD-style batch travel-cost summaries | Network Analysis — Workflow C | QGIS shortest path batch/model workflows |
| Compute Strahler and Shreve stream hierarchy | Network Analysis — Workflow D | whitebox_workflows:strahler_stream_order, whitebox_workflows:shreve_stream_magnitude |
| Convert raster stream network to vector | Network Analysis — Workflow D | whitebox_workflows:raster_streams_to_vector |
Linear Referencing
| Task | Chapter section | Key tools |
|---|---|---|
| Add measure (M) values to route network | Linear Referencing — Step 1 | QGIS native:setmvalue |
| Validate unique route IDs | Linear Referencing — Step 2 | QGIS Python Console check |
| Locate point events along routes | Linear Referencing — Step 3 | whitebox_workflows:locate_point_events |
| Locate line events along routes | Linear Referencing — Step 4 | whitebox_workflows:locate_line_events |
| Calibrate M-values against control points | Linear Referencing — Calibrate | whitebox_workflows:calibrate_route |
By Data Type
Raster input tasks
- Fill depressions → see Terrain Analysis / Spatial Hydrology
- Slope, aspect, curvature → see Terrain Analysis
- Flow direction, accumulation → see Spatial Hydrology
- Reclassification, suitability → see Raster Analysis
- Spectral indices, PCA, change → see Remote Sensing
Point cloud input tasks
- Ground classification, DTM/DSM/CHM → see LiDAR Processing
- Height normalisation → see LiDAR Processing
Vector input tasks
- Geometry validation, overlay, joins → see Vector Analysis
- Routing, service areas, and stream hierarchy → see Network Analysis
- Route events, calibration → see Linear Referencing
Troubleshooting
Plugin Does Not Appear in QGIS
Checks:
- Confirm plugin directory path is correct for active QGIS profile.
- Confirm plugin folder name is whitebox_workflows_qgis.
- Restart QGIS after install/symlink changes.
Whitebox Provider Missing from Processing
Checks:
- Confirm plugin is enabled.
- Trigger discovery refresh.
- Confirm whitebox_workflows imports in QGIS Python environment.
Tools Are Missing or Unexpectedly Locked
Checks:
- Rebuild/reinstall whitebox_workflows.
- Refresh discovery.
- Confirm runtime capability metadata matches expected tier.
- Confirm tool taxonomy and generated provider state are synchronized.
Tool Runs but Output Is Missing
Checks:
- Verify output path exists and is writable.
- Verify input paths and formats are valid.
- Re-run on a small dataset to isolate data-specific failures.
- Check Processing logs for warnings/errors.
Environment Mismatch Problems
Symptoms:
- import failures in plugin startup,
- inconsistent behavior between terminal and QGIS,
- discovery succeeds in one environment but not another.
Resolution:
- Ensure QGIS and your install command use the same Python environment.
- Re-run runtime install in that exact environment.