Overview
This manual is the long-form user guide for the WbW-R API.
WbW-R (Whitebox Workflows for R) is the R frontend to Whitebox Next Gen, providing access to modern geospatial processing for raster, vector, lidar, and sensor-bundle data through an R-friendly API and workflow model.
Whitebox is a long-running geospatial project that originated in academia and has grown into a broad analysis platform with recognized strengths in geomorphometry, hydrology, lidar processing, and remote sensing. Whitebox Next Gen is the current Rust-based major iteration focused on performance, cross-platform reliability, and modern data formats.
Whitebox Next Gen is intentionally full-stack: core geospatial capabilities that are often delegated to external C/C++ dependencies in other GIS platforms (for example raster I/O, projections, geometry/topology operations, and lidar handling) are implemented in the Whitebox codebase itself. This architecture is unusual in GIS and provides practical benefits for users: consistent behavior across platforms, tighter control over correctness and performance, fewer system-level dependency issues during installation, and faster iteration when fixing bugs or introducing new capabilities.
In this architecture, WbW-R is the orchestration layer for R users who need both practical scripting ergonomics and backend-scale performance.
This manual is written to be both:
- beginner friendly: clear progression and runnable examples,
- canonical reference: explicit operational patterns, constraints, and validation guidance aligned with backend behavior.
How to Use This Manual
This guide is intended for analysts who want to move from one-off exploratory commands to stable, scriptable geospatial workflows. The examples are practical and executable, but each chapter also explains the operational intent behind the code: when to keep work in memory, when to persist outputs, and how to validate results between stages.
A good first pass is chapter-order reading. After that, use the script index as a task-oriented entry point for adapting workflows to your own projects.
When adapting examples, keep a consistent script shape:
- session setup,
- input loading,
- transformation chain,
- validation and output persistence.
This structure keeps scripts easier to review, test, and maintain.
Goals:
- Comprehensive API coverage.
- Script-first documentation style.
- Chapter layout aligned with the project README.
Documentation style rules:
- Every major concept includes runnable code snippets.
- Each chapter includes at least one end-to-end workflow script.
- Tool parameter specifics are linked to shared tool reference docs as needed.
Common Pitfalls and Validation
- Confirm inputs exist and have the expected CRS/schema/metadata before running long workflows.
- Prefer explicit output names and formats for reproducibility, especially in batch scripts.
- Re-open and inspect outputs after major steps to validate assumptions before chaining more tools.
- For performance-sensitive runs, start with a small representative subset, then scale to full data.
Write Option References
For quick access to output option tables:
- Raster: Output Controls -> Raster Write Option Reference
- Vector: Output Controls -> Vector Write Option Reference
- Lidar: Output Controls -> Lidar Write Option Reference
Installation and Setup
Treat setup as an explicit validation stage, not a one-time administrative task. The smoke test in this chapter confirms that the package and runtime can create a session and enumerate tools, which catches many environment issues before they surface in long-running processing scripts.
For collaborative projects, this chapter doubles as a baseline checklist for new machines and CI environments.
Install
Use this to install the current local package build for development and testing.
R CMD INSTALL crates/wbw_r/r-package/whiteboxworkflows
Smoke Test
This verifies that startup succeeds and tool enumeration is available in your current runtime.
Rscript -e 'library(whiteboxworkflows); s <- wbw_session(); cat(length(wbw_tool_ids(session = s)), "\n")'
Build and Preview
This manual is structured as an mdBook project.
Documentation build steps are part of quality control. Running these commands early and often helps catch broken links, malformed markdown, and chapter order drift before those issues accumulate. Live preview is especially useful when iterating on explanatory text and section hierarchy.
Install mdBook
cargo install mdbook
Build the Manual
Run this when changing chapter content, links, or section ordering.
From repository root:
cd crates/wbw_r/manual
mdbook build
Generated HTML is written to book/.
Live Preview
Use live preview while refining long explanatory sections and chapter flow.
cd crates/wbw_r/manual
mdbook serve --open
Authoring Notes
- Keep chapter order aligned with
src/SUMMARY.md. - Prefer runnable snippets over pseudo-code.
- Link chapter recipes to concrete scripts listed in the script index.
Quick Start
This chapter is a minimal "first success" path. It verifies that your R installation, package setup, session creation, tool execution, and output write path all work together before you take on larger workflows.
Once this slice runs end-to-end, the remaining chapters explain how to make the same pattern more explicit, reproducible, and resilient.
The example below demonstrates the core session-first model in WbW-R: create a session, load data, run a tool, and inspect the result object.
library(whiteboxworkflows)
s <- wbw_session()
dem <- wbw_read_raster("dem.tif")
result <- wbw_slope(dem = dem$file_path(), output = "slope.tif")
print(result)
What to Read Next
These chapters extend the same lifecycle with discovery, progress handling, and larger workflow templates.
- Session lifecycle and discovery: Session and Discovery
- Progress callbacks and execution patterns: Tool Execution and Progress
- End-to-end workflow scripts: Script Index
Quick-Start Conventions
- Prefer explicit
wbw_session(...)construction in scripts. - Validate tool visibility before long-running pipelines.
- Persist outputs deliberately and re-open typed objects for verification.
Session and Discovery
This chapter covers session lifecycle and tool discovery patterns.
Session and discovery APIs establish runtime certainty before execution. By creating a session explicitly and checking tool visibility up front, you avoid late failures in long pipelines and make script behavior easier to reason about across different environments and deployment targets.
Create Session
This is the standard session bootstrap for scripts. Keep it explicit so runtime state is obvious to future readers.
library(whiteboxworkflows)
s <- wbw_session()
print(s)
Visibility Checks
Use hard visibility checks before long jobs so missing-tool failures happen immediately.
library(whiteboxworkflows)
s <- wbw_session()
ids <- wbw_tool_ids(session = s)
cat('Visible tools:', length(ids), '\n')
if (!wbw_has_tool('slope', session = s)) {
stop('slope is not available in this runtime session')
}
Search and Describe
Use this when you know the task objective but need to confirm exact tool IDs and parameter contracts.
library(whiteboxworkflows)
matches <- wbw_search_tools('lidar')
print(matches[1:min(5, length(matches))])
desc <- wbw_describe_tool('slope')
print(desc)
Category-Based Discovery
Category discovery is useful for UI generation, guided workflows, and policy checks in larger automation systems.
library(whiteboxworkflows)
summary <- wbw_category_summary()
print(summary)
cats <- wbw_get_all_categories()
print(cats)
raster_tools <- wbw_tools_in_category('Raster')
print(head(raster_tools, 20))
proj_tools <- wbw_tools_in_category('projection_georeferencing')
print(head(proj_tools, 20))
Session API Reference
WbW-R is function-first, but the wbw_session() object still exposes a useful
set of callable methods for execution, projection utilities, topology checks,
and typed I/O helpers.
Session Construction and Execution
| Method | Description |
|---|---|
run_tool | Execute a tool with a named argument list. |
run_tool_with_progress | Execute a tool and return structured progress/result output. |
list_tools | Return visible tool IDs for the session/license context. |
Typed I/O Helpers
| Method | Description |
|---|---|
read_vector | Read vector data with optional read options. |
write_raster | Write one raster with optional format options. |
write_rasters | Write multiple rasters in one call. |
write_vector | Write one vector with optional format options. |
Projection Utility Methods
| Method | Description |
|---|---|
projection_to_ogc_wkt | Convert EPSG code to OGC WKT text. |
projection_identify_epsg | Identify EPSG from CRS text where possible. |
projection_reproject_points | Reproject point collections between EPSG systems. |
projection_reproject_point | Reproject a single point between EPSG systems. |
Topology Utility Methods
| Method | Description |
|---|---|
topology_intersects_wkt, topology_contains_wkt, topology_within_wkt | Core spatial predicate checks on WKT geometries. |
topology_touches_wkt, topology_disjoint_wkt, topology_crosses_wkt, topology_overlaps_wkt | Additional topological predicates for rule-based QA. |
topology_covers_wkt, topology_covered_by_wkt | Boundary-aware containment checks. |
topology_relate_wkt | Return DE-9IM relation text for exact topology logic. |
topology_distance_wkt | Return shortest distance between WKT geometries. |
topology_vector_feature_relation | Evaluate topology relation between indexed features in two vector objects. |
topology_is_valid_polygon_wkt, topology_make_valid_polygon_wkt | Validate and repair polygon WKT geometries. |
topology_buffer_wkt | Build buffered geometry from WKT and distance. |
Recommended Discovery Pattern
- Build session explicitly.
- Check required tools with
wbw_has_tool(...). - Use
wbw_describe_tool(...)for parameter verification. - Use category functions to drive guided UX or script validation.
Tool Execution and Progress
This chapter documents execution styles and progress handling in WbW-R.
Execution structure affects observability and reproducibility. Explicit session execution makes dependencies and runtime state clear, while progress-aware patterns help with monitoring, troubleshooting, and user feedback during long operations. Think of callbacks as operational instrumentation, not just console output.
Explicit Session Execution
Use explicit session execution as the default in production scripts so runtime dependencies and state are clear.
library(whiteboxworkflows)
s <- wbw_session()
result <- wbw_slope(dem = 'dem.tif', output = 'slope.tif')
print(result)
Progress-Aware Execution
Use progress-aware execution for long operations, remote runs, and logs that require periodic status updates.
library(whiteboxworkflows)
s <- wbw_session()
result <- wbw_run_tool_with_progress(
'slope',
args = list(dem = 'dem.tif', output = 'slope.tif'),
session = s,
on_progress = wbw_print_progress
)
str(result$progress)
Custom Progress Callback
This callback pattern is useful when progress events need to feed custom log formats or monitoring pipelines.
progress_cb <- local({
last <- -1L
function(pct = NA_real_, message = '') {
msg <- if (is.null(message)) '' else as.character(message[[1]])
if (!is.numeric(pct) || length(pct) == 0L || is.na(pct[[1]])) {
m <- regexpr('(-?[0-9]+(\\.[0-9]+)?)\\s*%', msg, perl = TRUE)
if (m[[1]] >= 0L) {
token <- regmatches(msg, m)
pct <- as.numeric(sub('%.*$', '', token))
} else {
pct <- NA_real_
}
}
if (is.numeric(pct) && length(pct) > 0L && !is.na(pct[[1]])) {
value <- as.numeric(pct[[1]])
if (value <= 1.0) value <- value * 100.0
pct_int <- as.integer(max(0, min(100, floor(value))))
if (pct_int > last) {
cat(sprintf('[%3d%%] %s\\n', pct_int, msg))
last <<- pct_int
}
}
}
})
# Use: on_progress = progress_cb
Recommended Execution Pattern
- Validate tool visibility first.
- Run with explicit session.
- Capture progress for long operations.
- Persist outputs and re-open typed objects for post-processing.
Working with Rasters
This chapter documents practical raster workflows in WbW-R with emphasis on inspection, iteration, modification, and persistence.
Raster processing usually alternates between backend tools for heavy transformations and array-level edits for custom logic. The key design choice is to use tool operations for scale and consistency, then reserve manual array passes for domain-specific adjustments that are hard to express with existing tools.
Raster Lifecycle
This lifecycle makes assumptions explicit before computation and keeps your workflow reviewable.
Typical lifecycle:
- Read raster.
- Inspect metadata.
- Transform values (tool-driven or array-driven).
- Persist outputs with explicit options when needed.
library(whiteboxworkflows)
r <- wbw_read_raster('dem.tif')
meta <- r$metadata()
print(meta$rows)
print(meta$columns)
print(meta$nodata)
Memory-Backed Rasters for Pipeline Efficiency
For workflows that chain multiple tool operations, memory-backed rasters eliminate disk I/O between steps. This is especially valuable when processing large rasters in complex pipelines. Rasters remain in process memory, accessible to subsequent tools without writing to disk.
Load a raster into memory with file_mode = "m":
library(whiteboxworkflows)
# Read directly into memory; no disk path required for subsequent ops
r1 <- wbw_read_raster('dem.tif', file_mode = "m")
r2 <- wbw_read_raster('slope.tif', file_mode = "m")
# Both rasters are now memory-backed. Chain operations without disk:
result <- r1 + r2
print(result$file_path()) # prints: memory://raster/...
Memory-backed paths are compatible with all downstream raster operations:
library(whiteboxworkflows)
r <- wbw_read_raster('dem.tif', file_mode = "m")
# Inspect and transform
meta <- r$metadata()
scaled <- wbw_multiply(input = r$file_path(), multiplier = 1.5)
# Export to disk when ready
wbw_write_raster(scaled, 'dem_scaled_1p5x.tif')
Memory Lifecycle and Cleanup
Memory-backed rasters persist in the process store until explicitly removed or cleared. For long-running jobs, manage memory explicitly to avoid accumulation:
library(whiteboxworkflows)
# Check current memory usage
count_before <- wbw_raster_memory_count()
bytes_before <- wbw_raster_memory_bytes()
cat('Rasters in memory:', count_before, '\n')
cat('Bytes used:', bytes_before, '\n')
# Read two rasters
r1 <- wbw_read_raster('large1.tif', file_mode = "m")
r2 <- wbw_read_raster('large2.tif', file_mode = "m")
cat('After reads:', wbw_raster_memory_count(), '\n')
# Remove one raster when done
wbw_remove_raster_from_memory(r1)
cat('After remove:', wbw_raster_memory_count(), '\n')
# Or clear all rasters at once
wbw_clear_raster_memory()
cat('After clear:', wbw_raster_memory_count(), '\n')
Best practices:
- Use
file_mode = "m"for intermediate results in tool chains. - Export memory-backed rasters to disk with
write()when persisting results. - Call
remove_raster_from_memory()after a raster is no longer needed. - Use
clear_raster_memory()between independent job phases. - Use
wbw_clear_memory()when resetting all in-process raster/vector/lidar stores together. - Monitor
raster_memory_count()andraster_memory_bytes()in large pipelines.
Iterating Through Grid Cells
Use this only for logic that cannot be expressed through existing tools or vectorized operations.
Use to_array() for cell-level logic.
library(whiteboxworkflows)
r <- wbw_read_raster('dem.tif')
a <- r$to_array()
meta <- r$metadata()
nr <- dim(a)[1]
nc <- dim(a)[2]
for (row in seq_len(nr)) {
for (col in seq_len(nc)) {
z <- a[row, col]
if (!is.na(z) && z != meta$nodata) {
if (z < 0) {
a[row, col] <- 0
}
}
}
}
Fast Random Cell Access with Pinning
For neighbourhood logic or flow-path traversal that performs frequent random cell reads/writes, use pinned raster views. Pinning loads the raster values once and avoids repeated conversion overhead inside tight loops.
Single-raster pinning:
library(whiteboxworkflows)
r <- wbw_read_raster("dem.tif")
p <- wbw_pinned_raster(r)
v <- p$get_value(100, 200)
p$set_value(100, 200, v + 1)
p$close()
Two-raster read/write scan loop:
library(whiteboxworkflows)
src <- wbw_read_raster("D8Pointer.tif")
dst <- wbw_read_raster("output_template.tif")
wbw_with_pinned_rasters(src, dst, .f = function(srcp, dstp) {
meta <- src$metadata()
for (row in seq_len(meta$rows)) {
for (col in seq_len(meta$columns)) {
value <- srcp$get_value(row, col)
dstp$set_value(row, col, value)
}
}
})
Notes:
wbw_with_pinned_rasters()closes all pins automatically.wbw_pin_rasters(...)is available when manual close control is preferred.- Pinned write-back currently targets file-backed rasters in wbw_r.
Writing Modified Data Back
This demonstrates creating a derived raster while preserving base geospatial context.
library(whiteboxworkflows)
base <- wbw_read_raster('dem.tif')
a <- base$to_array()
a <- a * 1.05
out <- wbw_array_to_raster(a, base, output_path = 'dem_scaled.tif')
print(out)
Supported raster write keys and valid values are documented in Output Controls.
Multi-Band Iteration
Use this when transform rules depend on band identity or per-band thresholds.
library(whiteboxworkflows)
r <- wbw_read_raster('multiband.tif')
a <- r$to_array()
# If array is [rows, cols, bands], loop accordingly.
d <- dim(a)
if (length(d) == 3) {
nr <- d[1]
nc <- d[2]
nb <- d[3]
for (b in seq_len(nb)) {
for (row in seq_len(nr)) {
for (col in seq_len(nc)) {
v <- a[row, col, b]
if (!is.na(v) && v < 0) {
a[row, col, b] <- 0
}
}
}
}
}
wbw_array_to_raster(a, r, output_path = 'multiband_clamped.tif')
Tool-First Raster Processing
Prefer this pattern for heavy processing: let optimized tools do most work, then apply targeted custom edits.
library(whiteboxworkflows)
s <- wbw_session()
dem <- wbw_read_raster('dem.tif')
wbw_fill_depressions(dem = dem$file_path(), output = 'filled.tif')
filled <- wbw_read_raster('filled.tif')
slope <- filled$square() # placeholder unary op example
slope$write('slope_out.tif', overwrite = TRUE)
NoData and Performance Guidance
- Always account for
metadata()$nodatain per-cell loops. - Prefer tool operations for heavy transforms.
- Use array loops only for custom logic not available as tools.
Raster Object Method Reference
Metadata and Conversion
| Method | Description |
|---|---|
metadata | Return raster metadata (dimensions, extent, nodata, CRS fields). |
file_path | Return the backing raster path. |
band_count | Return raster band count. |
active_band | Return active band index. |
get_value, set_value | Read/write single cell values using 1-based row/column/band indices. |
crs_epsg, crs_wkt | Inspect CRS metadata. |
to_array | Convert raster to in-memory array for custom processing. |
to_stars | Convert raster to a stars object (requires terra support path). |
pin | Create a pinned view for low-overhead random cell access in tight loops. |
Arithmetic and Unary Transform Methods
| Method | Description |
|---|---|
add, subtract, multiply, divide | Cellwise binary arithmetic with another raster/path operand. |
abs, ceil, floor, round, square, sqrt | Common unary numeric transforms. |
log10, log2, exp, exp2 | Log and exponential transforms. |
sin, cos, tan, sinh, cosh, tanh | Trigonometric and hyperbolic transforms. |
Persistence
| Method | Description |
|---|---|
deep_copy | Copy raster to a new output path and return a raster object. |
write | Write raster to disk with optional output options. |
Working with Vectors
This chapter covers schema inspection, feature iteration, attribute reads/writes, and persistence workflows.
Reliable vector workflows depend on stable schema and attribute contracts as much as geometry itself. The patterns in this chapter emphasize inspecting structure early, applying deterministic edits, and validating outputs after persistence so downstream analysis remains predictable.
See Also: Online Sources
If your workflow starts by downloading vectors from web providers (starting with OSM Overpass), use the dedicated chapter:
Read and Inspect
Begin with schema and metadata inspection so edits are grounded in the actual field model.
library(whiteboxworkflows)
v <- wbw_read_vector('roads.gpkg')
schema <- v$schema()
print(schema)
print(v$metadata())
Memory-Backed Vectors for Pipeline Efficiency
For workflows that chain multiple vector operations, memory-backed vectors eliminate disk I/O between steps. This is valuable for complex pipelines where intermediate results are passed between spatial operations.
Load a vector into memory with file_mode = "m":
library(whiteboxworkflows)
# Read directly into memory
roads <- wbw_read_vector('roads.gpkg', file_mode = "m")
rivers <- wbw_read_vector('rivers.gpkg', file_mode = "m")
print(roads$path) # prints: memory://vector/...
Memory-backed vectors are compatible with all downstream operations:
library(whiteboxworkflows)
v <- wbw_read_vector('polygons.gpkg', file_mode = "m")
# Inspect schema and metadata
schema <- v$schema()
meta <- v$metadata()
# Pass to spatial tools
centroid_path <- wbw_centroid_vector(input = v$path, output = 'centroids')
# Export to disk when ready
result <- wbw_read_vector(centroid_path)
result$write('centroids_final.gpkg')
Vector Memory Lifecycle
Memory-backed vectors persist until explicitly removed or cleared. For long-running vector pipelines, manage memory explicitly:
library(whiteboxworkflows)
# Check current memory
cat('Vectors in memory:', wbw_vector_memory_count(), '\n')
# Read vectors
v1 <- wbw_read_vector('large1.gpkg', file_mode = "m")
v2 <- wbw_read_vector('large2.gpkg', file_mode = "m")
cat('After reads:', wbw_vector_memory_count(), '\n')
# Remove when done
wbw_remove_vector_from_memory(v1)
cat('After remove:', wbw_vector_memory_count(), '\n')
# Or clear all
wbw_clear_vector_memory()
cat('After clear:', wbw_vector_memory_count(), '\n')
Implicit Memory Output from Tools
All vector-output tools store their result in memory automatically when the
output argument is omitted (NULL). You do not need to pass file_mode = "m"
or choose a temporary path — simply leave output out and the returned
wbw_vector object is already memory-backed:
library(whiteboxworkflows)
wbe <- wbw_make_session()
roads <- wbw_read_vector('roads.gpkg')
# No output path — result is stored in memory automatically
centroids <- wbe$centroid_vector(input = roads$path)
cat(centroids$path, '\n') # prints: memory://vector/...
# Chain operations without any intermediate files
clipped <- wbe$clip(input = centroids$path, clip = 'boundary.gpkg')
cat(clipped$path, '\n') # also memory://vector/...
# Persist the final result only
wbw_write_vector(clipped, 'result.gpkg')
This applies to all tool categories — GIS, hydrology, geomorphometry, and stream
network tools all follow the same rule. Providing an explicit output path
writes to disk as before.
Best practices:
- Use
file_mode = "m"for intermediate spatial analysis results. - Export memory-backed vectors to disk with
write()when persisting final outputs. - Call
remove_vector_from_memory()after a vector is no longer needed. - Use
clear_vector_memory()between independent analysis phases. - Use
wbw_clear_memory()when resetting all in-process raster/vector/lidar stores together.
Iterate Through Features
Use this for quality checks, custom filters, and record-level diagnostics.
library(whiteboxworkflows)
v <- wbw_read_vector('roads.gpkg')
count <- v$metadata()$feature_count
for (i in seq_len(count)) {
attrs <- v$attributes(i)
# attrs is a named list
print(attrs)
}
Read and Update Attribute Table
This example combines common edit actions: single-value updates, grouped updates, and field creation.
library(whiteboxworkflows)
v <- wbw_read_vector('roads.gpkg')
name1 <- v$attribute(1, 'name')
print(name1)
v$update_attribute(1, 'name', 'Main Street')
v$update_attributes(2, list(speed = 50, class = 'collector'))
v$add_field('reviewed', field_type = 'integer', default_value = 0)
Persist Vector Outputs
This pattern demonstrates tool-driven persistence and post-write verification.
For complete write-option keys and allowed values, see Output Controls.
library(whiteboxworkflows)
s <- wbw_session()
roads <- wbw_read_vector('roads.gpkg')
wbw_centroid_vector(input = roads$file_path(), output = 'roads_centroids.gpkg')
centroids <- wbw_read_vector('roads_centroids.gpkg')
print(centroids$metadata())
Practical Notes
- Call
schema()first to confirm field names and expected types. - Use
update_attributes()for grouped feature edits. - Re-read output files to validate schema and values after writes.
Vector Object Method Reference
Metadata and Structure
| Method | Description |
|---|---|
metadata | Return vector metadata (geometry type, feature count, CRS, fields). |
schema | Return field names and types as a data frame. |
path | Return backing vector path. |
to_terra, to_sf | Convert to terra/sf objects for ecosystem workflows. |
Attribute Access and Edits
| Method | Description |
|---|---|
attributes | Return all attribute values for one feature index. |
attribute | Return one field value for one feature index. |
update_attributes | Update multiple fields for one feature index. |
update_attribute | Update one field for one feature index. |
add_field | Add a new field with declared type and default value. |
Persistence
| Method | Description |
|---|---|
deep_copy | Copy vector to a new path with optional write options. |
write | Write vector to a new output path. |
Working with Lidar
This chapter documents lidar workflows in WbW-R, including file-backed processing, metadata checks, and output control patterns.
Lidar workflows are strongly shaped by dataset size and I/O cost. The guiding pattern here is to use file-backed objects and vectorized matrix operations for normal workloads, then switch to chunked processing when point counts exceed comfortable memory limits.
Baseline Workflow
Use this as a first-run validation before introducing matrix edits or chunked processing.
library(whiteboxworkflows)
l <- wbw_read_lidar('survey.las')
print(l$metadata())
copy <- l$deep_copy('survey_copy.las', overwrite = TRUE)
print(copy$metadata())
Memory-Backed Lidar for Pipeline Efficiency
For workflows that chain multiple lidar operations, memory-backed lidar objects eliminate disk I/O between steps. This is valuable when processing large point clouds through sequential filtering or classification steps.
Load a point cloud into memory with file_mode = "m":
library(whiteboxworkflows)
# Read directly into memory
survey <- wbw_read_lidar('survey.las', file_mode = "m")
print(survey$file_path()) # prints: memory://lidar/...
Memory-backed lidar objects support the full matrix/data-frame API and all downstream operations:
library(whiteboxworkflows)
# Read into memory
survey <- wbw_read_lidar('survey.las', file_mode = "m")
# Inspect and process
meta <- survey$metadata()
cat('Points:', meta$point_count, '\n')
# Extract and edit points
pts <- survey$to_matrix(fields = c('x', 'y', 'z', 'classification'))
high <- pts[, 3] > 250
pts[high, 4] <- 6
# Write edits back to disk
edited <- survey$from_matrix(
pts,
output_path = 'survey_filtered.laz',
overwrite = TRUE,
fields = c('x', 'y', 'z', 'classification')
)
Lidar Memory Lifecycle
Memory-backed lidar objects persist until explicitly removed or cleared. For long-running lidar pipelines, manage memory explicitly:
library(whiteboxworkflows)
# Check current memory
cat('Lidar objects in memory:', wbw_lidar_memory_count(), '\n')
# Read point clouds
survey1 <- wbw_read_lidar('large1.las', file_mode = "m")
survey2 <- wbw_read_lidar('large2.las', file_mode = "m")
cat('After reads:', wbw_lidar_memory_count(), '\n')
# Remove when done
wbw_remove_lidar_from_memory(survey1)
cat('After remove:', wbw_lidar_memory_count(), '\n')
# Or clear all
wbw_clear_lidar_memory()
cat('After clear:', wbw_lidar_memory_count(), '\n')
Implicit Memory Output from Tools
All lidar-output tools store their result in memory automatically when the
output argument is omitted (NULL). You do not need to pass file_mode = "m"
or choose a temporary path — simply leave output out and the returned
wbw_lidar object is already memory-backed:
library(whiteboxworkflows)
wbe <- wbw_make_session()
survey <- wbw_read_lidar('survey.laz')
# No output path — result is stored in memory automatically
filtered <- wbe$filter_lidar_classes(input = survey$path, excluded_classes = list(7L))
cat(filtered$path, '\n') # prints: memory://lidar/...
# Chain operations without any intermediate files
thinned <- wbe$lidar_thin(input = filtered$path, resolution = 0.5)
cat(thinned$path, '\n') # also memory://lidar/...
# Persist the final result only
wbw_write_lidar(thinned, 'survey_clean.copc.laz')
This applies to all lidar tools. Providing an explicit output argument writes
to disk as before.
Best practices:
- Use
file_mode = "m"for intermediate point cloud processing. - Export memory-backed lidar to disk with
write()when persisting final outputs. - Call
remove_lidar_from_memory()after a point cloud is no longer needed. - Use
clear_lidar_memory()between independent analysis phases. - Use
wbw_clear_memory()when resetting all in-process raster/vector/lidar stores together. - Monitor
lidar_memory_count()for large processing jobs.
Iterating Through Lidar Points
Stable WbW-R lidar objects are file-backed and tool-oriented, with explicit columnar point access through matrix/data-frame APIs. Direct point-by-point iteration is still not the primary API path.
Recommended point-level workflow:
- Use
to_matrix()orto_data_frame()with selected point fields. - Apply vectorized edits in base R/tidy workflows.
- Write updates with
from_matrix(...)orfrom_data_frame(...).
The example below uses a simple reclassification rule to illustrate the matrix roundtrip pattern.
library(whiteboxworkflows)
l <- wbw_read_lidar('survey.las')
pts <- l$to_matrix(fields = c('x', 'y', 'z', 'classification'))
ground <- pts[, 4] == 2
pts[ground, 4] <- 6
edited <- l$from_matrix(
pts,
output_path = 'survey_reclassified.laz',
overwrite = TRUE,
fields = c('x', 'y', 'z', 'classification')
)
print(edited$point_count())
Tool-Driven Lidar Processing
Use tool-driven processing when extracting QA outputs and diagnostics from lidar datasets.
For complete lidar write-option keys and allowed values, see Output Controls.
library(whiteboxworkflows)
s <- wbw_session()
wbw_lidar_info(input = 'survey.las', output = 'survey_info.html')
Chunked Matrix Streaming
For very large point clouds, use chunked matrix streaming to avoid keeping the full point matrix in memory.
Recommended chunked workflow:
- Read chunks with
to_matrix_chunks(...). - Apply vectorized edits in each chunk.
- Write edited chunks with
from_matrix_chunks(...).
Use this when full-matrix operations would exceed available memory.
library(whiteboxworkflows)
l <- wbw_read_lidar('survey.las')
fields <- c('x', 'y', 'z', 'classification')
chunks <- l$to_matrix_chunks(chunk_size = 200000, fields = fields)
for (i in seq_along(chunks)) {
high <- chunks[[i]][, 3] > 250
chunks[[i]][high, 4] <- 6
}
edited <- l$from_matrix_chunks(
chunks,
output_path = 'survey_chunked_reclassified.laz',
overwrite = TRUE,
fields = fields
)
print(edited$point_count())
Notes:
- LAS/LAZ chunk outputs use shared core streaming rewrite.
- Other output formats currently fall back to non-streaming assembly in this API path.
Best Practices
- Validate CRS before and after reprojection.
- Keep source lidar immutable; write derived products to new files.
- Prefer COPC/LAZ outputs for large cloud workflows.
Lidar Object Method Reference
Metadata and Access
| Method | Description |
|---|---|
metadata | Return lidar metadata (bounds, point format, CRS, counts). |
point_count | Return total number of points. |
file_path, path | Return backing lidar path. |
get_short_filename | Return basename of lidar file. |
Matrix and Data-Frame Conversion
| Method | Description |
|---|---|
to_matrix | Read selected lidar fields as numeric matrix. |
to_data_frame | Read selected lidar fields as data frame. |
to_matrix_chunks | Read selected lidar fields in chunked matrix blocks. |
Writing Edited Point Data
| Method | Description |
|---|---|
from_matrix | Write edited matrix data to lidar output. |
from_data_frame | Write edited data frame fields to lidar output. |
from_matrix_chunks | Write chunked matrix edits to lidar output. |
Persistence
| Method | Description |
|---|---|
deep_copy | Copy lidar to a new path with optional write options. |
write | Write lidar to a new output path. |
Working with Sensor Bundles
This chapter documents supported sensor bundle families and common workflows.
Bundle APIs provide a common operational interface across heterogeneous sensor families. The conceptual goal is to standardize ingestion and inspection steps even when archive layout, naming conventions, or QA assets differ between providers.
Supported Families
- Generic auto-detect:
wbw_read_bundle(...) - Landsat:
wbw_read_landsat(...) - Sentinel-1 SAFE:
wbw_read_sentinel1(...) - Sentinel-2 SAFE:
wbw_read_sentinel2(...) - PlanetScope:
wbw_read_planetscope(...) - ICEYE:
wbw_read_iceye(...) - DIMAP:
wbw_read_dimap(...) - Maxar/WorldView:
wbw_read_maxar_worldview(...) - RADARSAT-2:
wbw_read_radarsat2(...) - RCM:
wbw_read_rcm(...)
Common Inspection Pattern
Start here to establish what each bundle contains before choosing analysis channels.
library(whiteboxworkflows)
b <- wbw_read_bundle('BUNDLE_ROOT')
print(b$family)
print(b$key_summary())
print(b$list_band_keys())
print(b$list_measurement_keys())
print(b$list_qa_keys())
print(b$list_asset_keys())
Family-Specific Examples
These examples follow one shared shape across providers: load, inspect, sample key channels, then generate QA-ready outputs.
Sentinel-2
Use this for optical band workflows where band identity is explicit.
s2 <- wbw_read_sentinel2('S2_SCENE.SAFE')
red <- s2$read_band('B04')
nir <- s2$read_band('B08')
rgb <- s2$write_true_colour('s2_true_colour.tif')
Landsat
Use this when metadata such as cloud cover and processing level drive filtering.
ls <- wbw_read_landsat('LC09_SCENE')
print(ls$processing_level())
print(ls$cloud_cover_percent())
preview <- ls$read_band(ls$list_band_keys()[[1]])
Sentinel-1 / SAR Families
Use this pattern for SAR measurement access and quick false-colour QA.
s1 <- wbw_read_sentinel1('S1_SCENE.SAFE')
print(s1$polarizations())
meas <- s1$read_measurement(s1$list_measurement_keys()[[1]])
fc <- s1$write_false_colour('s1_false_colour.tif')
PlanetScope / ICEYE / DIMAP / Maxar-WorldView / RADARSAT-2 / RCM
Use this loop for multi-provider intake checks with a consistent script shape.
loaders <- list(
wbw_read_planetscope,
wbw_read_iceye,
wbw_read_dimap,
wbw_read_maxar_worldview,
wbw_read_radarsat2,
wbw_read_rcm
)
paths <- list('PLANETSCOPE_SCENE', 'ICEYE_SCENE', 'DIMAP_SCENE', 'MAXAR_SCENE', 'RADARSAT2_SCENE', 'RCM_SCENE')
for (i in seq_along(loaders)) {
b <- loaders[[i]](paths[[i]])
print(c(b$family, b$bundle_root))
}
Recommended Workflow
- Open with family-specific reader when known.
- Inspect key groups with
key_summary()andlist_*_keys(). - Read representative channels via
read_any()or family-specific methods. - Generate preview/composite rasters for QA.
- Persist derived outputs and record source family + key choices.
Calibration Contract Notes (Landsat/Sentinel-2)
Remote-sensing radiometric and thermal tools in WbW-R now rely on a
standardized sensor-bundle calibration contract in wbraster.
- Sentinel-2 bundle metadata provides quantification values used for DN to TOA reflectance scaling.
- Landsat bundles provide typed per-band reflectance and thermal constants used in TOA reflectance and LST workflows.
For best results, keep original SAFE/MTL metadata with scene rasters when moving or staging bundle data for analysis.
Sensor Bundle Object Method Reference
Metadata and Key Discovery
| Method | Description |
|---|---|
metadata | Return bundle metadata (family, product descriptors, CRS hints). |
family, bundle_root, path | Return family ID and resolved bundle root/path. |
key_summary | Summarize available key groups and counts. |
list_band_keys, list_measurement_keys | List band/measurement keys. |
list_qa_keys, list_aux_keys, list_asset_keys | List QA, auxiliary, and asset keys. |
resolve_key, has_key | Resolve and verify key existence across key types. |
Raster Access by Key
| Method | Description |
|---|---|
read_any | Read key by searching configured key types. |
read_band | Read band key as raster. |
read_measurement | Read measurement key as raster. |
read_qa_layer, read_aux_layer, read_asset | Read QA/auxiliary/asset keys as rasters. |
read_preview_raster | Return selected preview raster descriptor. |
Composite Generation
| Method | Description |
|---|---|
write_true_colour | Write true-colour composite raster. |
write_false_colour | Write false-colour composite raster. |
Common Metadata Accessors
| Method | Description |
|---|---|
acquisition_datetime_utc | Return acquisition timestamp when available. |
processing_level, product_type | Return processing/product descriptors. |
tile_id, mission, acquisition_mode | Return common platform and acquisition descriptors. |
cloud_cover_percent, polarizations | Return cloud or polarization metadata when available. |
Output Controls
This chapter documents practical output controls for raster, vector, and lidar workflows in WbW-R.
Output configuration is where reproducibility is made explicit. Defaults are useful during exploration, but production scripts should pin extensions and format options so artifacts remain comparable across runs and environments. Treat this chapter as the policy layer for how outputs are persisted.
General Principles
- Start with defaults until explicit format constraints are required.
- Use explicit output extensions for reproducibility.
- Re-open outputs and validate metadata after writes.
Raster Output Controls
Raster objects expose write(...) and deep_copy(...) with optional options.
Use explicit raster options when output layout and compression behavior must be stable across environments.
library(whiteboxworkflows)
r <- wbw_read_raster('dem.tif')
# Default write
r$write('dem_default.tif', overwrite = TRUE)
# Write with options list
r$write(
'dem_cog.tif',
overwrite = TRUE,
options = list(
compress = TRUE,
strict_format_options = TRUE,
geotiff = list(
layout = 'cog',
tile_size = 512,
compression = 'deflate',
bigtiff = FALSE
)
)
)
Raster Write Option Reference
Raster write options are supplied as an R list. The current top-level keys
are:
compress: logical convenience switch for GeoTIFF-family outputs.strict_format_options: logical validation switch.geotiff: nested list for GeoTIFF / BigTIFF / COG-specific controls.jpeg2000: nested list for JPEG2000 (.jp2) controls.
compress
compress is a convenience flag used for GeoTIFF-family outputs.
TRUE: maps to GeoTIFFdeflatecompression.FALSE: maps to uncompressed GeoTIFF output.
Default behavior:
compressis unset by default (notTRUEand notFALSE).- For GeoTIFF-family outputs, an unset
compressfalls through to the GeoTIFF writer default compression, which isdeflate. - For non-GeoTIFF outputs,
compresshas no effect.
If geotiff$compression is also supplied, the explicit
geotiff$compression value takes precedence.
strict_format_options
strict_format_options must be TRUE or FALSE.
FALSE(default): GeoTIFF-specific options are ignored when writing a non-GeoTIFF output; JPEG2000-specific options are ignored when writing a non-.jp2output.TRUE: GeoTIFF-specific options on a non-GeoTIFF output path, or JPEG2000-specific options on a non-.jp2output path, raise an error.
jpeg2000
The nested jpeg2000 list supports these keys:
compression: character scalar compression mode.quality_db: numeric quality target in dB used whencompression = 'lossy'.decomp_levels: non-negative integer decode-resolution level hint (0to255).
Default values when keys are omitted:
compression:'lossy'.quality_db:35.0whencompression = 'lossy'and no explicitquality_dbis provided.decomp_levels: writer default (unset).
Supported jpeg2000$compression values:
'lossless''lossy'
Notes:
quality_dbis optional; when omitted withcompression = 'lossy', the writer default quality is used.- JPEG2000 color-space controls are not exposed in the R wrapper yet.
geotiff
The nested geotiff list supports these keys:
compression: character scalar naming the compression codec.bigtiff: logical.layout: character scalar naming the layout.rows_per_strip: positive integer used whenlayout = 'stripped'.tile_width: positive integer used whenlayout = 'tiled'.tile_height: positive integer used whenlayout = 'tiled'.tile_size: positive integer shortcut for COG tile size, and also accepted as a shortcut for both tile width and tile height whenlayout = 'tiled'.cog_tile_size: positive integer alias fortile_sizewhenlayout = 'cog'.
Default values when keys are omitted:
compression:'deflate'.bigtiff:FALSE.layout:'standard'.rows_per_strip:1whenlayout = 'stripped'.tile_width:512whenlayout = 'tiled'.tile_height: defaults totile_widthwhenlayout = 'tiled'.tile_size/cog_tile_size:512whenlayout = 'cog'.
Supported geotiff$compression values:
'none''off''uncompressed''deflate''zip''lzw''packbits''pack_bits''jpeg''webp''web_p''jpegxl''jpeg_xl''jxl'
These are accepted aliases for the same underlying codecs:
'none','off', and'uncompressed''deflate'and'zip''packbits'and'pack_bits''webp'and'web_p''jpegxl','jpeg_xl', and'jxl'
Supported geotiff$layout values:
'standard': default GeoTIFF writer behavior.'stripped': strip-organized GeoTIFF.'striped': alias for'stripped'.'tiled': tiled GeoTIFF.'cog': Cloud-Optimized GeoTIFF.
Layout-specific parameter behavior:
layout = 'standard': ignores strip/tile size keys.layout = 'stripped'or'striped': usesrows_per_strip. Default is1if omitted.layout = 'tiled': usestile_widthandtile_height.tile_widthdefaults to512if omitted.tile_heightdefaults totile_width.tile_sizeis accepted as a shortcut for both dimensions.layout = 'cog': usestile_sizeorcog_tile_size. Default is512if neither is supplied.
Extensionless Raster Outputs
If you omit the output extension, raster writes default to .tif. In that
case the wrapper also defaults the layout to COG unless you explicitly specify
another layout.
# Writes my_surface.tif and defaults to COG-style layout.
r$write('my_surface', overwrite = TRUE)
Practical Patterns
# Explicit uncompressed standard GeoTIFF
r$write(
'out_standard_uncompressed.tif',
overwrite = TRUE,
options = list(
compress = FALSE,
geotiff = list(layout = 'standard')
)
)
# Stripped GeoTIFF with 64 rows per strip
r$write(
'out_stripped.tif',
overwrite = TRUE,
options = list(
geotiff = list(layout = 'stripped', rows_per_strip = 64)
)
)
# Tiled GeoTIFF using a single tile_size shortcut
r$write(
'out_tiled.tif',
overwrite = TRUE,
options = list(
geotiff = list(layout = 'tiled', tile_size = 256)
)
)
# COG with explicit codec and BigTIFF toggle
r$write(
'out_cog.tif',
overwrite = TRUE,
options = list(
strict_format_options = TRUE,
geotiff = list(
layout = 'cog',
tile_size = 512,
compression = 'deflate',
bigtiff = FALSE
)
)
)
Memory Lifecycle and Cleanup
For workflows that use memory-backed rasters, vectors, and lidar objects
(file_mode = "m"), explicit lifecycle management prevents unbounded memory
growth in long-running jobs.
When to use memory mode
Memory mode is most valuable when:
- Chaining multiple operations on the same data without disk I/O.
- Processing intermediate results that are never persisted to disk.
- Running batched analysis where you load, process, and clear per batch.
- Working with smaller datasets where memory is not a constraint.
Avoid memory mode when:
- Working with data larger than available RAM.
- Processing single operations on large files.
- Running unattended long-running jobs without explicit cleanup.
Explicit cleanup in long-running pipelines
library(whiteboxworkflows)
# Long-running batch analysis
for (tile_id in 1:1000) {
cat('Processing tile', tile_id, '\n')
# Read data into memory for this tile
r <- wbw_read_raster(sprintf('tile_%d.tif', tile_id), file_mode = "m")
v <- wbw_read_vector(sprintf('bounds_%d.gpkg', tile_id), file_mode = "m")
# Process
result <- wbw_clip_raster_by_polygon(input = r$file_path(), polygon = v$file_path(),
output = sprintf('clipped_%d.tif', tile_id))
# Explicit cleanup before next iteration
wbw_remove_raster_from_memory(r)
wbw_remove_vector_from_memory(v)
}
Monitoring memory usage
For production scripts, track memory explicitly:
library(whiteboxworkflows)
cat('Initial raster memory:', wbw_raster_memory_bytes() / 1e6, 'MB\n')
cat('Initial vector memory:', wbw_vector_memory_bytes() / 1e6, 'MB\n')
# ... run operations ...
# Before returning or starting new phase
cat('Final raster count:', wbw_raster_memory_count(), '\n')
cat('Final raster memory:', wbw_raster_memory_bytes() / 1e6, 'MB\n')
# Explicit reset if needed
wbw_clear_memory()
cat('After clear:', wbw_raster_memory_count(), '\n')
Lidar Output Controls
Lidar objects expose write(...) and deep_copy(...) with optional options.
The standalone wbw_write_lidar() function persists a wbw_lidar result
returned by a tool call directly to disk.
When a session method is called without an output argument the result is
stored in memory automatically; pass it to wbw_write_lidar() to persist it:
library(whiteboxworkflows)
wbe <- wbw_make_session()
survey <- wbw_read_lidar('survey.laz')
# Omit output — result is memory-backed automatically
filtered <- wbe$filter_lidar_classes(input = survey$path, excluded_classes = list(7L))
cat(filtered$path, '\n') # memory://lidar/...
# Persist when ready
wbw_write_lidar(filtered, 'survey_clean.copc.laz')
Write options for wbw_write_lidar() and l$write():
Use these options to tune archive size, cloud-read behavior, and downstream compatibility.
library(whiteboxworkflows)
l <- wbw_read_lidar('survey.las')
# LAZ controls
l$write(
'survey_out.laz',
overwrite = TRUE,
options = list(
laz = list(
chunk_size = 25000,
compression_level = 7
)
)
)
# COPC controls
l$write(
'survey_out.copc.laz',
overwrite = TRUE,
options = list(
copc = list(
max_points_per_node = 75000,
max_depth = 8,
node_point_ordering = 'hilbert'
)
)
)
Lidar Write Option Reference
For lidar writes, the options list supports these top-level keys:
laz: nested list with LAZ writer controls.copc: nested list with COPC hierarchy controls.
laz
Supported keys:
chunk_size: positive integer (> 0).compression_level: integer in the range0to9.
Default values when keys are omitted:
chunk_size:50000.compression_level:6.
copc
Supported keys:
max_points_per_node: positive integer (> 0).max_depth: positive integer (> 0).node_point_ordering: one of:'auto''morton''hilbert'
Default values when keys are omitted:
max_points_per_node:100000.max_depth:8.node_point_ordering:'auto'.
Notes:
- If no output extension is provided, lidar writes default to
.copc.laz. - COPC options are relevant when writing COPC output (
.copc.laz/.copc.las).
# Extensionless output defaults to COPC
l$write(
'survey_out',
overwrite = TRUE,
options = list(
copc = list(
max_points_per_node = 75000,
max_depth = 8,
node_point_ordering = 'hilbert'
)
)
)
# Explicit LAZ controls
l$write(
'survey_out.laz',
overwrite = TRUE,
options = list(
laz = list(
chunk_size = 25000,
compression_level = 7
)
)
)
Vector Output Controls
wbw_write_vector(...) persists a wbw_vector object to disk. When a session
method is called without an output argument, the result is automatically
stored in memory and can be written to disk with wbw_write_vector().
library(whiteboxworkflows)
wbe <- wbw_make_session()
roads <- wbw_read_vector('roads.gpkg')
# Omit output — result is memory-backed automatically
centroids <- wbe$centroid_vector(input = roads$path)
cat(centroids$path, '\n') # memory://vector/...
# Persist when ready
wbw_write_vector(centroids, 'roads_centroids.gpkg')
# Or supply output explicitly to write directly to disk
wbe$centroid_vector(input = roads$path, output = 'roads_centroids_direct.gpkg')
Vector Write Option Reference
For vector writes, the options list supports these keys:
strict_format_options: logical validation switch.geoparquet: nested list with GeoParquet writer controls.
strict_format_options
FALSE(default): format-specific controls are ignored when they do not apply to the selected output format.TRUE: format-specific controls on incompatible output formats raise an error.
geoparquet
Supported keys:
max_rows_per_group: positive integer.data_page_size_limit: positive integer.write_batch_size: positive integer.data_page_row_count_limit: positive integer.compression: character scalar.
Default values when keys are omitted:
max_rows_per_group:1_048_576.data_page_size_limit: Parquet library default page size.write_batch_size: Parquet library default write batch size.data_page_row_count_limit: Parquet library default row-count limit.compression: Parquet library default compression codec.
Supported geoparquet$compression values:
'none''snappy''gzip''lz4''zstd''brotli'
Notes:
- GeoParquet controls are only applied for
.parquetoutputs. - If no output extension is provided, vector writes default to
.gpkg.
wbw_write_vector(
buffered,
'roads.parquet',
options = list(
strict_format_options = TRUE,
geoparquet = list(
compression = 'zstd',
max_rows_per_group = 250000,
data_page_size_limit = 1048576,
write_batch_size = 8192,
data_page_row_count_limit = 20000
)
)
)
Reproducibility Checklist
- Pin output extension explicitly.
- Capture option list values in scripts.
- Verify metadata after write (
metadata()and CRS values). - Keep source files immutable; write derived outputs separately.
Supported Data Formats
This chapter summarizes practical format support exposed through WbW-R.
Choosing formats is a workflow architecture decision, not just a file-extension choice. You should evaluate read/write support together with ecosystem fit, compression behavior, and long-term maintainability of outputs shared across toolchains.
Backend support comes from core crates:
- Raster:
wbraster - Vector:
wbvector - Lidar:
wblidar
Raster Formats
Exhaustive raster format support in the current WbW-R build:
| Format | Read (wbw_read_raster) | Write (wbw_write_raster) | Common extensions / path rules |
|---|---|---|---|
| DTED | Yes | Yes | .dt0, .dt1, .dt2 (DTED 0, 1, 2 elevation data; WGS-84 geographic only) |
| Esri ASCII Grid | Yes | Yes | .asc (and .grd when detected as ASCII) |
| Esri Binary Grid workspace | Yes | Backend-only | Esri Binary workspace directory (hdr.adf + w001001.adf) or .adf |
| Esri Float Grid | Yes | Yes | .flt, .hdr (single-band float grid with header file) |
| ERDAS IMAGINE (HFA) | Yes | No | .img - read-only MVP; RLC (run-length) compression supported |
| GRASS ASCII Raster | Yes | Yes | .txt / .asc when GRASS header keys are detected |
| Surfer GRD | Yes | Yes | .grd (DSAA / DSRB signatures) |
| PCRaster | Yes | Yes | .map (CSF signature) |
| SAGA Binary Grid | Yes | Yes | .sdat, .sgrd |
| Idrisi / TerrSet Raster | Yes | Yes | .rst, .rdc |
| ER Mapper | Yes | Yes | .ers |
| ENVI HDR-labelled raster | Yes | Yes | .hdr, or data files (.img, .dat, .bin, .raw, .bil, .bsq, .bip) with .hdr sidecar |
| GeoTIFF / BigTIFF / COG | Yes | Yes | .tif, .tiff |
| GeoPackage raster | Yes | Yes | .gpkg |
| JPEG2000 / GeoJP2 | Yes | Yes | .jp2 |
| JPEG + World File | Yes | Yes | .jpg, .jpeg with .jgw, .jpgw, .jpegw, or .wld world file |
| PNG + World File | Yes | Yes | .png with .pgw, .pngw, or .wld world file |
| Zarr | Yes | Yes | .zarr store (directory / suffix) |
| XYZ ASCII Grid | Yes | Yes | .xyz (whitespace or comma-delimited X Y Z points) |
Typical pattern:
library(whiteboxworkflows)
r <- wbw_read_raster('dem.tif')
wbw_write_raster(r, 'dem_out.tif')
Vector Formats
Exhaustive vector format support in the current WbW-R build:
| Format | Read (wbw_read_vector) | Write (wbw_write_vector) | Extensions / notes |
|---|---|---|---|
| FlatGeobuf | Yes | Yes | .fgb |
| GeoJSON | Yes | Yes | .geojson, .json |
| TopoJSON | Yes | Yes | .topojson |
| GeoPackage | Yes | Yes | .gpkg |
| GeoParquet | Yes | Yes | .parquet |
| GML | Yes | Yes | .gml |
| GPX | Yes | Yes | .gpx |
| KML | Yes | Yes | .kml |
| KMZ | Yes | Yes | .kmz |
| MapInfo Interchange | Yes | Yes | .mif with .mid sidecar |
| OSM PBF | Yes | No | .osm.pbf (read workflows only) |
| ESRI Shapefile | Yes | Yes | .shp plus dataset sidecars |
When wbw_write_vector(...) is called without an extension, WbW-R defaults output to .gpkg.
Typical pattern:
library(whiteboxworkflows)
v <- wbw_read_vector('roads.gpkg')
wbw_write_vector(v, 'roads_out.gpkg')
Lidar Formats
Exhaustive lidar format support in the current WbW-R build:
| Format | Read (wbw_read_lidar) | Write (wbw_write_lidar) | Extensions / notes |
|---|---|---|---|
| LAS | Yes | Yes | .las |
| LAZ | Yes | Yes | .laz |
| COPC | Yes | Yes | .copc.las, .copc.laz |
| PLY | Yes | Yes | .ply |
| E57 | Yes | Yes | .e57 |
When wbw_write_lidar(...) is called without an extension, WbW-R defaults output to .copc.laz.
Typical pattern:
library(whiteboxworkflows)
l <- wbw_read_lidar('survey.las')
wbw_write_lidar(l, 'survey_out.copc.laz')
Sensor Bundle Families
Supported bundle readers include:
wbw_read_bundle(...)(auto-detect)wbw_read_landsat(...)wbw_read_sentinel1(...)wbw_read_sentinel2(...)wbw_read_planetscope(...)wbw_read_iceye(...)wbw_read_dimap(...)wbw_read_maxar_worldview(...)wbw_read_radarsat2(...)wbw_read_rcm(...)
Bundle inputs may be either extracted directories or supported archives:
.zip.tar.tar.gz.tgz
See Working with Sensor Bundles for family-specific examples.
Validation Guidance
- Prefer stable interchange formats (
.tif,.gpkg,.copc.laz) for production pipelines. - Re-open outputs and verify metadata after write operations.
- Use explicit options where format behavior must be reproducible.
Reprojection and CRS
This chapter covers reprojection and CRS validation workflows in WbW-R.
CRS operations are correctness-critical. A workflow can run successfully while still producing invalid spatial interpretation if source and destination reference systems are misunderstood. The patterns here prioritize explicit CRS inspection, controlled reprojection calls, and immediate validation of outputs.
Inspect CRS
Run this check before any reprojection to catch missing or incorrect source CRS assumptions.
library(whiteboxworkflows)
r <- wbw_read_raster('dem.tif')
v <- wbw_read_vector('roads.gpkg')
l <- wbw_read_lidar('survey.las')
print(r$crs_epsg())
print(v$crs_epsg())
print(l$crs_epsg())
Assigning Projection Metadata
Projection assignment and reprojection are different operations:
- Assignment updates CRS metadata only.
- Reprojection changes coordinate values.
In current WbW-R, direct object-level CRS assignment helpers are not exposed in the same way as WbW-Py object methods. The practical assignment pattern is to use R spatial libraries for metadata repair, then re-open data in WbW-R.
library(whiteboxworkflows)
library(terra)
library(sf)
# Raster CRS assignment (metadata only)
r <- rast('dem_without_crs.tif')
crs(r) <- 'EPSG:26917'
writeRaster(r, 'dem_with_crs.tif', overwrite = TRUE)
# Vector CRS assignment (metadata only)
v <- st_read('roads_without_crs.gpkg', quiet = TRUE)
v <- st_set_crs(v, 26917)
st_write(v, 'roads_with_crs.gpkg', delete_dsn = TRUE, quiet = TRUE)
# Re-open in WbW-R for downstream analysis
r_wbw <- wbw_read_raster('dem_with_crs.tif')
v_wbw <- wbw_read_vector('roads_with_crs.gpkg')
print(r_wbw$crs_epsg())
print(v_wbw$crs_epsg())
If you need coordinate changes, use reprojection workflows in the next sections rather than metadata assignment.
Raster Reprojection Pattern
The core raster API provides six reprojection method patterns (documented here so behavior is explicit across language bindings):
- Full-options reprojection (
reproject) - Nearest convenience (
reproject_nearest) - Bilinear convenience (
reproject_bilinear) - Reproject to match another raster grid (
reproject_to_match_grid) - Reproject to match another raster resolution (
reproject_to_match_resolution) - Reproject to target EPSG while matching a reference resolution
(
reproject_to_match_resolution_in_epsg)
In WbW-R, practical workflows typically call reprojection tools through
wbw_<tool>(...) wrappers (or wbw_run_tool(...) for dynamic tool-id workflows)
and then reopen outputs as typed objects.
Available resampling methods (wbraster)
Use these method strings in reprojection workflows:
nearestbilinearcubiclanczosaverageminmaxmodemedianstddev
Method guidance:
- Categorical/class rasters:
nearest(ormodewhere majority behavior is desired). - Continuous rasters:
bilinear,cubic, orlanczos. - Statistical downscaling/generalization:
average,min,max,median,stddev.
Example: explicit raster reprojection
library(whiteboxworkflows)
s <- wbw_session()
wbw_reproject_raster(input = 'dem.tif',
output = 'dem_utm.tif',
epsg = 32618,
method = 'bilinear')
dem_utm <- wbw_read_raster('dem_utm.tif')
print(dem_utm$crs_epsg())
Example: match-grid categorical reprojection
library(whiteboxworkflows)
s <- wbw_session()
wbw_reproject_raster(input = 'landcover_4326.tif',
output = 'landcover_utm_aligned.tif',
epsg = 32618,
method = 'nearest')
Automatic reprojection in raster-stack tools
Stack-based tools now support automatic alignment controls:
auto_reproject(defaulttrue)auto_reproject_method(optional override)
Behavior for raster stacks:
inputs[0]/input_rasters[0]is the reference raster.- CRS-mismatched stack members are auto-reprojected to the reference grid when
auto_reproject=true. - If
auto_reproject_methodis not set:- categorical rasters infer
nearest - continuous rasters infer
bilinear
- categorical rasters infer
- Non-overlapping extents are treated as hard validation errors.
This matters most for tools that combine raster stacks (overlay, weighted sum, PCA, inverse PCA, raster calculator, segmentation).
library(whiteboxworkflows)
s <- wbw_session()
wbw_weighted_sum(input_rasters = c('slope_utm.tif', 'landcover_4326.tif', 'distance_utm.tif'),
weights = c(0.4, 0.35, 0.25),
auto_reproject = TRUE,
auto_reproject_method = '',
output = 'weighted_sum.tif')
Vector Reprojection Pattern
Use this when downstream geometry processing depends on a specific projected CRS.
library(whiteboxworkflows)
s <- wbw_session()
wbw_reproject_vector(input = 'roads.gpkg', output = 'roads_utm.gpkg', epsg = 32618)
roads_utm <- wbw_read_vector('roads_utm.gpkg')
print(roads_utm$crs_epsg())
Lidar Reprojection Pattern
Use this when point-cloud alignment and metric operations require a target CRS.
library(whiteboxworkflows)
s <- wbw_session()
wbw_reproject_lidar(input = 'survey.las', output = 'survey_utm.laz', epsg = 32618)
survey_utm <- wbw_read_lidar('survey_utm.laz')
print(survey_utm$crs_epsg())
Georeference Raster from Control Points
Use this when a raster/image lacks reliable georeferencing and you have control points mapping pixel coordinates to map coordinates.
Required CSV fields:
source_colsource_rowtarget_xtarget_y
library(whiteboxworkflows)
s <- wbw_session()
wbw_georeference_raster_from_control_points(input = 'historical_scan.tif',
control_points = 'historical_scan_gcps.csv',
epsg = 32618,
resample = 'bilinear',
output = 'historical_scan_georef.tif',
report = 'historical_scan_georef_report.json')
Projection Utility Functions
These functions provide CRS diagnostics and point-level coordinate transforms outside of full dataset reprojection workflows.
WKT and EPSG identification
library(whiteboxworkflows)
# Convert EPSG to OGC WKT.
wkt <- wbw_projection_to_ogc_wkt(32618)
cat(wkt, '\n')
# Identify EPSG from WKT or CRS text.
epsg <- wbw_projection_identify_epsg(wkt)
print(epsg) # 32618
# Reproject a batch of points.
pts <- data.frame(x = -79.3832, y = 43.6532)
result <- wbw_projection_reproject_points(pts, src_epsg = 4326L, dst_epsg = 32618L)
print(result)
Parse a PROJ string
Use wbw_projection_from_proj_string when you have a PROJ4-style string from a
legacy file header or third-party metadata source and need the corresponding
EPSG code or OGC WKT.
The function returns a named list with exactly one element:
list(epsg = integer)— EPSG code identifiedlist(wkt = character)— no EPSG match, WKT representation availablelist(unknown = TRUE)— PROJ string parsed but CRS could not be resolved
library(whiteboxworkflows)
proj_str <- '+proj=utm +zone=17 +datum=NAD83 +units=m +no_defs'
result <- wbw_projection_from_proj_string(proj_str)
if (!is.null(result$epsg)) {
cat('Identified EPSG:', result$epsg, '\n') # e.g. 26917
} else if (!is.null(result$wkt)) {
cat('WKT:', result$wkt, '\n')
} else {
cat('CRS unknown\n')
}
This is the recommended approach for legacy rasters or vectors whose metadata
carries only a PROJ4 string. WbW-R uses this path internally in
wbw_projection_identify_epsg as the third fallback step.
Area-of-use bounding box
Use wbw_projection_area_of_use to retrieve the geographic domain of valid use
for an EPSG code. This is useful for validating that data falls within the CRS
before or after reprojection.
Returns a named list list(lon_min, lat_min, lon_max, lat_max), or NULL if no
bounding box is registered for the code.
library(whiteboxworkflows)
bbox <- wbw_projection_area_of_use(32618) # UTM Zone 18N
if (!is.null(bbox)) {
cat(sprintf('valid lon: %.1f to %.1f\n', bbox$lon_min, bbox$lon_max))
cat(sprintf('valid lat: %.1f to %.1f\n', bbox$lat_min, bbox$lat_max))
}
# Returns NULL for unregistered codes.
print(wbw_projection_area_of_use(9999)) # NULL
Best Practices
- Confirm source CRS before transformation.
- Use interpolation appropriate to data type for raster reprojection.
- Re-open outputs and verify CRS metadata.
- Keep transform arguments explicit in reproducible scripts.
Topology Utilities
WbW-R currently emphasizes topology through tool workflows (category-driven) rather
than a dedicated R-native topology utility namespace equivalent to Python.
At the same time, session-level topology helpers provide a fast path for geometry QA checks and validation gates before heavier vector operations.
Topology Workflow Pattern
A robust topology-first script pattern is:
- Validate CRS compatibility first.
- Run fast predicate checks to detect obvious incompatibilities.
- Repair invalid polygon geometries if needed.
- Re-run critical predicates after repair.
- Continue into heavier vector processing only after checks pass.
This keeps failures early, localized, and easier to diagnose.
Session-Level Topology Predicate Checks
Use these methods when you need script-level yes/no checks on WKT geometries.
library(whiteboxworkflows)
s <- wbw_session()
poly <- 'POLYGON((0 0,10 0,10 10,0 10,0 0))'
pt <- 'POINT(5 5)'
print(s$topology_contains_wkt(poly, pt))
print(s$topology_intersects_wkt(poly, pt))
print(s$topology_within_wkt(pt, poly))
print(s$topology_touches_wkt(poly, pt))
print(s$topology_disjoint_wkt(poly, pt))
Topology via Tool Workflows
Use session execution and topology/geometry tool IDs through category discovery.
library(whiteboxworkflows)
s <- wbw_session()
# Discover likely topology-related tools
hits <- wbw_search_tools('topology')
print(hits)
# Inspect a specific candidate tool
if (length(hits) > 0) {
first_hit <- hits[[1]]
tool_id <- if (is.list(first_hit) && !is.null(first_hit$tool_id)) first_hit$tool_id else as.character(first_hit)
print(wbw_describe_tool(tool_id))
}
Use this pattern when topology checks are part of larger, file-based vector pipelines that should remain in Whitebox tooling end to end.
Vector Topology Pattern
library(whiteboxworkflows)
s <- wbw_session()
# Example pattern: run a topology-oriented vector tool once selected.
# Replace 'tool_id_here' and args with the concrete tool from discovery.
# wbw_run_tool(
# 'tool_id_here',
# args = list(input = 'input.gpkg', output = 'output.gpkg'),
# session = s
# )
Use this template after selecting a concrete tool ID through discovery.
Geometry Validation and Repair
Use session helpers for pure WKT checks and repairs.
library(whiteboxworkflows)
s <- wbw_session()
invalid <- 'POLYGON((0 0,4 4,4 0,0 4,0 0))'
print(s$topology_is_valid_polygon_wkt(invalid))
fixed <- s$topology_make_valid_polygon_wkt(invalid)
print(fixed)
buf <- s$topology_buffer_wkt('LINESTRING(0 0, 10 0)', 1.5)
print(buf)
For broader in-memory feature editing, an sf interop path is still practical:
library(whiteboxworkflows)
library(sf)
v <- wbw_read_vector('polygons.gpkg')
g <- v$to_sf()
valid_flags <- st_is_valid(g)
print(table(valid_flags))
g_fixed <- st_make_valid(g)
st_write(g_fixed, 'polygons_valid.gpkg', delete_dsn = TRUE, quiet = TRUE)
Relationship and Distance
Use relation/distance methods when binary predicates are not expressive enough.
library(whiteboxworkflows)
s <- wbw_session()
a <- 'LINESTRING(0 0, 10 0)'
b <- 'LINESTRING(0 1, 10 1)'
print(s$topology_distance_wkt(a, b))
print(s$topology_relate_wkt(a, b))
Topology Method Reference
| Session Method | Description |
|---|---|
topology_intersects_wkt | Return TRUE when two geometries share any space. |
topology_contains_wkt | Return TRUE when geometry A strictly contains geometry B. |
topology_within_wkt | Return TRUE when geometry A is strictly within geometry B. |
topology_touches_wkt | Return TRUE when geometries meet at boundaries only. |
topology_disjoint_wkt | Return TRUE when geometries do not intersect. |
topology_crosses_wkt | Return TRUE when geometries cross with dimensional reduction behavior. |
topology_overlaps_wkt | Return TRUE when same-dimension geometries partially overlap. |
topology_covers_wkt | Boundary-aware containment test (A covers B). |
topology_covered_by_wkt | Boundary-aware containment test (A covered by B). |
topology_relate_wkt | Return DE-9IM relationship text for exact topology-rule evaluation. |
topology_distance_wkt | Return shortest distance between geometries. |
topology_vector_feature_relation | Evaluate relation between indexed features in vector objects. |
topology_is_valid_polygon_wkt | Check polygon validity before topology-sensitive workflows. |
topology_make_valid_polygon_wkt | Repair an invalid polygon WKT representation. |
topology_buffer_wkt | Create a buffered geometry from WKT and distance. |
Guidance
- Use
wbw_search_tools(...)+wbw_describe_tool(...)to select backend topology tools. - Use session topology helpers for fast script-level predicates and repairs.
- Use
sffor broader in-memory feature editing when needed. - Re-open outputs with
wbw_read_vector(...)and verify schema/CRS after topology operations.
Interoperability
This chapter provides practical roundtrip patterns between WbW-R and R spatial tooling.
Interoperability should be treated as deliberate boundary crossing. Each conversion step can affect metadata fidelity, CRS representation, numeric precision, and schema details. The examples here emphasize explicit handoff points and validation checkpoints so multi-package workflows remain defensible.
Copy-Boundary Model
- Array exchange via
to_array()andwbw_array_to_raster(...)is an explicit in-memory boundary. - terra/stars/sf workflows are typically file or object conversion boundaries.
- Always validate metadata after roundtrip.
terra Roundtrip
Use this when your raster workflow depends on terra-native functions.
library(whiteboxworkflows)
library(terra)
r <- wbw_read_raster('dem.tif')
terra_r <- r$to_terra()
# terra-side operation
terra_r2 <- terra::focal(terra_r, w = 3, fun = mean, na.rm = TRUE)
terra::writeRaster(terra_r2, 'dem_terra_smoothed.tif', overwrite = TRUE)
r_back <- wbw_read_raster('dem_terra_smoothed.tif')
print(r_back$metadata())
stars Roundtrip
Use this for labeled-array operations and grid math in stars workflows.
library(whiteboxworkflows)
library(stars)
r <- wbw_read_raster('dem.tif')
st <- r$to_stars()
# stars-side operation
st2 <- st * 1.05
st_as_stars <- st_as_stars(st2)
write_stars(st_as_stars, 'dem_stars_scaled.tif', driver = 'GTiff')
r_back <- wbw_read_raster('dem_stars_scaled.tif')
print(r_back$metadata())
sf Roundtrip (Vector)
Use this when vector editing and validity checks are more convenient in sf.
library(whiteboxworkflows)
library(sf)
v <- wbw_read_vector('roads.gpkg')
sf_obj <- v$to_sf()
# sf-side edit
sf_obj$len_m <- as.numeric(st_length(sf_obj))
sf_obj <- sf_obj[sf_obj$len_m > 20, ]
st_write(sf_obj, 'roads_sf_filtered.gpkg', delete_dsn = TRUE, quiet = TRUE)
v_back <- wbw_read_vector('roads_sf_filtered.gpkg')
print(v_back$schema())
Validation Checklist
- Re-check CRS and bounds after conversion.
- Re-check schema and representative attributes for vector flows.
- Prefer stable interchange formats (
.tif,.gpkg) for routine roundtrips.
Licensing
This chapter documents open, signed entitlement, and floating license startup patterns in WbW-R.
Licensing mode is an operational dependency that should be explicit in scripts. The objective is to make startup behavior predictable and auditable: define which mode is expected, how fallback is handled, and what diagnostics are captured when activation fails.
Open Mode
Use this mode for OSS-only workflows or environments without entitlement needs.
library(whiteboxworkflows)
s <- wbw_session()
print(s)
Signed Entitlement Mode
Use this for managed deployments with entitlement payload verification.
library(whiteboxworkflows)
signed_entitlement_json <- '...'
s <- wbw_session(
signed_entitlement_json = signed_entitlement_json,
public_key_kid = 'k1',
public_key_b64url = 'REPLACE_WITH_PROVIDER_KEY',
include_pro = TRUE,
tier = 'open'
)
print(s)
Floating License Mode
Use this for centrally managed license allocation across users or machines.
library(whiteboxworkflows)
s <- wbw_session(
floating_license_id = 'fl_12345',
include_pro = TRUE,
tier = 'open',
provider_url = 'https://license.example.com',
machine_id = 'machine-01',
customer_id = 'customer-abc'
)
print(s)
Failure Handling Guidance
- Validate session creation at script startup and fail early.
- Capture and log runtime startup errors for entitlement/floating modes.
- Use open mode fallback only when policy allows.
Security and Operations Notes
- Keep entitlement payloads and keys out of source control.
- Prefer environment-variable or secret-store injection.
- Record runtime version and startup mode for reproducibility.
API Reference Strategy
Manual chapters provide narrative and recipes.
This separation between narrative guidance and parameter-complete references is intentional. The manual answers workflow and design questions, while shared tool docs provide exhaustive argument contracts. Keeping these concerns separate makes the manual easier to read without reducing technical precision.
Detailed tool-level parameter references remain in shared tool docs.
Reference Boundaries
- This manual focuses on concepts, workflow patterns, and runnable examples.
- Tool parameter completeness remains in shared tool documentation.
- Facade-level usage details are emphasized in object workflow chapters.
Navigation
- Discovery and execution patterns: Session and Discovery
- Data-object workflows: Working with Rasters, Working with Vectors, Working with Lidar
- Concrete runnable scripts: Script Index
Terrain Analysis and Geomorphometry
Digital terrain analysis in WbW-R encompasses the full range of operations from surface derivatives (slope, aspect, curvature) through multi-scale geomorphometric indices and geomorphological classification. All computation is performed by the Whitebox backend; this chapter shows how to wire those tools into reproducible R workflows.
Core Concepts
Terrain analysis relies on these fundamental concepts:
- Slope: The gradient of the land surface, typically expressed in degrees (0–90) or percent. High values indicate steep terrain; low values indicate flat terrain. Critical for erosion, landslide, and hydrology models.
- Aspect: Compass direction a slope faces (0–360°, measured clockwise from north). Flat cells are typically -1 or nodata. Controls solar radiation, drainage direction, and habitat quality.
- Curvature: Rate of change of slope (usually planform and profile separately). Positive profile curvature indicates acceleration zones (ridges); negative indicates deceleration zones (valleys).
- Flow direction: The steepest downslope direction from each cell. Forms the basis for flow accumulation, drainage network delineation, and watershed boundaries.
- Flow accumulation: Number of upslope cells draining to each cell, proportional to contributing area. High values indicate stream channels and valley bottoms.
- Topographic Wetness Index (TWI): Ln(upslope area / tan(slope)), predicts persistent moisture. High TWI indicates probable saturated zones; used in soil moisture and flood risk mapping.
- Landform classification: Categorizing terrain into types (e.g. summits, ridges, valleys, footslopes) via multivariate analysis. Provides interpretable terrain structure without tuning thresholds.
- Multiscale analysis: Deriving terrain metrics at multiple scales. Single-scale derivatives can miss important structure; multiscale approaches reveal process-relevant scales.
- Viewshed: Set of cells visible from a vantage point. Used in landscape perception, military analysis, and wind farm siting.
Session Setup
library(whiteboxworkflows)
s <- wbw_session()
# Set working directory for relative file paths
setwd('/data/terrain')
Reading a DEM
dem <- wbw_read_raster('dem.tif')
meta <- dem$metadata()
cat('Rows:', meta$rows, ' Cols:', meta$columns, '\n')
cat('Cell size:', meta$resolution_x, 'm\n')
cat('NoData:', meta$nodata, '\n')
Surface Derivatives
Slope
# Slope in degrees
slope_deg <- wbw_slope(dem = dem$file_path(),
output = 'slope_deg.tif',
units = 'degrees',
z_factor = 1.0)
slope <- wbw_read_raster('slope_deg.tif')
Aspect
wbw_aspect(dem = dem$file_path(),
output = 'aspect.tif',
zero_aspect = FALSE)
Hillshade
wbw_hillshade(dem = dem$file_path(),
output = 'hillshade.tif',
azimuth = 315.0,
altitude = 30.0)
Profile Curvature, Plan Curvature, Tangential Curvature
wbw_profile_curvature(dem = dem$file_path(), output = 'profc.tif')
wbw_plan_curvature(dem = dem$file_path(), output = 'planc.tif')
wbw_tangential_curvature(dem = dem$file_path(), output = 'tangc.tif')
Mean Curvature and Other Modes
wbw_mean_curvature(dem = dem$file_path(), output = 'meanc.tif')
wbw_minimal_curvature(dem = dem$file_path(), output = 'minc.tif')
wbw_maximal_curvature(dem = dem$file_path(), output = 'maxc.tif')
wbw_gaussian_curvature(dem = dem$file_path(), output = 'gaussc.tif')
Topographic Wetness and Flow Accumulation
wbw_wetness_index(sca = 'sca.tif',
slope = 'slope_deg.tif',
output = 'twi.tif')
Computing the specific contributing area (SCA) first:
wbw_d_inf_flow_accum(dem = dem$file_path(),
output = 'sca.tif',
out_type = 'sca',
threshold = 0.0,
log = FALSE,
clip = FALSE)
Relief and Position Indices
Relative Topographic Position
wbw_relative_topographic_position(dem = dem$file_path(),
output = 'rtp.tif',
filterx = 101,
filtery = 101)
TPI, Deviation from Mean, Scale Standardised Elevation
wbw_topographic_position_index(dem = dem$file_path(),
output = 'tpi.tif',
minrad = 1.0,
maxrad = 25.0,
steps = 10,
num_sig_digits = 3)
wbw_deviation_from_mean_elevation(dem = dem$file_path(),
output = 'dev_mean.tif',
filterx = 11,
filtery = 11)
wbw_elev_above_pit(dem = dem$file_path(),
output = 'elev_above_pit.tif')
Multi-Scale Roughness and Complexity
wbw_multiscale_roughness(dem = dem$file_path(),
out_mag = 'ms_rough_mag.tif',
out_scale = 'ms_rough_scale.tif',
min_scale = 1,
max_scale = 100,
step = 1)
wbw_vector_ruggedness_measure(dem = dem$file_path(),
output = 'vrm.tif',
filterx = 11,
filtery = 11)
wbw_ruggedness_index(dem = dem$file_path(),
output = 'tri.tif')
Terrain Smoothing
DEM-derived curvature, landform classes, and terrain-position indices can be overly sensitive to short-range roughness. Whitebox Next Gen now includes a multiscale feature-preserving smoother that is better aligned with modern terrain-analysis workflows than relying only on the older single-scale tool.
wbw_feature_preserving_smoothing_multiscale(input = dem$file_path(),
output = 'dem_smooth_multiscale.tif',
smoothing_amount = 0.65,
edge_preservation = 0.80,
scale_levels = 4,
fidelity = 0.45,
z_factor = 1.0)
dem_smooth <- wbw_read_raster('dem_smooth_multiscale.tif')
The key idea is coarse-to-fine smoothing: larger-scale terrain form is stabilized first, then finer detail is reintroduced while preserving major breaks-in-slope. This is especially useful before curvature, geomorphon, and terrain-position workflows.
Geomorphons — Landform Classification
wbw_geomorphons(dem = dem$file_path(),
output = 'geomorphons.tif',
search = 50,
threshold = 1.0,
fdist = 0,
skip = 0,
forms = TRUE,
residuals = FALSE)
Geomorphon class codes: 1=flat, 2=peak, 3=ridge, 4=shoulder, 5=spur, 6=slope, 7=hollow, 8=footslope, 9=valley, 10=pit.
Multi-Scale Topographic Position
wbw_multiscale_topographic_position_image(local = 'tpi_local.tif',
meso = 'tpi_meso.tif',
broad = 'tpi_broad.tif',
output = 'mstpi_rgb.tif',
hillshade = 'hillshade.tif')
Solar and Horizon Analysis
wbw_time_in_daylight(dem = dem$file_path(),
output = 'daylight_hrs.tif',
lat = 43.5,
long = -80.5,
az_fraction = 10.0,
max_dist = 100.0,
utc_offset = '-5:00',
start_day = 172,
end_day = 172,
start_time = '06:00:00',
end_time = '20:00:00')
WbW-Pro Spotlight: Terrain Constraint and Conflict Analysis
- Problem: Screen terrain constraints early for siting and corridor decisions.
- Tool:
terrain_constraint_and_conflict_analysis - Typical inputs: DEM, optional wetness, optional flood-risk surface, optional land-cover penalty, slope threshold.
- Typical outputs: Terrain-conflict score raster, conflict classes, and summary outputs.
result <- s$terrain_constraint_and_conflict_analysis(
dem = 'dem.tif',
wetness = 'wetness_index_norm.tif',
flood_risk = 'flood_risk_norm.tif',
landcover_penalty = 'landcover_penalty_norm.tif',
slope_limit_deg = 15.0,
output_prefix = 'terrain_conflict_corridor_a'
)
print(result)
Note: This workflow requires a session initialized with a valid Pro licence.
Complete Terrain Analysis Workflow
library(whiteboxworkflows)
s <- wbw_session()
setwd('/data/terrain_workflow')
dem <- wbw_read_raster('dem.tif')
# Smooth the DEM before derivative-heavy work.
wbw_feature_preserving_smoothing_multiscale(input = dem$file_path(),
output = 'dem_smooth_multiscale.tif',
smoothing_amount = 0.65,
edge_preservation = 0.80,
scale_levels = 4,
fidelity = 0.45)
dem_smooth <- wbw_read_raster('dem_smooth_multiscale.tif')
# Derivatives
for (tool in c('slope', 'aspect')) {
wbw_run_tool(tool, args = list(dem = dem_smooth$file_path(),
output = paste0(tool, '.tif')), session = s)
}
wbw_profile_curvature(dem = dem_smooth$file_path(), output = 'profc.tif')
wbw_plan_curvature(dem = dem_smooth$file_path(), output = 'planc.tif')
# Hillshade for visualisation
wbw_hillshade(dem = dem_smooth$file_path(), output = 'hillshade.tif',
azimuth = 315.0, altitude = 30.0)
# TWI
wbw_d_inf_flow_accum(dem = dem_smooth$file_path(), output = 'sca.tif',
out_type = 'sca')
wbw_wetness_index(sca = 'sca.tif', slope = 'slope.tif',
output = 'twi.tif')
# Roughness and classification
wbw_vector_ruggedness_measure(dem = dem_smooth$file_path(), output = 'vrm.tif', filterx = 11, filtery = 11)
wbw_geomorphons(dem = dem_smooth$file_path(), output = 'geomorphons.tif', search = 50, threshold = 1.0,
fdist = 0, skip = 0, forms = TRUE, residuals = FALSE)
cat('Terrain analysis complete.\n')
Tips
- Pre-process your DEM: Remove spikes, pits, and fill sinks before computing derivatives. Use
breach_depressions_least_cost()(preserves real depressions) orfill_depressions()(infills) depending on your application. - Resolution matters: Coarse DEMs (e.g. 30 m) are smoothed and may miss local process features. Fine DEMs (e.g. 1 m LiDAR) can be noisy. Choose resolution to match your process scale.
- Flow direction algorithms: D8 (8-directional) is faster but can cause artificial flow alignments. D-infinity and Dinf-Rho distribute flow more naturally and are preferred for continuous analyses.
- Curvature is scale-dependent: Always compute curvature at the scale matching your DEM resolution. A 10 m window on a 1 m DEM can overinterpret noise; a 1 m window on a 30 m DEM misses important structure.
- Multiscale position classification: Run classification at 3–5 scales (e.g. local, neighbourhood, regional) and examine layer coherence. Inconsistent multi-scale patterns suggest model overfitting.
- Viewshed validation: Viewshed results are sensitive to DEM quality and observer height assumptions. Always validate against ground observation or high-res ortho imagery.
- Hydrological thresholds are empirical: Contributing area thresholds for stream initiation vary by geology and climate (typically 0.5–5 km²). Calibrate against observed stream networks.
- Openness and exposure: Sky-view factor (SVF) and terrain openness support better hillshading and visibility assessment than raw slope or aspect. Use for visual interpretation and photogrammetry.
Spatial Hydrology
Hydrological analysis in WbW-R covers the full DEM-to-drainage workflow: depression removal, flow routing, stream extraction, watershed delineation, and hydrological indices. All heavy computation runs in the Whitebox backend via wbw_<tool>(...) wrappers (or wbw_run_tool(...) for dynamic tool-id workflows); R handles orchestration, conditional logic, and result inspection.
Session Setup
library(whiteboxworkflows)
s <- wbw_session()
setwd('/data/hydrology')
dem <- wbw_read_raster('dem.tif')
Depression Removal
Raw DEMs often contain spurious pits that block modelled drainage. Choose the method appropriate to your landscape and accuracy requirements.
Breach Depressions — Least-Cost
The preferred pre-treatment for most applications. Carves the minimum-width, minimum-depth channel through each depression barrier:
wbw_breach_depressions_least_cost(dem = dem$file_path(),
output = 'dem_breached.tif',
dist = 500, # maximum breach distance (cells)
max_cost = -1.0, # -1 = no cost limit
min_dist = TRUE,
flat_increment = 0.0001,
fill_deps = FALSE)
Fill Depressions — Wang & Liu
Use fill after breach, or alone on smooth low-relief landscapes:
wbw_fill_depressions_wang_and_liu(dem = dem$file_path(),
output = 'dem_filled.tif',
fix_flats = TRUE,
flat_increment = 0.001,
max_depth = -1.0)
Burn-in Stream Network
If you have a mapped stream network, burn it into the DEM before filling to improve drainage coherence:
streams <- wbw_read_vector('streams.shp')
wbw_burn_streams_at_roads(dem = dem$file_path(),
streams = streams$file_path(),
roads = 'roads.shp',
output = 'dem_burn.tif',
width = 6.0)
Flow Routing
D8 Flow Pointer and Accumulation
wbw_d8_pointer(dem = 'dem_breached.tif',
output = 'd8_ptr.tif',
esri_pntr = FALSE)
wbw_d8_flow_accum(i = 'd8_ptr.tif',
output = 'd8_acc.tif',
out_type = 'cells',
log = FALSE,
clip = FALSE,
pntr = TRUE)
D-Infinity Routing
Distributes flow across two cells proportionally:
wbw_d_inf_pointer(dem = 'dem_breached.tif',
output = 'dinf_ptr.tif')
wbw_d_inf_flow_accum(dem = 'dem_breached.tif',
output = 'dinf_sca.tif',
out_type = 'sca',
log = FALSE)
FD8 Multi-Flow
wbw_fd8_flow_accum(dem = 'dem_breached.tif',
output = 'fd8_acc.tif',
out_type = 'cells',
exponent = 1.1,
threshold = 0.0,
log = FALSE,
clip = FALSE)
Stream Extraction
wbw_extract_streams(flow_accum = 'd8_acc.tif',
output = 'streams.tif',
threshold = 5000,
zero_background = TRUE)
Extract Valley Bottom by Region Growing
wbw_extract_valley_bottoms(dem = 'dem_breached.tif',
output = 'valley_bottoms.tif',
threshold = 0.5,
line_thin = TRUE)
Watershed Delineation
Snap Pour Points
outlets <- wbw_read_vector('gauges.shp')
wbw_snap_pour_points(pour_pts = outlets$file_path(),
flow_accum = 'd8_acc.tif',
output = 'gauges_snapped.shp',
snap_dist = 200.0)
Single and Multi-Watershed Delineation
wbw_watershed(d8_pntr = 'd8_ptr.tif',
pour_pts = 'gauges_snapped.shp',
output = 'watersheds.tif',
esri_pntr = FALSE)
Unnest Basins
wbw_unnest_basins(d8_pntr = 'd8_ptr.tif',
pour_pts = 'gauges_snapped.shp',
output = 'unnested_basins.tif',
esri_pntr = FALSE)
Flow Path Analysis
# Downslope flowpath length
wbw_downslope_flowpath_length(d8_pntr = 'd8_ptr.tif',
output = 'ds_flowpath_len.tif',
watersheds = '',
weights = '',
esri_pntr = FALSE)
# Distance to stream outlet
wbw_distance_to_outlet(d8_pntr = 'd8_ptr.tif',
streams = 'streams.tif',
output = 'dist_to_outlet.tif',
esri_pntr = FALSE,
zero_background = TRUE)
# HAND — Height Above Nearest Drainage
wbw_elevation_above_stream(dem = 'dem_breached.tif',
streams = 'streams.tif',
output = 'hand.tif')
Hydrological Indices
Topographic Wetness Index
wbw_wetness_index(sca = 'dinf_sca.tif',
slope = 'slope.tif',
output = 'twi.tif')
Sediment Transport Index
wbw_sediment_transport_index(sca = 'dinf_sca.tif',
slope = 'slope.tif',
output = 'sti.tif',
sca_exponent = 0.4,
slope_exponent = 1.3)
Stream Power Index
wbw_stream_power_index(sca = 'dinf_sca.tif',
slope = 'slope.tif',
output = 'spi.tif',
exponent = 1.0)
Isobasins — Equal-Area Basin Partitioning
wbw_isobasins(dem = 'dem_breached.tif',
output = 'isobasins.tif',
size = 5000,
connections = TRUE,
csv_file = 'isobasin_connectivity.csv')
Stochastic Depression Analysis
wbw_stochastic_depression_analysis(dem = dem$file_path(),
output = 'prob_flooded.tif',
rmse = 0.18,
range = 20.0,
iterations = 1000)
Complete Hydrological Analysis Workflow
library(whiteboxworkflows)
s <- wbw_session()
setwd('/data/hydro_workflow')
dem <- wbw_read_raster('dem.tif')
# 1. Breach depressions
wbw_breach_depressions_least_cost(dem = dem$file_path(), output = 'dem_b.tif', dist = 500,
flat_increment = 0.0001, fill_deps = FALSE)
# 2. D8 routing
wbw_d8_pointer(dem = 'dem_b.tif', output = 'd8_ptr.tif')
wbw_d8_flow_accum(i = 'd8_ptr.tif', output = 'd8_acc.tif',
out_type = 'cells', pntr = TRUE)
# 3. DInf SCA for TWI
wbw_d_inf_flow_accum(dem = 'dem_b.tif', output = 'sca.tif',
out_type = 'sca')
# 4. Slope for indices
wbw_slope(dem = 'dem_b.tif', output = 'slope.tif',
units = 'degrees')
# 5. Streams
wbw_extract_streams(flow_accum = 'd8_acc.tif',
output = 'streams.tif', threshold = 3000, zero_background = TRUE)
# 6. Watershed
outlets <- wbw_read_vector('gauges.shp')
wbw_snap_pour_points(pour_pts = outlets$file_path(),
flow_accum = 'd8_acc.tif', output = 'gauges_snap.shp', snap_dist = 100.0)
wbw_watershed(d8_pntr = 'd8_ptr.tif',
pour_pts = 'gauges_snap.shp', output = 'watersheds.tif')
# 7. Indices
wbw_wetness_index(sca = 'sca.tif', slope = 'slope.tif',
output = 'twi.tif')
wbw_elevation_above_stream(dem = 'dem_b.tif',
streams = 'streams.tif', output = 'hand.tif')
cat('Hydrological analysis complete.\n')
LiDAR Processing
LiDAR point cloud processing in WbW-R covers the full pipeline from raw flight-line data through to classified point clouds and derived raster products. All processing tools run in the Whitebox backend; R handles session management, file discovery, and result validation.
Core Concepts
Before processing LiDAR data, understand these foundational terms:
- Return number: Which reflection from a single laser pulse. Pulse 1 is the first (often canopy top); pulse 2–5 capture midstory and ground returns.
- Point classification: ASPRS standard categories — ground (2), low veg (3), medium veg (4), high veg (5), buildings (6), noise (7), overlap (12), and others.
- Intensity: Reflectance value (0–65535) proportional to target brightness. Useful for vegetation density estimation and water detection.
- Ground filtering: Separating terrain points (classification 2) from vegetation and buildings; critical for accurate digital terrain models (DTMs).
- Digital terrain model (DTM): Raster surface of bare earth, computed from ground returns only. Used for hydrology, geomorphometry, and flood modelling.
- Digital surface model (DSM): Raster surface of highest returns (canopy top). Used for building detection and volume calculations.
- Canopy height model (CHM): DSM minus DTM; represents vegetation height above ground. Standard input for tree detection and segmentation.
- Point density: Points per square unit (typically points/m²). Higher density enables finer segmentation; lower density requires smoothing.
- Normalization: Converting raw Z-values to height-above-ground by subtracting DTM, creating a normalized point cloud for structural analysis.
Session Setup and File Discovery
library(whiteboxworkflows)
s <- wbw_session()
setwd('/data/lidar')
# List all LAS/LAZ files in the project folder
las_files <- list.files('.', pattern = '\\.(las|laz)$', full.names = TRUE,
recursive = TRUE)
cat('Found', length(las_files), 'point cloud files.\n')
Reading and Inspecting LiDAR Files
las <- wbw_read_lidar('survey.las')
meta <- las$metadata()
cat('Point count:', meta$number_of_points, '\n')
cat('X range:', meta$min_x, 'to', meta$max_x, '\n')
cat('Y range:', meta$min_y, 'to', meta$max_y, '\n')
cat('Z range:', meta$min_z, 'to', meta$max_z, '\n')
cat('Point density per m²:', meta$point_density, '\n')
Point Cloud Filtering and Outlier Removal
# Remove isolated low and high points
wbw_lidar_elevation_slice(i = 'survey.las',
output = 'survey_sliced.las',
minz = 0.0,
maxz = 200.0,
cls = FALSE)
# Statistical outlier removal
wbw_lidar_remove_outliers(i = 'survey.las',
output = 'survey_clean.las',
radius = 2.0,
elev_diff = 25.0,
use_median = FALSE)
Ground Point Classification
Progressive Morphological Filter
wbw_lidar_ground_point_filter(i = 'survey_clean.las',
output = 'survey_classified.las',
radius = 2.0,
min_elev_diff = 0.15,
max_elev_diff = 1.3,
num_iter = 5,
threshold = 0.15,
slope_threshold = 0.15,
height_threshold = 1.0,
classify = TRUE,
slope = FALSE,
height_diff = TRUE,
filter_return_all = FALSE)
Digital Elevation Model Interpolation
Tin Gridding (TIN Interpolation)
wbw_tin_gridding(i = 'survey_classified.las',
output = 'dtm.tif',
returns = 'last',
resolution = 1.0,
exclude_cls = '1,3,4,5,6,7',
minz = -50.0,
maxz = 250.0)
LiDAR IDW Interpolation
wbw_lidar_idw_interpolation(i = 'survey_classified.las',
output = 'dtm_idw.tif',
parameter = 'elevation',
returns = 'last',
resolution = 1.0,
weight = 1.0,
radius = 2.5,
exclude_cls = '1,3,4,5,6,7')
Normalised Height Above Ground
wbw_normalize_lidar(i = 'survey_classified.las',
ground = 'survey_classified.las',
output = 'survey_normalised.las',
ignore_ground_distance = FALSE)
Canopy Height Model (CHM) and DSM
# First return DSM
wbw_lidar_idw_interpolation(i = 'survey_classified.las',
output = 'dsm.tif',
parameter = 'elevation',
returns = 'first',
resolution = 1.0,
weight = 1.0,
radius = 2.5,
exclude_cls = '7')
# Canopy Height Model = DSM - DTM
wbw_subtract(input1 = 'dsm.tif',
input2 = 'dtm.tif',
output = 'chm.tif')
Point Density and Distribution Analysis
wbw_lidar_point_stats(i = 'survey.las',
resolution = 1.0,
num_points = TRUE,
num_pulses = FALSE)
wbw_lidar_density(i = 'survey.las',
output = 'density.tif',
resolution = 1.0,
returns = 'all',
exclude_cls = '7')
Scan-Angle and Return Analysis
# Filter by scan angle
wbw_filter_lidar_scan_angles(i = 'survey.las',
output = 'survey_nadir.las',
threshold = 15.0)
# Intensity image
wbw_lidar_idw_interpolation(i = 'survey_nadir.las',
output = 'intensity.tif',
parameter = 'intensity',
returns = 'all',
resolution = 1.0,
weight = 1.0,
radius = 2.5)
Tile Management
Working with large surveys requires tiling:
# Tile large dataset
wbw_lidar_tile(i = 'full_survey.las',
width = 500.0,
height = 500.0,
origin_x = 0.0,
origin_y = 0.0)
# Add a buffer overlap to each tile
wbw_lidar_tile_footprint(i = 'tile_000_000.las',
output = 'tile_000_000_footprint.shp',
hull = FALSE)
Segmentation and Vegetation Analysis
# Individual tree segmentation from normalised LiDAR
wbw_individual_tree_detection(i = 'survey_normalised.las',
output = 'tree_tops.shp',
min_search_radius = 1.0,
min_height = 2.0,
max_search_radius = 5.0,
max_height = 40.0)
# Canopy cover (fraction first return above height threshold)
wbw_lidar_segmentation_based_filter(i = 'survey_classified.las',
output = 'survey_seg_filtered.las',
slope_threshold = 15.0,
max_edge_length = 0.5,
classify = TRUE)
WbW-Pro Spotlight: LiDAR Change and Disturbance Analysis
- Problem: Compare repeat LiDAR epochs to detect disturbance in a repeatable way.
- Tool:
lidar_change_and_disturbance_analysis - Typical inputs: Baseline tile set, monitoring tile set, output resolution, minimum change threshold.
- Typical outputs: Change rasters plus summary metrics for affected area, hotspot intensity, and QA review.
result <- s$lidar_change_and_disturbance_analysis(
baseline_tiles = '/data/lidar_epoch_2022/',
monitor_tiles = '/data/lidar_epoch_2025/',
resolution = 1.0,
min_change_m = 0.5,
output_prefix = 'disturbance_2025_vs_2022'
)
print(result)
Note: This workflow requires a session initialized with a valid Pro licence.
Complete LiDAR Processing Workflow
library(whiteboxworkflows)
s <- wbw_session()
setwd('/data/lidar_project')
las_file <- 'raw_survey.las'
# 1. Remove outliers
wbw_lidar_remove_outliers(i = las_file, output = 'clean.las', radius = 2.0, elev_diff = 25.0)
# 2. Ground classification
wbw_lidar_ground_point_filter(i = 'clean.las', output = 'classified.las', radius = 2.0,
min_elev_diff = 0.15, max_elev_diff = 1.3, num_iter = 5,
threshold = 0.15, slope_threshold = 0.15, height_threshold = 1.0,
classify = TRUE, slope = FALSE, height_diff = TRUE,
filter_return_all = FALSE)
# 3. DTM
wbw_tin_gridding(i = 'classified.las', output = 'dtm.tif', returns = 'last',
resolution = 1.0, exclude_cls = '1,3,4,5,6,7')
# 4. DSM and CHM
wbw_lidar_idw_interpolation(i = 'classified.las', output = 'dsm.tif', parameter = 'elevation',
returns = 'first', resolution = 1.0, weight = 1.0, radius = 2.5,
exclude_cls = '7')
wbw_subtract(input1 = 'dsm.tif', input2 = 'dtm.tif', output = 'chm.tif')
# 5. Normalise for vegetation analysis
wbw_normalize_lidar(i = 'classified.las', ground = 'classified.las',
output = 'normalised.las', ignore_ground_distance = FALSE)
# 6. Tree detection
wbw_individual_tree_detection(i = 'normalised.las', output = 'tree_tops.shp',
min_search_radius = 1.0, min_height = 2.0,
max_search_radius = 5.0, max_height = 40.0)
# 7. Point density
wbw_lidar_density(i = 'classified.las', output = 'density.tif',
resolution = 1.0, returns = 'all', exclude_cls = '7')
cat('LiDAR processing complete.\n')
Tips
- Choose your format wisely: LAS is universal and compact; LAZ adds compression and is ideal for archival or transmission. COPC is cloud-optimized and best for remote HTTP range-request access. Use LAZ or LAS for terrestrial/airborne surveys; COPC for cloud-native workflows.
- Always validate classifications: Use
lidar_histogram()andlidar_info()to inspect point distributions by return and classification. Misclassified ground points silently corrupt DTMs and downstream hydrology. - DTM vs. DSM vs. CHM: Generate all three at the same resolution so derivatives align. A common pitfall is mixing DTM and DSM resolution.
- Ground filtering is critical: Outliers and noise (classification 7) should be excluded before gridding. Use
lidar_filter_for_ground()to remove spikes and erratic points. - Normalization enables vegetation analysis: Always normalize point clouds (subtract DTM) before individual tree detection or canopy structural metrics.
- Monitor memory for large surveys: Point clouds are memory-intensive. For datasets > 1 GB, use streaming APIs (
lidar_read_chunked()) rather than loading the entire tile at once. - Coordinate reference systems matter: LAS headers carry CRS as WKT. Verify WKT matches your project CRS before gridding or merging tiles.
- Density and grid resolution: If point density is < 0.5 pts/m², consider upsampling or smoothing the output grid to avoid isolated pits or peaks.
Remote Sensing Analysis
Remote sensing workflows in WbW-R cover multispectral and hyperspectral image analysis: spectral index computation, image enhancement, principal component analysis, image segmentation, unsupervised and supervised classification, accuracy assessment, and change detection. All computation runs in the Whitebox backend.
Sensor Bundle First: When working with Sentinel-2, Landsat, PlanetScope, or SAR product folders, open the scene as a sensor bundle using
wbw_sensor_bundle_from_path(). Bundles provide automatic metadata discovery, key-based band access, and one-call true/false-colour composites without hardcoding file names.
Core Concepts
Remote sensing image analysis requires familiarity with these core ideas:
- Spectral bands: Distinct wavelength ranges (e.g. blue 450–510 nm, red 620–750 nm, NIR 750–900 nm). Different materials reflect different bands, enabling material discrimination.
- Spectral indices: Normalized ratios of bands that isolate phenomena. NDVI (Normalized Difference Vegetation Index) uses NIR and red to measure greenness; NDWI uses NIR and SWIR for water content.
- Spatial resolution: Pixel size in meters (Sentinel-2: 10 m for visible/NIR, 20 m for SWIR; Landsat: 30 m; SAR: 5–30 m depending on mode).
- Temporal resolution: Revisit interval (Sentinel-2: 5 days; Landsat: 16 days; PlanetScope: daily or higher).
- Atmospheric effects: Raw radiance is affected by aerosols, water vapour, and ozone. Surface reflectance products are atmospherically corrected; still require relative normalization for multi-date analysis.
- Cloud masking: Cloud and cloud shadow pixels must be identified and excluded before classification or change detection using quality/QA bands.
- Supervised classification: Training polygons with known labels teach a model to assign classes to unlabelled pixels. Training quality directly controls classification accuracy.
- Unsupervised classification: Clustering algorithms (K-means, ISODATA) discover spectral clusters without training labels; useful for exploratory analysis.
- Change detection: Comparing indices or classifications across dates to identify land-use changes, disturbance, or phenological shifts.
Session Setup
library(whiteboxworkflows)
s <- wbw_session()
setwd('/data/remote_sensing')
Sensor Bundles — Cross-Family Scene Ingestion
wbw_sensor_bundle_from_path() wraps a product folder and returns a bundle object with a consistent interface across sensor families.
Opening a Bundle
# Works with Sentinel-2, Landsat 8/9, PlanetScope, SPOT, SAR, etc.
bundle <- wbw_sensor_bundle_from_path(
'/data/S2B_MSIL2A_20240601T105619_N0510_R094_T32TPT.SAFE',
session = s
)
# Or from a zip archive — WbW extracts it automatically
bundle <- wbw_sensor_bundle_from_path('/data/LC09_L2SP_017030_20240610.tar', session = s)
Discovering Available Layers
meta <- bundle$metadata()
cat('Family: ', meta$family, '\n')
cat('Mission: ', meta$mission, '\n')
cat('Tile: ', meta$tile_id, '\n')
cat('Level: ', meta$processing_level, '\n')
cat('Cloud %: ', meta$cloud_cover_percent, '\n')
cat('Datetime: ', meta$acquisition_datetime_utc, '\n')
print(bundle$list_band_keys()) # e.g. c('B02', 'B03', 'B04', 'B08', ...)
print(bundle$list_measurement_keys()) # e.g. c('ndvi', 'ndwi')
print(bundle$list_qa_keys()) # e.g. c('SCL', 'QA_PIXEL')
print(bundle$list_aux_keys()) # e.g. c('AOT', 'WVP')
print(bundle$list_asset_keys())
Reading Bands by Key
# Keys follow sensor-native naming — no need to memorise file paths
blue <- bundle$read_band('B02') # Sentinel-2 blue
green <- bundle$read_band('B03')
red <- bundle$read_band('B04')
nir <- bundle$read_band('B08')
swir1 <- bundle$read_band('B11')
swir2 <- bundle$read_band('B12')
# Landsat 9 uses different key names
# blue <- bundle$read_band('SR_B2')
# nir <- bundle$read_band('SR_B5')
# QA / cloud mask
scl <- bundle$read_qa_layer('SCL') # Sentinel-2 scene classification
Quick-Look Composites
# True-colour GeoTIFF (auto-enhanced by default)
bundle$write_true_colour(output_path = 'true_colour.tif')
# False-colour (NIR-Red-Green) composite
bundle$write_false_colour(output_path = 'false_colour.tif')
Cross-Family NDVI Pattern
compute_ndvi_from_bundle <- function(bundle_path, output_path, session = NULL) {
b <- wbw_sensor_bundle_from_path(bundle_path, session = session)
meta <- b$metadata()
if ('ndvi' %in% b$list_measurement_keys()) {
ndvi_r <- b$read_measurement('ndvi')
wbw_write_raster(ndvi_r, output_path)
} else {
# Fall back to band-based NDVI
if (meta$family == 'sentinel2') {
red_r <- b$read_band('B04')
nir_r <- b$read_band('B08')
} else {
red_r <- b$read_band('SR_B4')
nir_r <- b$read_band('SR_B5')
}
wbw_normalized_difference_index(input1 = nir_r$file_path(), input2 = red_r$file_path(),
output = output_path)
}
}
compute_ndvi_from_bundle('/data/S2B_MSIL2A_20240601.SAFE', 'ndvi_s2.tif', s)
compute_ndvi_from_bundle('/data/LC09_L2SP_017030.tar', 'ndvi_l9.tif', s)
Cloud and No-Data Masking
# Sentinel-2 SCL: 3=cloud shadow, 8=med cloud, 9=high cloud, 10=thin cirrus
scl <- bundle$read_qa_layer('SCL')
wbw_raster_calculator(output = 'red_cloud_free.tif',
statement = paste0("if('", scl$file_path(), "' == 3 or '", scl$file_path(), "' == 8 ",
"or '", scl$file_path(), "' == 9 or '", scl$file_path(), "' == 10, ",
"nodata, '", red$file_path(), "')"))
# Landsat Collection 2 QA_PIXEL — use bitwise expressions via raster_calculator
# qa <- bundle$read_qa_layer('QA_PIXEL')
Reading Multi-Band Imagery (Individual Files)
When bundles are not available (e.g., reprojected mosaics, custom composites, legacy archives), load individual band files in the conventional way:
b2 <- wbw_read_raster('LC08_B2_blue.tif')
b3 <- wbw_read_raster('LC08_B3_green.tif')
b4 <- wbw_read_raster('LC08_B4_red.tif')
b5 <- wbw_read_raster('LC08_B5_nir.tif')
b6 <- wbw_read_raster('LC08_B6_swir1.tif')
b7 <- wbw_read_raster('LC08_B7_swir2.tif')
# Resample if bands have different resolutions
wbw_resample(inputs = b6$file_path(),
output = 'b6_10m.tif',
cell_size = 0.0,
base = b4$file_path(),
method = 'bilinear')
Spectral Indices
Normalised Difference Vegetation Index (NDVI)
$$NDVI = \frac{NIR - Red}{NIR + Red}$$
# Using bands read from a bundle
wbw_normalized_difference_index(input1 = nir$file_path(), # NIR
input2 = red$file_path(), # Red
output = 'ndvi.tif')
Common Water, Snow, and Urban Indices
# Define index triplets: (output name, numerator band, denominator band)
indices <- list(
list(name = 'ndwi', b1 = green, b2 = nir), # open water; Green/NIR
list(name = 'mndwi', b1 = green, b2 = swir1), # urban water; Green/SWIR1
list(name = 'nbr', b1 = nir, b2 = swir2), # fire severity; NIR/SWIR2
list(name = 'ndsi', b1 = green, b2 = swir1), # snow/ice; Green/SWIR1
list(name = 'ndbi', b1 = swir1, b2 = nir) # built-up; SWIR1/NIR
)
for (idx in indices) {
wbw_normalized_difference_index(input1 = idx$b1$file_path(),
input2 = idx$b2$file_path(),
output = paste0(idx$name, '.tif'))
}
EVI (Enhanced Vegetation Index)
$$EVI = 2.5 \cdot \frac{NIR - Red}{NIR + 6 \cdot Red - 7.5 \cdot Blue + 1}$$
wbw_raster_calculator(output = 'evi.tif',
statement = paste0("2.5 * ('", nir$file_path(), "' - '", red$file_path(), "') / (",
"'", nir$file_path(), "' + 6.0 * '", red$file_path(), "' - 7.5 * '",
blue$file_path(), "' + 1.0)"))
Image Enhancement
Percentage Linear Stretch
wbw_percentage_contrast_stretch(i = b4$file_path(),
output = 'b4_stretched.tif',
clip = 1.0,
tail = 'both',
num_tones = 256)
Standard Deviation Stretch
wbw_standard_deviation_contrast_stretch(i = b4$file_path(),
output = 'b4_sd_stretch.tif',
stdev = 2.0,
num_tones = 256)
Gamma Correction
wbw_gamma_correction(i = b4$file_path(),
output = 'b4_gamma.tif',
gamma = 0.55)
Histogram Matching
wbw_histogram_matching(i = 'image_target.tif',
histo_file = 'reference_histogram.html',
output = 'image_matched.tif')
IHS Colour Space and Pan-Sharpening
wbw_rgb_to_ihs(intensity = b4$file_path(),
hue = b3$file_path(),
saturation = b2$file_path(),
output = 'ihs.tif')
# Pan-sharpen
wbw_ihs_to_rgb(intensity = 'pan_band.tif',
hue = 'ihs_hue.tif',
saturation = 'ihs_sat.tif',
output = 'pansharpened.tif')
Spatial Filtering
# Gaussian smoothing
wbw_gaussian_filter(i = b5$file_path(), output = 'b5_gauss.tif', sigma = 2.0)
# Edge detection — Canny
wbw_canny_edge_detection(i = b4$file_path(), output = 'edges.tif', sigma = 0.5,
low_threshold = 0.05, high_threshold = 0.15, add_back = FALSE)
# General-purpose GLCM texture (multiband output)
wbw_glcm_texture(input = b5$file_path(),
window_size = 9,
distance = 1,
angles = '0,45,90,135',
features = 'contrast,homogeneity,entropy',
direction_aggregation = 'mean',
levels = 32,
output = 'glcm_texture.tif')
Principal Component Analysis
# Supply a comma-separated list of band files
all_bands <- paste(c(b2$file_path(), b3$file_path(), b4$file_path(),
b5$file_path(), b6$file_path(), b7$file_path()),
collapse = ';')
wbw_principal_component_analysis(inputs = all_bands,
output = 'pca_loadings.html',
num_comp = 6,
standardised = FALSE)
# PC scores are written as pc1.tif, pc2.tif, ... in the working directory
Image Segmentation
wbw_image_segmentation(inputs = all_bands,
output = 'segments.tif',
threshold = 30.0,
steps = 10,
min_area = 10)
Object-Based Image Analysis (OBIA) Baseline
Phase 1 open-core OBIA tools are available through standard wbw_<tool>(...)
wrappers and can be discovered as a grouped set:
obia_tools <- wbw_tools_in_remote_sensing_obia(session = s)
print(obia_tools$id)
# 1) Segment imagery into compact object candidates
wbw_segment_slic_superpixels(inputs = all_bands,
region_size = 18,
compactness = 12.0,
output = 'segments_slic.tif')
# 2) Merge undersized regions
wbw_segments_merge_small_regions(segments = 'segments_slic.tif',
min_size = 12,
method = 'longest',
output = 'segments_clean.tif')
# 3) Extract spectral/shape/texture object features
wbw_object_features_spectral_basic(segments = 'segments_clean.tif',
inputs = all_bands,
output = 'object_features_spectral.csv')
wbw_object_features_shape_basic(segments = 'segments_clean.tif',
output = 'object_features_shape.csv')
wbw_object_features_texture_glcm_basic(segments = 'segments_clean.tif',
input = b5$file_path(),
levels = 16,
output = 'object_features_texture.csv')
# 4) Train/apply object-level RF model
wbw_classify_objects_random_forest(features = 'object_features_all.csv',
training = 'training_segments.csv',
output = 'object_predictions.csv')
# 5) Evaluate object-level accuracy
wbw_evaluate_object_classification_accuracy(predictions = 'object_predictions.csv',
reference = 'validation_segments.csv',
output = 'object_accuracy.json')
# Optional one-call baseline pipeline
wbw_obia_pipeline_basic(inputs = all_bands,
training = 'training_segments.csv',
output_prefix = 'obia_field01',
segment_method = 'slic')
Advanced OBIA Capabilities (Open Tier)
The OBIA surface now includes advanced open-tier tools. You can list them from
the grouped discovery helper and run each directly with wbw_<tool>(...) wrappers
(or with wbw_run_tool(...) for dynamic tool-id dispatch).
Segmentation and scale control:
segment_watershed_markerssegment_multiresolution_hierarchicalsegment_scale_parameter_optimizersegments_split_low_cohesion
Object conversion and interoperability:
segments_to_polygonspolygons_to_segments
Advanced features:
object_features_context_neighborsobject_features_topology_relations
Advanced classification:
classify_objects_svmclassify_objects_ensemble_proclassify_objects_rules_basicclassify_objects_rules_hierarchicalobject_class_probability_mapsobject_uncertainty_diagnostics_pro
Hierarchy and propagation:
build_object_hierarchy_multiscalepropagate_labels_across_hierarchy
Post-processing and quality:
objects_enforce_min_mapping_unitobjects_boundary_refinement_proevaluate_segmentation_quality_pro
Workflow operations:
obia_batch_orchestrator_proobia_audit_report_pro
# 1) Build multiscale hierarchy products
wbw_segment_multiresolution_hierarchical(inputs = all_bands,
coarse_k = 900.0,
fine_k = 280.0,
output_prefix = 'site01_hier')
# 2) Context and topology features for difficult class boundaries
wbw_object_features_context_neighbors(segments = 'site01_hier_segments_fine.tif',
output = 'site01_context.csv')
wbw_object_features_topology_relations(segments = 'site01_hier_segments_fine.tif',
output = 'site01_topology.csv')
# 3) Ensemble and rule-hierarchical object classification
wbw_classify_objects_ensemble_pro(features = 'site01_features_all.csv',
training = 'site01_training_segments.csv',
output = 'site01_pred_ensemble.csv')
wbw_classify_objects_rules_hierarchical(features = 'site01_features_all.csv',
rules = 'site01_rules.csv',
output = 'site01_pred_rules.csv')
# 4) Probability maps and uncertainty diagnostics
wbw_object_class_probability_maps(predictions = 'site01_pred_ensemble.csv',
output = 'site01_probabilities.csv')
wbw_object_uncertainty_diagnostics_pro(probabilities = 'site01_probabilities.csv',
low_conf_threshold = 0.7,
output = 'site01_uncertainty.json')
# Batch orchestration for multi-scene runs
wbw_obia_batch_orchestrator_pro(jobs = list(
list(
inputs = list('s1_red.tif', 's1_green.tif', 's1_nir.tif'),
training = 's1_training.csv',
output_prefix = 'prod/s1',
segment_method = 'graph'
),
list(
inputs = list('s2_red.tif', 's2_green.tif', 's2_nir.tif'),
training = 's2_training.csv',
output_prefix = 'prod/s2',
segment_method = 'slic'
)
),
output = 'prod/obia_batch_report.json')
wbw_obia_audit_report_pro(artifacts = list(
'prod/s1_object_predictions.csv',
'prod/s2_object_predictions.csv',
'prod/obia_batch_report.json'
),
output = 'prod/obia_audit.json')
Use segments_to_polygons and polygons_to_segments when analysts need to
edit object boundaries in vector format and then return those objects back into
raster-segment workflows.
Unsupervised Classification
KMeans
wbw_modified_k_means_clustering(inputs = all_bands,
output = 'kmeans.tif',
out_html = 'kmeans_report.html',
start_clusters = 5,
end_clusters = 10,
max_iterations = 25,
class_change = 2.0)
DBSCAN
wbw_dbscan(inputs = all_bands,
output = 'dbscan.tif',
search_dist = 2.5,
min_points = 10)
Supervised Classification
Random Forest
# 1. Fit the model
wbw_random_forest_classification_fit(inputs = all_bands,
training = 'training_polygons.shp',
field = 'CLASS_ID',
output = 'rf_model.bin',
n_trees = 100,
min_leaf_size = 1,
max_features = 0.0)
# 2. Apply the model
wbw_random_forest_classification_predict(inputs = all_bands,
model = 'rf_model.bin',
output = 'rf_classification.tif')
Support Vector Machine
wbw_svm_classification(inputs = all_bands,
training = 'training_polygons.shp',
field = 'CLASS_ID',
output = 'svm_classification.tif',
c = 200.0,
gamma = 50.0,
cost = 10.0,
tolerance = 0.1,
test_proportion = 0.2)
K-Nearest Neighbours
wbw_knn_classification(inputs = all_bands,
training = 'training_polygons.shp',
field = 'CLASS_ID',
output = 'knn_classification.tif',
k = 5,
scaling = TRUE,
test_proportion = 0.2)
Accuracy Assessment
wbw_kappa_index(i1 = 'rf_classification.tif',
i2 = 'reference_classification.tif',
output = 'accuracy_report.html')
Change Detection
Image Differencing
wbw_change_vector_analysis(date1 = paste(c('t1_b2.tif','t1_b3.tif','t1_b4.tif','t1_b5.tif'), collapse=';'),
date2 = paste(c('t2_b2.tif','t2_b3.tif','t2_b4.tif','t2_b5.tif'), collapse=';'),
magnitude = 'cva_magnitude.tif',
direction = 'cva_direction.tif')
Additional Change, Thermal, and Spectral Sprint Tools
The newest remote sensing tools are available through session tool execution, which is useful while dedicated R helper wrappers are still catching up:
# Image difference change detection
img_diff <- s$run_tool('image_difference_change_detection', list(
t1_inputs = c('t1_b2.tif', 't1_b3.tif', 't1_b4.tif'),
t2_inputs = c('t2_b2.tif', 't2_b3.tif', 't2_b4.tif'),
mode = 'magnitude',
threshold_sigma = 2.0,
output = 'img_diff_mag.tif',
output_signed = 'img_diff_signed.tif',
output_mask = 'img_diff_mask.tif'
))
# Post-classification change with class remapping
post_class <- s$run_tool('post_classification_change', list(
t1_classified = 'lulc_t1.tif',
t2_classified = 'lulc_t2.tif',
transition_scale = 1000,
t1_class_remap = list('11' = 1, '12' = 1),
t2_class_remap = list('41' = 4, '42' = 4),
output = 'post_class_transition.tif'
))
# PCA-based change detection
pca_change <- s$run_tool('pca_based_change_detection', list(
t1_inputs = c('t1_b2.tif', 't1_b3.tif', 't1_b4.tif'),
t2_inputs = c('t2_b2.tif', 't2_b3.tif', 't2_b4.tif'),
component = 1,
threshold_sigma = 2.0,
output = 'pca_change_pc1.tif',
output_mask = 'pca_change_mask.tif',
output_report = 'pca_change_report.json'
))
# Dark object subtraction and DN->TOA reflectance
dos <- s$run_tool('dark_object_subtraction', list(
inputs = c('B02.tif', 'B03.tif', 'B04.tif', 'B08.tif'),
percentile = 1.0,
output = 'dos_stack.tif',
output_diagnostic_offsets = 'dos_offsets.tif'
))
toa <- s$run_tool('dn_to_toa_reflectance', list(
inputs = c('B02.tif', 'B03.tif', 'B04.tif', 'B08.tif'),
sensor_bundle_root = '/data/LC09_L1TP_017030_20240420_20240426_02_T1',
output = 'toa_stack.tif'
))
# Thermal emissivity and LST
emiss <- s$run_tool('ndvi_based_emissivity', list(
red_input = 'B04.tif',
nir_input = 'B08.tif',
output = 'emissivity.tif'
))
lst_sc <- s$run_tool('land_surface_temperature_single_channel', list(
thermal_input = 'B10.tif',
sensor_bundle_root = '/data/LC09_L1TP_017030_20240420_20240426_02_T1',
output = 'lst_single_channel.tif'
))
lst_sw <- s$run_tool('land_surface_temperature_split_window', list(
thermal1_input = 'B10.tif',
thermal2_input = 'B11.tif',
emissivity_mean_constant = 0.98,
emissivity_delta_constant = 0.0,
output = 'lst_split_window.tif'
))
# Spectral analytics
sam <- s$run_tool('spectral_angle_mapper', list(
inputs = c('B02.tif', 'B03.tif', 'B04.tif', 'B08.tif'),
endmembers = list(
list(name = 'water', values = c(0.03, 0.02, 0.01, 0.00)),
list(name = 'veg', values = c(0.05, 0.10, 0.06, 0.40))
),
output = 'sam_classes.tif',
output_angle = 'sam_angle.tif'
))
cont <- s$run_tool('continuum_removal', list(
inputs = c('hyp_b1.tif', 'hyp_b2.tif', 'hyp_b3.tif'),
wavelengths = c(450.0, 550.0, 650.0),
output = 'continuum_removed.tif'
))
unmix <- s$run_tool('linear_spectral_unmixing', list(
inputs = c('B02.tif', 'B03.tif', 'B04.tif', 'B08.tif'),
endmembers = list(
list(name = 'soil', values = c(0.18, 0.20, 0.22, 0.24)),
list(name = 'veg', values = c(0.05, 0.09, 0.06, 0.40))
),
output = 'unmix_frac.tif',
output_residual = 'unmix_residual.tif'
))
mnf <- s$run_tool('minimum_noise_fraction', list(
inputs = c('B02.tif', 'B03.tif', 'B04.tif', 'B08.tif'),
num_components = 3,
output = 'mnf_components.tif',
output_inverse = 'mnf_inverse.tif'
))
lib_match <- s$run_tool('spectral_library_matching', list(
inputs = c('B02.tif', 'B03.tif', 'B04.tif', 'B08.tif'),
metric = 'sam',
library = list(
list(name = 'water', values = c(0.03, 0.02, 0.01, 0.00)),
list(name = 'soil', values = c(0.18, 0.20, 0.22, 0.24))
),
output = 'lib_class.tif',
output_score = 'lib_score.tif'
))
# PolSAR decomposition tools
cp <- s$run_tool('cloude_pottier_decomposition', list(
inputs = c('t11.tif', 't22.tif', 't33.tif'),
matrix_format = 'diag3',
output = 'cloude_pottier_haa.tif'
))
fd <- s$run_tool('freeman_durden_decomposition', list(
inputs = c('c11.tif', 'c22.tif', 'c33.tif'),
matrix_format = 'diag3',
output = 'freeman_durden.tif',
output_clip_mask = 'freeman_clip.tif'
))
WbW-Pro Spotlight: In-Season Crop Stress Intervention Planning
- Problem: Turn in-season crop stress signals into actionable intervention priorities.
- Tool:
in_season_crop_stress_intervention_planning - Typical inputs: NDVI raster, canopy-temperature raster, soil-moisture raster.
- Typical outputs: Intervention-priority and intervention-class products with summary reporting.
result <- s$in_season_crop_stress_intervention_planning(
ndvi = 'ndvi_latest.tif',
canopy_temperature = 'lst_latest.tif',
soil_moisture = 'soil_moisture_latest.tif',
output_prefix = 'field_07_stress'
)
print(result)
Note: This workflow requires a session initialized with a valid Pro licence.
Complete Remote Sensing Workflow
The following end-to-end example uses a sensor bundle for ingestion and random forest for land-cover mapping:
library(whiteboxworkflows)
s <- wbw_session()
# 1. Open scene as a bundle — works with S2, Landsat 9, PlanetScope, etc.
bundle <- wbw_sensor_bundle_from_path(
'/data/S2B_MSIL2A_20240601T105619_N0510_R094_T32TPT.SAFE',
session = s
)
meta <- bundle$metadata()
cat(sprintf('Family: %s, tile: %s, clouds: %.1f%%\n',
meta$family, meta$tile_id, meta$cloud_cover_percent))
if (meta$cloud_cover_percent > 20) {
stop('Too cloudy for classification')
}
# 2. Read bands by key (family-agnostic)
blue <- bundle$read_band('B02')
green <- bundle$read_band('B03')
red <- bundle$read_band('B04')
nir <- bundle$read_band('B08')
wbw_resample(inputs = bundle$read_band('B11')$file_path(), output = 'swir1_10m.tif',
cell_size = 0.0, base = red$file_path(), method = 'bilinear')
wbw_resample(inputs = bundle$read_band('B12')$file_path(), output = 'swir2_10m.tif',
cell_size = 0.0, base = red$file_path(), method = 'bilinear')
swir1 <- wbw_read_raster('swir1_10m.tif')
swir2 <- wbw_read_raster('swir2_10m.tif')
bands <- list(blue, green, red, nir, swir1, swir2)
band_paths <- paste(sapply(bands, function(b) b$file_path()), collapse = ';')
# 3. Mask clouds using SCL
scl <- bundle$read_qa_layer('SCL')
# 4. Spectral indices
wbw_normalized_difference_index(input1 = nir$file_path(), input2 = red$file_path(),
output = 'ndvi.tif')
wbw_normalized_difference_index(input1 = green$file_path(), input2 = swir1$file_path(),
output = 'mndwi.tif')
wbw_normalized_difference_index(input1 = nir$file_path(), input2 = swir2$file_path(),
output = 'nbr.tif')
ndvi <- wbw_read_raster('ndvi.tif')
mndwi <- wbw_read_raster('mndwi.tif')
nbr <- wbw_read_raster('nbr.tif')
# 5. PCA decorrelation
wbw_principal_component_analysis(inputs = band_paths,
output = 'pca_loadings.html',
num_comp = 4,
standardised = TRUE)
pc_paths <- paste(paste0('pc', 1:4, '.tif'), collapse = ';')
# 6. Feature stack = bands + indices + PCA scores
feature_paths <- paste(c(band_paths, 'ndvi.tif', 'mndwi.tif', 'nbr.tif', pc_paths),
collapse = ';')
# 7. Train random forest
wbw_random_forest_classification_fit(inputs = feature_paths,
training = 'training_polygons.shp',
field = 'CLASS',
output = 'rf_model.bin',
n_trees = 500,
test_proportion = 0.25)
# 8. Predict
wbw_random_forest_classification_predict(inputs = feature_paths,
model = 'rf_model.bin',
output = 'lulc.tif')
# 9. Generalise
wbw_generalize_classified_raster(i = 'lulc.tif', output = 'lulc_clean.tif', min_size = 9)
# 10. Accuracy assessment
wbw_kappa_index(i1 = 'lulc_clean.tif', i2 = 'validation_reference.tif',
output = 'accuracy.html')
# 11. Quick-look composite for QC
bundle$write_true_colour(output_path = 'quicklook_truecolour.tif')
cat('Remote sensing workflow complete.\n')
Tips
- Use bundles for scene-level ingestion:
wbw_sensor_bundle_from_path()eliminates hardcoded band filenames and works identically across Sentinel-2, Landsat, PlanetScope, and other supported families. - Screen cloud cover before processing: Check bundle metadata
cloud_cover_percentbefore loading all bands to skip unusable scenes early in a batch workflow. - Atmospheric correction is iterative: Raw digital numbers and even surface reflectance products may still carry illumination and atmospheric artefacts. Histogram matching across scenes is a simple relative normalisation that works well when no reference spectrum is available.
- Align band resolution before stacking: Sentinel-2 delivers 10 m (Blue, Green, Red, NIR) and 20 m (Red Edge, SWIR) bands. Upscale the 20 m bands to 10 m using bilinear resampling before combining them in a feature stack.
- Always mask clouds and cloud shadows: Use quality bands (Sentinel-2 SCL or Landsat QA_PIXEL) and apply bitwise tests via raster calculator to isolate valid pixels.
- Balance training data: Training datasets often over-represent dominant classes (e.g., forest) relative to rare classes (e.g., wetland). Sample training polygons proportional to expected class area, or oversample rare classes to improve model accuracy.
- Check spectral separability: If accuracy assessment reports Jeffries-Matusita distance < 1.5 between any two classes, consider merging classes or collecting more spectrally diverse training polygons.
- Use time-series techniques for change detection: Multi-date classifications can show spurious change due to atmospheric variation. Spectral indices are more robust; consider computing NDVI or NDWI time series and detecting change directly in index space.
- Memory is your bottleneck: Large multispectral stacks fill RAM quickly. Use band-by-band processing or out-of-memory methods for scenes > 1 GB.
Raster Analysis
Raster analysis in WbW-R covers the end-to-end workflow of reading, transforming, and writing gridded data — from simple cell-value arithmetic through multi-layer overlays, proximity operations, interpolation, and statistical testing. All heavy computation runs in the Whitebox backend.
Core Concepts
Raster analysis requires understanding these fundamental terms:
- Cell (pixel): The smallest unit of a raster. Each cell stores a single value (integer or floating-point) and has a uniform spatial extent (e.g. 10 m × 10 m).
- Data type: Integer (Int8, Int16, Int32) for categorical data; Float32 or Float64 for continuous measurements. Data type affects precision, file size, and computation speed.
- NoData value: Sentinel value representing missing or invalid data (e.g. -9999 or NaN). Critical for masking water, clouds, or invalid measurements in focal operations.
- Spatial reference (CRS): Coordinate system and projection. Mismatched CRS between rasters causes silent misalignment; always verify before overlay operations.
- Extent: The bounding box (xmin, ymin, xmax, ymax) of the raster in real-world coordinates.
- Cell size (resolution): Cell width and height in map units. Coarser resolution is faster but loses detail; finer resolution requires more memory and computation.
- Focal operation: Uses neighbourhood values (e.g. 3×3 kernel) to compute output. Examples: moving average, Sobel edge detection, local extrema.
- Zonal operation: Aggregates grid values by zone (polygon or categorical layer). Examples: mean by land-cover class, sum by administrative boundary.
- Reclassification: Reassigning cell values according to lookup rules. Common for categorizing continuous data (e.g. slope classes) or remapping land-cover codes.
- Resampling: Changing cell size or alignment. Methods: nearest-neighbour (preserves categories), bilinear (smooth), cubic (smoother).
Session Setup
library(whiteboxworkflows)
s <- wbw_session()
setwd('/data/raster')
Reading and Inspecting Rasters
r <- wbw_read_raster('input.tif')
meta <- r$metadata()
cat('Rows:', meta$rows, ' Cols:', meta$columns, '\n')
cat('Cell size:', meta$resolution_x, 'x', meta$resolution_y, '\n')
cat('EPSG:', meta$epsg, '\n')
cat('NoData:', meta$nodata, '\n')
cat('Data type:', meta$data_type, '\n')
Raster Calculator
raster_calculator evaluates an algebraic expression that combines one or more raster inputs:
# Single-raster expression — multiply by constant
wbw_raster_calculator(statement = "'elevation.tif' * 3.28084",
output = 'elevation_ft.tif')
# Multi-raster NDVI expression
wbw_raster_calculator(statement = "('nir.tif' - 'red.tif') / ('nir.tif' + 'red.tif')",
output = 'ndvi.tif')
# Conditional expression using isnull() and nodata()
wbw_raster_calculator(statement = "if(isnull('input.tif'), nodata(), 'input.tif' + 100.0)",
output = 'result.tif')
Special tokens available in statements: nodata(), isnull(), if(), abs(), sqrt(), log(), log2(), exp(), min(), max(), pi, integer constants, and floating-point constants.
Reclassification
# Reclassify using from-to-becomes triplets
# Format: "from;to;new;from;to;new;..."
wbw_reclass(i = 'slope.tif',
output = 'slope_class.tif',
reclass_vals = '0;5;1;5;15;2;15;30;3;30;45;4;45;90;5',
assign_mode = FALSE)
# Equal-interval reclassification
wbw_reclass_equal_interval(i = 'ndvi.tif',
output = 'ndvi_class.tif',
interval = 0.1,
start_val = -1.0,
end_val = 1.0)
# Reclassify from a CSV lookup table
wbw_reclass_from_file(i = 'landcover.tif',
reclass_file = 'reclass_table.txt',
output = 'landcover_reclassed.tif')
Focal Statistics (Moving Windows)
# Gaussian filter
wbw_gaussian_filter(i = 'dem.tif', output = 'dem_gauss.tif', sigma = 1.5)
# Median filter (feature-preserving)
wbw_median_filter(i = 'dem.tif', output = 'dem_median.tif', filterx = 5, filtery = 5,
sig_digits = 2)
# Feature-preserving smoothing (Zhang et al.)
wbw_feature_preserving_smoothing(dem = 'dem.tif', output = 'dem_fps.tif', filter = 11,
norm_diff = 8.0, num_iter = 3, max_diff = 0.5, zfactor = 1.0)
# Standard deviation in a window
wbw_standard_dev_filter(i = 'dem.tif', output = 'dem_sd.tif', filterx = 11, filtery = 11)
# Percentile filter
wbw_percentile_filter(i = 'dem.tif', output = 'dem_pct75.tif', filterx = 11, filtery = 11,
sig_digits = 2)
Morphological Operations
# Morphological closing (fills gaps in foreground)
wbw_closing(i = 'binary.tif', output = 'binary_close.tif', filterx = 3, filtery = 3)
# Morphological opening (removes small foreground blobs)
wbw_opening(i = 'binary.tif', output = 'binary_open.tif', filterx = 3, filtery = 3)
# Top-hat transform (white)
wbw_tophat_transform(i = 'raster.tif', output = 'tophat.tif', filterx = 11, filtery = 11,
variant = 'white')
Global and Zonal Statistics
# Global statistics (summary of entire raster)
wbw_raster_histogram(i = 'dem.tif', output = 'dem_histogram.html')
# Zonal statistics — mean elevation per watershed zone
wbw_zonal_statistics(i = 'dem.tif',
features = 'watersheds.tif',
output = 'watershed_stats.html',
stat = 'mean',
out_raster = 'watershed_mean_elev.tif')
Raster Overlay Operations
# Weighted sum (multi-criteria suitability)
wbw_weighted_sum(inputs = 'soil.tif;slope.tif;distance.tif',
weights = '0.3;0.5;0.2',
output = 'suitability.tif')
# Weighted overlay (MCE) with constraint
wbw_weighted_overlay(inputs = 'factor1.tif;factor2.tif;factor3.tif',
weights = '0.4;0.4;0.2',
output = 'suitability_mce.tif',
scale_max = 5.0,
scale_min = 0.0,
scale_factor = 1.0)
Resampling and Aggregation
wbw_resample(inputs = 'dem.tif',
output = 'dem_10m.tif',
cell_size = 10.0,
method = 'bilinear')
wbw_aggregate_raster(i = 'dem.tif', output = 'dem_agg.tif', agg_factor = 5,
type = 'mean')
Proximity Analysis
# Euclidean distance
wbw_euclidean_distance(i = 'sources.tif', output = 'euclidean_dist.tif')
# Cost-distance accumulation
wbw_cost_distance(source = 'sources.tif',
cost = 'friction.tif',
out_accum = 'cost_accum.tif',
out_backlink = 'cost_backlink.tif')
# Least-cost path
wbw_cost_pathway(destination = 'destinations.tif',
backlink = 'cost_backlink.tif',
output = 'least_cost_path.tif',
zero_background = FALSE)
# Raster buffer
wbw_buffer_raster(i = 'features.tif', output = 'buffered.tif',
size = 250.0, gridcells = FALSE)
Raster Object Analysis
# Label connected patches (foreground = non-zero)
wbw_clump(i = 'binary.tif', output = 'patches.tif',
diag = TRUE, zero_back = TRUE)
# Remove small patches below area threshold (10 000 m²)
wbw_remove_spurs(i = 'patches.tif', output = 'patches_clean.tif',
iterations = 10)
# Raster area of each patch value
wbw_raster_area(i = 'patches.tif', output = 'patch_area.tif',
out_text = FALSE, units = 'map units', zero_back = TRUE)
Interpolation from Points
pts <- wbw_read_vector('sample_points.shp')
# IDW
wbw_idw_interpolation(i = pts$file_path(),
field = 'ELEV',
output = 'idw_surf.tif',
use_z = FALSE,
weight = 2.0,
radius = 2.5,
min_points = 2,
cell_size = 5.0)
# Natural Neighbour
wbw_natural_neighbour_interpolation(i = pts$file_path(),
field = 'ELEV',
output = 'nn_surf.tif',
use_z = FALSE,
cell_size = 5.0)
# Radial Basis Function
wbw_radial_basis_function_interpolation(i = pts$file_path(),
field = 'ELEV',
output = 'rbf_surf.tif',
num_points = 8,
cell_size = 5.0,
func_type = 'ThinPlateSpline',
poly_order = 1,
weight = 0.1)
Statistical Tests
# Kolmogorov-Smirnov normality test
wbw_ks_test_for_normality(i = 'dem.tif', output = 'ks_normality.html')
# Two-raster paired samples t-test
wbw_paired_sample_t_test(input1 = 'dem_2010.tif', input2 = 'dem_2020.tif',
output = 'ttest.html', num_samples = 1000)
Contour Generation
wbw_contours_from_raster(i = 'dem.tif',
output = 'contours.shp',
interval = 10.0,
base = 0.0,
smooth = 5,
tolerance = 10.0)
WbW-Pro Spotlight: Field Trafficability and Operation Planning
- Problem: Plan equipment timing and field access under variable moisture and weather conditions.
- Tool:
field_trafficability_and_operation_planning - Typical inputs: DEM, normalized soil-moisture raster, optional rainfall-risk raster.
- Typical outputs: Trafficability score raster, operation-class raster, and summary outputs.
result <- s$field_trafficability_and_operation_planning(
dem = 'field_dem.tif',
soil_moisture = 'soil_moisture_norm.tif',
rainfall_forecast = 'rainfall_risk_norm.tif',
output_prefix = 'field_12_trafficability'
)
print(result)
Note: This workflow requires a session initialized with a valid Pro licence.
Complete Raster Analysis Workflow
library(whiteboxworkflows)
s <- wbw_session()
setwd('/data/raster_workflow')
dem <- wbw_read_raster('dem.tif')
# 1. Smooth DEM
wbw_feature_preserving_smoothing(dem = dem$file_path(), output = 'dem_smooth.tif', filter = 11,
norm_diff = 8.0, num_iter = 3, max_diff = 0.5)
# 2. Slope
wbw_slope(dem = 'dem_smooth.tif', output = 'slope.tif', units = 'degrees')
# 3. Reclassify slope into erosion risk classes
wbw_reclass(i = 'slope.tif', output = 'slope_risk.tif',
reclass_vals = '0;5;1;5;15;2;15;30;3;30;90;4')
# 4. Euclidean distance to water
wbw_euclidean_distance(i = 'water_bodies.tif', output = 'dist_water.tif')
# 5. Multi-criteria suitability overlay
wbw_weighted_sum(inputs = paste('slope_risk.tif', 'dist_water.tif', 'soil_type.tif', sep=';'),
weights = '0.5;0.3;0.2',
output = 'suitability.tif')
# 6. Reclassify to binary mask (suitability > threshold)
wbw_raster_calculator(statement = "if('suitability.tif' >= 3.0, 1.0, 0.0)",
output = 'suitable_areas.tif')
# 7. Generate contours
wbw_contours_from_raster(i = dem$file_path(), output = 'contours_10m.shp',
interval = 10.0, base = 0.0, smooth = 3)
cat('Raster analysis complete.\n')
Tips
- Choose your data type: Use integers for categorical data (land cover, classifications) to minimize file size and computation time. Use floating-point (Float32 or Float64) only for continuous measurements (elevation, temperature, probability).
- Set NoData explicitly: Ensure your source rasters carry a valid NoData value. Missing NoData declarations can corrupt statistics and focal operations by including invalid pixels as zeros or false elevations.
- Compress carefully: LZW and DEFLATE compression work well for most data; avoid if you need rapid random access to interior tiles. Use COMPRESS=JPEG for photographic data only (lossy, unsuitable for analysis).
- Focal operations require buffering: Cells at raster edges cannot compute full neighbourhoods. Use
expand()or accept edge effects; never assume borders are valid in derivative rasters. - Zonal statistics are only as good as your zones: Ensure zone boundaries are topologically clean (no overlaps, no gaps). Overlapping zones cause double-counting; gaps cause NoData regions in output.
- Reclassification is fast but risky: Always validate output distributions (histogram) after reclassification. Off-by-one errors in class boundaries can silently produce wrong land-cover or suitability classes.
- Memory is the constraint for large rasters: Tiles > 2 GB require out-of-core or streaming processing. Use
read_by_block()for large files; avoid loading entire rasters into memory if they exceed available RAM. - Upsampling introduces artifacts: Never upsample (finer resolution) without a valid interpolation method. Nearest-neighbour upsampling creates blocky artefacts; bilinear is smoother but may violate data range (e.g. probability values > 1.0).
Vector Analysis
Vector GIS analysis in WbW-R covers attribute management, geometric measurement, shape analysis, spatial overlay, proximity tools, spatial joins, and vector-to-raster conversion. All computation runs in the Whitebox backend through wbw_<tool>(...) wrappers (or wbw_run_tool(...) when tool IDs are dynamic); R handles session management, sequencing, and result processing.
Core Concepts
Vector analysis depends on understanding these core concepts:
- Feature geometry: Points (single coordinate pairs), lines (ordered sequences of coordinate pairs), and polygons (rings of coordinates forming closed boundaries). Each feature type supports different analyses.
- Topology: The spatial relationships between features (adjacency, containment, intersection). Topological errors (overshoots, undershoots, self-intersections) corrupt overlay operations and topology queries.
- Attribute table: The database associated with each feature layer, carrying descriptive fields and values. Attribute queries filter features; joins link external tables.
- Spatial index: Internal index structure (R-tree or quadtree) enabling fast spatial queries. Used by intersection, containment, and proximity operations. Always build spatial indices on frequently queried layers.
- Envelope (bounding box): The minimum rectangular boundary of a feature or layer; used for quick spatial culling before geometry tests.
- Buffer: A polygon created at a fixed distance around a feature. Buffers model proximity zones and are fundamental to distance-based analysis.
- Overlay (intersection, union, difference): Combining two polygon layers to create a new layer. Union merges boundaries; intersection keeps overlapping area only; difference removes one layer from another.
- Dissolve (aggregation): Merging adjacent features with identical attribute values, creating larger aggregate features. Reduces feature count and simplifies geometry.
- Spatial join: Associating features from one layer with features from another based on spatial relationship (overlap, containment, proximity). Assigns attributes across layers.
- Proximity analysis: Finding nearest features, distances, or connectivity. Foundation for network analysis, market analysis, and accessibility studies.
Session Setup
library(whiteboxworkflows)
s <- wbw_session()
setwd('/data/vector')
Reading and Writing Vectors
polys <- wbw_read_vector('parcels.shp')
lines <- wbw_read_vector('roads.shp')
points <- wbw_read_vector('samples.shp')
meta <- polys$metadata()
cat('Feature count:', meta$num_features, '\n')
cat('Geometry type:', meta$geom_type, '\n')
cat('CRS:', meta$wkt, '\n')
Attribute Management
# Add a new field
wbw_add_field(i = polys$file_path(),
output = 'parcels_v2.shp',
field_name = 'AREA_HA',
field_type = 'Float')
# Delete an unwanted field
wbw_delete_field(i = 'parcels_v2.shp',
output = 'parcels_v3.shp',
field_name = 'OLD_FIELD')
# Rename a field
wbw_rename_field(i = 'parcels_v3.shp',
output = 'parcels_v4.shp',
input_field = 'AREA_HA',
output_field = 'HECTARES')
# Extract features by attribute value
wbw_extract_by_attribute(i = polys$file_path(),
output = 'large_parcels.shp',
field = 'AREA_M2',
filter_value = 10000.0,
filter_type = 'Greater Than')
Geometric Measurements
# Add area, perimeter, and basic shape metrics to polygon attribute table
wbw_add_polygon_coordinates_to_table(i = polys$file_path(),
output = 'parcels_geom.shp')
# Polygon shape index — compactness
wbw_compactness_ratio(i = polys$file_path(), output = 'parcels_compact.shp')
wbw_elongation_ratio(i = polys$file_path(), output = 'parcels_elong.shp')
wbw_related_circumscribing_circle(i = polys$file_path(), output = 'parcels_rcc.shp')
wbw_patch_orientation(i = polys$file_path(), output = 'parcels_orient.shp')
wbw_radius_of_gyration(i = polys$file_path(), output = 'parcels_rog.shp')
Geometric Operations
# Centroids
wbw_centroid_vector(i = polys$file_path(), output = 'centroids.shp')
# Convex hull
wbw_convex_hull(i = polys$file_path(), output = 'convex_hulls.shp')
# Minimum bounding envelopes
wbw_minimum_bounding_envelope(i = polys$file_path(), output = 'mbe.shp')
# Smooth vector polygons
wbw_smooth_vectors(i = polys$file_path(), output = 'parcels_smooth.shp', filter = 5)
# Simplify features (Douglas-Peucker)
wbw_simplify_line_or_polygon(i = polys$file_path(), output = 'parcels_simplified.shp',
dist = 5.0, remove_spurs = TRUE, errors_only = FALSE)
# Dissolve polygons on field value
wbw_dissolve(i = polys$file_path(), output = 'parcels_dissolved.shp',
field = 'LAND_USE', snap_tol = 0.001)
Spatial Overlay
# Clip
wbw_clip(i = polys$file_path(),
clip = 'study_area.shp',
output = 'clipped.shp',
snap_tol = 0.001)
# Intersect
wbw_intersect(i = polys$file_path(),
overlay = 'zones.shp',
output = 'intersection.shp',
snap_tol = 0.001)
# Erase
wbw_erase(i = polys$file_path(),
erase = 'exclusion_areas.shp',
output = 'erased.shp',
snap_tol = 0.001)
# Union
wbw_union(i = polys$file_path(),
overlay = 'other_layer.shp',
output = 'union.shp',
snap_tol = 0.001)
# Symmetrical difference
wbw_symmetrical_difference(i = polys$file_path(),
overlay = 'other_layer.shp',
output = 'symdiff.shp',
snap_tol = 0.001)
Proximity Analysis
# Euclidean distance from vector features
wbw_vector_points_to_raster(i = points$file_path(), output = 'points.tif', field = 'FID',
assign = 'last', nodata = TRUE, cell_size = 5.0, base = 'dem.tif')
wbw_euclidean_distance(i = 'points.tif', output = 'euclidean_dist.tif')
# Voronoi diagram
wbw_voronoi_diagram(i = points$file_path(), output = 'voronoi.shp')
Select by Location
wbw_select_by_location(input = polys$file_path(),
select = 'stream_buffer.shp',
output = 'parcels_near_streams.shp',
condition = 'within')
Spatial Join
wbw_spatial_join(target = points$file_path(),
join = polys$file_path(),
output = 'points_joined.shp',
condition = 'within',
attr = 'first')
Aggregation Strategies
# Join and aggregate field values from nearest polygon
for (stat in c('count', 'sum', 'mean', 'min', 'max')) {
wbw_spatial_join(target = 'zones.shp',
join = 'observations.shp',
output = paste0('zones_', stat, '.shp'),
condition = 'contains',
attr = stat)
}
Vector-to-Raster Conversion
# Rasterize polygon layer
wbw_vector_polygons_to_raster(i = polys$file_path(), output = 'parcels_raster.tif',
field = 'LAND_USE_ID', nodata = TRUE, cell_size = 5.0, base = 'dem.tif')
# Rasterize line layer
wbw_vector_lines_to_raster(i = lines$file_path(), output = 'roads_raster.tif',
field = 'FID', nodata = TRUE, cell_size = 5.0, base = 'dem.tif')
# Rasterize points
wbw_vector_points_to_raster(i = points$file_path(), output = 'points_raster.tif',
field = 'VALUE', assign = 'max', nodata = TRUE, cell_size = 5.0)
Field Calculator
# Compute area in hectares and write to existing field
wbw_field_calculator(i = polys$file_path(),
output = 'parcels_calc.shp',
field_name = 'AREA_HA',
py_statement = '@Area / 10000.0',
analyse = FALSE)
Point Cluster Analysis
# Kernel density estimation (heat map)
wbw_kernel_density_estimation(i = points$file_path(),
output = 'heatmap.tif',
bandwidth = 200.0,
kernel_type = 'quartic',
cell_size = 5.0,
base = 'dem.tif')
# Hexagonal binning
wbw_create_hexagonal_vector_grid(i = 'study_area.shp', output = 'hex_grid.shp', width = 500.0, orientation = 'horizontal')
wbw_spatial_join(target = 'hex_grid.shp', join = points$file_path(),
output = 'hex_counts.shp', condition = 'contains', attr = 'count')
WbW-Pro Spotlight: Market Access and Site Intelligence
- Problem: Rank candidate sites using repeatable network-access and demand logic.
- Tool:
market_access_and_site_intelligence_workflow - Typical inputs: Network, existing sites, candidate sites, demand surface, drive-time rings.
- Typical outputs: Catchment polygons, competitive-overlap layer, candidate-ranking CSV, executive summary JSON.
result <- s$run_tool(
'market_access_and_site_intelligence_workflow',
list(
network = 'street_network.shp',
sites_existing = 'existing_sites.shp',
sites_candidates = 'candidate_sites.shp',
demand_surface = 'demand_points.shp',
ring_costs = c(5.0, 10.0, 15.0),
catchments_output = 'candidate_catchments.shp',
overlap_analysis_output = 'competitive_overlap.shp',
candidate_rank_csv = 'candidate_rankings.csv',
executive_summary_json = 'market_summary.json'
)
)
print(result)
Note: This workflow requires a session initialized with a valid Pro licence.
Complete Vector Analysis Workflow
library(whiteboxworkflows)
s <- wbw_session()
setwd('/data/vector_workflow')
parcels <- wbw_read_vector('parcels.shp')
study <- wbw_read_vector('study_area.shp')
streams <- wbw_read_vector('streams.shp')
# 1. Clip to study area
wbw_clip(i = parcels$file_path(), clip = study$file_path(),
output = 'parcels_clipped.shp', snap_tol = 0.001)
# 2. Add shape metrics
wbw_compactness_ratio(i = 'parcels_clipped.shp', output = 'parcels_shape.shp')
# 3. Buffer streams and intersect with parcels
wbw_buffer_raster(i = streams$file_path(), output = 'stream_buf.shp',
size = 30.0, gridcells = FALSE)
wbw_intersect(i = 'parcels_shape.shp', overlay = 'stream_buf.shp',
output = 'riparian_parcels.shp', snap_tol = 0.001)
# 4. Dissolve by land-use class
wbw_dissolve(i = 'riparian_parcels.shp', output = 'riparian_dissolved.shp',
field = 'LAND_USE', snap_tol = 0.001)
# 5. Rasterize result
wbw_vector_polygons_to_raster(i = 'riparian_dissolved.shp', output = 'riparian.tif',
field = 'LAND_USE_ID', nodata = TRUE, cell_size = 5.0)
cat('Vector analysis complete.\n')
Tips
- Always validate topology before analysis: Run
check_vector_topology()to detect overshoots, undershoots, self-intersections, and sliver polygons. Topological errors propagate through overlay and spatial join operations. - Build spatial indices on large layers: Large datasets (> 10,000 features) benefit from spatial indexing. Use
build_spatial_index()explicitly before repeated spatial queries; operations like containment or proximity are fast with indices. - Choose your overlay operation carefully: Union retains all boundaries and combines attributes (can create many small slivers). Intersection keeps only overlapping regions. Difference retains Polygon A minus Polygon B. Test on small subsets first.
- Dissolve reduces feature count and file size: After overlay, dissolve by ownership or category to collapse unnecessary edges. Dissolved layers render faster and are cleaner for publication.
- Spatial joins are sensitive to alignment: Ensure both input layers use the same CRS and are free of topology errors. Reproject to equal-area projection before computing buffer distances or areas for analysis.
- Buffer distance and units matter: Buffer distances are in map units (meters, feet, degrees). Use an equal-area projection if precise areas or distances are critical. Negative buffers can collapse small polygons (inset); test with small buffer values first.
- Attribute table size is a memory constraint: Attribute tables with millions of rows and dozens of fields consume RAM. Export to CSV or database for large tables; work with summaries or samples when memory is limited.
- Point-in-polygon operations scale with complexity: Containment tests are O(n) per point; on large datasets (> 1 million points), consider spatial index binning or vector-to-raster conversion for speed.
Online Data Downloads
This chapter covers downloading vector data directly from online providers into WbW-R workflows. The first provider is OpenStreetMap (OSM) through the Overpass API using tool ID download_osm_vector.
Scope and Current Provider
Current online-data tool:
download_osm_vector
In R, use wbw_run_tool(...) with explicit argument lists.
Quick Start
library(whiteboxworkflows)
session <- wbw_session()
result <- wbw_run_tool(
"download_osm_vector",
args = list(
west = -80.54,
south = 43.41,
east = -80.47,
north = 43.47,
filter_preset = "roads",
include_points = FALSE,
include_lines = TRUE,
include_polygons = FALSE,
output = "kitchener_roads.geojson"
),
session = session
)
print(result$outputs$path)
Presets and Filters
Preset classes:
allroadsbuildingswaterlandusetrailsparksrailamenitiesboundariestransitpoi
Optional custom filters:
filter_key = "amenity"filter_key_value = "amenity=school"
When custom filters are provided, they are used instead of preset filtering.
Geometry Controls
Use geometry toggles to reduce payload and output complexity:
include_pointsinclude_linesinclude_polygons
Examples:
- Streets only: points off, lines on, polygons off
- Building footprints: points off, lines off, polygons on
Phase 2 options:
split_output_by_geometry = TRUEwrites separate files with_points,_lines, and_polygonssuffixes.
Caching and Provenance
Use optional caching when running repeated AOI/filter requests:
cache_dir = ".wbw_cache/osm"cache_ttl_hours = 24(set0to disable TTL checks)
Use provenance_output to write a JSON sidecar with endpoint, bbox, filters, feature counts, and cache usage metadata.
wbw_run_tool(
"download_osm_vector",
args = list(
west = -80.54,
south = 43.41,
east = -80.47,
north = 43.47,
filter_preset = "trails",
include_points = FALSE,
include_lines = TRUE,
include_polygons = FALSE,
split_output_by_geometry = TRUE,
cache_dir = ".wbw_cache/osm",
cache_ttl_hours = 24,
provenance_output = "kitchener_trails_provenance.json",
output = "kitchener_trails.geojson"
),
session = session
)
Projection and Output
Rules:
- Query extent is in EPSG:4326 (longitude/latitude).
- Set
input_extent_epsgto provide west/south/east/north in another CRS (the bbox is transformed to EPSG:4326 before querying Overpass). - Use
output_epsgto reproject output. - Output format is inferred from output extension.
Endpoint selection:
overpass_profilesupports:main,kumi,fr,custom.overpass_urloverrides the selected profile URL when provided.
Large-AOI chunking:
chunk_large_aoi = TRUE(default) automatically tiles large query extents.chunk_max_area_deg2 = 4.0controls maximum area per chunk.max_chunk_count = 64caps the number of generated chunk requests.chunk_parallel_requests = 1(default) controls bounded parallel chunk fetch; set >1 to fetch chunks concurrently.
wbw_run_tool(
"download_osm_vector",
args = list(
west = -80.54,
south = 43.41,
east = -80.47,
north = 43.47,
filter_preset = "buildings",
include_points = FALSE,
include_lines = FALSE,
include_polygons = TRUE,
output_epsg = 32617,
output = "kitchener_buildings_utm17n.gpkg"
),
session = session
)
Operational Guidance
Overpass public endpoints have rate limits and shared infrastructure constraints.
Recommended practice:
- Use smaller AOIs
- Prefer thematic filters
- Cap responses with
max_elements - Increase
timeout_secondsonly when needed
Attribution and Licensing
OSM data are licensed under ODbL. Ensure appropriate OpenStreetMap attribution and verify obligations for redistributed outputs.
Companion Example
See:
crates/wbw_r/examples/osm_download_vector.R
Network Analysis
Whitebox Next Gen R brings the same deep network-analysis engine as the Python API: topology auditing, point-to-point routing, service areas, closest facility, OD cost matrices, location-allocation, accessibility metrics, sensitivity analysis, multimodal transit modelling, map matching, and fleet dispatch optimization. This chapter walks through those capabilities in order, mirroring the prepare-then-query-then-scale-up workflow used in practice.
Session Setup
library(whiteboxworkflows)
s <- wbw_session()
setwd('/data/network')
Core Concepts You Should Know First
Before running tools, it helps to align on a few core terms used throughout this chapter.
- Network: A graph made of edges (road or transit segments) and nodes (intersections, stops, junctions).
- Cost / impedance: The value minimized during routing. This can be distance, travel time, generalized cost, or another friction metric.
- Origin / destination (OD): Origins are trip start points; destinations are trip end points.
- OD matrix: A table of costs from many origins to many destinations. This is the standard structure for accessibility, market access, and assignment analyses.
- Shortest path: The minimum-cost path between one origin and one destination.
- K-shortest paths: The best k distinct alternatives between the same OD pair, useful for resilience and choice modelling.
- Service area (isochrone): The portion of the network reachable from an origin within a cost threshold (for example 10 minutes).
- Closest facility: For each incident or demand point, the least-cost route to the nearest candidate facility on the network.
- Location-allocation: Selecting facility sites that optimize a demand objective, such as minimizing total travel cost or maximizing coverage.
- Connectivity: Whether all required origins and destinations are in the same connected component. Disconnected components cause failed routes.
- Node degree: The number of edges touching a node. Degree supports basic network centrality interpretation and QA for odd junction structure.
- Multimodal routing: Pathfinding across multiple transport modes (walk/bus/rail) with transfer penalties and mode constraints.
- Map matching: Snapping GPS trajectories to the most plausible sequence of network edges.
If you keep these definitions in mind, each workflow step below becomes easier to interpret and validate.
Step 1 — Prepare and Audit the Network
Every routing workflow should begin with a topology check.
Topology Audit
network_topology_audit scans a line network for dead ends, pseudo-nodes,
overshoots, and isolated islands, writing each flagged location as a point
feature and optionally producing a text report.
wbw_network_topology_audit(i = 'roads.shp',
output = 'topology_errors.shp',
snap_tolerance = 0.5,
one_way_field = 'ONEWAY',
report = 'topology_report.txt')
Review topology_report.txt to understand the error count and class
distribution before continuing.
Connected Components
network_connected_components assigns a component identifier to every edge so
you can identify and resolve disconnected islands before running OD queries.
wbw_network_connected_components(i = 'roads.shp',
output = 'roads_components.shp',
snap_tolerance = 0.5)
# Edges not in the dominant component are candidates for removal or bridging.
Node Degree
network_node_degree writes the degree of every node as a point layer.
Degree-1 nodes are dead ends; unusually high-degree nodes may be duplicates.
wbw_network_node_degree(i = 'roads.shp',
output = 'node_degree.shp',
snap_tolerance = 0.5)
Step 2 — Shortest Path and Alternatives
Single Shortest Path
shortest_path_network finds the minimum-cost path between two coordinates
using Dijkstra's algorithm. Supply edge_cost_field for travel-time routing;
omit it to route by Euclidean arc length.
wbw_shortest_path_network(i = 'roads.shp',
output = 'route_shortest.shp',
start_x = 454230.0,
start_y = 4823150.0,
end_x = 458900.0,
end_y = 4819700.0,
snap_tolerance = 20.0,
edge_cost_field = 'MINUTES',
one_way_field = 'ONEWAY')
Turn penalties model the real cost of left, right, and U-turns in dense urban networks.
wbw_shortest_path_network(i = 'roads.shp',
output = 'route_with_turns.shp',
start_x = 454230.0,
start_y = 4823150.0,
end_x = 458900.0,
end_y = 4819700.0,
snap_tolerance = 20.0,
edge_cost_field = 'MINUTES',
one_way_field = 'ONEWAY',
turn_penalty = 0.5,
u_turn_penalty = 3.0,
forbid_u_turns = TRUE)
K-Shortest Alternative Paths
k_shortest_paths_network returns the k least-cost distinct paths for the
same endpoints — useful for resilience analysis and presenting alternatives to
planners.
wbw_k_shortest_paths_network(i = 'roads.shp',
output = 'routes_k3.shp',
start_x = 454230.0,
start_y = 4823150.0,
end_x = 458900.0,
end_y = 4819700.0,
k = 3L,
snap_tolerance = 20.0,
edge_cost_field = 'MINUTES',
one_way_field = 'ONEWAY')
# Each feature carries a PATH_RANK attribute (1 = shortest).
Step 3 — Service Areas
network_service_area delineates every part of the network reachable within a
cost threshold from one or more origins. Typical uses include drive-time
catchments for emergency services, walking isochrones around transit stops, and
delivery zones.
wbw_network_service_area(i = 'roads.shp',
origins = 'fire_stations.shp',
output = 'fire_catchment_5min.shp',
max_cost = 5.0,
snap_tolerance = 20.0,
output_mode = 'polygon',
polygon_merge_origins = TRUE,
edge_cost_field = 'MINUTES',
one_way_field = 'ONEWAY')
Use output_mode = 'edges' to retain actual road arcs inside the catchment
rather than fill a polygon — more appropriate when the network is sparse.
Step 4 — Closest Facility
closest_facility_network routes each incident point to its nearest facility,
measuring cost along the network rather than in straight-line distance. This
is the core tool for emergency-response siting, healthcare access studies, and
school-catchment delineation.
wbw_closest_facility_network(i = 'roads.shp',
incidents = 'accidents.shp',
facilities = 'hospitals.shp',
output = 'routes_to_hospital.shp',
snap_tolerance = 20.0,
edge_cost_field = 'MINUTES',
one_way_field = 'ONEWAY')
# Output carries INCIDENT_FID, FACILITY_FID, and COST fields per route.
Step 5 — OD Cost Matrix and Batch Route Export
OD Cost Matrix
network_od_cost_matrix solves all pairwise paths and writes results to a CSV.
Each row contains an origin identifier, a destination identifier, and the
network cost between them.
wbw_network_od_cost_matrix(i = 'roads.shp',
origins = 'schools.shp',
destinations = 'libraries.shp',
output = 'od_costs.csv',
snap_tolerance = 20.0,
edge_cost_field = 'MINUTES',
one_way_field = 'ONEWAY')
od <- read.csv('od_costs.csv')
# Minimum travel time from each school to any library:
print(aggregate(COST ~ ORIGIN_FID, data = od, FUN = min))
Materializing OD Routes as Geometry
network_routes_from_od creates the actual path lines between OD pairs.
wbw_network_routes_from_od(i = 'roads.shp',
origins = 'schools.shp',
destinations = 'libraries.shp',
output = 'od_routes.shp',
snap_tolerance = 20.0,
edge_cost_field = 'MINUTES',
one_way_field = 'ONEWAY')
Step 6 — Location-Allocation
location_allocation_network solves the p-median problem: given candidate
facility locations and weighted demand points, which p facilities minimise
total travel cost? Directly supports clinic siting, school consolidation, and
warehouse network design.
wbw_location_allocation_network(i = 'roads.shp',
demand_points = 'demand_points.shp',
facilities = 'candidate_sites.shp',
output = 'selected_facilities.shp',
facility_count = 4L,
solver_mode = 'minimize_impedance',
demand_weight_field = 'POP',
snap_tolerance = 20.0,
edge_cost_field = 'MINUTES')
# SELECTED == 1 on the four chosen candidate sites.
# Demand points carry ASSIGNED_FID linking each to its nearest selected site.
Supported solver modes: minimize_impedance (p-median), maximize_coverage,
and maximize_attendance.
Step 7 — Network Accessibility Metrics
compute_network_accessibility() is exposed as a typed facade wrapper. It
computes a gravity-model or cumulative-opportunity accessibility score per
origin — a standard indicator in transport equity analysis.
residents <- wbw_read_vector('resident_centroids.shp', session = s)
supermarkets <- wbw_read_vector('supermarkets.shp', session = s)
result <- s$compute_network_accessibility(
input = 'roads.shp',
origins = residents$file_path(),
destinations = supermarkets$file_path(),
output = 'food_accessibility.shp',
edge_cost_field = 'MINUTES',
impedance_cutoff = 30.0,
decay_function = 'negative_exponential',
decay_parameter = 0.1
)
# Each origin point carries an ACCESS_SCORE field.
Step 8 — OD Sensitivity Analysis
analyze_od_cost_sensitivity() quantifies how stable OD costs are under
stochastic perturbation of edge weights — useful for stress-testing a routing
model against travel-time uncertainty or hypothetical road-closure scenarios.
result <- s$analyze_od_cost_sensitivity(
input = 'roads.shp',
origins = 'schools.shp',
destinations = 'libraries.shp',
output = 'od_sensitivity.shp',
edge_cost_field = 'MINUTES',
impedance_disturbance_range = 0.2, # ±20 % perturbation
monte_carlo_samples = 500L
)
Step 9 — Multimodal Analysis
Whitebox Next Gen supports networks with a MODE field on each edge (e.g.
walk, cycle, bus, rail). Multimodal tools honour mode-specific speeds,
transfer penalties, and time-of-day profiles.
Multimodal OD Scenarios
analyze_multimodal_od_scenarios() runs a batch of named scenarios from a CSV,
each with different mode allowances, speed overrides, or departure times.
transit_net <- wbw_read_vector('transit_network.shp', session = s)
bus_stops <- wbw_read_vector('bus_stops.shp', session = s)
destinations <- wbw_read_vector('key_destinations.shp', session = s)
result <- s$analyze_multimodal_od_scenarios(
input = transit_net$file_path(),
origins = bus_stops$file_path(),
destinations = destinations$file_path(),
output = 'multimodal_od_scenarios.shp',
mode_field = 'MODE',
allowed_modes = 'walk,bus,rail',
transfer_penalty = 3.0,
edge_cost_field = 'MINUTES',
scenario_bundle_csv = 'scenarios.csv',
departure_time = '08:00',
temporal_mode = 'scheduled'
)
Exporting Multimodal Route Geometry
export_multimodal_routes_for_od_pairs() materializes the optimal multimodal
route for each OD pair with per-segment mode attributes.
result <- s$export_multimodal_routes_for_od_pairs(
input = transit_net$file_path(),
origins = bus_stops$file_path(),
destinations = destinations$file_path(),
output = 'multimodal_routes.shp',
mode_field = 'MODE',
allowed_modes = 'walk,bus,rail',
transfer_penalty = 3.0,
edge_cost_field = 'MINUTES'
)
Step 10 — Map Matching
map_matching_v1 snaps a raw GPS trajectory to the most plausible sequence of
network edges using a hidden Markov model with candidate expansion. It is the
first step in any floating-vehicle data or probe-data workflow.
wbw_map_matching_v1(i = 'roads.shp',
trajectory_points = 'gps_probe_points.shp',
timestamp_field = 'TIMESTAMP',
output = 'matched_route.shp',
search_radius = 30.0,
candidate_k = 5L,
snap_tolerance = 10.0,
edge_cost_field = 'MINUTES',
matched_points_output = 'matched_points.shp',
match_report = 'match_report.txt')
Step 11 — Fleet and Vehicle Routing (Pro)
fleet_routing_and_dispatch_optimizer solves CVRP and VRPTW problems: given
a fleet of vehicles and a set of service or delivery stops, it assigns and
sequences routes to minimise total cost subject to capacity and time-window
constraints.
result <- s$run_tool(
'fleet_routing_and_dispatch_optimizer',
list(
network = 'roads.shp',
depots = 'depots.shp',
stops = 'delivery_stops.shp',
vehicles_csv = 'fleet.csv',
route_output = 'fleet_routes.shp',
route_kpis_csv_output = 'fleet_kpis.csv',
edge_cost_field = 'MINUTES',
one_way_field = 'ONEWAY',
vrp_mode = 'VRPTW'
)
)
kpis <- read.csv('fleet_kpis.csv')
print(kpis)
Note: This tool requires a session initialised with a valid Pro licence.
Complete Workflow: Emergency Response Planning
library(whiteboxworkflows)
s <- wbw_session()
setwd('/data/emergency_planning')
# 1. Audit topology.
wbw_network_topology_audit(i = 'roads.shp', output = 'topology_errors.shp',
snap_tolerance = 0.5, report = 'topology_report.txt')
# 2. Five-minute drive catchments from existing stations.
wbw_network_service_area(i = 'roads.shp',
origins = 'fire_stations.shp',
output = 'existing_catchment_5min.shp',
max_cost = 5.0,
output_mode = 'polygon',
polygon_merge_origins = TRUE,
edge_cost_field = 'MINUTES',
snap_tolerance = 20.0)
# 3. Route historical incidents to nearest existing station.
wbw_closest_facility_network(i = 'roads.shp',
incidents = 'historical_incidents.shp',
facilities = 'fire_stations.shp',
output = 'incident_routes.shp',
snap_tolerance = 20.0,
edge_cost_field = 'MINUTES')
# 4. Find two new station sites that maximise coverage.
wbw_location_allocation_network(i = 'roads.shp',
demand_points = 'historical_incidents.shp',
facilities = 'candidate_stations.shp',
output = 'new_station_sites.shp',
facility_count = 2L,
solver_mode = 'maximize_coverage',
snap_tolerance = 20.0,
edge_cost_field = 'MINUTES')
Tips
- Always run
network_topology_auditfirst — even one disconnected segment can cause a path query to return no result without an explicit error. - Use
network_connected_componentsto confirm all origins and destinations belong to the same component before running OD queries. - Supply
edge_cost_fieldpointing to a travel-time column for realistic routing; omit it only for pure geometric distance problems. - For scheduled transit, pass
temporal_cost_profileanddeparture_timeto load time-of-day speeds. - The
fleet_routing_and_dispatch_optimizerPro tool requires a session initialised with a valid Pro licence.
Linear Referencing
Linear referencing in WbW-R is the practice of locating events, measurements, or observations along a route using a distance-based measure rather than absolute coordinates. A road pavement database records a pothole at 2.4 km along route R-12; a pipeline corridor flags an anomaly at 847 m along a trunk line. Whitebox Next Gen provides the tools to locate observations onto measured routes, place events from tabular or spatial sources, and — with a Pro licence — audit the consistency of large linear asset datasets.
Session Setup
library(whiteboxworkflows)
s <- wbw_session()
setwd('/data/linear_referencing')
Core Concepts
A linear-referencing workflow has three parts:
- Routes — line features defining the measurement axis. Each route has a unique identifier and M-values (cumulative distance from its start).
- Measures — the distance value used to locate a position along a route.
- Events — point or line observations located by (route_id, measure) or (route_id, from_measure, to_measure) pairs.
Common applications include road-pavement condition assessment, pipeline integrity monitoring, trail difficulty reporting, environmental transect sampling, and GPS probe data quality control.
Step 1 — Understand Your Route Geometry
Routes must be single-part polylines with a consistent digitizing direction. Before dropping events, confirm:
- Each route has a unique identifier stored in a field (e.g.
ROUTE_ID). - No route self-intersects.
- Routes forming a corridor are merged into one feature per identifier.
Use snap_endnodes and merge_line_segments via wbw_run_tool to clean
ragged street-centreline inputs before treating them as routes.
Step 2 — Locate Points Along Routes
locate_points_along_routes takes an existing point layer and finds the
nearest position on each matching route, writing back the M-value (measure)
for every point. Use this when field teams have collected GPS observation points
and you need to convert them to route-distance offsets.
wbw_locate_points_along_routes(routes = 'roads_measured.shp',
points = 'field_observations.shp',
output = 'observations_located.shp',
max_offset_distance = 15.0)
# Output adds ROUTE_ID, MEASURE, and OFFSET fields to every input point.
The MEASURE field is the along-route distance from the route start.
OFFSET is the perpendicular snap distance. Points beyond
max_offset_distance are excluded from the output.
Step 3 — Place Events from a Table
Point Events
route_event_points_from_table reads a CSV where each row specifies a route
identifier and a measure value, and places a point feature at that position.
This is the standard import path for lab results, inspection records, or
maintenance logs stored in external databases.
# pavement_defects.csv columns: ROUTE_ID, MEASURE, SEVERITY, NOTES
wbw_route_event_points_from_table(routes = 'roads_measured.shp',
events = 'pavement_defects.csv',
event_route_field = 'ROUTE_ID',
measure_field = 'MEASURE',
output = 'pavement_defects_located.shp')
Line (Interval) Events
route_event_lines_from_table reads FROM_MEASURE and TO_MEASURE columns
to produce line segments — useful for pavement condition ratings, speed zones,
or any attribute that applies to a stretch of route rather than a single point.
# pavement_condition.csv columns: ROUTE_ID, FROM_MEASURE, TO_MEASURE, IRI, CONDITION
wbw_route_event_lines_from_table(routes = 'roads_measured.shp',
events = 'pavement_condition.csv',
event_route_field = 'ROUTE_ID',
from_measure_field = 'FROM_MEASURE',
to_measure_field = 'TO_MEASURE',
output = 'pavement_condition_segments.shp')
Step 4 — Place Events from a Spatial Layer
When your event data is already a vector layer rather than a plain table, use
the _from_layer variants. These carry across all attributes of the source
feature and can optionally write the original FID and XY into the output.
Point Events from a Layer
wbw_route_event_points_from_layer(routes = 'roads_measured.shp',
events = 'manhole_inspections.shp',
event_route_field = 'ROUTE_ID',
measure_field = 'MEASURE',
output = 'manholes_on_routes.shp',
write_event_fid = TRUE,
write_event_xy = TRUE)
Line Events from a Layer
wbw_route_event_lines_from_layer(routes = 'roads_measured.shp',
events = 'speed_zone_events.shp',
event_route_field = 'ROUTE_ID',
from_measure_field = 'FROM_M',
to_measure_field = 'TO_M',
output = 'speed_zones_on_routes.shp',
write_event_fid = TRUE)
Step 5 — Linear Asset Governance (Pro)
route_event_governance_for_linear_assets audits a complete linear asset
dataset for measure gaps, overlaps, duplicate events, orphaned route
references, and out-of-range measures. It produces a flagged event output and
a structured report — essential for maintaining the integrity of a production
road or utility asset database.
result <- s$run_tool(
'route_event_governance_for_linear_assets',
list(
routes = 'roads_measured.shp',
events = 'pavement_condition.shp',
route_id_field = 'ROUTE_ID',
from_measure_field = 'FROM_MEASURE',
to_measure_field = 'TO_MEASURE',
flagged_output = 'event_flags.shp',
report = 'governance_report.csv'
)
)
flags <- read.csv('governance_report.csv')
print(table(flags$ERROR_CLASS))
Note: This tool requires a session initialised with a valid Pro licence.
Complete Workflow: Road Pavement Assessment
library(whiteboxworkflows)
s <- wbw_session()
setwd('/data/pavement_assessment')
# Step 1: Snap GPS observation points onto routes and extract M-values.
wbw_locate_points_along_routes(routes = 'roads_measured.shp',
points = 'field_inspection_gps.shp',
output = 'gps_on_routes.shp',
max_offset_distance = 10.0)
# Step 2: Place point defect records from the inspection database.
wbw_route_event_points_from_table(routes = 'roads_measured.shp',
events = 'defect_records.csv',
event_route_field = 'ROUTE_ID',
measure_field = 'MEASURE',
output = 'defects_located.shp')
# Step 3: Place condition rating intervals.
wbw_route_event_lines_from_table(routes = 'roads_measured.shp',
events = 'condition_ratings.csv',
event_route_field = 'ROUTE_ID',
from_measure_field = 'FROM_M',
to_measure_field = 'TO_M',
output = 'condition_segments.shp')
# Step 4 (Pro): Audit the condition layer for gaps and overlaps.
result <- s$run_tool(
'route_event_governance_for_linear_assets',
list(
routes = 'roads_measured.shp',
events = 'condition_segments.shp',
route_id_field = 'ROUTE_ID',
from_measure_field = 'FROM_M',
to_measure_field = 'TO_M',
flagged_output = 'condition_flags.shp',
report = 'governance_report.csv'
)
)
cat('Governance report:', 'governance_report.csv', '\n')
Tips
- Routes must have a consistent digitizing direction. Run
snap_endnodesand confirm that all segments in a route are digitized in the same direction before locating events. locate_points_along_routesexcludes points beyondmax_offset_distance. Inspect unmatched points to identify GPS outliers or route coverage gaps.- Use
route_event_points_from_tableandroute_event_lines_from_tablefor bulk imports from asset management databases where locations are already stored as route+measure pairs. - Use the
_from_layervariants when existing vector event layers already carry route identifier and measure fields. - The
route_event_governance_for_linear_assetsPro tool scales to production databases with millions of events and produces actionable error reports for integration into maintenance management systems.
Script Index
Use this index to map manual chapters to runnable examples.
Treat this chapter as the transition from concept to implementation. When building a new workflow, start from the closest example script and adapt it with your paths and parameters rather than composing from isolated snippets. This usually produces safer and more maintainable scripts.
Core Session and Discovery
crates/wbw_r/examples/generated_session_example.R- baseline session creation and discovery flowcrates/wbw_r/r-package/whiteboxworkflows/inst/examples/generated_session_example.R- package-scoped session bootstrap patterncrates/wbw_r/r-package/whiteboxworkflows/inst/examples/golden_path_workflows.R- representative end-to-end workflow chain
Raster Workflows
crates/wbw_r/r-package/whiteboxworkflows/inst/examples/raster_object_quickstart.R- object lifecycle and metadata checkscrates/wbw_r/r-package/whiteboxworkflows/inst/examples/raster_array_roundtrip.R- array conversion and persistence roundtrip
Vector Workflows
crates/wbw_r/r-package/whiteboxworkflows/inst/examples/vector_object_quickstart.R- schema-first vector workflow baselinecrates/wbw_r/examples/vector_topojson_roundtrip.R- TopoJSON read/write roundtrip workflowcrates/wbw_r/examples/osm_download_vector.R- OpenStreetMap download workflow via Overpass API
Lidar Workflows
crates/wbw_r/r-package/whiteboxworkflows/inst/examples/lidar_object_quickstart.R- lidar object lifecycle and metadata checkscrates/wbw_r/r-package/whiteboxworkflows/inst/examples/lidar_chunked_matrix_streaming.R- chunked matrix processing at scalecrates/wbw_r/examples/lidar_write_options.R- output option tuning for lidar formats
Sensor Bundle Workflows
crates/wbw_r/r-package/whiteboxworkflows/inst/examples/sensor_bundle_quickstart.R- bundle intake and key inspection baselinecrates/wbw_r/r-package/whiteboxworkflows/inst/examples/sensor_bundle_multi_family_preview.R- cross-family preview pattern
Licensing
crates/wbw_r/examples/licensing_offline.R- local entitlement startup examplecrates/wbw_r/examples/licensing_floating_online.R- floating license provider startup example