Online Data Downloads
This chapter covers downloading vector data directly from online providers into WbW-R workflows. The first provider is OpenStreetMap (OSM) through the Overpass API using tool ID download_osm_vector.
Scope and Current Provider
Current online-data tool:
download_osm_vector
In R, use wbw_run_tool(...) with explicit argument lists.
Quick Start
library(whiteboxworkflows)
session <- wbw_session()
result <- wbw_run_tool(
"download_osm_vector",
args = list(
west = -80.54,
south = 43.41,
east = -80.47,
north = 43.47,
filter_preset = "roads",
include_points = FALSE,
include_lines = TRUE,
include_polygons = FALSE,
output = "kitchener_roads.geojson"
),
session = session
)
print(result$outputs$path)
Presets and Filters
Preset classes:
allroadsbuildingswaterlandusetrailsparksrailamenitiesboundariestransitpoi
Optional custom filters:
filter_key = "amenity"filter_key_value = "amenity=school"
When custom filters are provided, they are used instead of preset filtering.
Geometry Controls
Use geometry toggles to reduce payload and output complexity:
include_pointsinclude_linesinclude_polygons
Examples:
- Streets only: points off, lines on, polygons off
- Building footprints: points off, lines off, polygons on
Phase 2 options:
split_output_by_geometry = TRUEwrites separate files with_points,_lines, and_polygonssuffixes.
Caching and Provenance
Use optional caching when running repeated AOI/filter requests:
cache_dir = ".wbw_cache/osm"cache_ttl_hours = 24(set0to disable TTL checks)
Use provenance_output to write a JSON sidecar with endpoint, bbox, filters, feature counts, and cache usage metadata.
wbw_run_tool(
"download_osm_vector",
args = list(
west = -80.54,
south = 43.41,
east = -80.47,
north = 43.47,
filter_preset = "trails",
include_points = FALSE,
include_lines = TRUE,
include_polygons = FALSE,
split_output_by_geometry = TRUE,
cache_dir = ".wbw_cache/osm",
cache_ttl_hours = 24,
provenance_output = "kitchener_trails_provenance.json",
output = "kitchener_trails.geojson"
),
session = session
)
Projection and Output
Rules:
- Query extent is in EPSG:4326 (longitude/latitude).
- Set
input_extent_epsgto provide west/south/east/north in another CRS (the bbox is transformed to EPSG:4326 before querying Overpass). - Use
output_epsgto reproject output. - Output format is inferred from output extension.
Endpoint selection:
overpass_profilesupports:main,kumi,fr,custom.overpass_urloverrides the selected profile URL when provided.
Large-AOI chunking:
chunk_large_aoi = TRUE(default) automatically tiles large query extents.chunk_max_area_deg2 = 4.0controls maximum area per chunk.max_chunk_count = 64caps the number of generated chunk requests.chunk_parallel_requests = 1(default) controls bounded parallel chunk fetch; set >1 to fetch chunks concurrently.
wbw_run_tool(
"download_osm_vector",
args = list(
west = -80.54,
south = 43.41,
east = -80.47,
north = 43.47,
filter_preset = "buildings",
include_points = FALSE,
include_lines = FALSE,
include_polygons = TRUE,
output_epsg = 32617,
output = "kitchener_buildings_utm17n.gpkg"
),
session = session
)
Operational Guidance
Overpass public endpoints have rate limits and shared infrastructure constraints.
Recommended practice:
- Use smaller AOIs
- Prefer thematic filters
- Cap responses with
max_elements - Increase
timeout_secondsonly when needed
Attribution and Licensing
OSM data are licensed under ODbL. Ensure appropriate OpenStreetMap attribution and verify obligations for redistributed outputs.
Companion Example
See:
crates/wbw_r/examples/osm_download_vector.R