A basic species distribution modelling workflow

Load occurrences for the Mountain Pygmy Possum using GBIF.jl

Load¤

using Rasters, GBIF2
using RasterDataSources
const RS = Rasters

records = GBIF2.occurrence_search("Burramys parvus"; limit=300)

300-element GBIF2.Table{GBIF2.Occurrence, JSON3.Array{JSON3.Object, Vector{UInt8}, SubArray{UInt64, 1, Vector{UInt64}, Tuple{UnitRange{Int64}}, true}}}┌──────────────────────────┬─────────┬─────────┬─────────┬──────────┬───────────
│                 geometry │    year │   month │     day │  kingdom │   phylum ⋯
│ Tuple{Float64, Float64}? │  Int64? │  Int64? │  Int64? │  String? │  String? ⋯
├──────────────────────────┼─────────┼─────────┼─────────┼──────────┼───────────
│                  missing │    2021 │       1 │       6 │ Animalia │ Chordata ⋯
│      (148.391, -36.3036) │    2015 │      11 │      15 │ Animalia │ Chordata ⋯
│      (148.333, -36.4333) │    2011 │      11 │      21 │ Animalia │ Chordata ⋯
│      (148.396, -36.3818) │    2016 │      11 │      15 │ Animalia │ Chordata ⋯
│                  missing │ missing │ missing │ missing │ Animalia │ Chordata ⋯
│      (148.329, -36.4317) │    2016 │       1 │       3 │ Animalia │ Chordata ⋯
│      (147.096, -36.9357) │    2020 │       2 │      10 │ Animalia │ Chordata ⋯
│      (148.236, -36.5249) │    2012 │      11 │      23 │ Animalia │ Chordata ⋯
│      (148.347, -36.5047) │    2012 │      11 │      22 │ Animalia │ Chordata ⋯
│                  missing │ missing │ missing │ missing │ Animalia │ Chordata ⋯
│      (148.241, -36.4001) │    2011 │      11 │      18 │ Animalia │ Chordata ⋯
│                  missing │ missing │ missing │ missing │ Animalia │ Chordata ⋯
│             (147.1, -37) │    2008 │ missing │ missing │ Animalia │ Chordata ⋯
│                  missing │ missing │ missing │ missing │ Animalia │ Chordata ⋯
│                  missing │ missing │ missing │ missing │ Animalia │ Chordata ⋯
│            ⋮             │    ⋮    │    ⋮    │    ⋮    │    ⋮     │    ⋮     ⋱
└──────────────────────────┴─────────┴─────────┴─────────┴──────────┴───────────
                                                 78 columns and 285 rows omitted

Extract coordinates¤

Extract the longitude/latitude value to a Vector of points (a Tuple counts as a (x, y) point in GeoInterface.jl):

coords = [(r.decimalLongitude, r.decimalLatitude) for r in records if !ismissing(r.decimalLatitude)]

254-element Vector{Tuple{Float64, Float64}}:
 (148.391097, -36.30362)
 (148.332969, -36.433349)
 (148.396453, -36.381847)
 (148.328896, -36.431684)
 (147.096394, -36.935687)
 (148.235596, -36.524924)
 (148.347186, -36.504673)
 (148.240881, -36.400058)
 (147.1, -37.0)
 (151.250866, -33.831883)
 ⋮
 (147.2174, -36.9357)
 (147.1422, -36.9854)
 (146.4297, -37.1554)
 (147.1429, -36.9901)
 (147.2208, -36.9564)
 (146.435, -37.1516)
 (147.1445, -36.9836)
 (147.2262, -36.9401)
 (146.435, -37.1516)

Get layer / Band¤

Get BioClim layers and subset to south-east Australia

A = RasterStack(WorldClim{BioClim}, (1, 3, 7, 12))
se_aus = A[X(138 .. 155), Y(-40 .. -25), RS.Band(1)]

╭────────────────────╮
│ 102×89 RasterStack │
├────────────────────┴─────────────────────────────────────────────────── dims ┐
  ↓ X Projected{Float64} LinRange{Float64}(138.0, 154.83333333333331, 102) ForwardOrdered Regular Intervals{Start},
  → Y Projected{Float64} LinRange{Float64}(-25.16666666666667, -39.83333333333333, 89) ReverseOrdered Regular Intervals{Start}
├────────────────────────────────────────────────────────────────────── layers ┤
  :bio1  eltype: Float32 dims: X, Y size: 102×89
  :bio3  eltype: Float32 dims: X, Y size: 102×89
  :bio7  eltype: Float32 dims: X, Y size: 102×89
  :bio12 eltype: Float32 dims: X, Y size: 102×89
├────────────────────────────────────────────────────────────────────── raster ┤
  extent: Extent(X = (138.0, 154.99999999999997), Y = (-39.83333333333333, -25.000000000000004))
  missingval: -3.4f38
  crs: GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]
└──────────────────────────────────────────────────────────────────────────────┘

Plot BioClim predictors and scatter occurrence points on all subplots

using Plots
p = plot(se_aus);
kw = (legend=:none, opacity=0.5, markershape=:cross, markercolor=:black)
foreach(i -> scatter!(p, coords; subplot=i, kw...), 1:4)
p

Then extract predictor variables and write to CSV.

using CSV
predictors = collect(extract(se_aus, coords))
CSV.write("burramys_parvus_predictors.csv", predictors)

"burramys_parvus_predictors.csv"

Or convert them to a DataFrame:

using DataFrames
df = DataFrame(predictors)
df[1:5, :]

5×5 DataFrame

Row	geometry	bio1	bio3	bio7	bio12
	Tuple…	Float32	Float32	Float32	Float32
1	(148.391, -36.3036)	6.1707	41.1198	23.4645	1692.0
2	(148.333, -36.4333)	7.83572	41.5975	23.5028	1500.0
3	(148.396, -36.3818)	6.88158	42.2681	23.133	1544.0
4	(148.329, -36.4317)	7.83572	41.5975	23.5028	1500.0
5	(147.096, -36.9357)	9.40835	40.7905	23.0895	1292.0

This page was generated using Literate.jl.