Lookups
Lookups
Module for Lookup
s and Selector
s used in DimensionalData.jl
Lookup
defines traits and AbstractArray
wrappers that give specific behaviours for a lookup index when indexed with Selector
.
For example, these allow tracking over array order so fast indexing works even when the array is reversed.
To load Lookup
types and methods into scope:
using DimensionalData
using DimensionalData.Lookups
Lookup
Types defining the behaviour of a lookup index, how it is plotted and how Selector
s like Between
work.
A Lookup
may be NoLookup
indicating that there are no lookup values, Categorical
for ordered or unordered categories, or a Sampled
index for Points
or Intervals
.
Aligned <: Lookup
Abstract supertype for Lookup
s where the lookup is aligned with the array axes.
This is by far the most common supertype for Lookup
.
AbstractSampled <: Aligned
Abstract supertype for Lookup
s where the lookup is aligned with the array, and is independent of other dimensions. Sampled
is provided by this package.
AbstractSampled
must have order
, span
and sampling
fields, or a rebuild
method that accepts them as keyword arguments.
Sampled <: AbstractSampled
Sampled(data::AbstractVector, order::Order, span::Span, sampling::Sampling, metadata)
Sampled(data=AutoValues(); order=AutoOrder(), span=AutoSpan(), sampling=Points(), metadata=NoMetadata())
A concrete implementation of the Lookup
AbstractSampled
. It can be used to represent Points
or Intervals
.
Sampled
is capable of representing gridded data from a wide range of sources, allowing correct bounds
and Selector
s for points or intervals of regular, irregular, forward and reverse lookups.
On AbstractDimArray
construction, Sampled
lookup is assigned for all lookups of AbstractRange
not assigned to Categorical
.
Arguments
data
: AnAbstractVector
of lookup values, matching the length of the curresponding array axis.order
:Order
) indicating the order of the lookup,AutoOrder
by default, detected from the order ofdata
to beForwardOrdered
,ReverseOrdered
orUnordered
. These can be provided explicitly if they are known and performance is important.span
: indicates the size of intervals or distance between points, and will be set toRegular
forAbstractRange
andIrregular
forAbstractArray
, unless assigned manually.sampling
: is assigned toPoints
, unless set toIntervals
manually. UsingIntervals
will change the behaviour ofbounds
andSelectors
s to take account for the full size of the interval, rather than the point alone.metadata
: aDict
orMetadata
wrapper that holds any metadata object adding more information about the array axis - useful for extending DimensionalData for specific contexts, like geospatial data in Rasters.jl. By default it isNoMetadata()
.
Example
Create an array with Interval
sampling, and Regular
span for a vector with known spacing.
We set the locus
of the Intervals
to Start
specifying that the lookup values are for the locus at the start of each interval.
using DimensionalData, DimensionalData.Lookups
x = X(Sampled(100:-20:10; sampling=Intervals(Start())))
y = Y(Sampled([1, 4, 7, 10]; span=Regular(3), sampling=Intervals(Start())))
A = ones(x, y)
# output
╭─────────────────────────╮
│ 5×4 DimArray{Float64,2} │
├─────────────────────────┴────────────────────────────────────────── dims ┐
↓ X Sampled{Int64} 100:-20:20 ReverseOrdered Regular Intervals{Start},
→ Y Sampled{Int64} [1, 4, 7, 10] ForwardOrdered Regular Intervals{Start}
└──────────────────────────────────────────────────────────────────────────┘
↓ → 1 4 7 10
100 1.0 1.0 1.0 1.0
80 1.0 1.0 1.0 1.0
60 1.0 1.0 1.0 1.0
40 1.0 1.0 1.0 1.0
20 1.0 1.0 1.0 1.0
AbstractCyclic <: AbstractSampled
An abstract supertype for cyclic lookups.
These are AbstractSampled
lookups that are cyclic for Selectors
.
Cyclic <: AbstractCyclic
Cyclic(data; order=AutoOrder(), span=AutoSpan(), sampling=Points(), metadata=NoMetadata(), cycle)
A Cyclic
lookup is similar to Sampled
but out of range Selectors
At
, Near
, Contains
will cycle the values to typemin
or typemax
over the length of cycle
. Where
and ..
work as for Sampled
.
This is useful when we are using mean annual datasets over a real time-span, or for wrapping longitudes so that -360
and 360
are the same.
Arguments
data
: AnAbstractVector
of lookup values, matching the length of the curresponding array axis.order
:Order
) indicating the order of the lookup,AutoOrder
by default, detected from the order ofdata
to beForwardOrdered
,ReverseOrdered
orUnordered
. These can be provided explicitly if they are known and performance is important.span
: indicates the size of intervals or distance between points, and will be set toRegular
forAbstractRange
andIrregular
forAbstractArray
, unless assigned manually.sampling
: is assigned toPoints
, unless set toIntervals
manually. UsingIntervals
will change the behaviour ofbounds
andSelectors
s to take account for the full size of the interval, rather than the point alone.metadata
: aDict
orMetadata
wrapper that holds any metadata object adding more information about the array axis - useful for extending DimensionalData for specific contexts, like geospatial data in Rasters.jl. By default it isNoMetadata()
.cycle
: the length of the cycle. This does not have to exactly match the data, thestep
size isWeek(1)
the cycle can beYears(1)
.
Notes
If you use dates and e.g. cycle over a
Year
, every year will have the number and spacing ofWeek
s andDay
s as the cycle year. UsingAt
may not be reliable in terms of exact dates, as it will be applied to the specified date plus or minusn
years.Indexing into a
Cycled
with anyAbstractArray
orAbstractRange
will return aSampled
as the full cycle is likely no longer available...
orBetween
selectors do not work in a cycled way: they work as forSampled
. This may change in future to return cycled values, but there are problems with this, such as leap years breaking correct date cycling of a single year. If you actually need this behaviour, please make a GitHub issue.
AbstractCategorical <: Aligned
Lookup
s where the values are categories.
Categorical
is the provided concrete implementation. But this can easily be extended, all methods are defined for AbstractCategorical
.
All AbstractCategorical
must provide a rebuild
method with data
, order
and metadata
keyword arguments.
Categorical <: AbstractCategorical
Categorical(o::Order)
Categorical(; order=Unordered())
A Lookup
where the values are categories.
This will be automatically assigned if the lookup contains AbstractString
, Symbol
or Char
. Otherwise it can be assigned manually.
Order
will be determined automatically where possible.
Arguments
data
: AnAbstractVector
matching the length of the corresponding array axis.order
:Order
) indicating the order of the lookup,AutoOrder
by default, detected from the order ofdata
to beForwardOrdered
,ReverseOrdered
orUnordered
. Can be provided if this is known and performance is important.metadata
: aDict
orMetadata
wrapper that holds any metadata object adding more information about the array axis - useful for extending DimensionalData for specific contexts, like geospatial data in Rasters.jl. By default it isNoMetadata()
.
Example
Create an array with [Interval
] sampling.
using DimensionalData
ds = X(["one", "two", "three"]), Y([:a, :b, :c, :d])
A = DimArray(rand(3, 4), ds)
Dimensions.lookup(A)
# output
Categorical{String} ["one", "two", "three"] Unordered,
Categorical{Symbol} [:a, :b, :c, :d] ForwardOrdered
Unaligned <: Lookup
Abstract supertype for Lookup
where the lookup is not aligned to the grid.
Indexing an Unaligned
with Selector
s must provide all other Unaligned
dimensions.
Transformed <: Unaligned
Transformed(f, dim::Dimension; metadata=NoMetadata())
Lookup
that uses an affine transformation to convert dimensions from dims(lookup)
to dims(array)
. This can be useful when the dimensions are e.g. rotated from a more commonly used axis.
Any function can be used to do the transformation, but transformations from CoordinateTransformations.jl may be useful.
Arguments
f
: transformation functiondim
: a dimension to transform to.
Keyword Arguments
metadata
:
Example
using DimensionalData, DimensionalData.Lookups, CoordinateTransformations
m = LinearMap([0.5 0.0; 0.0 0.5])
A = [1 2 3 4
5 6 7 8
9 10 11 12];
da = DimArray(A, (X(Transformed(m)), Y(Transformed(m))))
da[X(At(6.0)), Y(At(2.0))]
# output
9
MergedLookup <: Lookup
MergedLookup(data, dims; [metadata])
A Lookup
that holds multiple combined dimensions.
MergedLookup
can be indexed with Selector
s like At
, Between
, and Where
although Near
has undefined meaning.
Arguments
data
: AVector
ofTuple
.dims
: ATuple
ofDimension
indicating the dimensions in the tuples indata
.
Keywords
metadata
: aDict
orMetadata
object to attach dimension metadata.
NoLookup <: Lookup
NoLookup()
A Lookup
that is identical to the array axis. Selector
s can't be used on this lookup.
Example
Defining a DimArray
without passing lookup values to the dimensions, it will be assigned NoLookup
:
using DimensionalData
A = DimArray(rand(3, 3), (X, Y))
Dimensions.lookup(A)
# output
NoLookup, NoLookup
Which is identical to:
using .Lookups
A = DimArray(rand(3, 3), (X(NoLookup()), Y(NoLookup())))
Dimensions.lookup(A)
# output
NoLookup, NoLookup
AutoLookup <: Lookup
AutoLookup()
AutoLookup(values=AutoValues(); kw...)
Automatic Lookup
, the default lookup. It will be converted automatically to another Lookup
when it is possible to detect it from the lookup values.
Keywords will be used in the detected Lookup
constructor.
AutoValues
Detect Lookup
values from the context. This is used in NoLookup
to simply use the array axis as the index when the array is constructed, and in set
to change the Lookup
type without changing the index values.
The generic value getter val
val(x)
val(dims::Tuple) => Tuple
Return the contained value of a wrapper object.
dims
can be Dimension
, Dimension
types, or Symbols
for Dim{Symbol}
.
Objects that don't define a val
method are returned unaltered.
Lookup methods:
bounds(xs, [dims::Tuple]) => Tuple{Vararg{Tuple{T,T}}}
bounds(xs::Tuple) => Tuple{Vararg{Tuple{T,T}}}
bounds(x, dim) => Tuple{T,T}
bounds(dim::Union{Dimension,Lookup}) => Tuple{T,T}
Return the bounds of all dimensions of an object, of a specific dimension, or of a tuple of dimensions.
If bounds are not known, one or both values may be nothing
.
dims
can be a Dimension
, a dimension type, or a tuple of either.
hasselection(x, selector) => Bool
hasselection(x, selectors::Tuple) => Bool
Check if indexing into x with selectors
can be performed, where x is some object with a dims
method, and selectors
is a Selector
or Dimension
or a tuple of either.
sampling(x, [dims::Tuple]) => Tuple
sampling(x, dim) => Sampling
sampling(xs::Tuple) => Tuple{Vararg{Sampling}}
sampling(x:Union{Dimension,Lookup}) => Sampling
Return the Sampling
for each dimension.
Second argument dims
can be Dimension
s, Dimension
types, or Symbols
for Dim{Symbol}
.
span(x, [dims::Tuple]) => Tuple
span(x, dim) => Span
span(xs::Tuple) => Tuple{Vararg{Span,N}}
span(x::Union{Dimension,Lookup}) => Span
Return the Span
for each dimension.
Second argument dims
can be Dimension
s, Dimension
types, or Symbols
for Dim{Symbol}
.
order(x, [dims::Tuple]) => Tuple
order(xs::Tuple) => Tuple
order(x::Union{Dimension,Lookup}) => Order
Return the Ordering
of the dimension lookup for each dimension: ForwardOrdered
, ReverseOrdered
, or Unordered
Second argument dims
can be Dimension
s, Dimension
types, or Symbols
for Dim{Symbol}
.
locus(x, [dims::Tuple]) => Tuple
locus(x, dim) => Locus
locus(xs::Tuple) => Tuple{Vararg{Locus,N}}
locus(x::Union{Dimension,Lookup}) => Locus
Return the Position
of lookup values for each dimension.
Second argument dims
can be Dimension
s, Dimension
types, or Symbols
for Dim{Symbol}
.
shiftlocus(locus::Locus, x)
Shift the values of x
from the current locus to the new locus.
We only shift Sampled
, Regular
or Explicit
, Intervals
.
Selectors
Selector
Abstract supertype for all selectors.
Selectors are wrappers that indicate that passed values are not the array indices, but values to be selected from the dimension index, such as DateTime
objects for a Ti
dimension.
Selectors provided in DimensionalData are:
Note: Selectors can be modified using:
Not
: as inNot(At(x))
And IntervalSets.jl Interval
can be used instead of Between
..
Interval
OpenInterval
ClosedInterval
IntSelector <: Selector
Abstract supertype for Selector
s that return a single Int
index.
IntSelectors provided by DimensionalData are:
ArraySelector <: Selector
Abstract supertype for Selector
s that return an AbstractArray
.
ArraySelectors provided by DimensionalData are:
At <: IntSelector
At(x, atol, rtol)
At(x; atol=nothing, rtol=nothing)
Selector that exactly matches the value on the passed-in dimensions, or throws an error. For ranges and arrays, every intermediate value must match an existing value - not just the end points.
x
can be any value or Vector
of values.
atol
and rtol
are passed to isapprox
. For Number
rtol
will be set to Base.rtoldefault
, otherwise nothing
, and wont be used.
Example
using DimensionalData
A = DimArray([1 2 3; 4 5 6], (X(10:10:20), Y(5:7)))
A[X(At(20)), Y(At(6))]
# output
5
Near <: IntSelector
Near(x)
Selector that selects the nearest index to x
.
With Points
this is simply the index values nearest to the x
, however with Intervals
it is the interval center nearest to x
. This will be offset from the index value for Start
and End
locus.
Example
using DimensionalData
A = DimArray([1 2 3; 4 5 6], (X(10:10:20), Y(5:7)))
A[X(Near(23)), Y(Near(5.1))]
# output
4
Between <: ArraySelector
Between(a, b)
Depreciated: use a..b
instead of Between(a, b)
. Other Interval
objects from IntervalSets.jl, like `OpenInterval(a, b) will also work, giving the correct open/closed boundaries.
Between
will e removed in future to avoid clashes with DataFrames.Between
.
Selector that retrieve all indices located between 2 values, evaluated with >=
for the lower value, and <
for the upper value. This means the same value will not be counted twice in 2 adjacent Between
selections.
For Intervals
the whole interval must be lie between the values. For Points
the points must fall between the values. Different Sampling
types may give different results with the same input - this is the intended behaviour.
Between
for Irregular
intervals is a little complicated. The interval is the distance between a value and the next (for Start
locus) or previous (for End
locus) value.
For Center
, we take the mid point between two index values as the start and end of each interval. This may or may not make sense for the values in your index, so use Between
with Irregular
Intervals(Center())
with caution.
Example
using DimensionalData
A = DimArray([1 2 3; 4 5 6], (X(10:10:20), Y(5:7)))
A[X(Between(15, 25)), Y(Between(4, 6.5))]
# output
╭───────────────────────╮
│ 1×2 DimArray{Int64,2} │
├───────────────────────┴────────────────────────────── dims ┐
↓ X Sampled{Int64} 20:10:20 ForwardOrdered Regular Points,
→ Y Sampled{Int64} 5:6 ForwardOrdered Regular Points
└────────────────────────────────────────────────────────────┘
↓ → 5 6
20 4 5
Touches <: ArraySelector
Touches(a, b)
Selector that retrieves all indices touching the closed interval 2 values, for the maximum possible area that could interact with the supplied range.
This can be better than ..
when e.g. subsetting an area to rasterize, as you may wish to include pixels that just touch the area, rather than those that fall within it.
Touches is different to using closed intervals when the lookups also contain intervals - if any of the intervals touch, they are included. With ..
they are discarded unless the whole cell interval falls inside the selector interval.
Example
using DimensionalData
A = DimArray([1 2 3; 4 5 6], (X(10:10:20), Y(5:7)))
A[X(Touches(15, 25)), Y(Touches(4, 6.5))]
# output
╭───────────────────────╮
│ 1×2 DimArray{Int64,2} │
├───────────────────────┴────────────────────────────── dims ┐
↓ X Sampled{Int64} 20:10:20 ForwardOrdered Regular Points,
→ Y Sampled{Int64} 5:6 ForwardOrdered Regular Points
└────────────────────────────────────────────────────────────┘
↓ → 5 6
20 4 5
Contains <: IntSelector
Contains(x)
Selector that selects the interval the value is contained by. If the interval is not present in the index, an error will be thrown.
Can only be used for Intervals
or Categorical
. For Categorical
it falls back to using At
. Contains
should not be confused with Base.contains
- use Where(contains(x))
to check for if values are contain in categorical values like strings.
Example
using DimensionalData; const DD = DimensionalData
dims_ = X(10:10:20; sampling=DD.Intervals(DD.Center())),
Y(5:7; sampling=DD.Intervals(DD.Center()))
A = DimArray([1 2 3; 4 5 6], dims_)
A[X(Contains(8)), Y(Contains(6.8))]
# output
3
Where <: ArraySelector
Where(f::Function)
Selector that filters a dimension lookup by any function that accepts a single value and returns a Bool
.
Example
using DimensionalData
A = DimArray([1 2 3; 4 5 6], (X(10:10:20), Y(19:21)))
A[X(Where(x -> x > 15)), Y(Where(x -> x in (19, 21)))]
# output
╭───────────────────────╮
│ 1×2 DimArray{Int64,2} │
├───────────────────────┴─────────────────────────────── dims ┐
↓ X Sampled{Int64} [20] ForwardOrdered Irregular Points,
→ Y Sampled{Int64} [19, 21] ForwardOrdered Irregular Points
└─────────────────────────────────────────────────────────────┘
↓ → 19 21
20 4 6
All <: Selector
All(selectors::Selector...)
Selector that combines the results of other selectors. The indices used will be the union of all result sorted in ascending order.
Example
using DimensionalData, Unitful
dimz = X(10.0:20:200.0), Ti(1u"s":5u"s":100u"s")
A = DimArray((1:10) * (1:20)', dimz)
A[X=All(At(10.0), At(50.0)), Ti=All(1u"s"..10u"s", 90u"s"..100u"s")]
# output
╭───────────────────────╮
│ 2×4 DimArray{Int64,2} │
├───────────────────────┴──────────────────────────────────────────────── dims ┐
↓ X Sampled{Float64} [10.0, 50.0] ForwardOrdered Irregular Points,
→ Ti Sampled{Unitful.Quantity{Int64, 𝐓, Unitful.FreeUnits{(s,), 𝐓, nothing}}} [1 s, 6 s, 91 s, 96 s] ForwardOrdered Irregular Points
└──────────────────────────────────────────────────────────────────────────────┘
↓ → 1 s 6 s 91 s 96 s
10.0 1 2 19 20
50.0 3 6 57 60
Lookup traits
LookupTrait
Abstract supertype of all traits of a Lookup
.
These modify the behaviour of the lookup index.
The term "Trait" is used loosely - these may be fields of an object of traits hard-coded to specific types.
Order
Order <: LookupTrait
Traits for the order of a Lookup
. These determine how searchsorted
finds values in the index, and how objects are plotted.
Ordered <: Order
Supertype for the order of an ordered Lookup
, including ForwardOrdered
and ReverseOrdered
.
ForwardOrdered <: Ordered
ForwardOrdered()
Indicates that the Lookup
index is in the normal forward order.
ReverseOrdered <: Ordered
ReverseOrdered()
Indicates that the Lookup
index is in the reverse order.
Unordered <: Order
Unordered()
Indicates that Lookup
is unordered.
This means the index cannot be searched with searchsortedfirst
or similar optimised methods - instead it will use findfirst
.
AutoOrder <: Order
AutoOrder()
Specifies that the Order
of a Lookup
will be found automatically where possible.
Span
Span <: LookupTrait
Defines the type of span used in a Sampling
index. These are Regular
or Irregular
.
Regular <: Span
Regular(step=AutoStep())
Points
or Intervals
that have a fixed, regular step.
Irregular <: Span
Irregular(bounds::Tuple)
Irregular(lowerbound, upperbound)
Points
or Intervals
that have an Irregular
step size. To enable bounds tracking and accurate selectors, the starting bounds are provided as a 2 tuple, or 2 arguments. (nothing, nothing)
is acceptable input, the bounds will be guessed from the index, but may be inaccurate.
Explicit(bounds::AbstractMatrix)
Intervals where the span is explicitly listed for every interval.
This uses a matrix where with length 2 columns for each index value, holding the lower and upper bounds for that specific index.
AutoSpan <: Span
AutoSpan()
The span will be guessed and replaced in format
or set
.
Sampling
Sampling <: LookupTrait
Indicates the sampling method used by the index: Points
or Intervals
.
Points <: Sampling
Points()
Sampling
lookup where single samples at exact points.
These are always plotted at the center of array cells.
Intervals <: Sampling
Intervals(locus::Position)
Sampling
specifying that sampled values are the mean (or similar) value over an interval, rather than at one specific point.
Intervals require a locus
of Start
, Center
or End
to define the location in the interval that the index values refer to.
Positions
Position <: LookupTrait
Abstract supertype of types that indicate the locus of index values where they represent Intervals
.
These allow for values array cells to align with the Start
, Center
, or End
of values in the lookup index.
This means they can be plotted with correct axis markers, and allows automatic conversions to between formats with different standards (such as NetCDF and GeoTiff).
Center <: Position
Center()
Used to specify lookup values correspond to the center locus in an interval.
Start <: Position
Start()
Used to specify lookup values correspond to the start locus of an interval.
Begin <: Position
Begin()
Used to specify the begin
index of a Dimension
axis, as regular begin
will not work with named dimensions.
Can be used with :
to create a BeginEndRange
or BeginEndStepRange
.
End <: Position
End()
Used to specify the end
index of a Dimension
axis, as regular end
will not work with named dimensions. Can be used with :
to create a BeginEndRange
or BeginEndStepRange
.
Also used to specify lookup values correspond to the end locus of an interval.
AutoPosition <: Position
AutoPosition()
Indicates a interval where the index locus is not yet known. This will be filled with a default value on object construction.
Metadata
AbstractMetadata{X,T}
Abstract supertype for all metadata wrappers.
Metadata wrappers allow tracking the contents and origin of metadata. This can facilitate conversion between metadata types (for saving a file to a different format) or simply saving data back to the same file type with identical metadata.
Using a wrapper instead of Dict
or NamedTuple
also lets us pass metadata objects to set
without ambiguity about where to put them.
Metadata <: AbstractMetadata
Metadata{X}(val::Union{Dict,NamedTuple})
Metadata{X}(pairs::Pair...) => Metadata{Dict}
Metadata{X}(; kw...) => Metadata{NamedTuple}
General Metadata
object. The X
type parameter categorises the metadata for method dispatch, if required.
NoMetadata <: AbstractMetadata
NoMetadata()
Indicates an object has no metadata. But unlike using nothing
, get
, keys
and haskey
will still work on it, get
always returning the fallback argument. keys
returns ()
while haskey
always returns false
.
units(x) => Union{Nothing,Any}
units(xs:Tuple) => Tuple
unit(A::AbstractDimArray, dims::Tuple) => Tuple
unit(A::AbstractDimArray, dim) => Union{Nothing,Any}
Get the units of an array or Dimension
, or a tuple of of either.
Units do not have a set field, and may or may not be included in metadata
. This method is to facilitate use in labels and plots when units are available, not a guarantee that they will be. If not available, nothing
is returned.
Second argument dims
can be Dimension
s, Dimension
types, or Symbols
for Dim{Symbol}
.