Tables and DataFrames
Tables.jl provides an ecosystem-wide interface to tabular data in Julia, giving interoperability with DataFrames.jl, CSV.jl and hundreds of other packages that implement the standard.
DimensionalData.jl implements the Tables.jl interface for AbstractDimArray
and AbstractDimStack
. DimStack
layers are unrolled so they are all the same size, and dimensions loop to match the length of the largest layer.
Columns are given the name
or the array or the stack layer key. Dimension
columns use the Symbol
version (the result of DD.dim2key(dimension)
).
Looping of dimensions and stack layers is done lazily, and does not allocate unless collected.
Example
using DimensionalData, Dates, DataFrames
Define some dimensions:
julia> x, y, c = X(1:10), Y(1:10), Dim{:category}('a':'z')
↓ X 1:10,
→ Y 1:10,
↗ category 'a':1:'z'
julia> A = rand(x, y, c; name=:data)
╭───────────────────────────────────╮
│ 10×10×26 DimArray{Float64,3} data │
├───────────────────────────────────┴───────────────────── dims ┐
↓ X Sampled{Int64} 1:10 ForwardOrdered Regular Points,
→ Y Sampled{Int64} 1:10 ForwardOrdered Regular Points,
↗ category Categorical{Char} 'a':1:'z' ForwardOrdered
└───────────────────────────────────────────────────────────────┘
[:, :, 1]
↓ → 1 2 3 … 8 9 10
1 0.0346544 0.309024 0.482539 0.17394 0.0623452 0.641241
2 0.132353 0.114103 0.339602 0.0372641 0.0991284 0.853618
3 0.790219 0.447847 0.724883 0.245281 0.746798 0.446983
4 0.178662 0.989972 0.726265 0.578061 0.826863 0.0181295
⋮ ⋱ ⋮
7 0.160163 0.706573 0.635854 0.653148 0.0555744 0.800387
8 0.969558 0.592174 0.136896 0.144427 0.340806 0.842163
9 0.244398 0.533436 0.600217 0.772372 0.455109 0.551695
10 0.417803 0.392001 0.495234 … 0.66079 0.249985 0.264881
Converting to DataFrame
Arrays will have columns for each dimension, and only one data column
julia> DataFrame(A)
2600×4 DataFrame
Row │ X Y category data
│ Int64 Int64 Char Float64
──────┼───────────────────────────────────
1 │ 1 1 a 0.0346544
2 │ 2 1 a 0.132353
3 │ 3 1 a 0.790219
4 │ 4 1 a 0.178662
5 │ 5 1 a 0.411505
6 │ 6 1 a 0.505808
7 │ 7 1 a 0.160163
8 │ 8 1 a 0.969558
⋮ │ ⋮ ⋮ ⋮ ⋮
2594 │ 4 10 z 0.408114
2595 │ 5 10 z 0.344591
2596 │ 6 10 z 0.0682991
2597 │ 7 10 z 0.92104
2598 │ 8 10 z 0.125902
2599 │ 9 10 z 0.788728
2600 │ 10 10 z 0.06894
2585 rows omitted
Converting to CSV
We can also write arrays and stacks directly to CSV.jl, or any other data type supporting the Tables.jl interface.
julia> using CSV
julia> CSV.write("dimstack.csv", st)
"dimstack.csv"
julia> readlines("dimstack.csv")
2601-element Vector{String}:
"X,Y,category,data1,data2"
"1,1,a,0.40611979552595645,0.5871361574018922"
"2,1,a,0.7294300846156245,0.969160102348011"
"3,1,a,0.6133074780689224,0.8381373016529489"
"4,1,a,0.25409539862794295,0.05522210116858772"
"5,1,a,0.9643477283388405,0.25136524862034026"
"6,1,a,0.7789874610873342,0.6595469807660678"
"7,1,a,0.022821167957493782,0.3327672718806979"
"8,1,a,0.19324340139001805,0.3046052877910228"
"9,1,a,0.441709972307886,0.3273422537456421"
⋮
"2,10,z,0.14448377410912827,0.9802011939451115"
"3,10,z,0.9471495700282515,0.8234514562097587"
"4,10,z,0.9757301428903349,0.48388009035751545"
"5,10,z,0.029059050472127756,0.525134552199902"
"6,10,z,0.6034151766430393,0.996441808438332"
"7,10,z,0.7327040524529884,0.7847558563943032"
"8,10,z,0.01635542879454055,0.25912091365320333"
"9,10,z,0.48916771729948794,0.3204759326894585"
"10,10,z,0.47319891256881264,0.19935723592756383"