Skip to content

Tables and DataFrames

Tables.jl provides an ecosystem-wide interface to tabular data in Julia, giving interoperability with DataFrames.jl, CSV.jl and hundreds of other packages that implement the standard.

DimensionalData.jl implements the Tables.jl interface for AbstractDimArray and AbstractDimStack. DimStack layers are unrolled so they are all the same size, and dimensions loop to match the length of the largest layer.

Columns are given the name or the array or the stack layer key. Dimension columns use the Symbol version (the result of DD.dim2key(dimension)).

Looping of dimensions and stack layers is done lazily, and does not allocate unless collected.

Example

julia
using DimensionalData, Dates, DataFrames

Define some dimensions:

julia
julia> x, y, c = X(1:10), Y(1:10), Dim{:category}('a':'z')
X        1:10,
Y        1:10,
category 'a':1:'z'
julia
julia> A = rand(x, y, c; name=:data)
╭───────────────────────────────────╮
10×10×26 DimArray{Float64,3} data
├───────────────────────────────────┴───────────────────── dims ┐
X        Sampled{Int64} 1:10 ForwardOrdered Regular Points,
Y        Sampled{Int64} 1:10 ForwardOrdered Regular Points,
category Categorical{Char} 'a':1:'z' ForwardOrdered
└───────────────────────────────────────────────────────────────┘
[:, :, 1]
  1          2         38          9          10
  1    0.0346544  0.309024  0.482539     0.17394    0.0623452   0.641241
  2    0.132353   0.114103  0.339602     0.0372641  0.0991284   0.853618
  3    0.790219   0.447847  0.724883     0.245281   0.746798    0.446983
  4    0.178662   0.989972  0.726265     0.578061   0.826863    0.0181295
  ⋮                                   ⋱                         ⋮
  7    0.160163   0.706573  0.635854     0.653148   0.0555744   0.800387
  8    0.969558   0.592174  0.136896     0.144427   0.340806    0.842163
  9    0.244398   0.533436  0.600217     0.772372   0.455109    0.551695
 10    0.417803   0.392001  0.495234  …  0.66079    0.249985    0.264881

Converting to DataFrame

Arrays will have columns for each dimension, and only one data column

julia
julia> DataFrame(A)
2600×4 DataFrame
  Row │ X      Y      category  data
 Int64  Int64  Char      Float64
──────┼───────────────────────────────────
    1 │     1      1  a         0.0346544
    2 │     2      1  a         0.132353
    3 │     3      1  a         0.790219
    4 │     4      1  a         0.178662
    5 │     5      1  a         0.411505
    6 │     6      1  a         0.505808
    7 │     7      1  a         0.160163
    8 │     8      1  a         0.969558
  ⋮   │   ⋮      ⋮       ⋮          ⋮
 2594 │     4     10  z         0.408114
 2595 │     5     10  z         0.344591
 2596 │     6     10  z         0.0682991
 2597 │     7     10  z         0.92104
 2598 │     8     10  z         0.125902
 2599 │     9     10  z         0.788728
 2600 │    10     10  z         0.06894
                         2585 rows omitted

Converting to CSV

We can also write arrays and stacks directly to CSV.jl, or any other data type supporting the Tables.jl interface.

julia
julia> using CSV

julia> CSV.write("dimstack.csv", st)
"dimstack.csv"
julia
julia> readlines("dimstack.csv")
2601-element Vector{String}:
 "X,Y,category,data1,data2"
 "1,1,a,0.40611979552595645,0.5871361574018922"
 "2,1,a,0.7294300846156245,0.969160102348011"
 "3,1,a,0.6133074780689224,0.8381373016529489"
 "4,1,a,0.25409539862794295,0.05522210116858772"
 "5,1,a,0.9643477283388405,0.25136524862034026"
 "6,1,a,0.7789874610873342,0.6595469807660678"
 "7,1,a,0.022821167957493782,0.3327672718806979"
 "8,1,a,0.19324340139001805,0.3046052877910228"
 "9,1,a,0.441709972307886,0.3273422537456421"

 "2,10,z,0.14448377410912827,0.9802011939451115"
 "3,10,z,0.9471495700282515,0.8234514562097587"
 "4,10,z,0.9757301428903349,0.48388009035751545"
 "5,10,z,0.029059050472127756,0.525134552199902"
 "6,10,z,0.6034151766430393,0.996441808438332"
 "7,10,z,0.7327040524529884,0.7847558563943032"
 "8,10,z,0.01635542879454055,0.25912091365320333"
 "9,10,z,0.48916771729948794,0.3204759326894585"
 "10,10,z,0.47319891256881264,0.19935723592756383"