Do you remember what vectors and matrices and data frames are? Well, a vector is like a finite sequence of values, e.g. a sequence of numbers or a list of words. It has only one dimension and is either written as a row or a column. A matrix and data frame are like a table with rows and columns, where each cell can hold a value. All row have the same length and all columns have the same length. A matrix has two dimensions: row and column directions. The main difference between matrices and data frames is that matrices may only have elements of one type, whereas data frames may have columns of different data type. An array is similar to a matrix but it has more than two dimensions. So, an array is a multi-dimensional table, where data is organised in more than two dimensions (rows, columns, and more). It can have three or more dimensions, but values are restricted to one data type only. Arrays are helpful when working with digital images, where you can specify red/green/blue values of the pixels as a 3-dimensional array. Another application of a three-dimensional array is when you have two-dimensional data that change over time. The time variable is then a third dimension in the data set.
Explanation
You can create an array with the function array()
. It expects a vector and uses this vector to create an object with the specififed dimensions. The dimensions of an array can be displayed and set by the function dim()
. The session on the right-hand side illustrates this.
Sample session
> A <- array(1:12, dim=c(2,3,2)); A
, , 1
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
, , 2
[,1] [,2] [,3]
[1,] 7 9 11
[2,] 8 10 12
> dim(A) <- c(3,2,2); A
, , 1
[,1] [,2]
[1,] 1 4
[2,] 2 5
[3,] 3 6
, , 2
[,1] [,2]
[1,] 7 10
[2,] 8 11
[3,] 9 12
> A <- array(1:2, c(2,3,2)); A
, , 1
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 2 2 2
, , 2
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 2 2 2
Consider three children, Christie, Stephan, and Robin, of which you keep track of their height and weight for 4 consecutive year (age 5 through 8). Each year you record a data frame of the format
Name |
Length |
Weight |
Christie |
|
|
Stephan |
|
|
Robin |
|
|
In the end you want the data to be recorded in a \(3\times 2\times 4\) array.
First you enter the four data frames:
> df5 <- data.frame(Name = c("Christie", "Stephan", "Robin"),
+ Length = c(110, 111, 109),
+ Weight = c(15.5, 16, 15))
> df6 <- data.frame(Name = c("Christie", "Stephan", "Robin"),
+ Length = c(117, 118, 117),
+ Weight = c(20, 21, 20.5))
> df7 <- data.frame(Name = c("Christie", "Stephan", "Robin"),
+ Length = c(125, 124, 124),
+ Weight = c(24.5, 24, 24))
> df8 <- data.frame(Name = c("Christie", "Stephan", "Robin"),
+ Length = c(132, 131, 132),
+ Weight = c(27, 26, 26.5))
Next you create a data frame that combines these four data frames, but without the first column of each frame
> df <- data.frame(df5[,-1], df6[,-1], df7[,-1], df8[,-1])
> df
Length Weight Length.1 Weight.1 Length.2 Weight.2 Length.3 Weight.3
1 110 15.5 117 20.0 125 24.5 132 27.0
2 111 16.0 118 21.0 124 24.0 131 26.0
3 109 15.0 117 20.5 124 24.0 132 26.5
The trick is that you now get all values in one sequence via the function unlist()
and then rearrange these numbers into a three-dimensional array.
> array(unlist(df), dim=c(3,2,4))
, , 1
[,1] [,2]
[1,] 110 15.5
[2,] 111 16.0
[3,] 109 15.0
, , 2
[,1] [,2]
[1,] 117 20.0
[2,] 118 21.0
[3,] 117 20.5
, , 3
[,1] [,2]
[1,] 125 24.5
[2,] 124 24.0
[3,] 124 24.0
, , 4
[,1] [,2]
[1,] 132 27.0
[2,] 131 26.0
[3,] 132 26.5
Explanation
Like vectors and matrices, arrays are subscriptable and mutable objects. This means that components of an array (also called elements) can be accessed via a positional index , where indexing starts at 1, and that you can change a vector also by assigning a value to a single component. Everything is just a generalisation of what you learned for matrices twoards more dimension.
Have a close look at the examples in the sample session on the right-hand side.
Sample session
> A <- array(1:12, dim=c(2,3,2)); A
, , 1
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
, , 2
[,1] [,2] [,3]
[1,] 7 9 11
[2,] 8 10 12
> A[2,1,2] # selection of one element
[1] 8
> A[2,3,2] <- 15; A # change of one element
, , 1
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
, , 2
[,1] [,2] [,3]
[1,] 7 9 11
[2,] 8 10 15
> A[ , 3, ] # extraction of a specific part
[,1] [,2]
[1,] 5 11
[2,] 6 15
# extraction of values along one dimension
> A[1, 2, ]
[1] 3 9