Eddie Cantor said: “When I see the Ten Most Wanted lists, I always have this thought: If we’d made them feel wanted earlier, they wouldn’t be wanted now.” In a more analytical vein, lists — along with matrices and arrays — are data structures that have powerful capabilities as data containers. Lists store character strings, variables, numbers, expressions and matrices. Matrices store numbers in a rectangular array allowing for calculations using matrix algebra. A vector can be thought of as a matrix (or array) in one dimension. An array is a broader term used to describe data stored in various dimensions, such as character strings or numbers in one dimension, numbers in two dimensions, or data tables in three dimensions. There is also a type of array called an associative array, which functions as a lookup table.
The purpose of any of these data structures is to simplify programming and calculations. Individual members of a data structure are identified through indices or subscripts.
A list has a number of operations that can be performed on itself:
- Members can be assigned to the list.
- Once a list is created, members can be removed from it or new members assigned to it.
- The list can be sorted.
- A specific member or group of members of the list can be located according to various criteria.
- The number of members of the list can be calculated. This proves useful, since it allows iteration over the list, such as in a loop where an operation is performed on each item in sequence.
A matrix is composed of numbers in two dimensions: rows and columns. It is similar to a data table composed only of numbers. A member is identified by two indices, one for rows and another for columns. Entire rows or columns can be selected, deleted, concatenated, comparisons or inquiries performed. Functions, such as minimums, maximums, ranges or standard deviations, can be done on individual rows or columns. Their chief use, however, is in operations governed by linear algebra, which are computationally efficient. Matrices are equal if they are of the same size and each corresponding member is equal. If they are of the same size, matrices can be added. This property is commutative, meaning it can be performed in any order. A matrix can be multiplied by a scalar (single number) by multiplying each member by the scalar. Two matrices can be multiplied if the number of columns of the first matrix equals the number of rows of the second matrix. Multiplication is not commutative. A matrix can be transposed by turning its rows into columns and vice versa, called AT. A matrix equal to its transpose is said to be a symmetric matrix.
An important type of matrix is the identify matrix (I) which is a square matrix with ones along the diagonal and zeros elsewhere. If there is a matrix B, such that A●B=I, then B is said to be the inverse of A, also called A-1. The trace of a square matrix is the sum of the elements on the diagonal starting from upper left to lower right. It has several properties:
- The trace of the sum of two matrices equals the sum of their traces.
- The trace of a scalar multiplied by a matrix equals the scalar multiplied by the trace of the matrix.
- The trace of the multiplication of two matrices is commutative, ie.. tr(A●B) = tr(B●A).
For a square matrix, the trace of the matrix equals the sum of the eigenvalues. Eigenvalue decomposition is used in principal components and canonical correlations. Another important property of a matrix is its determinant. For a 2×2 matrix, A: det= ad – bc. A matrix is invertible if its determinant is nonzero. An important relationship is that the determinant of two square matrices equals the product of their determinants, i.e. det(A●B) = det(A) ● det(B). For calculation of eigenvalues and eigenvectors, where A is a square matrix with an eigenvector v and eigenvalue λ according to the equation Av = λv, then det(A – λI) = 0.
Matrices are used to represent a series of algebraic equations that are solved by linear algebra. An equation A●x = b can be used where A is a matrix of constants, x is a vector of unknowns, and b is a vector of constants. The rank is the maximum number of linearly independent row vectors. To solve equation, it can be re-arranged to: x = A-1 ●b
Often, certain matrix operations are difficult to perform with an existing matrix, and various matrix decompositions are performed to put them into a more accessible form. An example is the use of the Cholesky decomposition to obtain eigenvalues. Singular value decomposition is used in correspondence analysis. The goal of these techniques is to preserve certain properties of a matrix such as rank, determinant or inverse, so these quantities can be calculated after transformation.
A specific type of array, called an associative array, can be useful as a hash table, map or dictionary. It contains two elements, a key and a value which are linked. The values can be numbers, matrices, lists, etcetera. An example would be an array of capitals and states in the U.S. The elements can be iterated in the array. Unique elements can be identified and specific elements counted. For large lists, this data structure can be very efficient.
A comparison of several data structures is shown in Table 1.
Table 1: Comparison of Data Structures
Data Structure |
Components |
Example |
||||||||||||||||||
List |
|
{a, 5, sqrt(3)} |
||||||||||||||||||
Vector |
|
[1 5 10 15 20 25] | ||||||||||||||||||
Matrix |
|
[ 1 3 5,
6 8 10, 11 13 15] |
||||||||||||||||||
Associative Array |
|
{{“Alabama”,”Alaska”,”Arizona”}, {“Montgomery”,”Juneau”,”Phoenix”}} |
||||||||||||||||||
Array |
|
{sqrt(3), sqrt(5), sqrt(7)} |
||||||||||||||||||
Data Table |
|
|
Lists, matrices and arrays have many applications, such as game theory, graphics, probability, statistics and electronics. They are used to simplify programming and calculations.
Mark Anawis is a Principal Scientist and ASQ Six Sigma Black Belt at Abbott. He may be reached at [email protected].