The singular value decomposition can also be used to produce an approximation to a given \(n \times d\) matrix, \(\mathbf{X}\). For example, let the singular value decomposition of \(\mathbf{X}\) be \[\mathbf{X}= \mathbf{U} \mathbf{D}_{\lambda} {\mathbf{V}}^{\mkern-1.5mu\mathsf{T}} \] where \(\mathbf{U} = [\mathbf{U}_1, \mathbf{U}_2, \ldots , \mathbf{U}_d]\), \(\mathbf{D}_{\lambda} = diag(\lambda_1, \lambda_2, \ldots , \lambda_d)\), and \(\mathbf{V} = [\mathbf{v}_1, \mathbf{v}_2, \ldots , \mathbf{v}_d]\).
Suppose we choose \(r < d\), then the matrix \[\widehat{\mathbf{X}}= \sum_{i=1}^r \lambda_i \mathbf{U}_i {\mathbf{v}_i}^{\mkern-1.5mu\mathsf{T}} = \widehat{\mathbf{U}} \widehat{ \mathbf{D}}_{\lambda} \widehat{ \mathbf{V}}^T \]
where \(\widehat{\mathbf{U}}=[\mathbf{U}_1, \mathbf{U}_2, \ldots , \mathbf{U}_r]\), \(\widehat{\mathbf{D}}_{\lambda} = diag(\lambda_1, \lambda_2, \ldots , \lambda_r)\), and \(\widehat{\mathbf{V}} = [\mathbf{v}_1, \mathbf{v}_2, \ldots , \mathbf{v}_r]\) can be regarded as an approximation to the original matrix \(\mathbf{X}\).
Below, I am simply going to investigate the quality of this approximation on a particular example.
The original matrix \(\mathbf{X}\) requires \(n \times d\) numbers to be stored in an array. The approximation, \(\widehat{\mathbf{X}}\), contains as many numbers but need not be represented this way. Instead, only the numbers contained in the matrices \(\widehat{\mathbf{U}}\), \(\widehat{\mathbf{D}}_{\lambda}\), and \(\widehat{\mathbf{V}}\).
\(\widehat{\mathbf{U}}\) has dimension \(n\times r\).\(\widehat{\mathbf{D}}_{\lambda}\) has dimension \(r\times r\). \(\widehat{\mathbf{V}}\) has dimension \(d \times r\). However, \(\widehat{\mathbf{D}}_{\lambda}\) only have r non-zero entries as it is a diagonal matrix.
Therefore, the total number that must be held to reproduce \(\widehat{\mathbf{X}}\) is \(n\times r + r + d \times r\).
\(\mathbf{X}\) has a dimension of
\(n \times d\). Thus, we need to store
\(n \times d\) numbers.
The approximation will require fewer numbers than the original matrix if
\(n\times r + r + d \times r < n \times
d\).
For example, for \(n = 147\) and \(d = 104\), \(r < 60.666\). Therefore, \(r\) will need to be smaller than 60 for the approximation to require fewer numbers to be stored.
Below, I calculated the singular value decomposition of
bb
and plot the corresponding values of \(\lambda_i / \lambda_1\) versus \(i\).
source("bb.R")
bb <- matrix(scan("bbdata.R",sep=","),
nrow=147,ncol=104,byrow=TRUE)
svd_values <- svd(bb)
singular_values <- svd_values$d
# Normalizing by the first singular value
normalized_singular_values <- singular_values / singular_values[1]
# Plotting
plot(normalized_singular_values, type = "b", xlab = "i", ylab = expression(lambda[i] / lambda[1]),
main = "Normalized Singular Values")
Based on this plot, I would guess a value of 15 might be to provide an adequate approximation as the marginal benefit of further increasing the dimension is low.
Suppose that an approximation, newbb
, (or \(\widehat{\mathbf{X}}\)), of bb
(or \(\mathbf{X}\)) is constructed
based on dim
(or \(r\))
dimensions.
Below, I produced five such approximations (for \(r=5, 10, 15, 20, 40\)) and display them in a single plot, along with the picture of the original matrix.
par(mfrow = c(2, 3))
breaks <- seq(0,256,1)
cols <- heat.colors(256,alpha=0.75)
col_areas(bb, breaks, cols, main = "Original Matrix")
plot_approximation <- function(r) {
U_r <- svd_values$u[, 1:r]
D_r <- diag(svd_values$d[1:r])
V_r <- svd_values$v[, 1:r]
newbb <- U_r %*% D_r %*% t(V_r)
col_areas(newbb, breaks, cols, main = paste("Number of dimensions=", r))
}
for (r in c(5, 10, 15, 20, 40)) {
plot_approximation(r)
}
I would choose r = 15 as it is able to reconstruct all the finer
structures such as the eyes using the least amount of space.
Original space: \(147 \times 104 =
15288\). Approximation space: \(147
\times 15 + 15 + 104 \times 15 = 3780\).
Therefore, the space saved by choosing r = 15 is \(15288 - 3780 = 11508\).