Singular Value Decomposition with Tensorflow 2.0
Mukesh Mithrakumar ・5 min read
The singular value decomposition (SVD) provides another way to factorize a matrix into singular vectors and singular values. The SVD enables us to discover some of the same kind of information as the eigendecomposition reveals, however, the SVD is more generally applicable. Every real matrix has a singular value decomposition, but the same is not true of the eigenvalue decomposition. SVD can be written as:
A = UDV^{T}
Suppose A is an m x n matrix, then U is defined to be an m x m rotation matrix, D to be an m x n matrix scaling & projecting matrix, and V to be an n x n rotation matrix.
Each of these matrices is defined to have a special structure. The matrices U and V are both defined to be orthogonal matrices U^{T} = U^{-1} and V^{T} = V^{-1}. The matrix D is defined to be a diagonal matrix.
The elements along the diagonal of D are known as the singular values of the matrix A. The columns of U are known as the left-singular vectors. The columns of V are known as the right-singular vectors.
# mxn matrix A
svd_matrix_A = tf.constant([[2, 3], [4, 5], [6, 7]], dtype=tf.float32)
print("Matrix A: \n{}\n".format(svd_matrix_A))
# Using tf.linalg.svd to calculate the singular value decomposition where d: Matrix D, u: Matrix U and v: Matrix V
d, u, v = tf.linalg.svd(svd_matrix_A, full_matrices=True, compute_uv=True)
print("Diagonal D: \n{} \n\nMatrix U: \n{} \n\nMatrix V^T: \n{}".format(d, u, v))
Matrix A:
[[2. 3.]
[4. 5.]
[6. 7.]]
Diagonal D:
[11.782492 0.41578525]
Matrix U:
[[ 0.30449855 -0.86058956 0.40824753]
[ 0.54340035 -0.19506174 -0.81649673]
[ 0.78230214 0.47046405 0.40824872]]
Matrix V^T:
[[ 0.63453555 0.7728936 ]
[ 0.7728936 -0.63453555]]
# Lets see if we can bring back the original matrix from the values we have
# mxm orthogonal matrix U
svd_matrix_U = tf.constant([[0.30449855, -0.86058956, 0.40824753], [0.54340035, -0.19506174, -0.81649673], [0.78230214, 0.47046405, 0.40824872]])
print("Orthogonal Matrix U: \n{}\n".format(svd_matrix_U))
# mxn diagonal matrix D
svd_matrix_D = tf.constant([[11.782492, 0], [0, 0.41578525], [0, 0]], dtype=tf.float32)
print("Diagonal Matrix D: \n{}\n".format(svd_matrix_D))
# nxn transpose of matrix V
svd_matrix_V_trans = tf.constant([[0.63453555, 0.7728936], [0.7728936, -0.63453555]], dtype=tf.float32)
print("Transpose Matrix V: \n{}\n".format(svd_matrix_V_trans))
# UDV(^T)
svd_RHS = tf.tensordot(tf.tensordot(svd_matrix_U, svd_matrix_D, axes=1), svd_matrix_V_trans, axes=1)
predictor = tf.reduce_all(tf.equal(tf.round(svd_RHS), svd_matrix_A))
def true_print(): print("It WORKS. \nRHS: \n{} \n\nLHS: \n{}".format(tf.round(svd_RHS), svd_matrix_A))
def false_print(): print("Condition FAILED. \nRHS: \n{} \n\nLHS: \n{}".format(tf.round(svd_RHS), svd_matrix_A))
tf.cond(predictor, true_print, false_print)
Orthogonal Matrix U:
[[ 0.30449855 -0.86058956 0.40824753]
[ 0.54340035 -0.19506174 -0.81649673]
[ 0.78230214 0.47046405 0.40824872]]
Diagonal Matrix D:
[[11.782492 0. ]
[ 0. 0.41578525]
[ 0. 0. ]]
Transpose Matrix V:
[[ 0.63453555 0.7728936 ]
[ 0.7728936 -0.63453555]]
It WORKS.
RHS:
[[2. 3.]
[4. 5.]
[6. 7.]]
LHS:
[[2. 3.]
[4. 5.]
[6. 7.]]
Matrix A can be seen as a linear transformation. This transformation can be decomposed into three sub-transformations:
- Rotation,
- Re-scaling and projecting,
- Rotation.
These three steps correspond to the three matrices U, D and V
Let's see how these transformations are taking place in order
# Let's define a unit square
svd_square = tf.constant([[0, 0, 1, 1],[0, 1, 1, 0]], dtype=tf.float32)
# a new 2x2 matrix
svd_new_matrix = tf.constant([[1, 1.5], [0, 1]])
# SVD for the new matrix
new_d, new_u, new_v = tf.linalg.svd(svd_new_matrix, full_matrices=True, compute_uv=True)
# lets' change d into a diagonal matrix
new_d_marix = tf.linalg.diag(new_d)
# Rotation: V^T for a unit square
plot_transform(svd_square, tf.tensordot(new_v, svd_square, axes=1), "$Square$", "$V^T \cdot Square$", "Rotation", axis=[-0.5, 3.5 , -1.5, 1.5])
plt.show()
# Scaling and Projecting: DV^(T)
plot_transform(tf.tensordot(new_v, svd_square, axes=1), tf.tensordot(new_d_marix, tf.tensordot(new_v, svd_square, axes=1), axes=1), "$V^T \cdot Square$", "$D \cdot V^T \cdot Square$", "Scaling and Projecting", axis=[-0.5, 3.5 , -1.5, 1.5])
plt.show()
# Second Rotation: UDV^(T)
trans_1 = tf.tensordot(tf.tensordot(new_d_marix, new_v, axes=1), svd_square, axes=1)
trans_2 = tf.tensordot(tf.tensordot(tf.tensordot(new_u, new_d_marix, axes=1), new_v, axes=1), svd_square, axes=1)
plot_transform(trans_1, trans_2,"$U \cdot D \cdot V^T \cdot Square$", "$D \cdot V^T \cdot Square$", "Second Rotation", color=['#1190FF', '#FF9A13'], axis=[-0.5, 3.5 , -1.5, 1.5])
plt.show()
<img src="https://raw.githubusercontent.com/adhiraiyan/DeepLearningWithTF2.0/master/notebooks/figures/ch02/output_79_0.png" alt="Rotation" class="center-image">
<img src="https://raw.githubusercontent.com/adhiraiyan/DeepLearningWithTF2.0/master/notebooks/figures/ch02/output_79_1.png" alt="Scaling and Projecting" class="center-image">
<img src="https://raw.githubusercontent.com/adhiraiyan/DeepLearningWithTF2.0/master/notebooks/figures/ch02/output_79_2.png" alt="Second Rotation" class="center-image">
The above sub transformations can be found for each matrix as follows:
- U corresponds to the eigenvectors of A A^{T}
- V corresponds to the eigenvectors of A^{T} A
- D corresponds to the eigenvalues A A^{T} or A^{T} A which are the same.
As an exercise try proving this is the case.
Perhaps the most useful feature of the SVD is that we can use it to partially generalize matrix inversion to nonsquare matrices, as we will see in the next section.
This is section eight of the Chapter on Linear Algebra with Tensorflow 2.0 of the Book Deep Learning with Tensorflow 2.0.
You can read this section and the following topics:
02.01 — Scalars, Vectors, Matrices, and Tensors
02.02 — Multiplying Matrices and Vectors
02.03 — Identity and Inverse Matrices
02.04 — Linear Dependence and Span
02.05 — Norms
02.06 — Special Kinds of Matrices and Vectors
02.07 — Eigendecomposition
02.08 — Singular Value Decomposition
02.09 — The Moore-Penrose Pseudoinverse
02.10 — The Trace Operator
02.11 — The Determinant
02.12 — Example: Principal Components Analysis
at Deep Learning With TF 2.0: 02.00- Linear Algebra. You can get the code for this article and the rest of the chapter here. Links to the notebook in Google Colab and Jupyter Binder is at the end of the notebook.