Terra

Posted on Jul 10, 2024 • Originally published at pourterra.com

Mathematics for Machine Learning - Day 3

#machinelearning #learning #tutorial #beginners

Inverse and Transpose

Don't feel bad!, even matrix, a widely used object in mathematics, engineering, and computer science can still be single. But how can a matrix be single?

It's actually called singular or invertible matrix. These matrix are defined as singular when the don't have an inverse! either because of the determinant is zero or their lack of personality and relying on materialistic tendencies instead of working on themselves!

Inverse

Take matrix A for example:

A = \begin{pmatrix} 2 & 4 \\ 3 & 1 \end{pmatrix}

Matrix A contains two rows and two columns, with the inverse matrix formula being:

A^{-1} = \frac{1}{\det(A)} A'

We can calculate the inverse as below:

A^{-1} = \frac{1}{(2 \cdot 1) - (4 \cdot 3)} \begin{pmatrix} 1 & -4 \\ -3 & 2 \end{pmatrix}

Then we add the the values

A^{-1} = \frac{1}{-10} \begin{pmatrix} 1 & -4 \\ -3 & 2 \end{pmatrix}

Finally dividing the matrix by the determinant!

A^{-1} = \begin{pmatrix} -0.1 & 0.4 \\ 0.3 & -0.2 \end{pmatrix}

What was that? I don't understand anything

Here, let's go step by step, our goal is to prove this equation.

A \cdot A' = \det(A) \cdot I

This equation is just the Inverse equation with the determinant moved with the identity matrix. This step is also useful so you understand the basics of a 2x2 adjoint matrix (A')

Given:

A = \begin{pmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{pmatrix}

The adjugate of (A) is:

A' = \begin{pmatrix} a_{22} & -a_{12} \\ -a_{21} & a_{11} \end{pmatrix}

The determinant of (A) is:

\det(A) = a_{11}a_{22} - a_{12}a_{21}

Now, compute A . A':

A \cdot A' = \begin{pmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{pmatrix} \begin{pmatrix} a_{22} & -a_{12} \\ -a_{21} & a_{11} \end{pmatrix}

Perform the matrix multiplication:

= \begin{pmatrix} (a_{11}a_{22} - a_{12}a_{21}) & (a_{11}(-a_{12}) + a_{12}a_{11}) \\ (a_{21}a_{22} - a_{22}a_{21}) & (a_{21}(-a_{12}) + a_{22}a_{11}) \end{pmatrix}

Simplify the elements:

= \begin{pmatrix} \det(A) & 0 \\ 0 & \det(A) \end{pmatrix}

Thus:

A \cdot A' = \det(A) \cdot I

Properties

These are the properties of both transpose and inverse. Remember, these are the same as a normal matrix (Aside from equation 2) so just remember the normal properties of matrix and you're all set!

A A^{-1} = A^{-1} A \\ (AB)^{-1} = B^{-1} A^{-1} \\ (A + B)^{-1} \neq A^{-1} + B^{-1} \\ (A^T)^T = A \\ (A + B)^T = A^T + B^T \\ (AB)^T = B^T A^T \\

If (A = A^T) \\ then \\ (A) \textit{ is a symmetric matrix.}

A^T_{mn} = A_{nm} \implies m = n \\

If \; A = A^T \; then \; A^{-1} = A^{-T} \\ and \\ A^{-T} = (A^{-1})^T = (A^T)^{-1}

Specific Case: Multiplication by a scaler

Though this is the properties are the same as multiplying with a matrix (Associativity and Distribution), it's best to drill it down one more time.

Given:

C := \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix}

Consider ((\lambda + \psi)C):

(\lambda + \psi)C = \begin{pmatrix} (\lambda + \psi) & 2(\lambda + \psi) \\ 3(\lambda + \psi) & 4(\lambda + \psi) \end{pmatrix}

Distribute (\lambda) and (\psi):

= \begin{pmatrix} \lambda + \psi & 2\lambda + 2\psi \\ 3\lambda + 3\psi & 4\lambda + 4\psi \end{pmatrix}

Separate terms:

= \begin{pmatrix} \lambda & 2\lambda \\ 3\lambda & 4\lambda \end{pmatrix} + \begin{pmatrix} \psi & 2\psi \\ 3\psi & 4\psi \end{pmatrix}

Factor out (\lambda) and (\psi):

= \lambda \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix} + \psi \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix}

Express in terms of (C):

= \lambda C + \psi C

Compact Representation of System of Linear Equations

System of Linear Equations

The section is intended to prove that we can simplify large equations into a simpler form. This takes the load of seeing so many numbers everywhere and we can focus on the problem at hand, in this case, it's finding (x)

\begin{align*} (1) \quad & 2x_1 + 3x_2 + 5x_3 = 1 \\ (2) \quad & 4x_1 - 2x_2 - 7x_3 = 8 \\ (3) \quad & 9x_1 + 5x_2 - 3x_3 = 2 \end{align*}

The matrix form:

\begin{pmatrix} 2 & 3 & 5 \\ 4 & -2 & -7 \\ 9 & 5 & -3 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} = \begin{pmatrix} 1 \\ 8 \\ 2 \end{pmatrix}

equal to:

\sum_{j=1}^{3} A_{ij} x_i = B_j \textit{ for j = 1, 2, 3}

Acknowledgement

I can't overstate this: I'm truly grateful for this book being open-sourced for everyone. Many people will be able to learn and understand machine learning on a fundamental level. Whether changing careers, demystifying AI, or just learning in general, this book offers immense value even for fledgling composer such as myself. So, Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong, thank you for this book.

Source:
Deisenroth, M. P., Faisal, A. A., & Ong, C. S. (2020). Mathematics for Machine Learning. Cambridge: Cambridge University Press.
https://mml-book.com

DEV Community

Mathematics for Machine Learning - Day 3

Inverse and Transpose

Inverse

What was that? I don't understand anything

Given:

The adjugate of (A) is:

The determinant of (A) is:

Now, compute A . A':

Perform the matrix multiplication:

Simplify the elements:

Thus:

Properties

Specific Case: Multiplication by a scaler

Given:

Consider ((\lambda + \psi)C):

Distribute (\lambda) and (\psi):

Separate terms:

Factor out (\lambda) and (\psi):

Express in terms of (C):

Compact Representation of System of Linear Equations

System of Linear Equations

The matrix form:

equal to:

Acknowledgement

Top comments (0)

Read next

Using DeepSeek-R1 on Azure with JavaScript

AI Language Models Show Strange "Hyperfitting" Effect When Fine-Tuned for Precision

DeepSeek R1: Math Model Trades Speed for Accuracy in Complex Problem-Solving

New 4-Bit Training Method Cuts AI Model Memory Usage in Half While Maintaining Accuracy