Matrix multiplication is somewhat more complicated than matrix addition because it does not work componentwise, and because the size and the order of the matrices in a matrix multiplication matter.
We define the product \(A\,B\) of two matrices \(A\) and \(B\) only if every row of \(A\) is as long as each column \(B\). So if \(A\) is an \(m\times n\) matrix, then \(B\) needs to be an \(n\times p\) matrix for certain \(p\). If this is the case, then the matrix product \(A\,B\) is an \(m\times p\) matrix. The element \(c_{ij}\) on the \(i\)-th row and the \(j\)-th column in the matrix product \( C=A\,B\) is defined as follows:
\[
c_{ij}=a_{i1}b_{1j}+a_{i2}b_{2j}+\cdots +a_{in}b_{nj} \quad\text{for }\quad i=1,\ldots, m; \, j=1,\ldots, p
\]
We can also write the right-hand side of the definition of \(c_{ij}\) as the dot product of the #i#-th row vector of #A# and #j#-th column vector of #B# (interpreted as column vectors) or the matrix product of the #i#-th row vector of #A# and #j#-th column vector of #B# (interpreted as matrices): \[c_{ij} = \cv{a_{i1}\\ \vdots\\ a_{in}}\boldsymbol{\cdot}\cv{b_{1j}\\ \vdots\\ b_{nj}}= \matrix{a_{i1}& \cdots & a_{in}}\matrix{b_{1j}\\ \vdots\\ b_{nj}}\]
The product of the matrices #A# and #B# can be visualised as follows: \[\text{If}\quad A=\matrix{a_{11} & \cdots & a_{1n}\\ \vdots & \ddots & \vdots \\ a_{m1} & \cdots & a_{mn}}\quad\text{and}\quad B=\matrix{b_{11} & \cdots & a_{1p}\\ \vdots & \ddots & \vdots \\ b_{n1} & \cdots & b_{np}}\] then \(C=AB\) is the \(m\times p\) matrix of which the element \(c_{ij}\) is the dot product of the magenta coloured row and column of the matrices #A# and #B#: \[\matrix{a_{11} & a_{12} &\cdots & a_{1n}\\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots\\ \color{magenta}{a_{i1}} & \color{magenta}{a_{i2}} & \color{magenta}{\cdots} & \color{magenta}{a_{in}}\\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn}} \matrix{b_{11} & b_{12} & \cdots & \color{magenta}{b_{1j}} & \cdots & b_{1p} \\ b_{21} & b_{22} & \cdots & \color{magenta}{b_{2j}} & \cdots & b_{2p}\\ \vdots & \vdots & \ddots & \color{magenta}{\vdots} & \ddots & \vdots \\ b_{n1} & b_{n2} & \cdots & \color{magenta}{b_{nj}} & \cdots & b_{np}} = \matrix{ c_{11} & \cdots & c_{1p} \\ \vdots & \color{magenta}{c_{ij}} & \vdots \\ c_{m1} & \cdots & c_{mp}} \]
The definition of the product of matrices finds its inspiration in the writing of linear equations and systems of linear equations. A linear equation \[a_1x_1+a_2x_2+\cdots + a_nx_n=b\] can be written as the dot product of vectors \(\cv{a_1\\ \vdots\\ a_n}\) and \(\cv{x_1\\ \vdots\\ x_n}\), or as \[a_1x_1+a_2x_2+\cdots + a_nx_n= \cv{a_1\\ \vdots\\ a_n}\boldsymbol {\cdot} \cv{x_1\\ \vdots\\ x_n} = (a_1\;\cdots\;a_n)^{\top}\cdot \cv{x_1\\ \vdots\\ x_n}=b\] This then appears in matrix notation as \[\sum_{i=1}^n a_ix_i = \matrix{a_1 & \cdots & a_n}\matrix{x_1\\ \vdots\\ x_n}=b\] It suggests the concept of 'row vector times column vector'.
Similarly we can write a system of linear equations in matrix notation by stitching together linear equations in matrix notation. The system of \(m\) linear equations with \(n\) unknowns \(x_1, \ldots, x_n\) \[\left\{\;\begin{array}{llllllll} a_{11}x_1 \!\!\!\!&+&\!\!\!\! \cdots \!\!\!\!&+&\!\!\!\! a_{1n}x_n\!\!\!\!&=&\!\!\!\!b_1\\ \;\;\vdots &&&& \vdots&&\!\!\!\!\vdots\\ a_{m1}x_1 \!\!\!\!&+&\!\!\!\! \cdots \!\!\!\!&+&\!\!\!\! a_{mn}x_n\!\!\!\!&=&\!\!\!\!b_m\end{array}\right.\] can be written in matrix notation as \[\begin{pmatrix} a_{11} & \ldots & a_{1n} \\ \vdots & & \vdots \\ a_{m1} & \ldots & a_{mn} \end{pmatrix} \begin{pmatrix}x_1\\ \vdots \\ x_n\end{pmatrix}= \begin{pmatrix}b_1\\ \vdots \\ b_m\end{pmatrix}\]
Solving the system of linear equations amounts to finding a column matrix \(x\) which, when multiplied from the left by the coefficient matrix \(A\) yields the column matrix \(b\). This solving of a system for the given coefficient matrix \(A\) can be carried out for several column vectors \(b\) at the same time and can be concisely recorded by replacing \(x\) and \(b\) by multiple columns, so in fact by matrices \(X\) and \(B\). This forces a unique choice for #AX#: the matrix multiplication in the way it is defined.
If #A=\matrix{0&1\\ 0&0}# and #B = \matrix{0&0\\ 1&0}#, then\[\begin{array}{rcl}A\, B &=& \matrix{1&0\\ 0&0}\\ B\, A &=& \matrix{0&0\\ 0&1}\end{array}\]This shows that matrix multiplication is not commutative if the sizes are greater than #1#.
Use as many examples as you need to become familiar with the product of matrices.
\(\matrix{-4&-4\\-2&-1} \cdot \matrix{-3&-4\\-3&-5}={}\)\(\matrix{24&36\\9&13}\)
By way of example, we calculate the matrix coefficient at entry \(\rv{1,2}\), that is to say, in the first row and second column. To this end we calculate the dot product of the pair of magenta colored row and column vector, the first row in the first factor and the second column in the second factor of the matrix product:
\[\begin{aligned}
\left[\matrix{\color{magenta}{-4}&\color{magenta}{-4}\\-2&-1}\cdot\matrix{-3&\color{magenta}{-4}\\-3&\color{magenta}{-5}}\right]_{12} &=\matrix{\color{magenta}{-4}\\\color{magenta}{-4}}\boldsymbol{\cdot}\matrix{\color{magenta}{-4}\\\color{magenta}{-5}}\\ \\
&= -4 \cdot -4 -4 \cdot -5\\\\ &=9\tiny{.}
\end{aligned}\] In the same way, the other matrix coefficients can be calculated.
\[\begin{aligned}
\matrix{-4&-4\\-2&-1}\cdot \matrix{-3&-4\\-3&-5}&=\matrix{-4 \cdot -3 -4\cdot -3 & -4 \cdot -4 -4 \cdot -5\\-2\cdot -3-1\cdot -3&-2\cdot -4 -1 \cdot -5}\\ \\
&=\matrix{24&36\\9&13}\tiny{.}
\end{aligned}\]
Most calculation rules we already know from the multiplication of numbers, also apply to matrix multiplication, at least if the dimensions of the matrices are such that matrix multiplication is possible. We mention:
\[
\begin{array}{rcl}
A\,(B+C)&=& A\,B+A\,C \\
(\lambda A)\,B&=&\lambda (A\,B) \\
(A\,B)\,C&=&A\,(B\,C) \\
(A\,B)^{\top}&=&B^{\top}A^{\top} \end{array} \]
The first two lines and the last line follow directly from the definitions. The proof of the third line with the definition of the product matrices is much tedious paperwork.
Thanks to the second rule we can write \(\lambda\, A\,B\) without parentheses; it does not matter if we calculate this expression as \((\lambda\, A)\,B\) or as \(\lambda\,(A\,B)\). Similarly, the third line enables us to write \(A\,B\,C\); the order in which products are calculated matrix does not matter.
Suppose that \(A\) is a square matrix. Then, the matrix product \(A^{\top}\!A\) exists and this is a symmetric matrix.
Indeed it is true that \((A^{\top}\!A)^{\top}=A^{\top}\!(A^{\top})^{\top}=A^{\top}\!A\).
Suppose we have \(n\) random variables \(X_1,\ldots,X_n\) and have a sample of \(N\) data for each random variable \(X_i\) such that the sample mean for each random variable is zero. Put these data in an \(N\times n\) matrix \(X=(x_{ij})_{i=1,\ldots N;\; j=1\ldots n}\). Then, the sample covariance matrix \(C\) is a symmetric matrix, and can be written as \[C=\frac{1}{N-1}X^{\top}\!X\]