Systems of linear equations: Basic concepts and methods
The notion of linear equation
Suppose that \(x\) represents the number \(3\) and \(y\) the number \(2\). Then the following statement is true \(x+1=6-y\). This means \(x=3, y=2\) satisfies the equation \(x+1=6-y\). For the numbers \(2\) and \(3\) you can write many more equations that they satisfy.
In practice, the situation is the other way around: \(x\) and \(y\) are unknown numbers that satisfy the equation \(x+1=6-y\) and we are after the possible values of \(x\) and \(y\). In other words, we want to solve the equation. This can be done by reduction, i.e., by iteratively writing an equation that is simpler than the previous one and still has the same solution. In the example chosen, the equation can be reduced to \(y=5-x\), which means that for an arbitrary value for \(x\), say \(x=a\), the value of \(y\) is given by \(y=5-a\).
The given example is of a special type: it is a linear equation in \(x\) and \(y\). In this section we focus on the case of one linear equation in one or more unknowns.
Name convention Besides the letters \(x\), \(y\) and \(z\) indexed names are also used such as \(x_1\), \(x_2\) and \(x_3\). For example, \(x_1+4x_2+5x_3+6=0\) instead of \(x+4y+5z+6=0\). This makes it easier to describe the theory, methods and techniques for an arbitrary number of \(n\) unknowns. Convention is to use only letters in case of small number of variables (\({}\le 4\)).
In this chapter we will use the two formats together. In computer exercises we prefer to use letters for variables because they are easier to implement than indexed names.
General terminology Let \(x_1,\ldots, x_n\) be variables.
A linear equation with unknowns \(x_1,\ldots, x_n\) is an equation that can be reduced, by use of elementary operations, to a (linear) basic form \[a_1x_1 + \cdots + a_nx_n + b = 0\] where \(a_1,\ldots,a_n\) and \(b\) are real or complex numbers. We also speak of a linear equation in \(x_1,\ldots ,x_n\).
There is no unique basic form: the equations \(2x-2=0\) and \(x-1=0\) are both in the basic form, but are different, and can still be carried over into one another through elementary operations.
With an elementary operation we mean expansion of brackets, the regrouping of subexpressions, the addition or subtraction of the same expression on either side of the equation, or the multiplication or division by a nonzero number on both sides of the equation. We speak of a elementary reduction when all the steps in the reduction are elementary operations.
The expression to the left of the equal sign (\(=\)) is called the left-hand side of the equation (bove, this is \(a_1x_1 + \cdots + a_nx_n + b\) ), and the expression to the right is called the right-hand side (above, this is \(0\)).
The expressions \(a_1x_1, \ldots, a_nx_n\) and \(b\) in the left-hand side of the basic form are called terms. For each index \(i\), we call the number \(a_i\) the coefficient of \(x_i\). Terms that do not contain an unknown are called constant terms, or constants for short (above, these are the numbers \(b\) and \(0\)).
A list \([s_1,\ldots, s_n]\) of \(n\) numbers is called a solution of the equation if entering \(x_1=s_1, \ldots, x_n=s_n\) turns the equation into a true statement. All values of \(x_1,\ldots ,x_n\) in which the equation is true constitute the solution of the equation.
Two linear equations are called equivalent when they have the same solutions because they can be transformed into one another by elementary reduction.
If an equation can be reduced to another by elementary transformations, then the two equations are equivalent.
Substituting \(n=1\), \(a_1=2\), and \(b=3\) in the above basic form of a linear equation gives \(2x_1+3=0\). A solution of this equation is \(x_1=-\frac{3}{2}\). In fact, it is the solution: there are no others.
We then say that \(x_1=-\frac{3}{2}\) is the solution of the equation \(2x_1+3=0\).
The equation \(x_1+2=-(x_1+1)\) is an equivalent linear equation, and thus has the same solution.
The equation can be reduced to \(0\cdot x=0\) and is therefore linear according to the definition.
It is a degenerate situation where the unknown is not really present in the equation.
Incidentally, the equation has an infinite number of solutions.