Statistics: Statistics *
Independence
Basic independence
Going back to our initial example with a customer rating our movie on a scale of 1 to 5, denoted as \(R\), and them being happy, denoted as \(H\). Suppose we now introduce yet another variable, say \(B\), which denotes whether the user has blue eyes, again taking on values 1 and 0 denoting whether or not the user has blue eyes respectively. In this case, there is no reason to assume that my belief that someone is happy changes after observing whether or not someone has blue eyes, i.e. \[\mathbb{P}(H \mid B=1) = \mathbb{P}(H).\] Without defining these notions formally here, you can convince yourself that there is no information about \(H\) in \(B\). However, we could imagine a situation where someone gives a higher rating for a movie because they are happy, i.e. if we know that \(H=1\), our believe in \(R=5\) increases, that is: \[\mathbb{P}(H \mid B=1) = \mathbb{P}(H).\] When there is no information about \(X\) in \(Y\), or when \[\mathbb{P}(X \mid Y) = \mathbb{P}(X),\] we say that \(X\) and \(Y\) are independent. You may also have encountered this as \[\mathbb{P}(X, Y) = \mathbb{P}(X) \cdot \mathbb{P}(Y),\] which is exactly the same thing when multiplied by \(\mathbb{P}(Y)\) and making using of the fact that \(\mathbb{P}(X, Y) = \mathbb{P}(Y) \cdot \mathbb{P}(X \mid Y).\)
When \(X\) and \(Y\) are not independent, i.e. when \(X\) tells me something about \(Y\), we call them dependent.
Exercise Suppose we want to parametrize a joint distribution over 10 binary random variables, \(X_1, \cdots, X_{10}\). If we do not assume any independence, we would need \(2^{10} - 1 = 1023\) parameters to describe this joint distribution. Suppose now that all \(X_i\) are independent, i.e. \(\mathbb{P}(X_1, \cdots, X_{10}) = \prod_{i=1}^{10} \mathbb{P}(X_i)\). Explain why now we only need \(10\) parameters to describe our joint distribution.
Conditional Independence
In a real-life situation, simple independence is almost something you never encounter. A much more common scenario is that of two random variables being independent given some third variable.
Example of conditional independence Suppose we consider two students that both travel by the same train to the university, and we denote \(S_1\) as the binary variable describing whether student 1 is on time for the lecture, and \(S_2\) the binary variable describing whether student 2 is on time. Intuitively, these events are related, since when you tell me that \(S_1\) is on time, that increased my belief that \(S_2\) also is on time, for the reason that the trains were not delayed. In other words, \[\mathbb{P}(S_2 = 1 \mid S_1 = 1) > \mathbb{P}(S_2 = 1).\] Similarly, if \(S_1\) is not on time, my belief increases that \(S_2\) won’t be on time either. Therefore, we observe that \(S_1\) and \(S_2\) are not independent.
However, let us now introduce yet a third binary variable \(T\) denoting whether the train was on time. How does knowing \(T\)
change the above situation? In this case, observing that \(S_1\) is or is not on time if I already know \(T\) will not tell me any more information than I could already derive from \(T\). The reason is that the only thing it could tell me is something about whether or not the train is delayed, and since we already knew that the train was not delayed we obtain no information.
In this case, we have \[\mathbb{P}(S_2 \mid T, S_1) = \mathbb{P}(S_2 \mid T).\] n that case, we say that \(S_2\) is conditionally independent of \(S_1\) given \(T\). In other words, we say that \(X\) is conditionally independent of \(Y\) given \(Z\), if my belief in is not affected by the outcome of \(Y\) if I already know that \(Z\).
In the above example, we had a situation where \(X\) and \(Y\) were dependent, but \(X\) and \(Y\) were independent when informed about some \(Z\). That is., even if there is an ‘information flow’ between \(X\) and \(Y\) , there can be no ‘information flow’ from \(X\) to \(Y\) if we know that \(Z\). The other case is also possible, the situation where \(X\) and \(Y\) are independent variables, but they become dependent given some variable \(Z\). To illustrate this, let us consider a new - quite inspired on our breakfast - scenario.
Example of conditional dependence Suppose that \(S\) is the random variable that describes whether or not the first strawberry I eat on a day is sweet. Moreover, let \(E\) denote whether or not the first coffee I prepare is an espresso. Needless to say, the outcomes of these events do not affect each other, and hence \(S\) and \(E\) are independent. The reason we picked such a weird example is that even such two (very) independent events can be made conditionally dependent if we consider a crazy enough third variable...
Summary In this theory page, you have learned about the independence of random variables.