Definition 69
Events $A_{1},\ldots ,A_{n}$ in a common probability space are said to be independent if $ \mathbf{P}\left(A_{i_{1} }\cap A_{i_{2} }\cap \ldots \cap A_{i_{m} }\right)=\mathbf{P}(A_{i_{1} })\mathbf{P}(A_{i_{2} })\ldots \mathbf{P}(A_{i_{m} })$ for every choice of $m\le n$ and every choice of $1\le i_{1} < i_{2} < \ldots < i_{m}\le n$.
The independence of $n$ events requires us to check $2^{n}$ equations (that many choices of $i_{1},i_{2},\ldots$). Should it not suffice to check that each pair of $A_{i}$ and $A_{j}$ are independent? The following example shows that this is not the case!

 

Example 70
Let $\Omega=\{0,1\}^{n}$ with $p_{\underline{\omega}}=2^{-n}$ for each $\underline{\omega}\in \Omega$. Define the events $A=\{\underline{\omega}{\; : \;} \omega_{1}=0\}$, $A=\{\underline{\omega}{\; : \;} \omega_{2}=0\}$ and $C=\{\underline{\omega}{\; : \;} \omega_{1}+\omega_{2}=0 \mbox{ or }2\}$. In words, we toss a fair coin $n$ times and $A$ denotes the event that the first toss is a tail, $B$ denotes the event that the second toss is a tail and $C$ denotes the event that out of the first two tosses are both heads or both tails. Then $\mathbf{P}(A)=\mathbf{P}(B)=\mathbf{P}(C)=\frac{1}{4}$. Further, \[\begin{aligned} \mathbf{P}(A\cap B)=\frac{1}{4}, \mathbf{P}(B\cap C)=\frac{1}{4}, P(A\cap C)=\frac{1}{4}, \mathbf{P}(A\cap B\cap C)=\frac{1}{4}. \end{aligned}\] Thus, $A,B,C$ are independent pairwise, but not independent by our definition because $\mathbf{P}(A\cap B\cap C)\not= \frac{1}{8}=\mathbf{P}(A)\mathbf{P}(B)\mathbf{P}(C)$.

Intuitively this is right. Knowing $A$ does not given any information about $C$ (similarly with $A$ and $B$ or $B$ and $C$), but knowing $A$ and $B$ tells us completely whether or not $C$ occurred! Thus is is right that the definition should not declare them to be independent.

 

Exercise 71
Let $A_{1},\ldots ,A_{n}$ be events in a common probability space. Then, $A_{1}, A_{2},\ldots ,A_{n}$ are independent if and only if the following equalities hold: For each $i$, define $B_{i}$ as $A_{i}$ and $A_{i}^{c}$. Then \[\begin{aligned} \mathbf{P}(B_{1}\cap B_{2}\cap \ldots \cap B_{n})=\mathbf{P}(B_{1})\mathbf{P}(B_{2})\ldots \mathbf{P}(B_{n}). \end{aligned}\]
Note : This should hold for any possible choice of $B_{i}$s. In other words, the system of $2^{n}$ equalities in the definition of independence may be replaced by this new set of $2^{n}$ equalities. The latter system has the advantage that it immediately tells us that if $A_{1},\ldots ,A_{n}$ are independent, then $A_{1},A_{2}^{c}, A_{3},\ldots $ (for each $i$ choose $A_{i}$ or its complement) are independent.

Chapter 12. Subtleties of conditional probability