In particular, if the $n$ coins are identical, we may write $p=\alpha_{1}^{(j)}$ (for any $j$) and the elementary probabilities become $p_{\underline{\omega}}=p^{\sum_{i}\omega_{i} }q^{n-\sum_{i}\omega_{i} }$ where $q=1-p$.
Fix $0\le k\le n$ and let $B_{k}=\{\underline{\omega}{\; : \;} \sum_{i=1}^{n}\omega_{i}=k\}$ be the event that we see exactly $k$ heads out of $n$ tosses. Then $\mathbf{P}(B_{k})=\binom{n}{k}p^{k}q^{n-k}$. If $A_{k}$ is the event that there are at least $k$ heads, then $\mathbf{P}(A_{k})=\sum\limits_{\ell=k}^{n}\binom{n}{\ell}p^{\ell}q^{n-\ell}$.
An interesting random variable is the number of correct guesses. This is the function $X:\Omega\rightarrow \mathbb{R}$ defined by $X(\pi,{\sigma})=\sum_{i=1}^{52}{\mathbf 1}_{\pi_{i}={\sigma}_{i} }$. Correspondingly we have the events $A_{k}=\{(\pi,{\sigma}){\; : \;} X(\pi,{\sigma})\ge k\}$.
Let $A=\{0^{k}1{\; : \;} k\ge n\}$ be the event that at least $n$ tails fall before a head turns up. Then $\mathbf{P}(A)=q^{n}p+q^{n+1}p+\ldots =q^{n}$.
The two models are clearly different. Which one captures reality? We can arbitrarily label the balls for our convenience, and then erase the labels in the end. This clearly yields elementary probabilities $p^{MB}$. Or to put it another way, pick the balls one by one and assign them randomly to one of the urns. This suggests that $p^{MB}$ is the ''right one''.
This leaves open the question of whether there is a natural mechanism of assigning balls to urns so that the probabilities $p^{BE}$ shows up. No such mechanism has been found. But this probability space does occur in the physical world. If $r$ photons (''indistinguishable balls'') are to occupy $m$ energy levels (''urns''), then empirically it has been verified that the correct probability space is the second one! 2 The probabilities $p^{\mbox{MB} }$ and $p^{\mbox{BE} }$ are called Maxwell-Boltzmann statistics and Bose-Einstein statistics. There is a third kind, called Fermi-Dirac statistics which is obeyed by electrons. For general $m\ge r$, the sample space is $\Omega_{\mbox{FD} }=\{(\ell_{1},\ldots ,\ell_{m}){\; : \;} \ell_{i}=0 \mbox{ or }1 \mbox{ and }\ell_{1}+\ldots +\ell_{m}=r\}$ with equal probabilities for each element. In words, all distinguishable configurations are equally likely, with the added constraint that at most one electron can occupy each energy level.
Fix $m < N$ and define the random variable $X(\underline{\omega})=\sum_{i=1}^{k}{\mathbf 1}_{\omega_{i}\le m}$. If the population $[N]$ contains a subset, say $[m]$, (could be the subset of people having a certain disease), then $X(\underline{\omega})$ counts the number of people in the sample who have the disease. Using $X$ one can define events such as $A=\{\underline{\omega}{\; : \;} X(\underline{\omega})=\ell\}$ for some $\ell\le m$. If $\underline{\omega}\in A$, then $\ell$ of the $\omega_{i}$ must be in $[m]$ and the rest in $[N]\setminus [m]$. Hence $$\#A=\binom{k}{\ell}m(m-1)\ldots (m-\ell+1)(N-m)(N-m-1)\ldots (N-m-(k-\ell)+1).$$ As the probabilities are equal for all sample points, we get $$\begin{align*} \mathbf{P}(A)&= \frac{\binom{k}{\ell}m(m-1)\ldots (m-\ell+1)(N-m)(N-m-1)\ldots (N-m-(k-\ell)+1)}{N(N-1)\ldots (N-k+1)} \\ &= \frac{1}{\binom{N}{k} }\binom{m}{\ell}\binom{N-m}{k-\ell}. \end{align*}$$ This expression arises whenever the population is subdivided into two parts and we count the number of samples that fall in one of the sub-populations.
This is a class of examples from statistical physics. In that context, $\Omega$ is the set of all possible states of a system and ${\mathcal H}(\omega)$ is the energy of the state $\omega$. In mechanics a system settles down to the state with the lowest possible energy, but if there are thermal fluctuations (meaning the ambient temperature is not absolute zero), then the system may also be found in other states, but higher energies are less and less likely. In the above assignment, for two states $\omega$ and $\omega'$, we see that $p_{\omega}/p_{\omega'}=e^{\beta ({\mathcal H}(\omega')-{\mathcal H}(\omega))}$ showing that higher energy states are less probable. When $\beta=0$, we get $p_{\omega}=1/|\Omega|$, the uniform distribution on $\Omega$. In statistical physics, $\beta$ is equated to $1/\kappa T$ where $T$ is the temperature and $\kappa$ is Boltzmann's constant.
Different physical systems are defined by choosing $\Omega$ and ${\mathcal H}$ differently. Hence this provides a rich class of examples which are of great importance in probability.
It may seem that probability is trivial, since the only problem is to find the sum of $p_{\omega}$ for $\omega$ belonging to event of interest. This is far from the case. The following example is an illustration.
This may be thought of as follows. Imagine that each edge is a pipe through which water can flow. However each tube may be blocked or open. $\omega$ is the subset of pipes that are open. Now pour water at the top of the rectangle $R$. Will water trickle down to the bottom? The answer is yes if and only if $\omega$ belongs to $A$.
Finding $\mathbf{P}(A)$ is a very difficult problem. When $n$ is large and $m=2 n$, it is expected that $\mathbf{P}(A)$ converges to a specific number, but proving it is an open problem as of today! 3 In a very similar problem on a triangular lattice, it was proved by Stanislav Smirnov (2001) for which he won a fields medal. Proof that computing probabilities is not always trivial!
We now give two non-examples.
This appears obvious, but many folklore puzzles and paradoxes in probability are based on the faulty assumption that it is possible to pick a natural number at random. For example, when asked a question like ''What is the probability that a random integer is odd?'', many people answer $1/2$. We want to emphasize that the probability space has to be defined first, and only then can probabilities of events be calculated. Thus, the question does not make sense to us and we do not have to answer it! 4 For those interested, there is one way to make sense of such questions. It is to consider a sequence of probability spaces $\Omega^{(n)}=\{1,2,\ldots ,n\}$ with elementary probabilities $p^{(n)}_{i}=1/n$ for each $i\in \Omega_{n}$. Then, for a subset $A\subseteq \mathbb{Z}$, we consider $\mathbf{P}_{n}(A\cap \Omega_{n})=\#(A\cap [n])/n$. If these probabilities converge to a limit $x$ as $n\rightarrow \infty$, then we could say that $A$ has asymptotic probability $x$. In this sense, the set of odd numbers does have asymptotic probability $1/2$, the set of numbers divisible by $7$ has asymptotic probability $1/7$ and the set of prime numbers has asymptotic probability $0$. However, this notion of asymptotic probability has many shortcomings. Many subsets of natural numbers will not have an asymptotic probability, and even sets which do have asymptotic probability fail to satisfy basic rules of probability that we shall see later. Hence, we shall keep such examples out of our system.
The dart board can be considered to be the disk $\Omega=\{(x,y){\; : \;} x^{2}+y^{2}\le r^{2}\}$ of given radius $r$. This is an uncountable set. We cannot assign elementary probabilities $p_{(x,y)}$ for each $(x,y)\in \Omega$ in any reasonable way. In fact the only reasonable assignment would be to set $p_{(x,y)}=0$ for each $(x,y)$ but then what is $\mathbf{P}(A)$ for a subset $A$? Uncountable sums are not well defined.
We need a branch of mathematics called measure theory to make proper sense of uncountable probability spaces. This will not be done in this course although we shall later say a bit about the difficulties involved. The same difficulty shows up in the following ''random experiments'' also.