Monday, 12 December 2011

Hypergeometric Distribution

  Manoj       Monday, 12 December 2011

This tutorial covers hypergeometric experiments, hypergeometric distributions, and hypergeometric probability.

Definition

A discrete random variable $x$ is said to follow the hypergeometric distribution with parameters $N,M$ and n if it assumes only non-negative values and its probability mass function is given by:
\[P(X=k)=h(k;N,M,n)=\left\{ \begin{matrix}
\frac{\left( \begin{matrix}
M \\
k \\
\end{matrix} \right)\left( \begin{matrix}
N-M \\
n-k \\
\end{matrix} \right)}{\left( \begin{matrix}
N \\
n \\
\end{matrix} \right)};k=0,1,2,...,\min \left( n,M \right) \\
\begin{matrix}
o,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, & otherwise \\
\end{matrix} \\
\end{matrix} \right.\]
Where $N$ is the positive integer, $M$ is a positive integer not exceeding $N$ and $n$ is a positive integer that is at most $N$.


Hyper-geometric Experiments

A hypergeometric experiment is a statistical experiment that has the following properties: 
A sample of size n is randomly selected without replacement from a population of N items. 
In the population, k items can be classified as successes, and $N - k$ items can be classified as failures. 

Consider the following statistical experiment. You have an urn of 10 marbles - 5 red and 5 green. You randomly select 2 marbles without replacement and count the number of red marbles you have selected. This would be a hypergeometric experiment. 

Note that it would not be a binomial experiment. A binomial experiment requires that the probability of success be constant on every trial. With the above experiment, the probability of a success changes on every trial. In the beginning, the probability of selecting a red marble is $5/10$. If you select a red marble on the first trial, the probability of selecting a red marble on the second trial is $4/9$. And if you select a green marble on the first trial, the probability of selecting a red marble on the second trial is $5/9$. 

Note further that if you selected the marbles with replacement, the probability of success would not change. It would be 5/10 on every trial. Then, this would be a binomial experiment. 
Notation

The following notation is helpful, when we talk about hypergeometric distributions and hypergeometric probability. 
$N$: The number of items in the population. 
$k$: The number of items in the population that are classified as successes. 
$n$: The number of items in the sample. 
$x$: The number of items in the sample that are classified as successes. 
$kCx$: The number of combinations of $k$ things, taken $x$ at a time. 
$h(x; N, n, k)$: hypergeometric probability - the probability that an n-trial hypergeometric experiment results in exactly x successes, when the population consists of $N$ items, $k$ of which are classified as successes. 
Hypergeometric Distribution

A hypergeometric random variable is the number of successes that result from a hypergeometric experiment. The probability distribution of a hypergeometric random variable is called a hypergeometric distribution. 

Given $x, N, n,$ and $k$, we can compute the hypergeometric probability based on the following formula: 
Hypergeometric Formula. Suppose a population consists of N items, k of which are successes. And a random sample drawn from that population consists of n items, x of which are successes. Then the hypergeometric probability is: $h(x; N, n, k) = [ kCx ] [ N-kCn-x ] / [ NCn ]$
The hypergeometric distribution has the following properties: 
The mean of the distribution is equal to n * k / N . 
The variance is $n * k * ( N - k ) * ( N - n ) / [ N2 * ( N - 1 ) ] $. 
Example 1
Suppose we randomly select 5 cards without replacement from an ordinary deck of playing cards. What is the probability of getting exactly 2 red cards (i.e., hearts or diamonds)? 
Solution: This is a hypergeometric experiment in which we know the following: 
$N = 52$; since there are 52 cards in a deck. 
$k = 26$; since there are 26 red cards in a deck. 
$n = 5$; since we randomly select 5 cards from the deck. 
$x = 2$; since 2 of the cards we select are red. 
These values into the hypergeometric formula as follows:
$h(x; N, n, k) = [ kCx ] [ N-kCn-x ] / [ NCn ]$ 
$h(2; 52, 5, 26) = [ 26C2 ] [ 26C3 ] / [ 52C5 ]$ 
$h(2; 52, 5, 26) = [ 325 ] [ 2600 ] / [ 2,598,960 ] = 0.32513$
Thus, the probability of randomly selecting 2 red cards is 0.32513.
Cumulative Hypergeometric Probability
A cumulative hypergeometric probability refers to the probability that the hypergeometric random variable is greater than or equal to some specified lower limit and less than or equal to some specified upper limit.

For example, suppose we randomly select five cards from an ordinary deck of playing cards. We might be interested in the cumulative hypergeometric probability of obtaining 2 or fewer hearts. This would be the probability of obtaining 0 hearts plus the probability of obtaining 1 heart plus the probability of obtaining 2 hearts, as shown in the example below. 
Example 2
Suppose we select 5 cards from an ordinary deck of playing cards. What is the probability of obtaining 2 or fewer hearts? 
Solution: This is a hyper geometric experiment in which we know the following: 
$N = 52$; since there are 52 cards in a deck. 
$k = 13$; since there are 13 hearts in a deck. 
$n = 5$; since we randomly select 5 cards from the deck. 
$x = 0$ to 2; since our selection includes 0, 1, or 2 hearts. 
These values into the hyper geometric formula as follows:
$h(x < x; N, n, k) = h(x < 2; 52, 5, 13)$ 
$h(x < 2; 52, 5, 13) = h(x = 0; 52, 5, 13) + h(x = 1; 52, 5, 13) + h(x = 2; 52, 5, 13)$ 
$h(x < 2; 52, 5, 13) = [ (13C0) (39C5) / (52C5) ] + [ (13C1) (39C4) / (52C5) ]$
$+ [ (13C2) (39C3) / (52C5) ] $
$h(x < 2; 52, 5, 13) = [ (1)(575,757)/(2,598,960) ] + [ (13)(82,251)/(270,725) ] $
$+ [ (78)(9139)/(22,100) ] $
$h(x < 2; 52, 5, 13) = [ 0.2215 ] + [ 0.4114 ] + [ 0.2743 ] $
$h(x < 2; 52, 5, 13) = 0.9072 $
Thus, the probability of randomly selecting at most 2 hearts is 0.9072.
logoblog

Thanks for reading Hypergeometric Distribution

Previous
« Prev Post

No comments:

Post a Comment