Wednesday, February 12, 2025

Multinomial Distribution: Theory, Applications, and a Real-World Example

  Manoj       Wednesday, February 12, 2025

🔎 Introduction

The multinomial distribution generalizes the binomial distribution to more than two outcomes. It is widely used in statistics to model categorical data across multiple classes, such as survey responses, genetics, or market research.

👉 If you are not familiar with the binomial distribution, check my earlier post: Binomial Distribution Explained.


📘 Definition

Suppose we perform n independent trials, each with k possible outcomes. Let the probability of outcome i be \(p_i\) where:

$$ \sum_{i=1}^k p_i = 1 $$

If \(X_i\) represents the number of times outcome i occurs, then the joint distribution of \((X_1, X_2, \dots, X_k)\) is multinomial.

The probability mass function (pmf) is:

$$ P(X_1=x_1,\dots,X_k=x_k) = \frac{n!}{x_1!\,x_2!\,\dots\,x_k!} \, p_1^{x_1} p_2^{x_2} \dots p_k^{x_k} $$

where \(\sum_{i=1}^k x_i = n\).


📐 Properties

  • Mean: \(E[X_i] = np_i\)
  • Variance: \(\mathrm{Var}(X_i) = np_i(1-p_i)\)
  • Covariance: \(\mathrm{Cov}(X_i, X_j) = -np_ip_j, \; i \neq j\)

🌍 Applications

  • Genetics: Predicting genotype frequencies.
  • Marketing: Customer choices across brands.
  • Polling: Election vote share estimates across parties.
  • Gaming: Probability of outcomes with dice, cards, or random draws.
Figure 1: Visual representation of multinomial categories with probabilities \(p_1, p_2, \dots, p_k\).

📊 Example: Genetics

A plant produces flowers that may be red, white, or yellow with probabilities \(p_1=0.5\), \(p_2=0.3\), \(p_3=0.2\). For \(n=5\) flowers, find the probability of getting 2 red, 2 white, and 1 yellow flower.

Solution:

$$ P(X_1=2, X_2=2, X_3=1) = \frac{5!}{2!\,2!\,1!} (0.5)^2 (0.3)^2 (0.2)^1 $$

Step-by-step:

  • Coefficient: \(\frac{5!}{2!2!1!} = 30\)
  • Product: \((0.5)^2 (0.3)^2 (0.2)^1 = 0.0045\)
  • Final probability: \(30 \times 0.0045 = 0.135\)

So, the probability is 0.135 (13.5%).


📝 Key Takeaways

  • The multinomial distribution extends the binomial to k categories.
  • PMF formula includes factorial terms and probabilities of outcomes.
  • Mean: \(E[X_i] = np_i\); Variance: \(np_i(1-p_i)\).
  • Covariance between different categories is negative.
  • Applications include genetics, marketing, surveys, and games.

👉 Related Posts


Thank you for reading!
If you found this helpful, please share & subscribe for more statistics lectures.
Drop your questions in the comments 👇

logoblog

Thanks for reading Multinomial Distribution: Theory, Applications, and a Real-World Example

Previous
« Prev Post

No comments:

Post a Comment

Statistics becomes simple when we share and discuss. Drop your questions or suggestions below — I’ll be happy to respond!