Latest update Android YouTube

Probability for AI/ML - Chapter 7 Law of Large Numbers & Central Limit Theorem

Chapter 7 — Law of Large Numbers & Central Limit Theorem

Understanding how sample statistics converge to population parameters and how sample means behave is crucial in AI/ML for estimation, sampling, and simulation.

7.1 Law of Large Numbers (LLN)

The Law of Large Numbers states that as the number of independent samples increases, the sample average converges to the expected value (mean) of the population.

Mathematical Form: For random variables X₁, X₂, ..., Xₙ with expected value μ, lim (n → ∞) (1/n) Σ Xᵢ = μ

Example: Rolling a fair six-sided die many times — the average of outcomes approaches 3.5. AI/ML context: Ensures that empirical estimates (e.g., mean loss, accuracy) converge to true values when using large datasets or Monte Carlo samples.

7.2 Central Limit Theorem (CLT)

The Central Limit Theorem states that the distribution of sample means approaches a normal distribution, regardless of the population’s original distribution, as sample size increases.

Mathematical Form: For sample mean Ȳ = (1/n) Σ Xᵢ, Ȳ ~ N(μ, σ²/n) as n → ∞

Example: Sampling the average height of 30 people repeatedly — the distribution of the averages forms a bell curve, even if individual heights are not perfectly normal. AI/ML context: Justifies using Gaussian approximations, confidence intervals, and statistical tests in ML models and Monte Carlo simulations.

7.3 Practical Examples in Python

import numpy as np
import matplotlib.pyplot as plt

# LLN demonstration
np.random.seed(0)
die_rolls = np.random.randint(1,7, size=10000)
cumulative_avg = np.cumsum(die_rolls) / np.arange(1,10001)
plt.plot(cumulative_avg)
plt.axhline(y=3.5, color='r', linestyle='--')
plt.title("Law of Large Numbers: Average of Die Rolls")
plt.xlabel("Number of Rolls")
plt.ylabel("Cumulative Average")
plt.show()

# CLT demonstration
sample_means = []
for _ in range(10000):
    sample = np.random.exponential(scale=2.0, size=30)  # Non-normal distribution
    sample_means.append(np.mean(sample))
plt.hist(sample_means, bins=50, density=True)
plt.title("Central Limit Theorem: Distribution of Sample Means")
plt.xlabel("Sample Mean")
plt.ylabel("Frequency")
plt.show()

7.4 Key Takeaways

  • LLN ensures that averages computed on large datasets approximate true expectations.
  • CLT allows treating sample means as normally distributed, enabling statistical inference and confidence intervals.
  • Both concepts are fundamental in Monte Carlo simulations, probabilistic modeling, and evaluating ML algorithms on sampled data.

Next chapter: Entropy, Information & KL Divergence — understanding uncertainty and information in AI/ML models.

إرسال تعليق

Feel free to ask your query...
Cookie Consent
We serve cookies on this site to analyze traffic, remember your preferences, and optimize your experience.
Oops!
It seems there is something wrong with your internet connection. Please connect to the internet and start browsing again.
AdBlock Detected!
We have detected that you are using adblocking plugin in your browser.
The revenue we earn by the advertisements is used to manage this website, we request you to whitelist our website in your adblocking plugin.
Site is Blocked
Sorry! This site is not available in your country.