Chapter 3 — Random Variables
Random variables allow us to model uncertain quantities numerically. They are the foundation for representing features, outcomes, and predictions in AI & ML.
3.1 Discrete Random Variables
A discrete random variable takes on a finite or countable set of values. Example: Number of heads when flipping 3 coins (values: 0, 1, 2, 3).
Probability Mass Function (PMF): Gives probability of each outcome:
P(X=x) = probability that random variable X takes value x
Example PMF: Flipping 3 coins, X = # of heads:
P(X=0) = 1/8, P(X=1) = 3/8, P(X=2) = 3/8, P(X=3) = 1/8
AI/ML context: Discrete features, such as categorical variables or counts, can be modeled as discrete random variables.
3.2 Continuous Random Variables
A continuous random variable can take any value in an interval. Example: Height of students in a class or pixel intensity in an image.
Probability Density Function (PDF): The probability of the variable falling within an interval [a, b] is:
P(a ≤ X ≤ b) = ∫[a to b] f(x) dx, where f(x) is the PDF
Example: Standard normal distribution:
Mean μ = 0, Standard deviation σ = 1, PDF f(x) = (1/√(2π)) * exp(-x²/2)
AI/ML context: Continuous features (e.g., sensor readings, normalized pixel values) are modeled as continuous random variables. PDFs are used in Gaussian Naive Bayes and probabilistic models.
3.3 Expectation & Variance of Random Variables
- Expected value (mean): E[X] = Σ x * P(X=x) (discrete) or ∫ x f(x) dx (continuous) - Variance: Var(X) = E[(X - E[X])²] - Standard deviation = √Var(X)
AI/ML context: Expectation and variance help in feature normalization, uncertainty estimation, and probabilistic reasoning in ML.
3.4 Practical Examples in Python
import numpy as np
from scipy.stats import binom, norm
# Discrete random variable: Number of heads in 3 coin flips
pmf = binom.pmf([0,1,2,3], n=3, p=0.5)
print("PMF for 0-3 heads:", pmf)
# Continuous random variable: Standard normal
x = np.linspace(-3,3,100)
pdf = norm.pdf(x, loc=0, scale=1)
print("First 5 values of PDF:", pdf[:5])
# Expectation and variance
E_X = np.sum([0,1,2,3] * pmf)
Var_X = np.sum(((np.array([0,1,2,3]) - E_X)**2) * pmf)
print("Expected value:", E_X, "Variance:", Var_X)
3.5 Key Takeaways
- Random variables allow modeling of uncertainty in features and outcomes.
- Discrete vs continuous random variables depend on the type of data.
- PMF and PDF describe the distribution of probabilities for discrete and continuous variables, respectively.
- Expectation and variance are essential for feature scaling, probabilistic models, and risk estimation in ML.
Next chapter: Expectation, Variance, and Moments — how to quantify the center, spread, and shape of distributions for AI & ML applications.