Random Variables - Data Science Cheatsheet

What is a Random Variable?

🎲 Random Variable - Mapping Outcomes to Numbers

Definition: A function that assigns a numerical value to each outcome in a sample space

Notation: Usually denoted by capital letters: $X$, $Y$, $Z$

Purpose: Allows us to perform mathematical operations on random events

Example: $X$ = number of heads in 3 coin flips (maps outcomes like HHT → 2)

Formal Definition

$$X: \Omega \rightarrow \mathbb{R}$$

Where:
$$\Omega = \text{sample space (set of all possible outcomes)}$$ $$\mathbb{R} = \text{real numbers}$$ $$X(\omega) = \text{numerical value assigned to outcome } \omega$$ Concrete Example - Coin Flip: $$\text{Sample space: } \Omega = \{H, T\}$$ $$\text{Random variable: } X(H) = 1, \quad X(T) = 0$$ $$\text{Now we can do math: } P(X = 1) = 0.5, \quad E[X] = 0.5, \text{ etc.}$$

Intuitive Examples

🎲 Rolling a Die

Sample Space: $$\{1, 2, 3, 4, 5, 6\}$$ Random Variable $X$: The number shown
Values: $$X \in \{1, 2, 3, 4, 5, 6\}$$ Type: Discrete

🌡️ Temperature

Sample Space: All possible outcomes
Random Variable $X$: Temperature in °C
Values: $$X \in \mathbb{R} \quad \text{(any real number)}$$ Type: Continuous

📞 Phone Calls

Sample Space: All possible call patterns
Random Variable $X$: Number of calls per hour
Values: $$X \in \{0, 1, 2, 3, ...\}$$ Type: Discrete

⏱️ Waiting Time

Sample Space: All possible wait times
Random Variable $X$: Time until bus arrives (minutes)
Values: $$X \in [0, \infty)$$ Type: Continuous

Discrete Random Variables

📊 Discrete Random Variables - Countable Values

Definition: Takes on a countable number of distinct values

Examples: Number of heads in coin flips, count of defects, number of customers

Key property: Can list all possible values (finite or countably infinite)

Probability Mass Function (PMF)

$$p_X(x) = P(X = x)$$

Definition: $$\text{Probability that random variable } X \text{ takes value } x$$ Properties:
1. Non-negative: $$p_X(x) \geq 0 \text{ for all } x$$ 2. Sums to 1: $$\sum_{\text{all } x} p_X(x) = 1$$ 3. Only defined at discrete points Notation: $$\text{Also written as } P(X = x) \text{ or } f_X(x)$$ $$\text{"Mass" because probability is concentrated at discrete points}$$ Interpretation: $$\text{PMF gives exact probabilities for discrete values}$$

Example: Fair Die Roll

Random Variable $X$: Outcome of rolling a fair six-sided die

PMF: $$p_X(x) = \begin{cases} \frac{1}{6} & \text{if } x \in \{1, 2, 3, 4, 5, 6\} \\ 0 & \text{otherwise} \end{cases}$$

Verification: $$\sum_{x=1}^{6} p_X(x) = 6 \times \frac{1}{6} = 1 \quad \checkmark$$

Calculations:
$$P(X = 3) = \frac{1}{6} \approx 0.167$$ $$P(X \leq 2) = P(X=1) + P(X=2) = \frac{1}{6} + \frac{1}{6} = \frac{1}{3}$$ $$P(X > 4) = P(X=5) + P(X=6) = \frac{2}{6} = \frac{1}{3}$$ $$P(X \text{ is even}) = P(X=2) + P(X=4) + P(X=6) = \frac{1}{2}$$

Visual Representation: $$\text{PMF would be plotted as vertical bars at } x = 1, 2, 3, 4, 5, 6, \text{ each with height } \frac{1}{6}$$

Example: Number of Heads in 3 Coin Flips

Random Variable $X$: Number of heads when flipping a fair coin 3 times

Possible values: $$X \in \{0, 1, 2, 3\}$$ Sample space: $$\Omega = \{HHH, HHT, HTH, HTT, THH, THT, TTH, TTT\} \quad \text{(8 outcomes)}$$

PMF Calculation: $$P(X = 0) = P(TTT) = \frac{1}{8}$$ $$P(X = 1) = P(HTT, THT, TTH) = \frac{3}{8}$$ $$P(X = 2) = P(HHT, HTH, THH) = \frac{3}{8}$$ $$P(X = 3) = P(HHH) = \frac{1}{8}$$

$x$	0	1	2	3
$p_X(x)$	$\frac{1}{8}$	$\frac{3}{8}$	$\frac{3}{8}$	$\frac{1}{8}$
Decimal	0.125	0.375	0.375	0.125

Verification: $$\frac{1}{8} + \frac{3}{8} + \frac{3}{8} + \frac{1}{8} = 1 \quad \checkmark$$

Continuous Random Variables

📈 Continuous Random Variables - Uncountable Values

Definition: Takes on any value in an interval (uncountably infinite values)

Examples: Height, weight, time, temperature, voltage

Key property: Cannot list all values - must use intervals

Important: $P(X = x) = 0$ for any specific value! (Probability is over intervals)

Probability Density Function (PDF)

$$f_X(x) \geq 0, \quad \int_{-\infty}^{\infty} f_X(x) \, dx = 1$$

Definition: $$\text{Function whose integral over an interval gives probability}$$ Key Difference from PMF: $$\text{PMF gives exact probabilities: } P(X = x)$$ $$\text{PDF does NOT give probabilities directly!}$$ $$f_X(x) \text{ is a probability \textit{density}, not probability}$$ $$f_X(x) \text{ can be } > 1 \text{ (densities are not probabilities!)}$$ Probability Calculation: $$P(a \leq X \leq b) = \int_a^b f_X(x) \, dx$$ $$\text{Probability is the \textit{area under the curve}}$$ Properties:
1. Non-negative: $$f_X(x) \geq 0 \text{ for all } x$$ 2. Integrates to 1: $$\int_{-\infty}^{\infty} f_X(x) \, dx = 1$$ 3. Zero at any point: $$P(X = x) = 0 \text{ for any specific } x \text{ (zero area at a point)}$$ 4. Endpoints don't matter: $$P(a \leq X \leq b) = P(a < X < b)$$

Example: Uniform Distribution on [0, 1]

Random Variable $X$: Uniformly distributed on interval $[0, 1]$

PDF: $$f_X(x) = \begin{cases} 1 & \text{if } 0 \leq x \leq 1 \\ 0 & \text{otherwise} \end{cases}$$

Verification: $$\int_{-\infty}^{\infty} f_X(x) \, dx = \int_0^1 1 \, dx = 1 \quad \checkmark$$

Probability Calculations: $$P(X = 0.5) = 0 \quad \text{(probability at a single point is zero)}$$ $$P(0.2 \leq X \leq 0.7) = \int_{0.2}^{0.7} 1 \, dx = 0.7 - 0.2 = 0.5$$ $$P(X \leq 0.3) = \int_0^{0.3} 1 \, dx = 0.3$$ $$P(X > 0.8) = \int_{0.8}^1 1 \, dx = 0.2$$

Interpretation: $$\text{PDF is constant (height = 1) over } [0,1]$$ $$\text{Probability of any interval is proportional to its length}$$ $$\text{The area under the curve from 0.2 to 0.7 is } 1 \times 0.5 = 0.5$$

Example: Exponential Distribution

Use case: Modeling waiting times, time until failure (memoryless process)

PDF: $$f_X(x) = \begin{cases} \lambda e^{-\lambda x} & \text{if } x \geq 0 \\ 0 & \text{if } x < 0 \end{cases}$$ $$\text{where } \lambda > 0 \text{ is the rate parameter}$$

Example with $\lambda = 1$: $$P(X \leq 2) = \int_0^2 e^{-x} \, dx = [-e^{-x}]_0^2 = 1 - e^{-2} \approx 0.865$$ $$P(X > 1) = \int_1^{\infty} e^{-x} \, dx = e^{-1} \approx 0.368$$ $$P(0.5 \leq X \leq 1.5) = \int_{0.5}^{1.5} e^{-x} \, dx = e^{-0.5} - e^{-1.5} \approx 0.383$$

Note: $$\text{The PDF } f_X(0) = \lambda \text{ can be greater than 1 if } \lambda > 1$$ $$\text{This is okay because PDF is a density, not a probability!}$$

Cumulative Distribution Function (CDF)

📊 CDF - The Universal Description

Works for both: Discrete and continuous random variables

Key property: Always non-decreasing, ranges from 0 to 1

Advantage: Unified way to describe any random variable

CDF Definition

$$F_X(x) = P(X \leq x)$$

Interpretation: $$\text{Probability that } X \text{ is at most } x$$ Properties:
1. Non-decreasing: $$\text{If } x_1 < x_2, \text{ then } F_X(x_1) \leq F_X(x_2)$$ 2. Right-continuous: $$\lim_{h \to 0^+} F_X(x+h) = F_X(x)$$ 3. Limits: $$\lim_{x \to -\infty} F_X(x) = 0 \quad \text{and} \quad \lim_{x \to \infty} F_X(x) = 1$$ 4. Range: $$0 \leq F_X(x) \leq 1 \text{ for all } x$$ Key Insight: $$\text{CDF is defined for ALL random variables (discrete, continuous, or mixed)}$$

Relationship: PMF/PDF ↔ CDF

For Discrete Random Variables:
From PMF to CDF: $$F_X(x) = \sum_{k \leq x} p_X(k)$$ From CDF to PMF: $$p_X(x) = F_X(x) - F_X(x^-)$$ $$\text{where } F_X(x^-) \text{ is the left limit}$$ For Continuous Random Variables:
From PDF to CDF: $$F_X(x) = \int_{-\infty}^x f_X(t) \, dt$$ From CDF to PDF: $$f_X(x) = \frac{dF_X(x)}{dx}$$ $$\text{(derivative of CDF is PDF)}$$

Example: CDF for Fair Die

Random Variable $X$: Outcome of fair six-sided die

PMF: $$p_X(k) = \frac{1}{6} \quad \text{for } k \in \{1,2,3,4,5,6\}$$

CDF: $$F_X(x) = \begin{cases} 0 & \text{if } x < 1 \\ \frac{1}{6} & \text{if } 1 \leq x < 2 \\ \frac{2}{6} & \text{if } 2 \leq x < 3 \\ \frac{3}{6} & \text{if } 3 \leq x < 4 \\ \frac{4}{6} & \text{if } 4 \leq x < 5 \\ \frac{5}{6} & \text{if } 5 \leq x < 6 \\ 1 & \text{if } x \geq 6 \end{cases}$$

$x$	$F_X(x) = P(X \leq x)$	Interpretation
$x = 2.5$	$\frac{2}{6} = 0.333$	Prob. of rolling ≤ 2
$x = 4$	$\frac{4}{6} = 0.667$	Prob. of rolling ≤ 4
$x = 10$	$1$	Certain (all outcomes ≤ 10)

Visualization: $$\text{CDF for discrete variables is a step function with jumps at each possible value}$$ $$\text{Jump size equals PMF at that point}$$

Example: CDF for Uniform[0,1]

Random Variable $X$: Uniform on $[0,1]$

PDF: $$f_X(x) = 1 \quad \text{for } x \in [0,1]$$

CDF: $$F_X(x) = \begin{cases} 0 & \text{if } x < 0 \\ x & \text{if } 0 \leq x \leq 1 \\ 1 & \text{if } x > 1 \end{cases}$$ Derivation for $0 \leq x \leq 1$: $$F_X(x) = \int_{-\infty}^x f_X(t) \, dt = \int_0^x 1 \, dt = x$$

Examples: $$F_X(0.3) = 0.3 \quad \rightarrow \quad \text{30% chance } X \leq 0.3$$ $$F_X(0.7) = 0.7 \quad \rightarrow \quad \text{70% chance } X \leq 0.7$$ $$F_X(-1) = 0 \quad \rightarrow \quad \text{Impossible for } X \leq -1$$ $$F_X(2) = 1 \quad \rightarrow \quad \text{Certain that } X \leq 2$$

Verification (PDF from CDF):
$$f_X(x) = \frac{dF_X(x)}{dx} = \frac{d(x)}{dx} = 1 \quad \text{for } x \in [0,1] \quad \checkmark$$

Using CDF for Probability Calculations

Important Formulas

$$P(a < X \leq b) = F_X(b) - F_X(a)$$ $$P(X > a) = 1 - F_X(a)$$ $$P(X < a) = F_X(a^-)$$

Example with Uniform[0,1]: $$P(0.2 < X \leq 0.7) = F_X(0.7) - F_X(0.2) = 0.7 - 0.2 = 0.5$$ $$P(X > 0.6) = 1 - F_X(0.6) = 1 - 0.6 = 0.4$$ $$P(X \leq 0.25) = F_X(0.25) = 0.25$$

Comparison: Discrete vs Continuous

Property	Discrete	Continuous
Values	Countable set	Interval (uncountable)
Distribution	PMF (Probability Mass)	PDF (Probability Density)
Exact probability	$P(X = x) = p_X(x)$	$P(X = x) = 0$
Interval probability	$\sum_{k=a}^b p_X(k)$	$\int_a^b f_X(x) \, dx$
Normalization	$\sum_{\text{all } x} p_X(x) = 1$	$\int_{-\infty}^{\infty} f_X(x) \, dx = 1$
CDF formula	$F_X(x) = \sum_{k \leq x} p_X(k)$	$F_X(x) = \int_{-\infty}^x f_X(t) \, dt$
CDF shape	Step function (jumps)	Smooth curve
Examples	Coin flips, dice, counts	Height, time, temperature

Key Insights

Random Variables Map Outcomes to Numbers:
This allows us to use calculus and algebra to analyze random events mathematically.

PMF vs PDF - Critical Difference:
PMF gives exact probabilities ($P(X=x)$), but PDF gives densities. For continuous variables, $P(X=x) = 0$ always!

PDF Can Exceed 1:
Common misconception - PDF values can be > 1 because they are densities, not probabilities. Only integrals (areas) are probabilities.

CDF is Universal:
Works for both discrete and continuous. Always non-decreasing, goes from 0 to 1. You can always recover PMF/PDF from CDF.

Discrete: Sum, Continuous: Integrate:
This is the fundamental difference in calculations. Discrete uses sums ($\sum$), continuous uses integrals ($\int$).

Endpoints Don't Matter for Continuous:
$P(a < X < b) = P(a \leq X \leq b)$ for continuous variables (since $P(X=a) = 0$). This is NOT true for discrete!

CDF Always Defined:
Even for mixed distributions (part discrete, part continuous), CDF is always well-defined. It's the most general description.

Visualization:
PMF: vertical bars at discrete points
PDF: smooth curve, area under curve = probability
CDF (discrete): step function
CDF (continuous): smooth S-shaped curve