Probability Theory Fundamentals

Probability

Statistics

Mathematics

Introduction to fundamental probability concepts including sample spaces, events, and probability laws with practical Python implementations

Author

Nipun Batra

Published

February 2, 2025

Keywords

probability, sample space, events, combinatorics, probability law, event space

Probability Theory Fundamentals

This notebook covers the fundamental concepts of probability theory, including sample spaces, events, and probability laws. We’ll explore these concepts through practical examples using Python.

Learning Objectives

Understand sample spaces and events
Learn about combinatorics and power sets
Implement probability laws
Work with discrete probability distributions

Let’s begin by exploring sample spaces for common probability experiments:

import numpy as np

# Define the sample space and event space for a fair coin
sample_space = np.array(['H', 'T'])
print("Sample Space for Coin Toss:", sample_space)

# Dice throw
sample_space_dice = [1, 2, 3, 4, 5, 6]
print("Sample Space for Dice Throw:", sample_space_dice)

Sample Space for Coin Toss: ['H' 'T']
Sample Space for Dice Throw: [1, 2, 3, 4, 5, 6]

Sample Spaces and Events

In probability theory, the sample space (Ω) is the set of all possible outcomes of an experiment. An event is any subset of the sample space.

Let’s start with simple examples:

Next, let’s define some specific events:

Combinatorics and Power Sets

To understand all possible events, we need to explore combinatorics. The event space (σ-algebra) consists of all possible events, which is the power set of the sample space.

Let’s explore combinations and permutations using Python’s itertools:

# Define the event space for getting a head
event_space_head = np.array(['H'])

# Define the event space for getting an odd number in a dice throw
event_space_odd = [1, 3, 5]

# Mini tutorial on itertools
from itertools import combinations, permutations, product

x = [1, 2, 3, 4]
# Combinations
print('Combinations')
print(list(combinations(x, 2)))

Combinations
[(1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)]

list(combinations(x, 0))

[()]

for i in range(0, len(x)+1):
    print("Combinations of length ", i)
    print(list(combinations(x, i)))
    print()

Combinations of length  0
[()]

Combinations of length  1
[(1,), (2,), (3,), (4,)]

Combinations of length  2
[(1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)]

Combinations of length  3
[(1, 2, 3), (1, 2, 4), (1, 3, 4), (2, 3, 4)]

Combinations of length  4
[(1, 2, 3, 4)]

# Combine all using chain 
from itertools import chain
powerset = list(chain.from_iterable(combinations(x, i) for i in range(0, len(x)+1)))
print(powerset)

[(), (1,), (2,), (3,), (4,), (1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4), (1, 2, 3), (1, 2, 4), (1, 3, 4), (2, 3, 4), (1, 2, 3, 4)]

import itertools
# Generate the entire event space (power set)
def generate_event_space(sample_space):

    """
    Generates the power set of the sample space, which is the event space.

    Args:
        sample_space (np.ndarray): The sample space.

    Returns:
        list: A list of NumPy arrays representing all possible events.
    """
    n = len(sample_space)
    # Use itertools to generate all subsets
    power_set = list(itertools.chain.from_iterable(
        itertools.combinations(sample_space, r) for r in range(n + 1)
    ))
    # Convert tuples to NumPy arrays
    return [np.array(event) for event in power_set]

generate_event_space(sample_space)

[array([], dtype=float64),
 array(['H'], dtype='<U1'),
 array(['T'], dtype='<U1'),
 array(['H', 'T'], dtype='<U1')]

sample_space_dice

[1, 2, 3, 4, 5, 6]

generate_event_space(sample_space_dice)

[array([], dtype=float64),
 array([1]),
 array([2]),
 array([3]),
 array([4]),
 array([5]),
 array([6]),
 array([1, 2]),
 array([1, 3]),
 array([1, 4]),
 array([1, 5]),
 array([1, 6]),
 array([2, 3]),
 array([2, 4]),
 array([2, 5]),
 array([2, 6]),
 array([3, 4]),
 array([3, 5]),
 array([3, 6]),
 array([4, 5]),
 array([4, 6]),
 array([5, 6]),
 array([1, 2, 3]),
 array([1, 2, 4]),
 array([1, 2, 5]),
 array([1, 2, 6]),
 array([1, 3, 4]),
 array([1, 3, 5]),
 array([1, 3, 6]),
 array([1, 4, 5]),
 array([1, 4, 6]),
 array([1, 5, 6]),
 array([2, 3, 4]),
 array([2, 3, 5]),
 array([2, 3, 6]),
 array([2, 4, 5]),
 array([2, 4, 6]),
 array([2, 5, 6]),
 array([3, 4, 5]),
 array([3, 4, 6]),
 array([3, 5, 6]),
 array([4, 5, 6]),
 array([1, 2, 3, 4]),
 array([1, 2, 3, 5]),
 array([1, 2, 3, 6]),
 array([1, 2, 4, 5]),
 array([1, 2, 4, 6]),
 array([1, 2, 5, 6]),
 array([1, 3, 4, 5]),
 array([1, 3, 4, 6]),
 array([1, 3, 5, 6]),
 array([1, 4, 5, 6]),
 array([2, 3, 4, 5]),
 array([2, 3, 4, 6]),
 array([2, 3, 5, 6]),
 array([2, 4, 5, 6]),
 array([3, 4, 5, 6]),
 array([1, 2, 3, 4, 5]),
 array([1, 2, 3, 4, 6]),
 array([1, 2, 3, 5, 6]),
 array([1, 2, 4, 5, 6]),
 array([1, 3, 4, 5, 6]),
 array([2, 3, 4, 5, 6]),
 array([1, 2, 3, 4, 5, 6])]

# Probability law function for a fair coin
def probability(event, sample_space):
    """
    Computes the probability of an event for a fair coin.

    Args:
        event (np.ndarray): The event (subset of the sample space).

    Returns:
        float: The probability of the event.
    """
    # Convert the event into a NumPy array for comparison
    event = np.array(event)


    # Validate if the event is a subset of the sample space
    if not np.all(np.isin(event, sample_space)):
        raise ValueError("Invalid event. Event must be a subset of the sample space.")

    # Probability logic
    if len(event) == 0:         # Empty set
        return 0.0
    elif np.array_equal(event, sample_space):  # Entire sample space
        return 1.0
    else:                       # Any single event (like {H} or {T})
        return len(event) / len(sample_space)

for event in generate_event_space(sample_space):
    print(f"Event: {event} -> Probability: {probability(event, sample_space)}")

Event: [] -> Probability: 0.0
Event: ['H'] -> Probability: 0.5
Event: ['T'] -> Probability: 0.5
Event: ['H' 'T'] -> Probability: 1.0

import pandas as pd
# Initialize an empty list to store events and probabilities
events_data = []

# Generate events and probabilities
for event in generate_event_space(sample_space_dice):
    event_tuple = tuple(event)
    event_prob = probability(event, sample_space_dice)
    events_data.append((event_tuple, event_prob))

# Create a pandas DataFrame
df = pd.DataFrame(events_data, columns=['Event', 'Probability'])

df

	Event	Probability
0	()	0.000000
1	(1,)	0.166667
2	(2,)	0.166667
3	(3,)	0.166667
4	(4,)	0.166667
...	...	...
59	(1, 2, 3, 5, 6)	0.833333
60	(1, 2, 4, 5, 6)	0.833333
61	(1, 3, 4, 5, 6)	0.833333
62	(2, 3, 4, 5, 6)	0.833333
63	(1, 2, 3, 4, 5, 6)	1.000000

64 rows × 2 columns