Published on

[ENG] Probability and Statistics: Two Sides of the Same Coin

Authors
Article Cover
Table of Contents

Probability and statistics are closely related fields that are often mentioned together, leading to the misconception that they are one and the same. This article clarifies the distinction between probability and statistics, providing definitions of fundamental concepts within these domains.

1. Probability: Embracing Randomness

The concept of randomness refers to events whose outcomes cannot be predicted with certainty. A classic example is flipping a coin: randomness implies that we cannot know for sure whether the coin will land heads or tails. The probability of the coin landing heads is 50%.

Therefore, probability is the study of randomness. It deals with the likelihood of events occurring. There are two common interpretations of probability:

  • Relative frequency: This refers to the long-run proportion of times an event occurs. For instance, if we flip a coin many times, we expect heads to appear roughly half the time. As the number of flips increases, the proportion of heads will approach 50%. This interpretation is a specific case of the law of large numbers.
  • Degree of belief: Probability can also be interpreted as our level of confidence that an event will happen. This interpretation is more subjective but useful in situations where multiple trials are not feasible, such as weather forecasting. When considering the likelihood of rain, we analyze factors like cloud cover, humidity, and the historical probability of rain under similar conditions. Different individuals may arrive at different probability estimates based on their interpretations of these factors.

One application of probability is in communication systems. These systems transmit information from one point to another. When we speak on the phone, our voice is converted into a sequence of 0s and 1s called information bits. These bits are then transmitted from our phone's antenna to a nearby cell tower. However, the transmission process is susceptible to noise. For example, a phone might transmit the sequence 0-1-0-0-1-0, but the received sequence is 0-1-0-1-1-0. The fourth bit is an error, affecting the call's sound quality. This noise is a random phenomenon: we cannot know beforehand whether a bit will be affected. It's like flipping a coin for each bit to determine if it will be corrupted. Probability theory is extensively used in designing modern communication systems to understand noise behavior and implement error correction techniques.

Therefore, randomness is ubiquitous, and probability theory, simply put, is the study of randomness.

Here are examples demonstrating probability [2]:

  • Example 1: Rolling a die once

    • Alice receives $1 if the die roll is ≤ 3
    • Bob receives $2 if the die roll is ≤ 2

    The question is, would we rather be Alice or Bob?

    Let's denote the expectations of Alice and Bob as E[A],E[B]E[A], E[B], respectively. We see that E[A]E[B]E[A] \ge E[B]:

    E[A]=1×36=12E[B]=2×26=23>12E[A] = 1 \times \frac{3}{6} = \frac{1}{2} E[B] = 2 \times \frac{2}{6} = \frac{2}{3} > \frac{1}{2}
  • Example 2: Rolling a die twice

    • Choose a number between 2 and 12
    • You win $100 if you correctly guess the sum of the two rolls

    Let's denote the sum of the two rolls as X+Y, and Z as the number we choose. Our winnings are then 100×P[X+Y=Z]100 \times P[X+Y = Z], so we need to calculate P[X+Y=Z]P[X+Y=Z] for Z ranging from 2 to 12.

2. Statistics: Inferring from Data

Continuing the previous example, from a statistical perspective, we know that each face of the die has a probability of 1/6. We can build upon this knowledge by repeatedly rolling the die, counting the occurrences of each face, and estimating their probabilities.

However, for more complicated processes, we can view them as follows:

Complicated process=Simple process+Random noise\text{Complicated process} = \text{Simple process} + \text{Random noise}

Therefore, statistics is the process of estimating parameters from data, and data originates from random processes. This forms a circle of truth:

circle-of-truth

When we experiment with probabilities, we are given a model to predict data (given model, predict data). Subsequently, we use data to estimate parameters and refine the model (given data, predict model). This is sometimes referred to as the "Central Dogma of Inference":

  • We collect a set of observations about a population or phenomenon, also known as data. This data is recorded, plotted, analyzed, and interpreted to extract meaningful information.
  • However, data can be variable, and there is inherent uncertainty in making inferences from it.

Statistics uses sample data to make inferences about the entire population. For example, we can use data on the heights of 100 randomly selected students to estimate the average height of all students in a school.

central-dogma-of-inference

From this, we can see that a probabilistic model or statistical inference aims to:

  • Characterize the randomness or "noise" in the data
  • Quantify the uncertainty in models or decisions derived from the data
  • Predict future observations or decisions in the face of uncertainty

3. Conclusion: Two Sides of the Same Coin

Probability and statistics are intricately connected fields.

  • Probability studies randomness by describing the random process of a sample and providing the likelihood of specific outcomes.

  • Statistics estimates parameters from data, aiming to make inferences about the population based on the observed sample.

conclusion

References

  1. Pishro-Nik, H. (2016). Introduction to probability, statistics, and random processes.
  2. Lecture 1: Introduction - MIT 18.650 Statistics for Applications, Fall 2016